282 60 24MB
English Pages XVI, 314 [325] Year 2021
Advances in Intelligent Systems and Computing 1239
Paulo Novais · Gianni Vercelli · Josep L. Larriba-Pey · Francisco Herrera · Pablo Chamoso Editors
Ambient Intelligence – Software and Applications 11th International Symposium on Ambient Intelligence
Advances in Intelligent Systems and Computing Volume 1239
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156
Paulo Novais Gianni Vercelli Josep L. Larriba-Pey Francisco Herrera Pablo Chamoso •
•
•
•
Editors
Ambient Intelligence – Software and Applications 11th International Symposium on Ambient Intelligence
123
Editors Paulo Novais Departamento de Informática University of Minho, ALGORITMI Center Braga, Portugal Josep L. Larriba-Pey Data Management Group Technical University of Catalonia Barcelona, Barcelona, Spain Pablo Chamoso BISITE Research Group University of Salamanca Salamanca, Salamanca, Spain
Gianni Vercelli DIBRIS University of Genoa Genoa, Italy Francisco Herrera Department Computer Science and Artificial Intelligence, ETS de Ingenierias Informática y de Telecomunicación University of Granada Granada, Spain
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-58355-2 ISBN 978-3-030-58356-9 (eBook) https://doi.org/10.1007/978-3-030-58356-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
One of the main trends that is centering the main interest of researchers both in computer science and in many other areas is artificial intelligence (AI). Environmental intelligence (AmI) is directly related to AI to the extent that several authors point out that it is the “AI in the environment.” This shows a clear approach that AmI aims to apply technology and AI for the benefit of humans. More specifically, AmI studies the user context and leverages the knowledge gained to provide intelligent solutions. In recent years, with the emergence of the Internet of things (IoT) paradigm, which allows for the detailed measurement of relevant environmental information, both the number of studies and applications related to AmI have increased considerably, making AmI one of the most important parts of the computer science area. AmI is applicable in trendy areas such as smart cities, transportation, smart homes, ambient care and safety or intelligent workplaces. ISAmI is the International Symposium on Ambient Intelligence, aiming to bring together researchers from various disciplines that constitute the scientific field of AmI to present and discuss the latest results, new ideas, projects, and lessons learned. Brand new ideas will be greatly appreciated as well as relevant revisions and actualizations of previously presented work, project summaries, and PhD thesis. This year’s technical program will present both high quality and diversity, with contributions in well-established and evolving areas of research. Specifically, 48 papers were submitted by authors from over 10 different countries (Greece, Italy, Japan, UK, Portugal, Spain, or Turkey, among others), representing a truly “wide area network” of research activity. The ISAmI technical program has selected 22 papers, and as in past editions, there will be special issues in JCR-ranked journals such as Information Fusion, Neurocomputing, Electronics, IEEE Open Journal of the Communications Society, and Smart Cities. Moreover, ISAmI’20 workshops have been a very useful tool in order to complement the regular program with new or emerging topics of particular interest to the participating community.
v
vi
Preface
This symposium is organized by the Universidade do Minho, Universitat Politècnica de Catalunya, University of Granada, Università di Genova, and University of Salamanca. The present edition was held in L’Aquila, Italy, from October 7 to 9, 2020. Paulo Novais Gianni Vercelli Josep L. Larriba-Pey Francisco Herrera Pablo Chamoso
Organization of ISAmI 2020
http://www.isami-conference.net/
General Chair Paulo Novais
Universidade do Minho, Portugal
Organizing Committee Josep L. Larriba-Pey Francisco Herrera Pablo Chamoso
Technical University of Catalunya, Spain University of Granada, Spain University of Salamanca, Spain
Local Organizing Committee Gianni Vercelli
Università di Genova, Italy
Workshop Organizing Committee Joan Guisado Alfonso González Arnau Prat
Universitat Politècnica de Catalunya, Spain University of Salamanca, Spain Sparsity Technologies, Spain
Program Committee Ana Almeida Ana Alves Ricardo Anacleto Cesar Analide Cecilio Angulo
ISEP-IPP, Portugal Centre for Informatics and Systems, University of Coimbra, Portugal ISEP, Portugal University of Minho, Portugal Universitat Politècnica de Catalunya, Spain
vii
viii
Lars Braubach MarÍa-Pilar CÁceres-Reche Valérie Camps Javier Carbo Gonçalo Cardeal Davide Carneiro Joao Carneiro Fabio Cassano Jose Carlos Castillo Montoya Alvaro Castro-Gonzalez João P. S. Catalão Silvio Cesar Cazella Pablo Chamoso Stefano Chessa
Stéphanie Combettes Luís Conceição
Phan Cong-Vinh Ricardo Costa Rémy Courdier Fernando De La Prieta Patricio Domingues John Dowell Dalila Duraes Luiz Faria
Florentino Fdez-Riverola Marta Fernandes
Bruno Fernandes Antonio Fernández-Caballero João Ferreira Lino Figueiredo Adina Magda Florea Daniela Fogli Celestino Goncalves
Organization of ISAmI 2020
University of Hamburg, Germany Department of Didactic & School Organization, Faculty of Sciences of Education, Spain University of Toulouse - IRIT, France University Carlos III of Madrid, Spain Universidade de Lisboa, Portugal Polytechnic Institute of Porto, Portugal ISEP/GECAD, Portugal Università degli Studi di Bari Aldo Moro, Italy Universidad Carlos III de Madrid, Spain Universidad Carlos III de Madrid, Spain University of Porto, Portugal UFCSPA, Brazil University of Salamanca, Spain Department of Computer Science, University of Pisa, Italy IRIT, University of Toulouse, France GECAD, Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development, Portugal Nguyen Tat Thanh University, Vietnam ESTG.IPP, Portugal LIM, Université de la Réunion, Reunión University of Salamanca, Réunion ESTG, Leiria, Portugal University College London, UK Department of Artificial Intelligence, Technical University of Madrid, Madrid, Spain Knowledge Engineering and Decision Support Research, GECAD, Institute of Engineering, Polytechnic of Porto, Porto, Portugal University of Vigo, Spain GECAD, Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development, Polytechnic of Porto, Portugal University of Minho, Portugal Universidad de Castilla-La Mancha, Spain ISCTE, Portugal ISEP, Portugal University POLITEHNICA of Bucharest, AI-MAS Laboratory, Romania Università di Brescia, Italy Instituto Politecnico da Guarda, Portugal
Organization of ISAmI 2020
Sérgio Gonçalves Alfonso González Briones David Griol Junzhong Gu Esteban Guerrero Hans W. Guesgen Guillermo Hernández Javier Jaen Jean-Paul Jamont Vicente Julian Jason Jung Leszek Kaliciak Anastasios Karakostas Alexander Kocian Igor Kotenko
Joyca Lacroix Guillaume Lopez José Machado João Paulo Magalhaes Rafael Martinez Tomas Constantino Martins
Rene Meier Antonio Meireles Jose M. Molina José Pascual Molina Massó Tatsuo Nakajima Elena Navarro Jose Neves Paulo Novais Andrei Olaru Miguel Oliver Jaderick Pabico Juan José Pantrigo Fernández Juan Pavón Hugo Peixoto Ruben Pereira
ix
University of Minho, Portugal BISITE Research Group, Spain Universidad Carlos III de Madrid, Spain East China Normal University, China Umeå University, Sweden Massey University, New Zealand University of Salamanca, Spain Universitat Politècnica de València, Spain LCIS, Université de Grenoble, France Universitat Politècnica de València, Spain Chung-Ang University, Korea AmbieSense, Norway Aristotle University of Thessaloniki, Greece University of Pisa, Italy St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Russia Philips Research, Netherlands Aoyama Gakuin University, College of Science and Technology, Japan University of Minho, Portugal ESTGF, Porto Polytechnic Institute, Portugal Universidad Nacional de Educación a Distancia, Spain Knowledge Engineering and Decision Support Research (GECAD), Institute of Engineering, Polytechnic of Porto, Porto, Portugal Lucerne University of Applied Sciences and Arts, Switzerland ISEP, Portugal Universidad Carlos III de Madrid, Spain Universidad de Castilla-La Mancha, Spain Waseda University, Japan University of Castilla-La Mancha, Spain University of Minho, Portugal University of Minho, Portugal University POLITEHNICA of Bucharest, Romania Universidad de Castilla-La Mancha, Spain University of the Philippines Los Banos, Philippines Universidad Rey Juan Carlos, Spain Universidad Complutense de Madrid, Spain University of Minho, Portugal ISCTE, Portugal
x
Antonio Pereira António Pinto Tiago Pinto Isabel Praça Javier Prieto Joao Ramos Carlos Ramos Alberto Rivas Sara Rodríguez Teresa Romão
Albert Ali Salah Altino Sampaio Manuel Filipe Santos Enzo Pasquale Scilingo Fernando Silva
Fábio Silva S. Shyam Sundar Radu-Daniel Vatavu Lawrence Wai-Choong Wong Ansar-Ul-Haque Yasar
Organization of ISAmI 2020
Escola Superior de Tecnologia e Gestão do IPLeiria, Portugal ESTG, P.Porto, Portugal Instituto Superior de Engenharia do Porto, Portugal GECAD/ISEP, Portugal University of Salamanca, Spain University of Minho, Portugal Instituto Superior de Engenharia do Porto, Portugal BISITE Research Group, University of Salamanca, Spain University of Salamanca, Spain Faculdade de Ciências e Tecnologia/ Universidade NOVA de Lisboa (FCT/UNL), Portugal Bogazici University, Turkey Instituto Politécnico do Porto, Escola Superior de Tecnologia e Gestão de Felgueiras, Portugal University of Minho, Portugal University of Pisa, Italy Department of Informatics Engineering, School of Technology and Management, Polytechnic Institute of Leiria, Portugal University of Minho, Portugal Penn State University & Sungkyunkwan University, USA/Korea University Stefan cel Mare of Suceava, Romania National University of Singapore, Singapore Universiteit Hasselt, IMOB, Belgium
Workshop Program Committee Arnau Prat (Chair) Joan Guisado (Chair) Alfonso González (Chair) Josep Lluis Larriba Juan M. Corchado Pablo Chamoso Javier Prieto Fernando De La Prieta Yves Perreal Esther Bravo Achim von der Embse
Sparsity technologies, Spain Universitat Politècnica de Catalunya, Spain University of Salamanca, Spain Universitat Politècnica de Catalunya, Spain University of Salamanca, Spain University of Salamanca, Spain University of Salamanca, Spain University of Salamanca, Spain Thales, France S2R, EC, Brussels HaCon, Germany
Organization of ISAmI 2020
Antonio Soares Hans van Lint Viktoriya Degeler Marco Ferreira Martí Jofre Carles Labraña Alex Deloukas Ismini STroumpou
xi
Fertagus, Portugal TUDelft, the Netherlands University of Groningen, the Netherlands Thales, Portugal Pildo Labs, Spain AMTU, Spain AMETRO, Greece AETHON, Greece
Contents
Main Track eHealth4MS: Problem Detection from Wearable Activity Trackers to Support the Care of Multiple Sclerosis . . . . . . . . . . . . . . . . . . . . . . . Thanos G. Stavropoulos, Georgios Meditskos, Sotirios Papagiannopoulos, and Ioannis Kompatsiaris Society of “Citizen Science through Dancing” . . . . . . . . . . . . . . . . . . . . . Risa Kimura, Keren Jiang, Di Zhang, and Tatsuo Nakajima The ACTIVAGE Marketplace: Hybrid Logic- and Text-Based Discovery of Active and Healthy Ageing IoT Applications . . . . . . . . . . . Thanos G. Stavropoulos, Dimitris Strantsalis, Spiros Nikolopoulos, and Ioannis Kompatsiaris Explainable Intelligent Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . Davide Carneiro, Fábio Silva, Miguel Guimarães, Daniel Sousa, and Paulo Novais Overcoming Challenges in Healthcare Interoperability Regulatory Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . António Castanheira , Hugo Peixoto, and José Machado Tools for Immersive Music in Binaural Format . . . . . . . . . . . . . . . . . . . Andrea De Sotgiu, Mauro Coccoli, and Gianni Vercelli A Computer Vision-Based System for a Tangram Game in a Social Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carla Menendez, Sara Marques-Villarroya, Jose C. Castillo, Juan Jose Gamboa-Montero, and Miguel A. Salichs FullExpression Using Transfer Learning in the Classification of Human Emotions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ricardo Rocha and Isabel Praça
3
13
24
34
44 54
61
72
xiii
xiv
Contents
Deployment of an IoT Platform for Activity Recognition at the UAL’s Smart Home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Lupión, J. L. Redondo, J. F. Sanjuan, and P. M. Ortigosa Algorithms for Context-Awareness Route Generation . . . . . . . . . . . . . . Ricardo Pinto, Luís Conceição, and Goreti Marreiros
82 93
Detection Violent Behaviors: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Dalila Durães, Francisco S. Marcondes, Filipe Gonçalves, Joaquim Fonseca, José Machado, and Paulo Novais System for Recommending Financial Products Adapted to the User’s Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 M. Unzueta, A. Bartolomé, G. Hernández, J. Parra, and P. Chamoso A COTS (UHF) RFID Floor for Device-Free Ambient Assisted Living Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Ronnie Smith, Yuan Ding, George Goussetis, and Mauro Dragone Using Jason Framework to Develop a Multi-agent System to Manage Users and Spaces in an Adaptive Environment System . . . . . . . . . . . . . 137 Pedro Filipe Oliveira, Paulo Novais, and Paulo Matos Towards the Development of IoT Protocols . . . . . . . . . . . . . . . . . . . . . . 146 Gonçalo Salazar, Lino Figueiredo, and Nuno Ferreira Livestock Welfare by Means of an Edge Computing and IoT Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Mehmet Öztürk, Ricardo S. Alonso, Óscar García, Inés Sittón-Candanedo, and Javier Prieto Sleep Performance and Physical Activity Estimation from Multisensor Time Series Sleep Environment Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Celestino Gonçalves, Diogo Rebelo, Fábio Silva, and Cesar Analide Face Detection and Recognition, Face Emotion Recognition Through NVIDIA Jetson Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Vishwani Sati, Sergio Márquez Sánchez, Niloufar Shoeibi, Ashish Arora, and Juan M. Corchado Video Analysis System Using Deep Learning Algorithms . . . . . . . . . . . . 186 Guillermo Hernández, Sara Rodríguez, Angélica González, Juan Manuel Corchado, and Javier Prieto Workshop on New Applications for Public Transport (NAPT) Towards Learning Travelers’ Preferences in a Context-Aware Fashion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 A. Javadian Sabet, M. Rossi, F. A. Schreiber, and L. Tanca
Contents
xv
Reputation Algorithm for Users and Activities in a Public Transport Oriented Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 D. García-Retuerta, A. Rivas, Joan Guisado-Gámez, E. Antoniou, and P. Chamoso Extraction of Travellers’ Preferences Using Their Tweets . . . . . . . . . . . 224 Juan J. Cea-Morán, Alfonso González-Briones, Fernando De La Prieta, Arnau Prat-Pérez, and Javier Prieto Doctoral Consortium (DC) Adaptivity as a Service (AaaS): Personalised Assistive Robotics for Ambient Assisted Living . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Ronnie Smith Time in Multi-agent Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Niklas Fiekas Public Tendering Processes Based on Blockchain Technologies . . . . . . . 247 Yeray Mezquita Low-Power Distributed AI and IoT for Measuring Lamb’s Milk Ingestion and Predicting Meat Yield and Malnutrition Diseases . . . . . . 251 Ricardo S. Alonso Clifford Algebras: A Proposal Towards Improved Image Recognition in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 David García-Retuerta New Approach to Recommend Banking Products Through a Hybrid Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Elena Hernández Nieves An IoT-Based ROUV for Environmental Monitoring . . . . . . . . . . . . . . . 267 Marta Plaza-Hernández Deep Symbolic Learning and Semantics for an Explainable and Ethical Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Ricardo S. Alonso Development of a Multiagent Simulator to Genetic Regulatory Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Nilzair Barreto Agostinho, Adriano Velasque Wherhli, and Diana Francisca Adamatti Manage Comfort Preferences Conflicts Using a Multi-agent System in an Adaptive Environment System . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Pedro Filipe Oliveira, Paulo Novais, and Paulo Matos
xvi
Contents
AI-Based Proposal for Epileptic Seizure Prediction in Real-Time . . . . . 289 David García-Retuerta Digital Twin Framework for Energy Efficient Greenhouse Industry 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Daniel Anthony Howard, Zheng Ma, and Bo Nørregaard Jørgensen “Cooperative Deeptech Platform” for Innovation-Hub Members of DISRUPTIVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Niloufar Shoeibi Engineering Multiagent Organizations Through Accountability . . . . . . . 305 Stefano Tedeschi Circadian Rhythm and Pain: Mathematical Model Based on Multiagent Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Angélica Theis dos Santos, Catia Maria dos Santos Machado, and Diana Francisca Adamatti Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Main Track
eHealth4MS: Problem Detection from Wearable Activity Trackers to Support the Care of Multiple Sclerosis Thanos G. Stavropoulos1(B) , Georgios Meditskos1 , Sotirios Papagiannopoulos2 , and Ioannis Kompatsiaris1 1 Information Technologies Institute, Centre for Research and Technology Hellas, 6th Km
Charilaou Thermi, 57001 Thessaloniki, Greece {athstavrikom,gmeditsk,ikom}@iti.gr 2 Department of Neurology III, Medical School, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece [email protected]
Abstract. This paper presents eHealth4MS, an assistive technology system based on wearable trackers to support the care of Multiple Sclerosis (MS). Initially, the system integrates a tracker and a smartphone to collect and unanimously store movement, sleep and heart rate (HR) data in an ontology-based knowledge base. Then, ontology patterns are used to provide an initial approach to detect problems and symptoms of interest, such as lack of movement, stress or pain, insomnia, excessive sleep, lack of sleep and restlessness. Finally, the system visualizes data trends and detected problems in dashboards and apps. This will allow patients to self-manage and for clinicians to drive effective and timely interventions and to monitor progress in future trials to evaluate the system’s accuracy and effectiveness. Keywords: Multiple Sclerosis · eHealth · IoT · Wearables · AAL · Ontologies
1 Introduction Multiple Sclerosis (MS) is an autoimmune disease that affects the brain and the spinal cord (central nervous system), with serious implications in family, professional and social life. MS disrupts regular communication between the brain and the rest of the body, which may lead to vision impairment, muscle weakness, motor coordination and balance issues, cognitive deficit and depression. It may appear at any age, although in majority it does between 15 and 45 years of age. Out of two million suffering worldwide, 65–70% are women [1]. The course of the disease follows a relapsing-remitting cycle, but 60–70% of the cases develop a steady progression of symptoms with or without remission periods, known as secondary-progressive MS. Regular monitoring, effective support and personalized guidance are vital to the best possible outcomes [2]. Holistic and tailored solutions help patients preserve independence, effective communication, social roles and productivity to the greatest extent © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 3–12, 2021. https://doi.org/10.1007/978-3-030-58356-9_1
4
T. G. Stavropoulos et al.
possible. The specialized doctor’s role is also central to the therapeutic team, who is in charge of designing a treatment plan. However, such constant monitoring is often financially or practically impossible, while many times it relies on unreliable subjective observation. New assisted living technologies combined with intelligent applications may provide a reliable, remote monitoring solution, supporting the patients and their caregivers and doctors alike, to ensure personalized and constant care, in order to hinder the disease’s progression. Towards this, the young age of MS patients has been shown to help in the acceptance of technological aids [3]. However, holistic systems that can monitor multiple life aspects to aid in neurodegenerative diseases, such as MS, have yet to emerge. On the contrary, existing Ambient Assisted Living (AAL) systems can be exploited to support MS aid in other aspects, such as communication means [3], clinical information exchange between doctors [4] and even diagnosis via videoconferencing [5]. Technology has also been used to record rehabilitation exercises at home, with limitations in the way, place and time where monitoring takes place [6]. Beyond MS, many eHealth monitoring systems focus on a single aspect such as sleep [7]. Even systems that do combine information, are lacking the smart, personalized inference mechanisms to assess the status of a patient with MS [8]. The vision of a holistic monitoring eHealth platform for MS is introduced by the EFPIA alliance1 of the European Commission and the pharmaceutical industry including Novartis, Johnson & Johnson, Bayer, Roche and Sanofi. RADAR-BASE2 is a platform developed by EFPIA and academia, where such a monitoring platform is being developed with applications in multiple diseases, including MS [9]. This paper presents eHealth4MS; an intelligent system that collects, interlinks and interprets data from wearable devices, in order to support remote and reliable monitoring of people with MS at home, leading to more effective, efficient and accessible treatment and self-management, increasing the patients’ Quality of Life (QoL). The system is based on the Service-Oriented Architecture (SOA) principles to integrate heterogeneous Internet of Things (IoT) data sources to obtain QoL information, such as step count, calories burnt, distance walked, heart rate and sleep, through wearable smartwatches and smartphones. The data is processed and homogeneously stored in an ontology, and then interpreted, in order to extract clinically relevant information related to the disease, such as physical activity and exercise, stress, sleep quality patterns and problems, in relation to patient profile. The outcomes are presented to patients for self-management, and to doctors for effective decision-making, through adaptive user interfaces on web and mobile applications. In detail, this paper aims to address the following research questions: • How can IoT, QoL data from heterogeneous IoT platforms, such as wearable watches and smartphones be retrieved in a modular manner? • How can IoT, QoL data be represented and stored unanimously, analyzed and interpreted in order to produce clinically relevant and valuable information for the care of MS? 1 European Federation of Pharmaceutical Industries and Associations: https://www.efpia.eu. 2 The RADAR-BASE Platform: https://radar-base.org/.
eHealth4MS: Problem Detection from Wearable Activity Trackers
5
• How can data be visualized for patients and clinicians to facilitate the care of MS? In response, the paper provides and initial prototype approach towards the following goals: • To develop a heterogeneous data collection platform that integrates wearable watches and smartphones in a modular manner. • To semantically describe and unanimously store heterogeneous IoT/QoL data from wearable watches and smartphones. • To provide an initial approach to process, interpret and extract clinically relevant problems and symptoms from physical activity, sleep and HR data. • To visualize data in end-user dashboards and apps for patients and clinicians. To do so, the system builds upon previous integrated systems, applied to assisted living with dementia [10]. While those systems were far more complex, the system proposed in this paper relies on simpler, more effective but also modern, comfortable and acceptable sensors in daily life, as well as analysis and interpretation tailored for MS. Meanwhile, building upon previous experience in clinical trials with technology [11], the clinical evaluation of eHealth4MS is carved out to include around thirty patients and control groups, testing the system for several months in a future study.
2 Related Work Technologies such as IoT, promise to deliver eHealth solutions even to chronic diseases where pharmaceutical treatment is lacking, such as dementia, or even when a more holistic lifestyle change is needed, such as cardiovascular disease. However, eHealth solutions to date are characterized by their focus on a single aspect of life each, such as physical activity or exercise, sleep quality, serious games, alerts etc. [7], while also lacking the intelligent analysis and interpretation tailored to each field of disease [8]. Combined with market segmentation, their uptake is, as a result, limited. Especially for MS, eHealth applications are limited to message exchange and communication between doctors and patients [3, 4], telemedicine via videoconferencing [5] and rehabilitation exercise support in a specific place and time [6]. At the same time, the road for further developments in eHealth for MS is being paved. The young age of those suffering from MS has already proved to favor acceptance of technology as an aid on their behalf [3]. Meanwhile, the pharmaceutical industry, especially in alliance with the EU in the EFPIA initiative, co-develop such research ideas of eHealth systems providing so-called “digital biomarkers” for monitoring, so as to more efficiently and objectively assess a participant’s status in a clinical trial and therefore, aid in drug development. RADAR-CNS3 is such a platform to collect heterogeneous data, applied in the scenario of MS but has yet to leverage the potential for data analysis, interpretation and personalization [9]. Apart from IoT and digital biomarker data collection in eHealth, data models and intelligent analysis are progressing as well. Ontologies, such as OWL 2 [12], have 3 The RADAR-CNS Project: https://www.radar-cns.org/.
6
T. G. Stavropoulos et al.
attracted growing interest as means for modelling and reasoning over contextual information and human activities in particular. Activity recognition is often augmented with rules for representing richer relationships not supported by the standard ontology semantics, like e.g. temporal reasoning and structured, composite activities [13]. The present paper, integrates and extends data collection, semantic web modelling and analysis and tailored user interfaces to address the needs of MS. The system collects data for multiple life aspects, such as physical activity and exercise, quality of sleep and stress, through comfortable wearable smartwatches and smartphones. Consequently, it homogeneously represents them, in novel models, and interprets them to extract medically relevant information. After knowledge is stored, following the latest advances in security and privacy in the cloud, they can be visualized through various web and mobile apps. Low cost technology set up the platform for higher chances of future uptake.
3 Integration and Data Collection As in previous works [10], modular architectures based on SOA can integrate multiple kinds of wearable smartwatches, wristbands and any type of smart home sensor with extensibility. The same architecture is followed in this paper, however, adapted to mobile and more modern wearable devices as shown on Fig. 1. Most devices available in the market provide their data through their own welldefined Cloud APIs or Software Development Kits (SDKs). Therefore, two modules, the so-called “Adapters”, are used for the eHealth4MS platform for the two devices to be integrated: 1) the smartwatch, Fitbit Charge 3 and 2) a standard Android Smartphone (Xiaomi Redmi 6A/7A). The FitBit Adapter downloads FitBit Charge 3 clock data from the FitBit Cloud through secure user authorization (OAuth protocol) and the Smartphone Adapter retrieves smartphone usage and sensor data through the open Android SDK.
Fig. 1. eHealth4MS system architecture
eHealth4MS: Problem Detection from Wearable Activity Trackers
7
Both data streams end up on the eHealth4MS Cloud online platform, and through it to the Data Integration module, which seamlessly stores them in the semantic knowledge base (Cloud-based RDF triple store), described in the next section. To ensure SOA extensibility and modularity, Adapters and the central Data Integration module are connected through REST APIs. Therefore, based on the principles of Open API Initiative (OAI), new devices and data sources can be supported in the future by implementing new Adapters or otherwise new web services, using universal standards. Platform data becomes available for further analysis and interpretation, as well as for display in user applications through semantic search and consumption protocols (SPARQL endpoints).
4 Knowledge Representation and Analysis A significant challenge in remote monitoring solutions is the ability to identify and recognize the context signifying the presence of complex activities and situations in order to support intelligent behaviour interpretation. An important factor to take into consideration is that contextual information is typically collected by multiple sensors and complementary modalities. The goal is to recognize the behaviour of the person with MS and discern traits that have been identified by the clinicians as relevant, for diagnostic, status assessment, enablement and safety purposes, achieving medical ambient intelligence and situational awareness. For example, a long-duration movement outdoors may indicate a walking activity and disrupted sleeping patterns, such as regularly waking up throughout the night and short sleep durations, may be evidence of sleep disorders and insomnia. Behaviour interpretation and situational awareness require the aggregation of collected information and their infusion with clinical knowledge and user preferences (profiles, history, etc.). To this end, eHealth4MS aggregates individual pieces of information provided by monitoring and clinical experts and then meaningfully fuse them in order to derive high-level interpretations of the person behavior and achieve situational awareness. Two constituents are considered for supporting the underlying fusion tasks, namely representation and reasoning support services. Representation provides the vocabulary and infrastructure for capturing and storing information relevant to monitoring, environment, clinical and profile knowledge. The reasoning services support the integrated interpretation of the person behaviour and recognition of clinically relevant activities and problems. 4.1 Representation A common prerequisite in context-aware, sensor-driven systems, such as eHealth4MS, is the ability to share and process information coming from heterogeneous devices and services. This translates into a twofold requirement. First, there is a need for commonly agreed vocabularies of consensual and precisely defined terms for the description of data in an unambiguous manner. Second, there is a need for mechanisms to integrate, correlate and semantically interpret these data. To achieve this, there is a need to model context at different levels of granularity and abstraction, and support the derivation of higher-level
8
T. G. Stavropoulos et al.
interpretations. Context refers to any information that can be used to characterize the situation of a person or a computing entity; for example, the location of a person and the room temperature are aspects of context. The formalization of the ontological vocabulary follows the OWL 2 language and extends the Semantic Sensor Network (SSN) ontology for capturing measurements, following the horizontal and vertical modularization architecture of the standard, through a lightweight but self-contained core ontology called SOSA (Sensor, Observation, Sample, and Actuator) [14]. The current version of the domain ontology supports: • Atomic activities and measurements detected by the monitoring infrastructure (e.g. steps and sleeping activities) and complex activities inferred through context interpretation (e.g. having meal, walking sleeping). • Problems and situations of significance that the monitored people and the clinicians need to be informed about (e.g. sleep problems, correlations between heart rate and location). 4.2 Reasoning and Interpretation The interpretation framework analyses collectively the aggregated observations and derives a higher-level understanding of behaviour, in terms of activities and situations the person engages in, and the identification of clinically defined functional problems. To this end, two components have been developed: the Online Event Detection (OED) component and the Semantic Interpretation (SI) component. The developed components support interpretation tasks at different levels of granularity. OED serves for understanding context in a real-time manner. SI on the other hand addresses situations that require encapsulating pieces of information of higher abstraction. More specifically, OED focuses primarily on the real-time detection of situations of interest and their aggregation with clinical and profile information, in order to trigger respective feedback and alerts. Such events include elementary states and activities using sensor measurements (e.g. high physical activity for more than 2 min). In turn, the primary focus of SI is on the recognition of: i) complex situations and correlations (e.g. movement at night after the person has gone to sleep), ii) functional problems as defined by clinicians (e.g. interrupted sleep). SI espouses a hybrid approach that combines ontology- and rule-based reasoning. OWL 2 is used to model the domain concepts (activities, situations, problems, etc.); SPARQL rules are used to enhance typical ontology-based reasoning with complex activity and problem detection, temporal reasoning and incremental knowledge updates. The collected observations and measurements from the sensor network are sent to the OED module for real-time fusion with profile knowledge. OED uses the Drools CEP (Complex Event Processing) engine that queries the KB to retrieve profile information (e.g. behaviour patterns). The detected events are then sent to the alert and feedback services of the framework. Note that detected events by OED are also stored in the KB for further offline processing and fusion with other observations by SI. Figure 2 (left) presents the abstract representation and reasoning framework.
eHealth4MS: Problem Detection from Wearable Activity Trackers
9
Fig. 2. Logical architecture of the representation and reasoning framework (left); Ontology pattern capturing the restlessness problem (right)
Figure 2 (right) depicts the ontology pattern we have defined for modelling restlessness problems. The pattern captures the number of sleep interruptions observed for a single day, which is further classified as a Restlessness problem by the reasoning and interpretation layer. In the same way, patterns detect problems of: • • • • • •
“Stress or Pain” (high HR for long periods of time without movement), “Lack of Movement” (low steps in a day), “Lack of Exercise” (HR entirely out of cardio zone for a day), “Insomnia” (sleep latency in night), “Lack of sleep” (Short total sleep in a day), “Too much sleep” (Long total sleep in a day) and “Increased Napping” (Too long or too many naps in a day).
5 End-User Applications Graphical User Interfaces (GUIs) and web applications in eHealth4MS are designed and developed to serve the user’s needs (caregivers, as well as their relatives and therapists), based on their requirements. These applications, in addition to communicating the results to healthcare professionals, careers and patients themselves, allow the communication between stakeholders (e.g. by sending personal messages, questions, etc.). The dashboard visualizations design takes into consideration user goals, behaviours, needs and profiles, as well as performance, acceptance, clinical and therapeutic value characteristics. Many design choices are based on previous works in other eHealth fields [10]. MS differs in the sense that it can cause both physical and mental problems and, therefore requires: i) holistic view of all aspects and ii) clear detection of problems. The implementation relies on open source frameworks and libraries and mainly the well-established and modern Python - Django web framework. Responsive design ensures adaptation to: i) mobile phones and tablets that are mostly used by the patients; and ii) large computer screens, mostly used by the physicians. In addition to responsive design, user roles require adaptive views, with different permissions in the application. The patient view displays only abstract and positive observation features, while the
10
T. G. Stavropoulos et al.
clinician view provides complete, detailed information to physicians. Python-Django’s Model-View-Controller (MVC) approach appears effective in the way that it enables to easily manage model data, controls and views. Communication with analysis subsystems and the knowledge base operates on top of the semantic interfaces of data collection (SPARQL queries). Figure 3 presents a detailed view of the clinician dashboard, which visualizes trends of sensor information such as intraday, hourly, daily, monthly or yearly aggregates of steps, sleep totals and segmentation in sleep stages and average heart rate. It also shows the analysis and interpretation outcomes in the form of detected problems, which allows them to investigate occurrences and context, suggest interventions and view progress. In the future, the clinician view may allow them to modify rules and patient profiles to detect problems and symptoms which are now modelled in the ontology.
Fig. 3. The eHealth4MS dashboard showing daily steps, sleep duration per stage, HR and detected problems in daily scale.
eHealth4MS: Problem Detection from Wearable Activity Trackers
11
6 Conclusions and Future Work This paper presents eHealth4MS, an assistive technology system, based on wearable trackers to support the care of Multiple Sclerosis (MS). The system integrates a tracker and a smartphone to collect and unanimously store movement, sleep and hear rate (HR) data in a knowledge base. Then, problem detection techniques extract symptoms and behaviours clinically relevant to MS, such as lack of movement or exercise, stress or pain, insomnia, excessive sleep or lack of sleep and restlessness. Finally, the system visualizes trends and problems that may allow patients to self-manage and doctors to drive interventions, central to care, and to monitor progress more effectively. As future work, foremost we plan for a clinical study to evaluate the system’s usability and clinical value in a real-life environment for patients with MS. The technology package for each participant will include a wearable tracker and an Android smartphone, the eHealth4MS self-monitoring app and online access to the dashboard for doctors and caregivers. 45 participants will be recruited and randomly placed in three groups. The first group will be given the system and interventions from doctors adjusted and supported by problem detection and progress monitoring from the system. The second group will act as controls who will be given interventions without the system’s support, and the third group will not be monitored or perform interventions at all. Participants will be given a written consent form in order to collect personal information and monitoring data for the sole purpose of contributing to the research. Protecting the participants’ privacy, the information can be anonymously disclosed so as to contribute more broadly to future work, following guidelines from our previous studies [11]. After the period of six months, the evaluation will address both clinical aspects of the system as well as technological. The clinical part will include neuropsychological evaluation that will take place both at the beginning and following the end of the study and for each group. The tests will include those widely used for MS [15]. The usability of applications will be assessed using the standard System Usability Scale (SUS) [16] and the User Experience Questionnaire (UEQ) [17], as well as open-ended questions for patients with MS, their relatives and the attending physicians, different for each group of interest. Likewise, the acceptance of technology by patient users will be examined, taking into account earlier studies [18]. Acknowledgements. This research has been co-financed by the European Union and Greek national funds through the Operational Program Human Resources Growth, Education and Lifelong Learning, under the call for Support of Researchers with Emphasis on Young Researchers (Project: eHealth4MS).
References 1. Liguori, M., Marrosu, M.G., Pugliatti, M., Giuliani, F., De Robertis, F., Cocco, E., Zimatore, G.B., Livrea, P., Trojano, M.: Age at onset in multiple sclerosis. Neurol. Sci. 21, S825–S829 (2000) 2. Doshi, A., Chataway, J.: Multiple sclerosis, a treatable disease. Clin. Med. 17(6), 530 (2017). https://doi.org/10.7861/clinmedicine.17-6-530
12
T. G. Stavropoulos et al.
3. Haase, R., Schultheiss, T., Kempcke, R., Thomas, K., Ziemssen, T.: Use and acceptance of electronic communication by patients with multiple sclerosis: a multicenter questionnaire study. J. Med. Internet Res. 14(5), e135 (2012). ncbi.nlm.nih.gov 4. Marrie, R., Leung, S., Tyry, T., Cutter, G.R., Fox, R., Salter, A.: Use of eHealth and mHealth technology by persons with multiple sclerosis. Multiple Scler. Relat. Disord. 27, 13–19 (2019) 5. Kane, R.L., Bever, C.T., Ehrmantraut, M., Forte, A., Culpepper, W.J., Wallin, M.T.: Teleneurology in patients with multiple sclerosis: EDSS ratings derived remotely and from hands-on examination. J. Telemed. Telecare 14(4), 190–194 (2008). journals.sagepub.com 6. Huijgen, B.C.H., Vollenbroek-Hutten, M.M.R., Zampolini, M., Opisso, E., Bernabeu, M., van Nieuwenhoven, J., Ilsbroukx, S., Magni, R., Giacomozzi, C., Marcellari, V., Marchese, S.S., Hermens, H.J.: Feasibility of a home-based telerehabilitation system compared to usual care: arm/hand function in patients with stroke, traumatic brain injury and multiple sclerosis. J. Telemed. Telecare 14, 249–256 (2008). https://doi.org/10.1258/jtt.2008.080104 7. Chang, Y.-J., Chen, C.-H., Lin, L.-F., Han, R.-P., Huang, W.-T., Lee, G.-C.: Wireless sensor networks for vital signs monitoring: application in a nursing home. Int. J. Distrib. Sens. Netw. 8(11), 685107 (2012). https://doi.org/10.1155/2012/685107 8. Cislo, N., Arbaoui, S., Becis-Aubry, Y., Aubry, D., Parmentier, Y., Doré, P., Guettari, T., Ramdani, N.: A system for monitoring elderly and dependent people in nursing homes: the e-monitor’ age concept. Stud. Inform. Univ. 11, 30–33 (2013) 9. Radaelli, M., Martinis, M., Locafaro, G., Temussi, S., Mulero, P., Magyari, M., Buron, M., Montalban, X., Soerensen, S., Leocani, L., Kieseier, B., Comi, G.: A new way to monitor multiple sclerosis. In: IMI 10th Anniversary Scientific Symposium (2018) 10. Stavropoulos, T.G., Meditskos, G., Kompatsiaris, I.: DemaWare2: integrating sensors, multimedia and semantic analysis for the ambient care of dementia. Pervasive Mob. Comput. (2016). https://doi.org/10.1016/j.pmcj.2016.06.006 11. Lazarou, I., Karakostas, A., Stavropoulos, T.G., Tsompanidis, T., Meditskos, G., Kompatsiaris, I., Tsolaki, M.: A novel and intelligent home monitoring system for care support of elders with cognitive impairment. J. Alzheimer’s Dis. 54, 1561–1591 (2016). https://doi.org/ 10.3233/JAD-160348 12. Motik, B., Grau, B.C., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language Profiles. W3C recommendation 27, 61 (2009) 13. Eiter, T., Ianni, G., Krennwallner, T., Polleres, A.: Rules and ontologies for the semantic web. In: Baroglio, C., Bonatti, P.A., Małuszy´nski, J., Marchiori, M., Polleres, A., Schaffert, S. (eds.) Reasoning Web. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-856 58-0_1 14. Janowicz, K., Haller, A., Cox, S.J.D., Le Phuoc, D., Lefrançois, M.: SOSA: a lightweight ontology for sensors, observations, samples, and actuators. J. Web Semant. 56, 1–10 (2019) 15. Korakas, N., Tsolaki, M.: Cognitive impairment in multiple sclerosis. Cogn. Behav. Neurol. 29, 55–67 (2016). https://doi.org/10.1097/WNN.0000000000000097 16. Brooke, J.: SUS - a quick and dirty usability scale. Usabil. Eval. Ind. (1996). https://doi.org/ 10.1002/hbm.20701 17. Laugwitz, B., Held, T., Schrepp, M.: Construction and evaluation of a user experience questionnaire. In: HCI and Usability for Education and Work, pp. 63–76(2008). https://doi.org/ 10.1007/978-3-540-89350-9_6 18. Stavropoulos, T.G., Meditskos, G., Andreadis, S., Kompatsiaris, I.: Real-time health monitoring and contextualised alerts using wearables. In: Proceedings of 2015 International Conference on Interactive Mobile Communication Technologies and Learning, IMCL 2015 (2015). https://doi.org/10.1109/IMCTL.2015.7359619.
Society of “Citizen Science through Dancing” Risa Kimura(B) , Keren Jiang, Di Zhang, and Tatsuo Nakajima Department of Computer Science and Engineering, Waseda University, Tokyo, Japan {r.kimura,jiangkeren,zhang-di09,tatsuo}@dcl.cs.waseda.ac.jp
Abstract. Citizen science is scientific research conducted by nonprofessional scientists. Game-based citizen science tools have contributed to accelerating recent science progress. They have been paid great attention because nonexperts can find new scientific facts. However, the approach mainly fascinates only hardcore gamers because finding new facts is too challenging for casual users and does not fascinate them for long periods. In this paper, we present Citizen Science through Dancing, where two casual users collaboratively find better protein-protein docking through their body actions. Then, we enhance the basic approach with a virtual reality platform named CollectiveEyes to present multiple persons’ eye views in a virtual space. CollectiveEyes offers a social watching functionality to Citizen Science through Dancing for more fascinating casual users. We also report the opportunities and pitfalls into our current approach. Keywords: Citizen science · Casual game players · Social watching · Gamification · Dance performance · Health · Learning · Wellbeing
1 Introduction This study has developed a game-based citizen science tool for casual users that offers a playful and social way to find better protein docking that is useful for discovering new drugs. Traditional similar tools, such as Udock [7] and Bioblox [16], explore proteinprotein docking as puzzle solving games on PCs and navigate proteins via the mouse and keyboard. Although many researchers have claimed that the game-based approach is effective in educating people to understand scientific facts, the approach mainly fascinates hardcore gamers as they can find new protein combinations that were not previously known. Recently, there has been a new wave of video games named casual games, which seem to solve the problem of the missing pull; games that are easy to learn to play fit well with a large number of casual players and work in many different situations. Our approach named Citizen Science through Dancing, which is a casual game-based tool, is used to navigate proteins via users’ body movements, where their moving hands and foots, such as dancing, can be used to find better proteins docking together, as shown in Fig. 1. Recently, watching others’ game play has become very popular [15]. Additionally, gaming has become a sport (esports) because players compete with other players and one can watch the plays from many audience types [14]. Watching others’ plays can offer © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 13–23, 2021. https://doi.org/10.1007/978-3-030-58356-9_2
14
R. Kimura et al.
the following two additional advantages for fascinating casual users. The first advantage is learning new play skills by watching others’ play. The second advantage is enjoying others’ play. From the citizen science’s point of view, these advantages are essential because audiences may increase their interests in target sciences by watching various people’s plays. The advantages are significantly effective in increasing users’ intrinsic motivation to use citizen science tools. CollectiveEyes is a virtual reality platform used to share people’s eyes and ears, and we use CollectiveEyes to offer a social watching functionality to Citizen Science through Dancing to make it more appropriate for causal users. Figure 1 also shows a social watching functionality of Citizen Science through Dancing with CollectiveEyes.
Fig. 1. Gamified citizen science and Citizen Science through Dancing
The paper is structured as follows. In Sect. 2, we review citizen science and proteinprotein docking as backgrounds of our research. In Sect. 3, we present a casual game named Citizen Science through Dancing, where two users collaboratively find better protein-protein docking through their playful body actions and auditory feedback as game mechanics for encouraging more casual users. Section 4 reports to enhance Citizen Science through Dancing with a virtual reality platform named CollectiveEyes to add a social watching functionality as an additional game mechanic. We then report the opportunities and pitfalls into our current approach in Sect. 5, and finally Sect. 6 concludes the paper.
2 Citizen Science and Protein-Protein Docking Citizen science is scientific research conducted by nonprofessional scientists [2]. The purpose of citizen science is for nonprofessional people to voluntarily collect data for, share data with, or analyze data for professional scientists in the pursuit of contributing to scientific knowledge. Within the life sciences, at the intersection of citizen science, crowdsourcing and gamification, there have been many activities where game playing is used to solve scientific problems [10]. Game-based citizen science tools such as FoldIt
Society of “Citizen Science through Dancing”
15
and EteRNA have contributed to accelerating recent scientific progress [1]. They have been paid great attention because nonexperts can find new scientific facts. Interactions occur when two or more proteins bind together, which is called proteinprotein docking (PPD) [8]. PPD is a problem in predicting the three-dimensional structure of a protein complex from the structures of its components. Due to the difficulty of experimental methods for protein tertiary structure determination, there are more experimentally determined structures of single proteins than of protein complexes. Therefore, there is high demand for PPD, which gives important insight into the molecular recognition of unknown complexes. The typical use is for drug discovery. Several graphical interactive tools have been developed to explore PPD [7, 16]. One current significant trend is to develop tools for nonresearchers, such as Udock [7] and Bioblox [16], to explore PPD as a puzzle solving game, as such tools might help promote interest in the research field.
3 Citizen Science through Dancing This section explains how to navigate PPD manipulation through users’ playful body actions and how to offer auditory feedback in Citizen Science through Dancing. Similar to conventional gamified citizen science tools, users get higher scores when they find better protein combinations. For analyzing users’ body actions, we use Microsoft Kinect, and for visualizing PPD and auditory feedback, we use the Unity game engine.
Fig. 2. Manipulating proteins through dancing
The current version of Citizen Science through Dancing supports the following seven functions to manipulate PPD [4]. Five functions are used to navigate proteins. A user is able to move the two proteins (Move), rotate the side chains of proteins (RSChain), change the viewpoint (CView), zoom in/out on the proteins (Zoom) and rotate the proteins (Rotate). In addition to the above basic five functions, there are two additional functions. The first is to change the transparency of the protein surface (Trans), and the second is to cut the surface of the protein on any plane (Cut). Citizen Science through Dancing adopts different body actions and poses to operate the PPD manipulation functions presented above. Our tool will record the initial and end
16
R. Kimura et al.
positions of the hand and calculate the rotation direction. Then, it is easy to rotate the protein to the desired direction with a fixed angle. The user waves his/her hand to move two proteins closer together, as shown in Fig. 2n, squats to move them farther apart, as shown in Fig. 2-o, and does a T-shaped pose to make them stop moving, as shown in Fig. 2-p (Move). To change the viewpoint (CView), the user leans left/right to change the horizontal view angle, as shown in Fig. 2-a and 2-b, and leans forward/back to change the vertical view angle, as shown in Fig. 2-c and d. The user pulls/pushes his/her hand to zoom in/out, as shown in Fig. 2-e and 2-f (Zoom). The user also raises two hands to go back to the original viewpoint, as shown in Fig. 2-g. The user moves his/her hand as a cursor and chooses the side chain that he/she wants to rotate by making a fist. Then, the side chain he/she chooses will start to rotate until the user stops making a fist (RSChain). The user can rotate the protein by moving his/her hand in the desired direction (Rotate). The user swipes left/right/up/down to increase or decrease the transparency of two protein surfaces, as shown in Fig. 2-h, i, j, and k (Trans). The user kicks left/right to cut the surface of the protein and raises the left/right hand to recover the surface, as shown in Fig. 2-l and m (Cut). Returning proper feedback is essential for better understanding the current situation and encouraging users for challenging higher scores to find better protein combinations. Visual feedback is commonly used for delivering information. However, auditory feedback is usually better than visual written feedback. In particular, ambient auditory feedback is a promising way to improve feedback without increasing people’s mental load. We have considered using musical sound as ambient audio feedback, where the music’s tone, speed, and volume are configured according to the current score.
Fig. 3. Using Citizen Science through Dancing with two users
As shown in Fig. 3, Citizen Science through Dancing is usually used by two users. Each user can independently use the body actions described above. However, cooperating together may make PPD operations more enjoyable. We expect two use cases of our tool. The first use case is that two users who like dancing use the tool for their pleasure. Their purpose is to perform better dancing while aiming for better PPD exploration. The second
Society of “Citizen Science through Dancing”
17
use case is for a biological expert and nonexpert to use the tool. The purpose of this use case is for experts to teach the essence of PPD to nonexperts through playful dancing.
4 CollectiveEyes as a Social Watching Infrastructure Watching the experts’ examination of PPD teaches other people how PPD contributes to the scientific progress of biology and discovers new drugs. In particular, if experts explain their examination while using the playful tool, then the explanation makes people increase their intrinsic motivation to know and learn molecular biology. This is particularly useful to motivate young people to learn biology. Watching others’ PPD trials may enhance our imagination and creativity because we can watch how others use Citizen Science through Dancing. As shown in [13], people usually consider unexpected and alternative ways to use daily tools. Other people may use Citizen Science through Dancing in an unexpected way. By examining multiple users’ eye views, a user may become aware of other ways to use it. PPD is a kind of puzzle; thus, a user needs high skills to achieve better docking. Some users, such as hard core gamers, use the tool seriously to acquire higher docking scores. They develop various skills to achieve higher scores. Watching their docking as a causal user is effective for learning their skills. These functionalities are essential for increasing the fascination of a casual user’s intrinsic motivation.
Fig. 4. Watching multiple eye views with CollectiveEyes
CollectiveEyes shows multiple eye views simultaneously in a virtual space [5]. We assume that each person is equipped with a wearable device containing a camera and microphone, typically wearable glasses. The current version of CollectiveEyes uses a head-mounted display (HMD) and puts a camera and microphone in front of the HMD. The HMD projects the view captured by a camera. When showing multiple views, the views are shown in a virtual space. CollectiveEyes offers two modes to present multiple eye views. The first is the spatial view mode, and the second is the temporal view mode, as shown in Fig. 4. When using the spatial view
18
R. Kimura et al.
mode, the four views are automatically selected and shown in the virtual space. If one of the views does not interest a user, another view is shown instead of the removed view. When using the temporal mode, one view is randomly selected and displayed. The view can be successively changed to another view until the most desirable view is found. CollectiveEyes uses the Unity game engine to project multiple eye views in a virtual space. When using CollectiveEyes with Citizen Science through Dancing, each eye view presents a different user dancing scene, as shown in Fig. 4. A user can choose appropriate views, and these multiple views are effective in comparing them. The function is useful for focusing on differences in dancing.
5 User Studies We conducted two user studies to investigate our approach. In the first user study, participants were asked to play the role of a person shown in a scenario to extract several insights to use social watching via CollectiveEyes in Citizen Science through Dancing. In the second user study, we examine how two casual users’ collaborative performance is effective against the traditional single person-based approach. Investigating Social Watching in Citizen Science through Dancing: Docking proteins is a kind of puzzle; thus, a user needs high skills to achieve better docking. Some users, such as hard core gamers, use the tool seriously to acquire higher docking scores. They develop various skills to achieve higher scores. Watching their docking is effective for learning their skills. The acquired skills increase their self-efficacy to enhance their intrinsic motivations. Watching others’ performance is enjoyable and may enhance one’s creativity and imagination. Additionally, the act of watching is motivation to use the tool by one’s self, where the use encourages one’s exercise. The user study was conducted at our university; we hired 11 participants (age m = 26.0, 10 males). In the user study, to understand our above expectations, we asked each participant to perform the role of Risa in the abovementioned scenario, as shown in Fig. 5, based on user enactments [9] while watching others’ dancing using Citizen Science through Dancing through CollectiveEyes. This approach helps participants understand how to use such a new tool as a participant. Additionally, using a virtual space to present the scenario increases the immersion of using the tool. Finally, we asked them the following five questions, then conducted semistructured interviews based on the questions. We investigated participants’ scores on the points using a five-point Likert scale (5: induced, 4: to some extent, 3: cannot say either, 2: sometimes induced, 1: not induced) on each question. In the user study, we used only the spatial mode of CollectiveEyes and simultaneously showed two views in a virtual space. Q1
Does the tool increase the user’s intrinsic motivation for learning biological science?
Q2
Does the tool stimulate the user’s imagination and creativity?
Q3
Does the tool enhance the user’s performance skills by watching others’ PPD manipulation? (continued)
Society of “Citizen Science through Dancing”
19
(continued) Q4
Does the tool encourage the user’s exercise through Citizen Science through Dancing?
Q5
Do you enjoy watching others’ performances?
Risa is a college student who is studying medical science. She is also part of a performance club; she is especially interested in dance performance. Kana, who is a Risa’s friend, is a researcher who is studying PPD. Today, Kana decided to use Citizen Science through Dancing with Risa to share her knowledge of PPD with Risa because Kana believes that the knowledge is useful for increasing Risa’s knowledge on the developments of recent medicines. However, Risa does not know how to use the program, so she decides to watch others’ uses of Citizen Science through Dancing. She finds that two users’ collaboration is important to find better docking, and the dance-like performance needs large whole-body actions. Because she likes dance performance, she is very interested in using the program, and others’ performance offers various imaginations of how performance may help our society’s scientific progress.
Fig. 5. Citizen science through dancing scenario
The results of the average scores m(sd)1 of the above questions are Q1: 4.0(0.9), Q2: 4.27(1.1), Q3: 3.18(1.0), Q4: 4.27(1.0), and Q5: 3.73(1.1). 9 participants agreed that the tool is useful to increase the motivation for learning biological science in terms of Q1. One typical comment was as follows: “It is effective for increasing my motivation because there is a good possibility that I’m interested in protein docking from my interest in dance performance.” In terms of Q2, 9 participants also supported that visually watching others’ plays stimulates the participants’ imagination regarding using the tool. For example, a participant said, “I think that it is difficult to understand simply by receiving explanations in words, so if I can see how other users are actually using the tool, I will be able to imitate the appearance even if I cannot understand how to use it.” On the other hand, social watching seems to not help participants increase their play skills significantly for 6 of them in terms of Q3. An example comment was as follows: “I think that the tool will increase my motivation, but it is rare that the skill itself will improve just by looking at others’ uses of the tool.”, but one of them very strongly agreed that imitating others’ play can increase their skills. In terms of Q4, 9 participants considered the tool to be useful in encouraging participants’ exercise. One comment was that “I think that 1 m: mean, sd: standard deviation.
20
R. Kimura et al.
it will lead to health promotion if it becomes a habit to move the body.” The answers for Q5 are influenced by the participants’ interests in dance performance. For example, a participant said, “By looking at the dance performances of other users, you can compare them with your own dances, and even if you cannot dance, you can enjoy looking at the dances of other users”; however, another participant claimed that “It’s good for learning how to use the tool, but it does not seem fun to watch others’ uses of the tool.” The Effectiveness of Citizen Science through Dancing: In the experiment, to investigate the effectiveness of Citizen Science through Dancing, we have developed two variations. The first variation is the desktop mode, in which we can use a mouse and keyboard to operate the above functions, which are similar to the other popular related tools such as Udock and Bioblox. The second variation is the one-user body-action mode that navigates protein docking through body actions by only one person. We also call the original mode the two-users body-action mode. We hired 15 participants (age m = 28.8, 5 females). Each participant was first given background knowledge of PPD and completed a profile questionnaire. Next, we performed a body action tutorial. Then, we let the participant experience the one-user body-action mode of the tool, the two-users body-action mode tool and the desktop mode. In this experiment, the first user is the participant, and the second user is one of authors, where the second user helped the first user when the first user did not know what to do. Finally, we let the user finish the questionnaires and the interview. We used a modified GEQ scale [3] in the user study to assess our tool. The appendix presents the actual questionnaires used in the user study and the meanings of the scores in the questionnaires.
Fig. 6. Comparing desktop mode vs. one-user body-action mode vs. two-users body-action mode
We adopted the IGG scale and the GSP scale, as shown in the appendix. Figure 6 compares the desktop mode, the one-user body-action mode and the two-users bodyaction mode by using the IGG scale. The experiment also measured the social effects of the tool by using the GSP scale. Figure 7 shows the results of the experiment. When looking at the IGG scale scores, the participants were not less bored by the body-action modes than the desktop mode, based on IGG 1. The participants thought that the bodyaction modes were more interesting than the desktop mode, based on IGG 2, IGG 3,
Society of “Citizen Science through Dancing”
21
Fig. 7. Measuring social effects in two-user body-action mode
IGG 7, IGG 8, and IGG 10, and they thought that the desktop mode was easier to learn than the body-action modes, based on IGG 6 and IGG 9. The participants also had fewer negative emotions while using the desktop mode than the body action modes, based on IGG 4 and IGG 5. When looking at the GSP scale scores, the two-users body-action mode offered better social effects, and the participants were successfully cooperative with others without significant problems. Based on the scores of IGG 2, IGG 7, IGG 8 and IGG 10, the twousers body-action mode was better than the one-user body-action mode. We expected that the results would be influenced by social effects. On the other hand, in terms of the scores of IGG 4, the two-users body-action mode scored lower than the one-user body-action mode. We expected that the score of GSP 10 explains this result.
6 Conclusion and Future Direction This paper presented a new approach to enhance traditional gamified citizen science tool called Citizen Science through Dancing that incorporates users’ playful body actions and auditory feedback as game mechanics for encouraging more casual users, and we enhanced the basic approach with CollectiveEyes to increase its social effects through adding a social watching functionality as an additional game mechanic. The current study’s focus is to encourage more casual users, but in the next step, we also need to investigate how our tool actually finds much better protein-protein docking than by conventional GUI-based gamified citizen science tools. In the next step, we would like to investigate opportunities for using our tool to facilitate experts’ daily health and creativity through its capabilities of using whole-body actions. As shown in [11], body movement offers positive effects on people’s creativity; thus, the approach may facilitate better PPD by stimulating creativity. Typical experts tend to sit on their chairs for long periods while performing daily research activities; however, sitting is not desirable behavior for human health [12] because physical activities are essential for a healthy lifestyle [6]. Therefore, the tool offers a new direction to encourage PPD experts’ health. However, in the experts’ daily research activities, efficiency in helping with their work is the most important criterion; thus, it is important to strike a balance between health and work efficiency.
22
R. Kimura et al.
Appendix: Game Experience Questionnaire In the second user study, we used a subset of the Game Experience Questionnaire (GEQ). We adopted the “In-game GEQ” scale and the “GEQ Social Presence Module” scale. The following lists show the questionnaires used in the paper. A participant was asked to indicate how he/she felt while playing the game for each of the items on the following scale: 0, not at all; 1, slightly; 2, moderately; 3, fairly; 4, extremely. In-game GEQ (IGG)
GEQ - Social Presence Module (GSP)
IGG 1. I felt bored
GSP 1. I empathized with the other(s)
IGG 2. I found it impressive
GSP 2. My actions depended on the other(s) actions
IGG 3. I forgot everything around me
GSP 3. I felt connected to the other(s)
IGG 4. I felt frustrated
GSP 4. The other(s) paid close attention to me
IGG 5. I felt irritable
GSP 5. I paid close attention to the other(s)
IGG 6. I felt skillful
GSP 6. I found it enjoyable to be with the other(s)
IGG 7. I felt completely absorbed
GSP 7. When I was happy, the other(s) was(were) happy
IGG 8. I felt content
GSP 8. When the other(s) was(were) happy, I was happy
IGG 9. I felt challenged
GSP 9. What the other(s) did affected what I did
IGG 10. I felt good
GSP 10. What I did affected what the other(s) did
References 1. Bowser, A., Hansen, D., He, Y., Boston, C., Reid, M., Gunnell, L., Preece, J.: Using gamification to inspire new citizen science volunteers. In: Proceedings of the First International Conference on Gameful Design, Research, and Applications (Gamification 2013) (2013) 2. Hecker, S., Haklan, M., Bowser, A., Makuch, Z., Vogel, J., Bonn, A.: Citizen Science: Innovation in Open Science Society and Policy. UCL Press, London (2018) 3. IJsselsteijn, W., Poels, K., De Kort, Y.A.W.: The game experience questionnaire: development of a self-report measure to assess player experiences of digital games. TU Eindhoven, Eindhoven, The Netherlands (2008) 4. Jiang, K., Zhang, D., Iino, T., Kimura, R., Nakajima, T., Shimizu, K., Ohue, M., Akiyama, Y.: A playful tool for predicting protein-protein docking. In: Proceedings of the 18th International Conference on Mobile and Ubiquitous Multimedia (2019). Article 40 5. Kimura R., Nakajima T.: A ubiquitous computing platform for virtualizing collective human eyesight and hearing capabilities. In: Proceedings of the 10th International Symposium on Ambient Intelligence (2019) 6. Kruk, J.: Physical activity and health. Asian Pac. J. Cancer Prev. 10(5), 721–728 (2009) 7. Levieux, G., Tiger, G., Mader, S., Cois Zagury, J.F., Natkin, S., Montes, M.: Udock the interactive docking entertainment system. Faraday Discuss. 169, 425–441 (2014) 8. Matsuzaki, Y., Uchikoga, N., Ohue, M., Akiyama, Y.: Rigid-docking approaches to explore protein–protein interaction space. In: Nookaew, I. (eds.) Network Biology. Advances in Biochemical Engineering/Biotechnology, vol. 160. Springer, Cham (2016)
Society of “Citizen Science through Dancing”
23
9. Odom, W., Zimmerman, J., Davidoff, S., Forlizzi, J., Dey, A.K., Lee, M.K.: A fieldwork of the future with user enactments. In: Proceedings of the Designing Interactive Systems Conference (DIS 2012) (2012) 10. Spitz, R., Pereira Jr., C., Queiroz, F., Leite, L.C., Dam, P., Ferranti, M.P., Kogut, R., Oliveira, W.: Gamification, citizen science and civic engagement: in search of the common good. In: Proceedings of the 6th BALANCE-UNBALANCE 2017 [Arts + Sciences × Technology = Environment/Responsibility] A Sense of Place (2017) 11. Oppezzo, M., Schwartz, D.L.: Give your ideas some legs: the positive effect of walking on creative thinking. J. Exp. Psychol. Learn. Mem. Cogn. 40(1), 1142–1152 (2014) 12. Owen, N., Healy, G., Matthews, C.E., Dunstan, D.W.: Too much sitting: the population-health science of sedentary behavior. Exerc. Sport Sci. Rev. 38(3), 105 (2010) 13. Suri, J.F.: Thoughtless Acts?: Observations on Intuitive Design. Chronicle Books, San Francisco (2005) 14. Taylor, T.L.: Raising the Stakes: E-Sports and the Professionalization of Computer Gaming. The MIT Press, Cambridge (2015) 15. Taylor, T.L.: Watch Me Play: Twitch and the Rise of Game Live Streaming. Princeton University Press, Princeton (2018) 16. Thomason, A., Filippis, I., Leyton, P.Q., Sternberg, M., Latham, W., Leymarie, F.F.: Bioblox: Protein docking game, Heidelberg, Germany. https://bioblox.org/. Accessed 20 Mar 2019
The ACTIVAGE Marketplace: Hybrid Logic- and Text-Based Discovery of Active and Healthy Ageing IoT Applications Thanos G. Stavropoulos(B) , Dimitris Strantsalis , Spiros Nikolopoulos , and Ioannis Kompatsiaris Centre for Research and Technology Hellas, Information Technologies Institute, 57001 Thessaloniki, Greece {athstavr,dstrants,nikolopo,ikom}@iti.gr https://mklab.iti.gr/
Abstract. Currently, Active and Healthy Ageing (AHA) IoT solutions for the global ageing population, remain segmented. The ACTIVAGE project aims to integrate prevalent IoT platforms, enhanced with data analytics, developer, deployment, security and privacy tools to enable large-scale AHA pilots all over Europe. This paper presents the ACTIVAGE Marketplace, a one-stop-shop for developers to provide and for end-users, carers and healthcare professionals to discover AHA IoT Apps. The Marketplace offers standard functionality such as Developers uploading, monetizing and tracking analytics for their Apps, Users discovering, buying and downloading them, as well as Administrators managing and moderating content. In addition, it offers its own, hybrid logicand text-based method to discover alternatives from the wide variety of Apps, from sensor behavioral monitoring to exergames or pill reminder interventions. For this purpose, the Marketplace ontology defines App semantic properties, such as a hierarchy of App Categories, combined with keyword-based text-similarity search. Keywords: Marketplace · IoT · AHA · Ontology matching · Semantic discovery · Assisted living
1
· Semantic
Introduction
All around the world mortality rates have fallen significantly over the past decades leading to considerable changes in the age distribution of societies. People aged 60 are now expected to survive an additional 18.5 to 21.6 years and soon the world will have a higher number of older adults than children [18]. This transformation is expected to continue, with the age group of elders (65+) growing from 18% to 28% of the EU population by the year 2060. Furthermore, c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 24–33, 2021. https://doi.org/10.1007/978-3-030-58356-9_3
The ACTIVAGE Marketplace
25
according to the 2015 Ageing Report [1], one in three Europeans will be over 65 with a ratio of “working” to “inactive” population of 2 to 1, this representing a heavy impact on health and social care systems. Indeed, population ageing creates a common challenge for societies and the need to find ways to facilitate those target groups grows. Citizen empowerment and incitation to self-equip is one of the explored options. The findings highlight the need for active ageing and independent living, the so-called Active and Healthy Ageing (AHA) services, targeted to older adults. Not only legislation and insurance but also technology have to revolutionize current health and social care systems (H&SC). Indeed, AHA services based on the Internet-of-Things (IoT) technology are a promising, strategic component to support the creation of an ecosystem, able to dynamically answer and prevent the challenges faced by H&SC. Various IoT networks in Europe are being deployed for sensing, measuring, controlling indoor and outdoor smart connected objects, while various IoT platforms are emerging on the market with the aim to support the concept of independent living [4,8]. However, the platforms create segmented ecosystems to develop, use and exploit their applications. The ACTIVAGE project1 aims to integrate and leverage numerous established IoT platforms and services, enrich them with data analytics tools, endto-end security and privacy management, tools for developers and deployers, in order to bring about large-scale deployments of AHA IoT applications all over Europe. The IoT platforms integrated in ACTIVAGE include universAAL (uAAL) [4], OpenIoT [15], sensiNact [3], Sofia22 , FIWARE3 , IoTivity [8] and SeniorSome [11]. Developers can now build “Apps”, in the sense of self-contained applications built on top of a common platform, and users are able to universally download, install and deploy them across sites and ecosystems. Still, a crossplatform portal for developers to advertise and users to discover those Apps is needed. This paper presents the ACTIVAGE Marketplace, a one-stop-shop for developers and service providers to upload and monetize Apps and for end-users, carers and healthcare professionals to discover them. The added value of the ACTIVAGE Marketplace is the ability to cater for Apps built for any of the seven ACTIVAGE IoT platforms, crossing the boundaries of heterogenous systems and allowing developers and end-users to take advantage of cross-site application deployment, reach new audiences and expand the ecosystem of IoT for AHA. Basides standard upload, download, moderate and manage functionality, the Marketplace also offers its own intelligent App discovery method, between the vast variety of Apps, from behavioral monitoring with sensors to intervention support with exergames or pill reminders. The hybrid discovery algorithm is based on the Marketplace’s own knowledge representation model, the Markteplace ontology, combined with text-similarity keyword search algorithms.
1 2 3
The ACTIVAGE Project http://www.activageproject.eu/. Sofia2 IoT Platform, https://github.com/Sofia2. The FIWARE Platform, https://www.fiware.org/.
26
2
T. G. Stavropoulos et al.
Related Work
As ACTIVAGE unites several underlying IoT platforms, first we examined their pertaining Marketplaces. Table 1 summarizes their overall capabilities to upload, provision and monetize Apps. Many platforms rely on external platforms and third-party services to publish and provision apps such as Google Play Store and Github. For example the OpenIoT Marketplace is intended for developers only, not end-users. SensiNact offers an online Eclipse Project page with guidelines to contribute to code as does IoTivity on GitHub, but does not offer a Marketplace platform. The fact that none of the existing solutions offers the full range of features of a modern marketplace, highlights the need for a fully featured allin-one marketplace solution, that does not rely on third-parties. Based on the above, the ACTIVAGE Marketplace is inspired by IoT platform marketplaces to provide an integrated platform. It aims to provide all resources on how to become a developer, then publish, monitor and monetize your App, built on any of the seven popular platforms integrated by ACTIVAGE. Moreover, it aims to extend previous IoT marketplaces with more provisioning and discovery tools, by offering suggestions of similar apps. Table 1. Marketplace functionality of the ACTIVAGE platforms uAAL OpenIoT sensiNact Sofia2 FIWARE IoTivity SeniorSome Marketplace
()
–
–
–
Open source
–
–
–
–
–
Publish app
–
()
–
()
–
Sell apps
–
–
–
–
–
–
Developer analytics
–
–
–
()
–
Multi-criteria search
–
–
–
–
–
–
Regarding the suggestion of similar apps when searching, this functionality has been often addressed using machine learning approaches, namely recommender systems [13,14]. Two approaches are established: content-based, where mining is used to find similarities between items to make recommendations [10], and collaborative filtering, where items are recommended to users with similar behavior, as in Amazon [9]. However, the learning approaches require lots of data to mine from, especially user activity in the latter approach, which may not be initially available, as in the case of the Marketplace specialized in AHA IoT applications. This is known as the cold start problem. Therefore, we present an alternative knowledge-based approach that does not require training, but is rather based on ontologies and semantic discovery algorithms. The field of semantic matchmaking, i.e. the problem of finding an item with an offered parameter, denoted O, that is semantically similar to the requested, search parameter, denoted R, was mostly established in Semantic Web Service matchmaking problems [6,16,17]. Most algorithms for Web Services
The ACTIVAGE Marketplace
27
follow variations of three main approaches: the logic-based (i.e. based on logic reasoning with semantic web technologies), which explore ontology taxonomy structure and reasoning, text-based or syntactic, which explore string similarity, and hybrid, which combine logic and text-similarity. Examples of logic-based methods, include iMatcher [17], which looks for offered concepts that are children of the requested concept, i.e. more specific (R > O). The main characteristic of Web Services are their inputs and outputs. SAWSDL-MX [6] looks for more specific output, but more generic input (Routput > Ooutput ∩Rinput < Oinput ). URBE [12]’s logic-based approach calculates the distance between two concepts in the same ontology. Most algorithms fall back to text-similarity methods if logicbased methods fail or combine both. Our previous work in services, TOMACO [16] follows these principles. As shown in evaluation, TOMACO’s most accurate method is the hybrid, where text-based matching can compensate when the logic-based method fails to find an exact match. When both logic-based and textbased matches fail, logic-based alternatives such as parents, siblings and children concepts. Also, TOMACO does not require training, which lowers response time and mitigates the cold start problem. Therefore, this paper presents a transfer and adaptation of TOMACO from the field of Web Services to the field of Apps, in order to provide a logic-based, a text-based and a hybrid method to discover Apps.
3
The ACTIVAGE Marketplace
From the review of existing marketplaces in the previous section and with the main vision to unify the IoT platforms, providing a uniform portal to publish, monetize and promote Apps, a set of requirements per user role as defined below. The Marketplace “Users” are considered all the general end-users that wish to obtain, install and use applications hosted in the Marketplace. App Users, or simply Users in the context of this paper, have the ability to register, manage their profile, maintain wishlisted and installed applications lists, review and rate applications. They can be general public or even deployers of services and equipment related to the ACTIVAGE AHA platforms. If at any point they wish to develop Apps besides using them, they can enable developer mode. This creates a separate Developer profile, as is the case in most online marketplaces (e.g. Google Play store). Developers are programmers and engineers already working in the ACTIVAGE platforms or other parties that have the knowledge and intention to develop compatible applications compatible. In both cases, the Marketplace first of all provides material to educate and enable them to do so, i.e. the developer resources. Developers have the ability to upload and host applications in the Marketplace, monetizing them and widening their reach. They can also track performance, receive ratings and review insights. The Marketplace Administrators are the end-users that have full access to its setup, manage users and Apps. Their main task for the Marketplace maintenance is the validation of Apps for compliance. For this the Marketplace offers
28
T. G. Stavropoulos et al.
automatic virus checks and the ability to review the Apps that developers submit prior to publication. Administrators can also review reported content including Apps, comments and reviews, and choose to remove it. 3.1
The Marketplace Ontology and Hybrid App Discovery
Due to the wide variety of hosted content, from monitoring Apps with sensors, to intervention Apps with actuators and games, the Marketplace should offer the ability to efficiently categorise and discover content. While the standard web estore methodology offering keyword search and category filters has already been presented, a semantic discovery algorithm was integrated to suggest alternative or similar Apps based on semantic metadata. The Marketplace Ontology. The Marketplace Ontology is a model for AHA IoT Apps, available online4 , primarily used for knowledge representation around the Apps as well as for logic-based search and discovery. The central concept, as depicted on Fig. 1, is “App”, with the basic metadata properties of “description”, “version”, “price and “beta” and object properties linking it to other concepts. Namely, each app is developed by a certain type of “User”, which is a “Developer”. It is built on a certain “Platform”, is installed in one or more “DS” (Deployment Sites, which are the hospitals, nursing homes, municipalities and end-user homes currently registered with the platforms) in a respective “Country”. It may use one or more “Services” and “Devices” from the “AHAOntology” that links the Marketplace Ontology to the rest of the ACTIVAGE platform’s data model. Finally, each App belongs to a “Category” from where various concepts central to search in the Marketplace are branching out. Categories can be general such as “Smart Home” (automation), “QoL” (Quality of Life), “Mobility” and more specifically “Transportation” services in a Smart City, “Health” improvement and more specifically “eHealth” and “mHealth” using mobiles. A fundamental category is “Monitoring” Apps, which can further entail “Biometrics”, sensing blood pressure, glucose, weight etc., “Behavioral Monitoring”, sensing steps, physical activity, stress, sleep and other lifestyle measures, and “Activity Monitoring”, via activity recognition e.g. cooking, exercise, doing chores, watching TV etc. Another fundamental category is “Intervention”, which can entail “Reminders”, such as pill reminders or reminders to move, or “Serious Games” and more specifically “Brain Games” or “Exergames”. Hybrid Logic- and Text-Based App Search. The matchmaking algorithms employed in the Marketplace aim to provide more alternatives to user search than the conventional ones in the form of similar, alternative Apps, by employing semantic matchmaking. The matchmaking algorithms are inspired and established in previous works in Semantic Web Services [16]. While, services have 4
The ACTIVAGE Marketplace Ontology online: https://mklab.iti.gr/results/.
The ACTIVAGE Marketplace
29
Fig. 1. The marketplace ontology
certain characteristics, such as input and output semantics, but the main principles of semantic search criteria and syntactic matching as well as the weighted hybrid approach remain the same. All search algorithms employed in the Marketplace employ a ranking approach where all offered Apps, O, are rated with a decimal value of similarity to the user search required criteria, R, from 0 (no similarity) to 1 (maximum similarity). Logic-Based Search is based on ontology concept similarity between a requested category concept R and the offered concept O of an App in the Marketplace. This method is employed when the user provides a set of one or more criteria of type “Category” from the Marketplace Ontology, which then rate the offered Apps with: 1 for an exact match (R = O), 0.6 for a parent or child, i.e. R is a subclass () of O, 0.4 for siblings, i.e. R is a sibling () of O, and 0 for any other relationship, as defined in (Eq. 1). For example, searching for category “Behavioral Monitoring” returns: 1 for the “Step Count” App which belongs in this category, 0.6 for the “Elder Monitor IP Camera” in the parent “Monitoring” category, 0.4 for the Apps: “Oxymeter”, “Blood Pressure Monitor” (“Biometrics”) and the “Daily Activity Sensing” (“Activity Monitoring) and 0 for all other Apps. Notably the scores 1, 0.6, 0.4 and 0 are simply selected for a uniform distribution of ratings in [1, 0]. Text-Based Search is based on text-similarity using the Jaro algorithm [5], Rk . The method, denoted as ∼t , matches the keyword, Rk , across every offered App’s title, Ot with Jaro string similarity and if the text similarity is above a set threshold of 0.7 (as calibrated in past studies [16]) it returns 1 and otherwise 0 (Eq. 2). For example, when searching for the keyword “activity”, text similarity with App title “Stay Active” in category “Exergames” returns a match and, therefore, 1. The Hybrid approach combines logic and text similarity, employed when the user provides both a set of required categories and keyword search. It is based on the fact that text-similarity matches are actual exact matches, equal to the semantic logic-based exact matches. Therefore, for a requested category concept
30
T. G. Stavropoulos et al.
R and a keyword Rk , an offered App with the same category concept O or a matching title Ot are both rated 1. The following cases follow the logic-based method, rating Apps whose category O is a subclass or a superclass of R with 0.6, 0.4 if they categories are siblings and 0 otherwise, as described in Eq. (3). For example, when looking for the keyword “activity” and the category “Activity Monitoring”, hybrid search returns 1 for the “Daily Activity Sensing”, in this category, and for the “Stay Active” App, although it belongs to the “Exergames” category, due to text-similarity. Then it returns 0.6 for the “Oxymeter”, “Blood Pressure Monitor” and “Step Count” Apps in sibling categories, and 0.6 for the “Elder Monitor IP Camera” in parent category. The findings from alternative algorithms are, thus, combined and presented as alternatives that would otherwise be lost without manual exploration. The next section presents this functionality implemented in the Marketplace.
fLogic (R, O) = ⎧ 1 if R = O ⎪ ⎪ ⎪ ⎨0.6 if R < O ∨ R > O ⎪ 0.4 if R O ⎪ ⎪ ⎩ 0 otherwise
(1)
fHybrid (R, O, Rk , Ot ) = ⎧ 1 if R = O ∨ Rk ∼t Ot ⎪ ⎪ ⎪ ⎨0.6 if R < O ∨ R > O ⎪ 0.4 if R O ⎪ ⎪ ⎩ 0 otherwise
(3)
1 if Rk ∼t Ot fT ext (Rk , Ot ) = (2) 0 otherwise
3.2
Implementation and User Interface
The Marketplace architecture follows a standard web application technology stack, using modern and well-established frameworks. Specifically, as shown on Fig. 2, the Marketplace is primarily divided in Front-end and Back-end. The former provides user interfacing and experience for Users, Developers and Administrators. The back-end implements logic and operations, often leveraging third party APIs. Such operations include: authentication with third-part APIs (Google, Github, Twitter and Facebook user accounts through OAuth) and automated emailing (e.g. for passwords and notifications via the Sendgrid Web API). The entire technology stack is built around Python and the Django framework, in addition to Jquery, SASS, HTML5 while the development stack is supported by tools for version control, such as Github, Travis and Sphinx docs for documentation. The ontology and the logic-based, text-based and the hybrid search algorithms are integrated in the Marketplace. The Marketplace implementation fulfils the requirements per user role as listed in the beginning of this section and is publicly accessible online5 . Some of its major capabilities and its user interface are presented here. To begin with, 5
The ACTIVAGE Marketplace online: https://marketplace.activage.iti.gr.
The ACTIVAGE Marketplace
31
the “Home” page shows two sliding sets of Apps (carousels) hosting the top ten most downloaded and the top ten highest rated Apps. Both lists are automatically and periodically populated through the built-in metrics and ranking system. To view more Apps, the user may navigate to “All Apps”, which initially presents an exhaustive list of all Apps, with pagination. Several filters on the left allow the user to narrow down the list by using multiple criteria such as: Bundle Type, which is essentially the App format (File, Docker, Android, iOS, Hardware etc), Category, Platform, Deployment Site, Release Date range, Price range and keyword-based search. Using the category filter employs the logic-based method, while keyword search employs the text-based method to provide alternative suggestions, as described in Sect. 3.1. Using both employs the hybrid method, as shown on Fig. 3. The rest of the filters, namely the Platform, Deployment Site, Release Data and Price, are explicit so they do not produce alternative suggestions. Choosing an App shows its main view with description, current ratings and comments and allows the user to buy and download it or add it to their wishlist for later. They may also comment, review and rate the App and report spam and malicious content. The Marketplace offers users general analytics in the “Statistics” page showing historic trends of downloaded applications, comments and more. Users may edit their profile, review their wishlist and installed Apps list and also become developers, i.e. enable developer privileges. Developers can perform all regular User actions but also access the developer dashboard, which allows them to upload, update and manage their Apps, using user-friendly forms and controls that obscure the complex ontology structure behind the data models. It also shows performance analytics, such as downloads, comments, ratings and payments received, as well as developer resources. Administrators can additionally access the administration portal to manage all Apps, comments, ratings, reviews, users and payments. They can validate Apps with integrated malware and virus check tools6 ), approve, reject or ask for revisions as well as review and remove reported content.
Fig. 2. The Marketplace Architecture
6
https://www.hybrid-analysis.com/.
Fig. 3. Marketplace Hybrid Logic- and Text-Based Search
32
4
T. G. Stavropoulos et al.
Conclusion and Future Work
The ACTIVAGE Marketplace is a one-stop-shop for AHA IoT applications, built in any of the ACTIVAGE IoT platforms. It provides a common portal for Developers to upload, promote and monetize their Apps, while Users may easily discover and download them. Given the wide range of Apps from sensor behavioral monitoring to intervention support with pill reminders and exergames, the Marketplace employs a smart hybrid logic- and text-based method that is based on its own semantic metadata model, the Marketplace ontology, and keyword textsimilarity combined. Additionally, it offers a complete set of functionality for Users, Developers and Administrators including the ability to interact and provide ratings and feedback as well as to manage and validate content promoting the growth of the ecosystem. As future work, the Marketplace will be validated by its active users and the general public. The validation includes open-ended questions intended for all three user roles as well as closed-ended questions and ratings of the functionality and ease of use. Additionally we intend to use standardised questionnaires for usability (the System Usability Scale - SUS) [2] and user experience (the User Experience Questionnaire - UEQ) [7]. On the other hand, the effectiveness of the search methods is harder to evaluate. It assimilates works in the Information Retrieval frontier that mostly relate to machine learning. As in those works, we need to construct a dataset that contains not only a wide selection of Apps but also user queries and ground truth in terms of the target, desired answers (Apps) for each query, in order to evaluate accuracy, precision, recall, f-score and other measures. Acknowledgements. This work has received funding from EU H2020-IOT-732679 ACTIVAGE.
References 1. WHO — World report on ageing and health 2015. WHO (2017) 2. Brooke, J.: SUS - a quick and dirty usability scale. Usability Eval. Ind. (1996). https://doi.org/10.1002/hbm.20701 3. G¨ urgen, L., Munilla, C., Druilhe, R., Gandrille, E., Botelho do Nascimento, J.: sensiNact IoT platform as a service. In: Enablers for Smart Cities, pp. 127–147. John Wiley & Sons, Inc., August 2016. https://doi.org/10.1002/9781119329954. ch6 4. Hanke, S., Mayer, C., Hoeftberger, O., Boos, H., Wichert, R., Tazari, M.R., Wolf, P., Furfari, F.: universAAL – an open and consolidated AAL platform. In: Ambient Assisted Living, pp. 127–140. Springer, Heidelberg (2011). https://doi.org/10. 1007/978-3-642-18167-2 10 5. Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989). https://doi.org/10.1080/01621459.1989.10478785 6. Klusch, M., Kapahnke, P., Zinnikus, I.: Sawsdl-mx2: a machine-learning approach for integrating semantic web service matchmaking variants. In: IEEE International Conference on Web Services, ICWS 2009, pp. 335–342 (2009)
The ACTIVAGE Marketplace
33
7. Laugwitz, B., Held, T., Schrepp, M.: Construction and evaluation of a user experience questionnaire. In: HCI Usability Education and Work, pp. 63–76 (2008). https://doi.org/10.1007/978-3-540-89350-9 6 8. Lee, J.C., Jeon, J.H., Kim, S.H.: Design and implementation of healthcare resource model on IoTivity platform. In: 2016 International Conference on Information and Communication Technology Convergence, ICTC 2016, pp. 887–891. Institute of Electrical and Electronics Engineers Inc., October 2016. https://doi.org/10.1109/ ICTC.2016.7763322 9. Linden, G., Smith, B., York, J.: Amazon. com recommendations: item-to-item collaborative filtering. IEEE Internet Comput. 7(1), 76–80 (2003) 10. Lops, P., de Gemmis, M., Semeraro, G.: Content-based recommender systems: state of the art and trends. In: Recommender Systems Handbook, pp. 73–105. Springer (2011). https://doi.org/10.1007/978-0-387-85820-3 3 11. Luimula, M., Ailio, P., Botha-Ravyse, C., Katajapuu, N., Korpelainen, R., Heinonen, A., Jamsa, T.: Gaming for health across various areas of life. In: 2018 Proceedings 9th IEEE International Conference on Cognitive Infocommunications, CogInfoCom, pp. 247–252. IEEE (2019). https://doi.org/10.1109/CogInfoCom. 2018.8639955 12. Plebani, P., Pernici, B.: URBE: web service retrieval based on similarity evaluation. IEEE Trans. Knowl. Data Eng. 21(11), 1629–1642 (2009) 13. Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–59 (1997) 14. Schafer, J.B., Konstan, J., Riedl, J.: Recommender systems in e-commerce. Technical report. www.reel.com 15. Soldatos, J., Kefalakis, N., Hauswirth, M., Serrano, M., Calbimonte, J.P., Riahi, ˇ M., Aberer, K., Jayaraman, P.P., Zaslavsky, A., Zarko, I.P., Skorin-Kapov, L., Herzog, R.: Openiot: open source internet-of-things in the cloud. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 9001, pp. 13–25. Springer Verlag (2015). https://doi.org/10.1007/978-3-319-16546-2 3 16. Stavropoulos, T.G., Andreadis, S., Bassiliades, N., Vrakas, D., Vlahavas, I.: The tomaco hybrid matching framework for SAWSDL semantic web services. IEEE Trans. Serv. Comput. 9(6), 954–967 (2016). https://doi.org/10.1109/TSC.2015. 2430328 17. Wei, D., Wang, T., Wang, J., Bernstein, A.: SAWSDL-iMatcher: a customizable and effective semantic web service matchmaker. Web Semant. Sci. Serv. Agents World Wide Web 9(4), 402–417 (2011). https://doi.org/10.1016/j.websem.2011. 08.001 18. Yasamy, M., Dua, T., Harper, M., Saxena, S.: Mental health of older adults, addressing a growing concern substance, World Health Organization, Department of Mental Health and Substance Abuse (2013)
Explainable Intelligent Environments Davide Carneiro1,2(B) , F´ abio Silva1,2 , Miguel Guimar˜ aes1 , Daniel Sousa1 , and Paulo Novais2 1
CIICESI, Escola Superior de Tecnologia e Gest˜ ao, Instituto Polit´ecnico do Porto, Felgueiras, Portugal {dcarneiro,fas,8150520,8160334}@estg.ipp.pt 2 Algoritmi Centre/Department of Informatics, Universidade do Minho, Braga, Portugal [email protected]
Abstract. The main focus of an Intelligent environment, as with other applications of Artificial Intelligence, is generally on the provision of good decisions towards the management of the environment or the support of human decision-making processes. The quality of the system is often measured in terms of accuracy or other performance metrics, calculated on labeled data. Other equally important aspects are usually disregarded, such as the ability to produce an intelligible explanation for the user of the environment. That is, asides from proposing an action, prediction, or decision, the system should also propose an explanation that would allow the user to understand the rationale behind the output. This is becoming increasingly important in a time in which algorithms gain increasing importance in our lives and start to take decisions that significantly impact them. So much so that the EU recently regulated on the issue of a “right to explanation”. In this paper we propose a Humancentric intelligent environment that takes into consideration the domain of the problem and the mental model of the Human expert, to provide intelligible explanations that can improve the efficiency and quality of the decision-making processes . Keywords: Intelligent environments detection
1
· Explainable AI · Fraud
Introduction
Artificial Intelligence is nowadays used in virtually all aspects of our lives, controlling our routines in pervasive and transparent ways, but nonetheless taking decisions with significant influence. These applications range from innocuous ones such as image or speech classification, used in our smartphones and virtual assistants [1], to critical ones such as autonomous vehicle driving, health diagnostics, or crime/re-incidence risk assessment [2]. Generally, the more complex the problem/domain is, the more complex the models learned are. Consequently, they are also harder to understand. This poses c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 34–43, 2021. https://doi.org/10.1007/978-3-030-58356-9_4
Explainable Intelligent Environments
35
an interpretability problem: we often get a decision from a model, but we lack the information to properly judge and evaluate the decision. How good is it? How good are neighboring decisions? What is the rationale behind it? There are domains in which the lack of an explanation is not relevant. However, in domains in which the lives of people are significantly affected, explanations are of the utmost importance. For instance, an individual should not be sent to jail or a credit card should not be denied with a simple “yes or no” answer. Such decisions should come with a proper explanation, that would allow the interested parties to understand the reasons behind the decision. Indeed, we often fail to understand how these complex models work. This is not a problem while models work as expected. However, when there is the need to debug them, we often learn that we do not understand their inner workings. One of the best arguments in favor of the need for explanations, even when a model is apparently working appropriately, is given by [3]. The authors conducted an experiment whose task was to classify pictures containing either wolfs or huskies. While the model performed fairly well, the use of saliency maps showed that the model was not deciding based on the pixels that constituted the animal, but was actually using the background of the picture which contained mostly snow in the case of wolves, and grass in the case of huskies. If we were to provide the model with an image of a wolf standing on a grassy background, it would probably get it wrong and we would have no idea why. The need for explanations in AI is thus evident, much more so in critical applications. Indeed, the EU recently regulated on the “right to explanation” [4], ensuring that any decision uttered by an automated algorithm that has critical and binding decisions must be accompanied by an intelligible explanation. In line with this view, in this paper we propose a human-in-the-loop system, that combines Human experts and Machine Learning. The system continuously learns from the interaction with the Human experts, and the efficiency of this process is improved through elements of explainable AI such as interpretability, interactivity or counterfactual analyses. The system is also developed bearing in mind the mental model of the Human expert and the specific domain of fraud detection. However, the approach is general enough to be used in other domains.
2
Explanations and Human Factors
The concept of Explainable Artificial Intelligence (xAI) is related with the ability of a Human to understand the decision process of algorithms. In this context it is important to first make the distinction between two important terms: explanation and interpretability. Indeed, one can explain a decision process without actually understanding the model which generated such decision, or the intricate relationships between cause and effect in the decision process [5]. Thus, the ability to understand how a decision algorithm behaves when its inputs are sightly altered relates to the interpretability of the model. In other words, the ability to predict how changes in the input change the decision output. On the other hand, explainability is related
36
D. Carneiro et al.
to how the human cognition can understand the mechanics of the decision from their natural perception. The subtle difference is that to explain a decision we do not need to understand how a decision could be altered if inputs were different. An explanation can also vary according to its degree of completeness, which is the extent to which it allows a complete understanding of all the domains for each attribute in the decision-making process [6]. Explanation is naturally easier on some models, namely statistical or rulebased algorithms. It is much harder and less intuitive in ensemble models or under the umbrella of the connectionist methods, namely with algorithms such as Recorrent Neural Networks (RNNs). Indeed, explanations and interpretability are particularly difficult in these so-called “black-box” models, that are characterized by high complexity and abstraction levels. Nonetheless, many different approaches are being undertaken in both explainable and black-box models, which are reviewed in the next section. 2.1
Approaches to Enhance Explainability and Interpretability
The research community has developed several approaches to improve explanations and interpretability in Machine Learning (ML) models. These approaches are sometimes specific to a given algorithm, or generic and applicable to a broad range of them. One of the most interesting examples is the use of counterfactuals or evidence based on the interpertability of the model. These require a deep understanding of the machine learning model being used and how changes in the input may alter the decision outcome [7]. These decisions are categorized by the complete categorization of a specific decision and or how the decision would be altered given some changes in the input. This is a generic idea which may have different implementations depending on the algorithm being studied. In the literature we can find this approach in linear classification algorithms [8] where a linear machine learning algorithm is exploited to find how changes in coefficients or inputs change the final decision. Black box models, such as mutlilayer perceptrons, can also embed this approach. In [9], a genetic algorithm is used to search an output domain to provide suggestions for credit risk assessment, which can be perceived as an approach to interpret and explain a neural network decision process. This approach is similar to a technique known as LIME: Local Interpretable Model-Agnostic Explanations [9], which develops an approximation of the model by testing what happens when certain aspects within the input of the model are changed. It is about trying to recreate the outputs through a process of experimentation. Still in the domain of credit scoring, there are also examples of ensemble explanation, implemented through layers of interpertability of machine learning models [10]. In this approach, the decision making process is explained in different steps by an expert rule based system. In the case of black box models, there are techniques to recreate the decision process through the analysis of the internals of such models. In the case of neural networks and deep learning models, there is a technique called Deep
Explainable Intelligent Environments
37
Lift [11]. It works by taking the output and attempting to interpret the neurons that are significant to the original output. In short, it performs a sort of feature selection to explain the decision process based on the activated neurons. A similar approach to Deep Lift is the layer-wise relevance propagation technique [12]. It also works backwards from the output, identifying the most relevant neurons within the neural network. The general perception is that all models can be explained to some extent, some more than other. Moreover, some are easy to explain (generally those under a symbolic approach to AI) while other are more challenging (generally the connectionist models). However, explanations should also consider the mental model of the user and the domain of the application. In this paper we describe an intelligent environment for the domain of fraud detection, that incorporates a series of concepts from explainable systems, and that is built to integrate with the work of a Human auditor.
3
An Explainable Intelligent Environment for Tax Fraud Detection
The importance of explaining decisions in an Intelligent Environment has already been addressed in Sect. 2. However, nowadays, explanations are not only desirable from a perspective of interpretability but are starting to become a legal requirement. In the context of the GDPR, the EU recently regulated on algorithmic decision-making and, specifically, addressed the issue of a “right to explanation” [4]. There are particularly sensitive domains in which algorithmic decisions significantly affect one’s life, such as credit scoring, sentencing, or fraud detection. In this paper we present one such environment, in the domain of financial fraud detection, in the context of the Neurat funded project (31/SI/2017 39900). This environment is being built as a cooperative system in which Machine Learning tools and Human experts work together to increase the efficiency of tax audits. However, the use of Machine Learning, and in particular of supervised methods, requires vast amounts of labeled data. The problem is that data can only be labeled by Human experts (auditors) and, in this case, it comes at a high cost: auditors must undergo extensive training and their time is very limited. As a consequence, they are able to review but a small portion of the transactions of a company, usually by sampling, and thus provide a small amount of labeled data. An Active Learning (AL) approach is being followed to implement this environment [13]. Generally, AL approaches aim to make ML less expensive by reducing the need for labeled data. To achieve this, a so-called Oracle, which may be a Human expert or some automated artifact, is included in an cycle in which a ML model is continuously improved by training on a growing pool of labeled data. The key element in this approach is the selection strategy for unlabeled
38
D. Carneiro et al.
data, which will optimize selection queries so that learning occurs faster. Different data selection strategies may be implemented. However, the goal is the same: to cover the search space as quickly as possible, minimizing the necessary labeled data. ML accuracy is maintained while reducing the training set size. However, we introduce two major changes to the “traditional” AL scheme (Fig. 1). First, we consider a pool of models rather than a single model [14]. New models are trained and added to the pool, which constitute a voting/averaging ensemble whose weights are continuously optimized by a Genetic Algorithm. Over time, models with a smaller weight are removed from the ensemble. This allows the system to converge while using relatively simple models, trained with partial data, instead of a very large and complex one. Secondly, we add another input to the Oracle, which in this case is the Human auditor. The auditor has access to the selected instance i, which is now accompanied by a prediction p and an explanation e. Both are provided by the ensemble f and are a result of f (i), that is, of asking the current ensemble to classify a specific instance. Now, when the auditor receives the instance to label (that is, when the auditor performs an audit action), he also receives the label proposed by the system as well as an intelligible explanation for it, tailored for this specific domain.
Fig. 1. Overview of the main elements of the proposed environment for fraud detection.
To achieve this, we are using a modified version of the CART algorithm [15]. This algorithm allows to build a Decision Tree from a group of observations. Each node of the tree contains boolean rules about the observations (e.g. value of variable x is greater than y) and each leaf contains the result of the prediction for a given path in the tree. While the tree is being built, the training set is increasingly split at each node, leading to smaller sub-sets of the data. This splitting process ends when one or more stopping criteria are met, which may include a minimum size of the split or a minimum degree of variance/purity. Variance denotes how much the values for the dependent variable of a split are spread around their mean value (in regression tasks), while purity considers
Explainable Intelligent Environments
39
the relative frequency of classes: if all classes have roughly the same frequency the node is deemed “impure”. The Gini index is used in the CART algorithm to measure impurity [16]. Formula 1, as proposed by [17], describes the relationship between the outcome y and features x. Each instance of the training set is attributed to a single leaf node (subset Rm ). I{x ∈ Rm } is a function that returns 1 if x is in the subset Rm or 0 otherwise. In a regression problem the predicted outcome y = cl of a leaf node R1 is given by the average value of the instances in that same node. y = f(x) =
M
cm I{x ∈ Rm }
(1)
m=1
While the algorithm can be used for both classification and regression tasks, in this work we use a regression tree, as the task is to assign a value between 0 and 10 which represents the degree of certainty of a given instance to constitute fraud. 3.1
Generating Interpretable Explanations
A Decision Tree is, in itself, an explainable model: it can be analyzed visually to understand which variables and values are used at each level to take a decision. However, this may be difficult for example if the tree is too large. There is also additional information that can be provided that is not explicitly in the tree’s structure. In this section we detail the explainable elements that are generated by the system, to support the Human auditor in decision-making. When the tree is being built and each split generated, additional information is stored in the node which includes: the boolean rule that generates the split (mentioning the variable and the value interval), the prediction y based on that split (i.e. the average or most frequent value, depending on the problem), measures of dispersion or purity (variance, standard deviation and Gini index), and the indexes of the instances in the split. These values are then used to provide a notion of confidence and support to the decision-maker. Confidence is given by dispersion and purity measures: the lower the dispersion or the higher the purity, the higher the confidence on the decision is. Support is given by the number of instances in the split: the higher the number of instances, the higher the support is. This information on the nodes allows to incorporate a group of explainable elements in the user interface. Figure 2 shows a prototype of the graphical user interface that is used to provide explanations. When an auditor wants to analyze a specific instance she/he selects that instance and is redirected to this interface, which receives the data of the instance, the prediction, and an explanation. The user interface has three main areas, marked in the Figure as (a) - Explore, (b) Decision path and (c) - Last results. Area (a) allows the user to explore the search space and analyze each feature according to their relative importance. Features and values are collected
40
D. Carneiro et al.
Fig. 2. Prototype of the main screen of the application, with some of the explainable elements created, and three main areas highlighted: Explore (a), Decision Path (b) and Last Results (c).
from the internal nodes when traversing the tree to make a prediction. In this context, feature relevance is based on how much that split/feature decreases dispersion/purity. For each feature that the interface shows the following elements (depending on whether the variable is numeric or nominal): the domain of the feature (range/enumeration of possible values), the interval/values for which the prediction holds (blue bar or values highlighted in blue), and the value of the feature in the instance being audited (gray dot). This allows the auditor to gain a sense of how risky the decision is. If the value of a given feature is very close to the upper or lower limits of the blue bar, it indicates that a slight change of this feature towards the limit would significantly alter the prediction of the tree. Likewise, the size of the blue bar is also related to this sense of risk: the shorter the bar the more risky the decision is. In the case of a nominal feature, multiple values can be highlighted to show for which values of the enumeration the prediction holds. The risk of the decision grows with fewer highlighted values. In Fig. 2, the graphical interface is shown in “Edit Mode”. This means that the user may change the values of the variables to perform a counterfactual analysis. That is, what would be the prediction if the value of a feature had been v2 instead of v1 . These “what-if” scenarios allow the auditor to interact with the tree and to understand how predictions would change under different scenarios. This contributes significantly to the interpretability and interactivity of the explanation, as addressed in Sect. 2. The user does this by changing the value of the features by means of a slider, or by selecting a value from a list. The scenarios created by the user can be added to area (c), to be compared. The user
Explainable Intelligent Environments
41
can also reset area (a), returning all the values and the associated prediction to the initial state of the instance being audited. There is also a pagination mechanism that controls the amount of information provided to the user, to avoid overload. Indeed, depending on the training set, the number of levels/nodes/features on a tree may be too large to be efficiently analyzed by a Human. In that sense, in this interface we show only the n most relevant features. The user can then choose to request additional features (and the associated prediction) by clicking on the “More variables” button. These are gradually added upon request by decreasing relevance. In the left side of the interface there is the area marked as (b). This area shows the path followed through the tree to make the prediction. Like in (a), this area may not show the whole path as it implements the same pagination mechanism: when features are added to (a) they are also added to (b). This element allows the user to understand (part of) the reasons for a given prediction: “because feature f1 is smaller or equal than v1 and feature f2 equals v2 ”. In this area the user may also click on a specific node to see its details (Fig. 3). The details show, in the left side, the information for the feature that is also visible on (a). On the center and right, the “details” modal provides information regrading the confidence and support of the prediction. The graphical representation shows the prediction (blue dot) and the interval given by the standard deviation. A smaller interval indicates an increased confidence as instances in this split are more closely distributed around the mean, and vice-versa.
Fig. 3. Details of a split node, with confidence and support measures.
The central part of the modal shows values which include the support (number of instances in this split) and a button that allows the user to access the instances that fall into this split. The user may thus visualize the instances,
42
D. Carneiro et al.
which are shown sorted by similarity to the current instance in descending order. Similarity is calculated based on a weighted sum of differences, given by the euclidean distance for numerical variables and by the cosine similarity for the vector of nominal data (if any). While visualizing specific instances the user may add them to a list for comparison (area (c)). As the user moves down the path, splits become smaller but confidence increases. It is up to the user to decide how far down to travel: an early stop may lead to a more general decision (with high support and potential low confidence), while going further down will lead to low support but high confidence. Finally, in area (c) the user has access to a list of previous prediction results (the scenarios that were simulated) and/or to actual instances that were visualized by the user and added for comparison. This allows to more easily compare a group of scenarios or real cases and their results.
4
Conclusions and Future Work
With the growing use of AI models in our daily lives and the impact of their decisions, their inner workings must be more closely scrutinized. More and more we require not only a decision or a prediction, but also an intelligible explanation that we can use to judge the quality of the decision. However, the vast majority of existing AI systems do not consider this kind of elements. In this paper we presented an adapted version of a human-in-the-loop system, based on Active Learning. We expand the “traditional” process flow with the provision of predictions and corresponding explanations for the unlabeled data that is presented to the Human expert. We believe that the provision of these explanations will contribute to the efficiency of the interaction between the Human and the system, as well as to the quality of the decisions made by the Human. The quantification of such improvements will be carried out in future work. Among other aspects, the proposed system considers elements such as interactivity, counterfactual explanations, simulation, and rule-based explanations. The approach was developed taking into consideration the mental model of the auditor. Nonetheless, it is generic enough to be used in other domains, thus contributing to an increased awareness of users towards the Machine Learning models that they interact with. Acknowledgments. This work was supported by the Northern Regional Operational Program, Portugal 2020 and European Union, trough European Regional Development Fund (ERDF) in the scope of project number 39900 - 31/SI/2017, and by FCT - Funda¸ca ˜o para a Ciˆencia e a Tecnologia, through projects UIDB/04728/2020 and UID/CEC/00319/2019.
References 1. Ververidis, D., Kotropoulos, C., Pitas, I.: Automatic emotional speech classification. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. I–593. IEEE (2004)
Explainable Intelligent Environments
43
2. Crawford, K.:Artificial intelligence’s white guy problem. N.Y. Times 25 (2016) 3. Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016) 4. Goodman, B., Flaxman, S.: European union regulations on algorithmic decisionmaking and a right to explanation. AI Mag. 38(3), 50–57 (2017) 5. Dosilovic, F.K., Brcic, M., Hlupic, N.: Explainable artificial intelligence: a survey. In: Proceedings 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2018, pp. 210–215, May 2018 6. Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M., Kagal, L.: Explaining explanations: an overview of interpretability of machine learning. In: Proceedings 2018 IEEE 5th International Conference on Data Science and Advanced Analytics, DSAA 2018, pp. 80–89 (2019) 7. Sokol, K., Flach, P.: Desiderata for interpretability: explaining decision tree predictions with counterfactuals. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 10035–10036 (2019) 8. Ustun, B., Spangher, A., Liu, Y.: Actionable recourse in linear classification. In: FAT* 2019 - Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency, pp. 10–19 (2019) 9. Silva, F., Analide, C.: Information asset analysis: credit scoring and credit suggestion. Int. J. Electron. Bus. 9(3), 203 (2011) 10. Chen, C., Lin, K., Rudin, C., Shaposhnik, Y., Wang, S., Wang, T.: An interpretable model with globally consistent explanations for credit risk, pp. 1–10. arXiv preprint arXiv:1811.12615, November 2018 11. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: 34th International Conference on Machine Learning, ICML 2017, vol. 7, pp. 4844–4866 (2017) 12. Binder, A., Montavon, G., Lapuschkin, S., M¨ uller, K.R., Samek, W.: Layer-wise relevance propagation for neural networks with local renormalization layers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9887, pp. 63–71 (2016) 13. Settles, B.: From theories to queries: active learning in practice. In: Active Learning and Experimental Design workshop in conjunction with AISTATS, vol. 2010, pp. 1–18 (2011) 14. Ramos, D., Carneiro, D., Novais, P.: evoRF: an evolutionary approach to random forests. In: International Symposium on Intelligent and Distributed Computing, pp. 102–107. Springer (2019) 15. Singh, S., Gupta, P.: Comparative study ID3, cart and C4. 5 decision tree algorithm: a survey. Int. J. Adv. Inf. Sci. Technol. (IJAIST) 27(27), 97–103 (2014) 16. Lerman, R.I., Yitzhaki, S.: A note on the calculation and interpretation of the Gini index. Econ. Lett. 15(3–4), 363–368 (1984) 17. Molnar, C.: Interpretable Machine Learning. Lulu. com, Morrisville (2019)
Overcoming Challenges in Healthcare Interoperability Regulatory Compliance Ant´ onio Castanheira, Hugo Peixoto(B) , and Jos´e Machado Centro Algoritmi, Universidade do Minho, Campus de Gualtar, 4700 Braga, Portugal {hpeixoto,jmac}@di.uminho.pt
Abstract. There has been a significant increase in the quantity of information stored digitally by health institutions. Such information contains personal data from the actors in their universe. Thus, is crucial that it is governed by a set of rules, in order to allow it to be understood without losing important data. With the increased use of digital tools for storing and exchanging information, ethical issues began to arise in the context of the privacy of personal data. Questions about access, processing, treatment and storage of personal data became increasingly important in society, leading to the creation of the General Data Protection Regulation (GDPR) in force at European level. GDPR is one of the main challenges in healthcare interoperability regulatory compliance, therefore the proposed architecture shows an approach to enforce GDPR compliance into Agency for Integration, Diffusion and Archive Platform (AIDA), which is held by several healthcare unities in Portugal, using technologies like ElasticSearch and Kibana.
Keywords: GDPR Health Record
1
· ElasticSearch · Interoperability · Electronic
Introduction
In the last decades, the world has seen significant technological advancements, leading to a radical change in health administration systems: The information that was in the past stored in paper is now stored in digital systems using Electronic Healthcare Records, that can be defined has digitally stored health care information about an individual’s lifetime [1]. One of the platforms that appeared with those advancements was the Agency for Integration, Diffusion and Archive of Medical Information (AIDA), an agent-based platform that ensures interoperability between health facilities [2], currently operating in some Portuguese health facilities (Porto Hospital Center, Alto Ave Hospital Center, Tˆ amega and Sousa Hospital Center and Local Health Unit of Alto Alentejo) [3,4]. The increased use of digital tools to store and exchange information raised ethical issues in the context of privacy of personal data. Questions about access, c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 44–53, 2021. https://doi.org/10.1007/978-3-030-58356-9_5
Overcoming Challenges in Healthcare Interoperability
45
treatment and storage of personal data became more important in society, leading to the creation of a GDPR, in force at European level. That regulation states that the controller of the data shall maintain a record of processing activities under its responsibility [5]. Therefore, as the AIDA platform manages personal information, its data treatment operations must be monitored. A solution, proposed in this paper, is a monitoring tool, using ElasticSearch to store the records and Kibana for the visualisation of the data treatment activities kept in the records. The methodology used in the development of this tool was Design Science Research [6]. The first step the identification of the problems and motivations. In this case, the problem is the compliance with GDPR and the challenges in healthcare interoperability, which requires the addition of a module to monitor the operations performed. The second step is to delineate the objectives: Monitoring the data treatment activities using tools in order to keep and monitor the records generated by the AIDA platform. The third step is the design and development of the module, which is stated in Sect. 3. Section 3 also presents the demonstration and evaluation steps.
2 2.1
Background Interoperability
The concept of Interoperability has different nuances: Some state interoperability as the ability of two or more systems to exchange information and use it [7]. Others state that interoperability is the ability of two parties, either human or machine, to exchange data or information [8]. This concept has gained more importance in health care, as hospital systems are starting to work with multiple components, often built on different hardware, different languages and operating systems, and can incur unbearable financial and bureaucratic costs and ultimately make it difficult for them to communicate [4,9]. To facilitate the exchange of information, the Health Level Seven organisation created standards that specify the syntax of the information exchanged by the components and how it should be exchanged. As in Fig. 1, there are different levels of interoperability, differing on the communicating capacity and information shared between two entities [10]: – Level 0: There is no communication between the systems. – Level 1: There is a communication protocol between the systems. – Level 2: Information exchanged between systems has a common format and structure. – Level 3: In addition to the information exchanged between systems having a common format, its meaning is also interpreted in the same way. – Level 4: All stakeholders know the context in which information is exchanged. – Level 5: Systems have the ability to take advantage of state changes and understand the effects of information exchange. – Level 6: Stakeholders conform to the assumptions and constraints of each real environment.
46
A. Castanheira et al.
Fig. 1. Interoperability levels
2.2
AIDA Platform
AIDA is an agent-based platform that assures the interoperability between health care facilities [2], currently operating in Porto Hospital Center, Alto Ave Hospital Center, Tˆamega and Sousa Hospital Center and Local Health Unit of Alto Alentejo, Portuguese health facilities [3]. This platform communicates with the components of each health facility, sends, receives, manages and stores information regarding the functioning of the facilities. It possesses several interfaces based on webservices, allowing to execute queries and managing the data warehouse, as well as visualise the information in an easy way. Its main goal is to allow the diffusion and integration of the information generated in health facilities. This platform uses a set of standards for the exchange, sharing, integration and retrieval of electronic health information, from the Health Level Seven organisation, the HL7 v2, that defines the electronic exchange of clinical data between systems [11]. 2.3
Mirth Connect
Mirth Connect is an open source health care messaging integration engine that utilises a channel based architecture to connect systems and allows messages to be filtered, routed or transformed based on user defined rules [12,13]. It possesses a server, that works as a channel container, and a client, that represents an interface used to develop, test, deploy and monitor channels. One channel is constituted by 4 components: Source Connector, Filter, Transformer and Destination Connector. In the AIDA Platform this engine acts as an intermediary for the components of the platform. The source connector receives messages from a component and can be configured to listen for messages or to fetch the messages from it. The filter component determines which messages should be accepted, based in a set of configurable rules. The transformer component extracts the data from the incoming message and parses it to the format in which it will
Overcoming Challenges in Healthcare Interoperability
47
be sent to its destination. The destination connector handles the routing of the parsed message, connecting to components and transmitting the it. The destination connector is able to send the same message to multiple components. 2.4
ElasticSearch
Databases are a crucial element in almost every application nowadays, because they are the most reliable way of storing data. In the early days, file systems were the solution most times adopted when building an application. As time went by, databases appeared. They allow more flexibility than a file when storing large amounts of data on it, because it allows an application program to get at the smallest bits of that information faster. Also, as the applications tend to have many people looking at the same information and possibly changing that piece of information, coordination is necessary. That coordination is achieved easily because the database management systems help the handling of concurrency using transactions during the access to the database. The relational model is the most used in the market. There are multiple solutions using this model, like Postgres, MySQL or Oracle. However, the increasing use of digital systems generated huge data sets and solutions that can handle such numbers are important. It was in this sense that the NoSQL movement appeared, bringing solutions like Document-oriented Databases. In the health area this model has become important because in the hospital environment an increasing amount of information is generated. The amount of data circulating in a system such as AIDA can be used to identify needs, provide services or to monitor the operation of that system to ensure security and reliability. ElasticSearch is a RESTful document-oriented database. It is built to handle large volumes of data, while having high availability, the ability to distribute itself across multiple machines, fault tolerant and scalable. Its capabilities allow it to be used to monitor applications through their records or for search functions, as it has the capacity to store the huge amount of information generated during the activity of those same applications. Each ElasticSearch entity is named node, and the server can operate with only one, in single-server mode, or with more than one, as a cluster. ElasticSearch works in a distributed manner by acting on Peer-to-Peer (P2P). A P2P network can be defined as a set of entities that share hardware resources (in this case, share storage capacity) and are directly accessible to each other [14]. Documents are divided into indexes, each index being divided into shards, with a number defined when the index is created. Documents have an associated type, that can be set when the index is created or be dynamically set by ElasticSearch when the document is inserted [15]. The filtered and removed parameters, and how text parameters are stored by mapping it [16]. The Figs. 2 and 3 show an ElasticSearch server, first with one node containing 2 indexes, each one with 4 shards, working in single-server mode (Fig. 2) and, after adding a new ElasticSearch node, in distributed mode (Fig. 3. As can be
48
A. Castanheira et al.
Fig. 2. Single server mode
Fig. 3. Distributed mode
seen, the indexes are replicated to the new node’s shards and the shards are spread over the 2 nodes in the cluster to balance server load. In Fig. 2, requests placed on an index are sent to all shards of that index, executed on each shard, aggregated the information found and the response sent to the client. If a request is placed on all indexes then that request is executed on all shards of both indexes. In the case of Fig. 3, requests are replicated by both nodes, and then treated by each as in single-server mode. Document insertion works the same way in both examples, with the ElasticSearch algorithm defining in which shard the document is stored and indexed [17], so the information can be split across all the shards in equal parts. 2.5
Docker
The increased use of technological resources to respond to the current market made necessary the creation of technologies that would make the migration of applications between machines quick and easy. These virtualisation technologies were a step forward in that direction, as they eliminate the existence of dependencies on the machines. With these types of technologies it is also easier to distribute an application across several virtual machines, thus saving some of the processing performed on the application. Docker is a virtualisation technology that uses Linux Containers to standardise applications. A container is a standard unit of software that encapsulates the code and its dependencies, thus allowing the code to run in any computational environment [18]. Using this technology it’s possible to run several applications on the same machine, without spending a lot of resources on it and thus avoiding the high costs that would result from the use of multiple machines. 2.6
Kibana
Kibana is an open-source ElasticSearch plugin that serves as a visualisation layer for the data stored in the indexes of this database. It provides various types of
Overcoming Challenges in Healthcare Interoperability
49
visualisations in form of data tables, charts, maps, histograms, heat maps, or gauges. The main view of Kibana is divided into 4 main pages: Management, Discovery, Visualisation and Dashboards [19]. The management page allows the configuration of the Kibana internal settings, as well as editing the ElasticSearch indexes. It’s possible to merge several ElasticSearch indexes into one Kibana index pattern, allowing disparate data sources to be queried together. Tn addition, it’s possible to modify field formatters to alter how the field value is shown in Kibana. The discovery page allows the listing of the documents stored in a certain index and may include parameter filters. It’s also possible to observe the total amount of stored documents and the amount over time, this time interval being adjustable. The document listings with filters can be saved, to be used on other pages. The visualisation page makes possible the creation and edition of visualisations about the data. Views of various types can be created, for example, the count of patient-related messages stored, or the variation a given parameter over time. The dashboard page allows the creation, edition and viewing of dashboards, built using the graphics and counts created on the visualisation page. It is also possible to export dashboards, allowing them to be viewed on web pages. Despite all these features, Kibana possesses, in its open-source version, one main drawback: It has no user management. This feature is only available when using the paid version. Although, ElasticSearch has a plugin, SearchGuard, that offers encryption and user authentication and authorisation.
3
Implementation
The proposed solution is a system that stores most of the information in the messages exchanged by the AIDA platform in ElasticSearch and makes it available in dashboards, using Kibana. An architecture of the proposed solution is shown in Fig. 4.
Fig. 4. Proposed solution’s architecture
50
A. Castanheira et al.
In the existent infrastructure, messages are stored in two tables in an Oracle database. First block represents the Oracle infrastructure, where one table is for HL7 sent messages log and another one is for HL7 received messages log. The messages are generated and grouped by the departments of the health facility and stored by data and status. The second block represents the architecture of the solution presented hereby, where the messages are collected by MirthConnect channels from the Oracle block - one for each table - that process and send them to ElasticSearch. The second block is a docker infrastructure with several containers, namely one MirthConnect container, one with the ElasticSerach image and the last one with Kibana. The messages sent to ElasticSearch are then presented in Kibana dashboards, as shown in the Figs. 5, 6 and 7. This solution helps AIDA platform to comply with the legislation in the GDPR, currently in force at European level, through the use of Kibana to monitor the data and ElasticSearch to keep stored all the data generated and exchanged in the AIDA platform, making possible the maintenance of a record of the data treatment operations performed in the AIDA Platform and offering a way to monitor and search through those operations in an easier way, using the text search abilities of these tools. GDPR compliance is also enforce by the possibility to recall which systems held the patient’s information in each interaction and message exchange.
Fig. 5. Message listing dashboard
The solution was tested using MirthConnect as integration model, ElasticSearch, Kibana, with the SearchGuard plugin, and a PostgreSQL database that stores messages generated by a NodeJS script each 5 s. The messages contain a random destination, an HL7 event, one random episode and the date it is sent. In order to send the messages to ElasticSearch, a Mirth Connect channel was created. The source connector makes a query to the PostgreSQL database each
Overcoming Challenges in Healthcare Interoperability
51
Fig. 6. Message count by destination dashboard
minute, fetching the messages stored during that minute. No filters are applied. The transformer extracts the data from the message and formats it to JSON. Finally, the destination connector sends it to the ElasticSearch server. Kibana presents the data in dashboards. Figure 5 gives the overview of the messages of the last 15 min, showing the 5 more relevant types of messages exchanged in the given period and the number of messages exchanged. It also shows the number of messages with no type given. Below, it is given a list of all the messages exchanged in the last 15 min with all the attributes regarding those messages, allowing the user to perform a search query to filter the messages by any message field.
Fig. 7. Message count by episode dashboard
Another dashboard can be observed in Fig. 6, where the number of messages sent to each destination in the last 5 min is shown. This way it is possible to know
52
A. Castanheira et al.
the traffic each department has and to monitor the health of those departments, by monitoring the number of messages sent to a destination each minute. Lastly, Fig. 7 presents a dashboard with the number of messages regarding each episode exchanged in the last 15 min and, below, the list of the messages exchanged in the last 15 min. Using this dashboard, it is possible to monitor the information exchanged in the platform regarding a specific episode, ensuring the GDPR compliance of the platform.
4
Conclusion
This project is inserted in an important thematic of today’s society: Data Protection and GDPR compliance. In the last few years, with the approval of the legislation in force, all the health services entities must have features that ensure the safety of personal data. The solution presented hereby helps the implementations of the GDPR in the AIDA platform. Having log records of the exchanged information between different actors, it is possible to address one of the main challenges of the GDPR compliance but also to overcome the challenges in healthcare interoperability regulatory compliance. Furthermore, this system also offers AIDA a great feature that may help the platform maintainers to detect errors and anomalies by analysing the dashboards, while saving the data that can be used in the future to detect patterns with data mining. As future work, new dashboards will be added in order to give the user more information about the data exchanged in the platform. Acknowledgements. This work has been supported by FCT – Funda¸ca ˜o para a Ciˆencia e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020
References 1. Iakovidis, I.: Towards personal health record: current situation, obstacles and trends in implementation of electronic healthcare record in Europe. Int. J. Med. Inform. 52, 105–115 (1998) 2. Cardoso, L., Marins, F., Portela, F., Santos, M., Abelha, A., Machado, J.: The next generation of interoperability agents in healthcare. Int. J. Environ. Res. Public Health 11(5), 5349–5371 (2014) 3. Duarte, J., Portela, C.F., Abelha, A., Machado, J., Santos, M.F.: Electronic health record in dermatology service. In: Communications in Computer and Information Science, vol. 221. CCIS (PART 3). Springer (2011) 4. Neto, C., Brito, M., Lopes, V., Peixoto, H., Abelha, A., Machado, J.: Application of data mining for the prediction of mortality and occurrence of complications for gastric cancer patients. Entropy 21(12), 1163 (2019) 5. European Union. “Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance)” European Union Official Journal, L 119 (2016)
Overcoming Challenges in Healthcare Interoperability
53
6. Peffers, K., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research methodology for information systems research. J. Manage. Inf. Syst. 24, 45–77 (2007) 7. Rezaei, R., Chiew, T.K., Lee, S.P., Aliee, Z.S.: Interoperability evaluation models: a systematic review. Comput. Ind. 65(1), 1–23 (2014) 8. Mead, C.N.: Data interchange standards in healthcare IT-computable semantic interoperability: now possible but still difficult, do we really need a better mousetrap? J. Healthc. Inf. Manag. 20, 71–78 (2006) 9. Machado, J., Abelha, A., Neves, J., Santos, M.: Ambient intelligence in medicine. In: Proceedings of the IEEE-Biocas, Biomedical Circuits and Systems Conference, Healthcare Technology, Imperial College, London, UK (2006) 10. Turnitsa, C.D., Diallo, S.Y., Tolk, A.: Applying the levels of conceptual interoperability model in support of integratability, interoperability, and composability for system-of-systems engineering. J. Syst. Cybernet. Inf. 5, 65–74 (2007) 11. Hl7 standards v2 product suite (2020). http://www.hl7.org/implement/standards/ productbrief.cfm?productid=185 12. Bortis, G.: Experiences with mirth: an open source health care integration engine. In: ACM/IEEE 30th International Conference on Software Engineering (2008) 13. Brand˜ ao, A., Pereira, E., Esteves, M., Portela, F., Santos, M.F., Abelha, A., Machado, J.: A benchmarking analysis of open-source business intelligence tools in healthcare environments. Information 7(4), 57 (2016) 14. Schollmeier, R.: A definition of peer-to-peer networking for the classification of peer-to peer architectures and applications. In: Proceedings First International Conference on Peer-to-Peer Computing (2001) 15. Divya, M.S., Goyal, S.K.: ElasticSearch: an advanced and quick search technique to handle voluminous data. Compusoft 2(6), 171 (2013) 16. Ku´c, R., Rogozi´ nski, M.: Mastering ElasticSearch. Packt Publishing Ltd., Birmingham (2017) 17. Dixit, B., Kuc, R., Rogozinski, M., Chhajed, S.: Elasticsearch: A Complete Guide. Packt Publishing Ltd., Birmingham (2017) 18. What is a container (2020). https://www.docker.com/resources/what-container 19. Bajer, M.: Building an IoT data hub with Elasticsearch, Logstash and Kibana. In: 5th International Conference on Future Internet of Things and Cloud Workshops (2017)
Tools for Immersive Music in Binaural Format Andrea De Sotgiu(B) , Mauro Coccoli(B) , and Gianni Vercelli(B) DIBRIS, Università degli Studi di Genova, Genova, Italy {andrea.desotgiu,mauro.coccoli,gianni.vercelli}@unige.it
Abstract. Using Spatial Audio in Domotics and Ambient Assisted Living (AAL) applications can help creating proper immersive experiences, which can foster more natural interaction mechanisms or, simply, improve the quality of life through a pleasant listening experience. Unfortunately, due to the lack of standardization of spatial audio formats, current speaker-based home automation devices do not support such innovative acoustic models, still remaining tied to traditional mono and stereo models. To the aim of trying to overcome this situation, we present a survey on the current state of the art in the field of immersive music research, illustrating currently available tools, useful to carry out audio post-production in Binaural format. Consequently, today’s context of sound experiences will be analyzed, which are shifting the listener’s preferences towards immersive experiences such as the use of content in spatial audio. Keywords: Spatial audio · Immersive music · Binaural format
1 Introduction Now, we are on a turning point in which, more and more, homes are becoming smarthomes and sophisticated domotics solutions are being applied to both appliances and entertainment systems. Moreover, owing to the recent acceleration in Natural Language Processing (NLP) techniques and to the unprecedented availability of a plethora of commercial “speaking” virtual assistants, audio is gaining space in Ambient Assisted Living (AAL) applications as command interface and as a mean for giving feedback and, possibly, alarms. For these reasons, it is interesting to investigate the possibility of introducing new ways of listening both music and sounds, exploiting the immersive characteristics that the Binaural format can guarantee. From gramophone to home theatre systems, listening to music has characterized the home environment over the decades. Although several solutions are available for stereophonic audio in this context, there does not seem to be a standard for spatial audio. It looks complex to let multiple users perceive the same immersive experience considering the various morphological and architectural characteristics of an apartment, which generate problems like an excess of early reflections and reverb. For this reason, we decided to focus on listening with headphones, a commonly used for various multimedia products (from music to home cinema). © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 54–60, 2021. https://doi.org/10.1007/978-3-030-58356-9_6
Tools for Immersive Music in Binaural Format
55
Sound engineering has attempted to faithfully reproduce the acoustic characteristics of the human auditory system. In addition to the ability to hear, the human being is also able to perceive the surrounding space by localizing the sounds heard through the ear. Thanks to the psychoacoustic principles known as Head-Related Transfer Function (HRTF) it was possible to deduce which parameters determine the perception of a space (closed or open): (i) the shape of the auricle (pinna filter), (ii) the delay of perception (ITD - Interaural Time Difference) and (iii) the difference in amplitude of a sound (ILD Interaural Level Difference) between the left and right ear [1]. Turning these studies into algorithms, new software tools have been developed that create the virtual spatialization of one or more sounds. If “everyday life is full of three-dimensional sound experiences” [2], spatial audio (also called 360° audio or 3D audio) aims to reproduce the same sensation of sound immersion that we live daily. There are different contexts in which we can use spatial audio: gaming, virtual reality (VR), augmented reality (AR), mixed reality (XR), simulators, museum installations, music and broadcast transmission. It is possible to live sound experiences in three dimensions by wearing headphones or through loudspeakers. In this paper, we want to analyze the tools useful for the production of spatial audio within the musical context. Music listening is based on the principles of stereophony [3], using a front-oriented approach, in which the listener is placed in front of the speakers, generating a hypothetical triangle between her/him and the two speakers. Therefore, stereophony implies listening limited to the horizontal frontal plane, thus excluding the vertical plane and the part behind the head of the listener. Although the majority of the mass media models follow stereophonic principles, the introduction of spatial audio within the previously mentioned contexts is generating a change in audience preferences. Technological progress is placing us in front of the continuous possibility of enjoying different immersive experiences. The mass media (and consequently the audience) seems to be present in a limbo between stereophony and spatial audio, as happened more than ten years ago between cell phones and smartphones. As soon as the previous model was passed, no one decided to go back. The only difference today is that it has not yet been presented a technology that has definitively set the trend with spatial audio as a protagonist. However, as far as music is concerned, it seems that the first steps towards overcoming the stereophonic have already been made. Some music streaming services have made arrangements for the distribution of mixed music in spatial audio, which can be enjoyed both on headphones and through soundbars to experience a new concept of listening to music. Besides, the Grammy Awards recently awarded a specific prize to the “Best Immersive Audio Album”. We are therefore faced with the birth of a new kind of music that we could define as “immersive music”, based on the principles of spatial audio. The remainder of the essay is organized as follows: in Sect. 2, the relevant scientific literature will be illustrated; Sect. 3 will describe useful tools for the post-production of immersive music; Sect. 4 will collect research questions and, finally, Sect. 5 will lead to the essay’s conclusions.
2 Related Work Thanks to work done by the Audio Engineering Society, there is a large literature on spatial audio, available through the AES E-Library. A reference book is Spatial Audio
56
A. De Sotgiu et al.
by Francis Rumsey, which provides the humanistic and scientific foundations on spatial audio, starting from the historical background and the birth of the technologies that can be included in this context. At the base of all, there are studies on multichannel audio (from 4 to more channels), intending to overcome the two-channel stereo model (left and right). Today among the featured models, there are the Ambisonics format and the Binaural format, taken from various software for the virtual spatialization of sounds. The Ambisonics format was born in 1973 from the work of Michael Gerzon, who was inspired by quadraphony to build a model to record and reproduce sounds that the listener would perceive “around him”, 360°, intersecting the horizontal and vertical planes. The studies started from the recording phase of a sound, in which Gerzon needed to recreate ideal sphericity using the polar figures of the microphones at his disposal. He used four microphones to achieve a definitive result: three figure-of-eight that reproduced the three-dimensional planes (X, Y, Z) and an omnidirectional (W) that added the missing details to the previous microphones. This technique was identified as a 4-channel First-Order Ambisonics. Further research on this format took over further orders with a greater number of channels: 9-channel Second-Order Ambisonics (SOA), Third-Order Ambisonics, or even High Order Ambisonics (TOA or HOA) with 16 or more channels. To date, this multichannel format allows the listener to be placed in the center of rooms or installations in which speakers that surround the listener are wired and allow him to live an immersive 360° experience fully. The Binaural format, on the other hand, refers to the psychoacoustic function known as “binaural fusion”, through which our brain combines the signals from the two channels (right ear and left ear), interprets their amplitude, frequencies, delays and returns us all information necessary to correctly perceive one or more sounds with the consequent spatial information. As a consequence, this format has two channels (left/right), is closely linked to listening on headphones, and is the result of the decoding of other formats such as, for example, the different orders of the Ambisonics. There are some software tools, which will be discussed during the article, allowing us to perceive the virtual spatialization of a sound through headphones thanks to the binaural decoding algorithms of multichannel signals. The binaural format is primarily linked to Sound Source Localization, even before becoming a subject of computer music. Several studies on speech recognition have used this technique to be able to measure the level of speech intelligibility, exploiting it for many different research areas [4, 5]. The scientific community continues to study methods to create virtual rooms to make the listener experience immersive improved [6]. One of the goals of this research is to improve the externalization of sound perception for users who listen through the headphones, working on new algorithms that allow better decoding of information and that replicate the Head Related Transfer Functions. The culture of listening should not be underestimated. The decades spent listening to stereophonic products have accustomed the listener, who today needs training towards the binaural [7, 8]. The research field of Binaural Sound Source Localization can also be useful in this verse. Also computer music and, specifically, immersive music in Binaural format, carry the task of educating to listening through edutainment. The new products in ‘Binaural music’ that will come out will draw a new path that will initially be more difficult to understand, but if followed, it will lead to the birth of new perceptive skills on the part of the listeners. And this will allow an
Tools for Immersive Music in Binaural Format
57
extension to the sensory perception of the users, from which the new technologies, as home automation or entertainment, will benefit. Another important study, which continues the path started by Francis Rumsey’s Spatial Audio, is Agnieszka Roginska and Paul Geluso’s Immersive Sound: The Art and Science of Binaural and Multi-Channel Audio. In addition to taking up all the scientific contents of Rumsey’s work, the book also describes new technologies related to spatial audio and multichannel audio, such as object-based audio or wave field synthesis1 . Although the scientific community has not yet defined a standard for spatial audio file formats, two seem to be the most accredited: the SOFA format - Spatial Oriented Format Audio [9] and the MPEG format - H 3D Audio - ISO/IEC 23008-3 (MPEG-H Part 3) [10], already adopted in broadcasting systems during live events such as the Eurovision Song Contest 2019 [11]. Considering the scientific literature presented in this section, the paper wants to illustrate the current state of the art and the useful tools for the post-production of immersive music in binaural format (over headphones) in order to provide, in detail, information that can allow researchers to easily carry out the work aimed at solving research questions.
3 Tools: DAWs e Plugin The post-production phase of a song is tackled using Digital Audio Workstations or DAWs. DAWs are software that arise with the need to transform the analog recording studio into a digital studio, portable and accessible via computer. There are several Digital Audio Workstations dedicated exclusively to music, a market in which spatial audio is catching on, as previously mentioned. One of the most used DAWs in the professional field is Avid Pro Tools, which was also the first to provide the tools to operate in spatial audio thanks to a partnership with Dolby Laboratories. In 2012, Dolby Atmos, an object-based audio system for cinema, was patented and presented to consumers, allowing viewers to live immersive sound experiences with speakers on the sides and the ceiling of the cinema [12]. Although still not a common technology in all cinemas, the patent has also arrived on smartphones and computers since 2017, offering the possibility of listening to spatial audio directly on headphones or through appropriate soundbars. Together with Avid Pro Tools, Dolby has also chosen Steinberg Nuendo for compatibility with the Dolby Atmos Production Suite, a series of plugins that allow you to work on complete post-production in Dolby Atmos format. In 2019, Tidal, a company providing music streaming services, allowed its users to listen to a playlist dedicated to mixed music in this format [13]. It is not easy to get these software tools, which require dedicated high-performance hardware and, therefore, huge costs. 1 Object-based audio encodes each sound as an object inserted in a three-dimensional space,
which has metadata on its position. The term “object” derives from IT and object-oriented programming. Wave field synthesis (WFS) is a spatial audio rendering technique, characterized by the creation of virtual acoustic environments. It produces artificial wave fronts synthesized by a large number of speakers.
58
A. De Sotgiu et al.
Most other DAWs do not yet come with native plugins that allow for sound spatialization. Some of these do not even allow the management of multi-channel audio (the only ones at the moment are Avid Pro Tools, Reaper, and Steinberg Nuendo). The interesting work carried out by Envelop, a team of developers with the mission of giving new tools to the community to operate with music in the spatial audio field, is worth mentioning. The reference DAW for this open-source project was Ableton Live, famous for its use in the performance field. The trump card of this software has always been the possibility of integration with Max/Msp, a graphic programming environment dedicated to sound, through the Max For Live system, thanks to which developers can expand the possibilities of this software by creating autonomy their plugins. The Envelop team has developed with this system some plugins that give the possibility to spatialize the channels of the DAW and to obtain different output formats (binaural, FOA, SOA, HOA, etc.) thanks to a multichannel audio decoder to be positioned on a master bus (E4L - Master Bus). In the package to be installed there is also a software that allows you to spatialize the input or playback of an audio channel (E4L - Source Panner), a meter to control the signals (E4L - Meter) and other useful tools for creativity in the design of one or more sounds. Those who work in the field of music can also choose to hybridize their workflow, inserting plugins that are developed for VR or Cinematic VR environments. A useful tool for virtual spatialization of sounds is Facebook Spatial Workstation, compatible with the Avid Pro Tools and Reaper DAWs [14]. It is a software dedicated to the production of 360° videos and the management of the audio-related to them, which, however, can also be used for creative purposes in the music field. From models like the one described, other similar plugins were born such as dearVR Music (compatible with all DAWs). The Apple-owned DAW, Logic Pro X, has also taken steps towards spatial audio by introducing the possibility for its users to opt for binaural panning in each channel of the mixer inside the software. Instead of operating with the classic pan-pot, the graphical interface allows you to view a sphere and think about how to position the sound within a 360° panorama. As already mentioned, a source of continuous software and hardware improvements in the spatial audio sector is also the world of virtual reality (AR, VR, XR). Thanks to game engines, the creation of environments for sound immersion becomes easier and easier through the abstraction of the management of object-based audio within the development environments (Unity, Unreal, Google Resonance, etc.). Audio engines dedicated to virtual reality such as Wwise or FMOD are also taking hold. The interaction between the professional figures of sound designers and coders is exponentially increasing. Systems such as MatLab, Max/Msp, or Pure Data are useful tools to achieve new research objectives related to multichannel audio management.
4 Research Questions In the field of immersive music there is a lack of standards: from the absence of mixing and mastering techniques to doubts about the format of exporting files. There are few teaching resources on the subject (courses, scientific literature, workshops, summer schools, etc.). As previously mentioned, the key tools for working are the Digital Audio
Tools for Immersive Music in Binaural Format
59
Workstations, which, to date, do not univocally guarantee the same basic performance to treat spatial audio. The lack of native plugins and the inability to process multichannel audio sometimes does not facilitate research work. Hybridization with software dedicated to other fields of study makes everything even more complicated. Going into more detail, it is difficult to monitor the spectrum of frequencies that move around the listener’s head, since DAWs are not equipped with monitoring systems except for mono or stereo audio. To do this, you can adopt solutions such as external plugins (Waves Nx Ambisonics can be an example), which, however, do not have compatibility with all DAWs and with all formats. Solving the monitoring problem would prevent another one, the IHL - Inside the Head Locatedness. It is complex to make the listener perceive to be immersed in a sound environment when listening through headphones because sometimes the sensation is to perceive everything “inside its own head” and not in the surrounding environment. In the current era, a user, with access to content that provides spatial audio, will not only enjoy these reproductions in the cinema (or in an open space) but will be stimulated to use headphones. In a context such as cinema, although total visual immersion is not favored (due to the presence of screens, chairs, spectators alongside and to the static nature of the spectator), the positioning of the speakers in the room, and the new reproduction techniques (such as Dolby Atmos), allow a realistic and pleasant sound immersion. On the contrary, in a visually immersive environment such as the use of a VR video game headset in a home environment, using headphones may be not enough to feel the right user experience [15].
5 Conclusions To support the research questions, we also mention that, during the 147th conference of the Audio Engineering Society in New York, it emerged, in different panels, how much spatial audio is still a horizon to experiment, a new “wild west” of audio. The current job of the scientific community is to find definitive standards to work correctly. At the same time, the audiovisual market is shifting horizons and, together with it, the audience too. The interest of the entertainment business in the Binaural format can also bring innovation in other areas. This format has already been tested on conference calls between colleagues, with benefits [16]. In the Ambient Assisted Living (AAL) context, giving a person a more realistic immersive experience could return greater empathy to better support the user in daily life (e.g. in remote monitoring or medical assistance). It will be the task of the designers of future smart houses to understand how to integrate spatial audio as an immersive experience to be enjoyed even without headphones. It is clear that today’s listener is anchored to tradition, to the frontally oriented monophonic and stereophonic sources, but is receiving stimuli, on different levels, which generate new interest in the matter. We are on the edge of a multimedia revolution, which may change the use of all the audiovisual content available to us.
References 1. Roginska, A., Geluso, P.: Immersive Sound: The Art and Science of Binaural and MultiChannel Audio. Taylor & Francis, Abingdon (2017)
60
A. De Sotgiu et al.
2. Rumsey, F.: Spatial Audio. Taylor & Francis, Abingdon (2001) 3. Snow, W.B.: Basic principles of stereophonic sound. J. Soc. Motion Picture Telev. Eng. 61, 567–589 (1953) 4. Jorge, D.C., Jindong, L., Stefan, W.: Enhanced robot speech recognition using biomimetic binaural sound source localization. IEEE Trans. Neural. Netw. Learn. Syst. 30, 138–150 (2019). https://doi.org/10.1109/TNNLS.2018.2830119 5. Lam, A., Lee, M.-L., Philbert, S.: Measuring speech intelligibility using head-oriented binaural room impulse responses (2019) 6. Klein, F., Neidhardt, A., Seipel, M., Sporer, T.: Training on the acoustical identification of the listening position in a virtual environment (2017) 7. Rikova, M.R., Vermeir, G.: Binaural sound source localization in real and virtual rooms. J. Audio Eng. Soc. 57, 16 (2009) 8. Sloma, U., Klein, F., Werner, S., Kannookadan, T.P.: Synthesis of binaural room impulse responses for different listening positions considering the source directivity (2019) 9. Majdak, P., et al.: Spatially oriented format for acoustics: a data exchange format representing head-related transfer functions. In: Audio Engineering Society Convention 134 (2013) 10. Herre, J., et al.: MPEG-H audio—the new standard for universal spatial/3D audio coding. J. Audio Eng. Soc. 62(12), 821–830 (2015) 11. Live MPEG-H Audio System production chain for the 2019 Eurovision Song Contest showcased at IBC show. Fraunhofer Audio Blog (2019) 12. D. Laboratories: White Paper, Dolby Atmos Next Generation Audio for Cinema (2014) 13. TIDAL and Dolby are Bringing Dolby Atmos Music to TIDAL’S HiFi Members - Dolby (news). https://news.dolby.com/en-WW/184250-tidal-and-dolby-are-bringing-dolby-atmosmusic-to-tidal-s-hifi-members 14. Bellanti, E., Corsi, A., Sotgiu, A., Vercelli, G.: “Changes”: an immersive spatial audio project based on low-cost open tools. In: De Paolis, L., Bourdot, P. (eds.) Augmented Reality, Virtual Reality, and Computer Graphics. Springer, Cham (2018) 15. Johansson, M.: VR for your ears: dynamic 3D audio is coming soon. IEEE Spectr. 56(2), 24–29 (2019) 16. Aguilera, E., Lopez, J. J., Gutierrez, P., Cobos, M.: An immersive multi-party conferencing system for mobile devices using Binaural audio (2014)
A Computer Vision-Based System for a Tangram Game in a Social Robot Carla Menendez, Sara Marques-Villarroya, Jose C. Castillo(B) , Juan Jose Gamboa-Montero, and Miguel A. Salichs Systems Engineering and Automation Department, Universidad Carlos III de Madrid, Getafe, Spain {cmenendez,smarques,jocastil,jgamboa,salichs}@ing.uc3m.es
Abstract. Social Robotics is currently gaining ground as it allows robots to have an emphatic behaviour toward humans which facilitates the creation of emotional bonds with them. In this context, robots are starting to support caregivers in stimulation exercises, such as cognitive stimulation therapy, for elderly people. This work focuses on endowing the social robot Mini with perception capabilities to detect a classic game, the Tangram. Besides, the robot is able to follow the game mechanics and, using its interaction capabilities, leads the game and encourage users to complete the game.
Keywords: Social Robotics
1
· Computer vision · Cognitive stimulation
Introduction
Population ageing may limit people performance in basic daily tasks. There are options for preventing and alleviating physical and mental impairment, such as non-pharmacological therapies [5] or the use of new technologies. Unfortunately, the economic cost of non-pharmacological therapies and the lack of qualified personnel are issues that have yet to be solved [12]. Therefore, technological solutions appear as a good way to alleviate these issues, assisting caregivers and elders at home and in healthcare centres. Among these solutions, Social Robotics appears as an alternative able to interact naturally with humans, assisting in improving health conditions, and quality of life of elders [11]. Recent studies indicate that ICT-based cognitive interventions have positive effects on cognition, anxiety and depression on people that suffer dementia [1,10]. Technological devices, such as tablets, Virtual Reality equipment or Social Robots are useful tool to work with people with cognitive problems [4]. These devices allow a higher rate of stimuli and activities while providing a sense of achievement and an enjoyable experience for the users. For instance, Eisapour et al. [6] aimed at improving physical exercise practise for older people with dementia and showed that games using Virtual Reality were comparable to c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 61–71, 2021. https://doi.org/10.1007/978-3-030-58356-9_7
62
C. Menendez et al.
therapist-guided exercises. Bejan et al. [2] developed a user interface combined with a 3D virtual environment where patients suffering dementia could delve into memories while interacting with gestures adjusted to the degree of dementia. The use of tablets also appears as a feasible, safe and potentially useful solution for agitation in older adults, including those with severe dementia [15]. The advantages of using tablets include mobility, multifunctionality, a simplified and tactile interface, customization, accessibility, connectivity and ease of acquisition. Care robots are gaining ground in this area. Melkas et al. conducted empirical research on the impacts of using the Zota robot in rehabilitation sessions in two nursing homes [9]. They concluded that the robot had a positive influence on users, as it created added value for care professionals by providing fun at work. Koh et al. studied the effects of intervention using the PARO robot on cognition, emotion, problem behaviour and social interaction of older people with dementia [8]. The experimental group showed a significant improvement in social interaction. Tapus et al. [14] developed a pilot study to test the feasibility of robotic technology in the field of social care, aimed at providing personalized cognitive assistance, motivation and companionship to users who suffer cognitive changes related to ageing and Alzheimer’s disease. Recent studies relate the performance of cognitive stimulation exercises with the reduction of the risk of suffering from dementia [13]. Besides, games with manageable colourful objects, such as board games, are more effective because they produce greater sensory stimulation [16]. For these reasons, our work aims at combining the use of new technologies in cognitive stimulation by implementing the mechanics of a well-known game, the Tangram. We endowed a Social Robot with the ability of detecting a physical board game and guide the user through the process of solving it. We selected the Tangram because Frutos et al. demonstrated that this game helps expand problem-solving skills and logical thinking and also improves the development of perceptual reasoning and visual-spatial awareness [7]. The rest of this paper is organized as follows. Section 2 presents the main steps of the Tangram pieces detector. Next, Sect. 3 describes how the system is integrated in a Social Robot, and Sect. 4 presents the evaluation of the detection system with different camera positions and lighting conditions. Finally, Sect. 5 summarises the main findings and conclusions of this work.
2
Tangram Detection System
In this manuscript, we describe a modular solution integrated into a social robot to assist elderly people playing a Tangram game. The proposed system consists of three parts: (i) Camera distortion correction and play zone detection; (ii) Tangram pieces detection; and (iii) Integration in the social robot. This last phase is defined in Sect. 3.
Tangram Social Robot
2.1
63
Camera Distortion Correction and Play Zone Detection
In this work, we establish an area (play zone) in which the Tangram can be solved. That area is delimited by a rectangular board of 45 × 30 cm with a white background and black edges. The acquisition device is a wide-angle camera that allows a wide field of view in a short range. However, this kind of cameras add a fish-eye distortion to the images. Therefore, the first step is to correct the distortion of the lens and to detect the play zone. The distortion correction process1 scans the image by rows and columns and transforms each point to rectify the curvatures caused by the distortion.
Fig. 1. Pre-processing steps
Next, the playground detection process segments the image to isolate the borders of the board using a thresholding mechanism. A morphological filter is applied to eliminate isolated areas and to connect close regions. After filtering the image, the algorithm performs a contour detection, filtering out those areas that do not correspond with the target one. Note that the area of the board is known beforehand and that the system has a rough estimation of the distance between board and camera (see Sect. 4.1). Among the areas that are suitable, only the ones with a shape close to a 4 vertex polygon are considered. Out of the two possible shapes, the exterior of the board and the inner outline, the second one is finally selected as it was demonstrated that this shape provided less false positives caused by the black outline of the board. To ease calculations in further steps, the algorithm crops the image, keeping the inner region of the board, and corrects the image perspective. The resulting image is also rotated to provide the same view as the user. Figure 1 summarises the main steps of this process. 2.2
Tangram Detection
A Tangram game consists of 7 pieces with different shapes, colours and sizes. In particular, our version is composed by two large green and orange triangles, a medium red triangle, two small black and violet triangles, a yellow square and 1
https://github.com/machukas/openCV/blob/master/P3/camera calibration.
64
C. Menendez et al.
a blue rhomboid. The Tangram detection process has three stages: (i) First, the method performs a calibration step to establish the equivalence between pixels and centimetres to homogenize detections among different camera positions; (ii) Next, it is necessary to pre-process the image of the game board to (iii) finally perform a detection of the different parts based on their shape, size and colour.
Fig. 2. Tangram pieces recognition pipeline.
Camera Calibration. Since a Tangram has similar pieces with different sizes, we decided to make the detection independent from the size in pixels of the objects in the image. For this reason, this phase calculates a the pixel-tocentimetre conversion of the image obtained in Sect. 2.1. This calibration process is straightforward: a known piece, the green triangle, is placed on the play zone while the algorithm tries to find a shape with three vertices. Then, it selects the largest distance between them that corresponds to the hypotenuse. With this parameter, and using the actual length of that side of the triangle (9.3 cm in our case), it is possible to calculate the equivalence between centimetres and pixels. Every time the user wants to play Tangram, the robot executes the calibration step, but if the user wants to play several consecutive games, the process is not repeated. Pre-processing. This phase removes the background and filters the resultant image to allow detecting the parts of the Tangram. Figure 2 shows the main steps of this process. The first step creates a masked image where the Tangram parts are isolated from the background. The masked image keeps those areas above a threshold with the original colour while sets to zero the rest of the image. Then, this image is equalized to enhance contrast. To avoid the problem of illumination changes in the image we used Contrast-Limited Adaptive Histogram Equalization [17], a method that divides the image in small blocks that are equalized separately.
Tangram Social Robot
65
Next, the algorithm reduces the noise in the image applying three filters (a Median filter, a Gaussian filter and a Bilateral filter) to obtain a smooth image with high contrast. The last step is edge detection based on the Canny algorithm [3]. This algorithm uses an adaptive thresholding mechanism during a segmentation process to extract the edges of the filtered masked image. A morphological filter is also applied to eliminate noise. Detection of Tangram Pieces. After the pre-processing phase, the different parts of the Tangram are detected considering their shape and colour. Additionally, with the analysis of the relative positions of the pieces, the method also calculates how close the set of pieces is to a solution (target figure). First, the system calculates the centroid of the largest set of adjacent pieces on the playground using the contours calculated in the previous phase. Those contours that do not have a suitable area are discarded. Then, the algorithm obtains the centroid from each piece contour, checking the RGB value of that point in the image. To increase robustness against changes in illumination, different samples (tones) of each of the possible colours are considered in a matching process that calculates the distance between the detected colour and the samples. Since this process is not perfect, the area and shape of the pieces are also considered to detect them. Additionally, we calculate the orientation of the pieces with respect to the biggest set of connected parts. For the triangles, the orientation is given by the angle between X-axis of the image and the vector that goes from the center of the triangle to the vertex facing the hypotenuse. The orientation of the square is calculated as the angle formed by the diagonal of the figure and the X-axis. Finally, we calculate the orientation of the rhomboid as the angle formed by its main diagonal and the X-axis. It should be noted that the user can solve the target figure in any orientation. Therefore, after calculating the orientation of the isolated pieces, the system obtains the bounding box enclosing the largest detected area (biggest set of connected pieces) on the board and calculates the orientation with respect to the X-axis of the image. We will use this angle as a reference to perform a coordinate change.
3
Integration in the Social Robot
This section describes how the social robot Mini (described in Sect. 4) manages the Tangram game logic given the output from the Tangram detection module. The system also tries to establish a human-robot interaction as friendly as possible. This game logic is also designed to help the user to complete target figure successfully. When a Tangram game starts, the robot performs some preparatory steps to ensure that the game will run correctly. First, the system tries to locate the game board as described in Sect. 2.1. For this purpose, it gives instructions about how
66
C. Menendez et al.
to place the board properly. If the detection fails, the robot tells the user that it does not want to play. Conversely, if the game board is correctly detected, the interaction continues to calibrate the camera following the steps shown in Sect. 2.2. The robot guides the user through this process via voice interaction, requesting the user to place the green triangle on the board. If the triangle detection process fails after three trials, the game finishes and Mini tells the user that it does not want to play. Otherwise, if the system detects the green triangle the game begins. Once the game board and camera have been properly calibrated, the robot randomly chooses a target figure and shows it on the tablet. Then, the user starts building the target figure and the Tangram detection process (see Sect. 2.2) continuously provides information about the detected pieces (shape, colour, orientation, etc.). The system updates its information about the environment every 6 s, calculating the distance to the target figure for each one of the detected pieces. If 30 s after this phase starts no piece is detected, the game returns to the calibration stage. Additionally, the game logic tries to engage and motivate the user along the whole game. For this purpose, we system includes a help mechanism in which the robot provides hints to help completing the target figure. These clues are generated dynamically depending on the distance of each piece to the centroid of the largest group of pieces, the orientation error and the distance to neighbouring pieces. The system checks if a part is in the exact position of the target figure. Then, it goes through all of detected pieces selecting those with the correct orientation. The game logic checks if a piece has any of its neighbours already placed on the board. If so, the system checks the distance between the centres of the adjacent parts, considering that a part is correctly placed if the distance is correct. Once the system checks all of the pieces on the board, if no piece is correctly placed, the robot displays the target figure on the tablet again. In case some pieces are misplaced with respect to the main cluster of pieces, the robot will tell the user where to move one of those pieces. Finally, if there are 2 neighbouring pieces on the board, but none is correctly placed, the robot will indicate which edges the user should connect to position the pieces correctly. The user can request a hint at any time by touching the robot’s belly. Besides, to achieve an engaging interaction, the robot encourages the user if it detects a well-placed piece, after some time without detecting updates on the pieces positions or if there is just one piece left to complete the target figure. We compiled a video to show a round of a real game with the Social Robot interacting with a user2 .
4
Data and Results
In this work, we used the social robot Mini created at the Roboticslab of the University Carlos III of Madrid (see Fig. 3). Mini aims at interacting with elderly 2
https://youtu.be/WpmEjTMO3Hk.
Tangram Social Robot
67
Fig. 3. Camera height on the Mini robot and distances between the robot and the play area
people with cognitive impairment. It has skills such as dancing, conversation and can play cognitive stimulation games. To improve expressiveness, Mini has LEDs on its cheeks, heart and mouth, and 5 degrees of freedom to move its head, neck, arms and base. Mini also integrates touch sensors on the shoulders, belly and head, and a speaker to interact with users. The robot includes an external tablet to display multimedia content, such as video and pictures. It also uses it to display buttons used for game and exercise interaction. In this project, we installed a USB fish-eye camera with a 180◦ field of view that provides images with a resolution of 1080p at 30 frames per second. 4.1
Experimental Setup
To validate the effectiveness of the recognition of the pieces of the Tangram, we selected 15 different combinations of the pieces on the board (see Sect. 1 of the supplementary materials3 ) that included all pieces in a single image, 2 pieces touching with different colour combinations that challenged the detector, 3 pieces touching with different colour combinations, and 5 pieces together. We decided to test the detector against two factors in the environment that could influence the detection performance: camera distance to the board and lighting conditions. For each of the tests detailed next, we performed 1000 detections for each of the 15 combinations of pieces. First, we evaluated the detection performance changing the distance between the camera and the game board, as shown in Fig. 3 left. Given the wide field of view of the camera, it was possible to place it relatively close to the board, in the middle of the robot’s belly (25 cm of height). We also tried a second location (41 cm of height), as shown in Fig. 3 right, that also provided a good field of view. All of the previous images in this paper, as well as the demo video, have been acquired using this second distance. 3
Supplementary materials are available here: https://bit.ly/2Sk5R4q.
Blue Rhomboid
Yellow Square
Purple Triangle
Black Triangle
Red Triangle
Orange Triangle
Green Triangle
93.21 90.13 89.78 96.41 90.31 95.19 85.12 88.90 93.83 98.91 93.88 91.10 67.30 66.95
41
25
41
25
41
25
41
25
41
25
41
25
41
25
Distance Success Rate (%)
Blue Rhomboid
Yellow Square
Purple Triangle
Black Triangle
Red Triangle
Orange Triangle
Green Triangle
Low
High
Low
High
Low
High
Low
High
Low
High
Low
High
Low
High
58.52
66.77
14.88
5.92
55.24
21.99
66.45
81.60
28.90
50.55
40.66
28.96
64.44
75.41
Illumination Success Rate (%)
Table 1. Success rate of piece recognition placing the camera at two different heights (left) and with different illumination conditions. Medium illumination at the left side (used in the 41 cm distance tests), high and low illumination at the right side.
68 C. Menendez et al.
Tangram Social Robot
69
In the second round of tests, we varied the lighting conditions keeping the camera height constant at 41 cm. We performed these tests with high illumination (adding an external light source) and with low illumination (turning off the lights). The medium lighting coincided with the previous round of distance tests, considering the camera at 41 cm to establish a comparison. 4.2
Performance of the Pieces Detector
The results presented in this section are grouped considering each one of the pieces varying camera distance and illumination. Table 1 left shows the recognition success rate for each of the pieces. Six pieces have a high success rate (between 85% and 98%), but the detection accuracy for the blue rhomboid is approximately 67%. This is due to the obtuse angles of the rhomboid, which complicate the recognition of these vertices when the piece is in contact with another. On the other hand, the results obtained for both heights are similar. The maximum difference found in the orange triangle and is less than 7%. From these tests, we can conclude that height is not a determining factor for the detection of parts thanks to the calibration of the camera before recognition. Regarding lighting conditions, Table 1 shows the success rate of each piece for the three types of lighting: high, medium and low. The medium lighting corresponds to the left part of the table with the camera at 41 cm. In high lighting, the darker pieces, black triangle, green and blue rhomboid, have a higher hit rate: 81.60%, 75.71% and 66.77% respectively. In contrast, lighter colours have a lower success rate. The yellow square has the worst result, with a 5.92% hit rate, caused by the low contrast between the playground and the piece. When the illumination is low, we get similar results. The black and green triangles produce the highest success rates, with 66.45% and 64.44% respectively. The yellow square has the worst hit rate at 14.88%. With these results, we conclude that lighting is a determining factor for this application. Then, for the game works correctly, we’ll work on medium illumination. In contrast, the medium illumination (see Table 1 left, rows at distance 41 cm) shows good results for all of the pieces, above 85% in all cases, but the blue rhomboid, with a 67.3%. Sections 2 and 3 of the supplementary materials show the detailed results for the 15 combinations of pieces described at the beginning of this section.
5
Conclusions
This paper shows how the application of computer vision techniques contributes to increasing the capabilities of social robots in scenarios that could be applied to elderly people cognitive stimulation. In this case, we have developed a system able to recognize Tangram pieces and the game logic that uses the HRI capacities of the robot Mini to engage the user in the game. The system needs to correct the wide-angle distortion in the images and find the playground to recognize and classify the Tangram pieces accurately.
70
C. Menendez et al.
This process involves a series of segmentation and filtering steps to locate the game board and transform it to achieve a top-view representation that will help to find the pieces with no perspective distortion. A calibration process has also been implemented to make the recognition independent from the camera position establishing an equivalence from pixels to centimetres. The pieces recognition creates a masked image that just contains the parts and runs two detectors, one based in the colour of the pieces, and another one based on their shape. Combining these two methods, the system is able to provide an accurate result. Additionally, we wanted to check the performance and robustness of the methods implemented against different camera heights and lighting conditions. Results show that the heights tested did not have an impact on the performance of the game. However, illumination was an important factor in achieving a high recognition success rate. The small differences found in the two height tests lead us to believe that this system could be extended to two other robots with different sizes. Acknowledgements. The research leading to these results received funding from the projects: “Robots Sociales para Estimulaci´ on F´ısica, Cognitiva y Afectiva de Mayores (ROSES)”, funded by the Spanish “Ministerio de Ciencia, Innovaci´ on y Universidades”; “Desarrollo de t´ecnicas de VISi´ on por computador para el alineamiento de HELIOstatos (VISHELIO)” funded by “Universidad Carlos III de Madrid”; and from RoboCity2030-DIH-CM, Madrid Robotics Digital Innovation Hub, S2018/NMT-4331, funded by “Programas de Actividades I+D en la Comunidad de Madrid” and cofunded by Structural Funds of the EU.
References 1. Bejan, A., G¨ undogdu, R., Butz, K., M¨ uller, N., Kunze, C., K¨ onig, P.: Using multimedia information and communication technology (ICT) to provide added value to reminiscence therapy for people with dementia. Zeitschrift f¨ ur Gerontologie und Geriatrie 51(1), 9–15 (2018) 2. Bejan, A., Wieland, M., Murko, P., Kunze, C.: A Virtual Environment Gesture Interaction System for People with Dementia. In Proceedings of the 2018 ACM Conference Companion Publication on Designing Interactive Systems, pp. 225– 230, May 2018 3. Moeslund, T.: Canny Edge Detection, 23 March 2009. Accessed 3 Dec 2014 4. C-ASWCM Kathleen Allen, LCSW. Current technology for dementia and alzheimer’s disease (2017). https://www.brightfocus.org/alzheimers/article/ current-technology-dementia-and-alzheimers-disease. Accessed 04 Feb 2020 5. Duan, Y., Lu, L., Chen, J., Wu, C., Liang, J., Zheng, Y., Wu, J., Rong, P., Tang, C.: Psychosocial interventions for Alzheimer’s disease cognitive symptoms: a Bayesian network meta-analysis. BMC Geriatrics 18(1), 175 (2018) 6. Eisapour, M., Cao, S., Domenicucci, L., Boger, J.: Virtual reality exergames for people living with dementia based on exercise therapy best practices. In: Proceeding of the Human Factors and Ergonomics Society Annual Meeting, vol. 62, No. 1, pp. 528–532. SAGE Publications, Los Angeles, September 2018
Tangram Social Robot
71
7. Frutos-Pascual, M., Garc´ıa-Zapirain, B., & M´endez-Zorrilla, A.: Improvement in cognitive therapies aimed at the elderly using a mixed-reality tool based on tangram game. In Computer Applications for Graphics, Grid Computing, and Industrial Environment, pp. 68–75. Springer, Heidelberg (2012) 8. Koh, I.S., Kang, H.S.: Effects of intervention using PARO on the cognition, emotion, problem behavior, and social interaction of elderly people with dementia. J Korean Acad. Commun. Health Nur. 29(3), 300–309 (2018) 9. Melkas, H., Hennala, L., Pekkarinen, S., Kyrki, V.: Human impact assessment of robot implementation in Finnish elderly care. In: International Conference on Serviceology, pp. 202–206, September 2016 10. Patomella, A.H., Lovarini, M., Lindqvist, E., Kottorp, A., Nyg˚ ard, L.: Technology use to improve everyday occupations in older persons with mild dementia or mild cognitive impairment: a scoping review. Br. J. Occupat. Therapy 81(10), 555–565 (2018) 11. Petersen, S.W.: Robots for cognitive rehabilitation and symptom management. In: Rehabilitation Robotics, pp. 267–275. Academic Press (2018) 12. Schrum, M., Park, C.H., Howard, A.: Humanoid therapy robot for encouraging exercise in dementia patients. In: 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 564–565. IEEE, March 2019 13. Sipollo, B.V., Jullamate, P., Piphatvanitcha, N., Rosenberg, E.: Effect of a cognitive stimulation therapy program on cognitive ability of demented older adults. Bangkok Med. J. 15(1) (2019) 14. Tapus, A., Tapus, C., Mataric, M.J.: The use of socially assistive robots in the design of intelligent cognitive therapies for people with dementia. In: 2009 IEEE International Conference on Rehabilitation Robotics, pp. 924–929. IEEE (2009) 15. Vahia, I.V., Kamat, R., Vang, C., Posada, C., Ross, L., Oreck, S., Bhatt, A., Depp, C., Jeste, D.V., Sewell, D.D.: Use of tablet devices in the management of agitation among inpatients with dementia: an open-label study. Am. J. Geriat. Psych. 25(8), 860–864 (2017) 16. Zheng, J., Chen, X., Yu, P.: Game-based interventions and their impact on dementia: a narrative review. Australas. Psychiatry 25(6), 562–565 (2017) 17. Yadav, G., Maheshwari, S., Agarwal, A.: Contrast limited adaptive histogram equalization based enhancement for real time video system. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), New Delhi, 2014, pp. 2392–2397 (2014)
FullExpression Using Transfer Learning in the Classification of Human Emotions Ricardo Rocha1 and Isabel Pra¸ca1,2(B) 1 2
School of Engineering, Polytechnic of Porto (ISEP/IPP), Porto, Portugal {1100662,icp}@isep.ipp.pt GECAD – Knowledge Engineering and Decision Support Research Centre, Porto, Portugal
Abstract. During human evolution emotion expression became an important social tool that contributed to the complexification of societies. Human-computer interaction is commonly present in our daily life, and the industry is struggling for solutions that can analyze human emotions, to improve workers safety and security, as well as processes optimization. In this work we present a software built using the transferlearning technique on a deep learning model, and conclude about how it can classify human emotions through facial expression analysis. A Convolutional Neuronal Network model was trained and used in a web application. Several tools were created to facilitate the software development process, including the training and validation processes. Data was collected by combining several facial expression emotion databases. Software evaluation revealed an accuracy in identifying the correct emotions close to 80% . Keywords: Emotions · Facial expressions Deep Learning · Web application
1
· Artificial Intelligence ·
Introduction
During the human evolution, emotion expression became an important social tool that contributed to build more complex societies [1]. Human-computer interactions are present on daily basis and it seems that computers are not an emotional wall, meaning that humans are able to express emotions through computers with the same frequency as they do face-to-face [2]. Therefore, emotion recognition techniques aim to provide mechanisms to detect and classify human emotions, expressed in different forms such as facial, verbal and behavior responses [3]. Emotions can be influenced by several factors such as human body format or environment context (e.g. food and weather). The present work has been developed under the EUREKA – ITEA3 Project CyberFactory#1 (ITEA-17032) and Project CyberFactory#1PT (ANI–P2020 40124) co-funded by Portugal 2020). c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 72–81, 2021. https://doi.org/10.1007/978-3-030-58356-9_8
FullExpression
73
Likewise, humans can feel different types of emotions at the same time and those emotions can have a cultural component too [4,5]. Paul Ekman [6] created a classification model called Facial Action Coding System (FACS) and proposed seven universal emotional expressions: joy (happiness), sadness, surprise, fear, anger, disgust and contempt. Recently, advancements on computer vision and Artificial Intelligence (AI) fields [7] made it possible to improve the accuracy and speed of facial emotion classification on the affective computing field [8]. Also, the new advancements on Artificial Neuronal Network (ANN) and the new computational resources available, improved the general classification accuracy rate on computer vision classification problems [9], being one of the most popular type of algorithms used nowadays [7,10]. Simple Machine Learning (ML) ANN evolved to complex Convolutional Neuronal Network (CNN) architectures capable of solving general image classifications with high reliability but at the cost of being computationally demanding. Architectures such as MobileNet [11] were created to be used in mobile devices, or devices with less resources available, without compromising a lot the accuracy results. Applying the transfer learning technique, which uses information stored on one model and uses it to a different but related problem, is a good choice when time and computation resources are limited [12]. Deep Learning (DL) development process has four main steps: data gathering, exploration and normalization; training; testing and inference. Most of the time spent by data scientists seems to be related with collection and exploration of data, suggesting this is a key concern to be successful on DL field. Amount and data quality have a huge impact on the final accuracy results [13] but that’s not the only factor: choosing the right loss and optimizer function as well as removing or adding the right neurons and layers seems to also have an impact [14].
2
Research Works
Facial Action Coding System (FACS) is widely used on emotion studies and software since it provides a systematical way of categorizing facial expression emotion. FACS is a coding language that maps all facial expressions, due to facial muscle movements, into codes. Core emotions can be composed by one or many Action Units (AU), and FACS based software try to recognize the face land marking (AU points) present on images or videos in order to correctly identify emotions. By using FACS it is possible to eliminate the noise information about differences in age, sex, skin color or ethnicity since it only looks for differences between facial muscles. Table 1 compares emotion detection accuracy values of several FACS-based research works that used traditional algorithms to extract Facial Alignment or Face Landmarking to classify the emotions present in an image, or group of images. The information about accuracy values was extracted from each research work meaning the same conditions were not always guaranteed. There are a set of steps, common to several research works, in order to get the best results on facial emotion recognition:
74
R. Rocha and I. Pra¸ca Table 1. FACS-based research works comparison
Reference
Accuracy value
Database used
Number of emotions
Human-computer interaction using 98.8% emotion recognition from facial expression [15]
CK+ [16, 17]
6 (Anger, Disgust, Fear, Happy, Sad, Surprise)
A method to infer emotions from facial action units [18]
97.0%
CK+ [16, 17]
6 (Anger, Disgust, Fear, Happy, Sad, Surprise)
Facial expression recognition using geometric normalization and appearance representation [19]
96.5%
CK+ [16, 17]
6 (Anger, Disgust, Fear, Happy, Sad, Surprise)
Real-Time Emotion Recognition from Facial Images using Raspberry Pi II [20]
94.0%
Database locally created
5 (Anger, Disgust, Happy, Neutral, Surprise)
Fast facial expression recognition based on local binary patterns [21]
86.7%
JAFFE [22] 5 (Anger, Disgust, Happy, Sad, Surprise)
Facial emotion recognition in modern distant education systems using SVM [23]
83.9%
CK+ [16, 17]
Facial Emotion Recognition Using Fuzzy Systems [24]
78.8%
JAFFE [22] 6 (Anger, Disgust, Fear, Happy, Sad, Surprise)
Emotion Recognition from 3D Videos using Optical Flow Method [25]
75.3%
BU3DFE [26]
6 (Anger, Disgust, Fear, Happy, Sad, Surprise)
5 (Anger, Disgust, Happy, Sad, Surprise)
1. Facial Recognition: extraction of people’s faces from images in order to avoid useless information. The viola-jones face detection algorithm [27] is an example that can be used to extract faces from images and can be found on several works [15,20,21,28]. Other works use the “Multi-pose Face Detection Based on Adaptive Skin Color and Structure Model” [29,30] and their own algorithms to extract faces [18]; 2. Facial Alignment or Face Land marking: detection of the edges of the face, eyes, nose, mouth and eyebrows. To do so it is possible to use Local Binary Patterns (LBP) [19,21], Active Shape Model (ASM) [20], Active Appearance Models (AAM) [23], Fiducial Facial Point Detector [28], MATLAB tools [24], or even a hybrid approach of AAM and LBP [31]. Other authors have also proposed their own methods [15,25]. Some works skip the first step as the algorithm used does not need the initial facial recognition step (maybe due the highly controlled test environment provided);
FullExpression
75
3. Dividing Landmarking by Facial Patches: division of the face in different parts (eyes, nose, mouth and eyebrows). The work by Sadeghi et al. (2013) divided some facial patches to remove the cheeks and face around, achieving a faster detection result on the classification phase, and reducing computational costs [19]; 4. Emotion classification: based on the facial alignment step, an algorithm will be used in order to detect the associated emotion. Different AI classification algorithms have been used, such as Lean Mean Square [30], Support Vector Machine [15,21], and [32], ANN [31], [28] and Adaboost [15]. Also, the work by Velusamy (2011) mapped Facial Landmarks into AU in order to create their own algorithm that classified emotions based on different AU combinations [18]. Few works [33–35] use CNN to detect emotions from images, however it seems that EmotionalDan [36] is the CNN solution that achieves a higher accuracy value (≈75%) [137].
3
FullExpression Ecosystem
FullExpression Application is the product developed in this work and contains functionalities that allow users to generate, see and interact with emotion recognition. In order to predict emotions from facial expressions, a total of two web applications and a DL training script was created. Each one plays a specific function: the FullExpression Web Tools Application has a set of functionalities which helps to build and evaluate the FullExpression Application and the FullExpression Training Script is a Python program used to train, fine-tune and generate DL models, in which the transfer-learning technique is applied over the MobileNet V1 [11]. The model version used had an input size of 244 × 224 × 3. In order to apply the transfer learning technique, the last layer of this model was replaced by a new one, with a 1 × 1 × 7 size, corresponding to the seven core emotions. In total, the model is composed by 3.2 million parameters and 89 layers. All layers were retrain during the training phase. Comparing to FACS based studies, FullExpression uses face extraction algorithms to extract faces as most works do but delegates the Facial Alignment/Face land marking, facial feature extraction and emotion classification steps to the DL model. Different from EmotionalDan [36], it does not incorporate intentionally extra layers into the model in order to extract information about facial landmarks. Moreover, it is the only work which run exclusive on browsers, using only browser resources to perform the emotion classification task. The two main functionalities of the FullExpression Application are: emotion analysis in real time and emotion analysis from images. The first, uses the webcam for capturing real time images and detect potential faces and emotions expressed; the second allows users to generate emotional reports from their own images. Additionally, the user can interact with the report by searching images, downloading a version of the report in excel format or downloading the images organized by the corresponding emotion classification.
76
R. Rocha and I. Pra¸ca
The principal steps of the detecting emotions process are: 1. Detect faces from images and extract them (using the viola-jones [37] algorithm); 2. Resize the faces extracted to 300 300 pixels; 3. Remove color from faces; 4. Run the DL model trained to predict the emotion expressed.
Fig. 1. FullExpression ecosystem components diagram
Moreover, Fig. 1 shows the FullExpression architecture exposing the main components relationships: – The Core library provides the core functionalities for all web application such as User Interface (UI) animations, loaders, interaction with charts, uploading images, statistical boards, access to user webcam, browser canvas manipulation and excel/general files manipulation and download. To do so, dependencies to external libraries were needed including xlsx [38] (handling of excel
FullExpression
77
files), chartJS [39] (visualization and interaction with charts), angular materials (building and organizing the UI), jszip [40] (creating .zip files) and file saver [41] (downloading files); – The Emotion Classification library provides functionalities to detect and classify emotions. It is composed by services that allow face detection (implementation of the viola-jones algorithm [37]), image normalization and interaction with the model, in order to classify the emotions present in the images. Also, the software uses tensorflow.js [42] to handle the model; – Confusion Matrix library provides functionalities to visualize and calculate statistical data from a given confusion matrix. All web applications are driven by the separation of concerns (also known as modular architecture) promoting readability, maintainability, stability and reusage. They share a group of libraries, created for sharing functionalities across applications, and are prepared to be used by applications outside the FullExpression Ecosystem. They were designed to adapt the UI layout to different screen sizes and devices and obey to the single page application principal. The FullExpression Ecosystem used the following technologies: Angular 8, Typescript, Scss and Python as programing languages; NPM and Pip for package management; GitHub for control version purposes; Firebase to deploy and serve both databases and applications; NodeJS to run the applications and virtualenv to create Python containers.
4
Experimental Findings
A database containing 4699 images was created, from which 3769 (≈80%) were used for training and 930 (≈20%) to evaluate the model. This database combines images from KDEF & AKDEF [43] (≈62%), TFEID [44] (≈11%), Face Place [45] (≈22%) and jaffe [22] (≈5%) databases. The software evaluation was performed against images from each individual database that was not used during the DL model training process. The Web Tools Application was responsible not only for creating a confusion matrix, but also for calculating metrics such as accuracy, precision, recall and Fscore. The software classified correctly 78.44% of the total images contained facial expressed emotions. Happy (97.80%), disgusted (88.60%), surprised (86.30%) and neutral (81.70%) had highest accuracy ratings comparing to sad (73.10%), angry (69.10%) and afraid (62.50%). In addition, the accuracy results obtained for each core emotion varied from database to database (Fig. 2). For KDEF & AKDEF, afraid emotions presented the lowest accuracy rating value, while the same emotion for TFEID database is positioned in the top 3 of highest accuracy ratings. Nevertheless, common to all databases, happy and disgusted are the top 2 emotions with better accuracy ratings. Overall, the software classified correctly 85% of the total images from KDEF & AKDEF database, ≈79% for TFEID database, ≈74% for jaffe database and ≈43% for Face Place database (Fig. 2).
78
R. Rocha and I. Pra¸ca
Comparing to EmotionDan [36] study, this work performs ≈4% better using only ≈26% images for training the model, meaning the amount of data is not the most important factor associated to DL model performance. Comparing to FACS-based works average accuracy results, this work performs ≈8% worse which means there is yet space for DL-based works to evolve on the emotion classification field.
Fig. 2. Software accuracy for different emotion databases
5
Conclusions
Emotions have an important role on building complex societies by providing an effective channel of communication. Industry is struggling for solutions that can analyze human emotions, in an attempt to provide better user experiences. Previous works mixed ML algorithms with FACS extraction algorithms in order to classify facial expressions with the Paul Ekman core emotions. They presented good accuracy results, but few use DL solutions, which proved to perform well on image classification. Moreover, none of the works analyzed used ML and DL models on browser. With the purpose of using DL models to classify emotion on images, a web application was created. Thus, the FullExpression project demonstrated not only the possibility to run DL models on browsers, which opens new possibilities to the ML community, but also the possibility of creating emotion classification software based on DL models accomplish an accuracy of ≈80%. To create a DL model specialized in detecting emotions, a technique called transfer-learning was used. This technique allowed the training of models much faster than building and training a new DL model from scratch. Since the FullExpression Application provides emotion image classification in real time, a model which combines high accuracy values with low classification speed times was
FullExpression
79
required, leading to the choice of MobileNet V1. And, with the goal of training and evaluation the DL model, a training script and a web application were created. Moreover, the work presented can be useful in many different domains, as the perception of the emotional state of users can be determinant in: Marketing (to understand the effect of marketing campaigns); Health (in elderly people monitoring applications, mental diseases diagnosis, interactive applications to monitor patients’ behavior in hospitals environments, among others), Industry 4.0 (to monitor human collaborating with robots (cobots), making it possible to detect safety problems, such as fatigue); Critical infrastructures operators (to monitor their attention levels, to support them in tiredness situations, among others). More work could be done to improve the FullExpression ecosystem. The web application only classifies emotions from facial expressions, but humans also produce emotions through physiological and behavioral responses. Data from these different ways of expressing emotions, could be collected and used to improve accuracy results. Nonetheless, analyzing different human responses are not the only option to improve the machines accuracy: data quantity, quality and diversity has an important role.
References 1. Massey, D.S.: A brief history of human society: the origin and role of emotion in social life: 2001 presidential address. Am. Sociol. Rev. 67(1), 1–29 (2002). https://www.jstor.org/stable/3088931. http://www.jstor.org/stable/ 3088931?origin=crossref 2. Derks, D., Fischer, A.H., Bos, A.E.: The role of emotion in computer-mediated communication: a review. Comput. Hum. Behav. 24(3), 766–785 (2008) 3. Hockenbury, D., Hockenbury, S.: Discovering Psychology. Worth Publishers, New York (2007) 4. Cherry, K.: The James-Lange Theory of Emotion (2018). https://www. verywellmind.com/what-is-the-james-lange-theory-of-emotion-2795305 5. Bluhm, C.: Handbook on facial expression of emotion, November 2013. https:// search.proquest.com/docview/1518034163?accountid=13904 6. Ekman, P., Friesen, W.V.: Measuring facial movement. Environ. Psychol. Nonverbal Behav. 1(1), 55–75 (1976) 7. Ko, B.C.: A brief review of facial emotion recognition based on visual information. Sensors (Switzerland) 18(2), 401 (2018) 8. Picard, R.W.: Affective computing. Technical report, MIT Media Laboratory, Cambridge (1995) 9. Gershgorn, D.: The data that transformed AI research – and possibly the world (2017). https://qz.com/1034972/the-data-that-changed-the-direction-of-airesearch-and-possibly-the-world/ 10. Seif, G.: I’ll tell you why Deep Learning is so popular and in demand, pp. 2– 5 (2018). https://medium.com/swlh/ill-tell-you-why-deep-learning-is-so-popularand-in-demand-5aca72628780
80
R. Rocha and I. Pra¸ca
11. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications, April 2017. http://arxiv.org/abs/1704.04861 12. Yang, L., Hanneke, S., Carbonell, J.: A theory of transfer learning with applications to active learning. Mach. Learn. 90(2), 161–189 (2013) 13. Kaggle Inc.. Kaggle Machine Learning & Data Science Survey 2017, Technical report 2017. https://www.kaggle.com/surveys/2017 14. Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era arXiv:1707.02968 [cs] (2017). http://openaccess.thecvf.com/contentICCV2017/papers/SunRevisiting UnreasonableEffectivenessICCV2017paper.pdf 15. Abdat, F., Maaoui, C., Pruski, A.: Human-computer interaction using emotion recognition from facial expression. In: Proceedings - UKSim 5th European Modelling Symposium on Computer Modelling and Simulation. EMS 2011, pp. 196–201 (2011) 16. Kanade, T., Cohn, J., Tian, Y.: Comprehensive database for facial expression analysis. In: Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), vol. 899, no. 3, pp. 46–53. IEEE Computer Society (2017). http://ieeexplore.ieee.org/document/840611/ 17. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade Dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: n 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, vol. 45, no. 7, pp. 94–101. IEEE, June 2010. http://ieeexplore.ieee.org/document/5543262/ 18. Velusamy, S., Kannan, H., Anand, B., Sharma, A., Navathe, B.: A method to infer emotions from facial action units. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 2028–2031 (2011) 19. Sadeghi, H., Raie, A.A., Mohammadi, M.R.: Facial expression recognition using geometric normalization and appearance representation. In: Iranian Conference on Machine Vision and Image Processing, MVIP, pp. 159–163 (2013) 20. Suchitra, S.P., Tripathi, S.: Real-time emotion recognition from facial images using Raspberry Pi II. In: 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN), pp. 666–670 (2016). http://ieeexplore.ieee.org/ document/7566780/ 21. Verma, R., Dabbagh, M.Y.: Fast facial expression recognition based on local binary patterns. In: Canadian Conference on Electrical and Computer Engineering, pp. 1–4 (2013) 22. Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with Gabor wavelets. In: Proceedings - 3rd IEEE International Conference on Automatic Face and Gesture Recognition, FG 1998, pp. 200–205 (1998) 23. Sun, J.M., Pei, X.S., Zhou, S.S.: Facial emotion recognition in modern distant education system using SVM. In: Proceedings of the 7th International Conference on Machine Learning and Cybernetics, ICMLC, vol. 6, pp. 3545–3548 (2008) 24. Nicolai, A., Choi, A.: Facial emotion recognition using fuzzy systems. In: Proceedings - 2015 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2015, pp. 2216–2221 (2016) 25. Patil, G., Suja, P.: Emotion recognition from 3D videos using optical flow method. In: 2017 International Conference On Smart Technologies For Smart Nation (SmartTechCon), pp. 825–829 (2017). https://ieeexplore.ieee.org/document/ 8358488/
FullExpression
81
26. Zhang, X., Yin, L., Cohn, J.F., Canavan, S., Reale, M., Horowitz, A., Liu, P.: A high-resolution spontaneous 3D dynamic facial expression database. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 38, pp. 1–6. IEEE, April 2013. http://ieeexplore.ieee.org/ document/6553788/ 27. Viola, P., Jones, M.: Robust real-time object detection paul. Int. J. Comput. Vis. 4(57), 4 (2001) 28. Valstar, M.F., Pantic, M.: Fully automatic recognition of the temporal phases of facial actions. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42(1), 28–43 (2012). http://ieeexplore.ieee.org/document/6020812/ 29. Peng, Z.Y., Zhou, Y., Wang, P.: Multi-pose face detection based on adaptive skin color and structure model. In: CIS 2009 - 2009 International Conference on Computational Intelligence and Security, vol. 1, pp. 325–329 (2009) 30. Peng, Z.Y., Zhu, Y.H., Zhou, Y.: Real-time facial expression recognition based on adaptive Canny operator edge detection. In: 2010 International Conference on MultiMedia and Information Technology, MMIT 2010, vol. 2, pp. 154–157 (2010) 31. Mistry, K., Zhang, L., Neoh, S.C., Jiang, M., Hossain, A., Lafon, B.: Intelligent appearance and shape based facial emotion recognition for a humanoid robot. In: SKIMA 2014–8th International Conference on Software. Knowledge, Information Management and Applications (2014) 32. Suk, M., Prabhakaran, B.: Real-time mobile facial expression recognition systema case study. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 132–137 (2014) 33. Lopes, A.T., De Aguiar, E., Oliveira-Santos, T.: A facial expression recognition system using convolutional networks. In: Brazilian Symposium of Computer Graphic and Image Processing, vol. 2015, pp. 273–280, October 2015 34. Mollahosseini, A., Chan, D., Mahoor, M.H.: Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10. IEEE, March 2016. http:// ieeexplore.ieee.org/document/7477450/ 35. Kennedy, B., Balint, A.: EmotionNet. https://github.com/co60ca/EmotionNet 36. Tautkute, I., Trzcinski, T., Bielski, A.: I know how you feel: emotion recognition with facial landmarks, pp. 1991–1993 (2018). http://arxiv.org/abs/1805.00326 37. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, vol. 1, no. 1-3, pp. I–511–I–518. IEEE Computer Society (1997). http://ieeexplore.ieee.org/document/990517/ 38. Sheetjs, “xlsx.”. http://sheetjs.com/opensource 39. “charjs”. https://www.chartjs.org/ 40. “jszip”. https://stuk.github.io/jszip/ 41. “file saver”. https://www.npmjs.com/package/file-saver 42. “Tensorflow.js.” https://js.tensorflow.org/ 43. Lundqvist, D., Flykt, A., Ohman, A.: The Karolinska Directed Emotional Faces KDEF (1998). https://www.emotionlab.se/kdef/ 44. Chen, L.-F., Yen, Y.-S.: Taiwanese facial expression image database. Institute of Brain Science, National Yang-Ming University, Brain Mapping Laboratory (2007) 45. Righi, G., Peissig, J.J., Tarr, M.J.: Recognizing disguised faces. Visual Cogn. 20(2), 143–169 (2012). http://www.tandfonline.com/doi/abs/10.1080/13506285. 2012.654624
Deployment of an IoT Platform for Activity Recognition at the UAL’s Smart Home M. Lupi´ on, J. L. Redondo, J. F. Sanjuan, and P. M. Ortigosa(B) Department of Informatics, University of Almer´ıa, CEiA3, Almer´ıa, Spain [email protected], {jlredondo,jsanjuan,ortigosa}@ual.es http://www.hpca.ual.es
Abstract. Currently, most smart homes are aimed at user comfort or even energy efficiency. However, there are many cases in which Ambient Assisted Living is being used, to control the health of the elderly people, or people with disabilities. In this paper, a proposal for an IoT system for activity recognition in a smart home will be shown. Specifically, various low-cost sensors are incorporated into a home that send data to the cloud. In addition, an activity recognition algorithm has been included to classify the information from the sensors and to determine which activity has been carried out. Results are also displayed in a web system, allowing the user to validate them or correct them. This web system allows the visualization of the data generated by the sensors of the smart home and help to easily monitor the activities carried out, and to alert to the doctors or the user’s family when bad habits or any problem in the behaviour are detected. Keywords: Ambient Assisted Living learning algorithm
1
· Activity recognition · Machine
Introduction
Smart homes are considered to be intelligent environments that are composed of a wide variety of sensors and actuators capable of performing certain actions automatically with the aim of improving the user experience in the home [1]. Nowadays, smart homes not only aim to adapt the characteristics from the house to the user or perform energy-saving tasks (empty rooms do not the light is on, the blinds are raised so that the light is not turned on, etc.), but are using as services often referred to as Ambient Assisted Living (AAL). One of the goals of these services is to determine the wellness of elderly people, people with different pathologies, or people with disabilities living independently in their home [2]. This work has been supported by the Spanish Ministry of Economy, Industry and Competitiveness under grant RTI2018-095993-B-100, and the Spanish Junta de Andaluc´ıa under grants P18-RT-1193 and UAL18-TIC-A020-B, co-funded by FEDER funds. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 82–92, 2021. https://doi.org/10.1007/978-3-030-58356-9_9
Activity Recognition at the UAL’s Smart Home
83
Data from these devices can be used to execute certain actions based on conditions or monitor the operations or values that generate them. However, the potential of IoT does not reside in simply allowing the completion of the above tasks, but in to extract knowledge from this data. As the number of devices grows, therefore, increases the amount of data that is generated and has to be stored. Therefore, it is necessary to provide IoT systems with the necessary infrastructure to be able to store and analyze all these generated data [3]. The work that has been carried out consists of the implementation of a low-cost IoT platform that will be integrated into a smart home using sensors distributed throughout the home [4]. To implement the system, non-intrusive sensors are used for the user, with the aim of preserving the user’s privacy [5]. In addition, a web tool in the cloud is developed, which allows the user to obtain this data on mobile devices and from outside the home, allowing the user to monitor the state of the home at every time and offering users the ability to view a history of sensor data and manage the situation of these sensors in rooms defined by the system users, among other functions. In addition to allowing monitoring, in this work an algorithm to recognize activities in real time is implemented into the IoT platform [6]. In addition, for the display of the data obtained by the algorithm, the web system also provides the user the activities that have been carried out in the house and statistics generated on the basis of this new information. Thus, by knowing these activities carried out, the user can become aware of various unhealthy behaviours such as, for example, sedentary lifestyle, wasting energy or the large number of times this user opens the refrigerator. Also the activity recognition can detect an anomalous behaviour that can be identified with some problem that the user may have had and an emergency call can be made from the system to the doctor or to the user’s family, to whom the system is connected at every time. There are many solutions [7,8] for turning homes into smart homes which recognize real time activities. However, these solutions often do not provide enough functionality and users can not modify or customize the behavior of their smart devices. In addition, commercial smart home solutions are designed to young or middle-aged people, which means that they are complex enough so that older people. This paper is organized as follows. In Sect. 2, we indicate the infrastructure of the IoT platform proposed. In Sect. 3, we describe our proposed activities recognition algorithm and how it is integrated into the IoT platform. In Sect. 4, we discuss the experiment results. Lastly, we conclude in Sect. 5.
2
Infrastructure of the IoT System
The developed system (Fig. 1) is an Internet of Things system consisting of different low-cost and Open source software devices whose objective is to collect the data provided by the user when interacting with the environment and to analyze them to detect the activities carried out by this user at home. In this Internet of Things two parts with two different functionalities can be identified.
84
M. Lupi´ on et al.
Fig. 1. Infraestructure of the IoT system
First, the fog system is in charge of connecting the devices and obtaining the data they provide. The devices that are part of this system are different types of sensors that are located in the house together with a controller device that receives and stores the data from the sensors. Thus, contact sensors (detection of opening/closing) which are formed by two magnetic pieces that detect if they are very close or not, are located in devices such as WC, doors, refrigerator, etc. to detect if they are open or closed. In addition, presence sensors (passive IR sensor, PIR) can differentiate among the heat emitted by energy sources, so that they can distinguish between human heat and the heat emitted by static and inert objects. In this way, the presence of a person within their range of action is detected. Binary pressure sensors which are responsible for detecting pressure on its surface have also been used in bed and sofa. Finally, temperature and humidity sensors are incorporated in order to control over the environmental conditions of the house. The location of the sensors in the intelligent house is shown in Fig. 2. The controller that communicates to the different sensors is a Raspberry Pi 3B+ (SBC1 ), which receives and stores the data received from the sensors. The connection between the sensors and the controller is made using the a module called Razberry, which uses the Z-Wave wireless protocol, having a low power consumption. Secondly, various free cloud services have been used with the aim of relieving the controller of carrying out the most computationally costly operations. In this way, the database has been hosted in a service called MongoDB Atlas, which provides MongoDB database storage in a free cluster. In addition, Heroku has been used as a PAS2 , in which a web service has been deployed. It provides 1 2
Single Board Computer. Platform as a Service.
Activity Recognition at the UAL’s Smart Home
85
Fig. 2. Location of sensors at smart home
the user the tools to monitor and control the intelligent environment. Finally, a virtual machine from the University of Almer´ıa hosted by OpenStack has been created. The virtual machine runs a script made in Python that is in charge of making a permanent connection with the database and analyzing the data generated by the sensors in order to detect the activities made by the users in the smart home. This virtual machine has 2 GB of RAM, 1 CPU, 20 GB of HDD disk and Ubuntu 18.04 LTS as operating system. In general, when the user is at home, the IoT system is in constant operation. This implies that the information from the different sensors is instantly sent to the controller, which in turn stores this data in a database in the cloud. This database can be accessed from different types of applications that can use the data for different purposes. In this work we have designed an automatic learning algorithm, which can access the new generated data and later process them to determine almost instantaneously the different actions that the user of the house has carried out.
3 3.1
Activity Detection Algorithm Description of the Machine Learning Algorithm
A house monitored with multiple sensors generates a large amount of data that is difficult to interpret as a whole and therefore it is complicated to draw conclusions about the activities that are being carried out in the house. An example of a real situation of data being collected in the intelligent house is shown in Fig. 1. As can be seen with the naked eye, changes in the sensors are displayed, which can correspond to activities such as “Prepare dinner” or “Go to the WC”.
86
M. Lupi´ on et al.
However, this deduction is made in this case, as only 8 data from the sensors are available. At the end of the day, many more data are generated, being quite complicated to draw conclusions from these in an easy way, so it is necessary to design and implement an algorithm that processes these data and provides them with intelligence to determine the different activities. In this way, in the case that in the intelligent housing reside elderly people, with diseases or disabilities, medical staff or family members can monitor the activity of these more easily with the data provided by the algorithm. Table 1. Sequence of sensors’ notifications Time
Sensor
Room
State
20:45:04 Fridge
Kitchen
ON
20:45:15 Fridge
Kitchen
OFF
20:45:20 Microwave Kitchen
ON
20:45:22 Microwave Kitchen
OFF
20:46:00 WC
Bathroom ON
20:46:04 Microwave Kitchen
ON
20:46:15 Microwave Kitchen
OFF
20:47:04 WC
Bathroom OFF
Given the large amount of data to be analyzed in order to classify the information according to the different activities carried out by users, nowadays, artificial intelligence algorithms are widely used for activity recognition. Among these, it can be found the machine learning algorithms [9]. These algorithms try to classify data into different classes by learning, either automatically by analysing the different relationships among all the data, or by means of a supervision in which it is initially indicated to which class a series of initial data belong. In this work, a supervised algorithm is used, since the starting point is a classification of activities defined by the user or supervisor, and the algorithm learns to recognize them guided by the classification that the user initially makes of the different activities that occur in the dwelling. Specifically, the K-NN algorithm, widely used in this field of AAL, is adopted [10] due to the better performance in this case over other algorithms like Decision Trees, RandomForest and Linear Regression. This algorithm classifies the new sensor data sequences based on the most similar k sequences previously correctly labelled by the user; in this case, k = 5 as a result of several tests with different k-sizes, because having higher k values the algorithm did not recognize some activities having lower samples in dataset. In order to calculate the similarity between the new sensor data sequence and other sequences in the dataset, the algorithm follows a custom distance function. This function gives different “weight” to each different attribute of the data that is being evaluated. In Fig. 3, it is shown the attributes of the data from higher to lower importance, such as “Sensors’ state”, “Last Activity”, “Room”, “Sensor
Activity Recognition at the UAL’s Smart Home
87
which actives”, “Duration” and “Day time”. As scikit-learn library from Python has been used to program the algorithm, the dataset must only contain numbers in order to correctly classify new data. To achieve this, stored dataset is transformed using pipelines and one-hot encoding which transform categorical data such as “Sensor which actives” and “Last activity” to numerical data. To sump up, this machine learning algorithm does not generate a model and is quite tolerant to noise. The main disadvantage is that it gets worse as the size of the dataset increases.
Fig. 3. Algorithm’s data structure
The implemented algorithm makes a recognition of activities that are performed in different rooms. Obviously each room has a different number and type of sensors as they have been adapted to the different recognisable activities. In this way, the kitchen is the room where more possible activities have been defined, such as “preparing breakfast”, “preparing food”, “cleaning kitchen”, etc. Consequently, more sensors have been installed, such as those located at the refrigerator, microwave, dishwasher, pantry, dishwasher cabinet, etc. Recognizing activities in the kitchen will involve analyzing the quantity and sequence of changes that occur in the sensors in this room. On the other hand, there are rooms such as the bedroom, in which fewer activities have been defined and therefore the number of sensors is lower. Thus, a pressure sensor located on the bed is enough to detect the “lying down” activity. In this particular example, it will be sufficient to analyse the sensor status (on-off ) to determine the activity. For this reason, depending on the room and what is required to detect a particular activity, different information will be considered and stored in the database. Thus, in the case of the kitchen, the number of times the room’s sensors are changed while the user is in the room is stored in the data structure, and in the case of the bedroom, the status of the sensors is stored as in [11]. 3.2
Algorithm Operations
As mentioned above, the designed algorithm performs the activity recognition that is done in each of the rooms. By considering the rooms in an isolated way,
88
M. Lupi´ on et al.
the data coming from the rooms are simultaneously analysed separately. This allows the recognition of activities of several users as long as there is only one user per room. The algorithm works as follows: It is assumed that the algorithm is permanently connected to the database in such a way that each new data entry is directly processed. Once a new data is received and activated by a sensor in a room, the recognition of the activity in that room is activated. Each room stores the status of its associated sensors along with the time of the current activity. When the new data arrives, it can be assumed that the activity that was active at that time could have ended. To analyze the new possible activity, the K-NN algorithm is executed, the solution is saved in the database and the room status is updated. In the case of the rooms that, as it was previously commented, analyse all the changes that occur during the time that the user is in the room, the beginning and end of the activity is given by the presence detected by the PIR sensor or sensors located in the room. Another way of split the data is analyzed in [12]. Table 2. Initial dataset activities (training phase) Activity
Instances %
Activity
Instances %
Prepare breakfast 53
9.63 Hand-washing
37
Prepare lunch
47
8.55 Dressing
19
3.45
Prepare snack
28
5.09 Put washing machine
7
1.28
5.27 Sleep
6.73
Prepare dinner
29
Cleaning kitchen
66
12.00 Sitting on the bed
23
4.18
11
2.00
WC
85
15.45 Sitting 1 person
18
3.27
Shower
45
8.18 Sitting 2 people
15
2.73
Brushing teeth
40
7.27 Sitting 3 people
12
2.19
Shaving
15
2.73 Total
550
100
In this work a set of activities has been defined considering two basic aspects. First, the location of the sensors in the smart home (Fig. 2), and second, some of the most significant activities considering that the user is an elderly person, a person with a disability or a ill person. Table 2 shows the activities considered to be identified by the algorithm. Once the activities and the sensor values that could help to identify them have been defined, a training phase of the algorithm has been carried out. In this phase the supervisor has tagged the activity to help with the learning of the algorithm. After the training phase the algorithm is able to identify, almost instantly and with a high probability of success, the different activities that are being performed in the house in a later phase.
Activity Recognition at the UAL’s Smart Home
4
89
Results
To measure the efficiency and effectiveness of the proposed IoT activity recognition system, 17 activities have been defined and are shown in Table 2. For the training phase of the algorithm, data were being collected from the sensors installed in the house, where 4 users lived, during 3 weeks. During this time 550 records were stored in the database and used to tag the 17 activities. The Table 2 shows the total number of records used for each activity and the percentage with respect to the total of the initial 550 records. The graphical interface designed to see the different sensors active at any given time and the activity that the algorithm recognizes helps considerably in the correct labeling in case the algorithm fails, allowing the user to rectify (Fig. 4) some classification of the algorithm (even after the training phase) and thus allow it to continue improving. In this way, learning can also take place in later phases and the algorithm is increasingly successful. The fact that 4 users have participated in the activity generation allows the algorithm to classify more accurately by having a more heterogeneous sample.
Fig. 4. Change detected wrong activity
The designed system, not only recognizes the activities, but also several graphics are continuously updating themselves in order to show in a easy and fast way the different activities realised during a period of time. As example of one of these graphics, Fig. 5 shows a sequence of some activities recognized by the algorithm in the smart home during a day. This allows the user to identify easily bad habits or an activity that should not been realised at a certain time. Regarding the efficiency of the algorithm in the recognition of activities, the Table 3 shows a summary of the success rate in the classification of activities grouped by rooms. It can be seen that this success varies with the rooms, since the number of activities in a room and the similarity between them complicates the correct classification.
90
M. Lupi´ on et al. Table 3. % Correctly Classified Activities in each room Room
% Correctly classified % Instances in dataset
Bathroom
72
40.55
Kitchen
85
43.82
Living Room
95
8.18
Bedroom
97
6.18
Outside
100
1.27
Fig. 5. Timeline of detected activities
In general, for the rooms that consist of contact sensors only, better values are obtained since the activities are more defined. On the other hand, in rooms such as the kitchen and the bathroom a wide variety of activities can be performed that are very similar to each other (brushing teeth, shaving or washing hands), which the algorithm learns to differentiate by storing correctly classified instances. For this reason, it is necessary to perform a good dataset, and allow that regardless of whether the training phase has already been performed, the algorithm can be corrected to continue training, especially in the activities more difficult to recognize.
5
Conclusions and Future Work
The project has developed an Internet of Things system that integrates various types of low-cost sensors at the smart home of the University of Almer´ıa. The data from these sensors are displayed in a web system. In addition, an activity recognition algorithm has been incorporated. It recognizes 17 activities with a
Activity Recognition at the UAL’s Smart Home
91
very high degree of certainty, allowing the user to visualize its results in the web system and also to correct/validate the data provided by the algorithm. Also, the user can incorporate new activities into the algorithm that are important to detect. In addition, the system shows different graphs. The most important one is a “timeline”, which shows the activities performed throughout the day that the user selects. In this way, it is possible to visualize at first sight anomalies in the daily life of the inhabitant or inhabitants of the house, allowing to create alerts when they occur. Also, a voice assistant has been included in the smart home allowing inhabitants asking questions such as: have I taken a shower?, what time did I get up? or how many times did I eat today?. Therefore, by enabling this system, a person having some difficulties when interacting with a web system can interact with the smart home like he/she interacts with a human. As far as future work is concerned, since the sensors included in the IoT system do not currently allow the definition of which user performs which activity, location sensors will be introduced to detect the activity performed by each user in the home. In addition, dynamic acceleration and gyro sensors (already incorporated in the smartwatch) will be introduced to be able to determine more specific activities just considering the movement of the user and without requiring the user to interact with the currently integrated sensors. In order to the correct detection of the user activities with these sensors, a deep-learning model could me tried, as mentioned in [13]. Additionally, these new sensors will improve the degree of accuracy with the current activities. In addition, alerts produced by anomalies in the inhabitants behaviour will be sent to a doctor or a caregiver, depending on the situation. Everything will be incorporated into the current IoT system.
References 1. Ding, D., Cooper, R.A., Pasquina, P.F., Fici-Pasquina, L.: Smart homes and home health monitoring technologies for older adults: a systematic review. Int. J. Med. Inform. 91(2), 44–59 (2016) 2. Chana, L., Campoa, E., Est`evea, D., Fourniols, J.: Smart homes - current features and future perspectives. Maturitas 64(2), 90–97 (2019) 3. Marjani, M., Nasaruddin, F., Gani, A., Karim, A., Hashem, I.A.T., Siddiqa, A., Yaqoob, I.: Big IoT data analytics: architecture, opportunities, and open research challenges. IEEE Access 5, 5247–5261 (2017) 4. Yin, J., Fang, M., Mokhtari, G., Zhang, Q.: Multi-resident location tracking in smart home through non-wearable unobtrusive sensors. LNCS, vol. 9677, pp. 3– 13. Springer (2016) 5. Ding, D., Cooper, R.A., Pasquina, P.F., Fici-Pasquina, L.: Sensor technology for smart homes. Maturitas 69(2), 131–136 (2011) 6. Samarah, S., Zamil, M.G.A., Aleroud, A.F., Rawashdeh, M., Alhamid, M.F., Alamri, A.: An efficient activity recognition framework: toward privacy-sensitive health data sensing. J. Acoust. Soc. Am. 117(3), 988–988 (2017) 7. Chan, L., Hoey, J., Nugent, C.D., Cook, D.J., Yu, Z.: Sensor-based activity recognition. IEEE Trans. Syst. 42(6), 790–808 (2012)
92
M. Lupi´ on et al.
8. Soulas, J., Lenca, P., Th´epaut, A.: Unsupervised discovery of activities of daily living characterized by their periodicity and variability. Eng. Appl. Artif. Intell. 15, 90–102 (2015) 9. Alsheikh, M.A., Lin, S., Niyato, D., Tan, H.: Machine learning in wireless sensor networks: algorithms, strategies, and applications. IEEE Commun. Surv. Tutor. 16(4), 1996–2018 (2014) 10. Deng, Z., Zhu, X., Cheng, D., Zong, M., Zhang, S.: Efficient kNN classification algorithm for big data. Neurocomputing 195(3), 143–148 (2017) 11. Jurek, A., Nugent, C., Bi, Y., Wu, S.: Clustering-based ensemble learning for activity recognition in smart homes. J. Acoust. Soc. Am. 117(3), 988–988 (2014) 12. Wan, J., O’Grady, M.J., O’Hare, G.M.P.: Dynamic sensor event segmentation for real-time activity recognition in a smart home context. Pers. Ubiquit Comput. 19, 287–301 (2015) 13. Wang, J., Yiqiang, C., Shuji, H., Xiaohui, P., Lisha, H.: Deep learning for sensorbased activity recognition: a survey. Pattern Recogn. Lett. 119, 3–11 (2019)
Algorithms for Context-Awareness Route Generation Ricardo Pinto1(B) , Luís Conceição1,2(B)
, and Goreti Marreiros1(B)
1 GECAD – Research Group on Intelligent Engineering and Computing for Advanced
Innovation and Development, Institute of Engineering, Polytechnic of Porto, 4200-072 Porto, Portugal {1140482,msc,mgt}@isep.ipp.pt 2 ALGORITMI Centre, University of Minho, 4800-058 Guimarães, Portugal
Abstract. This work has as its main goal the investigation and experimentation on automatic generation of routes for tourists and visitors of points of interest, considering the knowledge of the routes, the profile of the visitor and the context awareness of the tour. The context of a trip can be taken through various sources of information, such as the location of the tourist, the time of the visit, the weather conditions, as well as relevant aspects and characteristics of the user’s activity and profile. The developments of this work are part of TheRoute project, and its main goal is the development of a route generation module that considers the context of the tourist, the trip and the environmental constraints. In order to solve the proposed problem, two algorithmic solutions were developed. One is an adaption of the A* algorithm with cuts, while the other is based on Ant Colony Optimization, a Swarm Intelligence algorithm. The results from the experiments allowed to conclude that the A* with cuts, oriented to the heuristic for the path with the highest score, is the one that obtains the best conjugation of results for the defined satisfaction metrics. Keywords: Route generation · Points of interest · User profile
1 Introduction The growing influx of tourists that has been registered in Portugal, creates the need to enhance country’s cultural resources and promote new routes and itineraries. With these ideas in mind, the use of intelligent systems capable of combining different scientific and technological capabilities has been gaining greater importance in the market. TheRoute project (Tourism and Heritage Routes including Ambient Intelligence with Visitants’ Profile Adaptation and Context Awareness) has as its main objective the research and experimentation in the scope of the automatic generation of routes for tourists and visitors of points of interest, considering the knowledge of the domains of the routes, the visitor’s profile and context-awareness parameters [1]. The context of a trip can be captured through many sources of information such as the location of the tourist, the schedule of the visit, the weather conditions [2], as well © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 93–105, 2021. https://doi.org/10.1007/978-3-030-58356-9_10
94
R. Pinto et al.
as relevant aspects and characteristics of the user’s personality. The generation of routes considers different restrictions of the environment, such as the attributes related with the points of interest (opening and closing times, duration of the visit, entrance fee, etc.) and travel times. All these factors were measured and considered in previous stages of the project, not being part of the scope of this research work. The recommendation of a route in real time is one of the most determining factors to maximize tourists’ satisfaction with the system. To solve these optimization problems, algorithms based on heuristics are normally used. Heuristics are procedures that establish good or almost optimal solutions to an optimization problem [3]. Given a computational problem, a heuristic is essentially a search algorithm that seeks the best solution to a problem from among the possible hypotheses of resolution [4]. These algorithms use reasonable resources and are capable of producing acceptable solutions, but without any theoretical guarantee [4]. Currently there is a huge variety of algorithms based on heuristics that can generate paths between different points. Some of the known examples are: Branch-and-Bound, Hill-Climbing, Tabu Search, Simulated Annealing, Swarm Intelligence and Genetic Algorithms. In Sect. 2, there are presented the methods developed to find the solution to the proposed problem. Section 3 is responsible for presenting the experiments carried out to prove the methods achieved. Finally, in Sect. 4, the conclusions for the work carried out are presented.
2 Methods Each point of interest is considered to have a link between all the rest. This way, points of interest can be interpreted as nodes of a graph connected all together through paths or edges. Each of these paths has an associated weight or cost, which is measured considering a relevant criterion to the problem. As it is defined a time space in which the tour must take place, it was decided to consider the travel time to measure the effort of a movement between nodes in the graph. It is intended to find a route with the execution of a search algorithm on a graph, considering a specific starting point and a time limit to complete the entire route. For N nodes in a graph, the number of possible combinations is N !. Consequently, the greater the number of nodes in the graph, the greater the number of connections between them and the greater the number of possible combinations. For this reason, whatever the algorithm to be used in the generation of the final solution, it is important that it receives as few nodes and connections as possible, taking into account the constraints of the problem. In an earlier stage, the weather conditions are checked. The existing model to represent the point of interest has three properties related to meteorological data. These attributes are the minimum and maximum temperature advisable to visit the place and its exposure to the rain. Thus, it is possible to restrict the points that must be considered for visit. Apart from this aspect, it is important to consider whether the opening hours are compatible with those of the visit. The day and time of the tour are chosen by the user when requesting a route recommendation. However, with the evolution of time, during the search made by the algorithm, some of the places initially filtered may have already closed. This check must be performed by the algorithm and it will have to be carried
Algorithms for Context-Awareness Route Generation
95
out during its execution. Each point of interest belongs to several tourist categories. The tourist has a user profile which preferences are correlated with each one of the categories in the system. Thus, in order to choose the best candidates for the visit, it is important to understand how closely they fit the user’s tastes. User preferences for each category are assigned taking into account a rating between zero and five. For each point of interest in the list of candidates, the scores of their categories are averaged, taking into account their respective values (categoryValue) corresponding to the user’s profile. If the number of categories (c) is greater than zero, the calculation of the score for the point of interest is done using the following formula: c categoryValue (1) POI Score = i=1 c If there are no matches between the categories of the point of interest and those in the user’s profile, i.e. c = 0, the point of interest is automatically assigned a zero score. Finally, the definitive list of candidate points of interest is obtained, whose selection criterion is to have a score equal to or greater than three. This acceptance value was defined as it is the intermediate value in the defined scale. After obtaining all the candidate points for the tour, it is necessary to check all the existing paths and distances between them. The displacement between points of interest is measured in temporal units, more specifically in minutes. The compilation of these data allows the construction of a graph. The starting point for exploring paths is the point of interest that is closest to the physical location of the tourist’s mobile device. The only stop condition defined for the algorithm is the time to the end of the trip. There were obtained satisfactory results in adaptions of A* and Ant Colony Optimization. 2.1 A* A* is an algorithm widely used to find the shortest path between two points in a graph. For the problem to be solved, instead of trying to minimize the sum of the costs of transition from one point of interest to another, it is intended to maximize the resulting value from the sum of its scores. In this approach there is no concept of a destination node, since only the source node is known. This way, through an origin node, an exploratory search is performed through the graph until it reaches the stop condition. This condition consists on the sum of the costs of traveling between points of interest, including visiting times, be equal or less than the total time indicated to accomplish the trip. Once the condition is reached the search for that path ends, with the possibility of exploring others. The adopted implementation, in addition to the other assumptions indicated above, also uses other strategies of A*. A queue, known as “open set”, is used to store the nodes that are of interest to explore during the execution of the algorithm. There is also a list of the already visited nodes, known as “closed set”, which is useful to avoid the occurrence of cycles in the graph, not allowing to go through an already visited node. At the beginning of the execution of the algorithm, the origin node is added to the queue of nodes to be visited, as well as to the queue of visited nodes. Then, the queue of nodes to visit is iterated until there are no more nodes to explore. The first node in the queue is removed and each of its neighbors is explored. If the neighbor is not included in the list
96
R. Pinto et al.
of visited nodes, it is checked whether the sum of travel and visit times to that point is greater than the tour time limit or if the point of interest is closed for the desired period. If these conditions are met, the analyzed path is discarded and not re-verified. In case it is possible to visit the point of interest in question and the heuristic chosen for evaluation is favorable, the node is added to the queue of nodes to be explored in future iterations. Otherwise, this node is ignored and will not be explored again. The algorithm presented in Fig. 1 demonstrates the pseudocode implementation of the A* algorithm to find a route.
Fig. 1. Pseudocode implementation of the A* algorithm
However, the demonstrated code is not fast enough to find a solution when the considered time periods are too long or when there are many nodes to analyze. For this reason, it was still necessary to divide the graph exploration in smaller blocks, applying cuts to the research. After experimenting several different values, it was concluded that the algorithm became more stable for periods of 120 min. This way, the total time of the visit is divided into portions of 120 min and, in an iterative way, the algorithm presented above is invoked. In the specific case where the total visit time is smaller than the value defined by the user, only an iteration will be made taking that value into account. Each iteration returns a path or fragment of the final path which, after all the iterations complete, it will be integrated into the route that will be recommended to the tourist. For the developed algorithm, two variants were idealized, considering two different heuristics for accepting locations on a route. Heuristic for the Highest Result. The heuristic for the highest result aims to find the set of points of interest for which the sum of their scores is the highest. As explained
Algorithms for Context-Awareness Route Generation
97
earlier, the adopted approach manages to considerably decrease the research space by finding an acceptable solution using cuts. Consequently, this heuristic has the simple responsibility of checking whether the path to be analyzed has a total score equal to or greater than the maximum score recorded by the algorithm so far. If the heuristic is applied, this path replaces the previous best with the highest score and it is used as a point of comparison on the subsequent iterations. The heuristic does not guarantee the ideal solution to the problem, but it does find an optimized solution for the tourist. It is optimized because it presents a path that takes into account his tastes and preferences, discarding routes that would not be fitted to him. In addition, it allows the suggestion of a recommendation in a reasonable time. Heuristic based on Simulated Annealing. After the analysis of the Simulated Annealing (SA) algorithm, the idea of using its heuristic to select the node in the A* algorithm arose. The decision process in the SA is carried out through a heuristic that tests the viability of the movement from one node to the another through an acceptance probability function. This function returns a number (acceptance) that is compared with a random value between 0 and 1. With the evolution of the algorithm, the probability of choosing bad movements decreases. If the acceptance is greater than the number generated, the movement is accepted. The idealized acceptance function is similar to the one used on SA, however it has a slight change to make it compatible with the developed A* algorithm. Since there is no temperature concept in A*, this value is replaced in the equation by the difference between the time limit of the visit and the time accumulated until the iteration to be verified. The acceptance probability function can be seen below: pmax −pcurrent
Ap = e tfinal −tcurrent
(2)
Where: pmax : Maximum score accumulated up to the current iteration; pcurrent : Current score accumulated to the current iteration; tfinal : Visit timeout; tcurrent : Accumulated time to the current iteration.
2.2 Ant Colony Optimization Ant Colony Optimization (ACO) is a research algorithm that consists of imitating the behavior of ants in the movement between their colony and the food source. Several ants leave their colony in search of food and, if they find anything, leave a trail of pheromones along the way. The rest of the ants in the surroundings follow this trail, heading that way. The greater the number of ants on the path, the more attractive it is, as more pheromones are released. However, over time, the effect caused by these hormones tends to deteriorate due to the phenomenon of evaporation. The shortest path has a higher density of pheromones and therefore is more attractive to ants that converge on an approximately ideal path. For the problem in question, this analogy can be adopted between ants and tourists, since both wish to obtain a path that satisfies their objective. The implemented solution consists of an adaptation of the ACO algorithm idealized by [5], which can be partially visualized in the algorithm presented in Fig. 2. It is a multithreaded implementation
98
R. Pinto et al.
in which multiple “ants” are created, starting from a fixed origin point (closest point of interest to the tourist) and continue to explore a path through the created graph, until they reach some of the stopping conditions. The calculations made by each one of the ants are based on the work developed by [6] and run on independent threads. There are shared data structures, such as the distance matrix between points of interest and the matrix of scores between vertices of the graph. Both matrices are filled before the creation of the explorer threads, being immutable from the beginning to the end of their execution. It was necessary to create a third matrix in order to contain the values resulting from the update of pheromones by the tasks that run on different threads. Initially, this matrix was filled with the same values as the matrix of scores divided by five. When dividing the score value by the scale variation, a probability of choice is obtained, which varies between zero and one. Each ant begins to follow a route from the user-defined starting point and calculates a possible best path. When passing through the various points on its path, the ant will mark these places to not visit them again. If the time limit for the trip has not been exceeded and the place is open for visit, the ant adds that point to the list of visited places and updates the pheromone matrix to help the others to find the best path.
Fig. 2. Pseudocode implementation of the ant colony algorithm
If the ant finds a path with a score equal or greater than the last best found, the value and the respective path are recorded for later comparison with the results obtained by the other ants. The execution control of the different threads is done through the
Algorithms for Context-Awareness Route Generation
99
Java’s “ExecutorCompletionService”. A fixed size thread pool was created with the same number of available CPU cores. Initially, thousands of “ants” or tasks are created and submitted to the executor service, which tries to create an orderly execution queue. As tasks are completed, results are stored in a list to be compared at the end of all runs.
3 Experiences and Evaluation The validation of the presented algorithms was carried out through tests and simulations. In order to validate the consistency and performance of the developed module, test data are also collected, similar to those expected in real conditions. The tests were performed on a portable computer equipped with an Intel Core i7-3635QM 2.40 GHz processor and 16 GB of RAM. The database used for tests has a total of 266 points of interest. As all points of interest are linked with bidirectional paths, the total number of paths between each two points (edges of the graph) is, at most, 70490. A user and his profile were created in order to simulate the use case of creating a route. Each time the tourist requests the creation of a route through the mobile application, data that serves as input to the algorithm is sent: coordinates where the tourist is located (latitude and longitude); day of the week on which the tourist want to travel; start and end time of the tour; means of travel (on foot or by car). Considering this diversity of inputs, two datasets were defined for further analysis, according to the evaluation metrics. Each of these datasets consists of a list of points of interest and their travel times. It was chosen the car as the main transport to calculate travel times. The day of the week was randomly chosen, but it is the same for both scenarios in order to ensure that the opening and closing times of each point of interest do not change during the experiments. The times for the start and end of the visit differ to obtain different sets of points of interest, since not all have the same opening hours. Each of the referred scenarios generates a specific compilation of data. As it is a large amount of data, the datasets presented in Table 1 are displayed in a simplified way, showing only the relevant content to the analysis and experimentation. Table 1. Datasets used in the experiments (simplified view) Name
Number of POIs
Connections between POIs
Starting time
Ending time
Average visiting time (minutes)
Dataset 1
92
8372
14 h 00
18 h 00
22.326
Dataset 2
92
8372
12 h 30
20 h 30
22.326
In order to facilitate the treatment and analysis of the presented results, there are used abbreviations for the names of the algorithms. Ant Colony Optimization is now called ACO and A* is called AS. As two different heuristics are tested for AS, the heuristic for choosing the highest score will be designated AS-Max and the heuristic based on Simulated Annealing called AS-SA.
100
R. Pinto et al.
3.1 Evaluation Metrics Each algorithm needs to perform a different set of operations which is reflected in the time it takes to return a valid solution. Its performance is tested by the quality-cost ratio of the results, that is, by the quality of the solution produced in the execution time of the algorithm. The sum of the scores of each point of interest should be maximized, as it influences the tourist’s satisfaction with the recommendation. This satisfaction is calculated using the formula expressed in (3), taking into account the n points of interest to visit. n POI Score (3) Stourist = i=1
The quality of the algorithm can also be verified by taking advantage of the time entered by the user to carry out the route, that is, the route generated must provide the highest possible occupancy rate for the tourist. This rate can be presented as a percentage using formula in (4). occupancy (%) =
trf − tri ∗ 100 tdf − tdi
(4)
Where: tri : Start time of the generated route; trf : End time of the generated route; tsi : Time set for the start of the route; tsf : Time set for the end of the route. 3.2 Execution Time Scenario The algorithm’s execution time is one of the most determining factors for the viability of the solution that will be included in TheRoute. For a better comparative analysis of the collected data, some graphs were generated, one for each dataset created. In each case, the execution times in seconds are shown for the algorithms which were tested in 30 executions. Figure 3 shows the execution times for each one of the algorithms, having as input the data collected in Dataset 1. 6
ACO
AS-Max
AS-SA
4 2 0 1
6
11
16
21
26
Fig. 3. Algorithms’ execution times for Dataset 1
Considering a maximum travel time of 4 h and a total of 92 points of interest, the algorithm that has a more favorable and uniform execution time is AS-Max. With a time of less than 1 s in all executions, this algorithm proves to be much more efficient than the others. AS-SA is the one with the biggest time oscillations, exceeding the 5 s mark in its slowest execution. The ACO always presents results close to 3 s. Since it is the efficiency
Algorithms for Context-Awareness Route Generation
101
in finding a valid path that is being evaluated for this metric, the AS-Max algorithm is, without any doubt, the one which reveals the best performance for the Dataset 1. When considering a travel time of 8 h and the same 92 points of interest, it was obtained the graph illustrated in Fig. 4 for the algorithm’s execution times, considering the data from Dataset 2. 15 ACO
10
AS-Max
AS-SA
5 0 1
6
11
16
21
26
Fig. 4. Algorithms’ execution times for Dataset 2
By analyzing the presented values, it is possible to conclude that the best performing algorithm for Dataset 2 is again AS-Max. Once again, all its executions occur in times that are close to 1 s, which is less than any of the times achieved by the other algorithms. In its turn, ACO comes in second place with execution times close to those achieved for Dataset 1, always around 3 s. AS-SA is the one that unequivocally reveals the worst results, again reaching high differences between the value of its best and worst performance. In its worst execution, the time needed to calculate the resulting path reaches values close to 10 s. Conclusions. From the performed analysis, it is concluded that the algorithm with the highest response speed is AS-Max, in all study cases carried out. The other algorithms also obtained very encouraging results. The peaks in the execution time obtained by the AS-SA algorithm are due to the random nature of the choice of the next node to be visited. The extra time taken by ACO, even for shorter trips, can be justified by the overhead caused by the creation of the various threads and the number of tasks performed on them. However, all tests exceeded expectations as valid routes were found in a very short time.
3.3 Route Score Scenario Each of the simulations evaluated in the previous section obtained a score relative to the generated path. For a better analysis of this metric, bar graphs were used, which allow the comparison of the obtained results by the algorithms on its first 15 executions. This way, it is possible to have a better visual perception of the values without losing much information, since the remaining executions for this parameter reveal very close values to those obtained on the first ones. The results of the simulations for Dataset 1 can be verified through the scores obtained in each generated route, illustrated in the graph of Fig. 5. As it can be verified, all the algorithms present very close scores for a relatively short path time (4 h). It is possible to identify ACO as the algorithm that obtains the
102
R. Pinto et al.
ACO
60
AS-Max
AS-SA
40 20 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Fig. 5. Route scores obtained by the algorithms for Dataset 1
highest results in most of the cases, always reaching 46.25 points. AS-Max also presents constant values in the order of 40 points. However, the difference in the score between these two algorithms is already significant (6.5 points), which may indicate that the selection made in the ACO is better achieved than in the AS-Max. Regarding AS-SA, it is verified that its results vary between 39 and 47 points, which corresponds to the minimum and the maximum value recorded in this series of simulations. Compared to the other two tested algorithms, AS-SA exceeds the values obtained by ACO by a little 0.75 points and manages to be lower than AS-Max by only 1 point. The bar graph shown in Fig. 6 reveals the scores obtained in each simulation performed for Dataset 2. ACO
90
AS-Max
AS-SA
85 80 75 70 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Fig. 6. Route scores obtained by the algorithms for Dataset 2
Compared to the previously analyzed graph, the one in Fig. 6 already shows significant differences between the scores obtained for each algorithm. In this set of simulations, the ACO has an enormous advantage of points over the remaining two. It obtains the result 84.75 in all its executions. In second place, but far from the ACO, comes ASSA with scores ranging from 77.25 to 80.25. Lastly, AS-Max appears, obtaining just 77 points. The increase in the route time also resulted in an increase of the disparities between the search algorithms tour scores, which served to highlight the ACO as the one that obtains the best results. Conclusions. The analysis carried out allows to conclude that the ACO is the most suitable for the metric under study. This way, it is possible to recommend a route that
Algorithms for Context-Awareness Route Generation
103
satisfies the tourist, considering various personal aspects of his profile and conditions of the surrounding context. AS-SA again proved to be quite fickle, by obtaining considerable differences in values under identical test conditions. The Simulated Annealing heuristic has, once again, a great influence on the randomness of the results. For the case of AS-Max, it appears that the cuts made in the research affect the score of the obtained route by being the algorithm that obtained the worst score in most of the tests performed.
3.4 Time Occupancy Rate to Complete the Route Scenario Before fulfilling the request for the creation of a route, the tourist specifies times for the beginning and end of the route. The developed algorithms take these values into account when researching and building the solution that is presented to the user. However, as the routes are being generated with the objective of obtaining a solution that approaches the preferences of the tourist, the time required for the completion of the route is not always fully filled. As it is a time requirement defined by the tourist, it is also important to assess the tour occupancy rate. In each one of the study cases there were performed 30 algorithm runs. Minimum and maximum schedule times were obtained for each generated route, as well as their average percentage rate of occupancy considering the time defined by the user to complete the route. Table 2 presents the obtained route schedules and the average occupancy rate on a 4-h route, calculated considering the results obtained for Dataset 1. Table 2. Route schedules and average occupancy time rate for Dataset 1 Algorithm
Start time
Minimum end time
Maximum end time
Average occupancy time rate
ACO
14:00:00
17:38:42
17:38:44
91.13%
AS-Max
14:00:00
17:58:41
17:58:41
99.45%
AS-SA
14:00:00
17:49:33
17:59:57
98.93%
As can be verified from the analysis of occupancy rates, all algorithms reach values above 91%. AS-Max obtains the best average by generating routes that occupy 99.45% of the time defined by the user. In its turn, AS-SA is the one that reaches the second-best average with values around 98.93%. Finally, ACO comes with 91.13% of occupancy time, below from the other algorithms. While AS-Max and AS-SA algorithms reveal wastes of a few minutes or seconds, in some cases, the ACO is unable to use approximately 20 min on a 4-h route. Table 3 contains the schedules and the average occupancy time on the routes generated for a period of 8 h, considering the data collected in Dataset 2. With the increase on the defined period to complete the route, it is verified that the time is better applied in the route. ACO remains the algorithm with the worst performance in this metric, reaching an average value of 94.74%, which corresponds to a waste of approximately 25 min on an 8-h route. However, compared to the previous analysis,
104
R. Pinto et al. Table 3. Route schedules and average occupancy time rate for Dataset 2
Algorithm
Start time
Minimum end time
Maximum end time
Average cccupancy time rate
ACO
12:30:00
20:04:45
20:04:45
94.74%
AS-Max
12:30:00
20:29:10
20:29:10
99.83%
AS-SA
12:30:00
20:22:55
20:29:59
99.56%
the increase in unused time was only 5 min. The other algorithms also saw a slight improvement in the percentage of time used. Once again, AS-Max obtained the best classification with 99.83%, slightly ahead of that recorded by AS-SA with 99.56% of the time used. Conclusions. From the performed analysis, it is confirmed that the AS-Max algorithm is the one that best meets this metric in all the cases carried out. AS-SA also obtained very promising results, despite the inconsistency on its variations. The ACO algorithm fell short of expectations, wasting more time than the others.
4 Conclusions The work developed arose from the need of a route generation module in TheRoute project, an existing tourism recommendation system. One of the main objectives of this system is to present personalized routers, considering user’s personality and the analysis of aspects derived from the context involved. On an earlier stage, some of the existing solutions within the scope of the tourist recommendation systems were analyzed. From that analysis were found some algorithms capable of generating paths and examining a determined space of research. These algorithms use considerable computational resources but can produce acceptable solutions. In the implementation stage, there were developed two algorithmic solutions based on A* and Ant Colony Optimization. All were tested considering the defined measures and the experimentation of possible real scenarios. The results were considered suitable for the problem in hands, since it is possible to obtain routes in short periods of time, considering aspects of the user’s profile and the context analysis. As it achieved better results in two of the three identified satisfaction scenarios, the AS-Max algorithm is the one that has the better quality-cost ratio for generating path recommendations. Acknowledgments. This work was supported by National Funds through the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within the Projects UIDB/00319/2020 and UIDB/00760/2020 and by the Luís Conceição Ph.D. Grant with the reference SFRH/BD/137150/2018.
Algorithms for Context-Awareness Route Generation
105
References 1. Ramos, C., Marreiros, G., Martins, C., Faria, L., Conceição, L., Santos, J., Ferreira, L., Mesquita, R., Lima, L.S.: A context-awareness approach to tourism and heritage routes generation. In: Novais, P., Jung, J.J., Villarrubia González, G., Fernández-Caballero, A., Navarro, E., González, P., Carneiro, D., Pinto, A., Campbell, A.T., Durães, D. (eds.) 9th International Symposium on Ambient Intelligence – Software and Applications, pp. 10–23. Springer (2019) 2. Borràs, J., Moreno, A., Valls, A.: Intelligent tourism recommender systems: a survey. Expert Syst. Appl. 41, 7370–7389 (2014). https://doi.org/10.1016/j.eswa.2014.06.007 3. Eiselt, H.A., Sandblom, C.-L.: Heuristic algorithms. In: Eiselt, H.A., Sandblom, C.-L. (eds.): Integer Programming and Network Models, pp. 229–258. Springer, Berlin (2000). https://doi. org/10.1007/978-3-662-04197-0_11 4. Porumbel, D.C.: Heuristic algorithms and learning techniques: applications to the graph coloring problem. 4OR 10(4), 393–394 (2012) 5. Jungblut, T.: Ant Colony Optimization for TSP Problems (2015) 6. Dorigo, M., Di Caro, G.: The Ant Colony Optimization Meta-Heuristic. New Ideas in Optimization (1999)
Detection Violent Behaviors: A Survey Dalila Dur˜ aes1,2(B) , Francisco S. Marcondes2 , Filipe Gon¸calves2,3 , Joaquim Fonseca3 , Jos´e Machado2 , and Paulo Novais2 1
CIICESI, ESTG, Instituto Polit´ecnico do Porto, Felgueiras, Portugal [email protected] 2 Algorithm Center, University of Minho, Braga, Portugal [email protected], {jmac,pjon}@di.uminho.pt 3 Bosch Car Multimedia, Braga, Portugal {filipe.goncalves,joaquim.fonseca2}@pt.bosch.com
Abstract. Violence detection behavior is a particular problem regarding the great problem action recognition. In recent years, the detection and recognition of violence has been studied for several applications, namely in surveillance. In this paper, we conducted a recent systematic review of the literature on this subject, covering a selection of various researched papers. The selected works were classified into three main approaches for violence detection: video, audio, and multimodal audio and video. Our analysis provides a roadmap to guide future research to design automatic violence detection systems. Techniques related to the extraction and description of resources to represent behavior are also reviewed. Classification methods and structures for behavior modelling are also provided . Keywords: Violence detection · Action recognition · Video surveillance · Audio surveillance · Multimodal surveillance
1
Introduction
In recent years, detection and recognition of violence has been studied for several applications, namely in surveillance. In surveillance, the analysis and automatic detection of abnormal, dangerous and violent events is an important field of study [1]. Action recognition and violence detection as been study, with different perspectives, for several disciplines, including psychology, biomechanics and computer vision [1]. In computer vision several studies were performed using different approach. In some approach the technique applied was visual detection. In other approach the technique applied was audio detection. Finally, in other approach, was used audio and visual techniques. In visual approach a movement has a set of primitive actions and describes a whole-body movement. However, a primitive action is an atomic motion that can c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 106–116, 2021. https://doi.org/10.1007/978-3-030-58356-9_11
Detection Violent Behaviors: A Survey
107
be described in terms of a member [2]. The detection of human movements consists of obtaining a set of actions. Finally, several subsequent actions, providing an interpretation of the movement being performed and are called activities [2]. In audio approach the signal produced by the sound of an audio contains a lot of information that only visual data cannot represent, namely: screams, explosions, words of abuse and even sound passages showing some kind of emotion. In this paper, we conducted a recent systematic review of the literature on this subject, covering a selection of various researched articles. This paper is organized as follows. Firstly, Sect. 2 introduces the concepts of action recognition and violence detection. Then, Sect. 3 presents the sample setup. Next, Sect. 4 presents results and discussion, namely the video approach, audio approach, and multimodal audio and video approach. Finally, Sect. 5 concludes the review by performing a global analysis to the presented review and presenting some future work for this research.
2
Concepts
The goal of an intelligent surveillance system for violence detection has effectively to detected event in real-time to avoid dangerous situations. However, it’s necessary to understand some important concepts of surveillance. 2.1
Action Recognition
In a human recognition action recognition is a system that can detect human activities. Types of human activity are classified into four different categories, depending on the complexity of the actions and the number of body parts involved in the action. The four categories are gestures, actions, interactions and group activities [3]. A gesture is a collection of movements made with the hands, head, or members to show a particular meaning. Actions are a collection of multiple gestures performed by a single person. Interactions are a collection of human actions, with at most, two actors. When there are two actor, an actor should be a human and the other can be a person or an object. Group activities are a combination of gestures, actions or interactions in which the number of players is greater than two and there may be one or more interactive objects [3]. 2.2
Violence Detection
Violence detection is a particular problem regarding the great problem of the recognition of action. The objective of violence detection is to automatically and effectively determine whether the violence occurs or not within a short period of time. In recent years, the automatic recognition of human actions on videos has become increasingly important for applications such as video surveillance, human-computer interaction and video retrieval based on their content [2,4].
108
D. Dur˜ aes et al.
The purpose of violence detection is automatically and effectively determine if violence occurs or not. Anyway, the detection of violence itself is an extremely difficult problem, since the concept of violence is subjective. Detection of violence is an important issue not only at the application level but also at the scientific level because it has characteristics that differentiate the recognition of generic actions.
3
Sample Setup
To obtain the sample, the research was carried out in February 2020 at NEXT ACM DL (dlnext.acm.org). The first query was: [[All: violence] OR [All: fight] OR [All: aggression]] AND [[All: detection] OR [All: recognition] OR [All: surveillance]] AND [[All: indoor] OR [All: inside] OR [All: interior] For that query, we have obtained 6641 paper. Then we consider the last five years, the query result as 2055 papers. In addition, if we consider in the title, the keywords ‘detection’ or ‘recognition’ or ‘surveillance’, the query results is 183 papers. To reduce the query, we introduced the abstract keywords ‘review’, or ‘survey’ or ‘benchmark’. The keywords used to obtain the query were: [[All: violence] OR [All: fight] OR [All: aggression]] AND [[Publication Title: detection] OR [Publication Title: recognition] OR [Publication Title: surveillance]] AND [[All: indoor] OR [All: inside] OR [All: interior]] AND [[Abstract: review] OR [Abstract: survey] OR [Abstract: benchmark]] AND [Publication Date: (02/01/2015 TO 02/29/2020)].
Fig. 1. Weight of each keyword in the query. First column, complement, second, absolute weight, third, relative weight, and fourth, normalized weight.
The query and the keyword weights are shown in Fig. 1. The query retrieved 41 results published in the last five years. In the first part of Fig. 1, it was
Detection Violent Behaviors: A Survey
109
presented the query submitted to the ACM DL search engine. In the second part of Fig. 1 it was shown the Weight of each keyword in the query. The first column, complement, depicts the number of papers retrieved by removing the keyword from the query; second, absolute weight, is the difference between the number of papers retrieved by the query and the complement; third, relative weight, is the percentage representation for the absolute weight in relation to the number of retrieved papers; and fourth, normalized weight, is the max-min representation of relative weight.
4
Results and Discussion
Despite the 41 papers obtained in the query, some papers contained information that is not related to the topic. Reading all the papers, it was possible to separate three different approaches: video, audio, and multimodal audio and video. In the video approach, we have 18 papers from the query; for the audio approach we have 3 papers from the query, and for the multimodal audio and video approach we have 3 papers from the query. 4.1
Video Approach
Automated surveillance occurs if cameras are constantly monitored by a computer, and in real-time, trigger an event when something suspicious happens. To violence detection in video, it is difficult to capture effective and discriminative features as a result of the variations of the human body. The modifications are essentially caused by scale, viewpoint, mutual occlusion and dynamic scenes. In the last years, it was published several previous surveys about abnormal human behavior recognition [5–7], human detection behavior [8,9], crowed behavior [10], datasets human recognition [11–13], and foreground segmentation [14]. Also, there are some research of fast violent detection [15–19], multi-features descriptors for human activity tracking and recognition [20], segmentation [21], and vision enhanced color fusion techniques [22]. Mabrouk and Zagrouba [5] study the two main steps composing a video surveillance system, which are the behavior representation and behavior modelling. On behavior representation, it’s presented the most popular features for global features (optical flow, and motion information), local features (based on interest points: STIP, CSTIP, MoSIFT or Spatio-temporal volume, cube, blob), widely used for falls detection (shape), adapted for crowd monitoring (texture), and adapted for tracking a single person (object tracking and trajectory extraction). On classifying abnormal behavior recognition methods it’s separated in: (i) modelling frameworks and classification methods; and (ii) scene density and moving object interaction. In modelling frameworks and classification methods its made a comparison of classification methods categories: supervised, semisupervised: rule-based method, semi-supervised: model-based method, and unsupervised methods. Also made a comparison of frameworks and classification methods for abnormal behavior detection. On scene density and moving object
110
D. Dur˜ aes et al.
interaction its presented a scene density and moving object interaction-based grouping for an uncrowded and crowded scene. On performance evaluation, it’s divided into datasets and evaluation metrics. In datasets, it presented available datasets for video surveillance systems evaluation. In evaluation metrics its summarize the performance evaluation results, accuracy, equal error rate, area under, and curve. On existing video surveillance systems its presents a summary of existing video surveillance systems. Gowsikhaa, Abirami and Baskaran [8] showed a survey of automated human behavior analysis from surveillance videos, which begin to present a map representing the human activity prediction architecture and literature survey of low-level processing techniques, namely motion detection methods, object classification methods, and motion tracking methods. Then high-level processing techniques are presented: (a) pre-processing and human behaviour recognition and analysis, which compared activities recognized in different works; (b) a sample of semantic descriptions used in the state of art, (c) predicting the activities of a person with respect to an object, (d) performance evaluation, and (e) comparison of low-level and high-level techniques in human behavior analysis. Finally, challenges in human behavior analysis mentioned cavities, human body modelling, handling occlusions, scene classification, person identification, techniques for activity perception, cameras revisited, modelling scenes, standardization, and domain specificity. Afsar, Cortez, and Santos [9] presented a review of automatic visual detection of human behavior. The authors begin to present an automatic human behavior detection from video keywords by publication year. Also, indicated the techniques used for human behavior detection from video: (i) initialization, (ii) tracking, (iii) pose, and (iv) recognition. The initialization model begins with main approaches and discussion, where it’s mentioned comparison of approaches for model initialization. Tracking talks about background segmentation, temporal correspondence, and discussion, which present a comparison of approaches for tracking. Pose estimation mentioned model-free, indirect model use, direct model use, and discussion, where a comparison of approaches for pose estimation is made. Recognition explains scene interpretation, holistic recognition approaches, action primitives and grammar, and discussion, which showed a comparison of approaches for recognition. In addition, a dataset related with human behavior detection, and discussion, where is showed a state-of-the-art-results for human behavior detection datasets. Furthermore, applications with chronological evolution, specifying Human detection using 3D depth images, abnormal activity detection, action recognition from video, player modelling and robotics, pedestrian detection and in-home scenarios, and person tracking and identification. Maheshwari and Heda [10] present a review of analysis methods for crowd behavior in video surveillance. All the process occur in three phases: (a) video surveillance; (b) crowd analysis; and (c) methods to abnormal behavior detection in crowd scene. (a) In video surveillance, the method consists of three major modules: i) background modeling, ii) blob analysis, and iii) crowd detection and tracking. (b) crowd analysis estimated using three basic steps which are pre-
Detection Violent Behaviors: A Survey
111
processing, object tracking and event/behavior recognition. Preprocessing step can analyse: pixel-level analysis, texture level analysis, object level analysis, and frame-level analysis. Object tracking step can analyse: region based approach, active contour based approach, feature based approach, and model based approach. Event/behavior recognition step applied two approaches: object based approach and holistic based approach. (c) The methods of abnormal behavior detection in crowded scene can used modeling Gaussian Mixture Modeling (GMM), Social Force Model Method (SFM), Hidden Markov Model (HMM), Correlated Topic Model (CTM), Markov Random Field (MRF), Sequential Monte Carlo (SMC), and Dynamic Oriented Graph (DOG). Dubuisson and Gonzales [11] begin for presenting in visual tracking: which dataset for which need. Then presented visual tracking issues: (a) problems inherent to visual tracking, like illumination effects, scene clutter, changes in object appearance, abrupt changes in motion, occlusions, similar appearances, camera motions, appearance/disappearance, and quality of frames; and (b) quantitative evaluation of visual tracking, like: error score, accuracy score, success score, detection scores, curves for comparing tracking algorithms, and robustness of evaluation; and (c) datasets for visual tracking; (d) datasets for scene analysis and understanding based on visual tracking, which present: Human understanding (hand gesture tracking, face tracking, facial expression or emotion, body motion, individual action/activity, behavior, and interactions and social activities), and scene understanding (video surveillance, event, crowd, and sport games). Table 1. Categories and paper related for video approach. Categories
Papers
Survey
[1–3, 5–7, 9–14]
Human behavior
[5, 8, 9, 15]
Recognition
[1, 2, 7, 9, 10, 12, 19, 20]
Crowed behavior
[6, 10, 19]
Dataset
[5, 7, 9, 11–14]
Segmentation
[7, 9, 10, 12, 14, 15, 17, 19–22]
Violent detection
[15–19]
Tracking
[6, 8, 9, 11, 13, 20]
Color fusion
[22]
Classification
[4–10, 12, 14–16, 18–20, 22]
Features extraction [6, 8, 14–18, 20, 21]
112
D. Dur˜ aes et al.
Mahmood, Khan and Mahmood [22] a review has been examined to provide in detail survey and comparison of night vision imaging techniques. Then it showed theoretical foundations of technical framework and paradigms proposed to enhance and fuse color in night images, namely infra-red spectrum based selfadaptive enhancement, colormap clustering based color transfer method, contrast and color enhancement techniques, region-based color transfer methods, image enhancement based on selective Retinex fusion algorithm, color estimation and sparse representation, histogram based enhancement revisited, and nature inspired models. Next, its presented quantitative analysis: image contrast metric, the gradient metric, phase congruence (PC) metric, color natural metric, and objective evaluation index. Finally, its showed results and comparative analysis. In the Table 1 is identified by category, the papers that the theme. 4.2
Audio Approach
In terms of bandwidth, memory storage and computing requirements, the audio stream is generally much less than the video stream. While standard cameras have a limited angular field of view, microphones can be omnidirectional (providing a spherical field of view). Due to the audio wavelength, many surfaces allow reflections of the acoustic wave, thus allowing the acquisition of audio events even when obstacles are present in the direct path. Illumination and temperature are not problems for audio processing [25]. Souto, Mello and Furtado [23] present, an acoustic scene classification approach involving domestic violence using machine learning. The methodology specifies the parameters used (MFCC, Energy and ZCR), parameter extraction (medium-term parameter sequence processing and short-term processing), and classification (SVM technique). They applied some tests and they used a database audio. After training the classifier, the model obtained were MFCCSVM classifier, Energy-SVM classifier, and ZCR-SVM classifier. Rouas, Louradour and Ambellouis [24] referred audio events detection in a public transport vehicle. The first idea is to extract relevant events from the audio stream. So it’s necessary to create an automatic audio segmentation, which splits an audio signal into several quasi-stationary consecutive zones, an activity detection algorithm, which aims at skipping silence and low-level noise zones, out of interest and a merging step, to gather successive activity segments. To modelling and classification framework, first, it’s obtained features extraction, then it’s used a GMM method, after an SVM classifier, then a classification framework. Crocco, Cristiani, Trucco and Murino [25] showed an audio surveillance review. The authors begin to compare audio data to visual data. Then it explains the background subtraction by monomodal analysis, which presents the features employed in this type of background subtraction. Next, it explains the background subtraction by multimodal analysis, which highlights those methods requiring more or less offline learning. After it’s showed the audio event classification, where the taxonomy for the classification methods, with pros and cons, added for each category of approach. Furthermore, source localization and
Detection Violent Behaviors: A Survey
113
tracking are explained, especially audio source localization: a typology of audio events considered in the literature, features employed in audio event classification, general taxonomy of source localization, the taxonomy for time delay-based localization methods, optimal working conditions for different localization methods, features employed in sound localization; audio-visual source localization; audio source tracking, and audio-visual source tracking. In addition, situation analysis is made: one-layer systems and hierarchical systems. Also, audio features are analysed, namely features employed in situation analysis: time, frequency, cepstrum, time-frequency, energy, biologically/perceptually driven, and feature selection and feature learning. Finally, present some open problems like background subtraction, audio classification, audio source localization and tracking, situation analysis, audio-video fusion, privacy and audio encryption, and adversarial setting. In the Table 2 is identified by category, the papers that referred the theme. Table 2. Categories and paper related for audio approach. Categories
Papers
Survey
[25]
Human behavior
[23, 24]
Background subtraction, Tracking, fusion methods [25] Dataset, Violent detection
4.3
[23]
Segmentation
[24]
Features extraction, Classification
[23–25]
Multimodal Audio and Video Approach
When it’s used Audio and video approach in surveillance, its called Multimodal surveillance [27]. Crocco, Cristiani, Trucco and Murino [26] began to explain audio analysis driven violence detection, video analysis driven violence detection, multimodal analysis driven violence detection, and knowledge-based semantics extraction for violence detection. Then some general description of the proposed methodology is presented. Next, audio classification for violence hints like audio class definition, audio feature extraction, and class probability estimation. After, presented visual classification for violence hints, like visual class definition, visual features (motion features, person detection features, and gunshot detection features), and video class probability estimation. In addition, machine learning-based fusion, namely fused feature vector and meta-classifier. Also, explain ontological fusion, which includes the ontological framework, violence ontology definition, visual semantics for violence, audio semantics for violence, video structure ontology,
114
D. Dur˜ aes et al.
inference engine design, and inference engine implementation. Finally, experimental evaluation with implementation issues, scenario and setup, and classification and detection results are showed.
5
Conclusions
In this paper, we present a survey of violence detection behavior. It was made a review concept of action recognition and violence detection. To carried out the research was created a sample setup and we have obtained a query of 41 papers. Analysing the papers, only 24 was specific with the theme. Also, we concluded that the papers separated in three different approaches: 18 papers were for video approach, 3 papers referred audio approach, and 3 papers, related multimodal audio and video approaches. This three different approach are analyzed independently and in each approach was analysed the methods, techniques and classification used. Besides, in some approaches also was included datasets and fusion methods. For future works, due to the specificities related to each environment and violence, it can be proposed the exploration of surveillance inside a vehicle. Also, other sensors might be included, for instance, we can propose a multimodal system to detect violence behavior inside a vehicle. Additionally, a conception of reference architectures incorporated with security measurements is also useful. Acknowledgments. This work is supported by the European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project n◦ 039334; Funding Reference: POCI-01-0247-FEDER-039334]. This work has been supported by national funds through FCT – Funda¸ca ˜o para a Ciˆencia e Tecnologia through project UIDB/04728/2020.
References 1. Ko, T.: A survey on behavior analysis in video surveillance for homeland security applications. In: Applied Imagery Pattern Recognition Workshop, AIPR 2008, 37th IEEE, pp. 1–8. IEEE (2008) 2. Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010) 3. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3), 16:1–16:43 (2011) 4. Sun, Q., Liu, H.: Learning spatio-temporal co-occurrence correlograms for efficient human action classification. In: 2013 IEEE International Conference on Image Processing, pp. 3220–3224. IEEE, September 2013 5. Mabrouk, A.B., Zagrouba, E.: Abnormal behavior recognition for intelligent video surveillance systems: a review. Expert Syst. Appl. 91, 480–491 (2018) 6. Lopez-Fuentes, L., van de Weijer, J., Gonz´ alez-Hidalgo, M., Skinnemoen, H., Bagdanov, A.D.: Review on computer vision techniques in emergency situations. Multimedia Tools Appl. 77(13), 17069–17107 (2018)
Detection Violent Behaviors: A Survey
115
7. Wang, P., Li, W., Ogunbona, P., Wan, J., Escalera, S.: RGB-D-based human motion recognition with deep learning: a survey. Comput. Vis. Image Underst. 171, 118–139 (2018) 8. Gowsikhaa, D., Abirami, S., Baskaran, R.: Automated human behavior analysis from surveillance videos: a survey. Artif. Intell. Rev. 42(4), 747–765 (2014) 9. Afsar, P., Cortez, P., Santos, H.: Automatic visual detection of human behavior: a review from 2000 to 2014. Expert Syst. Appl. 42(20), 6935–6956 (2015) 10. Maheshwari, S., Heda, S.: A review on crowd behavior analysis methods for video surveillance. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, pp. 1–5, March 2016 11. Dubuisson, S., Gonzales, C.: A survey of datasets for visual tracking. Mach. Vis. Appl. 27(1), 23–52 (2016) 12. Zhang, J., Li, W., Ogunbona, P.O., Wang, P., Tang, C.: RGB-D-based action recognition datasets: a survey. Pattern Recogn. 60, 86–105 (2016) 13. Singh, T., Vishwakarma, D.K.: Video benchmarks of human action datasets: a review. Artif. Intell. Rev. 52(2), 1107–1154 (2019) 14. Komagal, E., Yogameena, B.: Foreground segmentation with PTZ camera: a survey. Multimedia Tools Appl. 77(17), 22489–22542 (2018) 15. Zhou, P., Ding, Q., Luo, H., Hou, X.: Violence detection in surveillance video using low-level features. PLoS One 13(10), e0203668 (2018) 16. Deniz, O., Serrano, I., Bueno, G., Kim, T.K.: Fast violence detection in video. In :2014 International Conference on Computer Vision Theory and Applications (VISAPP), vol. 2, pp. 478–485. IEEE, January 2014 17. De Souza, F.D., Chavez, G.C., do Valle Jr., E.A., Ara´ ujo, A.D.A.: Violence detection in video using spatio-temporal features. In: 2010 23rd SIBGRAPI Conference on Graphics, Patterns and Images, pp. 224–230. IEEE, August 2010 18. Gao, Y., Liu, H., Sun, X., Wang, C., Liu, Y.: Violence detection using oriented violent flows. Image Vis. Comput. 48, 37–41 (2016) 19. Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE, June 2012 20. Jalal, A., Mahmood, M., Hasan, A.S.: Multi-features descriptors for human activity tracking and recognition in Indoor-outdoor environments. In: 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST), pp. 371–376. IEEE, January 2019 21. Komagal, E., Yogameena, B.: Region MoG and texture descriptor-based motion segmentation under sudden illumination in continuous pan and excess zoom. Multimedia Tools Appl. 77(8), 9621–9649 (2018) 22. Mahmood, S., Khan, Y.D., Mahmood, M.K.: A treatise to vision enhancement and color fusion techniques in night vision devices. Multimedia Tools Appl. 77(2), 2689–2737 (2018) 23. Souto, H., Mello, R., Furtado, A. : An acoustic scene classification approach involving domestic violence using machine learning. In: Anais do XVI Encontro Nacional de Inteligˆencia Artificial e Computacional, vol. 16, No. Salvador, pp. 705–716. SBC (2018) 24. Rouas, J.L., Louradour, J., Ambellouis, S.: Audio events detection in public transport vehicle. In: 2006 IEEE Intelligent Transportation Systems Conference, pp. 733–738. IEEE, September 2006 25. Crocco, M., Cristani, M., Trucco, A., Murino, V.: Audio surveillance: a systematic review. ACM Comput. Surv. (CSUR) 48(4), 1–46 (2016)
116
D. Dur˜ aes et al.
26. Perperis, T., Giannakopoulos, T., Makris, A., Kosmopoulos, D.I., Tsekeridou, S., Perantonis, S.J., Theodoridis, S.: Multimodal and ontology-based fusion approaches of audio and visual processing for violence detection in movies. Expert Syst. Appl. 38(11), 14102–14116 (2011) 27. Dedeoglu, Y., Toreyin, B.U., Gudukbay, U., Cetin, A.E.: Surveillance using both video and audio. In: Multimodal Processing and Interaction, pp. 1–13. Springer, Boston, MA (2008)
System for Recommending Financial Products Adapted to the User’s Profile M. Unzueta1 , A. Bartolom´e1 , G. Hern´ andez1,2 , J. Parra1 , and P. Chamoso1(B) 1
2
BISITE Research Group, University of Salamanca, Calle Espejo 2, 37007 Salamanca, Spain {srunzu15,alvarob97,guillehg,javierparra,chamoso}@usal.es AIR Institute, Edificio Parque Cient´ıfico Universidad de Valladolid, M´ odulo 305, Paseo de Bel´en 11, Campus Miguel Delibes, 47011 Valladolid, Spain
Abstract. The breakthroughs in computing over the last decade have opened up a wide range in the analysis of large volumes of data. Today, data has become a raw material that is exploited in virtually every business sector to analyze and improve processes and results. One sector where it has a special impact is the financial sector. One of the main applications of data analysis in this sector involves applying methodologies that discover patterns and trends in the value of financial products. However, data analysis can also be used to analyze users and not just products. The work presented in this article aims to analyze financial products and users in order to make product recommendations adapted to the objectives of each user at an individual level. To this end, a profile of each user is obtained and an analysis is made of which financial products are capable of satisfying their investment objectives within the set time frame.
Keywords: Fintech
1
· Financial risk · Product recommendation
Introduction
Financial risk makes it possible to quantify the likelihood of an adverse event occurring that would have negative financial consequences for a user or organization [1]. The measurement of financial risk is one of the current lines of research with a clear involvement of the evolution of finance understood as Fintech [2]. Fintech studies digital innovations and financial business model innovations incorporating technology in the financial sector [3]. The main services that are favored by the incorporation of Fintech technologies are Regulation, Back-office operations, Currency and payments, Lending, Insurance, Savings and Advice [4]. Currently, although more typical of the past, Regulation and Currency and Payments activities are carried out by Central Banks, while the rest of the activities described are carried out by financial c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 117–126, 2021. https://doi.org/10.1007/978-3-030-58356-9_12
118
M. Unzueta et al.
institutions. All of them, shortly and in some countries, are developed by firms known as Fintech. The main technologies applied under the Fintech paradigm are those related to Big Data and Artificial Intelligence, with Distributed Computing, Cryptography, and access through mobile devices and the Internet [5]. The optimization of work in terms of investments, credit decisions, and application development is the subject of this study, which has a necessary link with the optimization of the level of personal financial risk. Risk profiling in this context refers to investment funds to the user’s investment profile, obtained through a study that handles relevant user information. This information must necessarily be able to reflect the investor’s capacity to assume losses [6], and therefore his profile must be analyzed in detail, for example through questionnaires with strategic questions on personal and technical issues in the financial sector and suitably classified answers. The purpose of the current work is to establish risk profiles of users and risk profiles of financial products in order to match them optimally through artificial intelligence to make product recommendations at an user-level. The paper is organized as follows. In Sect. 2 related work is discussed and describes the case studies that use and validate the proposed architecture, presented in Sect. 3 in real scenarios and shows the results. The fourth section refers to the Sect. 4. Finally, discussion is presented in Sect. 5.
2 2.1
Related Work Users’ Risk
Existing Methodologies. The great interest that the terms robo-advisor and Fintech are arousing today is mainly due to the fact that, from the scientific point of view, there is still a long way to go, since the methodologies used so far are only statistical, which meet the main conditions of the risk indicators, namely: coherence, related properties, and robustness [7], and make us think that with the appearance of the techniques related to big data, the results can be significantly better. However, the latter requires the acquisition of a large volume of data to allow maximum benefit to be derived from technologies such as machine learning [8]. For example, recently in the work carried out by Emfevid and Nyquist [9], they propose their own methodology based on statistical methods after carrying out a deep theoretical analysis on variables and techniques to be applied (models based on logistic regressions or analysis of variance - ANOVA -, among others). Basically, the current techniques are based on the deviation of the data from the average value, which is very useful for environments where volatility plays a very important role, as we can see from Eq. 1. ¯ 2 X −X (1) σ= N
System for Recommending Financial Products Adapted to the User’s Profile
119
Value at Risk (commonly known as VaR) is a methodology that is most widespread today. In itself, it is a statistical methodology that measures and quantifies the level of financial risk for a given investment portfolio in a specific time frame, acting as a benchmark on certain investments to be made and where the main variables to be identified are the given time horizon and the level of confidence [10]. Also known as Value at Risk is a measure used to generally evaluate the risk of a certain position or portfolio of financial assets. This risk assessment technique currently has the following main methodology for its calculation [11], which means that a greater profit margin can also be perceived in the use of machine learning technology. The previous literature recognizes the conjunction of all these methodologies with other methods such as artificial intelligence or machine learning in the permanent search to optimize investment decisions, trying to find a comprehensive methodology to quantify financial risk. Not in vain, there are other models developed historically such as those stated by Harry M. Markowitz (1952) and James Tobin (1958) that already associated risk with the variance in the value of the portfolio or the one developed by William Sharpe (1964) that tried to relate the incentives of investment with the information that investors could have that could be implemented by developing econometric techniques such as ARCH Model, GARCH Model, and EGARCH Model, which also require more sophisticated techniques focused on the processes of extraction, transformation, and loading. 2.2
Background
The current optimal fit between the risk profile of the user and the risk of the particular asset is a difficult matter in the field of finance. Portfolio selection models were first reviewed in 1952 and consisted of allocating capital to a number of available assets in an attempt to maximize investment returns by considering a certain associated risk. Traditional and modern frameworks for robust yield estimation are introduced in Fabozzi et al (2007) [12], as well as robust covariance matrix estimation with the work of Lee et al (2006) [13], contraction estimators with the work of Garlappi et al (2007) [14], and the concept of Black and Litterman (1992) [15] to introduce investor insight into a balanced framework.
3
Proposal
The current work is framed within the Spanish financial markets, whose reference body is the National Securities Market Commission (CNMV). Therefore, it is proposed that, by analyzing the history of all available financial products, their volatility is obtained and from their value, a metric is extracted which, instead of fluctuating between 1 and 7 [16], does so up to 5 in order to directly calculate the correspondence with the risk of the users, which we have classified into 5 profiles from least to most risky.
120
M. Unzueta et al.
The questions of the questionnaire are mainly elaborated from information obtained through the German bank Deutsche Bank that identifies the profiles in “very conservative”, “conservative”, “moderate”, “dynamic” and “risky”, although information from the rest of the observed sections is also used, as for example in the present section. The incorporation of this extra information is implemented in The Risk Management Guide [17] where a possible questionnaire is even presented at the end of the analysis. The user will use on the one hand the factors associated with his personal profile [18] and, on the other hand, purely financial factors [19]. The rest of the information identified will be obtained through the financial questionnaire. A series of questions have been defined to extract the information that will be used by the user’s risk analysis algorithms as input. Each of the questions will have a unique answer, which will be concrete and duly delimited, allowing it to be related to a type of risk profile. The questions developed are concerned with answering the knowledge of the level of education, the financial experience, the financial knowledge from the point of view of the work in the financial sector, the knowledge of the investment objective, the terms of the investment, the possible assumable loss, the knowledge of the average monthly income and the average monthly expenses, the maintenance capacity using only savings, and the financial products already contracted [20–22]. To generate a risk profile associated with a user, the information available clearly cannot have the same weight. For this reason, based on the information previously identified, a relative weight has been assigned which will be taken into account in the algorithms to be applied. This weighting has been given by consensus by experts in the sector. The weight has been structured in 5 levels: Very low, low, medium, high and very high. When designing the algorithms for risk profiling, a distinction must be made between a methodology based on statistical models and a methodology based on machine learning. This is because, depending on the characteristics of the problem, in certain cases one or the other can be better adjusted. Each methodology, in turn, presents two different approaches, one to perform risk profiling of users and the other to perform risk profiling of financial products. 3.1
Methodology Based on Statistical Models
For a cold start (without a significant volume of users and operations that could be useful for machine learning techniques), it is proposed to use well-established and validated statistical techniques as a starting point. To this end, the risk is assigned to the user using the initial questionnaire as a starting point. Subsequently, the risk of the financial products will be calculated using the volatility (the higher the volatility the higher the risk) and refined based on moving averages. This can be done within different time frames: short, medium and long term. Users’ Risk Profiling. When starting to use the application, the first thing the user must do is answer the questionnaire. The questions asked are designed
System for Recommending Financial Products Adapted to the User’s Profile
121
to classify the user exactly on a scale with five categories, already described above. One of the options classifies the user as conservative, another option as intermediate or balanced and the last one as risky. Therefore, the answers are treated as real numbers belonging to the interval [0,2], where 0 is conservative, 1 is balanced and 2 is aggressive. Questions that have more than three possible answers have the values equidistributed in this same interval. The answers to all questions are combined to obtain a total score using the following equation: Ruser =
N −1 1 w0 ∗ p0 + w1 ∗ p1 + . . . + wN −1 ∗ pN −1 = wi ∗ pi N N i=0
(2)
From this formula, the initial risk of the user Ruser is obtained. N is the number of questions asked, pi is the rating of the i-th, and wi is the weight of the i-th question based on the criteria of some experts. However, as indicated, the banking products already contracted (the purchase history) have a very high weight when determining the risk a user is willing to assume. For this reason, after incorporating the first purchase into the system, this user risk value will be corrected by 10% for the level of risk assumed in the purchase. In this way, if the user remains in line with the products that the system suggests, his risk profile will remain at the same levels, and in this case, he should not change category (although it could oscillate if he is very close to the point of change between two of the five categories), while if the operations indicated by the user are not very aligned with the calculated profile, little by little the user’s risk profile will change with respect to that obtained from the questionnaire, to the point where he will be asked to make the purchase again because it is not aligned with his historical purchases. Therefore, the operation that is executed for each product included in the purchase history and that updates the value of the user risk profile is defined according to the following equation: Ruser = 0.9 ∗ Ruser + 0.1 ∗ Rinvestment
(3)
Thus, the importance of the questionnaire decreases as new purchases are introduced. Financial Products’ Risk Profiling. Before describing the algorithm, it is necessary to enter the dataset that has been used for the validation of the implementation. Although tentatively it will not be the definitive set since it is obtained through a crawler developed to recover the information from third parties and it is not the best technique to incorporate information into a final product. However, in research stages, it is the most complete option among all the sources analyzed. More specifically, we have updated information on 8,682 ETFs around the world, with an average of nearly 5 years of history.
122
M. Unzueta et al.
In order to assign a risk to each of the funds, particular attention should be paid to the amount of ‘gross’ changes in the funds. A fund whose value fluctuates very slightly will be valued as safe, while a fund full of abrupt changes will be classified as risky. Volatility is used to capture this variability: N 1 2 (xi − μ) (4) Rinvestment = N − 1 i=1 where Rinversion is the investment risk, N is the total number of data, xi is the value of the fund at the time i, and μ is the average of the set of values considered. In particular, the standard derivation of a time series is being measured over a specific time horizon. Its values are always positive and have a maximum value of 1 (if the volatility is greater than 1, it is changed to 1). From this historical volatility, a measure of the risk involved in investing in this fund is obtained. This valuation is independent of whether the money will be gained or lost, it only quantifies the amount of variation. Risk Analysis in Financial Operations. With the above data already calculated, you get an ordered list of funds for each user. The first items on the list will present a risk more similar to that which the user is willing to assume, the following items will present risks that are increasingly disparate from the user’s preference. User risk is modelled on the questionnaire in the range [0, 2], and investment risk in the range [0.1]. A linear transformation changes the data so that both are in the range [0.5]. Now that you have a common framework, you can refer to it when suggesting funds that are within the risk value obtained by the user. The latter analysis is made analogously for financial operations in the machine learning-based methodology presented below, so it will not be described again for that case. 3.2
Methodology Based on Machine Learning
The successful use of an automatic learning model requires consideration of several preliminary stages that can be applied to the data set. The snippet shown in Fig. 1 shows the complete construction of a model by different stages, using the concept of the pipeline (set of interconnected operations such that the output of one constitutes the input of a later one) through the functionality provided by the scikit-learn library in Python language. The first stage, labeled as kbest in the code, will consist of a selection of characteristics, in which the available attributes are ordered according to a measure of their relationship with the objective to be predicted (in our case, the risk of the product or client). The measures usually used for this purpose are statistical chi-square tests or mutual information estimation. The second stage consists of the application of standardization techniques that make the attributes comparable. Depending on the range of variation of
System for Recommending Financial Products Adapted to the User’s Profile
Known cases
adjustment and validation
Attribute selection (kbest)
New cases
123
Standarization (std)
Feature extraction (PCA)
Prediction model (model)
Prediction
Fig. 1. Pipeline for machine learning application
the data, one or other techniques can be applied. In the example shown, standardization has been considered by means of the moments of the attribute (mean and standard deviation), understood as a random variable. The use of a pipeline makes it possible to guarantee that the methodology is statistically rigorous, avoiding potential errors such as scaling over the entire data set, thus biasing the final evaluation. The third stage incorporated consists of the extraction of characteristics, in which the attributes resulting from the previous processes are transformed to find mathematical combinations that better describe the variability present in them. The algorithm used for this purpose is the so-called “Principal Component Analysis” (PCA), based on the decomposition of the covariance matrix of the attributes into auto-values. The last stage involves the use of an algorithm for either classification, in the case of treating the risk of a customer or product as a discrete scale, or regression, in the case of treating it continuously. The various prediction algorithms generally have a large number of parameters that are ideally set a priori, the so-called hyperparameters. In the example, a neuronal network has been indicated, indicating variations in some of its parameters such as the topology with which the neurons are distributed, the transfer functions used, the magnitude of the regularization parameters or the numerical algorithm used to solve the underlying optimization problem. The methodology to allow the adjustment of the hyperparameters is the following: a fixed amount is reserved (around 20% of the data) to make a final evaluation of the predictor’s performance. With the remaining amount, a crossvalidation is made, that is, this subset is divided into a quantity k of equal parts, the algorithm is trained with a series of hyperparameters in what results from excluding one of them from the subset, measuring the effectiveness on it. By performing the k possible combinations an average measure of the effectiveness of the hyperparameters can be obtained, thus selecting the best combination. The typical strategies for this optimization are the exhaustive one, consisting of the exploration of the complete space of parameters, which usually is not feasible due to the enormous computational cost that it usually entails; and the random one, in which a prefixed number of combinations of the hyperparameters is chosen to go through different regions of this space in a random way.
124
M. Unzueta et al.
The possible inclusion or exclusion of the steps can be considered as one more hyperparameter, so that simpler pipelines (e.g. that obtained by eliminating the decomposition into major components), can be more advantageous for risk prediction. The determination of the optimal shape is inevitably dependent on the data and will be evaluated for some pre-determined set at a later stage of the project. User Risk Profiling. The procedure described above for the analysis of the risk profile of users using statistical methods allowed the assignment of a continuous numeral by means of the results of a questionnaire designed for this purpose. However, an intrinsic weakness of this design is the arbitrariness in the possible assignment of weights to its questions, so that some of them may be more important for the designation of the risk level than others. An additional problem is the validity that the hypothesis of continuity in the range of measures may have (when averaging or applying other algebraic operations, it is tacitly assumed that the distance between a 1 and a 2 is the same as between a 3 and a 4, but this does not have to be the case). One way to address these issues is to replace the algebraic techniques for assessing the results of the questionnaire with some techniques obtained through a supervised automatic learning process. This requires a series of class labels to be applied to each of the clients. Risk Profiling of Financial Products. The risk indicators on the variability of the products described above have a precise and well-defined mathematical meaning. This makes the introduction of automatic learning methods to replace it less attractive than in the case of customer risk assessment. One possible scenario where such techniques could be used to improve the system would be the possibility of having to assess the risk of a product from which there is not yet a history of data with sufficient volume for the extraction of indicators. In this case, either these or an overall indicator of the product can be estimated by constructing a classifier or a regressor following the general procedure described above.
4
Conclusion
The information used by the user to determine his risk profile has been refined, establishing the questions and possible response options which will include a mandatory questionnaire. Each possible answer is assigned to a profile according to its relationship with the different profiles in which it can be classified (which are 5: very conservative, conservative, moderate, dynamic and risk) and in addition, each question will have its own weight, as not all the information has the same influence on the risk profile. With regard to the proposed methodologies, two development paths have been presented that are currently independent, but at the testing stage, it will be seen whether they can be treated as complementary. The corresponding functional tests have been implemented and carried out to validate the implementation, while the evaluation and integration tests have to be carried out in a second
System for Recommending Financial Products Adapted to the User’s Profile
125
stage of the optimization process, with the possibility of refining the algorithms as more information becomes available. Statistical techniques have allowed obtaining good results, use for a cold start of the program. The use of well-established techniques makes it possible to obtain a high degree of reliability in the data obtained and to assign to each user and each product risk values that will serve as reliable inputs for the recommending system. Scenarios have also been proposed in which machine learning can provide benefits for a risk profiling system, finding it potentially useful for the classification of user profiles, in which context it allows several defects inherent in purely statistical classification to be overcome through the design of surveys, such as the arbitrariness present in the assignment of weights to questions or the quantification applied to their answers. Some scenarios have also been put forward in which these methods can be applied for the classification of product risks, although in these cases it is more reasonable to use indicators of a mathematical nature, whenever their extraction is possible. The choice of an optimal algorithm depends on the data set, so it is not possible to give an assessment of the algorithms without this context, except that the decision to give discreet or continuous treatment to the risk may condition the set of applicable algorithms. In this study, the continuous option is preferred, because it allows a more precise description of the situation, although the possibility of using a discrete description in case of recourse to risk labeling by an expert is considered. In general, the proposed methodological framework allows the best algorithms to be determined in an unbiased way, including the adjustment of hyperparameters, which, in some cases, may be fundamental to allow the successful application of the algorithm.
5
Discussion
Our work proposes improvements in the methodological framework allowing improvements in the algorithms in an unbiased way. For future development it is necessary to continue to reach a consensus on these algorithms, understanding that the investor’s profile may be affected not only by risks derived from the investment, but also by systemic risks that may vary the return on the investment. Acknowledgments. This research has been supported by Industrial Technology Development Center (CDTI) and Ministry of Economy and Competitiveness (Spain) under the project grant IDI-20180151 (“Proyectos de I+D de Cooperaci´ on Nacional”).
References 1. Miller, P., Kurunm¨ aki, L., O’Leary, T.: Accounting, hybrids and the management of risk. Account. Organ. Soc. 33(7–8), 942–967 (2008)
126
M. Unzueta et al.
2. Arner, D.W., Barberis, J., Buckey, R.P.: FinTech, RegTech, and the reconceptualization of financial regulation. Northwest. J. Int. Law Bus. 37, 371 (2016) 3. Philippon, T.: The fintech opportunity. Technical report, National Bureau of Economic Research (2016) 4. He, M.D., Leckow, M.R.B., Haksar, M.V., Griffoli, M.T.M., Jenkinson, N., Kashima, M.M., Khiaonarong, T., Rochon, C., Tourpe, H.: Fintech and Financial Services: Initial Considerations. International Monetary Fund (2017) 5. Gomber, P., Kauffman, R.J., Parker, C., Weber, B.W.: On the fintech revolution: interpreting the forces of innovation, disruption, and transformation in financial services. J. Manage. Inf. Syst. 35(1), 220–265 (2018) 6. Raz, T., Michael, E.: Use and benefits of tools for project risk management. Int. J. Proj. Manage. 19(1), 9–17 (2001) 7. Emmer, S., Kratz, M., Tasche, D.: What is the best risk measure in practice? a comparison of standard measures. J. Risk 18(2), 1–26 (2015) 8. Montoya, L.A., Arias, S.N.R., Benjumea, J.C.C.: Metodolog´ıas para la medici´ on del riesgo financiero en inversiones. Scientia et technica 12(32), 275–278 (2006) 9. Emfevid, L., Nyquist, H.: Financial risk profiling using logistic regression (2018) 10. Duffie, D., Pan, J.: An overview of value at risk. J. Deriv. 4(3), 7–49 (1997) 11. Novales, A.: Valor en riesgo. Departamento de Econom´ıa Cuantitativa (2014) 12. Fabozzi, F.J., Kolm, P.N., Pachamanova, D.A., Focardi, S.M.: Robust portfolio optimization. J. Portfolio Manage. 33(3), 40–48 (2007) 13. Kim, B., Lee, S.: Robust estimation for the covariance matrix of multivariate time series based on normal mixtures. Comput. Stat. Data Anal. 57(1), 125–140 (2013) 14. Garlappi, L., Uppal, R., Wang, T.: Portfolio selection with parameter and model uncertainty: a multi-prior approach. Rev. Financ. Stud. 20(1), 41–81 (2007) 15. Black, F., Litterman, R.: Global portfolio optimization. Financ. Anal. J. 48(5), 28–43 (1992) 16. Fondium. Escala de riesgo de los fondos de inversi´ on, November 2017 17. Stoneburner, G., Goguen, A., Feringa, A.: Risk management guide for information technology systems. Nist Spec. Publ. 800(30), 800–30 (2002) 18. Hallahan, T.A., Faff, R.W., McKenzie, M.D.: An empirical investigation of personal financial risk tolerance. Financ. Serv. Rev. Greenwich 13(1), 57–78 (2004) 19. Erb, C.B., Harvey, C.R., Viskanta, T.E.: Political risk, economic risk, and financial risk. Financ. Anal. J. 52(6), 29–46 (1996) 20. Beal, D., Delpachitra, S.: Financial literacy among australian university students. Econ. Papers J. Appl. Econ. Policy 22(1), 65–78 (2003) 21. Devlin, J.F.: Customer knowledge and choice criteria in retail banking. J. Strateg. Market. 10(4), 273–290 (2002) 22. Satyavolu, R.V., Kothari, S., Daredia, S.: System and method for providing a savings opportunity in association with a financial account, 11 February 2014. US Patent 8,650,105 (2014)
A COTS (UHF) RFID Floor for Device-Free Ambient Assisted Living Monitoring Ronnie Smith1,2(B) , Yuan Ding2 , George Goussetis2 , and Mauro Dragone1,2 1
2
Edinburgh Centre for Robotics, Edinburgh, UK [email protected] Institute of Sensors, Signals and Systems, Heriot-Watt University, Edinburgh, UK {yuan.ding,g.goussetis,m.dragone}@hw.ac.uk Abstract. The complexity and the intrusiveness of current proposals for AAL monitoring negatively impact end-user acceptability, and ultimately still hinder widespread adoption by key stakeholders (e.g. public and private sector care providers) who seek to balance system usefulness with upfront installation and long-term configuration and maintenance costs. We present the results of our experiments with a device-free wireless sensing (DFWS) approach utilising commercial off-the-shelf (COTS) Ultra High Frequency (UHF) Radio Frequency Identification (RFID) equipment. Our system is based on antennas above the ceiling and a dense deployment of passive RFID tags under the floor. We provide baseline performance of state of the art machine learning techniques applied to a region-level localisation task. We describe the dataset, which we collected in a realistic testbed, and which we share with the community. Contrary to past work with similar systems, our dataset was collected in a realistic domestic environment over a number of days. The data highlights the potential but also the problems that need to be solved before RFID DFWS approaches can be used for long-term AAL monitoring. Keywords: Ambient Assisted Living (AAL) · Healthcare monitoring Device free wireless sensing · Radio-frequency identification (RFID) · Indoor localisation · Region-level tracking · Fingerprinting
1
·
Introduction
Sensor-based indoor monitoring is a cornerstone component of any Ambient Assisted Living (AAL) platform, providing information relating to users’ location, activities, falls, and well-being. Common approaches rely on sensing devices attached to the environment, or attached to the body (wearable). Installation and configuration costs of these solutions limit large scale adoption, and many people are simply uncomfortable living with sensor devices in their own home, especially cameras, and are unwilling, find impractical, or simply forget to carry wearable devices. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 127–136, 2021. https://doi.org/10.1007/978-3-030-58356-9_13
128
R. Smith et al.
Device-free wireless sensing (DFWS) [1] is an emerging paradigm with the potential to address many of these concerns. Instances include systems using active/passive radar, WiFi and radio-frequency identification (RFID) technology. The underlying working principle in all cases is the use of radio frequency (RF) signals as a sensing medium: the presence, the location and the activity of humans can be estimated by analysing the way they influence the wireless signal. The key advantage of DFWS approaches is that they do not require users to wear or carry devices. However, ensuring ease of configuration and robust and reliable monitoring performance outside simplified laboratory conditions remains an outstanding challenge [1]. In this paper, we describe a monitoring system based on COTS UHF RFID equipment. Compared to other DFWS technologies, High Frequency (HF) and UHF RFID systems (respectively operating at 13.56 MHz and between 860 and 950 MHz) are already in widespread use in monitoring applications such as access control, stock management and logistics. RFID tags are either active or passive, where the former have their own power source and a long broadcast range (as much as several hundred metres) and the latter capture power from the interrogating reader to send a response signal with enough energy to travel a few metres. In particular, UHF RFID is increasingly framed as a source of useful data for healthcare monitoring applications [2], as the higher frequency allows for the rapid capture of data from a large number of passive tags, at a high data transfer rate, and with a read range (≥10 m) compatible with domestic environments, at relatively low cost. However, most UH/UHF RFID-based monitoring systems used in AAL applications to date are typically comprised of fixed readers in the environment that track moving tags attached to people/objects [3,4], or viceversa [5,6]. On the contrary, our system is designed for DFWS use: we employ only passive tags. These are distributed under the floor, in a dense grid pattern, and are continuously interrogated by antennas concealed above the ceiling. Existing works, which have evaluated the use of similar dense deployments of passive RFID tags in walls or floors [7–10], have done so in small-scale settings and over a limited time-frame. On the contrary, we evaluated our system in a realistic and real-size domestic environment over a nine day period. We describe the fully annotated RFID dataset, and provide baseline performance measurements for a region-level localisation task implemented by analysing Received Signal Strength Indicator (RSSI) data using state-of-the-art machine learning techniques. Having utilised commercial off-the-shelf equipment, this paper can in part be considered a ‘how to’ guide. In particular, we highlight the impact of RSSI drift, which needs to be addressed before enabling long-term AAL monitoring in multipath-rich environments with similar setups.
2
Related Work
Much attention has been paid to using RFID to detect human object interactions by perceiving movements of tags attached on everyday objects. The majority of
A COTS (UHF) RFID Floor for Device-Free AAL Monitoring
129
these works, such as [11] and [12], have exploited active tags, which are bigger and require maintenance due to their battery. [3] demonstrated a passive RFID trilateration technique based on RSSI data. This is filtered using a Gaussian mean to reduce flickering. The system reaches an average accuracy of ±14.12 cm over an area of 6 m2 (a kitchen in a smart home laboratory). The information is used to identify the usage of items and contribute to complex behavioural pattern mining. Noticeably, constraints specific for object tracking approaches include the fact that folding tags (to adhere to objects without flat surfaces) degrades performance, and that typically objects must be tagged with more than one tag and multiple antennas must be carefully placed to increase their observability. Much work on RFID-based localisation stems from the LANDMARC approach, which relies on fixed position (active) ‘reference tags’ and readers to track moving ‘target tags’ [13]. Accuracy typically depends on tag density, with the original architecture having a position error of 1–2 m using k -Nearest Neighbour to find the closest reference tags to the target. Latest derivations include systems using Bayesian estimation [14] and fingerprinting [15], reaching errors as low as 15 cm with reference tag spacing of 40 cm. Other approaches rely on placing the RFID reader on the moving object (e.g. [6]), which makes them not practical for AAL monitoring. More recently, wearable-free approaches have used readings from passive tags fixed to walls of the smart home to predict human activities. Approaches in [7] and [8] both demonstrate how continuous reading of a ‘wall’ of RFID tags while a human moves in front of them can be used to accurately predict activities based on motion and pose. All the activities in these examples are carried out within a short distance from the wall (few meters). Temporary occlusions or physical obstructions are inherent problems with this type of wall-mounted setup. Crucially, all these systems have been tested only in tightly controlled/stable environments over short periods of time - a point made also by [16] in their analysis of RFID sensing applications.
3
Fingerprinting for Region-Level Localisation
In order to support the evaluation of a RFID-based DFWS monitoring system for AAL applications, we implemented a RFID fingerprinting approach for regionlevel localisation. Tracking users’ movements across different regions of their home is a key function to monitor their health and behaviour [17]. The simple idea of fingerprinting is to learn the characteristic signature of the signal received from given locations. It is a widely adopted technique in RF-based localisation systems, to avoid having to model multipath analytically [18]. However, this comes at the cost of having to collect supervised information through a calibration step. The other problem, is that RF signals is strongly affected by multipaths and other perturbations, such as interference from metal present in the environment and/or in objects, humans or RF-based appliances.
130
3.1
R. Smith et al.
Testbed and Hardware Setup
We evaluated the feasibility of our fingerprinting approach in a realistic domestic environment. The experiments and results reported here were performed in the Robotic Assisted Living Testbed (RALT)1 . This is a 60 m2 , fullyfurnished simulated apartment consisting of a bedroom, bathroom and combined kitchen/dining and living areas. Our system utilised commercial off-the-shelf IMPINJ hardware: – 1 × IMPINJ Speedway Revolution R420 UHF RFID Reader – 4 × Laird Far-Field RAIN RFID Antenna (865–868 MHz) – Approx. 196 × IMPINJ RFID Monza 4QT Tags (excl. object tags). Groups of four tags were affixed to 60 cm2 Ethylene Vinyl Acetate (EVA) floor mats, enabling easy installation. For our experiments, the kitchen and bedroom of the testbed were fully outfitted with tags, as shown in Fig. 1. Interrogating antennas are behind mineral fibre tiles, at ceiling height of 2.4 m.
Fig. 1. Map of the testbed, showing positions of antennas and tags used during the study. Orange boxes represent boundaries of location zones.
3.2
Fig. 2. Single option within an activity pathway for the ‘early morning/wakeup’ phase. Participants executed at least nine of these per session.
Data Acquisition
Custom software captures and stores tag reports (comprised of peak RSSI, phase angle, and calculated velocity) for each tag in a pre-configured tag dictionary, utilising the Low-Level Reader Protocol (LLRP) toolkit from IMPINJ. A snapshot of all tag reports is generated and stored every second along with a Unix timestamp, while values are latched by the software for five seconds from their last update; although the value can still be updated during this time. 1
http://ralt.hw.ac.uk/.
A COTS (UHF) RFID Floor for Device-Free AAL Monitoring
131
Fig. 3. Heatmap of tag updates. Heatmap squares (right) correspond directly to floor tiles (left). Tiles with −1 have no tags. Antenna positions represented by ‘A’. Expressed in percent (%) of snapshots that contain an update for tags on the tile, e.g. a tile at 93% updates on average every 1.08 s, while a tile at 3% updates every 33.33 s.
Our IMPINJ reader was configured in ‘AutoSet Dense Reader Deep Scan’ mode, which widens the effective scan coverage area versus faster modes such as ‘Max Throughput’. The heatmap in Fig. 3 shows the average inquiry frequency of tags for the bedroom. With two antennas, some tags update at a rate of almost once per second, while others (mostly those at greater distance from an antenna) are considerably more infrequent. The average read rate is 52.2%, which equates to a read rate of 63.1 reads per second for the bedroom. 3.3
Activity Pathways Data Collection
Data collection was undertaken by recording raw RFID data together with video footage of six volunteers over ten sessions, for later annotation and processing. All participants took part in one approx. 40–50 min session, while participant #2 took part in an additional four sessions, for a total of five sessions labelled #2A through #2E. Sessions were performed on four days during a total time period of nine days. A particularly important point to note is that one session from participant #2, session A, was undertaken six days before the subsequent four sessions (B–E) with that same participant. Overall, this generated a total of 25,924 snapshots/seconds (≈7 h 12 min) of RFID data. Figure 4 is a frame extracted from the recorded video stream. Volunteers were asked to follow a set of prescribed activity ‘pathways’. Each pathway defines the order of execution, mimicking real-world scenarios. Participants repeated pathway variations, to simulate variances in human behaviours. An example of a single activity pathway (one of nine in each session) is shown in
132
R. Smith et al.
(a) Bedroom (CH4)
(b) Kitchen (CH7)
Fig. 4. Example of footage captured by the CCTV system.
Fig. 2. Individual activities may carry a minimum suggested time in seconds (to avoid early termination of repetitive tasks such as teeth brushing), or a ‘NAT’ instruction to behave naturally. Collected RFID data was annotated by manual inspection of the video footage, with the annotator noting the delimiter points of each region where the user stood, sat, or moved, together with the associated activity being performed, with a resolution of 1 s. Regions are semantically relevant in relation to common household activities and are highlighted in Fig. 1. At all other times, the user was transitioning between the regions, and the data is annotated to belong to a class ‘TRA’ to reflect these transition states. 3.4
Experiments
Table 1 summarises accuracy results obtained using off-the-shelf classification algorithms included in the Weka Explorer environment. Table 1. Comparison of algorithm accuracy (number of true positives and true negatives over the total number of samples) for region estimation. L1O = ‘leave-one-out’. *Split before shuffling the training set, test set order retained. NB = Naive Bayes, RF = Random Forest, LR = Logistic Regression, IBk = k-Nearest Neighbours, and SMO = Sequential Minimal Optimisation. Full dataset (participants #1–6) Evaluation type
NB
RF
LR
IBk
SMO DESN
Cross-validated (10-fold) 49.77 94.77 82.67 94.83 82.58 66/33% Train/Test Split* 45.80 57.63 37.62 40.08 42.21 75.00 43.82 72.71 57.52 52.19 61.96 L1O: participant #6 Participant #2 only Cross-validated (10-fold) L1O: participant #2E L1O: participant #2A
59.88 94.26 85.17 94.32 84.33 47.10 78.39 24.94 40.24 59.63 76.00 12.22 41.37 10.67 27.93 31.54 20.00
A COTS (UHF) RFID Floor for Device-Free AAL Monitoring
133
First, we have included 10-fold cross-validation results for: (1) the entire dataset, and (2) only data from participant #2. These results indicate that it is entirely possible to learn and make predictions on fingerprint RFID data with a high degree of accuracy. However, the high accuracy obtained with this approach does not paint a full picture of long-term behaviour or day-to-day variation in RSSI data. We then evaluated the same models with a train/test split approach, splitting the dataset before shuffling the training set: this ensures that entire blocks of sequentially labelled location data is hidden from the algorithm while building the model - in order to simulate the way the system would practically function after its calibration. We have provided results for a 66/33% train/test split on all data, as well as some ‘leave-one-out’ (L1O) examples, in which a single participant session is excluded from training and used only for testing. There is an example for this utilising the entire dataset and two exclusively utilising participant #2. These results reveal that all of these models struggle (to varying degrees) to generalise over time, whether predicting on data from the same or different individuals. This phenomenon is most likely attributable to concept drift, whereby the target changes by an unknown margin with time. The drift effect is most obvious when considering different train/test splits on data from participant #2 (bottom section of Table 1): accuracy of Random Forest almost halves when predicting locations for the session (#2A) that was carried out on a different day, compared to predicting on a session (#2E) that was carried out on the same day as most of the training data. This experiment reveals that classification accuracy is dependent on not just a sufficient pool of training data, but on training data collected close to the time of the test data. Table 2. Confusion matrix for DeepESN. Evaluated with a 66/33% train/test split. Corresponds to DESN result in Table 1. Predicted
Truth
146 95 13 0 0 0 30 239 123 0 3 1 30 140 670 30 56 76 1 6 42 431 0 0 1 9 50 0 1338 64 1 1 18 3 44 171 0 0 18 0 42 6 1 76 75 0 24 25 2 16 72 19 10 1 3 7 44 81 0 0 A
B
C
D
E
F
0 56 0 0 0 18 4 0 0 145 75 17 0 17 6 0 0 6 0 7 0 16 0 20 0 1 0 0 0 1040 1 63 0 5 1333 2 0 1 0 194 G
H
I
J
A = bedroom wardrobe B = bedroom drawers C = TRA D = bedroom mirror E = kitchen worktop corner F = kitchen worktop sink G = kitchen worktop stove H = kitchen table I = bedroom bed J = bedroom chair
134
R. Smith et al.
The model may therefore be overfitting to the response of the RFID tags around a point in time, then struggling to then generalise to new data. Our testbed remained in effectively the same configuration across all of the recording sessions, although it is virtually impossible to account for minor movement of objects and furnishings occurring during the study. Despite this, dramatic changes of measured peak RSSI values day to day are clearly observed in the raw data, as shown in Fig. 5. In both examples, a five day gap in recordings results in changes to the average measured peak RSSI, with one tag no longer being read at all—neither tag is physically obstructed. Some of the observed behaviour may be attributed to typical fluctuations in transceiver operation, in addition to non-linear effects of RF multipath and physical disturbances, including from office spaces adjacent to our test-bed.
Fig. 5. Example of average measured RSSI can change over time, despite minimal changes to environment. Here, recordings taken five days apart show significant average values. Four participants are shown on each plot, two per day.
Our next attempt has been to experiment with sequence learning techniques, in order to leverage the temporal aspect of our data. In particular, we used the Deep Echo State Network (DeepESN/DESN) implementation proposed in [19], which is based on layered Echo State Networks [20]. These types of Recurrent Neural Networks have been demonstrated to be extremely efficient in processing noisy timeseries, including in human-activity recognition (HAR) tasks using RSSI data from wearable devices [21]. We have included in Table 1 (in column ‘DESN’) the results of a 66/33% train/test split over the whole dataset (25,924 data samples with 197 variables) and also data pertaining only to user #2 trials (11,100 samples). A corresponding confusion matrix for this result is provided in Table 2. As input features we used running average and standard deviation of RSSI data, computed over windows of five seconds, and normalised ([−1..+1]). We obtained the best results with a DeepESN architecture with five layers, each of which consisted of a fully-connected reservoir with 200 units. The resulting
A COTS (UHF) RFID Floor for Device-Free AAL Monitoring
135
accuracy on the whole dataset, around 75%, is considerably higher than what we obtained with fingerprint (e.g. Random Forest) methods utilising the same train/test split, which shows this class of techniques a better choice to handle the noise in RFID data. However, noticeably the DeepESN approach fares even worse in coping with the abrupt change between training and testing data collected nine days apart from each other, as shown by the very poor accuracy (≈20%) from the train/test split with user #2.
4
Conclusions and Future Work
In this work we have evaluated the application of COTS RFID equipment to the task of indoor region localisation, a vital source of data in AAL. Our approach shows potential in its simple installation, which benefits from the use of passive tags and requires minimal planning in terms of tag placement. However, we have also highlighted how the approach is prone to concept drift, in that measured RSSI deviate further away over time from the data used in the initial supervised learning. We make the dataset publicly available2 for reproducing our results and support other researchers in the area. In our future work we plan to design an ad-hoc system tailored for AAL monitoring. We will also delve further into sequence learning techniques, only touched on here, in conjunction with automatic calibration approaches to measure RSSI background (e.g. when no one is in the environment) and track and compensate its drift over time. Acknowledgements. This work was supported by the Engineering and Physical Sciences Research Council (grant EP/L016834/1), the Carnegie Research Incentive Grant (RIG008216), and by METRICS (H2020-ICT-2019-2-#871252).
References 1. Wang, J., Gao, Q., Pan, M., Fang, Y.: Device-free wireless sensing: challenges, opportunities, and applications. IEEE Netw. 32, 132–137 (2018) 2. Khan, S.F.: Health care monitoring system in Internet of Things (IoT) by using RFID. In: 2017 6th International Conference on Industrial Technology and Management (ICITM), pp. 198–204 (2017) 3. Fortin-Simard, D., et al.: Exploiting passive RFID technology for activity recognition in smart homes. IEEE Intell. Syst. 30, 7–15 (2015) 4. Wang, L., Gu, T., Tao, X., Lu, J.: Toward a wearable RFID system for real-time activity recognition using radio patterns. IEEE Trans. Mob. Comput. 16, 228–242 (2017) 5. Soonjun, S., Boontri, D., Cherntanomwong, P.: A novel approach of RFID based indoor localization using fingerprinting techniques. In: 2009 15th Asia-Pacific Conference on Communications, pp. 475–478. IEEE, Shanghai (October 2009). ISBN: 978-1-4244-4784-8
2
https://github.com/care-group/RFID-Datasets.
136
R. Smith et al.
6. Saab, S.S., Nakad, Z.S.: A standalone RFID indoor positioning system using passive tags. IEEE Trans. Ind. Electron. 58, 1961–1970 (2011). ISSN: 0278-0046, 15579948 7. Yao, L., et al.: Compressive representation for device-free activity recognition with passive RFID signal strength. IEEE Trans. Mob. Comput. 17, 293–306 (2018). ISSN: 1536-1233 8. Oguntala, G.A., et al.: SmartWall: novel RFID-enabled ambient human activity recognition using machine learning for unobtrusive health monitoring. IEEE Access 7, 68022–68033 (2019). ISSN: 2169-3536 9. Ruan, W., et al.: TagFall: towards unobstructive fine-grained fall detection based on UHF passive RFID tags. In: Proceedings of the 12th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. ACM, Coimbra (2015) 10. Torres, S., et al.: Artificial Intelligence in Medicine, pp. 86–95. Springer, Cham (2015) 11. Zhang, D., Yang, Y., Cheng, D., Liu, S., Ni, L.M.: COCKTAIL: an RF-based hybrid approach for indoor localization. In: 2010 IEEE International Conference on Communications, pp. 1–5 (2010) 12. Mori, T., et al.: Active RFID-based indoor object management system in sensorembedded environment. In: 2008 5th International Conference on Networked Sensing Systems, pp. 224–224 (June 2008) 13. Ni, L.M., et al.: LANDMARC: indoor location sensing using active RFID. In: Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, 2003, (PerCom 2003), pp. 407–415 (2003) 14. Xu, H., et al.: An RFID indoor positioning algorithm based on Bayesian probability and k-nearest neighbor. Sensors 17, 1806 (2017). ISSN: 1424-8220 15. Zou, H., et al.: An RFID indoor positioning system by using weighted path loss and extreme learning machine. In: 2013 IEEE 1st International Conference on CyberPhysical Systems, Networks, and Applications (CPSNA), pp. 66–71. IEEE, Taipei (August 2013) 16. Wang, J., Chang, L., Abari, O., Keshav, S.: Are RFID sensing systems ready for the real world? In: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services, pp. 366–377 (2019) ´ 17. Lowe, S.A., OLaighin, G.: Monitoring human health behaviour in one’s living environment: a technological review. Med. Eng. Phys. 36, 147–168 (2014) 18. Paul, A.S., et al.: MobileRF: a robust device-free tracking system based on a hybrid neural network HMM classifier. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 159–170 (2014) 19. Gallicchio, C., Micheli, A., Pedrelli, L.: Deep reservoir computing: a critical experimental analysis. Neurocomputing 268, 87–99 (2017) 20. Jaeger, H.: Echo state network. Scholarpedia 2, 2330 (2007) 21. Amato, G., et al.: A benchmark dataset for human activity recognition and ambient assisted living. In: International Symposium on Ambient Intelligence, pp. 1–9 (2016)
Using Jason Framework to Develop a Multi-agent System to Manage Users and Spaces in an Adaptive Environment System Pedro Filipe Oliveira1,2(B) , Paulo Novais1 , and Paulo Matos2 1
Department of Informatics, Algoritmi Centre/University of Minho, Braga, Portugal [email protected] 2 Department of Informatics and Communications, Institute Polytechnic of Bragan¸ca, Bragan¸ca, Portugal {poliveira,pmatos}@ipb.pt
Abstract. Manage user preferences and local specifications on an IoT adaptive system is a actual problem. This paper uses Jason framework to develop a multi agent system to achieve a Smart Environment System, and supports interaction between persons and physical spaces, that users want to smartly adapt to their preferences in a transparent way. This work proposes a new approach, that has been developed using a multi agent system architecture with different layers to achieve a solution that entails all the proposed objectives. Keywords: Adaptive-system
1
· AmI · Multi-agent · IoT · Jason · Argo
Introduction
The Artificial Intelligence field continues with an exponential growth rate, especially in the different sectors applicability. Currently, multi-agent systems have been used to solve diverse situations, like in Ambient Intelligence. Ambient Intelligence (AmI), is an ubiquitous, electronic and intelligent environment, recognized by the interconnection of different technologies/systems, in order to carry out the different daily tasks in a transparent and autonomous way for the user [6]. Thus, multi-agent systems are made up of autonomous agents present in the environment and who have the ability to make decisions derived from the interpreted stimuli as well as the connection with other agents, to achieve common goals [15]. Currently there are different languages as well as platforms for the development of this type of systems, namely 3APL, Jack, Jade/Jadex, Jason, among others [4]. 3APL, Jadex and Jason use agents with cognitive reasoning models as an alternative to more traditional reactive models. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 137–145, 2021. https://doi.org/10.1007/978-3-030-58356-9_14
138
P. F. Oliveira et al.
Focusing on the cognitive model Belief-Desire-Intention (BDI), which allows the creation of intelligent agents capable of making decisions based on beliefs and perceptions, desires and intentions that the agent may have at a given moment. Jason is a framework for the development of SMA, which has an interpreter for the AgentSpeak language developed in Java, which implements the previously mentioned BDI model. There is also an extension for Jason, called ARGO, this being a customized architecture that allows the programming of cognitive agents using controllers (taking advantage of sensors and actuators). There are already different works in the literature that present solutions for integrating SMA with AmI, and specifically with Smart Homes, using Jade [2,7], which is reactive, and using Jason with JaCaMo [1,8,9]. Projects that use Jason as development language are mainly simulated and there are no works in the literature on physical integration with real environments or hardware to meet ubiquitous computing that use Jason as SMA in the literature. ARGO is an architecture that aims to facilitate ubiquitous SMA programming using Jason, regardless of the chosen field. This work aims to propose an autonomous Smart Home model controlled by cognitive agents using Jason and ARGO and to manage physical devices, since ARGO agents allow communication with different controllers (Arduino, Raspberry). For this, the work has a prototype of a house with six divisions, each with lighting, and a heating system. To evaluate the development of the SMA and the prototype, a series of performance tests were performed taking into account parameters such as the number of agents, number of controllers, speed of reasoning of the agents, moment of perception of the environment and information filtering, in order to explore different implementation strategies about the system. The main expected contribution of this work is the possibility of applying SMA to ubiquitous prototypes using the Jason framework and ARGO architecture applied to intelligent environments. This project proposes as well, a solution using a multi-agent system. Next, the problem will be detailed, as well as a solution proposal, which includes all the architecture developed, that later will be implemented and tested.
2
Materials and Methods
Figure 1, shows the scenario of an environment where it intends to develop this work. Explaining this figure, it can be seen the user who through its different devices (smartphone, wearable, and other compatible) communicates with the system, and for that can be used different technologies, like Near Field Communication (NFC) [14], Bluetooth Low Energy (BLE) [3] and Wi-Fi Direct [5]. Next, the system performs communication with the Cloud, to validate the information. And then the system will perform the management of the different components in the environment (climatization systems, security systems, other smart systems).
Using Jason Framework to Develop a Multi-agent System
Fig. 1. Problem statement
139
Fig. 2. Contextualization of time/ environment dimensions
To optimize the predictions of the solution proposed, an architecture for a multi-agent system was defined. The roles that each agent should represent, as well as the negotiation process to be taken, and the different scenarios in which this negotiation should take place and the way it should be processed were specified. For the project development, two phases are defined as follows: • Hardware (local systems) installation; • Multi-agent system development; Firstly, the entire physical structure must be prepared, where the local devices (Raspberry) equipped with the network technologies previously identified, so that they can detect the users present in the space. The comfort preferences of each user present in the space is sent to the agent every time an ARGO agent performs his reasoning cycle (by calling the getPercepts method, which must exist in all controllers that need to send perceptions to agents). Thus, the SMA must be programmed independently from the hardware, taking into account only the actions that must be performed to achieve the ideal comfort values for the space in question, and then these values are sent to the actuators. The connection to the actuators was not taken into account in this work, implying that it is automatic and without any constraint for the user. A prototype was thus implemented in a house, taking into account all the architecture of the SMA and the comfort actuators present in it. For this purpose, a Raspberry is used per division, in this case three on the ground floor (living room/kitchen, office, bedroom) and three on the first floor (one per room).
140
P. F. Oliveira et al.
Regarding the actuators, these divisions have a hydraulic radiant floor heating system heated by a heat pump, and a home automation system that controls the luminosity intensity in the different rooms. Figure 3 and 4 illustrates where the different local systems (Raspberry’s) are placed in each of the floors.
Fig. 3. Ground floor
Fig. 4. First floor
Viewing in more detail, a 3D model was designed, where the operation of the system for a specific space can be visualized, like can be seen in Fig. 5. Explaining the model, we can see different people present in the space, as well as the local system present in it, the arrows illustrate the autonomous communication process between the users’ peripherals (smartphones) and the local system and the one with the central server (Cloud), which will allow to have the information for each of the agents to work and to reach the values of comfort preferences to use in the actuators. This work proposes an autonomous Smart Home model, controlled through cognitive agents, which get the final information to be applied by the actuators. For do that, a house with six divisions was prototyped with different comfort features, namely temperature, luminosity, audio and video. The considered parameters for performance evaluation are as follows: • Number of agents used; • Agent speed reasoning;
Using Jason Framework to Develop a Multi-agent System
141
Fig. 5. Example of the system in a division
• Information filtering; • Environment perception time; This work resulted in the complete specification of an architecture that supports the solution found, to solve the presented problem. It will now be implemented, tested and validated using real case studies, so as to gather statistical information to assess its effectiveness and performance in the context of application. This work aims to give continuity and finalize the doctoral work presented in previous editions [10–12]. Thus Fig. 1 and Fig. 2 exemplifies in a global way the architecture of the system where this work has been carried out.
3
Results
This section presents the technologies used in this project for the development of the entire SMA applied to AmI. Jason is a framework with its own language for the development of cognitive SMA, and using the customized architecture of ARGO agents it is possible to bridge the gap between SMA and actuators and sensors present in the real world. Figure 6 exemplifies the use case diagram, explaining this diagram, we can verify the functioning of the different agents. Namely the information received by them, and how the negotiation process is carried out, those involved in it, and how the final result of the negotiation is passed to the actuator. Initially, we can verify that the agent that represents the local system receives its information, namely the security information (maximum values of temperature, gases, and others). For each user present at the local, there will be an agent who represents him, he will receive information about the user preferences from the central system.
142
P. F. Oliveira et al.
Fig. 6. AmI system - use case diagram
The negotiation process will then be made up of the local system agent and each of the users agents present at the local. The negotiation result of will then be passed on to the different actuators present in the local. 3.1
Framework Jason
Jason is a framework with an agent-oriented programming language, has an AgentSpeak Java interpreter for the development of intelligent cognitive agents using BDI. The BDI consists of three basic constructions: beliefs, desires and intentions. Beliefs are information taken as truths for the agent, which can be internal, acquired through the relationship with other agents or through the perceptions observed in the environment. Desires represent an agent’s motivation to achieve a certain objective and intentions are the actions that the agent has committed to perform. AgentSpeak is a programming language focused on the agent approach, which is based on principles of the BDI architecture. In addition to these concepts, the Procedural Reasoning System allows the agent to build a real-time reasoning system for performing complex tasks. Jason’s agents have a reasoning cycle based on events that are generated from capturing perceptions of the environment, messages exchanged with other agents and through their own conclusions based on their reasoning. These events can be triggered by triggers that lead to the execution of plans (available in specific libraries) composed of several actions. Jason’s agents are
Using Jason Framework to Develop a Multi-agent System
143
Fig. 7. Architecture of the multi-agent system
programmed based on the definition of objectives, intentions, beliefs, plans and actions internal to the agent and actions performed in the environment. An SMA in Jason does not traditionally have an interface for capturing perceptions directly from the real world using sensors. The ARGO custom architecture is used for this, Jason uses a simulated environment. 3.2
The ARGO Architecture
ARGO is a customized Jason agent architecture to enable the programming of robotic and ubiquitous agents using different prototyping platforms. ARGO allows intermediation between cognitive agents and a real environment (using controllers) through Javino middleware, which communicates with the hardware (sensors and actuators). In addition, as the use of BDI on robotic platforms can generate bottlenecks in the processing of perceptions and, consequently, unwanted delays in execution, this extension also has a perception filtering mechanism at run time [13]. An SMA using Jason and ARGO can be made up of traditional Jason agents and ARGO agents that work simultaneously. Jason agents can carry out plans and actions only at the software level and communicate with other agents in the system (including ARGO agents).
144
P. F. Oliveira et al.
An ARGO agent, on the other hand, is a traditional agent with additional characteristics, such as, for instance, the ability to communicate with the physical environment, perceive it, modify it and filter the perceived information. Figure 7 represent the architecture separation into different layers, to be easily identified the purpose of each, and agents containing it. The layers description is as follows: • Data acquisition layer, which will import the necessary information for the agents operation, namely information of interior and exterior temperature and light sensors. • User layer, in this layer we will have an agent that will represent each user and his preferences that must be taken into account in the negotiation process. • Local System layer, here each local system will be represented by an agent, which contains all the information necessary to this location, either the referred to user preferences, or local/users security (maximum/minimum temperature, safety values for CO2, etc.). • Simulation layer, in this layer will be the negotiation between the different agents involved, namely the management of conflicts between the different users and local systems. After the negotiation process ends we will have as result the values to apply in the place. • Action layer, after the process is executed in the simulation layer, and the values to be applied are obtained. These values are used in this layer and sent to the actuators that will apply them in the different automation systems present the local.
4
Discussion and Conclusions
With this work, the total development of an architecture and respective cognitive model for a Smart Home was achieved, using an SMA with BDI agents, developed using Jason and ARGO. The main objective of this work was mainly to verify the potential that this type of architecture has for the development of ubiquitous SMA using low cost hardware, such as Raspberry’s. This being a feasible proposal for the problem to be solved in this project, thus being a viable solution and that solves all the constraints adjacent to the problem. The agent system modeling is fully developed. At this stage the agent layer is developed and implemented, and is now in a testing phase in the testing environment developed for this project. For future work, the results of the testing phase, will be analyzed and evaluated and with that results improve this project and support other works in this field. Acknowledgments. This work has been supported by FCT – Funda¸ca ˜o para a Ciˆencia e Tecnologia within the Project Scope: UID/CEC/00319/2019.
Using Jason Framework to Develop a Multi-agent System
145
References 1. Andrade, J.P.B., Oliveira, M., Gon¸calves, E.J.T., Maia, M.E.F.: Uma abordagem com sistemas multiagentes para controle autˆ onomo de casas inteligentes. In: XIII Encontro Nacional de Inteligˆencia Artificial e Computacional (ENIAC) (2016) 2. Benta, K.I., Hoszu, A., V˘ acariu, L., Cret¸, O.: Agent based smart house platform with affective control. In: Proceedings of the 2009 Euro American Conference on Telematics and Information Systems: New Opportunities to increase Digital Citizenship, pp. 1–7 (2009) 3. Bluetooth Specification: Bluetooth core specification version 4.0. Specification of the Bluetooth System (2010) 4. Bordini, R.H., H¨ ubner, J.F., Wooldridge, M.: Programming Multi-Agent Systems in AgentSpeak Using Jason, vol. 8. Wiley, Hoboken (2007) 5. Camps-Mur, D., Garcia-Saavedra, A., Serrano, P.: Device-to-device communications with Wi-Fi direct: overview and experimentation. IEEE Wirel. Commun. 20(3), 96–104 (2013) 6. Chaouche, A.C., Seghrouchni, A.E.F., Ili´e, J.M., Saidouni, D.E.: A higher-order agent model with contextual planning management for ambient systems. In: Transactions on Computational Collective Intelligence, vol. XVI, pp. 146–169. Springer (2014) 7. Kazanavicius, E., Kazanavicius, V., Ostaseviciute, L.: Agent-based framework for embedded systems development in smart environments. In: Proceedings of the 15th International Conference on Information and Software Technologies IT, pp. 194–200 (2009) 8. Martins, R., Meneguzzi, F.: A smart home model to demand side management. In: Workshop on Collaborative Online Organizations (COOS 2013)@ AAMAS (2013) 9. Martins, R., Meneguzzi, F.: A smart home model using JaCaMo framework. In: 2014 12th IEEE International Conference on Industrial Informatics (INDIN), pp. 94–99. IEEE (2014) 10. Oliveira, P., Matos, P., Novais, P.: Behaviour analysis in smart spaces. In: 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), pp. 880–887. IEEE (2016) 11. Oliveira, P., Novais, P., Matos, P.: Challenges in smart spaces: aware of users, preferences, behaviours and habits. In: International Conference on Practical Applications of Agents and Multi-Agent Systems, pp. 268–271. Springer (2017) 12. Oliveira, P., Pedrosa, T., Novais, P., Matos, P.: Towards to secure an IoT adaptive environment system. In: International Symposium on Distributed Computing and Artificial Intelligence, pp. 349–352. Springer (2018) 13. Stabile, M.F., Sichman, J.S.: Evaluating perception filters in BDI Jason agents. In: 2015 Brazilian Conference on Intelligent Systems (BRACIS), pp. 116–121. IEEE (2015) 14. Want, R.: Near field communication. IEEE Pervasive Comput. 10(3), 4–7 (2011) 15. Wooldridge, M.: An Introduction to Multiagent Systems. Wiley, Hoboken (2009)
Towards the Development of IoT Protocols Gon¸calo Salazar1(B) , Lino Figueiredo1 , and Nuno Ferreira2 1
ISEP - Institute of Engineering – Polytechnic of Porto (ISEP/IPP), Porto, Portugal {glb,lbf}@isep.ipp.pt 2 Porto, Portugal [email protected] Abstract. The increasing need for Internet of Things (IoT) devices will increase the need for resilient communication and network protocols, therefore an evaluation of the state of the art is required to establish the present and project the future of IoT networks. This paper analyses both communication and network protocols used in the IoT, as well as routing mechanisms and security procedures. The paper also uses the analysis made to the current technologies as basis for considerations regarding the future trends of IoT technologies. Keywords: IoT
1
· Communication protocols · Network protocols
Introduction
The current trends and predictions all point to an even bigger increase of the connected devices, increasing the number of IoT nodes to the billionths of nodes. This effect is already provoking many changes to the human lifestyle with the always on, always connected devices such as smartphones, but this will increase when it reaches all other devices that interact with people. The interface and limits between the analog and the digital world will be fuzzed and the everyday objects will make their status available to users in real-time. To enable these changes the communication methods used will be of utmost importance and will define if the users are able to communicate with all devices or if they will be locked in to a certain vendor or platform. The communication stack will divide itself into two categories; wide range and short range, with the first being able to communicate up to some kilometres and the second to some hundreds of metres. However due to the ubiquity of such devices, whether long or short range ones, one of the main requirements will be low power nature of the protocols used enabling the long term deployment of equipments. The next section will detail what is IoT and describe some concepts surrounding the Internet Of Things. Section 3 will present some of the different Network Protocols used, as well as their advantages and disadvantages. Section 4 will analyse and compare multiple Communication Protocols used in IoT. Finally, c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 146–155, 2021. https://doi.org/10.1007/978-3-030-58356-9_15
Towards the Development of IoT Protocols
147
Sect. 5 will present some future perspectives for IoT networks and Sect. 6 will present some conclusions.
2
IoT
The increase in electronic devices that is currently happening with no signs of slowdown in the near future leads to the ubiquity of such devices, creating what is called the Internet of Things. Ericsson predicts that in 2022, 29 billion devices will be connected, with 18 billion being IoT devices. Of these IoT devices 16 billion will be short-range IoT devices [1]. “The IoT can be perceived as a far-reaching vision with technological and societal implications.” [2]; this implies that the problems and issues that arise from this increase of connected systems and objects are not only a technical problem that relates to the network configurations, power consumption, data volumes and security but also the more human aspects of the increase of connectivity, such as tracking, privacy protection, human-machine interfaces. The IoT adds a new dimension when compared to traditional Information and Communication Technologies (ICTs) that provides the “Any Place” and “Any Time”, the “Any Thing” [2]. The things connected by the IoT can be both physical and virtual and can interact or not with humans. The networks that they form can go from more common already used TCP/IP based networks or to more ad-hoc distributed mesh networks that do not have a standard structure. “There are two important aspects to scaling within the IoT: scaling up Internet technologies to a large number of inexpensive nodes, while scaling down the characteristics of each of these nodes and of the networks being built out of them, to make this scaling up economically and physically viable.”, according to the “Terminology for Constrained-Node Networks” [10]. The applications of such networks have multiple possible application ranging from industrial control and sensing to mobility industry and smart cities. Allowing the increase of sensors and actuators to automate most of the processes. Nevertheless the decision making cannot be completely offloaded to most of those devices due to the constraints on processing power, data storage and power consumption.
3
IoT Network Protocols
Multiple wireless network protocols exist, defined by multiple standards and operating in multiple frequencies. Those protocols have their own advantages and disadvantages and different goals and applications. Most of the protocols presented are based on radio technologies defined by the IEEE organization in the 802.11 and 802.15 standards. The selection of the most suitable network protocol is dependent of the application and is tied to what the goals, range and data rates requires are.
148
3.1
G. Salazar et al.
Wi-Fi
Wi-Fi is a wireless networking technology based on the IEEE 802.11 standards. It is one of the most pervasive wireless technologies used, being present in most consumer and industrial electronic devices. The large deployment of this technology makes it easy to use and to develop applications based on it due the existing knowledge. Wi-Fi works on both the 2.4 GHz and 5 GHz frequencies and has data rates between 2 Mbps and 1.73 Gbps [4] depending on the frequency and modulation used. However the power consumption of Wi-Fi setups is quite high making sometimes unsuitable for low-power applications that characterize most IoT applications. 3.2
Long Range Wide Area Network (LoRaWAN)
LoRaWAN is a Low Power Wide Area Network (LPWAN) technology targeting the IoT requirements of security, bi-directional communication and long-range communications. The network is laid out in a star of stars topology where gateways act as transparent bridges between edge nodes and a central server. The communication between edge nodes and the gateways is performed at multiple frequencies and data rates, the selection of the data rate is a trade-off between communication range and message duration [5]. LoRaWAN is based on the LoRa modulation, a proprietary modulation system, which operates on the unlicensed spectrum, which means that anyone can setup a network. It can operate with data rates between 0.3 kbps and 50 kbps, with ranges 2 to 5 km in urban environment and 15 km in suburban environment [3] and low power consumption. 3.3
Bluetooth Low Energy (BLE)
BLE is a wireless low-power technology that operates in the 2.4 GHz frequency, the same as the Classic Bluetooth, but is optimized for battery consumption. It is capable of communicating with data rates between 125 kbps and 2 Mbps and can operate in three different topologies: point-to-point, broadcast and mesh. BLE has a maximum line of sight range of 300 m. BLE is currently one of the most widely distributed technologies present in devices, from smartphones to IoT application in automation. BLE was designed to ensure that it does not suffer from interference from Wi-Fi and other technologies operating in that frequency range. It was also designed with power efficiency in mind, ensuring that low-power consumption is obtained both during communication and also while sleeping [8]. 3.4
ZigBee
ZigBee is a low-power wireless technology that operates on the 2.4 GHz frequency with data rates of 250 kbps and ranges of up to 300 m in line of sight. It is a
Towards the Development of IoT Protocols
149
technology currently deploy in many industrial and home automation scenarios where power consumption is an issue [7]. The networks can have multiple structures from star to mesh. Multiple ZigBee protocols exist and not all of them are compatible, making some specifications not able to talk with each other. Standard networks usually have a coordinator responsible for gathering and relaying messages between nodes, when configured in a mesh network the coordinator responsibility is to accept new nodes onto the network, however the network can live even if the coordinator is no longer present. 3.5
Cellular
Multiple cellular based communication solutions exist, however, all require an operator to setup the infrastructure and therefore fees are always required. They are characterized by long range (ie: 35 km for Global System for Mobile Communications (GSM) and 200 km for High Speed Packet Access (HSPA)) and operate in licensed radio bands (900/1800/1900/2100 MHz). Multiple standards and technologies exist for cellular based communication, that can be divided into two groups: standard and low-power. Standard cellular network include GSM/3G/4G which are usually used for communication due to the high availability of the service, the big coverage, high data rates and long range. However these technologies have a high power consumption not making them suitable for low-power applications. Low-power standards, CAT-M1 and NB-IoT are more recent standards presented by 3GPP that are design for different types of applications. Both these technologies operate in the same licensed spectrum that the standard cellular network technologies but provide lower power consumption at the expense of bandwidth and latency [6]. 3.6
SigFox
SigFox is a proprietary low power, low data rate wireless technology operating on the unlicensed Industrial, Scientific and Medical (ISM) radio bands. The network operates on a one-hop star topology network that requires an operator, usually SigFox, to setup the antennas. Due to this fact the number of messages that can be sent daily is limited, the coverage is conditioned by the infrastructure deployment by the operators and a fee is required to use the network. SigFox networks are mostly uplink only. The data rates are between 10 bps and 1 kbps with ranges between 30 and 50 km in rural environments and 3 and 10 km in urban environments [3]. The very low power consumption makes is suitable for applications where small amounts of data need to be sent and long range is required.
150
G. Salazar et al.
3.7
IPv6 over Low-Power Wireless Personal Area Networks (6LoWPAN)
6LoWPAN is an open standard defined by the IETF to enable IPv6 for small embedded devices. “The concept was born from the idea that the Internet Protocol could and should be applied to even the smallest of devices.” [9]. The standard was developed to be used on top of existing IEEE 802.15.4 based networks such as BLE and ZigBee enabling IPv6 addressing for such low-power devices. The 6LoWPAN standard allows multiple low-powered devices to communicate seamlessly with the Internet, differing only in the header compression used in the 6LoWPAN which is optimized for the IEEE 802.15.4 networks, allowing interoperability between them, apart from a gateway that does not need to translate packages. Since 6LoWPAN is an IP it also allows for the leveraging of previous knowledge and tools used in IP networks. 3.8
Summary
Table 1 presents a summary of the different evaluated network protocols. The different characteristics are present for each, the 6LoWPAN protocol does not have fixed values since it is dependent on the physical layer on top of which it is used. It can be used on top of any IEEE 802.15.4 based network. Table 1. IoT Network protocols summary Range
Data rate
Frequency
Power consumption
50–100 m
2–1730 Mbps
2.4 or 5 GHz
High
LoRaWAN 2–15 km
0.3–50 kbps
868/902/920 MHz
BLE
100 m
125–2000 Mbps 2.4 GHz
Low
ZigBee
300 m
250 kbps
2.4 GHz
Low
Cellular
35–200 km
50–300 Mbps
900/1800/1900/2100 MHz High
SigFox
3–50 km
10–1000 bps
868/902/920 MHz
Wi-Fi
6LoWPAN PHY dependent PHY dependent PHY dependent
4
Low
Low PHY dependent
IoT Communication Protocols
On top of the Network Protocols presented in Subsect. 3, data needs to be transmitted between multiple devices that must communicate using the same “language”. In the IoT, the communication can be divided in two types: device to device and device to Internet. The communication between the devices, whether edge nodes or gateways, needs to be coherent and have low overhead to ensure the
Towards the Development of IoT Protocols
151
limited resources are correctly used; between the gateways and the Internet this limitation usually does not exist, therefore more common protocols can be used. Due to the recent growth in the IoT world a standard protocol still does not exist, nonetheless multiple contenders are available each with their own strengths and weaknesses. All the protocols were designed with certain set of applications in mind but all try to reduce the bandwidth used and the non payload bytes transmitted. The following sections will attempt to present some of the more common ones along with the advantages and disadvantages of each one. 4.1
HTTP/2
HTTP/2 is a major revision of the HTTP that tries to address some of the issues present in the HTTP/1.1 while ensuring backwards compatibility with the previous revisions. It is based partially in the experimental “SPDY” protocol developed by Google. The goals of the protocol was to ensure the core features of the HTTP/1.1 while improving its efficiency, allow for multiplexing of requests via streams, add flow control and server to client push [11]. HTTP is built on top of TCP and can make use of other security features such as SSL/TLS. Although some IoT applications might use HTTP, the overhead imposed by the protocol headers makes it unusable for constraint devices. 4.2
Message Queue Telemetry Transport (MQTT)
MQTT is a communication protocol, initially developed by IBM and currently maintained by OASIS that works over TCP. The protocol works with a publisher/subscriber logic and was projected with reliability and low-power in mind [12]. The use of a publish/subscribe model has several implications in the network structure. A broker is required to relay the messages between the publishers and the subscribers, however the publishers do not need to be aware of the subscribers and the performance of the publisher is independent of the number of subscribers. The protocol also allows for very small message headers making it good for low-bandwidth networks. The payload does not have a standard format, therefore the application layer needs to ensure this is coherent between all the nodes. The MQTT protocol has some limitations since it does not support message queueing nor time to live, the broker just stores one message and delivers it to the subscribers that are awake. It is not possible to tell if a device that is in low-power mode will ever receive the message. 4.3
Constrained Application Protocol (CoAP)
CoAP was created by the IETF and is a specialized web transfer protocol for use with constrained nodes and constrained networks in the Internet of Things. The
152
G. Salazar et al.
protocol is designed for machine-to-machine (M2M) applications such as smart energy and building automation [13]. It is based on the REST model like many HTTP applications, where servers make resources available under a URL that can be accessed via methods such as POST, PUT, GET, etc. It has the advantage of allowing skill transfer from HTTP. It supports asynchronous message exchange, low header overhead and parsing complexity and security capabilities using DTLS [14]. The data payloads can use multiple encodings such as JSON, CBOR, XML, etc. However due to the model used it does not support one to many communication as other protocols. 4.4
Advanced Message Queueing Protocol (AMQP)
AMQP is a protocol standard for message-queueing communications, maintained by OASIS. It works on top of TCP or UDP and is focused on low-power devices supporting one-to-one and one-to-many communication. It intends to define a low-barrier of entry protocol, with built-in safety and reliability features, guaranteeing interoperability of implementations and stability of operations [13]. The devices communicate with a broker that supports message queuing and ensures that the messages are correctly delivered to the destination. Unlike MQTT it is not a pure publisher/subscriber communication although it can work as one. Compared to MQTT it has a bigger overhead, due to the increase of features it supports [15]. 4.5
WebSockets
WebSockets is a communication protocol different from, but compatible with, HTTP that operates on top of the TCP layer. It is defined in the RFC 6455 [16] as “a two-way communication between a client running untrusted code in a controlled environment to a remote host that has opted-in to communications from that code.” To allow two-way communication without requiring multiple HTTP connections to be made, reducing the overall overhead. Although some applications using websockets for the IoT exist, this protocol was designed with web browsers in mind, hence the concerns with overhead reduction and low-power were not taken into account, despite it using less bandwidth than HTTP. 4.6
Extensible Messaging and Presence Protocol (XMPP)
XMPP is a set of open technologies for instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data [17]. It was originally developed by the Jabber open-source community to provide an open and decentralized alternative to closed communication protocols. The standard for XMPP was defined by the IETF and it is a proven standard with security features (based on TLS) and flexible applications.
Towards the Development of IoT Protocols
153
It is useful in IoT contexts since it already provides a tested protocol that is extensible and scalable with support for one-to-one and one-to-many communication patterns that can be used depending on the application [18]. 4.7
Data Distribution Service (DDS)
DDS is a middleware standard from the OMG that provides a broker-less, fully distributed communication platform. It intends to provide an interoperable, lowlatency, reliable and scalable data transmission protocol [19]. DDS provides a data centric layer to communicate from device-to-device or device-to-server, with multiple QoS and priority definitions per topic and device [19]. Although the communication is broker-less, a transparent broker can be used to relay messages. DDS uses a data-centric architecture meaning that all the messages include the contextual information needed to be interpreted, meaning that the applications are aware of data but not necessarily of its encoding [19]. 4.8
Summary
Table 2 presents a summary of the different evaluated communication protocols. The different network protocols are qualitatively analysed from the perspective of an IoT application considering the constraints and requirements of such applications. Table 2. IoT communication protocols summary
5
Overhead Topology
Designed for IoT
HTTP/2
High
One-to-one
No
MQTT
Low
One-to-many Yes
CoAP
Low
One-to-one
AMQP
Medium
Yes
One-to-many Yes
WebSockets High
One-to-one
No
XMPP
Medium
One-to-many No
DDS
Medium
One-to-many Yes
Future Perspectives
The current trends both in the growth of IoT networks and devices combined with the reduction of size, cost and power consumption of such systems will definitively lead to an increase of investment in novel communication methods and protocols. The communication between devices and with central systems
154
G. Salazar et al.
will require either a network of interconnected nodes and gateways that are able to communicate at different distances and relay the required information to its final destination. The needs of the systems will power an evolution in the communication and network protocols, however to accelerate the integration and deployment of such devices to accompany the fast demands of the market the current know-how and methodologies will have to be leveraged. The inclusion of proven technologies and methods will allow for more robust systems, for instance, reusing and adapting the existing standards that created the internet will enable a quicker integration between the Internet of Things and the World Wide Web. The adaptation of network protocols such as Wi-Fi will allow for the reuse of already deployed networks in different environments. Nevertheless, not all scenarios will allow for the reuse of the existing technologies and therefore new technologies will emerge, that will allow for lower power and wider range communication. Technologies as LoRa and Sigfox are examples of recent networks that have those properties. These newer technologies are subject to validation and market adoption but will enable multiple applications that were not possible several years ago. The development of new distributed communication protocols allows the decentralization of the systems increasing the reliability and pushing the data analysis closer to where it is collected, reducing the amount of information that needs to travel longer distances. These protocols will also move towards lower overhead to ensure that more data is transmitted in each communication. The continued reduction of the electronics that compose the IoT will push the communication means towards a continued reduction in energy requirements while driving an increase in the amount of data transmitted and distance it travels. These evolutions will lead to ubiquitous distributed networks that will populate our everyday lives.
6
Conclusions
The existing landscape of IoT protocols is vast and each of the existing ones has its own set of advantages and disadvantages, however this is a beneficial situation since the multiple deployment scenarios with different characteristics makes a “one size fits all” protocol hard, if not impossible to come by. The current IoT protocols have the possibility to accelerate the deployment of the IoT ensuring that it can meet the demands of the market while guaranteeing the resilience and security of the systems. However the evolution trends will push such systems to a smaller, lower power consumption, longer distance novel communication and network protocols driving an shift in the IoT paradigm. The IoT will no longer exist in small contained deployments but interconnect multiple systems of sensors and actuators providing a clearer picture of the world. Acknowledgements. This work has received funding from National Funds through FCT under the project UIDB/00760/2020.
Towards the Development of IoT Protocols
155
References 1. Internet of Things Forecast. https://www.ericsson.com/en/mobility-report/ internet-of-things-forecast. Accessed July 2018 2. ITU-T Study Group 13: Recommendation ITU-T Y.4000/Y.2060 - Overview of the Internet of Things (July 2012 3. 11 Internet of Things (IoT) Protocols You Need to Know About. https://www. rs-online.com/designspark/eleven-internet-of-things-iot-protocols-you-need-toknow-about. Accessed Jan 2018 4. Different Wi-Fi Protocols and Data Rates. https://www.intel.com/content/www/ us/en/support/articles/000005725/network-and-i-o/wireless-networking.html. Accessed Jan 2018 5. Lora Alliance Technology. https://www.lora-alliance.org/technology. Accessed Jan 2018 6. CAT-M1 vs NB-IoT – Examining the Real Differences. https://www.iot-now.com/ 2016/06/21/48833-cat-m1-vs-nb-iot-examining-the-real-differences/. Accessed Jan 2018 7. Zigbee 3.0. http://www.zigbee.org/zigbee-for-developers/zigbee-3-0/. Accessed Jan 2018 8. Bluetooth Low Energy (BLE) Fundamentals. https://www.embedded.com/design/ connectivity/4442870/Bluetooth-low-energy--BLE--fundamentals. Accessed Jan 2018 9. Mulligan, G.: The 6LoWPAN architecture. In: Proceedings of the 4th Workshop on Embedded Networked Sensors (2007) 10. Bormann, C., Ersue, M., Keranen, A.: RFC7228 - terminology for constrained-node networks. IETF (May 2014) 11. Belshe, M., Peon, R.,Thomson, M.: RFC7540 - hypertext transfer protocol version 2 (http/2). IETF (May 2015) 12. HiveMQ - MQTT Essentials. https://www.hivemq.com/mqtt-essentials/. Accessed Jan 2018 13. CoAP - RFC 7252 Constrained Application Protocol. http://coap.technology/. Accessed Jan 2018 14. Shelby, Z., Hartke, K., Bormann, C.: RFC7252 - the constrained application protocol (CoAP). IETF (June 2014) 15. From MQTT to AMQP and Back. http://vasters.com/blog/From-MQTT-toAMQP-and-back/. Accessed Jan 2018 16. Fette, I., Melnikov, A.: RFC6455 - the websocket protocol. IETF (December 2011) 17. An Overview of XMPP. http://xmpp.org/about/technology-overview.html. Accessed Jan 2018 18. XMPP - Internet of Things (IoT). https://xmpp.org/uses/internet-of-things.html. Accessed Jan 2018 19. What is DDS?. http://portals.omg.org/dds/what-is-dds-3/. Accessed Jan 2018
Livestock Welfare by Means of an Edge Computing and IoT Platform ¨ urk1 , Ricardo S. Alonso1(B) , Oscar ´ Mehmet Ozt¨ Garc´ıa1 , 1 1,2 In´es Sitt´ on-Candanedo , and Javier Prieto 1
Bisite Research Group, University of Salamanca, Salamanca, Spain {mehmet,ralorin,oscgar,isittonc,javierp}@usal.es 2 Air Institute, IoT Digital Innovation Hub, Salamanca, Spain
Abstract. The drop in the area of land available for agriculture and the growth of the population is creating more food demand and this makes farmers turn to the new technologies to increase or maintain the quantity and quality of agricultural products. Cloud computing has been playing an important role in the last decade. Unlike Cloud computing, Edge Computing handles the data generated by processing them at the network edge which allows for the implementation of services with shorter response times, a higher Quality of Service (QoS), increased security and low costs. In this paper, we present a platform which combines IoT, Edge Computing, Machine Learning and Blockchain techniques based on the Global Edge Computing Architecture to monitor the state of livestock in real-time, as well as ensure the traceability and sustainability of the different processes involved in the production. The platform is tested for its effectiveness comparing a traditional cloud-based architecture. Keywords: Internet of Things · Edge Computing · Distributed Ledger Technologies · Smart Farming · Precision agriculture · Livestock monitoring
1
Introduction
A report by the World Health Organization suggests that while 218 million tonnes of meat was produced in 1999, by 2030 this is projected to increase to around 376 million tonnes while the population increases, 2.25 billion people from today’s levels, reaching 9.15 billion by 2050 [1]. As this global demand for animal products(meat, milk, eggs) is set to increase by up to 70% in 2050, so too does the pressure on the livestock farming sector. This brings various issues: a need to increase resource efficiency, become more environment-friendly, ensuring the safety and quality of the product. Hence, small- and medium-sized farms could benefit from the implementation of low-cost technologies in many dimensions [2]. In the last decade, technology has become more accessible and cheaper [2]. IoT and Cloud has brought new technology-related concepts such as Precision c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 156–165, 2021. https://doi.org/10.1007/978-3-030-58356-9_16
Livestock Welfare by Means of Edge Computing and IoT
157
Agriculture, which measures and responds to the variability of the agricultural data gathered by sensors [3]; and Smart Farming, which applies information and data technologies to perform an analysis of location, historical, real-time and forecast data before taking specific actions [4]. The Internet of Things (IoT) paradigm [5] is essential in the monitoring of resources by connecting multiple and heterogeneous objects in livestock, such as buildings (e.g., barns), machinery (e.g., agricultural tractors) and living organisms (e.g., cattle) [6,7]. Edge Computing [8] reduces the costs associated with computing, storage and network resources in the Cloud by deploying services at the edge of the network, and also reduces service response times and increases the Quality of Service (QoS) and the security of applications [9]. Distributed Ledger Technologies can be also used in smart farming scenarios so that end consumers can track the processes of the product, guaranteeing the integrity of the information [10]. This work presents a new agro-industry platform with the implementation of Edge Computing, Artificial Intelligence and Blockchain techniques in Smart Farming environments to monitor, in real time, the state of livestock and the feed grain, while ensuring the traceability and sustainability of the different processes involved in the production. The rest of this paper is structured as follows. Section 2 identifies the most important trends in the application of IoT and Edge Computing paradigms in Smart Farming scenarios. Global Edge Computing Architecture (GECA) [11] is described in Sect. 3. Finally, conclusions and future work are outlined in Sect. 4.
2
Edge Computing and IoT in Smart Farming
Thanks to technical and communication advances in technology, the number of sensors and devices that can be implemented in agricultural solutions has grown enormously [12]. This growth and accessibility have favored the emergence of IoT and Cloud solutions, giving rise to a phenomenon known as Smart Farming [3]. Smart Farming provides an exhaustive analysis, performs precise actions (e.g., decision support information, task automation), taking into account, location of assets, cattle or humans, and other data enriched by the historical, real-time and forecast information and knowledge [3,4]. The term Internet of the Things (IoT) refers to connecting multiple heterogeneous objects, such as machinery, vehicles or buildings with electronic devices such as sensors and actuators, through different communication protocols, in order to gather and extract data [5]. In addition to the challenges encountered in the management of heterogeneous resources, the acquisition, processing and transmission of data also become problematic when dealing with millions of connected data sources [13]. Edge Computing (EC) is the competent paradigm when it comes to solving those problems. Multiple areas have benefited from combining IoT and Edge Computing, such as health care [14], augmented reality [15], energy and smart grids [16], smart cities [17], as in the case of this research, Smart Farming [3].
158
¨ urk et al. M. Ozt¨
There are multiple use cases that provide IoT and Edge Computing solutions to specific problems in Smart Farming. [18] presented an IoT system that measures the quantity and quality of grain in a silo, using multiple sensors to measure temperature and humidity. [19] propose to control irrigation for hydroponic precision farming by means of the combination of multiple sensors and pumps to take smart measurements of the pH and water at the hydroponic facility. Understanding the behavior of cattle and herds is important to improving animal welfare, which leads to better management, productivity and product quality. An example of this is designed by [20]. There are also approaches aimed to detect and prevent plagues by means of IoT and Artificial intelligence techniques [21]. [22] propose an electronic nose that detects apple mold by means of multiple sensors, neural networks, Linear Discriminant Analysis and Support Vector Machines. [23] presented a generic architecture formed by five layers: Perception Layer (Data Ingestion), Network Layer (communications), Middleware Layer (service management), Application Layer (management of application objects) and Business Layer (management of the whole system). Although this approach has five layers, the solution does not take security issues into account, this can have a negative impact on data management and product traceability. [24] present a full-stack solution for a connected farm. This module communicates with expert systems and controls the deployment of the IoT systems that monitor crops. However, it lacks flexibility given the great heterogeneity of existing communication devices and protocols. Also security is an aspect that has not been addressed in this work. Agri-IoT is a framework that favors agricultural data analytics [25]. It is divided into three levels: lower level (IoT devices and communication), intermediate level (data management and analytics) and higher level (application). The framework is tested in two scenarios (Management of fertility dairy cows and Soil fertility for crop cultivation) in which the performance of the framework is verified, however, the need to include open standards that would increase the flexibility of the framework is also evident. Finally, [26] propose a scalable framework for data analysis in which edge nodes pre-process and analyze private data collected before sending the results to a remote server, which collects these results to estimate and predict the total yield of the crop. In the work of [26], applying its framework in a real scenario on a tomato farm, the error rate is comparable to the case executed only on the server. In the works found in the state-of-the-art, it can be observed that all the developments lack security and there is a limit to increasing generalization. The next section presents a novel Edge Computing Reference Architecture that covers these gaps. This new architecture provides the basis for the development of a new platform for application in smart agricultural and livestock environments.
Livestock Welfare by Means of Edge Computing and IoT
3
159
An Intelligent Edge-IoT Platform for Monitoring Livestock
The novel Global Edge Computing Architecture (GECA) [11], designed as an Industry 4.0-oriented Edge Computing Reference Architecture, has been used to implement a new agro-industry platform, SmartDairyTracer. GECA is a layered architecture with a modular approach that provides and manages various solutions for different environments such as Industry 4.0, smart cities, smart energy [27], or smart farming [28], and is briefly described in Sect. 3.1. The SmartDairyTracer platform, depicted in Sect. 3.2, is aimed at smart monitoring, sustainability and traceability of products at farms (i.e., crop and livestock farms), dairy industry (e.g., milk, cheese, butter, etc.) and transportation to the end consumer, and its functionalities are enabled by GECA, which provides the platform with Internet of Things, Artificial Intelligence and blockchain technologies. In the first stage of development, focused on livestock and crops. The new SmartDairyTracer platform is deployed and tested in a real scenario on a mixed dairy farm in Castrillo de la Guare˜ na in the province of Zamora (Spain). 3.1
Global Edge Computing Architecture
The edge computing architecture employed in this study has first been proposed in [11], and it consists of three principal layers: IoT, Edge and Business Solution layers. The authors of the Global Edge Computing Architecture [11] analyzed four of the most important reference architectures in the field of Edge Computing applications in industrial environments. Their aim was to build an architecture that was complete in all senses and that covered all the important needs of these environments [29–32]. The platform presented in the paper is based on GECA architecture. 3.2
SmartDairyTracer: Livestock Monitoring, Sustainability and Traceability by Means of IoT, AI and Blockchain Technologies
Researchers from the University of Salamanca (Spain) and the Digital Innovation Hub (Salamanca, Spain) are building a consortium gathering different profiles (livestock managers, farmers, dairy and meat industries, IoT technology providers, ICT experts, energy engineers and scientific community researchers) with an extensive background in different activities/technologies (irrigation control, energy management and optimization, cattle welfare monitoring, pests and plague detection in crops) to involve the whole product value chain in the roll out of several use cases that, making use of currently available innovative technologies and solutions (IoT, Distributed Ledger Technologies and AI, among others), will provide an integral and open solution in the form of a smart platform based on FIWARE [33] for the improvement of the whole farming industry: optimization of processes, reduction of water and energy consumption, reduction
160
¨ urk et al. M. Ozt¨
Fig. 1. Schema of the complete Smart Dairy Tracer platform following the Global Edge Computing Architecture.
of pesticides in associated crops, promotion of a sustainable and environmental friendly production, monitoring of animal welfare and deployment of a reliable agri-food traceability system. This solution will collect information from each stage (using IoT to monitor cow health and state, product processing or transport safety) and will share it through a reliable, secure and transparent platform (based on Distributed Ledger Technologies) to provide valuable information to stakeholders (enabling them to optimize their procedures through the integration of AI assisted support) and customers (building a P2P “enabler” between producers and consumers, as well as providing an accurate traceability system and information about the health state and conditions of the livestock). The projected SmartDairyTracer platform, whose full architecture based on GECA is illustrated in Fig. 1, is focused on three main cornerstones: monitoring, through IoT technologies; sustainability, thanks to the application of AI algorithms; and traceability, achieved by means of innovative Distributed Ledger Technologies. SmartDairyTracer will include the following sources and IoT standards: – Livestock farms: ambient sensors inside barns – Agro-meteo stations in crops used to feed livestock – Cattle sensors: real-time location [34], Livestock health conditions, body temperature, breath, heart rhythm and rumination, using Wi-Fi technologies. – Factories: RFID tags and QR codes will be incorporated for the traceability of the different packaged products (milk, cheese, butter, etc.) – Transportation: the time and conditions of transport (temperature, humidity, vibrations, etc.)
Livestock Welfare by Means of Edge Computing and IoT
161
The values of the monitored parameters will be gathered through wireless IoT technologies, managed by means of Lightweight M2M [35] and transferred using FIWARE [33] to IoT Service and Mediation layers, respectively. Auxiliary gateways and proxies will be deployed when necessary both in the Edge on Site and the Near Edge. The first version of the SmartDairyTracer platform has been implemented following the Industry 4.0-oriented Edge Computing reference architecture (GECA). Its main objective is to lay the foundations of the whole SmartDairyTracer platform and, in accordance, develop an agro-industry platform designed to monitor, track and optimize the management tasks of mixed crop-livestock farms. Thus, to test and validate the new agro-industry platform, it has been deployed in a real scenario in a dairy cow farm located in Castrillo de la Guare˜ na, in the province of Zamora (Castile and Le´ on, Spain). The farm has two barns of 1850 and 1650 m2 , respectively, which hold 180 dairy cows. This dairy farm also has 302 ha of associated crops, including corn, alfalfa and rye used as fodder for the livestock. The main objective of implementing this use case is to monitor, by means of IoT and Edge Computing technologies, all the resources used livestock tracing, the parameters related to the livestock, their environment and the associated crops (alfalfa, corn, rye) used to feed the livestock. Monitoring has a twofold purpose; it allows to optimize the resources used in Precision Livestock Farming (PLF), through Business Intelligence, Data Analytics and Machine Learning techniques. Traceability is achieved by means of Distributed Ledger Technologies (blockchain). Following the requirements of the first stage to be implemented within the SmartDairyTracer platform, as well as the design patterns of the Global Edge Computing Architecture, the following specifications were defined for each of the three layers of the architecture, which can be seen reflected in the platform in Fig. 2. – IoT Layer This layer includes all IoT devices designated for monitoring livestock-related parameters (location, activity patterns and health status through bio-metric sensors) and their environment (ambient conditions of the barns in order to detect potential stress and hazardous concentration of gases). – Edge Layer: collects all the information gathered by the IoT devices in the lower layer. – Business Solution Layer: is deployed in the agro-industry platform as a set of coordinated components. SQL and NoSQL databases, back-end Web Services are deployed through Serverless Function as a Service, as well as Artificial Intelligence algorithms of the Cloud Computing platform. Thank to the GECA reference architecture, the new platform is organized in a set of layers that allow to add and remove components dynamically, making it possible to scale the implemented systems horizontally over time. The Cloud
162
¨ urk et al. M. Ozt¨
Fig. 2. The first version of the agro-industry platform based on the novel Edge Computing reference architecture (GECA).
provides flexibility to the Business Solution Layer thanks to data pre-filtering in the Edge layer, as well as knowledge extraction at both layers. Moreover, thanks to the distributed ledger technologies underlying the entire architecture, the data read and collected by the sensors and devices at the IoT layer are unalterable as they are recorded for traceability through the Edge and Business Solution layers.
4
Conclusions and Future Work
Edge Computing paradigm makes it possible to reduce the costs associated with computing, storage and network resources in the Cloud, through the implementation of services in low-cost Edge nodes, such as microcomputers like Raspberry Pi 3 Model B and similar, through which it is even possible to use Machine Learning algorithms in the same Edge using TensorFlow Lite libraries. In this work, it has been demonstrated that it is possible to reduce the costs associated with the transfer of data between the IoT and the remote Cloud by introducing design rules coming from a reference architecture, such as the Global Edge Computing Architecture in Precision Livestock Farming (PLF). Moreover, the introduction of Edge nodes improves the reliability of communications to the Cloud by means of a reduction in the number of missing values in the database in the cloud. The Edge nodes of the Global Edge Computing Architecture filter and preprocess data coming from devices in the IoT layer. Moreover, they are responsible
Livestock Welfare by Means of Edge Computing and IoT
163
for discarding the values that have been repeated due to the retransmission of frames from the physical sublayers (ZigBee, Wi-Fi) to the IoT layer. They can also perform averages and regression data analysis which takes place in the same Edge. In both cases, the amount of data and the cost of its transmission that is transmitted to the Cloud is reduced, reducing the costs of data traffic, as well as the need for calculation and storage in the Cloud. Furthermore, at a qualitative level, the new version of the agro-industry platform benefits in terms of security, traceability and data integrity, due to the characteristics provided by the elements associated with the Distributed Ledger Technologies of the Global Edge Computing Architecture, including the blockchain itself and the Crypto-IoT boards and oracles. The platform will be implemented on several animal farms in the same region. The experiments will be conducted simultaneously on multiple farms where Machine Learning techniques will be applied to learn about the different conditions in which cows suffer stress or illnesses that affect productivity. Moreover, authors will complete the development of the SmartDairyTracer platform which will give the agri-food industry and all those involved in it, such as farmers and stockbreeders, an opportunity to foster loyalty and gain new consumers. Acknowledgments. This work has been partially supported by the European Regional Development Fund (ERDF) through the Interreg Spain-Portugal V-A Program (POCTEP) under grant 0677 DISRUPTIVE 2 E (Intensifying the activity of Digital Innovation Hubs within the PocTep region to boost the development of disruptive and last generation ICTs through cross-border cooperation). In´es Sitt´ onCandanedo has been supported by IFARHU – SENACYT scholarship program (Government of Panama). Authors would like to give a special thanks to Rancho Guare˜ na Hermanos Olea Losa, S.L. (Castrillo de la Guare˜ na, Zamora, Spain) for their collaboration during the implementation and testing of the platform.
References 1. World agriculture towards 2030/2050: the 2012 revision, June 2012 2. Fleming, K., Waweru, P., Wambua, M., Ondula, E., Samuel, L.: Toward quantified small-scale farms in Africa. IEEE Internet Comput. 20(3), 63–67 (2016) 3. Wolfert, S., Ge, L., Verdouw, C., Bogaardt, M.-J.: Big data in smart farming-a review. Agric. Syst. 153, 69–80 (2017) 4. Wolfert, S., Goense, D., Sørensen, C.A.G.: A future internet collaboration platform for safe and healthy food from farm to fork. In: 2014 Annual SRII Global Conference, pp. 266–273. IEEE (2014) 5. Kethareswaran, V., Sankar Ram, C.: An Indian perspective on the adverse impact of internet of things (IoT). ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(4), 35–40 (2017) 6. Patil, K.A., Kale, N.R.: A model for smart agriculture using IoT. In: 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC), pp. 543–545, December 2016
164
¨ urk et al. M. Ozt¨
7. Jayaraman, P., Yavari, A., Georgakopoulos, D., Morshed, A., Zaslavsky, A.: Internet of things platform for smart farming: experiences and lessons learnt. Sensors 16(11), 1884 (2016) 8. Ai, Y., Peng, M., Zhang, K.: Edge computing technologies for internet of things: a primer. Digit. Commun. Netw. 4(2), 77–86 (2018) 9. Lin, J., Wei, Y., Zhang, N., Yang, X., Zhang, H., Zhao, W.: A survey on internet of things: architecture, enabling technologies, security and privacy, and applications. IEEE Internet Things J. 4(5), 1125–1142 (2017) 10. Patil, A.S., Tama, B.A., Park, Y., Rhee, K.-H.: A framework for blockchain based secure smart green house farming. In: Advances in Computer Science and Ubiquitous Computing, pp. 1162–1167. Springer (2017) 11. Sitt´ on-Candanedo, I., Alonso, R.S., Corchado, J.M., Rodr´ıguez-Gonz´ alez, S., Casado-Vara, R.: A review of edge computing reference architectures and a new global edge proposal. Future Gener. Comput. Syst. 99, 278–294 (2019) ´ de Paz, J.F., Corchado, J.M.: Imple12. Alonso, R.S., Tapia, D.I., Bajo, J., Garc´ıa, O., menting a hardware-embedded reactive agents platform based on a service-oriented architecture over heterogeneous wireless sensor networks. Ad Hoc Netw. 11(1), 151–166 (2013) 13. Sitt´ on, I., Rodr´ıguez, S.: Pattern extraction for the design of predictive models in industry 4.0. In: International Conference on Practical Applications of Agents and Multi-Agent Systems, pp. 258–261. Springer (2017) 14. Rahmani, A.M., Gia, T.N., Negash, B., Anzanpour, A., Azimi, I., Jiang, M., Liljeberg, P.: Exploiting smart e-health gateways at the edge of healthcare internetof-things: a fog computing approach. Future Gener. Comput. Syst. 78, 641–658 (2018) 15. Morabito, R., Cozzolino, V., Ding, A.Y., Beijar, N., Ott, J.: Consolidate IoT edge computing with lightweight virtualization. IEEE Netw. 32(1), 102–111 (2018) 16. Singh, S., Yassine, A.: IoT big data analytics with fog computing for household energy management in smart grids. In: International Conference on Smart Grid and Internet of Things, pp. 13–22. Springer (2018) 17. Taleb, T., Dutta, S., Ksentini, A., Iqbal, M., Flinck, H.: Mobile edge computing potential in making cities smarter. IEEE Commun. Mag. 55(3), 38–43 (2017) 18. Agrawal, H., Prieto, J., Ramos, C., Corchado, J.M.: Smart feeding in farming through IoT in silos. In: The International Symposium on Intelligent Systems Technologies and Applications, pp. 355–366. Springer (2016) 19. Cambra, C., Sendra, S., Lloret, J., Lacuesta, R.: Smart system for bicarbonate control in irrigation for hydroponic precision farming. Sensors 18(5), 1333 (2018) 20. Chien, Y.-R., Chen, Y.-X.: An RFID-based smart nest box: an experimental study of laying performance and behavior of individual hens. Sensors 18(3), 859 (2018) 21. Potamitis, I., Rigakis, I., Tatlas, N.-A., Potirakis, S.: In-vivo vibroacoustic surveillance of trees in the context of the IoT. Sensors 19(6), 1366 (2019) 22. Jia, W., Liang, G., Tian, H., Sun, J., Wan, C.: Electronic nose-based technique for rapid detection and recognition of moldy apples. Sensors 19(7), 1526 (2019) 23. Khan, R., Khan, S.U., Zaheer, R., Khan, S.: Future internet: the internet of things architecture, possible applications and key challenges. In: 2012 10th International Conference on Frontiers of Information Technology, pp. 257–260. IEEE (2012) 24. Ryu, M., Yun, J., Miao, T., Ahn, I.-Y., Choi, S.-C., Kim, J.: Design and implementation of a connected farm for smart farming system. In: 2015 IEEE Sensors, pp. 1–4. IEEE (2015)
Livestock Welfare by Means of Edge Computing and IoT
165
25. Kamilaris, A., Gao, F., Prenafeta-Bold´ u, F.X., Ali, M.I.: Agri-IoT: a semantic framework for internet of things-enabled smart farming applications. In: 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT), pp. 442–447. IEEE (2016) 26. Park, J., Choi, J.-H., Lee, Y.-J., Min, O.: A layered features analysis in smart farm environments. In: Proceedings of the International Conference on Big Data and Internet of Things, BDIOT 2017, pp. 169–173, New York, NY, USA. ACM (2017) ´ Gil, A.B., Rodr´ıguez-Gonz´ 27. Sitt´ on-Candanedo, I., Alonso, R.S., Garc´ıa, O., alez, S.: A review on edge computing in smart energy by means of a systematic mapping study. Electronics 9(1), 48 (2020) ´ Prieto, J., Rodr´ıguez-Gonz´ 28. Alonso, R.S., Sitt´ on-Candanedo, I., Garc´ıa, O., alez, S.: An intelligent edge-IoT platform for monitoring livestock and crops in a dairy farming scenario. Ad Hoc Netw. 98, 102047 (2020) 29. Project FAR-EDGE. FAR-EDGE Project H2020, November 2017 30. INTEL-SAP: IoT Joint Reference Architecture from Intel and SAP. Technical report, INTEL-SAP, November 2018 31. Edge Computing Consortium, Alliance of Industrial Internet, and Edge Computing Consortium. Edge Computing Reference Architecture 2.0. Technical report, Edge Computing Consortium, November 2017 32. Tseng, M., Canaran, T.E., Canaran, L.: Introduction to Edge Computing in IIoT. Technical report, Industrial Internet Consortium (2018) 33. de la Prieta, F., Gil, A.B., Moreno, M., Mu˜ noz, M.D.: Review of technologies and platforms for smart cities. In: Rodr´ıguez, S., Prieto, J., Faria, P., Klos, S., Fern´ andez, A., Mazuelas, S., Jim´enez-L´ opez, M.D., Moreno, M.N., Navarro, E.M. (eds.) Distributed Computing and Artificial Intelligence, Special Sessions, 15th International Conference, Advances in Intelligent Systems and Computing, pp. 193–200. Springer International Publishing (2019) 34. De Paz, J.F., Tapia, D.I., Alonso, R.S., Pinz´ on, C.I., Bajo, J., Corchado, J.M.: Mitigation of the ground reflection effect in real-time locating systems based on wireless sensor networks by using artificial neural networks. Knowl. Inf. Syst. 34(1), 193–217 (2013) 35. Trentin, I.F., Berlemont, S., Barone, D.A.C.: Lightweight M2M protocol: archetyping an IoT device, and deploying an upgrade architecture. In: 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 403–408, March 2018
Sleep Performance and Physical Activity Estimation from Multisensor Time Series Sleep Environment Data Celestino Gonçalves1,2 , Diogo Rebelo2 , Fábio Silva2,3(B) , and Cesar Analide2 1 Polytechnic Institute of Guarda, Guarda, Portugal
[email protected] 2 ALGORITMI Centre, University of Minho, Braga, Portugal
[email protected], [email protected] 3 CIICESI, ESTG, Politécnico do Porto, Porto, Portugal [email protected]
Abstract. Sleep is an essential physiological function, needed for the proper functioning of the brain and therefore for the general well-being and for a good quality of life. Currently, more and more people use mobile and wearable devices to monitor their sleep and physical activity. But while some of those recent devices may even provide reliable measurements of sleep structure, they do not provide a consolidated and definite answer for what has contributed to that situation. With the present study, we intend to verify and establish relationships between some environmental factors, such as temperature, humidity, luminosity, noise or air quality parameters and between sleep performance and physical activity and assert legal issues with GRPD law. A multisensor monitoring system was used to obtain a real dataset consisting of 55 night sessions of time series sleep environment data. We have also explored the feasibility of using time series machine learning models to predict sleep stages and to estimate the level of physical activity of a person. Keywords: Sleep monitoring · Sleep structure · Time series monitoring system · Machine learning · Sleep stages prediction · Activity estimation
1 Introduction It can be said that the scientific study of sleep has only started about a hundred years ago, with the invention of the electroencephalograph, however, only in the last decades it has been possible to obtain a more rigorous knowledge of this essential physiological function for the proper functioning of the brain [1]. It is a complex process [2] that has only recently been given due attention by science and medicine, given the growing awareness of its importance [3]. The most used objective methods to measure sleep are polysomnography (PSG) and actigraphy [4]. PSG is considered the ground-truth for measuring sleep, as it combines multiple sensors to measure various parameters, such as eye movement, brain activity, heart rate, muscle tone, and physical movement, however © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 166–176, 2021. https://doi.org/10.1007/978-3-030-58356-9_17
Sleep Performance and Physical Activity Estimation
167
it is an expensive and intrusive method, usually carried out in clinical or laboratory environment, and is not normally used for data collection for very long periods, given the inconvenience and discomfort it causes in the usual routines and the possible and unwanted effect that it may have on the normal sleep process. The actigraphy consists of the use of a device like a wristwatch, which measures movement as an indicator of wakefulness. It is a more economical, non-intrusive alternative, which can be used in the domestic environment and for long periods of time, however it is not as accurate as PSG [5]. This recent interest in sleep and concern for its measurement coupled with the recent development of technologies and growth in the consumer market for wearable health devices has allowed the emergence of multiple, non-intrusive, affordable sleep monitoring solutions that users can use at home, in some cases with very acceptable levels of precision [6, 7]. So, we can say that nowadays, with all this technology and quantity of devices available, it is relatively simple to obtain sleep data at home, unobtrusively and for long periods of time. However, what these solutions are not yet able to provide is a justification for the values obtained for the various sleep parameters, or what may have contributed to some specific outcomes. In fact, sleep is multi-dimensional and involves not only objective but also subjective parameters [3, 8], so it has not been easy to obtain a strict and concise definition of sleep quality [4, 9]. On the other hand, sleep can also be affected by a particular set of contextual factors [2], such as environmental factors like temperature, humidity, luminosity, noise and poor air quality [10, 11]. Thus, it would be interesting and useful to be able to establish relationships between the obtained sleep data and relevant environmental factors for these results, thereby to help people to be aware and to properly interpret their sleep architecture and performance. This new opportunity for research led us to devise a multisensor system for monitoring the sleep environment to try to understand the effect that these factors have on the structure and quality of sleep. These issues were explored through an experimental study over a period of about two months.
2 Related Work The perfect human sleep holds four to five sleep cycles in one night, lasting about 90 to 120 min each one [2]. In each of these cycles, sleep goes through two distinct phases, NREM (non-rapid eye movements) and REM (rapid eye movements). The NREM sleep phase, or slow sleep, is usually subdivided into three states or depth levels: N1 (onset of sleep), N2 (light sleep), and N3 (deep sleep) [1]. Often, the N3 state is also referred to as slow wave sleep (SWS), presenting regular breathing and heart rate and low muscle tone. The REM sleep phase presents irregular breathing and heart rate, null muscle tone and, of course, and as the name implies, rapid eye movements. Sleep starts with N1 phase, which lasts only a few minutes, usually about 5. The N2 phase generally corresponds to almost 50% of total sleep time. The N3 phase occurs essentially in the first third of sleep, whereas REM sleep occurs predominantly in the last third, with a total duration of about 20% of sleep time.
168
C. Gonçalves et al.
2.1 Sleep Monitoring Systems Our search of related work considering sleep monitoring systems that take into account solutions to estimate sleep structure and the effect of contextual factors has resulted in the following examples. The first example is a comparative study, presented by Liang and Chapa-Martell [21], to examine the performance of seven machine learning algorithms in estimating sleep structure. Additionally, they have considered three resampling techniques to overcome the imbalance that occurs among the classes that represent each sleep stage. The next example is Lullaby [10], a capture and access system for understanding the sleep environment. The authors of this system use temperature, light, and motion sensors, audio and photos, and an off-the-shelf sleep sensor to acquire a complete recording of a person’s sleep. The system visualizes graphs and parameter recordings, allowing users to find trends and potential causes of sleep disruptions relatively to environmental factors, helping them understand their sleep environment. SleepExplorer [11] is a web-based visualization tool to make sense of correlations between personal sleep data and contextual factors. It imports data from commercial sleep trackers and online diaries and helps users to better understand their sleep and to discover novel relationships between sleep data and contextual factors. The contextual factors collected during the field study were organized under four main categories: physiological factors (weight, body temperature and menstrual cycles), psychological factors (mood, stress, tiredness and dream), behavioral factors (steps, minutes very active, minutes fairly active, minutes lightly active, calories in, calories out, activity calories, coffee, coffee time, alcohol, electronic devices usage, evening light, nap time, nap duration, social activities, exercise time and dinner time) and environmental factors (ambient temperature and ambient humidity). Finally, Borazio and Laerhoven [22] present a long-term sleep monitoring system combining wearable and environmental data to automatically detect the user’s sleep at home. The system uses inertial, ambient light, and time data tracked from a wrist-worn sensor, synchronized with night vision camera footage for easy visual inspection. 2.2 Re-aligning EU Data Protection Law with the Right to Reasonable Inferences of a Multisensor Sleep Monitoring System Recent developments in Artificial Intelligence and globalization brought the legal framework that outline the myriad of concerns GDPR [12] enshrine, specifically in those rules regarding inferential analytics. Nowadays, Machine Learning models draw privacyinvasive, non-intuitive and unverifiable inductions about individuals. And despite all the attempts of the European legislator to uniformize enough protection against automated individual decision-making, data subjects are still granted little control over how data is used to exhort inferences. However, the implementation of each machine learning model does not necessarily entail the creation of new opportunities for the unfinished and overstated symphony within unfair, discriminatory or biased automated processing (Art 5(1)) [13]. Even some sensitive predictions, whose outcome portray autonomous estimations, can still be privacy-friendly and lawful.
Sleep Performance and Physical Activity Estimation
169
For example, our Multisensor Sleep Monitoring System can determine the relationship between detailed environment data and sleep performance, aiming to provide explanations for the root causes of the quality of human sleep structure. By using alternative and non-personal input variables, such as temperature, humidity, light intensity, noise and air quality, the decision tree trained and tested can predict sleep stages with 89% accuracy and feasibly estimate the level of daily activity of an individual [14]. Since both inferences link information relating to an identifiable natural person through rule-based predictive reasoning, data protection law classifies them on the scope of a non-verifiable ‘personal data’ (Art 4(1)) [15]. Apart from the opposite lessons stated in YS., M. and S. [16] and Nowak [17] jurisprudence of the European Court of Justice (ECJ) - leaving open the answer to the question of whether a result and the subsequent decision are personal data -, the three-steps recommended by the Article 29 Working Party falls completely within the scope of this sleep monitoring system [14]. The suggested approach under study allows transforming environment attributes into personal data. While predicting sleep stages and estimating physical activity, the results encompass sensitive purposes in accordance to its legal strict restrictions (Art 9(1)) (Art 4(2)). Consequently, to guarantee lawfulness of processing, data subjects must give an explicit consent to the data controller (Art 9(2, a) (4)). The algorithmic outcomes fill in special categories of personal data that pertain patterns to behavioral biometrics (Art 4(14)) (Rec(24)(51)) and data concerning health (Art 4(15)) (Rec(35)(53)). Thus, establishing autonomous correlations between environment data, sleep performance and how active a person was during the day, is likely to be used to evaluate, treat in a certain way or improve the health and physical status of end-users [18]. From machine learning workflows, there is a detection of the sleep health after data was collected in multiple sensors. The tracking system perceives environment features and then adjust machine learning profiling to different people (Art 4(4)) (Rec (72)). We can observe from the real dataset consisting of 55 sessions’ time-series, that different users in different situations produce different results. The analysis spectrum in later stages, then, uses inferred attributes to a decision support system whose aim is merely improving user’s wellbeing (Art 5(1,a,b)) [14]. And such information derived from the solely automated decision-making do not have serious impactful effects on person’s legal status, rights or interests [15, 19]. Hence, predicting sleep stages and estimating subjects’ physical activity does not mean in any way that data subject has fulfilled the right not to be subject to a decision based solely on automated processing (Art 22(1)) (Rec 71). The decontextualized effect on profiling in analysis does not contribute to the standardization of users to such an extent that an ‘expropriated identity’ can be imposed on them. Firstly, the inferences have not a prolonged or permanent impact on data subject. Secondly, in any case, suggestions reported to the user via direct notification to smartphone app do not lead to the exclusion or discrimination of individuals. And, last but not least, when put into practice, multisensor monitoring system seems not having to comply with the GDPR unenforceable entanglement, particularly concerning the right to explanation and to obtain human intervention in autonomous data processing (Art 22(3)) [15, 20]. Similarly, these system are beyond the current limitations placed on the technophobic remit of Data Protection Law that this intelligent sleep system can avoid and surpass the detrimental effect to the GDPR broader aim of protecting privacy against the novel risks
170
C. Gonçalves et al.
posed by machine learning techniques. After all, the widespread distinction between types of personal data based on identifiability and sensibility does not make any sense when applied to inferences.
3 Multisensor Sleep Monitoring System In order to try to determine the relationship between environment data and sleep structure and performance, a prototype was developed to monitor the diverse variables that allow the characterization of the sleep environment: temperature, humidity, light intensity, sound level, atmospheric pressure and air quality, Fig. 1.
Fig. 1. Multisensor sleep monitoring system architecture.
The acquisition of data is based on an Arduino microcontroller, Arduino MEGA 2560, which receives data from diverse sensors of temperature, humidity, light, sound, atmospheric pressure and air composition and quality, such as oxygen gas, carbon dioxide gas, particle concentration and various sensors of the MQ series: carbon monoxide gas (MQ-7), propane gas (MQ- 2), natural methane gas (MQ-4), liquefied isobutane propane gas (MQ-6), alcohol ethanol (MQ-3), hydrogen gas (MQ-8), and air quality (MQ-135). Through a Bluetooth communication module, the Arduino microcontroller sends the data received from all the ambient sensors to a mobile smartphone that syncs with a backend server that supports sleep session management. In order to associate sleep structure with environment data, our multisensor monitoring system uses an activity wristband, Fitbit Charge 3, to characterize sleep structure of each sleep session. The approach of our multisensor system is illustrated in Fig. 2. First, sleep environment data and Fitbit data, from multiple sleep sessions, are collected and aggregated into a single dataset. Then, machine learning workflows are trained and tested with the aim of predicting sleep architecture, that is, the sleep stage for each time interval, considering only environment data. The algorithms applied were Decision Tree [23], Random Forest [24], Gradient Boosted Trees [25], and Probabilistic Neural Networks [26]. Finally, that knowledge will allow us to assess and estimate sleep quality and sleep performance and will be used in later stages as input to a decision support system to improve people wellbeing. The attributes used were taken from environment and physical activity in order to add context to a data series which is used in common machine learning algorithms. The
Sleep Performance and Physical Activity Estimation
171
Fig. 2. Multisensor machine learning sleep architecture prediction methodology.
time series data are built with a data pre-processing stage following best practices from the CRISP-DM methodology [27].
4 Results from Experiments The multisensor sleep monitoring system was deployed in a bedroom and used by a male subject, with 50 years of age, during 55 consecutive sleep sessions, between January and February of 2020. One sample per minute of each environment sensor was collected through direct sensing or aggregation of values in which case the average, minimum and maximum were stored. The data collected by the Fitbit wristband sensors during the same period allowed the definition of the sleep structure for each of the 55 nights sessions considered for this experiment: 1-Deep sleep, 2-Light sleep, 3-REM sleep, and 4-Awake. The distribution of sleep stages has a strong imbalance, with light sleep verifying more than 60% of the occurrences among the 4 sleep stages: 60.9% for Light sleep, 15.1% for REM sleep, 12.7% for Deep sleep and 11.2% for Awake stage. For the period of the experiment, the Total Time Sleep (TST) ranged from 174 min (2 h 54 m) to 575 min (9 h 35 m), the average time in each sleep stage was 61.8 min for deep sleep, 238.1 min for light sleep, 61.8 min for REM sleep and 67.2 min for awake stage. The overall sleep score, assessed by Fitbit proprietary algorithms, ranged from 38% (Poor) to 89% (Good). 4.1 Data Pre-processing A dataset was prepared considering the data from the sleep environment sensors acquired by the Arduino microcontroller, the data of sleep structure estimation, sleep characterization, and physical activity during the day, obtained from the Fitbit Charge 3 wristband. The timestamp values of each entry of the dataset were converted into two integer values,
172
C. Gonçalves et al.
representing the hour and the minute of the corresponding timestamp. In addition, some other data were added to each dataset entry, such as the number of the night session (1 to 55), the acquisition ID (epoch or row ID of each session), the starting timestamp of the corresponding session, also converted into two integer values, total sleep time and the time spent in each sleep stage (Awake, REM, Light and Deep), in minutes. Finally, to conclude the preparation of the dataset, each entry of the dataset received the total number of steps and floors taken by the subject during the corresponding day and, for the respective timestamp, the sleep stage estimation, converted to a numerical value: 1 for Deep stage, 2 for Light, 3 for REM and 4 for Awake. Each entry of the dataset corresponds to the sensor values acquired every minute (every 60 s), for the 55 sleep sessions. In the case of the sound sensor, the maximum and the summaries of one minute were considered, for acquisitions at every 125 ms. The data gathered was used to support an inference system, based on classification machine learning models complying with the guidelines described in Sect. 2.2, thus enabling legal support for our proposed system. 4.2 Machine Learning Models Performance The data of 55 consecutive night sleep sessions, obtained from our multisensor system, was considered and used to develop machine learning models to estimate sleep structure from historic time series data. As shown in Fig. 1, the data was stored in a local database which is accessible by machine learning workflows developed on the KNIME Analytics Platform [28]. The machine learning workflows were developed using machine learning algorithms that are compatible with the PMML specification [27], which means they can be transported to other machine learning frameworks, retaining the same characteristics. With this approach the research team is able to test different theories, allowing the rapid development of decision support systems. The assessment of each machine learning model is made by the execution of the respective machine learning workflow. The algorithms being tested were: (1) Decision Tree, (2) Random Forest, (3) Gradient Boosted Trees and (4) Probabilistic Neural Network. The output and prediction are stored in the same database where sensor data was extracted. Scorer nodes are used to obtain direct model performance. The learning phase of the classification workflow used 70% of the dataset rows, while the remaining 30% were used for the prediction phase. In order to be able to assess the capacity of the considered classification algorithms in the correct identification of the different sleep stages, a comparison between the predicted sleep stages and the sleep stages estimated by the Fitbit wristband was accomplished. Table 1 presents the accuracy results of sleep stage prediction for each considered machine learning model. The analysis of results lets conclude that the Random Forest algorithm performed better, not only at global accuracy but also at individual sleep stage estimation. The Decision Tree algorithm achieved very close results. A benefit from this model, is it ability to predict sleep stages of a test subject without intrusive measures. Another advantage is that the best performing algorithm, Random Forest, is an explainable algorithm, which mean we can extract some rule-based explanation to characterize our decision. These rules dictate which attributes are relevant for each decision for each
Sleep Performance and Physical Activity Estimation
173
subject and may allow the study of which dimension is more relevant for sleep stage prediction of each subject. Table 1. Accuracy of sleep stage prediction. Machine learning algorithm Accuracy of sleep stage prediction Overall prediction (%) Accuracy
Sleep stage prediction (%)
Error 4-Awake 3-REM 2-Light 1-Deep
Decision Tree
90.3
9.7
85.8
86.1
92.6
88.5
Random Forest
93.2
6.8
87.2
88.7
95.8
90.9
Gradient Boosted Trees
72.3
27.7
42.7
24.7
97.6
33.1
PNN
82.5
17.5
52.7
56.9
98.3
63.2
For the various tested algorithms, we found that the elements that most influenced and contributed to a greater precision in the prediction of sleep stages were the Temperature and Humidity values, as well as the values of the MQ7, MQ8, MQ135 and PM 10.0 sensors. These observations motivate further studies to assess the influence of such elements on sleep estimation. Aside from being able to identify sleep stages by time series data, this approach is able to characterize how active a person is by estimating the number of steps and the number of floors the subject achieved during the day analyzing time series sleep data. Using the same approach, with the number of steps and floors extracted from the Fitbit wristband and with sleep data as input, it was possible to estimate daily number of steps and floors with good results. Table 2 presents the overall accuracy of physical activity prediction for the considered machine learning classifiers. This provides evidence that is possible to estimate the activity of people by analyzing only sleep environment data with our multisensor architecture and sleep classification from the wristband. The next stage is to remove the need for classified sleep patterns, which came from the Fitbit wearable, making this solution to use only contextual and environmental data. Table 2. Overall accuracy of physical activity prediction Machine learning algorithm
Steps prediction
Decision Tree
2.107
8.988
2.480
10.077
Random Forest
4.725
32.979
3.635
21.536
Gradient Boosted Trees
4.796
8.588
3.781
19.125
PNN
6.710
60.157
3.960
23.776
Mean absolute error
Floors prediction Mean square error
Mean absolute error
Mean square error
174
C. Gonçalves et al.
Like the first approach, Random Forest and Decision Tree, some of the best performing algorithms used, which also allow the extraction of rule-based explanation for each decision. This allows the study of how exercise may influence sleep performance for test subjects, which can support other services such sleep and exercise recommendation systems tailored to each subject.
5 Conclusions and Future Work The study here presented estimates sleep quality and sleep structure using environmental and data. The results demonstrate that it is possible with a good accuracy to predict sleep stages and exercise activity with time series machine learning models. The advantages of these techniques are that not only can sleep tracking be made with contextual and environment data but we can also estimate how active the person was during the day using trained machine learning models. These are services that people should expect from future IoT and smart city platforms, tailored to each person needs. From the legal point of view, this article also address the legitimacy of these system, and how can they be built following rules and regulations from GDPR and local legislation. It is our belief that the work towards specialized inference systems, will yield benefits for our society, enabling a better quality of life and innovative decision support systems based on services such as the one described in this article. As future work, we intend to test other machine learning models and check if the results of our classification algorithms can be improved considering the previous balance between the classes corresponding to the different sleep stages, since there is, as we saw, a great imbalance between them. Moreover, we will continue this study and include different test subjects sleeping in the same environment to adjust machine learning models to different people in the same environment. The team is also interested in predicting the activity of test subjects from a daily routine, such as sports and sedentarism from the analysis of our multisensory system. One objective is to build an open IoT service that may be interlinked with other inference systems to advise and promote better quality of life with unobtrusive systems. Acknowledgments. This work has been supported by FCT - Fundacao para a Ciencia e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020. It has also been supported by national funds through FCT – Fundação para a Ciência e Tecnologia through project UIDB/04728/2020.
References 1. Sena, A.: Cérebro, Saúde e Sociedade. Lidel - Edições Técnicas, Lda, Lisboa (2016) 2. Carskadon, M.A., Dement, W.C.: Normal human sleep: an overview. Princ. Pract. Sleep med. 4, 13–23 (2005) 3. Buysse, D.J.: Sleep health: can we define it? Does it matter? Sleep 37(1), 9–17 (2014) 4. Krystal, A.D., Edinger, J.D.: Measuring sleep quality. Sleep Med. 9, S10–S17 (2008) 5. Beattie, Z., Oyang, Y., Statan, A., Ghoreyshi, A., Pantelopoulos, A., Russell, A., Heneghan, C.: Estimation of sleep stages in a healthy adult population from optical plethysmography and accelerometer signals. Physiol. Meas. 38(11), 1968–1979 (2017)
Sleep Performance and Physical Activity Estimation
175
6. Roomkham, S., Lovell, D., Cheung, J., Perrin, D.: Promises and challenges in the use of consumer-grade devices for sleep monitoring. IEEE Rev. Biomed. Eng. 11, 53–67 (2018) 7. Liang, Z., Martell, M.: Validity of consumer activity wristbands and wearable EEG for measuring overall sleep parameters and sleep structure in free-living conditions. J. Healthc. Inform. Res. 2(1–2), 152–178 (2018) 8. Åkerstedt, T., Hume, K., Minors, D., Waterhouse, J.: Good sleep - its timing and physiological sleep characteristics. J. Sleep Res. 6(4), 221–229 (1997) 9. Ohayon, M., et al.: National Sleep Foundation’s sleep quality recommendations: first report. Sleep Health 3(1), 6–19 (2017) 10. Kay, M., Choe, E.K., Shepherd, J., Greenstein, B., Watson, N., Consolvo, S., Kientz, J.A.: Lullaby: a capture & access system for understanding the sleep environment. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing (2012) 11. Liang, Z., Ploderer, B., Liu, W., Nagata, Y., Bailey, J., Kulik, L., Li, Y.: SleepExplorer: a visualization tool to make sense of correlations between personal sleep data and contextual factors. Pers. Ubiquit. Comput. 20(6), 985–1000 (2016). https://doi.org/10.1007/s00779-0160960-6 12. Regulation G, Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Off. J. Eur. Union (OJ), 59(294), 1-88 (2016) 13. European Parliament: European Parliament resolution of 14 March 2017 on fundamental rights implications of big data: privacy, data protection, non-discrimination, security and law enforcement (2016/2225(INI)) (2017) 14. Goncalves, C., Silva, F., Novais, P., Analide, C.: Multisensor monitoring system to establish correlations between sleep performance and environment data. In: Intelligent Environments 2019: Workshop Proceedings of the 15th International Conference on Intelligent Environments, vol. 26, pp. 26–35. IOS Press (2019) 15. Wachter, S., Mittelstadt, B.: A right to reasonable inferences: re-thinking data protection law in the age of big data and AI. Colum. Bus. Law Rev. 2019, 494 (2019) 16. M. YS and S. v. Minister voor Immigratie, “Integratie en Asiel Joined Cases C-141/12 and C-372/12 (n 7), para 38–39” (2014) 17. Podstawa, K.: Peter Nowak v Data Protection Commissioner: Case C-434/16 (n 8), para 53. Eur. Data Prot. L. Rev. 4, 252 (2018) 18. Democratic Party of Wisconsin, Opinion 4/2007 on the concept of personal data, 01248/07/EN WP 136’ (n 68). European Commission, Brussels, Belgium (2007) 19. Data Protection Working Party WP29, Guidelines on Automated individual decision-making and Profiling for the purposes of Regulation 2016/679 (n 19) (2017) 20. Wachter, S., Mittelstadt, B., Floridi, L.: Why a right to explanation of automated decisionmaking does not exist in the general data protection regulation. Int. Data Priv. Law 7(2), 76–99 (2017) 21. Liang, Z., Martell, M.A.C.: Achieving accurate ubiquitous sleep sensing with consumer wearable activity wristbands using multi-class imbalanced classification. In: IEEE International Conference on Pervasive Intelligence and Computing, pp. 768–775 (2019) 22. Borazio, M., Van Laerhoven, K.: Combining wearable and environmental sensing into an unobtrusive tool for long-term sleep studies. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium (2012) 23. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan-Kaufmann, San Francisco (1993) 24. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 25. Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
176
C. Gonçalves et al.
26. Specht, D.: Probabilistic neural networks. Neural Netw. 3(1), 109–118 (1990) 27. Clifton, C., Thuraisingham, B.: Emerging standards for data mining. Comput. Stand. Interf. 23(3), 187–193 (2001) 28. Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kotter, T., Meinl, T., Ohl, P., Thiel, K., Wiswedel, B.: KNIME-the Konstanz information miner: version 2.0 and beyond. ACM SIGKDD Explor. Newslett. 11(1), 26–31 (2009)
Face Detection and Recognition, Face Emotion Recognition Through NVIDIA Jetson Nano Vishwani Sati1(B) , Sergio M´ arquez S´ anchez2(B) , Niloufar Shoeibi2 , 3 Ashish Arora , and Juan M. Corchado2,4,5,6 1
Amity School of Engineering and Technology, Noida, Uttar Pradesh, India [email protected] 2 Bisite Research Group, University of Salamanca, Calle Espejo 2, 37007 Salamanca, Spain {smarquez,niloufar.shoeibi,corchado}@usal.es 3 Indian Institute of Technology, Dharwad, Dharwad, India [email protected] 4 Air Institute, IoT Digital Innovation Hub (Spain), 37188 Salamanca, Spain 5 Department of Electronics, Information and Communication, Faculty of Engineering, Osaka Institute of Technology, Osaka 535-8585, Japan 6 Pusat Komputeran dan Informatik, Universiti Malaysia Kelantan, Karung Berkunci 36, Pengkaan Chepa, 16100 Kota Bharu, Kelantan, Malaysia Abstract. This paper focuses on implementing face detection, face recognition and face emotion recognition through NVIDIA’s state-of-theart Jetson Nano. Face detection is implemented using OpenCV’s deep learning-based DNN face detector, supported by a ResNet architecture, for achieving better accuracy than the previously developed models. The result computed by framework libraries of OpenCV, with the support of the above-mentioned hardware, displayed reliable accuracy even with the change in lighting and angle. For face recognition, the approach of deep metric learning using OpenCV, supported by a ResNet-34 architecture, is used. Face emotion recognition is achieved by developing a system in which the areas of eyes and mouth are used to convey the analysis of the information into a merged new image, classifying the image into displaying any of the seven basic facial emotions. A powerful and a low-power platform, Jetson Nano carried out intensive computations of algorithms easily, contributing in high video processing frame. Keywords: Face recognition · Emotion detection Neural Network · NVIDIA Jetson Nano
1
· OpenCV · Deep
Introduction
As the data is increasing exponentially, supported by the steady doubling rate of computing power every year, computer vision has become a popular field of research. With researchers highly intrigued about finding insights into how our c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 177–185, 2021. https://doi.org/10.1007/978-3-030-58356-9_18
178
V. Sati et al.
brain works, computer vision has not limited itself into being only a research area of computer science, but also the object of neuro-scientific and psychological studies. Face detection and recognition, along with the analysis of facial expressions, is currently an active research area in the community of computer vision. Face Detection is a computer technology which, given a digital image or a video, detects facial features and determines the locations and sizes of human faces, by ignoring anything else, such as trees, buildings or bodies present in the image or video. This localization and detection of human face is a prerequisite for face recognition and/or analysis of facial expressions, used in applications such as video surveillance, image database management and human computer interface. Face Recognition, introduced by Woodrow Wilson Bledsoe in the 1960s, is being constantly improved and optimized ever since then, becoming gradually mature, and the technology being more and more widely used in human daily life. Then used to confirm the identity of Osama Bin Laden after he was killed in a U.S. raid, the face recognition system is now increasingly used in the smart phones for user authentication and device security, and for forensics by military and law enforcement professionals. Generally, the face recognition process involves two steps: First, the photo is searched to find a face i.e., face detection, by processing the image to crop and extract the person’s face for recognition. Second, the detected face is then compared to a stored database of known faces i.e., face recognition, contributing in the identification or verification of one or more persons in a given still or video images of a scene. A machine can detect and recognize a person’s face using a regular web camera. However, factors like viewing a person from an angle, lack of proper lighting or brightness of an image, a blurry picture or a contrast in shadows can significantly increase the difficulty in detecting a face. Since the 1990s, face recognition has been a prominent area of research however, it’s still less reliable than face detection and far away from being considered a reliable method of user authentication. Emotion recognition is a technology that has been gaining lot of attention over the past decades with the development of techniques of artificial intelligence. This can be achieved by inspecting the body posture, voice tone or facial expressions. In this paper, we focus on recognising the emotions using facial expressions. Inferring the facial emotions of other people helps in human communication by understanding the intention of others. Facial Emotion Recognition, being a thriving area of research, has applications in computer animations, human-machine interaction, and in various educational processes – understanding the inner state of mind of the learner. NVIDIA Jetson Nano, ideally suited as an IoT edge device because of its small size and connectivity options, is used for machine learning inferencing. Considered to be a powerful and a low-power platform, Nano has applications in home robots, intelligent gateways. Delivering 472 GFLOPS of compute performance by utilizing power of only 5 W, Jetson Nano supports high performance
Face Detection and Emotion Recognition
179
ML acceleration. Researchers have implemented algorithms on NVIDIA Jetson Nano for solving problems concerned with autonomous driving and traffic surveillance, medical and farming, drone navigation.
2
Related Work
With technologies of face detection and face recognition being widely used, interest in them date’s way back to 1960s when Bledsoe, Chan, and Bisson developed the first face recognition algorithm [1–4]. Most of the resources available for implementing the algorithms to achieve the desired goal of recognition are for Neural Networks, while Eigenfaces work better. Only a few resources like recognition from video and other techniques at the Face Recognition Homepage [5], 3D Face Recognition Wikipedia page [6] and Active Appearance Models page [7] provide explanation to achieve the goal of recognition better than the eigenfaces. However, many other techniques are mentioned in the recent computer vision research papers from CVPR. Computer vision or machine vision conferences, such as CVPR10 and CVPR09 [8], discussed advancements in these techniques, which give slightly better accuracy. The issue of facial emotion recognition has been an important research area and is being inspected and analysed on various other research areas [9]. Conventional ways of determining the facial emotions have been introduced in some review papers [10,11]. Difference between conventional techniques and deep learning-based approaches for facial emotion recognition has been introduced by Ghayoumi [12] in a review paper. [13] discusses facial emotion recognition in detail. To classify emotions, algorithms like KNN, Random Forest are applied in [14]. The possibility of using deep learning for emotion detection was inferred from the high accuracy rate obtained by using filter banks and Deep CNN [15] to identify emotions from facial images. For recognizing facial emotions, different databases were studied in [16]. Also, Hidden Markov Models and Deep Belief Networks with UAR of approx. 53% have been used to recognize emotions from facial expressions in [17]. With the significant research being done in the fields of IoT, Machine Learning, NVIDIA’s Jetson Nano, the latest addition to the Jetson family of embedded computing boards, is being used as an edge computing platform for ML inferencing. Vittorio Mazzia et al. [18] have worked on real-time detection of apples to estimate the apple yields and therefore, manage the apple supplies. Researchers had also worked previously on proposing machine vision system for yield estimation however, those algorithms had high computation power, utilized intensive hardware setup, in addition to which the weight and power constraints made these algorithms unsuitable for real-time apple detection. For machine learning algorithms, Mazzia et al. used Jetson Nano that contributed in accelerating complex machine learning tasks. The light weight, low power consumption and form factor significantly made the goal of yield estimation plausible. Siddhartha S. Srinivasa et al. [19] have devised MuSHR, a considerably economic robotic race car, an open source platform to advance research and education in the field
180
V. Sati et al.
of robotics. The hardware architecture of the robotic car comprises of Nvidia’s Jetson Nano, on which the computations are performed. Srinivasa et al. mention that the ability of Nano to be loaded with the desired operating system and program through an SD card has been primary reason of its inclusion in the hardware architecture of MuSHR.
3 3.1
Methodology Face Detection
OpenCV’s Haar Cascades is popularly used to detect faces in images or videos. However, Haar Cascades serve a disadvantage of not being able to detect the faces that are not at a straight angle. In this paper, we have used OpenCV’s DNN Face Detector i.e., a deep learning-based face detector. While MobileNet base network is used by other OpenCV SSDs, DNN Face Detector utilizes ResNet as the base network and the Single Shot Detector (SSD) framework, which enables it to detect faces at angles other than straight angles as well. The algorithm of detecting faces using deep learning OpenCV Face Detectors runs locally and is not cloud based. The version 3.3 of OpenCV has a highly improved module of Deep Neural Networks (DNN). Frameworks supported by this module are Caffe, Tensor Flow and Torch/PyTorch. We have used the Caffe-based face detector in this paper, which requires two set of files i.e., the protxt file(s), used for defining the model architecture, and the caffe model file, containing the weights for the actual layers. The process of face detection involves importing the necessary packages. Additional packages like Video Stream, imutils, and time also have to be imported, only while detecting faces in videos, not in images. Webcam’s video feed has been used for face detection in videos. The model is loaded and the video stream is initialized to allow the camera to warm up. To obtain the face detections, the frames from the video stream are looped over in order to pass the blob through the DNN. This enables us to compare the detections to the confidence threshold for the face boxes and the confidence values to be drawn on the screen. After the OpenCV face detections are drawn, the frame is displayed on the screen until a key is pressed to break out of the loop and clean-up is performed. 3.2
Face Recognition
Face Recognition is usually performed using deep learning in which a network is typically trained to accept a single input image and then output a classification for that particular image. However, we have used the concept of deep metric learning using OpenCV in this paper, which is slightly different from the deep learning approach. Also, the architecture of the network used is based on ResNet-34. In the deep metric learning approach for face recognition, a realvalued feature vector is obtained as an output instead of a single label output. A list of 128 real-valued numbers is the output feature vector used for quantifying the face in the dlib facial recognition network. The network is trained using
Face Detection and Emotion Recognition
181
triplets. The process of training the network using triplets can be explained step by step as: First, three images are fed to the network as input where two images are of the same person and one is of a different person. The faces in the input images are quantified by the construction of 128-d embedding for each by neural net. While comparing the results, the weights of the neural net are slightly tweaked so that the measurements of the two images of the same person are closer and that of the single image of the different person are further away. 3.3
Face Emotion Recognition
In this paper, we have performed face emotion recognition using two major modules: Facial Image Treatment and a propagation algorithm of ANN to recognize the facial expressions. The process can be explained as: First, a new image is provided as an input. Second, it is passed through a series of phases. While doing so, it gets turned into a new merged image which is used for analysis in the ANN. The training set consists of images with seven different face emotions: Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral, with which the ANN has been previously trained. The emotional state of the face is reported by the system once the group to which the image belongs is detected. The human face is constantly analysed by the system and the information regarding the emotional states is extracted using the inputs of the eyes and mouth zone. The extractions are then merged into a single new image, which is resized using the method of Nearest Neighbour Interpolation. The ANN is then provided with the current input data and a back-propagation algorithm, with a feed-forward architecture is used to recognize the facial expressions. In this paper, face detection, face recognition, and face emotion recognition are implemented using NVIDIA Jetson Nano as the main core as it has high video processing frame rate and also an internet connection is not required. Nano is booted with the Operating System of Ubuntu with libraries like OpenCV, NumPy and Keras installed in it.
4 4.1
Results Face Detection
The OpenCV’s Deep Learning Face Detector shows better performance in terms of accuracy than the Haar Cascades. Figure 1(a) displays the detected faces with the accuracy of 99.96% for the face closer to the webcam and 99.31% for the detected face a bit farther away. Also, with a powerful board like Nano, intensive computation of DNN Face Detector can be carried out easily. Figure 1(b) displays the robustness of the face detection software as it gives an accuracy of 89.99% even in condition where the lighting is dim. The face is detected with an accuracy of 72.87% in Fig. 1(c), where the face is not at a straight angle. This wouldn’t have worked well with the Haar Cascades. Not quite faster, but a robust face detection software is built, with a better accuracy than the Haar Cascades, using OpenCV’s deep learning-based face detector.
182
4.2
V. Sati et al.
Face Recognition
The performance of the recognition process improves by adding colour processing, considering application of techniques of edge detection. Also, increasing the number of input images, with images from different angles, different conditions of lighting contributes significantly in the improvement of the accuracy of the recognition procedure. Careful alignment of the pictures and choosing low resolution images over high resolution images give better recognition results. We used the deep learning-based facial embeddings, capable of being executed in real time, which recognized the face with an accuracy of 79.85% (Fig. 2 a) and 86.41% (Fig. 2 b) in different scenarios, different backgrounds. However, it is easier to perform the process of face recognition in real-time because of the same camera, background, expression, lighting or direction of view being used as compared to recognition from a different direction, time or room.
Fig. 1. (a) Detected face with the accuracy of 99.96%, (b) Face detection with accuracy of 89.99%, and (c) Face is not at a straight angle.
Fig. 2. (a) Face recognition with accuracy of 79.85% (b) Face recognition with 86.41% accuracy.
Face Detection and Emotion Recognition
4.3
183
Face Emotion Recognition
The emotions in static face expressions can be recognized by the methodology adapted in this paper. The images used for training as well as testing purposes were fed as an input to the ANN. Before the feeding of the images to the ANN, a pre-processing step was followed in which the images were resized and merged. A union of the eyes and mouth zone was obtained, as they can be best used for predictions of emotions, reason being the most visual indication of emotions is visible in these zones. Figure 3a and Fig. 3b display the results of face emotion detection by the ANN.
Fig. 3. (a and b) Face emotion detection by the ANN
Strengths and weaknesses of the system of face emotion detection is shown in Table 1. Considering the diagonal elements, with 83.3% as the performance of the classifier, all emotions can be classified with an accuracy of more than 75%. Breaking down the details, among the 12 images of anger, the system was successful in classifying 9, while the other 3 were classified as sadness. The perceived reason being that in both the cases of anger and sadness, lips pressed tightly serve as the common denominator. 12 images had the expression of disgust, among which 10 were recognized successfully and the remaining 2 were recognized as sadness. Similarity of the shape of the mouth in both the emotions produced the confusion. With 12 images of fear fed to the system, 9 were successfully classified, remaining 3 were attributed to surprise, as confusion is caused because of the closeness of eyebrows in both the cases. 10 out of 12 images of happiness were classified correctly, rest 2 were recognised as anger, because both the expressions show some teeth areas. Similarly, 10 out of 12 images with the
184
V. Sati et al.
expression of sadness were classified accurately by the system however, 2 were misclassified as disgust. The system could accurately recognize the images with the expression of surprise and neutral expressions. Table 1. Confusion matrix of the system of face emotion detection Variables Anger Disgust Fear Anger
75.00
0.00
Disgust
0.00
83.33
83.33
Fear
0.00
0.00
75.00
Happy
0.00
0.00
Sad
5
Happy Sad
0.00 16.67
Surprise Neutral
0.00
0.00
0.00
0.00
16.67
0.00
0.00
0.00
0.00
0.00
0.00
0.00 83.33
0.00
0.00
0.00
83.33
25.00
16.67
0.00
0.00
0.00
0.00
Surprise
0.00
0.00
16.67
0.00
0.00 100.00
0.00
Neutral
0.00
0.00
0.00
0.00
0.00
0.00
100.00
Conclusion
Face detection and recognition, useful for constructing numerous industrial and commercial applications, is a challenge for many researchers. For making the methodologies more efficient and improving the results, improvement of small features can be done. As technology is advancing, more advanced features can be added to the system, which might be helpful in increasing the accuracy. Also, face emotion recognition is conducted using ANN. Using the same architecture, real time face emotion recognition can be developed, while also increasing the reliability and possibilities. Acknowledgments. This work was supported by the Spanish Junta de Castilla y Le´ on, Consejer´ıa de empleo. Project: UPPER, aUgmented reality and smart personal protective equipment (PPE) for intelligent pRevention of occupational hazards and accessibility INVESTUN/18/SA/0001.
References 1. Bledsoe, W.W.: The model method in facial recognition, vol. 15, no. 47, p. 2. Panoramic Research Inc., Palo Alto, CA, Report PR1 (1966) 2. Chan, H., Bledsoe, W.W.: A man-machine facial recognition system: some preliminary results. Panoramic Research Inc., Palo Alto, CA, USA (1965) 3. Bledsoe, W.W.: Some results on multicategory pattern recognition. J. ACM (JACM) 13(2), 304–316 (1966) 4. Bledsoe, W.W.: Semiautomatic Facial Recognition. Stanford Research Institute, Menlo Park, CA, USA (1968) 5. Face Recognition. http://www.face-rec.org/algorithms/ 6. Wikipedia, Three-Dimensional Face Recognition. http://en.wikipedia.org/wiki/ Threedimensional face recognition
Face Detection and Emotion Recognition
185
7. Wikipedia, Active Appearance Model. http://en.wikipedia.org/wiki/Active appearance model 8. Computer Vision Papers. http://www.cvpapers.com/ 9. Swain, M., Routray, A., Kabisatpathy, P.: Databases, features and classifiers for speech emotion recognition: a review. Int. J. Speech Technol. 21(1), 93–120 (2018) 10. Kolakowska, A.: A review of emotion recognition methods based on keystroke dynamics and mouse movements. In: 2013 6th International Conference on Human System Interactions (HSI), pp. 548–555. IEEE, June 2013 11. Ko, B.C.: A brief review of facial emotion recognition based on visual information. Sensors 18(2), 401 (2018) 12. Ghayoumi, M.: A quick review of deep learning in facial expression. J. Commun. Comput. 14(1), 34–8 (2017) 13. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE, May 2013 14. Mustafa, M.B., Yusoof, M.A., Don, Z.M., Malekzadeh, M.: Speech emotion recognition research: an analysis of research focus. Int. J. Speech Technol. 21(1), 137–156 (2018) 15. Huang, K.Y., Wu, C.H., Yang, T.H., Su, M.H., Chou, J.H.: Speech emotion recognition using autoencoder bottleneck features and LSTM. In: 2016 International Conference on Orange Technologies (ICOT), pp. 1–4. IEEE, December 2016 16. Le, D., Provost, E.M.: Emotion recognition from spontaneous speech using hidden Markov models with deep belief networks. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 216–221. IEEE, December 2013 17. Har´ ar, P., Burget, R., Dutta, M.K.: Speech emotion recognition with deep learning. In: 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN), pp. 137–140. IEEE, February 2017 18. Mazzia, V., Khaliq, A., Salvetti, F., Chiaberge, M.: Real-time apple detection system using embedded systems with hardware accelerators: an edge AI application. IEEE Access 8, 9102–9114 (2020) 19. Srinivasa, S.S., Lancaster, P., Michalove, J., Schmittle, M., Rockett, C.S.M., Smith, J. R., Choudhury, S., Mavrogiannis, C., Sadeghi, F.: MuSHR: A LowCost, Open-Source Robotic Racecar for Education and Research. arXiv preprint arXiv:1908.08031 (2019)
Video Analysis System Using Deep Learning Algorithms Guillermo Hern´ andez, Sara Rodr´ıguez(B) , Ang´elica Gonz´alez, Juan Manuel Corchado, and Javier Prieto BISITE Research Group, University of Salamanca, Edificio Multiusos I+D+i Calle Espejo 2, Salamanca, Spain {guillehg,srg,angelica,corchado,javierp}@usal.es
Abstract. Detection of video duplicates is an active field of research, motivated by the protection of intellectual property, the fight against piracy or the tracing of the origin of reused video segments. In this work, a method for the detection of duplicate videos is proposed and implemented, making use of deep learning methods and techniques typical of the field of information recovery. This method has been evaluated with a data set usually used in the field, with which high average accuracies, above 85%, have been obtained. The effect of the different layers of the convolutional neural network used by the algorithm, the aggregation mechanisms that can be used on them, and the influence of the recovery model have been studied, finding a set of parameters that optimize the overall accuracy of the system.
Keywords: Deep learning images databases
1
· Video analysis · Knowledge discovery in
Introduction
The detection of video with duplicate content is an active research field, motivated by needs such as intellectual property protection, anti-piracy or tracing the origin of reused video segments. The number of videos available on the Internet has increased significantly in recent years, largely due to the development of social networks. We can get an idea of the magnitude of this increase through the fact that about 100 h of audiovisual content are uploaded to the YouTube platform every minute [1]. In addition, the emergence of applications that facilitate the process of downloading, modifying and uploading videos has contributed to an increase in the amount of duplicate content [2], in line with the interests of the users. It is currently estimated that duplicate content constitutes 27% of the original [3]. A possible video definition “duplicate” —to avoid being limited to exact duplicates—, is the following, taken from Wu et al. [4]: “videos approximately c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 186–199, 2021. https://doi.org/10.1007/978-3-030-58356-9_19
Video Analysis System Using Deep Learning Algorithms
187
the same as an original, with possible differences in file format, encoding settings, variations in photo-metric nature (color, brightness), editing operations (inclusion of labels, logos, borders, ...), duration settings or other modifications (adding or deleting frames)”. The intuitive idea behind this is that they are pairs of videos that a user would recognize as a modification of the original, despite the differences between them. The following objectives have been proposed in this work: (i) To study existing models for the detection of video duplicates through a systematic mapping process. (ii) To propose and implement an algorithm that makes use of in-depth learning techniques for the detection of video duplicates that can be used in the framework of the previous system. (iii) Validate the capability of the proposed system with an appropriate data set.
2
Technological Background
This section presents a summary of the review of the state of the art that has been carried out through the technique of systematic mapping [5–7]. Systematic mapping is a technique that allows the extraction of existing scientific knowledge on a specific topic, identifying needs and gaps in it by categorizing existing publications, obtained through a process that is as formal, defined and repeatable as possible. The following stages in this process can be identified: (i) Formulation of research questions that the mapping aims to answer. These are usually broad questions about the content of the specialist literature. (ii) Definition of a search system, based on a series of key words with which a query is constructed and by means of a set of bibliographic databases. (iii) Definition of inclusion/exclusion criteria for the consideration of papers in the review and, possibly, selection by means of quality indicators. (iv) The defined bibliographic search is carried out. (v) The papers that meet the criteria are selected. (vi) Information is extracted from each paper to answer the questions posed above. This development is not strictly linear, but the realization of the study itself may suggest reformulating it for improvement, adapting it in an iterative way. Examples of this may be the inclusion of new questions, the extraction of new data or the inclusion of additional relevant literature cited in the papers that would escape the selection criteria. Research Questions. The questions that have been proposed to guide the systematic mapping are the following: (i) What techniques are used to detect duplicate videos? (ii) How have trends in this field evolved over time? (iii) What models have considered the detection of copied subsequences? Although the last of these questions does not correspond directly to the objectives included in this report, its inclusion has been considered, suggested by the content of the works analysed, as it is potentially useful information for a continuing study.
188
G. Hern´ andez et al.
Keywords. The key words in English considered for the search were video y copy detection. A first search was used to detect possible usual synonyms for the second of the terms, but this proved to be exhaustive in this respect. A simple search expression would therefore be ("video") AND ("copy detection"). Data Base. The database considered was Scopus1 . The reason for limiting ourselves to a single database is to allow the development of the study in a reasonable time. Inclusion/Exclusion Criteria. The inclusion criteria considered are as follows: (i) The work must be a journal paper. (ii) The paper must be written in English. (iii) The paper must present a system for detecting duplicate videos.(iv) The paper must be accessible through the bibliographic resources of the University of Salamanca. To detail the possible causes of failure of the third of these considerations, the following exclusion criteria have been defined to complement the denial of the previous ones: (i) The work presents a system exclusively for 3D video. (ii) The paper is a literature review. Data Extraction Form. The constructed data extraction form is shown in the Fig. 1. The fields chosen are geared towards answering the questions posed above. Regarding the values offered for the algorithm field, it was iterated throughout the study of the literature to extend the set of valid values to give rise to the one shown. The last two questions on the form were suggested by the case studies found during the review period. Firstly, some of the works use the audio
Fig. 1. Data extraction form in the Parsifal tool. 1
http://www.scopus.com.
Video Analysis System Using Deep Learning Algorithms
189
information from the videos instead of just the frames, for which this possibility has been collected, which provides some information on the techniques used, although it is not exactly an algorithm. Finally, some of the articles make specific reference to the detection of video segments that are duplicated, which motivated the third of the research questions in this review, as indicated above. Selection of Works. The above search was refined through the expression, with the specific syntax of Scopus,
TITLE-ABS-KEY("video") AND TITLE-ABS-KEY("copy detection") AND SRCTYPE(J) AND LANGUAGE("English") This expression has required that the terms used in the search appear in the title, the abstract or the list of key words of the article, in order to avoid results whose main subject is not the detection of duplicates. After this process, the remaining exclusion criteria were applied. Finally, 57 articles have been selected in the review [5]. The causes of exclusion of jobs that passed the search filter are detailed in the Table 1. Table 1. Reasons for exclusion from the discarded works. Cause
Occurrences
The work does not directly present a duplicate detection algorithm 32 The full text is not accessible with the available tools
9
The paper is a literature review
2
The work deals exclusively with 3D video
2
Data Analysis. A trend can be observed with a maximum around the year 2013. At present, interest in the field seems to be maintained, with an evolution towards a decline. The breakdown by the algorithms used can be studied in the Fig. 2. It can be seen that most articles have fallen into the category of “characteristic extraction”, making it clear that this is too generalist to detail information. We will return to this issue in the conclusions section. Regarding the algorithms, according to the selection considered the most current techniques are convolutional networks [8,9], the clustering (which is used to speed up searches) [10–12] and the inclusion of specific methods for frame selection [8,13–20]. Analysis. Regarding the first research question of what techniques were used to carry out the detection of duplicates, the exploration of the literature has allowed us to find that the extraction of characteristics - of frames, temporal sequences of these and audio - are, in a general way, the means used.
190
G. Hern´ andez et al. Feature extraction Probabilistic model Clustering CNN Frame selection Nearest-neighbor Hashing
12
10
Papers
8
6
4
2
0
2002
2004
2006
2008
2010
2012
2014
2016
2018
Year
Fig. 2. Algorithms used in the selected works by year of publication.
Formally, they use a variety of mathematical tools -with a casuistry too complex for it to be difficult to establish a classification of these-, as well as algorithms among which probabilistic techniques (often based on graphs) and convolutional neural networks predominate. From the number of publications according to the techniques they use, it is clear that convolutional networks are the most current [8,9], the inclusion of algorithms of clustering as a means of speeding up searches [10–12] and consideration of specific methods for frame selection [8,13–15]. Some works have specifically studied the issue of detection of copied segments. The works detected in this way have been the Refs. [9,10,21,22]. In the light of this study, a system is proposed that combines the advantages of the techniques studied. The following sections describe the proposal and the results obtained.
3
Proposed System
The proposed system and the data set used for validation are shown below. The overall process, based on the workflows of the Refs. [8,9,23], is the following: (i) (Keyframes) of the video are extracted. (ii) The frames are used as input to a convolutional neural network. The activation functions of their intermediate layers are collected by the max pooling technique, allowing for the extraction of features from the frame. (iii) The features are dealt with in the bag of words model, compared to a codebook to get the valid word (code word ) most similar that is found in this. (iv) The list of occurrences of each feature in the codebook is converted to a weighted vector model. (v) It is obtained by means of a similarity measure the ordered list of most similar videos. In the following sub-sections the various components of the process are analysed in more detail. Additionally, in Sect. 3 aggregation methods are presented that can be used to improve the results of the scheme.
Video Analysis System Using Deep Learning Algorithms
191
Keyframe Extraction. A video can be hierarchically decomposed into scenes –narrative units with spatial or temporal continuity–, shots—perspectives of the elements that the observer captures– and stills –each image that makes up the video’s animation. Key frames can be defined as the subset of frames that can briefly represent the visual component of a video [24], for which there are several automatic extraction methods [25]. A set of extracted key frames is already provided in the data set used for the validation of this work, so keyframe extraction techniques will not be studied in detail. Alternatively, frames from the video could be sampled with other techniques, although this possibility is beyond the scope of this paper. Feature Extraction. The neural network used was the so-called AlexNet [26], a deep convolutional network trained with the ImageNet dataset [27], on whose classification problem he obtained results that significantly improved the state of the art in 2012. Among the key points that allowed its success are the use of the rectified linear function x : max (0, x) versus alternatives based on the tanh or sigmoid functions, which are overcrowded; the use of GPUs to quickly apply convolution operations, spread over two different cards; and the use of dropout to avoid overtraining [28]. The weights and implementation in TensorFlow of the network have been taken from the mirror prepared by E. Shelhamer, which is provided under an unrestricted license2 . Codebook. The codebook, which transforms each characteristic into a valid word within its vocabulary, can be built using a clustering algorithm, in which centroids would be the valid words. With a view to the progressive construction of the cluster in a real system, as well as to allow training with a large amount of data, the use of batch algorithms is convenient. In this work, the so-called “mini batch k-means” [29]. Vector Model with Weight. Vectors counting the number of occurrences of valid words must be transformed into a representation that adequately takes into account the effect of their presence both at the video level and at the collection level. The usual tool for this is the so-called tf-idf (Term Frequency–Inverse Document Frequency), for which there are several schemes available [30,31]. In general, the representation will be a vector whose components will be given by the product of two functions of the terms t, tf(t) e idf(t), on which some standardization scheme will possibly be applied later, which can be expressed as (tf(t1 ) · idf(t1 ), · · · , tf(tn ) · idf(tn ))norm .
(1)
Thus, the three functions tf, idf y ·norm completely define the schema used. The recovery model considered as the basis for this work has been chosen by 2
https://github.com/BVLC/caffe/tree/master/models/bvlc alexnet.
192
G. Hern´ andez et al.
making the frequency of terms tf corresponds directly to the number of occurrences of the term, while the document inverse frequency has been modeled in a logarithmically smoothed manner, given by idf(t) = log
1+n + 1, 1 + df(t)
(2)
where n is the number of documents in the recovery set and df (t) is the number of documents with the word t. At the appearance of the two summons 1 in the fraction of (2) is occasionally referred to as smoothing, to ensure that the function is defined when the term does not appear in the documentary collection3 . Finally, each vector has been standardized in the sense of the Euclidean standard x2i . (3) x2 = i
In the results section (Sect. 4) the resilience of the system will be analysed on the basis of the model used, taking into account the variations in the Table 2. Table 2. Variations on the recovery model. f (t) denotes the number of occurrences of the term t, n the number of documents and df (t) the number of documents with the term t. tf
idf
Standard
f (t)
log
Without idf
f (t)
1
No smoothing f (t)
log
Tf logarithmic 1 + log (f (t)) log
Normalizaci´ on 1+n 1+df(t)
+ 1 ·2
n + df(t) 1+n 1+df(t)
·2 1
·2
+ 1 ·2
Measure of Similarity. In the last stage of the process, the weight vectors are compared with each other to find the most similar to a given one. There are several notions that can be applied [31], in this paper we will consider cosine similarity, defined as n
sim(A, B) =
Ai Bi A·B = i=1 . A2 B2 n n A2i Bi2 i=1
(4)
i=1
Remember that, with the standard model as described in Vector model with weight, the denominator is equal to the unit and the cosine similarity matches the scalar product. 3
Although in our case this is not possible due to the use of the code book.
Video Analysis System Using Deep Learning Algorithms
193
Aggregation Mechanisms. In the work of Kordopatis-Zilos et al. [23] two proposals are presented that can be used in combination with the information extracted from several layers of the network. The first of these is the so-called “layer aggregation”, in which a codebook is built for each layer following the proposed method. Then, the vector model will use the occurrences of each word in the different books, giving rise to a vocabulary whose size increases by a factor given by the number of layers used in the aggregation. This method can be understood as an aggregation performed in step 3. The alternative is the so-called “vector aggregation”, in which the vectors of characteristics extracted in each layer are concatenated, using the resulting vector to generate a single codebook. In this case the aggregation can be understood as performed in step 2. The data set considered for the validation of the system has been the socalled CC WEB VIDEO [3,4]. This set of videos was compiled in November 2006, from a series of textual queries aimed at retrieving a number of videos popular at the time. For these queries, the most frequent video was marked as ‘original’, and the relationship of the others to this video was tagged with a few categories of duplicates (Exact duplicate, similar video, long version, ...). The same data set provides a series of frames for each video, extracted using a cut detection algorithm. A total of 398015 key frames are available, which are used as input for the analysis presented in Result section.
4
Results
The duplicate retrieval behavior for the reference video for each of the queries in the dataset can be summarized using a precision-exhaustiveness diagram, such as those shown in Fig. 3. As the number of items retrieved as potential duplicates by the system increases, the completeness (fraction of true duplicates to total video) increases or remains constant, which is reflected in the curve moving only to the right. Only if the videos recovered as duplicates are actually duplicates, will the accuracy (fraction of true duplicates to the videos recovered) be high. Three characteristic cases have been chosen to explain the behaviour of the network. The curves always have a pedestal on the left that corresponds to the exact duplicates. Then, if the recovery has been good the curve will descend slowly, while the fall will be faster in another case. To summarize the curves with a single value, we will use the average precision, which is defined as (Rn − Rn−1 )Pn , (5) AP = n
where the index n refers to each point of the precision-exhaustiveness curve, whose respective coordinates are Pn , Rn . This value is equivalent to a numerical approximation of the area under the curve.
194
G. Hern´ andez et al. 1.0 Q1, MAP=0.931 Q6, MAP=0.710 Q22, MAP=0.105
Precision
0.8
0.6
0.4
0.2
0.0 0.0
0.2
0.4
0.6
0.8
1.0
Exhaustiveness
Fig. 3. Precision-exhaustiveness diagram for consultations with high (Q1), medium (Q6) and very low (Q22) accuracy.
The behaviour of the system when using the characteristics of one or another layer of the neural network is shown in Fig. 4. As can be seen, the general tendency is for the deeper layers to identify duplicates more accurately. It should be noted that this evaluation has been carried out against the total number of videos in the collection, and not only against those relevant to each query. An evaluation with the latter system would lead to more optimistic results. The first form has been chosen to better reflect a realistic usage scenario. The two aggregation mechanisms presented in Aggregation mechanisms section were tested, yielding the results that are reflected in the Fig. 5. As can be 1.0
Average accuracy
0.8
0.6
0.4 Capa Capa Capa Capa Capa
0.2
0.0 0
1, 2, 3, 4, 5,
MAP=0.683 MAP=0.778 MAP=0.795 MAP=0.803 MAP=0.844 5
10
15
20
25
Query
Fig. 4. Average accuracy (vertical) for each query (horizontal) Each series shows the use of the characteristics of a different depth of the network (legend). Dotted lines serve merely as a visual aid. The Q18 and Q22 queries are recognized with the most complicated cases of the data set (cf. Ref. [4]).
Video Analysis System Using Deep Learning Algorithms 0.844
195
0.857 0.794
0.8
Average accuracy
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
No aggregation
By layer Aggregation mechanism
Vectorial
Fig. 5. Average accuracy (vertical) for each aggregation method (horizontal axis). The value is shown on the label of each bar.
seen in this one, layer aggregation provides a slight improvement to the system, while vector aggregation worsens the results. These observations are consistent with the original results of [23], where layer aggregation provides a slight improvement in the case of AlexNet and the vector results in poorer performance than the best individual layers. The distribution of the similarity measure for the different types of duplicates that the labelling of the data set provides is shown by the box diagram of the Fig. 6. As the median line (half of the box) shows, the copies marked as simply “similar” are those with higher values of the coefficient, while the more complicated cases exhibit lower values in general, as expected. There are some cases in which the long version does bear a greater resemblance to the original, as shown by the outliers, but in reality this is because the copies labelled as “long” version imply very different modifications. In the case of query 24, it is actually a longer video with the same content, hence the values with a high degree of similarity. On the other hand, those of query 1 Include different shots, similar to the original ones but trimmed or rotated, which will necessarily have lower similarity coefficients. With regard to the “significant alterations”, they are sometimes more similar in the sense of the measure used (as shown by the numerous outliers). In conclusion, despite the different labels present in the data set, the distinction between them does not help to improve the analysis much. It should be borne in mind that the magnitude depends very much on the video under consideration, as shown by the Kolmogorov-Smirnov test [32] made in Fig. 7. This test allows, in a brief way, to study if the data coming from two different samples can come from the same statistical distribution. The result of the test is a p-valor, a number between 0 and 1 indicating the probability that the observed data are compatible with the so-called null hypothesis, which is the one you want to prove false. In this case, this hypothesis is that “the distribution of the pair of random variables is the same”.
196
G. Hern´ andez et al.
There are only a few pairs of consultations where the test does not lead to the conclusion that the distributions are different. (Q6, Q8) is the most representative case, with p-value 0.68. Cases involving Q24 also have high p-values, but the small number of cases is suspect as a cause of inconclusive evidence. In any case, it should be remembered that the test result does not allow one to claim that the distributions are different, but this does not imply that it claims that they are the same.
1.0
Similarity
0.8
0.6
0.4
0.2
0.0 Similar
Long version
Another version
Considerable alteration
Degree of similarity labelling
Fig. 6. Distribution of similarity measure of the videos in the data set with the labeling of these. Each box delimits the first and third quartiles of the distribution, with the median marked with the horizontal line. Whiskers (lines similar to error bars) delimit 150% of the interquartile range – without exceeding the maximum and minimum data. Observations outside this range (“outliers”) are shown with a marker. Kolmogo´rov-Smirnov test Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 Q23 Q24
1.0
0.8
p-valor
0.6
0.4
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 Q23 Q24
0.2
0.0
Fig. 7. p-Kolmogorov-Smirnov test values for query pairs. Values close to zero (dark blue) indicate that the distributions are significantly different, while outside this range no such conclusion is reached (which does not prove that they are equal).
Video Analysis System Using Deep Learning Algorithms
5
197
Conclusions and Future Work
In summary, the following contributions have been made in this work: (i) A duplicate retrieval system has been built using deep learning techniques. (ii) The effect of the different layers of the AlexNet neural network has been analyzed, finding an improvement with depth. (iii) The mechanisms of aggregation of layers proposed in Ref. [23] have been studied. The layer aggregation provides a slight improvement in the results, while the vectorial aggregation produces worse results than the individual ones of the deeper layer. These results are consistent with those shown in this paper. (iv) Several vector models have been studied for the description of codebook terms in the videos. The most advantageous one has turned out to be the logarithmic one, both for tf and idf. (v) The distribution of similarity measures in the duplicates of the data set has been studied. The distributions show the difficulty of defining a universal cut for the detection of duplicate videos. It has been statistically proven that the distributions are different in each query of the dataset. Some ideas are collected that may serve to extend this work in the future: (i) Review and extend the review of the literature. In Background section possibilities for this have been identified, based on the study itself. (ii) To study the influence of the methods of resizing the frames for their introduction into the neural network. (iii) Use of other pre-trained neural networks, such as VGGNet [33] or GoogLeNet [34]. Acknowledgments. This research has been supported by the project “Intelligent and sustainable mobility supported by multi-agent systems and edge computing (InEDGEMobility): Towards Sustainable Intelligent Mobility: Blockchain-based framework for IoT Security”, Ref.: RTI2018-095390-B-C32, (MCIU/AEI/FEDER, UE).
References 1. Chou, C.-L., Chen, H.-T., Lee, S.-Y.: Pattern-based near-duplicate video retrieval and localization on web-scale videos. IEEE Trans. Multimed. 17(3), 382–395 (2015) 2. Liu, H., Zhao, Q., Wang, H., Lv, P., Chen, Y.: An image-based near-duplicate video retrieval and localization using improved edit distance. Multimed. Tools Appl. 76(22), 24435–24456 (2017) 3. Wu, X., Ngo, C.-W., Hauptmann, A., Tan, H.-K.: Real-time near-duplicate elimination for web video search with content and context. IEEE Trans. Multimed. 11(2), 196–207 (2009). https://doi.org/10.1109/TMM.2008.2009673 4. Wu, X., Hauptmann, A.G., Ngo, C.-W.: Practical elimination of near-duplicates from web video search. In: Proceedings of the 15th ACM International Conference on Multimedia, pp. 218–227. ACM (2007) 5. Hernandez, G.: Sistema de an´ alisis de v´ıdeo mediante la utilizaci´ on del marco metodol´ ogico de los sistemas de razonamiento basados en casos y el uso de algoritmos de aprendizaje profundo. In: Avances en Inform´ atica y Autom´ atica - D´ecimotercer Workshop (2019) 6. Garcıa-Pe˜ nalvo, F.: Revisiones y mapeos sistem´ aticos de literatura (2019). https:// doi.org/10.5281/zenodo.2586725
198
G. Hern´ andez et al.
7. Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering (2007) 8. Hu, Y., Lu, X.: Learning spatial-temporal features for video copy detection by the combination of CNN and RNN. J. Vis. Commun. Image Represent. 55, 21–29 (2018). https://doi.org/10.1016/j.jvcir.2018.05.013 9. Zhang, X., Xie, Y., Luan, X., He, J., Zhang, L., Wu, L.: Video copy detection based on deep CNN features and graph-based sequence matching. Wirel. Pers. Commun. 103(1), 401–416 (2018). https://doi.org/10.1007/s11277-018-5450-x 10. Law-To, J., Buisson, O., Gouet-Brunet, V., Boujemaa, N.: ViCopT: a robust system for content-based video copy detection in large databases. Multimed. Syst. 15(6), 337–353 (2009). https://doi.org/10.1007/s00530-009-0164-2 11. Liu, H., Zhao, Q., Wang, H., Lv, P., Chen, Y.: An image-based near-duplicate video retrieval and localization using improved edit distance. Multimed. Tools Appl. 76(22), 24435–24456 (2017). https://doi.org/10.1007/s11042-016-4176-6 12. Liao, K., Liu, G.: An efficient content based video copy detection using the sample based hierarchical adaptive k-means clustering. J. Intell. Inf. Syst. 44(1), 133–158 (2014). https://doi.org/10.1007/s10844-014-0332-5 13. Su, P.-C., Wu, C.-S.: Efficient copy detection for compressed digital videos by spatial and temporal feature extraction. Multimed. Tools Appl. 76(1), 1331–1353 (2017). https://doi.org/10.1007/s11042-015-3132-1 14. Guzman-Zavaleta, Z., Feregrino-Uribe, C., Morales-Sandoval, M., Menendez-Ortiz, A.: A robust and low-cost video fingerprint extraction method for copy detection. Multimed. Tools Appl. 76(22), 24143–24163 (2017). https://doi.org/10.1007/ s11042-016-4168-6 15. Boukhari, A., Serir, A.: Weber Binarized Statistical Image Features (WBSIF) based video copy detection. J. Vis. Commun. Image Represent. 34, 50–64 (2016). https:// doi.org/10.1016/j.jvcir.2015.10.015 16. Chamoso, P., Gonz´ alez-Briones, A., Rodrguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the art review. Wirel. Commun. Mob. Comput. 2018, 17 (2018) 17. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Process. 119, 115–127 (2016) 18. Coria, J.A.G., Castellanos-Garz´ on, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189– 1205 (2014) 19. Bull´ on, J., Gonz´ alez Arrieta, A., Hern´ andez Encinas, A., Queiruga Dios, A., et al.: Manufacturing processes in the textile industry. Expert Syst. Fabrics Prod. 6(1), 41–50 (2017) 20. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Future Gener. Comput. Syst. 102, 965–977 (2020) 21. Chiu, C.-Y., Wang, H.-M.: Time-series linear search for video copies based on compact signature manipulation and containment relation modeling. IEEE Trans. Circuits Syst. Video Technol. 20(11), 1603–1613 (2010). https://doi.org/10.1109/ TCSVT.2010.2087471 22. Chiu, C.-Y., Tsai, T.-H., Liou, Y.-C., Han, G.-W., Chang, H.-S.: Near-duplicate subsequence matching between the continuous stream and large video dataset. IEEE Trans. Multimed. 16(7), 1952–1962 (2014). https://doi.org/10.1109/TMM. 2014.2342668
Video Analysis System Using Deep Learning Algorithms
199
23. Kordopatis-Zilos, G., Papadopoulos, S., Patras, I., Kompatsiaris, Y.: Nearduplicate video retrieval by aggregating intermediate CNN layers. In: International Conference on Multimedia Modeling, pp. 251–263. Springer (2017) 24. Panagiotakis, C., Doulamis, A., Tziritas, G.: Equivalent key frames selection based on ISO-content principles. IEEE Trans. Circuits Syst. Video Technol. 19(3), 447– 451 (2009) 25. Paul, M.K.A., Kavitha, J., Rani, P.A.J.: Key-frame extraction techniques: a review. Recent Pat. Comput. Sci. 11(1), 3–16 (2018) 26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) 27. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009) 28. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012) 29. Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1177–1178. ACM (2010) 30. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988) 31. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Boston (1999) 32. Massey Jr., F.J.: The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951) 33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 34. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Workshop on New Applications for Public Transport (NAPT)
Towards Learning Travelers’ Preferences in a Context-Aware Fashion A. Javadian Sabet1(B) , M. Rossi2(B) , F. A. Schreiber1(B) , and L. Tanca1(B)
2
1 Dipartimento di Elettronica, Informazione e Bioingegneria, Milano, Italy {alireza.javadian,fabio.schreiber,letizia.tanca}@polimi.it Dipartimento di Meccanica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy [email protected]
Abstract. Providing personalized offers, and services in general, for the users of a system requires perceiving the context in which the users’ preferences are rooted. Accordingly, context modeling is becoming a relevant issue and an expanding research field. Moreover, the frequent changes of context may induce a change in the current preferences; thus, appropriate learning methods should be employed for the system to adapt automatically. In this work, we introduce a methodology based on the so-called Context Dimension Tree—a model for representing the possible contexts in the very first stages of Application Design—as well as an appropriate conceptual architecture to build a recommender system for travelers. Keywords: Context Dimension Tree · Preferences planning · Data tailoring · Recommender systems
1
· Journey
Introduction
The demand for the systems that provide personalized services increases the need to extract knowledge from different sources and appropriately reshape it. Besides, services cannot be properly adapted just by considering the static information obtained from the users’ profiles: using instead a combination of such profiles with the context in which the user is going to be served is definitely more realistic. Generally speaking, context can be recognized as a set of features (a.k.a. variables) contributing to the decision of a user in a system [5]. One of the novel and challenging characteristics of Intelligent Transportation Systems (ITSs) is the real-time personalization of the information considering the user (i) requirements, (ii) preferences, and (iii) behavioural profile [7]. This work investigates and presents the essential elements required to design a user-centered recommender system for the Travel Companion (TC) module currently being developed within the Shift2Rail (S2R) initiative as part of the Innovation Programme 4 (IP4). TC acts as an interface between the users (typically travelers) and the other modules of the S2R IP4 ecosystem, supporting c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 203–212, 2021. https://doi.org/10.1007/978-3-030-58356-9_20
204
A. Javadian Sabet et al.
the users in all steps of their travel. More precisely, since supporting contextdependent data and service tailoring is paramount to ensure personalized services, we aim at extending the work carried out in [1] on Traveler Context-aware User Preferences, by designing the Traveler Context Dimension Tree (CDT) and the conceptual system architecture that identifies the essential components dealing with the creation and management of travelers’ preferences. Notice that the CDT discussed in Sect. 3.1 is a proof of concept and is not meant to be a complete real-life system. The rest of the paper is organized as follows. Sect. 2 discusses some related work and necessary background; Sect. 3 explains the proposed methodology, and Sect. 4 contains some discussions and future works.
2
Background and Related Work
This section discusses different trends in the fields of recommender systems and context-aware models. Recommender Systems. Recommender Systems are designed to help users fulfill their needs by recommending them appropriate items. Different user profiling approaches emerged in the literature, with the aim to determine the users’ requirements and behavioral patterns. Each approach falls into one of the socalled Explicit, Implicit or Hybrid categories. Explicit approaches, often referred to as static user profiling, predict the user preferences and activities through data mostly obtained from filling forms, and do not consider the context. Implicit approaches, instead, mostly disregard the users’ static information and rely on the information obtained from observing their behaviors. Hybrid approaches are a combination of the other two [11]. Context-Aware Models. The demands and models for designing context-aware systems have been described in many works [2,3,14]. Bolchini et al. [4] introduced the Context Dimension Tree (CDT) model (and associated methodology), aimed at representing and later exploiting the information usage contexts to capture different situations in which the user can act, and formalize them hierarchically as a rooted labeled tree T = N, E, r. An example of a CDT is depicted in Fig. 1, thoroughly explained in the next sections: r is the root of the tree, which represents the most general context, and N is the set of nodes, which are either dimension nodes ND (black circles) or concept nodes NC a.k.a dimension’s values (white circles). For further analysis of the dimension and concept nodes, it is possible to add one or more parameter nodes (white squares) that characterize their parent node. The children of r should be dimension nodes, which are known as top dimensions. They define the main analysis dimensions. Each dimension node should have at least one concept or a parameter node. Dimension nodes should not directly generate dimensions; that is, they cannot have immediate descendants that are dimension nodes themselves. Similarly, concept nodes cannot directly generate other concept nodes. If they are not followed by any parameter node, they represent a Boolean value, and in case of
Towards Learning Travelers’ Preferences in a Context-Aware Fashion
205
continuous values or a large number of values, they are followed by suitable parameter node(s).
3
Methodology
In this section we explore the travelers’ preferences through the CDT methodology, and design a conceptual system architecture that includes the main components for the ranking of the proposed trips according to the travelers’ context. 3.1
Traveler Context Dimension Tree (TCDT)
To enable context-aware recommendation for the travel purposes, we identified the aspects characterizing contexts which correspond to the TC users’ choice criteria that are potentially useful to score the available trip choices. Figure 1 presents the proposed Traveler CDT (TCDT). Note that, in the application design phase, designing a CDT is performed independently of, yet in parallel with, the other routine activities involved in this phase [6]. The modeling mechanism of the TCDT neither intends to model all the available data and their structure, nor how they are acquired and where they are stored; rather, it models the information that constitutes the various contexts in which the travelers may find themselves during their reservation and travel experiences, information potentially useful for supporting the system in understanding and seconding the users’ preferences. Consider the user variable Name as an example; the TCDT does not include it because it does not vary with the user’s context; however, it might be useful within the user profile because one might decide to use it to estimate the user gender. To improve the performance of the predictive models, it is a common practice to employ different feature engineering techniques to transform the dataset by transforming its feature space [12]. When designing the TCDT of Fig. 1, we have employed the expansion and transformation of some raw features into different context dimensions, so that the same feature plays different roles in the TCDT to provide a better understanding of the context. An example is the Places dimension, explained in Sect. 3.2. In practice, the design of a CDT requires an iterative approach, and it is dependent on the final requirements of the application. In the proposed TCDT, we designed the top dimensions in such a way that any further modification can be applied by increasing or decreasing the level of granularity to tailor the TCDT. Moreover, the TCDT encompasses some of the essential primary dimension and concept nodes to pave the way for further investigation and tailoring. 3.2
Main Dimensions and Concepts by TCDT
In the TCDT of Fig. 1, the Profile dimension captures the socio-economic characteristics of the users, along with their payment methods. Different groups can be extracted according to the values of the socio-economic factor
206
A. Javadian Sabet et al. Traveler Context Dimension Tree (TCDT)
Other (0:n) List
Loyalty Cards
(0:n) List
Role
(0:n) List
Requested (0:n) List
Excluded
Driver
Passenger
Number Size
Cat TC Id
Number Size
Number
Friend Count
TC Profile ID
Infant
Children
Other
Number TC Id Count
Count
Aid
Bike Size
Partner
Type
Other
Species
Number Size
Relation
Bird
Work
Leisure
Type
Dog
Pet
Family
Person
Meals
Number Size
List
Igonred Tpoics List
Tpoics of Interest Smart
Economy
First Item
Number Size
Value
Value Precipitation Value Cloudiness
Wind
Humidity Value
Atm. Value
Value
Score
Temperature
Weather Forecast
Accompanying
Business
Holiday
Weekend
Part of Day Value Season Value
Zone Id
Id
Product
Luggage
Tablet
Smartphone
Computer
Travel Info
Time City
Country
Middle
Final Value
Hotels
PT
Value Landmarks
Source
Multi-Destination
One-Way
Round Trip
Value
Safety
Purpose
Gestures
Repeat Schedule
Distance
(0:n) List
Search Options
(2:n) Places
Communities
Traveling
Count
Interface
Legs
Deafness
Severity Temporary
Visual
Pregnant
Surfing
TC Profile ID
Inactive
Health Issues
Months
Wheelchair Type Temporary
List
List
TSPs
Hotels
PRM
Severity Temporary
Behavioral Status
Travel Exclusion Socio-economic Age Category Education Level Occupation Gender Income Marital Status Citizenship
(1:n) List
Payment Methods
Profile
Fig. 1. Proposed Traveler Context Dimension Tree. Black circles, white circles, and white squares represent, respectively, dimensions, concepts and parameters.
concept node, such as geographical origin, profession, and so on. Each group carries a set of preferences that are rather stable, thus can be associated with the notion of user profile. The main motivations for this dimension are to enable a warm start for the system and also to provide the chance of detecting and possibly supporting some group behaviors. As an example, consider two regions X and Y in the category of geographical groups, and suppose that, for some reason,
Towards Learning Travelers’ Preferences in a Context-Aware Fashion
207
users from X tend to choose eco-friendly travels much more frequently than users from Y. Investigating the reasons behind this tendency enables the authorities and the system to take the required actions (if applicable) for increasing the popularity of eco-friendly travels for the users living in region Y. Beside the preferences obtained by analyzing the history of the user’s choices, the Travel Exclusion dimension allows the system to filter out travel offers that include features, such as specific hotels and transport service providers (TSPs), that have already been explicitly excluded by the user. In order to support the needs of people with disabilities and health-related issues, we put a particular emphasis on dedicating a group of dimensions— namely Health Issues and PRM, which stands for Person with Reduced Mobility–to this problem. The suggested concept nodes for the PRM dimension identify some of the most critical mobility issues and their related parameters. For example, by nature, pregnancy is a temporary concept, while the others include a parameter that represents the fact that the reduced mobility situation of the user is temporary or permanent. Another example of the parameters to be taken into account during the expansion stage of the TCDT is the Type parameter of the Wheelchair concept; possible values for this parameter are manual or motored. Knowing this concept is essential because of the particular space each of these types requires while recommending trips. The same importance is also applied to the Severity parameters, which can potentially limit the travel choices. A person can belong to a community if they have joined that community. In the TCDT, the Communities dimension captures the memberships of the user. The Loyalty Cards concept captures membership of the user in a community that potentially provides specific discounts. Moreover, we introduced another concept node—Other—as a placeholder to capture other, less structured, communities that may follow different patterns compared to those based on Loyalty Cards. Tailoring the Communities dimension with more levels of granularity through a combination of domain experts’ knowledge and machine learning approaches like clustering is one of our future works. The Behavioral Status dimension describes the current situation of the user through three sub-dimensions. The Inactive concept is true if the user is not interacting with the TC. The Traveling concept, instead, captures the state in which the user is traveling, or has purchased a travel offer and is waiting for the upcoming trip. The Surfing concept encompasses both implicit and explicit momentary user behaviors. More precisely, the Interface and Gestures dimensions capture implicit behaviours. As an example, consider a context in which the user is interacting with the TC through a computer; since this interface provides more space for showing information, and potentially may suggest that the user has no urgent travel request, the TC promotes information regarding “eco-friendly” offers, which have lower CO2 emissions. The users may decide to click, scroll, or ignore this information, which in turn can provide useful insights about their preferences regarding eco-friendly offers. Explicit behaviours, instead, are captured through the Search Options dimension.
208
A. Javadian Sabet et al.
Eco-friendly traveling behaviors can be promoted through so-called ridesharing. For this reason, we foresee that when the user requests a travel offer through the TC and driving a car is a possibility, they can specify whether their Role is that of Driver or of Passenger. Through the TC, they may also specify the Purpose, Legs (One-way, Round Trip, Multi-Destination) and preferred Product (first-class trip, second-class trip, etc.) for the trip. Naturally, the user provides the locations that they are going to visit (at least source and destination). As far as the TCDT is concerned, this value is transformed into appropriate concepts such as Country, City and Zone. For example, consider the Zone concept as a representation of the location; it enables the TCDT to capture the factors contributing to the user’s decision through its subdimensions, i.e., Distance from public transportation (PT), Hotels and Landmarks. In addition, the Weather Forecast dimension is used to capture weather information according to the Time when the user will be in that Zone. The same strategy is applied to transform the actual value of the requested departure and arrival times to the Time dimension and its descendant concepts. It may happen that the user has some Accompanying Items (e.g., a bike) and Pets, whose characteristics—such as their Type, Species, Size and Number — should be taken into account when recommending trips. Also, accompanying Persons not only affect travel choices from the logistic point of view, but, if the Person is also a user of the TC, their preferences should be taken into account. 3.3
Dynamic vs. Static Dimensions
We categorize dimensions as static and dynamic ones. Among the top dimensions, Profile, Travel Exclusion and Communities are static, since their values are rarely modified (though they can indeed change over time). On the other hand, Behavioral Status is a dynamic dimension capturing features that usually refer to a specific moment in time and are not necessarily valid for the future interactions of the user with the TC. Moreover, PRM and Health Issues are dimensions that can be categorized both as static and dynamic, depending on whether they are permanent or temporary. To clarify, we illustrate how the interpretation of a situation can change through the following example. For the sake of reporting to the Business Analytics Dashboard (discussed in Sect. 3.4) or of adapting the preferences learner, the TC needs to query the list of TSPs that are un-favored by users. Indeed, according to the TCDT, this information can be obtained from the Travel Exclusion dimension’s child—TSPs concept node—or from the Gestures dimension’s child—Ignored Topics node. Suppose a TSP with eco-friendly, but comparatively expensive travel offers has appeared in the list of Ignored Topics. As the latter is a dynamic concept, the TSP should not be considered one that users in general do not favor; indeed, the TSP might have been made less popular by the short-term circumstances (say, “Holidays in a touristic region”, in which users might tend to opt for the more economical offers). Consequently, any further action, like updates in the preferences learner, should be temporary. If, on the other hand, the TSP appeared in the list associated with
Towards Learning Travelers’ Preferences in a Context-Aware Fashion
209
the TSPs concept node (which is static), it should be considered as one that is excluded by a group of users, and any decision for it might not be temporary. 3.4
System Architecture
Figure 2 shows the conceptual architecture defining the main blocks and elements required to learn users’ preferences and to recommend the best travel options accordingly. Notice that Fig. 2 provides only a partial representation of the TC system, and it does not include all TC’s blocks and functions; also, it does not specify where modules are deployed (in the cloud, on the client app, etc.). The architecture depicts three main actors, namely End Users, Travel Companion, and Third Parties. Naturally, Travel Companion is, for our purposes, the most important actor, for which in the following we provide a further breakdown into modules. The Knowledge Models block is useful to have a warm start for users who just registered to the system, and for which the system does not have any prior
generate contents
End Users
are known in responds
Travel Companion
interact requests
User Interface
provides
collects logs
Recommender Core
Data
Preferences Learner
User-Related
Community Learner
Personal Learner
Evaluation Metrics Data Centered
Knowledge Models
User Centered
Ranker
provide
tailor
Data Providers
Experts
Service-Related
Weather Service Organizations
Weather Forecats
provide
Travel Offers
provide
Travel Shoppers
Services & Products Features
provide
collaborate
provide
Filter
Social Media Core SM Miner Pipeline
User Data
requests
SM Publisher
Travel Sevice Providers
publishes tailored news & promotions mines
Interoprability Framework
Trip Tracker
Business Analytics Dashboard
Social Media
Third Parties
Fig. 2. Conceptual system architecture showing the main elements engaged in learning the travelers’ preferences.
210
A. Javadian Sabet et al.
data regarding their behavioral activities and specific preferences. Moreover, the Knowledge Models block provides the opportunity to study and understand the behavioral drifts that happen for the person, groups, and communities. Another advantage includes acquiring initial information for logistic purposes. Social Media (SM) Core is composed of two main blocks, namely SM Miner Pipeline and SM Publisher. The SM Miner Pipeline queries different SM platforms seeking explicit mentions, keywords, and hashtags relevant to TC and travel-related contents; it employs Natural Language Processing techniques to harvest knowledge from those platforms. The SM Publisher enables the TC to publish tailored news, promotions, and responses to specific users on SM. In addition, it allows users to share their trip information and provides other socializing functionalities. Recommender Core elements take as input user data, knowledge models, and service-related information and accordingly provide a ranked list of the trips for the user. The TC uses the S2R Interoperability Framework [13] to facilitate the exchange of information between TC and other modules through the automatic mapping between concepts, both semantically and technologically [8,10]. After the trip planning is finalized, Trip Tracker provides appropriate notifications about the trip (e.g., disruptions), which include both information that the user explicitly decided to receive, and information that is deemed useful according to the preferences that are implicitly learned. Finally, the Business Analytics Dashboard keeps track of the system performance according to different KPIs and provides a platform for observing the trends and behavioral drifts that are happening. The main Third parties, playing different roles concerning the provision of information and services to the TC, are the following. Data Providers provide many different types of information related to the user, service, etc. For example, they could provide data regarding the safety of zones. Experts from different domains like sociology, transportation, etc. provide and modify the knowledge models of the TC. Weather Service Organizations provide weather forecasts associated with places that the user will visit. Travel Shoppers are the organizations and services which are in charge of planning the journeys. Different Social Media platforms can play two primary roles. On one side, they can be employed to collect data regarding the users’ attitudes and preferences. On the other hand, TC makes use of them to publish tailored news and promotions. 3.5
Trip Recommendations
Each travel offer received by the TC from Travel Shoppers contains a set of variables that describe its characteristics, such as duration, type of vehicle, CO2 emissions, type of seat, and many others. As a first step in the recommendation of travel offers to the user, the Filter block (see Fig. 2) hides some of them according to the knowledge provided by the values associated with specific TCDT dimensions that are stronger preferences and act as a kind of personal constraints—e.g., offers that include TSPs listed in the Travel Exclusion dimension.
Towards Learning Travelers’ Preferences in a Context-Aware Fashion
211
Then, the Ranker block receives the list of remaining travel offers, plus a vector of preferences containing the weights capturing the importance of the TCDT values to the user and to different communities and groups. For each received travel offer, the Ranker computes a numerical score in the interval [0, 1] according to some suitable evaluation metrics and uses this score to rank offers. Note that, among the user’s preferences, some explicit choices should be treated differently from the others. For instance, once the user has chosen first class (which is one of the offer categories), the context becomes more precise because now the system knows that the first-class-related offers are to be considered as the most probably chosen. Therefore, this specification acts as another filter that filters out the offers not pertaining to this travel category, or assigns to them a lower weight while scoring the offers. We illustrate the ranking step through an example. Consider a traveler who is pregnant and who is traveling for leisure accompanied by her husband; for the trip, she has excluded a specific type of meal. Moreover, through preference learning, TC knows that she favors eco-friendly offers. Among the travel offers received by the Travel Shopper, TC filters out those that include the type of meal to be avoided. Considering her current context, the weight of her pregnancy condition is higher than that of her preference for eco-friendly solutions. As a result, a travel offer that includes a direct flight providing two aisle seats next to each other will have a higher score compared to another with the same characteristics, but one window and one aisle seat separated by a corridor, which is less favorable considering the presence of the accompanying husband.
4
Conclusion and Future Work
The design of an advanced learning system for travelers’ preferences should be such that it not only provides the best possible rankings (and, consequently, suggestions) for travel offers, but it should also be capable of adapting to changes in the behaviors and preferences of users. The latter requirement is of great importance because preferences are highly dynamic, and they are prone to changes from time to time according to different contexts. In this work, we proposed a methodology to describe, at the conceptual level, the different contexts in which travelers can find themselves, with the advantage of being able to specify, for each traveler, how their preferences are affected by context changes. The methodology consists, on one side, in representing the characteristics of users, services and specific circumstances by means of a TCDT, and, on the other side, in designing a system architecture that identifies the potential sources of data and the interactions among the various system elements. We are currently working on enriching the proposed TCDT by increasing the dimensions’ level of granularity to explore the other contexts whose characteristics can contribute to the users’ preferences and to their traveling decisions. Last, but not least, the recommender system should be able to provide the appropriate exploration-exploitation tradeoff [9], which stems from the observation that due to the lack of information about the existence of offers, users may take actions that might mislead the learning system.
212
A. Javadian Sabet et al.
Acknowledgements. Work supported by Shift2Rail and the EU H2020 research and innovation programme under grant agreement No: 881825 (RIDE2RAIL)
References 1. D5.2 – travel companion specifications. Technical report, IT2Rail Consortium (2017). http://www.it2rail.eu 2. Alegre, U., Augusto, J.C., Clark, T.: Engineering context-aware systems and applications: a survey. J Syst. Software 117, 55–83 (2016) 3. Bolchini, C., Curino, C.A., Quintarelli, E., Schreiber, F.A., Tanca, L.: A dataoriented survey of context models. ACM SIGMOD Record. 36(4), 19–26 (2007) 4. Bolchini, C., Curino, C.A., Quintarelli, E., Schreiber, F.A., Tanca, L.: Context information for knowledge reshaping. Int. J. Web Eng. Technol. 5(1), 88–103 (2009) 5. Bolchini, C., et al.: And what can context do for data? Commun. ACM 52(11), 136–140 (2009) 6. Bolchini, C., Quintarelli, E., Tanca, L.: Carve: context-aware automatic view definition over relational databases. Inf. Syst. 38(1), 45–67 (2013) 7. Canale, S., Di Giorgio, A., Lisi, F., Panfili, M., Celsi, L.R., Suraci, V., Priscol, F.D.: A future internet oriented user centric extended intelligent transportation system. In: 2016 24th Mediterranean Conference on Control and Automation (MED), pp. 1133–1139. IEEE (2016) 8. Carenini, A., Dell’Arciprete, U., Gogos, S., Kallehbasti, M.M.P., Rossi, M., Santoro, R.: ST4RT – semantic transformations for rail transportation. In: Transport Research Arena (TRA 2018), pp. 1–10 (2018) 9. Drugan, M.M.: Reinforcement learning versus evolutionary computation: a survey on hybrid algorithms. Swarm Evol. Comput. 44, 228–246 (2019) 10. Hosseini, M., Kalwar, S., Rossi, M., Sadeghi, M.: Automated mapping for semanticbased conversion of transportation data formats. In: Proceedings of the International Workshop on Semantics For Transport (Sem4TRA), CEUR-WS, vol. 2447, pp. 1–6 (2019) 11. Kanoje, S., Girase, S., Mukhopadhyay, D.: User profiling trends, techniques and applications. arXiv preprint arXiv:1503.07474 (2015) 12. Polyzotis, N., Roy, S., Whang, S.E., Zinkevich, M.: Data lifecycle challenges in production machine learning: a survey. ACM SIGMOD Record 47(2), 17–28 (2018) 13. Sadeghi, M., Buchníček, P., Carenini, A., Corcho, O., Gogos, S., Rossi, M., Santoro. R.: SPRINT: semantics for PerfoRmant and scalable INteroperability of multimodal Transport. In: Transport Research Arena (TRA 2020), pp. 1–10 (2020, to appear) 14. Vert, G., Iyengar, S.S., Phoha, V.V.: Introduction to Contextual Processing. Theory and Applications. CRC Press, London (2016)
Reputation Algorithm for Users and Activities in a Public Transport Oriented Application D. Garc´ıa-Retuerta1 , A. Rivas1 , Joan Guisado-G´ amez2 , E. Antoniou3 , 1(B) , and on behalf of My-TRAC group P. Chamoso 1
2
3
BISITE, University of Salamanca, Calle Espejo 2, 37007 Salamanca, Spain {dvid,rivis,chamoso}@usal.es Data Management Group, Universitat Polit`ecnica de Catalunya, Jordi Girona 1-3, 08034 Barcelona, Spain [email protected] AETHON Engineering Consultants, 25 Em. Benaki Street, 10678 Athens, Greece [email protected] Abstract. This article presents two reputation algorithms oriented to the transport sector. The first algorithm is to determine the reputation of users and thus encourage their participation through a point system. The second algorithm, closely related to the previous one, is also a reputation system, but this time oriented to the activities that a user can perform during a planned trip so that it establishes a ranking among the most popular activities that can be performed, which is not based only on typical rating systems since they can be easily tricked. The main scientific novelty compared to other proposed algorithms is that the proposed system automatically adapts to the existing content of the platform regularly. The proposed system has been evaluated with synthetic data since, at the time of the definition of the system, there were no real users or activities to work on. The results show that the system behaves as expected after the conclusions obtained from the analysis of the different existing proposals. Keywords: Reputation algorithm Software application
1
· Users’ reputation · Transport ·
Introduction
The vast amount of information generated on the Internet is highly valuable nowadays; if properly analysed, it is a source of knowledge. Thus, it is necessary to apply analysis mechanisms like those used in recommender systems that are capable of extracting knowledge, for example, to ensure satisfactory user experience by providing users with the content they are looking for, in systems that manage large volumes of information. Analysing and filtering the information is especially necessary in cases where the user can interact directly with the content offered in the service. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 213–223, 2021. https://doi.org/10.1007/978-3-030-58356-9_21
214
D. Garc´ıa-Retuerta et al.
The main object of this article is to propose a reputation algorithm to facilitate recommendations on a series of trip-related activities, such as the purchase of tickets, selection of the most appropriate means of transport, tourist activities, etc., which the users will be able to use as a guide while planning their trips. After the evaluation of the implemented algorithms using synthetic data (since no real data are available at this stage of the project and it will be necessary to wait for a new validation with these real data), the results obtained show that the reputation value behaves as expected given the technical proposal detailed in this document. Next, a review of existing reputation systems is presented in Sect. 2. Section 3 describes the proposal. Section 4 presents the assessment made with synthetic data. Finally, Sect. 5 presents the conclusions.
2
Background
Numerous proposals of reputation algorithms have been put forward over the years. They are generally quite context-dependent. This is because each problem entails the study of the best solution and in most cases, it is not enough to have one generic proposal or to apply a specific proposal to a different problem. It is always necessary to adapt the approach to the new problem. In the state of the art, the majority of cases proposed context-dependent solutions that perform better than those that do not consider the context [1]. The following paragraphs present different existing reputation systems divided into groups of academic and commercial proposals. 2.1
Academic Proposals
Among the scientific proposals in the state of the art, two of them stand out (PageRank and EigenTrust). PageRank is the most popular of all the reputation algorithms, presented in [2] and used in the earlier times by Google to order the websites in its search engine in an objective and mechanical way. Four years later, researchers from Stanford University proposed an algorithm for reputation management in peer-to-peer (P2P) networks, called EigenTrust and described in [3]. With its application, they managed to minimize the impact of malicious peers on the performance of a P2P system. PathTrust [1], has been presented more recently. It is based on a model that exploits the graph of relationships among the participants of virtual organizations. Its authors indicate that the system is based on the two previous algorithms (PageRank and EigenTrust), however they are not directly applicable because their personalization is very limited. Below is a brief description of how each algorithm works, along with its advantages and disadvantages. – PageRank [2]: Advantages: This algorithm converges in about 45 iterations. Its scaling factor is roughly linear in log(n). It uses graph theory to link
Reputation Algorithm for Users and Activities
215
the pages. An important component of PageRank is that its calculation can be personalized. Also, PageRank can estimate web traffic and can predict backlinks. Disadvantages: PageRank is based on random walks on graphs. This algorithm assumes the behaviour of a “random surfer”, but if a real Web surfer ever gets into small loops of web pages the PageRank will have false positives. Also, this method of random surfer assumes that periodically the surfer “gets bored” and jumps to another random page. – EigenTrust [3]: Advantages: This reputation system is among the most known and successful reputation systems. It satisfactorily solves different problems existing in P2P systems, which is the context in which the algorithm is designed. Disadvantages: The main drawback of this system is its reliance on a set of pre-trusted peers which causes nodes to centre around them. As a consequence, other peers are ranked low despite being honest, marginalizing their effect in the system [4]. – PathTrust [1]: Advantages: This model of reputation (using the trust relationships amongst the participants) particularly lends itself to resistance against the attack of faking positive feedback. A group of attackers collaborates to boost their reputation rating by leaving false, positive feedback for each other. In this model of reputation, this will only strengthen the trust relationship among themselves, but not necessarily strengthen the path from an honest inquirer to the attacker, such that the reputation from the honest inquirer’s point of view should remain unaffected. Another benefit of exploiting established relationships in member selection is the formation of long-term relationships. Disadvantages: The trust relationship between two participants is formed based on the past experience they had with each other. A participant leaves a feedback rating after each transaction and these ratings are accumulated to a relationship value. So, one user can boost a positive or negative feedback. 2.2
Commercial Proposals
Currently, the most important reputation system proposals are those used by commercial applications. Generally, commercial reputation systems focus directly on assigning users a reputation in that commercial system. More specifically, the reputation systems of TripAdvisor, Waze, Amazon, BlaBlaCar, ResearchGate. The conclusion drawn from the review of the state of the art is that all the existing reputation system proposals, especially those for commercial systems, focus exclusively on their context. This implies that a specific algorithm has to be designed to obtain good results. To do this, it is essential to identify the factors and the extent to which they have a direct influence on reputation. In the same way, although each type of parameter has its weight in the final score, each occurrence of the parameter may affect the associated factors differently. It is, therefore, necessary to determine how the score assigned to each occurrence of a parameter evolves over time.
216
D. Garc´ıa-Retuerta et al.
Moreover, in the majority of the analysed commercial proposal, the user must know what is the highest reputation level that can be reached. This allows them to understand the relevance of the different scores.
3
Proposal for User and Users’ Choices Reputation Algorithm
In this section we describe the algorithms that assign a reputation score to each user and each user choice (activity), representing their ranking within the system. The two algorithms share a common basis, what differentiates them is the purpose for which they are used, therefore they use different factors and metrics. As a result, each subsection describes either the part dedicated to the users’ reputation algorithm or the users’ choices reputation algorithm. This section is structured as follows: the factors identified as essential to determine a user’s reputation in the system are presented below. Then, the metrics associated with each of the individual factors are shown, followed by the description of the mechanism that provides the initial score, as the output of each user or user choice algorithms. Finally, the proposed adaptive weight mechanisms of both algorithms are described. They adapt the weight of factors according to the dynamic characteristics of the application where the algorithms are applied (My-TRAC, an application oriented to provide new public transport oriented functionalities to users). Thus, the role of this mechanism is to re-establish the limits of each factor over time, as the number of users or the number of existing ratings changes with time, providing an adequate maximum score. 3.1
Mathematical Description of the Reputation System
Metrics are related to each of the identified factors. Therefore, each metric determines the reputation score provided by its corresponding factor and each factor has its own metric. Besides, metrics affect the overall reputation as it is calculated as the sum of all the factors’ score. Thus, each metric provides a final score which is calculated as the percentage reached by that user over the total weight of each factor, and these final scores are added to obtain the user/activity reputation. The number of instances that are required to reach the maximum score, is established for each factor. In addition, the slope of a certain function will determine how fast or how slowly the value for that parameter increases. It has been established that the evolution is not linear, just like ResearchGate’s calculation of its “RG Score”. Thus, the growth of the score of a specific factor will be either logarithmic or exponential, following the equations defined in Eq. (1) and Eq. (2). scoreP arameteri = ymaximum ∗
log(slope ∗
x xmaximum
log(2 + slope)
+ 1)
(1)
Reputation Algorithm for Users and Activities
217
The logarithmic equation shown in Eq. (1) will be useful in cases where the slope should be greater in the initial instances and then gradually decrease in subsequent instances. For example, to encourage new users to rate activities, the first few ratings the user gives will have a considerable effect on their reputation, however, the user will not be able to continue gaining reputation at the same rhythm after producing a considerable amount of ratings. Instead, further ratings will have a smaller impact on the reputation of the user. Logarithmic growth is regulated by the slope variable of the equation, whereas the maximum number of instances is regulated by xmaximum . This factor will be dynamic due to the usage characteristics of the social network. So, in the case of the factor, ratings provided by users, the maximum score xmaximum can take a value of 200, meaning that a user with more than 200 ratings will obtain a 100% initial score which will greatly contribute to the final score. In cases where the usage patterns of the application imply that users give a large number of ratings, the factor xmaximum is adjusted dynamically so that xmaximum = 2 ∗ avgRatingsByU ser. Finally, the factor ymaximum can reach 1, so that each factor will have a score between 0 and 1. x )slope (2) scoreP arameterj = ymaximum ∗ ( xmaximum The exponential equation shown in Eq. (2) will be useful for factors in which the weight of the initial instances is lesser and becomes more important in the system as the number of instances grows. For example, a user that opens the application three times will not notice a significant increase in their reputation in the system, however, a user who opened the application 200 times is considered a regular user and therefore obtains a pertinent reputation. Although the mathematical approach described above is not directly based on any existing work to determine reputation, these types of equations are well known and widely used in the literature for multiple purposes. Mathematically speaking, the most similar proposed work can be found in [5], where the authors present a trust management system based on reputation mechanisms. The mechanisms proposed in this paper base the evolution of reputation on the number of assessments that follows a logarithmic distribution. User Reputation Mathematical Model. All the equations related to users can be found in Fig. 1. The next variables are used as input: – – – – – –
cdate refers to the current date. rdate refers to the registration date. nvaluations refers to the number of valuations of the user. nuses refers to the number of times the user launched the app. ntickets refers to the number of tickets the user bought. ngroups refers to the number of groups the user joined. The next variables are the obtained outputs:
– s1 is the initial score of days registered.
218
D. Garc´ıa-Retuerta et al.
Fig. 1. Equations related to the user reputation, inputs, outputs and factors.
– – – –
s2 s3 s4 s5
is is is is
the the the the
initial initial initial initial
score score score score
of of of of
the the the the
list of valuations. number of uses of the application. number of tickets purchased. number of groups.
M represents the number of occurrences of a given parameter to provide the maximum value/weight that it is capable of providing over the total reputation (w). Both are static (but editable) variables obtained from the database. They refer to maximum and weight respectively. The subscript indicates which factor they are related to. The final score S is defined as shown in Eq. (3): S=
5
si
(3)
i=1
This novel proposal, incorporates advantages in commercial systems such as: i) dynamic adaptation of the reputation to the information of the system itself (non-linear growth), ii) dynamic adaptation at parameter level, varying the specific weight that each parameter has on the final reputation, iii) engaging the users with well-modelled changes in its reputation score. To this end, mechanisms similar to those used in well-known proposals that have proven to work well (such as the presented in [5]) have been incorporated, together with the peculiarities of My-TRAC, which determine the information to be used. Users’ Choices Reputation Mathematical Model. All the equations related to users’ choices can be found in Fig. 2. The next variables are used as input: – rauserk,i is the rating of the k-th user on the i-th activity. The number of users who rated this activity is defined as n. – reuserk is the reputation of the k-th user who rated the activity. The number of users who rated this activity is defined as n.
Reputation Algorithm for Users and Activities
219
Fig. 2. Equations related to the users’ choices reputation, inputs, outputs and factors.
– ndays is the number of days since the activity was created. – nviews is the number of views of the activity. The next variables are the obtained outputs: – s1 is the initial score of N-star ratings weighted average. – s2 is the initial score of the number of views of the activity. M and w are static (but editable) variables obtained from the database. They refer to maximum and weight respectively. The subscript indicates which factor they are related to. The final score, S, is defined as shown in Eq. (4): S=
2
sj
(4)
j=1
3.2
Updating the Parameters and Their Weights
The information on My-TRAC is not static, instead, it evolves over time. This obliges the metrics that are part of the reputation algorithms, to adapt to the information. For this reason, it is crucial to implement mechanisms that update configurable factors in each of the metrics. For example, during the pilot stage, when the application begins to obtain real user data, the system will start from zero. Also, in the beginning, a lesser number of instances of each factor will be required to obtain a significant final reputation score of a user/activity. The number of instances required will be much higher after a year of system functioning. According to the number of instances of each one of those factors, the metrics that make up the algorithm can be adapted automatically or manually to update the values. Both the weight that the parameter has on the final reputation and number of occurrences that a parameter must have to obtain the maximum score can be updated. Regarding the weight of the parameter in the final reputation, it is set a priori but can be changed at any given point in order to correct certain anomalies or to reinforce wished behaviours. On the other hand, there are two ways of updating the number of occurrences that a parameter must have to obtain its maximum score:
220
D. Garc´ıa-Retuerta et al.
– Manually: when an expert administrator/developer decides that it should be changed for some reason. – Automatically: depending on the evolution of the information in the platform. For example, the rating an activity has in the system will not remain the same; it is going to change over time and according to its evolution, the maximum weight of this parameter in the system can increase or decrease (if it receives many ratings, its weight will decrease). In this first version of the model, the system’s automatic adaptation has not been evaluated because the data we are using at this stage is generated synthetically. However, the system already implements automatic adaptation which uses those data. Once real data are obtained in the pilot stage, we will evaluate automatic adaptation.
4
Evaluation and Results
The only way of evaluating the correct functioning of the algorithm with the synthetic data is to analyse whether the obtained output behaves as expected and then draw conclusions as to whether the reputation score assigned to different users corresponds to the initial idea as a function of the values of each of the parameters affecting the reputation score. The evaluation of the obtained results is a subjective task however it is important to verify that the algorithms behave as expected. 4.1
Users’ Reputation Evaluation
When creating the synthetic dataset, the aim is to simulate the behaviour of real users. This study aims to model the use of the system by users, no inactive users will be generated, even though in a real system they could become the majority.
Fig. 3. Distribution of the users’ reputation with the generated synthetic data
Reputation Algorithm for Users and Activities
221
In this way, there will be a set of users who use the system a lot, a larger set who use it frequently and an even larger one who use it sporadically. This has involved creating three ranges of usage possibilities when creating the data. This distribution of users is easily observed by analysing the scoreboard of those commercial applications whose scoreboard is public. For example, in Waze [6], one of the tools analysed in Sect. 2, a user with 100,000 points can reach the maximum level “Waze Royalty”, which means to be in the 1% most active in the country, while the top users listed in the scoreboard, have more than 1,000,000 points. Figure 3 shows the distribution of reputation among system users. On the x-axis, there are reputation intervals and on the y-axis the number of users with a reputation within those intervals. The resulting scores present a Gaussian distribution which denotes a desirable behaviour—it is the distribution you would expect from many natural phenomena. 4.2
Activities Reputation Evaluation
On the other hand, the activities reputation algorithm, which determines the reputation of the activities included in My-TRAC, cannot be contrasted with the reputation of real activities, as the data have been obtained synthetically. For this reason, the evaluation will be extended when real data is available. In this case, the only case-specific restrictions applied when generating the synthetic data set are: – The identifier is a unique integer from 1 to 1,000. – The inclusion date is between 01/09/2017 (start of the project) and 18/12/2018 (the date on which the evaluation was carried out). – The number of views of an activity is higher than its number of ratings. The distribution of the reputation of the 1,000 synthetically created activities is shown in Fig. 4, which shows on the x-axis the reputation values of the activities and on the axis and the number of activities that there are in the different reputation intervals. It can be observed that there is no activity with a reputation of less than 21, because the synthetic data was created to test the performance of the models with active users and successful activities. These circumstances are not expected to exist in reality, where it is expected that there may be activities that receive no ratings at all during the pilots’ stage.
222
D. Garc´ıa-Retuerta et al.
Fig. 4. Distribution of the activities’ reputation with the generated synthetic data
5
Conclusion
The described evaluation has allowed us to verify that the software implementation of the designed algorithms provides the expected results. Moreover, the overall outcome is similar to what was initially intended: new users are encouraged to use the application by receiving higher scores for their first activities. This score decreases as users become regular visitors. Thus, in order to achieve maximum reputation it is necessary to be a fairly active user in the system and not to simply maintain sporadic usage of the platform. As the system always try to converge to Gaussian distribution of the reputation for both users and activities, we could asses that a “good” reputation should be over 75. We can, therefore, conclude that even though there was no real data, we fulfilled the goal of allowing users to determine the relevance of users and the actions on the case study of My-TRAC platform. Acknowledgments. This research has been supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No 777640.
References 1. Kerschbaum, F., Haller, J., Karabulut, Y., Robinson, P.: Pathtrust: a trust-based reputation service for virtual organization formation. In: International Conference on Trust Management, pp. 193–205. Springer (2006) 2. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999) 3. Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The eigentrust algorithm for reputation management in p2p networks. In: Proceedings of the 12th International Conference on World Wide Web, pp. 640–651. ACM (2003) 4. Kurdi, H.A.: Honestpeer: an enhanced eigentrust algorithm for reputation management in p2p systems. J. King Saud Univ.-Comput. Inf. Sci. 27(3), 315–322 (2015)
Reputation Algorithm for Users and Activities
223
5. Zacharia, G., Maes, P.: Trust management through reputation mechanisms. Appl. Artif. Intell 14(9), 881–907 (2000) 6. Waze. Your rank and points - connected citizens program 02 December 2019. https://wiki.waze.com/wiki/Your Rank and Points
Extraction of Travellers’ Preferences Using Their Tweets Juan J. Cea-Mor´ an1 , Alfonso Gonz´ alez-Briones1,2(B) , Fernando De La Prieta1 , Arnau Prat-P´erez3 , and Javier Prieto1 1
2
BISITE Research Group, University of Salamanca, Calle Espejo 2, 37007 Salamanca, Spain {juanju 97,alfonsogb,fer,javierp}@usal.es Air Institute, IoT Digital Innovation Hub, Carbajosa de la Sagrada, 37188 Salamanca, Spain 3 Sparsity-Technologies, Barcelona, Spain [email protected]
Abstract. New mapping and location service applications have focused on offering improved usability and service based on multimodal passenger experiences from door to door. This helps citizens to develop greater confidence in and adherence to multimodal transport services. These applications focus on adapting to the needs of users during their journeys thanks to the data, statistics and trends provided by the passenger experiences while using these platforms. The My-Trac application is dedicated to the research and development of these user-centred services to improve the multimodal experience using various techniques. Among these techniques are preference extraction systems, which extract user information from social networks such as twitter. In this article we present a system that allows the creation of a profile of preferences of a certain user based on the tweets published in his Twitter account. The system extracts the tweets from the profile and analyzes them using the proposed algorithms and returns them as a result in a document containing the categories and the degree of affinity that the user maintains with each one. In this way the user can be offered activities or services during the route to be taken with a high degree of affinity. Keywords: Users’ profiling · Data extraction processing transport · Mapping application
1
· Natural language
Introduction
Human beings are social beings, we always seek to be in contact with other people and to have as much information as possible about the world around us. The emergence of the Internet has made it possible to define new forms of communication between people. It has also made it possible to make available to the average user a great deal of information on any subject and at any time. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 224–235, 2021. https://doi.org/10.1007/978-3-030-58356-9_22
Profiling with Twitter Info
225
The main technological advance in which this reality is materialized is the development of social networks. Facebook, Instagram or Twitter are three examples of social networks used by millions of people around the world. People who, thanks to these technologies, can start a conversation from different parts of the world, post photos from their last trip or update their followers by writing down their opinions or experiences. Regarding the latter, Twitter is undoubtedly the main exponent [1,2]. Twitter is a social network based on microblogging, i.e. messages of no more than 280 characters through which users express their opinions, tastes, experiences, etc. In addition, due to its nature as a social network, Twitter allows its users to follow other accounts that interest them, or comment on events in real time using hashtags. All this translates into one word: information. The information provided by users in social networks can be used in various ways, many of them negative. Being exposed on the Internet means that anyone can access your data and therefore use it for profit. However, it can also be used to make life easier for users who choose to do so, always bearing in mind that there must be express consent on their part. This is precisely the case of the work presented. Nowadays, information is a very valuable asset, and as such, Twitter is a great source of data when it comes to analyzing human behavior and interactions or when it comes to knowing the opinion of certain users on certain topics. These systems can be used in mapping applications to improve the multimodal experience of users when using the application. Therefore, it is proposed to adapt these systems for their adoption in mapping applications. This article proposes a system for the extraction of information about Twitter users. The system is capable of obtaining information about a particular user and of elaborating a profile with the user’s preferences in a series of preestablished categories. A review of existing reputation systems is presented in Sect. 2. Section 3 describes the proposal. Section 4 presents the assessment made with synthetic data. Finally, Sect. 5 presents the conclusions.
2
Background
The use of Machine Learning techniques for the analysis of information extracted from Twitter is a very common case study today. It is convenient to study what kind of research is being carried out on this subject. One of the main applications is the use of Twitter and Natural Language Processing techniques in order to extract a user’s opinion about what is being tweeted at a given time. The article “A system for real-time Twitter sentiment analysis of 2012 U.S. presidential election cycle”, written by Hao Wang et al. [3] presents a system for real-time polarity analysis of tweets related to candidates for the 2012 U.S. elections. The system collects tweets in real time, tokens and cleans them, identifies which user is being talked about in the tweet and analyzes the polarity. For training, it applies Na¨ıve Bayes, a statistical classifier. It uses hand categorized tweets as input. Another study similar to this one is the one proposed by J.M. Cotelo et al. from the University of Seville: “Tweet Categorization by combining
226
J. J. Cea-Mor´ an et al.
content and structural knowledge” [4]. It proposes a method to extract the users’ opinion about the two main Spanish parties in the 2013 elections. It uses two processing pipelines, one based on the structural analysis of the tweets and the other based on the analysis of their content. Another possible line of research is based on categorizing Twitter content. This is the case of the article “Twitter Trending Topic Classification” written by Kathy Lee et al. [5]. It studies the way to classify trending topics (hashtags highlighted) in 18 different categories. For this, Topic Modelling techniques are used. The key point lies in providing a solution based on the analysis of the network underlying the hashtags and not only the text: “our main contribution lies in the use of the social network structure instead of using only textual information, which can often be noisy considering the social network context”. As you can see, there are many studies currently oriented to the analysis of Twitter using Machine Learning tools. The challenge to be faced in this work will be to find the optimal way to classify users according to their tweets. In later sections of this report, the objectives of the project will be detailed and each of the parts that have been researched and tested will be detailed in order to finally build a stable system that allows the task to be solved.
3
Proposal
The following sections will break down the proposal into its different parts and provide in-depth details. From an abstract point of view, the proposal could be seen as a processing pipeline, as shown in Fig. 2. The different phases of this pipeline will contribute to the achievement of the main objective: user classification.
Information extraction (tweets from user profile)
Cleaning and vectorization
Analysis and categorization
User categorized
Fig. 1. Processing pipeline representing the system processing steps.
3.1
Twitter Data Extraction
The Twitter data extraction mechanism is a fundamental element of the system. The goal of this mechanism is to recover two types of data. On the one hand, the system extracts a set of anonymous tweets related to each of the defined preference categories; these tweets are used to train the data classification algorithms. On the other hand, the mechanism extracts information about the user for the analysis of their preferences. Twitter’s API enables developers to perform all kinds of operations on the social network. It is therefore necessary for our system to use this powerful
Profiling with Twitter Info
227
API. This API could be used by elaborating a module that would make HTTP requests to the API so that the endpoints of interest are executed. However, this involves a remarkably high development cost. Another option would be to make use of one of the multiple Python libraries that encapsulate this logic and offer a simple interface to developers. The latter option has been chosen for the development of this system, more specifically, Tweepy. 3.2
Preprocessing and Vectorization Ot Tweets
Once the data has been extracted, it must be prepared for the classification algorithms. First of all, the data must be cleaned of symbols that are strange or irrelevant to the problem in question. During this process, they should also be divided into tokens. They must then be converted from text to number so that Machine Learning algorithms can work with them. First of all, cleaning and preprocessing techniques must be applied, so that the text is prepared for the vectorization algorithms. Packages such as NLTK and Spacy will be used. The key activity performed during the preprocessing is the elimination of the most common words from the language you are working with. These words in the Spanish language would be, for example: “y”, “hacia”, “aunque”, “desde”, “cuando”, etc. In this case we work with the English language. Web links and usernames that may compromise the users’ privacy must also be removed. Table 1 shows the results obtained after the Tweets have gone through the preprocessing and preparation process which had been carried out using the tools listed above. Vectorization is the application of models that convert texts into numerical vectors so that the classifying algorithms can work with the data. Two algorithms have been considered for the performance of this task, they are widely used in the field of NLP, namely, BagOfWords and TF-IDF. Each of the techniques described below will serve for different classification algorithms. Bag of Words [6] is a model that allows to extract characteristics from texts (also images, audios, etc.). It is therefore a feature extractor model. The model consists of two parts: a representation of all the words in the text and a vector representing the number of occurrences of each word throughout the text. That is why it is called BagofWords. This model completely ignores the structure of the text, it simply counts the number of times words appear in it. It has been implemented through the Genism library. Term Frequency - Inverse Document Frequency (TF-IDF) [7]. This is the product of two measures that indicate, numerically, the degree of relevance that a word has in a document in a collection of documents. It is broken down into two parts: Term frequency: Measures the frequency with which certain terms appear in a document. There are several measurement options, the simplest being the
228
J. J. Cea-Mor´ an et al. Table 1. Preprocessing results with NLTK and Spacy. Text
Nltk tokenized
Spacy tokenized
0 Read This Before Taking a Road Trip with a Pet
[read, taking, road, trip, pet]
[read, take, road, trip, pet]
1 @kenwardskorner @Senators @Canucks Also, does
[also, name, imply, take, acid, road, trips, ...]
[imply, acid, road, trip, look, like]
2 Our Art is our Passion \n#apnatruckart #truck
[art, passion, apnatruckart, truck, art, uniqu...]
[art, passion, apnatruckart, truck, art, uniqu...]
3 Lelang drop acc budget 40–50 dong?
[lelang, drop, acc, budget, dong]
[lelang, drop, acc, budget, dong]
4 We agree...and want [agree, want, everyone to know everyone, know, that ou tours, relaxed,
[agree, want, know, tour, relaxed, low, mileage...
5 Choosing a hotel for [choosing, hotel, a break away with break, away, family, the fam special
[choose, hotel, break, away, family, special, ...
6 @ JassyJass Are you, camping?
[camping]
[camp]
7 How to Pack Your Electronics for Air Travel ht
[pack, electronics, air, travel]
[pack, electronic, air, travel]
8 Dasar low budget! =)=) https://t.co/ 2YUmUrGjj5
[dasar, low, budget]
[dasar, low, budget]
gross frequency, i.e. the number of times a term t appears in a document d. However, in order to avoid a predisposition towards long documents, the normalized frequency will be used: tf(t, d) =
f(t, d) max{f(t, d) : t ∈ d}
(1)
As shown in Eq. (1), the frequency of the term is divided by the maximum frequency of the terms in the document. Inverse document frequency: If a term appears very frequently in all of the analysed documents, its weight is reduced. If it appears infrequently, it is increased. idf(t, D) = log
|D| |{d ∈ D : t ∈ d}|
(2)
Profiling with Twitter Info
229
As shown in Eq. (2), the total number of documents is divided by the number of documents containing the term. Term frequency – Inverse document frequency: The entire formula is as shown in Eq. (3). tfidf(t, d, D) = tf(t, d) × idf(t, D)
(3)
The implementation of this model has been done through the Sklearn library. 3.3
Topic Modelling
Topic Modelling is a typical NLP task that aims to discover abstract topics in texts. That is, it is a widely used data mining technique for the discovery of hidden semantic structures in the body of texts. In this case, this technique will be used to discover the theme of the user’s tweets, which, should correspond to the defined categories. Most of the tested algorithms will be based on the paradigm of unsupervised learning (except KNN). These algorithms return a set of topics, as many as indicated in the training. Each topic represents a cluster of terms that must be related to one of those categories. Precisely for this reason, a large number of tweets have been retrieved as training data by searching for keywords in each of the categories. During the research period a total of 6 algorithms have been tested: LDA (Gensim), LDA (Sklearn), LSI (Sklearn), Kmeans, KNN and NMF. For each of them, different combinations of vectorizers and hyperparameters have been tested with the intention of fine-tuning the result. The discarding process has been based on two phases: – Generation of topics: The first phase to be overcome was to be able to generate a set of topics according to some category, avoiding as far as possible the appearance of terms belonging to other categories. – Testing with users: Algorithms that go beyond the previous phase and give a correctly trained model should correctly analyze the tweets of specific users whose general theme is very clear (for example Tesla or Donald Trump). The algorithm used in the final system will be explained below. Nonnegative Matrix Factorization (NMF), is an unsupervised learning algorithm belonging to the field of linear algebra. NMF reduces the dimensionality of an input matrix by factoring it in two and approximating it to another of a smaller range. The formula is V W H. Let us suppose, observing Eq. (4), a vectorization of P documents with an associated dictionary of N terms (weight). That is, each document will be represented as a vector of N dimensions. All documents therefore correspond to a V matrix ⎛ ⎞ ··· ··· ⎜ .. .. ⎟ ⎜ . .⎟ N ×P ⎜ ⎟ (4) =⎜. V ∈R .⎟ . . ⎝ .. . . . . .. ⎠ ···
230
J. J. Cea-Mor´ an et al.
Where N is the number of rows in the matrix and each of them represents a term, while P is the number of columns in the matrix and each of them represents a document. Equations (5) and (6) shows matrices W and H. The value r marks the number of topics to be extracted from the texts. Matrix W contains the characteristic vectors that make up these topics. The number of characteristics (dimensionality) of these vectors is identical to that of the data in the input matrix V . Since only a few topic vectors are used to represent many data vectors, it is ensured that these topic vectors discover latent structures in the text. The H-matrix indicates how to reconstruct an approximation of the V -matrix by means of a linear combination with the W -columns. ⎞ ⎛ ··· ··· ⎜ .. .. ⎟ ⎜ . .⎟ N ×r ⎟ ⎜ (5) =⎜. W ∈R .⎟ . . ⎝ .. . . . . .. ⎠ ··· Where N is the number of rows in matrix W and each of them represents a term (weight) and r is the number of columns in matrix W , where r is the number of characteristics to be extracted. ⎞ ⎛ ··· ··· ⎜ .. .. ⎟ ⎜ . .⎟ r×P ⎟ ⎜ (6) H∈R =⎜. .⎟ . . ⎝ .. . . . . .. ⎠ ··· Where r is the number of rows in matrix H, where r is the number of characteristics to be extracted and P is the number of columns, with one column for each document. The result of the matrix product between W and H will therefore be a matrix of dimensions N × P corresponding to a compressed version of V . For the implementation of this algorithm, the Sklearn library has been used again. NMF Sklearn implementation: from sklearn.decomposition import NMF nmf = NMF (n_components=29, random_state=2, alpha=.1, l1_ratio=.5, init='nndsvda').fit(tfidf)
It should be noted that during the process of data preparation, a function has been implemented prior to vectorization to allow the 8,000 tweets in each category to be divided into a variable number of documents. Thus, tests have been performed to train the model with 80 documents per category (100 tweets per document), 100 documents per category (80 tweets per document), 8,000 documents per category (1 tweet per document), and so on. All this, in turn, modifies the number of components to be extracted by the model.
Profiling with Twitter Info
4
231
Evaluation and Results
In Fig. 2 the most relevant terms have been identified for 4 different topics, this demonstrates how well the algorithm identifies the terms associated with each one. As it can be seen, all of them are unambiguously related to their defined categories. Topic 1: Travel. Topic 2: Movies. Topic 9: Personal Finance. Topic 10: Pets & Animals. It should be noted that some of the initially defined categories have been removed during the training of this model, due to the lack of tweets that would fit into those categories: Pop Culture, Real State, Television and Events & Attractions, leaving a total of 25 categories in the system, as can be seen in Table 2. Table 2. Final categories. Music and radio
Fine art
Movies
Science
Videogames
News and politics
Books and Literature
Education
Career
Food and drinks
Personal health
Finance and business
Religion and Spirituality Family and relationships Home and Garden
Style and fashion
Personal Finance
Medicine and health
Pets and Animals
Software and technology
Shopping
Events
Travel
Hobbies and pop culture
Motor vehicles
Thus, a total of 29 sets have been extracted, as shown in Table 3. The reason for this is the observation in previous tests that there were always topics that contained words that did not fit the subject and introduced noise. With the inclusion of these 4 extra topics, all these meaningless terms can be grouped into individual topics. Once the correct generation of categorization sets has been verified, the next step is to check the effectiveness of the model when real tweets are used. The tests will be performed using 1200 tweets recovered from different users and concatenating them into a single document that will be passed on to the model. The final test results are shown in Table 3. It can be observed that the classified preferences correspond with the name of each account.
232
J. J. Cea-Mor´ an et al.
Fig. 2. Example topics generated by NMF.
4.1
Final System Integration
Having passed the entire research and testing process, a trained algorithm has been obtained capable of classifying different Twitter accounts according to the initially defined categories. In addition, a reliable data extraction method has been developed. Therefore, the next step will be to create the system that will be delivered to the client and which is the objective of this project. Python has been selected as the programming language. This immediately makes one think of executing a Python package as a means of implementing the system. In addition, Python packages are highly portable, being totally independent of the architecture of the machine in which they are implemented. This will provide greater flexibility during the process of selecting methods of communication with the customer’s platform. For each user, the model returns the associated categories along with the percentage of weight that each category has on the user. The lower the percentage, the less relation the user has with the category. Only the three main categories
Profiling with Twitter Info
233
Table 3. Final NMF results. Cat 1
Cat 2
Cat 2
Pontifex
Religion and Spirituality
Family and, Relationships
Software and Technology
Tesla
Motor vehicles Software and Technology
Travel
BBCNews
News and Politics
Sports
Family and Relationships
NintendoAmerica Videogames
Hobbies and Pop Culture
Sports
Theresa may
News and Politics
Finance and Business
Personal Finance
Oprah
Events
Family and, Relationships
Sports
SkyFootball
Sports
Events
Hobbies and Pop Culture
ScuderiaFerrari
Sports
Motor vehicles News and Politics
IMDb
Events
Movies
Hobbies and Pop Culture
Sciencemagazine
Science
Medicine and Health
Software and Technology
Spotify
Music and Radio
Events
Family and, Relationships
Airbnb
Travel
Events
Career
are shown in the table (together with their associated percentage), as they are the most accurate for categorizing the user. The results of the final classification of the different Twitter accounts are given in Table 4. Table 4. Final results with different accounts. First category Tesla
Second category
Motor vehicles Software and (70.92%) Technology (10.07%)
Third category Travel (3.86%)
realDonaldTrump News and Politics (28.02%)
Finance and Business (11.36%)
Sports (8.02%)
sciencemagazine
Medicine and Health (16.73%)
Education (12.03%)
Hobbies and Pop Culture (12.19%)
Sports (9.74%)
Science (24.41%)
NintendoAmerica Videogames (41.65%)
234
5
J. J. Cea-Mor´ an et al.
Conclusions and Future Work
This article presents a novel approach to extracting preferences from a Twitter profile by analyzing the tweets published on it for use in mapping applications. This approach has successfully defined a consistent and representative list of categories, and the necessary mechanisms for information extraction are available, both for model training and end-user analysis. It is a unique system thanks to the use of Catwigopy and it has been possible to develop a web platform that allows customers to check the proper functioning of the system. Regarding future work on this system, many fields of improvement and development have been identified. Tweets are not the only source of information that allows to discern the interests of a profile. It may be the case that a user only writes about football but is a follower of many news and political accounts. With the current implementation of the system it would only be possible to extract the category of sport. Therefore, one of the improvements would be the implementation of a model that allows the analysis of followed users. This has already started to be done, by extracting the followers and creating wordclouds with the most relevant ones. Similarly, hashtags also provide additional information suitable for analysis. Another line of work will be the training of a model that allows to analyze the tweets individually. This would open the doors to perform a polarity analysis that would allow us to know if a user who writes about a certain category does it in a positive, negative or neutral way. Acknowledgments. This research has been supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777640).
References 1. Java, A., Song, X., Finin, T., Tseng, B.: Why we Twitter: under- standing microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65 (2007) 2. Bakshy, E., Hofman, J.M., Mason, W.A., Watts, D.J.: Everyone’s an in UENCER: quantifying in UENCE on Twitter. In: Proceedings of the Fourth ACM international Conference on Web Search and Data Mining, pp. 65–74 (2011) 3. Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120. Association for Computational Linguistics (2012) 4. Cotelo, J.M., Cruz, F.L., Enr´ıquez, F., Troyano, J.A.: Tweet categorization by combining content and structural knowledge. Inf. Fus. 31, 54–64 (2016) 5. Lee, K., Palsetia, D., Narayanan, Patwary, Md.M.A., Agrawal, A., Choudhary, A.: Twitter trending topic classification. In: 2011 IEEE 11th International Conference on Data Mining Workshops, pp. 251–258. IEEE (2011)
Profiling with Twitter Info
235
6. Yale University. About yale: Yale facts. https://www.yale.edu/about-yale/yale-facts (2017) 7. Evgenia, A., Vassilis, A., Konstantinos, D., Kosmides, P.: An offline, statistical method for cost efficient design of experiments and field trials involving electric vehicles. In: Proceedings of the 11th ITS European Congress, 06-09 June 2016, Glasgow, Scotland (2016)
Doctoral Consortium (DC)
Adaptivity as a Service (AaaS): Personalised Assistive Robotics for Ambient Assisted Living Ronnie Smith(B) Edinburgh Centre for Robotics, Edinburgh, UK [email protected]
Abstract. The need for personalised Ambient Assisted Living (AAL) solutions is widely recognised. However, many existing solutions lack flexibility in terms of long-term user-adaption, a natural effect of working within the constraints of lab-based evaluation. As such, few approaches directly address a potentially desirable heterogeneous future where AAL is accessible to all: people should be able to create solutions that work for them through bricolages of assorted off-the-shelf (OTS) devices, including robots. For AAL to succeed at scale, these devices must share experiences of user interactions within and outwith the home. Adaptivity as a Service (AaaS) sandboxes adaptivity into its own research domain and posits that adaptation should be managed by a highly specialised adaptivity service that ‘fits in’ with different AAL solutions. Keywords: Personalisation · Adaptive robotics · Ambient Assisted Living (AAL) · Human-in-the-Loop (HITL) · Digital Twin
1
Introduction
In AAL, adaptivity refers to how solutions adapt to changing habits, situations, individual preferences and evolving needs of users. In practice, adaption might mean: changing how to respond to user commands and/or activities; how robots or other devices interact with humans physically and socially; how human intentions are perceived; and even modification of interaction modalities themselves. Adaptivity can readily be viewed a machine learning problem, dependent on high quality training data. Hence, much early work on Human Activity Recognition (e.g. [1]), which provides vital context-awareness needed to offer realtime assistance in AAL, is based on supervised learning. Likewise, interaction adaptivity is often based on Partially Observable Markov Decision Processes (POMDPs). In ‘GrowMeUp’, social robot decision making (e.g. next action) is tuned for each user using a POMDP trained based on the positive/negative impact of actions on the user’s state [2]. Historically, such data-driven approaches effectively start afresh with each new environment/user, since they do not transfer what they have learned previously to speed up the induction phase. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 239–242, 2021. https://doi.org/10.1007/978-3-030-58356-9_23
240
R. Smith
Models are essential in many AAL approaches. For instance, in [3] activities are recognised by comparing sensor events over sliding (fixed duration) time windows with templates of Activities of Daily Living in an ontological model. These templates specify criteria such as: sensor events, duration, conditions to be met, and objects involved in an activity. Models are also used to incorporate predefined templates of potential users, such as “dependent, assisted, at risk, and active” used in the approach described in [4] to modify system behaviour. However, rigid modelling approaches are ultimately time consuming to design, and are consequently unlikely to capture the diversity of elderly care requirements. A third category of approaches are ‘hybrid’ solutions, where data- and knowledge driven (model-based) sources are combined to overcome the disadvantages of each. Work in [5] exemplifies the concept, in the context of HAR: initial knowledge engineered activity ‘seeds’ (templates) enable initial activity detection, while newly discovered activities are recorded and grouped for later labelling, often with the active participation of the end user. This expands the initial pool of activities correctly recognised by the system. ‘GrowMeUp’ represents another example of a hybrid approach, for the way it incorporates dynamic profiling mechanisms: pre-defined profile schemas (which specify what can be learned) are fleshed out using knowledge gathered over time by a social robot companion [6]. While hybrid approaches may ultimately be the key to long-term adaptivity, existing approaches have not taken advantage of the wealth of useful data that can be directly extracted from a population of individuals. There are also efforts to increase personalisation in the health care industry more widely. For example, IBM and collaboration partners are currently working on ‘IBM Watson Health’ to harness the potential of a wealth of existing patient health data in a cloud-based cognitive system [7]. The implication of this for patients of the future is improved treatment aided by artificial intelligence and the consolidation of their health data that can be shared with their designated healthcare professionals throughout their life. The hypothesis advanced in this work is that all the dimensions of adaption in AAL could be delivered more effectively by adopting an Adaptivity as a Service (AaaS) approach.
2
A Proposal for Adaptivity as a Service in AAL
AaaS intends to address three types of long-term adaptation of AAL systems, namely: (1) context-awareness to account for predicted physical and mental decline, based on an individual’s known ageing-related conditions; (2) adaption in assistive functionality, including interaction modalities, to fit individual needs/wants; and (3) quick adaption to new users, based on experience. AaaS addresses these three types of adaption by using local and global levels to separate individual service delivery from centralised processing and storage. The intention is to slot into AAL platforms as a high-level intermediary between local control/decision making and devices such as robots, where it can intercept communication between the two. User models reside primarily in the global
Adaptivity as a Service (AaaS): Personalised Assistive Robotics for AAL
241
level, where the users are collocated in a population of many comparable users receiving adapted AAL. The collocation of this data seeks to accelerate learning potential and the amount of useful knowledge that can be extracted. Hybrid models should be employed in AaaS for user modelling and personalisation, with their implementation intrinsically linked. Unlike in HAR, the hybrid model in the proposed AaaS framework will be realised as a Digital Twin (DT), i.e. a digital representation of the user and high-level personal policies. AaaS relies on Human-in-the-Loop (HITL) learning to modify initial templates where the parameters of what can be modified are previously defined as part of knowledge engineering effort. The Digital Twin is a user model generated from given profile information. When a new user is created, a set of initial personal policies are generated and assigned to the DT that can be used to adapt system behaviour, based on prior experience with other users of similar profile. Personal policies are nested high-level plans originally added by humans to a global policy bank. They describe: (1) assistive plans for execution in a robotic AAL environment, and (2) possible modifications to external plans to accommodate specific user wants/needs, including tunable parameters for (1) and (2). These high-level plans rely not on specific hardware, but on command execution via commonly available APIs, e.g.. via Robot Operating System (ROS). The DT consolidates provided and learned user information (demographic, health, care needs, preferences, etc.) into a single place. This will enable AaaS to continuously evolve its understanding of users over time and to harness that understanding for the benefit of all. The expectation is that such an approach will maximise useful knowledge retention and opportunities thereof, in order to improve both long-term acceptability and scalability of AAL solutions. Personal policies should be updated using state of the art HITL methods, including Reinforcement Learning (RL). Granular policy/scenario linkage would benefit from a spatiotemporal aspect: each policy execution is a standalone instance, which can be later grouped with like instances, since the same policy may apply under different circumstances and preferences. Over time, policies for specific scenarios settle to suit user wants—emphasis over needs. This has short- and long-term forecasting implications: short-term, AaaS should be able to assess whether actions planned by the local AAL system (which may not actively adapt) will suit the user by testing hypothesis against the DT, allowing for plan modification or suppression; long-term, it becomes possible to predict behaviour and health patterns in relation to health conditions specified in the user profile. Feeding back of individual adaptions should enable automatic policy selection for new users, while enabling updates of global beliefs about personal traits and the impact of functional and cognitive impairments, via aggregation of real-world experiences. At scale, there should be opportunities to investigate merging of similar feedback to best reflect what has been learned, while eliminating outliers.
242
3
R. Smith
Research Questions
Although the vision for AaaS is rather extensive, research will initially focus on the underpinning scientific issues that must be addressed in order for it to become feasible. A number of key research questions are as follows: – How can personalisation and adaptivity plans be represented and encoded at a high level? – How to create an interface for AaaS that enables third-party devices/ platforms to most easily integrate? – How can low-level granularity in policy/scenario linkage be achieved given the spatiotemporal aspect of human activities and daily routines? – How can feedback from various users and sources be merged and generalised to best reflect learning from multiple similar users, while eliminating outliers? – How does AaaS handle “unhealthy” feedback, where a policy has moulded to unhealthy individual wants?
4
Conclusion
This paper has established the fundamental principles of AaaS and highlights its novel HITL approach of a distributed architecture with local instances (individual homes) and the cloud, whereby an individual can both benefit from and contribute to a wider network of adaptivity. Future research will stem from determining the best approaches to meet key goals of AaaS. Acknowledgements. This work is supported by the Engineering and Physical Sciences Research Council (grant EP/L016834/1).
References 1. Martins, G.S., Santos, L., Dias, J.: User-adaptive interaction in social robots: a survey focusing on non-physical interaction. Int. J. Soc. Robot. 11, 185–205 (2019) 2. Martins, G.S., Al Tair, H., Santos, L., Dias, J.: aPOMDP: POMDP-based useradaptive decision-making for social robots. Pattern Recogn. Lett. 118, 94–103 (2019). ISSN 01678655 (2020) 3. Chen, L., Nugent, C.D., Wang, H.: A Knowledge- driven Approach to Activity Recognition in Smart Homes. IEEE Trans. Knowl. Data Eng. 24, 961–974 (2011) 4. Stavrotheodoros, S., Kaklanis, N., Tzovaras, D.: A personalized cloud-based platform for AAL support to cognitively impaired elderly people. IFMBE Proc. 66, 87–91 (2018). ISSN 16800737 5. Chen, L., Nugent, C., Okeyo, G.: An ontology-based hybrid approach to activity modeling for smart homes. IEEE Trans. Hum.-Mach. Syst. 44, 92–105 (2013). ISSN 16800737 6. Martins, G.S., Santos, L. Dias, J. BUM: Bayesian user model for distributed social robots. In: RO-MAN UM, IEEE, Lisbon, August 2017, pp. 1279– 1284 (2020). ISBN 978-1-5386-3518-6 7. Ahmed, M.N., Toor, A.S., O’Neil, K., Friedland, D.: Cognitive computing and the future of healthcare: the cognitive power of IBM Watson has the potential to transform global personalized medicine. IEEE Pulse 8, 4–9 (2017). ISSN 2154-2317
Time in Multi-agent Systems Niklas Fiekas(B) Department of Informatics, Clausthal University of Technology, Julius-Albert-Str. 4, 38678 Clausthal, Germany [email protected] Abstract. This is a research proposal, aiming to improve tooling and features of agent-oriented programming languages, in particular Jason, to handle virtual time and real-time deadlines. The main idea is to apply known techniques and patterns from asynchronous programming in classical programming languages, and extend Jason as necessary. Keywords: Agent-oriented programming
1
· Simulation · Soft real-time
Introduction
This early stage proposal is motivated by two use cases that were encountered in prior research. First, the SimSE (Simulating Software Evolution) project aims to simulate the behavior of human software developers, in order to make predictions about deadlines and software quality [1]. The environment is a graph created from the real software repository to be simulated. The developers are modelled as BDI agents in a multi-agent system, using Jason/AgentSpeak to describe plans. This kind of simulation is typically divided into discrete virtual time steps, but by default all actions and reasoning in Jason are performed instantly, with little support to model the time that a simulated human developer would take, much less synchronize a simulated virtual clock across agents. Second, the yearly multi-agent programming contest provides challenging scenarios to benchmark multi-agent programming languages, platforms and tools [2]. Contest games are played in discrete time steps (typically 4 s). The participants program agents that receive percepts and submit actions for each step. This is a soft real-time system. Missing the deadline to submit an action for the current step is not fatal, but it degrades the performance of team. Furthermore, participants report a snowball effect: Missing one deadline leads to a backlog of incoming percepts, and more missed deadlines if the agent cannot catch up. It can be difficult to recover. It seems likely that the described issues with virtual time, real time, and deadlines are common in multi-agent systems, especially when simulating realworld scenarios or moving closer to the implementation of physical systems, including ambient AI. The proposal is therefore to survey the state of solutions in multi-agent systems, and improve them on the language and platform level. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 243–246, 2021. https://doi.org/10.1007/978-3-030-58356-9_24
244
2
N. Fiekas
Related Work
On the practical side, the motivating examples come from the SimSE project [1] and the yearly Multi-Agent Programming contest. In the latter, participants describe their experience and practical issues while implementing multi-agent systems [2]. Jason 2.0 recently introduced the fork/join operators, to support structured plan level concurrency [3], and foundations for concurrent programming are laid [4]. On the theoretical side, approaches using LTL, CTL and extensions [5] can show properties like safety and liveness (never something, always eventually again), but are not designed to answer if agents can meet a particular deadline. Keeping time and achieving consensus in distributed systems comes with interesting algorithmic challenges [6], but this is a not an issue with virtual time in simulated environments, nor when trying to meet deadlines of a centralized game server (intended to simulate deadlines of local sensors and actuators). Real time agent systems are typically not BDI based, much less using high level agent languages like Jason [7]. Outside of the agent community, there is a wide array of work around asynchronous programming in classical programming languages. Many languages adopted syntax extensions to support writing asynchronous programs in straight-line fashion (e.g.., Python, JavaScript, Rust, Kotlin). All in all, this is an active area of research (even with regard to classical programming languages). This proposal focuses on improvements to practical agent oriented programming.
3
Proposal
The driving conjecture is that many of the techniques and patterns for asynchronous programming in classical languages can also be applied to agentoriented languages, such as Jason. This is not obvious. Consider for example negative results for the analogous conjecture with regard to testing and fuzzing [8]. Initial techniques to investigate are generators and coroutines. These involve computations that can be interrupted and resumed at predefined points, for example to wait for network I/O. Before using these techniques, they must be unified with the operational semantics of Jason agents. Instead of simply putting them on top, it seems possible to express the existing Jason semantics in these terms. For example, plans can also be interrupted in favor of other plans and resumed later. This is the first work package. Existing Jason programs should be shown to be equivalent under the new semantics. This is the second work package, although it may be deferred, in order to first gain more confidence in the practical relevance of the proposed semantics, with possible iterations on the design. Many classical programming languages have added language level features to support the mentioned techniques. The proposal therefore includes extending
Time in Multi-agent Systems
245
the Jason language as necessary and providing a working implementation. This is a third work package, and will provide the means for further evaluation. Generators and coroutines are frequently combined with timeouts and cancellation tokens, or to build structured concurrency abstractions. In new Jason programs, timeouts and cancellations should interact in practically useful ways with long term desires and short term intentions of agents, as well as recovery plans. The final work package is evaluating this based on the use cases from the introduction.
4
Preliminary Results
An extensible Jason interpreter has been developed, https://github.com/niklasf/ python-agentspeak. So far, the main novelty is the ability to pause the interpreter at any time, including during agent actions and even Prolog queries, and quickly serialize the state. This is achieved by translating Jason programs to control-flow graphs with high level instructions. The following instructions are sufficient to express all Jason plans: noop(agent, intention). Does nothing and succeeds always. push query(query, agent, intention). Starts a Prolog query and adds it to the query stack. This is also used for actions that can yield multiple results. next or fail(agent, intention). Tries to find the next solution for the topmost Prolog query, a substitution of variables. pop query(agent, intention). Removes the topmost Prolog query from the query stack. add belief(term, agent, intention). Applies the current substitution to term and adds it to the belief base. Triggers a belief addition event. remove belief(term, agent, intention). Unifies term with the first matching belief and removes it from the belief base. Triggers a belief removal event. test belief(term, agent, intention). Tries to find a substitution such that term is a logical consequence of the belief base. Triggers a belief test event. call(trigger, goal type, term, agent, intention). Tries to find a plan matching the trigger, goal type and term and adds it as a sub-plan to the current intention. call delayed(trigger, goal type, term, agent, intention). Tries to find a matching plan and crates a new intention with it. Initially this was designed in order to apply data parallelism to multi-agent simulation (e.g.., treat agent states as just data and apply techniques like MapReduce to advance the simulation). However, the same design also allows bringing asynchronous programming to Jason. This includes having agents wait for a synchronized virtual clock without wasting time in the real world, and actual asynchronous communication with the game server of the multi-agent programming contest. On the other hand the current design is not yet satisfactory for soft real-time systems. While the instructions call, call delayed, push query, pop query,
246
N. Fiekas
and of course trivially noop, are constant time, it is hard to predict if queries to the belief base (add belief, remove belief, test belief) and next or fail will complete before a given deadline. Also, importantly, the control flow graph is currently limited to individual plans. This will not suffice, as interactions of errors, timeouts, cancellation and recovery plans appear to be essential.
5
Evaluation Plan
To evaluate new approaches, it seems useful to come back around to the motivating examples. In the SimSE project, the goal is to simulate the behavior of human software developers. It will be interesting to see if the existing agents can be simplified, and if more detailed modelling in future iterations can be supported. For the multi-agent programming contest, participants submit the source code of their solutions, and comment on difficulties that they encountered. Some teams used Jason for their solutions, so it will be possible to apply and evaluate new approaches based on these agent programs. Finally, language level changes might impact the performance of the Jason interpreter. Its performance can be evaluated in benchmarks and compared with previous versions and the original Jason interpreter.
References 1. Ahlbrecht, T., Dix, J., Fiekas, N., Grabowski, J., Herbold, V., Honsel, D., Waack, S., Welter, M.: Agent-based simulation for software development processes. In: Proceedings of the 14th European Conference on Multi-Agent Systems, EUMAS 2016. Springer, December 2016 2. Ahlbrecht, T., Dix, J., Fiekas, N.: The Multi-agent Programming Contest 2018 Agents Teaming Up in an Urban Environment. Springer (2019). https://doi.org/10. 1007/978-3-030-37959-9 3. Concurrency in Jason. https://github.com/jason-lang/jason/blob/master/doc/ tech/concurrency.adoc 4. Muscar, A., Badica, C.: Monadic foundations for promises in Jason. Inf. Technol. Control 43(1), 65–72. http://www.itc.ktu.lt/index.php/ITC/article/view/4586 5. Bordini, R.H., Fisher, M., Pardavila, C., Wooldridge, M.: Model Checking AgentSpeak. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 409–416, July 2003. https://doi.org/10.1145/ 860575.860641 6. Yang, T., Meng, Z., Dimarogonas, D.V., Johansson, K.H.: Global consensus for discrete-time multi-agent systems with input saturation constraints. Automatica (50), 499-506 (2014). https://doi.org/10.1016/j.automatica.2013.11.008 7. Juli´ an, V., Botti, V.: Developing real-time multi-agent systems. Integr. Comput. Aid. Eng. 11(2), November 2002. https://content.iospress.com/articles/integratedcomputer-aided-engineering/ica00172 8. Winikoff, M., Cranefield, S.: On the testability of BDI agent systems. J. Artif. Intell. Res. (2014). https://www.jair.org/index.php/jair/article/view/10903
Public Tendering Processes Based on Blockchain Technologies Yeray Mezquita(B) BISITE Research Group, University of Salamanca, Salamanca, Spain [email protected]
Abstract. The lack of transparency is one of the problems affecting public tendering processes. In this paper, we will study some of the solutions based on blockchain technology and smart contracts on this topic. The conclusions are that the application of blockchain technology to public bidding processes is feasible, although it is also necessary to take into account the legislation of each country to which it will be applied.
Keywords: Blockchain Corruption
1
· Public tendering · Review · Transparency ·
Introduction
One of the problems of public administrations today, is the lack of transparency when tendering for contracts. This lack of transparency leads to the rigging of public tenders, resulting in a market of favours that only benefits corrupt politicians and their cronies [2,6,8,9,11–15,20–23,39]. In the literature, in order to increase the transparency and security of different kind of platforms, which involves actors with disparate interests, it has been proposed to make use of blockchain technology [1,3,5,24,30–32,35,38]. The blockchain technology allows the implementation of a distributed ledger on the platform that underlies, where the data stored in it is immutable. Besides being used to store economic transactions [7,10,17,19,26,27,29,34,36,40], it is possible to store all kinds of data, including the virtualization of real assets using smart contracts [4,16,18,28,33,37,42]. To improve the public tendering process, in [25] a study is carried out on how to design an architecture based on blockchain and governed by smart contracts. The experiments carried out in it show that the tendering scheme can be made fully open, autonomous, fair and transparent within the Ethereum blockchain. Although it may seem really feasible, in [41] a study on the problems and challenges of its adoption is shown in the case study of South Africa: an aversion to new technologies, integration with legacy systems, the cost of adoption, and gaining stakeholder support. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 247–250, 2021. https://doi.org/10.1007/978-3-030-58356-9_25
248
2
Y. Mezquita
Conclusions
This work has addressed the problem of transparency in public tenders. To this end, some works have been studied that address this problem in different types of platforms, in which it is shown how the trend goes in the application of blockchain technology to those systems. In addition, it has been studied how it could be feasible the design and implementation of this kind of solution in a real environment, although without taking into account barriers such as the legislation of the countries and the aversion to new technologies people usually have. Acknowledgements. The research of Yeray Mezquita is supported by a pre-doctoral fellowship from the University of Salamanca and Banco Santander. Also, this work has been partially supported by the Salamanca Ciudad de Cultura y Saberes Foundation under the Talent Attraction Program (CHROMOSOME project).
References 1. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 2. Boehm, F., Olaya, J.: Corruption in public contracting auctions: the role of transparency in bidding processes. Ann. Publ. Coop. Econ. 77(4), 431–452 (2006) 3. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Procedia Comput. Sci. 134, 393–398 (2018) 4. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Fut. Gener. Comput. Syst. 102, 965–977 (2020) ´ Altair: supervised methodology 5. Chamoso, P., P´erez-Ramos, H., Garc´ıa-Garc´ıa, A.: to obtain retinal vessels caliber. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(4), 48–57 (2014) 6. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307– 313 (2002) 7. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood Hebbian learning based retrieval method for CBR systems. In: International Conference on Case-Based Reasoning, pp. 107–121. Springer (2003) 8. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 9. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell 15(2), 105–127 (2001) 10. Corchado, J.M., Pav´ on, J., Corchado, E.S., Castillo, L.F.: Development of CBRBDI agents: a tourist guide application. In: European Conference on Case-based Reasoning, pp. 547–559. Springer (2004) 11. Coria, J.A.G., Castellanos-Garz´ on, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189– 1205 (2014)
Public Tendering Processes Based on Blockchain Technologies
249
12. Dargham, J.A., Chekima, A., Moung, E.G., Omatu, S.: The effect of training data selection on face recognition in surveillance application. In: Distributed Computing and Artificial Intelligence, 12th International Conference, pp. 227–234. Springer (2015) 13. Di Giammarco, G., Di Mascio, T., Di Mauro, M., Tarquinio, A., Vittorini, P., et al.: Smartheart cabg edu (2015) 14. D´ıaz, F., Fdez-Riverola, F., Corchado, J.M.: gene-CBR: a case-based reasonig tool for cancer diagnosis using microarray data sets. Comput. Intell. 22(3–4), 254–268 (2006) 15. Fazekas, M., Kocsis, G.: Uncovering high-level corruption: cross-national objective corruption risk indicators using public procurement data. Br. J. Polit. Sci. 1–10 (2017) 16. Fdez-Riverola, F., Corchado, J.M.: FSFRT: forecasting system for red tides. Appl. Intell 21(3), 251–264 (2004) 17. Fdez-Riverola, F., Iglesias, E.L., D´ıaz, F., M´endez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007) 18. Fdez-Riverola, F., Iglesias, E.L., D´ıaz, F., M´endez, J.R., Corchado, J.M.: Spamhunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 19. Fern´ andez-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy case-based reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 20. Ferreira, A.S., Aurora, P., Gon¸calves, R.A.: An ant colony based hyper-heuristic approach for the set covering problem. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(1), 1–21 (2015) 21. G´ omez Zotano, M., G´ omez-Sanz, J., Pav´ on, J., et al.: User behavior in mass media websites (2015) 22. Gonz´ alez-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 23. Griol, D., Molina, J., et al.: Measuring the differences between human-human and human-machine dialogs. Adv. Distrib. Comput. Artif. Intell. J. 4(2), 99 (2015) 24. Guill´en, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a SCIRAS model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 25. Hardwick, F.S., Akram, R.N., Markantonakis, K.: Fair and transparent blockchain based tendering framework-a step towards open governance. In: 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science and Engineering (TrustCom/BigDataSE), pp. 1342–1347. IEEE (2018) 26. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Process. 119, 115–127 (2016) 27. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE (2014) 28. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 29. Matsui, K., Kimura, K., P´erez, A.: Control prosody using multi-agent system. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(4), 49–56 (2013)
250
Y. Mezquita
30. Mendez, J.R., Fdez-Riverola, F., Diaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Industrial Conference on Data Mining, pp. 106–120. Springer (2006) 31. Mezquita, Y., Casado, R., Gonzalez-Briones, A., Prieto, J., Corchado, J.M.: Blockchain technology in IoT systems: review of the challenges. Ann. Emer. Technol. Comput. (AETiC) 3(5), 17–24 (2019) 32. Mezquita, Y., Gazafroudi, A.S., Corchado, J., Shafie-Khah, M., Laaksonen, H., Kamiˇsali´c, A.: Multi-agent architecture for peer-to-peer electricity trading based on blockchain technology. In: 2019 XXVII International Conference on Information, Communication and Automation Technologies (ICAT), pp. 1–6. IEEE (2019) 33. Mezquita, Y., Valdeolmillos, D., Gonz´ alez-Briones, A., Prieto, J., Corchado, J.M.: Legal aspects and emerging risks in the use of smart contracts based on blockchain. In: International Conference on Knowledge Management in Organizations, pp. 525– 535. Springer (2019) 34. Morente-Molinera, J.A., Kou, G., Gonz´ alez-Crespo, R., Corchado, J.M., HerreraViedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl.-Based Syst. 137, 54–64 (2017) 35. Salazar, R., Rangel, J.C., Pinz´ on, C., Rodr´ıguez, A.: Irrigation system through intelligent agents implemented with Arduino technology (2013) 36. Sergio, A., Carvalho, S., Marco, R.: On the use of compact approaches in evolution strategies. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(4), 13–23 (2014) 37. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for Alzheimer health care. Int. J. Amb. Comput. Intell. (IJACI) 1(1), 15–26 (2009) 38. Tapia, D.I., Fraile, J.A., Rodr´ıguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for ambient intelligence systems. Inf. Sci. 222, 47–65 (2013) 39. Trindade, N., Antunes, L.: An architecture for agent’s risk perception. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(2), 75–85 (2013) 40. Valdeolmillos, D., Mezquita, Y., Gonz´ alez-Briones, A., Prieto, J., Corchado, J.M.: Blockchain technology: a review of the current challenges of cryptocurrency. In: International Congress on Blockchain and Applications, pp. 153–160. Springer (2019) 41. Williams-Elegbe, S.: Public procurement, corruption and blockchain technology in South Africa: a preliminary legal inquiry. In: Regulating Public Procurement in Africa for Development in Uncertain Times (Lexis Nexis, 2020) (2019) ˇ 42. Z´ avodsk´ a, A., Sramov´ a, V., Anne-Maria, A.: Knowledge in value creation process for increasing competitive advantage. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(3), 35–47 (2012)
Low-Power Distributed AI and IoT for Measuring Lamb’s Milk Ingestion and Predicting Meat Yield and Malnutrition Diseases Ricardo S. Alonso(B) BISITE Research Group, University of Salamanca, Edificio Multiusos I+D+I, Calle Espejo 2, 37007 Salamanca, Spain [email protected]
Abstract. On most sheep dairy farms, it is a common practice to separate lambs from mothers shortly after birth, being raised by artificial lactation, i.e., automatic lamb feeders. This work will build a low-power Distributed AI device combining IoT and Machine Learning aimed at measuring each lamb’s milk ingestion and predicting its future meat yield and possible malnutrition diseases. This device will consist of a Customized Low-Energy Computing (CLEC) that will identify each lamb through Bluetooth beacons, measure the amount of milk ingested by each lamb and offer researchers and farmers a prediction using ML models that will be executed in the device itself through Distributed AI techniques. Keywords: Internet of Things · Artificial Intelligence · Recurrent Neural Networks · Precision agriculture · Smart farming
1 Introduction Increasing the efficiency with which animals use food is currently the subject of extensive research, both because of its relationship with competition for resources and because of its effect on the profitability of livestock farms and on environmental impact [1–5]. The development of genomics and metabolomics and of systems for the automatic recording of multiple biological parameters (automatic intake control systems, accelerometers, temperature sensors, sensors for continuous recording of ruminal parameters, facial recognition systems, recording of milk production, automatic recording of weight, etc.) will make it possible to understand the mechanisms involved and to identify markers genomic or biochemical - that make it possible to guide genetic selection or to develop strategies for more precise feeding [6–14]. In most dairy sheep farms, it is a common practice to separate the lambs from their mothers shortly after birth, being raised by artificial lactation. Most of these farms use automatic lamb feeders (which combines powdered milk and water), but do not have an individual intake control system [15–27]. The development of an automatic intake control © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 251–257, 2021. https://doi.org/10.1007/978-3-030-58356-9_26
252
R. S. Alonso
system for these lamb feeders would represent a technological advance of unquestionable impact. On the one hand, it could warn the farmer of those animals that show abnormal behavior in the intake, indicative of some problem that compromises their welfare [28–34]. On the other hand, it would allow the recording of data that would make it possible to estimate the efficiency with which the animals use the feed in this early stage of life and if this efficiency were related to the efficiency in later stages (rearing, lactation) [35–47]. Furthermore, it would allow the selection of animals at early ages, increasing the efficiency of the herd [48–54]. There are different methods based on Machine Learning, Edge Computing and Internet of Things techniques aimed at monitoring the status of the livestock and predicting future meat and milk production [6, 53, 55–60]. However, many of these techniques are focused on cow’s milk, with less research on sheep [61–66]. On the other hand, other techniques that are focused on predicting meat yield in sheep require expensive methods such as chemical analysis [67–70]. Others are focused on predicting the quality of the meat in the carcass when the growth process of the animal has been completed [71–79], not allowing the correction of the feeding process and thus improving the productivity of the herd.
2 Conclusions This work will build a low-power Distributed AI device combining IoT and Machine Learning aimed at measuring each lamb’s milk ingestion and predicting its future meat yield and possible malnutrition diseases. This device will consist of a Customized LowEnergy Computing (CLEC) that will identify each lamb through Bluetooth beacons, measure the amount of milk ingested by each lamb and offer researchers and farmers a prediction using ML models that will be executed in the device itself through Distributed AI techniques. This research will be completely innovative for several reasons. Firstly, it will be initially focused on sheep, where there is less research than on cattle. Secondly, the device will be very non-invasive, as it will only require connection to the milk outlet connectors of the artificial lamb feeder. In addition, innovative Deep Learning models based on Recurrent Neural Networks that have been used in other environments, such as poultry weight prediction, will be applied. The Machine Learning models will be executed in the same Distributed Artificial Intelligence devices built, so that the connectivity of the devices with a remote Cloud platform will not be necessary. This is especially beneficial in livestock environments, typically located in remote rural locations where Internet connectivity can suffer from outages and coverage problems. Acknowledgments. This work has been partially supported by the European Regional Development Fund (ERDF) through the Interreg Spain-Portugal V-A Program (POCTEP) under grant 0677_DISRUPTIVE_2_E (Intensifying the activity of Digital Innovation Hubs within the PocTep region to boost the development of disruptive and last generation ICTs through cross-border cooperation).
Low-Power Distributed AI and IoT for Measuring Lamb’s Milk Ingestion
253
References 1. Shahinfar, S., Kelman, K., Kahn, L.: Prediction of sheep carcass traits from early-life records using machine learning. Comput. Electron. Agric. 156, 159–177 (2019) 2. da Rosa Righi, R., Goldschmidt, G., Kunst, R., Deon, C., da Costa, C.A.: Towards combining data prediction and internet of things to manage milk production on dairy cows. Comput. Electron. Agric. 169, 105156 (2020) 3. Justice, S.M.M., Britt, J., Miller, M., Greene, M., Davis, C., Koch, B., Jesch, E.: Predictions of lean meat yield in lambs using Dexa and chemical analyses proximate. Meat Muscle Biol. 2(2), 184 (2019) 4. Savoia, S., Albera, A., Brugiapaglia, A., Di Stasio, L., Ferragina, A., Cecchinato, A., Bittante, G.: Prediction of meat quality traits in the abattoir using portable and hand-held near-infrared spectrometers. Meat Sci. 161, 108017 (2020) 5. Johansen, S.V., Bendtsen, J.D., Martin, R., Mogensen, J.: Broiler weight forecasting using dynamic neural network models with input variable selection. Comput. Electron. Agric. 159, 97–109 (2019) 6. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE, July 2014 7. Wang, X., Tarrío, P., Bernardos, A.M., Metola, E., Casar, J.R.: User-independent accelerometer-based gesture recognition for mobile devices. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(3) (2012). (ISSN 2255-2863) 8. Urbano, J., Cardoso, H.L., Rocha, A.P., Oliveira, E.: Trust and normative control in multi-agent systems. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(1) (2012). (ISSN 2255-2863) 9. Oliveira, T., Neves, J., Novais, P.: Guideline formalization and knowledge representation for clinical decision support. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(2) (2012). (ISSN 2255-2863) 10. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007) 11. Aige, M.B.: The online tourist fraud: the new measures of technological investigation in Spain. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(2) (2017). (ISSN 2255-2863) 12. Morente-Molinera, J.A., Kou, G., González-Crespo, R., Corchado, J.M., Herrera-Viedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl.-Based Syst. 137, 54–64 (2017) 13. Carneiro, D., Araújo, D., Pimenta, A., Novais, P.: Real time analytics for characterizing the computer user’s state. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(4) (2016). (ISSN 2255-2863) 14. Alonso, R.S., García, Ó., Saavedra, A., Tapia, D.I., de Paz, J.F., Corchado, J.M.: Heterogeneous wireless sensor networks in a tele-monitoring system for homecare. In: International Work-Conference on Artificial Neural Networks, pp. 663–670 (2009) 15. Alonso, R.S., García, O., Zato, C., Gil, O., De la Prieta, F.: Intelligent agents and wireless sensor networks: a healthcare telemonitoring system. In: Trends in Practical Applications of Agents and Multiagent Systems, pp. 429–436. Springer (2010) 16. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Process. 119, 115–127 (2016) 17. Coria, J.A.G., Castellanos-Garzón, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189–1205 (2014)
254
R. S. Alonso
18. Silva, A., Oliveira, T., Neves, J., Novais, P.: Treating colon cancer survivability prediction as a classification problem. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(1) (2016). (ISSN 2255-2863) 19. Tapia, D.I., Fraile, J.A., Rodríguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for Ambient Intelligence systems. Inf. Sci. 222, 47–65 (2013) 20. Corchado, J.M., Pavón, J., Corchado, E.S., Castillo, L. F.: Development of CBR-BDI agents: a tourist guide application. In: European Conference on Case-based Reasoning, pp. 547–559. Springer, Heidelberg, August 2004 21. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 22. Nihan, C.E.: Healthier? More Efficient? Fairer? An overview of the main ethical issues raised by the use of ubicomp in the workplace. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(1) (2013). (ISSN 2255-2863) 23. Macek, K., Rojicek, J., Kontes, G.D., Rovas, D.V.: Black-box optimization for buildings and its enhancement by advanced communication infrastructure. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(2) (2013). (ISSN 2255-2863) 24. Fdez-Riverola, F., Corchado, J.M.: FSfRT: forecasting system for red tides. Appl. Intell. 21(3), 251–264 (2004) 25. Alonso, R.S., Prieto, J., García, O., Corchado, J.M.: Collaborative learning via social computing. Front. Inf. Technol. Electron. Eng. 20(2), 265–282 (2019) 26. Alonso, R.S., Sittón-Candanedo, I., García, Ó., Prieto, J., Rodríguez-González, S.: An intelligent edge-IoT platform for monitoring livestock and crops in a dairy farming scenario. Ad Hoc Netw. 98, 102047 (2020) 27. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: SpamHunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 28. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Future Gener. Comput. Syst. 102, 965–977 (2020) 29. Ueno, M., Suenaga, T., Isahara, H.: Classification of two comic books based on convolutional neural networks. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(1) (2017). (ISSN 22552863) 30. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 31. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Proc. Comput. Sci. 134, 393–398 (2018) 32. Silva, F., Analide, C.: Tracking context-aware well-being through intelligent environments. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(2) (2015). (ISSN 2255-2863) 33. Li, T., Sun, S.: Online adapting the magnitude of target birth intensity in the PHD filter. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(4) (2013). (ISSN 2255-2863) 34. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307–313 (2002) 35. González-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 36. Díaz, F., Fdez-Riverola, F., Corchado, J.M.: gene-CBR: a case-based reasoning tool for cancer diagnosis using microarray data sets. Comput. Intell. 22(3–4), 254–268 (2006) 37. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood Hebbian learning based retrieval method for CBR systems. In: International Conference on Case-Based Reasoning, pp. 107–121. Springer, Heidelberg, June 2003
Low-Power Distributed AI and IoT for Measuring Lamb’s Milk Ingestion
255
38. Alonso, R.S., Sittón-Candanedo, I., Rodríguez-González, S., García, Ó., Prieto, J.: A survey on software-defined networks and edge computing over IoT. In: International Conference on Practical Applications of Agents and Multi-Agent Systems, pp. 289–301 (2019) 39. Alonso, R.S., Tapia, D.I., Bajo, J., García, Ó., De Paz, J.F., Corchado, J.M.: Implementing a hardware-embedded reactive agents platform based on a service-oriented architecture over heterogeneous wireless sensor networks. Ad Hoc Netw. 11(1), 151–166 (2013) 40. Martínez Martín, E., Escrig Monferrer, M.T., Del Pobil, A.P.: A qualitative acceleration model based on intervals. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(2) (2013). (ISSN 22552863) 41. Guillén, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a SCIRAS model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 42. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell. 15(2), 105–127 (2001) 43. Satoh, I.: Bio-inspired self-adaptive agents in distributed systems. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(2) (2012). (ISSN 2255-2863) 44. Fernández-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy casebased reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 45. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for Alzheimer health care. Int. J. Ambient Comput. Intell. (IJACI) 1(1), 15–26 (2009) 46. Adam, E., Grislin-Le Strugeon, E., Mandiau, R.: MAS architecture and knowledge model for vehicles data communication. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(1) (2012). (ISSN 2255-2863) 47. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 48. Mendez, J.R., Fdez-Riverola, F., Diaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Industrial Conference on Data Mining, pp. 106–120. Springer, Heidelberg, July 2006 49. De Paz, J.F., Tapia, D.I., Alonso, R.S., Pinzón, C.I., Bajo, J., Corchado, J.M.: Mitigation of the ground reflection effect in real-time locating systems based on wireless sensor networks by using artificial neural networks. Knowl. Inf. Syst. 34(1), 193–217 (2013) 50. García, Ó., Alonso, R.S., Martínez, D.I.T., Guevara, F., De La Prieta, F., Bravo, R.A.: Wireless sensor networks and real-time locating systems to fight against maritime piracy. IJIMAI 1(5), 14–21 (2012) 51. Mata, A., Corchado, J.M.: Forecasting the probability of finding oil slicks using a CBR system. Expert Syst. Appl. 36(4), 8239–8246 (2009) 52. Chamoso, P., González-Briones, A., Rodríguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the-art review. Wirel. Commun. Mob. Comput. 2018, 17 (2018) 53. Glez-Bedia, M., Corchado, J.M., Corchado, E.S., Fyfe, C.: Analytical model for constructing deliberative agents. Eng. Intell. Syst. Electr. Eng. Commun. 10(3), 173–185 (2002) 54. Ochoa-Aday, L., Cervelló-Pastor, C., Fernández-Fernández, A.: Discovering the network topology: an efficient approach for SDN. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(2) (2016). (ISSN 2255-2863) 55. Fyfe, C., Corchado, J.M.: Automating the construction of CBR systems using Kernel methods. Int. J. Intell. Syst. 16(4), 571–586 (2001) 56. Choon, Y.W., Mohamad, M.S., Safaai Deris, R.M., Illias, C.K.C., Chai, L.E., Omatu, S., Corchado, J.M.: Differential bees flux balance analysis with OptKnock for in silico microbial strains optimization. PloS One 9(7), 1–13 (2014)
256
R. S. Alonso
57. Sittón-Candanedo, I., Alonso, R.S., Corchado, J.M., Rodríguez-González, S., Casado-Vara, R.: A review of edge computing reference architectures and a new global edge proposal. Future Gener. Comput. Syst. 99, 278–294 (2019) 58. Sittón-Candanedo, I., Alonso, R.S., García, Ó., Gil, A.B., Rodríguez-González, S.: A review on edge computing in smart energy by means of a systematic mapping study. Electronics 9(1), 48 (2020) 59. Shoeibi, N., Shoeibi, N.: Future of smart parking: automated valet parking using deep Qlearning. In: Herrera-Viedma, E., Vale, Z., Nielsen, P., Martin Del Rey, A., Casado Vara, R. (eds.) Distributed Computing and Artificial Intelligence, 16th International Conference, Special Sessions, DCAI 2019. Advances in Intelligent Systems and Computing, vol. 1004. Springer, Cham (2020) 60. Pawlewski, P., Golinska, P., Dossou, P.-E.: Application potential of agent based simulation and discrete event simulation in enterprise integration modelling concepts. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(1) (2012). (ISSN 2255-2863) 61. Martín del Rey, A., Casado Vara, R., Hernández Serrano, D.: Reversibility of symmetric linear cellular automata with radius r = 3. Mathematics 7(9), 816 (2019) 62. Ueno, M., Mori, N., Matsumoto, K.: Picture information shared conversation agent: Pictgent. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(1) (2012). (ISSN 2255-2863) 63. Griol, D., García-Herrero, J., Molina, J.M.: Combining heterogeneous inputs for the development of adaptive and multimodal interaction systems. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(3) (2013). (ISSN 2255-2863) 64. Casado-Vara, R., Novais, P., Gil, A.B., Prieto, J., Corchado, J.M.: Distributed continuous-time fault estimation control for multiple devices in IoT networks. IEEE Access 7, 11972–11984 (2019) 65. Vilaro, A., Orero, P.: User-centric cognitive assessment. Evaluation of attention in special working centres: from paper to Kinect. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(4) (2013). (ISSN 2255-2863) 66. Romero, S., Fardoun, H.M., Penichet, V.M.R., Gallud, J.A.: Tweacher: new proposal for online social networks impact in secondary education. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(1) (2013). (ISSN 2255-2863) 67. Sittón-Candanedo, I., Alonso, R.S., García, Ó., Muñoz, L., Rodríguez-González, S.: Edge computing, IoT and social computing in smart energy scenarios. Sensors 19(15), 3353 (2019) 68. Shoeibi, N., Karimi, F., Corchado, J.M.: Artificial intelligence as a way of overcoming visual disorders: damages related to visual cortex, optic nerves and eyes. In: Herrera-Viedma, E., Vale, Z., Nielsen, P., Martin Del Rey, A., Casado Vara, R. (eds.) Distributed Computing and Artificial Intelligence, 16th International Conference, Special Sessions, DCAI 2019. Advances in Intelligent Systems and Computing, vol. 1004. Springer, Cham (2020) 69. Tapia, D.I., Alonso, R.S., García, Ó., de la Prieta, F., Pérez-Lancho, B.: Cloud-IO: cloud computing platform for the fast deployment of services over wireless sensor networks. In: 7th International Conference on Knowledge Management in Organizations: Service and Cloud Computing, pp. 493–504 (2013) 70. Fuentes, D., Laza, R., Pereira, A.: Intelligent devices in rural wireless networks. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(4) (2013). (ISSN 2255-2863) 71. Macintosh, A., Feisiyau, M., Ghavami, M.: Impact of the mobility models, route and link connectivity on the performance of position based routing protocols. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(1) (2014). (ISSN 2255-2863) 72. Casado-Vara, R., Chamoso, P., De la Prieta, F., Prieto, J., Corchado, J.M.: Non-linear adaptive closed-loop control system for improved efficiency in IoT-blockchain management. Inf. Fusion 49, 227–239 (2019)
Low-Power Distributed AI and IoT for Measuring Lamb’s Milk Ingestion
257
73. Alam, N., Sultana, M., Alam, M.S., Al-Mamun, M.A., Hossain, M.A.: Optimal intermittent dose schedules for chemotherapy using genetic algorithm. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(2) (2013). (ISSN 2255-2863) 74. Tapia, D.I., Alonso, R.S., Rodríguez, S., de Paz, J.F., González, A., Corchado, J.M.: Embedding reactive hardware agents into heterogeneous sensor networks. In: 2010 13th International Conference on Information Fusion, pp. 1–8 (2010) 75. Tapia, D.I., Bajo, J., De Paz, J.F., Alonso, R.S., Rodríguez, S., Corchado, J.M.: Using multilayer perceptrons to enhance the performance of indoor RTLS. Proceedings of the Progress in Artificial Intelligence Workshop: Ambient Intelligence Environments, EPIA 2011 (2011) 76. Magaña, V.C., Organero, M.M., Álvarez-García, J.A., Rodríguez, J.Y.F.: Design of a speed assistant to minimize the driver stress. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(3) (2017). (ISSN 2255-2863) 77. Marín, P.A.R., Giraldo, M., Tabares, V., Duque, N., Ovalle, D.: Educational resources recommendation system for a heterogeneous student group. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(3) (2016). (ISSN 2255-2863) 78. Desquesnes, G., Lozenguez, G., Doniec, A., Duviella, É.: Planning large systems with MDPs: case study of inland waterways supervision. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(4) (2016). (ISSN 2255-2863) 79. Oliver, M., Molina, J.P., Fernandez-Caballero, A., González, P.: Collaborative computerassisted cognitive rehabilitation system. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(3) (2017). (ISSN 2255-2863)
Clifford Algebras: A Proposal Towards Improved Image Recognition in Machine Learning David Garc´ıa-Retuerta(B) University of Salamanca, Patio de Escuelas Menores, 37008 Salamanca, Spain [email protected] https://bisite.usal.es/en/group/team/David Abstract. Machine learning algorithms are designed to learn autonomously to learn general rules from a set of examples. The importance of this task lies in its potential to provide future and past predictions, as well as to improve the interpretability of the data. RGB images have shown themselves to be a challenging topic to neural networks as their 3 dimensions (Red, Green and Blue) have to be processed using mathematical techniques designed for 1-dimensional inputs. However, an implementation of neural networks using Clifford algebras can speed up the processing time and improve the performance, as the resulting network is based on a 4-dimensional space.
Keywords: Clifford algebras
1
· Machine learning · Image recognition
Introduction
Clifford Algebras are important associative algebras in mathematics [1–6]. They are unitary associative algebras generated by a vector space with an associated quadratic form. This field of study is strongly connected with the theory of quadratic forms and orthogonal transformations [7–10]. One of its most important applications is digital image processing, as they allow the usage of quaternions in the processing phase of the image’s pixels [11–16]. One particularly promising use case of the Clifford Algebras related to image processing is to use quaternions (and the subsequent Clifford’s theory) in the activation function of the perceptrons, in an Artificial Neural Network (ANN) [15,17–20]. This application gave rise to the term Clifford neural networks (Clifford NN). The basic idea is to extend the number field from real numbers (traditionally used) to quaternions q = a + ib + jc + kd, although it’s also possible to just extend the dimensionality of the weights and threshold values from 1 dimension to n-dimensional real valued vectors (this approach is not studied in this article) [21–23]. Simple networks are not likely to specially benefit from the advantages of more complex algebras as they have already been greatly optimised using the c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 258–261, 2021. https://doi.org/10.1007/978-3-030-58356-9_27
Hamiltonian Mechanics
259
nowadays-standard methods. However, multi-layered Clifford NN models, which derived a Clifford back-propagation learning algorithm, shows promising results in recent research works. Similar ideas have been used to develop the so-called Clifford support vector machine (Clifford SVM) [24,25]. In this paper we present a new set of techniques meant to optimise signal and image processing, which results are promising. Furthermore, fields like computer and robot vision can greatly benefit from it, as well as certain control problems, kinematics and dynamics of robots [26,27].
2
Conclusion
This work proposes a novel technique to processing images which is focus on the colours as it allows inputs to have up to 4 dimensions. The algorithm can be used in computer and robot vision, with already proven results in colour active contour. This paper provides a Clifford algebras-based adaptation for the activation function of perceptrons in ANN, which allows a better performance of the network and more versatile to multiple-dimensional inputs. In our future work, we will extend the Clifford NN to new real-life cases which are likely to be well modelled and will study hyper-parameters optimisation. Acknowledgements. This paper has been partially supported by the Salamanca Ciudad de Cultura y Saberes Foundation under the Talent Attraction Programme (CHROMOSOME project).
References 1. Guill´en, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a SCIRAS model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 2. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell. 15(2), 105–127 (2001) 3. Fern´ andez-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy case-based reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 4. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for alzheimer health care. Int. J. Ambient Comput. Intell. (IJACI) 1(1), 15–26 (2009) 5. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 6. Mendez, J.R., Fdez-Riverola, F., Diaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Industrial Conference on Data Mining, pp. 106–120. Springer, Heidelberg (July 2006) 7. Koskimaki, H., Siirtola, P.: Accelerometer vs. electromyogram in activity recognition. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 2255-2863. Salamanca, v. 5, n. 3
260
D. Garc´ıa-Retuerta
8. Goyal, S., Goyal, G.K.: Machine learning ANN models for predicting sensory quality of roasted coffee flavoured sterilized drink. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2013). ISSN: 2255-2863. Salamanca, v. 2, n. 3 9. Jim´enez-Rodr´ıguez, A., Castillo, L.F., Gonz´ alez, M.: Studying the mechanisms of the somatic marker hypothesis in spiking neural networks (SNN). ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2012). ISSN: 2255-2863. Salamanca, v. 1, n. 2 10. F¨ ahndrich, J., Ahrndt, S., Albayrak, S.: Formal language decomposition into semantic primes. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2014). ISSN: 2255-2863. Salamanca, v. 3, n. 1 11. Griol, D., Molina, J.M.: A proposal to manage multi-task dialogs in conversational interfaces. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 22552863. Salamanca, v. 5, n. 2 12. Santos, A., Nogueira, R., Louren¸co, A.: Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2012) (ISSN: 2255-2863), Salamanca, v. 1, n. 1 13. Mata, A., Corchado, J.M.: Forecasting the probability of finding oil slicks using a CBR system. Expert Syst. Appl. 36(4), 8239–8246 (2009) 14. Chamoso, P., Gonz´ alez-Briones, A., Rodr´ıguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the-art review. Wirel. Commun. Mob. Comput. (2018) 15. Glez-Bedia, M., Corchado, J.M., Corchado, E.S., Fyfe, C.: Analytical model for constructing deliberative agents. Eng. Intell. Syst. Electr. Eng. Commun. 10(3), 173–185 (2002) ´ Chamoso, P., Corchado, J.M.: Counter16. Garc´ıa-Retuerta, D., Bartolom´e, A., terrorism video analysis using hash-based algorithms. Algorithms 12(5), 110 (2019) 17. Beliz, N., Rangel, J.C., Hong, C.S.: Detecting DoS attack in web services by using an adaptive multiagent solution. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2012). ISSN: 2255-2863. Salamanca, v. 1, n. 2 18. Fyfe, C., Corchado, J.M.: Automating the construction of CBR systems using kernel methods. Int. J. Intell. Syst. 16(4), 571–586 (2001) 19. Choon, Y.W., Mohamad, M.S., Safaai Deris, R.M., Illias, C.K.C., Chai, L.E., Omatu, S., Corchado, J.M.: Differential bees flux balance analysis with OptKnock for in silico microbial strains optimization. PloS One 9(7), e102744 (2014) 20. Hern´ andez, G., Garc´ıa-Retuerta, D., Chamoso, P., Rivas, A.: Design of an AI-based workflow-guiding system for stratified sampling. In: International Symposium on Ambient Intelligence, pp. 105–111. Springer, Cham (June 2019) 21. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE (July 2014) 22. Mart´ın del Rey, A., Casado Vara, R., Hern´ andez Serrano, D.: Reversibility of symmetric linear cellular automata with radius r = 3. Mathematics 7(9), 816 (2019) 23. Casado-Vara, R., Novais, P., Gil, A.B., Prieto, J., Corchado, J.M.: Distributed continuous-time fault estimation control for multiple devices in IoT networks. IEEE Access 7, 11972–11984 (2019) 24. Loukanova, R.: Relationships between specified and underspecified quantification by the theory of acyclic recursion. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 2255-2863. Salamanca, v. 5, n. 4 25. Matos, S., Ara´ ujo, H., Oliveira, J.L.: Biomedical literature exploration through latent semantics. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2013). ISSN: 2255-2863. Salamanca, v. 2, n. 2
Hamiltonian Mechanics
261
26. Casado-Vara, R., Chamoso, P., De la Prieta, F., Prieto, J., Corchado, J.M.: Nonlinear adaptive closed-loop control system for improved efficiency in IoT-blockchain management. Inf. Fusion 49, 227–239 (2019) 27. Monino, J.L., Sedkaoui, S.: The algorithm of the snail: an example to grasp the window of opportunity to boost big data. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 2255-2863. Salamanca, v. 5, n. 3
New Approach to Recommend Banking Products Through a Hybrid Recommender System Elena Hernández Nieves(B) BISITE Digital Innovation Hub, University of Salamanca, Edificio Multiusos I+D+I, 37007 Salamanca, Spain [email protected]
Abstract. This research aims to add value in the private banking sector by increasing the sale of financial products to the customer by personalising the recommendation. It proposes a conceptual definition of a new process for private banking when recommending banking products. To this end, a hybrid method of recommending financial products is presented: collaborative filtering combined with content-based filtering. This task involves exploring intelligent algorithms to create the right recommendation for each client. The expected results are the prediction with a high degree of accuracy by integrating a hybrid method that ensures the personalization of products suggested by a bank. Keywords: Recommendation systems · Artificial Intelligence · Hybrid method · Fintech
1 Introduction Today’s financial market has evolved greatly. According to a report by Fintech Spain1 , whose aim is to promote and bring financial technologies closer to the public, in 2014 there were 440 Fintech companies worldwide with a turnover of $7.47 trillion. The following year the number of Fintech companies worldwide reached 2,000 with a total turnover of $20 trillion. In Spain, where it is estimated that the market will be consolidated within 3 to 5 years, the following are considered to be growth levers: reputation and branding, user experience, the entry of investment funds that allow sufficient volume to be achieved, and greater agility in service. The basis of the research is to overcome the current barriers and strengthen the relationship between banks and their clients when recommending financial products by applying Artificial Intelligence techniques. This is fundamental in order to be able to adapt, as far as possible, to the client. The recommendation systems [13–41] allow to adapt to the client’s needs due to the fact that they calculate the similarity between users and products, making them especially suitable for solving financial problems. Table 1 shows the recommendation approaches that have been considered for the study. 1 Fintech Inside-Fintech Spain. http://fintechspain.com/wp-content/uploads/FINTECH-Inside.
pdf. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 262–266, 2021. https://doi.org/10.1007/978-3-030-58356-9_28
New Approach to Recommend Banking Products
263
Table 1. Recommendation approaches. Approaches
Typology
Attributes
Collaborative filtering
Memory-based Prompt incorporation of the newest information Unavailability of ratings to predict Difficulties in scalability Model-based
Facility to recommend and personalize Data incorporation when generate the model
Content-based filtering
Forecasting by checking information from a resource with information describing the users’ needs, priorities and patterns
The proposal is a hybrid approach that merges collaborative filtering methods with content-based techniques. The k-nearest neighbors (k-NN) algorithm will be used to make recommendations.
2 Conclusion This research proposes a methodological framework [42–58] to provide recommendations of banking products. Once the similarity function has been defined among two users, it is possible to obtain the coincidences between the defined products and the users. A methodological framework is provided to generate recommendations for banking products. To test it, a dataset was generated including different groups of customers assigned to various products, following a realistic probability distribution [1–12]. It is possible to provide a one-time recommendation by selecting the one with the maximum weight, or by applying a random sample based on these recommendations. Acknowledgments. This research is supported by the Ministry of Education of the Junta de Castilla y León and the European Social Fund through a grant from predoctoral recruitment of research personnel associated with the University of Salamanca research project “ROBIN: Robo-advisor intelligent”.
References 1. Fernández-Isabel, A., Fuentes-Fernández, R.: Simulation of road traffic applying modeldriven engineering. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(2) (2015). (ISSN 22552863) 2. Bicharra, A.C., Sanchez-Pi, N., Correia, L., Molina, J.M.: Multi-agent simulations for emergency situations in an airport scenario. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(3) (2012). (ISSN 2255-2863) 3. Alves, A.O., Ribeiro, B.: Consensus-based approach for keyword extraction from urban events collections. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(2) (2015). (ISSN 2255-2863)
264
E. H. Nieves
4. Pereira, A., Felisberto, F., Maduro, L., Felgueiras, M.: Fall detection on ambient assisted living using a wireless sensor network. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(1) (2012). (ISSN 2255-2863) 5. Carbó, J., Molina, J.M., Patricio, M.A.: Asset management system through the design of a Jadex agent system. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(2) (2016). (ISSN 2255-2863) 6. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 7. Casado-Vara, R., Chamoso, P., De la Prieta, F., Prieto, J., Corchado, J.M.: Non-linear adaptive closed-loop control system for improved efficiency in IoT-blockchain management. Inf. Fusion 49, 227–239 (2019) 8. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Future Gener. Comput. Syst. 102, 965–977 (2020) 9. Casado-Vara, R., Novais, P., Gil, A.B., Prieto, J., Corchado, J.M.: Distributed continuous-time fault estimation control for multiple devices in IoT networks. IEEE Access 7, 11972–11984 (2019) 10. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Proc. Comput. Sci. 134, 393–398 (2018) 11. Chamoso, P., González-Briones, A., Rodríguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the-art review. Wirel. Commun. Mob. Comput. 2018, 17 (2018) 12. Choon, Y.W., Mohamad, M.S., Safaai Deris, R.M., Illias, C.K.C., Chai, L.E., Omatu, S., Corchado, J.M.: Differential bees flux balance analysis with OptKnock for in silico microbial strains optimization. PloS One 9(7), 1–13 (2014) 13. Villavicencio, C.P., Schiaffino, S., Diaz Pace, J., Monteserin, A.: A group recommendation system for movies based on MAS. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(3) (2016). (ISSN 2255-2863) 14. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307–313 (2002) 15. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 16. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell. 15(2), 105–127 (2001) 17. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood Hebbian learning based retrieval method for CBR systems. In: International Conference on Case-Based Reasoning, pp. 107–121. Springer, Heidelberg, June 2003 18. Corchado, J.M., Pavón, J., Corchado, E.S., Castillo, L.F.: Development of CBR-BDI agents: a tourist guide application. In: European Conference on Case-based Reasoning, pp. 547–559. Springer, Heidelberg, August 2004 19. Coria, J.A.G., Castellanos-Garzón, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189–1205 (2014) 20. Peñaranda, C., Aguero, J., Carrascosa, C., Rebollo, M., Julián, V.: An agent-based approach for a smart transport system. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(2) (2016). (ISSN 2255-2863) 21. Lopez Sanchez, D., Gonzalez Arrieta, A.: Preliminary results on nonparametric facial occlusion detection. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(1) (2016). (ISSN 2255-2863) 22. Griol, D., Molina, J.M., De Miguel, A.S.: Developing multimodal conversational agents for an enhanced e-learning experience. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(1) (2014). (ISSN 2255-2863)
New Approach to Recommend Banking Products
265
23. Díaz, F., Fdez-Riverola, F., Corchado, J.M.: gene-CBR: a case-based reasoning tool for cancer diagnosis using microarray data sets. Comput. Intell. 22(3–4), 254–268 (2006) 24. Fdez-Riverola, F., Corchado, J.M.: FSfRT: forecasting system for red tides. Appl. Intell. 21(3), 251–264 (2004) 25. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007) 26. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: SpamHunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 27. Fernández-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy casebased reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 28. Fyfe, C., Corchado, J.M.: Automating the construction of CBR systems using Kernel methods. Int. J. Intell. Syst. 16(4), 571–586 (2001) 29. Santos, G., Pinto, T., Vale, Z., Praça, I., Morais, H.: Enabling communications in heterogeneous multi-agent systems: electricity markets ontology. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(2) (2016). (ISSN 2255-2863) 30. Glez-Bedia, M., Corchado, J.M., Corchado, E.S., Fyfe, C.: Analytical model for constructing deliberative agents. Eng. Intell. Syst. Electr. Eng. Commun. 10(3), 173–185 (2002) 31. González-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 32. Guillén, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a SCIRAS model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 33. Bargaoui, H., Driss, O.B.: Multi-agent model based on tabu search for the permutation flow shop scheduling problem. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. Salamanca 3(1) (2014). (ISSN 2255-2863) 34. Ko, H., Bae, K., Marreiros, G., Kim, H., Yoe, H., Ramos, C.: A study on the key management strategy for wireless sensor networks. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(3) (2014). (ISSN 2255-2863) 35. Román Gallego, J.Á., Rodríguez González, S.: Improvement in the distribution of services in multi-agent systems with SCODA. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(3) (2015). (ISSN 2255-2863) 36. Alemany, J., Heras, S., Palanca, J., Julián, V.: Bargaining agents based system for automatic classification of potential allergens in recipes. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(2) (2016). (ISSN 2255-2863) 37. Jimenez-Garcia, J.L., Baselga-Masia, D., Poza-Lujan, J.L., Munera, E., Posadas-Yagüe, J.L., Simó-Ten, J.E.: Smart device definition and application on embedded system: performance and optimization on a RGBD sensor. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(1) (2014). (ISSN 2255-2863) 38. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Process. 119, 115–127 (2016) 39. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE, July 2014 40. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 41. Lopes, Y., Cortés, M.I., Tavares Gonçalves, E.J., Oliveira, R.: JAMDER: JADE to multi-agent systems development resource. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 7(3), 63–98 (2018)
266
E. H. Nieves
42. Martín del Rey, A., Casado Vara, R., Hernández Serrano, D.: Reversibility of symmetric linear cellular automata with radius r = 3. Mathematics 7(9), 816 (2019) 43. Mata, A., Corchado, J.M.: Forecasting the probability of finding oil slicks using a CBR system. Expert Syst. Appl. 36(4), 8239–8246 (2009) 44. Mendez, J.R., Fdez-Riverola, F., Diaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Industrial Conference on Data Mining, pp. 106–120. Springer, Heidelberg, July 2006 45. Morente-Molinera, J.A., Kou, G., González-Crespo, R., Corchado, J.M., Herrera-Viedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl.-Based Syst. 137, 54–64 (2017) 46. Muzammul, M., Awais, M.: An empirical approach for software reengineering process with relation to quality assurance mechanism. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 7(3), 31–46 (2018) 47. Campillo-Sánchez, P., Botía, J.A., Gómez-Sanza, J.: Development of sensor based applications for the android platform: an approach based on realistic simulation. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(1) (2013). (ISSN 2255-2863) 48. Chamoso, P., De La Prieta, F.: Swarm-based smart city platform: a traffic application. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(2) (2015). (ISSN 2255-2863) 49. Faia, R., Pinto, T., Vale, Z.: Dynamic fuzzy clustering method for decision support in electricity markets negotiation. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 5(1) (2016). (ISSN 2255-2863) 50. Omatu, S., Araki, H., Fujinaka, T., Yano, M., Yoshioka, M., Nakazumi, H., Tanahashi, I.: Mixed odor classification for QCM sensor data by neural network. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 1(2) (2012). (ISSN 2255-2863) 51. Omatu, S., Wada, T., Chamoso, P.: Odor classification using agent technology. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 2(4) (2013). (ISSN 2255-2863) 52. Toscano, S.S.: Freedom of expression, right to information, personal data and the internet in the view of the inter-American system of human rights. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(1) (2017). (ISSN 2255-2863) 53. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for alzheimer health care. Int. J. Ambient Comput. Intell. (IJACI) 1(1), 15–26 (2009) 54. Tapia, D.I., Fraile, J.A., Rodríguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for Ambient Intelligence systems. Inf. Sci. 222, 47–65 (2013) 55. Julián, V., Navarro, M., Botti, V., Heras, S.: Towards real-time argumentation. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(4) (2015). (ISSN 2255-2863) 56. Magaña, V.C., Organero, M.M.: Reducing stress and fuel consumption providing road information. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(4) (2014). (ISSN 2255-2863) 57. Parra, V., López, V., Mohamad, M.S.: A multiagent system to assist elder people by TV communication. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 3(2) (2014). (ISSN 22552863) 58. Yadav, M., Kr Purwar, R., Jain, A.: Design of CNN architecture for Hindi characters. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 7(3), 47–62 (2018)
An IoT-Based ROUV for Environmental Monitoring Marta Plaza-Hernández(B) BISITE Research Group, University of Salamanca, Edificio Multiusos I+D+i, Calle Espejo 2, 37007 Salamanca, Spain [email protected]
Abstract. Over the past five years the Internet of Things (IoT) technology has grown rapidly, finding application in several sectors. It plays an important role in environmental monitoring. This research proposal aims to develop a Remotely Operated Underwater Vehicle (ROUV) for the evaluation and monitoring of marine environments. Keywords: Internet of Things · ROUV · Environmental monitoring · Environmental conservation
1 Introduction The Internet of Things (IoT) is a network of physical “smart” devices embedded with electronics, software, sensors and actuators, that allows interconnectivity among devices and data exchange. This new technology has grown rapidly [1], finding applications in several sectors [2] (e.g. energy, healthcare, industrial, IT and networks, security and public safety and transportation). The European Union, through its Horizon 2020 programme, will allocate up to EUR 6.3 billion for research and development of ICT and IoT technologies [3, 4]. It is expected that by 2025, IoT will reach a potential market impact of USD 11.1 trillion [5]. Environmental monitoring for management and conservation purposes is a research field where the IoT technology plays a crucial role, especially with the rising concern about climate change. First, this research proposal will conduct a literature review of IoT applications in the field of marine environmental monitoring, starting from the exhaustive work performed by Xu et al. [6]. Then, this work aims to develop an IoTbased Remotely Operated Underwater Vehicle (ROUV) [7–30], which will comprise several sensors that collect environmental data from a selected area. The sensors will measure several physical and chemical parameters [31–44], such as water temperature and pressure, pH, salinity, dissolved oxygen, nitrate, etc. The information collected will be transferred to a cloud [7, 44–64], so that environmental agencies can exploit it for decision-making.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 267–271, 2021. https://doi.org/10.1007/978-3-030-58356-9_29
268
M. Plaza-Hernández
2 Conclusions IoT is considered one of the leading gateway technologies to digital transformation. It plays an essential role in environmental monitoring, helping agencies to understand the current state of the environment so that they can take management and preservation actions. This research proposal aims to perform a literature review of the use of IoT technologies in marine environments, and then to develop a ROUV that measures and monitors physical and chemical parameters. Acknowledgments. This research has been supported by the project “The Surveying & MARiTime internet of thingS EducAtion (SMARTSEA)”, Reference: 612198-EPP-1-2019-1-ESEPPKA2-KA, financed by the European Commission (Erasmus+: Higher Education - International Capacity Building).
References 1. Manyika, J.; Chui, M.; Bisson, P.; Woetzel, J.; Dobbs, R.; Bughin, J., Aharon, D.: The Internet of Things: Mapping the Value Beyond the Hype. McKinsey Global Institute (2015) 2. Beecham Research Homepage: M2M Sector Map. http://beechamresearch.com/. Accessed 01 Dec 2020 3. European Commission: EU leads the way with ambitious action for cleaner and safer seas. https://ourocean2017.org/eu-leads-way-ambitious-action-cleaner-and-safer-seas. Accessed 01 July 2020 4. European Commission: Horizon2020 - Smart, Green and Integrated Transport. ec.europa.eu/programmes/horizon2020/en/h2020-section/smart-green-and-integratedtransport. Accessed 01 July 2020 5. Deloitte: https://www2.deloitte.com/tr/en/pages/technology-media-and-telecommunicati ons/articles/internet-of-things-iot-in-shipping-industry.html. Accessed 01 Sept 2020 6. Xu, G., Shi, Y., Sun, X., Shen, W.: Internet of things in marine environment monitoring: a review. Sensors 19, 1711 (2019) 7. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE, July 2014 8. Blanco Valencia, X.P., Becerra, M.A., Castro Ospina, A.E., Ortega Adarme, M., Viveros Melo, D., Peluffo Ordóñez, D.H.: Kernel-based framework for spectral dimensionality reduction and clustering formulation: a theoretical study (2017) 9. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007) 10. Morente-Molinera, J.A., Kou, G., González-Crespo, R., Corchado, J.M., Herrera-Viedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl.-Based Syst. 137, 54–64 (2017) 11. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Process. 119, 115–127 (2016) 12. Coria, J.A.G., Castellanos-Garzón, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189–1205 (2014)
An IoT-Based ROUV for Environmental Monitoring
269
13. Hassanat, A.: Greedy algorithms for approximating the diameter of machine learning datasets in multidimensional euclidean space: experimental results. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 7(3), 15–30 (2018) 14. Bullón, J., González Arrieta, A., Hernández Encinas, A., Queiruga Dios, A.: Manufacturing processes in the textile industry. Expert Systems for fabrics production. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 6(1) (2017). (ISSN: 2255-2863), Salamanca 15. Tapia, D.I., Fraile, J.A., Rodríguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for Ambient Intelligence systems. Inf. Sci. 222, 47–65 (2013) 16. Corchado, J.M., Pavón, J., Corchado, E.S., Castillo, L.F.: Development of CBR-BDI agents: a tourist guide application. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 547–559. Springer, Heidelberg (2004). https://doi.org/10.1007/9783-540-28631-8_40 17. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 18. Fdez-Riverola, F., Corchado, J.M.: Fsfrt: forecasting system for red tides. Appl. Intell. 21(3), 251–264 (2004) 19. Cunha, R., Billa, C., Adamatti, D.: Development of a graphical tool to integrate the Prometheus AEOlus methodology and Jason platform. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(2) (2017). (ISSN: 2255-2863), Salamanca 20. Rodríguez Marín, P.A., Duque, N., Ovalle, D.: Multi-agent system for knowledge-based recommendation of learning objects. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 21. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: SpamHunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 22. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Fut. Gener. Comput. Syst. 102, 965–977 (2020) 23. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 24. Sánchez-Carmona, A., Robles, S., Borrego, C.: Improving podcast distribution on Gwanda using PrivHab: a multiagent secure georouting protocol. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 25. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Procedia Comput. Sci. 134, 393–398 (2018) 26. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307–313 (2002) 27. González-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 28. Gonçalves, E., Cortés, M., De Oliveira, M., Veras, N., Falcão, M., Castro, J.: An analysis of software agents, environments and applications school: retrospective, relevance, and trends. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(2) (2017). (ISSN: 2255-2863), Salamanca 29. Guimaraes, M., Adamatti, D., Emmendorfer, L.: An agent-based environment for dynamic positioning of the fogg behavior model threshold line. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(1), 67–76 (2018) 30. Griol, D., Molina, J.M.: Simulating heterogeneous user behaviors to interact with conversational interfaces. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 5(4), 59–69 (2016). (ISSN: 2255-2863), Salamanca
270
M. Plaza-Hernández
31. Díaz, F., Fdez-Riverola, F., Corchado, J.M.: Gene-CBR: A CASE-BASED REASONIG TOOL FOR CANCER DIAGNOSIS USING MICROARRAY DATA SETS. Comput. Intell. 22(3–4), 254–268 (2006) 32. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood hebbian learning based retrieval method for CBR systems. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS (LNAI), vol. 2689, pp. 107–121. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45006-8_11 33. Guillén, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a sciras model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 34. Jassim, O., Mahmoud, M., Ahmad, M.S.: Research supervision management via a multi-agent framework. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(4) (2014). (ISSN: 2255-2863), Salamanca 35. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell. 15(2), 105–127 (2001) 36. Fernández-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy casebased reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 37. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for alzheimer health care. Int. J. Ambient Comput. Intell. (IJACI) 1(1), 15–26 (2009) 38. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 39. Méndez, J.R., Fdez-Riverola, F., Díaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 106–120. Springer, Heidelberg (2006). https://doi.org/10.1007/11790853_9 40. Cardoso, R.C., Bordini, R.H.: A multi-agent extension of a hierarchical task network planning formalism. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(2) (2017). (ISSN: 2255-2863), Salamanca 41. Mateen, A., et al.: Secure data access control with perception reasoning. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(1), 13–28 (2018) 42. Mata, A., Corchado, J.M.: Forecasting the probability of finding oil slicks using a CBR system. Expert Syst. Appl. 36(4), 8239–8246 (2009) 43. Chamoso, P., González-Briones, A., Rodríguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the-art review. Wireless Commun. Mob. Comput. (2018) 44. Glez-Bedia, M., Corchado, J.M., Corchado, E.S., Fyfe, C.: Analytical model for constructing deliberative agents. Eng. Intell. Syst. Electr. Eng. Commun. 10(3), 173–185 (2002) 45. Teixeira, E.P., Goncalves, E., Adamatti, D.F.: Ulises: a agent-based system for timbre classification. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(1), 29–40 (2018) 46. Pudaruth, S., et al.: Sentiment analysis from facebook comments using automatic coding in NVivo 11. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(1), 41–48 (2018) 47. Fyfe, C., Corchado, J.M.: Automating the construction of CBR systems using kernel methods. Int. J. Intell. Syst. 16(4), 571–586 (2001) 48. Choon, Y.W., Mohamad, M.S., Deris, S., Illias, R.M., Chong, C.K., Chai, L.E., Omatu, S., Corchado, J.M.: Differential bees flux balance analysis with OptKnock for in silico microbial strains optimization. PloS one 9(7) (2014) 49. Martín del Rey, A., Casado Vara, R., Hernández Serrano, D.: Reversibility of symmetric linear cellular automata with radius r = 3. Mathematics 7(9), 816 (2019) 50. Casado-Vara, R., Novais, P., Gil, A.B., Prieto, J., Corchado, J.M.: Distributed continuous-time fault estimation control for multiple devices in IoT networks. IEEE Access 7, 11972–11984 (2019)
An IoT-Based ROUV for Environmental Monitoring
271
51. Munera, E., Poza-Lujan, J.-L., Posadas-Yagüe, J.-L., Simó-Ten, J.-E., Blanes, F.: Integrating smart resources in ROS-based systems to distribute services. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(1) (2017). (ISSN: 2255-2863), Salamanca 52. Jasim, Y.A.: Improving intrusion detection systems using artificial neural networks. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(1), 49–65 (2018) 53. Jasim, Y.A., Saeed, M.G.: Developing a software for diagnosing heart disease via data mining techniques. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(3), 99–114 (2018) 54. Casado-Vara, R., Chamoso, P., De la Prieta, F., Prieto, J., Corchado, J.M.: Non-linear adaptive closed-loop control system for improved efficiency in IoT-blockchain management. Inf. Fusion 49, 227–239 (2019) 55. de Melo, M.J., et al.: Robust and adaptive chatter free formation control of wheeled mobile robots with uncertainties. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(2), 27–42 (2018) 56. Ferreira, M.R., Kawakami, C.: Ransomware-Kidnapping personal data for ransom and the information as hostage. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(3), 5–14 (2018) 57. Rincón, J., Poza, J.L., Posadas, J.L., Julián, V., Carrascosa, C.: Adding real data to detect emotions by means of smart resource artifacts in MAS. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 5(4) (2016). (ISSN: 2255-2863), Salamanca 58. Bremer, J., Lehnhoff, S.: Decentralized coalition formation with agent-based combinatorial heuristics. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(3) (2017). (ISSN: 2255-2863), Salamanca 59. Teixeira, E.P., Goncalves, E.M.N., Adamatti, D.F.: Ulises: a agent-based system for timbre classification. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(2) (2017). (ISSN: 2255-2863), Salamanca 60. Becerril, A.A.: The value of our personal data in the big data and the internet of all things era. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(2), 71–80 (2018) 61. Ali, Z., Kiran, H.M., Shahzad, W.: Evolutionary algorithms for query optimization in distributed database systems: a review. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(3), 115–128 (2018) 62. Becerra-Bonache, L., López, M.D.J.: Linguistic models at the crossroads of agents, learning and formal languages. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(4) (2014). (ISSN: 2255-2863), Salamanca 63. De Castro, L.F.S., Alves, G.V., Borges, A.P.: Using trust degree for agents in order to assign spots in a smart parking. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(2) (2017). (ISSN: 2255-2863), Salamanca 64. Bicharra Garcia, A.C., Vivacqua, A.S.: ACoPla: a multiagent simulator to study individual strategies in dynamic situations. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 7(2), 81–91 (2018)
Deep Symbolic Learning and Semantics for an Explainable and Ethical Artificial Intelligence Ricardo S. Alonso(B) BISITE Research Group, University of Salamanca, Edificio Multiusos I+D+I, Calle Espejo 2, 37007 Salamanca, Spain [email protected]
Abstract. The main objective of this research is to investigate new hybrid neurosymbolic algorithms for the construction of an open-source Deep Symbolic Learning framework that allows the training and application of explainable and ethical Deep Learning models. This framework will be supported by an ontology and a layer model in which it is taken into account which user is responsible for interpreting each of the output results according to his or her role, considering, also, the ethical implications of those results. Keywords: Deep Learning · Deep Symbolic Learning · Explainable artificial intelligence · Ethical artificial intelligence · Interpretable machine learning
1 Introduction Today, the applications of Artificial Intelligence (AI) [1–15] and, more specifically, Deep Learning (DL) [16–30], are part of the daily life of all citizens [31]. This includes AI applications aimed at predicting whether we are good credit payers, which banking products are suitable to be recommended to us [32] and applications that can estimate our age, sex and race based merely in a picture [33]. To do this, it is necessary to collect data from users in order to build training datasets, with the ethical implications that this entails. On the other hand, an incorrect or incomplete data population can provide results that affect users’ rights (e.g., identifying a person as an animal in an image, or denying a credit to a person due to an incomplete model) [34]. The second AI winter of the late 1980s and early 1990s was followed by the emergence of intelligent agents and machine learning (ML) based on statistical methods, including the first mentions to DL [35]. The fact that high parallel computing using GPUs/TPUs is now widely available [36], as well as high storage capacity for training data, explains why DL techniques have gained a great popularity in recent years, providing many successful results. In this sense, DL is subdivided into different branches (which can be combined with each other) with specialized deep neural networks at the application level, such as recurrent neural networks for Natural Language Processing (NLP) © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 272–278, 2021. https://doi.org/10.1007/978-3-030-58356-9_30
Deep Symbolic Learning and Semantics
273
and text and speech recognition [37], convolutional neural networks for image/video recognition [38], or auto-encoders for image generation [39]. DL techniques offer outstanding results in terms of classification and predictions accuracy compared to classic machine learning (ML) [23, 40–48]. The main trade-offs are, firstly, performance, which requires more computing resources than classic ML methods [49–60]. And, secondly, the interpretation of connectionist (or neural) models that include multiple layers with non-linear interactions in each of these intermediate layers is an extremely complicated or even disputable task [3, 61–64] (e.g., convolutional neural networks used to recognize people/animals in images [65–69]). Thus, it is necessary to develop innovative and open solutions that allow the scientific and industrial community to build DL models that allow, at the same time: i) to take into account the ethical aspects in the training of the datasets and the results provided by the algorithms; ii) to facilitate the explainability and interpretability of the models built and the reason of their effectiveness to help data scientists which direction to take. This research proposes the development of hybrid neuro-symbolic algorithms of AI [70–74] that helps to create a new Deep Symbolic Learning (DSL) framework, which allow us to link the performance of deep neural networks with the interpretability of the models built by symbolic algorithms [72, 75–85]. A solution such as this open-source framework will allow, in the medium and long term, scientific researchers, industry and policy makers to research, develop and supervise their own models taking into account ethical aspects and explainability. This will make it possible to respect users’ rights and, at the same time, to better understand the built models, easing their learning, reducing the time of experimentation and accelerating their evolution, achieving more efficient models.
2 Conclusions Deep Learning (DL) includes applications aimed at predicting whether we are good credit payers, which banking products are suitable to be recommended to us, and applications that can estimate our age, sex and race based merely in a picture. However, one of the main problems of DL is the large amount of data that deep neural networks require to be trained. In this regard, an incomplete dataset can provide results that rise ethical problems (e.g., identifying a person as an animal or denying a credit). Another problem is that the interpretation of deep neural models is an extremely complicated task. In this sense, traditional symbolic AI algorithms allow the interpretability of the models at the expense of offering poorer performance than DL techniques. A possible solution is the application of hybrid neurosymbolic AI techniques. In this way, the aim is to take advantage of the performance of neural networks and the explainability of symbolic logic. However, the applications of Deep Symbolic Learning (DSL) has been barely exploited so far. In this regard, the main objective of this research is to produce significant advances in the explainability of DL models through a new approach based on the hybridation of connectionist AI approaches (e.g., DL) with symbolic learning techniques (e.g., mixture of experts), called DSL. As a result, this research will build an open-source framework that allows the training and application of explainable and ethical DL models. This
274
R. S. Alonso
framework will be supported by a new ontology and a new layer model that will consider the ethical implications of a biased dataset or a biased classification and prediction models. This approach will be characterized by the creation of interpretable and ethical DSL models for natural language processing, text and speech recognition, sentiment analysis, image and video recognition, and image generation. To validate the research carried out, a platform will be implemented regarding a case study applied in a Smart Home environment that ingests data from Edge-IoT devices. Acknowledgments. This work has been partially supported by the European Regional Development Fund (ERDF) through the Spanish Ministry of Science, Innovation and University State Research Agency under grant RTC-2017-6611-8 (TWINPICS - Social computing and sentiment analysis for detection of duplicate profiles used for terrorist propaganda and other criminal purposes).
References 1. Althubaiti, S., et al.: Ontology-based prediction of cancer driver genes. Sci. Rep. 9(1), 17405 (2019) 2. López, M., Pedraza, J., Carbó, J., Molina, J.M.: The awareness of privacy issues in ambient intelligence. Adv. Distrib. Comput. Artif. Intell. J. 3(2), 71–84 (2014). ISSN: 2255-2863, Salamanca 3. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE (July 2014) 4. Bullon, J., et al.: Manufacturing processes in the textile industry. Expert Systems for fabrics production. Adv. Distrib. Comput. Artif. Intell. J. 6(4), 15–23 (2017) 5. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Exp. Syst. Appl. 33(1), 36–48 (2007) 6. Alonso, R.S., García, Ó., Saavedra, A., Tapia, D.I., de Paz, J.F., Corchado, J.M.: Heterogeneous wireless sensor networks in a tele-monitoring system for homecare. In: Omatu, S., et al. (eds.) IWANN 2009. LNCS, vol. 5518, pp. 663–670. Springer, Heidelberg (2009). https:// doi.org/10.1007/978-3-642-02481-8_99 7. Alonso, R.S., García, O., Zato, C., Gil, O., De la Prieta, F.: Intelligent agents and wireless sensor networks: a healthcare telemonitoring system. In: Demazeau, Y., et al. (eds.) Trends in Practical Applications of Agents and Multiagent System. Advances in Intelligent and Soft Computing, vol. 71, pp. 429–436. Springer, Heidelberg (2010). https://doi.org/10.1007/9783-642-12433-4_51 8. de Castro, L.F.S., Vaz Alves, G., Borges, A.P.: Using trust degree for agents in order to assign spots in a Smart Parking. Adv. Distrib. Comput. Artif. Intell. J. 6(4), 5 (2017) 9. Moung, E.: A comparison of the YCBCR color space with gray scale for face recognition for surveillance applications. Adv. Distrib. Comput. Artif. Intell. J. 6(4), 25–33 (2017) 10. Morente-Molinera, J.A., Kou, G., González-Crespo, R., Corchado, J.M., Herrera-Viedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl. Based Syst. 137, 54–64 (2017) 11. Kethareswaran, V., Sankar Ram, C.: An Indian perspective on the adverse impact of Internet of Things (IoT). Adv. Distrib. Comput. Artif. Intell. J. 6(4), 35–40 (2017)
Deep Symbolic Learning and Semantics
275
12. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Process. 119, 115–127 (2016) 13. Alonso, R.S., Prieto, J., García, Ó., Corchado, J.M.: Collaborative learning via social computing. Front. Inf. Technol. Electron. Eng. 20(2), 265–282 (2019). https://doi.org/10.1631/ FITEE.1700840 14. Alonso, R.S., Sittón-Candanedo, I., García, Ó., Prieto, J., Rodríguez-González, S.: An intelligent Edge-IoT platform for monitoring livestock and crops in a dairy farming scenario. Ad Hoc Netw. 98, 102047 (2020) 15. Cunha, R., Billa, C., Adamatti, D.: Development of a Graphical Tool to integrate the Prometheus AEOlus methodology and Jason Platform. Adv. Distrib. Comput. Artif. Intell. J. 6(2), 57–70 (2017) 16. Coria, J.A.G., Castellanos-Garzón, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Exp. Syst. Appl. 41(4), 1189–1205 (2014) 17. Siyau, M.F., Li, T., Loo, J.: A novel pilot expansion approach for MIMO channel estimation. Adv. Distrib. Comput. Artif. Intell. J. 3(3), 12–20 (2014). ISSN: 2255-2863, Salamanca 18. Tapia, D.I., Fraile, J.A., Rodríguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for Ambient Intelligence systems. Inf. Sci. 222, 47–65 (2013) 19. Corchado, J.M., Pavón, J., Corchado, E.S., Castillo, L.F.: Development of CBR-BDI agents: a tourist guide application. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 547–559. Springer, Heidelberg (2004). https://doi.org/10.1007/9783-540-28631-8_40 20. Alonso, R.S., Sittón-Candanedo, I., Rodríguez-González, S., García, Ó., Prieto, J.: A survey on software-defined networks and edge computing over IoT. In: International Conference on Practical Applications of Agents and Multi-agent Systems, pp. 289–301 (2019) 21. Alonso, R.S., Tapia, D.I., Bajo, J., García, Ó., De Paz, J.F., Corchado, J.M.: Implementing a hardware-embedded reactive agents platform based on a service-oriented architecture over heterogeneous wireless sensor networks. Ad Hoc Netw. 11(1), 151–166 (2013) 22. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 23. Fdez-Riverola, F., Corchado, J.M.: FSfRT: forecasting system for red tides. Appl. Intell. 21(3), 251–264 (2004) 24. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: SpamHunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 25. Casado-Vara, R., del Rey, A.M., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Future Gener. Comput. Syst. 102, 965–977 (2020) 26. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 27. De Paz, J.F., Tapia, D.I., Alonso, R.S., Pinzón, C.I., Bajo, J., Corchado, J.M.: Mitigation of the ground reflection effect in real-time locating systems based on wireless sensor networks by using artificial neural networks. Knowl. Inf. Syst. 34(1), 193–217 (2013) 28. García, Ó., Alonso, R.S., Martínez, D.I.T., Guevara, F., De La Prieta, F., Bravo, R.A.: Wireless sensor networks and real-time locating systems to fight against maritime piracy. IJIMAI 1(5), 14–21 (2012) 29. Sittón-Candanedo, I., Alonso, R.S., Corchado, J.M., Rodríguez-González, S., Casado-Vara, R.: A review of edge computing reference architectures and a new global edge proposal. Fut. Gener. Comput. Syst. 99, 278–294 (2019) 30. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Procedia Comput. Sci. 134, 393–398 (2018)
276
R. S. Alonso
31. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018) 32. Hernandez, E., Hernández, G., Gil, A., Rodríguez, S., Corchado, J.M.: Fog computing architecture for personalized recommendation of banking products. Exp. Syst. Appl. 140, 112900 (2020) 33. Liu, H., Lu, J., Feng, J., Zhou, J.: Group-aware deep feature learning for facial age estimation. Patt. Recogn. 66, 82–94 (2017) 34. Sánchez-Morales, A., Sancho-Gómez, J., Martínez-García, J., et al.: Improving deep learning performance with missing values via deletion and compensation. Neural Comput. Appl., 1–12 (2019). https://doi.org/10.1007/s00521-019-04013-2 35. Dechter, R.: Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory, pp. 178–183 (1986) 36. Jouppi, N., Young, C., Patil, N., Patterson, D.: Motivation for and evaluation of the first tensor processing unit. IEEE Micro 38(3), 10–19 (2018) 37. Hassan, A., Mahmood, A.: Convolutional recurrent deep learning model for sentence classification. IEEE Access 6, 13949–13957 (2018) 38. Rivas, A., Chamoso, P., González-Briones, A., Corchado, J.M.: Detection of cattle using drones and convolutional neural networks. Sensors 18(7), 2048 (2018) 39. Xu, W., Keshmiri, S., Wang, G.: Adversarially approximated autoencoder for image generation and manipulation. IEEE Trans. Multimed. 21(9), 2387–2396 (2019) 40. Liu, Y., Yuan, X., Gong, X., Xie, Z., Fang, F., Luo, Z.: Conditional convolution neural network enhanced random forest for facial expression recognition. Patt. Recogn. 84, 251–261 (2018) 41. Sittón, I., Alonso, R.S., Hernández, E., Rodríguez, S., Rivas, A.: Neuro-symbolic hybrid systems for industry 4.0: a systematic mapping study. In: International Conference on Knowledge Management in Organizations, pp. 455–465 (2019) 42. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307–313 (2002) 43. González-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 44. Díaz, F., Fdez-Riverola, F., Corchado, J.M.: gene-CBR: a case-based reasoning tool for cancer diagnosis using microarray data sets. Comput. Intell. 22(3–4), 254–268 (2006) 45. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood Hebbian learning based retrieval method for CBR systems. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS (LNAI), vol. 2689, pp. 107–121. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45006-8_11 46. Ribeiro, C., et al.: Customized normalization clustering meth-odology for consumers with heterogeneous characteristics. Adv. Distrib. Comput. Artif. Intell. J. 7(2), 53–69 (2018) 47. Guillén, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a SCIRAS model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 48. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell. 15(2), 105–127 (2001) 49. Pawlewski, P., Kluska, K.: Modeling and simulation of bus assembling process using DES/ABS approach. Adv. Distrib. Comput. Artif. Intell. J. 6(1), 59 (2017). ISSN: 2255-2863, Salamanca 50. Silveira, R.A., Comarella, R.L., Campos, R.L.R., Vian, J., De La Prieta, F.: Learning objects recommendation system: issues and approaches for retrieving, indexing and recommend learning objects. Adv. Distrib. Comput. Artif. Intell. J. 4(4), 69 (2015). ISSN: 2255-2863, Salamanca
Deep Symbolic Learning and Semantics
277
51. Fernández-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy casebased reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 52. Sittón-Candanedo, I., Alonso, R.S., García, Ó., Gil, A.B., Rodríguez-González, S.: A review on edge computing in smart energy by means of a systematic mapping study. Electronics 9(1), 48 (2020) 53. Sittón-Candanedo, I., Alonso, R.S., García, Ó., Muñoz, L., Rodríguez-González, S.: Edge computing, IoT and social computing in smart energy scenarios. Sensors 19(15), 3353 (2019) 54. Tapia, D.I., Alonso, R.S., García, Ó., de la Prieta, F., Pérez-Lancho, B.: Cloud-IO: cloud computing platform for the fast deployment of services over wireless sensor networks. In: 7th International Conference on Knowledge Management in Organizations: Service and Cloud Computing, pp. 493–504 (2013) 55. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for alzheimer health care. Int. J. Ambient Comput. Intell. 1(1), 15–26 (2009) 56. Gómez, J., Alamán, X., Montoro, G., Torrado, J.C., Plaza, A.: Am ICog – mobile technologies to assist people with cognitive disabilities in the workplace. Adv. Distrib. Comput. Artif. Intell. J. 2(4), 9–17 (2013). ISSN: 2255-2863, Salamanca 57. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 58. Méndez, J.R., Fdez-Riverola, F., Díaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 106–120. Springer, Heidelberg (2006). https://doi.org/10.1007/11790853_9 59. Serna, F.J.A., Iniesta, J.B.: The delimitation of freedom of speech on the Internet: the confrontation of rights and digital censorship. Adv. Distrib. Comput. Artif. Intell. J. 7(1), 5–12 (2018) 60. Mata, A., Corchado, J.M.: Forecasting the probability of finding oil slicks using a CBR system. Exp. Syst. Appl. 36(4), 8239–8246 (2009) 61. Chamoso, P., González-Briones, A., Rodríguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the-art review. Wirel. Commun. Mob. Comput. 2018(1), 1–17 (2018) 62. Glez-Bedia, M., Corchado, J.M., Corchado, E.S., Fyfe, C.: Analytical model for constructing deliberative agents. Eng. Intell. Syst. Electr. Eng. Commun. 10(3), 173–185 (2002) 63. Fyfe, C., Corchado, J.M.: Automating the construction of CBR Systems using Kernel methods. Int. J. Intell. Syst. 16(4), 571–586 (2001) 64. Choon, Y.W., et al.: Differential bees flux balance analysis with OptKnock for in silico microbial strains optimization. PLoS ONE 9(7), e102744 (2014) 65. Tapia, D.I., Alonso, R.S., Rodríguez, S., de Paz, J.F., González, A., Corchado, J.M.: Embedding reactive hardware agents into heterogeneous sensor networks. In: 2010 13th International Conference on Information Fusion, pp. 1–8 (2010) 66. Tapia, D.I., Bajo, J., De Paz, J.F., Alonso, R.S., Rodríguez, S., Corchado, J.M.: Using multilayer perceptrons to enhance the performance of indoor RTLS. In: Proceedings of the Progress in Artificial Intelligence Workshop: Ambient Intelligence Environments, EPIA 2011 (2011) 67. Martín del Rey, A., Casado Vara, R., Hernández Serrano, D.: Reversibility of symmetric linear cellular automata with radius r = 3. Mathematics 7(9), 816 (2019) 68. Casado-Vara, R., Novais, P., Gil, A.B., Prieto, J., Corchado, J.M.: Distributed continuous-time fault estimation control for multiple devices in IoT networks. IEEE Access 7, 11972–11984 (2019)
278
R. S. Alonso
69. Shoeibi, N., Shoeibi, N.: Future of smart parking: automated valet parking using deep Qlearning. In: Herrera-Viedma, E., Vale, Z., Nielsen, P., Martin Del Rey, A., Casado Vara, R. (eds.) DCAI 2019. AISC, vol. 1004, pp. 177–182. Springer, Cham (2020). https://doi.org/10. 1007/978-3-030-23946-6_20 70. Vera, J.S.E.: Human rights in the ethical protection of youth in social networks-the case of Colombia and Peru. Adv. Distrib. Comput. Artif. Intell. J. 6(4), 71–79 (2017) 71. Casado-Vara, R., Chamoso, P., De la Prieta, F., Prieto, J., Corchado, J.M.: Non-linear adaptive closed-loop control system for improved efficiency in IoT-blockchain management. Inf. Fusion 49, 227–239 (2019) 72. Farias, G.P., et al.: Predicting plan failure by monitoring action sequences and duration. Adv. Distrib. Comput. Artif. Intell. J. 6(4), 55–69 (2017). ISSN: 2255-2863, Salamanca 73. Van Haare Heijmeijer, A., Vaz Alves, G.: Development of a Middleware between SUMO simulation tool and JaCaMo framework. Adv. Distrib. Comput. Artif. Intell. J. 7(2), 5–15 (2018) 74. Durik, B.O.: Organisational metamodel for large-scale multi-agent systems: first steps towards modelling organisation dynamics. Adv. Distrib. Comput. Artif. Intell. J. 6(3), 17 (2017). ISSN: 2255-2863, Salamanca 75. da Silveira Glaeser, S., et al.: Modeling of Circadian Rhythm under influence of Pain: an approach based on Multi-agent Simulation. Adv. Distrib. Comput. Artif. Intell. J. 7(2), 17–25 (2018) 76. Srivastava, V., Purwar, R.: An extension of local mesh peak valley edge based feature descriptor for image retrieval in bio-medical images. Adv. Distrib. Comput. Artif. Intell. J. 7(1), 77–89 (2018) 77. Silveira, R.A., Klein Da Silva Bitencourt, G., Gelaim, T.Â., Marchi, J., De La Prieta, F.: Towards a model of open and reliable cognitive multiagent systems dealing with trust and emotions. Adv. Distrib. Comput. Artif. Intell. J. 4(3), 57 (2015). ISSN: 2255-2863, Salamanca 78. González, C., Burguillo, J.C., Llamas, M., Laza, R.: Designing intelligent tutoring systems: a personalization strategy using case-based reasoning and multi-agent systems. Adv. Distrib. Comput. Artif. Intell. J. 2(1), 41–54 (2013). ISSN: 2255-2863, Salamanca 79. Ayala, D., Roldán, J.C., Ruiz, D., Gallego, F.O.: An approach for discovering keywords from Spanish tweets using Wikipedia. Adv. Distrib. Comput. Artif. Intell. J. 4(2), 73–87 (2015). ISSN: 2255-2863, Salamanca 80. del Rey, Á.M., Batista, F.K., Queiruga Dios, A.: Malware propagation in Wireless Sensor Networks global models vs individual-based models. Adv. Distrib. Comput. Artif. Intell. J. 6(3), 5–15 (2017). ISSN: 2255-2863, Salamanca 81. Cooper, V.N., Haddad, H.M., Shahriar, H.: Android malware detection using Kullback-Leibler divergence. Adv. Distrib. Comput. Artif. Intell. J. 3(2), 17–25 (2014). ISSN: 2255-2863, Salamanca 82. Kamaruddin, S.B.A., Ghanib, N.A.M., Liong, C.Y., Jemain, A.A.: Firearm classification using neural networks on ring of firing pin impression images. Adv. Distrib. Comput. Artif. Intell. J. 1(3), 177–182 (2012). ISSN: 2255-2863, Salamanca 83. Castellanos Garzón, J.A., Ramos González, J.: A gene selection approach based on clustering for classification tasks in Colon cancer. Adv. Distrib. Comput. Artif. Intell. J. 4(3), 1 (2015). ISSN: 2255-2863, Salamanca 84. Shoeibi, N., Karimi, F., Corchado, J.M.: Artificial intelligence as a way of overcoming visual disorders: damages related to visual cortex, optic nerves and eyes. In: Herrera-Viedma, E., Vale, Z., Nielsen, P., Martin Del Rey, A., Casado Vara, R. (eds.) DCAI 2019. AISC, vol. 1004, pp. 183–187. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-23946-6_21 85. Ueno, M., Mori, N., Matsumoto, K.: Picture models for 2-scene comics creating system. Adv. Distrib. Comput. Artif. Intell. J. 3(2), 53–64 (2014). ISSN: 2255-2863, Salamanca
Development of a Multiagent Simulator to Genetic Regulatory Networks Nilzair Barreto Agostinho(B) , Adriano Velasque Wherhli(B) , and Diana Francisca Adamatti(B) Universidade Federal do Rio Grande (PPGMC/FURG), Av. It´ alia, km 8, Bairro Carreiros, Rio Grande, RS, Brazil [email protected], [email protected], [email protected], http://www.c3.furg.br Abstract. Biological systems are highly complex and separating them into individual parts facilitates their study. The representation of biological systems as Genetic Regulatory Networks (GRN) that form a map of the interactions between the molecules in an organism is a standard way of representing such biological complexity. GRN are composed of genes that are translated into transcription factors, which in turn regulate other genes. For simulation and inference purposes, many different mathematical and algorithmic models have been adopted to represent the GRN in the past few years. In this paper we present the first efforts to develop a simulator using the MAS for to model generics GRN. To accomplish it, we develop a Multiagent System (MAS) that is composed of agents that mimic the biochemical processes of gene regulation. Keywords: Multiagent systems Simulation
1
· Genetic regulatory network ·
Introduction
Albeit the central dogma of biology states that information flows through macromolecules, from DNA to RNA and from RNA to proteins, life would not exist from macro-molecules alone. Then, for the central dogma to be highly descriptive, it should include small molecules [8]. These small molecules are the key elements in various topics in the life sciences, as for example, the origins of life, memory and cognition, sensing and signaling, understanding cell circuitry and disease treatments [8]. Methodologies and tools that can improve the knowledge about these intricate molecular interactions, and the modelling and inference of biological networks are very important in this task. The work [6] uses fuzzy cognitive maps for GRN reconstruction is cited on. In this work, a developing version of a MAS that models a framework for representing a regulatory network. In the stage of this work, an initial version is presented, that shows the interaction between two transcription factors that interact in a specific binding site, and the preliminary results are promising. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 279–283, 2021. https://doi.org/10.1007/978-3-030-58356-9_31
280
2
N. B. Agostinho et al.
Operation of a GRN
Genes are activated or inhibited by so called transcription factor (TF) proteins, which are themselves gene products. A GRN describes the interaction between the TF and the genes that they regulate. This process allows cells to make the proteins they need at the appropriate times and amounts. The basic mechanism of regulatory control is accomplished through the binding of the TF, called TF binding sites, that are located in the promoter region of a gene as described in [8]. A TF can interact with a binding site and activate or inhibit the associated gene and thus increase or decrease the production of proteins, according to a specific rate.
3
Proposed Model
The proposed model is an abstraction of the underlying biochemical processes and simulates the interaction between the TF and a target gene. The MAS tools are very flexible allowing the representation of each individual in a system, simplifying the evaluation of new hypotheses about the model. Thus, this work presents an initial version of a MAS as the modelling framework for representing a regulatory network. The simulator is implemented in Netlogo1 , a free Multiagent oriented software. Two configuration files are defined to set up the model: agent definition and constraint definition (Table 1a and 1b). Table 1. Agent and constraints definition to initial model Id Size Color
Quantity Gene
0
1
White 0
“X”
1
1
Red
“Y”
30 (a)
Regulator Action Regulate Quantity “Y”
1
“X” (b)
3
In Table 1a, there is 0 “X” genes and 30 “Y” genes. The “1” in second column is the definition of action. Each gene in terms of size (on a two-dimensional plane), color, amount, and name are defined here. And, in Table 1b, the action of “Y” upon “X” is activation (Action 1) and 3 agents of “Y” are required to produce one agent of type “X”. In the simulator environment is presented in Fig. 2, where “X” and “Y” are represented respectively as white and red circles. There is one green square in the environment that represent the binding site for “Y”. The process of regulation among genes emerges from their interactions at the binding sites and can be modeled as an enzyme kinetics process also known 1
https://ccl.northwestern.edu/netlogo/.
Development of a Multiagent Simulator to Genetic Regulatory Networks
281
as Michaelis-Menten kinetics. The reaction rate varies linearly with substrate concentration [S] (first-order kinetics) [5]. The behavior of the curve of MichaelisMenten is used for comparisons with the results of model [5] (as presented in Fig. 1).
Fig. 1. Subfigure (a) shows a typical Michaelis-Menten reaction rate curve. Subfigure (b) shows typical variation in substrate concentration following a Michaelis-Menten approach.
Fig. 2. Subfigure (a) depicts the simulation environment. The binding sites are represented as green squares and the agents as white and red circles. Subfigure (b) shows the state of the simulation environment after 1648 ticks and the respective agents (gene) concentrations time series.
4
Results
At the current stage of the development of our simulator, the genes (agents) interact at the binding sites, according to the constrains defined and exhibit the behavior shown in Figs. 2a, 2b and 2c. In the simulator environment, “X” and “Y” are represented respectively as white and red circles. There is one green square in the environment that represent the binding site for “Y”. In NetLogo, each time step is represented by a tick. Figure 2a shows the simulation environment with the genes and binding sites after a few first ticks of the simulation. In the left panel of Fig. 2b the simulation environment is shown after 1684 ticks. The Fig. 2c presents the concentrations of genes “X” and “Y” throughout the simulation after 1684 ticks.
282
5
N. B. Agostinho et al.
Conclusions
This study presents the first efforts to simulate GRN using the MAS approach. In the present study, a MAS is defined with few rules and parameters. The system consists of only two genes, where one regulates the other. These curves present similar behaviour to the ones derived from the Michaelis-Menten equation and presented in Fig. 1b. Although these are preliminary results, they indicate that there is a good place for research on this subject. We intend to validate our approach by comparing it to the work of [3] and also supported by [1,2], which yields results as the ones presented in Fig. 3.
Fig. 3. Results of [3]. In this figure a regulatory circuit is shown along with the resulting concentrations of its components. This is an example of the results we will use to compare with our approach.
Furthermore, we intend to compare our simulation studies with the model of the circadian cycle of the plant Arabidopsis thaliana [7] and with Bio-PEPA [4] and others frameworks. Acknowledgments. We would like to thank CAPES (Coordination for the Improvement of Higher Education Personnel) for the financial support to Doctorate Scholarship.
References 1. Adler, M., Alon, U.: Fold-change detection in biological systems. Curr. Opin. Syst. Biol. 8, 81–89 (2018) 2. Adler, M., Szekely, P., Mayo, A., Alon, U.: Optimal regulatory circuit topologies for fold-change detection. Cell Syst. 4, 171–181 (2017) 3. Alon, U.: Network motifs: theory and experimental approaches. Nat. Rev. Genet. 8, 450–461 (2007) 4. Haydarlou, R., Jacobsen, A., Bonzanni, N., Feenstra, K.A., Abeln, S., Heringa, J.: BioASF: a framework for automatically generating executable pathway models specified in BioPAX. Bioinformatics 32, i60–i69 (2016) 5. Johnson, K., Goody, R.: The original Michaelis constant: Translation of the 1913 Michaelis-Menten paper. Biochemistry 50, 8264–8269 (2011) 6. Liu, J., Chi, Y., Liu, Z., He, S.: Ensemble multi-objective evolutionary algorithm for gene regulatory network reconstruction based on fuzzy cognitive maps. IET J. 4, 24–36 (2019)
Development of a Multiagent Simulator to Genetic Regulatory Networks
283
7. Pokhilko, A., Fernandez, A., Edwards, K., Southern, M., Halliday, K., Millar, A.: The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops. Mol. Syst. Biol. 8, 574 (2012) 8. Schreiber, S.: Small molecules: the missing link in the central dogma. Nat. Chem. Biol. 1, 64–66 (2005)
Manage Comfort Preferences Conflicts Using a Multi-agent System in an Adaptive Environment System Pedro Filipe Oliveira1,2(B) , Paulo Novais1 , and Paulo Matos2 1
Department of Informatics, Algoritmi Centre, University of Minho, Braga, Portugal 2 Instituto Polit´ecnico de Bragan¸ca, Campus de Santa Apol´ onia, 5300-253 Bragan¸ca, Portugal [email protected]
Abstract. Managing comfort preferences conflicts of the different users and locals on an IoT adaptive system is a actual problem, this paper proposes a protocol and hierarchical rules to develop a multi-agent system to achieve a Adaptive Environment System that solves the management of conflicts in an autonomous way for the users and interdependent of the user schedules and routines.
Keywords: Adaptive-system
1
· AmI · Multi-agent · IoT · Conflicts
Introduction
As in any environment where different users are present, there will be a native conflict of interest regarding the most diverse issues. Also in this case of developing an intelligent environment adaptable to the comfort preferences of each user, this problem will be present. It is well known that in everyday life, we deal with people with different comfort preferences, both in terms of temperature or humidity values, as well as in terms of musical or video tastes, etc. This is more critical, as these values interfere with people’s well-being, or even their health, namely in terms of allergies, diseases, etc. In this project, a protocol was created, which allows to resolve as much as possible this type of conflicts, thus trying to reach a optimum preference value, which satisfies as much as possible the majority of users present in the environment. This work resulted in the complete specification of an architecture that supports the solution found, to solve the presented problem. It will now be implemented, tested and validated using real case studies, so as to gather statistical information to assess its effectiveness and performance in the context of application. This work aims to give continuity and finalize the doctoral work presented in previous editions [6–9]. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 284–288, 2021. https://doi.org/10.1007/978-3-030-58356-9_32
Manage Comfort Preferences Conflicts Using a Multi-agent System
2
285
Materials and Methods
Figure 1, shows the scenario where we develop this work. It can be seen the user who through its different devices (smartphone, wearable, and other compatible) communicates with the system, and for that can be used different technologies, like Near Field Communication (NFC) [10], Bluetooth Low Energy (BLE) [1] or Wi-Fi Direct [2]. After, the system communicates with the Cloud, to validate the information, and the system will perform the management of different components and actuators in the environment (climatization systems, security systems, other smart systems) (Fig. 2).
Fig. 1. Problem statement [6]
3
Fig. 2. Contextualization of time/ environment dimensions [6]
Results
This section presents the technologies used in this project for the development of the entire multi-agent system applied to AmI. At Fig. 3 is represented the different architecture layers, the agent that represents the local system receives its information, namely the security information (maximum values of temperature, gases, and others). Also for each user present at the local, there will be an agent who represents him, he will receive information about the user preferences from the central system, that will be used for the negotiation process. The negotiation process will then be made up of the local system agent and each of the users agents present at the local. The negotiation result of will then be passed on to the different actuators present in the local (Fig. 4). In the course of this project, the environments the focuses will be mainly on domestic/family, professional environments (workplaces) and public spaces, where a large number of people are usually present.
286
P. F. Oliveira et al.
Fig. 3. Architecture of the multi-agent system [3–5, 9]
Fig. 4. AmI System - Use Case diagram
One of the rules used for conflict resolution was the hierarchy of preferences. Starting with family contexts, it was taken into account to maximize the preference value of adult elements (parents) over the children, in a ratio of 1 to 0.75. Another hierarchy is the preference value of the space if it exists, in this case a proportion of 1.5 will be used. These cases may exist in spaces where there is some conditioning, such as kitchens/wc, or other spaces that have some type of conditioning. The proportions described and used for the rules are detailed in Table 1. In the professional context, the proportion values are also defined in a hierarchical way, and in this context the professional hierarchy of space will be used, as well as the space preference value if it exists. The proportions described are detailed in Table 2. Regarding public/social spaces, the predominant value will obviously be the space value with a proportion of 2, and each user will have a proportion of 0.15, as in these spaces it is natural that there is little variation in the values, derived by the high movement of people. The proportions described are detailed in Table 3. The formula used to achieve the optimum preference value to the different spaces is the following:
prefValue =
n
user=1
userP ref × userHierP roportion + (spaceP ref × spaceP roportion) n user=1 userHierP roportion + spaceP roportion
Manage Comfort Preferences Conflicts Using a Multi-agent System
Table 1. Type of users and proportions - Home space
Table 2. Type of users and proportions - Work space
Table 3. Type of users and proportions - Public/Social space
Type
Proportion
Type
Adult
1
Hierarchy 1 (100-1)
User 1 0,15
Child
0,75
Hierarchy 2 (100-2)
User 2 0,15
Visitor 1
Hierarchy n (100-n)
User n 0,15
Space
Space
Space
4
1,5
Proportion
150
287
Type
Proportion
2
Discussion and Conclusions
This work resulted in the complete specification of an architecture that supports the solution found, to solve the presented problem. The agent system model is fully developed. At this stage the agent layer was developed, implemented, and is now in a testing phase at the testing environment developed for this project. Now It will be tested and validated using real case studies, so as to gather statistical information to assess its effectiveness and performance in the context of application. For future work, the result tests will be evaluated, in order to verify the model optimization possibilities and implementation. Acknowledgments. This work has been supported by FCT – Funda¸ca ˜o para a Ciˆencia e Tecnologia within the Project Scope: UID/CEC/00319/2019.
References 1. Bluetooth Specification: Bluetooth Core Specification Version 4.0. Specification of the Bluetooth System (2010) 2. Camps-Mur, D., Garcia-Saavedra, A., Serrano, P.: Device-to-device communications with Wi-Fi direct: overview and experimentation. IEEE Wirel. Commun. 20(3), 96–104 (2013) 3. Gonz´ alez-Briones, A., Chamoso, P., De La Prieta, F., Demazeau, Y., Corchado, J.M.: Agreement technologies for energy optimization at home. Sensors 18(5), 1633 (2018) 4. Gonz´ alez-Briones, A., De La Prieta, F., Mohamad, M.S., Omatu, S., Corchado, J.M.: Multi-agent systems applications in energy optimization problems: a stateof-the-art review. Energies 11(8), 1928 (2018) 5. Gonz´ alez-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 6. Oliveira, P., Matos, P., Novais, P.: Behaviour analysis in smart spaces. In: 2016 International IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), pp. 880–887. IEEE (2016)
288
P. F. Oliveira et al.
7. Oliveira, P., Novais, P., Matos, P.: Challenges in smart spaces: aware of users, preferences, behaviours and habits. In: International Conference on Practical Applications of Agents and Multi-Agent Systems, pp. 268–271. Springer (2017) 8. Oliveira, P., Novais, P., Matos, P.: Generating real context data to test user dependent systems-application to multi-agent systems. In: International Conference on Practical Applications of Agents and Multi-Agent Systems, pp. 180–187. Springer (2019) 9. Oliveira, P.F., Novais, P., Matos, P.: A multi-agent system to manage users and spaces in a adaptive environment system. In: International Conference on Practical Applications of Agents and Multi-Agent Systems, pp. 330–333. Springer (2019) 10. Want, R.: Near field communication. IEEE Pervasive Comput. 10(3), 4–7 (2011)
AI-Based Proposal for Epileptic Seizure Prediction in Real-Time David Garc´ıa-Retuerta(B) University of Salamanca, Patio de Escuelas Menores, 37008 Salamanca, Spain [email protected]
Abstract. Epilepsy has a great importance for researchers as this group of neurological disorders affects to roughly 1% of the global population. Modelling the behaviour of epileptic brains and generating predictions of when the next seizures will take place has the potential of contributing to several research lines and even to evolve into real-life treatments. Using a combination of mathematical methods with machine learning techniques can achieve a high performance in seizure prediction, which shall be based on a properly labelled dataset and manually evaluated by an expert.
Keywords: Epilepsy
1
· Algorithms · Machine learning
Introduction
Epilepsy has always been an important challenge for society and nowadays it still is [1–9]. Recent discoveries have found treatments for many neurological disorders, which shows the great potential our current scientific tools have. In particular, machine learning is achieving promising results in recent years [10–15]. Its great capacity to find complex patterns hidden in between vast amounts of data and to generalise them, finding the underlying rules, allows researchers to apply its algorithms in their researches with a great versatility. Fields like medicine, engineering, pharmacology and several more have greatly benefited from machine learning. In particular, its revolutionary advantages in image recognition, natural language processing (NLP) and data science are useful in almost all disciplines [16–24]. This article tackles the problem of epileptic seizure prediction using a mixture of machine-learning and new mathematical algorithms, which is a novelty approach in this field. In this paper, an algorithm which will output the probability of imminent seizures is presented. Based on the output, a system which produces short-term alerts before a seizure will be created. In our approach, a electroencephalography (EEG) is used to detect local field potentials and therefore produce the data which will be used a input [25,26]. The data is labelled discerning the ictal events, inter-ictal events and the normal brain state. The classification is carried out using several well-studied mathematical algorithms. Such a labels are used c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 289–292, 2021. https://doi.org/10.1007/978-3-030-58356-9_33
290
D. Garc´ıa-Retuerta
to create a time series which behaviour is modelled by a CNN (convolutional neural network) and a RNN (recurrent neural network). Finally, the output is used to create the early-warning alert system, which will issue a warning if the output of the network is higher than a certain threshold [27].
2
Conclusion
This work proposes a novel system for brain data processing with the ability to predict future seizures, using machine learning and mathematical algorithms. The algorithm has been applied to the data obtained by a EEG of the brain of mice. The main goal of the system is to obtain a properly labelled dataset and to achieve accurate predictions of ictal events on short beforehand. Furthermore, the algorithm is designed to minimise type II errors as it is preferable to detect “too many” possible seizures, than to miss any of them. The most significant results obtained in this work are listed below. This paper provides a novel, artificial intelligence-based and real-time system, which allows a correct early-warning alarm system for seizures. We also address its possible applications in patient’s treatments and industry applications. In future work, we will extend the system to more complex data from humans and will improve the performance so that the system can work in real-time with (complex) human data. Acknowledgements. This paper has been partially supported by the Salamanca Ciudad de Cultura y Saberes Foundation under the Talent Attraction Programme (CHROMOSOME project).
References 1. Teixido, M., Palleja, T., Tresanchez, M., Font, D., Moreno, J., Fern´ andez, A., Palac´ın, J., Rebate, C.: Optimization of the virtual mouse HeadMouse to foster its classroom use by children with physical disabilities. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2013). ISSN: 2255-2863. Salamanca, v. 2, n. 4 2. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE (July 2014) 3. Costa, A., Heras, S., Palanca, J., Novais, P., Juli´ an, V.: Persuasion and recommendation system applied to a cognitive assistant. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 2255-2863. Salamanca, v. 5, n. 2 4. Fdez-Riverola, F., Iglesias, E.L., D´ıaz, F., M´endez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007) 5. Keyhanipour, A.H., Moshiri, B.: Designing a web spam classifier based on feature fusion in the layered multi-population genetic programming framework. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2013). ISSN: 2255-2863. Salamanca, v. 2, n. 3
AI-Based Proposal for Epileptic Seizure Prediction in Real-Time
291
6. Ameller, M.A., Gonz´ alez, M.A.: Minutiae filtering using ridge-valley method. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 2255-2863. Salamanca, v. 5, n. 1 7. Morente-Molinera, J.A., Kou, G., Gonz´ alez-Crespo, R., Corchado, J.M., HerreraViedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl.-Based Syst. 137, 54–64 (2017) 8. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Signal Process. 119, 115–127 (2016) 9. Coria, J.A.G., Castellanos-Garz´ on, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189– 1205 (2014) 10. Fern´ andez-Fern´ andez, A., Cervell´ o-Pastor, C., Ochoa-Aday, L.: Energy-aware routing in multiple domains software-defined networks. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2016). ISSN: 2255-2863. Salamanca, v. 5, n. 3 ´ Chamoso, P., Corchado, J.M.: Counter11. Garc´ıa-Retuerta, D., Bartolom´e, A., terrorism video analysis using hash-based algorithms. Algorithms 12(5), 110 (2019) 12. Khayati, N., Lejouad-Chaari, W.: A distributed and collaborative intelligent system for medical diagnosis. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2013). ISSN: 2255-2863. Salamanca, v. 2, n. 2 13. Tapia, D.I., Fraile, J.A., Rodr´ıguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for ambient intelligence systems. Inf. Sci. 222, 47–65 (2013) 14. Corchado, J.M., Pav´ on, J., Corchado, E.S., Castillo, L.F.: Development of CBRBDI agents: a tourist guide application. In: European Conference on Case-Based Reasoning, pp. 547–559. Springer, Heidelberg (August 2004) 15. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 16. Fdez-Riverola, F., Corchado, J.M.: FSfRT: forecasting system for red tides. Appl. Intell. 21(3), 251–264 (2004) 17. Garc´ıa-Retuerta, D., Bond´ıa, R.A., Tejedor, J.P., Corchado, J.M.: Inteligencia artificial para la asignaci´ on autom´ atica de categor´ıas constructivas. 94 SEXTA, vol. 111 (2018) 18. Fdez-Riverola, F., Iglesias, E.L., D´ıaz, F., M´endez, J.R., Corchado, J.M.: SpamHunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 19. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Future Gener. Comput. Syst. 102, 965–977 (2020) 20. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 21. Rodrigues, M., Gon¸calves, S., Fdez-Riverola, F.: E-learning platforms and elearning students: building the bridge to success. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2012). ISSN: 2255-2863. Salamanca, v. 1, n. 2 22. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Procedia Comput. Sci. 134, 393–398 (2018) 23. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307– 313 (2002)
292
D. Garc´ıa-Retuerta
24. Gonz´ alez-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 25. D´ıaz, F., Fdez-Riverola, F., Corchado, J.M.: gene-CBR: a case-based reasonig tool for cancer diagnosis using microarray data sets. Comput. Intell. 22(3–4), 254–268 (2006) 26. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood Hebbian learning based retrieval method for CBR systems. In: International Conference on Case-Based Reasoning, pp. 107–121. Springer, Heidelberg (June 2003) 27. Castro, J., Marti-Puig, P.: Real-time identification of respiratory movements through a microphone. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. (2014). ISSN: 2255-2863. Salamanca, v. 3, n. 3
Digital Twin Framework for Energy Efficient Greenhouse Industry 4.0 Daniel Anthony Howard(B)
, Zheng Ma , and Bo Nørregaard Jørgensen
University of Southern Denmark, Campusvej 55, 5230 Odense, Denmark [email protected]
Abstract. This paper introduces the ongoing research conducted on enabling industrial greenhouse growers to optimize production using multi-agent systems and digital twin technology. The project seeks to develop a production process framework for greenhouses, based on several case studies, that can be applied to different greenhouse facilities to enable a broad implementation in the industrial horticulture sector. The research will incorporate AI technology to support the production process agent in forecasting and learning optimal operating conditions within set parameters that will be feedback to the grower through a common information model. Furthermore, the production agent will communicate with other process agents to co-optimize the essential aspects of production. In turn, this allows the growers to optimize the production cost with minimal risk to product quality while aiding in upholding grid stability. The findings in this research project may be beneficial for developing industry-specific energy flexibility solutions incorporating product and process constraints. Keywords: Industry 4.0 · Greenhouse · Multi-agent system · Digital twin
1 Problem Statement The greenhouse sector has been identified as a sector with significant unexploited process flexibility and as discussed by Ma et al. there is also a potential for monetary gain for the participating growers [1]. Several projects have already studied the planning and control of the lighting used for production within greenhouses [2–4]. However, to understand the flexibility constraints regarding product quality further research is required. 1.1 Greenhouse Industry 4.0 This research is part of the Greenhouse Industry 4.0 funded by the Danish Energy Technology Development and Demonstration Program (EUDP). The project takes a holistic approach by optimizing the operation based on inputs from three developed digital twins. The digital twins will cover the energy system, the climate compartment and the research presented here will develop a digital twin for the production process flow. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 293–297, 2021. https://doi.org/10.1007/978-3-030-58356-9_34
294
D. A. Howard et al.
1.2 Research Aim and Objectives This research aims to develop a digital twin of the greenhouse production flow, to create an artificial intelligence (AI) based simulation model of the greenhouse production flow for investigating the effects of co-optimizing production schedule, plant growth, energy consumption, and cost, by considering influential factors including production deadlines, quality assessment, (district) heating demand, gas and electricity prices, and weather forecasts. To achieve the project aim, the following objectives will be conducted: • • • •
Development of multi-agent simulation model for greenhouse production flow Identification of Common Information Model interface Investigating the effects of co-optimizing factors influencing the production flow Simulation of energy efficiency and demand response potentials
The research will be conducted as a part of the Greenhouse Industry 4.0 project between September 2019 and October 2023.
2 Related Work 2.1 Brewery Fermentation Process Optimization Applying multi-agent-based simulation on a production process, a case study was conducted on a Danish brewery. The goal was to establish a baseline consumption model for the fermentation tanks within the brewery and subsequently investigate any potential for the process to deliver energy flexibility based on market conditions. Brewing is a quality-oriented process and this was emphasized by developing a process agent logic using the product agents (beer) constraints, primarily temperature zones for the yeast [5]. 2.2 Greenhouse Flexibility Potential Before the co-optimization using the information provided by the other digital twins in the system, a baseline flexibility potential for the process should be established as a line of reference. In order to determine the flexibility potential, a multi-agent system representing the current greenhouse production process was developed. Using the multiagent system and data provided by the growers it was possible to estimate the current power consumption and flow of the processes. Furthermore, a product agent representing the process product (plant) could be developed in order to account for variations in growth depending on the environment. 2.3 Integration of Simulation Software with Optimization Platform The simulation software used to represent the digital twin for the production process is required to interact with an optimization platform through data exchange it is important to establish the integration possibility between said platforms. As the digital twins developed in the project will run on different platforms it is beneficial to connect them using a
Digital Twin Framework for Energy Efficient Greenhouse Industry 4.0
295
horizontal integration approach to minimize the connections for each subsystem. Using an in-memory data grid (Hazelcast) it was possible to establish a connection between the simulation platform (AnyLogic) and the optimization platform. Will the potential energy flexibility for implicit demand response in industrial greenhouses using a process framework incorporating production deadlines be more effective than traditional estimation of energy flexibility potential to prevent overrunning delivery deadlines?
3 Hypothesis To aid in answering the research question a set of hypotheses were constructed: • H1: If the frequency of unmet product deadlines is related to energy flexibility then greenhouses utilizing energy flexibility will have a higher frequency of unmet production deadlines. • H2: If the product growth is related to light/CO2 /humidity exposure then forecasting the parameters (light/CO2 /humidity) will yield the estimated completion time. • H3: If the production price is dependent on multiple compartments then co-optimizing with the other compartments will result in the cheapest production price.
4 Proposal The research proposes the development of a digital twin that utilizes AI technology within the production process agent and considering inputs from the surrounding agents the research proposes a method for optimizing the production while incorporating product deadlines. The novelty of the research hence lies in incorporating the product deadline and estimating the best-practice production schedule (if possible) to minimize production cost.
5 Preliminary Results The conducted research thus far has focused on establishing a base-line for product process modeling. From the established model it has been possible to estimate the energy consumption from the production process in accordance with the first point of the research aim and objectives. An overview of the production process flow can be seen in Fig. 1.
296
D. A. Howard et al.
Fig. 1. Greenhouse production process model overview
6 Reflections The research project is still in the initial phase of development it is imperative for successful completion to establish the correct system architecture for data handling. Moving forward with the research the goal will be to incorporate AI technology into the agent logic allowing to train the model based on inputs. Inputs will be supplied by other digital twins to which the communication flow will also need to be established. As the project relies on multiple parties to incorporate their work an initial estimation of input parameters has been set that can be exchanged for real-data input once available.
References 1. Ma, Z., Jørgensen, B.N.: Energy flexibility of the commercial greenhouse growers: the potential and benefits of participating in the electricity market. In: 2018 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), 19–22 February 2018, pp. 1–5 (2018). https://doi.org/10.1109/isgt.2018.8403368 2. Christensen, K., Ma, Z., Værbak, M., Demazeau, Y., Jørgensen,B.N.: Agent-based decision making for adoption of smart energy solutions. Presented at the IV international congress of research in sciences and humanities international research conference (SHIRCON 2019), Lima, Peru, 12–15 November 2019. https://doi.org/10.1109/SHIRCON48091.2019.9024880 3. Howard, D.A., Ma, Z., Aaslyng, J.M., Jørgensen, B.N.: Data Architecture for Digital Twin of Commercial Greenhouse Production. Presented at the The 2020 RIVF international conference on computing and communication technologies, Ho Chi Minh City, Vietnam, 6–7 April 2020. https://doi.org/10.1109/RIVF48685.2020.9140726
Digital Twin Framework for Energy Efficient Greenhouse Industry 4.0
297
4. Christensen, K., Ma, Z., Demazeau, Y., Jørgensen, B.N.: Agent-based modeling for optimizing CO2 reduction in commercial greenhouse production with the implicit demand response. Presented at the The 6th IEEJ international workshop on Sensing, Actuation, Motion Control, and Optimization (SAMCON2020), Shibaura Institute of Technology, Tokyo, 14–16 March 2020. http://id.nii.ac.jp/1031/00127067/ 5. Howard, D., et al.: Optimization of energy flexibility in cooling process for brewery fermentation with multi-agent simulation. Presented at the The 6th IEEJ international workshop on Sensing, Actuation, Motion Control, and Optimization (SAMCON2020), Shibaura Institute of Technology, Tokyo, 14–16 March 2020. http://id.nii.ac.jp/1031/00127065/
“Cooperative Deeptech Platform” for Innovation-Hub Members of DISRUPTIVE Niloufar Shoeibi(B) Bisite Research Group, University of Salamanca, Salamanca, Spain [email protected]
Abstract. The “Cooperative Deeptech Platform” project aims at designing and implementing an architecture based on DLT/blockchain technologies and virtual agent organizations, which will provide intelligent computing to edge nodes in IoT environments. The platform will facilitate the integration of all the technology developed by the members of the DISRUPTIVE project and will incorporate additional and complementary elements to create large IoT projects, Smart Cities, Smart Grids, Industry 4.0, etc. DISRUPTIVE contemplates the development of disruptive technology, but not the integration of the generated elements, therefore this project will be a great complement for DISRUPTIVE. The proposal seeks the integration of information technologies for the improvement of the quality of life. In the “Cooperative Deeptech Platform” project, a breakthrough in last generation technologies and environments such as DLT/Blockchain, Big data, IoT or Edge Computing will be researched and carried out. To this end, the design of an architecture that uses intelligent systems will be studied in order to balance the flow of information in a network that facilitates the integration of different technologies already developed by the members of the project, while at the same time developing innovative components. Keywords: Innovation-Hub platform · DLT/blockchain · Cooperative platform
1 Introduction DISRUPTIVE is a Spain-Portugal cross-border cooperation project that seeks to improve research and innovation (R&D) infrastructures and the capacity to develop excellence in R&D and to promote centers of competence, especially those of European interest. The project seeks to promote research activity, increase the number of researchers specialized in disruptive technologies and generate a tractor effect that has a significant impact on the development and competitiveness of the cooperation area. Specifically, the project proposes research in DLT (Distributed Ledger Technology), Blockchain, IoT, Edge Computing, Artificial Intelligence, etc. In short, this project aims to promote a cross-border Digital Innovation Hub macro IHL that brings together skills, making the CyL-PN region an innovative pole of reference for cutting-edge ICT technologies. The proposed work will allow the development of the research promoted by © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 298–304, 2021. https://doi.org/10.1007/978-3-030-58356-9_35
“Cooperative Deeptech Platform” for Innovation-Hub Members
299
the DISRUPTIVE project. The main objective of the project is to facilitate the training of new researchers to support interregional activities for R&D excellence. The combination of IoT devices and Blockchain technology for information encryption will allow the traceability and reliability of the information in a distributed way in the system. In addition, we will seek to propose and implement new algorithms based on automatic learning techniques that will help in the creation of distributed intelligent systems. This research work constitutes an opportunity for both the present and the future of the research group in which it will be developed. Likewise, it will have a great impact among all the members of the DISRUPTIVE project, because they will enjoy a new technology integration platform, as well as acquire knowledge in different fields: Edge Computing, Blockchain, IoT, etc. Moreover, the results of the project will be applied in different use cases of social, scientific and technological interest in collaboration with research groups and companies of Castilla y León, promoting a growing IT sector.
2 Proposed Method The “Cooperative Deeptech Platform” poses technological challenges that can be considered within the promotion and generation of frontier knowledge. Specifically, the proposal of architecture to facilitate the management of information and knowledge in the field of IO that contributes to the generation of new knowledge in relation to the integration of new technologies: – Distributed Logging Technologies (DLT) [1–12] and blockchain [13–24] in the case of this project, the design of the technological framework that needs to be defined, for the integration of the DLTs into the core of the architecture, is a highly complex challenge. Due to the high disruptive capacity of the DLTs, it is expected to develop new products and services based on the integration of this technological innovation. – Edge Computing. [25–32] In this context, it is important to identify the right balance between centralized processing, carried out in the cloud, and processing displaced to the ends, whose advantages include the reduction and optimization of high-volume data traffic in Big Data architectures. In this project, research will be conducted on intelligent systems that allow decision-making to be decentralized. – Virtual organizations of light agents for IoT. [33–40] In the context of this work, we will look for the implementation of virtual organizations of agents in IoT devices in networks that exploit the paradigm of edge computing. – Intelligent models [41–55] can balance the system and establish mechanisms to identify wherein the network data processing and information analysis are carried out based on existing knowledge. Distributing the processing efficiently in an edge-computing system and displacing the information processing in the very devices where the information is created is not an easy task. This project proposes the use of neuro-symbolic systems of artificial intelligence that can work intelligently with large flows of information (in a centralized or distributed manner) while distributing the computational load between a powerful central calculation server and a network of light agents.
300
N. Shoeibi
As use cases, we will work on ongoing projects of intelligent electrical distribution and intelligent cities. The aim of the project is also to promote the development of publications in these areas in magazines with a high impact rate, in addition to those generated in workshops and international conferences that will be included in the dissemination and exploitation plan.
3 Conclusion The “Cooperative Deeptech Platform” project will also help to strengthen the international leadership of the research team, as there are a number of collaborating companies with a broad international presence. Likewise, it will be possible to expand the current network of contacts through events organized within the dissemination plan where international experts from the ICT sector, urban planning, etc. will attend. The work process that has been designed to be followed during the development of the thesis is based on the Action-Research (AR) methodology. The methodology identifies the problem and formulates it from a hypothesis, based on defined concepts within a quantitative model of reality. Then, a collection, organization, and analysis of the information is carried out, continuing with the design of a proposal focused on solving the problem. Finally, the respective conclusions are formulated after evaluating the results obtained from the research. To follow this methodology, it is necessary to define a series of phases that will allow the proposed objectives to be achieved, as well as to demonstrate the hypothesis. The phases proposed are the following: 1. Definition of the problem: approaching the problem together with the environment that defines it to establish the objectives and the hypothesis of the research. 2. Review of the state of the art: analysis of the problem and solutions in similar environments that have been carried out by other researchers. The review process will focus both on the techniques applied in expression analysis and on the alternatives and methods for carrying out automatic planning. The review of the state of the art must be a continuous process throughout the research. 3. Proposal of models and validation of the fulfillment of the objectives as the different components are specified. The models are broken down into a series of components to facilitate the validation process and thus improve the research process. 4. Study of the results obtained through comparison with other procedures and thus determine whether the objectives and the hypothesis initially proposed have been achieved. 5. Publication of the results obtained throughout the research both in congresses and in journals. Publication in congresses is of great importance because it allows attendance at conferences that facilitate the exchange of first-hand ideas.
Acknowledgments. This work has been partially supported by the European Regional Development Fund (ERDF) through the Interreg Spain-Portugal V-A Program (POCTEP) under grant 0677_DISRUPTIVE_2_E (Intensifying the activity of Digital Innovation Hubs within the PocTep region to boost the development of disruptive and last generation ICTs through cross-border cooperation).
“Cooperative Deeptech Platform” for Innovation-Hub Members
301
References 1. Coghill, J.G.: Distributed usage logging: what to consider. J. Electron. Resour. Med. Libr. 16(2), 81–86 (2019) 2. Li, T., Sun, S., Corchado, J.M., Siyau, M.F.: A particle dyeing approach for track continuity for the SMC-PHD filter. In: 17th International Conference on Information Fusion (FUSION), pp. 1–8. IEEE, July 2014 3. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007) 4. Morente-Molinera, J.A., Kou, G., González-Crespo, R., Corchado, J.M., Herrera-Viedma, E.: Solving multi-criteria group decision making problems under environments with a high number of alternatives using fuzzy ontologies and multi-granular linguistic modelling methods. Knowl.-Based Syst. 137, 54–64 (2017) 5. Li, T., Sun, S., Boli´c, M., Corchado, J.M.: Algorithm design for parallel implementation of the SMC-PHD filter. Sig. Proc. 119, 115–127 (2016) 6. Coria, J.A.G., Castellanos-Garzón, J.A., Corchado, J.M.: Intelligent business processes composition based on multi-agent systems. Expert Syst. Appl. 41(4), 1189–1205 (2014) 7. Gómez Zotano, M., Gómez-Sanz, J., Pavón, J.: User behavior in mass media websites. ADCAIJ: Adv. Distrib. Comput. Artif. Intell. J. 4(3), 47–56 (2015) 8. Tapia, D.I., Fraile, J.A., Rodríguez, S., Alonso, R.S., Corchado, J.M.: Integrating hardware agents into an enhanced multi-agent architecture for Ambient Intelligence systems. Inf. Sci. 222, 47–65 (2013) 9. Corchado, J.M., Pavón, J., Corchado, E.S., Castillo, L.F.: Development of CBR-BDI agents: a tourist guide application. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 547–559. Springer, Heidelberg (2004). https://doi.org/10.1007/9783-540-28631-8_40 10. Lima, A.C.E., de Castro, L.N., Corchado, J.M.: A polarity analysis framework for Twitter messages. Appl. Math. Comput. 270, 756–767 (2015) 11. Fdez-Riverola, F., Corchado, J.M.: Fsfrt: forecasting system for red tides. Appl. Intell. 21(3), 251–264 (2004) 12. Kushida, T.: Distributed logging service with distributed hash table for cloud. In: Hsu, C.-H., Kallel, S., Lan, K.-C., Zheng, Z. (eds.) IOV 2019. LNCS, vol. 11894, pp. 158–173. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-38651-1_15 13. Bashynska, I., Malanchuk, M., Zhuravel, O., Olinichenko, K.: Smart solutions: risk management of crypto-assets and blockchain technology. Int. J. Civil Eng. Technol. (IJCIET) 10(2), 1121–1131 (2019) 14. Závodská, A., Šramová, V., Aho, A.-M.: Knowledge in value creation process for increasing competitive advantage. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 1(3), 35–47 (2012). (ISSN: 2255-2863), Salamanca 15. Fdez-Riverola, F., Iglesias, E.L., Díaz, F., Méndez, J.R., Corchado, J.M.: Spamhunting: an instance-based reasoning system for spam labelling and filtering. Decis. Support Syst. 43(3), 722–736 (2007) 16. Kenji, M., Kimura, K., Pérez, A.: Control prosody using multi-agent system. DCAIJ Adv. Distrib. Comput. Artif. Intell. J. 2(4), 49–56 (2013). (ISSN: 2255-2863), Salamanca 17. Casado-Vara, R., Martin-del Rey, A., Affes, S., Prieto, J., Corchado, J.M.: IoT network slicing on virtual layers of homogeneous data for improved algorithm operation in smart buildings. Fut. Gener. Comput. Syst. 102, 965–977 (2020) 18. Sergio, A., Carvalho, S., Rego, M.: On the use of compact approaches in evolution strategies. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(4), 13–23 (2014). (ISSN: 2255-2863), Salamanca
302
N. Shoeibi
19. Baruque, B., Corchado, E., Mata, A., Corchado, J.M.: A forecasting solution to the oil spill problem based on a hybrid intelligent system. Inf. Sci. 180(10), 2029–2043 (2010) 20. Agüero, J., Rebollo, M., Carrascosa, C., Julián, V.: MDD-approach for developing pervasive systems based on service-oriented multi-agent systems. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 2(3), 55–64 (2013). (ISSN: 2255-2863), Salamanca 21. Casado-Vara, R., Prieto, J., De la Prieta, F., Corchado, J.M.: How blockchain improves the supply chain: case study alimentary supply chain. Procedia Comput. Sci. 134, 393–398 (2018) 22. Trindade, N., Antunes, L.: An architecture for agent’s risk perception. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 2(2), 75–85 (2013). (ISSN: 2255-2863), Salamanca 23. Di Giammarco, G., Di Mascio, T., Di Mauro, M., Tarquinio, A., Vittorini, P.: SmartHeart CABG Edu. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 24. Ocheja, P., Flanagan, B., Ueda, H., Ogata, H.: Managing lifelong learning records through blockchain. Res. Pract. Technol. Enhanced Learn. 14(1), 1–19 (2019). https://doi.org/10.1186/ s41039-019-0097-0 25. Corchado, J.M., Aiken, J.: Hybrid artificial intelligence methods in oceanographic forecast models. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(4), 307–313 (2002) 26. González-Briones, A., Prieto, J., De La Prieta, F., Herrera-Viedma, E., Corchado, J.M.: Energy optimization using a case-based reasoning strategy. Sensors 18(3), 865 (2018) 27. Chamoso, P., Pérez-Ramos, H., García-García, Á.: ALTAIR: supervised methodology to obtain retinal vessels caliber. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(4), 48–57 (2014) (ISSN: 2255-2863), Salamanca 28. Díaz, F., Fdez-Riverola, F., Corchado, J.M.: Gene-CBR: A CASE-BASED REASONIG TOOL FOR CANCER DIAGNOSIS USING MICROARRAY DATA SETS. Comput. Intell. 22(3–4), 254–268 (2006) 29. Dargham, J.A., Chekima, A., Moung, E.G., Omatu, S.: The effect of training data selection on face recognition in surveillance application. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(4), 227–234 (2014). (ISSN: 2255-2863), Salamanca 30. Salazar, R., Rangel, J.C., Pinzón, C., Rodríguez, A.: Irrigation system through intelligent agents implemented with arduino technology. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J.2(3) (2013). (ISSN: 2255-2863), Salamanca 31. Corchado, J.M., Corchado, E.S., Aiken, J., Fyfe, C., Fernandez, F., Gonzalez, M.: Maximum likelihood hebbian learning based retrieval method for CBR systems. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS (LNAI), vol. 2689, pp. 107–121. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45006-8_11 32. Griol, D., Molina, J.: Measuring the differences between human-human and human-machine dialogs. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(2) (2015). (ISSN: 2255-2863), Salamanca 33. Girau, R., Cossu, R., Farina, M., Pilloni, V., Atzori, L.: Virtual user in the IoT: definition. Technol. Exp. Sens. 19(20), 4489 (2019) 34. Guillén, J.H., del Rey, A.M., Casado-Vara, R.: Security countermeasures of a SCIRAS model for advanced malware propagation. IEEE Access 7, 135472–135478 (2019) 35. Corchado, J.M., Lees, B.: A hybrid case-based model for forecasting. Appl. Artif. Intell. 15(2), 105–127 (2001) 36. Fernández-Riverola, F., Diaz, F., Corchado, J.M.: Reducing the memory size of a fuzzy casebased reasoning system applying rough set techniques. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(1), 138–146 (2006) 37. Yousefpour, A., Fung, C., Nguyen, T., Kadiyala, K., Jalali, F., Niakanlahiji, A., Jue, J.P.: All one needs to know about fog computing and related edge computing paradigms: a complete survey. J. Syst. Architect. (2019)
“Cooperative Deeptech Platform” for Innovation-Hub Members
303
38. Tapia, D.I., Corchado, J.M.: An ambient intelligence based multi-agent system for alzheimer health care. Int. J. Ambient Comput. Intell. (IJACI) 1(1), 15–26 (2009) 39. Corchado, J.M., Fyfe, C.: Unsupervised neural method for temperature forecasting. Artif. Intell. Eng. 13(4), 351–357 (1999) 40. Degaonkar, S.P., Mitkar, A.: U.S. Patent No. 10,552,294. U.S. Patent and Trademark Office, Washington, DC (2020) 41. Salehi, B., Ghanbaran, A.H., Maerefat, M.: Intelligent models to predict the indoor thermal sensation and thermal demand in steady state based on occupants’ skin temperature. Build. Environ. 169, 106579 (2020) 42. Méndez, J.R., Fdez-Riverola, F., Díaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, pp. 106–120. Springer, Heidelberg (2006). https://doi.org/10.1007/11790853_9 43. Ferreira, A.S., Pozo, A., Gonçalves, R.A.: an ant colony based hyper-heuristic approach for the set covering problem. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 44. Cofini, V., De La Prieta, F., Di Mascio, T., Gennari, R., Vittorini, P.: Design smart games with requirements, generate them with a click, and revise them with a GUIs. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 1(3), 55–68 (2012). (ISSN: 2255-2863), Salamanca 45. Mata, A., Corchado, J.M.: Forecasting the probability of finding oil slicks using a CBR system. Expert Syst. Appl. 36(4), 8239–8246 (2009) 46. Chamoso, P., González-Briones, A., Rodríguez, S., Corchado, J.M.: Tendencies of technologies and platforms in smart cities: a state-of-the-art review. Wireless Commun. Mob. Comput. (2018) 47. Glez-Bedia, M., Corchado, J.M., Corchado, E.S., Fyfe, C.: Analytical model for constructing deliberative agents. Eng. Intell. Syst. Electr. Eng. Commun. 10(3), 173–185 (2002) 48. Brondino, M., Dodero, G., Gennari, R., Melonio, A., Raccanello, D., Torello, S.: Achievement emotions and peer acceptance get together in game design at school. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(4), 1–12 (2014). (ISSN: 2255-2863), Salamanca 49. Frikha, M., Mhiri, M., Gargouri, F.: A semantic social recommender system using ontologies based approach for tunisian tourism. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 50. Rossi, S., Barile, F., Caso, A.: Dominance weighted social choice functions for group recommendations. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 51. Fyfe, C., Corchado, J.M.: Automating the construction of CBR systems using kernel methods. Int. J. Intell. Syst. 16(4), 571–586 (2001) 52. Choon, Y.W., Mohamad, M.S., Deris, S., Illias, R.M., Chong, C.K., Chai, L.E., Omatu, S., Corchado, J.M.: Differential bees flux balance analysis with OptKnock for in silico microbial strains optimization. PloS one, 9(7), e102744 (2014) 53. Aly, H.H.: A novel approach for harmonic tidal currents constitutions forecasting using hybrid intelligent models based on clustering methodologies. Renew. Energy 147, 1554–1564 (2020) 54. Alvarado-Pérez, J.C., Peluffo-Ordóñez, D.H., Therón, R.: Bridging the gap between human knowledge and machine learning. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(1) (2015). (ISSN: 2255-2863), Salamanca 55. Martín del Rey, A., Casado Vara, R., Hernández Serrano, D.: Reversibility of symmetric linear cellular automata with radius r = 3. Mathematics 7(9), 816 (2019) 56. Casado-Vara, R., Novais, P., Gil, A.B., Prieto, J., Corchado, J.M.: Distributed continuous-time fault estimation control for multiple devices in IoT networks. IEEE Access 7, 11972–11984 (2019)
304
N. Shoeibi
57. Carvalhal, C., Deusdado, S., Deusdado, L.: Crawling PubMed with web agents for literature search and alerting services. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 2(1) (2013). (ISSN: 2255-2863), Salamanca 58. Pinto, T., Marques, L., Sousa, T.M., Praça, I., Vale, Z., Abreu, S.L.: Data-mining-based filtering to support solar forecasting methodologies. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(3) (2017). (ISSN: 2255-2863), Salamanca 59. Valdivia, A.K.C.: Between the profiles pay per view and the protection of personal data: the product is you. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 6(1), 51–58 (2017). (ISSN: 2255-2863), Salamanca 60. Casado-Vara, R., Chamoso, P., De la Prieta, F., Prieto, J., Corchado, J.M.: Non-linear adaptive closed-loop control system for improved efficiency in IoT-blockchain management. Inf. Fusion 49, 227–239 (2019) 61. Isaza, G., Mejía, M.H., Castillo, L.F., Morales, A., Duque, N.: Network management using multi-agents system. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 1(3), 49–54 (2012). (ISSN: 2255-2863), Salamanca 62. Guivarch, V., Camps, V., Péninou, A.: AMADEUS: an adaptive multi-agent system to learn a user’s recurring actions in ambient systems. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 1(3), 1–10 (2012). (ISSN: 2255-2863), Salamanca 63. López-Fernández, H., Reboiro-Jato, M., Pérez Rodríguez, J.A., Fdez-Riverola, F., GlezPeña, D.: The artificial intelligence workbench: a retrospective review. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 5(1) (2016). (ISSN: 2255-2863), Salamanca 64. López Sánchez, D., Revuelta, J., De La Prieta, F., Dang, C.: Analysis and visualization of social user communities. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 4(3) (2015). (ISSN: 2255-2863), Salamanca 65. Martins, C.,Silva, A.R., Martins, C., Marreiros, G.: Supporting informed decision making in prevention of prostate cancer. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 3(3), 1–11 (2014). (ISSN: 2255-2863), Salamanca 66. Jasmine, K.S., Gavani Prathviraj, S., Rajashekar, P.I., Devi, K.S.: Inference in belief network using logic sampling and likelihood weighing algorithms. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 2(3) (2013). (ISSN: 2255-2863), Salamanca 67. Khan, W.Z., Ahmed, E., Hakak, S., Yaqoob, I., Ahmed, A.: Edge computing: a survey. Fut. Gener. Comput. Syst. 97, 219–235 (2019)
Engineering Multiagent Organizations Through Accountability Stefano Tedeschi(B) Dipartimento di Informatica, Universit` a degli Studi di Torino, Turin, Italy [email protected]
Abstract. In my PhD I have been investigating the notions of accountability and responsibility in multiagent organizations. The main objective of the work is to develop both a conceptual model and a programming framework encompassing the two concepts as engineering tools, which, in our view, are fundamental for the realization of robust systems. Keywords: Accountability
1
· Responsibility · Multiagent organizations
Introduction
Multiagent systems (MAS) proved to be effective in the development of complex systems composed of heterogeneous actors operating in distributed environments. Normative agent organizations offer abstractions defining strategies for decomposing complex goals into simpler ones and for allocating them to agents. Key features of most organizational models, e.g. [6], are a functional decomposition of the goal and a normative system. Norms establish what agents should perform to achieve the organizational goal. However, none of the approaches addresses accountability, i.e. who should report to whom for the fulfillment of its duties and how, resulting in the following limits: (i) difficulty to identify who should give restitution to whom for a certain state of the organization; (ii) difficulty to take appropriate countermeasures in case of abnormal situations. On this foundation, the aim of my PhD has been to investigate the notions of accountability and responsibility as engineering tools to systematize the development of MAS organizations. We claim that the realization of distributed systems would benefit from an explicit representation of accountability and responsibility, two fundamental concepts at the basis of human organizations. The overall objective is to develop both a formal conceptual model and a programming framework to guide the design and engineering of robust MAS organizations.
2
Related Work
Agent organizations [7] embody patterns of interaction imposed to agents to ensure a coherent global behavior. Normative organizations provide the means to c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 305–308, 2021. https://doi.org/10.1007/978-3-030-58356-9_36
306
S. Tedeschi
realize the correct behavior, capturing what agents should do and which sanction is applied if they do not comply. One drawback is that, when the system faces an abnormal situation and some agent fails, sanctions are of little utility. What is missing is some support for agents to provide accounts on what happened, propagating such feedback through appropriate channels, and reaching those agents equipped for coping with them, making the system robust. We claim that accountability and responsibility serve the purpose intuitively, yet effectively. In [8], accountability is “a primary characteristic of governance where there is a sense of agreement and certainty about the legitimacy of expectations.” Along this line, accountability implies that some actors have the right to hold other actors to a set of standards [10]. Notably, [11] points out that the lack of an adequate representation of relationships between actors obfuscates accountability, possibly compromising the system functioning. Concerning responsibility, in the context of information systems, [9] represents it as a unique charge assigned to an agent. It is worth noting that accountability and responsibility are not primitive concepts. Rather, they are properties that emerge in carefully designed systems.
3
Proposal
We propose to explicitly introduce the notions of accountability and responsibility as software engineering tools for use in MAS organizations. By this, I mean the realization via software of the abilities to trace, evaluate, and communicate accountability and responsibility. The problem will be addressed (i) by supplying a formal model and definition of computational accountability, clarifying its relation with the sibling notion of responsibility; (ii) by providing modeling and programming tools to simplify the realization of accountability supporting organizations. The purpose is to come up with a formalization of the concepts as first-class entities to be used both by the designer to describe the expected behavior of the system and by the agents to direct their conduct. The first part of my project has been focused on the development of a methodology and framework for the design of such organizations. The construction of a comprehensive system requires many elements: a formal model of accountability and responsibility, an engine to distribute responsibilities, an automated forum to discern the accountability of the involved agents, and a mechanism to keep track of who could be accountable for what in which situation. The second part will be devoted to the development of an actual programming platform implementing the previously mentioned framework in some of the main platforms for MAS organizations. One platform that seems particularly promising is JaCaMo [6]. Such a choice is due to the fact that it provides a very good integration of the concepts characterizing agents, environments and organizations.
4
Preliminary Results
The support to accountability and responsibility has found a first realization in the ADOPT protocol for creating and manipulating accountability relationships [3]. The main intuition is that, when an agent participates in an organization,
Engineering Multiagent Organizations Through Accountability
307
it must accept a set of accountability requirements, expressed as social commitments. The protocol specifies the shapes of these commitments and controls their creation. In [4], an information model that describes which data should be available to identify accountabilities in a group of interacting agents has been proposed. The model, provided in Object-Role Modeling, identifies the main concepts in the process of accountability determination: mutually held expectation and control. Central lies the accountability relationship, constrained so that a principal accountable for an achievement must be in control of it and that there must be a mutually held expectation on that principal for the achievement. In [1], we proposed to improve the specification of an organization with a set of accountability and responsibility specifications. R(x, q), denoting a responsibility, expresses an expectation on any agent playing role x on pursuing condition q. A(x, y, r, u) expresses that x, the a-giver, is accountable towards y, the a-taker, for condition u when condition r holds. Accountability relationships are collected in an accountability specification, while responsibilities are grouped in a responsibility distribution. The designer will specify a set of acceptable accountability specifications and responsibility distributions. Central is the notion of accountability fitting that links together accountability and responsibility. We, then, studied how robustness can be achieved via accountability and responsibility. In [2] we presented two programming patterns for developing agents according to the accountability fitting. The proposal allows to map the accountability/responsibility specification into a set of well-defined agent plans. Such plans define the behavior agents should exhibit to produce accounts for the goals they are responsible for, directed to the agents entitled for treating them. The same approach has been applied to the field of Business Processes (BPs), as well. BPs realize a business goal by coordinating the tasks undertaken by multiple parties. When processes are distributed, MAS organizations are promising candidates for realizing them. Again, we claimed that, to effectively engineer distributed processes, a modeler should be equipped with abstractions for capturing relationships between the actors, and not only between the process activities. This line has been developed, e.g., in [5].
5
Evaluation
The proposal is being evaluated in two ways. Regarding the formal model, the proposal has been compared to the main approaches to accountability and responsibility from other areas (e.g., social sciences, public administration, psychology). Accountability is central in many fields that study human interaction and the very same approaches can be effective in the context of intelligent systems. The objective is to propose a characterization for use in MAS capturing as many declinations as possible. Moreover, the proposal has been discussed in the context of two widely accepted models of interaction, i.e., normative ones and the ones based on social commitments. For what concerns the programming model, the plan is to evaluate it mainly inside the JaCaMo framework. We are working to enrich JaCaMo’s organizational model and infrastructure with accountability
308
S. Tedeschi
and responsibility. At the same time, BPs offer interesting real-world scenarios to show the benefits coming from an explicit account of accountability and responsibility both at design and execution time.
6
Conclusions
Accountability offers interesting implications in terms of robustness. The final part of my PhD will be focused on this aspect. Robustness is typically defined as “the ability of a software to keep an ‘acceptable’ behavior [. . . ] in spite of exceptional or unforeseen execution conditions.” Casting such a view into the context of MAS organizations, an organization is robust when its agents can react to abnormal events, possibly encompassing contextual information provided by others. Accountable software results to be robust, that is, capable to keep up working within acceptable standards despite the occurrence of some abnormal situations. By way of accountability, an organization designer can specify how (relevant) contextual information produced during the achievement of a goals flows from an agent to another, so as to provide an adequate context for the a-taker’s decision-making, especially in front of invalid or exceptional situations.
References 1. Baldoni, M., Baroglio, C., Boissier, O., May, K.M., Micalizio, R., Tedeschi, S.: Accountability and responsibility in agent organizations. In: Proceedings of PRIMA 2018. LNCS. Springer (2018) 2. Baldoni, M., Baroglio, C., Boissier, O., Micalizio, R., Tedeschi, S.: Engineering business processes through accountability and agents. In: Proceedings of AAMAS 2019, IFAAMAS, pp. 1796–1798 (2019) 3. Baldoni, M., Baroglio, C., May, K.M., Micalizio, R., Tedeschi, S.: Computational accountability in MAS organizations with ADOPT. Appl. Sci. 8(4), 489 (2018) 4. Baldoni, M., Baroglio, C., May, K.M., Micalizio, R., Tedeschi, S.: MOCA: an ORM model for computational accountability. Intelligenza Artificiale 13(1), 5–20 (2019) 5. Baldoni, M., Baroglio, C., Micalizio, R., Tedeschi, S.: Implementing business processes in JaCaMo+ by exploiting accountability and responsibility. In: Proceedings of the AAMAS 2019, IFAAMAS, Demo Track, pp. 2330–2332 (2019) 6. Boissier, O., Bordini, R.H., H¨ ubner, J.F., Ricci, A., Santi, A.: Multi-agent oriented programming with JaCaMo. Sci. Comput. Program. 78(6), 747–761 (2013) 7. Carabelea, C., Boissier, O.: Coordinating agents in organizations using social commitments. Electron. Not. Theor. Comput. Sci. 150(3), 73–91 (2006) 8. Dubnick, M.J., Justice, J.B.: Accounting for accountability. In: Annual Meeting of the American Political Science Association (2004) 9. Feltus, C.: Aligning access rights to governance needs with the Responsability MetaModel (ReMMo) in the Frame of Enterprise Architecture. Ph.D. thesis, University of Namur, Belgium (2014) 10. Grant, R.W., Keohane, R.O.: Accountability and abuses of power in world politics. Am. Polit. Sci. Rev. 99(1), 29–43 (2005) 11. Nissenbaum, H.: Accountability in a computerized society. Sci. Eng. Ethics 2(1), 25–42 (1996)
Circadian Rhythm and Pain: Mathematical Model Based on Multiagent Simulation Ang´elica Theis dos Santos(B) , Catia Maria dos Santos Machado , and Diana Francisca Adamatti Programa de P´ os-Gradua¸ca ˜o em Modelagem Computacional, Universidade Federal do Rio Grande, Av. Italia s/n km 08, Rio Grande, Brazil {theisangelica,catiamachado,dianaadamatti}@furg.br https://www.furg.br/
Abstract. The circadian rhythm controls the unconscious activities of living beings through the biological clock. External influences, such as pain, migraine, depression or anxiety, cause dysfunction in the synchronization and desynchronization of the human body. We propose to study the mathematical and computational properties that can describe the modeling of circadian rhythm, specifically with the influence of pain. We use a mathematical model of two processes, composed of circadian and homeostatic rhythms. The computational model, via a multiagent system, will use the pain variable, that will define by a non-invasive questionnaire with people. Preliminary results show that pain directly influences the quality of sleep, as well as the development of daily activities.
Keywords: Circadian rhythm Mathematical model
1
· Pain · Multiagent simulation ·
Introduction
Rhythmic processes are intrinsic and play a large part of the indispensable processes of the human body. Among all rhythmic processes, the circadian rhythm process is what stands out. The same is characterized by biological processes that present an oscillation rhythm in the sleep/wake period, being adjusted to 24 h [2]. The formal study of circadian rhythm is part of chronobiology, area belonging to the biological sciences that aims to study the biological clocks that control the rhythms and are responsible for the activities of living beings. Above all, rhythms are associated with vital functions such as hormones, digestive system, feeling of sleep and hunger, and external influences, such as pain, anxiety or depression [8]. c The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 309–312, 2021. https://doi.org/10.1007/978-3-030-58356-9_37
310
A. T. dos Santos et al.
The chronobiology is centered on the body, controlled by models, which are always interconnected. The human body functions as mathematical models - in sync. Mathematical models are useful for representing real situations, performing predictions, and assisting in decision support. In this way, the main variables that model the circadian rhythms can be seen as a sinusoidal curve over a 24 h period [2]. An important learning model that has been used is agent-based simulation. Agent-based simulation makes possible to show a real population in an artificial form in which each individual of the population is represented by an agent, some agents form a group, and each group has its own rules and behaviors [5]. This paper presents the research that is underway on the influence of pain on the circadian rhythm, as well as on daily activities, related to individuals who work or study. We expect to develop a multiagent environment that reliably describes the behavior of circadian rhythm and pain.
2
Theoretical Basis
Each person’s biological clock is synchronized according to her/his daytime activities. Thus, the internal timetable is accurate. For internal regulation, adjustment mechanisms that govern synchronization are necessary. This synchronization is performed by the adjustment phenomenon, which is called “entrainment”. The external factor that directs the adjustment is called a “zeitgeber”. Zeitgebers are the biological clock’s synchronizers. In this way, the circadian and homeostatic rhythms are synchronized by the zeitgebers, and they are always interconnected by a type of pacemaker [3,4]. ˜ process. One important mechanism of sleep-wake regulation occurs via the S This process is dependent on the duration and quality of sleep [3]. The duration of wakefulness increases process activity, thus in creasing the sleep time. Sleepwake is regulated by the model of two processes (the circadian rhythm (process C) and the homeostatic rhythm (process S) [4]. The circadian rhythm is controlled by a pacemaker located in the brain, which is in dependent of wakefulness and sleep. A circadian rhythm can be expressed by a sinusoidal curve that reaches its maximum level in the early morning and its minimum in the early evening. The homeostatic rhythm, known as the S process, arises from the wakeful˜ process. This rhythm is the pressure arising from ness that comes from the S sleep accumulated during the day that decreases during the night. The homeostatic rhythm has a sinusoidal increase from the beginning of wakefulness to the beginning of sleep, after which it decreases until its end. NREM (Non-rapid eye movement) is high in the first stage, when it is the first sleep, and exhibits a sinusoidal decrease throughout the period [3]. According to the International Association for the Study of Pain (IASP), the pain is “a sensory, emotional and unpleasant experience associated with bodily injury” [1,7].
Circadian Rhythm and Pain
311
Multiagent Systems study the behavior of an independent set of agents with different characteristics, evolving in a environment [5].
3
Methodology
The study and diagnose of the dynamics and functioning of the circadian rhythm is fundamental for chronobiology and for science. In this work, we propose to study a mathematical model of two processes, which describes the curves of the circadian rhythm [4], a computational model based on a multiagent system, based on the implementation performed by [6], and the insertion of the pain variable in this model, that directly affects the circadian rhythm. The differential of our work is that we cannot find in literature works with multiagent system, two-process model and pain. Initially, a study of the mathematical model [4] was carried out, where we defined the best model is the two processes model. After, the study of circadian rhythm implemented in Netlogo, proposed by [6] was carried out. In this moment, apart from Netlogo, other multiagent systems were not evaluated. Using agent-based simulation, it is possible to show a real population in an artificial way, where each individual of the population is presented by an agent and all agents form a group, each of which has its own rules and behaviors. In the implementation, the multiagent system was used, which allows showing everyday applications in simulations. We inserted the variable pain by the equation (1). This variable was found in a empirical way and it presented good results, allowing an analysis of the individual’s efficiency. pain ∗ 0.2955) (1) pain → (1 − 10 Figure 1 shows the interface in Netlogo, as well as the simulation of an individual with random pain, on each day of the week.
Fig. 1. Interface with pain influence, using the empirical equation of pain.
312
A. T. dos Santos et al.
At this point in the research, we are collecting data, via a non-invasive questionnaire. Some data are age, bedtime, time to wake up, level of pain on each day of the week, location of pain and productivity at work or study. With these data, we will analyse them to discovery a more realistic value to pain equation.
4
Conclusion
In the literature, there are several studies that use circadian rhythm and pain in an integrated way, but none is integrated with a multiagent system. In this way, it becomes important to model the multiagent system for circadian rhythm and pain, which is in progress, because we want to show how much pain implies in productivity to study and/or work. As future works, we intend to collect more answers from the questionnaire, as well as, proof of the pain equation is reliable. After, we will implement and tests in Netlogo, to prove scientifically. This work intends to show the interdisciplinarity between artificial intelligence, mathematical and biological models, and how they can help in real daily situations.
References 1. International association for the study pain. https://www.iasp-pain.org/. Accessed 22 Oct 2017 2. Asgari-Targhi, A., Klerman, E.B.: Mathematical modeling of circadian rhythms. Wiley Interdisc. Rev. Syst. Biol. Med. 11(2), e1439 (2019) 3. Borb´ely, A.A., Daan, S., Wirz-Justice, A., Deboer, T.: The two-process model of sleep regulation: a reappraisal. J. Sleep Res. 25(2), 131–143 (2016) 4. Daan, S., Beersma, D., Borb´ely, A.A.: Timing of human sleep: recovery process gated by a circadian pacemaker. Am. J. Physiol. Regul. Integr. Comp. Physiol. 246(2), R161–R183 (1984) 5. Li, Z., Duan, Z.: Cooperative Control of Multi-agent Systems: A Consensus Region Approach. CRC Press, Boca Raton (2017) 6. Skeldon, A.: Are you listening to your body clock? http://personal.maths.surrey.ac. uk/st/A.Skeldon/sleep.html (2014). Accessed 20 Feb 2020 7. Turk, D.C., Monarch, E.S.: Biopsychosocial perspective on chronic pain. In: Turk, D.C., Gatchel, R.J. (eds.) Psychological Approaches to Pain Management: A Practitioner‘s Handbook, pp. 3–29. The Guilford Press (2002) 8. Zaki, N.F., Spence, D.W., BaHammam, A.S., Pandi-Perumal, S.R., Cardinali, D.P., Brown, G.M.: Chronobiological theories of mood disorder. Eur. Arch. Psychiatry Clin. Neurosci. 268(2), 107–118 (2018)
Author Index
A Adamatti, Diana Francisca, 279, 309 Agostinho, Nilzair Barreto, 279 Alonso, Ricardo S., 156, 251, 272 Analide, Cesar, 166 Antoniou, E., 213 Arora, Ashish, 177 B Bartolomé, A., 117 C Carneiro, Davide, 34 Castanheira, António, 44 Castillo, Jose C., 61 Cea-Morán, Juan J., 224 Chamoso, P., 117, 213 Coccoli, Mauro, 54 Conceição, Luís, 93 Corchado, Juan Manuel, 177, 186 D De La Prieta, Fernando, 224 De Sotgiu, Andrea, 54 Ding, Yuan, 127 dos Santos Machado, Catia Maria, 309 dos Santos, Angélica Theis, 309 Dragone, Mauro, 127 Durães, Dalila, 106 F Ferreira, Nuno, 146 Fiekas, Niklas, 243 Figueiredo, Lino, 146 Fonseca, Joaquim, 106
G Gamboa-Montero, Juan Jose, 61 García, Óscar, 156 García-Retuerta, David, 213, 258, 289 Gonçalves, Celestino, 166 Gonçalves, Filipe, 106 González, Angélica, 186 González-Briones, Alfonso, 224 Goussetis, George, 127 Guimarães, Miguel, 34 Guisado-Gámez, Joan, 213 H Hernández, Guillermo, 117, 186 Howard, Daniel Anthony, 293 J Javadian Sabet, A., 203 Jiang, Keren, 13 Jørgensen, Bo Nørregaard, 293 K Kimura, Risa, 13 Kompatsiaris, Ioannis, 3, 24 L Lupión, M., 82 M Ma, Zheng, 293 Machado, José, 44, 106 Marcondes, Francisco S., 106 Marques-Villarroya, Sara, 61 Marreiros, Goreti, 93 Matos, Paulo, 137, 284
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 P. Novais et al. (Eds.): ISAmI 2020, AISC 1239, pp. 313–314, 2021. https://doi.org/10.1007/978-3-030-58356-9
314 Meditskos, Georgios, 3 Menendez, Carla, 61 Mezquita, Yeray, 247 N Nakajima, Tatsuo, 13 Nieves, Elena Hernández, 262 Nikolopoulos, Spiros, 24 Novais, Paulo, 34, 106, 137, 284 O Oliveira, Pedro Filipe, 137, 284 Ortigosa, P. M., 82 Öztürk, Mehmet, 156 P Papagiannopoulos, Sotirios, 3 Parra, J., 117 Peixoto, Hugo, 44 Pinto, Ricardo, 93 Plaza-Hernández, Marta, 267 Praça, Isabel, 72 Prat-Pérez, Arnau, 224 Prieto, Javier, 156, 186, 224 R Rebelo, Diogo, 166 Redondo, J. L., 82 Rivas, A., 213 Rocha, Ricardo, 72 Rodríguez, Sara, 186 Rossi, M., 203
Author Index S Salazar, Gonçalo, 146 Salichs, Miguel A., 61 Sánchez, Sergio Márquez, 177 Sanjuan, J. F., 82 Sati, Vishwani, 177 Schreiber, F. A., 203 Shoeibi, Niloufar, 177, 298 Silva, Fábio, 34, 166 Sittón-Candanedo, Inés, 156 Smith, Ronnie, 127, 239 Sousa, Daniel, 34 Stavropoulos, Thanos G., 3, 24 Strantsalis, Dimitris, 24 T Tanca, L., 203 Tedeschi, Stefano, 305 U Unzueta, M., 117 V Vercelli, Gianni, 54 W Wherhli, Adriano Velasque, 279 Z Zhang, Di, 13