145 102 122MB
English Pages 1493 [1492] Year 2023
Lecture Notes in Networks and Systems 739
Kohei Arai Editor
Intelligent Computing Proceedings of the 2023 Computing Conference, Volume 2
Lecture Notes in Networks and Systems
739
Series Editor Janusz Kacprzyk , Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
Kohei Arai Editor
Intelligent Computing Proceedings of the 2023 Computing Conference, Volume 2
Editor Kohei Arai Faculty of Science and Engineering Saga University Saga, Japan
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-37962-8 ISBN 978-3-031-37963-5 (eBook) https://doi.org/10.1007/978-3-031-37963-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
With profound pride and privilege, we present before you the proceedings of the Computing Conference, 2023. It was held over two days from 22 to 23 June 2023 at London, UK, in a hybrid mode. The conference was hugely successful as it was attended by 200 delegates from more than 60 countries across the globe. The conference covered the whole gamut of topics ranging from Internet of Things, Artificial Intelligence, Ambient Intelligence, e-Learning and Machine Vision. The conference provided a coveted platform to all the renowned and budding researchers and industry experts to voice their iconic, innovative and insightful research study. The synergy of studies made by the academia and industry experts is definitely going to give a great thrust to the technological advancement of the world. The conference had four keynote addresses, paper presentations and engaging networking breaks for the delegates which allowed them to build long-term associations. We received a voluminous number of 539 paper submissions out of which we selected 193 papers on the criteria of originality, applicability and presentation. The selected papers provide a vast pool of knowledge and expertise in solving routine, repetitive and rigorous tasks. They are also a window to future living trend. The studies also gave an important thread for future research and beckoned all the bright minds to foray in those fields. The conference, without doubt, ignited a spark of great interest amongst its distinguished audience. The astounding success of the conference would not have been possible without the precious contribution of many people. The key stakeholders were the authors who gave such thought-provoking studies. The arduous task of review and evaluation by the Technical Committee members cannot be overlooked. The session chair’s role was noteworthy. We would extend our heartfelt gratitude to all the above key contributors. This note of thanks would be incomplete without the mention of our esteemed keynote speakers who enthralled everyone with their unique researches. The organizing committee’s efforts cannot go un-noticed as they managed seamlessly such a huge event and that too in hybrid mode. Our special thanks to them as well. We have sincerely endeavoured to publish the cherry-picked studies for our avid scientific readers. The encouraging response by our authors, participants and readers is indeed our dose of motivation. We hope to continue bringing the most unique and path-breaking researches in future as well with your enthusiastic support. Kohei Arai
Contents
Training and Diagnosis of Retinal OCT Images with Auxiliary Data Using TripleGAN When Imbalanced Class Occurs . . . . . . . . . . . . . . . . . . . . . . . . Justin Joshua Park
1
A Context-Aware Street Light System Based on Multi-variate Forecast and Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fouad Agramelal, Mohamed Sadik, Essaid Sabir, and Abouzahir Saad
11
Using Smartphone Sensing for Recognition of Game Player Attributes During Gameplay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Saad Khaquan, Muhammad Ehatisham-ul-Haq, Fiza Murtaza, Aasim Raheel, Aamir Arsalan, and Muhammad Awais Azam
26
ESSL-Polyp: A Robust Framework of Ensemble Semi-supervised Learning in Polyp Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toan Pham Van and Sang Dinh Viet
39
Data–Driven Design of an Active Wake Steering Control for a Wind Farm Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Silvio Simani, Saverio Farsoni, and Paolo Castaldi
53
Design Principles for Interactive and Reflective Journaling with AI . . . . . . . . . . Max Angenius and Maliheh Ghajargar AI Ethics on Blockchain: Topic Analysis on Twitter Data for Blockchain Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yihang Fu, Zesen Zhuang, and Luyao Zhang Intelligent Virtual Assistants (IVAs): Trust and Zero Trust . . . . . . . . . . . . . . . . . . Allison Wylde Implementation of a Smart House Using a Programmable Logical Controller (PLC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mustafa Ayad, Oday Alkaragole, Dominic Comito, and Khaled Ayad Use of Cell Phones in Near Surface Seismology . . . . . . . . . . . . . . . . . . . . . . . . . . Hyunseo Lee
62
82
101
109
125
viii
Contents
Techno-Economic Assessment in Communications: New Challenges . . . . . . . . Carlos Bendicho and Daniel Bendicho A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jack Humphreys and Emanuele Lindo Secco
134
151
Mobile Application for Bidding Durians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Check-Yee Law, Yong-Wee Sek, Choo-Chuan Tay, Tze-Hui Liew, and Ya-Qin Loh
162
ADAGIO, a BDI Music Recommender Telegram Chatbot . . . . . . . . . . . . . . . . . . Arantxa Garayzar-Cristerna and Wulfrano Arturo Luna-Ramirez
175
School Bus Routing Problem-Algorithm Optimization Using Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Galazios Konstantinos and Alexiou Dimitra
185
py-microdots: Position Encoding in the Euclidean Plane Based on the Anoto Codec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christoph Heindl
219
An Empirical Criterion for Transitive Closure of a Searchability Matrix . . . . . . Marius Orlowski
236
Detection of Computational Propaganda on Social Networks: A Survey . . . . . . Bodor Moheel Almotairy, Manal Abdullah, and Dimah Alahmadi
244
Quantum Computing Techniques for Multi-knapsack Problems . . . . . . . . . . . . . Abhishek Awasthi, Francesco Bär, Joseph Doetsch, Hans Ehm, Marvin Erdmann, Maximilian Hess, Johannes Klepsch, Peter A. Limacher, Andre Luckow, Christoph Niedermeier, Lilly Palackal, Ruben Pfeiffer, Philipp Ross, Hila Safi, Janik Schönmeier-Kromer, Oliver von Sicard, Yannick Wenger, Karen Wintersperger, and Sheir Yarkoni
264
Analysis of Syntactic Errors of Novice Python Programmers in a Nigeria University . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Philip Olu Jegede, Emmanuel Ajayi Olajubu, Olusegun Ojo Bakare, Isaac Oluwafemi Elesemoyo, and Josiah Owolabi A Constructive Heuristic “MDSA” Solving the Flexible Job Shop Scheduling Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vassil Guliashki and Gašper Mušiˇc
285
296
Contents
J-Parallelio: Automatic Parallelization Framework for Java Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piotr Listkiewicz, Krzysztof Stuglik, Mateusz Kulczyk, and Marcin Pietron
ix
307
Hyper Burst Buffer: A Lightweight Burst Buffer I/O Library for High Performance Computing Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Erick Fredj and Michael Laufer
321
Electromagnetic Quantum Memory in Coherent Domains of Condensed Matter and Its Prospects for Quantum Hypercomputing . . . . . . . . . . . . . . . . . . . . Luigi Maxmilian Caligiuri
338
Encoding Nets-Within-Nets in Maude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lorenzo Capra and Michael Köhler-Bußmeier
355
Usage of High-Performance System in Impulsive Modelling of Hepatitis B Virus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ekaterina Gospodinova and Ivan Torlakov
373
Performance Comparison of Operations in the File System and in Embedded Key-Value Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jesse Hines, Nicholas Cunningham, and Germán H. Alférez
386
A Review of Usability Evaluation Methods for eHealth Applications . . . . . . . . . Aslina Baharum, Siti Rahayu Abdul Aziz, and Nurul Hidayah Mat Zain Access Control in Mobile Crowdsensing: Requirements, Challenges and Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hajar El Gadi, Hanan El Bakkali, Driss Benhaddou, Houda Benbrahim, Wahiba Abou-zbiba, and Zaina Maqour A Game with a Purpose for Building Crowdsourced Semantic Relations Datasets for Named Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . André Fernandes dos Santos and José Paulo Leal Performing Different Clustering Methods for Mapping the European Union Member States using Green Energy, Digitalization, and R&D Indicators: A Five-Year Comparison (2016-2020) . . . . . . . . . . . . . . . . . . . . . . . . . Andreea Pernici and Stelian Stancu Modeling of the Modes of Operation of the AES Algorithm in the Cryptool 2 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Olga Manankova and Mubarak Yakubova
401
411
422
440
462
x
Contents
Investigating the Stability of SMOTE-Based Oversampling on COVID-19 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jih Soong Tan, Hui Jia Yee, Ivan Boo, Ian K. T. Tan, and Helmi Zakariah
470
Real-Time DNI and DHI Prediction Using Weather Information via LGBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geonwoo Bae
481
Finding Suitable Data Mining Techniques for Software Development Effort Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Julius Olufemi Ogunleye
490
Universal Hidden Monotonic Trend Estimation with Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Edouard Pineau, Sébastien Razakarivony, Mauricio Gonzalez, and Anthony Schrapffer
507
Applying CRISP-DM Methodology in Developing Machine Learning Model for Credit Risk Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kuldeep Rawat
522
A Bidirectional Encoder Representations from Transformers and CNN Based Prediction Model on Competitive Products . . . . . . . . . . . . . . . . . . . . . . . . . Yuefei Chen and Xinzhe Wang
539
Diagnosis of Parkinson’s Disease Through Spiral Drawing Classification via VGG19 and AnoGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hyunmin Lee
552
Modularity in Deep Learning: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Haozhe Sun and Isabelle Guyon
561
Improving Distributed Caching Using Reinforcement Learning . . . . . . . . . . . . . Ashwin Alaparthi, K. Shriprajwal, J. S. Sooraj, M. S. Suraj, and T. S. B. Sudarshan
596
Digital Watermarking Method for Copyright Protection of Deep Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuliya Vybornova Address Search Correction Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . Hoang-Quan Tran, Thanh-Nam Vo, Duc-Hao Do, and Thanh-Duc Chau
611
629
Contents
Analysis of Neural Network Architectures for Semantic Segmentation of Seismic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gabriel Danilo Figueiredo da Silva, João T. Dias, Luciana Faletti Almeida, and Milena Faria Pinto
xi
643
MA-CC: Cross-Layer Congestion Control via Multi-agent Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianing Bai, Tianhao Zhang, Chen Wang, and Guangming Xie
659
Depthwise Separable Dilated Convolutions for Low-Complexity Acoustic Scene Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chukwuebuka Olisaemeka and Lakshmi Babu Saheer
672
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Balázs Szalontai, Ákos Kukucska, András Vadász, Balázs Pintér, and Tibor Gregorics Cascaded 3D Object Segmentation with Volumetric Propagation Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yi Wu, Xin Wang, Yi Lu, Feng Gao, and Youbing Yin An Improved CNN Model for Image Forgery Detection . . . . . . . . . . . . . . . . . . . . K. R. Jisha and N. Sabna
683
703
716
Low-Cost Model-Free Deep Reinforcement Learning on Continuous Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huihui Zhang, Xu Han, Yanlong Cheng, and Cong Yan
728
A General Unbiased Training Framework for Deep Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huihui Zhang, Xu Han, Yanlong Cheng, and Cong Yan
746
Application of Convolutional Neural Networks with Quasi-Reversibility Method Results for Option Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zheng Cao, Wenyu Du, and Kirill V. Golubnichiy
761
A Comparison of LSTM and GRU Networks for Learning Symbolic Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roberto Cahuantzi, Xinye Chen, and Stefan Güttel
771
Attitude Comparison Between Parents and Primary and Secondary School Children Regarding Computer Games and Its Influences . . . . . . . . . . . . . Neven Groš, Andrija Bernik, and Danijel Radoševi´c
786
xii
Contents
How to Teach Programming to Beginners in a Playful Way? . . . . . . . . . . . . . . . . Veronika Stoffova, Veronika Gabal’ová, and Aliya Katyetova
801
ATLAS – A Three-Layer Action Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Per Skafte Hansen
812
Investigating the Assessment Submission Rates of a First Year Programming Module at an African Open and Distance e-Learning University . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dalize van Heerden and Leila Goosen The Impact of an App on Students’ Anxiety, Well-Being, and Resilience: A Pilot Efficacy Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iolie Nicolaidou Monolayer Network Representation and Analysis of the Curriculum . . . . . . . . . Durdica Vukic, Sanja Candrlic, and Alen Jakupovic Vision Based Machine Learning Algorithms for Out-of-Distribution Generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hamza Riaz and Alan F. Smeaton Fake Review Recognition Using an SVM Model . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander I. Iliev, Anandreddy Nimmala, Rashid Abdul Rahiman, Sonia Raju, and Swetha Chilakalanerpu Early Detection of Rust in Coffee Plantations Through Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Luis Guillermo Cruz-Estrada and Wulfrano Arturo Luna-Ramírez Towards Analog Implementation of Spiking Neural Networks for Audio Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maciej Wielgosz, Andrzej Skocze´n, Jerzy D˛abrowski, Aleksandra D˛abrowska, and Waldemar Tabaczynski AI-ML Analytics: A Comprehensive Investigation on Sentimental Analysis for Social Media Forensics Textual Data . . . . . . . . . . . . . . . . . . . . . . . . . Yashas Hariprasad, Suraj Lokesh, Nagarjun Tumkur Sharathkumar, Latesh Kumar KJ, Chance Miller, and Naveen Kumar Chaudhary Using Machine Learning to Predict Grocery Sales . . . . . . . . . . . . . . . . . . . . . . . . . Kamil Samara and Mark Stanich
827
848
856
870
885
894
905
923
936
Contents
Domain-Compatible Synthetic Data Generation for Emergency Vehicle Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dipika Khullar, Negin Sokhandan, Ninad Kulkarni, Yash Shah, and Suchitra Sathyanarayana Real-Time Material Clustering for Mixed Reality Applications Based on Dichromatic Reflection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kostiantyn Rudenko and Peter Kapec Comparative Analysis of Augmented Reality Software Development Kits for the Development of Educational Applications Within Unity Game Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ˇ Andrej Cep, Andrija Bernik, and Danijel Radoševi´c YOLOv5 for Road Events Based Video Summarization . . . . . . . . . . . . . . . . . . . . Nitya Saxena and Mamoona Naveed Asghar
xiii
944
957
977
996
Style Accessory Occlusion Using CGAN with Paired Data . . . . . . . . . . . . . . . . . 1011 Sujith Gunjur Umapathy and Alexander I. Iliev Artificial Intelligence Applied to Detection of Leaks in Ducts and Pipes . . . . . . 1030 Lucas T. Silva, Rael S. Oliveira, Anderson C. Calderini, Leonardo B. A. da Silva, José A. F. C. R. Rodrigues, and João T. Dias Adoption of the Organization of Help According to the Use Case Technique in the Open-Source Software Development Process . . . . . . . . . . . . . . 1039 Lucrecia Llerena, Nancy Rodriguez, Angelita Bosquez, John W. Castro, and Lister Mera Design of a Low-Cost RUV Stereo System for Monitoring of a Trout Farm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1060 Alexander Fernandez, Paola Fonseca, and Wilder Nina Proposing an AI-Based Approach to Raise Environmental Awareness . . . . . . . . 1070 Jeongwook Kim Ontology Based Text Understanding and Text Generation for Legal Technology Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1080 Anton Ivaschenko, Oleg Golovnin, Ilya Syusin, Arkadiy Krivosheev, and Margarita Aleksandrova Open-Domain Question Answering with Topic Clustering . . . . . . . . . . . . . . . . . . 1090 Arinc Gurkan and Lakshmi Babu-Saheer
xiv
Contents
A Model for Troubleshooting Automation Based on Text Similarity . . . . . . . . . 1110 Laura Tomaz Da Silva, Júlia Colleoni Couto, Davi Kniest, Julia Godoy, Daniel Callegari, Felipe Meneguzzi, and Duncan Ruiz An Analysis on the Importance of Persuasion Strategies in Environmental-Oriented Online Crowdfunding Projects . . . . . . . . . . . . . . . . . . 1130 Yun Biao, Ying-Pei Yu, Hao-Yun Chuang, and Yu-Yun Chang A Subword-Centric Character Level Representation for Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1143 Feng Gao, Changsheng Liu, Yue Pan, Xin Wang, and Youbing Yin Attention Is not Always What You Need: Towards Efficient Classification of Domain-Specific Text: Case-Study: IT Support Tickets . . . . . . 1159 Yasmen Wahba, Nazim Madhavji, and John Steinbacher Exploring the Relationship Between News Articles and Stocks Market Movements: A Sentiment Analysis Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . 1167 Alaa Marshan, Musa Mbedzi, and Athina Ioannou L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1184 Ananya Joshi, Aditi Kajale, Janhavi Gadre, Samruddhi Deode, and Raviraj Joshi Design and Development of a Mobile Robotics Module with ROS/Gazebo Twin for Flexible, Adaptive Hands-On Learning: In-Campus/Remote/Hybrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1200 Mario Mata and Ryan M. Gibson Educational System Through Software Robots Vision . . . . . . . . . . . . . . . . . . . . . . 1219 Monica-Ioana Vulpe and Stelian Stancu Synchronized Colored Petri Net Based Multimodal Modeling and Real-Time Recognition of Conversational Spatial Deictic Gestures . . . . . . . 1227 Aditi Singh and Arvind K. Bansal Maximum Independent Set Formation on a Finite Grid by Myopic Robots . . . . 1247 Raja Das, Avisek Sharma, and Buddhadeb Sau Generation of Time-Varying Feedback-Based Wheel Lock Attack Policies with Minimal Knowledge of the Traction Dynamics . . . . . . . . . . . . . . . . 1268 Alireza Mohammadi and Hafiz Malik
Contents
xv
SCAHunter: Scalable Threat Hunting Through Decentralized Hierarchical Monitoring Agent Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1282 Mohiuddin Ahmed, Jinpeng Wei, and Ehab Al-Shaer Security Issues in Cyber Threat Intelligence Exchange: A Review . . . . . . . . . . . 1308 Moses Olaifa, Joey Jansen van Vuuren, Deon Du Plessis, and Louise Leenen Design of a True Random Number Generator Based on MRAM Devices . . . . . 1320 Manuel Aguilar Rios, Saloni Jain, and Bertrand Cambou Analysis of Detection Systems in a Software-Defined Network . . . . . . . . . . . . . . 1342 Oluwapelumi Fakolujo and Amna Qureshi Slow Slopes for Optimizing the Fault Detection in Secure QDI Circuits . . . . . . 1364 G. Ait Abdelmalek, Z. Lamine, R. Ziani, and R. Mokdad Is Your Surveillance Camera App Watching You? A Privacy Analysis . . . . . . . . 1375 Vera Schmitt, James Nicholson, and Sebastian Möller Extension for ASPICE and Cybersecurity Process Assessment Model . . . . . . . . 1394 Christian Schlager, Georg Macher, Richard Messnarz, and Eugen Brenner New Results on Algebraic Constructions of Extremal Graph Theory and Implementations of New Algorithms of Postquantum Cryptography . . . . . . 1409 Vasyl Ustimenko and Tymoteusz Chojecki Improving AFL++ CmpLog: Tackling the Bottlenecks . . . . . . . . . . . . . . . . . . . . . 1419 Sander J. Wiebing, Thomas Rooijakkers, and Sebastiaan Tesink Public Private Partnership and Smart City: An Investigation into the Navigation App “Makani” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1438 Mohamed Salama, Hind Zantout, and Sakeena El Hussien Exploring the Incremental Improvements of YOLOv5 on Tracking and Identifying Great White Sharks in Cape Town . . . . . . . . . . . . . . . . . . . . . . . . . 1455 Luxolo Kuhlane, Dane Brown, and Alden Boby Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1475
Training and Diagnosis of Retinal OCT Images with Auxiliary Data Using TripleGAN When Imbalanced Class Occurs Justin Joshua Park(B) The Thacher School, Ojai, CA, USA [email protected]
Abstract. As a result of the recent COVID-19 outbreak, global interest in healthcare issues are developing. Due to the pandemic, people are meeting via the internet and online more and more, and because of that, we are entering the era of hyperconnection between industries and IT at a fast rate. When a doctor treats a patient, there are two possible issues. Those concern the possibility of misdiagnosis and physical limitations. However, because of the virtual barrier nowadays, we can utilize deep learning and remote medical treatment that can be performed. This is more accurate and decision making can also be easier. However, Deep learning requires a lot of data to learn and train, and medical data is personal information, meaning that it is difficult to obtain because of security factors. We created additional data using our proposed model, which is Triple GAN to solve the problem of health-care deep learning that cannot be learned. The data Imbalance problem contributes to the decrease in accuracy, and we solved this problem to produce a noticeable improvement in our accuracy. Afterwards, this method can be applied to various industries where it is difficult to obtain data. Keywords: HealthCare · Convolutional Neural Network (CNN) · Deep Learning · Generative Adversarial Networks (GANs)
1 Introduction After the Corona-virus pandemic, many changes have taken place in Health-Care in this new age. Some of these changes include different methods of diagnosis such as telemedicine and deep learning. These new methods show higher accuracy, reduce misdiagnosis, and use data-driven criteria for judgment. In addition, as the use of smart devices are becoming more common, eye diseases are becoming more prominent as a consequence. Some examples of these are Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), DRUSEN, and AMD. CNV is a disease that creates new blood vessels in the choroid These new blood vessels are dangerous as they leak fluids into the retina, which can cause blindness. DME is caused when there is a high level of blood sugar and harms the blood vessels in the retina. Patients with Diabetes are at risk of this disease. When these blood vessels are damaged, they can tear and leak fluids into the retina which causes swelling. DRUSEN © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1–10, 2023. https://doi.org/10.1007/978-3-031-37963-5_1
2
J. J. Park
are yellow colored spots under the retina that are a result of old age. Drusen does not directly cause Age-related Macular Degeneration (AMD), but increases a person’s risk of developing it. A large amount of Drusen is a sign of AMD. AMD is a condition where the disease blurs the central vision of a patient, which is the result of damage to the macula. AMD in general is a leading cause of vision loss in aging people. We classified the above diseases by applying Deep Learning. Deep learning is already being applied in various fields, with examples like DallE2, medical diagnosis, etc. Made by OpenAI, DallE-2 shows very good performance in creating image data, and computes the test into a desired image. This also allows image editing using text[999]. When applied in medical diagnoses, Deep learning shows better performance than previous techniques. Deep learning learns from a large amount of data, and it makes decisions in a datadriven way. However, since medical data is personal information, it is difficult to collect enough information to learn. Without enough information, it is hard to diagnose diseases, like when the Coronavirus first spread. We propose a method to improve the performance of existing deep learning models by generating new data through GAN for these problems.
2 Background As of recently, more industries have applied machine learning, and they have already shown better results compared to previous methods. Deep Learning’s benefits include its generation of good features, its higher accuracy, and its continuous improvement. It is called Deep Learning because it stacks several layers made by imitating a human neural network. With these advantages, convolutional neural networks are mainly used for imagebased disease diagnosis in healthcare, ensuring faster diagnosis and higher accuracy. Deep Learning is also based on IT, so synchronized processing is possible. This means that multiple people can be diagnosed at the same time with just one model. Another reason to use deep learning is Misdiagnosis rates. In the US alone, misdiagnosis rates compile to almost 5% which can lead to dire consequences in some cases. That amounts to about 12 million Americans that receive a misdiagnosis yearly, and nearly 33% of those diagnoses may result in serious consequences. Through a variety of data, the best diagnosis can be made and resolved. However, as said earlier, medical data is personal information, meaning it is difficult to collect. Going further, each item or disease requires a certain number of data for learning. To combat this problem, we used a GAN that generates new data. The resulting model quickly becomes an expert in diagnosis by utilizing retinal data. One of the most important layers is called the convolution layers. They help collect data for average pooling or max pooling among features. The model stems from various computer CNN [1] models such as LeNet, AlexNet [2], ZFNet [3], GoogleNet [4], VGGNet [5], UNet [6], ResNet [7] and VIT [8].
Training and Diagnosis of Retinal OCT Images
3
LeNet is a famous CNN model that first takes an input and takes small samples within the image to analyze. This is analyzed so that the computer can classify what is in the image so that the model can first identify the object or the input. However, LeNet was not used due to computational cost problems. AlexNet was a breakthrough in 2012, which helped CNN become widespread in deep learning techniques today. It was made after Alex Krizhensky developed it to win a competition. ZFNet was the next step, analyzing AlexNet’s features and creating a Feature Map which could present a clearer image. GoogleNet is a more complicated model that is composed of an Inception module. However, GoogleNet studies the layers very well, which is why it is very useful. VGGNet is a model for being easy to use and using a 3X3 kernel for feature extraction. VGG was developed by deep layering through various studies. ResNet is a residual model, which adapts a receptive field into its features, and solves the issue of the vanishing gradient. UNet is both an encoder and decoder, which is used for the complicated segmentation process; the model picks out an individual pixel and analyzes it so that the computer can understand what the pixel is and a part of an input. Finally, VIT [8] is also an example of applying Transformer, which was applied in the NLP field, to images. Generative Adversarial Nets (GANs) [9] have a semi-supervised structure that learns while opposing the Generator and Discriminator. Generators learn to create desired Data using Noise, and Discriminators distinguish between the data created by the Generator and among existing data. This way, they learn from each other and work as a team to perform well in data creation. In Triple GAN [10], a classifier is added to the GAN to label the data created by the trained generator. Newly labeled data can be used for additional learning and is also used in fields with little data. Objective We create a Data (Class) Imbalance problem and proceed with the experiment. However, because the health-care system cannot give us data, we utilize Triple Gan to create data necessary for training and essential to increasing performance. The experiment presents the situation when the data is balanced and imbalanced, but eventually the program learns it both by creating data in the form required for learning through Triple GAN.
3 Related Works CNN-Based Model ResNet is used as a core model in many fields in CNN research. It has a variety of model seizes, and has a structure using the most basic kernel. In addition, it has benefits in backpropagation. ResNet has a comparable performance to SoTA (State-of-The-Art) in most Benchmark datasets. GAN-Based Model Triple GAN is a Semi-Supervised model that is used to create new images. The new image can be used as auxiliary data for training and it is solved when data cannot be
4
J. J. Park
obtained or a class imbalance problem occurs. When the Generator creates an image, the Discriminators become advervals to each other and start learning together. After that, the Classifier creates a Labeled image by inference at the created image. Class Imbalance There are various ways to solve the class imbalance problem. There is a method to solve the problem by increasing the existing data through data augmentation. There is also another method of undersampling the majority class and oversampling the minority class.
4 Materials and Method
Fig. 1. The Images above are the Different Diseases we Tested (Choroidal Neovascularization, Diabetic Macular Edema, DRUSEN, and AMD)
Description We used retinal OCTimages(Optical Coherence Tomography) from a dataset on Kaggle [Fig. 1.]. This dataset combined data from NORMAL, CNV, DME, DRUSEN, and AMD data. The reason for merging data was to create a Class Imbalance when a new disease appears. There are a total of 135,922 images with different sizes. The test set was distributed to 20% of the total data and then we conducted the experiment. In Fig. 1, it is difficult to identify and capture features by accurately distinguishing them with the naked eye. We will classify the above images through Deep Learning. Data Preprocessing We used 135,922 fundus images for preprocessing. For preprocessing, we applied Resize
Training and Diagnosis of Retinal OCT Images
5
to fit the data size, and all of them were changed to (224, 224). The image’s dimensions were compressed to (224, 224, 1), and were converted in Gray Scale.
Fig. 2. Overall Process and Flow Diagram: Triple GAN Generates new Labeled Target. ResNet50 can Proceed with Class Balanced Training
Convolutional Neural Network (CNN) A Convolution Neural Network is a multilayer network that classifies objects with functions. These different layers are known as the convolution layer, the pooling layer, and the fully connected layer. Once an image is inputted, it first is processed in the convolution layer that creates a feature map, or a kernel. Then, the pooling layer resizes the map by calculating the max pooling and average pooling and applying it to its calculations. Next, the fully connected layer classifies objects with activation functions. Finally, the multi-class classification utilizes a softmax function to use as an activation function. (1) is the loss function we applied to learning. We used Cross Entropy as the loss function. i is the number of data, j is the number of classes, y indicates label, and p indicates probabilities. (2) is the Optimizer function we applied, and we used the Adam Optimizer. θ is the loss function and t is the time. η is the learning rate, m is the exponential average of the gradient, and v is the exponential average of the square value of the gradient. loss = −
1 N M yij log(pij ) i=1 j=1 N
(1)
6
J. J. Park
η θt+1 = θt − m ˆt vˆ t +
(2)
Pre-Trained CNN During feature extraction, we utilized various pretrained CNN models such as ResNet. These models were taken from Retinal Images and pretrained on a dataset named ImageNet. By employing our images into training, our large number of images resulted in a high accuracy in comparison to a CNN model. These models efficiently extract features of images and layers for classification for the user. Triple GAN Triple GAN has Classifier C and classifies images pg(x|y) ≈ p(x|y). Generator G creates an image according to the class pg(x|y) ≈ p(x|y), and discriminator D determines whether a pair of data (x,y) comes from the true distribution of p(x, y). These three types work together to help the model understand and learn based on the data. The learning direction follows the formula below. min max U (C, G, D) C,G
D
= E(x,y)∼p(x,y) log D(x, y) + aE(x,y)∼Pc (x,y) [log(1 − D(x, y))] + (1 − a) + (1 − a)E(x,y)∼p(x,y) [log(1 − D(G(y, z), y))]
(3)
5 Results Experimental Setup The machine used for all experiments had an Intel 12th gen i5-12400F with an NVIDIA GTX 3070ti with 8 GB, and 16 GB of Ram. For deep learning, we used a Keras library, and all experiments were expressed as the average of repeated experimental results. To check the Data Imbalance problem, we fixed the BackBone using Resnet in [Fig. 2.]. Although there are models of various sizes, the experiment was conducted with ResNet50. The Batch Size is 100, loss is Cross Entropy, optimizer is Adam, and metric is accuracy. All experiments were tested many times to find the average number of repetitions. Data Augmentation was not applied because the data contained independent variables. Test Sets were conducted by allocating 20% of the number of Train + Valid data for each. In the paper, we conducted Triple GAN in the same format. To optimize, we used Mini Batch stochastic gradient descent training. ImBalanced Data The experiment was conducted under data imbalance [Table 1]. Learning was carried out on ResNet50 under the same conditions as above. In Fig. 3 the acc of train and
Training and Diagnosis of Retinal OCT Images
7
Table 1. Shows a Table when the Data Class is Unbalanced. AMD’s Data can be Seen as Remarkably Small (New Diseases and Hard to Obtain Data) Data
Num
portion
AMD
429
0.0039
CNV
37,205
0.3421
DME
11,348
0.1043
DRUSEN
8,616
0.0792
NORMAL
51,140
0.4703
108,738
1
Fig. 3. Shows that the Train and Valid do not Converge and are not well Trained
valid does not converge and continues to move, which means that there was a case of underfitting. Finally, the accuracy of the test set was 48.35%. Balanced Data In Table 2 the 2017 OCT data, the balanced situation allowed us to conduct training with a small number of 400 each. With repetition, we extracted data randomly. In addition, although the amount of data is small, the experiment was conducted in a situation in which the imbalance problem was resolved. In Fig. 4, the data is small, but is evenly distributed and was much more successful with an accuracy of 97.12% in testsets. The number 400 is because the number of test sets of AMD is 50. Since the test set is 20%, we applied 400, which is the remaining 80%. 50 random types of data were selected and the experiment was conducted.
8
J. J. Park
Table 2. Shows a Table when the Data Class is Balanced. Experiment was Conducted with the Ratio of All Data Uniformly Matched, and the Number was Determined based on the Smallest AMD Data
Num
Portion
AMD
400
0.2
CNV
400
0.2
DME
400
0.2
DRUSEN
400
0.2
NORMAL
400
0.2
2000
1
Fig. 4. Train and Valid are Rising and Converging very Consistently. It can be seen that the Number of Data is small and the Learning is very good
Auxiliary data by Triple GAN Table 3. Shows when the Data Class is Balanced Data AMD
Num
Portion
12,892
0.10
CNV
38,000
0.30
DME
12,000
0.09
DRUSEN
10,000
0.08
51,500
0.40
NORMAL
124,392
1
Training and Diagnosis of Retinal OCT Images
9
Fig. 5. This Figure Shows a Class Balanced Data Situation. Both Train and Valid are Converging Properly
In Table 3 the Auxiliary data set was created and used using Triple GAN until the portion was reduced to a certain ratio. Train and Valid are learning while converging similarly. It shows results for Valids that have not learned, and in addition, it shows the performance of 72.08% for the test set. Accuracy Examining at Fig. 5, the proposed model of ResNet with Triple GAN resulted in a high accuracy of 72.08%. The imbalanced data model accuracy resulted in a 48.35% accuracy.
6 Conclusion We conducted a study on the Data Imbalance problem that occurs when Deep Learning is applied to the healthcare field. It shows noticeable performance improvement when triple GAN is applied to solve data imbalance for retina problems. Using this method, if new data is generated through GAN, a more accurate diagnosis can be made for a disease that appeared for the first time or for a disease for which data is difficult to obtain. In the future, we can solve problems by applying GAN series to various fields, which enables faster and more accurate diagnosis of newly discovered diseases.
References 1. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
10
J. J. Park
2. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012) 3. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol. 8689, pp. 818–833. Springer, Cham (2014). https:// doi.org/10.1007/978-3-319-10590-1_53 4. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) 5. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint http://arxiv.org/abs/1409.1556 (2014) 6. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3319-24574-4_28 7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 8. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 9. Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014) 10. Li, C., Xu, T., Zhu, J., Zhang, B.: Triple generative adversarial nets. Adv. Neural Inf. Process. Syst. 30 (2017)
A Context-Aware Street Light System Based on Multi-variate Forecast and Fuzzy Logic Fouad Agramelal(B) , Mohamed Sadik, Essaid Sabir, and Abouzahir Saad NEST Research Group, LRI Laboratory, ENSEM, Hassan II University of Casablanca, Casablanca, Morocco {fouad.agramelal,m.sadik,e.sabir,saad.abouzahir}@ensem.ac.ma
Abstract. The rapid growth in urbanization increases the number of street lamps worldwide in rural areas and cities. This places a heavy demand on electric energy, which stems from fossil fuels, thus increasing greenhouse emissions. By using stand-alone photo-voltaic street lamps, gas emissions are lowered. However, the vital aspect of street lamps is the safety of users at nighttime. The batteries of these systems are quickly depleted due to adverse weather conditions. In order to increase safety and smartly manage energy consumption, in this work we suggest using a fuzzy controller with traffic and solar radiation forecasts in this work. We suggest using a fuzzy controller with traffic and solar radiation forecasts in this work. The controller adapts light according to traffic demand, solar radiation in the upcoming three days, and battery level. First, we tested and validated several multi-step forecast models to predict solar radiation. Then the description of the system, along with simulations, was carried out. The obtained results indicate that the designed light controller is capable of lowering energy consumption, thus prolonging the system’s autonomy while at the same time assuring road safety.
Keywords: Smart Street Light GRU · Fuzzy Logic Controller
1
· Multi-Step Forecast · LSTM ·
Introduction
While nearly 20% of the global electric energy is consumed by street lamps (SLs), the primary source of this energy stems from fossil fuels. According to the international energy agency (IEA), energy production accounts for more than two-thirds of global greenhouse emissions [1]. Therefore, a significant shift towards green energy is needed to reduce carbon footprints. According to the latest United Nations Climate change Conference (COP-27), held in Egypt, an investment of approximately 6 trillion US dollars annually in green energy is needed to eliminate emissions by 2050 [3]. Alternatively, recent technological progress made it possible to further diminish GH emissions. For instance, The c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 11–25, 2023. https://doi.org/10.1007/978-3-031-37963-5_2
12
F. Agramelal et al.
latest advancements in LED technology significantly reduced the energy consumption of indoor and outdoor lighting. This enabled other frontiers, in the case of SLs, through solar panels and intelligent control strategies. SLs play a significant role in the life of citizens during nighttime, as it prevents crime and helps in obstacle avoidance and orientation. It also gives a feeling of safety and helps reduce accidents by 35%, in contrast to places with a lack or absence of illumination [8]. Photo-voltaic (PV) SLs are advantageous over traditional grid-connected ones, as it requires less wiring and are easy to install in remote areas. However, PV-SLs can be used in several architectures, distinctively in a Stand-alone mode where the system uses stored energy in batteries or a grid-connected architecture. In either case, energy management is inefficient as lighting is not provided on demand, i.e., vehicular traffic, and it disregards future energy. Consequently, electric energy is not efficiently exploited in a gridconnected architecture, and it may jeopardize safety in case of a stand-alone architecture in case there is not enough solar energy. Prior works exploited wireless sensor networks and intelligent control to drive the luminosity of SLs. For example, in [10], the authors proposed the usage of sensors to detect vehicles on highways. In their approach, the road is divided into sections, where each section represents the distance between two adjacent street lamps. The light is turned on/off or dimmed when the vehicle counter in each section equals 0. Simulation findings indicate that turning off/dimming lamps when sections are empty can save up to 57% of energy. However, turning off the lamps may endanger the safety of road users in the occurrence of a miss detection by sensors. In [9], the authors presented a traffic-aware light management system, where an adaptive dimming profile is applied around the user upon its detection by sensors. In addition, the authors suggested partitioning the lit portion of the road around the user into zones, wherein, in the case of a motorist, street lamps within a 100 m distance of the subject are turned fully on. Simulations of their approach show significant energy reduction. Nevertheless, this approach is challenging to apply when there is high activity on the road. Other approaches rely on adjusting the dimming level per traffic flow. For example, in [11], a light control based on local regulations is implemented, where a downgrade/upgrade of the lighting class is only applied when the traffic count changes. At the same time, others exploited neural networks and fuzzy logic controllers. For instance, in [13], an LSTM model was used to forecast the power output of a PV panel using historical weather data. Based on forecasts, several methods were applied to calculate the dimming level of SLs for the upcoming nights. Results indicate a low probability of discharging the battery below 30%, even in adverse weather conditions. In [6], the authors used a combination of fuzzy logic and neural networks to control the highway’s stand-alone SLs. The neural networks were used to forecast traffic in the upcoming hour while the fuzzy logic controller calculated the dimming level based on the forecasted traffic and the battery’s energy. Simulations show that the controller can maintain battery energy even on long nights while only providing light based on demand. In [7], the authors suggested using a separate controller for each carriageway on a highway, as each has different traffic volume statistics. The authors added a third variable to the prior fuzzy
A Dual Carriageway Smart Street Lighting Controller
13
logic controller used in [6], namely solar radiation forecast, as it mirrors future energy production. A multi-variable traffic forecast model was conceived for each carriageway. The simulation indicates that the system effectively adjusts the lamp brightness following traffic and battery level while considering future energy production. Nevertheless, limiting the solar forecast to only one day in advance can limit the effectiveness of the controller, as energy produced by PV panels can be minimal in a succession of days. In this paper, we enhance the work presented in [7] by using a multi-step multi-variate forecast model to predict solar radiation in the upcoming three days. Then we use a modified version of the fuzzy logic controller to adjust the illuminance level instead of the dimming level of the lamps. The remainder of this paper is organized as follows: Sect. 2 describes the proposed light controller. Section 3 also describes developing and testing several solar radiation forecast models for the upcoming three days. Section 4 outlines the design of the fuzzy light controller. Lastly, we explain the simulation model’s design and provide a comparative study against other light controllers in the literature. Then, in Sect. 5, we conclude the paper with remarks and future orientations.
2
System Description
As mentioned before, this study is an enhancement of previous work where a separate lane control scheme was to adjust the brightness of PV-powered highway SLs. In the proposed model, each segment of SLs is controlled by the so-called ‘Master lamp.’ This lamp collects data across sensors installed within SLs, communicates with the central controller, and performs the necessary calculations for control Fig. 2. The system’s kernel, the FLC, is installed within the master lamp. In previous work, we based the FLC on the hourly prediction of traffic flow, the battery state, and the forecasted amount of solar irradiation for the next day, which was taken from the local weather station. Here, we extend the solar forecast for the next three days. This particular amount of time represents the capacity of the battery to hold energy without charging. A flowchart of the systems controller process is presented in Fig. 3. After system initialization, the system checks the ambient illuminance through an LDR sensor. If it falls below a certain threshold, the system is triggered, and the controller reads the past stored 48 hourly traffic data, speed, and occupancy and also reads the stored daily solar irradiance along with the current battery level. The collected data are then used as inputs to the forecast models. The forecasted values of traffic Xt , irradiance radd+3 along with the battery level Ent are then forwarded to the FLC Eq. (3). The controller bases its calculation on the minimum amount of stored energy within the street lamps it controls in order to ensure a uniform light distribution. We used the same forecast models for each carriageway as in [7] and the same fuzzy rules in Fig. 1. However, the output of the FLC is set to output the illuminance level on the road surface expressed in Lux, instead of the dimming
14
F. Agramelal et al.
level of lamps. This is carried out so that the lighting adjustment follows the CIE light regulation [4], except for using a continuous range of illumination (i.e., from 7,5 to 50lx) instead of light classes, as suggested in [5]. The illuminance is then converted to the required power of LED lamps using Eq. 4.
Fig. 1. Surface View of the Used FLC (a) Light Level in Accordance with Traffic Volume and Battery Level (b) Light Level in Accordance with Traffic Volume and Solar Irradiance.
Fig. 2. Overview of the Proposed System.
Yt = {Solar radt , Feature1t , Feature2t , ...}
(1)
irrd+1 , irrd+2 , irrd+3 = forecast (Yt−1 , Yt−2 , ...Yt−n )
(2)
irr = irrd+1 + irrd+2 + irrd+3
(3)
φt = FLC (Xt , Ent , irr)
(4)
A Dual Carriageway Smart Street Lighting Controller
15
Fig. 3. Flowchart of the Proposed System Controller.
3 3.1
Solar Irradiation Prediction Solar Irradiation Data and Metrics
In order to forecast the future solar irradiance for the next three days, a multistep time series forecast model is needed. To this end, the solar irradiation data and other weather parameters are collected from an open repository, the global modeling and assimilation office (GMAO). The area of interest is chosen similarly to the previous work: the SR51S freeway in California. The collected weather data comprises the daily shortwave irradiation level expressed in Wh/m2. In addition, the daily temperature, pressure, wind speed, wind direction, and relative humidity are expressed respectively in ◦ c, hPa, m/s, and deg. The daily data time ranges from 2008/01/02 to 2022/07/01, totaling 5295 across 14 years, a depiction of the solar irradiance data is represented in Fig. 4. First we remove outliers and missing values
16
F. Agramelal et al.
by using neighboring data in temporal order. A correlation test on the dataset Fig. 5 shows that temperature, wind speed, and wind direction are highly correlated with solar irradiance, and the other data are less correlated, which we dropped and not used to make forecasts. Then, we split the data into 80% for training/validating and the rest for testing. The input data is then rearranged to an FxWxS data structure, with F denoting the number of used features, in this case, 4; W is the size of the time window, and S is the total number of samples after data rearrangement. On the other hand, the output data is rearranged into a Wf xS with Wf the future time window of daily irradiation data in the following three days and S the number of samples. We use three performance metrics to evaluate the forecast accuracy: the root means square error (RMSE) and Mean absolute error (MAE), expressed in Wh/m2. And the mean absolute percentage error (MAPE). The following equations define all error metrics:
N 2 1 t Xt − X RM SE = N t=1
12
N 1 t | |Xt − X N t=1 t 100% Xt − X N M AP E = Xt N
MAE =
(5)
(6)
(7)
t=1
t is the observed value of solar irradiance and Xt is the corresponding where X predicted value at time t whilst N is the total number of predictions. 3.2
Forecast Models
This subsection presents the candidate forecast models to employ in the system. We experiment from a simple ANN model to various vanilla deep learning models such as DNN, CNN, LSTM, and GRU models, along with a mixture of deep learning referred to as hybrid models such as CNNLSTM and biLSTM. Artificial Neural Network. ANNs are heuristic approaches composed of interconnections between artificial neurons (i.e., units), where each neuron can process inputs and forward the output to other neurons. It is the weighted sum of its inputs passed through an activation function. The simplest way to combine these neurons is to use a feed-forward method, in which information is only propagated forward from the input units toward the output units. In an ANN architecture, neurons are distributed within three distinctive layers: an input layer, a hidden layer, and an output layer Fig. 6. By adding other hidden layers, the model can extract and model complex features and non-linear relationships between its inputs and outputs. These models are called Deep Neural Networks (DNN) because of the deep stack of layered neurons.
A Dual Carriageway Smart Street Lighting Controller
Fig. 4. Yearly Solar Data Overview.
Fig. 5. Data Heatmap.
X t-1
X t-2 Xt
X t-n+ 1
X t-n
Fig. 6. Neural Network Model.
17
18
F. Agramelal et al.
Long Short Term Memory Model. Long short-term memory (LSTM) enhances recurrent neural networks (RNN). It mainly handles the vanishing gradient problem by introducing a memory. Where each regular recurrent node is swapped by a memory cell, the model can incorporate the effect of distant past records to compute the output. Figure 7 shows the general structure of an LSTM model. The feeding data are the cell state Ct , the hidden cell state ht−1 at the previous time step, and the feeding data at the current time step. The flow of information inside the LSTM model is controlled via three fully connected layers, i.e., the forget gate ft , the input it , and output gates Ot . The role of each gate is to determine whether to keep the actual memory value or delete it and whether the memory cell should influence the current output by using a sigmoid activation function. The main equations of an LSTM model are as follows: ft = sigma (wf · [ht−1 , xt ] + bf )
(8)
it = sigma (wi · [ht−1 , xt ] + bi )
(9)
gt = tanh (wg · [ht−1 , xt ] + bg )
(10)
Ct = ft ∗ ct−1 +
i∗t gt
(11)
outt = sigma (wo · [ht−1 , xt ] + bO )
(12)
ht = outt ∗ tanh (ct )
(13)
Fig. 7. LSTM Cell Model.
A Dual Carriageway Smart Street Lighting Controller
19
Gated Recurrent Unit Model. Gated recurrent unit (GRU) networks are similar to the LSTM topology but only simpler to compute. A reset gate and an update gate replace the three gates of the LSTM model. Each is computed using fully connected layers with a sigmoid activation function. The reset gate is used to determine how much of the previous state to remember, while the update gate is in charge of figuring out the amount of past data that needs to be sent with the next state Fig. 8. The main equations of the GRU model are listed below: zt = σ (Wz · [ht−1 , xt ])
(14)
rt = σ (Wr · [ht−1 , xt ])
(15)
˜ t = tanh (W · [rt ∗ ht−1 , xt ]) h
(16)
˜t ht = (1 − zt ) ∗ ht−1 + zt ∗ h
(17)
Fig. 8. GRU Cell Model.
Convolutional-LSTM Model. In this model, we merge a convolutional neural network, also known as ConvNets, i.e., (CNN) and an LSTM model, together to make predictions. ConvNets are widely applied in gridlike data such as images and are very effective in feature extraction. They are also applied in other applications like natural language processing and time-series forecast. In this context, the model input data is firstly handled by the CNN layers and then passed to the LSTM layers. In this study, the adopted structure is as follows: the CNN part of the model is a combination of two consecutive uni-dimensional convolution layers and a uni-dimensional Max-pooling layer, and a flattened layer, thereafter it is followed by an LSTM layer and a dense layer.
20
F. Agramelal et al.
Bi-directional LSTM Model. Derived from the bidirectional RNN [12], this model is a combination of two LSTM models working together, the first one learns the sequence of the provided input data, and the second one learns the reverse of the same input sequence. Then, the output sequence of the two LSTMs is merged using various functions such as concatenation, average function, summation, or multiplication function. In this study, we implement two consecutive bidirectional LSTM layers with ’tanh’ as an activation function followed by a dropout and a dense layer. The merging step is set by default, i.e., the concatenation function. 3.3
Forecast Results
The optimal model design variables, i.e., the number of layers, number of neurons, and type of activation function, along with the optimal hyperparameters such as batch size, the number of epochs, and the learning rate, are a subject of optimization. Therefore, we experimented on two-time windows for each forecast model and used grid search to find the optimal design variables and hyperparameters in each case. Keras tuner was chosen to execute grid-search and to speed up the searching time, and we used a Googlecolab GPU. The obtained scores for each time window are reported in Table 1. It can be observed that the GRU model, in the case of a seven days time-window, outperforms the other models. With an RMSE score of 865.57 lx, a MAE score of 37.17 lx, and a MAPE score of 20.71%. The second and third best models are the 14-day time-window GRU model with an RMSE and MAE score of 870.08 lx and 640.2 lx, respectively, while a MAPE score of 20.79%. Table. A graphical comparison of the top five best models for each of the future three days is depicted in Figs. 9, 10 and 11. Respectively. Based on the obtained results, a 7-days time-window GRU model is adopted in this study to predict solar irradiation. The optimal hyperparameters of the best three models are reported in Table 2. It can be seen in the reported results a large score value. In a future study, we plan to use a smaller granularity resolution to predict solar irradiation. Table 1. Forecast Scores of the Different Prediction Models. 7days RMSE
MAPE MAE
14days RMSE MAPE MAE
ANN
904.48
21.7
648.98
942.26 22.41
718.44
DNN
956.26
22.92
736.76
919.56 22.38
680.6
CNN
886.5
21.38
640.53
886.3
21.82
650.48
LSTM
896.79
20.85
660.4
907.36 22.07
629.97
GRU
865.57 20.71
biLSTM
893.74
21.28
654.71
960.03 23.43
663.53
CNNLSTM 935.06
22.52
677.38
976.1
738.93
637.17 870.08 20.79 23.13
640.2
A Dual Carriageway Smart Street Lighting Controller
21
Table 2. Optimal Training Parameters of the Best Forecast Models. Model Layer1 Dropout Layer2 Dropout Batch size Epochs RMSE MAE 7d
GRU
96
0.1
128
14d GRU
128
0.1
96
MAPE
0.1
128
50
865.57 673.17 20.71
0
128
60
870.08 640.2
20.79
Fig. 9. Comparison between Forecast Models for Day+1.
4
System Assessment
In this section, we evaluate and compare the proposed light controller against various light controllers in the literature. We compare mainly against (i) a traditional light controller with no dimming. (ii) Controller based on future hourly traffic, battery stored energy, and the expected solar irradiation in the following day [7]. (iii) An adaptive light controller based on the Italian light regulation, where light is classified per traffic volume [11]. The simulation of the system is conducted in the Matlab environment. The used parameters are a 30 W LED points module with an efficacy of 130 lm/W. The battery is sized to keep each PV-powered SL functioning for 10 h daily, with an autonomy of three consecutive days. It is assumed that the battery’s Voltage is 12 V, with a capacity of 100 AH. The total stored energy in Joules is given by Eq. 21. The sizing of the PV panel is obtained using PVsyst software. The panel is sized to an output power of 213.5 W at the maximum power point(MPP). In the studies mentioned above, the light controller outputs the level of lighting flux φ. For this, a proper link to convert light flux into power needs to be established. As in [7], the 30 W LED module is assessed using a 5 × 3 matrix of ’LUXEON Rebel’ LED points [2]. By exploiting the datasheet, a link between the light flux φ and the current of each LED point Id is firstly established using the regression expression in Eq. 19. The instantaneous lamp power is then calculated with Eq. 18–Eq. 20. The term Rd represents the dynamic resistance, and
22
F. Agramelal et al.
Fig. 10. Comparison between Forecast Models for Day+2.
Fig. 11. Comparison between Forecast Models for Day+3.
Vth is the threshold voltage of the LED module. In the simulation process, two parts are distinguished: discharge and charge mode. The discharge mode, which happens during nighttime, is activated once the ambient illuminance falls below a certain threshold. The FLC computes the required road illuminance, which is then converted to the instantaneous power value using Eq. 23. The remaining battery energy is then estimated by subtracting its previous value from the obtained instantaneous power. As for the other studies, the battery’s state of charge is computed in the same manner, with the only difference of converting flux to power instead of road illuminance to power. Conversely, the charging mode is activated once the ambient illuminance rises above a set threshold. With the previously sized PV panel, the instantaneous power is calculated by converting the solar irradiance using Eq. 22. It is worth noting that for an accurate calculation of the generated power by the PV solar
A Dual Carriageway Smart Street Lighting Controller
23
panel, the irradiance on the plan of array (POA) needs to be computed. Consequently, accurate positioning of the panel in the studied area with a known tilt angle and diffuse horizontal irradiance (DHI), amongst other parameters, are to be known. Therefore, in this study, we limit the computation of generated power on the GHI value. (18) Vo = RD Io + Vth Id = 66.9713φ3 + 38.3589φ2 + 595.0029φ − 0.59462
(19)
Pt = Vo ∗ I0 = Vo ∗ (Id ∗ 3)
(20)
E0 = C0 U0
(21)
irr × (1 + γ(θ − 25)) (22) 1000 P ·η (23) E= 2 h The simulation is carried out during winter, representing the worst-case scenario, i.e., nighttime is longer than daytime, and the generated power is mostly insufficient due to cover. Figure 12 shows the evolution of the battery’s state of charge over time for the entirety of December for each lighting control strategy. P = MPP ×
Fig. 12. Comparison between Different Control Schemes during Winter.
Assuming that the battery starts at a 100% charge, the conventional way of lighting shows the worst-case scenario, where it doesn’t only drop below 30%, but it also discharges at the end of the month. The other control strategies and the proposed controller keep a charge state above 60%. In some cases, controller (iv) discharges faster, as it changes the required lighting flux through steps, as opposed to the other controllers where the change in illuminance is stepless and seemingly won’t appear to road users. The other reason, this system is traffic adaptive only, with no knowledge of the battery’s charge state or expected
24
F. Agramelal et al.
generated energy. The proposed controller is shown to be efficient, as it keeps an average battery charge of about 94%, with the minimum being 82%. In contrast, controllers (ii) and (iii) keep an average battery charge of about 92% and 83%, with the minimum being 80% and 63%, respectively.
5
Conclusion
This paper proposes a smart controller for PV-powered autonomous street lamps to increase time usage in adverse traffic and weather conditions. The system uses an FLC to adapt the lamp’s brightness, and it uses weather and traffic data to forecast the solar radiation in the upcoming three days and traffic volume in the next hour. First, we tested several forecast models using appropriate metrics. Then a multi-variate GRU model was validated and chosen to predict solar radiation. Next, using both predicted parameters and battery level, the FLC outputs the appropriate light illuminance on the road surface. Finally, modeling and simulation of the system were conducted in the Matlab environment. Compared to other light controllers, simulations show that the suggested controller can keep a battery charge above 90% in the worst-case scenario, i.e., winter time, while providing adequate lighting to road users. In future work, we plan to use in-situ photo-voltaic street lamps and forecast directly on the real generated power instead of solar radiation. We also plan to fuse the proposed light controller method with video processing technics to detect vehicles and merge other light control strategies.
References 1. IEA, greenhouse gas emissions kernel description. https://www.iea.org/dataand-statistics/data-tools/greenhouse-gas-emissions-from-energy-data-explorer. Accessed 10 Nov 2022 2. LUMILEDS, luXEon Rebel ES kernel description. https://lumileds.com/wpcontent/uploads/files/DS61.pdf. Accessed 30 Mar 2022 3. UNEP, COP-27 Annoucement kernel description. https://www.unep.org/newsand-stories/story/cop27-ends-announcement-historic-loss-and-damage. Accessed 10 Nov 2022 4. CIE 115:2010 recommendations for the lighting of roads for motor and pedestrian traffic. International Commission on Illumination, Vienna, Austria (2010). Kernel description 5. Shlayan, N., Challapali, K., Cavalcanti, D., Oliveira, T., Yang, Y.: A novel illuminance control strategy for roadway lighting based on greenshields macroscopic traffic model. IEEE Photon. J. 10(1), 1–11 (2018) 6. Agramelal, F., Sadik, M., El Hannani, A., Moubarak, Y.: A traffic-aware street lighting system based on fuzzy logic controller. In: 2022 IEEE 18th International Colloquium on Signal Processing & Applications (CSPA), pp. 132–137. IEEE (2022) 7. Agramelal, F., Sadik, M., Sabir, E.: A dual carriageway smart street lighting controller based on multi-variate traffic forecast. In: Kacprzyk, J., Ezziyyani, M., Balas, V.E. (eds.) AI2SD 2022. LNNS, vol. 637. Springer, Cham (2022). https:// doi.org/10.1007/978-3-031-26384-2 41
A Dual Carriageway Smart Street Lighting Controller
25
8. Jackett, M., Frith, W.: Quantifying the impact of road lighting on road safety-a New Zealand study. IATSS Res. 36(2), 139–145 (2013) 9. Lau, S.P., Merrett, G.V., Weddell, A.S., White, N.M.: A traffic-aware street lighting scheme for smart cities using autonomous networked sensors. Comput. Electr. Eng. 45, 192–207 (2015) 10. Mustafa, A.M., Abubakr, O.M., Derbala, A.H., Ahmed, E., Mokhtar, B.: Towards a smart highway lighting system based on road occupancy: model design and simulation. In: Sucar, E., Mayora, O., Mu˜ noz de Cote, E. (eds.) Applications for Future Internet. LNICST, vol. 179, pp. 22–31. Springer, Cham (2017). https://doi.org/10. 1007/978-3-319-49622-1 4 11. Petritoli, E., Leccese, F., Pizzuti, S., Pieroni, F.: Smart lighting as basic building block of smart city: an energy performance comparative case study. Measurement 136, 466–477 (2019) 12. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997) 13. Tukymbekov, D., Saymbetov, A., Nurgaliyev, M., Kuttybay, N., Dosymbetova, G., Svanbayev, Y.: Intelligent autonomous street lighting system based on weather forecast using LSTM. Energy 231, 120902 (2021)
Using Smartphone Sensing for Recognition of Game Player Attributes During Gameplay Muhammad Saad Khaquan1 , Muhammad Ehatisham-ul-Haq1(B) , Fiza Murtaza1 , Aasim Raheel2 , Aamir Arsalan3 , and Muhammad Awais Azam4 1
4
Department of Creative Technologies, Air University Islamabad, Islamabad, Pakistan [email protected] 2 University of Engineering and Technology Taxila, Taxila, Pakistan 3 Fatima Jinnah Women University, Rawalpindi, Pakistan Technology Innovation Research Group, School of Information Technology, Whitcliffe, Wellington, New Zealand
Abstract. With the recent advances in smart sensing technologies, many people are now using smartphones to do routine tasks. In particular, the use of smartphones for gaming and infotainment purposes has increased significantly, as a result of which the gaming industry is expanding globally. Smartphones can now sense the interactions of a user with the device based on the embedded sensors. These human-smartphone interactions can be recognized using machine learning approaches to sense and automate/control different tasks being performed on the device. In particular, smartphone-based games have a lot of advantages, where the game controls can be seamlessly adjusted automatically based on the game player’s interaction to achieve a better gaming experience. In this regard, we propose a smartphone sensor-based method to recognize the attributes of a user, i.e., game player, during gameplay. The proposed scheme is based on the idea that different game players have their own ways of interacting with the device when playing games. The smartphone inertial sensors can be used to track these interactions and recognize different attributes (such as expertise level, gender, and identity) of a game player. This information can further be used in the games for adaptive control to maximize the user’s gaming experience. The proposed scheme is validated based on different experiments, and the overall average accuracy of 71.3% is achieved in the best case; thus, satisfactory results are achieved. Keywords: Attribute Recognition · Game Player Interaction · Machine Learning · Smartphone Sensing · Ubiquitous Computing
1
Introduction
Computing and sensing technologies have drastically changed and improved in the past few years, making smartphones more widespread and constantly present. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 26–38, 2023. https://doi.org/10.1007/978-3-031-37963-5_3
Recognition of Game Player Attributes During Gameplay
27
Nowadays, more and more people are using smartphones for different purposes. A smart device is a phone that can do more than just make calls and send texts. They can also be used to access the internet, take pictures, and use different applications. Many smartphones have embedded sensors that allow us to do things like play games. These sensors usually include inertial sensors, such as an accelerometer and a gyroscope, which measure how the phone is moving or positioned. These sensors can be used for different purposes, such as counting steps or even controlling game inputs. The gaming business has found a new platform with the widespread availability and usability of smartphones. As a result, there has been a huge increase in the number of people playing mobile games over the last few years [1,5]. The gaming industry is a big part of the entertainment industry and has become a way of life for many people, who use it for learning and enjoyment. The end-user who use the products or services in this industry have become more common in recent years because of the advancement and availability of smart computing technologies. In addition, the gaming industry has changed because now people can play their favorite games on their smartphones instead of video game consoles and this allows people to play games anywhere and at any time. Games designed for smart devices and tablets are now in the running with traditional computer-based games and have caught the attention of a wide range of people from different parts of life [7]. Virtual reality-based games are being developed that are controlled and played with the help of sensors placed on the human body. In this regard, recognition of the game player context, i.e., attribute information, during the gameplay can help the gaming industry to evaluate the gaming experience of different people. Thus, in [5], the authors proposed a system that uses a combination of the accelerometer and gyroscope to classify mobile game players as experts or novices. In [4], the authors analyzed how smartphone users can be identified by using embedded accelerometer sensor data. They concluded that different users have their own way of interacting with a smartphone, which can be tracked using smartphone motion sensors for user identification based on machine learning algorithms. In [12], the authors proposed a new method for user identification, which is based on the characteristics of mobile phone usage (such as position changes when carrying, talking, or taking other actions). They validated their scheme by acquiring accelerometer sensor data from different users for several months during their ordinary smartphone usage. In [21], the authors proposed a method to predict the performance of game players using the sensor data. They proposed a scheme to predict if a player is likely to win a future game. In [3,8,11,14], it is analyzed how experienced and non-experienced athletes punch, and how those punches vary in terms of force, velocity, and time. The authors used the expert-novice paradigm to study 31 athletes, looking at how well they executed four different types of punches: the cross, jab, uppercut, and hook strike. These studies also utilized smartphone sensors to track the player’s performance and classify punches with the convolutional neural network models in boxing video games. In [19,20], the authors also utilized data from sensors to predict the player’s performance in a game. To get this data, they collected information from professional and amateur players.
28
M. S. Khaquan et al.
The player’s performance is evaluated using game logs from a multiplayer game. A recurrent neural network is used to assess the player’s performance at each moment in time. The authors found that an attention mechanism can improve the overall performance of the network. In [10,18], a torso sensor is installed to monitor the athlete’s physical state in real-time. The relevant basketball movement is divided into different sections so that the athlete’s physical state can be monitored more easily. In [15,17], the authors analyzed how accurate an inertial sensor device is at measuring distance and speed during a soccer-specific circuit. They also measured the speed and movements of a player from video analysis to study different academy-level soccer players. In [22], the authors proposed a method to allow people suffering from a stroke to regain movement in their hands. Their proposed system used sensors to track the movement of the person’s muscles and used that information to provide feedback to the person for improving their movement. The system also included two games that the person can play to help them practice the movement they are trying to regain. In [13,16], a wearable device was used to detect the forward, backward, and rotating movements of game players. The data regarding the body/hand movement is sent to the Unity 3D game engine using Bluetooth. The game engine used that data to control the movements of the character in the game. Mobile-based video games are becoming increasingly popular, thus requiring them to be automatically adaptive. The success of a chosen game depends on its ability to capture the end user’s attention, the level of enjoyment of a particular recreation, and responsiveness. In this regard, recognition of game player attributes, such as experience/expertise level, gender, age, and identity, is very important to keep track of how well a player is doing during a game. The game settings and difficulty levels can be adjusted based on the player’s abilities, making the game more exciting and enjoyable for the player. Computing devices are getting better and better at understanding people. There have been lots of different methods proposed for figuring out individual attributes like gender. For example, researchers have proposed various methods using face, speech, brain signals, and video-processing techniques for gender recognition. In [9], genders are recognized by analyzing male and female voice differences. In [24], a technique is proposed for gender recognition based on facial images. In [23], the convolutional neural network is used to analyze Electroencephalography (EEG) data in order to identify individual users. In [6], a method is proposed to classify gender using EEG during mobile gameplay. The capacity of an online game to model the expected participant conduct is very important because it allows developers to collect data-driven insights for creating a well-informed game design. Therefore, in this research work, we aim to recognize game player attributes using smartphone-embedded sensors, which can further help in raising the quality of the gameplay experience. 1.1
Major Contributions
In this research work, we propose a new way of recognizing game player attributes based on mobile sensors (accelerometer and gyroscope), figuring out how good someone is at playing a mobile game. We classify the game players based on their
Recognition of Game Player Attributes During Gameplay
29
skill level and categorize them into either experts or novices. We also identify the gender of a gameplayer during the gameplay based on how a person interacts with the phone. Moreover, we also identify the gameplayer as well based on smartphone-based movements and interactions during the gameplay. In this regard, we use an existing dataset for smartphone-based gameplay to train and test the proposed model with the help of a Random Forest classifier. To the best of our knowledge, it is for the first time that a smartphone sensor-based method is proposed to recognize multiple attributes of the game player during gameplay. The results section shows the efficacy of our proposed idea and the method. The proposed idea can possibly be used to devise new methods for virtual reality-based games, in which the wearable sensors can be used to realize the game player attributes and interactions for further game adaptation. 1.2
Paper Organization
The rest of the paper is organized as follows: Sect. 2 describes the proposed method in detail, whereas Sect. 3 presents a detailed analysis of the achieved results. Section 4 concludes the findings of this research work and provides recommendations for future work.
2
Proposed Methodology
The proposed scheme for game player attributes recognition consists of the following key stages: 1) data acquisition and pre-processing, 2) feature extraction and selection, and 3) attribute classification, as shown in Fig. 1. The details regarding each stage are provided below.
Fig. 1. Proposed Methodology for Game Player Attribute Recognition.
2.1
Data Acquisition and Pre-processing
We used an existing dataset [5] for game player attribute recognition, which was collected from thirty-eight (38) healthy individuals (19 males and 19 females) when playing the “Traffic Racer” game on a smartphone. The inertial sensors
30
M. S. Khaquan et al.
data were recorded during the gameplay using a Huawei Nova 3i mobile phone with Android v.8.1 via the CrowdSense application. Every participant gave three (03) trials of the gameplay, with each trial having a duration of 100 s. More details regarding the dataset can be obtained from [5]. After data acquisition, we applied a moving average smoothing filter (size: 1 × 3) for noise removal on the inertial sensors data and further performed feature extraction. 2.2
Feature Extraction and Selection
After the pre-processing stage, we extracted twenty time-domain features from the sensor data, which are as follows: maximum and minimum amplitude, maximum and minimum latency, latency-amplitude ratio, absolute latency-amplitude ratio, peak-to-peak value (amplitude, time, slope), mean, standard deviation, skewness, kurtosis, energy and normalized energy, entropy, absolute mean of first and second difference of signal, and absolute mean of first and second difference of normalized signal. We extracted features separately from each axis of the accelerometer and gyroscope sensors, resulting in [1 × (20 × 3)] = [1 × 60] feature vector size per sensor. To further enhance the performance of extracted features, we applied rank-based feature selection and utilized the information gain (i.e., entropy) value for each feature for ranking purpose [2]. We finally chose the top twenty features per sensor based on their entropy scores for further classification tasks. The entropy scores for the rest of the features were very low, thus we ignored them in further processing of the proposed method with the feature selection approach. 2.3
Attribute Classification
After feature extraction, the next stage is to classify the feature vectors into game player attributes. For this purpose, we used the random forest (RF) classifier for training our model because of its efficient performance and robustness in the existing studies for smartphone sensor-based classification. The RF algorithm uses decision trees to label the input data and make predictions. Each decision tree makes its prediction, and then RF combines these predictions to get a final prediction. The outcome of the classifier is based on the majority voting criterion, which means that the class with the most predictions becomes the output of the prediction. For experimentation, we used an RF classifier with 100 decision trees. We trained the RF classifier independently for three types of attribute classification, including expertise level classification (expert/novice), gender classification (male/female), and game player classification (user identification). The feature vectors were labeled independently against each type of experiment.
3
Analysis of Results
This section presents the detailed results and analysis of the proposed scheme. As mentioned earlier, we perform three types of experimentation for the recognition of game player attributes, i.e., expert/novice classification, male/female
Recognition of Game Player Attributes During Gameplay
31
classification, and game player identification. The detailed results regarding these experiments are given below in subsequent sections. 3.1
Analysis of Expertise Level Classification
In our first type of experiment, we recognized the expertise level of the game player as either expert or novice. In the existing dataset, out of 38, there were 22 game players labeled as experts, and the rest of the 16 users were labeled as novices. The labeling of expertise level against each used was done based on the gameplay score [5]. For the classification of expert and novice game players, we passed the extracted feature vectors (with and without feature selection) as input to the RF classifier and used a five-fold cross-validation scheme for evaluation. Table 1 and Table 2 provide the average results obtained for the expert/novice classification of the game player using the proposed scheme with and without feature selection respectively. Table 1. Average Results Obtained for Expertise Level Classification using RF Classifier (without Feature Selection). Accelerometer Gyroscope Acc. + Gyr Accuracy
73.6
66
Recall
73.5
75
76.3 76
Precision
73.6
58.6
76.4
F1 Score
73.5
65.3
76
Specificity
73.5
59.2
76
60
120
Feature Vector Size 60
Table 2. Average Results Obtained for Expertise Level Classification using RF Classifier (with Feature Selection). Accelerometer Gyroscope Acc. + Gyr Accuracy
77.1
66.6
78.9
Recall
77.2
66.5
78.7
Precision
77.1
66.5
78.9
F1 Score
77.1
66.5
78.8
Specificity
77.2
Feature Vector Size 20
66.5
78.7
20
40
It can be observed from Table 1 that the accelerometer sensor achieves an accuracy rate of 73.6% for expertise-level classification, whereas the gyroscope
32
M. S. Khaquan et al.
Fig. 2. Comparison of Average Accuracy Rate Obtained for Expertise Level Classification.
achieves an average accuracy of 66%. If we combine the accelerometer sensor with the gyroscope, the accuracy rate improves to 76.3%. The results of other performance metrics such as precision, recall, f1-score, and specificity are also better in the case of the accelerometer as compared to the gyroscope. The best-case results are obtained with the fusion of accelerometer and gyroscope sensors. Table 2 provides the result for the expertise level classification with feature selection. It can be seen that the average accuracy rate achieved with the accelerometer sensor is 77.1%, which is 11.5% better than that obtained with the gyroscope, i.e., 66.6%. When fusing both accelerometer and gyroscope, the average accuracy reaches up to 78.9%, which shows the efficacy of using the feature selection for performance improvement. Figure 2 provides a comparison of the expertise level classification results obtained with and without feature selection. The feature selection eliminates the redundant features that may lead to misclassification. Thus the results are further improved with the feature selection, where we selected the top-ranked features for expertise-level classification. Moreover, the feature vector size obtained with the feature selection is three times less than the original feature vector. This shows that we can achieve maximum accuracy at a low computational cost when applying the feature selection.
Recognition of Game Player Attributes During Gameplay
3.2
33
Analysis of Gender Classification
Gender is one of the important attributes of a game player. In the dataset we used for experimentation, there were 19 male and 19 female participants who acted as game players. For gender classification, we passed the extracted feature vectors (labeled as either “male” or “female” based on the ground truths) as input to the RF classifier and used a five-fold cross-validation scheme for evaluation. Table 3, Table 4, and Fig. 3 provide the average results obtained for gender classification of the game player with and without feature selection, respectively. Table 3. Average Results Obtained for Gender Classification using RF Classifier (without Feature Selection). Accelerometer Gyroscope Acc. + Gyr Accuracy
71
59.6
58.7
Recall
71
59.6
58.7
Precision
71
59.5
58.8
F1 Score
71
59.7
58.6
Specificity
71
Feature Vector Size 60
59.6
58.7
60
120
Table 4. Average Results Obtained for Gender Classification using RF Classifier (with Feature Selection). Accelerometer Gyroscope Acc. + Gyr Accuracy
66.6
55.2
66.6
Recall
66.6
55.2
66.6
Precision
66.8
55.2
66.8
F1 Score
66.5
55.2
66.5
Specificity
66.6
55.2
66.6
20
40
Feature Vector Size 20
It can be observed from Table 3 that the accelerometer sensor achieves an accuracy rate of 71% for gender classification, whereas the gyroscope achieves an average accuracy of 59.6%. If we combine the accelerometer sensor with the gyroscope, the accuracy rate drops to 58.7%. Thus, the best-case results are obtained using the accelerometer sensor individually. Table 4 provides the gender classification results when using the feature selection. It can be seen that the average accuracy rate for the accelerometer sensor is 66.6%, whereas for gyroscope it is 55.2%. When fusing both accelerometer and gyroscope sensors
34
M. S. Khaquan et al.
Fig. 3. Comparison of Average Accuracy Rate Obtained for Gender Classification.
with feature selection, the average accuracy obtained is 66.6%, which is the same as achieved with the accelerometer only. Thus, the addition of gyroscope with accelerometer sensors does not tend to be a good option for achieving satisfactory results. 3.3
Analysis of Game Player Classification
For game player classification, the proposed scheme aims to recognize each user based on their gameplay behavior and hand movements measured using smartphone sensors. In this regard, we assigned a different user label to each game player. As there were 38 participants in the dataset, thus we used 38 labels for classification purpose. In this regard, we trained the RF classifier based on the extracted features that were assigned to the user labels. We utilized the five-fold cross-validation scheme for evaluation and the obtained results are provided in Table 5 and Table 6. Figure 4 provides a comparison of the user classification results obtained with and without feature selection.
Recognition of Game Player Attributes During Gameplay
35
Table 5. Average Results Obtained for Game Player Classification using RF Classifier (without Feature Selection). Accelerometer Gyroscope Acc. + Gyr Accuracy
50
27.1
Recall
50
27.2
62.2 62.3
Precision
49
27.3
61.5
F1 Score
48.1
25.4
60.3
Specificity
98.6
98
99
60
120
Feature Vector Size 60
Table 6. Average Results Obtained for Game Player Classification using RF Classifier (with Feature Selection). Accelerometer Gyroscope Acc. + Gyr Accuracy
52.6
34.2
64
Recall
52.6
34.2
64
Precision
53.1
38.8
69.1
F1 Score
51.1
34.2
63.6
Specificity
98.7
98.2
99
20
40
Feature Vector Size 20
It can be observed from Table 5 that the accelerometer achieves an accuracy rate of 50% for game player identification during gameplay, whereas the gyroscope achieves an average accuracy of 27.1%. Thus, the performance of the accelerometer is way better than the gyroscope. When we combine the accelerometer sensor with the gyroscope, the accuracy rate improves to 62.2%. Table 6 provides the results obtained for game player identification using feature selection. In this case, the average accuracy rate achieved with the accelerometer sensor is 52.6%, which is 18.4% better than that obtained with the gyroscope, i.e., 34.2%. When fusing both accelerometer and gyroscope, the average accuracy reaches up to 64%, thus the performance is improved. Hence, feature selection tends to provide better results for game player classification.
36
M. S. Khaquan et al.
Fig. 4. Comparison of Average Accuracy Rate Obtained for Game Player Classification.
3.4
Discussions
It can be observed from the results reported in Tables 1, 2, 3, 4, 5 and 6 that the proposed scheme performs well for expertise-level classification with the best average performance of 78.9% in terms of average accuracy. However, for gender recognition, the best accuracy rate achieved is 71%, which is satisfactory. On contrary, the results achieved for game player classification have the best average accuracy of 64%, which is less than those attained for other experiments. However, in the case of game player classification, we have 38 classes, which makes it hard to attain high accuracy. It can be concluded from these results that it is very much possible to recognize the game player attributes based on how a player interacts with the smartphone and plays the game. The speed of hand movements and the smartphone itself reflects many characteristics of the game player, which can be tracked and learned, as demonstrated by the proposed method.
4
Conclusions
Recognizing game player attributes has become an interesting area of research with the increase in the pervasiveness of smartphones. It has a variety of appli-
Recognition of Game Player Attributes During Gameplay
37
cations in gaming, multimedia, and virtual and augmented reality. Thus, in this study, we propose a method for the analysis and recognition of game player attributes during gameplay. In this regard, we utilize the existing dataset (comprising 38 participants) for experimentation purposes and make use of the smartphone inertial sensors for classifying the expertise level, gender, and identity of a game user. We achieve the best average accuracy of 78.9%, 71%, and 64% for expertise-level classification, gender recognition, and game player identification, respectively, using the RF classifier. The results obtained by the proposed scheme are satisfactory, considering that not much work has already been done in this area based on smartphone sensors. However, for future studies, the proposed method can be extended to analyze game player attributes based on games belonging to multiple and diverse genres. Further, multiple sensing modalities (including physiological sensors) and novel feature extraction methods can be used to further improve the accuracy of the system.
References 1. Anwar, S.M., Saeed, S.M.U., Majid, M.: Classification of expert-novice level of mobile game players using electroencephalography. In: 2016 International Conference on Frontiers of Information Technology (FIT), pp. 315–318. IEEE (2016) 2. Azhagusundari, B., Thanamani, A.S., et al.: Feature selection based on information gain. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 2(2), 18–21 (2013) 3. Ba´ nkosz, Z., Winiarski, S.: Using wearable inertial sensors to estimate kinematic parameters and variability in the table tennis topspin forehand stroke. Appl. Bionics Biomech. 2020 (2020) 4. Davarci, E., Anarim, E.: User identification on smartphones with motion sensors and touching behaviors. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4. IEEE (2022) 5. Ehatisham-ul-Haq, M., Arsalan, A., Raheel, A., Anwar, S.M.: Expert-novice classification of mobile game player using smartphone inertial sensors. Expert Syst. Appl. 174, 114700 (2021) 6. Fatima, B., Raheel, A., Arsalan, A., Majid, M., Ehatisham-Ul-Haq, M., Anwar, S.M.: Gender recognition using EEG during mobile game play. In: 2021 International Conference on Information Technology (ICIT), pp. 634–639. IEEE (2021) 7. Gr¨ uter, B., Hajinejad, N., Sheptykin, I.: Mobile game play and everyday life. In: Handbook of Digital Games, pp. 444–470 (2014) 8. Gu, F., Xia, C., Sugiura, Y.: Augmenting the boxing game with smartphone IMUbased classification system on waist. In: 2022 International Conference on Cyberworlds (CW), pp. 165–166. IEEE (2022) 9. Kabil, S.H., Muckenhirn, H., Magimai-Doss, M.: On learning to identify genders from raw speech signal using CNNs. In: Interspeech, vol. 287, p. 291 (2018) 10. Li, Z.: Feature extraction and data analysis of basketball motion postures: acquisition with an inertial sensor. J. Eng. Sci. Med. Diagn. Therapy 4(4), 041006 (2021) 11. Menzel, T., Potthast, W.: Application of a validated innovative smart wearable for performance analysis by experienced and non-experienced athletes in boxing. Sensors 21(23), 7882 (2021) ¨ 12. O˘ guz, A., Ertu˘ grul, O.F.: Human identification based on accelerometer sensors obtained by mobile phone data. Biomed. Signal Process. Control 77, 103847 (2022)
38
M. S. Khaquan et al.
13. Perng, S.-S., Nien-Tsu, H., Tsai, P.-S., Ter-Feng, W., Chen, J.-Y.: Wearable devices to control objects in virtual reality. Sens. Mater 32, 2007–2015 (2020) 14. Picerno, P., Iosa, M., D’Souza, C., Benedetti, M.G., Paolucci, S., Morone, G.: Wearable inertial sensors for human movement analysis: a five-year update. Expert Rev. Med. Devices 18(sup1), 79–94 (2021) 15. Pillitteri, G., et al.: Validity and reliability of an inertial sensor device for specific running patterns in soccer. Sensors 21(21), 7255 (2021) 16. Rana, M., Mittal, V.: Wearable sensors for real-time kinematics analysis in sports: a review. IEEE Sens. J. 21(2), 1187–1207 (2020) 17. Reilly, B., Morgan, O., Czanner, G., Robinson, M.A.: Automated classification of changes of direction in soccer using inertial measurement units. Sensors 21(14), 4625 (2021) 18. Ren, H., Wang, X.: Application of wearable inertial sensor in optimization of basketball player’s human motion tracking method. J. Ambient Intell. Humaniz. Comput. 1–15 (2021) 19. Santos, O.C.: Artificial intelligence in psychomotor learning: modeling human motion from inertial sensor data. Int. J. Artif. Intell. Tools 28(04), 1940006 (2019) 20. Smerdov, A., Somov, A., Burnaev, E., Stepanov, A.: AI-enabled prediction of video game player performance using the data from heterogeneous sensors. Multimedia Tools Appl. 82, 1–26 (2022) 21. Smerdov, A., Somov, A., Burnaev, E., Zhou, B., Lukowicz, P.: Detecting video game player burnout with the use of sensor data and machine learning. IEEE Internet Things J. 8(22), 16680–16691 (2021) 22. Song, X., et al.: Wearable multimodal-serious game system for hand and cognitive rehabilitation after stroke (2021) 23. Sun, Y., Lo, F.P.-W., Lo, B.: EEG-based user identification system using 1Dconvolutional long short-term memory neural networks. Expert Syst. Appl. 125, 259–267 (2019) 24. Surinta, O., Khamket, T.: Gender recognition from facial images using local gradient feature descriptors. In: 2019 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), pp. 1–6. IEEE (2019)
ESSL-Polyp: A Robust Framework of Ensemble Semi-supervised Learning in Polyp Segmentation Toan Pham Van1 and Sang Dinh Viet2(B) 1
2
Sun-Asterisk R&D Department, Tokyo, Japan [email protected] BKAI Research Center, Hanoi University of Science and Technology, Hanoi, Vietnam [email protected]
Abstract. We propose a robust framework called ESSL-Polyp combining ensemble and semi-supervised learning to improve polyp segmentation accuracy. The intuition starts from our previous experiments with semi-supervised learning on polyp segmentation. Following that, the semi-supervised models usually generalize better than supervised models with the same amount of training data, especially in out-of-domain datasets. In this paper, instead of using all labeled data, we split it into k-fold sub-datasets with labeled and unlabeled parts to train corresponding semi-supervised models. The ensemble of semi-supervised models is utilized to generate final precise predictions. We achieve an average of 0.8557 Dice score on five popular benchmark datasets, including Kvarsir, CVC-ClinicDB, ETIS-LaribPolypDB, CVC-ColonDB, and CVC-300. Meanwhile, the supervised baseline using the same training dataset only has an average Dice score of 0.8264. Our method especially yields superior performance compared to the supervised approach in out-of-domain datasets such as ETIS-LaribPolypDB, CVC-ColonDB, and CVC-300. The source code and pre-trained models are available at https://sal.vn/ essl-polyp. Keywords: Semi-Supervised Learning Ensemble Learning
1
· Polyp Segmentation ·
Introduction
Colonoscopy is a widely adopted technique for the detection of colon polyps. In light of the recent achievements in deep learning, several image segmentation techniques have been introduced to aid oncologists in reducing the time taken for diagnosis. However, it is worth noting that deep learning models are generally characterized as data-hungry models, which means that they require a vast corpus of data to achieve high levels of accuracy [2]. However, getting a labeled dataset is very expensive, especially for medical images. It requires not c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 39–52, 2023. https://doi.org/10.1007/978-3-031-37963-5_4
40
T. P. Van and S. D. Viet
only time and effort but also the participation of people with expertise in this field [3]. Furthermore, this datatype requires pixel-level annotation, leading to inconsistency between human annotators even given the same input image. Semi-supervised learning involves the utilization of both labeled and unlabeled data to improve the learning performance of a model, without requiring human intervention. The fundamental concept underlying semi-supervised learning is to employ the confident predictions generated by the model as pseudo labels to refine the model parameters [4]. Recent research studies [5,6] have demonstrated that it is possible to attain comparable levels of accuracy to supervised learning, while requiring fewer annotations. Especially in the polyp segmentation experiments, we explore that learning from pseudo labels makes the semi-supervised model generalize superior in out-of-domain data. In the previous trials, we used only 20% of the labeled data and treated the rest as unlabeled. The experimental findings indicate that the semi-supervised model exhibits superior performance compared to the supervised model, particularly in the context of out-of-domain datasets. Ensemble learning aims to enhance the generalization ability by combining multiple base models. Each base model (also called a component model, ensemble member, or weak learner because of its simplicity) learned to represent different sub-space in the problem space. The ensemble method builds a meta-model with the purpose that it will achieve better than any of their base model. The efficacy of ensemble learning is contingent upon the diversity among the base models, as reported by Krogh and Vedelsby [7]. In other words, the combination of identical base models is unlikely to yield an improvement in the final performance. In the majority of ensemble methods, the base models are trained only on labeled data under a supervised setting, with diversity introduced by manipulating the training data. Examples of such methods include Bagging [8] and Adaboost [9]. With inspiration from the achievements in semi-supervised and ensemble learning, we introduce a novel and resilient framework for the task of polyp segmentation. Our proposed method is contrasted against a supervised strategy with the same model architecture, as well as other state-of-the-art supervised approaches. The experiment results demonstrate that our method outperforms almost the supervised methods in average Dice score, specifically in the out-ofdomain datasets such as ETIS-LaribPolypDB, CVC-ColonDB, and CVC-300. Our main contributions are as follows: – We propose a novel strategy called ESSL-Polyp that leverages semi supervised learning along with model ensemble for polyp segmentation task. We achieve an average of 0.8557 Dice score on five popular datasets, including Kvarsir, CVC-ClinicDB, ETIS-LaribPolypDB, CVC-ColonDB, and CVC-300. Meanwhile, the supervised method on the same training dataset has only an average Dice score of 0.8256. Our method outperforms many other existing methods on most datasets. – To the best of our knowledge, this is the inaugural study to integrate ensemble and semi-supervised learning for the purpose of polyp segmentation.
ESSL-Polyp
41
The rest of this paper is structured as follows. Section 2 presents related work. The proposed method is presented in Sect. 3. Section 4 describes the datasets and details implementation methods. The experimental results are presented in Sect. 5. Finally, Sect. 6 concludes with discussions and future research directions.
2 2.1
Related Work Polyp Segmentation
Polyp segmentation receives as input an endoscopic image. The output will automatically localize the pixels containing the polyp location as a binary mask. Pixels with a value of 1 are the part containing the polyp, and 0 otherwise. Several studies have endeavored to construct dedicated models for polyp segmentation, such as HarDNet-MSEG [10], AG-ResUNet++ [11], and ColonFormer [12]. However, it is noteworthy that all of the aforementioned methods are based on supervised learning. However, the conventional approach of supervised learning often necessitates a substantial quantity of labeled data. Obtaining dense pixel-level annotations for semantic segmentation is an arduous and costly task, particularly for medical segmentation data. Therefore, it is crucial for practical applications to adopt an efficient learning strategy to address the issue of data scarcity. 2.2
Semi Supervised Learning for Semantic Segmentation
Supervised learning techniques demonstrate effectiveness when ample labeled examples are available for learning. Nevertheless, acquiring labeled instances can be demanding, costly, or time-consuming in various applications, such as object recognition, document classification, and webpage categorization, as they necessitate empirical research or skilled human annotators. Semi-supervised learning, which involves utilizing both labeled and unlabeled data during the training of a model, aims to enhance a base model that is trained on a dataset of N labelled samples Dsup = {(xn , yn )|n = 1...N } by utilizing a dataset of M unlabeled samples Dunsup = {xm |m = N + 1...N + M }. Many semi-supervised studies such as pseudo labeling [15], cross pseudo supervision [13], few-shot learning [14], and deep adversarial learning have applied for medical image data. Our previous experiment focused on generating high-quality pseudo-labels for semantic segmentation with an online learning strategy and momentum teacher [16]. This approach uses only 20% of labeled data but achieves a better result with the supervised method that uses all labeled data. Figure 1 demonstrates some failure cases of the supervised model in out-of-domain data. GradCAM’s [17] visualization shows that the supervised method has a high uncertainty with these data. Meanwhile, the semi-supervised method gets better generalization than the supervised method, even using less labeled data. Impressed by our previous results, in this work, we split the original training dataset into k-fold. Each fold is used once as a labeled set while the k −1 remaining folds form the unlabeled set. We train k semi-supervised model in the original dataset and combine them with various ensemble strategies such as the last
42
T. P. Van and S. D. Viet
Fig. 1. Effective of the Semi-Supervised Method in ETIS-LaribPolypDB Dataset - An Out-of-Domain with Training Data. Each Column Represents a Different Feature Map and Binary Mask of Supervised (using Complete Labeled Data) and Semi-Supervised Model (using 20% of Labeled Data and Remaining as Unlabeled). (a) and (b) are the Input Image and Corresponding Ground Truth, (c) and (d) are the GradCAM’s Visualization of the Segmentation Head, and the Output Binary mask of the Supervised Model, (e) and (f) are the GradCAM’s Visualization of Segmentation Head and the Output Binary Mask of Semi-Supervised Model.
layer average, validation-based weighted average, and soft voting. The output of ensemble k models was used for evaluation in five testing datasets, including Kvarsir, CVC-ClinicDB, ETIS-LaribPolypDB, CVC-ColonDB, and CVC-300. 2.3
Ensemble Learning
Ensemble learning is a widely used meta approach in machine learning that aims to enhance the predictive performance of a model by combining the predictions of multiple models. Specifically, let C = {f1 (x), f2 (x), . . . , fK (x)} be a collection of models in an ensemble. An ensemble method produces its final output by computing a weighted combination of the models in the ensemble. This is accomplished by: K F (x) = wi fi (x) (1) i=1
where F (x) is the final model, and wi is either the weight of the i-th component base model or the weighted voting for the i-th model for the final decision model F (x). Ensemble models are typically trained on multiple subsamples from the training data. There are two main types of ensemble frameworks: dependent and independent [18]. In a dependent framework, the output of one model is used to
ESSL-Polyp
43
generate the next model, allowing knowledge from previous iterations to guide subsequent learning. Boosting is an example of this approach. In the independent approach, each model is built independently, and the outputs are combined using voting methods. In our proposed pipeline, we adopt the independent ensemble approach, where we combine the outputs of k semi-supervised models using soft voting, average, and weighted average methods. 2.4
Encoder-Decoder Architecture
The encoder-decoder architecture is one of the most widely used and effective deep learning architectures for medical image segmentation tasks. In order to encode an input image, the encoder first downscales it by computing feature representations at different resolution scales, and then produces feature maps that contain the encoded information. These feature maps are upsampled in the decoder part and restored to the full segmentation map. This paper uses the primary model of fully convolution network (FPN) [1] architecture with the DenseNet169 backbone [19]. This architecture is used in both supervised and semi-supervised strategies in our experiments.
3
Proposed Method
In this section, we first describe the architecture of the proposed method. As depicted in Fig. 2, the overview pipeline consists of three steps: 1) data separation to split the original dataset into k-folds, 2) training k independent segmentation model in k folds with semi-supervised strategy, and 3) ensemble of k models to generate the final output.
Fig. 2. Overview of our Architecture. We Split the Original Dataset into k SubDatasets with k-Folds Separation. We Divide each Sub-Dataset into Two Parts, Including Labeled and Unlabeled Data for Training Semi-Supervised Models. Finally, we Combine k Semi-Supervised Models to Generate the Final Output.
44
3.1
T. P. Van and S. D. Viet
Data Separation
The core idea behind our method is to leverage the power of semi-supervised models on all the labeled data. Based on our observation in our previous method, the semi-supervised model has comparable or better performance than a supervised model, even using only 20% of labeled data. In previous experiments, we randomly sampled the labeled data from the original dataset and masked the remaining data as unlabeled data. An independent semi-supervised model was trained on this separation. Naturally, we expected that the combination of models trained on k-fold separation could make it more accurate than an independent model. Visual comparison of feature maps extracted with different sub-models is demonstrated in Fig. 3. We can see that each semi-supervised model trained on a different data part can give other behavior despite almost the attention in polyp regions being similar. In this case, an ensemble method is needed to get a better result.
Fig. 3. Visual Comparison of Feature Maps Extracted with Different Sub-Models with k = 3. (a) and (b) are the Input Image and Ground Truth Mask, (c), (d), and (e) are the Feature Maps Extracted from each Sub-Model Trained with Semi-Supervised Learning.
Our separation strategy is inspired by the traditional cross-validation method but with a different purpose. In the field of machine learning, cross-validation is a widely used technique to evaluate the performance of a model on unseen data. This method involves dividing the data into k groups, with each group being used as a validation set once, and the remaining k − 1 groups used for training the model. This process is referred to as k-fold cross-validation. The value of k is a single parameter that determines the number of groups into which the data will be divided. For example, a value of k = 5 would result in a 5-fold cross-validation. In our proposed method, we use a modified version of cross-validation, where one fold is used as the labeled training data, and the remaining k − 1 folds
ESSL-Polyp
45
are masked as unlabeled data to train semi-supervised models. This approach is illustrated in Fig. 4. By utilizing unlabeled data in the training process, our method aims to improve the model’s performance while reducing the need for a large amount of labeled data.
Fig. 4. Example of our 5-Folds Data Separation Strategy.
3.2
Training k Semi-Supervised Models
The process of semi-supervised learning involves training models on both labeled and unlabeled data. In our approach, we have k independent sub-models {f1 , · · · , fk }, each of which is trained on a different subset of k−1 unlabeled folds. We utilize a two-step technique for semi-supervised training, in which we first train a teacher model on the labeled dataset. During the teacher training process, we save a slow copy version of the model updated by EMA (called momentum teacher). The best momentum teacher on the validation set is then used to generate pseudo-labels during student training. Online pseudo-labeling [16] is used during student training with the momentum teacher model. At the same time, the weights of the original student model are updated for the momentum teacher and its student momentum version. Finally, after the training is completed, the momentum student version is used to make final predictions. By using a momentum-based approach, we aim to reduce the effects of noisy pseudo-labels and make our training process more robust. The overall pipeline of this training strategy is demonstrated in Fig. 5. 3.3
Ensemble Semi-supervised Models
When using the semi-supervised learning in each fold, we expect each sub-model to leverage this method’s power on different sub-parts of the original dataset. The
46
T. P. Van and S. D. Viet
Fig. 5. Overview of our Semi-Supervised Learning Strategy. The Green Dashed Arrow (Color figure online) Corresponds to Pseudo Labeling on Unlabeled Data in each Training Iteration. The Blue Dashed Arrow Corresponds to Online Updating the Teacher during Training with EMA. And, the Red Dashed Arrow Corresponds to Updating the Student Network during training with EMA. After the Training Finished, we used the Momentum Student Network to make the Final Predictions.
combination of k sub-models will be equivalent to training on full original labeled data but aggregating the strength of semi-supervised models. This paper used soft voting and hard voting to combine the segmented output masks of individual models. In the soft voting strategy, we used the direct value from the last layer of the segmentation head without applying the binary threshold. The sigmoid activation function was applied in this layer to create a probability mask. If the pixel value is closer to 1, then it has a higher probability of belonging to the segmented polyp region. As in Eq. 1, the weights of outputs of individual models are treated equally. In the hard voting strategy, we convert the probability mask to a binary mask with a threshold of T per pixel. In this work, we choose the T = 0.5 for simplification. The hard voting is applied in K binary mask. This strategy will predict the class with the largest sum of votes from models per pixel.
4 4.1
Dataset and Experiments Setup Dataset
In this paper, we utilized the datasets proposed in [10], which include Kvasir [26], CVC-ClinicDB [27], CVC-ColonDB [28], CVC-300 [29], and ETIS-Larib Polyp DB [30]. These datasets were used to make a detailed comparison with other
ESSL-Polyp
47
state-of-the-art (SOTA) models. The training dataset comprised of 900 images in Kvasir-SEG and 550 in CVC-ClinicDB. The test dataset comprised of 798 images synthesized from different data sets, which include CVC-300 dataset, CVC-ClinicDB, CVC-ColonDB, ETIS-LaribPolypDB, and Kvasir-SEG. For our experiments, we used a validation dataset of 10 4.2
Evaluation Metrics
Our proposed method is evaluated using two widely used evaluation metrics for medical image segmentation tasks: mean Dice coefficient (mDice) and mean Intersection over Union (mIoU). The mDice coefficient measures the similarity between the predicted and ground truth segmentation masks, while the mIoU measures the intersection area between the predicted and ground truth masks, providing an indication of the accuracy of the segmentation predictions. They are defined as follows: TP (2) mIoU = TP + FP + FN 2 ∗ TP (3) mDice = 2 ∗ TP + FP + FN 4.3
Implementation Details
In our experiments, we utilized a computer equipped with an Intel Core i5-7500 CPU running at 3.4 GHz, 32 GB of RAM, a GeForce GTX 1080 Ti GPU, and a 1TB SSD hard disk. The models were developed using the PyTorch Lightning framework. The teacher model was first trained using labeled images for 200 epochs, followed by primary semi-supervised training with both labeled and unlabeled images for 200 epochs. The batch size was set to 16, and all inputs were uniformly resized to 352 × 352 during both supervised and semi-supervised training. The Adam optimizer was employed to reduce the overall parameters with a learning rate of 0.001 for training the segmentation networks. We evaluated the performance of our method using the mean Dice coefficient (mDice) and mean Intersection over Union (mIoU), which are the most commonly used metrics for medical image segmentation tasks.
5 5.1
Result of Experiments Ablation Studies
We use two different value of k in data separation step are k = 3 and k = 5. With each separation, we test our ensemble step with two methods: hard and soft voting. We compared the ensemble result with the baseline of supervised training and semi-supervised training.
48
T. P. Van and S. D. Viet
Comparison with Supervised Baseline with Same Model Architecture. Table 1 show the detail of our experiment with different value of k. With k = 3, we achieve the best result with a 0.8557 Dice Score on average. Meanwhile, the supervised baseline trained with a full labeled dataset only achieves a 0.8264 Dice Score on average. Overall, the ensemble of semi-supervised methods outperforms the supervised method in all datasets with the same model architecture. Specifically, our method has robust generalize on the out-of-domain data such as ETIS, CVC-300, and CVC-ColonDB. The supervised practice (on the last row of Table 1) has lower performance in these datasets. Effectiveness of Different k Value. In our experiment, we use the two value of k are k = 3 and k = 5. As the result show on the Table 1, with the k = 3, our method have better overall accuracy with 0.8557 Dice Score. This result is reasonable, for k = 3, more labeled data is used to train each independent semisupervised model (30% of labeled data in each fold). When k = 5, the model has a bit lower overall accuracy with 0.8519 Dice Score but deliver better accuracy in some dataset like CVC-300 or ETIS. Effectiveness of Voting Method. The result from Table 1 gives the better effectiveness of the soft voting method in our ensemble strategy. In both experiments with k = 3 and k = 5, soft voting has a higher accuracy of about 1.5 to 2% than the hard voting strategy. Table 1. A Comparison of our Method with Different K and Supervised Baseline. The Bold and Red Color is the Highest Result. SSL K Soft voting 3 5 -
5.2
-
-
Kvasir CVC-ClinicDB CVC-ColonDB CVC-300 ETIS Average mDice mIOU mDice mIOU mDice mIOU mDice mIOU mDice mIOU mDice mIOU 0.921 0.916 0.918 0.920 0.899
0.869 0.861 0.865 0.866 0.847
0.905 0.888 0.893 0.895 0.902
0.858 0.835 0.840 0.847 0.851
0.772 0.757 0.763 0.765 0.741
0.697 0.689 0.689 0.690 0.663
0.905 0.891 0.909 0.882 0.904
0.842 0.825 0.846 0.817 0.839
0.775 0.741 0.776 0.759 0.686
0.701 0.662 0.705 0.685 0.621
0.856 0.839 0.852 0.844 0.826
0.793 0.773 0.789 0.789 0.764
Comparison with State-of-the-Art Methods
In order to demonstrate the effectiveness of our proposed method, we conducted a comparative analysis with several state-of-the-art models, including UNet [22], UNet++ [20], SFA [21], PraNet [23], MSNet [24] and Shallow Attention [25]. The evaluation was carried out using the mIoU and mDice metrics, and the results are presented in Table 2. Our proposed model consistently outperforms the benchmark models on all datasets and in all scenarios. Specifically, our method achieves superior performance compared to UNet, UNet++, PraNet, SFA, and Shallow Attention.
ESSL-Polyp
49
Table 2. A Comparison of our Method with State-of-the-Art Supervised Models. SSL
-
Methods Unet [22] Unet++ [20] SFA [21] PraNet [23] MSNET [24] Shallow Attention [25] Ours (K = 3, soft voting) Ours (K = 5, soft voting) Ours (K = 3, hard voting) Ours (K = 5, hard voting)
Kvarsir CVC-ClinicDB CVC-ColonDB CVC-300 ETIS mDice mIOU mDice mIOU mDice mIOU mDice mIOU mDice mIOU 0.818 0.746 0.823 0.755 0.512 0.444 0.710 0.627 0.398 0.335 0.821 0.743 0.794 0.729 0.483 0.410 0.707 0.624 0.401 0.344 0.723 0.611 0.700 0.607 0.469 0.347 0.467 0.329 0.297 0.217 0.898 0.840 0.899 0.849 0.709 0.640 0.871 0.797 0.628 0.567 0.907 0.862 0.921 0.879 0.755 0.678 0.869 0.807 0.719 0.664 0.904 0.847 0.916 0.859 0.753 0.670 0.888 0.815 0.750 0.654 0.921 0.869 0.905 0.858 0.772 0.697 0.905 0.842 0.775 0.701 0.918 0.865 0.893 0.840 0.763 0.689 0.909 0.846 0.776 0.705 0.916 0.861 0.888 0.835 0.757 0.689 0.891 0.825 0.741 0.662 0.920 0.866 0.895 0.847 0.765 0.690 0.882 0.817 0.759 0.685
Figure 6 presents visual comparisons of our proposed method with different fully supervised competitors on typical challenging cases in Kvasir and CVC-ClinicDB datasets. Our method shows comparable performance to stateof-the-art supervised models on the in-domain datasets. In addition, our method exhibits superior generalization capabilities on out-of-domain datasets such as ETIS, CVC-300, and CVC-ColonDB, as illustrated in Fig. 7. These qualitative results demonstrate the robustness and generalizability of our proposed method.
Fig. 6. Comparisons with Different State-of-the-Art Methods on the Kvasir-SEG and CVC-Clinic DB. (a) Input Image. (b) Ground Truth. (c) Ours. (d) Unet++. (e) Unet. (f) SFA. (g) PraNet. Our Method has Comparable Results with SOTA Supervised in Predicting Samples from the In-Domain Dataset.
50
T. P. Van and S. D. Viet
Fig. 7. Comparisons with Different State-of-the-Art Methods on the Out-of-Domain Dataset (ETIS, CVC-300, CVC-ColonDB). (a) Input Image. (b) Ground Truth. (c) Ours. (d) Unet++. (e) Unet. (f) SFA. (g) PraNet. Our Method Outperformed SOTA Supervised in Predicting Samples from the Out-of-Domain Dataset.
6
Conclusion and Future Works
This study proposes a novel framework of ensemble semi-supervised learning methods for polyp segmentation. The original training set is divided into kfolds, and a semi-supervised model is trained on each fold. Subsequently, the k independent semi-supervised models are ensembled to produce the final prediction. The proposed pipeline achieves a remarkable result with a Dice score of 0.8557, surpassing the supervised baseline with the same model architecture. Our method exhibits superior performance over several state-of-the-art supervised learning methods on most datasets, particularly for out-of-domain data. Future research will focus on exploring additional ensemble and semi-supervised learning strategies to better leverage unlabeled data. Acknowledgment. The project described in this paper was funded by Vingroup Innovation Foundation (VINIF) under project code VINIF.2020.DA17. Additionally, this work received partial support from Sun-Asterisk Inc. The authors express their gratitude to their colleagues at Sun-Asterisk Inc for their valuable advice and expertise, which proved to be instrumental in the successful completion of this experiment.
References 1. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
ESSL-Polyp
51
2. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009) 3. Chaitanya, K., Karani, N., Baumgartner, C.F., Erdil, E., Becker, A., Donati, O., Konukoglu, E.: Semi-supervised task-driven data augmentation for medical image segmentation. Med. Image Anal. 68, 101934 (2021) 4. Van Engelen, J.E., Hoos, H.H.: A survey on semi-supervised learning. Mach. Learn. 109(2), 373–440 (2020) 5. Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. In: Advances in Neural Information Processing Systems, vol. 33, pp. 596–608 (2020) 6. Zhang, B., et al.: Flexmatch: boosting semi-supervised learning with curriculum pseudo labeling. Adv. Neural. Inf. Process. Syst. 34, 18408–18419 (2021) 7. Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, vol. 7 (1994) 8. Quinlan, J.R.: Bagging, boosting, and C4. 5. In: AAAI/IAAI, vol. 1, pp. 725–730 (1996) 9. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: ICML, vol. 96 (1996) 10. Huang, C.-H., Wu, H.-Y., Lin, Y.-L.: Hardnet-MSEG: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172 (2021) 11. Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-03087193-2 2 12. Duc, N.T., Oanh, N.T., Thuy, N.T., Triet, T.M., Dinh, V.S.: Colonformer: an efficient transformer based method for colon polyp segmentation. IEEE Access. 10, 80575–80586 (2022) 13. Zhang, Y., Gong, Z., Zheng, X., Zhao, X., Yao, W.: Semi-supervision semantic segmentation with uncertainty-guided self cross supervision. arXiv preprint arXiv:2203.05118 (2022) 14. Li, Y., Data, G.W.P., Fu, Y., Hu, Y., Prisacariu, V.A.: Few-shot Semantic Segmentation with Self-supervision from Pseudo-classes. arXiv preprint arXiv:2110.11742 (2021) 15. Wang, Y., et al.: Semi-Supervised semantic segmentation using unreliable pseudolabels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4248-4257 (2022) 16. Van, T.P., et al.: Online pseudo labeling for polyp segmentation with momentum networks. arXiv preprint arXiv:2209.14599 (2022) 17. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Gradcam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618– 626 (2017) 18. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1), 1–39 (2010) 19. Iandola, F., Moskewicz, M., Karayev, S., Girshick, R., Darrell, T., Keutzer, K.: Densenet: implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869 (2014) 20. Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39(6), 1856–1867 (2019)
52
T. P. Van and S. D. Viet
21. Fang, Y., Chen, C., Yuan, Y., Tong, K.: Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 302–310. Springer, Cham (2019). https:// doi.org/10.1007/978-3-030-32239-7 34 22. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4 28 23. Fan, D.-P., et al.: PraNet: parallel reverse attention network for polyp segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 263–273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2 26 24. Zhao, X., Zhang, L., Lu, H.: Automatic polyp segmentation via multi-scale subtraction network. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 120–130. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2 12 25. Wei, J., Hu, Y., Zhang, R., Li, Z., Zhou, S.K., Cui, S.: Shallow attention network for polyp segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 699–708. Springer, Cham (2021). https://doi.org/10.1007/978-3-03087193-2 66 26. Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi. org/10.1007/978-3-030-37734-2 37 27. Bernal, J.F., et al.: WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 43, 99–111 (2015) 28. Tajbakhsh, N., Gurudu, S.R., Liang, J.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans. Med. Imaging 35(2), 630–644 (2015) 29. V´ azquez, D., et al.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthcare Eng. 2017 (2017) 30. Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 9(2), 283–293 (2014)
Data–Driven Design of an Active Wake Steering Control for a Wind Farm Benchmark Silvio Simani1(B) , Saverio Farsoni1 , and Paolo Castaldi2 1 2
Department of Engineering, University of Ferrara, Ferrara, Italy [email protected] Department of Electrical, Electronic, and Information Engineering, University of Bologna, Bologna, Italy http://www.silviosimani.it
Abstract. Wake steering yaws upstream wind turbines to deflect their wakes from downstream turbines, thus increasing the generated power. However, most wake steering methods rely on lookup tables obtained offline, which map a set of conditions, such as wind speed and direction, to yaw angles for each turbine in a farm. These tables assume all turbines are operational and can be significantly non–optimal when one or more turbines do not provide the rated power, because of low wind speed, faults, routine maintenance, or emergency maintenance. This work presents an intelligent wake steering method that adapts to turbine actual working conditions when determining yaw angles. Using a hybrid model–and a learning–based method, i.e. an active control, a neural network is trained online to determine yaw angles from operating conditions including turbine status. Unlike purely model–based approaches which use lookup tables provided by the wind turbine manufacturer or generated offline, the proposed control solution does not need to solve e.g. optimisation problems for each combination of the turbine non-optimal working conditions in a farm; the integration of learning strategy in the control design allows to obtain an active control scheme. Keywords: Fault Diagnosis · Neural Network · Data–Driven Approach · Model–Based Scheme · Wind Farm Simulator
1
Introduction
Wake steering can increase the net power produced by a wind farm by yawing upstream turbines to redirect their wake away from downstream turbines, as shown e.g. in [7] for turbines in commercial wind farms. These works proposed the use of a lookup table that for each Wind Speed (WS), Wind Direction (WD), and Turbulence Intensity (TI) sequence provides the yaw angles to maximise net power. This table is generated offline by solving an optimisation problem for each WS, WD and TI sequence exploiting a wake model. For their wake models, e.g. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 53–61, 2023. https://doi.org/10.1007/978-3-031-37963-5_5
54
S. Simani et al.
the works [7] suggested in particular the employment of the lifting line [11], Gauss Curl–Hybrid [8], and Gaussian [3] models, respectively. One common key aspect not considered by these works is the faulty condition (and the failure, i.e. the shutdown) of one or more turbines in a farm, which often occurs due to low wind speed (wind speed below the cut–in speed), anomalous working conditions, routine maintenance, or emergency maintenance. Therefore, by simply applying a lookup table, which does not consider the turbine healthy status, can lead to non-optimal control, such as yawing a turbine to direct its wake away from a downstream turbine that is not working properly, thus decreasing the power of the upstream turbine without gaining any power increase from the downstream turbine. To account for turbine conditions, the lookup table approach would need to optimise over turbine efficiency and availability; this could be prohibitively expensive for a large wind farm due to the large number of possible combinations of turbine availability. One possible approach to take into account turbine efficiency reduction or even their shutdown due to faults or failures while avoiding optimisation over all combinations of turbine availability is through learning by using real–time data. Learning–based control, sometimes in connection with Reinforcement Learning (RL) methods, has recently demonstrated interesting properties, and it may be considered for the problem of wake steering when all wind turbines are active. For example, the work [4] exploited RL for maximising wind farm power generation. On the other hand, [2] applied RL to the related problem of power tracking, thus matching the output power of the wind farm as a function of the electricity grid. The paper proposes this Hybrid Model– and Learning–Based (HMLB) approach to wake steering for the first time, with the aim of managing wind turbine working conditions that can vary due to their availability after faults and shutdown. Note that this HMLB scheme was developed earlier for dynamic control, whilst this work develops a similar approach for steady–state conditions. This method could also be seen as model–based or white box RL, which is able to take into account possible faults and failures affecting the wind farm. The manuscript is organised as follows. Section 2 presents the HMLB solution for the steady–state setting considered in this paper, including how it applies for real–time (online) control and offline training. The training phase of the ALBC approach requires the generation of data through a model; therefore, a learning– enabled version of the wake model is presented. Section 3 recalls the wind park simulator. This solution is exploited for the active wake steering of a wind farm in Sect. 4. A more realistic validation of the achieved results is addressed in Sect. 4.1 using the Hardware–In–the–Loop (HIL) tool. Finally, Sect. 5 summarises the achievements of the paper and highlights possible directions of future works.
2
Hybrid Model– and Learning–Based Control Method
The active control method exploited here [5] is an ALBC method, originally developed for dynamic control. The same approach can be applied in steady– state conditions, specifically for active wake steering. This approach could also
Data–Driven Wind Farm Active Wake Steering Control
55
represent a model–based or white box RL. In particular, the work explains how this approach can be used for real–time (online) active control of a particular wind farm, whilst an offline training mechanism is exploited. In real–time conditions, the active control strategy passes the current state and the exogenous inputs to a policy, Pθ∗ , with optimised parameters θ∗ , which generates the control action. In this steady–state case, the considered policy takes in only the exogenous inputs: the Wind Speed (WS), vw , the Wind Direction (WD), φ, and the turbine status S. The policy is trained for a particular wind farm, where the number of turbines and their location is fixed. The vector S indicates whether each turbine in the farm is active, inactive, or faulty. The policy generates a vector, γ, with a yaw angle, γi , for each i–th turbine in the farm. Thus, ALBC requires to solve the problem defined by the relation of Eq. (1): γ = Pθ∗ (vw , φ, S)
(1)
The policy Pθ∗ is obtained by using a NN, as recalled in this section. The optimised parameters, θ∗ , represent the weights and biases of the NN, obtained during the training. This study proposes a different data–driven approach, based on NNs, which is exploited to implement the fault diagnosis block. This section briefly recalls their general structure and properties, which are used to implement the policy generator for the signals γ(t). Therefore, a NN is realised in order to reproduce the behaviour of the policy relation of Eq. (1) using a proper set of input and output measurements. The NN structure consists of different layers of neurons, modelled as a static function f . This function is described by an activation function with multiple inputs properly weighted by unknown parameters that determine the learning capabilities of the whole network. A categorisation of these learning structures concerns the way in which their neurons are connected each others. This work proposes to use feed–forward network, also called multi-layer perceptron, where the neurons are grouped into unidirectional layers. Moreover, this multi-layer perceptron is provided with a tapped delay line, which is a feed–forward network whose inputs come from a delay line. This study proposes to use this solution, defined as ’quasi–static’ NN, as it represents a suitable tool to predict dynamic relationships between the input–output measurements and the considered policy function Pθ∗ . In this way, a NARX description is obtained, since the nonlinear (static) network is fed by the delayed samples of the system inputs and outputs selected by the analysis tool described in Sect. 3. Indeed, if properly trained, the NARX network can estimate the current (and the next) control vector γ(t) on the basis of the selected past measurements of system inputs and outputs ul (t) and yj (t), which represent the wind turbine status S. Other inputs of the NN are the WS and WD, i.e., vw and φ. Therefore, with reference to the policy generator, which is used to design the yaw vector γ, this NARX network is described by the relation of Eq. (2): (2) γ(t) = F uj (·), yl (·), vw (·), φ(·)
56
S. Simani et al.
where uj (·) and yl (·) are the generic j–th and l–th components of the measured inputs and outputs u and y, respectively, that are selected via the sensitivity analysis table recalled in Sect. 3. They represent the status S of the wind turbines of the wind farm, thus including possible fault (efficiency reduction) and failure (shutdown) conditions. F (·) is the function realised by the static NN, which depends on the layer architecture, the number of neurons, their weights and their activation functions. The NARX network is thus used as active policy estimator Pθ∗ . t is the time step, whilst the signals uj (·), yl (·), vw (·), and φ(·) have a number of delays nd that have to be properly selected. During training, the ALBC scheme samples the exogenous inputs. For each batch of samples, the ALBC scheme runs a forward pass of the policy on those inputs to generate a control action, calculates the loss from the inputs and control by using the model, runs a backward pass of the policy to get gradients of the loss with respect to the policy parameters, and takes a gradient step to update the policy parameters in the direction that decreases the loss. The ALBC strategy has been implemented in the learning framework [6]. The loss function V in Eq. (3) is defined as negative and represents the average power produced by the turbines: Nt 1 Wi (vw , φ, S) (3) V= Nt t=1 where Wi (vw , φ, S) represents the power generated from the i–th turbine given the WS vw , the WD φ, and the yaw γ of all turbines in the farm; Nt indicates the number of wind turbines that are able to contribute to the power generation. This power is computed using the Gaussian wake model and the wind farm simulator addressed. A key aspect of the proposed ALBC, and where it differs from model–free RL, is that during the backward pass of the training phase, the ALBC generates the predicted output signals through the model to get the exact gradients. This is in contrast to model–free RL, which does not exploit the model and thus needs to estimate the gradient from samples. To use the ALBC for wake steering, a version of the wake model able to predict future measurements with a learning– enabled strategy is required.
3
Wind Farm Simulator
This benchmark model implements a simple wind farm with 9 wind turbines that are arranged in a square grid layout [10]. The distance between the wind turbines in both directions are 7 times the rotor diameter, L. Two measuring masts are located in front of the wind turbines, one in each of the wind directions φ considered in this benchmark model, e.g. 0◦ and 45◦ . The wind speed is measured by these measuring masts and they are located in a distance of 10 times L in front of the wind farm. The wind turbines of the farm are defined by their row and column indices in the coordinate system illustrated in Fig. 1, which sketches layout of the wind farm with the 9 turbines of the square grid and the
Data–Driven Wind Farm Active Wake Steering Control
57
masts along the wind directions. It is worth noting that the original simulator described in [10] has been modified by the authors in order to include the wake model considered in this work.
Fig. 1. Layout of the Wind Farm with 9 Wind Turbines.
The farm uses generic 4.8 MW wind turbines, which were described in [10]. The turbine is a three bladed horizontal axis, pitch controlled variable speed wind turbine. Each of the wind turbines are described by simplified models including control logics, variable parameters and 3 states. The i–th wind turbine model generates the electrical power, Pi g (t), the collective pitch angle, βi (t), and the generator speed, ωi g (t). Note that the wind farm simulator has been modified by the authors in order to include the control of the wind turbine yaw angles γi (t). The second control input is thus represented by the pitch angle βi (t) modified by the baseline wind farm controller [10]. The two scenarios with different wind directions but driven both by the same wind speed sequence vw (t) (possibly with a time shift) are considered. The wind sequence contains a wind speed vw increasing from 5 m/s. to 15 m/s, and with a peak value of about 23 m/s. In this benchmark model a very simple wind farm controller is used, which provides the wind turbine controllers with a power reference Pi ref (t). More details on wind farm model considered in this paper can be found in [10]. It is worth noting that the wind farm considered here could be seen as simplistic model. However, the work [10] describes how the simulator can fit realistic wind farms.
58
S. Simani et al.
With these assumptions, the complete continuous–time description of the wind farm under diagnosis has the following form: x˙ c (t) = fc (xc (t), u(t)) (4) y(t) = xc (t) T
T
where u(t) = [vw (t), βi (t)] and y(t) = xc (t) = [ωi g (t), Pi g (t)] are the input and the monitored output measurements, respectively. The subscript i indicates the measurement from the i–th wind turbine of the wind farm (i = 1, . . . , 9). fc (·) represents the continuous–time nonlinear function describing the model of the plant under investigation. In this benchmark three faults are considered that influence the measured variables from the wind turbine, i.e. βi (t), ωi g (t), and Pi g (t). It is also assumed that the considered faults can be detected at a wind farm level by comparing the performance from other wind turbines in the wind farm, but they are difficult to detect at a wind turbine level.
4
Simulation Results
The wind farm system is considered as described in Sect. 3 consisting of 9 National Renewable Energy Laboratory (NREL) reference turbines [9]. The atmospheric conditions are constant, with wind shear of 0.12, wind veer of 0.0, and turbulence intensity of 0.06. The incoming hub height wind direction can vary according two directions, as recalled in Sect. 3, the incoming hub height wind speed ranges from 3 m/s to 24 m/s. The turbine status S describes the working conditions of the turbines that can be active, inactive (shutdown), or faulty. These conditions are provided e.g. by a Fault Detection and Isolation (FDI) scheme already proposed by the authors e.g. in [12]. The different working conditions of the wind turbines of the park are considered, when a set of turbine statuses, each one with nine, eight, seven and six turbines are considered. They are described by means of their row and column indices in the layout matrix. The ALBC method has been trained by means of a NN properly trained, and the control policy was estimated with one hidden layer with 30 neurons and an input layer with 15 neurons. A number of 4 delays has been exploited. An adaptive learning rate was used in the learning algorithm. Table 1 shows the loss function of Eq. (3) for each turbine status vector S in the test set, where the faulty turbines are highlighted. The four turbine statuses in the test set are considered. The loss in the test set is computed according to Eq. (3). Table 1 shows that for all turbine status cases in the test set, the averaged loss approximates quite accurately the reference optimal loss. The reference loss is the loss for the yaw angles found by optimisation using FLORIS for just the fault–free turbines, as described in [1]. This is the (approximately) optimal loss value, which is correct up to the accuracy of the FLORIS optimisation method. The trained ALBC policy proposed in this work is applied to the test set. Table 2 shows the net wind farm power for each turbine status in the test set,
Data–Driven Wind Farm Active Wake Steering Control
59
Table 1. Values of the Loss Function of Eq. (3) Labelled by the Number of Fault–Free Turbines Nt . Reference Loss Function Average Loss Function Nt - 751
- 755
9
- 735
- 731
8
- 726
- 728
7
- 714
- 717
6
averaged over the wind speeds in the test set both for the ALBC and the approximate optimal power from FLORIS for the fault–free turbines. The power obtained using the baseline controller is reported, which sets all the turbines to have zero yaw (relative to the incoming wind direction). The power achieved via the developed ALBC closely matches the optimal power for each of the turbine statuses, indicating how this ALBC method is able to adapt effectively to turbine status. The power obtained by both the ALBC and the optimal power are higher than the one under baseline control, indicating the importance of wake steering for this test case. Table 2. Wind Farm Power (kW) for each Wind Turbine Status S in Simulation. Fault–Free Turbines Nt ALBC Method Optimal Power Baseline Controller 6
4321
4334
3981
7
4987
5001
4498
8
5521
5567
4987
9
6123
6145
5426
Table 3 summarises the net wind farm power for each wind speed in the test set, averaged over the turbine statuses in the test set for the ALBC, the optimal power, and baseline control. The power obtained by the ALBC closely matches the optimal power and is higher than the baseline power for each of the turbine statuses, indicating that the ALBC identifies the correct relationship between wind speed and yaw angles. 4.1
Hardware–in–the–Loop Validation
The HIL test–rig has been implemented in order to verify and validate the proposed solutions in more realistic real–time working conditions. These experimental tests aim at validating the achieved results obtained in simulations, considering the almost real conditions that the wind turbine systems under analysis may deal with, during their working situations. Table 4 summarises the results obtained using this real–time HIL set–up.
60
S. Simani et al. Table 3. Wind Farm Power (kW) for each Wind Speed vw in the Test Set. Wind Speed vw (m/s) ALBC Scheme Optimal Power Baseline Controller 4
987
996
961
5
1998
2001
1876
6
3112
3124
3001
7
5882
5898
5679
8
8299
8321
8012
Table 4. Wind Farm Power (kW) for each Wind Turbine Status S with the HIL Test–Bed. Fault–Free Turbines Nt ALBC Method Optimal Power Baseline Controller 6
4327
4338
3987
7
4992
5008
4502
8
5528
5572
4991
9
6129
6153
5433
It is worth observing the consistency of the almost real–time test of Table 4 with respect to the results reported in Sect. 4. Although the average performances seem to be better than those obtained using the HIL platform, some issues have to be taken into account. Indeed, the numerical accuracy of the on–board electronics, which involves float calculations is more restrictive than the CPU of the simulator. Moreover, also the A/D and D/A conversions can motivate possible deviations. Note that real situations do not require to transfer data from a computer to the on board electronics, so that this error is not actually introduced. However, the obtained deviations are not critical and the developed control strategies can be also considered in real applications.
5
Conclusion
This work presented an intelligent wake steering method that adapts to turbine actual working conditions when determining yaw angles. In particular, it exploited a hybrid model– and a learning–based method, i.e. an active control, where a neural network was trained online to determine yaw angles from operating conditions including wind turbine status. In fact, wake steering is usually employed to yaw upstream wind turbines in order to deflect their wakes from downstream turbines, thus increasing the generated power. However, most wake steering methods rely on lookup tables obtained offline, which map a set of conditions, such as wind speed and direction, to yaw angles for each turbine in a farm. These tables assume all turbines are operational and can be significantly non– optimal when one or more turbines do not provide the rated power, because of low wind speed, faults, routine maintenance, or emergency maintenance. Unlike
Data–Driven Wind Farm Active Wake Steering Control
61
purely model–based approaches that use lookup tables usually provided by the wind turbine manufacturer or generated offline, the proposed control solution did not rely on optimisation problems, e.g. applied to each combination of the wind farm turbines.
References 1. Andersson, L.E., Anaya-Lara, O., Tande, J.O., Merz, K.O., Imsland, L.: Wind farm control - Part I: a review on control system concepts and structures. IET Renew. Power Gener. 15(10), 2085–2108 (2021). https://doi.org/10.1049/rpg2.12160 2. Arroyo, J., Manna, C., Spiessens, F., Helsen, L.: Reinforced model predictive control (RL-MPC) for building energy management. Appl. Energy 309(1), 1–16 (2022). https://doi.org/10.1016/j.apenergy.2021.118346 3. Bastankhah, M., Porte-Agel, F.: Experimental and theoretical study of wind turbine wakes in yawed conditions. J. Fluid Mech. 806(1), 506–541 (2016). https:// doi.org/10.1017/jfm.2016.595 4. Dong, H., Xie, J., Zhao, X.: Wind farm control technologies: from classical control to reinforcement learning. Progress Energy 4(3), 1–19 (2022). https://doi.org/10. 1088/2516-1083/ac6cc1 5. Drgona, J., Kis, K., Tuor, A., Vrabie, D., Klauco, M.: Differentiable predictive control: deep learning alternative to explicit model predictive control for unknown nonlinear systems. J. Process Control 116(1), 80–92 (2022). https://doi.org/10. 1016/j.jprocont.2022.06.001 6. Dueben, P.D., Schultz, M.G., Chantry, M., Gagne II, D.J., Hall, D.M., McGovern, A.: Challenges and benchmark datasets for machine learning in the atmospheric sciences: definition, status, and outlook. Artif. Intell. Earth Syst. 1(3), 1–11 (2022) 7. Howland, M.F., et al.: Collective wind farm operation based on a predictive model increases utility-scale energy production. Nat. Energy 7(1), 818–827 (2022). https://doi.org/10.1038/s41560-022-01085-8 8. King, J., et al.: Control-oriented model for secondary effects of wake steering. Wind Energy Sci. 6(3), 701–714 (2021). https://doi.org/10.5194/wes-6-701-2021 9. Odgaard, P.F., Stoustrup, J., Kinnaert, M.: Fault-tolerant control of wind turbines: a benchmark model. IEEE Trans. Control Syst. Technol. 21(4), 1168–1182 (2013). https://doi.org/10.1109/TCST.2013.2259235. ISSN 1063-6536 10. Odgaard, P.F., Stoustrup, J.: Fault tolerant wind farm control – a benchmark model. In: Proceedings of the IEEE Multiconference on Systems and Control – MSC 2013, Hyderabad, India, pp. 1–6 (2013) 11. Shapiro, C., Gayme, D.F., Meneveau, C.: Modelling yawed wind turbine wakes: a lifting line approach. J. Fluid Mech. 841(1), 1–12 (2018). https://doi.org/10.1017/ jfm.2018.75 12. Simani, S., Farsoni, S., Castaldi, P.: Residual generator fuzzy identification for wind farm fault diagnosis. In: Proceedings of the 19th World Congress of the International Federation of Automatic Control – IFAC 2014, Cape Town, South Africa, 24–29 August 2014, vol. 19, pp. 4310–4315. IFAC & South Africa Council for Automation and Control, IFAC. Invited paper for the special session “FDI and FTC of Wind Turbines in Wind Farms” organised by P. F. Odgaard and S. Simani. https://doi.org/10.3182/20140824-6-ZA-1003.00052
Design Principles for Interactive and Reflective Journaling with AI Max Angenius1(B) and Maliheh Ghajargar1,2 1 School of Arts and Communication, Malmö University, Malmo, Sweden
[email protected], [email protected] 2 Internet of Things and People Research Center, Malmö University, Malmo, Sweden
Abstract. Designing for reflection and journaling have been prominent research areas in HCI and Interaction Design. However, designing for the experience of journaling that is supported by conversations with AI–Conversational Agent (CA)–to foster reflection seems to be a relatively unexplored area. Furthermore, while there are an abundant number of general guidelines and design principles for designing human-AI interactions, a set of guidelines for designing an interactive and reflective journaling experience with AI is lacking. This paper is a first attempt to address that need. We present the result of a qualitative user study on interactive and reflective journaling. We were interested in attending to our participants’ experiences and finding out their needs regarding the interactive journaling experience with CA. The user needs then were translated to design requirements and thereafter to themes or design principles. Some of our findings suggest that one of the important factors in journaling is the personal aesthetics of writing, by using carefully selected personal tools, specific materiality and interactions. Further, the flow of writing is considered sacred, hence it is almost like an untouchable, reflective ritualistic flow. Reflecting on the findings, we believe the outcome of this study can create opportunities for designing for human-AI interactions that are generative and reflective for activities that require such qualities, such as journaling or creativity. Keywords: Journaling · Reflection · Interaction Design · Human-AI Interaction · Conversational Design
1 Introduction The daily practice of journaling and the narration of personal experiences, thoughts, and emotions is a human activity that has been around for a long while [1]. Benefits of journaling as a reflective practice include the discovery of meaning, gaining perspective about others, developing critical thinking and affective skills, among others [2]. Research about journaling includes areas such as nursing education [2]; learning and education [3–5]; and behavioral medicine [6], to mention a few. Reflective journaling is a written activity [2] and provides a vehicle for inner dialogue that connects feelings, thoughts, and actions [4]. In Interaction Design, journaling has been studied in different contexts, for © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 62–81, 2023. https://doi.org/10.1007/978-3-031-37963-5_6
Design Principles for Interactive and Reflective Journaling
63
instance, in the context of education, computer-mediated journaling–infusing reflective journaling with technology–was studied by Stamper [7] and Ong and Pemberton [8]. Some of the results of previous studies confirm that reflective journaling using blogging is enjoyable and effective, and feedback is a motivational factor in continuing the learning efforts [8]. Food journaling is another area of interest for Interaction Design research, where journaling as a tool for self-reflection helps support healthy eating behaviors [9]. Internet Memes have been researched as an alternative way of journaling that has shown positive effects on helping participants process emotional experiences [1]. Computersupported reflective learning in a workplace context is another research area within HCI, where journaling is aimed at using reflection as a tool for learning [10]. Kocielnik [11] researched multimodal interactive journaling combining written and voice modalities in a Conversation Agent (CA). However, a particular technique–video, voice, text, memes, etc.–does not guarantee success in every context, and each modality needs to be chosen according to specific needs and requirements of the situation [7]. CA uses language models; hence the conversation is the primary mode of interaction and engagement with users generally [12] and for collaboration in specific contexts [13]. CAs can also augment human abilities such as reflection and supporting memory [10]. The CA’s capabilities to interact and engage in a conversation with users, employing Natural Language Processing (NLP) models and personalization based on user’s data, show that this technology is an exciting design research area to explore, specifically in the context of journaling. However, like any other technology, there are also challenges of using this technology, such as trust and transparency [12] and its ability to become highly personified hence become misleading [14], which need to be carefully considered when designing interactions with this technology. Over the years, HCI and interaction design communities have developed design principles to guide a better and more transparent human-AI interaction design. In this paper, we take stock of the community effort and intend to contribute knowledge by offering a set of design principles specifically for designing interactive and reflective journaling experiences with CA. Hence, this paper explores the following research question: RQ: How can we design human-Conversational Agents interactions for an interactive and reflective journaling experience? We were interested to attend to the experiences of journalers and to find out about their needs regarding the interactive journaling experience with CA. This paper seeks to contribute knowledge by offering a set of design principles for human-AI interactions that are generative and reflective for journaling, but perhaps also for other activities that require such qualities. In this paper we present an overview of literature on journaling as a reflective activity and conversational agents as technology. Then we present several existing sets of design principles for human-AI interaction. Further we report on our qualitative user study which includes a series of semi-structured interviews. We conclude by reporting of the results of the study which led to defining a set of design principles for interactive journaling with AI. Lastly, we offer some discussion points and future works.
64
M. Angenius and M. Ghajargar
2 Background 2.1 Journaling and Interaction Design Journaling is a method of documenting personal experiences by writing down everyday thoughts and emotions as well as connecting various bits of information [1, 2, 15] to gain insight, reflect and track personal growth [5]. Journaling can support learning, personal growth, and professional development [4]. Writing itself as an activity is one of the most effective methods to trigger reflection [5] and to enable organization of thoughts, clarifications, and connections, by processing both old and new knowledge [2, 5]. Writing helps people to develop a reflective practice and actively be engaged in their process of learning [2, 8], creating meaning and context from experiences [8]. Some of the benefits of journaling are for instance, shaping knowledge [5]; developing affective skills for yourself and others [2]; challenging established patterns of thinking and unquestioned beliefs, etc. [4]. Ullrich and Lutgendorf’s [6] study on journaling about stressful and traumatic events suggests that growth needs a more holistic approach, which requires to engage both cognitive and emotive efforts. Further, negative emotions are not uncommon during reflection, and it is essential to consider the contextual aspect of journaling–i.e., when to journal and how to support action on the reflection [16]. There are several ways of journaling, for instance Double Entry Journal [5], Dialogue Journal, and Personal Journal [4]. The Double Entry Journal is used for individual dialogue on experiences to construct personal meaning, followed by a discussion with a group of peers to construct knowledge in a social setting to advance their knowledge [5]. The Dialogue Journal is a private dialogue between a student and instructor–where the instructor comments on the student’s journal entries, and the student comments on the instructor’s comments. This interactive dialogue between the student and their instructor enables critical thinking in an open manner. The Personal Journal is a solitary style of journaling where a person writes about and reflect upon their own inner thoughts and experiences as a dialogue with self [4]. Journaling as individual practice is a valuable tool for reflection but might also result in negative aspects such as endlessly looping concepts without having personal beliefs challenged. Using journaling as a collaborative and interactive activity increases the potential for personal growth due to enabling the possibility of feedback from others, which allows for critical self-assessment [4]. Engaging in an activity of reflection in conversation, allows the individual to construct knowledge and to enhance the sense making of the content [5]. Hughes et al. further argue that learning occurs as knowledge moves from an initial investigative state to a socially constructed form [5]. Hubbs and Brand bring up three necessary conditions and ethical consideration when reflective journaling becomes a dialogue between multiple people: (1) perceived trustworthiness of the journal reader, (2) clarity of expectations, and (3) quantity and quality of the feedback. The authors also highlight the activity of collaborative review, that is when the student and instructor discuss the nature and quality of the reflection, which can only be constructive if the three previously mentioned conditions are met [4]. Different ways of journaling have previously been explored in Interaction Design and also has been used for different areas of application. For instance, Bullet Journal
Design Principles for Interactive and Reflective Journaling
65
blends the craft-based physical nature of bullet journaling with the more digital approach typically used in personal informatics [17]. It is a social journaling practice for selfimprovement, where writers frequently share their design with others, as a way to freely express themselves in a personal way [17]. The open-ended use of various materials and personal aesthetics in bullet journaling allows the user to be engaged in a process of self-creation and reflection on their life and appreciating themself and the world around them even if it is imperfect [17]. Another example is MEMEory, which is a mobile meme journaling application using memes as a medium for reflecting on daily events [1]. Terzimehi´c et al. found that memes were considered an engaging, expressive, and memorable way of documenting experiences–more motivating and enjoyable than text. Eat4Thought is a food journaling application used to identify eating behavior developed by Zhang and Parker [9] which found that contextualizing people’s eating experiences and drawing attention to specific experiences supported reflection on eating behavior. Ong and Pemberton [8], used computer-mediated journaling (blogging) to enhance the learning experience in a classroom environment and found it enjoyable to use a digital tool and receive feedback on their reflective journals to be motivating. Finally, Robota, by Kocielnik et al. [11], is a CA using both written and voice modalities to support self-learning. Robota is especially interesting because the system is similar to what we set out to explore in this paper, but for a different context, which is home environment. 2.2 Designing for Reflection Reflection is a meaning-making process to discover and understand relationships and connections between experiences and actions to develop new perspectives [10, 15, 18, 19]. Reflective thought, according to Dewey, is about moving an experience from a state of ambiguity to a state of clarity [cited in 25], and reflection is about understanding and reframing a situation [7, 20, 21]. Some of the qualities of reflection are self-directed, purposeful, disciplined, and cyclical [7], it fosters mindfulness, aligning decision-making with identity [10], evaluating actions [10, 16], and improved self-knowledge [10, 22] to mention a few. Reflection is influenced by internal and individual activity and external and collective components such as relationships to artifacts, activities, places, and people and can be also a social activity [19]. Reflection can be supported by computing tools [19] or writing techniques, e.g., reflective questions, techniques for dialogue and discussion, non-verbal techniques, evaluative review of events, and encouragement [20]. Baumer et al. reviewed the use of the concept of reflection in Interaction Design and found a lack of explicit definition within the community, which they argue is a sign of a lack of deep consideration of or engagement with the phenomena of reflection. The authors suggest that it is essential to have a more nuanced understanding of reflection, the means of assessing it, and strategies to support it [22]. Attempting to illustrate the process of reflection, Pirzadeh et al. [15] have suggested three key stages of the process of reflection: (1) awareness of uncomfortable feelings and thoughts, (2) critical analysis, and (3) development of a new perspective. Additionally, Fleck and Fitzpatrick [20] suggest five levels of reflection: (1) description, (2) reflective description, (3) dialogic reflection, (4) transformative reflection, and (5) critical reflection–suggesting that reflection can achieve different levels of depth. Interactive and everyday use artifacts can mediate reflection. There are possibilities in moving from designing useful and aesthetic artifacts
66
M. Angenius and M. Ghajargar
to designing artifacts that embody those qualities but also are thought-provoking agents in our lives [21, 23]. Fleck and Fitzpatrick [20] suggest ways of supporting reflection with technology for example by recording and providing information about experiences and events over time to support memory, providing prompts and questions for reflection, categorizing experiences with tags, and connecting people to gain different perspectives [20]. Reflection has been used also as design concepts, in projects such as Reflection Companion [10] and WanderingMind [15]. The Reflection Companion concept, which is particularly relevant to the present paper, is a mobile conversational system that supports reflection on personal data related to physical activity, which successfully supports reflection in the form of awareness, understanding, and new insights for the future [10]. 2.3 Conversational Agents Conversation as Design Material Hall mentions three conversational cultures that have evolved alongside humans over millennia: oral culture, literate culture, and secondary orality. The oral culture was dominant in the age before writing and drawing and is described as temporal, face-to-face [24]. Literature culture came after oral culture as humanity began to illustrate and write [24]. Finally, Hall describes Secondary Orality as the newest culture with qualities of immediacy, social awareness, group-minded, conversational, collaborative, and intertextual qualities. Conversation is the oldest interface [24] and is a cascade of behaviors and cues unfolding as two speakers respond to each other [12]. A conversation is more than just words [25] and more than just an exchange of phrases; instead, it is goal-oriented, relational, contextual, and emotional [24]. Conversation as an interface and as an interaction modality has specific characteristics such as turn-taking, overlap, interruption, cues, and repair [12]. Words in conversations, are materials for design and an ingredient in interaction–they are a fundamental part of the experience and, therefore, design [24]. Non-verbal communication is as important as words–and includes body language, facial expressions, vocalizations, etc. [12]. Conversational Interface Three paradigms of human-AI interaction are intermittent, continuous, and proactive, and they have different complexity and use cases [26]. Intermittent human-AI interaction is a turn-taking paradigm of interaction where the user initiates an action, the AI responds, and so there is a loop; continuous is a paradigm where the AI listens to an uninterrupted input stream and responds throughout the interaction and proactive is a paradigm of human-AI interaction where the system actively initiates and completes tasks independently [26]. Conversation design is a form of user experience design focusing on languagecentered interaction [12] and is not only about designing an interface for having a conversation but requires the application of principles of how humans speak with each other and express different ideas [12, 24]. Like any other technology, the Conversation Interfaces come with their own challenges and benefits. For instance, some challenges are calibrating users’ trust over technology, and more importantly the semantic complexity
Design Principles for Interactive and Reflective Journaling
67
of human language that is often not captured and understood by technology, and one of their benefits is their judgment-freeness and perceived patience by users [12]. Conversational Agent A Conversational Agent (CA) is a dialogue-based system [25], e.g., chatbot or virtual assistant. As CAs have been widely adopted, research needs to understand the user experience, including pragmatic and hedonic qualities [13]. Designing a positive experience with CAs requires the CA to recognize the user’s attention, actively fulfill this attention, collaborate with the user, and to engage in a conversation [13]. The use of CAs is rarely the primary activity, according to Luger and Sellen [25]; instead, it is an AI technology to support and mediate other actions. Spoken dialogue artifacts such as CAs have the potential to make human-computer interaction collaborative through mixed-initiative interaction [27]. In addition to previous challenges mentioned about conversational interfaces, CAs have also limited capabilities in understanding complex sentences and semantic meaning of the words [13]. Additionally, because of its portrayed intelligence, users might have unrealistic expectations [14]. Studies shows that even if people are aware that they are talking to a computer, they project a personality to it, which is the character traits and behaviors, word choice, voice, and appearance that create a human-like element in an artifact [12, 28, 29]. Personality is also connected to expectations, as the names of CA–e.g., Siri, Alexa–allow a user to imagine a human-like character and behavior. Furthermore, the character is the core part of the user experience design [12], which unifies the experience in a synthesized whole, and brings the artifact to life [24]. 2.4 Design Guidelines and Principles Design principles, criteria and guidelines are a series of descriptive guides that aid the design process. They are often based on empirical studies of specific kinds of design, category of users or applications. We reviewed nine of such guidelines that are increasingly used and cited in HCI and design community in order to analyze their affinities to designing interactive reflective journaling with CA. Some examples of design principles that have been developed especially in HCI and human-AI interaction research areas are Amershi et al. [30]’s Guidelines for Human-AI Interaction, Kulesza et al., [31]’s Principles of Explanatory Debugging to Personalize Interactive Machine Learning, Hummels & van Dijk [32]’s Principles of Embodied Sensemaking, Cronholm & Göbel [33]’s Design Principles for Human-Centred AI. Amershi et al. suggest 18 general guidelines for human-AI interaction that are intentionally designed to be easily used to guide designers interested in applying AI in design [30]. They recognize that each AI technology has its own specific features, so the general guidelines need to be put into the context. Kulesza et al. propose two main principles with eight sub-principles for Explanatory Debugging created to ground and support the replication of the method [31]. The main principles are (1) Explainability and (2) Correct ability [31] and focus on the user and AI understanding each other to improve over time. Hummels and van Dijk’s guideline although not specifically made for AI, but still relevant in this context, suggests seven design principles to support face-to-face embodied sensemaking technology–taking the body-in-action as a starting point. The
68
M. Angenius and M. Ghajargar
principles should be viewed as scaffolds to help designers shift perspectives to deal with complex challenges and technologies [32]. Cronholm and Göbel propose three design principles for designing decision-support systems that combine human and artificial intelligence. The design principles focus on decision-making and mutual learning [33]. Ahmad et al., propose six design principles for Personality-Adaptive Conversational Agents (PACA) specifically for the mental health domain. The design principles, include meta-requirements connected to each design principle. Adaptivity is the keyword in this set of design principles [34]. Design principles are not only developed and used in academic research, but also in industry. Both Google and IBM have developed principles for designing for AI [35, 36]. Google has designed seven design principles–or objectives–for developing AI technology responsibly [36]. This set of principles touches on topics such as bias, accountability, and privacy, and can be used in developing any category of AI-infused artifact, broadly, but they lack support of particularities of application and use. For instance, they are not specifically connected to the context of interactive, reflective journaling–which is fair and was not their intent. IBM has also created design principles focused on AI ethics [35] and a more comprehensive guide for designing for AI, elaborates on aspects of AI such as foundations, characteristics, and factors–e.g., intent for the design, the relationship between humans and AI, and elements of AI [37].
3 Methodology The point of departure of this work was the understanding on how Conversational Agents, as an AI technology can aid an interactive reflective journaling experience. We conducted a literature study [38] on journaling, reflection, CA, and principles of designing for human-AI interaction. Based on the literature study, we found out that CA technologies, can be beneficial for reflection and journaling–considering their specific characteristics, such as the ability to engage users in a conversation [13], help them remember memories and hence to reflect upon [21]. However, existing human-AI design principles felt general and broad and, in some respects, lacking some specific design considerations that we learned as necessary points for designing a better interactive and reflective journaling experience with CA–e.g., the social dimension of journaling and the nature and quality of conversation in general [4, 5, 24]; the critical role of context and meaning for reflection [16] and context related personal experiences [8]; the conversation as an interaction modalities in interactive journaling to enhance reflection [12, 24]; conversation as goal-oriented, relational and emotional [24, 25], and lastly transparency and context-awareness specific to the CA technologies [14]. Hence, we found out that there is an opportunity to develop a set of guidelines specifically for designing interactive and reflective journaling with CA. To attend this goal, as well as to understand people’s habits and needs regarding reflective and interactive journaling, a series of qualitative semi-structured user interviews were conducted [12, 39] with six participants. We analyzed the data through coding and categorizing [40]. Six people were recruited to participate in a 30 min semi-structured interview [41, 42]. The interview questions were divided into current journaling experience and future interactive journaling. The people interviewed in this project were all
Design Principles for Interactive and Reflective Journaling
69
familiar and experienced with journaling, but their experience and ways of journaling were different. For instance, some interviewees journaled occasionally when they feel the need, while some others journaled several times a week about different things, e.g., thoughts and emotions, planning daily activities, personal development, and dreams. The participants were from different countries worldwide, were adults between 24 and 35 years old, and were comfortable using digital technology. Five of the six identified themselves as women, and the sixth identified themself as a man. We understand that we had a relatively small group of participants for this study and the results may not be representative of a larger society. However, we still believe that our small experiment and the insights can be a first step towards conducting larger empirical studies and developing a more comprehensive set of design principles for interactive journaling with AI.
4 Results The data collected from interviews, were transcribed, mapped, coded and categorized [40]. The analysis of the interviews resulted in seven themes with associated sub-themes, which helped to define the set of design principles for interactive and reflective journaling. 4.1 The Whys of Journaling All six participants mentioned a variety of reasons to why they journal, e.g., to keep track of what is happening in the present, to reflect on previous experiences, to set goals, to process emotional or challenging events, etc. Generally, the purpose of journaling was described as an activity of self-reflection about life to gain insight by capturing, documenting, and analyzing experiences and events–including but not limited to appreciation, decision-making, learning languages, and planning of the day. One participant talked about their purpose while journaling: “So, it was more like writing down the things that, you know, came up. And then I really like also going back and looking into what I dealt with, for example a year ago. Um, and so something like a calendar, diary, journaling works for me. And the way I also do it, it’s not only like keeping track of events that occur every day, but also a little bit more on what’s bothering me.” (P1) Some of the participants have been using more than one journal simultaneously with different purposes, e.g., calendar, tracking daily activities, reflecting on experiences, planning for the future, or language learning: “I journal for different reasons. I have one, which is my let’s say logbook or entry book […] for journaling and to jot down notes, thoughts, or, my insight, my anxiety or good things or bad things just without having a lot of structure. […]. Then I have a kind of, more structured […] and in that one, I try to have also a sort of a log, it’s a sort of a log for the future. […] and the third one, just for language.” (P4) 4.2 The Aesthetics of Journaling It is essential to create something personal, during journaling. Aesthetics are essential to how a person expresses themselves and their personal beliefs, activities, and thoughts.
70
M. Angenius and M. Ghajargar
Participants wanted to have freedom to choose and control their input modality, no matter if it is writing, sketching, sculpting or photography. Two participants found it relaxing only to use black pencils while some other participants used varying colors: “I have a set of pens, at least I write in black its weird but I’m really OCD with colors. And it relaxes me to write in black.” (P4) “I actually invested in lots of pens because they have these like pretty colors and it’s really nice to write with them. So, it’s specifically, for my writings.” (P5) The tactile feeling of tangible elements of writing was also fundamental to the aesthetics of journaling. Most of the participants used pen and paper or paper journals–one used a notes application. Pen and paper allow for different ways of expression, such as visualizing, making connections, mind maps, and using multiple colors. Pens considered to increase the feeling of being present and the here and now in journaling: “I’d also like to draw, like to visualize. So that’s why I also think papers, you know, are nicer. Kind of also making connections between like maybe doing a mind map and see how it’s all related […] like different maybe also colors I like to use.” (P1) Four participants reported journaling either in the morning, in the evening, or sometimes both. Some people expressed the need to be able to do it wherever and whenever, while others reported the need for a specific space to do it. Journaling was done in the bed or at a desk in the home. The space needs to feel safe and quiet, and preferably when the person is alone and have their own space. Some people add personal rituals such as drinking hot beverages, free from technology moments, candles, and ambient music. One participant explained that journaling in the morning had different goals compared to that in the evening –the morning was for planning for the future, while the evening was to declutter and debrief: “I usually like at the end of the day, like I’m sitting down in my bed, like I light some scented candles or something, just something to like, relax me, I guess that’s it. And I usually write with like ambient music playing in the background, like things that I just normally do to calm myself down and to get into a good headspace.” (P3) Most participants describe journaling as relaxing in one form or the other, calling it therapeutic, improving mental peace and health, and helping them stay grounded and relaxed. Diving deep into a reflection about why something is the way it brings clarity and, at the same time, creating something with your hands, e.g., writing, is relaxing: “It’s relaxing for me to do it because it usually there is something that bothers me and after, I don’t know, it doesn’t bother me anymore.” (P2) 4.3 The Act of Writing The Flow of Writing Is Sacred A few participants mentioned that the moment of writing down thoughts is almost holy, hence they do not want to be disturbed during journaling– e.g., pens stopping working, receiving calls or messages, and other distractions: “I think so, because if I have a bad pen, for example, I’m trying to write something and, you know, for example, it stops in the middle. That’s annoying. Cause as I mentioned […] when I’m in the bed and writing, I’m just really into the flow and I want to get it out. I don’t want to be interrupted by anything, but I just want to, to keep focusing on, on writing.” (P5) Furthermore, adding
Design Principles for Interactive and Reflective Journaling
71
any interactive artifact should not disrupt the writing flow. Instead, where artifacts can provide value is during the processing time, which happens after writing. Writing is a Skill Two participants mentioned that being good at writing is a relevant factor in journaling, that is how one can express authentic thoughts and experiences through a good writing. One of them acknowledged that the practice of journaling was a way of improving their writing skills: “Because sometimes I feel that I have so many thoughts, but I feel lazy sometimes. And I don’t want to write down all the stuff because it takes time.” (P4) 4.4 Supporting Reflective Practice Journaling is an activity with different goals, which each requires a different level of reflection. Deep reflection is not always the goal, and instead, sometimes the person just wants to review and report what happened. Furthermore, the levels of reflection mentioned by Fleck and Fitz Patrick [20] could be seen as a process that you go through and stop depending on the level of depth you seek. The deeper the reflection, the more time is needed. Finally, it was mentioned that it could be a challenge to reflect alone because no one is challenging the person–this again depends on the level of reflection: “The third one [level of reflection] is a more in-depth analysis. And I do that when I really have time […] so, this happens not as often as the others do.” (P4) One of the participants mentioned that sometimes they do not confide to another human to share personal experiences and vulnerabilities, so a machine could be an alternative or as a mirror, an extension of themselves that responded as a partner in a dialogue: “I don’t feel like I feel comfortable talking to somebody else, but […] I would be open to talk to a machine.” (P2) Other participants described technology in the context of journaling as a supporting player that could prompt deeper reflection. It was described as a turn-taking conversation that dissects or investigates topics, which is reminiscent of qualities of intermittent human-AI interaction [26], characteristics of conversation interfaces [12], and the nature of use of Double Entry Journals [5] and Dialogue Journals [4]. In addition, one of the participants mentioned that learning more about reflection was another way technology can help the user: “So, it could be informative on, on what stages [of reflection] are you achieving. Maybe not in the process, because I feel like where, when do you start you have to do it and maybe nothing will disturb you at the time, but maybe in the part of reading it again and trying to understand what happened, then it might be handy to have an informative approach on what was this [depth of reflection].” (P2) 4.5 Journaling to Remember and to Forget Almost all the participants mentioned that journaling was a way to store information with the possibility of looking back later. Furthermore, the participants mentioned technology can help with capturing everything in one place and then organizing, categorizing, and analyzing it. Additionally, it would help to go back and revisit and search for entries. This was considered time consuming in the analog nature of traditional journaling with physical notebooks: “Then I feel like it is like documented in a way, because I mean,
72
M. Angenius and M. Ghajargar
if it’s just a conversation with a friend, I would, you know, not know where to capture it, but then like with the journaling, I would have it somewhere and so, I have a place where to save it.” (P1) A few participants considered to write in order to let something go–even to burn it up!–and to clear your head. Having a feature that allows the person destroys a journal entry, was also mentioned to be important in an interactive artifact: “Sometimes I throw them away. I literally tear them apart and put them in the garbage bin.” (P2) 4.6 Journaling and Conversations Diverse Point of Views All participants mentioned talking to other people about personal issues, although it was not always like the ones they would journal about. Having a friend or family member was considered valuable to get other’s perspectives, contribute to other issues, or have someone else help you dig deeper and reflect. Another effect of talking to someone was that the process of verbalizing thoughts and feelings already have helped the participant to process them. That dialogue and the kind of personal relationships, further determine an understanding of what type of thoughts or feelings the participant would share: “You’re also being able to dig a bit deeper, [with questions like] why this issue is bothering you, for example. But I think it’s also just nice to get someone else’s opinion on things […] someone else’s perspective can put things into a different light that can help you understand things better.” (P5) More-than-Human Participation Journaling technologies and tools were described as unfiltered, judgment-free, and a silent listener with infinite patience. This was contrasted with talking with a human being since people neither have infinite patience nor time! One participant described writing to her dog and viewed it as writing to someone who was not judging them. “I feel like when it comes to journaling, I can be very unfiltered. It’s not that I need someone to give me a solution, but it’s sometimes something that you can just vent, which doesn’t happen when I talk to people because almost most of the time people’s reaction is like to have an opinion about it or to kind of advice in some ways.” (P3) AI as a Journaling Companion When it comes to external support for journaling, an interactive journaling artifact could support as a companion for self-development personalized to specific goals, supporting behavior change, and giving positive feedback and affirmations and blocking negative thoughts. In addition to this, it could be a platform for sharing journal entries with selected people: “…the interaction should be very personalized to your own. Uh, traits and, um, and self. So, for example, if I want to watch myself not talking to myself really badly, then the interactive journal would provide some sort of help or, um, say like, okay, you shouldn’t call yourself stupid […] basically. Sort of like a personal guide that you were on a journey of self-discovery or self-improvement.” (P5)
Design Principles for Interactive and Reflective Journaling
73
4.7 Ubiquitous, Interactive Journaling A Ubiquitous and Reflective Space Two participants mentioned the idea of ubiquity in the context of journaling. Pondering the idea during the interview, what it would look like and work if interactive journaling was without a user interface and screens. “I’ve been writing down for so long. That was my… kind of old fashion. So, I’m thinking maybe it could be something that you don’t have to have a physical book or pen. Not even like a specified device to focus on. So, you could just like sit on your table and start talking.” (P4) Different Ways of Input Participants have different views regarding interactive journaling. One participant described writing as a good way because some topics might be too emotional to verbalize. Another participant said straight out that they would not like to use vocal input and that they like the act of writing. Having bodily movement as an input was hypothetically considered fascinating, novel, and attractive to at least try. Another mentioned that using a computer keyboard is not favorable. In that case, writing with pen and paper was preferred: “I tried many applications to have a digital logbook. I start, but then I always end up with a normal physical analog. I feel it’s less distracting because like logging in a computer with many other applications and also the screen. No!” (P4)
5 Human-AI Interaction Design Principles for Interactive Journaling The themes identified from the user interviews and insights from the literature review were used to create 11 design principles (Table 1). These design principles represent guidelines for designing human-AI interactions for interactive journaling. The list is not exhaustive but they include a list of characteristics and functionalities that contribute to a successful design. The principles can also be used as analytical tool to evaluate an existing interactive journaling experience [39]. Some examples of design principles that have been developed especially in HCI and human-AI interaction research areas are Amershi et al., [30]’s Guidelines for Human-AI Interaction, Kulesza et al., [31]’s Principles of Explanatory Debugging to Personalize Interactive Machine Learning, Hummels & van Dijk [32]’s Principles of Embodied Sensemaking, Cronholm & Göbel [33]’s Design Principles for Human-Centred AI, and Wärnestål [29]’s Design Guidelines for Design of AI-Driven Services. The set of design principles that we presented in this paper, builds upon these already existing principles and guidelines and repeats some of their principles such as the need for transparency and explainability, but it includes elements and needs that are peculiar to journaling as a practice and to CA as a technology. Below we unpack those peculiarities in relation to our design principles as some concluding discussion points.
74
M. Angenius and M. Ghajargar
Table 1. Design Principles for Interactive and Reflective Journaling with Conversational Agent Design Principles
User Needs
Artifact Requirements
01. The system acts as a confidant
The journaler needs to have the The artifact’s features and opportunity to build synergy interactions need to be with the system intentionally designed to be a judgment-free and perceived as a patient listener
02. The system expresses a personality
The journaler attributes a personality to the artifacts and builds a close relationship with them. The relationship sets expectations and impacts synergy
The personality of an interactive artifact needs to be carefully crafted to support this relationship, without anthropomorphizing and creating deceitful character
03. The system supports personal expressions and aesthetics
The journaler needs to be able to express their personal feelings and thoughts and to have the freedom to use various materials and tools
The user should be allowed to express themselves personally, e.g., through writing and speaking to CA. In addition to this, the system features need to support users in different journaling goals, depths of reflection, and modes of interaction
04. The system prompts deeper reflection
The journaler needs encouragement and guidance for deeper reflection. e.g., through prompts or questions
The system needs to support reflection through providing e.g., prompts, questions, or a reflective dialogue–part of the paradigm of intermittent human-AI interaction [26]
05. The system augments human memory
The journaler needs to remember the previous events, thoughts and feelings written about in their journal. That supports user in recognizing patterns, reflections, and personal growth
The system needs to collect data to support memory augmentation, pattern recognition, and continual adaptive use over time, using ethical and explicit data collection: e.g., recognizing written and voice input from the user, categorization of journal entries, applying the paradigm of intermittent human-AI interaction [26] (continued)
Design Principles for Interactive and Reflective Journaling
75
Table 1. (continued) Design Principles
User Needs
Artifact Requirements
06. The system updates and adapts
The jounaler’s needs, goals and expectations change over time and in different contexts. The journaler needs a flexible journaling experience and practice
The system needs to be able to adapt to the user’s needs and context. The system needs to be designed for longitudinal context and be able to collect data from multiple sessions over an extended period of time
07. The system encourages social interactions for reflection
The journaler needs social interaction to externalize thoughts and to seek second opinions and perspectives The user needs to share their experiences and knowledge with others to get alternative perspectives
The system needs to provide a platform for co-writing and sharing journal entries. It needs to provide features such as question and answer, space for comments or drawings on individual journal entries
08. The system participates just enough
The journaler’s consider the moment of writing as sacred and as a meditative state that should not be disturbed
The system can act proactively and collect data, but explicit interactions with the user need to be done intermittently, waiting for its turn (when the user finished journaling or needs the system to interact with)
09. The system explains and is transparent
The journaler needs to know about the systems’ functionalities, data collection methods, its activities in the background
The system needs to explain what it can and cannot do. This is especially important in forming a better relationship with the user
10. The system onboards slowly
The user needs to get to know the system slowly and the first impression of a system is essential in making a more sustainable relationship
The system needs to slowly onboard the user using progressive disclosure. During the onboarding process, the previously mentioned principles are relevant to bring up as information
11. The system lets the user manage and control the data and use
The journaler needs to be able to leave the system anytime. They need to have control over data usage and be able to manage it
The system must provide the user with opportunities to stop data collection at any moment and to be able to delete journal entries as they require
76
M. Angenius and M. Ghajargar
6 Concluding Discussions 6.1 Supporting Collaborations and Reflections Reflection has different key stages [15] and levels of depth [20]. While the deeper levels of reflection do not seem to be essential in journaling, it is still important to be aware of an experience and describing it in depth–it all depends on the individual’s goal for reflection and journaling. We reflected this need in the Design Principle 04 (DP04). Designing artifacts that evoke reflection through exploration and collaboration with the user is a suitable way of allowing the user to craft personal information and aesthetics [18] (DP03, DP04). CA technology can support that kind of interaction and experience– make the interaction between the user and the artifact conversational and collaborative [27]. One benefit of making journaling more collaborative is the potential for feedback, allowing the person to self-assess critically [4] (DP04, DP07). This type of dialogic reflection allows the person to construct knowledge and enhance the meaning of the content [5]. Furthermore, interactive artifacts can support reflection by recording and categorizing information and providing information and prompts for reflection [20] f, both of which are relevant in journaling (DP04, DP05). 6.2 Encouraging Personal Expressions and Aesthetics Another aspect of journaling is personal expression and aesthetics (DP03). Although the results of this project suggest that it is not as essential as it is in the style of bullet journaling [17], but the user yet needs to express themselves in a personalized manner and be able to craft their own aesthetics of journaling experience. That can be supported through multimodal interaction, e.g., written and verbal. When designing for multimodal interaction, it is crucial to understand how different modalities function together [11, 12]. Different modalities suit human capabilities differently, e.g., humans can speak faster than they write but read faster than they hear [12]. These needs and requirements are presented in the Design Principle 03 (DP03). 6.3 Acting as a Patient Confidant When it comes to having someone else reading an individual’s journal, the person may experience fear or insecurities of being judged [4]. But technology is often perceived as judgment-free [12], and the result of this paper further confirms this as they are presented in principle 01 and 02. An agent’s non-human qualities in the context of journaling was appreciated by the participants when sharing personal thoughts and emotions (DP01). While some thoughts or feelings may be too personal to share with another human, results from the project suggest that this is not the case for sharing with a CA. Instead, it is seen as a judgment-free companion with infinite patience that allows for unfiltered streams of thoughts (DP01). Typically, interaction with agents is not the primary goal but is used to support or mediate other actions [25], which was the concept’s goal in this project (DP08). For example, having the agent as support to suggest actions and ask for help was a positive addition to the journaling experience (DP04, DP07, DP08) that
Design Principles for Interactive and Reflective Journaling
77
requires for the agent to have more agency in the interaction in the context of journaling (DP04, DP07, DP08). In comparison, the expectation was that the agent should mainly behave according to the intermittent paradigm of human-AI interaction, which is turn-based [26]. The results suggested that the CA can be allowed to have more agency through continuous and proactive paradigms of human-AI interaction (DP06, DP07). 6.4 Conversational Cultures and Journaling Additionally, participants’ desire to discuss with the agent to dive deeper into their reflection (DP01, DP02), suggests an opportunity for mixed-initiative interaction between the human and the agent in the context of reflective journaling (DP07). Another exciting aspect of applying different interaction modalities is how it shapes the conversation in the context of journaling as in relation to the conversational cultures–the interaction moved between the three conversational cultures mentioned by Hall [24]: oral culture, literate culture, and secondary orality. Keeping the journal as a physical object allows for individual consumption and a sense of private ownership where users can express themselves personally [24] (DP01, DP02). While the oral qualities of not being judged on the correctness of choices of words (DP01) and being present in the moment [24] (DP08) are both inherent in journaling, regardless of written or verbal input. Adding a CA to the context of journaling brings in elements of Secondary Orality, such as immediate intertextual collaborative conversation [24] (DP04, DP07). Using different input modalities and conversing with an agent allows the user to adopt different qualities of conversational cultures depending on their choices (DP03, DP06). This allows the individual to take advantage of the human ability to interpret and appropriate technology to their liking, which is an integral part of aesthetic interaction [43] (DP04, DP06). Furthermore, according to Petersen et al. [44], important aspects of aesthetic interaction include (1) inviting the user to create a sense of meaning, (2) provoking thoughts, and (3) encouraging the individual to think differently about using the artifact in different ways (DP06). 6.5 Ethical Aspects of Journaling with AI In designing and developing AI artifacts, such as CAs, aspects such as transparency, explainability, and expectations have been subject of numerous studies [29, 30, 45] (DP09). The artifact needs to help the user to understand how it works and what to expect from the interaction (DP09, DP10). This is a challenge for CAs because the functionality is somewhat invisible [12]. Explaining what the artifact can do is one strategy to decrease ambiguity, but that comes with a trade-off in how valuable the explanation is to the user experience [46]. One approach uses progressive disclosure to give the user just enough information that need [24] (DP10). Privacy is another essential factor in journaling [4] with AI [29, 47]. Hubbs and Brand [4] mentioned that when journaling becomes a dialogue, the perceived trustworthiness of the dialogue partners and clarity of expectations becomes essential (DP02, DP09). The person using the artifact must feel safe when journaling about private thoughts. The data collected by the artifact needs to be protected from being misused, and the artifact
78
M. Angenius and M. Ghajargar
can clarify it by explaining what data is collected, how it is used, and by whom (DP09)– additionally, the user needs to be able to turn off the artifact and its data collection at any time [29] (DP11). Another concern is if the artifact experiences technological failures, which could disrupt the meditative nature of journaling (DP08) or result in previous journaling entries being erased from existence if data is lost (DP05). Additional ethical concerns and challenges with adding voice interaction and AI to journaling include language, accent, and gender biases. The artifact must be able to adapt to the individual user’s context, culture, and abilities (DP06). The ethical side of journaling with an interactive artifact using AI needs to be explored extensively in future developments of this project.
7 Conclusions This project explored the intersection of journaling, reflection, and Conversational Agents. It explored how we might design a CA to facilitate an interactive and reflective journaling experience. Our findings suggest that CAs have a place in the context of journaling. The possibility to express freely, in a judgement free and personalized manner seemed to be the most important user needs, followed by the value of aesthetics of experience of journaling. The purpose of journaling, depth of reflection, and aesthetics such as colors and type of journals varies significantly among people, hence the interactive journaling system needs to be able to accommodate that need by being adaptable. Reflection is a core component of most journaling practices, and the journaler benefits from having external prompts to support the reflective practice. The CA can deliver these types of prompts. Additionally, a CA can support the users’ need for memory augmentation and pattern recognition. Using conversation as the primary form of interaction, CA can provide the user with a dialogue partner or external source of feedback, allowing for the opportunity to make journaling more like a conversation. We learned that some of the personal activities have particular meaning for the users in the context of journaling, such as the writing skills, aesthetics of experience and the sacred moment of journaling. We realized that most of the AI design principles talk about the functionalities and transparency, ethics, and privacy, while all relevant and useful in designing better and more just AI systems, but they often do not talk about activities with particular and personal meaning in users’ life. This paper sought to address that need and included those needs in the set of design principles for human-AI interaction design specifically for the context of interactive journaling. The principles are based on a qualitative study and a literature study about reflection, conversational agents, and interactive journaling. Comparing our design principles to more general AI guidelines in the field, it can be viewed as an additional context-specific layer of recommendations for applying AI and conversational agents to the context of journaling. Moving forward, more studies are needed to produce more comprehensive design principles for interactive journaling with AI, and this paper aims to take the first step in that direction. Acknowledgments. We would like to thank the people who participated in the user studies.
Design Principles for Interactive and Reflective Journaling
79
References 1. Terzimehi´c, N., Schött, S.Y., Bemmann, F., Buschek, D.: Memeories: internet memes as means for daily journaling. In: Designing Interactive Systems Conference 2021, pp. 538– 548. Association for Computing Machinery, New York (2021). https://doi.org/10.1145/346 1778.3462080 2. Blake, T.K.: Journaling; an active learning technique. Int. J. Nurs. Educ. Sch. 2 (2005). https:// doi.org/10.2202/1548-923X.1116 3. Corbin Frazier, L., Eick, C.: Approaches to critical reflection: written and video journaling. Reflective Pract. 16, 575–594 (2015). https://doi.org/10.1080/14623943.2015.1064374 4. Hubbs, D.L., Brand, C.F.: The paper mirror: understanding reflective journaling. J. Exper. Educ. 28, 60–71 (2005). https://doi.org/10.1177/105382590502800107 5. Hughes, H.W., Kooy, M., Kanevsky, L.: Dialogic reflection and journaling the clearing house. J. Educ. Strat. Issues Ideas. 70, 187–190 (1997). https://doi.org/10.1080/00098655.1997.105 44193 6. Ullrich, P.M., Lutgendorf, S.K.: Journaling about stressful events: effects of cognitive processing and emotional expression. Ann. Behav. Med. 24, 244–250 (2002). https://doi.org/10. 1207/S15324796ABM2403_10 7. Stamper, C.E.: Fostering reflective thinking through computer-mediated journaling. Arizona State University (1996) 8. Ong, L.T.R., Pemberton, R.: Enhancing classroom learning through computer-mediated reflective writing and peer feedback. J. Mod. Lang. 19, 99–120 (2009) 9. Zhang, Y., Parker, A.G.: Eat4Thought: a design of food journaling. In: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–8. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3334480.3383044 10. Kocielnik, R., Xiao, L., Avrahami, D., Hsieh, G.: Reflection companion: a conversational system for engaging users in reflection on physical activity. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 70:1–70:26 (2018). https://doi.org/10.1145/3214273 11. Kocielnik, R., Avrahami, D., Marlow, J., Lu, D., Hsieh, G.: Designing for workplace reflection: a chat and voice-based conversational agent. In: Proceedings of the 2018 Designing Interactive Systems Conference, pp. 881–894 (2018) 12. Deibel, D., Evanhoe, R.: Conversations with Things: UX Design for Chat and Voice. Rosenfeld Media (2021) 13. Yang, X., Aurisicchio, M., Baxter, W.: Understanding affective experiences with conversational agents. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–12. Association for Computing Machinery, New York (2019). https://doi.org/ 10.1145/3290605.3300772 14. Liao, Q.V., Shmueli-Scheuer, M., Wen, T.-H., Yu, Z.: User-aware conversational agents. In: Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion, pp. 133–134 (2019) 15. Pirzadeh, A., He, L., Stolterman, E.: Personal informatics and reflection: a critical examination of the nature of reflection. In: CHI 2013 Extended Abstracts on Human Factors in Computing Systems, pp. 1979–1988 (2013) 16. Nakamura, K., Feng, H., Priss, S., Mei, H.: Designing for night-time reflection: how to support night-time reflection through non-digital means. In: The 39th ACM International Conference on Design of Communication, pp. 386–388 (2021) 17. Tholander, J., Normark, M.: Crafting personal information - resistance, imperfection, and self-creation in bullet journaling. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–13. Association for Computing Machinery, New York (2020)
80
M. Angenius and M. Ghajargar
18. Ghajargar, M., De Marco, A., Montagna, F.: In: Wise Things: when smart objects stimulate reflection (2017) 19. Ghajargar, M., Wiberg, M., Stolterman, E.: Designing IoT systems that support reflective thinking: a relational approach. Int. J. Des. 12, 21–35 (2018) 20. Fleck, R., Fitzpatrick, G.: Reflecting on reflection: framing a design landscape. In: Proceedings of the 22nd Conference of the Computer-Human Interaction Special Interest Group of Australia on Computer-Human Interaction, pp. 216–223. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1952222.1952269 21. Ghajargar, M., Wiberg, M.: Thinking with interactive artifacts: reflection as a concept in design outcomes. Des. Issues 34, 48–63 (2018). https://doi.org/10.1162/DESI_a_00485 22. Baumer, E.P.S., Khovanskaya, V., Matthews, M., Reynolds, L., Schwanda Sosik, V., Gay, G.: Reviewing reflection: on the use of reflection in interactive system design. In: Proceedings of the 2014 Conference on Designing Interactive Systems, pp. 93–102. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2598510.2598598 23. Ghajargar, M., Bardzell, J.: Synthesis of forms: integrating practical and reflective qualities in design. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–12. Association for Computing Machinery, New York (2021). https://doi.org/ 10.1145/3411764.3445232 24. Hall, E.: Conversational Design. A Book Apart (2018) 25. Luger, E., Sellen, A.: Like having a really bad PA” the gulf between user expectation and experience of conversational agents. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5286–5297 (2016) 26. van Berkel, N., Skov, M.B., Kjeldskov, J.: Human-AI interaction: intermittent, continuous, and proactive. Interactions 28, 67–71 (2021). https://doi.org/10.1145/3486941 27. Allen, J.F., Byron, D.K., Dzikovska, M., Ferguson, G., Galescu, L., Stent, A.: Toward conversational human-computer interaction. AI Mag. 22, 27 (2001). https://doi.org/10.1609/aimag. v22i4.1590 28. Norman, D.A.: How might people interact with agents. Commun. ACM 37, 68–71 (1994) 29. Wärnestål, P.: Design av AI-drivna tjänster. Studentlitteratur AB (2021) 30. Amershi, S., et al.: Guidelines for human-AI interaction. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–13. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3290605.3300233 31. Kulesza, T., Burnett, M., Wong, W.-K., Stumpf, S.: Principles of explanatory debugging to personalize interactive machine learning. In: Proceedings of the 20th International Conference on Intelligent User Interfaces, pp. 126–137 (2015) 32. Hummels, C., Van Dijk, J.: Seven principles to design for embodied sensemaking. In: Proceedings of the Ninth International Conference on Tangible, Embedded, and Embodied Interaction, pp. 21–28 (2015) 33. Cronholm, S., Göbel, H.: Design Principles For Human-Centred AI. ECIS 2022 Research Papers. (2022) 34. Ahmad, R., Siemon, D., Gnewuch, U., Robra-Bissantz, S.: Designing personality-adaptive conversational agents for mental health care. Inf. Syst. Front. 1–21 (2022). https://doi.org/10. 1007/s10796-022-10254-9 35. AI Ethics. https://www.ibm.com/artificial-intelligence/ethics. Accessed 27 July 2022 36. Our Principles. https://ai.google/principles/. Accessed 27 July 2022 37. Fundamentals. https://www.ibm.com/design/ai/fundamentals/#design-factors-for-ai. Accessed 27 July 2022 38. Hanington, B., Martin, B.: Universal methods of design expanded and revised: 125 Ways to research complex problems, develop innovative ideas, and design effective solutions. Rockport publishers (2019)
Design Principles for Interactive and Reflective Journaling
81
39. van Boeijen, A., Daalhuizen, J., Zijlstra, J.: Delft Design Guide: Perspectives, Models, Approaches, Methods. BIS Publishers (2020) 40. Adu, P.: A Step-by-step Guide to Qualitative Data Coding. Routledge (2019) 41. Myers, M.D.: Qualitative Research in Business and Management. SAGE (2013) 42. Sharp, H., Preece, J., Rogers, Y.: Interaction Design: Beyond Human-Computer Interaction. Wiley (2019) 43. Bardzell, J., Bardzell, S.: Humanistic HCI. Synth. Lect. Hum.-Center. Inform. 8, 1–185 (2015). https://doi.org/10.2200/S00664ED1V01Y201508HCI031 44. Petersen, M.G., Iversen, O.S., Krogh, P.G., Ludvigsen, M.: Aesthetic interaction: a pragmatist’s aesthetics of interactive systems. In: Proceedings of the 5th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques, pp. 269–276 (2004) 45. Abdul, A., Vermeulen, J., Wang, D., Lim, B.Y., Kankanhalli, M.: Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. pp. 1–18 (2018) 46. Bunt, A., Lount, M., Lauzon, C.: Are explanations always important? a study of deployed, lowcost intelligent interactive systems. In: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces, pp. 169–178. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2166966.2166996 47. Ghajargar, M., Persson, J., Bardzell, J., Holmberg, L., Tegen, A.: The UX of interactive machine learning. In: Proceedings of the 11th Nordic Conference on Human-Computer Interaction: Shaping Experiences, Shaping Society, pp. 1–3 (2020)
AI Ethics on Blockchain: Topic Analysis on Twitter Data for Blockchain Security Yihang Fu1,2 , Zesen Zhuang1,2 , and Luyao Zhang1,2(B) 1
2
Duke Kunshan University, Suzhou, Jiangsu 215316, China [email protected] Data Science Research Center and Social Science Division, Duke Kunshan University, Suzhou, China
Abstract. Blockchain has empowered computer systems to be more secure using a distributed network. However, the current blockchain design suffers from fairness issues in transaction ordering. Miners are able to reorder transactions to generate profits, the so-called miner extractable value (MEV). Existing research recognizes MEV as a severe security issue and proposes potential solutions, including prominent Flashbots. However, previous studies have mostly analyzed blockchain data, which might not capture the impacts of MEV in a much broader AI society. Thus, in this research, we applied natural language processing (NLP) methods to comprehensively analyze topics in tweets on MEV. We collected more than 20000 tweets with #MEV and #Flashbots hashtags and analyzed their topics. Our results show that the tweets discussed profound topics of ethical concern, including security, equity, emotional sentiments, and the desire for solutions to MEV. We also identify the comovements of MEV activities on blockchain and social media platforms. Our study contributes to the literature at the interface of blockchain security, MEV solutions, and AI ethics. Keywords: AI Ethics · Blockchain Security · Natural Language Processing (NLP) · Twitter · Flashbots · MEV
The corresponding author Luyao Zhang is supported by National Science Foundation China on the project entitled “Trust Mechanism Design on Blockchain: An Interdisciplinary Approach of Game Theory, Reinforcement Learning, and Human-AI Interactions.” (Grant No. 12201266). Yihang Fu is supported by the Summer Research Scholar (SRS) program 2022 under Prof. Luyao Zhang’s project titled “Trust Mechanism Design: Blockchain for Social Good” at Duke Kunshan University. Zesen Zhuang is supported by the Social Science Divisional Chair’s Discretionary Fund for undergraduate research in Prof. Luyao Zhang’s related interdisciplinary research courses as a Teaching and Research Assistant at Duke Kunshan University. Both Zesen Zhuang and Luyao Zhang are also with SciEcon CIC, a not-for-profit organization aiming at cultivating interdisciplinary research of both profound insights and practical impacts in the United Kingdom. We thank the anonymous referees at Computing Conference for their professional and thoughtful comments. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 82–100, 2023. https://doi.org/10.1007/978-3-031-37963-5_7
AI Ethics on Blockchain
1
83
Introduction
Blockchain has enabled computer systems to be more secure using a distributed network [3,25,26,28]. However, the current blockchain design suffers from fairness issues in transaction ordering [8]. Miner extractable value (MEV), first coined by Daian et al. in 2020 [6], refers to the value that miners can extract by reordering the transactions on the blockchain. For example, on the Proof-ofWork (PoW) Ethereum [17], miners can order, include, and exclude transactions in mem-pool, a pool where transactions are stored or sorted temporarily before adding to the new blocks. Researchers [20] found that from 2015 to 2020, the 199724 frontrunners had cumulative profits of more than 18.4 billion USD. Since the transition of Ethereum from PoW to Proof-of-Stake (PoS), miners no longer have a role in the blockchain protocol. Instead, validators take charge of validating transactions on the blockchain. However, the method of extracting value by manipulating the transaction order still exists. Therefore, people now use MEV as an abbreviation for the maximum extractable value in PoS Ethereum, the socalled Ethereum 2.0. Existing research recognizes MEV as a severe security issue and proposes potential solutions [7,24] including the prominent Flashbots [21]. However, previous studies have mostly analyzed blockchain data, which might not capture the impacts of MEV in a much broader AI society. Therefore, we extend the study of MEV from blockchain data to a broader community on social media platforms. Specifically, our study targets two research questions (RQs): 1. RQ1: What are the main keywords and topics being discussed in tweets with #MEV and #flashbots hashtags, and what are the connections between those keywords? 2. RQ2: What are the connections between the MEV activities on blockchain and discussions on social media platforms? In this study, we applied natural language processing (NLP) methods to analyze topics in tweets on MEV comprehensively. We queried more than 20000 tweets with #MEV and #Flashbots hashtags from 2019 to October 2022. We also included corresponding Google Trend data in the same period for reference and comparison. To explore the connections between the MEV activities on blockchain and discussions on social media platforms, we collected the gross profit data of MEV from Flashbots. Our results show that the tweets discussed profound topics of ethical concern, including security, equity, emotional sentiments, and the desire for solutions to MEV. According to the keyword statistics, the discussion about MEV is highly concentrated on the Ethereum blockchain. The result also indicates that the MEV problem is one of the most urgent problems on the Ethereum blockchain, and practical solutions are highly demanded. In addition to Flashbots, the topics mention several alternative solutions to MEV, but Flashbots appears to be the most promising one. Some potential nontraditional solutions are also mentioned, such as machine learning. Moreover, other nontechnical keywords indicate that people generally express negative emotions toward MEV, e.g., a feeling of unfairness. We also identify the co-movements of MEV activities on blockchain and social media platforms. Our study contributes to the literature at the interface of blockchain security, MEV
84
Y. Fu et al.
solutions, and AI ethics. In Sect. 2, we discuss the related literature and background. Section 3 introduces the data and methodology. Section 4 presents the results of the two research questions respectively. Section 5 discusses and concludes. We provide a glossary Table 7 in the Appendix.
Fig. 1. Common topics between MEV and flashbots, and unique topics in the two hashtags.
2
Related Literature and Background
Our research contributes to three lines of literature: blockchain security, MEV solutions, and AI ethics. 2.1
Blockchain Security: MEV Issues
Ethereum blockchain facilitates transactions with the use of smart contracts. In Ethereum, nodes collect transaction information from networks, and miners record the transactions into blocks. Before being added to the blocks, transactions are temporarily stored and sorted in the mem-pool. Miners select transactions in the mem-pool and execute Proof of Work. Whoever (miners) wins the race of PoW can add the block to the network [17]. The order of transactions is predetermined. The execution depends on the initial transaction sets in front of the block or in the same block. However, when Daian et al. [6] introduced frontrunning in cryptocurrency decentralized exchange (DEX), miners could change the order of the transactions. In general, MEV is an activity in which attackers (or profit seekers) discover certain instabilities and look for extractable values [1]. MEV has different strategies to obtain profits. One of the most common strategies is called a sandwich attack. For example, if A is prompting a transaction to purchase a token, the attackers who discovered A’s attempt could buy this token ahead of A at a high price and then sell this token after A. The attackers manage to extract a profit from the series of transactions. Another commonly
AI Ethics on Blockchain
85
used strategy is called the arbitrage attack. In arbitrage, the same good is purchased and sold simultaneously in different markets, and profits are made from the difference in the price of the same good in different markets. In Ethereum, if two or more DEXs offer the same token in different prizes simultaneously, one can buy the cheaper token and sell it at a higher price. Our research contributes to the literature by analyzing concerns about blockchain security discussed on social media platforms. 2.2
MEV Solutions: Flashbots and Alternatives
Methods to mitigate MEV problems are divided into two main categories: democratization and minimization. MEV minimization sets up roles to make MEV impossible or increases the risk to be larger than the MEV benefits. For example, Ethereum 2.0 upgrades from proof-of-work (PoW) to proof-of-stake (PoS) and introduces slashing [14] to punish misbehavior regarding MEV. Third-party researchers propose minimization solutions such as fair sequencing services [7] and Conveyor [11]. Alternatively, MEV democratization tries not to eliminate but to democratize MEV behavior so that everyone has access to information that is available to miners. The most popular approaches, such as Flashbots [21] direct transactions to private relays to reduce the mem-pool bidding war [22].1 Flashbots aim to mitigate the negative externalities of the current MEV by establishing a fair, transparent, and permissionless ecosystem. Two initiatives mainly support it, MEV-geth and MEV-inspect [6]. Specifically, Flashbots provide a new auction ecosystem with three primary roles: searchers, relays, and miners. Searchers seek MEV opportunities. Once they find potential transactions promoted by peers, they create a bundle that contains a set of transactions. The bundle includes the fee paid to the miners and the searchers themselves. Searchers then send the bundles to the relays instead of the mem pool. Relays collect the bundles and send them to the miners. Since the bundles are sent to the Flashbots, the miners are exclusively Flashbots miners. Miners then collect bundles and select the most profitable ones, and only one transaction can be accounted for in each block. Miners can determine which transaction to mine based on MEV-geth, a forked version of the Go-Ethereum client [21]. Our research contributes to the literature by evaluating MEV solution discussions on social media platforms and their connections to MEV activities on the blockchain. 2.3
AI Ethics
Researchers have expressed concern about the ethical aspects of security issues related to blockchain technologies. Bertino et al. [4] noted that if data were gathered and used, based on some ethical principles of data transparency, it would provide a novel way for policy-makers to assess the mechanism of blockchain transactions. Another group of researchers proposed a list of ethical concerns 1
There are also similar solutions like Eden Network [13] and CoW Protocol [15], etc.
86
Y. Fu et al.
about blockchain technology categorized into four areas: technology stack, cryptocurrencies, smart contracts, and decentralization. One of the major concerns in the cryptocurrency area is whether the coin mining mechanism is ethically sustainable and fair [18]. Regarding this question, Ben and his colleagues assessed Flashbots. They argued that in the new auction mechanism of Flashbots (as introduced previously), only some users can receive fair profits, and miners benefit more than searchers [21]. However, although the researchers provided comprehensive insights into the ethical discussion, the existing research needs to better evaluate the ethical issue of blockchain based on the real-life reactions of blockchain users. In this article, we measure and evaluate people’s reactions and feedback toward blockchain security issues on social media platforms.
3
Data and Methodology
The data and code in this project are open-sourced and can be accessed at https://github.com/SciEcon/blockchain-ethics Table 1. Sample Tweets Data. Index Date
Tweets
0
2021-12-31 15:53:38+00:00 @willwarren No worries @foldfinance solves thi...
1
2021-12-31 05:56:23+00:00 Vshadow textbackslashn#Imgoodgirl n...
2
2021-12-31 05:49:28+00:00 This is what a sandwich attack looks like. The...
3
2021-12-29 21:11:16+00:00 #Memoria...l´ıderes de jxc apoyaron p´ ublicament...
4
2021-12-29 17:34:14+00:00 even if you don’t have that much money to clai...
3.1
Data
We collected three datasets. The first includes tweets and Google Trends data. For Tweets, we used snscrape2 to query primary data for our research. snscrape is a Python library to scrape posts on a variety of social networks with specific topics or hashtags. We queried two datasets, one with the hashtag #mev and the other with the hashtag #flashbots. We queried Twitter data from 2019-01-01 to 2022-10-01. In total, we found in total 20574 tweets with hashtag #mev and 852 tweets with hashtag #flashbots. The queried data includes two columns which are date and content. Table 1 shows examples of downloaded data. Next, we use Python library pytrend3 to query Google Trend data for two topics, “MEV” and “flashbots”. pretend provides API to automatically download reports from Google Trend4 . Then, we merge the Google Trend data with the tweets by 2 3 4
https://github.com/JustAnotherArchivist/snscrape. https://github.com/GeneralMills/pytrends. https://trends.google.com/trends.
AI Ethics on Blockchain
87
date as in Table 2. In addition, we also queried Ethereum MEV records from the flashbots’ MEV-Explore dashboard5 to compare on-chain and social media activities. Table 2. Sample Merged Data: Column Date Shows the Date in YYYY-MM-DD Format; Column Google Trend Shows the Google Trend Index; Column tweet volume is the Count of Tweets with a Specific Topic (“MEV” for Example); Column tweet len is the Sum of the Length of the Tweets in One Day. date
3.2
google trend tweet volume tweet len
0 2021-04-11 23
3
20.666667
1 2021-04-18 23
2
37.500000
2 2021-04-25 0
1
42.000000
3 2021-05-02 0
1
34.000000
4 2021-05-09 24
1
19.000000
Methodology
Our NLP methods include keyword analysis and Latent Dirichlet Allocation (LDA), similar to the quantitative methods in [19]. Keywords Analysis Methods. Analyzing the trend of discussion on a topic on social media and its high relevance is helpful to understand the history and development of the topic and future trends. In this study, we trace and then quantify the activity of the hashtags #mev and #flashbots on Twitter and compare them with their Google Trends profiles. We first conduct the Spearman correlation test between tweet volume and Google Trends data and then plot the time series for each hashtag to reveal the correlation between their activity on Twitter and Google. Next, we count and sort the keywords’ appearance (irrelevant words such as emojis and common words are excluded) in tweets and draw a word cloud that shows the most relevant topics discussed on social media to the two hashtags #mev and #flashbots. After that, we use Python library NetworkX6 to draw a network on keywords. The edge in the network indicates that two keywords occur in the same topic, and the thickness of the edge is proportional to the frequency of co-occurrence.
5 6
Flashbots MEV-Explore public dashboard https://explore.flashbots.net/ consists of various historical statistics of MEV activities on the Ethereum blockchain. https://networkx.org/.
88
Y. Fu et al.
Latent Dirichlet Allocation for Topic Analysis. We utilize Latent Dirichlet Allocation (LDA) [5] to reveal the topic tendency of the collected tweets on #mev and #flashbots. LDA is a statistical model that groups data and explains why some groups of data are similar. The LDA model is widely used in natural language processing. Our research utilizes an LDA model implemented in the Python library gensim [16]. The LDA in gensim implementation has three hyperparameters: (1) integer K, the number of topics; (2) rational number α between 0 and 1 controls the per-document topic distribution, a higher α results in more topics in a document; (3) rational number β between 0 and 1 controls the per-topic word distribution, a higher β results in more keywords in a topic. For results, the LDA model produces the probability of the corpus as shown in equation (1). p(D|α, β) ⎛ ⎞ Nd K p(zdn |θd )p(wdn |zdn , β)⎠ dθd p(θd |α) ⎝ = d=1
(1)
n=1 Zdn
In equation (1), θ is the joint distribution of a topic mixture, z is the number of topics, and N is the number of words in a set. The model is trained with various α, β, and K. We use a coherence score for parameter optimization. A high coherence score in a topic indicates a higher semantic similarity among keywords in which. We manually tried out K ∈ {1, 5, 10, 15, 20, 25, 30} then adopt SALDA [12] algorithm for α and β to optimize hyperparameters. Ultimately, we achieved K = 20, α = 0.31 and β = 0.61 with a coherence score of 0.4296 for hashtag #flashbots and K = 5, α = 0.25 and β = 0.91 with a coherence score of 0.4687 for hashtag #mev.
4 4.1
Results Answers to RQ1
This section answers the RQ1 based on the four analyses below: 1. We calculate and ranked the frequent keywords among the tweets under the hashtag of #MEV and #Flashbots. 2. We use the LDA analysis to analyze people’s reactions and emotions behind the potential MEV security issue. 3. We establish Network Analysis (NA) to seek the intrinsic relationship between keywords under each Twitter hashtag. 4. We compare the Google Trend data with real-world events.
Keywords. We rank the frequency of keywords for both hashtags as in Table 3 and Table 4. By calculating the word count of the keywords in each topic, we found similarities between these two hashtags. First, both have hashtags with
AI Ethics on Blockchain
89
Table 3. Wordcount for #flashbots. Keywords
Frequency
#flashbots
513
#Flashbots
340
MEV
219
#MEV
152
ETH
85
mist
84
Flashbots
76
opensea
72
#Ethereum
67
Support
65
artist
65
grow
65
Shill
65
Shizzlebotz
65
#nftshill
65
#PolygonNFT
65
#openseaNFT
65
#mev
49
thegostep
49
transactions
47
gas
46
miners
41
MIST
41
#DeFi
40
Ethereum
39
team
39
bertcmiller
31
transaction
31
#FlashBots
29
#mistX
28
NFT
26
front
25
#riverfrontrocks 25 #ethereum
24
EST
24
Table 4. Wordcount for #MEV. Keywords
Frequency
#MEV
19928
#arbitrage
7511
Ecocent
5966
MEV
5957
WETH
5339
video
4398
simply
4357
explained
4352
System
4351
#HotWater
4349
info
3930
USDT
3318
USDC
3290
ROI
2908
ESP
2749
view
2578
WBNB
2382
triangular
2102
sandwich
2078
spatial
1570
profit
1267
DAI
1225
days
1175
contract
1075
eye
1016
#SandwichAttacker 1009 SAP
910
EigenPhi
877
#mev
816
pass
809
WBTC
768
BUSD
754
90
Y. Fu et al.
relative semantic meanings. For example, like Ethereum, miner, and crypto, the terminologies in blockchain, are the most salient keywords, appearing more than 300 times, except for #MEV and #Flashbots. Also, we observed that #MEV or #Flashbots would be mentioned in the other hashtag simultaneously. This reflects the coherent relationship between these two hashtags. As we can see through the table 4 and 3, Flashbots was mentioned under the topic of MEV nearly 400 times, and MEV was mentioned nearly 400 times as well under the topic of Flashbots. After investigating the word frequency, we also explored the keyword connections by building up the word bigrams. As we can see in Fig. 2, with the selection of the semantic meaning, the top frequent bigrams for the #flashbots are “mev” & “flashbots”, “crucible” & “copper”, and “flashbots” & “mist”. The top frequent bigrams for the #MEV are “extractable” & “value”, “mev” & “flashbots”, “macri” & “macri”, “miner” & “extractable”, “front” & “running”, “defi” & “mev” and “cpb” & “mev”. LDA Analysis. We used Genism, an open-source library for unsupervised topic modeling to implement text analysis. We selected 1, 3, 5, 10, 15, 20, 25 as the targeted number of topics to calculate the corresponding coherence score. We found that when k = 1 (the number of topics), the coherence score of #flashbots (0.482) is the highest. When k = 3, the coherence score of #MEV, which equals 0.47 is the highest outcome in our model. This indicates that when the number of topics for #flashbots is 1, and the number of topics for #MEV is 3, the words in the corpus are relatively more semantically interpretable to humans. Figure 3 plots the different coherence scores under the different number of topics for two hashtags. Since the coherence score is also attributable to the two hyperparameters, we adopted Pathik’s SA-LDA algorithm to find a pair of approximately suitable α and β. We found that when α = 0.31, β = 0.61, and the number of topics = 20, the coherence value of #flashbots equals 0.4296, which is the highest among all the outcomes of tested combinations. In the same way, when α= 0.25, β = 0.91, and the topic = 5, the approximate highest coherence value of #MEV equals to 0.4687. After excluding some words without semantic meaning, for example, emojis and persons’ names, we identify the sets of words categorized by different topics in Table 6. We summarize our main findings in Fig. 1. Our results show that the tweets discussed profound topics of ethical concerns including security, equity, emotional sentiments, and craving for solutions of MEV. Table 5 illustrates the top 30 salient terms generated by the LDA model with the selected parameters of #MEV and #flashbots. Combined with LDA analysis, we find that the top 30 salient terms, including some frequently mentioned keywords such as ethereum, smart contract, and solution, etc., are not categorized by the topics. That is to say, the LDA model did not recognize some of their semantic meanings successfully. Keywords Network. We used NetworkX on Python to visualize the outcome found in Fig. 4 and 5. Each node represents a commonly used keyword extracted from tweets of #MEV and #flashbots, and each edge represents a connection
AI Ethics on Blockchain
91
Fig. 2. Bigram of Keywords: The Bigram of Keywords, which Illustrates the most Frequent Keyword Pairs that were Mentioned in Tweets under the Hashtag of #MEV and #Flashbots.
92
Y. Fu et al. Table 5. Top 30 most Salient Terms of Two Hashtags. Top 30 Most Relevant Terms (Overall term frequency) #Flashbots #MEV mist mistx flashbots crucible gt gas bundles copper team poap alchemist eth samiches mev going pm bertcmiller good future week thanks see ethereum thegostep crypto leaksblockchain new block via miners
macri mev extractable ethereum value see op chain mist inflate si aiz eth flashbots video door mehwishhayat chainlink energy mempool front capital love miner us worldpastsaday look people makes fair
AI Ethics on Blockchain
93
Fig. 3. Coherence Score for #MEV.
between two keywords. The number of edges of each node represents the number of respective networks with other keywords. We can identify the most commonly mentioned keywords and their relative co-occurring keywords by this analysis. We can see in the solution that the most frequent keywords that appeared together with #flashbots are ethereum, smartcontract, solution, etc. The most frequent keywords connected with #MEV are flashbots, sandwich, miner, etc. Google Trend and Twitter Data. We also researched Google Trends data of two keywords (namely, #MEV and #flashbots) to compare their Twitter volume and offline activities. We used the Google Trends API Pytrend to query the Google Trends data. Google Trends reveals the popularity of top search queries in Google Search across various regions and languages. In general, the Google Trends of #MEV and #flashbots show moderate consistency with the respective Twitter volumes. We ran the Spearman correlation test between Twitter volume and Google Trends, and the results showed that #MEV had more moderate consistency (with coefficient= 0.45) than #flashbots (with coefficient = 0.202). Furthermore, we observed that each peak of flashbots Google trends have a certain time interval, with the peaks of Flashbots twitter volume. Figure 6 illustrates the staggering peaks of Google Trend and Twitter volume for #Flashbots. Also, we found that every peak of Google Trend or Twitter volume of #Flashbots could match up with a big offline event of Flashbots corporation. For example, on January 2021, Flashbots Auction Alpha (v0.1) was made available for miners
94
Y. Fu et al.
Fig. 4. Network of Keywords for #MEV.
Fig. 5. Network of Keywords for #flashbots.
AI Ethics on Blockchain
95
and searchers to adopt. The same year, in May, August, and September, Flashbots Auction Alpha updated and published its latest versions corresponding to the peaks shown in Fig. 7. This will be further discussed in the discussion part. However, the situation of #MEV was slightly different. Although the Spearman correlation test indicates a higher correlation between Google trends and Twitter volume, those two lines do not show much consistency in peaks.
Fig. 6. Time Series for Google Trend and Twitter Volume of #Flashbots.
4.2
Answers to RQ2
To respond to this research question, we query the gross profit data of MEV from 2019 to 2022 using Flashbots API. Gross profit here refers to the amount that attackers acquire from MEV arbitrages. As presented in Fig. 8, we observe that there exist three spikes between late 2020 and July 2021. At the same time, we compare the gross profit data with Twitter volume data. Interestingly, we find that each spike in gross profit data corresponds to a spike in Twitter volume data. We find that After July 2021, the gross profit of MEV remained at a relatively stable and low value when the Twitter volume was at a high level. Thus the high Twitter volume could not be explained by the gross profit in MEV. Instead, We discovered some potential causes from offline events. In the third quarter of 2021, EIP-1559 [9], a proposal to reform the Ethereum fee market, and a new version of Zero-Knowledge Rollups, ZK-rollups [2], regarded as one of the complete solutions to prevent MEV problems, were released. We calculated the frequency of the keywords before and after July of 2021 separately, and we found the frequency of keywords “Solution”, “Prevent” and “Attack” after 2021.7 was higher than the time before that time. Thus, the release of ZK-rollups and the
96
Y. Fu et al.
Fig. 7. Time Series for Google Trend and Twitter Volume of #MEV.
new Ethereum transaction mechanism might be the driving force for the high social media appearance of MEV topics.
Fig. 8. MEV Gross Profit, Tweet Volume, and Google Trend.
AI Ethics on Blockchain
97
Fig. 9. #Flashbots Time-Series Analysis with Landmark Events. Table 6. Semantic Category of the most Salient Keywords.
1 2 3 4 5
5
#MEV
#Flashbots
Terminologies Platforms, companies Concerns, Worries Fairness Complaints
Terminologies Platforms, companies Mechanisms, auction, etc Trust, grateful, etc Future, wishes
Conclusion and Discussion
In conclusion, we apply Natural Language Processing (NLP) methods to comprehensively analyze topics in tweets of MEV. Our results show that the tweets discussed profound topics of ethical concerns including security, equity, emotional sentiments, and the craving for solutions of MEV. We also identify the co-movements of MEV activities on blockchain and on social media platforms. Our study contributes to the literature at the interface of blockchain security, MEV solutions, and AI ethics. Regarding the different peak times of Google trend and Twitter volume of #Flashbots, we identify connections between the peaks and the new version release in Fig. 9. We observed that the announcements of every latest version of Flashbots will be released through Twitter. Therefore, the peaks of the Google trend obviously tend to appear right after the peaks of Twitter volume. The
98
Y. Fu et al.
increases in Google and Twitter trends around the new Flashbots version releases are likely to be from the core teams of Flashbots. Future research could further explore how the discussions of #mev and #flashbots differ between users of different backgrounds and changes upon blockchain mechanism upgrades [27]. For example, how are the topics differ between the core developer team and a broader community of the general public? In the bigrams and the networks in Fig. 2, for #MEV, the keywords that appear simultaneously with MEV (or mev), such as ethereum, machinelearning, and solution, indicate that MEV happened more often in Ethereum Blockchain; for #Flashbots, frequent keywords such as frontrunning and sandwich explain the core problems solved by Flashbots, namely, the sandwich attacks. Further research could study how the topics differ in alternative MEV solutions other than Flashbots. Table 6 shows the successfully categorized topics by LDA. In #flashbots, the most frequently discussed topics are about some blockchain terminologies and DEXs platforms. Another salient topic under this hashtag expresses people’s emotional and ethical sentiments, such as fairness, trust, gratefulness, expectation, etc. In contrast, the topics under hashtag #MEV show people’s negative concerns about blockchain security issues such as inflation, unfairness, etc. However, existing research points out that the LDA model has a problem with processing sentiment analysis in short text [23]. Therefore, we will consider ameliorating the model in future studies.
Appendix Table 7. The Glossary Table. Keywords
Definition
Citation
Blockchain
Blockchain is a form of DLT (Distributed Ledge Technology) that stores data in a chain of blocks. Each block needs to be verified, validated, and then chained to another block
[10]
Ethereum
A blockchain platform
https://ethereum.org/en/ developers/docs/intro-toethereum/
DEXs
“Decentralized Exchanges where a smart contract or other forms of peer-to-peer network executes exchange functionality.”
[6]
(continued)
AI Ethics on Blockchain
99
Table 7. (continued) Keywords
Definition
Consensus Mechanism “The entire stack of protocols, incentives, and ideas that allow a network of nodes to agree on the state of a blockchain.”
Citation https://ethereum.org/en/ developers/docs/consensusmechanisms/
PoW
Proof of Work is the mechanism [10] that once allowed the decentralized Ethereum network to come to a consensus (i.e. all nodes agree) on things like account balances and the order of transactions
PoW and Mining
Proof of Stake is the current consensus mechanism of Ethereum Blockchain. Ethereum uses proof-of-stake, where validators explicitly stake capital in the form of ETH into a smart contract on Ethereum
https://ethereum.org/en/ developers/docs/consensusmechanisms/pos/
The Merge
The event that Ethereum decided to change the PoW mechanisms to PoS since the latter is more secure and less resource-intensive
https://ethereum.org/en/ upgrades/merge/
References 1. Sandwich attack (2021). https://www.mev.wiki/attack-examples/sandwich-attack [Accessed: Whenever] 2. MEV-resistant ZK-Rollups with Practical VDE (PVDE) (2022). https://ethresear. ch/t/mev-resistant-zk-rollups-with-practical-vde-pvde/12677 [Accessed: Whenever] 3. Ao, Z., Horvath, G., Zhang, L.: Are decentralized finance really decentralized? A social network analysis of the Aave protocol on the Ethereum blockchain. arXiv preprint arXiv:2206.08401 (2022) 4. Bertino, E., Kundu, A., Sura, Z.: Data transparency with blockchain and AI ethics. J. Data Inf. Qual. (JDIQ) 11(4), 1–8 (2019) 5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003) 6. Daian, P., et al.: Flash Boys 2.0: Frontrunning, Transaction Reordering, and Consensus Instability in Decentralized Exchanges. arXiv:1904.05234, April 2019 7. Juels, A.: Fair Sequencing Services: Enabling a Provably Fair DeFi Ecosystem (2020). https://blog.chain.link/chainlink-fair-sequencing-services-enabling-aprovably-fair-defi-ecosystem/ [Accessed: Whenever] 8. Kelkar, M., Deb, S., Kannan, S.: Order-fair consensus in the permissionless setting. In: Proceedings of the 9th ACM on ASIA Public-Key Cryptography Workshop, pp. 3–14 (2022) 9. Liu, Y., Lu, Y., Nayak, K., Zhang, F., Zhang, L., Zhao, Y.: Empirical analysis of EIP-1559: transaction fees, waiting times, and consensus security. In: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security,
100
10. 11. 12.
13. 14. 15. 16.
17. 18.
19.
20.
21.
22. 23.
24. 25. 26. 27. 28.
Y. Fu et al. CCS ’22, pp. 2099–2113. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3548606.3559341, https://arxiv.org/abs/2305.02552 Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. In: Decentralized Business Review, p. 21260 (2008) Network, A.: What’s Automata (IV): Conveyor (2021). https://medium.com/ atanetwork/whats-automata-iv-conveyor-93c9335e4f43 Pathik, N., Shukla, P.: Simulated annealing based algorithm for tuning LDA hyper parameters. In: Pant, M., Kumar Sharma, T., Arya, R., Sahana, B.C., Zolfagharinia, H. (eds.) Soft Computing: Theories and Applications. AISC, vol. 1154, pp. 515–521. Springer, Singapore (2020). https://doi.org/10.1007/978-98115-4032-5 47 Piatt, C., Quesnelle, J., Sheridan, C.: EDEN Network Whitepaper (2021). https:// edennetwork.io/EDEN Network Whitepaper 2021 07.pdf Piet, J., Fairoze, J., Weaver, N.: Extracting Godl [sic] from the Salt Mines: Ethereum Miners Extracting Value, March 2022. arXiv:2203.15930 [cs] Protocol, C.: CoW Protocol Overview (n.d.). https://docs.cow.fi/ ˇ uˇrek, R., Sojka, P.: Software framework for topic modelling with large corReh˚ pora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta, Malta, May 2010. http://is.muni.cz/ publication/884893/en Smith, C.: Maximal extractable value (MeV) (2022). https://ethereum.org/en/ developers/docs/mev/ [Accessed: Whenever] Tang, Y., Xiong, J., Becerril-Arreola, R., Iyer, L.: Ethics of blockchain: a framework of technology, applications, impacts, and research directions. In: Information Technology and People (2019) Tong, X., Li, Y., Li, J., Bei, R., Zhang, L.: What are People Talking about in #BackLivesMatter and #StopAsianHate? In: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (2022). https://doi.org/10.1145/3514094. 3534202 Ferreira Torres, C., Camino, R., et al.: Frontrunner jones and the raiders of the dark forest: An empirical study of frontrunning on the Ethereum blockchain. In: 30th USENIX Security Symposium (USENIX Security 21), pp. 1343–1359 (2021) Weintraub, B., Ferreira Torres, C., Nita-Rotaru, C., State, R.: A flash (bot) in the pan: Measuring maximal extractable value in private pools. arXiv preprint arXiv:2206.04185 (2022) Wiki, M.: Flashbots (n.d.). https://www.mev.wiki/solutions/faas-or-meva/ flashbots Di, W., Yang, R., Shen, C.: Sentiment word co-occurrence and knowledge pair feature extraction based LDA short text clustering algorithm. J. Intell. Inf. Syst. 56(1), 1–23 (2021) Yang, S., Zhang, F., Huang, K., Chen, X., Yang, Y., Zhu, F.: Sok: Mev countermeasures: Theory and practice. arXiv:2212.05111 [cs]. December 2022 Zhang, L., Ma, X., Liu, Y.: Sok: Blockchain decentralization. arXiv preprint arXiv:2205.04256 (2022) Zhang, L., Tian, X.: On blockchain we cooperate: an evolutionary game perspective. arXiv preprint arXiv:2212.05357 (2022) Zhang, L., Zhang, F.: Understand waiting time in transaction fee mechanism: An interdisciplinary perspective. arXiv preprint arXiv:2305.02552 (2023) Zhang, Y., Chen, Z., Sun, Y., Liu, Y., Zhang, L.: A comparative study of decentralized banks, Blockchain network analysis. arXiv preprint arXiv:2212.05632 (2022)
Intelligent Virtual Assistants (IVAs): Trust and Zero Trust Allison Wylde(B) Cardiff University, Cardiff CF10 3EU, Wales, UK [email protected]
Abstract. Intelligent virtual assistants (IVAs) for cyber security appear to offer promising solutions to tackle the problem of gaps in the future cyber security workforce. However, this paper argues that a problem emerges as artificial intelligence (AI) partners take on their roles. In AI - implicit trust is the norm, yet in cyber security, zero trust protocols are now mandated. The contribution of this conceptual paper is firstly to present an argument for the deployment of zero trust protocols to effectively manage our future AI partners, and secondly, to set out the first steps in a process to assess the operationalization of zero trust. By leveraging well-established theory on trust from organization and conflict management studies, zero trust can be evaluated. The zero trust assessment involves determining: propensity to trust; experience of trust; a trust assessment based on ability, benevolence, and integrity; followed by a study of; acceptance of vulnerability and of risk taking. Implications for practitioners, policymakers and academics include an argument that the deployment, assessment, and management of a zero trust posture will promote explainable, trustworthy, and secure AI. Further studies are called for. Keywords: Artificial Intelligence · AI · Trust · Zero Trust · Cyber Security
1 Introduction Worldwide the impact of cyber criminals and cyber crime is increasing in frequency scale and impact. At the same time estimates of gaps in the cyber security workforce in 2022, are estimated to be more than 3.4 million cyber security practitioners [1]. The focus of this paper is to explore aspects of trust in the operations of AI trained intelligent virtual assistants (IVAs) as they are deployed for cyber security. Although much is known about AI, IVAs, and trust, to the best of the author’s knowledge little is published regarding how to assess trust in a context of cyber security operations (involving IVAs) that require mandated zero trust protocols. What follows in this paper is not a systematic survey of state of the art, rather a review of key aspects in the context of trying to solve the problem. The contribution of the paper is at the intersection between AI, trust, and cyber security. Through presenting an argument for the adoption of zero trust in the operations of AI and IVAs deployed for cyber security this paper sets out a fresh approach and a new process to allow assessment of the operationalization of zero trust. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 101–108, 2023. https://doi.org/10.1007/978-3-031-37963-5_8
102
A. Wylde
The paper is structured as follows. To achieve the goal of furthering our understanding, this paper proceeds by first, in Sect. 1, the introduction, presenting the reader with the central argument that zero trust is an important and overlooked protocol essential for cyber security. In Sect. 2, a discussion on AI with a particular focus on IVAs is presented. The third section trust theory from management and organization studies is reviewed with a focus on the key elements involved in trust building developed further to help to understanding. Section 4 discusses zero trust, in the context of standards and guidelines, and zero trust protocols for AI, concerning IVAs deployed in operation. This work is part of a larger study which also examines aspects of workforce acceptance, however, due to space limitations, this is not covered in this paper. In the final Sect. 6, the conclusion, important implications for practitioners, policymakers and academics are presented.
2 AI and Intelligent Virtual Assistants (IVAs) AI is not new, in fact in banking and in healthcare, the next section summarizes AI and IVAs along with some unforeseen risks and challengers with their adoption. AI has been in use since the 1950s with systems able to make inferences and later in the 1960s and 70s, to conduct searches for, and the retrieval of medical literature [2]. Later developments helped improve diagnostic accuracy and procedural accuracy that resulted in improved patient outcomes [2]. IVAs were developed from chat bots, which themselves were originally developed to provide online conversation through text or speech to simulate a human [3]. IVAs are now more sophisticated using natural language processing (NLP) to match users’ words, whether as text or spoken and to process images [4]. Many IVAs use AI to learn through searches, lookups, backend databases, rules, and reference engines that provide answers, which may not or not always be reliable [5]. The development of IVAs has continued with more recent examples, such as developed by Apple, Siri in an iPhone, Microsoft Amazon, Alexa and Echo device, and Google Assistant on Google enables Android devices. Activating the IVA’s listening, recording (voice or image) and processing functions requires a call or wake signal such as ‘Hey Siri’, ‘OK Google’ or ‘Alexa’. Once activated the IVA carries out tasks, often involving communicating with multiple devices, such as smart speakers or headphones, or smart home fridges, thermostats, lights, and security systems [5]. The IVAs are also linked with third-party vendors, allowing bank balances to be checked and items, such as, pizzas, video streaming or ride hires to be purchased [5]. In October 2022, a humanoid android device named Ai-Da attended a high-level UK parliament committee hearing and provided evidence through using a language mode [6]. Not surprisingly several risks have emerged, notably [5] who explore the ecosystem of IVAs and highlight risks such as eavesdropping, voice recording and hacking, the authors call for the need to understand growing security and privacy threats. Key issues privacy issues are described by [7] as the ‘end of privacy’. Security threats include malicious commands which could result in unwanted ordering, or attacks on property through opening doors and allowing thefts. Unintentional ordering of products has occurred, in one example a six-year-old in Dallas, Texas, ordered a doll’s house (Sparkle Mansion) and cookies (snacks), much to her parents’ surprise when they arrived [5]. Other instances
Intelligent Virtual Assistants (IVAs): Trust and Zero Trust
103
include recordings of private conversations being used by Apple and Google for product training purposes [8]. Several cyber security challenges have been identified so far in the previous incidents, the main problems are arguably linked to the current procedures based on presumptive trust in the systems and networks. These incidents have resulted in unplanned purchases, inconvenience and at worst privacy violations [8]. However, in cases where critical processes or high-risk AI systems or environments are involved, such as healthcare, financial services or in nuclear power generation, events that are not planned or unexpected, such as cyber security breaches or actions, could prove fatal, economically damaging and/ or catastrophic. The impacts of high-risk incidents could extend beyond a single individual, organization or country. In a nuclear event, the outcome could extend across several nations or indeed in the worst case, world-wide. It is thus essential to understand the basic trust functions and assumptions and to create further awareness of potential cyber security threats as well provide approaches that allow for monitoring and prevention. For this paper, the focus is on understanding the nature of and the role of trustdiscussed next.
3 Trust The key assumptions and elements in the construct of trust are well-researched. Trust has also been characterized as operating in and at different levels and between different respondents [9]. As the paper is not a systematic survey, the focus here is trying to arrive at a framework for analysis that is applicable to the context of virtual assistants. Two key approaches from trust research are drawn on to create a framework through which trust can be examined. First the integrative trust formation model [10] from organization and management studies is examined followed by a consideration of important components from conflict resolution studies [11]. What is presented is not a complete state of the art, but rather a review of specific material, in the context of trying to create a frame to understand the puzzle at the heart of this paper. Trust is widely viewed as multi-faceted and based on several elements. These include the presence of positive expectations of a trustee’s trustworthiness and an assessment of trustworthiness [10]. Next, trust is assessed based on three components, ability (does an individual have the ability necessary to perform a particular action?), benevolence (does the individual act in good conscience?) and lastly, integrity (does the individual act with integrity?) [10]. This assessment is moderated by a trustor’s; propensity to trust; willingness to accept vulnerability and to take risk, in the relationship (on the part of the trustor) [10]. From conflict resolution trust development is viewed as based on the foundations of a trustor’s willingness to act and their ability to trust as moderated by trust experiences [11]. The review above gives rise to a standard definition of trust as based on a trustor’s positive expectations and willingness to be vulnerable to the actions of the trustee, with an expectation that the trustee will undertake an action important to the trustor, irrespective of control or monitoring by the trustor [10]. In this paper trust is conceptualized based on the definition as presented. Importantly, trust is linked to and tied into an individual
104
A. Wylde Table 1. A Throughput Model of Trust Building, Expanding [10] and [11]
Trustor’s antecedents
Trustor’s assessment
Trustor’s actions
Propensity
Propensity
Propensity
Ability to trust
Ability
Acceptance of vulnerability
Trust, prior experience
Benevolence
Assessment of risk
Integrity (ABI)
Risk taking behavior
trustor’s propensity to trust [10], personality, belief systems and trust experiences [11], set out in Table 1, above. Trust has also been studied at different levels and different referents, in teams, organizations and institutions [9]. Trust has also been examined in non-person based relations, for example, trust in a policy [10] or a technology [12]. What is important in the context of this paper are ideas that trust exists in entities outside of, and indeed, beyond human relations [12]. Considered next are a definition for zero trust, and zero trust operations to help understanding.
4 Zero Trust The cyber security approach zero trust was first proposed in 2010 [13]. Zero trust counters an over-dependence on, and the presence of presumptive trust and trusted systems [13, 14]. Key views of zero trust are discussed next. In the context of cyber security, trust is viewed as a vulnerability [13]. In this paper zero trust is based no presumptive trust, rather a risk-based approach to granting trust [13]. Zero trust relies on continuous monitoring and verification [13]. The zero trust approach deals with limitations in traditional fixed boundary or perimeter-based trust approaches. Trust cannot be granted based on location, in fact in modern network, cloud based, and internet of things (IoT) organizational boundaries no longer exist [13, 14]. Recent policy and guidelines from international bodies and governments promote and, in in the US, zero trust is mandated [15]. Guidance on zero trust implementation involves the verification of identity, individual, device, process, and or service, such as IoT [14]. As well as an assessment of context and state [13]. The NIST definition of zero trust involves: minimizing uncertainty and enforcing decisions based on least privilege access peer-request-access in information systems in the face of a network, which is viewed as compromised [16]. A zero trust architecture therefore comprises an enterprise’s cyber security plan (zero trust based), component relationships, workflow planning, and access policies [16]. In sum, a NIST zero trust enterprise is the sum of the physical and virtual network infrastructure and zero trust policies [16]. NIST define the terms, user, subject, and resource, as entities that may request information from resources (assets, applications, workflows, network accounts, services, and devices) that may substitute as data [16]. Interestingly, the term user is reserved for humans, while subject is the standard term for all other entities [16]. Zero trust minimizes access to identified subjects and assets requiring access, based on an authentic
Intelligent Virtual Assistants (IVAs): Trust and Zero Trust
105
subject/user and valid request - while continuously authenticating and authorizing each request [16]. The process, referred to as policy decision and policy enforcement policy (PDP/PEP) judgements, may be managed by trust algorithms [16]. NIST also adds that zero trust was in operation long before it was named zero trust [16]. The UK NCSC’s ten principles for zero trust, include: knowledge of architecture; the creation of a single strong identity; strong device identity; authentication of everything; no trust in any network; and the selection of services designed for zero trust [14]. The NCSC’s approach involves policy and continuous, authorized decision-making to help in the practical implementation of zero trust [14]. Prominent approaches and protocols in zero trust application such as zero knowledge and garbled circuits [17], are not discussed here due to limitations of scope. Operationalizing zero trust therefore relies on continuous decision-making and monitoring to ensure that confidential and sensitive information is not discoverable. Next, the application of trust and zero trust in the context of AI and AVIs is discussed.
5 IVA’S: Trust, Zero Trust As highlighted above, AVI’s are in operation now, and use is set to grow in the domain of cyber security. What this paper suggests is that in light of the challenges highlighted earlier, it is necessary to address the pressing issue of trust. Trust in this domain is explored next. As a starting point one emerging policy is considered, the October 2022, US Government’s AI Bill of Rights [18, 19]. This is followed by a consideration of the expanded trust models presented above, with the context of zero trust included. At the time of writing, the 2022, US Government’s AI Bill of Rights [18, 19] has just been released, setting out five principles (underpinned by trust) to ensure that AI is trusted and trustworthy and protects the American public, in the age of AI. The five principles offer guidance on the design, use, and deployment of AI, encompassing: automated systems; safe and effective systems; algorithmic discrimination protections; data privacy; notice, explanation, and human alternatives, together with consideration, and fallback [19]. Several important aspects of trust receive attention: management, in the form of stewardship; service, in terms of independence, genuine unfiltered access to the whole system; trustworthiness in the design, development and use and evaluation of AI products and services; innovation, in approaches to ensure trustworthiness, accuracy and explainability [20]; cyberspace [21], should be secure and trustworthy; bias (systemic, statistical and human), should not chip away at public trust; data brokers should be prevented from breeding corrosive distrust; public understanding and knowledge should be fostered through better explainability, to allow humans to appropriately trust and effectively manage the emerging generation of AI partners [20], addressing opaque decision making processes which result in a lack of public trust; and finally, recognizing the importance of placing trust in people and not technology [19]. Although these key approaches provide a basis for an assessment of trust, what appears not to be considered is a posture of zero trust, discussed next. This paper suggests that a posture based on zero trust could be harnessed [13] to operationalize the US Government calls for the appropriate trust and effective management of emerging AI partners [19]. As highlighted, current approaches are based on
106
A. Wylde
presumptive trust [13] which can result in challenges such as unknown cyber security threats and or breaches. Adopting zero trust as a risk-based approach could arguably overcome these current limitations and challenges. Implementing zero trust relies on an initial assessment of identity whether this is a an individual, or a device, service, or software [13, 15]. Once the identity has been verified trust can be granted, on a least privilege principle [14]. A posture of zero trust also involves continuous monitoring and verification [13]. Deploying zero trust could achieve the goal of securing a trustworthy cyberspace [19]. The operations of IVAs for cyber security in the context of zero trust are considered next. This paper suggests that responsible stewardship, trustworthiness, and improved understanding [19] of IVAs could be demonstrated through implementing zero trust [13]. Returning to Table 1, above, all activities in the trust throughput model [10, 11], are reliant on the trustor’s propensity. In zero trust, propensity is based on no presumptive trust [13–15, 20]. This approach compares with trust, where propensity is viewed as multidimensional and presumptive, implicitly trustful [10]. Given the viewpoint of zero trust, the antecedents of trust, involving both a trustor’s ability to trust, and their experiences of trust [11] are dialed into a zero trust posture [20]. Such a position serves as a positive reinforcement to the posture of zero trust [20]. In the next stage of the throughput model, the assessment of trust, in this example, for zero trust, the propensity is again set as no presumptive trust [20]. In the assessment of ABI, as all traffic on the network is viewed as hostile and the state of the network is founded on a view of trust as a vulnerability [13–15] zero trust is once more reinforced [20]. In zero trust, if the assessment of identity (of device or service in the case of IVA) is verified, then confidence may be gained in the user and trust may be gained [14, 15]. In the next phase, in the trust model, the trustor’s actions include assessing risk, accepting vulnerability, and taking risks [10]. In zero trust policy and enforcement decisions based are on authentication and authorization- these occur with no acceptance of risk or vulnerability [14, 15, 20]. Summing up, the findings from the approach presented here could provide steps that could overcome cyber security challenges and help inform judgments in policy decision and policy enforcement policy (PDP/PEP) judgements [14, 16].
6 Concluding Remarks This paper has examined IVAs, the problems of trust, zero trust and AI and reflections on current policy at the intersect between AI, trust, and cyber security have been presented. The contribution of the paper is to argue for the adoption of zero trust in the operations of AI and IVAs deployed for cyber security along with the elaboration of a new approach to assessing and evaluating zero trust. Suggested benefits through adopting these approaches together with promising avenues for future research and implications are discussed next. This paper has demonstrated that leveraging well-established trust theory from organization studies and conflict resolution studies allows progress to be made in improving the steps in decision-making for policy and policy enforcement [14, 18]. Through this approach, users can benefit from transparency and understanding helping to fulfill the
Intelligent Virtual Assistants (IVAs): Trust and Zero Trust
107
requirements for Trustworthy AI [20] and achieving the goals of the US 2022, AI Bill of Rights [21]. Promising avenues for future research include further study of decision making processes in policy development and in enforcement as zero trust is actioned. Further development of the conceptual model as presented could help practitioners better understand the key issues involved in building confidence in, and effectively assessing, managing, and monitoring our future generations of AI partners [20]. In conclusion, important implications for practitioners, policymakers and academics presented in this paper include an argument for the implementation of zero trust protocols and a process for implementation. It is hoped that scholars, practitioners, and policy makers will take up this call for further study and development. Acknowledgements. Many thanks to colleagues for helpful discussions both online and in person. Thank you to the anonymous reviewers for their comments and observations that have improved this work.
References 1. ISC2 2022, Cyber security workforce study 2022. https://www.isc2.org//-/media/ISC2/Res earch/2022-WorkForce-Study/ISC2-Cybersecurity-Workforce-Study.ashx. Accessed 19 Oct 2022 2. Kaul, V., Enslin, S., Gross, S., A.: History of artificial intelligence in medicine. Gastrointest. Endosc. 92(4), 807–812, (2020). ISSN 0016–5107 3. Adamopoulou, E., Moussiades, L.: An overview of chatbot technology. In: Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds.) AIAI 2020. IAICT, vol. 584, pp. 373–383. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49186-4_31 4. Schmidt, B., et al.: Industrial virtual assistants: challenges and opportunities. In: Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, pp. 794–801 (2018) 5. Chung, H., Iorga, M., Voas, J., Lee, S.: Alexa, can i trust you? Computer 50(9), 100–104 (2017) 6. The Guardian. https://www.theguardian.com/technology/2022/oct/11/typos-and-shutdownsrobot-gives-evidence-to-lords-committee. Accessed 31 Oct 2022 7. Casilli, A.A.: Against the hypothesis of “the end of private life. Revue française des sciences de l’information et de la communication. [in English], 3 (2013). https://journals.openedition. org/rfsic/630. Accessed 30 Oct 2022 8. Forbes. https://www.forbes.com/sites/jeanbaptiste/2019/07/30/confirmed-apple-caught-insiri-privacy-scandal-let-contractors-listen-to-private-voice-recordings/. Accessed 30 Oct 2022 9. Fulmer, A., Gelfand, M.: At what level (and in whom) we trust: trust across multiple organizational levels. J. Manag. 38(4), 1167–1230 (2012) 10. Mayer, R., Davis J., Schoorman, F.F.: An integrative model of organizational trust. Acad. Manag. Rev. 20(3), 709–734 (1995) 11. Deutsch, M.: Trust and suspicion. J. Conflict Resolut. 2(4), 265–279 (1958) 12. Mcknight, D.H., Carter, M., Thatcher, J.B., Clay, P.: Trust in a specific technology: an investigation of its components and measures. ACM Trans. Manag. Inf. Syst. 2(2), 1–25 (2011)
108
A. Wylde
13. Kindervag, J.: No more chewy centers: Introducing the zero trust model of information security. Forrester Research, Sept, 14, updated Sept, 17 (2010). https://media.paloaltonetworks. com/documents/Forrester-No-More-Chewy-Centers.pdf. Accessed 12 Dec 2022 14. SH., S.: Zero trust architecture design principles. National Cyber Security Centre (NCSC), 20/Nov/20219. https://www.ncsc.gov.uk/blog-post/zero-trust-architecture-designprinciples. Accessed 30 Oct 2022 15. The White House. https://www.whitehouse.gov/?s=zero+trust. Accessed 31 Oct 2022 16. Rose, S., Borchert, O., Mitchell, A., Connelly, S.: Zero trust architecture, NIST special publication 888–207, NIST, Aug/2020. https://nvlpubs.nist.gov/nistpubs/SpecialPublications/ NIST.SP.800207.pdf. Accessed 30 Oct 2022 17. Wylde, A.: Zero trust: never trust always verify. In: 7th International conference on Cyber Security for Trustworthy and Transparent Artificial Intelligence (CYBER SA 2021, IEEE), pp. 1–4 (2021) 18. The White House. https://www.whitehouse.gov/ostp/news-updates/2022/10/04/bluepr int-for-an-ai-bill-of-rightsa-vision-for-protecting-our-civil-rights-in-the-algorithmic-age. Accessed 31 Oct 2022 19. The White House. https://www.whitehouse.gov/wp-content/uploads/2022/10/Blueprint-foran-AI-Bill-of-Rights.pdf/. Accessed 31 Oct 2022 20. NIST. https://www.nist.gov/system/files/documents/2022/03/17/AI-RMF-1stdraft.pdf 21. NSF. https://www.nsf.gov/pubs/2022/nsf22517/nsf22517.pdf
Implementation of a Smart House Using a Programmable Logical Controller (PLC) Mustafa Ayad1(B) , Oday Alkaragole1 , Dominic Comito1 , and Khaled Ayad2 1 The State University of New York at Oswego, Oswego, NY 13126, USA
[email protected] 2 Computer Department, Al Jafara University, Tripoli, Libya
Abstract. Programmable logic controllers (PLCs) are industrial, solid-state digital computers that emulate the interconnection of many relays to perform logic tasks. A PLC can be utilized in many ways, mainly in an industrial setting, such as steel industries or power generation companies. However, PLCs can also be used to make a house smarter. This paper presents a scale model of a house showing the capabilities of PLCs as an alternative to Smart Home Devices. Our designed smart house has two full rooms and a garage, each designed with realistic settings. The first room, the bedroom, will have automatic window blinds, overhead lights connected to the blinds, and a cooling system. Downstairs, in the living room, there is front door security, an overhead light that is also connected to the blinds, and another cooling system. The cooling system will turn on when the room temperature rises above the selected comfort temperature. In addition, the front door security works whenever the door is opened, and an alarm will sound until the switch to turn it off is flipped. The listed features of the smart home are enough to convince people to turn their homes smart too. The Click PLC, Human Machine Interface (HMI)., sensors, actuators, and ladder logic are used to achieve the project. The primary purpose of this work is to gain experience working with widely used equipment while also finding a way to make life more efficient, convenient, and comfortable. Finally, the Smart House prototype is implemented, and the primarily targeted functions are tested successfully. Keywords: Programmable Logic Control · Click PLC · Ladder Logic · HMI
1 Introduction The Smart House concept emerged four decades ago as engineers started to apply the concept of smart buildings in real projects. In recent times, smart homes have been designed to make life around the house more convenient and efficient [1]. Smart houses use advanced technologies with remote or centrally controlled functions and services. In a smart home, the residents’ wishes and needs concerning all or some parts of equipment and functionality are prioritized [2]. In today’s modern era, automation is rapidly advancing while it has become a part of our homes and offices [3]. Automation can generally be described as a process following © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 109–124, 2023. https://doi.org/10.1007/978-3-031-37963-5_9
110
M. Ayad et al.
pre-determined sequential steps with little or no human effort. Automation is provided using various sensors to observe the production processes, actuators, and different techniques and devices [4]. Therefore, the home automation system must be effective, easy to apply, and affordable during the design process. The Programmable Logic Controller (PLC) is considered an alternative to such systems. PLC, security monitoring, energy consumption management, and control of machines and automatic production lines included in almost every field of industrial automation systems are commonly used [5]. PLC is an electronic device designed to be used in the field of industry that controls a system or groups of systems through analog/digital data input/output terminals. The PLC system provides general control utilizing inherent functions of timing, counting, data processing, comparing, sorting, data transfer, and arithmetic operations. Concurrently, the use of PLC is advantageous for several reasons, such as making changes to the software and resuming the algorithm as the energy is supplied back by saving data for a long time in the case of power failure. The programming language used to program PLC is ladder logic. Ladder logic represents a program by a graphical diagram based on the relay logic hardware circuit diagrams. It is primarily used to develop software for PLCs used in industrial control applications. Programs in this language resemble ladders, with two vertical rails and a series of horizontal rungs between them [6]. Ladder logic is widely used to program PLCs, which require sequential control of a process or manufacturing operation. Ladder logic is useful for simple but critical control systems or for reworking old hardwired relay circuits. As PLCs became more sophisticated, they have also been used in complex automation systems. The ladder logic program is often used along with a human-machine interface (HMI) program operating on a computer workstation [6]. This paper attempts to perfect the idea of smart home devices by using different devices that are usually or mainly used. The idea is to use a PLC interfaced with sensors and actuators and program it to perform functions that would make life around a house convenient and efficient. Another reason for using PLCs is that they do not run off a Wi-Fi signal like smart home devices usually do. If the Wi-Fi were ever to cut out, the functions around the house would still run precisely the same. The smart house model used in this paper is a scale model of a house for display purposes, and utilizing these functions in a real home would not be much different, just ten times the scale. For financial reasons, choosing the most practical yet challenging functions was necessary to decide the final project specifications. The number of functions a plc can handle is unlimited, so in the future, more input and output modules can be added to increase the volume of functions this project can have. Throughout the design process, there were many design flaws and obstacles to getting to the final product, but using the skills learned throughout training helped to analyze and perfect the problems that were faced. Also, using PLCs introduces working with the type of equipment widely used in industries today. The paper is organized as follows; Sect. 2 presents the problem formulation, in Sect. 3 presents some related work. Then, Sects. 4 and 5 introduced the project specifications and PLC description. Next, the smart house schematic and analysis are shown in Sects. 6 and 7. Finally, the conclusion is presented in Sect. 8.
Implementation of a Smart House
111
2 Problem Formulation Throughout the entire process of making a Smart Home, there were many successes and failures. Researching and learning how PLC works and how we can program it was very impactful before starting the design process. Once we understand pro functionalities and how to use the software, we can formulate the smart house idea and the project design. When first coming up with the idea of utilizing PLCs to create a smart house, there were many questions about where to start, but more technical questions came as the design process started. At the infancy of the project, the original design was to use a dollhouse purchased online with the functionalities as follows: • Install a contact sensor in the mailbox. Then, when mail gets delivered, the AOL theme “you’ve got mail” plays throughout the home, and you will receive a text message. • Receive a text alert when a garage door is left open. • The buzzer will sound if the front door is left open. • Need to be reminded to take out the trash? Program a notification to appear on every touchscreen and mobile device at the same time once a week • The “Wake up” scene opens the curtains, gradually raises lighting over five minutes and sets the temperature to your preferred level. • “Sleeping” scene that closes the curtains shuts the lights off and lowers the temperature to your preferred level. However, we noted that we should build a house model made from plywood boards instead of buying a house online. The functionalities list was a good start, but some of our ideas came with challenges as the project progressed. Flipping Motors Polarity. The blinds and the garage door motors are geared toward 12 V DC motors. The one used for the blinds is 10 rpm, and the motor used for the garage door is 50 rpm because the garage door size is approximately five times larger. The first part of the design process is straightforward, making the blinds move in one direction, and if you flip the polarity of the motor, it changes direction. The polarity was switched without manually having to do so using mechanical relays. Relays are devices that reverse the power supplied to another device on and off. Two relays are wired oppositely and connected to two input buttons. These two buttons signify which direction to turn the motors, either raising or lowering the blinds or garage door. Saving Output. With the amount of functionality intended for the project, there were not enough digital outputs. Two digital outputs were used on only the blinds to make them rise and fall; one digital output was used for the lights in series, and one digital output for the buzzer for the front door security. With all four digital outputs being used, it left the project without a garbage reminder, a mail indicator, and a garage door. In order to save the garage door, two switches are used to have the motors on the same two outputs. One switch is used to connect the forward motion relay to each motor, and the other switch is used to connect the reverse motion relay to each motor. By flipping the
112
M. Ayad et al.
switches in the same orientation, each motor can run separately from the same lines of programming. Not Enough Power. When connecting the first motor to the relays, there was a problem. The power was going into the relay but not coming out of the relay to power the motor. As a result, the indicator light turned on once the power went through but was dimmer compared to when power was directly input into the relay from the function generator. After troubleshooting the relay and figuring out that it was functioning correctly, adding another power supply seemed to be the solution. By powering the relay with another 24 V, the relay’s indicator light was brighter and outputted exactly 24 V to the motors. The Number of Digital Inputs and Outputs. The CLICK PLC has four digital inputs along with four digital outputs. The intention for the digital card was to have blinds and a garage door going up and down with two indicator lights, one connected to a contact sensor in a mailbox and the other on a timer for a reminder to take the trash out, front door security, temperature monitoring and lights in two of the rooms. The inputs include two buttons, a contact sensor, a thermometer, a switch, and a limit switch and the outputs include the buzzer, two motors, two indicator lights, two lights, and two fans. The lights and the fans were put in parallel, giving the outputs the same amount of voltage while sharing an output. After saving as many outputs as possible, there were still too many outputs without purchasing another input and output module with four more digital inputs and outputs. Instead of purchasing another module card, we decided to use HMI to model an actual residential smart device instead of having our system hardwired with buttons and switches. The HMI will allow the inputs to be displayed via a touchscreen connected to a smartphone. Analog Output Current. The functionality of the “air cooling system” is that whenever the temperature that is read from the thermometer in the house rises above 80 °F , the PLC will output 2 V, turning the fans on at a slow speed. When the temperature rises above 90 °F, the PLC will output 5 V, turning the fans on at full speed. The problem with this is that the specifications of the analog output are only 5 V and 4–20 mA. The available fans are 5 V, 0.15 A, 12 V, 0.1 A, and 24 V, 0.08 A. At first, the 5 V fans were purchased in hopes that they would still work, just not at full efficiency, but there was a voltage drop of 4 V whenever the fan was connected, so instead of 5 V going into the fan, only 0.9 V was going into the fan. Next, the 12 V fans were purchased, and the same result happened. Instead of giving up on this idea, the 24 V fans were purchased, and they can run off of the 5 V output but require a jump start. For the fans to turn on, they need to be flicked after consistently running.
3 Related Work There have been many projects utilizing PLCs to implement various ideas, such as Smart Home Control, but authors address their designs differently. In [7], the project uses Seimens PLCs to control the lights, blinds, temperature, irrigation, and security. Their project has many input and output modules, so the functions can be used in many different ways. For example, the lights can be turned on using switches, motion control,
Implementation of a Smart House
113
and a crepuscular sensor. The irrigation function is automatically switched on according to soil humidity measured by a humidity sensor. The PLC senses its value through the analog input. If the soil humidity is below the set reference value, it opens an electro-valve irrigated circuit, activating the irrigation and watering the grass. On the other hand, suppose the soil humidity read by the humidity sensor will be higher than the maximum humidity of reference. In that case, it will command the closing marker, break the electro-valve circuit, and stop the irrigation based on the initial specifications. In [8], the authors discussed the fundamental PLC batch process control techniques. The authors used two inlet pipes and one outlet pipe batch process control to demonstrate the PLC implementation structured approach. The batch process was reduced to a superficial level, temperature, and mixing control operations, simplifying the overall programming task and focusing more on the critical safety and operational issues. In [9], the authors attempted to combine PLC with a home environment to achieve multi-faced home automation control functions. Their study used the human-computer interaction interface to humanize the integration of home controlrelated processes. Authors rely on ladder diagrams to perform various home automation control functions, including anti-theft control, lighting control, temperature control, fire control, and sleep mode.
Fig. 1. a. Living Room and b. Bedroom
In our paper, we designed a small house with two entire rooms and a garage, each with realistic settings. The first room, the bedroom, will have automatic window blinds, overhead lights connected to the blinds, and a cooling system. Downstairs, in the living room, there is front door security, an overhead light that is also connected to the blinds, and another cooling system. The cooling system will turn on when the room temperature
114
M. Ayad et al.
rises above the selected comfort temperature. How the front door security works are whenever the door is opened, an alarm will sound until the switch to turn it off is flipped, as shown in Fig. 1.
4 Project Specifications The entire project can be broken into three rooms with different functionalities. First, the bedroom area has automatic blinds connected to the overhead lights and temperature monitoring to keep the room at a consistent temperature. Next, the downstairs room or kitchen/ living room has front door security, a hidden control switch, and the same temperature monitoring. Last, the garage has an automatic garage door. Bedroom. The way that the bedroom is programmed, there are two scenes. The first scene is the going-to-bed scene where the blinds will descend, the lights will turn on while the resident gets ready for bed, and the lights will turn off after a certain amount of time. The next scene is the waking up scene; after the lights turn on in the morning to wake the resident up, the blinds will open, and the lights will turn off. The blinds are controlled by a 12 V, 10 rpm motor. The blinds themselves are made from a 6 × 10” felt rectangle and folded in an accordion fashion. The string was then lined through the center of the folds on either side of the felt, strung through a drawer handle, and used as a curtain rod to the motor. The string was mounted to the motor shaft by putting the line through a piece of heat shrink tubing, and heat shrunk onto the shaft. A thermometer is installed in the bedroom’s top left corner for temperature monitoring. This thermometer is connected to a 24 V fan in the top right corner of the room, and if the temperature reading were to rise above room temperature, the cooling fan would turn on a low setting. As the temperature reading from the thermometer rises, the power provided to the fans will rise, increasing the speed. When the temperature reading lowers back down to room temperature, the fans will slow down and turn off after the room is back to the desired temperature. Kitchen/Living Room. When the front door is opened, the door will press on a limit switch, activating a timer in the program. After this timer is counted, it will turn on the output set to it, which is a 12 V buzzer. The only way to turn this alarm (buzzer) off is by a hidden switch in the house. The switch is hidden to ensure that if a burglar did break in, the only people who know how to turn the alarm off would be family or close friends of the residents. If the alarm were to go off for a certain amount of time, it would alert anyone home that they are in potential danger and call the police. Also, in the top right corner of the room is another 12 V fan set to the same temperature monitor in the bedroom. This will keep the house at a consistent temperature throughout. Garage. The garage door was made similarly to the blinds. It was made of a 1 × 1’ piece of felt with six 2 × 2” pieces symmetrically sewn to the garage door to give it a realistic look. The felt was then installed to the house model by putting three screws and nuts through the top of the garage to hold the felt in place. The string was then strung from the bottom of the feet, up through the middle of the felt, and the top of the garage and came to the center of another motor. Since both motors run off the same lines of
Implementation of a Smart House
115
program and the garage is five times larger than the window, the garage is controlled using a 12 V, 50 rpm motor. Like the blinds, the garage door can go up and down. HMI. The HMI is being used to control the outputs of the PLC. The main reason for the HMI is to model after a residential smart home device. The HMI used in this project is a C’more Micro Panel and is programmed using the corresponding C’more software. Programming the HMI mainly connected the inputs and outputs from the PLC to signals that the HMI can read. The project comprises five screen settings that can be switched back and forth. On the first screen, there is an input button to activate the forward motion of the blinds motor to raise the blinds, and the other control starts the reverse movement to lower the blinds. Also, there is an indicator light on this screen to indicate whenever the blinds are left open. There are also two buttons on the second screen, but these are for the garage door motor. There is also an indicator light indicating whenever the garage is left open. For the third screen, there is an indicator light that recognizes when the door is available whenever the door presses the limit switch and indicates that the alarm is on. The last screen display is for temperature monitoring. There is a temperature display in degrees Fahrenheit, along with two indicator lights, one for when the fan is on in the bedroom and the other for when the fan is on in the kitchen. The fifth screen is the “home” screen, which consists of shortcuts to each screen so that you can switch back and forth from the home screen to every function screen.
5 PLC Description The PLC used in this project is a CLICK C0-12ARE-1-D, as shown in Fig. 2, an ethernet analog series PLC that requires a 24 V DC power supply. This model has some technical specifications: ethernet and serial ports, four discrete inputs, four analog inputs with 4-channel, four discrete outputs, a relay, and two analog output channels. The second important device for the PLC is the Central Processing Unit (CPU). It is Microprocessorbased and may allow arithmetic operations, logic operators, block memory moves, computer interface, local area network, functions, etc. Moreover, CPU makes a significant number of check-ups of the PLC controller itself so that eventual errors would be discovered early. An example of an industrial CPU can be seen in Fig. 3, Siemens SIMATIC S7–300 [10].
Fig. 2. CLICK PLC
116
M. Ayad et al.
Fig. 3. Industrial CPU
The internal paths along which the digital signals flow within the PLC are called system buses. The system consists of four busses: • • • •
The CPU uses the data bus for sending data between the different elements, The address bus to send the addresses of locations for accessing stored data, The control bus for signals relating to internal control actions, The system bus is used for communications between the I/O ports and the I/O unit.
The PLC system memory consists of ROM and RAM. The system ROM gives permanent storage for the operating system and the fixed data used by the CPU, while system RAM is used for data. The RAM is where information is stored on the status of input and output devices and the values of timers and counters, and other internal devices. EPROMs are used for ROMs that can be programmed, and then the program is made permanent, as in Fig. 4.
Fig. 4. System Bus
Regarding PLC I/O Sections, inputs monitor field devices, such as switches and sensors, while outputs control other devices, such as motors, pumps, solenoid valves, and lights. The PLC used in this project has four digital inputs, four digital outputs, two analog inputs, and two analog outputs. For the power supplies, PLC controllers can vary from 24 VDC or 220 VAC. Some PLC controllers have an electrical supply as a separate module, while small and medium
Implementation of a Smart House
117
series already contain the supply module. The power supplies used in this project are an Omron S8VK-G01524 and a Mean Well open frame [11]. The S8Vk-G01524, seen in Fig. 5 a., is a DIN rail 24 V AC to DC power supply with one output. It is capable of two-phase input usage, a –40 to 70 °C temperature range for operation, and its main application is power management. The Mean Well PD-45 open frame, seen in Fig. 5b., is a universal 5 and 24 V AC to DC power supply with a low leakage current of less than 0.5 mA. It is fully equipped with short circuit, overload, and over-voltage protection cooled by free air convection [12, 13]. The HMI used for this project is the C-more micro panel, seen in Fig. 6, which is the operator interface that controls the functions of the plc. The Human Machine Interface provides push-button panels or switches banks and a textual or graphical view of system conditions and operations. HMIs offer robust monitoring, control, status reporting, and many other functions. It is programmed using signals made in the plc program and connected to ports to the HMI, which can be designed in any way on up to 999 different display screens. The display screens have been created the way the user desires [14].
Fig. 5. a. Omron S8VK-G01524 b. Mean Well PD-45 Open Frame
Fig. 6. C-More Micro Panel
Software. The PLC chosen for this project was CLICK with internal digital/analog, input and output modules to make it affordable. In more complex applications, we could use separate input and output modules. The software we will use for this PLC is
118
M. Ayad et al.
called CLICK Programming 3.01. CLICK Programming Software is designed to be a convenient and adaptable application. The tools, layout, and software interaction provide step-by-step instructions that make it easy to navigate. The programming language used in this software is ladder logic, which depicts the program in terms of a graphical diagram taken from the circuit diagrams of the relay logic hardware. The ladder logic for the three rooms is divided into three photos, as seen in Fig. 7, 8 and 9.
Fig. 7. The Ladder Logic of Room 1
It is the ladder logic for the front door security. The limit switch is connected to the input X001 and is set to open input normally. It means that if the limit switch is activated, the timer on the right side of this rung will count to the set point of 5 s. After 5 s, the output of T1 will activate, and the bottom rung will be activated, which will turn on the output of Y001, which is set to the buzzer. Finally, the security selector switch is the control switch hidden behind the fridge in the kitchen that will turn off or reset the alarm.
Fig. 8. The Ladder Logic of Room 2
Implementation of a Smart House
119
It is the ladder logic for the blinds and the garage door. There are two main rungs, one for the up direction and the other for the down direction. The output power of Y003 is set to a timer of 6 s, so the motor is only powered for 6 s, and after this timer counts to 6, the power will be cut off to the motor. The C signals are signals that the HMI can use and read. They are set to the same rung as the inputs, so when the HMI is activated, the inputs of X004 and X002 are activated.
Fig. 9. The Ladder Logic of Room 3
The figure above shows the temperature monitoring function. If the temperature that DF5 is reading is equal to or above the set temperature, which is set to 80 degrees Fahrenheit, the outputs DF9 and DF10 will be activated. When DF9 and DF10 are activated, the analog outputs will be supplied with power, turning the fans in the bedroom and the kitchen on. When DF5 reads the temperature dropped below 80 °F, the power provided will be turned off, turning the fans off.
6 Smart House Schematic As seen in Fig. 10, the overall schematic of the project is seen above. It is the way that the plc is wired. As seen on the left side, both power supplies are plugged into a wall outlet. The main power supply is next to the CPU and powers the CPU and outputs. The local PLC digital input and output are shown in Fig. 11. The PLC and motor diagram are shown in Fig. 12. The HMI communicates with the plc via a telephone cable. The laptop sends programs to the plc through an ethernet cable, and the HMI signals are shown in Fig. 13. Two external relays make the motors move forward and reverse.
120
M. Ayad et al.
Fig. 10. Project Schematic
Fig. 11. PLC Digital Input and Output
Implementation of a Smart House
121
Fig. 12. Motor Connected to PLC Diagram
Fig. 13. HMI Signals
7 Analysis Most parameters used throughout this project come from the CLICK PLC user manual [15]. This manual consists of 6 chapters, Getting Started, Specifications, Installation and Wiring, PLC Communication, Maintenance, and Troubleshooting, to help with questions
122
M. Ayad et al.
when using the PLC. When first wiring the PLC and testing to see if the code is working correctly, there are LED indicators next to the input and outputs to indicate whenever they are being used or called. In the Click software, after you run the project, the output voltage and temperature reading can be monitored by going to Monitor at the top, then Dataview. When there was a problem with the fan working, even though the data view said that the analog output was outputting 5 V, the 5 V fan was not moving. When using the multimeter to measure the voltage between the PLC and the fans, the output voltage was 0.9, meaning that there was a voltage drop from the PLC to the fan. The next thing was to look at the parameters of the fan, and the current was far higher than what the PLC could produce. Rather than completely changing the entire design and idea of the fans, getting a fan that can run off of the analog outputs on the PLC is essential. After trying other fans with lower currents, the lowest current on a miniature fan of the size needed for the project is 0.08 A. With the help of a push start, the fans constantly run for the presentation that the fans are turning on when the temperature increases. Failure should be expected when starting a new project using unfamiliar devices. The main reason the Smart Home is implemented using PLCs is that the main flaws of at-home devices such as smart plugs or smart lights are that they run off Wi-Fi. If the Wi-Fi in the house cuts out or stops working, the devices are useless and cannot be controlled virtually. On the other hand, the PLC is connected via an ethernet cable, so it is always functional. Furthermore, since a PLC is ruggedized, it can withstand harsh environments; if it is in suitably protected cases, it could be installed outside the residence to control outside functionalities. When the PLC is plugged in, and the switch in the top left corner of the CPU module is flipped to run, start by finding the HMI. The HMI will be on the home screen; if it is not, by pressing F3, the screen will transition to the home screen. On this screen, there are four options to go to other screens. If F1 is pressed, the blinds screen will pop up. On this screen, two buttons make the fan rise and turn the lights on or make the blinds go down and turn the lights off. Two indicator lights also indicate if the blinds are open or closed. To return to the home screen, F3 has to be pressed. Once returned to the home screen, if F2 is pressed, the temperature screen will appear. This screen will display the temperature in the bedroom, along with two indicator smiley faces. When these faces smile, it indicates power is supplied to the fans in the bedroom and the kitchen. When the faces frown, this shows that the power provided to the fans has been turned off. Again, pressing will reset the home screen, and pressing F3 will bring you to the garage door screen, as in Fig. 14. For the power to be switched from the blinds to the garage door, the switches on the side of the garage need to be flipped up. Power will only be supplied to the blinds if the switches are at the bottom. After returning to the home screen by pressing F3, pressing F4 will bring the screen to the security screen. On this screen, there is an indication that whenever the front door is open or closed, there will also be an alert that will show whenever the buzzer or alarm is on, as seen in Fig. 14. HMI is Usually used as a vital tool for operators and line supervisors to coordinate and control industrial manufacturing processes.
Implementation of a Smart House
123
Fig. 14. HMI Different Functions
8 Conclusion Smart house devices appeal to a wide variety of people worldwide. Being able to control the devices around your house through one interface or mobile device is efficient and convenient, for example, if someone is getting ready for bed and doesn’t remember if they closed the garage door when carrying in the groceries earlier that day. They can check to see if the door is closed and, if left open, close it, all from the same spot of realization. While working on this small project, we have been able to design a scaled prototype house with two small rooms and one garage. We implemented different functionality in the smart house using PLC, ladder logic, and HMI. All intended functions worked perfectly, and we achieved our objectives for the project. Our progress in this project encouraged us to extend it to cover more functions required for an easy life. Moreover, we are confident that we can use PLC for more controlled and automated applications. In the future, we plan to control all the functionality through a mobile app and add functions outside the house, such as grass irrigation control and front gate security.
References 1. Mtshali, P., Freedom, K.: A smart home appliance control system for physically disabled people. In: 2019 Conference on Information Communications Technology and Society (ICTAS), pp. 1–5. IEEE (2019)
124
M. Ayad et al.
2. Talal, M., et al.: Smart home-based IoT for real-time and secure remote health monitoring of triage and priority system using body sensors: multi-driven systematic review. J. Med. Syst. 43(3), 1–34 (2019) 3. Jabbar, W.A., et al.: Design and fabrication of smart home with the Internet of Things enabled automation system. IEEE Access 7, 144059–144074 (2019) 4. Peng, J.Y., Zhang, D., Deng, Y.W., Li, R.: A review on sustainable smart homes and home automation in tmall, baidu and know the topic: big data analytics approach. Curr. State Art Artif. Intell. Ubiquitous Cities, 155–167 (2022) 5. Massaro, A., Angelo, G.: Re-engineering process in a food factory: an overview of technologies and approaches for the design of pasta production processes. Product. Manufact. Res. 8(1) (2020) 6. Mehta, B.R., Reddy, Y.J.: Chapter 4 - batch automation systems. In: Mehta, B.R., Reddy, Y.R. (eds.) Industrial Process Automation Systems, Butterworth-Heinemann (2015) 7. Barz, C., Deaconu, S.I., Latinovic, T., Berdie, A., Pop-Vadean, A., Horgos, M.: PLCs used in smart home control. In: IOP Conference Series: Materials Science and Engineering, vol. 106, no. 1, p. 012036. IOP Publishing (2016) 8. Kamel, K., Eman, K.: PLC batch process control design and implementation fundamentals. J. Electron. 2(03), 155–161 (2020) 9. Cheng, Y-H., et al.: Smart home environment management using programmable logic controller. Eng. Lett. 28(4) (2020) 10. Siemens Stories, Global Website. https://new.siemens.com/global/en/company/stries/home. html?gclid=Cj0KCQiA15yNBhDTARIsAGnwe0WX6Ny8mvjxG3B6dwOXd_H5lMOf w6EumfSvURd46Qg1m2Y4aTW_EOAaAu4UEALw_wcB#Industry. Accessed 01 Dec 2021 11. S8VK-g01524 - AC/DC Din Rail Power Supply (PSU), 1 output, 15 W, 24 VDC, 650 ma Newark. https://www.newark.com/omron-industrial-automation. Accessed 01 Dec 2021 12. Mean well-switching power supply manufacturer. MEAN WELL Switching Power Supply Manufacturer. https://www.meanwell.com/. Accessed 01 Dec 2021 13. Electrical Power Products, AutomationDirect. https://www.automationdirect.com/adc/sho pping/catalog/power_products_(electrical)#sort=undefined%20asc&start=0. Accessed 01 Dec 2021 14. More touch panels EA9 series in-depth, C. https://www.automationdirect.com/c-more/home. Accessed 01 Dec 2021 15. CLICK PLC User Manual. https://cdn.automationdirect.com/static/manuals/c0userm/c0u serm.html. October 2021
Use of Cell Phones in Near Surface Seismology Hyunseo Lee(B) Seoul International School, Seoul, Republic of Korea [email protected]
Abstract. The experiment aimed to test the possibility of using phones to measure signals and waves used in near-surface seismology instead of specialized geophones. Yeouido Han River Park was selected based on the presence of a grassland, high cultural sounds, a subway line passing underground, and deposited quaternary alluvium. Using Sensor Log, an app that the acceleration of the phone using an accelerometer, cultural noise experiment and hammer energy source experiment were conducted. In the cultural noise experiment, data was collected to explore the possibility of phones being able to collect cultural seismic waves; in the hammer energy source experiment, data was collected to explore the possibility of phones being able to collect seismic waves from a particular energy source. MATLAB was used to analyze the data and create a visual representation. The results revealed that it is possible to use phones as a replacement for geophones, allowing the analysis of surface waves possible without specialized equipment in wide arrays of situations. Keywords: Accelerometer · Cultural Noise · Near Surface Seismology · Phones · Seoul
1 Introduction This study explores the possibility of replacing Multichannel Analysis of Surface Waves (MASW) with phones and its accelerometer board. 1.1 Near Surface Seismology Near surface seismology is a technique that uses seismic data to characterize the mechanical properties of the near surface (0−30 m). There are several techniques, such as MASW [5]. MASW uses the phase velocity of surface wave arrivals at different frequencies to recover shear velocity versus depth. With 3-component seismometers, a technique known as the H/V method has been used to estimate the depth to the basement [2]. Typically, a built-for-purpose acquisition system including an active source such as a weight drop or hammer is used along with geophones for each of these methods. In this study, we examine the possibility of using the 3-component accelerometers with cell phones as a replacement for conventional geophones. With such a system, investigators could collect data at a fraction of the cost. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 125–133, 2023. https://doi.org/10.1007/978-3-031-37963-5_10
126
H. Lee
1.2 Phone Accelerometer
Fig. 1. Phone Accelerometer Boardm(a) https://dealna.com/Article/Post/313/How-the-Acc elerometer-in-Your-Phone-Could-Help-Identify-You (b) https://turbofuture.com/cell-phones/10Best-Accelerometer-Apps
As shown in Fig. 1, a phone accelerometer is a built-in sensor that measures the force of acceleration caused by movements or gravitational force.
2 Data Acquisition Method 2.1 Selection of Location Area selection was based on the presence of a grassland, high cultural sounds, a subway line passing underground, and deposited quaternary alluvium. Proximity to the river ensured that quaternary alluvium was deposited in the area. As such, the area along the Han River was examined to find experimental locations. Figure 2 indicates possible areas along the Han River [3]. Along the Han River, parks, which had grassland, were established. Among all the Han River parks, parks without subway lines passing underneath and parks with low cultural sound were eliminated. After this process, Yeouido Han River Park was selected as the experiment site. As Fig. 3 indicates, the purple subway line passes through the grassland area of the park. Moreover, adjacent to the park is a four-lane road that can provide high cultural sounds. 2.2 Sensor Log APP Data collection was executed through an app called Sensor Log. To measure using Sensor Log, the phone should be placed on a flat land with the screen facing up without any phone case or other accessories. As shown in Fig. 4, Sensor Log measures the acceleration of the phone in three-axis. For each measurement, the acceleration is recorded at a logging rate of 100 Hz and is converted into a CSV file. CSV file contains information of the time stamp and acceleration in each axis. Later, the CSV file is analyzed using MATLAB.
Use of Cell Phones in Near Surface Seismology
127
Fig. 2. Map of Seoul based on Geologic Features https://www.researchgate.net/figure/Simplifiedgeological-map-of-Seoul-Modified-from-Kim-et-al-2016_fig8_322204463
Fig. 3. Map of Yeouido Han River Park
128
H. Lee
Fig. 4. Sensor Log Screen Shot
2.3 Phone Set-Up The first set of experiments focused on determining what type of phone set-up would be most effective in capturing seismic waves. The experiment has three set-ups: phone without any supplements, phone pressed with a heavy book, and phone attacked to a plastic stake. A simple experiment showed that the plastic stake was the most effective method to measure seismic waves. To effectively insert the stake into the grassland, the stake was first hammered into the ground as the phone is set up in Fig. 5. When the stake was hammered enough so that the head of the stake is at ground level, double-sided tape and strong glue were used to attach the phone to the head of the stake. Then, another experiment was conducted to determine the time of the day that the phone can most effectively collect data. Using a phone attached to a plastic stake, the measurement was taken during the early morning (5:00–7:00 A.M.), mid-day (11:00 A.M.–1:00 P.M.), and late night (10:00 P.M.–12:00 A.M.). The time range was restricted by the operation time of the subway. The results showed that measurements taken during the early morning contained less ground-level cultural noise and clearer seismic waves. Lastly, among different versions of iPhones (X, 11, and 12), iPhone 12 was the most efficient in collecting data.
Use of Cell Phones in Near Surface Seismology
129
Fig. 5. Image of a Phone Attached to a Plastic Stake
2.4 Cultural Noise Experiment Cultural Noise Experiment was conducted to collect cultural seismic waves, such as the waves from the subway line, using phones. Three iPhone 12s were used to collect data on August 7th, 2022, between 5:40 A.M. and 7:00 A.M., with each measurement being 15 min long. Experiment time was restricted due to starting time of subway operation and increasing ground-level cultural noise. Thus, the experiment was conducted for two consecutive days to measure 14 different locations as in Fig. 6.
Fig. 6. Cultural Noise Experiment Phone Location on Red Dots
The phones are placed in a line that is perpendicular to the subway line. Also, the phone was placed so that the configuration of the phone itself is parallel with the line created by the phones, and the phones are directed towards the subway line. The reference phone is placed at a three-meter distance from the subway line and is fixed and not moved for all measurements. As shown in Fig. 7, starting from the reference phone, phones are placed in three-meter intervals—if there were an obstacle, the phone was placed after
130
H. Lee
the obstacle. Due to limitations in the number of phones that could be utilized, for every measurement, two iPhone 12s (excluding one used as a reference phone) were moved according to the interval.
Fig. 7. Cultural Noise Experiment Set-Up
2.5 Hammer Energy Source Experiment Hammer Energy Source Experiment was conducted to collect seismic waves from an energy source, such as the waves from hammering a steel plate, using a phone. Again, three iPhone 12s were used to collect data on August 28th, 2022, between 5:40 A.M. and 7:00 A.M. For each phone location, two measurements were made: five minutes of no hammering and 40 s of hammering. For two minutes of hammering, the steel plate used was 20 cm × 20 cm × 2 cm large, as shown in Fig. 8. The steel plate was placed on a flat grassland, and for 40 s, the steel plate was hit with a hammer with full power every 10 s.
Fig. 8. Steel Plate
The phones were placed so that a linear line, including the hammer source, is created. The reference phone was placed at a two-meter distance from the energy source and was
Use of Cell Phones in Near Surface Seismology
131
not moved throughout the whole experiment. The phone was set up as shown in Fig. 9. From the reference phone location, phones, two iPhone 12s, were moved according to the two-meter intervals for each measurement.
Fig. 9. Hammer Energy Source Experiment
3 Experimental Results
Fig. 10. Cultural Noise in High Traffic Time Results (Z-Component: Dark Orange, XComponent: Blue, Y-Component: Yellow)
Figure 10 shows the Fourier magnitude spectrum for a receiver 12 m from the reference phone computed from approximately x seconds of data. The data were band-passed from 5–45 Hz prior to computing the magnitude spectrum. The average spectral magnitude is around 5 × 10–3 g, which is an order of magnitude above the −5×10–4 g self-noise level of the cell phone accelerometers and indicates that each component is measuring cultural noise signals. The y component has a large spectral peak between 30 and 35 Hz, which is similar to the spectral peak observed in another study measuring subway-generated noise with a conventional geophone [4].
132
H. Lee
Fig. 11. Hammer Energy Source Results (Z-Component: Blue, X-Component: Red, YComponent: Green)
Figure 11 shows the Fourier magnitude spectrum for a similar receiver location from the hammer-on-plate data. The data were band-passed from 5–45 Hz prior to computing the magnitude spectrum. Note that the y component is smallest in this data and that the frequency spectrum is decidedly peaked at much higher frequencies than the cultural noise data. Hammer on plate data is known to be weak at low frequencies, unlike cultural noise recordings. Moreover, the predominant signals are expected in the vertical plane from a hammer source.
4 Discussion and Conclusion The experimental results indicate that it is possible to replace phones with specialized devices to measure signals and waves of use in near-surface seismology. However, some caveats exist in this method. Since the phones were not equipped with anything than timing the measurements with a separate device, the measurement could not be turned on and off at the exact moment. Furthermore, because the measurements had to be manually turned on and off, the first and last few seconds of the measurements had to be trimmed out during the analysis process. Further developments of this technique could create a trigger switch that would accommodate the timing of the measurements. This technique of replacing geophones with commonly used cell phones allows seismic research to be done without any specialized equipment and in a variety of settings. When doing archaeological research, near-surface seismology research would be effective in determining the presence of artifacts or human remains underground. In those situations, without using specialized equipment for geologic research and by collecting
Use of Cell Phones in Near Surface Seismology
133
seismic waves and signals using phones, near-surface seismology could be effectively utilized to aid the research. Furthermore, in civil engineering and urban architecture, near-surface seismology could be used to determine the presence of sewerage systems and other underground structures. This technique could significantly reduce the cost and complicatedness of using near-surface seismology analysis. Especially in South Korea, there are multiple instances where construction is delayed for several years due to archaeological remains discovered under the site [1]. Using nearsurface seismology and analyzing underground substances before initiating the actual construction could lead to huge economic and social benefits. Furthermore, there is historical significance as valuable artifacts and remains could be excavated using proper and careful methods rather than being discovered during urban construction.
References 1. Bandun, R.: [Cityscapes] Exploring beneath Gwanghwamun’s Surface. Koreatimes. 27 Apr 2021. www.koreatimes.co.kr/www/nation/2021/05/177_307682.html 2. Hong, T.-K., et al.: Roles of subway speed and configuration on subway-induced seismic noises in an urban region. J. Appl. Geophys. 13 May 2022, www.sciencedirect.com/science/ article/abs/pii/S0926985122001392?fr=RR-2&ref=pdf_download&rr=7632b5dbf b669325 3. Lee, J.-Y., Kwon, K.D., Raza, M.: Current water uses, related risks, and management options for Seoul megacity, Korea. Environ. Earth Sci. 77(1), 1–20 (2017). https://doi.org/10.1007/s12 665-017-7192-6 4. Psarropoulos, P.: Impact of Tunnels and Underground Spaces on the Seismic Response of Overlying Structures. IntechOpen, IntechOpen, 3 December 2019. www.intechopen.com/cha pters/69298 5. Jumrik, T., Dey, A.: A review of active and passive MASW techniques. In: National Workshop Engineering Geophysics for Civil Engineering and Geo-Hazards (EGCEG) (2012)
Techno-Economic Assessment in Communications: New Challenges Carlos Bendicho(B)
and Daniel Bendicho
Bilbao, Spain [email protected]
Abstract. This article shows a brief history of Techno-Economic Assessment (TEA) in Communications, a proposed redefinition of TEA as well as the new challenges derived from a dynamic context with cloud-native virtualized networks, the Helium Network & alike blockchain-based decentralized networks, the new network as a platform (NaaP) paradigm, carbon pricing, network sharing, and web3, metaverse and blockchain technologies. The authors formulate the research question and show the need to improve TEA models to integrate and manage all this increasing complexity. This paper also proposes the characteristics TEA models should have and their current degree of compliance for several use cases: 5G and beyond, software-defined wide area network (SD-WAN), secure access service edge (SASE), secure service edge (SSE), and cloud cybersecurity risk assessment. The authors also present TEA extensibility to request for proposals (RFP) processes and other industries, to conclude that there is an urgent need for agile and effective TEA in Comms that allows industrialization of agile decision-making for all market stakeholders to choose the optimal solution for any technology, scenario and use case. Keywords: 5G · 6G · Blockchain · Network as a Platform · Techno-Economic Assessment · Techno-Economics · SD-WAN · SASE · SSE · Metaverse · Web3
1 Introduction The advent of new communications, computing and cyber security technologies such as 5G, 6G, software-defined networking (SDN), network function virtualization (NFV), secure access service edge (SASE) and secure service edge (SSE) has increased the technical complexity and diversity of architectural solutions. Thus, decision makers need effective Techno-Economic Assessment (TEA) models to choose the optimal solution. Current techno-economic models in communications prioritize economic feasibility instead of both technical and economic viability of complex technical solutions. Besides, they are mainly focused on network deployment and telecom operators’ perspectives and seldom consider end customers’ needs. C. Ben—Independent ICT Researcher, Member, IEEE, Member, ACM, Member, FITCE, COIT. D. Bend—Independent Researcher. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 134–150, 2023. https://doi.org/10.1007/978-3-031-37963-5_11
Techno-Economic Assessment in Communications: New Challenges
135
Therefore, Assessment in Comms is becoming a crucial need for organizations and it should be considered in the same way that assessment in cloud adoption and migration is included in the current portfolios of service providers and consulting companies. For instance, techno-economic assessment will allow to address, in optimal conditions, the transformation of a more than $100 billion global market of managed network services by 2030 at 6% CAGR from 2021 [1], considering different flavors for SD-WAN, SASE or SSE adoption. And this is only an example of one domain in communications. We can extend it to any other domain as, for example, 5G infrastructure and services market with a $1 trillion investment forecast by operators from 2019 to 2025 as stated by GSMA [2]. Current techno-economic analysis models and tools are especially focused on network deployment considering telecom operators’ perspective [3]. However, a technoeconomic assessment can also be applied to the deployment or adoption of any other new access network technology, backhaul, transport or any complete technical solution, such as aerial base stations deployment for emergencies or events coverage, for any market stakeholder: vendors, telecom operators, communications service providers (CSPs), mobile network operators (MNOs), mobile virtual network operators (MVNOs), multinational corporations (MNCs), enterprise organizations, small and medium enterprises (SMEs), public administration, technological consulting companies, regulators, and so on. Hence, there is a need for solvent and agile techno-economic assessment (TEA) models and tools that allow industrialization of agile decision-making for all market stakeholders to select the optimal solution for any technology, scenario and use case. This paper is structured as follows. Section 2 presents a brief history of TEA in Communications summarizing the review of the literature. Section 3 shows a proposal of redefinition of TEA. Section 4 exposes new challenges for TEA derived from the evolving technological and market context. In Sect. 5 the authors formulate the research question. Section 6 shows some use cases of TEA application (5G and beyond, SDWAN, SASE and SSE, Cybersecurity Risk Assessment), the characteristics TEA models should have and related gaps detected in the literature. Section 7 exposes some current challenges concerning RFP processes that could be addressed by using TEA. Section 8 introduces the extension of TEA models for Comms to other domains and industries, and eventually Sect. 9 shows the conclusions.
2 Brief History of TEA in Comms Summarizing the review of the literature presented in detail by the author in [3] and [4], there was an American U.S. seed in the field of techno-economic modeling for access networks in the late 1980s and the early 1990s. In 1989, a publication based on dynamic programming predicted that the most appropriate moment to invest massively in FTTH deployment would not be before 2010, considering the forecasting of costs, income and interest rates [5]. In the 1990s, studies began to focus on the detailed cost analysis of components with a “bottom-up” approach always oriented to the deployment of access networks and ignoring the end-user perspective.
136
C. Bendicho and D. Bendicho
In the 1990s, techno-economic modeling for access networks also germinated in Europe with European Union (EU) public-funded projects aimed at choosing the most appropriate alternative for access network deployment by telecom operators and promoting standards and recommendations, with the surge of STEM, TITAN, OPTIMUM, TONIC, and ECOSYS models. The ECOSYS model resulted in a dissemination effect materialized with the emergence of new proprietary models for power line communications (PLC), optical networks, 3G-LTE, the BATET model that distinguished between fixed and nomadic (mobile) layers, the COSTA model based on MUSE (a TONIC model extension), and more models for hybrid FTTx and WiFi or WiMAX networks, fiber to the distribution point (FTTdp). Since 2008, more EU public-funded projects continued the research: BONE, OASE for optical networks TEA, and DISCUS, provoking another spread effect towards proprietary models for deployment of broadband in rural areas, comparing fiber to the home (FTTH) and hybrid fiber coax (HFC) DOCSIS 3.0, for hybrid fiber & WiFi networks, LTE, converged access networks FTTH/FTTB and LTE, and eventually for 5G deployment [6]. In 2019, reference [6] included open source code in the GitHub public repository. Previous TEA models resulting from publicly funded EU projects are made available to the public by contacting project interlocutors. All models in the literature are based on the traditional definition of TEA [7] and mainly consider the deployment of access network technologies from the perspective of operators, vendors, manufacturers and standardization bodies.
3 Redefinition of Techno-Economic Assessment (TEA) The traditional definition of Techno-Economic Assessment is stated as follows: “Technoeconomic models are a method used to evaluate the economic viability of complex technical systems”, according to Smura’s doctoral thesis [7], which obviates any evaluation of technical viability. Aligned with this definition, current decision-making processes regarding the choice of networking technologies are based almost exclusively on economic criteria, which bears the risk of committing serious technical errors that may compromise expected economic viability. TEA models in the literature use economic output parameters like capital expenses (CapEx) and operation expenses (OpEx) or even Net Present Value (NPV) and IRR (Internal Rate of Return) to evaluate economic feasibility of a project, but they rarely use technical input or output parameters to evaluate the technical suitability of the proposed system. Most of TEA models just calculate costs in a given scenario without considering technical requirements and end users’ needs. Therefore, a broader extension of the concept of techno-economic analysis or assessment (TEA) is required. The new definition must emphasize the evaluation of technical feasibility and must be supported by techno-economic models that develop it. From this perspective, the author proposed a new definition in [3] as follows: “The techno-economic models are methods that allow the evaluation of the technical and economic viability of complex technical systems.” This new definition emphasizes both the technical and economic aspects of modeling, considering the technical feasibility and satisfaction of specific technical and economic requirements and needs.
Techno-Economic Assessment in Communications: New Challenges
137
On the other hand, techno-economic models in the literature are eminently oriented towards the dynamics of the deployment of access networks promoted by vendors, manufacturers and operators, ignoring the perspective of end users. Hence, there is a need for techno-economic assessment models that reflect and respond to both perspectives to contribute to market equilibrium.
4 New Challenges for TEA in Comms The context in communications is extraordinarily dynamic incorporating continuously new methods, technologies and innovations that pose new impacts and implications for TEA. This section shows some of these innovations and their implications for TEA in communications. 4.1 Validation, Test and Assurance on New Cloud-Native Networks A new paradigm for designing and operating networks is derived from validation, testing, and assurance needs to deliver high availability in large, complex cloud-native virtualized networks based on IntOps (Integrator and Operator) best practices supported by GitOps methodology and technology that provide workflows for infrastructure and applications with continuous evolution and continuous integration and delivery (CI/CD) [8]. GitOps is a set of practices to manage infrastructure and applications configurations using Git, which is an open source releases control system, as the only source of truth for declarative infrastructure and applications [8]. IntOps is a shift in strategy in progress of adoption by many telecom operators in order to become both integrators and operators [8]. The engineering and quality assurance (QA) areas must design and integrate dynamically and as much automated as possible new cloud-native network functions (CI: continuous integration), which need to be automatically tested for carrier-degree scalability, security, capacity, performance, quality, availability, resilience and compliance (CT: continuous testing) to be delivered end-to-end along the network (CD: continuous delivery). Continuous testing (CT) is therefore necessary and critical every time a cloud-native network function changes. For instance, every time there is a change in the containers image registry for a given cloud-native network function that will be executed on Kubernetes (containers’orchestrator) clusters (nodes) or a new cloud-native network function is created. CT is required even one step before the image is uploaded in the registry, and also afterwards during execution time to detect any failure and provide feedback for a new version, considering both the customer experience and runtime environment. Both CI/CD and CT pose new technical and economic impacts and implications for the techno-economic assessment (TEA) of network and security solutions for any market stakeholder. They require capital expenses (CapEx) and operation expenses (OpEx) for tools and knowledge acquisition as well as cultural change. In the long term, continuous integration, testing and delivery (CI/CT/CD) implies OpEx reduction, although CT impacts CapEx and requires smaller orders of magnitude in OpEx.
138
C. Bendicho and D. Bendicho
4.2 Blockchain-Based Decentralized Networks: The Helium Network and Alike The Helium Network [9] is a decentralized wireless network initially conceived for Internet of Things (IoT) devices and sensors, using LoRaWAN open-source wireless protocol, based on blockchain with a native protocol, called the Helium Consensus protocol, which serves to feed a two-sided market integrated by coverage providers and coverage consumers. LoRaWAN specification is a Low Power, Wide Area Network open-source protocol designed to connect battery operated devices wirelessly to the internet in regional, national or global networks [9]. The Helium Network is now extending to provide a 5G Citizen Broadband Radio Service (CBRS) spectrum decentralized network called Helium 5G powered by coverage providers’ hotspots. Coverage providers, who in Helium are the so-called miners in the blockchain terminology, purchase access node equipment, connect it via the Internet and demonstrate their contribution to Helium wireless network coverage in a cryptographically verified physical location and time by a Proof-of-Coverage algorithm, submitting proofs to the Helium network from their own access node. In the Helium network, IoT or 5G CBRS devices pay to send and receive data via the Internet by using the Helium distributed access network provided by coverage providers, who earn tokens for providing wireless network coverage for those devices, from data traffic transactions passing through their access nodes and for validating the integrity of the Helium network. The result is a decentralized wireless access network that is commoditized by competition, and provides global coverage at a fraction of the current costs. Therefore, blockchain-based decentralized networks such as Helium pose a new paradigm that must be considered in TEA from both technical and economic perspectives, as this type of decentralized networks will impact technical functionalities and performance as well as OpEx. There are also regulatory aspects and uncertainties to consider for TEA as these blockchain-based networks and markets have not been regulated to date. 4.3 Evolution of Networks: Network as a Platform (NaaP) The advent of fully programmable networks that expose capabilities via application programmable interfaces (API) to developers and applications in cloud hyperscalers’ marketplaces will provide an additional source of revenues in addition to businessto-business (B2B) or business-to-consumer (B2C) traditional markets for telecom operators, converting telco networks in platforms in the middle of two-sided markets. This is coming in an iterative cycle, and this time, the materialization of the Network as a Platform (NaaP) paradigm is much closer as GSMA operators have joined the Linux Foundation in the so-called CAMARA community. CAMARA was launched in MWC2022 for operator platform API development and standardization following the “code-first” principle applying agile methodologies. Telecom operators within the GSMA Operator Platform API Group (OPAG) have been advancing in recent years, and many have internal APIs available for the CAMARA community.
Techno-Economic Assessment in Communications: New Challenges
139
These network as a platform capabilities will facilitate a continuous and dynamic dialogue of business, information technology (IT) and operation technology (OT) applications with the network, by means of which network congestion state will be provided to applications so that they can invoke, when appropriate, better network performance (QoS), network latency, network latency stability services and so on. This will allow communications service providers (CSPs) to monetize such network services from application developers and over-the-top companies (OTTs), adding these revenues to those coming from traditional B2B, B2C and wholesale markets. On the other hand, network as a platform requires a secured API exposure layer to minimize threats and vulnerabilities and grant secure access via appropriate identity and access management (IAM), directly impacting OpEx. Therefore, the new network as a platform (NaaP) paradigm impacts revenues and OpEx dimensions of TEA and at the same time, these new network capabilities pose new technically advantageous features to be considered and weighed from the technical perspective in techno-economic assessment. 4.4 Sustainability Issues: Carbon Price The objective of net zero emissions under the United Nations (UN) Sustainable Development Goals (SDG) drives governance, risk and compliance (GRC) objectives, environment, social and governance (ESG) criteria for sustainability and environmental governance in stakeholders’ organizations. One measure adopted in this direction is carbon pricing calculation and its inclusion as OpEx in any project economics. Therefore, the best practice will be to add carbon pricing as an OpEx in technoeconomic assessment. On the other hand, the inclusion per se of carbon pricing information in a solution to be assessed, constitutes a technical input parameter to be considered and weighed in techno-economic assessment versus other alternative solutions. 4.5 Network Sharing Telecom operators are deploying different strategies in order to share network infrastructure either with other network operators or non-operator companies from different sectors (energy companies, media companies, hyperscalers, municipalities, private equity, etc.), mainly seeking cost savings as well as quicker rollouts. Hence, network sharing has an impact on TEA CapEx and OpEx calculations as well as on the technical viability as it makes operators face new challenges by increasing operational complexity, diminishing services differentiation opportunities or control over the network, increasing also the probability of technical incompatibilities with partner’s equipment.
140
C. Bendicho and D. Bendicho
4.6 Metaverse, Web3 and Blockchain The development and future evolution of the metaverse will increase pressure on computing power and networks as it needs large-data processing such as graphics rendering for the myriad of virtual worlds that will emerge. The authors envision the metaverse as one of the future 5G and beyond massive use cases gathering consumer and enterprise users around virtual, augmented and extended reality mobile devices. There are estimates about $5 trillion metaverse value generation by 2030 [16] and even 4 hours of metaverse presence per capita [17]. Massive interconnected metaverses will require real time interactions processing considering navigation, tactile and haptic interactions including persistence and footprint of users in every metaverse. Today, there are 5.4B internet mobile connections per day [GSMA] what means that more than 5 billion users will be able to access thoses metaverses. Moreover, the development and evolution of web3 and blockchain technologies associated or not with the metaverse will add even more pressure on computing power and networks. There are forecasts about 20x data and 1000x computing power required. Telcos will need to deploy massive edge computing, low latency access technologies: fiber to the home (FTTH)/fiber to the room (FTTR), Wifi6, 5G stand alone (5G SA) and beyond…) as well as extend network softwarization towards programmable networks that offer network capabilities via Network APIs making real the Network as a Service (NaaS) or Network as a Platform (NaaP) paradigm introduced in Sect. 4.3. Therefore, the evolution of the metaverse, web3 and blockchain will raise sustainability concerns such as their impact on carbon footprint. They could also get benefit from Network as a Platform approaches to get the most of the network to improve users’experience. These technologies will impact TEA in Comms by increasing required CapEx to deploy networks capable of coping with such increases in traffic, network performance, stability and latency, carbon price costs and OpEx, as well as requiring more complex architectures including edge computing to cope with these new technical challenges. 4.7 Blockchain Secured SDN The centralized control included in software defined networking (SDN) is susceptible to Denial of Service (DoS) attacks and single failure in the control plane [18]. SDN is also subject to security challenges and therefore potential threats related to the communication and interaction of application and data planes with the control plane [18]. Latah and Kalkan propose in reference [18] a Blockchain Secured SDN (BC-SecSDN), a hierarchical architecture to improve SDN security and synchronization, offloading proactive decision making from SDN controllers to smart contracts on the blockchain. This novel approach brings new challenges related to the limitations of BC systems’ latency, throughput and consensus protocols. These limitations need to be overcome to meet SDN high load scenarios requirements [18]. Hence, the combination of SDN and blockchain pose new challenges for TEA in Comms, impacting on technical complexity of required architectures, as well as associated CapEx and OpEx.
Techno-Economic Assessment in Communications: New Challenges
141
5 Need for a Universal Techno-Economic Assessment Model? Given this complex landscape with increasing technical complexity and exposed implications for TEA, there is a need for techno-economic assessment models that provide flexibility and adaptation capability to this dynamic context of continuous evolution and innovation, and therefore, with generalization and universal application attributes. So the research question the author formulates is: “Is it possible to define universally applicable, scalable, flexible and generalizable techno-economic models that make it possible to compare any networking & security solutions in order to help the different market agents make agile decisions?”. In reference [3], the author studied techno-economic models for access technologies, considering the following relevant points: • Traditionally, techno-economic models are defined as methods used to evaluate the economic viability of complex technical systems [7], ignoring an authentic evaluation of technical viability that takes into account the requirements and preferences of users, which carries the risk of committing serious technical errors that can compromise the expected economic viability. • The techno-economic models in the literature are eminently oriented towards the dynamics of the deployment of access networks promoted by manufacturers, vendors and operators, ignoring the perspective of end users. • Since the 1990s, different projects with public funding have been developed within the framework of different European R&D programs with the aim of developing and evolving access networks, which have given rise to most of the existing literature regarding models of techno-economic evaluation of access technologies. The review and analysis of the literature made by the author and exposed in [3, 4], together with the wide variety of different architectural modalities and technologies, the high CapEx and OpEx required for access networks, as well as the significant volume of scientific research on techno-economics supported by EU public funding, led to conclude it was interesting to define the characteristics required to have a universal and generalizable theoretical model for techno-economic evaluation of access technologies, in order to develop a specific classification of the literature and detect areas of improvement. After defining the characteristics and detecting areas for improvement, the author proposes in [3, 4] a new techno-economic model called Universal Techno-Economic Model (UTEM) that presents a higher degree of compliance than the models in the literature, achieving this dual main objective in his research work: 1. Defines a techno-economic model of universal, scalable, flexible and generalizable application that allows the evaluation and comparison of multiple access network technologies in different scenarios and use cases. 2. Develops a methodology for applying the techno-economic model to facilitate its use by different market agents, providing guidelines for the design of scenarios, application of the model and proper interpretation of the results obtained.
142
C. Bendicho and D. Bendicho
Given the complementary views exposed in previous sections and their implications for TEA, it makes sense to continue this research line to improve techno-economic assessment models to integrate and manage all this increasing complexity and contribute to agile decision-making for any market stakeholder.
6 Some Use Cases in Comms, Networking and Security 6.1 TEA Models for 5G and Beyond There are numerous combinations of technologies that can compose a 5G and beyond solution: multiple radio access technologies (RATs) (e.g.: 5G New Radio, LTE, WiFi, etc.), network virtualization, new generation antennas (massive multiple input multiple output – mMIMO), multiple frequency carrier bands, cloud, edge computing, multiaccess edge computing (MEC), heterogeneous deployments (HetNets) with macro-cells and small cells (micro-cells, pico-cells, femto-cells), network slicing, and so on. There is also an increasing myriad of use cases for 5G and beyond starting from eMBB (enhanced Mobile Broadband), uLLC (ultra-Low Latency Communications), and mMTC (massive Machine-Type Communications) providing different services in different industries: • Industry 4.0: connected factories, AGVs, Digital Twins, Cloud Robotics, Video Edge Analytics, Drones control, etc. • TV, Media & Events: 360º video for events, automatic production, corporate events broadcasting, etc. • eHealth: remote surgery, VR/AR aided rehabilitation, 5G music therapy, etc. • Intelligent Mobility: intelligent automotive, platooning, V2X, etc. • Tourism & Entertainment: AR/VR, 360º reality, holographic reality, cloud gaming, simultaneous translation, etc. • Education: real-time learning content distribution, AR/VR/360º reality, remote realtime assistance, etc. • Operations, Security and Emergency: Mission Critical Push-To-Talk, Mission Critical Data/Video, 5G Enhanced positioning for Mission Critical services, ProSe -Proximity Services-, remote real time-assistance, monitoring and control of operations, drones fleet management, aerial base stations deployment, etc. 5G enables AI application and digital transformation across all industries and organizations, and eventually the whole of our societies. Many of these use cases require the deployment of 5G public networks and/or 5G private networks involving different market stakeholders (manufacturers, vendors, CSPs, telecom operators, MVNOs, cloud providers, customer enterprise and corporate organizations, retail companies, cities and town halls, regulators, and technological consultants).
Techno-Economic Assessment in Communications: New Challenges
143
Therefore, 5G different architectural flavors challenge industry stakeholders as technical complexity increases, so decision-makers need an effective and agile technoeconomic assessment to choose the optimal solution. All techno-economic assessment models for 5G in the literature were elaborated specifically for a reduced number of scenarios as reviewed in reference [10]. Besides, they lack generalization capabilities that allow their adaptation to different use cases and evolving 5G architectures. The reviewed models in [10] are not sufficiently flexible to integrate new technical and economic parameters as well as other stakeholders’ perspectives apart from telecom operators, and do not allow agile assessment through automation. Based on a review of the literature on techno-economic assessment models for 5G, the characteristics of a theoretical 5G techno-economic assessment reference model, which considers all stakeholders’ perspectives, as well as technical and economic feasibility of any 5G current and future architecture as well as automation capabilities for agile assessment, were proposed in reference [10] as 15 characteristics named C1-C15 plus an additional one (C16) that the author includes in the present paper to consider the open source availability of any TEA model for 5G, as suggested in reference [6]. Acronyms are included for easy identification in Table 1 (RT, NT, MB, MT, LL, IN, AN, SI, EC, DC, CR, CB, OT, MP, AU, and OS): • • • • • • • • • • • • • • • •
C1. Assessment of both RAN and Transport Networks scenarios (acronym RT). C2. Business viability NPV and TCO (acronym NT). C3. Evaluation of enhanced Mobile Broad Band (eMBB) case (acronym MB). C4. Evaluation of massive machine-type communications (mMTC) use case (acronym MT) C5. Evaluation of ultra reliable low latency communications (URLLC) use case (acronym LL). C6. Evaluation of Inspection KPIs (acronym IN). C7. Evaluation of Analytical KPIs (acronym AN). C8. Evaluation of Simulation KPIs (acronym SI). C9. The economic assessment provides OPEX, CAPEX, Revenues, and ARPU as output parameters (acronym EC). C10. Demand and Capacity assessment (acronym DC). C11. Evaluation of different degrees of centralization of RAN or CRAN scenarios (acronym CR). C12. Consideration of Cost-Benefit analysis for decision (acronym CB). C13. Overall Technical Performance (acronym OT). C14. Multi-perspective: including all stakeholders’ perspective (not only mobile network operator deployment perspective) (acronym MP). C15. Capabilities to Automate Assessment (acronym AU). C16. Open Source availability of TEA model (acronym OS).
Table 1 presents the classification of the reviewed models in the literature regarding their compliance with these 16 characteristics of the theoretical reference model, showing that there is ample room for improvement as the most compliant techno-economic models in the literature (see blue bars in Fig. 1) present a 50% degree of compliance with the theoretical reference model in the literature.
144
C. Bendicho and D. Bendicho
At the same time, Fig. 1 and Table 1 (see last column) show a 93,8% degree of compliance for the UTEM model in green, thus reducing the gap in the literature. Table 1. Classification of Models in the Literature According to the Characteristics Established for the Theoretical Reference Techno-Economic Assessment Model for 5G [10]
Models
Theorecal Model UTEM Model (Bendicho, 2016) Roblot et al., 2019 Oughton et al., 2019 Maternia et al., 2018 mmMAGIC, 2017 J.R. Marn et al., 2019 Mesogi et al., 2020 Smail et al., 2017 Bouras et al., 2019 Neokosmidis et al., 2019 Walia et al., 2017 Yaghoubi et al., 2018 Bouras et al., 2015 Bouras et al., 2016 Kolydakis & Tomkos, 2014 Nikolikj & Janevski, 2014 Kusuma & Suryanegara, 2019 Bouras et al., 2018 Arévalo et al., 2018
C 1 R T
C 2 N T
C 3 M B
1 1 1
C C 1 9 0 E D C C 1 1 1
C 1 1 C R 1
C 1 2 C B 1
C 1 3 O T 1
C 1 4 M P 1
C 1 5 A U 1
1
1 1 1
1 1 1
1
1
1
1
1
1
1
1 1 1
1
C 4 M T
C 5 L L
1 1 1
1
1 1 1 1
1
C 6 I N
C 7 A N
1 1 1 1
1 1 1 1
1 1 1 1 1
1
1
1 1 1
1
1
1 1 1
1
1 1 1 1 1 1
1 1
1 1
1
1 1 1
1
C 1 Total Compli 6 Score ance O S 1 16 100%
1
1 1
1
1 1 1 1 1
1
C 8 S I
1
1 1
1 1
8
50%
8
50%
7
44%
7
44%
7
44%
7
44%
6 6
38% 38%
1
6
38%
6
38%
6
38%
5 5
31% 31%
4
25%
4
25%
1
1 1
1
1 1
93,8%
1 1
1 1
15
1
1 1
1 1
1 1 1 1 1
1
1
1 1
1
1 1
1
1 1
4
25%
1
1 1
3
19%
1
1
3
19%
1
12 6 19 5
8 2 2
1
6 18 17 6
1
1 7
2
2
4
1
As shown in the last row (heat map) of Table 1, the gap in the literature is more pronounced for C6: Evaluation of Inspection KPIs; C7: Evaluation of Analytical KPIs; C13: Overall Technical Performance; C14: Multi-perspective; C15: Capabilities to Automate Assessment (AU); and C16: Open Source Availability of the TEA model (OS).
Techno-Economic Assessment in Communications: New Challenges
145
120% 100% 100% 93,8% 80%
60%
50% 50% 44% 44% 44% 44%
40%
38% 38% 38% 38% 38% 31% 31% 25% 25% 25%
20%
19% 19%
0%
Fig. 1. Ranking of TEA Models for 5G Considering their Degree of Compliance with the Characteristics of the Theoretical Reference TEA Model for 5G [10]
Regarding C16, it must be stated that the UTEM model does not consider open-source availability as there is a planned technology transfer process to industry on course. 6.2 TEA Models for SASE, SD-WAN, SSE The advent of software-defined networking (SDN) and network function virtualization (NFV), and more recently secure access service edge (SASE) and secure service edge (SSE), the latter driven by the extension of work-from-home (WFH) and hybrid work models derived from the SARS-Cov-2 pandemic, have created a myriad of wide-area network (WAN) vendor solutions based on any combination of these technologies. The industry analyst firm Gartner coined the SASE concept in 2019 [11], including these 5 core networking and security functions: SD-WAN (Software Defined Wide Area Network with separated data and control planes, QoS based traffic prioritization and centralized security policies deployable with a few clicks from a central orchestrator control panel and dashboard and NFV functions deployable across the whole network on CPEs or cloud, integrating headquarters, branches, datacenters, virtual private and public clouds), Secure Web Gateway (SWG), Cloud Access Security Broker (CASB), Zero-Trust Network Access (ZTNA), and Firewall-as-a-service (FWaaS) with the ability to identify sensitive data (DLP) or malware, decrypt content at line speed, and monitor sessions continuously for risk and trust levels.
146
C. Bendicho and D. Bendicho
The coexistence of legacy WAN network portions based on Multi-Protocol Label Switching (MPLS) with progressive migration to SD-WAN or SASE (full-stack vendors or multi-vendor disaggregated options) with SSE integration and the different acquisition (do-it-yourself DIY) vs. managed services subscription models (CapEx-intensive vs. OpEx-only models) from generalist providers such as a telecom operator or a managed service from a systems integrator with deep sector expertise (e.g., manufacturing, retail, financial, etc.) or regional/national expertise, increases technical complexity and multiplies the possible alternative solutions for any interested organization. Hence, decision-makers need solvent and agile techno-economic assessment models to choose the optimal solution. However, a review of the literature shows that there is a lack of techno-economic studies and development of TEA models oriented to this domain, except for a few papers (e.g., references [12] and [13]) that slightly cover this topic. Considering that WAN solutions include different access network technologies for the different sites to be connected (headquarters, branches, datacenters, private and public clouds), it is possible to consider TEA models for access network technologies that leverage their extensibility capabilities to include additional technical and economic parameters related to the networking and security WAN domain (SD-WAN, SASE and/or SSE), as well as the additional parameters for the techno-economic implications of the aforementioned complementary approaches discussed in Section IV (Continuous Testing, Blockchain-based decentralized networks, Network as a Platform, and Carbon price). Table 2 shows the ranking of different TEA models for access network technologies according to their degree of compliance with the characteristics defined for a theoretical universal TEA model [4]. Among those defined characteristics there is the “Extensibility and Flexibility” characteristic as the easiness to add new technical and economic input and output parameters. Anyway, Table 2 shows there is a significant gap in the literature considering the compliance with the “Extensibility and Flexibility” characteristic as only UTEM model presents a degree of compliance with that characteristic higher than 0. Therefore, there is an urgent need to bridge this gap in order to develop TEA models for SD-WAN, SASE and/or SSE domain to satisfy market needs. 6.3 Cloud Cybersecurity Risk Assessment Use Case The reader can appreciate applicability of TEA to cloud cybersecurity risk assessment (RA) in reference [14]. A review of the literature shows that cloud cybersecurity RA approaches leveraging the Cloud Security Alliance (CSA) STAR Registry that consider an organization’s security requirements present a higher degree of compliance with the defined reference model, but they still lack risk economic quantification, an aspect that can be improved by using appropriate TEA models.
Techno-Economic Assessment in Communications: New Challenges
147
PredicƟve Ability
Ability to integrate with other models
Technical and Economic Comparability
Flexibility and Extensibility
Technical and Economic Universality
Geographic universality
Universality in user orientaƟon Universality in ncorporaƟng "micro" and "macro" approaches OrientaƟon to User of the Model Requirements
MulƟaccess Universality
Universality in combinaƟon of access technologies
Table 2. Ranking of TEA Models for Access Network Technologies Regarding their Degree of Compliance with the Characteristics of the Theoretical Reference TEA Model [4]
Total Score
Compli ance
Maximum possible score Author´s UTEM model – Bendicho (2016)
100 100 100 100 100 100 100 100 100 100 100 1100
100%
100 100 84 100 100 100 100 50 75 100 100 1009
92%
Shahid & Machuca (2017)
100 25 67 50 100 100 100 0
75
0
684
62%
Pereira & Ferreira (2009)
100 25 67 50 50 100 75
0
50 100 0
617
56%
Pereira (2007)
100 0
0
50 100 0
592
54%
Olsen et al., ECOSYS (2006)
67 25 34 50
0 100 75
0
50 100 50
551
50%
Monath et al., MUSE (2005)
67 25 34 50
0 100 75
0
50 100 50
551
50%
Oughton et al. (2019)
34
0 100 100 0
75 100 50
543
49%
Feijoo et al., RURAL (2011) Vergara et al., Model COSTA (2010)
67 25 34 50 50 100 50
0
50 100 0
526
48%
67 25 67 50
0 100 50
0
50 100 0
509
46%
Olsen et al., TITAN (1996) Jankovich et al., EURESCOM (2000) Smura, WiMAX only, TONIC & ECOSYS (2005) Zagar et al. (rural broadband in CroaƟa) (2010)
34 50 34 50
0 100 50
0
50 100 0
468
43%
67 25 34 50
0 100 50
0
50 100 0
476
43%
34
0
34 50
0 100 75
0
50 100 50
493
45%
67
0
34 50
0 100 75
0
50 100 0
476
43%
67 50 50 100 75
0 344 50
67
Pecur, FIWI (2013)
10 25
0
50
0 100 75
0
50
0
467
42%
MarƟn et al., Only HFC (2011) Van der Wee et al., FTTH only, OASE (2012) Van der Merwe et al., FTTH only (2009)
34
0
34 50
0 100 50
0
50 100 0
418
38%
34
0
34 50
0 100 50
0
50 100 0
418
38%
34
0
34 50
0 100 75
0
50
410
37%
67
67
0
1106 350 726 900 350 17001200 50 925 1601 300
148
C. Bendicho and D. Bendicho
7 TEA in Request for Proposals (RFP) Processes Worldwide publicly funded economic recovery instruments, such as the $1.5 + trillion in the U.S. (Build Back Better) or the e1.8 + trillion in the EU (Next Generation EU and Multiannual Financial Framework), require tremendous effort by public servants to prepare, publish and assess request for proposals (RFP). In late June 2022, having nearly reached the first quarter of the period for 2021–2026 Recovery and Resiliency Plans (RRP) grants and loans for EU countries, less than 14% of the budget had been executed [15]. In mid October 2022, at 30,5% of the whole 2021–2026 period, only one country (Spain) had reached 44,65% of the budget execution after receiving 2nd payment from the EU in late July 2022. France had reached 31,8%, Greece 24,68%, Italy 23,97%, Portugal 20%, Slovakia 19,3%, while the rest (15 applicant countries) were still in the 13% budget pre-financing milestone [15]. As RFP preparation and assessment are still very handcrafted, there is an urgent need to systematize, automate and industrialize it by incorporating agile TEA tools to minimize current bottlenecks. The authors will also conduct future research work in this direction.
8 Extension of TEA Application to Other Industries Some TEA models for Comms can also be extended to other domains and industries, such as energy, chemical, biotechnology and bioengineering optimal process design, which are key to sustainability issues to achieve the United Nations Sustainable Development Goals (SDGs). The challenges the European Union must face towards a net-zero Europe by 2050 include reducing emissions 55% by 2030 [19]. This goal requires reducing emissions mainly in power, transportation, buildings, industry and agriculture by investing about e28 trillion in clean technologies and techniques over the next 30 years [19]. The huge order of magnitude of the aggregate investments required for this energy transition and the variety of possible architectural and technology pathways to reach the net-zero goal, make it necessary to apply agile and effective techno-economic assessment models in these domains and industries. Therefore, the authors will also conduct future research work in this direction considering both the solutions deployment perspective as well as end users’ needs in order to choose the optimal solution.
9 Conclusions and Future Work This paper has presented a brief history of Techno-Economic Assessment (TEA) in communications (Comms), summarizing the review of the literature. It has also introduced the redefinition of TEA proposed by the author as a conclusion drawn from that previous review of the state of the art. This article has also identified the new challenges and implications for TEA derived from the evolving technological and market context considering cloud-native virtualized
Techno-Economic Assessment in Communications: New Challenges
149
networks, the Helium Network & alike blockchain-based decentralized networks, the new network as a platform (NaaP) paradigm, carbon pricing, network sharing, as well as the metaverse, web3 and blockchain technologies. The authors have formulated the research question and shown the need to improve and develop TEA models capable to integrate and manage all the previously exposed increasing complexity. This paper also has shown the characteristics TEA models should have and their current degree of compliance for several use cases: 5G and beyond, software-defined wide area network (SD-WAN), secure access service edge (SASE), secure service edge (SSE), and cloud cyber security risk assessment. Considering SD-WAN, SASE and SSE use cases, the authors have identified a gap in the literature about specific TEA models for this domain, proposing the use of TEA models developed for access network technologies. The authors have also exposed some challenges concerning request for proposal (RFP) processes and proposed TEA extensibility to that domain and other industries, including its possible key contribution to net-zero goals, given the great amount of investments required. Therefore, there is an urgent need for agile and effective TEA in communications that allows industrialization of agile decision-making for all market stakeholders to choose the optimal solution for any technology, scenario and use case. Moreover, techno-economic assessment can help societies face transformation with more accurate and agile decision-making tools for all stakeholders in different domains and industries, so it is key to develop research lines focused on the extensibility of the most compliant TEA models in communications to other domains and industries. As shown in this paper, there is ample room for improvement of TEA models in communications to integrate complementary views and respond to herein exposed challenges as well as to improve performance in different use cases (5G and beyond, SDWAN, SASE, SSE, cybersecurity risk assessment, and so on), always considering all stakeholders’ perspectives. The authors hope their research in this direction will contribute to the general adoption of Assessment in Comms in the same way assessment in cloud adoption and migration is included in the current portfolios of service providers and consulting companies.
References 1. Managed Network Services Market. The Brainy Insights (2022). https://www.prnewswire. com/news-releases/managed-network-services-market-size-worth-101-54-billion-by-2030the-brainy-insights-301487277.html. Accessed 21 June 2022 2. GSMA Intelligence. The 5G era for operators: investing in core networks, capturing B2B opportunities. GSMA Association (2020) 3. Bendicho, C.: Model for Techno-Economic Assessment of Access Network Technologies. Ph.D. dissertation, Bilbao School of Engineering, University of the Basque Country, Bilbao (2016) 4. Bendicho, C.: Techno-economic assessment in communications: models for access network technologies. In: Arai, K. (ed.) FICC 2021. AISC, vol. 1363, pp. 315–341. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73100-7_24
150
C. Bendicho and D. Bendicho
5. Reed, D., Sirbu, M.A.: An optimal investment strategy model for fiber to the home. J. Lightwave Technol. 7(11), 1868–1875 (1989) 6. Oughton, E.J., Katsaros, K., Entezami, F., Kaleshi, D., Crowcroft, J.: An open-source technoeconomic assessment framework for 5G deployment. IEEE Access 7, 155930–155940 (2019). https://doi.org/10.1109/ACCESS.2019.2949460 7. Smura, T.: Tecno-Economic Modelling of Wireless Network and Industry Architecture. Ph.D. dissertation, 23/2012, Aalto University Publication Series, 2012. Aalto University School of Science and Technology, Finland. ISBN 978–952–60–4525–2 8. IntOps, the new model to unlock the full potential of 5G —A Spirent and Deutsche Telekom Vision Paper”, Spirent & Deutsche Telekom (2022). https://www.spirent.com/assets/whitepaper-intops-the-new-model-to-unlock-the-full-potential-of-5g. Accessed 15 June 2022 9. Haleem, A., et al.: Helium —A Decentralized Wireless Network,” Helium Systems, Inc., Release 0.4.2 (2018). http://whitepaper.helium.com. Accessed 13 June 2022 10. Bendicho, C.: Techno-economic assessment models for 5G. In: Proceedings of The 4th International Conference on Modern Research in Science, Engineering and Technology (2021). https://doi.org/10.33422/4th.msetconf.2021.03.01 11. Orans, L., Skorupa, J., MacDonald, N.: The Future of Network Security is in the Cloud. Gartner (2019) 12. Andromeda, S., Gunawan, D.: Techno-economic analysis from implementing SD-WAN with 4G/LTE, a case study in XYZ company. In: 2020 International Seminar on Intelligent Technology and Its Applications (ISITIA), IEEE, pp. 345-531 (2020) 13. Asif, R., Ghanem, K.: AI secured SD-WAN architecture as a latency critical IoT enabler for 5G and beyond communications. In: 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6 (2021). https://doi.org/10.1109/CCNC49032.2021. 9369477 14. Bendicho, C.: Cyber security in cloud: risk assessment models. In: Arai, K. (ed.) Intelligent Computing. LNNS, vol. 283, pp. 471–482. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-030-80119-9_28 15. European Commission. Recovery and Resilience Scoreboard. https://ec.europa.eu/eco nomy_finance/recovery-and-resilience-scoreboard/disbursements.html?lang=en. Accessed 30 June 2022 and 15 Oct 2022 16. McKinsey & Company. What is the Metaverse (2022). Accessed 17 Aug 2022. https://www. mckinsey.com/featured-insights/mckinsey-explainers/what-is-the-metaverse 17. McKinsey & Company. Probing Reality and Myth in the Metaverse (2022). Accessed on 06.13.2022. https://www.mckinsey.com/industries/retail/our-insights/probingreality-and-myth-in-the-metaverse 18. Latah, M., Kalkan, K.: When SDN and blockchain shake hands. Commun. ACM 65(9), 68–78 (2022). https://doi.org/10.1145/3500920 19. McKinsey & Company. Net-Zero Europe (2020)
A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely Jack Humphreys and Emanuele Lindo Secco(B) School of Mathematics, Computer Science & Engineering, Liverpool Hope University, Liverpool L16 9JD, UK {19002075,seccoe}@hope.ac.uk
Abstract. This work aims at providing a low-cost solution to produce good quality thermal image that can be monitored remotely. The main application of such a system is to monitor electronic systems or installations as most industrial fires are caused by electronic or electrical systems. The paper presents the design and integration of the system, as well as a set of preliminary validation. The device was developed by combining and integrating a low-cost Raspberry Pi board and Pimoroni MLX90640 thermal camera with C++ and Python programming software. Multiple tests were then performed to validate the system’s accuracy. The temperature of a daily life hot object was measured in time and compared vs the thermal measurements of a commercial IR gun (Etekcity Lasergrip 1080). Tests were performed indoor and outdoor to double check the effect of the environmental noise. The overall results show an average difference of ± 1.9 °C. Keywords: Electronics · Thermal Imaging · Raspberry Pi · Remote viewing · Servo
1 Introduction Currently devices that provide the user with an accurate video feed of temperature and various other forms of data are expensive and not always easily obtainable. There are many different occupations, hobbies and DIY jobs which require the need to use this information. The Function of Thermal Imaging A thermal image is produced by using infra-red radiation, unlike normal images and cameras which will use visible light in a similar way to the human eye. For context, the human eye works by detecting visible light that has been reflected off an object and turning it into an image. However, a thermal imaging device works in a slightly different way by forming an image from heat instead of visible light. Many factors can affect the infra-red image produced. Two of the main factors are radiation from an object and radiation emitted from the atmosphere itself. Radiation from an object is noticeable by the color red. Normally, humans have a very hard time perceiving infra-red light and can only do so under very specific circumstances which © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 151–161, 2023. https://doi.org/10.1007/978-3-031-37963-5_12
152
J. Humphreys and E. L. Secco
meant soon after infra-red was discovered, it led to the desire of creating a device capable of making visible light the form of light output by warm objects. History of Thermal Imaging – 1800 -2000s To properly understand infrared and thermal imaging technology it is a good idea to research how the topic has developed over the years. The history of this will show many of the core features and discoveries that resulted in today’s level of thermal imaging. 1800 - British astronomer Sir William Herschel first discovered infrared radiation in 1800. He made the discovery by directing sunlight through a prism which displayed the visible spectrum of colors from blue to red. 1900 - In 1900 Max Planck produced the formula “Planck’s radiation Law”. This formula that explains the energy distribution of radiation by a blackbody. 1929 - The next major advancement of this technology was in 1929 when Kalman Tihanyi was the first to invent an infrared sensitive camera. This was an electronic television camera for anti-aircraft defence and was made in Britain. 1956 - However, the first conventional infrared camera was not developed until 1956 called Evaporograph. 1963 - The first Forward Looking Infrared (FLIR) system was created by Texas Instruments in 1963. This is incredibly useful as it allows for an infrared image to be created without the need to scan with a moving sensor. 1969 - AT&T Bell Labs created the first charge coupled devices in 1969. This was revolutionary for cameras in general and would eventually be used to create the first ever digital camera. Figure 2 is an image of the first ever CCD. 1970 - The first naval thermal imager was able to be created in 1970 due to the invention of the pyroelectric tube. 1978 - After the first FLIR device was created in 1963 the organization called FLIR, was established in 1978. To this day people are still buying their cameras as their advancements made in this technology were very important. Obviously, by the name they implemented FLIR into their cameras. 1980s - One of the most common pieces of technology used for thermal imaging is the microbolometer. However, this technology comes at a cost in the thousands. Microbolometer technology was first made in 1980 by Honeywell and is used as the detector of infrared radiation. When the radiation strikes the material, it is heated depending on the wavelength which in turn changes the electrical resistance. Early 2000s - Luckily, the ridiculously high cost of thermal cameras began to decrease in the early 2000s as the technology was being optimized as much as possible. This meant that by 2006 it was a lot more accessible for businesses willing to invest. Application of Thermal Imaging Thermal Imaging can be applied to a wide range of scenarios and industries as temperature is a very useful piece of data. So, it is necessary to research some of the most known papers published to grasp an understanding of the potential of thermal imaging. a. Clinical Thermography – The healthcare industry already makes use of different forms of radiation to provide patients with an accurate diagnosis. For example, XRay and computed tomography (CT) scans. Thermal imaging has begun to be used a
A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely
153
lot more often in medicine due to a lot of conditions causing changes of temperature in the body. b. Food Quality and Agriculture - One use of thermal imaging in food is for the detection of foreign bodies. For example, the cooling behavior of food in comparison with a foreign body is different so can be used to produce thermal images distinguishing the two. c. Monitoring Electronic/Electrical and Mechanical Systems - Industrial systems often have many different components, this can be electrical, mechanical or several other things. Due to the large number of systems that are used, there is guaranteed to be faults and problems at some point. This results in many different techniques that are required to prevent or detect a problem as soon as possible and one of the potential options is thermal imaging. d. Military and Emergency Services - Thermal imaging technology was originally developed for military applications such as being able to identify and track enemies even at night or low light conditions. Some devices can even detect a human presence twelve miles into the distance. In the context of this background and different set of applications, this study aims at delivering a low-cost device for automatic temperature detection for use in real-world applications. Moreover, this work aims to be a very practical solution that will allow the user to connect remotely to the device or connect a display for a more portable solution. Viewing remotely is important if the device is required to be left monitoring an area where humans cannot enter without the proper protection. However, at a first instance, the focus will be on a system that is stationery and monitors and processes different forms of data to do with thermal imaging.
2 Methodology & Research Design This section details how the research has been carried out, with details of the design and development process. This design section will convey the hardware and software requirements that were necessary. As well as diagrams planning the layout of the system, describing the choices that were made to reach a good standard. The methodology details how the development of the project progressed step by step. The prototype system will be described, and the reasoning behind it will be explained. All new code produced is commented and tested properly. Furthermore, the device has been required to be tested in different locations and at differing temperatures to figure out how accurate and useful the data collected is. 2.1 Hardware The proposed system is made of an embedded computer (namely a Raspberry Pi board), a thermal camera and a servo motor in order to orient the camera (Fig. 1). The main board will be a Raspberry Pi 4B with 2GB of RAM. Other similar boards like the Pi 3 could also be used, or a board with less RAM. However, the Pi 4 currently
154
J. Humphreys and E. L. Secco
has the highest processing power and will allow for the generated image to be as good as possible on a Pi. A starter kit was chosen from ThePiHut. Included in the pack is a: Pi 4, 32GB Class 10 A1 MicroSD, Pi 4 Case, Micro-HDMI Cable, and a USB-C Pi Power Supply. The HDMI cable is not necessary for the final product as the device will be interacted with remotely, however it is needed for the development and debugging phase.
Fig. 1. System Design
There are multiple other versions of the MLX90640 by other developers, but the Pimoroni camera seemed like the best option as the pin layout maps very well to the layout of pins on the Raspberry Pi. Otherwise though, all of them are similar and have an array of 768 thermal sensors which is equal to 32x24 pixels [1]. It is capable of detecting temperatures from −40 to 300 °C with an accuracy of 1 °C. These are good statistics for the price points and should be able to produce the desired image based on projects created by other users. If the device needed to be tailored more towards security and surveillance, then there is a version with a greater field of view at 110. The device used for this project will be the 55 version as there should be slightly better performance for the required application. A micro servo motor is an optional extra that may or may not be necessary. The servo provides a useful feature in order to be able to re-direct the direction the thermal camera when, for example, observing and monitoring the temperature of an external plant. Although, the idea is to add on screen buttons to the application that allow the user to remotely control the motor so that the camera can turn left and right. If this works successfully then it is a satisfactory proof of concept, and another axis could be added in the future for increased control. Naturally, the easiest way to remotely view the camera will be with a mobile phone. Any phone that has access to Wi-Fi or a mobile network will most likely be capable of running the remote viewer software.
A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely
155
2.2 Software The operating system recommended to use with a Raspberry Pi is Pi OS, a Linux based system that works very well for almost any task that can be completed on a normal desktop OS. A few online sources were used to find out the best way to set everything up properly [2, 3]. Pimoroni provides their own python library and examples to allow the programmer to develop their own application [1]. This is essential for making the camera function, however there are other libraries by other developers available. This is heavily adapted from the Melexis library. Geany is an IDE and text editor that comes pre-installed with Pi OS. Geany is capable of compiling C++ and Python code so can be used for this project. C++ will be used for generating the image, the user interface will be created in Python [4–7]. VNC Viewer [8] is another application that is pre-installed with Pi OS and it allows the device to be viewed remotely. From prior experience VNC Viewer is very easy to setup and there is very little input delay if the wireless connection is stable, so is the obvious choice for this project. There are also a few other libraries which are needed as recommended by Pimoroni to use the full potential of the camera and some libraries for the python GUI and GPIO control. They are: • C++: Linux I2C dev library, Bcm2835 library, Linux SDL2 dev library • Python: Tkinter Python interface library, RPi.GPIO library, time library [7]. 2.3 Hardware Design Layout and Test Setup The MLX90640 is connected to pins numbers 1, 3, 5 and 9 of the Raspberry Pi board. In this order the camera connectors are respectively connected to the 3V3 Power, GPIO 2 (SDA), GPIO 3 (SCL) and Ground pins. Similarly, the wires of the servo are connected to the board, according to the color codes provided by the manufacturer. The same color code is also reported within Fig. 1 for convenience. A 1 k resistor is added to the servos Pulse Width Modulation (PWM) wire in order to limit the current from the GPIO12 pin and protect the system in case connection mistakes may occur. The power supply then connects to the USB-C port. Figure 2 shows the overall integrated system, according to the scheme of Fig. 1. Since the Pi 4 has a built-in dual band Wi-Fi (IEEE 802.11), the system is also inherently able to wirelessly connect to other devices such as, for example, mobile phones; similarly, a desktop PC can be remotely streamed without the use of any physical connections such as a HDMI cable as portrayed in Fig. 1. 2.4 Testing Procedure The first test will be completed by measuring the temperature of a cup of coffee over the period of 2 min with a sampling rate of 10 s. Figure 3 shows the experimental set-up. The results of this test are reported in Table 1. This will be useful to prove whether the device is accurate or not. Hopefully, the test will be long enough to produce a substantial enough difference every ten seconds so that a range of temperatures can be tested. If the
156
J. Humphreys and E. L. Secco
Fig. 2. System Integration Combining the Low-Cost MLX90640 Thermal Camera (1) and ServoMotor (2) with the Raspberry Pi and Wi-Fi Communication Module (3).
difference is negligible then the time limit can simply be extended, and more readings can be taken. It is worth noting that the MLX90640 datasheet claims to have an accuracy of ± 1 °C, whilst the IR gun has an accuracy of ± 2 °C so it will not be a surprise should there be a small difference within that range.
Fig. 3. Experimental Setting of the Coffee Cup Indoor test with the Thermal IR Gun, Etekcity Lasergrip 1080 (1) and the Low-Cost Raspberry Pi -Based System with the MLX90640 Thermal Camera (2).
A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely
157
The test will take place twice, with one done outside and the other inside. Therefore, a temperature difference column is important to remove any initial difference between both tests allowing it to be as fair as possible. If the device is working correctly then both tables should follow a similar pattern in relation to the indoor and outdoor environmental temperature difference. Table 1. Test 1 Results t [s]
Temperature [°C]
MLX90640
Etekcity Lasergrip 1080
0
47.5
49.7
2.2
10
47.3
49.5
2.2
20
47.1
49.3
2.2
30
46.9
49.1
2.2
40
46.9
49.1
2.2
50
46.8
49.1
2.3
60
46.9
49.0
2.1
70
46.7
48.9
2.2
80
46.6
48.8
2.2
90
46.7
48.8
2.1
100
46.2
48.7
2.5
110
46.4
48.6
2.2
120
46.3
48.4
2.1
130
46.5
48.4
1.9
140
46.2
48.3
2.1
150
46.1
48.1
2.0
160
45.9
47.9
2.0
170
45.8
47.8
2.0
180
45.7
47.5
2.2
Ideally, there will be no real difference between the two tests apart from the rate at which the temperature decreases.
3 Results The results of the first successful temperature test have been shown in Fig. 4. The temperatures recorded by the Raspberry Pi can be seen along the red line and the temperatures by the IR gun are indicated by the blue line. The ambient temperature was 23.5 °C at the time, this will be useful when used in comparison with the results from the second temperature test.
158
J. Humphreys and E. L. Secco
Fig. 4. Results of Test 1.
Furthermore, from the data collected it can be assumed that the device developed on the Raspberry Pi is functioning properly under the test conditions. There is a constant gap between both devices of at least 1.9 °C. Whilst this seems like quite a large margin, it is not a failure as once the margin of error is taken into account then the temperatures will always overlap as displayed in Fig. 4. Both devices come with a margin of error provided by the manufacturer which is useful for determining whether the results are accurate or not. Since the temperature difference is so consistent this seems like the most likely cause of the gap. However, multiple times the temperature detected by the Pi appeared to have numerous drops and increases in temperature whereas the IR gun only dropped in temperature throughout the test. The first example of this appears at around 50 s when the temperature drops to 46.8 °C, but then increases 10 s later to 46.9 °C. This occurred four times during the test with the worst example being a 0.2 °C increase. Obviously, this is not a major problem as the difference is not greater than 1 °C and is probably caused due to the large amount of data that the camera detects in comparison the IR gun. Constantly detecting a 32x24 area will result in some anomalies or missed readings. Furthermore, even though the code provided by Pimoroni adapts in case of dead pixels this could be a potential cause of the problem. The second test took place with all the same conditions, the only difference being the environment that the results were recorded in. The ambient temperature at the time was around 13 °C, which was a 10.5 °C difference to the previous test. This is evident in the results which behaved relatively the same to test one, and as expected the rate of temperature change increased. However, the average temperature difference between the two devices also increased. The average of the first test was a difference of about 2.15 °C, whereas the second average taken was around 2.27 °C. This is not too worrying as they are not vastly different but could become a problem if used in harsher conditions. Although, the problem may not lie solely with the Pi and could be caused by the IR gun. Overall, both tests were successful and showcased the effectiveness of the device as the readings were within an
A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely
159
Fig. 5. The Outcome of the GUI and SDLSCALE Program which Allow the Visualization of the Thermal Image.
acceptable level. With further refining and understanding of the software the problems shown could be mitigated with future development. Figure 5 displays a successful implementation of the objectives as the temperatures can clearly be seen by the user and the motor moves when the buttons are pressed. Figure 5 showcases what the configuration of the Raspberry Pi system looks like up closely. This thermal image has been produced by following the instructions provided with the Pimoroni MLX90640 library which was adapted from the Melexis MLX90640 library. The example program SDLSCALE has been modified to allow for the data to be exported for the creation of the GUI.
4 Conclusion The main purpose of this work was to create a device that could produce an accurate thermal image that could be viewed remotely by the user. This condition has been met as highlighted in the results section. Additionally, the servo motor was also implemented successfully to allow the user to control the direction that the camera is facing, which can also be done remotely. Ultimately, all the aims and objectives of this project have been completed. Further research should be implemented vs the processing and analysis of the image in order, for example, to be able to identify and detect any hazardous condition [9] and possibly extract useful information automatically [10, 11]. An in-depth validation of the system in real industrial scenario is missing and would be of benefit for a proper testing of the proposed architecture. In particular, a comparison vs current available system under different experimental conditions (i.e. temperature, heat flux disturbances, humidity, environmental light) could be implemented, where performances of the proposed device are compared vs gold standard instrumentation of a higher class.
160
J. Humphreys and E. L. Secco
Other skills could also be implemented in a further stage where the system automatically recognize and classify the information inherently reported on the image of the thermal measurement while processing and extracting this information with Computer Vision techniques based, for example on Convolutional Neural Networks [12, 13] and Machine Learning Techniques such as SVM, LDA or NN [14, 15]. Moreover, the wireless communication system and reliability should be tested as well, providing a final overview of the system performance. Acknowledgment. This work was presented in dissertation form in fulfilment of the requirements for the BEng in Electronic and Computer Engineering for the student J Humphreys under the supervision of EL Secco from the Robotics Laboratory, School of Mathematics, Computer Science and Engineering, Liverpool Hope University.
References 1. Pimoroni. 2022. Python library for the MLX90640 thermal camera. https://github.com/pim oroni/mlx90640-library. Accessed 31 March 2022 2. Hrisko, J.: High Resolution Thermal Camera with Raspberry Pi and MLX90640. Maker Portal. Maker Portal (2020). https://makersportal.com/blog/2020/6/8/high-resolution-thermalcamera-with-raspberry-pi-and-mlx90640. Accessed 13 April 2022 3. Shaffner, T.: MLX90640 Thermal Camera with Raspberry Pi 4. PiThermalCam (2021). https:// tomshaffner.github.io/PiThermalCam/. Accessed 13 April 2022 4. Docs.python.org. tkinter — Python interface to Tcl/Tk — Python 3.10.4 documentation (2022). https://docs.python.org/3/library/tkinter.html. Accessed 13 April 2022 5. Raspberry Pi OS – Raspberry Pi. Raspberry Pi (2022). https://www.raspberrypi.com/sof tware/. Accessed 13 April 2022 6. Melexis. MLX90640 library functions (2021). https://github.com/melexis/mlx90640-library. Accessed 13 April 2022 7. PyPI. RPi.GPIO (2022). https://pypi.org/project/RPi.GPIO/. Accessed 13 April 2022 8. RealVNC®. RealVNC® - Remote access software for desktop and mobile | RealVNC (2022). https://www.realvnc.com/en/. Accessed 28 June 2022 9. Secco, E.L., Secco, E.L., Abdulrahman, R., et al.: Development of cost-effective endurance test rig with integrated algorithm for safety. Advances in Intelligent Systems and Computing, 1139, 14 https://doi.org/10.1007/978-981-15-3287-0_14 10. McHugh, D., Buckley, N., Secco, E.L.: A low-cost visual sensor for gesture recognition via AI CNNS. Intelligent Systems Conference (IntelliSys), Amsterdam, The Netherlands (2020) 11. Secco, E.L., McHugh, D.D., Buckley, N., A CNN-based computer vision interface for prosthetics’ application. EAI MobiHealth 2021 - 10th EAI International Conference on Wireless Mobile Communication and Healthcare, pp. 41–59. https://doi.org/10.1007/978-3-031-063 68-8_3 12. Buckley, N., Sherrett, L., Secco, E.L., A CNN sign language recognition system with single & double-handed gestures. IEEE Signature Conference on Computers, Software, and Applications (2021). https://doi.org/10.1109/COMPSAC51774.2021.00173 13. Myers, K., Secco, E.L.: A Low-Cost Embedded Computer Vision System for the Classification of Recyclable Objects. Lecture Notes on Data Engineering and Communications Technologies 61. https://doi.org/10.1007/978-981-33-4582-9_2
A Low-Cost Thermal Imaging Device for Monitoring Electronic Systems Remotely
161
14. Latif, B., Buckley, N., Secco, E.L.: Hand gesture & human-drone interaction. Intelligent Systems Conference (IntelliSys) 3, 299–308 (2022) 15. Maereg, A.T., Lou, Y., Secco, E.L., King, R.: Hand Gesture Recognition Based on NearInfrared Sensing Wristband. In: Proceedings 15th International Joint Conference on Computer Vision, Imaging & Computer Graphics Theory & Applications, pp. 110–117 (2020). https:// doi.org/10.5220/0008909401100117
Mobile Application for Bidding Durians Check-Yee Law1(B)
, Yong-Wee Sek2 , Choo-Chuan Tay2 and Ya-Qin Loh1
, Tze-Hui Liew1
,
1 Multimedia University, Melaka, Malaysia
{cylaw,thliew}@mmu.edu.my
2 Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
{ywsek,tay}@utem.edu.my
Abstract. Normally, there are intermediaries between durian farmers and the end consumers. As a result, the retail durian price in the market is generally twice the farm price. As durians are perishable and taste best when they are fresh, intermediaries are necessary in the distribution channel. This is because the production volume per single farmer is not large, and reaching the downstream customers would require a well-planned logistics and suitable transportation as well as strong marketing strategies and networking. With the popularization of the Internet, online bidding is suitable for both durian buyers and sellers. This paper presents the design and development of a mobile application to facilitate the process of selling and bidding durians known as BidDurian. A complete prototype testing has been conducted and the prototype performs well. Future enhancement may include the live video streaming auction function and inclusion of data analytics. Keywords: Bidding System · Durian · Durian Market · Mobile Application · Price Ceiling · The King of Fruits · Winning Bidder
1 Introduction Durian is a tropical fruit. It has the reputation of “The King of Fruits” for its unique pungent aroma and creamy custard-like flesh [1]. It is a very nutritious fruit that is rich in various healthy plant compounds, fibres, minerals, B vitamins, and vitamin C [2]. In Malaysia, durians are grown commercially and in home garden as well. Johor is the largest producer of durians in Malaysia. Other major producers include Pahang, Penang, Negeri Sembilan, Perak, and Kelantan. There are many varieties of durians grown in Malaysia, and each state in Malaysia is prolific in producing different types of durians. For example, Raub district of Pahang is famous for Musang King and D24 varieties whereas Johor is known for IOI and Kampung varieties [3]. Durians are priced based on the varieties. For example, premium durian such as the Black Thorn was priced at RM100 per kg, Musang King was priced at RM75 per kg whereas Hor Lor and Red Prawn were at RM50 per kg [4]. Durian market prices fluctuate with various demand and supply factors, and seasonality. Durian yield heavily depends on weather conditions. Heavy rain and strong winds © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 162–174, 2023. https://doi.org/10.1007/978-3-031-37963-5_13
Mobile Application for Bidding Durians
163
during blossom would impact the pollination of durian and wash away durian flower buds. This will result in low durian yield, and hence increase in durian price. In Malaysia, durian season would typically start in Penang in around mid of April or early May, followed by the harvest season in Johor and then moving up north to Pahang, Perak and Kelantan [3]. However, there are times in which durian fruits from different states ripen at the same time that leads to abundant supply and lower price [4]. Besides that, durian price is also affected by increased price of durian fertilizers and high cost of foreign labour to look after the durian plantation [5]. In addition, demand is an important factor. Durian in Malaysia is consumed domestically and exported to other countries. China is the main importer, followed by Singapore and Hong Kong [3]. When the demand from other countries is weak, excess durians will be pushed for domestic consumption and the price will drop. During durian season, durian lovers will spend time and effort to find favourable and affordable durians. Normally, there are intermediaries between durian farmers and the end consumers. Wholesalers mediate the interaction between farmers and customers by buying in bulk and selling in small quantities for distribution to downstream customers. As durians are perishable and taste best when they are fresh, intermediaries are necessary in the distribution channel. This is because the production volume per single farmer is not large, and reaching the downstream customers would require a well-planned logistics and suitable transportation as well as strong marketing strategies and networking [6]. With the existence of intermediaries in the distribution channel, retail durian price in the market is generally twice the farm price. In this era of popularization of the Internet, online bidding is suitable for both durian buyers and sellers. This paper presents the design and development of a mobile application to facilitate the process of selling and bidding durians known as BidDurian. Section 2 reviews some similar bidding systems and the comparison of the key features of the systems. Section 3 describes BidDurian system with the illustrations of a context diagram, user activity diagrams, and some screenshots of the prototype interface to explain the bidding process. This is followed by some functional testing results related to the bidding process in Sect. 4. This paper is then concluded in Sect. 5 with some future work.
2 Related Work Online auction refers to price negotiation and transaction activities implemented through the Internet. This involves the use the Internet to publish information about the goods or services to be tendered on the website and sell it to the successful bidder through competitive bidding. Sellers can post items for auction whereas buyers can freely make bids during the bidding process and no advanced deposit is needed. It is a relatively safe trading environment created under the website’s self-build credit evaluation system in which both the buyers and sellers can find reliable trading partners. In order to design an online durian bidding system, various bidding systems are reviewed to obtain a better understanding of how other similar bidding systems work and the key features in the bidding system.
164
C.-Y. Law et al.
2.1 Chilindo Online Bidding System Chilindo is an e-commerce auction platform [7]. Buyers can bid with other users, and only those who bid with the highest price before the closing time can buy the goods. This platform offers various items on auction including clothing accessories, daily necessities, children’s products, electronic equipment, and etc.. One of Chilindo’s prominent characteristics is it is a safe platform. Chilindo applies encrypted Secure Sockets Layer (SSL) connections to all payments made by credit cards to protect customers from any online fraud or theft. Users can place order for the product in the shopping cart only after they have successfully bid the product. Users do not need to pay any deposit in advanced to bid for the product. However, bidders have to be aware that the bid price is not the final payment price. This is because the final payment will include additional 10% of the platform service fee and freight. Therefore, the user needs to manually calculate the final payment price when bidding for the products. 2.2 eBay Bidding System eBay started from auctions and gradually developed into a cross-border e-commerce company with a business-to-consumer (B2C) vertical sales model [8]. It is mainly aimed at individual customers or small businesses and has always reserved auctions as a particular form of sales. The auction can quickly attract people in a short period. Merchants can set the auction duration such as 1, 3, 5, 7, or 10 days, and the platform will charge a listing fee based on the auction duration. One or two hours before the end of the auction, the goods will be top listed on the page to facilitate buyers to snap up. Once a user has clicked on the Place Bid button on the item details page, the user can key in the bid price ceiling or click on the recommended bid amount. By clicking the Confirm Bid button, users need to agree to the regulations that they agree to commit to buying the item if they win the bid. The notification page will notify users about the bidding status in order to take further action via an auto-generated message. The buyer can also contact the seller in the message page. Some drawbacks of this platform include it is difficult to find the bidding status of the specified product and the notification list is disorganized. 2.3 ZenMarket Auction and Shopping Service ZenMarket is a Japanese proxy auction and shopping service [9]. Zenmarket helps people buy items from Japanese online shopping platforms, such as Rakuten, Amazon, Yahoo!, and other online marketplaces. Then, they help ship the goods to all parts of the world. In addition, ZenMarket supports bidding features for Yahoo! Auctions which is Japan’s most popular auction site. In the product details page, users can navigate to place a bid by clicking on the Bid button. The information on the product details page is mostly focused on the auction information, not product information. One of the key features in ZenMarket is users can opt to make bids in either Normal Bid or Delayed Bid. For Delayed Bid, the bid will be placed within one-six minutes before the auction ends.
Mobile Application for Bidding Durians
165
2.4 Bidding System Key Features Comparison Based on system review, most online bidding systems possess One-Time Bid, InstantBuy, and Bidding History features. However, not all systems are equipped with Auto-Bid function to support the bidding process. Bidders would need to always track the bidding records if they want to be the winning bidders. We propose Auto-Bid function with AutoBid Increment Amount and Price Ceiling feature to help bidders bid for their favourable durians with their affordable budget. Table 1 compares the bidding systems including Chilindo, eBay, ZenMarket, and the proposed system–BidDurian. Table 1. Comparison of Bidding Systems Chilindo
eBay
ZenMarket
BidDurian
Set end time of auction
No
Yes
Yes
Yes
One-time bid
Yes
Yes
Yes
Yes
Set bid price ceiling
No
Yes
No
Yes
Auto-bid function
Yes
Yes
No
Yes
Set auto-bid increment
No
Yes
No
Yes
Instant buy features
Yes
Yes
Yes
Yes
Product recommendation by third party
No
No
No
Yes
Order history
Yes
Yes
Yes
Yes
3 The Prototype Design of Durian Bidding System Durian bidding system is a mobile application to facilitate durian trading process by using React Native framework and Firebase Database. React Native framework is used to develop the mobile application whereas Firebase Database is used in this project for database management. Figure 1 shows the context diagram of BidDurian. It contains 3 entities which is the Buyer, Seller, and Bank. Buyer and Seller users would need to Sign Up to use the system. A Seller user can post durian information for bidding, view order and the details of the winning bidders. The Buyer users can view information about durians on auction, and place bid with One-Time Bid, Auto-Bid or Instant-Buy function. Both Seller and Buyer users can view Bidding History. The system will record the payment details of the buyer and send them to the bank. The bank will then send credit card authentication respond to the system, and the system will record the order status and respond to the buyer and seller. This paper focuses on the bidding process of BidDurian system.
166
C.-Y. Law et al.
Fig. 1. Context Diagram of BidDurian System
User activity diagram and some system interface screenshots are used to explain the bidding process of the proposed system. Figure 2 shows the activity diagram of a Seller. Once Sign Up in the Landing Page of the system as in Fig. 3(a), a user would need a one-time login to use the system. As a Seller user, one can post a new auction about durian available for auction as in Fig. 3(b). The Seller users are able to describe the durian variety and taste with durian image for auction, and the auction details. The post will then be available in the Auction List. Through the Auction List, a Seller user can view the Bidding History and the Order History, and track the winning bidder details and the buyer details. Figure 4 illustrates the process of a Buyer searching for durian and placing a bid. Upon login, a Buyer user can search for favourite durians by variety. The durian variety with its taste and rarity rating is shown below the search column as in Fig. 5(a). The information of durian taste and rarity rating is adopted from YouTrip Website [10]. This information is for user reference and helps users find their favourite durian varieties. Users can search the durian by variety using the Search Bar. The Product List (Fig. 5(b)) will be shown after a user clicks on one of the varieties on the Search Page. The information available in the Product List includes durian image, name, quantity information, remaining auction time, and Buy-It-Now price (see Fig. 5(b)). A user can view various auctions of the same durian variety in the Product List.
Mobile Application for Bidding Durians
167
Fig. 2. The Activity Diagram of a Seller User
When a Buyer user clicks on the durian on auction in the Product List, the Buyer user can view description of the durian on auction. Information available includes the durian image, Buy-It-Now or Instant Buy price, auction time left, auction end date and time, number of bids, and product description posted by the Seller user. The Bid button as in Fig. 6(a) will lead the Buyer users to the Bidding Page. A Buyer user can bid by choosing the bid mode, either One-Time Bid or Auto-Bid. Figure 6(b) shows an example of the Bidding Page. In Auto-Bid mode, one needs to set the maximum bid amount, that is the Price Ceiling of the bid and the increment amount (RM1, RM5, RM10 or RM20). In this bidding mode, a user does not need to regularly track on bidding page as the auto bidding can run on background to help the bidder sends the bids. The One-Time Bid user
168
C.-Y. Law et al.
Fig. 3. (a) Landing Page (b) Post a New Auction
will win the bid if and only if there is no other bidder who bid higher than the One-Time Bid bidder. For Buy-It-Now mode, the users will get the durians at the Buy-It-Now price set by the Seller. This means a user can direct purchase the durians without waiting until the auction ends. By confirming the order, a user will be prompted to Payment Page. The bidding activities will be recorded in the Bidding History whereas the Buy-It-Now mode will be recorded in the Order History. Based on the Bidding History and the Order History, a Seller can then contact the winning bidder or the buyer for durian delivery. The following section presents an example of the auto-bidding function. Assume there are three bidders on one auction with the following bidding modes: Bidder A: Place one-time bid, Price Ceiling is set at RM 50 Bidder B: Place auto-bid with RM 1 increment, Price Ceiling is set at RM 59 Bidder C: Place auto-bid with RM 5 increment, Price Ceiling is set at RM75 Bidder A then places one-time bid again with price ceiling set at RM 69
Mobile Application for Bidding Durians
169
Fig. 4. The Activity Diagram of a Buyer User
Figure 7(a) shows the Bid History that consists of the bidding records made by Bidder A, Bidder B, and Bidder C. Bidder A starts the bid with a One-Time-Bid mode for RM 50. When Bidder B joins the bid with auto-bid mode with increment of RM 1, and a Price Ceiling of RM 59, the system will act intelligently to auto bid for Bidder B at RM51. Bidder C then auto-bid with increment of RM5, and a Price Ceiling of RM75. The system automatically displays Bidder C’s bid at RM56. Again, the auto-bid function will adjust Bidder B’s bid to RM57, followed by Bidder C’s bid at RM62. As this amount has exceeded the price ceiling set by Bidder B, the auto-bid for Bidder B stops. As Bidder A now makes a One-Time-Bid at RM69, the system again adjust the bidding price of Bidder C to RM74. Since there is no more bid till the auction end time, Bidder C appears to be the winning bidder of this auction. A Congratulations message will pop up when Bidder C taps on Check My Bid Results as in Fig. 7(b).) The seller can then contact Bidder C.
170
C.-Y. Law et al.
Fig. 5. (a) Search by Durian Variety (b) Product List
Mobile Application for Bidding Durians
Fig. 6. (a) Bid Button that Leads Users to Bidding Page (b) Bidding Page
171
172
C.-Y. Law et al.
Fig. 7. (a) Bidding Records (b) Check Bid Result
4 Prototype Testing A complete prototype testing includes unit testing and integration testing has been conducted. The testing aims to ensure the prototype performs well with the expected outcomes. For the presentation of this paper, only the test cases related to the bidding process are shown. Table 2 presents the test results.
Mobile Application for Bidding Durians
173
Table 2. Prototype Testing Results No
Test case
Expected outcome
Result
1
Seller Business Console Page a) Add new product/auction b) Add new image c) Publish new post d) Set minimum bid price e) Date picker f) Date picker validation
A Seller user is able to edit the form A Seller user is able to add new image The auction post is stored in database Display minimum bid price A Seller user can use date picker to set auction end date A Seller user will be alert of setting appropriate date (Auction end date > Today’s date)
Successful Successful Successful Successful Successful Successful
2
Home Page a) Varieties b) Search function c) Tap to view a specific durian variety
A Buyer user is able to view durian varieties A Buyer user should be able to search durians by durian varieties All auction posts of this durian variety are shown (Product List page)
Successful Successful Successful
3
Product List Page a) Auctions (Product List) b) Tap to view a specific auction for bidding
A user is able to view all auctions posted by various sellers Specific post of durian on auction is shown with the Bid button
Successful Successful
4
Product Details Page a) Product details b) Tap on the Bid button
Show product details such as durian image, Buy-It-Now price, auction time left, auction end date and time, number of bids, and product description posted by the Seller user Navigate to bidding screen
Successful Successful
5
Bidding Page a) Bidding page b) Place Bid with Auto-Bid c) Place Bid with One-Time-Bid d) Place Bid with Buy-It-Now
Show 3 bidding modes: Auto-Bid, One-Time-Bid, Buy-It-Now with Price Ceiling input box and Place Bid button A Buyer user is able to place Auto-Bid by setting the increment amount and the Price Ceiling A Buyer user is able to place One-Time-Bid by tapping on One-Time-Bid button and setting the Price Ceiling (One-Time-Bid amount) A Buyer user is prompted to Payment page
Successful Successful Successful Successful
6
My Auction Page (Seller) a) View product post b) Navigate to show product details c) Navigate to show Bidding History and Order History d) View Winning Bidder details e) View Buyer details f) Bid Validation
A user is able to see all the product posts A user is able to view the product details A user is able to track Bidding History and Order History Show Winning Bidder’s details Show a Buyer’s details Stop accept the bid when an auction has ended
Successful Successful Successful Successful Successful Successful
7
My Bids Page (Buyer) a) Bidding List b) Check Bid Result c) Bid button validation
A user can track his/her bidding Successful A user is able to know whether win or lose the bid Successful Direct a user to the Bidding Page if the auction is Successful still available, otherwise show alert message for an expired auction
5 Conclusion and Future Work Through online auctions, sellers can directly publish their durian auctions online whereas potential buyers can easily obtain information about durians on auction, and start bidding. In this way, sellers can sell directly to the end customers without intermediaries. The whole process also saves durian enthusiasts time to find and buy durians they like at
174
C.-Y. Law et al.
a cheaper, achieving a win-win deal. Future enhancement may include the live video streaming auction function to make the bidding process more attractive and successful. Data analytics can also be included to guide users in marketing and transactions of durians.
References 1. Ketsa, S., Wisutiamonkul, A., Palapo, Y., Paull, R.E.: Chapter 4: The Durian, Botany, Horticulture, and Utilization. In: Horticultural Reviews, Warrington, I., (ed.), John Wiley & Sons, Inc., pp. 125–211 (2019) 2. Jennings, K.-A.: Durian Fruit: Potent smell but Incredibly Nutritious (2019) 3. Bedi, R.S.: CNA Explains: Why Durian Supply from Malaysia to Singapore has Grown this Season (2022) 4. Bernama: Prices of Premium Durian in Penang Seen Dipping in June. MalayMail (2022) 5. See, K., Manoharan, S., See, R.Y.: Durian prices up as bad weather hits harvest and labour, fertiliser costs rise. The Straits Times (2022) 6. Zakaria, N.A., Abdul Rahim, A.R.: An overview of fruit supply chain in Malaysia. Jurnal Mekanikal, 37, 36–46 (2014) 7. Chilindo.com. Thousands of auctions starting at 1 THB (2021). www.chilindo.com 8. ebay, I.: eBay shop by category (2022). https://www.ebay.com/ 9. ZenMarket, I.: Kaedah pembelian di pasaran online melalui ZenMarket (2021). www.zenmar ket.jp/ 10. YouTrip. 12 Types of Durians and How to Pick the Best One (2021). https://www.you.co/sg/ blog/types-of-durians-how-to-pick-the-best-durian/
ADAGIO, a BDI Music Recommender Telegram Chatbot Arantxa Garayzar-Cristerna and Wulfrano Arturo Luna-Ramirez(B) Universidad Aut´ onoma Metropolitana, Unidad Cuajimalpa, Santa Fe. Cuajimalpa de Morelos, Ciudad de M´exico, Mexico {arantxa.garayzar,wluna}@cua.uam.mx
Abstract. Within the digital market, music is the product that has generated the most revenue in recent years. One of the technologies that help make this happen are recommendation systems, which are used in most music distribution platforms. These tools are developed with the goal of gaining more users, neglecting the need to further improve recommendation and communication with the user. In this paper, we report a) the design of a Belief-Desire-Intention architecture for the implementation of an agent integrated with a messaging platform, which allows focusing on the user’s needs about music recommendation while it does remain as an independent system (without being part of any streaming service); and b) the successful integration of two platforms: Python AgentSpeak and Telegram. Keywords: BDI Agents · System Integration · Chatbots Recommender Agents · Telegram · Python · AgentSpeak
1
·
Introduction
Lately, the music industry has been affected by new technologies in a major way, especially with the way in which music is consumed and distributed. In the digital business, music downloads and streaming services such as Apple Music, Spotify, SoundCloud, or Last.fm represent the largest source of revenue in the digital market. Music is the most popular type of multimedia content on the internet [12], in 2019 the music industry reported revenues of $9,704.7 million dollars and by 2020 $10,749 [6]. Most music listening platforms integrate recommendation systems that use collaborative filtering and music features to make recommendations to their users. Some of the companies that have implemented this type of recommendation are Last.fm, Pandora, AllMusic and Shazam [12]. Although these services use in-platform recommendation systems, the development of these independently (not part of a streaming service) is still an under-explored area [4]. For this reason, it would be worthwhile to have tools that allow the user to get the
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 175–184, 2023. https://doi.org/10.1007/978-3-031-37963-5_14
176
A. Garayzar-Cristerna and W. A. Luna-Ramirez
benefit of a recommendation without the need of a subscription, in addition to being in a system that does not focus on just providing recommendations. The existing recommender systems present problems such as cold-start, tolerance to the difference in musical tastes, recommendations based on popular elements, as well as recommendations, based on particular characteristics that limit the variety have not been widely addressed [3], since the main focus is to get more users and no to improve the recommendation and the direct communication with the user. An alternative to come closer to solving the problems in existing recommender systems is the use of agents. In this project, we propose to use in particular the well known Belief-Desire-Intention (BDI) agents for the development of the proposed system.
2
Recommendation Systems
Recommendation systems are software tools that provide a set of items that might be of interest to a particular user, they can also be understood as data processing systems that actively collect information to generate based on the user’s tastes and needs. Users interested in the recommendation must specify their preferences by providing feedback to the system, thus updating the knowledge base about the user [8]. Within recommendation systems there are different ways of generating recommendations to the user, each of these ways are known as models. some recommendation models have become a standard in the area of recommendation systems. The most commonly used among music recommendation systems are: collaborative filtering based model, content-based model, knowledge-based model and hybrid models. Being the latter, one of the most popular, since the combination of more than one model has achieved better recommendations [3].
3
BDI Agents
According to [13] the BDI architecture, based on mentalistic notions such as belief, desires, and intentions is a cutting-edge model for developing agents systems. The agent’s beliefs are the information about the world available to it, and may be correct or incorrect depending on the current state of affairs in the environment. Desires represent the goals and tasks to be achieved. Finally, intentions are those desires that the agent is committed to fulfill, they are the action oriented behavior, i.e., a guide for the decisions the agent will make throughout its execution, in other words they allow giving a unique sense to the system and to the actions that it will take to fulfill the goals for it was developed [7]. Building a recommender system based on the BDI agent model provides the opportunity to focus on the user and its relationship with the system, as well as with the recommendations. Therefore, the agent takes the user’s interests as goals, it should be provided with plans to execute the recommendation and the
ADAGIO, a BDI Music Recommender Telegram Chatbot
177
interaction with the user and its beliefs should reflect the interests of the user and not of the platforms. In the case of the chatbot presented here when a user sends a request, it becomes part of the agent’s beliefs that will be combined with other information like the music data to be recommended. The BDI reasoning continues triggering desires about some sort of recommendation items suitable to the user request, ending with an intention to provide recommendation to be communicated to the user.
4
Related Work
Recently, the popularity of music listening platforms has been growing and with them the amount of available songs, to the point that searching for them has become a time-consuming and ineffective task. To give an overview of recommendation systems, the following are some non-agent based systems that have been developed to make this task easier. Convolutional Neural Networks. In the system proposed in [1] they used recurrent convolutional neural networks to create recommendations based on the similarity of the characteristics of the music and how close those characteristics were to the music to be recommended. They used this technique because it combines recurrent neural networks, and each had its advantages. Recurrent neural networks have better results when working with sequences of words and paragraphs, while convolutional neural networks allow working with features such as chords and music tempo. Musicube. Is a recommendation system that uses a genetic algorithm to evaluate user preferences. This system is based on a graphical interface where it places dots within a designated area. Each of the dots have different colors that represent the user’s preference for every song. The color red represents songs that were liked by the user, blue is used for songs that were not liked, orange for suggestions that have not been rated by the user and finally yellow which are songs that are not being suggested or have not been evaluated [11]. Foafing the Music. This project uses FOAF, a framework that allows to present information about users, their relationships, interests and social connections. It uses a hybrid recommendation system from a combination of a context-based and content-based model [2]. T-RECSYS. Is a music recommendation system based on a hybrid model between content-based modeling and collaborative filtering. It was developed at the University of Nevada, USA. This project uses a deep learning model developed in Python using Google’s Keras and TensorFlow module. This tool was made with the intention of improving the recommendation algorithms currently used in the music industry [5].
178
A. Garayzar-Cristerna and W. A. Luna-Ramirez
MusicRoBot. This project is a Conversational music recommendation system implemented in WeChat, a messaging platform. This system uses an emotionbased model to perform the recommendation. The system recommends a song that is considered as the emotion said by the user in a text message [14].
5
Adagio System Overview
This section describes the Adagio project and its components. The Prometheus methodology [10] was used as a guide for the development of this project, which provided the necessary tools for the design and development of the system agents. Figure 1 shows a guide to the elements that constitute the diagrams and their meaning.
Fig. 1. Entities within the Prometheus Methodology
Figure 2, shows an important element within the Prometheus methodology, the System Overview Diagram. Inside this diagram are the main elements of Adagio and how they interact with each other.
Fig. 2. Prometheus System Overview Diagram of Adagio System
ADAGIO, a BDI Music Recommender Telegram Chatbot
179
The system design includes the Adagio agent, which is responsible for performing the recommendation and interact with the user through text message chat. The Data Manager agent is also contemplated, and is responsible for querying the databases and keeping them updated, and communicates with Adagio to provide this information. The Adagio Interaction Diagram, that describes the interaction between each component of the system, can be seen in Fig. 3. This diagram shows how the recommendation is made as soon as the user makes the request within the chat.
Fig. 3. Interaction Diagrams of Adagio Recommendation
To implement this design, different tools were integrated. To program the agents, Python3 and an AgentSpeak interpreter called Python-AgentSpeak [9]
180
A. Garayzar-Cristerna and W. A. Luna-Ramirez
were used. As the objective of this project contemplated direct interaction with the user, we used the Telegram messaging platform and its Telegram-API. To carry out these activities, two Python3 scripts were created. The first script is called Telegram and is in charge of controlling the chat interface. To execute the actions of the agent, we create a script called PythonAgentSpeak. To program the Adagio agent design, an ASL file was generated. In addition, to facilitate the communication between the ASL file and the Python scripts, two pickle files were used to exchange the user and agent responses between the different parts. The Adagio system can be seen operating in Fig. 4, where the welcome to the user can be appreciated. In this part of the conversation, a presentation of the commands available to the user is made. On the other hand, Fig. 5 shows the recommendation in action, from the use of the /recomienda command to the system response.
Fig. 4. Welcome to the User within Telegram (Spanish)
ADAGIO, a BDI Music Recommender Telegram Chatbot
181
Fig. 5. Adagio interacting with a user (Spanish) through a set of commands ending in a recommendation for the genre LoFi
6
Testing and Results
To test the Adagio chatbot, we conducted a test on 11 users. User were asked to use the Adagio bot and to answer a survey to determine how they would interact with the bot. One of the aspects we wanted to test was the welcome to new users, where once the chat started, the user is shown the list of commands available for interaction. As shown in Fig. 6, most users found this welcome useful or very useful. Users were asked about their perception of the chatbot in general, to ascertain whether they consider this tool useful or not. Most users answered that they found it to be either moderately useful or useful. The user responses are shown in Fig. 7.
182
A. Garayzar-Cristerna and W. A. Luna-Ramirez
Fig. 6. User Answers about the Usability of Adagio Main Menu
Fig. 7. User Answers about Adagio Usefulness
Since the number of recommendations currently is limited, it was important to ask users if they would contemplate Adagio as a platform to be used in a regular basis if additional recommendations were added. In general, the responses were inclined to frequent use of the chatbot, as can be seen in Fig. 8.
Fig. 8. Adagio usage Frequency when recommendation items would be increased according to users
ADAGIO, a BDI Music Recommender Telegram Chatbot
183
Lastly, it was significant to understand the users’ opinions about the innovation of this project and how they perceived it.
Fig. 9. User Pondering about Adagio Innovative Degree
In Fig. 9, 63.6% of users found Adagio innovative, while 36.4% found it an average tool.
7
Conclusion and Future Work
Music recommendation systems have clearly been successful in the music industry, as their revenues are increasing year after year. However, along with this, certain problems have arisen that are important to address. Mainly, the role of the user should be considered to offer varied recommendations based on a more in-depth analysis of the user’s tastes. In addition to looking for platforms specialized in recommendation, which are available to more users and in an already familiar environment. We also believe that the search for platforms that protect the user’s privacy and are not focused on the creation of an account or payment of a subscription is important, as it offers the user a secure and private environment that allows them to continue to obtain recommendations. During the development of this project there were difficulties integrating the different tools used, especially with AgentSpeak, since there were very few projects that used it and none of them were close to Adagios objectives. In addition, AgentSpeak is a little unexplored tool that provides few options for a complete implementation of agents and for communication between them. However, we believe that is something worth further work, since another problem is the lack of tools for agent development in different programming languages. Although such difficulties, we achieve a successful integration between two different technologies making able to the BDI agent to use Telegram as an environment and offering to the user a state-of-the-art messaging platforms as a way to grant the access to information around the world. For future work, we believe it would be very valuable to improve the recommendation engine and add a connection to sites for listening to music independently of a restricted streaming service. Furthermore, to include more
184
A. Garayzar-Cristerna and W. A. Luna-Ramirez
information about the user within the conversation, preserving user privacy, i.e., without sharing data and leaving it only available to the user himself. Finally, a key point for the future is to carry out a wide range of tests with numerous users, since testing carried out so far were only technical and on a few final users. Acknowledgment. This work would not have been possible without the financial support of the Divisi´ on de Ciencias de la Comunicaci´ on y Dise˜ no, Universidad Aut´ onoma Metropolitana-Cuajimalpa.
References 1. Adiyansjah, A.A., Gunawan, S., Suhartono, D.: Music recommender system based on genre using convolutional recurrent neural networks. Proc. Comput. Sci. 157, 99–109 (2019) 2. Celma, O., Ramirez, R., Herrera, P.: Foafing the music: a music recommendation system based on RSS feeds and user preferences. In: 6th International Conference on Music Information Retrieval (ISMIR), pp. 464–467, London, UK (2005) 3. Celma, O., Lamere, P.: Music recommendation and discovery revisited. In: Proceedings of the Fifth ACM Conference on Recommender Systems, pp. 7–8 (2011) ` Music Recommendation and Discovery. Springer, Heidelberg (2010). 4. Celma, O.: https://doi.org/10.1007/978-3-642-13287-2 5. Fessahaye, F., et al.: T-RECSYS: a novel music recommendation system using deep learning. In: 2019 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–6. IEEE, Las Vegas, NV, USA, January 2019 6. Friedlander, J.P.: 2020 Year-End Music Industry Revenue Report | RIAA - RIAA. RIAA 7. Garc´ıa, D.R., Simari, G.R., Garc´ıa, A.J.: Planificaci´ on de agentes BDI. In VI Workshop de Investigadores en Ciencias de la Computaci´ on (2004) 8. Mahmood, T., Ricci, F.: Improving recommender systems with adaptive conversational strategies. In: Proceedings of the 20th ACM conference on Hypertext and hypermedia, pp. 73–82 (2009) 9. Niklas, F.: Python-agentspeak (2017). https://github.com/niklasf/pythonagentspeak 10. Padgham, L., Winikoff, M.: Developing intelligent agent systems: a practical guide. Wiley Series in Agent Technology. John Wiley, Chichester, England, Hoboken, NJ. OCLC: ocm56035292 (2004) 11. Saito, Y., Itoh, T.: MusiCube: a visual music recommendation system featuring interactive evolutionary computing. In: Proceedings of the 2011 Visual Information Communication - International Symposium on - VINCI 2011, pp. 1–6, Hong Kong, China. ACM Press (2011) 12. Song, Y., Dixon, S., Pearce, M.: A Survey of Music Recommendation Systems and Future Perspectives (2012) 13. Michael, J.: Wooldridge. Reasoning About Rational Agents. Intelligent Robotics and Autonomous Agents. MIT Press, Cambridge (2000) 14. Zhou, C., Jin, Y., Zhang, K., Yuan, J., Li, S., Wang, X.: MusicRoBot: towards conversational context-aware music recommender system. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds.) DASFAA 2018. LNCS, vol. 10828, pp. 817–820. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91458-9 55
School Bus Routing Problem-Algorithm Optimization Using Graph Theory Galazios Konstantinos1 and Alexiou Dimitra2(B) 1 Hellenic Open University, Aristotelous 18, 26335 Patra, Greece 2 Department of Spatial Planning and Development, School of Engineering, Aristotle University
of Thessaloniki, Thessaloniki, Greece [email protected]
Abstract. Scientific and technological progress has resulted in increasing the complexity of every human being’s daily life. Companies, institutions and states constantly need to find modern tools to assist them in making the best decisions possible. Graph theory has multiple applications in many everyday problems. It makes use of a set of nodes (or points) that are connected to each other by a set of edges (or lines). It can resolve and simplify problems by using algorithms. One of the major problems plaguing modern societies is the School Bus Routing Problem (SBRP) as it is related to many areas of everyday life such as the transportation of products and people, the routing of packages on the internet, etc. This paper deals with this problem and its solution. In particular, its goal is to create an algorithm that will contain optimization techniques such as the use of Dijkstra’s algorithm as well as the use of a specific mathematical function that will search for the shortest distance from a node to the next available one. Using the above techniques, the design of an effective timetable of a specific fleet of school buses will be sought with the ultimate goal of picking up students from their houses and delivering them to a specific location (fixed point) at the school, while overcoming specific restrictions and meeting certain conditions. In the present thesis, a bibliographic reference is made to Graph theory and the School Bus Routing Problem (SBRP) and can act as a supplementary tool for future dissertations. Furthermore, a specific algorithm for solving the aforementioned problem is proposed. All calculations will be carried out using Python and Mathematica (version 12.0). The results of the algorithm will be extracted and scientifically analyzed. Keywords: Graph Theory · Graph · Algorithm · School Bus Routing Problem · SBRP · Dijkstra
1 Introduction Although the development of technology has contributed to reducing the time and procedures of transportation between countries, there is a greater need for finding techniques to reduce the routing time now than ever before. This paper not only addresses such issues but also suggests solutions. It is of great help to anyone dealing with problems of this kind. This document can bring about significant changes in various areas such © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 185–218, 2023. https://doi.org/10.1007/978-3-031-37963-5_15
186
G. Konstantinos and A. Dimitra
as the reduction of environmental pollution, economic benefits, improvement in the transportation of products, safer and faster transportation of personnel, etc. The objective of this paper is to present a theoretical approach to graph theory and the School Bus Routing Problem (SBRT). For a single fixed location (school node), an algorithm will be designed to optimize a specific schedule for picking up students from their residences (which will be specific nodes) and dropping them to this location [1, 2]. A specific algorithm is proposed which satisfies certain constraints such as reducing the average time students spend on buses, selecting bus stop locations so that the average walking distance from the students’ residence to those locations is minimized, the process of transferring all students to the fixed location must be completed in a given time window, the maximum capacity of buses and finally reducing required number of buses which is a function of all the above. The optimal initial vertex in the process of generating the subset of student pickup for each bus will be calculated. At the same time, the relationship between the number k of transfer buses and the allowed walking distance W of the students will be studied and the pickup of more students from certain points will be sought as a function of the distance W [1]. At last, the total delivery time of all students to the fixed point in the specified time window will be calculated and this will be calculated in relation to the number of buses. The main purpose of this work to prove in theory as well as in practice that it is possible to solve the SBRP problem. The problem will be thoroughly analyzed. All the tools that will be used to solve such a difficult problem will be described in detail. The most difficult process -the mathematical formula of node selection- will also be presented.
2 Dijkstra’s Algorithm An important aspect of solving problems using graphs is finding the optimal paths. The two most popular algorithms used to solve this problem are Bellman-Ford’s algorithm and Dijkstra’s algorithm. Dijkstra’s algorithm is used on both directed and undirected coherent graphs with positive weights. In the case of negative weights, Bellman-Ford’s algorithm should be used because Dijkstra’s will produce incorrect results [3]. Dijkstra’s algorithm is used to find the optimal paths from one node to all others in a graph. It is a greedy algorithm because at each step it selects the locally optimal solution until it reaches the terminal node and will synthesize all previous solutions resulting in the optimal solution to the problem. Before running the algorithm the following should be defined: • • • • •
Graph G(V, E) N: The number of nodes. w(v, u): The weights between node v and node u. Start: Starting node. Dist[]: Vector of size N in which the distance of the node i from the starting point is dynamically stored.
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
187
• Prev[]: Vector of size N where for each node the immediately previous node in the shortest path from the origin to that node is stored. During the execution of the algorithm all vertices are divided into two sets: • Set T = ∅, in which all nodes that have been tested are stored and the optimal path to them has been found. • Set F = V which contains all the nodes of the graph to be tested. Each time a node is tested it is removed from the current set. When the set F becomes equal to null then the algorithm terminates, since there is no node to test. At each step, a node v which is not contained in the set T is selected so that the path remains as short as possible. Then for each neighbor of the node provided that it is not in the set T if it turns out to be the shortest path, the vector Dist[] is updated with the new value.
3 School Bus Routing Problem (SBRP) 3.1 Description of SBRT Problem The School Bus Routing Problem (SBRT) was first reported by Newton and Thomas in 1969. According to Park and Kim, the SBRT problem refers to the design of an efficient scheduling plan for a fleet of school buses intended to pick up students of a school from specific stops. It should satisfy some specific criteria such as the maximum bus capacity, the maximum time limit of waiting for students inside the bus, and the maximum time window from pickup to the return of students to the school. To solve an SBRT problem, the term “efficient routing” should be defined. The requirements and priorities of the problem should be fully defined from the beginning so that there is a better approach (or optimal) to solving the problem. For example, suppose we have two possible efficient problem solving scenarios. • In scenario A we have less average waiting time for a student on a bus but each bus will travel more kilometers on average. • In scenario B we have more average waiting time for a bus than in scenario A but each bus will travel fewer kilometers on average. • If the main criterion is economic then more efficient routing is scenario B. If the study is based on the minimum time a student spends on a bus and we are not interested in the economic factor then scenario A is definitely the acceptable solution. 3.2 Subdivision of SBRT A school bus routing problem can be divided into five sub-problems [4]. 1. Graph Preparation: The creation of a single graph and its adjacency matrix in which the residences (stops) of the students and the school stop are contained.
188
G. Konstantinos and A. Dimitra
2. Selection of bus stop: It uses the previous graph with the ultimate goal of finding all bus stop locations for each bus. Includes limiting the maximum walking distance from the student’s residence to the bus stop. The goal of this sub problem is to assign each student the closest pickup stop from the corresponding bus; its solution is divided into two steps: • The creation of the stops of the corresponding bus. • Assigning the stop to the corresponding student. 3. Creation of routes: The data from solving the previous two problems is used to find the optimal routes for each bus. The creation of routes can be classified according to the bus fleet: • Homogeneous networks in which all nodes (student residences) have the same function within the network. Each node is interchangeable in the basic function it performs. • Heterogeneous networks in which there are two or more different classes of nodes categorized according to their utility. They can also be classified according to the bus load: • Single load i.e. the bus picks up students for one school. • Mixed load i.e. a bus has the capacity to pick up students for different schools. 4. School bell adaptation: This sub-problem is used only in the mixed load case. It adjusts the arrival time of students at school 5. Route scheduling: This sub-problem is used only in the mixed load case. The objective of solving this subproblem is to adjust the exact sequence of routes that each bus will run. 3.3 A Mathematical Programming of a SBRT Problem For the mathematical formulation of a school bus routing problem we must first make the following assumptions: • • • • • • • •
The delivery node (school) is unique. The buses are identical with a predetermined capacity. There is only one category of student. Each student residence (node) is unique. A stop can only be visited by one bus. The number of students per stop cannot exceed the capacity of the bus. At a stop, students cannot be divided into groups and shared on different buses. Each bus can only make one journey.
The key criterion of the optimization of the SBRT problem will be the total distance travelled by all buses. Below we will describe the algorithm of Toth and Vigo.
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
189
We define the matrix of variables to be used in the objective functions of SBRT. Table 1. Table of Variables of the Toth and Vigo Algorithm [2] Variables – Data C
Bus capacity
V
Set of possible bus stops with |V | = n
E
Set of edges between two nodes
S
Total nodes (of students)
cij
Weights (or distance) between a node i and a node j
sil
1 if a student l can approach a stop i, otherwise 0
i=0
School index
Decision variables xijk
1 if bus k can travel distance i to distance j, otherwise 0
yik
1 if bus k can visit stop i, otherwise 0
zik
1 if student l gets on bus k at stop i, 0 otherwise
The following are the objective functions of the SBRT problem [1, 2, 4, 5]. • The following function minimizes the total distance travelled by all buses. ⎛ ⎞ n min⎝ cij xijk ⎠ i∈V j∈V
k=1
• A key constraint is that you need a k bus to pick up a student. So only one bus can travel the distance from i to j. s.t. xijk = xjik = yik , ∀i ∈ V , k = 1, . . . , n j∈V
j∈V
• Connectivity of the k bus route. xijk ≤ |Q| − 1, ∀Q ⊆ V \{v0 }, ∀k i,j∈Q
• Satisfying the constraint that each stop can be visited at most once except for the stop corresponding to the school. n
yijk ≤ 1, ∀i ∈ V \{0}
k=1
• Ensuring the premise each student was picked up by the stop they can approach. n k=1
zilk ≤ sil , ∀l ∈ S, ∀i ∈ V
190
G. Konstantinos and A. Dimitra
• Applying the constraint that the capacity of a k bus is not exceeded. zilk ≤ C, k = 1, . . . , n i∈V l∈S
• Enforcement of the constraint that student l is not picked up by bus k at stop i, if the corresponding bus does not visit this stop. zilk ≤ yik , ∀i, l, k • Enforcement of the constraint that each student can only be picked up once. n
zilk = 1, ∀l ∈ V
i∈V k=1
• The following decision variables must be binary. That is, they must take the values 1 or 0 [1, 2, 4]. yik ∈ {0, 1}, ∀i ∈ V , k = 1, . . . , n xijk ∈ {0, 1}, ∀i, j ∈ V , i = j, k = 1, . . . , n zilk ∈ {0, 1}, ∀i, j ∈ V , i = j, l ∈ S
4 Implementation of an Algorithm to Solve the School Bus Routing Problem In this paper, we will describe a methodology that deals with the school bus routing problem (SBRP). In particular, the SBRP problem is an integrated system of pickup and delivery of like or unlike entities (students, organization personnel, objects) from some fixed points called stops to a fixed point called starting point (school, a public service, a factory) within an urban area. In particular, we will mainly consider the pick-up of students from their residences and delivering them to the starting node which is the school. A virtual map will be provided in graph form containing the school (starting point) and all the students’ residences (nodes). Also all distances between the nodes (weights) will be given. 4.1 Problem Description When planning the routes of a school’s bus fleet to pick up all students, the following are taken into account: • The main objective is to reduce the average time students spend on buses. • Maximum permissible walking distance from a student’s residence to a bus stop. • Transportation of students to school should be completed within a specified time window. • Fixed bus capacity.
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
191
Fig. 1. School – Residences Graph.
• Lowering of the required number of buses, if possible. Below (Fig. 1) we provide the graph G(V, E, D) that will be used throughout this section to build the methodology (and analysis) for solving the SBRP problem. This paper studies the SBRP problem only in terms of the average time a student spends on a bus, not in terms of the average number of kilometric distances a bus will travel or the number of buses used to transport all students. • Each element i ∈ V is called a vertex. It represents the residence of a student. Except for node 8 which represents the school. • Each pair of two nodes (i, j) ∈ E forms an edge. It represents the road with the minimum mileage between the two residences. The distance dij ∈ D between two vertices is called the weight and represents the kilometric distance of these two vertices. 4.2 Stepwise Algorithm Variables: In this section an attempt will be made to solve the SBRP problem using graph theory. Before the solution can be implemented, it is necessary to construct some necessary data which will be used throughout the process.
192
G. Konstantinos and A. Dimitra
The graph given above is non-directed with weights. The number of vertices is 15. The Table 2 shows the set of edges E with weights D [6]. Table 2. Table of Edges E N
United Edges(E)
Weight(D)
1
1
2
7
2
1
3
3
3
1
4
6
4
2
4
2
5
2
5
8
6
3
8
2
7
4
5
5
8
4
7
9
9
4
8
3
10
5
6
2
11
6
9
6
12
6
10
7
13
7
11
7
14
8
12
2
15
8
13
8
16
9
10
4
17
10
11
3
18
10
15
9
19
11
13
5
20
12
13
3
21
12
14
6
22
13
15
8
23
14
15
10
From the given network of the urban area under investigation the adjacency matrix A will be extracted (see Table 3). For each element aij ∈ A is calculated as follows [6]: aij =
1, if eij ∈ E /E 0, if eij ∈
Since the graph G is undirected, then: aij = aji
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
193
Table 3. Adjacency Matrix A 0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
1
0
0
1
0
1
1
0
0
0
0
0
0
0
0
1
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
1
0
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
The adjacency matrix is then calculated with the corresponding weights (refer Table 4). Each element Nij ∈ N it is calculated as follows: Nij =
dij , if aij = 1 0, if aij = 0
Since the graph G is undirected: Nij = Nji • The urban network N represents the set of nodes. Student residences together with the school node. This paper assumes that there is only one school [7, 8]. • The set S ⊂ N expresses only the student residences nodes. Without loss of generality it is assumed that each element S corresponds to a student. The total number of students is |S| = |N| − 1 = m ⇒ m = 14. • In an urban network N it is assumed that the origin-destination node (school) is: school_node = 8. • Number of buses: k = 3. • Capacity of students per bus: C = 5. • Without loss of generality it is assumed that all buses have the same capacity. • Maximum permissible walking distance of pupils from their home to the nearest bus stop: W = 2.
194
G. Konstantinos and A. Dimitra Table 4. Adjacency Matrix with Weights for the Network
0
7
3
6
0
0
0
0
0
0
0
0
0
0
0
7
0
0
2
8
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
2
0
0
0
0
0
0
0
6
2
0
0
5
0
9
3
0
0
0
0
0
0
0
0
8
0
5
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
6
7
0
0
0
0
0
0
0
0
9
0
0
0
0
0
0
7
0
0
0
0
0
0
2
3
0
0
0
0
0
0
0
2
8
0
0
0
0
0
0
0
6
0
0
0
4
0
0
0
0
0
0
0
0
0
0
7
0
0
4
0
3
0
0
0
9
0
0
0
0
0
0
7
0
0
3
0
0
5
0
0
0
0
0
0
0
0
0
2
0
0
0
0
3
6
0
0
0
0
0
0
0
0
8
0
0
5
3
0
0
8
0
0
0
0
0
0
0
0
0
0
0
6
0
0
10
0
0
0
0
0
0
0
0
0
9
0
0
8
10
0
4.3 Calculation of the Optimal Distance of the Network N Connecting All Pairs of Vertices From the network N and with the continuous use of Dijkstra’s algorithm for each node, the matrixDistij which contains the optimal distances of each node i from all other nodes j with i = j. will be calculated (see Table 5). In this paper, the implementation of the above algorithm is not the subject of study so only the results are presented. To extract the results, a special script of the python language was used employing the Atom framework. 4.4 Construction of Eccentricity Matrix From the Dist matrix of optimal distances (Dijkstra matrix) the eccentricity matrix ei will be derived. The eccentricity of a vertex i is defined as its maximum distance from all vertices [9]. Please see Table 6 for Eccentricities for each node. ei = max{dist ij , ∀j ∈ V, i = j} 4.5 Calculation of Matrix walkij From the Dist matrix of optimal distances (Dijkstra matrix), we will derive the matrix of the approaching of a possible stop by a student called walk ij (as shown in Table 7) 1, if 0 ≤ dist ij ≤ W , ∀j ∈ V walk ij = 0, if dist ij > W
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
195
Table 5. The Dist Matrix Contains the Optimal Distances between Two Vertices. Results after using Dijkstra 0
7
3
6
11
13
15
5
19
18
15
7
10
13
18
7
0
7
2
7
9
11
5
15
16
15
7
10
13
18
3
7
0
5
10
12
14
2
18
15
12
4
7
10
15
6
2
5
0
5
7
9
3
13
14
13
5
8
11
16
11
7
10
5
0
2
14
8
8
9
12
10
13
16
18
13
9
12
7
2
0
16
10
6
7
10
12
15
18
16
15
11
14
9
14
16
0
12
14
10
7
14
12
20
19
5
5
2
3
8
10
12
0
16
13
10
2
5
8
13
19
15
18
13
8
6
14
16
0
4
7
15
12
21
13
18
16
15
14
9
7
10
13
4
0
3
11
8
17
9
15
15
12
13
12
10
7
10
7
3
0
8
5
14
12
7
7
4
5
10
12
14
2
15
11
8
0
3
6
11
10
10
7
8
13
15
12
5
12
8
5
3
0
9
8
13
13
10
11
16
18
20
8
21
17
14
6
9
0
10
18
18
15
16
18
16
19
13
13
9
12
11
8
10
0
Table 6. Eccentricities for each Node 19 18 18 16 18 18 20 16 21 18 15 15 15 21 19
196
G. Konstantinos and A. Dimitra Table 7. Student Approaching to Various Stops
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
4.6 Input of Vertices Qi , Bus_Stopi In this step we will partition the set of vertices S into k (number of buses) suitable subsets called Qi . The set Qi contains elements of the set N and denotes the corresponding students that bus i will pick up. Furthermore, the set Bus_stopi denotes the corresponding stops that bus i will make. The following basic conditions should hold: • Qi ⊆ S Qi ⊆ S. and Busstop i ⊆ i i i • Qi = q1 , q2 , . . . , qli , where i = 1, 2, . . . , k and li ≤ C. The qlii element expresses that a particular bus will pick up the respective student from his/her home. That is, it is an element of the set S. • S = Q1 ∪ Q2 ∪ · · · ∪ Qk [5, 8].. • Qi ∩ Qj = ∅ and Bus_stopi ∩ Bus_stopj = ∅, ∀i = j [8]. • |Qi | ≤ C and |Busstop j | ≤ |Qi | ≤ C. The vertices belonging to some Qi are said to be bound. This means that the students contained in these sets have been picked up by a bus and no further visit of the corresponding residence is needed. We define the set F ⊆ S to be the set of all bound vertices. The complement of this set contains the free vertices with Fc = S − F, i.e. possible pickup of a student by a bus. The school node is excluded from the set F. The first step in completing the sets Qi is to insert the first vertices into the corresponding sets. Since this paper studies the average time of a student’s stay on a bus, it
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
197
assumes that the pickup of the first student on the corresponding bus takes place at time t0 = 0. So the distance time from the school to the residence of the first student is not calculated. The study is done only from the point of view of the student and not from the point of view of the driver of a bus. Using the eccentricity matrix ei the first vertices of the sets Qi will be added. The selection method is described below: 1. The vertices with the lowest eccentricity are selected. The reason is that these vertices have better access to their most remote areas.
Qi = index min {ei } , where min {ei } ≤ k i=1,..,k
i=1,..,k
2. If there are more than k vertices then the next criterion is to select the regions with the largest number of neighbors. The reason is that it has better access to other graphs hence greater contribution to the graph. Qi = max aij
i∈index
min {ei } j∈V
i=1,..,k
3. If there are more than k vertices then one is chosen at random. For graph N, the following is observed: e(13) = e(11) = e(12) = 15 So the corresponding vertices are the initial vertices of the Qi . Q1
Q2
Q3
13
11
12
At time t0 = 0 the corresponding bus will be at the given location. The table below shows the distances travelled by a bus with at least one student. Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0
0
198
G. Konstantinos and A. Dimitra
The set of bound vertices F is shown below. F 11, 12, 13
The vertices Qi are the initial bus departure stops. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11
12
In this section we will analyze and represent the process of finding the minimal sub-graph of N containing G(Qi ). Through an algorithmic procedure we will place the appropriate nodes in the corresponding subsets Qi ⊆ S. The node elements (student houses) will be denoted: Bus qnode = qri μεqri ⊂ Qi .
Below are the basic criteria for selecting an item and placing it in the corresponding k set. 1. The main criterion for selecting an element is the objective function. It is defined as the minimum distance between the sums of the shortest distances contained in the Qi set [3].
, ∀i = 1, . . . , k, ∀j ∈ Free, ∀r ∈ Qi qji |Vi = min min ajri i
r∈Qi
j
2. The second criterion applies only if there are two or more possible qri which have the same value. In this case the node with the smallest eccentricity will be chosen.
i i qj |min e qj , ∀i = 1, . . . , k, ∀j ∈ Free i
3. The last criterion applies only in the case where there are two or more possible qri which have identical eccentricities. In this case the node with more neighboring elements will be chosen. In each step, the following are calculated: • • • • • •
The possible elements nodes qri , using the objective function. The predominant element qri . Then finding all nodes that satisfy the maximum pedestrian distance criterion. Placing them in the corresponding set Qi . Finding the next bus stop i. Finding the minimum distance that bus i will travel from the departure stop to the pick-up stop of the next student(s).
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
199
• The insertion of a node (or several) into the set F, i.e., its categorization from unbound to bound. Below is a step-by-step representation of the algorithm. 1st Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 7 ⎨ q3 2 with corresponding values : q10 V =3 ⎩ 2 ⎩ 3 q3 V3 = 4 The second element will be selected as min(V1 , V2 , V3 ) = 3. 2 . We select the element: q10 No unbounded neighboring element of node {10} satisfies the maximum pedestrian distance requirement. The above element is added in the second column. This means that the student residing at house {10} will take bus 2.
Q1
Q2
Q3
13
11,10
12
The vertex {10} is the next stop of the second bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11, 10
12
The second bus will travel from {11} to {10} with a total distance of: Dist 11,10 = 3 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0, 3
0
The vertex {10} now belongs to the set F of bound vertices. F 10, 11, 12, 13
200
G. Konstantinos and A. Dimitra
2nd Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 7 ⎨ q3 q92 with corresponding values : V2 = 4 ⎩ ⎩ 3 q3 V3 = 4 It is observed that the selection of the second and third elements are equal since min(V1 , V2 , V3 ) = 4. According to the second criterion, the third element has smaller eccentricity since min(e(9), e(3)) = min(21, 18) ⇒ e(3) = 18. We select the element: q33 . No unbounded neighboring element of node {3} satisfies the maximum pedestrian distance requirement. The above element is inserted in the third column. This means that the student who resides in house {3} will take bus 3. Q1
Q2
Q3
13
11,10
12,3
The vertex {3} is the next stop of the third bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11, 10
12,3
The third bus will travel from {12} to {3} for a total distance of: Dist 12,3 = 4 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0, 3
0, 4
Vertex {3} now belongs to the set F of bound vertices. F 3, 10, 11, 12, 13
3rd Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 8 ⎨ q4 q92 with corresponding values : V2 = 4 ⎩ ⎩ 3 q4 V3 = 5
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
201
The second element will be selected since min(V1 , V2 , V3 ) = 4. We select the element: q92 . No unbounded neighboring element of node {9} satisfies the maximum pedestrian distance requirement. The above element shall be added in the second column. This means that the student living in house {9} will take the bus 2. Q1
Q2
Q3
13
11,10, 9
12, 3
Vertex {9} is the next stop of the second bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11, 10, 9
12, 3
The second bus will travel from house {10} to house {9} for a total distance of: Dist 10,9 = 4. Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0, 3, 4
0, 4
Vertex {9} now belongs to the set F of bound vertices. F 3, 9, 10, 11, 12, 13
4th Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 8 ⎨ q4 q62 with corresponding values : V2 = 6 ⎩ ⎩ 3 q4 V3 = 5 The third element will be selected since min(V1 , V2 , V3 ) = 5. We select the element: q43 . The neighboring element {2} of node {4} satisfies the maximum pedestrian distance requirement. This means that the student residing at location {2} should take the third bus arriving at location {4}. The above data shall be entered in the third column. This means that students living in houses {4, 2} will take bus 3 at the same time.
202
G. Konstantinos and A. Dimitra Q1
Q2
Q3
13
11,10, 9
12,3, 4, 2
Vertex {4} is the next stop of the third bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11, 10, 9
12, 3, 4
The third bus will travel from house {3} to house {4} for a total distance of: Dist 3,4 = 3 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0, 3, 4
0, 4, 5
Vertices {2, 4} now belong to the set F of bound vertices. F 2, 3,4, 9, 10, 11, 12, 13
5th Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 8 ⎨ q15 q62 with corresponding values : V2 = 6 ⎩ ⎩ 3 q1 V3 = 7 The second element will be selected since min(V1 , V2 , V3 ) = 6. We select the element: q62 . The neighboring element {5} of node {6} satisfies the maximum pedestrian distance requirement. This means that the student residing at location {5} should take the second bus arriving at location {6}. The above data shall be entered in the second column. This means that students living in houses {6, 5} will take the bus 2. Q1
Q2
Q3
13
11,10, 9, 6, 5
12, 3, 4, 2
Vertex {6} is the next stop of the second bus. The total number of stops for all buses is shown below.
School Bus Routing Problem-Algorithm Optimization Using Graph Theory Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11, 10, 9, 6
12, 3, 4
203
The second bus will travel from house {9} to house {6} for a total distance of: Dist 9,6 = 3 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0, 3, 4, 6
0, 4, 5
Vertices {6, 5} now belong to the set F of bound vertices. F 2, 3, 4, 5, 6, 9, 10, 11, 12, 13
6th Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 8 ⎨ q15 − with corresponding values : − ⎩ ⎩ 3 q1 V3 = 7 The third element will be selected since min(V1 , V3 ) = 7. The second bus is not included in the objective function because it has reached the maximum number of students (C[2] = k = 5). We select the element: q13 . No unbounded neighboring element of node {1} satisfies the maximum pedestrian distance requirement. The above element is inserted in the third column. This means that the student who resides in house {1} will take bus 3. Q1
Q2
Q3
13
11,10, 9, 6, 5
12, 3, 4, 2, 1
Vertex {1} is the next stop of the third bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13
11, 10, 9, 6
12, 3, 4, 1
The third bus will travel from house {2} to house {1} for a total distance of: Dist 2,1 = 3.
204
G. Konstantinos and A. Dimitra Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0
0, 3, 4, 6
0, 4, 5, 7
Vertex {1} now belongs to the set F of bound vertices. F 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 13
7th Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 8 ⎨ q15 − with corresponding values : − ⎩ ⎩ − − The first element will be chosen since min(V1 ) = 8. The second and third buses are not counted in the objective function because they have filled the maximum number of students (C[2] = C[3] = k = 5). 1 . We select the element: q15 No unbounded neighboring element of node {15} satisfies the maximum pedestrian distance requirement. The above element is added in the first column. This means that the student who resides at house {15} will take bus 1. Q1
Q2
Q3
13, 15
11,10, 9, 6, 5
12, 3, 4, 2, 1
Vertex {15} is the next stop of the third bus. The total number of stops for all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13, 15
11, 10, 9, 6
12, 3, 4, 1
The first bus will travel from house {13} to house {15} for a total distance of: Dist 13,15 = 8 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0, 8
0, 3, 4, 6
0, 4, 5, 7
Vertex {15} now belongs to the set F of bound vertices.
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
205
F 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 13, 15
8th Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 10 ⎨ q14 − with corresponding values : − ⎩ ⎩ − − The first element will be chosen since min(V1 ) = 10. The second and third buses are not taken into account in the objective function because they have filled the maximum number of students (C[2] = C[3] = k = 5). 1 . We select the element: q14 No unbounded neighboring element of node {14} satisfies the maximum pedestrian distance. The above element is inserted in the first column. This means that the student who resides at house {14} will take bus 1. Q1
Q2
Q3
13, 15, 14
11,10, 9, 6, 5
12, 3, 4, 2, 1
Vertex {14} is the next stop of the third bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13, 15, 14
11, 10, 9, 6
12, 3, 4, 1
The first bus will move from the house {15} to the house {14} for a total distance of: Dist 15,14 = 10 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0, 8, 10
0, 3, 4, 6
0, 4, 5, 7
Vertex {14} now belongs to the set F of bound vertices. F 1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 13, 14, 15
206
G. Konstantinos and A. Dimitra
9th Iteration Finding possible elements using the objective function. ⎧ ⎧ 1 ⎨ V1 = 20 ⎨ q7 − with corresponding values : − ⎩ ⎩ − − The first element will be chosen since min(V1 ) = 10. The second and third buses are not counted in the objective function because they have filled the maximum number of students (C[2] = C[3] = k = 5). We select the element: q71 . No unbounded neighboring element of node {7} satisfies the maximum pedestrian distance requirement. The above element is added in the first column. This means that the student residing at house {7} will take bus 1. Q1
Q2
Q3
13, 15, 14, 7
11,10, 9, 6, 5
12, 3, 4, 2, 1
Vertex {7} is the next stop of the third bus. The total number of stops of all buses is shown below. Bus_Stop1
Bus_Stop2
Bus_Stop3
13, 15, 14, 7
11, 10, 9, 6
12, 3, 4, 1
The first bus will travel from house {14} to house {7} for a total distance of: Dist 14,7 = 20 Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0, 8, 10, 20
0, 3, 4, 6
0, 4, 5, 7
Vertex {7} now belongs to the set F of bound vertices. F 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15
Final Results The algorithm will terminate successfully after all nodes have been placed in their respective sets Qi . Indeed, the set F of bound nodes is identical to the set of all residences of students S.
School Bus Routing Problem-Algorithm Optimization Using Graph Theory Q1
Q2
Q3
13, 15, 14, 7
11,10, 9, 6, 5
12, 3, 4, 2, 1
Bus_Stop1
Bus_Stop2
Bus_Stop3
13, 15, 14, 7
11, 10, 9, 6
12, 3, 4, 1
Distance(Q1 )
Distance(Q2 )
Distance(Q3 )
0, 8, 10, 20
0, 3, 4, 6
0, 4, 5, 7
207
F 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15
The following are observed: • The set of all bound vertices is identical to the set of all student residences F ≡ S. • The set of all students picked up by a bus (3-buses) is identical to the set of all student residences S = Q1 ∪ Q2 ∪ Q3 . • No bus contains a student node that is contained in another bus Q1 ∩ Q2 ∩ Q3 = ∅. • No bus contains a student pick-up stop which is contained in another bus Bus_stop1 ∩ Bus_stop2 ∩ Bus_stop3 = ∅. • The number of picked-up students on a bus is less than or equal to the capacity of the corresponding bus |Q1 | = 4 < C[1] = 5, |Q2 | = 5 = C[2] = 5, |Q3 | = 5 = C[3] = 5. • The number of stops for picking up students on a bus is less than or equal to the capacity of the corresponding bus |Bus_stop1 | = 4 < C[1] = 5, |Bus_stop2 | = 4 < C[2] = 5,|Bus_stop3 | = 4 < C[3] = 5. 4.7 Statistical Results In an urban area, a bus must, according to the law, run at an average hourly speed of 60 km/h. But due to relative traffic, various traffic lights, pedestrian crossings and various other exogenous factors this paper makes the assumption that a bus runs at an average hourly speed of 40 km/h. At each stop, a period of time is spent to safely place a student and also to get them on the bus. The reasons vary, such as fastening the students in their seats, some delay caused by their parents etc. This paper makes the assumption that at each stop, approximately 1.5 min is spent on the entire process from the moment the student enters the bus until the moment the bus starts. • Average hourly speed: 40 km/h. • Stop time in total: 1.5 min. • Total distance (in km) travelled by each bus, including the distance between the last bus stop and the school:
208
G. Konstantinos and A. Dimitra Q1
Q1
Q1
5
2.3
2.1
• Total journey time of each bus (in min): Q1
Q1
Q1
7.5
3.45
3.15
• Total journey time including delays for each stop (in min): Q1
Q1
Q1
13.5
9.45
9.15
• Average journey time for each bus is 10.7 min. • Average journey time for each child on a given bus (in min): Q1
Q1
Q1
3.375
1.89
1.83
• The average time each child spends on a bus is 2.365 min. 4.8 Suggested SBRT Algorithm Consider the following sets: • V(G), the set containing all nodes including the school node. • V(E),the set containing all edges that connect two different nodes. • V(w),the set containing all edge weights. Below is the proposed algorithm that solves the SBRT problem:
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
209
210
G. Konstantinos and A. Dimitra
The above algorithm uses the following notations: • Inserting a value into a variable:variable = value • Inserting a value in a list: List ← value 4.9 Results Summary According to the final results, the following can be observed: 1. The first bus: It will run the route: 13–15-14–7–8. It will pick up the students: 13, 15, 14, 7. It will travel a total distance of 5 km. The duration of the whole trip is 13.5 min. The average stay time for each child is 3.375 min. 2. The second bus: It will run the route: 11–10–9–6–8. It will pick up the students: 11, 10, 9, 6, 5. One student will walk to a stop (student 5 to stop 6). He will walk a total distance of 2.3 km. The duration of the whole journey is 9.45 min. The average stay time for each child is 1.89 min. 3. The third bus: It will run the route: 12–3–4–1–8. It will pick up the students: 12, 3, 4, 2, 1. One student will walk to a stop (student 2 to stop 4). He will walk a total distance of 2.1 km. The duration of the whole journey is 9.15 min. The average stay time for each child is 1.83 min. The first bus chooses to pick up students from the most remote areas. The other two buses choose to pick up nodes at short distances.
5 Introduction 5.1 Conducting Experiments on Bus Capacity In this section, the algorithmic routing process will be studied in terms of bus capacity. Three buses will be used which will contain different number of student pick-ups. Data will be presented where the maximum travel time of a bus is less than 16 min. It is observed that the best choice in terms of the capacity of the three buses is the case where buses 1 and 2 can pick up 5 children while bus 2 can only pick up 4 (see Table 8).
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
211
Table 8. Conducting an Experiment with Variable Capacity Capacity
Number of students
Maximum distance (km)
Maximum route time (min)
C[2, 6, 6]
[2, 6, 6]
3.9
13.35
C[3, 4, 7]
[3, 4, 7]
3.5
12.75
C[3,5, 6]
[3, 5, 6]
4.3
13.95
C[4, 3, 7]
[4, 3, 7]
5
13.5
C[4, 4, 7]
[3, 4, 7]
3.5
12.75
C[4, 5, 5]
[4, 5, 5]
5
13.5
C[4, 5, 6]
[3, 5, 6]
4.3
13.95
C[4, 6, 4]
[4, 6, 4]
3.9
13.35
C[4, 6, 5]
[3, 6, 5]
3.9
13.35
C[4, 7, 3]
[3, 2,9]
5
13.5
C[4, 7, 5]
[3, 6, 5]
3.9
13.5
C[4, 8, 5]
[3, 6, 5]
3.9
13.35
C[4, 9, 4]
[4, 6, 4]
3.9
13.35
C[4, 9, 5]
[3, 6, 5]
3.9
13.35
C[5, 3, 6]
[5,3, 6]
4.3
13.95
C[5, 4, 5]
[5, 4, 5]
4.2
12,3
C[5, 4, 7]
[3, 4,7]
3.5
12.75
C[5, 5, 5]
[4, 5, 5]
5
13,5
C[5, 6, 3]
[5, 6, 3]
4.7
14.1
C[5, 6, 5]
[4, 6, 4]
3.9
13.35
C[4, 7, 3]
[4, 7,3]
5
13.5
C[5, 7, 4]
[4, 6,4]
3.9
13.5
C[6, 3, 6]
[5, 3, 6]
4,3
13.95
C[6, 3, 7]
[4, 3,7]
5
13.5
5.2 Conducting Experiments on the Number of Buses Subsequent experiments will be conducted with the same capacity for each bus. A maximum of four buses will be set (Table 9). It is observed that the best option in terms of maximum journey time for a bus is to use three buses with a capacity of six persons. Although the five and six passenger options have the same maximum journey time, the latter option is preferred because the total distance travelled by all buses is shorter and bus 2 will pick up more students from a stop.
212
G. Konstantinos and A. Dimitra Table 9. Conducting an Experiment with Variable Number of Buses
Case
Capacity
Number of students
Maximum distance (km)
Maximum route time (min)
1
C[14]
[14]
10,4
33,6
2
C[7, 7]
[7, 7]
5,9
17,85
3
C[8, 8]
[8, 6]
5,5
18,75
4
C[5, 5, 5]
[4, 5, 5]
5
13,5
5
C[6, 6, 6]
[3, 6, 5]
3.9
13,5
5.3 Conducting Experiments on the Maximum Permissible Walking Distance Subsequent experiments will be performed with the same capacity number for each bus. A realistic maximum pedestrian distance which is defined is 600 m. The results will be recorded for each case which has the minimum-maximum travel time (see Table 10). Table 10. Conducting an Experiment with a Variable Number of Walking Distances Case
Walking distance (m)
Capacity
Number of students
Maximum distance (km)
Maximum route time (min)
1
0
C[14]
[14]
10,8
37,2
2
100
C[14]
[14]
10,8
37,2
3
200
C[14]
[14]
10,4
33,6
4
300
C[14]
[14]
9,1
27,15
5
400
C[14]
[14]
9,1
27,15
6
500
C[14]
[14]
8
24
7
600
C[14]
[14]
7,6
21,9
8
0
C[7,7]
[7, 7]
6.1
19.65
9
100
C[7,7]
[7, 7]
6.1
19.65
10
200
C[7,7]
[7, 7]
5.9
17.85
11
300
C[9,9]
[9, 5]
5.3
14.4
12
400
C[9,9]
[9, 5]
5.3
14.4
13
500
C[8,8]
[6, 8]
5
13.5
14
600
C[7,7]
[7, 7]
4.3
12.45
15
0
C[5, 5, 5]
[4, 5, 5]
5
13.5
16
100
C[5, 5, 5]
[4, 5, 5]
5
13.5 (continued)
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
213
Table 10. (continued) Case
Walking distance (m)
Capacity
Number of students
Maximum distance (km)
Maximum route time (min)
17
200
C[6, 6, 6]
[3, 6, 5]
3.9
13.35
18
300
C[6, 6, 6]
[3, 6, 5]
3.9
13.35
19
400
C[6, 6, 6]
[4, 4, 6]
3.3
9.6
20
500
C[5, 5, 5]
[5, 5, 4]
3.6
11.4
21
600
C[5, 5, 5]
[4, 5, 5]
4
10.5
22
0
C[5, 5, 5, 5]
[3, 5, 3, 3]
2.3
10.95
23
100
C[5, 5, 5, 5]
[3, 5, 3, 3]
2,6
10,95
24
200
C[4, 4, 4, 4]
[3, 4, 3, 4]
3,3
10,95
25
300
C[5, 5, 5, 5]
[4, 3, 2, 5]
3,5
9,7
27
400
C[4, 3, 2, 5]
[4, 7, 5, 6]
3,2
9,3
27
500
C[5, 5, 5, 5]
[5, 3, 3, 3]
3,2
9,3
28
600
C[6, 6, 6, 6]
[4, 2, 3, 5]
2.2
6.3
It is observed that the best option in terms of maximum travel time for a bus is to use four buses with a capacity of six people and a maximum permissible walking distance set at 600 m. There is a significant improvement in the time taken to deliver students to school. The best case (the last one) takes only 18% of the time of the worst case (the first one).
6 Presentation of Algorithm Results 6.1 Optimal Solution The optimal solution is defined as the minimum delivery time for all students. It is obvious that in the ideal case where there would be no constraints on the maximum number of buses then if there was one bus per student there would be an immediate delivery of the student to school. This scenario corresponds in reality to the scenario of each parent delivering their child to school. Indeed from the triangular inequality we know: di,school < d i,j + dj,school Timei,school < Timei,j + Timej,school where i is the initial node, j is an intermediate node and finally school is the school node. Of course the time using intermediate nodes becomes longer if we take into account the time spent at the stop.
214
G. Konstantinos and A. Dimitra
The above case is unrealistic in terms of the study object of this paper, because the aim is to find an optimal solution of picking up students with as few vehicles as possible in the least time. The reason for calculating the above time amount T∗ is only to compare with the final results of the algorithm. From the Distmatrix (Table 5) it is observed that in row 8, which represents all the distances from the school node, the maximum distance is in column 9. That is dist 9,8 = 1.6km. All other distances are shorter resulting in being overlapped in time by this case. It has been assumed that buses run at an average hourly speed of 40 km/h (u), but also that about 1.5 min ((Tstation ) is spent at each stop. It is assumed that the start of the journey is also a stop. T∗ = Time9,8 =
dist 9,8 + Tstation u
T∗ = 3, 8min
6.2 Results in Terms of Capacity The results of the experiment in terms of bus capacity are shown in the diagram in Fig. 2.
Fig. 2. The Results of the Experiment in Terms of Bus Capacity
The following are observed: • The best case in terms of pick-up and transfer time of students to the school terminal is the 16th. That is, the respective bus capacities are 5, 4, 5. The total time of all journeys is 12.3 min with a maximum distance of 4.2 km. The total time is 3.25 times longer than the optimum which is acceptable. T16 = 3, 25 ∗ T∗
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
215
• It is observed that depending on the capacity of the buses the final time varies. A bus with a certain capacity can affect the whole process because there is a dependency not only on the distance from the next stop but also on the maximum permitted walking distance. Because if there are students at a stop who exceed the capacity of the bus in question, then these students will have to be picked up by another bus. • Apart from the best case scenario, no other safe conclusions can be drawn. 6.3 Results in Terms of the Number of Buses The results of the experiment in terms of the number of buses are shown in the diagram in Fig. 3.
Fig. 3. Experiment Results in Terms of the Number of Buses
The following are observed: • The best case in terms of pick-up and transfer time of students to the school terminal is the 5th. That is the use of three buses. It is observed that the delivery time of all students in the 4th and 5th case is the same but the latter is preferred because the buses will travel a shorter distance and pick up more students from one stop. The respective bus capacities are 6, 6, 6. The total time of all routes is 13.5min with a maximum trip distance of 3.9 km. The total time is 3.55 times longer than the optimum which is acceptable. T5 = 3, 55 ∗ T∗ • It is observed that depending on the number of buses, the final time decreases. Only by simply observing the case of employing one bus and the case of employing two buses, one can see that the total time is reduced by about half. • In the fourth case and also in the fifth case, although three buses are used, the specific capacity can affect the whole process because there is a dependence not only on the distance to the next stop but also on the maximum permissible walking distance. Thus
216
G. Konstantinos and A. Dimitra
while in the fourth case more time is spent on approaching the stops, in the fifth case more time is spent on picking up several students from one stop. Usually in these cases the administrations of organizations or companies prefer the case with the least economic cost such as case 5. • As the number of buses increases, the time decreases more and more slowly. While going from using one bus to two the time is reduced by about 50% the use of a third bus compared to two is only 25% or 60% of the total time using one bus. With an increase in the number of buses the total time decreases and approaches the optimal time T*. An evaluation should be conducted to decide what is the allowable time in order to calculate the total number of buses. That is, the maximum permissible time for the delivery of students to be determined by the administration and the optimum use of buses to be chosen. 6.4 Results in Terms of Buses of Maximum Permissible Walking Distance The results of the experiment in terms of the maximum permissible walking distance of the students are presented in the diagram in Fig. 4.
Fig. 4. Results of the Experiment in Terms of Maximum Permissible Walking Distance
The following are observed: The best case in terms of pick-up and transfer time of students to the school terminal is the 28th. That is the use of four buses with a maximum permissible walking distance of 600 m. The respective bus capacities are 6, 6, 6, 6, 6. The total time of all journeys is 6.3 min with a maximum travel distance of 2.2 km. The total time is 1.66 times longer than the optimum which is acceptable. T5 = 1, 66 ∗ T∗ • It is observed that the delivery time of all students in all cases decreases as the maximum permissible walking distance increases. It is observed that depending on
School Bus Routing Problem-Algorithm Optimization Using Graph Theory
217
the number of buses, the final time decreases. Just by simple observation in the case where W = 600m the use of one bus and the case of using two buses one can observe the rapid decrease in the total delivery time. The increase in the number of buses does not have a corresponding decrease in the delivery time of students. As the number of buses increase, the time decreases more and more slowly. While going from one bus to two, the reduction in time is about 44%, the use of a third bus versus two is only 15.7% or 52% of the total time by using one bus. • Then as the distance W increases the mileage decreases. A logical conclusion since some buses will prefer to pick up more students from certain stops. • In any case, there is a complete dependence of the delivery time on the maximum allowable walking distance. The morphology of the graph and the choice of an appropriate W can significantly influence the final result. For example, in a provincial network where each student resides in a separate village with each village stop approximately 1.5 km away, it may not make sense to choose an appropriate W. While an increase in the maximum permissible walking distance reduces the delivery time, a reasonable value should be set in advance. Of course in the above experiment forcing a student to walk 600m perhaps in an urban network with poor pedestrian conditions and also with poor weather conditions may not be acceptable. An assessment should be made by the management of the organization or private company as to what the appropriate maximum permissible walking distance is and conclusions should be drawn accordingly. For example in the above experiment the option for a student to walk a maximum of 200 m is acceptable. Using a bus it would take approximately 33.6 min to deliver all students to school. By using four buses it will take about 10.95 min which is the optimal solution. Perhaps the administration should take into account the cost of the bus driver and come up with an average solution that satisfies them. That is, the use of two buses (reducing driver costs), with a maximum permissible walking distance of 200 m (more safety for the students) and a final delivery time of about 17.85 min (quite a remarkable time spent on the bus).
7 Conclusions Solving an SBRP problem is multifactorial. The requirements of the problem must be fully defined at the beginning to derive the optimal solution. The final conclusions are evaluated according to the final pick-up and delivery time of all students, which is the maximum delivery time from all buses. One and perhaps the most basic initial requirement is to define the maximum allowable walking distance W. It was observed that the longer the distance is defined, the shorter the delivery time. That is, the number of stops decreases, resulting in more students being picked up from specific stops. It was also observed that its combination with bus capacity is equally important. In particular, in cases where many students are able to walk and attend a particular stop, the capacity of the bus determines whether that bus will attend that stop. Otherwise, if it takes a different route, it will have to incur a certain time cost.
218
G. Konstantinos and A. Dimitra
The number of buses is also a catalyst for the final delivery time of the students. As the number of buses increases, the final delivery time decreases. However, the management of the organization or company should define at the beginning the maximum number of buses that it can provide in order to be able to find the optimal route of all buses. Lastly, bus capacity alone cannot draw any firm conclusions. However, as mentioned above, the combination with the number of buses and the maximum permissible walking distance has a significant influence on the final results. Listed below are some suggestions aimed at the development of this paper. First of all, the implementation of the proposed algorithm using a programming language such as Python will save time regarding the process of finding the optimal solution. The algorithm will be tested on various problems in order to check its correctness. Moreover, instead of studying each problem separately, an attempt should be made to integrate various problems into families with similar characteristics. The purpose is to find the differences of the routing problems, analyze them, then find all the changes that should be made to the algorithm so that it can be executed. Statements and Declarations The authors have no relevant financial or non-financial interests to disclose.
References 1. Zhang, D.: Solving School Bus Routing and Student Assignment Problems with Heusristic and Column Generation Approach. University of Louisville, Kentucky, Department of Industrial Engineering, Louisville, Kentucky (2018) 2. Schittekat, P., Kinable, J., Sörensen, K., Sevaux, M., Spieksma, F., Springael, J.: A Metaheuristic for the School Bus Routing Peoblems with Bus Stop Selection (2011). https://doi.org/10. 1016/j.ejor.2013.02.025 3. Plotas, C.M.: Implementation of algoritms for ship rooting. Patra: University of Patras. Department of Computer Engineering & Informatics (2019) 4. Faraj, M.F., Sarrubbi, J.F., Silva, C.M., Porto, M.F., Nunes, N.T.R.:. A Real Geographical Application for the School Bus Routing Problem (2014). https://doi.org/10.1109/ITSC.2014. 6958132 5. Shafahi, A., Wang, Z., Haghani, A.: A Matching-based Heyristic Algorithm for School Bus Routing Problems. TRB2060 (2018) 6. Christos, G.: Study and Implematation of Community Detection Algorithms in Large Graphs. Hellenic Open University, Department: Master in Information Systems, Patra (2017) 7. Alexiou, D.: Creation of All Phases of Traffic Signal System Periods in Coordination with Pedestrian Crossings. Aristotle University of Thessaloniki, Faculty of Engineering, Department of Spatial Planning and Development, Thessaloniki 8. Alexiou, D., Katsavounis, S.: Determining the Minimum Number of Warhouses and their Space-Size for Storing Compatible Items (2013). https://doi.org/10.1007/978-1-4614-51341_13 9. Goudinoudis, S.D., Fotopoulos, D.I.: Evolving Networks in Time: The Case of a Collaboration of Scientists. National and Kapodistrian University of Athens, Department of Informatics and Telecommunications, Athens (2012)
py-microdots: Position Encoding in the Euclidean Plane Based on the Anoto Codec Christoph Heindl(B) Visual Computing, PROFACTOR GmbH, Steyr, Austria [email protected]
Abstract. Determining the precise position of devices over large spaces is a challenging problem. In this work, we provide an in-depth review of the Anoto structured pattern codec to solve this localization problem in the Euclidean plane. Starting from the preferred embodiment of the code used in many digital pen applications, we proceed to generalize the underlying principles to enable codes tailored to applicationspecific requirements. To facilitate working with these codes, we introduce a modern Python library named py-microdots to encode, decode and customize Anoto based patterns.
Keywords: Positional Encoding Patterns
1
· Planar Localization · Structured
Introduction
This work considers a method for localizing devices in the Euclidean plane by observing parts of a structured pattern. In its preferred embodiment, the pattern consists of microdots arranged in a grid, with each dot deliberately offset from its nominal grid position to encode 2 bits of information. Decoding the absolute device location merely requires observing any subset of 6 × 6 dots as depicted in Fig. 1. Assuming a grid point spacing of about 0.3 mm, one could theoretically code the area of Europe and Asia unambiguously. This code was developed by Anoto and patented in the early 2000s. Because the pattern is easy to produce, leaves only a small visual footprint and requires little computer resources to decode, the technology has quickly found its way into many digital applications such as digital paper and interactive books and games. Since its introduction, the Anoto coding has only been studied in a handful of scientific papers, despite the fact that it seems exceptionally useful for many tasks such as indoor positioning and camera calibrations. The primary reason for this is, we suspect, that genuine information can only be found in published patent information. As existing patents are about to expire and only a few scientific papers on this versatile localization approach have been published recently, we believe it is worthwhile to revisit the Anoto approach. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 219–235, 2023. https://doi.org/10.1007/978-3-031-37963-5_16
220
C. Heindl
This paper is organized as follows: After discussing related work in Sect. 2, we highlight the mathematical concepts underpinning the central codec ideas in Sect. 3. Section 4 then explains the coding principles based on the preferred coding embodiment and subsequently generalizes these principles to arbitrary embodiments. In Sect. 5 we introduce our Python library py-microdots1 to process Anoto based codings. Finally, we present our main findings in Sect. 6 and discuss further research directions in Sect. 7.
Fig. 1. Overview. The Anoto Structured Pattern Encodes a Unique 2D Position for every Ppossible 6 × 6 Sub-Array of Dots (Green and Purple Rectangles with Decoded Locations). Theoretically, at a Grid Resolution of 0.3 mm, this Coding Remains Unique over the Area of Europe and Asia. For Clarity, the Dots are Significantly Scaled up and Nominal Grid Lines are Shown.
2
Related Work
This work is mostly based on genuine details given in the Anoto patents [7–10]. In addition, the works of Aboufadel et al. [1], Hostettler et al. [4], Zhang et al. [12] and the dissertation of Ozgur [11] provide helpful insights of the Anoto codec. To our knowledge there are two open source software projects dealing with the Anoto codec. The first one is freedot [3] by Flaherty, it consists of a collection of C++ and Python scripts for reverse engineering existing Anoto devices. The second library is called libdots [4] written in C/C++. It provides routines for encoding and decoding the preferred embodiment of the code. In contrast to our work, their work also contains an image processing pipeline tailored to a specific sensor setup. Neither of the aforementioned software packages appear to expose a generic API for handling custom embodiments of the codec. Our contributions include (i) an in-depth explanation of the Anoto code with many details not found in other works (e.g. section coordinates, pattern 1
https://github.com/cheind/py-microdots.
py-microdots and the Anoto Codec
221
rotations and encoding details), (ii) a generalization of the preferred embodiment to arbitrary codings and (iii) a modern Python library to handle preferred and custom embodiments. We intentionally leave image processing for dot pattern recognition to future work, in which we intend to explore a general approach that should work well with different code embodiments and across sensor devices.
3
Preliminaries
This section briefly introduces mathematical concepts related to the Anoto codec principle. The interested reader is referred to classical textbooks on Number Theory [5,6] for more information. At this point, the following concepts may seem unrelated, but they will be tied together in Sect. 4 to form the Anoto code. 3.1
De Bruijn Sequences
A cyclic de Bruijn sequence of order n over an alphabet of size k is a k-ary sequence of length k n with the following properties: (i) all possible k-ary strings of length n are contained and (ii) each string of length n appears exactly once. We denote the set of possible de Bruijn sequences for a particular configuration B(k, n). Related to this concept is the set of cut-down or quasi de Bruijn sequences, denoted QB(k, n, m), which contains all universal cycles of length 1 ≤ m ≤ k n such that any k-ary string of length n appears at most once. Examples for each sequence type are given below x ∈ B(2, 4) : (0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0) x ∈ QB(2, 4, 12) : (0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1). 3.2
Prime Factor Integer Bases
Given an integer n and its factorization into prime factors [p1 , . . . , pm ), one may express any integer e ∈ (0, n) using m coefficients (c1 , . . . , cm ) with ci ∈ [0, pi ) by m bi ci (1) e= i=1
i−1 with bases bi = j=1 pj and b1 = 1. Using matrix notation, we may write e = bT c, where b = b1 · · · bm is the basis column vector and c = c1 · · · cm is a column vector composed of coefficients. It is important to note that any permutation of the sequence (p1 , . . . , pm ) gives rise to a different sequence of bases and will hence change the coefficients. Table 1 gives an example. Projecting the integer e onto the set of prime factor bases can be accomplished via modular arithmetic and integer division as follows. Let q = e and take the largest basis bm . Then the quotient of integer division q/bm equals cm . Next, set q = q mod bm and repeat with the next largest basis bm−1 until all bases have been processed.
222
C. Heindl
Table 1. The Table Shows the Coefficients of Integers in (0, 12) (in Rows) Factored According to Different Prime Factor Basis Permutations (in Columns). [2,2,3] [2,3,2] [3,2,2] 0 000
000
000
1 100
100
100
2 010
010
200
3 110
110
010
4 001
020
110
5 101
120
210
6 011
001
001
7 111
101
101
8 002
011
201
9 102
111
011
10 012
021
111
11 112
121
211
Table 2. Remainder Tuples i mod n1 , . . . , i mod nm for i ∈ N 1, is shown to be represented by the expression given in Eq. (4): m2 mk 2 −1 2 2 Zk = Zo + Zo2 + · · · + Zok −1 + Zok (4) k=1
k=1
where mj ∈ N are integers arising from the execution of Eq. (3). For example, if one would perform three iterations of Eq. (3), one would obtain for W3 following equation: Z1 = Zo + Zo · Zo Z2 = Z1 + Z1 · Z1 Z3 = Z2 + Z2 · Z2 or, more explicitly Z3 = Zo + 3Zo2 + 6Zo3 + 9Zo4 + 10Z 5o + 8Zo6 + 4Zo7 + Zo8 Thus, the coefficient m2 = 3, m3 = 6, m4 = 9, m5 = 10, m6 = 8, m7 = 4, and m1 = m8 = 1, where the subscript of parameter m denotes the power of the matrix Zo j. As the logic OR operation on any binary matrix A leaves the matrix A unchanged, i.e. A = A + A, for any natural number n ε N. N Zok = Zok (5) i=1
for any k,N ≥ 2 and therefore the precise value of the coefficients mj is irrelevant. Using identity of Eq. (5), Eq. (4) can be rewritten as follows: Zk =
k 2
Zl l=1 o
(6)
An Empirical Criterion for Transitive Closure of a Searchability Matrix
239
According to the “squaring” matrix relation given in Eq. (1) the transitive closure solution Z* has been found for all k larger or equal to k. Hence, it suffices to perform the iterative loops up to k = ln2 n. Thus, the matrix Z* for transitive closure can be calculated by evaluation of Eq. (7): Z∗ =
ln2 n l=1
Zol
(7)
In some cases where the equation n = 2k holds for n ∈N, the rounding equation being substituted by the argument itself, that is k = k. It’s worth noting that the factor in the ln2 n runtime has also been discovered in comparable investigations [19] albeit in slightly different context. Hence, the proposed recursive relation (3) is tantamount to the squaring method shown in Eq. (1). Relation (3), however, avoids the direct calculation of all power matrices Ak for k = 1, n, separately, and hence is computationally more efficient. The equivalency between the recursive matrix relation in Eq. (3) and the squaring theorem in Eq. (1) establishes a new path to establish the transitive closure. Although the algorithm is faster than the squaring algorithm in Eq. (1), it is slower than most of the other techniques [20, 21] and hence, it does not appear to be very enticing at first glance. However, the recursive approach Eq. (3) may be used to construct a simple criterion for determining whether a given adjacency matrix is transitively closed or not. The operation can be used to determine whether a given matrix A fulfills Eq. (8a). A=A+A·A
(8a)
If Eq. (8a) is fulfilled, then A is already the final solution, i.e. A = Z*. In case A is not equal to Z* (transitive closure solution) then the inequality A = A + A · A
(8b)
would hold. Even if, A = Z ∗ , there are chances that the matrix A could be on the verge of reaching the transitive closure after a couple of iterations. This means that iteration of Eq. (3) could be repeated until relation Eq. (8a) is obtained. This could be very useful in the case of sparse matrices, or in situations when an educated guess is warranted or there is sufficient information otherwise indicating that the final solution is already established or the given matrix is very close to the final solution. Hence, depending on the degree of being close to the final solution Z*, the relation (3) may effectively reduce the computing cost to at least O(n2 ln2 n) and, possibly to much less down to O(n2 ) operations, in many actual applications, where a high degree of reachability of a graph H is already given. In such cases, only few iterations of the relation (3) application are required, and, consequently, a lot of computational time is being saved.
240
M. Orlowski
As a result, a superfluous use of the Warshall O (n3 ) brute force approach or of the more advanced O (ns ) methods with s < 3 can be entirely avoided. Figure 1 gives an illustration of how the various methods work on and the example of simple matrix A representing a directional graph.
Fig. 1. A Four-Node Graph is Shown with Directed Links along with the Corresponding Adjacency Matrix
In the standard “squaring” closure relation (Eq. (1)) one could not avoid calculating the power matrices, A (by default), and A2 , A3 , A4 : ⎛
⎛ ⎛ ⎞ ⎞ ⎞ 0000 0000 0000 ⎜0 1 1 0⎟ 3 ⎜1 0 0 1⎟ 4 ⎜0 1 1 0⎟ ⎜ ⎜ ⎟ ⎟ ⎟ A2 = ⎜ ⎝ 0 0 0 0 ⎠, A = ⎝ 0 0 0 0 ⎠, A = ⎝ 0 0 0 0 ⎠, 1101 0110 1001 If one adds up A + A2 + A3 + A4 one obtains the transitive closure Z*. ⎛
⎞ 0010 ⎜1 1 1 1⎟ ⎟ Z ∗ = A1 + A2 + A3 + A4 = ⎜ ⎝ 0 0 0 0 ⎠, 1111 Following the Warshall-Roy-Floyd algorithm given in Eq. (2) the consecutive matrices are A1 , A2 , A3 , and A4 given explicitly below by: ⎛
⎞⎛ ⎞⎛ ⎞⎛ ⎞ 0010 0010 0010 0010 ⎜ 1 0 1 0 ⎟⎜ 1 0 1 1 ⎟⎜ 1 0 1 1 ⎟⎜ 1 1 1 1 ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟ ⎝ 0 0 0 0 ⎠⎝ 0 0 0 0 ⎠⎝ 0 0 0 0 ⎠⎝ 0 0 0 0 ⎠, 0100 1111 1111 1111 respectively. The last matrix represents the transitive closure, i.e. A4 = Z*. In case of the recursive matrix relation given in Eq. (3), one needs to calculate only two matrices, Z1 and Z2 ⎛ ⎛ ⎞ ⎞ 0010 0010 ⎜1 1 1 1⎟ ⎜1 1 1 1⎟ ⎜ ⎟ ⎟ Z1 = ⎜ ⎝ 0 0 0 0 ⎠, Z2 = ⎝ 0 0 0 0 ⎠ 1001 1111
An Empirical Criterion for Transitive Closure of a Searchability Matrix
241
It is seen that the matrix Z2 represents the final solution, i.e. Z2 = Z*. Let’s consider another example of an adjacency matrix representing a directed graph H of five nodes shown below, with the task of finding for this graph the transitive closure. ⎞ ⎛ 11000 ⎜1 1 0 0 0⎟ ⎟ ⎜ ⎟ ⎜ A = ⎜0 0 1 1 1⎟ ⎟ ⎜ ⎝0 0 1 1 1⎠ 00111 In case of using the “squaring relation” (1), all the required power matrices A, A2 , A3 , A4 , and A5 have to be calculated, and added up in order to obtain the final solution. In case of employing the Warshall-Roy-Floyd procedure or one of the cutting edge algorithms to compute the transitive closure, Z*, one would have to apply the full algorithm, to be confident that the transitive closure for the underlying network has been properly identified. However, we can infer from visual inspection (which is simple for this particular) that matrix A already reflects the transitive closure. The comparatively simple criterion presented in Eq. (8a) can be used to confirm this educated guess. Indeed, one finds that the identity A = A + A·A holds for this matrix, obviating the need for a fullfledged calculation using Warshall-Roy-Floyd or any other direct or indirect algorithm. The advantage of using the criterion (8) is that it can be found using a O(n2 ) operation that the transitive closure is already established instead of using most advanced transitive closure algorithms that have run times as low as ln (n2.49 ) [22].
Fig. 2. A Flowchart of how the Proposed Method can be used to Determine the Transitive Closure in Conjunction with Extant Algorithms
In Fig. 2 a flowchart is provided of how the proposed method could be used in concert with the extant algorithms. The maximum number of iterations m = 0.5ln(n) is dictated by the circumstance that the numerical burden of using criterion in Eq. (8) should be considerably smaller than the numerical burden of using existing algorithms
242
M. Orlowski
to determine the transitive closure. This restriction indicates that the proposed relation (3) resulting in the two criteria given in Eq. (8a) and in Eq. (8b), can upfront decide whether an application of heavy duty algorithms is needed.
4 Summary A novel recursive matrix relation has been proposed to find the transitive closure for a given directed graph. Albeit this relation is computationally inferior to most advanced algorithms for the computation of the transitive closure if a transitive closure would have to be computed from the scratch, it has the distinct advantage to allow one to derive from it a simple criterion enabling the user to determine whether a given reachability matrix represents the transitive closure already or is within a few iterations of reaching it. In the latter case, a few iterations may be sufficient to establish the transitive closure with a smaller computational burden than the most efficient algorithms for transitive closure. The criterion of being close to the final solution may be established by monitoring from a few initial iterations the change of the number of matrix element entries “1”. If the number of additional “1” in the matrix entries from iteration to iteration is small and if that number decreases quickly from iteration to iteration, then for many cases it will indicate that the transitive closure can be reached with relatively few additional iterations given in Eq. (3). If a given matrix is far from the transitive closure, then the present criterion would indicate that a full-fledged algorithm must be applied to find the transitive closure and, in that case, nothing has been gained from the present method.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Warshall, S.: A theorem on boolean matrices. J. Assoc. Comput. Mach. 9, 11–12 (1962) Roy, B.: Trasitivite et connexite. C.R. Acad. Sci. Paris Der. A-B 249, 216–218 (1959) Floyd, W.R.: Algorithm 97: shortest path. Comm. ACM 5, 344–345 (1962) Dijkstra, E.W.: A note on two problems in connexion with graphs. Num. Math. 1, 269–271 (1959) Schnorr, C.P.: An algorithm for transitive closure with linear expected time. SIAM J. Computing 7(2), 127–133 (1978) Poor, H.: A Hypertext History of Multiuser Dimensions. MUD History ((1986). http://www. ccs.neu.edu/home/pb/mud-history.html Neadeau, T., Gray, K.: Software Defined Networking. O’Reilly, USA (2013) Comen, T., Leiserson, T.C., Rivest, S., Stein, K.E.: Introduction to Algorithms. MIT Press, USA (2009) Chazelle B.: Cuttings. In: Handbook of Data Structures and Applications, CRC Press, Boca Raton, pp. 25.1–25.10 (2005) Aho, A., Hopcroft, E., Ullman, J.D.: The Design and Analysis of Computer Algorithms. Addison-Wesley (2008) Aho, A.,R. Sethi, R., Ulmann, J.D.: Compilers, Principles, Techniques. Addison Wesley 7(8), 9 (1986) Lin, C., He, D., Huang, X., Khan, M.K., Choo, K.R.: A new transitively closed undirected graph authentication scheme for block-based identity management system. IEEE Access 2837650, 28203 (2018)
An Empirical Criterion for Transitive Closure of a Searchability Matrix
243
13. Abdellah, A.S., Saif, S., ElDeeb, H.E., Abd-Elrahman, E., Taher, M.: A secured blockchainbased information-centric network. J. Comput. Sci. 18(4), 266280 (2022). https://doi.org/10. 3844/jcssp.2022.266.280 14. Skiena, S.: Sorting and Searching. The Algorithm Design Manual, Springer. p. 480 (2008) 15. Shimon, E.: Graph Algorithms. (2nd ed.), Cambridge University Press, pp. 46–48 (2011) 16. Rosen, K.H.: Discrete Mathematics and its Applications, 7th edn. McGraw Hill, New York (2012) 17. Chan, T.M.: All-pairs shortest paths with real weights in O(n3 / log n) time. Algorithmica 50, 236–243 (2008) 18. Galil, Z., Margalit, O.: All pairs shortest paths for graphs with small integer length edges. J. Comput. Syst. Sci. 54, 243–254 (1997) 19. Benedikt, M., Senellart, P.: Databases. In: Blum, Edward K.; Aho, V. (eds.), Computer Science. The Hardware, Software, pp. 169–229 (2011) 20. Batsamut, V., Manzura, S., Kosiak, O., Garmash, V., Kukharets, D.: Fast algorithm for calculating transitive closures of binary relations in the structure of a network object. Int. J. Comput. 20(4), 560–566 (2021). https://doi.org/10.47839/ijc.20.4.2444 21. Green, O., Du, Z., Patel, S., Xie, Z., Liu, H., Bader, D.A.: Anti-section transitive closure. In: 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC) pp. 192–201 (2021). https://doi.org/10.1109/HiPC53243.2021.00033 22. Sridhar, S.: Design and Analysis of Algorithms. Oxford University Press (2014)
Detection of Computational Propaganda on Social Networks: A Survey Bodor Moheel Almotairy(B) , Manal Abdullah, and Dimah Alahmadi Department of Information Systems, King Abdulaziz University, Jeddah, Saudi Arabia [email protected]
Abstract. Propaganda campaigns try to influence people’s minds to achieve a certain goal. The rise of social networks has provided propagandists with unparalleled opportunities to reach millions of social network users and create a large-scale illusion. This phenomenon is called “Computational Propaganda.” It involves the use of computational enhancement tools to create a sense of widespread agreement. Computational propaganda has impacted a variety of domains in diverse ways. Unfortunately, propagandists rely on advanced artificial intelligence methods to change their strategies with every new campaign, which allows them to evade detection methods easily. This research draws on a systematic literature review of recently published material to investigate and compare the state-of-theart computational propaganda detectors. The findings of the research indicate that detective efforts should be combined to detect computational propaganda in realtime. Moreover, the cooperation behaviors between malicious human beings and malicious social bots weren’t highlighted enough, although they are one of the most important practices of computational propaganda. This study is important because it evaluates the current detective methods, which would enlighten future research as it identifies research gaps in this field. Keywords: Computational Propaganda · Astroturfing · Misinformation · Opinion Manipulation · Social Media
1 Introduction Today, it is very easy for anyone to design a news medium from either a blog or a website, thanks to the presence of the Web. News medium plays a critical role in promoting freedom expression since it increases the scope and platform for people to express their views and ideas without restrictions. A large percentage of small news sources provide untrustworthy information. On the hand, social networks have made it possible for people to express their views to a large audience. While news media platforms developed from blogs and websites have helped societies in terms of self-expression, the platforms have also been associated with negative consequences that have affected societies in different ways. The major consequence of these platforms is the potential manipulation of information. Propaganda is the major technique through which misinformation and disinformation is spread. However, the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 244–263, 2023. https://doi.org/10.1007/978-3-031-37963-5_18
Detection of Computational Propaganda on Social Networks
245
public has significantly assumed the fact that propaganda has been used to convey false information to society. Propaganda is the use of psychological and certain rhetorical approaches to appeal to the emotions of the audience [1]. State actors were the major campaigners of propaganda in the past. In the social network’s era, online “Computational Propaganda” has impacted a variety of domains in diverse ways [2]. Computational propaganda, which uses technical means to disseminate information and create propaganda, has made it easier for individuals and groups to spread propaganda [3]. Computational propaganda spreads messages at a greater scale by relying primarily on coordinated approaches. To achieve greater coordination, the current computational propaganda leverage groups of fully automated accounts called botnets [2]. Perpetrators of computational propaganda use logical fallacies to appeal to the emotions of the audience and make them believe in the reliability and the authenticity of the information being spread to them. The perpetrators also select facts that appeal to the emotions of the audience, which evade the notion of “lies” in the propaganda. Computational propaganda players (malicious users) appear under different masks such as seminar users [4], astroturfers [5], internet water army [4], sockpuppets [6], cyborgs [7], and troll armies [8]. One of the major cases that researchers found was defeating one of the candidates in the Massachusetts special election in 2009. In the findings, it was revealed that nine fake accounts were used to generate tweets that directed the target audiences to a specific website that contained defaming information about the affected candidate [9]. Large corporations have also found themselves in the mix, which adds to the reported cases that had no political motivations. Microsoft is one of the large corporations that found itself in the mix of an astroturf campaign [10]. The idea of computational propaganda is also extending into healthcare. An automated bots have been used in healthcare during the pandemic to run anti-vaccine campaigns [11]. As a result, the disinformation appeals to the target audience, primarily the public which makes them to have confidence in simplistic solutions at the expense of scientific research that is based on empirical evidence. Social networks usage is currently the popular online activity according to Statista1 which makes the use of the internet to have limitless implications of unethical internet usage. With over 3.6 billion people having been estimated to use the internet by 2020, the number is anticipated to go up to over 4.4 billion users by 2025. A 2022 study by Amy Watson revealed that most adults largely depend on social media as their daily source of news and information [12]. This implies that the platforms play a significant role in the daily lives of people around the world, especially for youths and a younger generation. There is a very big indication that consumption of social media news and information will go up significantly in the coming years without the need of the audience trusting the reliability of information from the chosen network. Besides, 2020 studies have shown that about 81 countries have fallen victim of social media manipulation of the online public opinion [7]. The large increase in the number of users of social media, as well as the number of countries that have been exposed to such malicious practices, embodies the magnitude of the danger. 1 https://www.statista.com
246
B. M. Almotairy et al.
Understanding and detecting computation propaganda has attracted research communities. Many methods, frameworks and data analysis have been proposed. However, the field of computational propaganda detection is still in its infancy. Therefore, the severity of this problem calls for a special consideration from societies and research communities for a solution to be attained. This research is motivated by the scarcity of approaches to detect computational propaganda. It is a content analysis that aims to systematize the studies carried out on comprehending and detecting computational propaganda. To ensure replicability and relevance, this paper adopts a systemic approach. A systemic review reflects a satisfactory trustworthiness of the existing body of literature [13]. This approach has as its purpose to discover, summarize, and analyze any relevant literature in the light of transparency and replicability [13]. The remaining parts of this paper are organized as follows: Sect. 2 defines computational propaganda. Section 3 discusses in detail the computational propaganda detection methods in previous studies. Section 4 compares the features that were used in computational propaganda detection. Section 5 discusses the role of bots in spreading computational propaganda. Section 6 shows concrete and promising research possibilities in the field of computational propaganda detection.
2 Computational Propaganda “Propaganda” is a term coined from the word ‘propagation,’ which refers to spreading Catholic faith ideas and thoughts in the New World [1]. The term was coined in the 17th Century. Propaganda later took a different meaning referring to the opposition to the protestant faith. In 1938, the Institute of Propaganda Analysis IPA came up with a new definition of propaganda. IPA defined propaganda as “the expression of actions, views, or opinions of other people or groups with the intentions of influencing their ideas, activities, or thoughts in a manner to achieve desired ends, for example, considering the opinions or actions as unacceptable or immoral” [14]. According to the history of propaganda, Viorel T, UT, UI [15] noted that every new public communication medium comes along with a new wave of propaganda. As such, aspects of public communication such as writing, the religious, literary, and other artistic works, music, and storytelling, building for ancient kings and giant statues, the intervention of radio, television, and printing press were developed for propaganda. In 2017, Woolley & Tsyrenzhapova [16] defined computational propaganda as all propagandas created and shared using technical or computer-aided means. This involves using algorithms, big data analytics, automation machines, and human curation to disseminate or share false or misleading information through social media networks while influencing public opinions [16, 17]. The main concern of deceptive social media network campaigns is focusing on coordinated efforts to spread messages on a scale [17]. Computational propaganda is some time created and executed through algorithmic bots and data mining, created, and controlled using contemporary advanced technologies such as machine learning and Artificial Intelligence [18]. For example, social media platforms such as Facebook, Twitter, and Google apply different algorithms to forecast what the readers or users want to see, promote engagement, and increase their revenues.
Detection of Computational Propaganda on Social Networks
247
This technique is based on the user’s history or habits of clicks, likes, and shares filtered using algorithms to provide the content that the user prioritizes [18]. More efficient and effective classifiers are developed as the users engage more with the content, defining the user’s preferences and emotional reactions to affirm the already existing biases as filtered by algorithms [18]. From these predictions, user groups can be isolated into different echo chambers. These are social spaces intended to reinforce the social beliefs of like-minded people that lead to a better polarization process [18]. As a result of these classifiers and predictions from these algorithm tools, misleading information and false news are spread and shared on the internet [16]. This makes it not easy to detect malicious intentions in mass media, where Machine-driven communications (MADCOMs) are integrated into machine learning and artificial intelligence to come up with video, audio, and text content that is easily related to users’ backgrounds and personalities [19]. MADCOM can easily use chatbots that rely on natural languages to help users conduct online meetings or discussions and threaten or troll other people. With the evolution of deep learning, it is now easier to influence the sound, video, and image for impersonation or deep faking. Sound, images, and video are used to make it look like a particular person said or did something they did not. As a result, differentiating between real and fake audio or video content is difficult [19].
3 Computational Propaganda Detection Taxonomy Reviewing previous literature related to the strategies and automatic detection of computational propaganda allows researchers to know various categories related to the subject and their impacts. Such studies are helpful in revealing more facts on the subject and its consequences in various societies in the long term. Therefore, it will enable researchers to be aware of the existing research gaps related to the topic. This research follows a synthesis and interpretation approach which can help in the effective identification and analysis of computational propaganda and its overall effects in diverse settings. The systematic literature review settled on 30 papers with formulated by the researchers. The results obtained from the systematic literature review guided the researchers in developing a taxonomy to structure outcomes of various categories. The researchers identified different technical approaches affecting computational propaganda detection and their overall significance. The taxonomy has two major categories that focus on the detector’s aim: detecting propagandist content and detecting propagandist malicious accounts. Figure 1 shows the major categories this research proposes, and the discussion also focuses on the literature relevant to the chosen groups. Moreover, further classification into sub-categories ensures the research study achieves its objectives. It is worth noting the main criteria for defining taxonomy are the critical issues covered in online computational propaganda detection and how the literature tackles such issues. The applicability of such techniques in taxonomy goes beyond computational propaganda detection, making it easier for experts to assess other elements affecting computational propaganda. Thus, the research work can be categorized into various groups and the focus will be on the groups that best describe the study’s scope and other relevant factors.
248
B. M. Almotairy et al.
3.1 Detecting Propagandist Posts According to the survey, the text analysis perspective and social network analysis content were the main approaches used in detecting computational propagandist posts. The two views enabled the researchers to focus on diverse subsections to determine how they affected the overall topic. Thus, the following subsection reviews the two perspectives to assess their overall significance in the discussion. 3.1.1 Text Analysis Perspective The perspective’s history is short because experts could not access suitable annotated datasets for training supervised models. Thus, the next subsection discusses various elements of propaganda detection using text analysis, to determine how the approach allows experts to distinguish between multiple elements affecting the reliability of content. Further, literature analysis allows the research to determine the approaches various authors used to reach specific conclusions while working on diverse topics. Thus, the study found authors depending on writing style features and complexity, mixed code, stylometry and semantic features to accomplish their goals. In Table 1, various text features the studies used are listed and they describe various approaches the researchers used to accomplish their objectives. Writing Style and Complexity Some early propaganda identification approaches aligned with the produced corpora, affecting the outcomes various publishers attained in different locations. Rashkin et al. [20] argue that such approaches focused on distinguishing news from other types of information like satire, hoaxes, and propaganda. Thus, the process required the researchers to compile a mass of documents from English real news, which they used with unreliable news sources for three other categories. The researchers relied on the n-grams representation n ∈ [1, 3], and the approach enabled them to train three main models that were useful to their work. The Maximum Entropy Classifier (MaxEnt), Long Short-Term Memory Networks (LSTM), and Naïve Bayes were the main models the researchers relied on to execute different functions. From their results, the researchers found that LSTM is more effective than the other models whenever they used text as the only input. Alternatively, the remaining two models registered significant improvements whenever the researchers added Linguistic Inquiry and Word Count (LIWC)2 features in their assessments. Thus, using n-gram representation with Maximum Entropy classifier can substantially decrease performance when testing propaganda on news articles from unseen sources compared to seen sources. Barr’on et al. [21] assessed the hypothesis that argued the drop is due to word n-grams being dependent on specific topics. Thus, the hypothesis focused on the argument that the way the news source is modeled rather than the concept of propaganda affects its overall reliability. Essentially, Barr’on et al. [21] assessed the data distribution to address the limitation, which ensured the test data originated from news sources unseen during training. Consequently, the approach penalized models that focus on predicting specific news sources styles that are used during training phases, which do not focus on solving the actual tasks. The approach also enabled the researchers 2 https://www.liwc.app/
Detection of Computational Propaganda on Social Networks
249
to be aware of features that can allow the approach to overcome such limitations to achieve the desired outcomes during a particular analysis. The researchers proposed a binary classification setting during a particular analysis. The researchers proposed a binary classification setting that enabled them to distinguish between propaganda and non-propaganda while relying on TSHP [20] and QProp [21] Corpora. They conducted several experiments investigating diverse representations such as writing styles, readability, and keywords using the MaxEnt Classifier. They found that distant supervision and rich representations enable the model to predict the source, without distinguishing propaganda from non-propaganda. Consequently, models relying on writing style and readability representations are more likely to outperform those dealing with word-level representations. Therefore, style is more crucial than a topic in propaganda detection.
Detection Computational Propaganda Toxonomy Detecting propagandaist post Social Network Analysis
Text Analysis
Supervised approach
Stylometry analysis Writing style & complexity analysis
Semeantic analysis Mixed-Code analysis
Embeddings word analysis
Detecting malicious account
Interactionbased analysis
Unsupervised approach Anomalous patterns analysis Group behavior analysis
Individual behavior analysis
Authorship attribution Information difusion traking
Supervised approach
N-gram analysis lingustic features analysis
Unsupervised approach
Viral cascades analysis
Social network analysis
Supervised approach
Statistical techniques
Supervised & semi-supervised approach
Contenet analysis
Influence analysis Frequinci es analysis
User's profile analysis
Interaction analysis Activity analysis in short time Mixing sementic & non-sementic features analysis
Fig. 1. Computational Propaganda Detection Taxonomy
Some researchers have not addressed the significance of explaining the importance of credibility assessments in credibility analysis. Popat et al. [22] assess the credibility of emerging and long-tail claims with a poor web presence. Alternatively, Da San Martino et al. [23] proposed a novel task that relies on a deep neural network. They performed a fine-grained analysis of various texts by detecting article fragments with propaganda techniques. Their approach differed from previous studies [20, 21], because they considered the linguistic style of the claim and language features (F L ) [24]. F L features included assertive, report, and factive verbs, hedges and subjective phrases. The stance, reliability, and trend were the main reliability factors, while relying on distant supervision and conditional random field (CRF), to determine interactions between various factors. Da Martino et al. [25] proposed a shared task comprising two subtasks: Span
250
B. M. Almotairy et al.
Identification and Technique Classification. Vorakitphan et al. [26] proposed a supervised approach to categorize textual snippets using propaganda messages and applied propaganda techniques. Writing style and complexity also help to distinguish between real news from fake news and satire [27]. Misinformation and propaganda are common due to the Covid 19 pandemic. Khanday et al. [11] reviewed Twitter data to analyze its accuracy, and the extracted data was grouped into propaganda and non-propaganda classes. Combining three textual features allowed the researchers to classify the tweets into the two main groups. Thus, Khanday et al. [28] found that support vector machines with news tweets indicated ideal results in all traditional machine learning algorithms. Semantic Features Semantic information is crucial in detecting propaganda in a published text. Li et al. [29] focused on sentence-level propaganda detection task by developing an automatic system with different features. The features included term frequency-inverse document frequency (TF-IDF) [30], text length, emotion, readability level, Linguistic Inquiry and Word Count (LIWC), emphatic and Bidirectional Encoder Representations from Transformers (BERT) [31]. The results revealed that sentence length and complexity improve performance compared to language from various terms and expressions used to capture propaganda. Stylometry Stylometry is helpful in detecting hyper-partisanship in news reports that promote extreme left or extreme right opinions. Potthast [32] reviewed articles from nine sources to distinguish between real news, fake news and satire. A stylometric analysis was designed for authorship verification [33], enabling the study to detect factuality and bias by assessing the writing style and the conveyed message. The researchers also reviewed various words and text paragraph lengths, and they discovered the writing styles of left and right-leaning publications were similar. However, their representations did not have the necessary qualities to attain the fake news classification. Embedding Word The issue of computational propaganda detection requires a generalization of models that can aid the detection of fake news. Nasir et al. [34] proposed a novel hybrid deep learning model combining convolutional and recurrent neural networks classifying fake news. The validation of the model on two fake news datasets (ISOT [34] and FA-KES [35]), ensured the detection outcomes were better than those obtained from alternative non-hybrid baseline methods. Further experiments on the generalization of the approach in other datasets yielded promising outcomes.
Detection of Computational Propaganda on Social Networks
251
3.1.2 Social Network Analysis The process of identifying the intent of a propaganda campaign requires more analysis to determine the profiles of social media users who spread propaganda in a network. The approach allows experts to detect coordinated inauthentic behavior on a platform. Early Approaches The detection of malicious coordination involves classifying individual posts in a network to determine if they are legitimate or dubious. Supervised machine learning approaches allow professionals to analyze individual accounts and posts separately in a group and get a specific label from the detecting system. Thus, each dubious account has unique features that distinguish it from legitimate accounts, and social bot detection systems such as Botometer [36] allow professionals to identify malicious accounts. Therefore, the detector can review more than 1200 features in a social media account by evaluating the network structure, content, and other temporal features. From information diffusion perspective, Agarwal et al. [37] examined bot networks complexity used in the 2014 Crimean Water crisis and the 2015 Dragoon Ride Exercise. The easy-to-capture botnet affected the 2014 Crimean Water Crisis while the 2015 Dragoon Ride Exercise experienced problems caused by a more complex bot network. Thus, the authors discovered that information seeders to the Crimean crisis case bots were easily identifiable, but the Dragoon Ride Exercise bots were more sophisticated. In the Dragoon Ride case, the coordination of fewer bots to seed the information ensured the dissemination of propaganda through sophisticated networks. Therefore, it required more effective network analysis approaches to detect. There is little evidence on processes that polarize digital networks due to the largescale spread of propaganda and unverified news with the ability to go viral quickly. Guarino et al. [38] showed the significance of characterizing propaganda networks on Twitter by combining three diverse approaches. They initially suggested using content from tweets to determine the level of polarization in users according to the central theme. The second approach involves separating users with diverse viewpoints engaged in spreading propaganda and disinformation using a particular theme. Lastly, they proposed analyzing social ties to various clusters and users who are the most active on a network. The authors found that propaganda network clusters were more polarized than other clusters, showing users’ ideological leanings. Thus, authority and hub centrality metrics were crucial in identifying propagandist posts on a network and their specific roles. Moreover, the polarization and clustering structures in data obtained from retweets provided vital insights about users’ knowledge, socialization, and overall engagement to specific misinformation topics. Recent Approaches In recently proposed detectors, the researchers adopted unsupervised approaches for spotting strange patterns in the temporal tweeting and retweeting behaviors of groups of accounts. Mazza et al. [39] computed the distance (or similarity) between account activity time series as the Euclidean distance between the feature vectors computed by an LSTM autoencoder with a hierarchical density-based algorithm. This method seems promising since researchers assure that coordination must be considered a key feature
252
B. M. Almotairy et al.
to model the detectors. The primary justification for this approach is malicious accounts coordinate their posting to amplify their effect in a network. Similarly, analyzing many posts provides more data to fuel powerful AI algorithms [40]. For instance, such systems implement network-based techniques to detect suspicious posts connectivity patterns [41]. The authors applied two filtering approaches to the data to highlight different suspicious behaviors. The first filter identifies accounts that consistently tweet the same source in the initial 10 s. This filter’s most prominent feature is identifying groups of promoters (who systematically make retweets) and promoted accounts. However, the authors noted coordination could not be proven only depending on the time of retweeting, contrary to the previously held views [39]. The second filter identifies groups of coordinated posts as they are deliberately post, similar content in the first 10 s. 3.2 Detecting Malicious Accounts Previous studies discussed the methods used to detect malicious accounts from four perspectives: authorship attribution perspective, statistical analysis perspective, mixing semantic and non-semantic features perspective, individual behavior analysis perspective. The following subsections discuss the methods in detail. 3.2.1 Authorship Attribution Perspective Authorship attribution remains one of the most utilized approaches to research and has primarily been employed in the detection of two critical elements of research, including spam [42] and plagiarism [43]. Authorship attribution is the process by which authorship responsibility for a piece of text is assigned. Machine Learning algorithm is the main approach by which this technology is applied. The technology stands out from other forms of machine learning algorithms because it is specifically designed to identify an author whose text is available online [43]. One of the specific purposes for which authorship is employed in the detection of computational propaganda and this use has been applied by Peng et al. [44]. The research by Peng et al. was based on a methodology that they had proposed in their previous research study [45]. This was achieved by making an analysis of comments from different Australian news portals extracting ‘below the line’ (BtL) comments made by the public on news media outlets including The Guardian, ABC Drum, and the Australian Independent Media Network. Aggregation was done for these comments according to the individual profiles, and for each profile, bit-level n-gram analysis of the comments made through it was performed by the authors. This was followed by a comparison of two pieces of text, by applying n-gram analyses and determining their distance. The KNN classification was then performed with the aim of attributing authorship to a piece of text among the authors existing within a group. From their research, a case of possible astroturfing was revealed, affecting two news media websites. The absence of metadata analysis is the main limitation of their study. Moreover, this research also fails to address the cases of astroturfing that occur within shorter periods when performed by crowds.
Detection of Computational Propaganda on Social Networks
253
3.2.2 Statistical Analysis Perspective Statistical methods refer to mathematical tools, including formulas, techniques, and models [46]. With statistical methods, researchers are enabled to collect, organize, analyze, and interpret raw data [46]. Statistical methods were employed by Neyazi [47] in the process of distinguishing propaganda created by human beings or bots on Twitter. Neyazi employed statistical methods proposed by Howard & Kollanyi [48] in their research on the Brexit campaigns3 , whose bases were the frequency of tweets. For example, the handles that created more than 50 tweets per day are defined as having employed more automation, but these handles are not automatically considered to have used bots. Through analyzing and thus understanding malicious accounts, the social media companies can develop sophisticated and automated strategies by which they can be stopped before reaching the targets and thus alleviating their negative implications on consumers. For example, the well-known “Hawkes Process” statistical technique has been employed by Alvari & Shakarian [49] in quantifying the effectiveness of the malicious accounts in leveraging different social media platforms to disseminate malicious information. 3.2.3 Mixing Semantic and Non-semantic Features Perspective For centuries, both semantic and non-semantic Content-Specific Features have been employed as principal tools for the research on detecting improper behaviors by users on the web. The potential benefits of incorporating both semantic and non-semantic features in detecting online astroturfing were studied by Miller [50]. To achieve this, Miller used Weibo, a China-based microblogging website, to extract profile data. Semantic features such as TF-IDF weighted bigrams and unigrams [30], the number of government words, and punctuation counts were employed in each of the short descriptions that had been drawn from the profiles. On the other hand, some of the non-semantic features employed by Miller include the normal number of posts, the count of followers, and determining whether the accounts were either verified or belonging to companies. In their research, the main goal was to detect computational propaganda in China. Using support vector machine, three main tools, including a logistic regression classifier, a support vector machine, and an ensemble classifier, more than 50% of malicious accounts were detected. However, semantic features from the posts that the holders of the profiles made were not considered in this study, an approach that would have further enhanced the classifiers’ accuracy. A system that leverages the use of credibility-oriented features was proposed by Boididou et al. [51]. These features are drawn from tweets and also the profile of the individuals that publish such tweets, and then a two-step classification model that is based on the novel semi-supervised learning scheme is applied. The latter depends on the accord of two pre-trained models that are independent regarding new posts to serve as signals that will guide the retraining of the model of classification. A dataset of shared tweets in which debunked fake and real images and videos that were confirmed was analyzed. With this strategy, it was confirmed that with the proposed features, alongside bagging 3 https://en.wikipedia.org/wiki/Campaigning_in_the_2016_United_Kingdom_European_U
nion_membership_referendum.
254
B. M. Almotairy et al.
in the initial classifiers and the learning scheme that is semi-supervised, significant improvement was recorded as far as the classification accuracy is concerned. However, this approach also has a limitation because all datasets are a batch even though the malicious accounts continually change their strategies with time. 3.2.4 Individual Behavior Analysis Perspective Due to the rarity of viral cascades, the users behind them are also expected to be holders of accounts that are suspicious. This issue is addressed by Shaabani et al. [52] with the aim of finding the malicious users in viral cascades. To achieve this, an unsupervised framework based on casualty was proposed, and this also makes use of label propagation. With this method, the users could be identified without necessarily making use of elements such as network structure, user’s information, cascade path information, or content. The strategy only depends on the activities of the users and their timespan. The element of time is thus critical if such users are to be successfully detected. With the existing techniques, more time is taken in the process of collecting information on cascade path, network, or content. This gap can be overcome through the application of time-decay extensions of the causal metrics as presented by Alvari & Shaabani & Shakarian [53]. With this approach, causal inference is applied in identifying the activities of the malicious accounts within the shortest time possible. The time-decay causality metric was integrated into the causal community detection-based algorithm. The application of this proposed algorithm was first made in the groups of accounts that share causality features, and the classification algorithm follows. With this integrated approach, the accounts can be classified as malicious accounts or not. The scheme employed by the authors exclusively relies on the users’ action log. The efficiency and effectiveness of their strategy are reflected in the results obtained from the real-world Twitter dataset. The framework of causal inference that had been proposed by Alvari et al. [53] was adopted by Shaabani et al. [54] in 2019 who incorporated the graph-based metrics with the aim of distinguishing malicious accounts from the normal users within the shortest time of their activities. Both supervised and semi-supervised strategies are proposed without the consideration of content and network information. The advantage of their proposed framework is accentuated by Twitter’s real-world datasets. The concept of “Seminar Users” was introduced by Darwish et al. [4] in the Arabic context. These are defined as those users who frequently engage in computational propaganda, and their existence is seen as a risk for the distortion of analysis of social media. Therefore, this presence is considered to be socially troubling. This has led to the adoption of a method that can automatically detect such users by monitoring activities such as how they interact, the diversity of their tweet contents, and their inclination towards the use of offensive words and those that bear sentiments. User similarity features were not employed due to the computationally high cost of mining the graph of interactions by users. Further, a contentious political topic was explored with the aim of observing the potential potency and prevalence of such users. Nerino [55] developed a strategy that is based on the ML algorithm for bot detection with the aim of identifying the bot of propagandists. Nerino proved that those features identified by researchers as the ones that are effective in distinguishing propagandist human accounts from propagandist bots are the critical determinants of the performance
Detection of Computational Propaganda on Social Networks
255
of machine learning algorithms that are supervised. These features are used in the ML training directly and not just for creating the dataset of Twitter accounts intended as machine learning training sets. Therefore, the success of the detection process is highly dependent on the identification of the features as one of the most important procedures. The profile settings and account activity are identified as two sets of features in an account that are the most important predictors of an automated nature of a Twitter account. 3.2.5 Social Network Analysis Commonly, the analysis of social networks employs methods that do not focus on the small groups that are responsible but instead focus on the broader campaigns. Weber et al. [56] propose a novel temporal window strategy to bring to light the latent networks of the cooperating accounts. This strategy relies on two main elements, including metadata and account interactions. An unsupervised algorithm is employed in this approach to identify the groups of accounts whose behaviors enable them to execute strategies that are goal-based. The validation of their approach was done through the use of two datasets whose data was confirmed to be ground truth. In 2021, another layer was added by Weber et al. [57] in what was an improvement from their 2020 version in [56]. This layer was aimed at guaranteeing that a group that is detected is not only real but also one that is composed of coordinated users who are responsible for behavior that is considered malicious. The main element of the new layer is the training and validation of three supervised ML algorithms on features extracted from accounts and their groupings. Both individual and collective behavior alongside homophily was specifically applied in the process.
4 Features Used to Detect Computational Propaganda To detect propaganda, the researchers have been considering hybrid feature engineering by combining different types of features. The feature combination improves the performance significantly, i.e., in most combinations, the different feature families capture different aspects [21]. A review of the selected features in the previous studies helps to differentiate between different aspects of features: embedded features, psychological features, stylistic features, linguistic features, complexity features, simplicity features, reliability features, network topology features, user-based features (user’s activity, profile, reliability) and non-semantic features. Table 1 and Table 2 provide a summary about the combinations of features used in the previous interesting studies. Some studies rely on linguistic features [11, 20–23, 25, 26, 28, 32, 34, 44, 58], network topology features [39, 41], users’ activity features [52, 53], non- or semantic features [47, 55]. At the same time, there are studies mixing features to capture different aspects such as mixing semantic and non-semantic features [4, 27, 29, 50, 51], mixing linguistic and network topology features [36, 38], or mixing non-semantic and network topology features [36, 37, 56, 57].
256
B. M. Almotairy et al.
As shown from the comparison in the table some researches have investigated linguistic features that were not used before [4, 21, 22, 25–27, 50, 58]. Barron-Cede´no at el. [21] investigated the NEws LAndscape features (NELA) which proposed by Horne et al. [59]. NELA incorporates 130 content-based characteristics culled from the literature that assess complexity, morality, sentiment, prejudice and, among other things, in news articles. Among all the features, the authors discovered that character n-grams based features and NELA features perform best for improvement while adding lexicon and vocabulary richness on top of them works best for testing. In 2020, the embedded propagandist mixed-code is investigated by Tundis & Mukherjee & Mühlhäuser [58]. They proposed a propaganda mixed-code features identification technique that used deep learning models and achieved 86 percent accuracy. The F L Words, stance features, and source reliability features were only combined by Popat at el. [21] to detect the reliability of the message. This combination improves the distance supervision model’s accuracy. Horne & Adalı [27] combined psychological features to the Stylistic features and complexity features. They found that a number of nouns, lexical redundancy (TTR), word count, and a number of quotes using SVM classifier reached a 71 percent cross-validation accuracy. Vorakitphan & Cabrio & Villata [26] discovered that combining argumentation features with semantic features improved propaganda detection performance on two common benchmarks solely on sentence-level classification and fragment-level classification with specialized persuasion strategies. Finally, Da San Martino at el. [25] added rhetorics feature to their experiments. They discovered that the results’ accuracy differs substantially between languages.
5 Using Bots to Spread Propaganda Botnets,” or computer programs, have been shown through research to be the leading cause for online traffic that is particularly recorded on social media platforms. These computer programs are coordinated across different computers that are designed to perform different tasks on behalf of human beings [37]. There are many ways in which social bots can influence public opinion, such as through the use of spam hashtags, and scam Twitter users [60]. This phenomenon is best observed on Twitter than on any other platform [37]. According to Agarwal et al. [37], posts made by both bots and human/bots hybrids were rampant on Twitter’s algorithms presenting users with both fabricated and manipulated information. This led to the focus on two specific events recorded in the Russian region: the Crimean Water Crisis of 2014 and the Dragoon Ride Exercise of 2015. The aim of this study was to identify how botnets were used to spread propaganda during these two events, and it was found that there is an increase in the sophistication with which automated and coordinated bots are being used. This makes them evasive even when high-level techniques are used for their detection, and this means that more sophisticated bot detection methodologies need to be developed. The new techniques must be more intelligent in a way that they can evolve together with the behavior of the bots. This suggestion is in line with what the research community has already established.
Detection of Computational Propaganda on Social Networks
257
Table 1. The Linguistic Features Used in Propaganda Detection
Table 2. The Linguistic Features Used in Propaganda Detection
The use of bots has been extensively covered by Neyazi [47], with a focus on how they have been applied in India. Posts regarding the Indian television viewership data, the Surgical Strike, and the Uri Attack were the specific events studied by Neyazi with the aim of understanding how bots have been and can be utilized in the social realm. One of the major conclusions is that the use of bots polarized the country socially. Also, it was established that this use infiltrates the news media. Therefore, the need to apply computational techniques by which bots can be identified was highlighted by the author.
258
B. M. Almotairy et al.
For an effective method to be settled upon, the developers must first identify the behaviors of suspicious bots. Various retweet behaviors have been identified by Mazza et al. [39]. The first behavior is when a user posts a tweet within seconds of another tweet. Also, there is a clear division of the activities by a user into sessions where small gaps that separate them indicate no activity. The second behavior is the activity sessions being regular and approximately of the same length, being active at the same time in which different sessions are separated. The third behavior is that of a user whose retweets can be traced way back in time. This behavior is caused by retweeting a feed systematically in reverse but chronological order. Both theoretical and empirical exploration of the computational propaganda that is diffused by automated agents have been used to determine the impacts of computational propaganda by Nerino [55]. They employed a two-step analysis that integrated a mixed-method approach. The primary finding was that the info-cues of computational propaganda play the most prominent role in the effects it has on users. This is because in it are persuasion strategies that are meant to trigger precise cognitive deliberation.
6 Discussion and Research Gap The impact of computational propaganda has been affecting different domains. There have been several reports of computational propaganda campaigns included large corporations and famous figures. It is clear from the examples of computational propaganda described in section I, that its consequences are severe, regardless of the areas in which it occurs [61]. Despite initial encouraging findings, these early techniques had several drawbacks. The next subsections discuses drawbacks from many aspects, they are network topology, textual analysis, malicious social bot, stream data analysis, autonomous system, hybrid system, and adversarial system and datasets: Network Topology Analysis In social media, the social network analysis approach is successful in detecting coordinated malicious accounts, but malicious accounts might work alone and generate new original content [62]. Furthermore, the network structure of malicious accounts is dynamic, and the topic of discussion may drastically change the structure of the community of an interaction network [41]. Thus, traditional social network analysis algorithms, which can assess malicious users at group-level, are no longer lucrative. Text Analysis Existing machine learning techniques focus on understanding and detecting patterns of the observed malicious activities by analyzing the texts that have been published. However, existing approaches often presume that postings from fraudulent users are almost identical since the material is frequently pre-written by the supporters. However, computational propaganda comes in different. New and updated strategies and rogue users have been discovered, allowing malicious users to bypass easily [6, 20, 63]. Moreover, existing propaganda detection techniques relies on writing style more than understanding the topics [21]. Using only language stylistic features is not sufficient [21, 22, 32]; it
Detection of Computational Propaganda on Social Networks
259
is difficult to discern without a comprehensive picture of the subject’s information landscapes [21]. Thus, traditional machine learning classification algorithms, which assess each content separately, are no longer lucrative. Malicious Social Bot Computational propaganda is spread by a hybrid of bots and human being users, making it even more difficult to detect [37]. Several researchers have made efforts to either identify propagandist social bots’ techniques or propagandist human being techniques, it appears the studies to identify their cooperation are not widely discovered yet [60]. Thus, cooperation behaviors of malicious human being and malicious social bot is required. Real Time Based on the literature, the proposed methods imply that improvements emerge after evidence of new malicious conduct has been gathered. Malicious users profit from the long period of time—the time it takes to design, model, and implement, and publish a new detector—during which they are free to mess with our online surroundings. Thus, there is a need for a method to detect malicious users and propagandist messages in real-time. Hybrid System Based on the literature, early proposed methods focused on investigating one perspective of computational propaganda features, either NLP perspective or network perspective. Thus, when compared to a single class of features, a hybrid feature space that considers network structure, user interaction, time span, contents, meta-data and so on can provide more value. A solid structure that takes into account each class of features and the most successful characteristic within each class is required.
7 Conclusion This research reviewed state-of-the-art computational propaganda detection methodologies and the selected features of each one as one of its contributions to the field. It discussed in detail the previously discovered characteristics of the propagandist content, users, and network structures. Moreover, it demonstrated how the fast growth of an adversary’s strategies impairs current propaganda detection tools. Further, it justified its call for moving toward real-time detection methods. It also advocated for the importance of combining NLP, SNA, and machine learning efforts to improve the detectors’ performance by combining and improving the current state-of-the-art approaches. Finally, it demonstrated some real research possibilities in the field of computational propaganda detection.
References 1. Jowett, G., O’Donnell, V.: Propaganda & Persuasion. SAGE (2012) 2. Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings - Annual Computer Security Applications Conference, ACSAC, pp. 21–30 (2010). https://doi.org/10.1145/1920261.1920265
260
B. M. Almotairy et al.
3. Bolsover, G., Howard, P.: Computational propaganda and political big data: moving toward a more critical research agenda. Big Data 5(4), 273–276 (2017). https://doi.org/10.1089/big. 2017.29024.cpr 4. Darwish, K., Alexandrov, D., Nakov, P., Mejova, Y.: Seminar users in the Arabic Twitter sphere. In: Ciampaglia, G.L., Mashhadi, A., Yasseri, T. (eds.) SocInfo 2017. LNCS, vol. 10539, pp. 91–108. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67217-5_7 5. Mahbub, S., Pardede, E., Kayes, A.S.M., Rahayu, W.: Controlling astroturfing on the internet: a survey on detection techniques and research challenges. Int. J. Web Grid Serv. 15(2), 139– 158 (2019). https://doi.org/10.1504/IJWGS.2019.099561 6. Wu, L., Liu, H.: Detecting crowdturfing in social media. In: Alhajj, R. (ed.) Encyclopedia of Social Network Analysis and Mining, pp. 1–9. Springer New York (2017). https://doi.org/10. 1007/978-1-4614-7163-9_110196-1 7. Bradshaw, S., Bailey, H., Howard, P.: Computational Propaganda | Industrialized Disinformation: 2020 Global Inventory of Organized Social Media Manipulation (2020). https://com prop.oii.ox.ac.uk/research/posts/industrialized-disinformation/. Accessed 11 Feb 2021 8. Broniatowski, D.A., et al.: Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate. Am. J. Public Health 108(10), 1378–1384 (2018). https://doi.org/ 10.2105/AJPH.2018.304567 9. Ratkiewicz, J., et al.: Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, pp. 249–252 (2011). https://doi.org/10.1145/1963192.1963301 10. Pfister, D.S.: The logos of the blogosphere: flooding the zone, invention, and attention in the lott imbroglio. Argumentation and Advocacy 47(3), 141–162 (2011). https://doi.org/10.1080/ 00028533.2011.11821743 11. Khanday, A.M.U.D., Khan, Q.R., Rabani, S.T.: Identifying propaganda from online social networks during COVID-19 using machine learning techniques. Int. J. Inf. Technol. 13(1), 115–122 (2020). https://doi.org/10.1007/s41870-020-00550-5 12. Watson, A.: Share of adults who use social media as a source of news in selected countries worldwide as of February 2022 (2022). https://www.statista.com/statistics/718019/soc ial-media-news-source/. Accessed 20 Oct 2022 13. Petticrew, M., Roberts, H.: Systematic reviews - do they ‘work’ in informing decision-making around health inequalities? Health Econ. Policy Law 3(2), 197–211 (2008). https://doi.org/ 10.1017/S1744133108004453 14. How to Detect Propaganda. Bulletin of the American Association of University Professors 24(1), 49–55 (1938) 15. T, ut, ui, V.: The propaganda machine in the age of social media. Journal of the Seminar of Discursive Logic, Argumentation Theory and Rhetoric 18(2), 117–138 (2020) 16. Woolley, S., Howard, P.: Computational Propaganda | Computational Propaganda Worldwide: Executive Summary (2017). comprop.oii.ox.ac.uk. Accessed 11 Feb 2021 17. Woolley, S., Howard, P.: Political communication, computational propaganda, and autonomous agents: introduction. Int. J. Commun. 10, 4882–4890 (2016). https://par.nsf. gov/biblio/10021331. Accessed 09 Feb 2021 18. Cinelli, M., de Francisci Morales, G., Galeazzi, A., Quattrociocchi, W., Starnini, M.: The echo chamber effect on social media. Proc. Natl. Acad. Sci. USA 118(9) (2021). https://doi. org/10.1073/PNAS.2023301118/-/DCSUPPLEMENTAL 19. Diakopoulos, N.: Algorithmic accountability: journalistic investigation of computational power structures. Digit. J. 3(3), 398–415 (2018). https://doi.org/10.1080/21670811.2014. 976411 20. Rashkin, H., et al.: Truth of varying shades: analyzing language in fake news and political fact-checking. In: 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2931–2937 (2017)
Detection of Computational Propaganda on Social Networks
261
21. Barrón-Cedeño, A., Jaradat, I., da San Martino, G., Nakov, P.: Proppy: organizing the news based on their propagandistic content. Inf. Process. Manag. 56(5), 1849–1864 (2019). https:// doi.org/10.1016/j.ipm.2019.03.005 22. Popat, K., Mukherjee, S., Strötgen, J., Weikum, G.: Where the truth lies: explaining the credibility of emerging claims on the web and social media. In: 26th International World Wide Web Conference 2017, WWW 2017 Companion, pp. 1003–1012 (2017). https://doi. org/10.1145/3041021.3055133 23. Da San Martino, G., Yu, S., Barrón-Cedeño, A., Petrov, R., Nakov, P.: Fine-grained analysis of propaganda in news article. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 5635–5645. https://doi.org/10.18653/v1/D19-1565 24. Mukherjee, S., Weikum, G.: Leveraging joint interactions for credibility analysis in news communities. In: Proceedings of the International Conference on Information and Knowledge Management, vol. 19, 23-Oct-2015, pp. 353–362 (2015). https://doi.org/10.1145/2806416. 2806537 25. da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H., Petrov, R., Nakov, P.: SemEval2020 Task 11: Detection of Propaganda Techniques in News Articles. In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 1377–1414 (2020). https://doi.org/10. 18653/v1/2020.semeval-1.186 26. Vorakitphan, V., Cabrio, E., Villata, S.: ‘Don’t discuss’: investigating semantic and argumentative features for supervised propagandist message detection and classification. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pp. 1498–1507 (2021). https://doi.org/10.26615/978-954-452-072-4_168 27. Horne, B.D., Adalı, S.: This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, no. 1 (2017). https://github.com/rpi trust/fakenewsdata1. Accessed 03 Jul 2021 28. Khanday, A.M.U.D., Khan, Q.R., Rabani, S.T.: Analysing and predicting propaganda on social media using machine learning techniques. In: Proceedings - IEEE 2020 2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2020, pp. 122–127 (2020). https://doi.org/10.1109/ICACCCN51052.2020.9362838 29. Li, J., Ye, Z., Xiao, L.: Detection of propaganda using logistic regression. In: Proceedings of the Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, pp. 119–124 (2019). https://doi.org/10.18653/V1/D19-5017 30. Hasan, K.S., Ng, V.: Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In: Coling 2010: Poster, pp. 365–373 (2010) 31. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018) 32. Potthast, M., Kiesel, J., Reinartz, K., Bevendorff, J., Stein, B.: A stylometric inquiry into hyperpartisan and fake news. In: Proceedings of the Conference (Long Papers) ACL 2018 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 231–240 (2017) 33. Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: unmasking pseudonymous authors Elisheva Bonchek-Dokow. J. Mach. Learn. Res. 8, 1261–1276 (2007) 34. Nasir, J.A., Khan, O.S., Varlamis, I.: Fake news detection: a hybrid CNN-RNN based deep learning approach. Int. J. Inf. Manage. Data Insights 1(1), 100007 (2021). https://doi.org/10. 1016/J.JJIMEI.2020.100007 35. Abu Salem, F.K., Al Feel, R., Elbassuoni, S., Jaber, M., Farah, M.: FA-KES: a fake news dataset around the Syrian war. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13, pp. 573–582 (2019). https://doi.org/10.5281/ZENODO.2607278
262
B. M. Almotairy et al.
36. Yang, K.-C., Varol, O., Davis, C.A., Ferrara, E., Flammini, A., Menczer, F.: Arming the public with artificial intelligence to counter social bots. Hum. Behav. Emerg. Technol. 1(1), 48–61 (2019). https://doi.org/10.1002/hbe2.115 37. Agarwal, N., Al-khateeb, S., Galeano, R., Goolsby, R.: Examining the use of botnets and their evolution in propaganda dissemination. Defence Strateg. Commun. 2(2), 87–112 (2017) 38. Guarino, S., Trino, N., Celestini, A., Chessa, A., Riotta, G.: Characterizing networks of propaganda on twitter: a case study. Appl. Netw. Sci. 5(1), 1–22 (2020). https://doi.org/10. 1007/s41109-020-00286-y 39. Mazza, M., Cresci, S., Avvenuti, M., Quattrociocchi, W., Tesconi, M.: RTbust: exploiting temporal patterns for botnet detection on twitter. In: WebSci 2019 - Proceedings of the 11th ACM Conference on Web Science, pp. 183–192 (2019) 40. Zhang, J., Zhang, R., Zhang, Y., Yan, G.: The rise of social botnets: attacks and countermeasures. IEEE Trans. Dependable Secure Comput. 15(6), 1068–1082 (2018). https://doi.org/10. 1109/TDSC.2016.2641441 41. Pacheco, D., Flammini, A., Menczer, F.: Unveiling coordinated groups behind white helmets disinformation. In: The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020, pp. 611–616 (2020). https://doi.org/10.1145/3366424.3385775 42. Iqbal, F., Hadjidj, R., Fung, B.C.M., Debbabi, M.: A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digit. Investig. 5, S42–S51 (2008). https://doi. org/10.1016/J.DIIN.2008.05.001 43. Stamatatos, E.: Intrinsic plagiarism detection using character n-gram profiles. In: SEPLN 2009 Workship on Uncovering Plagiarism, Authorship and Social Software Misuse (PAN 09), pp. 38–46 (2009) 44. Peng, J., Detchon, S., Choo, K.-K.R., Ashman, H.: Astroturfing detection in social media: a binary n-gram-based approach. Concurrency and Computation: Practice and Experience 29(17), e4013 (2017). https://doi.org/10.1002/cpe.4013. John Wiley and Sons Ltd 45. Peng, J., Choo, K.K.R., Ashman, H.: Bit-level n-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J. Netw. Comput. Appl. 70, 171–182 (2016). https://doi.org/10.1016/j.jnca.2016.04.001 46. Bain, L., Engelhardt, M.: Statistical analysis of reliability and life-testing models: theory and methods, 2nd ed. Routledge (2017). https://doi.org/10.1201/9780203738733 47. Neyazi, T.A.: Digital propaganda, political bots and polarized politics in India. Asian J. Commun. 30(1), 39–57 (2019). https://doi.org/10.1080/01292986.2019.1699938 48. Howard, P.N., Kollanyi, B.: Bots, strongerin, and brexit: computational propaganda during the UK-EU referendum. SSRN Electron. J. (2016). https://doi.org/10.2139/SSRN.2798311 49. Alvari, H., Shakarian, P.: Hawkes process for understanding the influence of pathogenic social media accounts. In: Proceedings - 2019 2nd International Conference on Data Intelligence and Security, ICDIS 2019, pp. 36–42 (2019). https://doi.org/10.1109/ICDIS.2019.00013 50. Miller, B.: Automated detection of Chinese government astroturfers using network and social metadata. SSRN Electr. J., 35 (2016). https://doi.org/10.2139/SSRN.2738325 51. Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., Kompatsiaris, Y.: Detection and visualization of misleading content on twitter. Int. J. Multimedia Inf. Retrieval 7(1), 71–86 (2017). https://doi.org/10.1007/s13735-017-0143-x 52. Shaabani, E., Guo, R., Shakarian, P.: Detecting pathogenic social media accounts without content or network structure. In: Proceedings - 2018 1st International Conference on Data Intelligence and Security, ICDIS 2018, pp. 57–64 (2018). https://doi.org/10.1109/ICDIS. 2018.00016 53. Alvari, H., Shaabani, E., Shakarian, P.: Early identification of pathogenic social media accounts. In: 2018 IEEE International Conference on Intelligence and Security Informatics, ISI 2018, pp. 169–174 (2018). https://doi.org/10.1109/ISI.2018.8587339
Detection of Computational Propaganda on Social Networks
263
54. Shaabani, E., Mobarakeh, A.S., Alvari, H., Shakarian, P.: An end-to-end framework to identify pathogenic social media accounts on twitter. In: Proceedings - 2019 2nd International Conference on Data Intelligence and Security, ICDIS 2019, pp. 128–135 (2019). https://doi. org/10.1109/ICDIS.2019.00027 55. Nerino, V.: Tricked into supporting: a study on computational propaganda persuasion strategies. Italian Sociol. Rev. 11(3) (2021). https://doi.org/10.13136/isr.v11i4S.438] 56. Weber, D., Neumann, F.: Who’s in the gang’ revealing coordinating communities in social media. In: Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2020, pp. 89–93 (2020). https://doi.org/ 10.1109/ASONAM49781.2020.9381418 57. Weber, D., Neumann, F.: Amplifying influence through coordinated behaviour in social networks. Soc. Netw. Anal. Min. 11(1), 1–42 (2021). https://doi.org/10.1007/S13278-02100815-2/FIGURES/18 58. Tundis, A., Mukherjee, G., Mühlhäuser, M.: Mixed-code text analysis for the detection of online hidden propaganda ACM reference format. In: Proceedings of the 15th International Conference on Availability, Reliability and Security, p. 7. https://doi.org/10.1145/3407023 59. Horne, B.D., Khedr, S., Adal, S.: Sampling the news producers: a large news and feature data set for the study of the complex media landscape. In: 12th International AAAI Conference on Web and Social Media, ICWSM 2018, pp. 518–527 (2018). https://arxiv.org/abs/1803.101 24v4. Accessed 22 Dec 2021 60. Abokhodair, N., Yoo, D., McDonald, D.W.: Dissecting a social botnet: growth, content and influence in twitter. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, pp. 839–851 (2016). https://doi.org/10.1145/267 5133.2675208 61. Haikarainen, J.: Astroturfing as a global phenomenon, undefined (2014) 62. Jagabathula, S., Subramanian, L.: Reputation-based worker filtering in crowdsourcing processing systems. In: the 27th International Conference on Neural Information, pp. 2492–2500 (2014) 63. Kaplan, A.M., Haenlein, M.: Users of the world, unite! The challenges and opportunities of social media. Bus. Horiz. 53(1), 59–68 (2010). https://doi.org/10.1016/j.bushor.2009.09.003
Quantum Computing Techniques for Multi-knapsack Problems Abhishek Awasthi1 , Francesco B¨ar2 , Joseph Doetsch3 , Hans Ehm4 , Marvin Erdmann5 , Maximilian Hess4 , Johannes Klepsch5 , Peter A. Limacher2 , Andre Luckow5 , Christoph Niedermeier6 , Lilly Palackal4 , Ruben Pfeiffer4 , onmeier-Kromer2 , Oliver von Sicard6 , Philipp Ross5 , Hila Safi6 , Janik Sch¨ Yannick Wenger3 , Karen Wintersperger6(B) , and Sheir Yarkoni7 1
3
BASF SE, Ludwigshafen am Rhein, Germany [email protected] 2 SAP SE, Walldorf, Germany [email protected] Lufthansa Industry Solutions AS GmbH, Raunheim, Germany [email protected] 4 Infineon Technologies AG, Neubiberg, Germany {maximilian.hess,Lilly.Palackal}@infineon.com 5 BMW AG, Munich, Germany [email protected] 6 Siemens AG, Munich, Germany [email protected] 7 Volkswagen AG, Munich, Germany [email protected]
Abstract. Optimization problems are ubiquitous in various industrial settings, and multi-knapsack optimization is one recurrent task faced daily by several industries. The advent of quantum computing has opened a new paradigm for computationally intensive tasks, with promises of delivering better and faster solutions for specific classes of problems. This work presents a comprehensive study of quantum computing approaches for multi-knapsack problems, by investigating some of the most prominent and state-of-the-art quantum algorithms using different quantum software and hardware tools. The performance of the quantum approaches is compared for varying hyperparameters. We consider several gate-based quantum algorithms, such as QAOA and VQE, as well as quantum annealing, and present an exhaustive study of the solutions and the estimation of runtimes. Additionally, we analyze the impact of warm-starting QAOA to understand the reasons for the better performance of this approach. We discuss the implications of our results in view of utilizing quantum optimization for industrial applications in the future. In addition to the high demand for better quantum hardware, our results also emphasize the necessity of more and better quantum optimization algorithms, especially for multi-knapsack problems. This paper was developed within the Quantum Technology and Application Consortium (QUTAC). QUTAC: [email protected]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 264–284, 2023. https://doi.org/10.1007/978-3-031-37963-5_19
QC for Multi-knapsack Problems Keywords: Knapsack problem
1
265
· QAOA · VQE · Quantum Annealing
Introduction
The knapsack problem deals with a set of items, each having a value and a weight, which are assigned to a knapsack with a certain capacity. The task is to maximize the total value of items placed in the knapsack while respecting the capacity of the knapsack. This optimization problem is also referred to as a one-dimensional 0/1 knapsack problem and is known to be NP-hard [16]. Generalizing the problem to M knapsacks (multi-knapsack problem) and valuing items differently in each knapsack complicates the problem further [24]. However, these more delicate versions of the knapsack problem can be used as models for many real-world use cases. The scale of many industrial applications, combined with the multi-objective nature of the M -dimensional 0/1 knapsack problem, poses a significant challenge to classical solution approaches [24]. Therefore, new computational paradigms such as quantum computing are explored, with the goal of finding a speed-up over traditional approaches [5,31]. Quantum computing allows solving certain types of complex problems significantly faster than classical devices [10,30]. Due to its broad applicability, the knapsack problem and its variants have been well studied, and several classical approaches have been proposed to find solutions to this class of problems. Surveys on heuristic algorithms for multiple categories of knapsack problems can be found in [19] and [35]. The field of research dedicated to quantum computing solutions for knapsack problems is considerably younger and smaller. Two techniques that are built upon the quantum approximate optimization algorithm (QAOA) are introduced in [34]: one of them “warm-starts” the quantum optimization algorithm by seeding it with an initial solution using a greedy classical method, and the other uses special mixing Hamiltonians to improve the exploration of the solution space. Results indicate that both approaches outperform similarly shallow classical heuristics in one-dimensional knapsack problem instances. Reformulations of this version of the knapsack problem are evaluated in [27]. For the multiknapsack problem, two quantum-inspired evolutionary algorithms (QIEA) are presented in [22], solving knapsack problem instances with more than 10,000 items. Besides these promising findings of the potential of quantum computing for the knapsack use case, there are studies indicating challenges and even an inability of quantum optimization techniques to outperform classical methods, at least in the era of near-term noisy intermediate-scale quantum (NISQ) devices. In [25], it is shown that a D-Wave 2000Q quantum annealer could not provide optimal solutions for many small-scale knapsack problems due to the limitations of the hardware. A general theory on the limitations of optimization algorithms on NISQ devices is presented in [9]. Since most of the optimization algorithms for NP-hard problems are heuristic in nature, frameworks such as QUARK [8] are essential to obtain a thorough comparison of the performances of classical and quantum algorithms on various hardware backends.
266
A. Awasthi et al.
This work presents a comprehensive benchmark of different quantum algorithms for a use case relevant to many industries. We study an M -dimensional 0/1 knapsack problem and carry out an extensive comparison of results obtained via QAOA, warm-start QAOA, VQE, Quantum Annealing as well as the iterative heuristic solver with Simulated Annealing. Much work about benchmarking quantum algorithms for optimization problems focuses mainly on a single type of algorithm (circuit model or adiabatic model). However, to identify the most promising quantum algorithm, holistic benchmarking of several quantum approaches needs to be carried out. We study the suitability of quantum algorithms for the multi-knapsack problem with the help of key performance indicators (KPIs) relevant to industrial applications. Since we are interested in the performance of the algorithms, we benchmark the gate-based algorithms on noiseless simulators (while quantum annealing has been carried out on quantum hardware). Nonetheless, in contrast to most of the studies published in this field, we present a practical estimation of the runtimes on quantum hardware for the QAOA and VQE algorithms. Additionally, we compare the warm-start QAOA described in [6] as well as another new variant of warm-start QAOA with the standard algorithms mentioned above, which is another novel contribution of this work to the best of our knowledge. The remainder of the paper is structured as follows: Sect. 2 provides the business motivation and possible use cases. The modeling of the multi-knapsack problem is described in Sect. 3, while Sect. 4 gives a brief overview of the quantum algorithms being used, i.e., QAOA, VQE and Quantum Annealing. The information about the problem instances and the definitions of the KPIs used in this work are provided in Sect. 5. We then present our benchmarking results for the studied algorithms, along with the runtimes using a quantum annealer and an estimation of the runtimes of the gate-based algorithms on a real quantum device. We conclude our work with Sect. 6 and provide an analysis of the results obtained as well as an outlook.
2
Business Motivation
The Quantum Technology and Application Consortium (QUTAC) and its members focus on industry use cases for quantum computing applications [26]. The general knapsack problem has many applications in decision-making processes along the entire value chain, such as optimizing portfolios [17], business operations and supply chains. A prominent example of a multi-knapsack problem in many industries, including the automotive, semiconductor and chemical industry, is the optimization of complex supply chains, since products are usually processed in a global manufacturing network rather than in a single factory. Hence, the need for planning and communication between the manufacturing sites emerges to realize an optimal global manufacturing process. Optimization techniques are crucial to addressing common supply chain challenges and increasing the responsiveness to disruptions, e.g., optimizing freight and warehouse capacities, labor planning and carbon emissions, thereby also enhancing
QC for Multi-knapsack Problems
267
the overall sustainability [12]. For the semiconductor industry alone, where supply chain management is particularly complex and dynamic, optimization plays a major role in the context of integrated supply chain planning, daily. One major goal is to solve the demand-capacity matching, which corresponds to maximizing the available product units relative to the units promised to the customer. This task consists of several computationally hard optimization problems, since more than a million order confirmations and reorder confirmations have to be regularly calculated. Therefore, high-quality solutions need to be delivered in a reasonable time. In the following, we will restrict ourselves to a simplified model in which we assume that the demand, capacity, and costs are given. Note that this is a strong assumption, as determining the inputs of the model is a computationally hard task itself. The simplified demand-capacity match can be encoded as a multi-knapsack problem.
3
Modeling
In this section, we present a formal description of the multi-knapsack optimization problem, along with the mathematical formulation of the QUBO model. Given N items and M knapsacks, the objective is to assign as many valuable items as possible to each of the knapsacks while not exceeding the capacity of any knapsack. The problem can be stated as follows. Let j ∈ {0, 1, . . . , N − 1}, then wj ∈ N0 denotes the weight of item j and vi,j ∈ N0 denotes the value of item j in knapsack i ∈ {0, 1, . . . , M − 1}. The capacity of a knapsack i is denoted by ci ∈ N0 . We define a decision variable xi,j , such that xi,j = 1 if and only if item j is assigned to knapsack i, and 0 otherwise. We can now formulate the corresponding QUBO model, including the problem constraints and objective term. – Any item j can be assigned to at most one knapsack. It is possible that an item is not assigned to any knapsack. −1 M −1 N −1 M Hsingle = xi,j · xi,j − 1 . (1) j=0
i=0
i=0
– Ensure that no knapsack’s capacity is exceeded. This is achieved by introducing slack bits yi,b with binary expansion, based on the work of Lucas [21]. The filling of knapsack i ∈ {0, . . . , M − 1}, which can be smaller than its log c capacity, is thereby expressed as ci − b=0 2 i 2b · yi,b . Using this formulation, no fillings larger than the knapsack capacity can be encoded, also when ci is not a power of two. If the sum corresponds to a number larger than ci , the actual filling becomes negative and thus leads to an even larger penalty in the Hamiltonian. The capacity term of the Hamiltonian reads ⎡⎛ ⎞ ⎛ ⎞ ⎤2 log2 ci M −1 N −1 ⎣⎝ wj · xi,j ⎠ + ⎝ 2b · yi,b ⎠ − ci ⎦ . (2) Hcapacity = i=0
j=0
b=0
268
A. Awasthi et al.
– The objective term is formulated such that our original maximization objective function is converted to a minimization problem, as shown below. Hobj = −
M −1 N −1
vi,j · xi,j .
(3)
i=0 j=0
With the above QUBO terms, we can now formulate the complete QUBO for multi-knapsack optimization problems as H, where H = A · Hsingle + B · Hcapacity + C · Hobj .
(4)
The coefficients A, B > 0 are the penalty weights, C > 0 is the objective weight. The minimization of H results is an optimal solution to the multiA knapsack problem. We have to choose A, B, C such that C > maxi,j (vi,j ) and B > max (v ). This ensures that any value gained by breaking a constraint is i,j i,j C offset by an even larger penalty. Without loss of generality, we therefore choose C = 1 and set A = B = 2 · maxi,j (vi,j ).
4
Quantum Optimization Algorithms
This work provides a comparison for the multi-knapsack problem between several quantum algorithms. Since most of these algorithms are well-known and explained in detail in the literature, we provide only a short overview below. 4.1
Quantum Approximate Optimization Algorithm (QAOA)
The QAOA is a popular variational algorithm inspired by the adiabatic theorem, devised to produce approximate solutions for combinatorial optimization problems [7]. For brevity, we outline the basics of the algorithm below; for an in-depth explanation of the algorithm we refer the reader to [38]. The QAOA algorithm optimizes any Hamiltonian C by constructing a predefined parameterized quantum circuit and optimizing the circuit parameters by utilizing classical iterative algorithms. Concretely, the QAOA algorithm requires a quantum circuit to sample a 1 quantum state |ψγ ,β = l=p U (B, βl ) · U (C, γl ) · |s, where |s is the uniform superposition state, U (B, βl ) = e−iβl B is the unitary operator resulting from a mixing Hamiltonian and U (C, γl ) is an operator for the problem Hamiltonian x C. The mixing Hamiltonian is defined n as xthe sum of Pauli-X (σ ) observables acting on all the n qubits, B = j=1 σj . The optimization task is to maximize/minimize ψγ ,β |C|ψγ ,β , the expectation value of |ψγ ,β given the problem Hamiltonian C. For the implementation of QAOA there are a few hyperparameters like the initialization of γ, β and the number of layers p that need to be specified. We provide the analysis of these aspects in Sect. 5. A schematic diagram of the QAOA circuit is provided in Fig. 1.
QC for Multi-knapsack Problems
.. .
U (B, βp )
.. .
U (C, γp )
U (B, β2 )
U (C, γ2 )
U (B, β1 )
U (C, γ1 )
|s
... ... ... ... ...
269
|ψγ,β
Fig. 1. Schematic Diagram of the QAOA Circuit with p Layers.
4.1.1 Warm-Start QAOA: Warm-start QAOA (WS-QAOA) is a variant of QAOA developed by Egger et al. [6]. The difference essentially lies in the initial state and the mixing Hamiltonian, which are defined based on the solution of the relaxed QUBO (which admits variables in [0, 1] instead of {0, 1}). Suppose there are in total L variables in the original QUBO and c∗l be the optimal solution value of the lth variable in the relaxed QUBO, the initialization of the WSQAOA circuit is done with |φ∗ , such that |φ∗ =
L−1 l=0
ˆ Y (θl )|0⊗L = R
L−1 l=0
1 − c∗l |0 +
c∗l |1 ,
(5)
ˆ Y (θl ) is a rotation gate on the lth qubit parameterized by angle θl = where R ∗ cl . As we can see, this initialization state ensures that the probability 2 arcsin of measuring qubit l in state |1 is c∗l . Another difference to the standard QAOA in WS-QAOA is the mixer Hamiltonian. Instead of the Pauli-X Hamiltonian, WS-QAOA utilizes a Hamiltonian whose ground state is |φ∗ with eigenvalue ˆ ws such ˆ ws = L−1 H −n. Formally, the mixer Hamiltonian in WS-QAOA is H M M,l l=0 that ∗ ∗ c∗l ) ws ˆ M,l 2c∗l − 1 ∗ −2 cl (1 − H = . (6) ∗ −2 cl (1 − cl ) 1 − 2cl ˆ ws ), which is implemented in WS-QAOA The mixer operator is exp (−iβ H M,l ˆ Z (−2β)R ˆ Y (−θl ). The implementation ˆ Y (θl )R using single qubit rotation gates R of the WS-QAOA in this work is identical to the warm-start QAOA described by Egger et al. [6]. 4.1.2 Warm-Started Standard QAOA: As we will see in our results and the study presented by Egger et al. [6], the warm-start QAOA definitely performs much better than the standard QAOA. To clearly understand the effect of warm-starting, i.e., initializing the QAOA (WS-QAOA) circuit with the relaxed solution of a QUBO, we propose a variant of WS-QAOA, in that the circuit is initialized with the relaxed QUBO solution, but the mixer Hamiltonian is unchanged to the standard QAOA. We call this approach of QAOA the warmstarted standard QAOA, and for brevity we refer to it as WS-Init-QAOA. The WS-Init-QAOA can be formally expressed as a variational quantum circuit 1 which samples |ψγ ,β = l=p U (B, βl ) · U (C, γl ) · |φ∗ , where |φ∗ is the relaxed QUBO solution, explained in Sect. 4.1.1. U (B, βl ) = e−iβl B is the parameterized
270
A. Awasthi et al.
unitary operator resulting from the Pauli-X mixing Hamiltonian and U (C, γl ) is an operator for the problem Hamiltonian C. Note that the only difference to the WS-QAOA is the mixing Hamiltonian. 4.2
Variational Quantum Eigensolver
Like QAOA, the variational quantum eigensolver (VQE) [23] consists of a parameterized quantum circuit U (θ), where θ ∈ [0, 2π]n are angles for single-qubit rotation gates. Analogous to QAOA, these parameters are tuned to minimize the expectation value ψ(θ)|C|ψ(θ), where ψ(θ) := U (θ)|0 is the trial state prepared by the circuit U (θ) and C is the Hamiltonian encoding the optimization problem. The VQE offers more freedom in the choice of the circuit. QAOA can be seen as a special case of VQE, namely if we choose a QAOA circuit as U (θ). A common requirement for the circuit U (θ) is hardware efficiency [15], meaning that the circuits consist of parameterized one-qubit rotations and twoqubit entangling gates and are kept relatively shallow in order to deal with short coherence times on NISQ devices. The drawback of using generic ansatz circuits is that a larger number of parameters (compared to the QAOA) is needed to guarantee expressivity [4] of the circuit, i.e., the circuit’s ability to prepare the ground state of any choice of target Hamiltonian. For a full technical review of methods and best practices for VQE, we refer the reader to [33]. 4.3
Quantum Annealing
Quantum annealing (QA) is a metaheuristic quantum optimization algorithm inspired by Adiabatic Quantum Optimization (AQO) and Adiabatic Quantum Computing (AQC). The algorithm starts by initializing a system of qubits in a simple-to-prepare optimum (known as the ground state or the initial Hamiltonian) that is slowly evolved to represent a combinatorial optimization problem expressed as an Ising model or QUBO (known as the final Hamiltonian) [14]. At the end of this process, the qubits’ states represent a possible solution to the combinatorial optimization problem in the final Hamiltonian, with a nonzero probability of being a global optimum (i.e., the ground state of the final Hamiltonian). The specific evolution parameters which govern the path from initial to final Hamiltonian dictate the success of the QA algorithm, specifically the probability of measuring a ground state of the final Hamiltonian. For a full description of the physics and implementation of QA, we refer the reader to [11]; for a review of applications tested using QA, we refer the reader to [37]. Quantum annealing (or any QUBO solving method) can be expanded upon via the Iterative Heuristic Solver (IHS), originally developed by Rosenberg et al. [28]. The general idea of this metaheuristic procedure is to split a QUBO problem into several smaller subproblems and solve these iteratively instead of solving the entire problem at once. Suppose we aim to minimize xT Qx, where Q ∈ Rn×n is the QUBO matrix and x ∈ {0, 1}n the binary solution vector. In the basic version of the IHS, we start with a randomly generated initial configuration of x and repeatedly perform the following steps:
QC for Multi-knapsack Problems
271
1. Randomly choose k variables from x and fix the other n − k variables. 2. Optimize over the k chosen variables using the underlying QUBO solver. 3. Check for an improvement in solution quality over the previous iteration. If no improvement is detected in the final step for several iterations, we assume convergence and output the final state of the vector x as the optimized solution. It is to be noted that an iterative metaheuristic like this has a significantly larger runtime by design than other more direct approaches to solving a QUBO problem. However, the IHS avoids having to embed the full QUBO matrix of a problem, which is potentially large, on a quantum annealer. This increases the size of potentially solvable problems significantly, considering hardware limits, which makes it worthwhile to test the approach alongside our other presented methods.
5
Results
In this section, we first discuss different problem instances considered in this work for the multi-knapsack problem, along with the description of different measures to assess the performance of QAOA, VQE, and quantum annealing. Subsequently, we present detailed results obtained via benchmarking these algorithms over different scenarios. It must be noted that in this work we have executed the QAOA and VQE algorithms using state-vector simulations, while quantum annealing was carried out on real quantum devices. Additionally, we present the comparative performance of quantum annealing versus classical heuristic methods. 5.1
Problem Instances
Table 1. Benchmark Scenarios. Scenarios
1
2
3
4
Knapsacks
1
2
2
2
Items
8
4
5
6
Qubits
12 14 16 19
Optimal Sol. (vopt ) 22 12 13 13 Prefactors (A = B) 32 10 8
8
To compare the performance of the various optimization methods and algorithms, 4 different instances of the knapsack problem are considered, increasing in problem size and complexity, which are listed in Table 1. The optimal solution was derived classically for each problem size using 0/1 integer programming. The maximal value of the optimal item distribution and the prefactors A and
272
A. Awasthi et al.
B of the penalty terms in the Hamiltonian in Eq. (4) are given in the last two rows of Table 1. The parameters for the problem instances were initially chosen randomly and partly modified afterwards in order to ensure different levels of complexity. 5.2
Measures for Solution Quality
When solving the knapsack problem with a quantum program, each solvers’ output is a probability distribution of quantum states represented as bitstrings, characterized by executing the program several times and taking measurements. To inspect the validity of the solutions, the sum of all penalty terms in the QUBO is evaluated. If the total penalty value is equal to zero, the bitstring is considered a valid solution. Note that in this way, bitstrings encoding an item distribution with a total weight smaller than the knapsack capacity N −1 ( j=0 wj · xi,j ≤ ci ) are considered invalid if the corresponding yib slack bits are not correct (i.e., Hcapacity > 0). The best solution xmin is defined as the bitstring with the lowest energy. This solution is characterized by the total value of the corresponding item distribution vtot , divided by the value of the known optimal solution vopt . This relative value, also known as the approximation ratio, is averaged over several runs Nrun for all valid best solutions with the same parameters to obtain the mean closeness to optimum Copt (xmin ), such that Copt (xmin ) =
Nrun vtot (xmin,r ) 1 · 100 . Nrun r=1 vopt
(7)
In practice, it is useful to find a distribution of items that is not necessarily optimal, but still has a high Copt value. We consider all valid bitstrings x within a probability distribution which have a Copt value above a certain threshold Clim , which we chose to be 90%. For each set of parameters, the amplitudes1 a(x) of all x with Copt (x) ≥ Clim are added and the sums are averaged over all runs. Thus, we define the overlap of the sampled solutions with 0.90-opt solution as O90 , where Nrun 1 ar (x) . (8) O90 = Nrun r=1 x|Copt (x)≥Clim
Note that it might be possible that the optimal solution has a high relative value, but the O90 remains small, since there are no further solutions with a relative value above 90%. 5.3
Results for QAOA, WS-QAOA and WS-Init-QAOA
In this section, we discuss detailed results of the QAOA, WS-QAOA and WSInit-QAOA, implemented using Qiskit [13]. For all results presented here, the 1
The amplitude of a state x is defined as the square root of its probability in the sampled solution.
QC for Multi-knapsack Problems
273
p=1 p=3 p=5
12
p=2 p=4 p=6
14 16 Number of Qubits
(a) Standard QAOA.
19
0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
p=1 p=3 p=5
12
14 16 Number of Qubits
(b) WS QAOA
p=2 p=4 p=6
19
Overlap for 0.90-opt, O90
0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
Overlap for 0.90-opt, O90
Overlap for 0.90-opt, O90
quantum circuits were sampled with nsamp = 10, 000 shots. The classical optimization of γ and β is carried out with the off-the-shelf optimizer class of SciPy Python library using the sequential least squares programming (SLSQP) Algorithm [29], where the maximal number of iterations was limited to 10, 000. 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
p=1 p=3 p=5
12
p=2 p=4 p=6
14 16 Number of Qubits
19
(c) WS-Init-QAOA
Fig. 2. Standard QAOA, Warm-Start QAOA and WS-Init-QAOA Overlaps with 0.90opt Results for Different Problem Sizes and Number of QAOA Layers.
We carried out tests for the three QAOA based algorithms for the four different problem instances with a number of QAOA layers ranging from p = 1 to p = 6. For each problem instance, we randomly initialize the γ and β angles in the range [−π, π]. Additionally, to better assess the convergence of results, we repeat each test 20 times and report the average and the standard deviation values. Figure 2 presents 0.90-opt overlap O90 results for QAOA, WS-QAOA and WS-Init-QAOA against problem instances ranging from 12 to 19 qubits, for varying numbers of QAOA layers p, along with their standard deviation. The overlap values evidently decrease by increasing the problem sizes, reaching ≈ 0 for the last scenario with 19 qubits for standard QAOA. Increasing the number of layers seems to present none to minimal improvement in the solution values. This can be seen as an indicator that adding more layers does not increase the subspace of states explored by the QAOA-circuit. The overlap values obtained for WS-QAOA (Fig. 2b) are much better than for QAOA. As discussed in Sect. 4.1.1, WS-QAOA differs from the standard QAOA in two aspects, the initial state |φ∗ ˆ ws whose ground state eigenvector is |φ∗ . Our and the mixing Hamiltonian H M results show that these two modifications certainly lead to a better performance over standard QAOA. Remarkably though, WS-Init-QAOA (Fig. 2c) seems to perform much better for all the instances, in comparison to WS-QAOA. This is an interesting result considering the mixer Hamiltonian for the WS-Init-QAOA brings the quantum states to their superposition position state (similar to standard QAOA). On the other hand, the mixer Hamiltonian for WS-QAOA is designed to bring the quantum states to the optimum continuous solution. Regardless, it is apparent that just improving the choice of initial state leads to even better results compared to modifying the mixer operator along with the initial value. We would
274
A. Awasthi et al.
90 80 70 p=1 p=3 p=5
60 50
12
p=2 p=4 p=6
14 16 Number of Qubits
19
(a) Standard QAOA
100
Closeness to optimum (%)
100
Closeness to optimum (%)
Closeness to optimum (%)
like to emphasize this aspect and suggest that the good solutions obtained by WS-QAOA are mainly due to the classical pre-processing (i.e., the continuous solutions to the relaxed QUBO), and not due to the modified mixing Hamiltoˆ ws ). nian (H M
90 80 70 p=1 p=3 p=5
60 50
12
p=2 p=4 p=6
14 16 Number of Qubits
(b) WS QAOA
19
100 90 80 70 p=1 p=3 p=5
60 50
12
p=2 p=4 p=6
14 16 Number of Qubits
19
(c) WS-Init-QAOA
Fig. 3. Closeness to Optimum of the Best Solutions Obtained from Standard QAOA, Warm-Start QAOA and WS-Init-QAOA for Different Problem Sizes and Number of QAOA Layers.
As far as the closeness of the best solution to the optimum is concerned, both warm-start approaches outperform standard QAOA, as shown in Fig. 3. While standard QAOA (Fig. 3a) values drop below the 90%-mark already for 14 qubit problems, the WS-QAOA (Fig. 3b), maintains 90% of the optimal solution also for 19 qubits and the WS-Init-QAOA results remain even well above 90% (Fig. 3c). The number of layers seems to have only a minor influence on the quality of results. Regarding the stability of the results, we notice that the standard deviation of both WS-QAOA results is smaller than in the standard QAOA. WS-Init-QAOA seems to again provide better values than the WS-QAOA. In summary, while the number of QAOA layers does not have a considerable influence, using warm-start QAOA increases the quality of the results. Moreover, from the comparison of the two warm-start approaches we see that choosing a better initial state for the quantum circuit leads to significant improvements of the result quality. On the other hand, the additional modification of the mixing Hamiltonian incorporated in the WS-QAOA algorithm does not seem to have a distinct effect, since the WS-Init-QAOA approach, that just uses a different initial state, gives the best results. Other variations of QAOA including better initial rotation angles, as suggested in [38], need to be looked at to fully understand the true potential of QAOA for multi-knapsack problems. 5.4
Results from VQE
The ansatz circuit chosen for the VQE experiments is adopted from the work of Liu et al. [20] and consists of parameterized single-qubit rotations and two-qubit entangling gates without parameterization, schematically shown in Fig. 4.
QC for Multi-knapsack Problems
275
Fig. 4. A Quantum Circuit for 4-Qubit VQE with a Single Layer.
100
p=1 p=2
0.4
Closeness to optimum (%)
Overlap for 0.90-opt, O90
We conducted the experiments with p = 1 and p = 2 layers, sampling 10,000 shots from the corresponding quantum circuits. All results are averaged over 20 runs, and the error bars in Fig. 5 indicate the standard deviation. Parameters are randomly initialized in the range [0, 2π], and the COBYLA optimizer is used for the classical parameter tuning. Compared to the QAOA results, the VQE reaches slightly worse approximation rates in general. However, when good solutions are found, they are sampled with higher probability compared to QAOA. The quality of the solutions improves when we take an ansatz circuit with 2 layers, although this results in almost twice as many parameters and thus a substantial calculation overhead on the parameter tuning side.
0.3 0.2 0.1 0 −0.1 −0.2 12
14 16 Number of Qubits
(a) Overlap with 0.90-opt
19
p=1 p=2
90 80 70 60 50
12
14 16 Number of Qubits
19
(b) Closeness to Optimum
Fig. 5. Overlap and Closeness Results for VQE Algorithm for the Multi-Knapsack Problem for 1 and 2 Layers Circuit Ansatz.
As can be seen from the high number of iterations needed for convergence of the classical optimizer in Table 3, VQE exhibits a longer runtime than QAOA, even when one accounts for the simulation time (see Sect. 5.5). These runtimes may serve as a motivation to look for strategies which reduce the number of classical iterations. Overall, the benefits of VQE, i.e., hardware-adapted quantum circuits which suffer less from noise, can only become apparent when running the algorithms on actual quantum hardware. Thus, despite the worse approximation rates and longer run times, VQE deserves further exploration in the context of quantum optimization algorithms.
276
5.5
A. Awasthi et al.
Runtime Estimation on Quantum Devices for QAOA and VQE
As described in the previous sections, the results for QAOA and VQE are obtained by simulations of the quantum circuits. In order to assess the performance of each quantum algorithm and provide a meaningful comparison with the results from quantum annealing, we estimate the runtimes for standard QAOA and VQE on a quantum processing unit (QPU). A simple model to describe the total runtime T of a variational quantum algorithm is explained in [36] and reads: (9) T = niter · [nsamp · (tcirc + tmeas ) + topt + tcomm ], where niter denotes the number of iterations of the optimizer, nsamp is the number of samples taken to measure the quantum state and tcirc , tmeas , topt describe the times to execute all gates of the quantum circuit, measure all qubits and perform the classical optimization, respectively. Table 2. Gate Execution Times from Qiskit Backend FakeBrooklyn. Gate Exec. time (ns) RZ
0
SX
35.56
X
35.56
CX
370 ± 80
The communication time between the quantum and the classical computer used for optimization is represented by tcomm . To provide a rough estimate of the expected runtimes on a QPU, we assume that the times for classical optimization, measurement and communication remain in a similar range as for the simulation and just consider how tcirc is changed. The circuit execution time on a QPU t˜circ can be calculated from the gate execution times and the structure of the circuit. As an example, we choose the IBM-Q Brooklyn device consisting of 65 qubits. The properties such as the topology and gate execution times are estimated using the FakeBrooklyn backend from the FakeProvider module of Qiskit. The corresponding execution times for the native gate set are averaged over all qubits and the mean values are presented in Table 2. The quantum circuits for QAOA and VQE are then transpiled to the FakeBrooklyn backend using the standard Qiskit transpiler with optimization level 3 [13]. For each circuit, the execution time is derived from a schedule describing the execution of all gates on all qubits, taking into account parallel execution of gates as well as constraints on the timings imposed by two-qubit gates. The QPU runtimes were estimated for QAOA and VQE with p = 1, averaging over 20 compilation runs to account for the stochastic placement of SWAP gates within the standard Qiskit transpilation pass. The mean runtimes t˜circ for a single execution of the circuit and their standard deviation are shown in Fig. 6, along
0.06
250
0.05
200
0.04
150
14 16 Number of Qubits
180 Runtime per circuit execution (ms) VQE circuit depth for single layer
0.015
160
0.013
140
0.011
120
Circuit Depth
300
Runtime per circuit execution (ms)
350
0.07
12
277
0.017
Runtime per circuit execution (ms) QAOA circuit depth for one layer
0.08
Circuit Depth
Runtime per circuit execution (ms)
QC for Multi-knapsack Problems
19
(a) QAOA Runtime and Circuit Depth
12
14 16 Number of Qubits
19
(b) VQE runtime and circuit depth
Fig. 6. Mean Circuit Execution Times t˜circ on an IBM-Q Device Estimated from the Gate Times and Mean Circuits Depths for QAOA and VQE for p = 1.
with the corresponding mean circuit depths. As described in Sect. 4.2, the VQE algorithm needs less gates, resulting in a mean depth being smaller by about a factor of 2 compared to QAOA. Due to the different kinds of gates being used, the circuit runtimes are decreased even by a factor of 4. To estimate the overall runtime T˜ on the QPU, we replace tcirc in Eq. (9) with ˜ tcirc , keeping the other contributions unchanged. The execution times and the overall runtimes on the simulator are obtained from a python profiler, averaging over 5 runs with random initial parameters. In Fig. 7, the resulting runtimes for the simulator are compared with the estimate on the QPU. For QAOA, the runtimes on the QPU are about 30 times larger than on the simulator. Both runtimes show a similar dependence on the problem size and complexity, following the trend observed in Fig. 6a for the circuit depth and circuit execution time. In general, we observed no considerable difference in the simulator runtimes and number of optimizer iterations niter between standard QAOA, WS-QAOA and WS-init-QAOA. The overall runtimes for VQE on the simulator and QPU differ only by a factor of 2-3 and are both larger than for QAOA. While the QPU circuit runtimes t˜circ are smaller for VQE, the mean circuit runtimes on the simulator lie in the qaoa same range as for QAOA with tvqe circ ∼ (2.5 − 4.2)μs and tcirc ∼ (0.6 − 3.9)μs. However, the VQE algorithm contains more parameters to optimize and thus it takes more iterations to find the optimal parameter values, as shown by the values in Table 3. This increases the runtimes on the simulator as well as the QPU estimate accordingly.
A. Awasthi et al. Simulator Estimation on IBM-Q
80
Simulator Estimation on IBM-Q
300 Total runtime (seconds)
Total runtime (seconds)
278
60
40
20
250 200 150 100 50
10 5 0 12
14 16 Number of Qubits
12
19
14 16 Number of Qubits
19
(b) Overall Runtimes for VQE
(a) Overall Runtimes for QAOA
Fig. 7. Mean Overall Runtimes on the Simulator and Estimated on an IBM-Q Device for QAOA and VQE with p = 1. The Error Bars are Computed by Error Propagation. Table 3. Mean Number of Optimizer Iterations for QAOA and VQE for p = 1. Num. qubits nqaoa iter
5.6
nvqe iter
12
80 ± 20 700 ± 100
14
70 ± 10 900 ± 100
16
56 ± 7
19
90 ± 20 1400 ± 200
1000 ± 100
Results from Annealing
We tested both simulated and quantum annealing, along with the IHS with simulated annealing. For quantum annealing, we focused on two of the devices available on Amazon Braket: D-Wave 2000Q (2,048 qubits) and the larger DWave Advantage 6.1 (5,760 qubits). For the IHS, we used simulated annealing with 50 iterations in a single run and set the number of optimization parameters to 12. All annealing algorithms were executed with 1,000 reads and repeated ten times to evaluate the algorithms’ stability. The O90 results displayed in Fig. 8a show a higher overlap for simulated annealing - between 0.25 ± 0.01 in scenarios with 19 qubits and 0.43 ± 0.01 with 14 qubits - compared to the quantum annealing options which perform similar and do not exceed 0.14 ± 0.05 with D-Wave 2000Q in scenarios with 14 qubits. For IHS, the 0.90-opt overlap cannot be computed in a meaningful way due to the nature of the algorithm. Figure 8b displays the closeness of the found solution relative to the optimum. The simulated annealing and IHS approaches find optimal solutions for each of the tested knapsack instances. With quantum annealing, the optimal solutions were obtained only for Scenario 2 with 14 qubits. For the other scenarios, DWave Advantage 6.1 exhibits solution qualities of 95.4 ± 7.4% for the largest problem instance up to 98.2 ± 3.8% for Scenario 1 and outperforms D-Wave
QC for Multi-knapsack Problems
279
0.4
0.2
0
100
Closeness to optimum (%)
Overlap for 0.90-opt, O90
0.6
12
14 16 Number of logical qubits
95
85
75 12
19
15
Runtime (s)
75 Number of physical qubits
19
(b) Closeness to Optimum (%)
(a) Overlap with 0.90-opt
50
10
5
25
0
14 16 Number of logical qubits
12
14 16 Number of logical qubits
D-Wave Advantage- 6.1
19
D-Wave 2000Q
(c) Number of Physical Qubits
0
12
14 16 Number of logical qubits
Simulated Annealing
19
IHS with Sim. Anneal.
(d) Runtime (Seconds)
Fig. 8. Overlap, Closeness to Optimum, Number of Required Physical Qubits for Annealing and Runtime.
2000Q yielding 88.5 ± 11.0% to 94.5 ± 6.4%. Figure 8c displays the number of physical qubits needed to embed the theoretical qubits from the QUBO, compensating for the limited connectivity of the physical qubits. As expected, the D-Wave 2000Q requires more physical qubits to embed the same QUBO than the D-Wave Advantage 6.1, as its architecture has lower connectivity[3]. Since finding the embedding is done by a non-deterministic algorithm, the required number physical qubits on each device varies between runs of the same scenario. In scenarios with two knapsacks (Scenarios 2–4), the number of physical qubits grows proportionally to the number of logical qubits needed to model the respective problem. However, in Scenario 1 more physical qubits are required to embed the QUBO than in Scenario 2, even though less logical qubits are necessary to represent the problem with one knapsack and eight items (see Fig. 1). This shows the importance of considering the physical properties of the particular quantum hardware and not solely looking at the theoretical number of logical qubits. The runtimes for various annealing approaches are compared in Fig. 8d. For simulated annealing, the average runtimes are below 1 s for all scenarios, while the runtimes of the IHS are the longest of all annealing approaches considered in
280
A. Awasthi et al.
this work, ranging from 10.8 ± 0.3 s to 11.9 ± 3.1 s. With runtimes between 6.8 ± 0.7 s and 9.6 ± 2.2 s (D-Wave Advantage 6.1 ) and between 8.4 ± 1.7 s and 10.2 ± 1.7 s (D-Wave 2000Q), respectively, quantum annealing approaches are about a factor 3–7 faster than the QPU estimates for QAOA and about 10–30 times faster than the estimates for VQE presented in the previous section. The quantum annealing runtimes grow only moderately with the problem size compared to the pronounced increase observed especially with VQE. Note that in our analysis of quantum annealing runtimes, we also include the communication overhead between the user and the QPU as well as measurement times, which were not included in the estimation for QAOA and VQE. Thus, the runtimes for the gate-based approaches are expected to increase even more in practice, also when adding more layers to the circuits. In general, using a different hardware platform will also influence the runtime, since the gate times depend on the physical realization of the QPU. The circuit runtime also has to be compared to the coherence time of the qubits: while the gate execution times on other platforms such as trapped ions or cold atoms are in general longer than for superconducting QPUs such as the IBM-Q device considered here, those platforms usually also feature longer coherence times. Moreover, realizations based on ions or atoms exhibit better connectivity between the qubits which in turn reduces the amount of gates needed to run the circuit on the QPU and thus also reduces the amount of noise introduced on NISQ-devices [36]. Eventually, the absolute values of the runtimes have to be judged along with the quality of the delivered results within the specific context of the application. 5.7
Comparison of the Quantum Algorithms
The result quality provided by the different quantum solvers is summarized in Fig. 9, where the results for the gate-based approaches are averaged over different numbers of layers. While the result quality is decreasing with the problem size for all approaches, this effect is most pronounced for VQE and standard QAOA. Overall, the best results are delivered by WS-Init-QAOA and QA on the DWave Advantage 6.1 device, which show quite comparable values for the 0.90opt overlap and the closeness to optimum. As discussed in the previous section however, the runtimes for QA on a QPU are shorter than the corresponding estimates for QAOA. When comparing empirical results from different quantum algorithms, it is important to consider how the differences in implementation can affect both results and their interpretation. For quantum annealers, it is known that noise, embedding overhead, and high precision requirements are all detrimental to performance. Simulated QAOA and VQE do not suffer from these issues, but rather are limited by the quality of parameter settings with classical optimization and statistical sampling accuracy. Conversely, quantum annealing samples are known to primarily populate local minima due to classical effects after the freeze-out point [2], which is not the case for QAOA and VQE.
QC for Multi-knapsack Problems 100 Closeness to optimum
Overlap for 0.90-opt O90
0.15
281
0.1
0.05
90
80
70
0 60 12
QAOA
14 16 Number of Qubits X WS-QAOA
12
19
WS-Init-QAOA
(a) Overlap with 0.90-opt
VQE
14 16 Number of Qubits DWave Advantage 6.1
19
DWave 2000Q
(b) Closeness to Optimum
Fig. 9. Average Overlap and Closeness to Optimum Values for All the Quantum Approaches, over Different Problem Sizes.
6
Conclusion and Outlook
The aim of this work was to compare different approaches for solving the multiknapsack problem in view of practical applications. We have defined appropriate measures for this comparison such as the 0.90-opt overlap and have also discussed the implementation of the considered quantum algorithms on real hardware as well as their limitations. The direct comparison of the most common gate-based algorithms with quantum annealing and simulated annealing provided in this work will help to better estimate the potential of these approaches for solving more complex optimization problems such as the knapsack problem in the future. Since practitioners might not have access to various types of quantum computing hardware, the suggested estimation of runtimes on real hardware derived from simulations of quantum circuits can be useful to carry out benchmarks of quantum algorithms. Our results show that adapting a standard algorithm such as QAOA can considerably improve the quality of the delivered results. Comparable advantages might be achieved for VQE by finding similar initialization and warm-starting strategies as for QAOA. Deriving optimized starting angles for QAOA (or VQE) as described in [38] or tailoring the schedule of quantum annealing to minimize transfer to higher energy states [38] constitute other promising approaches. Moreover, we can conclude that variational gate-based approaches will profit from better optimization strategies for the circuit parameters to lower the number of iterations, especially in the case of VQE. As a future work, it would be of great interest to study the QAOA and VQE algorithms incorporating the ideas from Koretsky et al. [18] and Braine et al. [1], which provide an alternative and qubit-efficient way of formulating inequality constraints in a QUBO model without requiring binary slack bits. These techniques seem to be promising since the convergence of slack bits in all quantum algorithms based on QUBOs is generally hard to achieve. Moreover, it would
282
A. Awasthi et al.
also be interesting to study and implement the qubit-efficient encoding of optimization problems especially with VQE [32]. Tackling realistic problems with millions of qubits, as described in Sect. 2, is out of scope for the currently available NISQ-devices. Considering the roadmaps of various quantum hardware vendors, however, quantum computers with up to 10,000 qubits might become available within two to three years. Thus, using strategies to find optimized problem formulations and algorithms with reduced number of qubits and quantum operations will be of key importance to fit realistic problems onto those intermediate-sized quantum computers. In addition, matching the design of algorithms and quantum computing hardware is seen as another quite promising approach in the NISQ-era.
References 1. Braine, L., Egger, D.J., Glick, J.R., Woerner, S.: Quantum algorithms for mixed binary optimization applied to transaction settlement. IEEE Trans. Quantum Eng. 2, 1–8 (2019) 2. D-Wave Systems Inc. Annealing Implementation and Controls (2022) 3. D-Wave Systems Inc. D-Wave QPU Architecture: Topologies (2022) 4. Du, Y., Tu, Z., Yuan, X., Tao, D.: Efficient measure for the expressivity of variational quantum algorithms. Phys. Rev. Lett. 128(8), 080506 (2022) 5. Ebadi, S., et al.: Quantum optimization of maximum independent set using rydberg atom arrays. Science 376(6598), 1209–1215 (2022) 6. Egger, D.J., Mareˇcek, J., Woerner, S.: Warm-starting quantum optimization. Quantum 5, 479 (2021) 7. Farhi, E., Goldstone, J., Gutmann, S.: A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028 (2014) 8. Finˇzgar, J.R., Ross, P., Klepsch, J., Luckow, A.: QUARK: A Framework for Quantum Computing Application Benchmarking. arXiv preprint arXiv:2202.03028 (2022) 9. Fran¸ca, D.S., Garc´ıa-Patr´ on, R.: Limitations of optimization algorithms on noisy quantum devices. Nat. Phys. 17(11), 1221–1227 (2021) 10. Grover, L.K.: A fast quantum mechanical algorithm for database search. In: Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing (STOC), pp. 212–219. Association for Computing Machinery, New York (1996) 11. Hauke, P., Katzgraber, H.G., Lechner, W., Nishimori, H., Oliver, W.D.: Perspectives of quantum annealing: methods and implementations. Rep. Prog. Phys. 83(5), 054401 (2020) 12. Henrich, J., Li, J., Mazuera, C., Perez, F.: Future-proofing the supply chain (2022) 13. IBM. QisKit SDK (2022) 14. Kadowaki, T., Nishimori, H.: Quantum annealing in the transverse ising model. Phys. Rev. E 58, 5355–5363 (1998) 15. Kandala, A., et al.: Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549(7671), 242–246 (2017) 16. Karp, R.M.: Reducibility among combinatorial problems. In: Miller, R.E., Thatcher, J.W., Bohlinger, J.D. (eds.) Complexity of Computer Computations, pp. 85–103. Springer, Boston (1972). https://doi.org/10.1007/978-1-4684-2001-2 9 17. Kellerer, H., Pferschy, U., Pisinger, D.: Knapsack Problems, p. 461. Springer, Berlin (2004). https://doi.org/10.1007/978-3-540-24777-7
QC for Multi-knapsack Problems
283
18. Koretsky, S., et al.: Adapting quantum approximation optimization algorithm (QAOA) for unit commitment. In: 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), pp. 181–187 (2021) 19. Laabadi, S., Naimi, M., El Amri, H., Achchab, B.: The 0/1 multidimensional knapsack problem and its variants: a survey of practical models and heuristic approaches. Am. J. Oper. Res. 8, 395–439 (2018) 20. Liu, X., Angone, A., Shaydulin, R., Safro, I., Alexeev, Y., Cincio, L.: Layer VQE: a variational approach for combinatorial optimization on noisy quantum computers. IEEE Trans. Quantum Eng. 3, 1–20 (2022) 21. Lucas, A.: Ising formulations of many NP problems. Front. Phys. 2, 5 (2014) 22. Patvardhan, C., Bansal, S., Srivastav, A.: Quantum-inspired evolutionary algorithm for difficult knapsack problems. Memetic Comput. 7, 135–155 (2015) 23. Peruzzo, A., et al.: A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5(1), 4213 (2014) 24. Pisinger, D.: Where are the hard knapsack problems? Comput. Oper. Res. 32(9), 2271–2284 (2005) 25. Pusey-Nazzaro, L., Date, P.: Adiabatic quantum optimization fails to solve the knapsack problem. arXiv preprint arXiv:2008.07456 (2020) 26. Quantum Technology and Application Consortium - QUTAC: Industry quantum computing applications. EPJ Quantum Technol. 8(25) (2021) 27. Quintero, R.A., Zuluaga, L.F.: Characterizing and benchmarking QUBO reformulations of the knapsack problem. Technical report, Department of Industrial and Systems Engineering, Lehigh University (2021) 28. Rosenberg, G., Vazifeh, M., Woods, B., Haber, E.: Building an iterative heuristic solver for a quantum annealer. Comput. Optim. Appl. 65(3), 845–869 (2016) 29. SciPy. SciPy documentation (2022) 30. Shor, P.W.: Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings 35th Annual Symposium on Foundations of Computer Science (FOCS), pp. 124–134 (1994) 31. Streif, M., Yarkoni, S., Skolik, A., Neukart, F., Leib, M.: Beating classical heuristics for the binary paint shop problem with the quantum approximate optimization algorithm. Phys. Rev. A 104(1), 012403 (2021) 32. Tan, B., Lemonde, M.-A., Thanasilp, S., Tangpanitanon, J., Angelakis, D.G.: Qubit-efficient encoding schemes for binary optimisation problems. Quantum 5, 454 (2021) 33. Tilly, J., et al.: The variational quantum eigensolver: a review of methods and best practices. Phys. Rep. 986, 1–128 (2022) 34. van Dam, W., Eldefrawy, K., Genise, N., Parham, N.: Quantum optimization heuristics with an application to knapsack problems. arXiv preprint arXiv:2108.08805 (2021) 35. Wilbaut, C., Hanafi, S., Salhi, S.: A survey of effective heuristics and their application to a variety of knapsack problems. IMA J. Manag. Math. 19(3), 227–244 (2008) 36. Wintersperger, K., Safi, H., Mauerer, W.: QPU-system co-design for quantum HPC accelerators. In: Schulz, M., Trinitis, C., Papadopoulou, N., Pionteck, T. (eds.) Architecture of Computing Systems, pp. 100–114. Springer, Cham (2022). https:// doi.org/10.1007/978-3-031-21867-5 7
284
A. Awasthi et al.
37. Yarkoni, S., Raponi, E., B¨ ack, T., Schmitt, S.: Quantum annealing for industry applications: introduction and review. Rep. Prog. Phys. 85(10), 104001 (2022) 38. Zhou, L., Wang, S.-T., Choi, S., Pichler, H., Lukin, M.-D.: Quantum approximate optimization algorithm: performance, mechanism, and implementation on nearterm devices. Phys. Rev. X 10(2), 021067 (2020)
Analysis of Syntactic Errors of Novice Python Programmers in a Nigeria University Philip Olu Jegede1 , Emmanuel Ajayi Olajubu2 , Olusegun Ojo Bakare1(B) , Isaac Oluwafemi Elesemoyo3 , and Josiah Owolabi4 1 Institute of Education, Obafemi Awolowo University, Ile-Ife, Nigeria
[email protected]
2 Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife,
Nigeria [email protected] 3 Department of Computer Engineering, Elizade University, Ilara-Mokin, Ondo State, Nigeria 4 Department of Educational Foundation, National Open University of University, Abuja, Nigeria
Abstract. This study sought to analyse syntactic errors of Novice Python programmers. This is with the intention of generating an error log aimed at enhancing effective pedagogy in the training of beginners Python programmers. A sequential exploratory mixed-method design was adopted. Seventy 200-level Python programming students in a South-Western Nigerian University as participants conveniently sampled participated in the study. Python code written in a summative assessment were analysed for errors and categorised based on error types, concepts where the errors were committed and achievement levels of students that commit such errors. The resulting data were analysed using Descriptive Statistics as well as Pearson Product Moment Correlation Coefficient (PPMC). Qualitative analysis through interviews were also employed to identify possible causes of such errors. The findings showed that 90 errors were committed and classified as follows; Missing Symbols (30.0%), Invalid symbols (21.1%), Mismatched symbols (25.6%), Inappropriate naming (8.9%) and Excessive symbols (14.4%). Most of the errors committed were from Input and Output concepts (35.6%), followed by Loops (25.6%). Significant relationships were found between Excessive symbols and each of Invalid and Mismatched symbols; Inappropriate and missing symbols were significantly related. Classroom implications of the findings were discussed. Keywords: Syntactic Errors · Novice Programmers · Python Programming · Coding · Pedagogy
1 Introduction The act of teaching beginners or novices in programming language is still challenging. One of the major efforts at enhancing programming skills acquisition and fostering achievement is to develop an error log that guides classroom instruction and consequently improve learning outcomes. The previous efforts at achieving this has largely been in Java [1, 2] and C [3] as well as in python programming [4]. In Samoa for © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 285–295, 2023. https://doi.org/10.1007/978-3-031-37963-5_20
286
P. O. Jegede et al.
example, Mow [1] conducted a study to understand what common errors undergraduate students learning java programming commit. In the study, Mow found that the novice programmers made errors which was categorized into five classes, namely: variable not found (59.1%), identifier expected (13.5%), class not found (7%), mismatched brackets (6.4%), and invalid method declaration (7%). However, Mow [1]’s study was silent on how the categories of errors identified could be used to improve teaching and learning programming amongst novice programmers. The study however suggested that, further research should use the identified errors to improve pedagogical process in classroom programming. Recently, Jegede, et al. [2] investigated errors made among Nigerian undergraduate novice Java programmers. In the study, Jegede, et al. [2] identified 598 errors that were common to beginners in Java programming. The errors were classified into five, namely: Invalid Symbol, Inappropriate Naming, Excessive Symbol, Mis-matched Symbol and Missing Symbol. The authors found that Missing Symbol was 195 (33%), least errors committed were Invalid Symbols (11.9%), Methods and Classes concept housed the highest number of errors 119 (35.8%) followed by Other Object Concepts (34%), Decision Making followed after (29.1%), with Looping (10.4%) housing the least number of errors. Python is adjudged to be the fastest growing programming language in the world [5]. Similarly, in the year 2020, Popularity of Programming Language (PYPL) index ranked Python programming language the most popular language and grew the most in the last 5 years (19.4%) while Java lost the most (−7.2%) [6]. Also [7] posited that Python has a general-purpose nature, it is extensively used for a wide range of tasks, including web development [8, 9], machine learning and analysing data [10]. However, Python being a fast-growing programming language has focused on developer experience and attempted to lower the challenge to programming to accommodate even school kids to write productive code [5]. Therefore, understanding the ease with which to learn Python programming language amongst novices’ undergraduates becomes imperative. This is in the light of motivating learning Python programming language amongst beginners. Though, few studies [4, 11, 12] pointed out errors novices often confront while learning coding in Python programming language. Zhang, et al. [11] mentioned errors such as undefined variables, or data pre-processing. Likewise, a study conducted by Case [4] on animated pedagogical agent for assessing novice programmers identified five broad categories of python programming errors, namely; syntax errors, semantic errors, logical errors, strategic errors, and errors arising from incorrect use of letter-case in Python. These errors identified in Case [4] could be regarded as common with other programming languages like Structured Query Language (SQL) [13, 14], C programming and MATLAB [3] among others. This implies that novice learning SQL, C programming, Java and MATLAB could confront the errors identified by Case [4]. However, previous studies [1, 2] in java for example regarded syntax errors as most prevalent errors novice programmers confronted while learning programming. Thus, research in learning Python programming needs to extend the findings of Case [4] by analysing the syntax errors novice programmers make. It would seem that none of the previous works on Python errors specifically analyse syntax errors. Errors were largely analysed on omnibus scale
Analysis of Syntactic Errors of Novice Python Programmers
287
which may obscure some salient information. This current study is thus sets out to analysis syntactic errors of novice Python programmers. This will be with a view to developing pedagogical skills that could enhance training of novice python programming. Five major research objectives guided this study. These include, to: 1. identify type of errors made by novice programmers in python based on concepts; 2. determine the error types made by low achieving, average achieving and high achieving beginner programmers in python; 3. identify errors made in python programming across concepts based on the achievement level of the student programmers; 4. examine the relationships between each of the python programming error types with other programming error types; and 5. explore the novice python programmers’ perception on how to minimize the programming errors.
2 Methods A sequential exploratory mixed-method design was adopted for the study. A class of 70 students at 200-level offering Python programming in a southwestern Nigerian university participated in the study. The study spanned one academic session. Introductory concepts in Python were instructed (with theory and practicals) in the first semester after which the students were examined. A summative assessment based on the semester examination provided the basis for the categorization of students into “low achieving” (between 0 and 49%), “average achieving” (between 50 and 59%) and “high achieving” (between 60 and 100%) in Python programming. In the second semester, students were taken through practical sessions in Python and solutions to coding exercises were submitted through an online platform. The submissions covered concepts such as arithmetic operations, control statements, loops, input and output and Python collections. Arithmetic operations involved equations and solely the use of mathematical operators. This also encompassed accepting data from the user from the command line using the input statement. Control Statements involved making decisions. It is required when more than one conclusion can be reached and questions are needed to be asked to enhance the conclusion. Loops statements as it is in other programming language involved repeating an action over a range of items. The actions may involve decision making, performing arithmetic and other operations. Python collections made use of python lists, tuples and dictionaries data structures. These data structures were collection of items which were not necessarily of the same type. The dictionary unlike list and tuple is a key-value pair datatype. Identified errors were categorized Invalid Symbols, Mismatched Symbols, Missing Symbols, Inappropriate Naming and Excessive Symbols. Invalid Symbols, according to Bringula, et al. [15] consist of errors such as “No period between class name and method name”, capitalized keywords, replacing (and) with < and > or in.In this work, Mismatched Symbol was a case when an invalid variable is used as an iterator for example the statement for Len(x) in range(m) or using return under a control statement outside a function or using open quote and close parenthesis and vice versa or open parenthesis and closed quote. Missing Symbols was a situation where instead of using parenthesis with print function as done in Python 3, the parenthesis is not present, or omitting the colon symbol used
288
P. O. Jegede et al.
when defining a class, function and control statements and loops code blocks or missing arithmetic operator during arithmetic operation (like omitting a multiplication symbol when trying to multiply a number with a variable or a variable followed by parenthesis). Invalid Symbol was a case when a variable is put in quotes when trying to print to display or using ‘of’in place of assignment symbol (=) or calling a variable that has not been defined or using arithmetic operators on the left-hand side of the assignment operator or opening parenthesis without closing them or calling unavailable function or starting or ending an arithmetic operation with arithmetic operator. Inappropriate Naming was a circumstance where reserved words were used for identifier and vice versa or calling a variable of function with the wrong casing. Excessive Symbol was a case when excess parenthesis, quote or operators are used in a python statement. Interference is when the style of another programming language the students has knowledge of (for example Java) reflects in the codes; for example, using the Java for-each loop in Python. Quantitative Analysis was done, using descriptive statistics to determine error distribution based on concepts and achievements levels of beginner programmers. Pearson Product Correlation Coefficient was used to determine the relationship between error types. An interview was also conducted to elicit information on selected students (5 in all) on how in their own opinion programming error can be minimized.
3 Results Python concepts considered for this study included Arithmetic Operation, Control Statement, Loops, Python Collections, Input and Output. Errors made by beginner programmers in the study were categorized into Missing Symbols, Invalid Symbols, Mismatched Symbols, Inappropriate Naming and Excessive Symbols based on previous explanation. In all, 90 errors were committed which included Missing Symbols (30%), Invalid Symbols (21. 1%), Mismatched Symbols (25.6%), Inappropriate Naming (8.9%) and Excessive Symbols (14.4%). Thus, Missing and Mismatched symbols errors were the most committed in the study while Inappropriate Naming is the least committed. Most of the errors committed in the study were in Input and Output concept (36.5%) followed by that of Loops (25.6%) (Table 1). Low achievers and high achievers were more susceptible to errors of Missing symbols (i.e. 30.4% and 57.1% respectively) while Average achievers committed more of errors of Mismatched symbols (35%) (see Table 2). Errors committed by Low achievers were majorly in Loops concepts while Average and High achievers committed mostly Input and output concept. (48.3% and 14.3% respectively as stated in Table 3). In Table 4, Missing Symbol was found to be related to Inappropriate Naming, Invalid Symbol was related to Excessive Symbols while Mismatched Symbols and Excessive symbols were also found to be related. The primary reasons for these are perhaps because some errors could appear under the two related categories simultaneously (Missing Symbol and Inappropriate Naming), for example, the absence of quote for strings can lead to the interpreter taking the string as an identifier thereby naming inappropriately. Similar situation also exists between Invalid and Excessive symbol, example of such is the placing of identifier in quote. The quotes are Excessive Symbol which at the same time makes the items to be printed invalid (The string will be printed instead of the value
Analysis of Syntactic Errors of Novice Python Programmers
289
of the identifier). In the case of Mismatch and Excessive symbol, instances exist where we have a return statement under a control statement outside the function. This excessive statement is also mismatched. It only remains for further studies to identify other reasons apart from those previously raised why some categories of errors are related. Table 1. Error Categorization and Concept Arithmetic operation
Control statement
Loops
Python collections
Input and output
Total
Missing symbols
7 (22.2%)
8 (29.6%)
8(29.6%)
0 (0%)
4 (18.5%)
27 (30%)
Invalid symbols
5 (26.3%)
3 (15.8%)
2(10.5%)
0 (0%)
9 (47.4%)
19 (21.1%)
Mismatched symbols
0 (0%)
8 (34.8%)
6 (26.1%)
0 (0%)
9 (39.1%)
23 (25.6%)
Inappropriate naming
2 (25%)
0 (0%)
4 (50%)
1 (12.5%)
1 (12.5%)
8 (8.9%)
Excessive
1 (7.7%)
0 (0%)
3 (23.1%)
0 (0%)
9
13 (14.4%)
(69.2%)
symbols Total
14 (15.6%)
19 (21.1%)
23 (25.6%)
33 (36.7%)
1 (1.1%)
90
Table 2. Error Types and Achievement Level Missing symbols
Invalid symbols
Mismatched name
Inappropriate name
Excessive symbol
Total
Low achieving
7 (30.4%)
6 (26.1%)
2 (8.7%)
3 (13%)
5 (21.7%)
23 (25%)
Average achieving
16(26.7%)
12 (20%)
21 (35%)
3 (5%)
8 (13.3%)
60 (67%)
High achieving
4 (57.1%)
1 (14.3%)
0 (0%)
2 (28.6%)
0 (0%)
7 (7.8%)
Total
27 (30%)
19(21.1)
23 (25.6%)
8 (8.9%)
13 (14.4%)
90
290
P. O. Jegede et al. Table 3. Concept-Based Errors and Achievement Level Arithmetic operations
Control statements
Loops
Python collections
Input and output
Total
Low achieving
8 (34.8%)
0 (0%)
11 (47.8%)
1 (4.3%)
3 (13%)
23 (25.6%)
Average achieving
4 (6.7%)
19 (31.7%)
8 (13.3%)
0 (0%)
29 (48.3%)
60 (66.7%)
High achieving
2 (28.6%)
0 (0%)
4 (57.1%)
0 (0%)
1 (14.3%)
7 (7.8%)
Total
14 (15.6%)
19 (21.1%)
23 (25.6%)
1 (1.1%)
33 (36.7%)
90
Further efforts were made to dribble down to classroom practices in Python programming that could minimize the identified errors. Five students in the Python class with large number of errors were interviewed on their opinion regarding how such errors can be minimized. Extracts from three of the students are as follow. Student #2 opined that more than tutorials, the lecturers should teach us the basis (of programming) before going to tutorials. ……… Student #3 likewise stated that the teachers should simplify the basis at the beginning of the course more, I think it will be better. This is because if you have a good foundation, that is what will give you grasp of the course. Similar opinion was expressed by student #1. This perhaps accounted for the reasons why most of the errors made were in Input and Output concept which is foundational. This called for greater instructional attention on the foundation before advance concepts are instructed. Student #1 stated that ………, more practical will enable us to understand concepts of the course. In addition, the practical should not be to test us. Like the practical we did were to test our ability. The practical should be more of learning process, teaching us how to do it (how to code) because most of the practical we did were to test our ability……… The implication of this is that practical session is often taken by the instructors as assessment sessions. Thus, learning environment appears unconducive and tense. Opportunities to ask questions and make costless mistakes appeared largely limited. Hence, the possibilities of repeating the same mistake and committing the same errors over an over is high. Another information obtained was in connection with the devices used to learn python programming by the students outside the classroom environment. The devices were android phone and laptop. However, users of laptops reported better coding ability than those using android phone. The reason adduced for this was that the IDLE features that were present in laptops were missing in phone devices. Thus, students learning python programming with phones were more susceptible to errors than those learning with laptops. According to student #2,
Analysis of Syntactic Errors of Novice Python Programmers
291
Table 4. Relationships between error types
Missing
Pearson
Missing
Invalid
Mismatched
Inappropriate
Excessive
1
.142
.222
.427**
.143
.142
1
.170
.084
.363**
.222
.170
1
-.090
.561**
.427**
.084
-.090
1
.172
.143
.363**
.561**
.172
1
Correlation Sig. (2tailed) Invalid
Pearson Correlation Sig. (2tailed)
Mismatched
Pearson Correlation Sig. (2tailed)
Inappropriate
Pearson Correlation Sig. (2tailed)
Excessive
Pearson Correlation Sig. (2tailed)
**. Correlation is significant at the 0.01 level (2-tailed).
once they (lecturer) taught us in class, I have the app on my phone because I do not have a system then. I use my android phone. I only use what I have then. I could have preferred computer but I used what I have then. Similarly, student #3 stated that I use my phone then. But now, I have a laptop to run my code. Further inquiry revealed that the IDLE in the lab practical sessions has no intelliSense and as such could not suggest appropriate syntax during coding. While, those used by students to practice has this feature. This inconsistency did not make for effective learning. IDLE without this feature appeared to be relatively difficult for beginner programmers. This is illustrated in the extract of student #1 that when I input command on my system at home, it tells me what is wrong with it (code). If I input wrong command, it’s analysis, it gives me different words, approaches I could take to correct it (code) and suggestions that I want. The one in the lab is purely for testing. Immediately you input your code, if you feel like it’s not giving you the right thing. You try until you feel like you have the right thing (code). So, you may not know
292
P. O. Jegede et al.
the particular code (command) you want to input. If you input the wrong thing like you make a mistake on a line, it will not suggest the correct command to be used.
4 Discussion The study obtained that missing symbols constitute the most committed errors. This is consistent with the finding of [2] in the case of taxonomy of errors made by beginner programmers in Java. However, in the case of Java, programming missing symbols errors resides largely in objects and classes concepts which is usually the introducing concepts to every student in Java programming, but the case is different in that of Python where missing symbols errors were committed mostly in loops and control statements. This is probably because colons and indentations largely characterized loops and control statements and which are easy for beginners to omit due to inattentiveness Bringula, et al. [15]. It also cut across achievement levels. The finding is also consistent with Jadud [16] and Mow [1] that missing symbols constitute highest percentage of errors committed by students. Another common error made by novice programmers have been identified to be mismatched symbols and this largely resided in input and output concepts. An example that aptly illustrated this was that student interchanged parenthesis for quotation marks. Furthermore, many of the students did practice with smart phones rather than Personal Computers (PCs): an equipment with no intelliSence and hence could not prompt or identify bugs. Another finding of this study is that Input and Output Concept had the highest number of errors. In fact, almost all the error types were reasonably represented under this concept. The reason for this was perhaps because input and output concept cuts across all other concepts in Python programming. This is analogous to object and class concept in Java programming which is a cross concept. Earlier results in Java showed that objects and classes harboured greater number of bugs. It would seem that similar reason accounted for this. The type of errors that appeared here were mostly excessive symbols and invalid symbols. For excessive symbols, students do add unneeded quotation marks to variables they tried to output thereby converting the variable name to a string. It was also found that excessive symbols are significantly related with each of mismatched and invalid symbols. This is not farfetched due to the fact that students who committed excessive symbols error will have unrequired symbols in the body of the code and this can render the code invalid. In the current study, students were cumulatively assessed and scored in every laboratory session thus every error committed was at a cost. But one way to bring about meaningful learning in programming is to give students the opportunity to learn from their own mistakes. This kind of learning should be supervised in such a way that each error plays a significant role in the process of learning [17]. Also, furthering on the fact that beginners’ programmer learnt python using smart phones and lap tops. Opinions are mixed regarding the effectiveness of smartphones in learning programming languages. Advocates of this practice were of the view that using such device makes learning programming possible anywhere at any time [18]. A number of mobile apps have been developed to make this possible [19]. Tillmann, et al. [20] stated that since the mobile phone is handy, it is easier for students to programme on the go.
Analysis of Syntactic Errors of Novice Python Programmers
293
The work also suggested that mobile programming environment should be standardized and made available across platforms or all students of a class must adopt a particular phone platform, or school must make available a homogenous set of mobile devices for teaching purposes, similar to how many schools already make available PCs or laptops. The case with this suggestion is that it’s going to be difficult to regulate the mobile phones used by students and also to make them use the same platforms since mobile phones don’t generally use the same operating system. The work suggested usage of Bluetooth keyboard for ease typing, but the issue also with this is that Bluetooth keyboards that can be used comfortably are bigger than mobile phones themselves, as a result the said mobility of learning programming which should be the gain of using smart phone will be jeopardized. Apart from these, some of the online responses expressed their views on the limitation of this device. For example, reacting to Fower’s position and invitation to use android device a student of programming who had earlier been using smart phone to programme said It seems to have an error when I try to put in an input The proffered solution was that The input needs to be typed in the input text file and read in the input (). All of these point to the fact that despite the benefit of learning anywhere and anytime made possible by smart phones, there are drawbacks. Most existing PC IDEs are complex, use highly graphic interfaces, or work in integrated architectures that may not be suitable to implement as is on a mobile programming environment [21]. Another issue regarding programming learning environment in the study was that even laptop users learn with IDE that has intellisense but were assessed with environment with no intelligent editors. Intelligent editors (though easier to learn programming with) will return less bugs and may give a false impression of coding mastery for beginners. In a similar study regarding Java programming Jegede, et al. [2] opined that learning is often enhanced with an unintelligent editor as learners are forced to think, reason and spot on their own any bugs in the code. This may be controversial as one may not necessarily need to memorize all the syntax and the core libraries to be a good programmer, even though there is the need to understand the core syntax. The same reasoning that makes for the rejection of intellisense, one may also reject any tool designed to increase programmers’ productivity, compilers and debuggers inclusive.
5 Pedagogical Implications This study should lead to Python programming “error logs”. The compiled errors often committed by novice programmers will assist instructors in the mission of helping tutee programmers in their future classes to avoid such errors. Also, for a meaningful and result-oriented pedagogical process in programming instructions, the choice of electronic devices to be used both at home and in school is key. This becomes more important when dealing with beginners. To avoid confusing the novice learners; devices having the same features should be used for learning, practice and test. When devices used for instructions in school have different features compared to the ones used for practice at home, errors
294
P. O. Jegede et al.
are easier committed. For effective pedagogy therefore, efforts should first be made to ensure that similar devices are used for learning and assessment at school and as well as for practising at home. Devices that contain intelligent editors with built-in resources to handle some of the errors should also be encouraged, to reduce errors in programming. One good pedagogical practice, especially in a novice programming class will be to proceed from known to the unknown. As discovered from this study, more efforts should be made to explain the basic foundational concepts in Python programming. Experienced instructors in Python programming who are also very much familiar with errors that novice programmers commit should take time to explain these basic concepts to beginners with diverse illustrations specifically pointing to possible errors to which learners are susceptible. The outcome of the interview with students during this study revealed their displeasure and challenge with the approach of confronting them with exercises and tests at the beginning of their practical experience in the laboratory, when they expected to be guided on what and how to do. Therefore, to help students overcome programming errors, test taking should be the last set of activities in the computer laboratory after each concept successfully instructed. They should be taken through hands-on activities of learning Python programming practically by first observing their instructors demonstrate the process. In virtually all classroom settings, including that of novice programmers, the learners are never the same when it comes to ability. In terms of learning rate and efficiency, they are often divided to low, average and high achievers’ groups. Closely related to this, is the fact that areas and types of errors vary across the groups. In this study for instance, low achievers were seen to commit more errors in Loops concepts while average and high achievers committed more errors in Input and Output concepts. Specifically, for types of errors committed; low and high achievers committed more errors of Mismatched symbols. If the learners are grouped, the peculiarities in the areas of errors committed across the groups will turn out to be a good guide in the design and implementation of pedagogical approaches for the different groups in order to help surmount the challenges of errors in Python. Jegede, et al. [2], in a similar study on errors committed by novice programmers in Java programming language; suggested that instructions should be based on achievement levels because errors (quantity and types) have been found to be achievement-based. Therefore, for effective pedagogy, stratification of novice programmers according to their ability grouping in programming achievement becomes highly desirable.
References 1. Mow, I.T.C.: Analyses of student programming errors in Java programming courses. J. Emerg. Trends Comput. Inf. Sci. 3, 739–749 (2012) 2. Jegede, P.O., Olajubu, E.A., Ejidokun, A.O., Elesemoyo, I.O.: Concept–based analysis of java programming errors among low, average and high achieving novice programmers. J. Inf. Technol. Educ. Innov. Pract. 18, 049–059 (2019) 3. Ettles, A., Luxton-Reilly, A., Denny, P.: Common logic errors made by novice programmers. In: Proceedings of the 20th Australasian Computing Education Conference, pp. 83–89 (2018)
Analysis of Syntactic Errors of Novice Python Programmers
295
4. Case, D.R.: An Animated Pedagogical Agent For Assisting Novice Programmers Within A Desktop Computer Environment by Desmond Robert Case BSc, MSc A thesis submitted in partial fulfilment of the requirements of Staffordshire University for the award of the degree of Doctor of Philosophy in Computer Science September 2012, Staffordshire University (2012) 5. Kamaruzzaman, M.: (2020, Top 10 in-demand programming languages to learn in 2020: In-depth analysis and ranking of the top programming languages for job seekers and new developers. 104182 (2020) 6. PYPL. PopularitY of Programming Language (PYPL) [Online] 7. Saabith, A.S., Fareez, M.M.M., Vinothraj, T.: Python current trend applications-an overview popular web development frameworks in python. Int. J. Adv. Eng. Res. Dev. 6 (2019) 8. Leenings, R., Winter, N.R., Sarink, K., Ernsting, J., Jiang, X., Dannlowski, U., et al.: The PHOTON Wizard--Towards Educational Machine Learning Code Generators, arXiv preprint arXiv:2002.05432, pp. 1–6 (2020) 9. Avhad, P., Bhanushali, H., Bhatt, K., Rathod, M.:Canteen Automation System with Payment Gateway. Available at SSRN 3568597, pp. 1-3 (2020) 10. Shafi, J., Waheed, A., Krishna, P.V.: Analysing human activity patterns by chest-mounted wearable devices. In: Emerging Research in Data Engineering Systems and Computer Communications, ed: Springer, pp. 389–401 (2020) 11. Zhang, Y., Chen, Y., Cheung, S.C., Xiong, Y., Zhang, L.: An empirical study on TensorFlow program bugs. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 129–140 (2018) 12. Humbatova, N., Jahangirova, G., Bavota, G., Riccio, V., Stocco, A., Tonella, P.: Taxonomy of Real Faults in Deep Learning Systems, arXiv, p. arXiv: 1910.11015 (2019) 13. Von Dollen, A.C.: Data-Driven Database Education: A Quantitative Study of SQL Learning in an Introductory Database Course. California Polytechnic State University, San Luis Obispo, M.Sc (2019) 14. Taipalus, T., Perälä, P.: What to expect and what to focus on in SQL query teaching. In: Proceedings of the 50th ACM Technical Symposium on Computer Science Education, pp. 198–203 (2019) 15. Bringula, R.P., Manabat, G.M.A., Tolentino, M.A.A., Torres, E.L.: Predictors of errors of novice java programmers. World J. Educ. 2, 3–15 (2012) 16. Jadud, M.C.: Methods and tools for exploring novice compilation behavior. In: Proceedings of the Second International Workshop on Computing Education Research, pp. 73–84 (2006) 17. Ahmadzadeh, M., Elliman, D., Higgins, C.: The impact of improving debugging skill on programming ability. Innov. Teach. Learn. Inf. Comput. Sci. 6, 72–87 (2007) 18. Lin, Y.-B., Shieh, M.Z.: To learn programming through internet of Things. IEEE Internet Things Mag. 5(4), 168-172 (Accepted) (2020) 19. Pears, A., Rogalli, M.: mJeliot: a tool for enhanced interactivity in programming instruction. In: Proceedings of the 11th Koli Calling International Conference on Computing Education Research, pp. 16–22 (2011) 20. Tillmann, N., Moskal, M., De Halleux, J., Fahndrich, M., Bishop, J., Samuel, A., et al.: The future of teaching programming is on mobile devices. In: Proceedings of the 17th ACM annual conference on Innovation and technology in computer science education, pp. 156–161 (2012) 21. Mbogo, C.C.: Scaffolding Java Programming on a Mobile Phone for Novice Learners (2015)
A Constructive Heuristic “MDSA” Solving the Flexible Job Shop Scheduling Problem Vassil Guliashki1(B) and Gašper Mušiˇc2 1 Institute of Information and Communication Technologies, Bulgarian Academy of Sciences,
Sofia, Bulgaria [email protected] 2 Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia [email protected]
Abstract. This paper considers a novel heuristic algorithm called “MDSA” (Minimal Deviation Strategy Algorithm) for solving Flexible Job Shop Scheduling Problems (FJSSP). In the optimization of the machine schedule, one objective function is considered - the “makespan”, i.e., the maximum completion time for all operations of all jobs is minimized. The algorithm is based on a strategy of minimizing machine idle times and scheduling the operations so that they are executed in an optimal time (Shortest Processing Time - SPT) or in a time with minimum deviation from the SPT. The aim is that the sum of the working time of a machine and its idle times is approximately equal to this sum for each of the other machines, i.e. the completion times of the machines should be approximately equal. The heuristic algorithm is constructive and generates a near-optimal solution in a time that depends polynomially on the number of operations and the number of machines. The performance of the algorithm on an illustrative example is considered. We compare the MDSA performance on 20 test instances taken from the literature with the performance results of a Genetic Algorithm (GA) on the same instances. The obtained results are encouraging and show that this type of heuristics can be used to quickly find approximately optimal solutions, which in turn can serve as initial solutions for exact methods and can save a large amount of computational effort and time, especially for problems of large size. Keywords: Combinatorial Optimization · Flexible Job Shop Scheduling Problem · Heuristic Algorithm
1 Introduction Due to their wide applicability, the job shop scheduling problems have attracted the attention of many researchers in recent decades. There is a trend towards an ever-increasing intellectualization of production processes. This requires faster and more accurate solution of scheduling optimization problems, the size of which is getting larger and larger. For this reason, various exact methods are being developed (see for example [1, 2]). Unfortunately, the mentioned problems are combinatorial problems, and as their size © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 296–306, 2023. https://doi.org/10.1007/978-3-031-37963-5_21
A Constructive Heuristic “MDSA” Solving the Flexible Job
297
increases, the computational cost of the exact methods increases exponentially. Therefore heuristic algorithms are becoming increasingly important, because they are able to find solutions that are close to the optimum in a reasonable time, and with a relatively small computational cost. A review of heuristic algorithms developed for Flexible Job Shop Scheduling Problems (FJSSP) is presented in [3]. A modified Simulated Annealing method for FJSSP is suggested in [4]. An Ant Colony approach for production scheduling problems is used in [5]. An Evolutionary algorithm for solving FJSSP is applied in [6]. The same class of problems has been attacked by means of Genetic Algorithms (GA) in [7–11]. Practical applications of the job shop scheduling problem in many real-life situations are discussed in [2, 12]. The management of smart electric grids also requires the solution of scheduling problems [13]. We cannot mention here all the developed heuristic algorithms for FJSSP. It should be noted that evolutionary algorithms and GA are slower than many other heuristics, because they are population-based. This disadvantage is amplified when the population size increases. When the population size becomes smaller, the quality of the generated solutions decreases. In this paper a constructive heuristic algorithm for FJSSP, called “MDSA” (Minimal Deviation Strategy Algorithm) is proposed. A similar heuristic approach was presented in [14], but the corresponding algorithm is simpler and can efficiently solve only small size problems. The paper is organized as follows: The novel proposed algorithm is presented in Sect. 2. An example illustrating the performance of this algorithm is considered in Sect. 3. In Sect. 4 the MDSA algorithm is compared with GA. Finally, conclusions are drawn in Sect. 5. 1.1 Job Shop Scheduling Problem (JSSP) Statement JSSP includes a set of n jobs: J1 ,…, Jn , which should be processed on m machines M1 ,…, Mm . Each job consists of several operations, which have to be processed in a specific order: Ji = (Oi, 1 ,…, Oi, j(i) ), j(i) is the final number of operations for job i, and i = 1,…,n. It is defined on which machine each operation will be processed. The processing times for each operation on each machine are given. They are denoted by: pi,k , i = 1,…,n; k = 1,…, j(i).
298
V. Guliashki and G. Mušiˇc
The optimal schedule S according to the set criterion (criteria) should be found. The most popular criterion is the minimization of makespan, denoted by C max . This model was developed the earliest (see [15]). 1.2 Flexible Job Shop Scheduling Problem (FJSSP) Model The FJSSP model is a kind of extension of the above JSSP model. The difference is that each operation can be performed not only on a specific machine, but on a given subset of machines. This subset is usually different for each operation. In general, it is not defined in advance for each operation on which machine it will be processed. This model corresponds better to real situations and is used when some or all machines can process more than one operation (but not simultaneously), i.e. several or all machines are multifunctional. The processing times for one operation on different machines are usually different. This model was proposed in [16]. In [10] two cases/variants of the FJSSP model are considered. 1) when every operation can be processed on all machines, there is a Totally Flexible Job Shop Problem – T-FJSP; 2) when not every, but at least one operation can be processed on all machines, there is available a Partial Flexible Job Shop Problem (P-FJSP). The FJSSP has high computational complexity and is classified as a NP-hard problem (see [17]).
2 The Novel Algorithm “MDSA” 2.1 Basic Concepts of the Algorithm “MDSA” • We calculate the weight W i (see Eq. (1)) of each job i as the sum of the shortest processing times t(Oik )min for the operations Oik of this job i, where k = 1,…,j(i), and j(i) is the number of operations in job i. Wi =
j(i) k=1
t(Oik )min
(1)
• We arrange the jobs J i by weights in order of decreasing W i . If two different jobs have the same sum of shortest times, then the job with the smaller index comes first. • For each machine, we calculate its potential workload, which is equal to the sum Ml , l = 1,…,m; of only the shortest processing times of the operations that can be processed by it. We arrange the machines in descending order of the calculated amounts. We say that a machine that is ahead of another machine in this order is busier (it has a higher potential workload) than the later machine.
A Constructive Heuristic “MDSA” Solving the Flexible Job
299
2.2 Basic Ideas on Which the “MDSA” Algorithm is Based • The main objective is to generate a makespan value which differs minimally from the mean time for machine workload, expressed by Eq. (2): n tmean =
i=1
j(i)
j=1 (Oij )min
m
(2)
where m is the number of machines, and n is the number of jobs. • The schedule is constructed by assigning the operations with the same index to the machines on which they will be executed in the shortest time. The consecutive assignment of the operations corresponds to the order of the jobs J i by weight. • When creating the schedule, the goal is to keep the sum of the idle times of the machines as low as possible. Therefore, when an idle time occurs on the current machine a check is made to see if an operation with a next higher index can be inserted into the schedule before the current operation so that the idle time is reduced or eliminated. If there is such a possibility, the assignment is made with precedence. If there is no such possibility, the algorithm looks for an operation that can be redirected from the machine on which it is processed in the shortest processing time to the current machine - to the place of the idle time. • The redirection takes place in two cases: • if possible, to eliminate or reduce an occurring idle time period of the current machine. • if possible, to relieve the most loaded machine (or another relatively highly loaded machine) at the expense of a less loaded machine; In both cases, the potential workload of the corresponding machines is recalculated. • The redirection takes place by selecting among the possible operations that one where the processing time increases the least. With each redirection, the potential workloads of the machine that processes the operation and of the machine from which the operation comes are recalculated. • If there are several possible machines on which the current operation can be processed in the same shortest time, then the machine with the lowest potential workload is selected for assignment.
300
V. Guliashki and G. Mušiˇc
• The last-index operations for each job are assigned according to the rule - to the machine on which the operation will have the earliest completion time.
A Constructive Heuristic “MDSA” Solving the Flexible Job
301
3 Illustrative Example for “MDSA” Performance The example considered here includes three machines and five jobs. It is taken from [11]. The data is presented in Table 1: Table 1. Example 1. Processing Times for All Operations on All Machines JOB
OPERATION (machine/processing time) O*1 O*2 O*3 O*4
J1
M1/7
J2
J3
J4
J5
∞
M1/3
∞
M2/3
∞
M2/4
M3/4
∞
M3/6
∞
M1/2
M1/8
∞
M1/7
M1/8
M2/12
∞
M2/14
∞
∞
M3/4
∞
M3/4
M1/10
∞
M1/2
M1/6
M2/15
M2/2
∞
M2/3
M3/8
M3/6
M3/4
∞
∞
M1/6
∞
M1/9
M2/9
∞
M2/7
M2/6
M3/5
M3/2
M3/12
M/3
M1/10
∞
M1/5
M1/4
∞
M2/7
M2/8
M2/6
M3/15
M3/14
∞
M3/8
For this example there are 20 operations of i = 5 jobs to be processed on m = 3 machines. Each job includes j = 4 operations. Hence, the mean time for machine
302
V. Guliashki and G. Mušiˇc
workload can be evaluated according to Eq. (3): 5 tmean =
i=1
4
j=1 (Oij )min
3
= 31
(3)
In case that there exist a feasible schedule without idle times, and all machines have uniformly distributed workload, where all operations are processed in Shortest ∗ Processing Times (SPT), then the makespan Cmax is the ideal optimal solution, and ∗ Cmax = tmean . The weights of the jobs are evaluated as follows: W1 = 12; W2 = 23; W3 = 15; W4 = 17; W5 = 26; Hence the order of the jobs is: J = { J 5 , J 2 , J 4 , J 3 , J 1 }; The potential workload of all machines is calculated as: M1 = 41; M3 = 30; M2 = 22; Before assignment of operations with the second index 1, obviously the machine M2 remains unloaded. Check the possibilities of redirecting an operation from the bottleneck machine M1 to machine M2 . Operation O51 cannot be redirected (on M2 its processing time is ∞), but operation O21 can be redirected to machine M2 . After assignment of operations with the second index 1, the operations are assigned as follows: O51 - to M1 with completion time 10; O21 - to M2 with completion time 12; The operations O41 , O31 and O11 - to M3 with completion time of O11 equal to 17. Redirection of operations with the second index 2 is impossible. After assignment of all operations with this second index, the following partial schedule is obtained: O52 on M2 with completion time 19; O32 - on M2 with completion time 21; O12 - on M2 with completion time 24; O22 - on M3 with completion time 21; O42 - on M3 with completion time 23; Obviously O53 can be processed on M1 with the earliest starting time 19. Hence an idle time period of 9 time units between 10 and 19 arises on M1 . It is not possible to redirect an operation to this point. Proceeding in the same manner, finally a schedule is ∗ . = 38. This is the completion time of all machines. generated with makespan-value Cmax The conclusion is that the redirection of O21 - to M2 is not suitable, taking into account the idle time period on M1 . For this reason, we perform the assignment of operations with the second index 1 as follows: O51 - to M1 with completion time 10; O21 - to M1 with completion time 18; Among the remaining operations op. O41 can be redirected to M2 with time loss 4 time units. The operation O31 can also be redirected on M2 but with a time loss 7. Hence, operation O41 is assigned to M2 with completion time 9; The operations O31 and O11 are assigned to M3 with completion time 12 of O11 . After assignment of operations with the second index 2, the following partial schedule is obtained: O32 - on M2 with completion time 11; O32 precedes O52 because O52 can start on M2 with earliest starting time 10. Then O52 is assigned to M2 with completion time 18; O12 - to M2 with completion time 21; O42 - to M3 with completion time 14; O42 precedes O22 because O22 can start on M3 with earliest starting time 18. Hence an idle time period of 4 time units between 14 and 18 arises on M3 .
A Constructive Heuristic “MDSA” Solving the Flexible Job
303
Check the possibility to assign at this point an operation with a higher second index - 3 (preceding the next operation with the current second index - O22 ). The most suitable operation is O33 which is assigned to M3 with completion time 18 (the time loss is 2). Then O22 is assigned to M3 with completion time 22; After assignment of operations with the second index 3, the following partial schedule is obtained: O53 - on M1 with completion time 23; O23 - on M1 with completion time 30; O43 - on M2 with completion time 28; Check the possibilities of redirecting an operation from the bottleneck machine M1 to another machine. O13 is redirected from M1 to M3 with completion time 28; Finally, assign the last operations with the second index 4 according to the J order and the rule “Earliest finishing time”: The following schedule is generated: O54 - on M1 with completion time 34; O34 - on M2 with completion time 31; O14 - on M2 with completion time 35; O44 - on M3 with completion time 31; O44 precedes O24 because O24 can start on M3 with earliest starting time 30. O24 is assigned on M3 with completion time 35; the completion times of all operations with the second index 4 are: {35, 35, 31, 31, 34}.
4 Comparison of “MDSA” to a GA Genetic algorithms have been successfully applied to FJSSP by several authors [7–10]. Therefore, it was decided to compare the results of MDSA heuristic with those of GA. A specific version of GA was applied, using a direct operation list representation of schedules as chromosomes [7]. A random set of feasible operation sequences is used as the initial population. Two sets of chromosome manipulation operators are used to change the GA population at each step of the algorithm: sequence crossover and mutation; machine assignments crossover and mutation. The two crossover and mutation strategies are executed in parallel and combined with local search as described in [7]. Compared to [7], the number of generations of a single GA run was reduced to 50 to reduce computation time. The performance of the presented heuristic is compared with the results of the Genetic algorithm on 20 test instances from literature sources. The test results are summarized in Table 2. The Genetic algorithm was started 10 times for each test instance. This is the reason, that on several instances the relative error has more than one value. The MDSA heuristic was run only once for each test instance. From the view point of efficiency, the MDSA heuristic is much faster than the Genetic algorithm, but MDSA is not a population-based algorithm. The relative error ε is calculated according Eq. (4) as follows: ε =
− ∗ Cmax Cfinal ∗ Cmax
,
(4)
∗ is the optimal value of the makespan criterion, and c where cmax final is the best makespan value obtained by the corresponding algorithm.
304
V. Guliashki and G. Mušiˇc Table 2. Test Results by MDSA and GA
Instance №
Source
Size
Optimal solution C*max
MDSA C max
Relative error ε MDSA
GA C max
Relative error ε GA
1
[20]
M3J2O5
53
53
0%
53
0%
2
[24]
M3J2O5
11
11
0%
11
0%
3
[25]
M3J3O8
11
11
0%
11
0%
4
[26]
M3J3O9
45
45
0%
45
0%
5
[27]
M3J4O13
28
28
0%
28
0%
6
[11]
M3J5O20
35
35
0%
35
0%
7
[28]
M4J3O8
12
12
0%
12
0%
8
[29]
M5J3O7
11
11
0%
11
0%
9
[30]
M5J3O8
5
5
0%
5
0%
10
[6]
M5J4O12
11
11
0%
11
0%
11
[21]
M6J6O36
54
54
0%
54
0%
12
[10]
M8J8O27
14
14
0%
14
0%
13
[22]
M10J8O34
23
23
0%
23
0%
14
[23]
M10J10O30
7
7
0%
7
0%
15
[18]
M11J3O30
568
568
0%
568
0%
16
[18]
M20J10O50
93
93
0%
96–100
3,23%–7,53%
17
[19]
M6J10O55
40
40
0%
40
0%
18
[19]
M6J10O58
26
27
3,85%
27, 28
3,85%, 7,69%
19
[19]
M8J15O150
204
204
0%
204
0%
20
[19]
M8J15O90
60
61
1,67%
62–67
3,33%–11,67%
5 Conclusions The novel constructive heuristic algorithm called “MDSA” (Minimal Deviation Strategy Algorithm) for solving Flexible Job Shop Scheduling Problems (FJSSP) presented in this paper was developed with the aim solve FJSSP with moderate and large size near optimally. The comparison with the genetic algorithm is clearly in favor of the novel “MDSA” algorithm. The obtained results are very encouraging. The MDSA algorithm has a relative error of up to 3,85%, and that only for some relatively hard test instances of Brandimarte. The genetic algorithm has up to 11,67% relative error. The conclusion is that MDSA should be carefully tested on a set of very hard test instances, and eventually refined.
A Constructive Heuristic “MDSA” Solving the Flexible Job
305
From a computational point of view, MDSA only performs tests and comparisons whose number depends polynomially on the number of operations and the number of machines in the current problem. Hence it could be proved that the computational complexity depends polynomially on the number of operations and the number of machines in the problem. It can be expected that this heuristic will be able to generate fast solutions, which are relatively close to the optimal solutions and can be used as a starting points for exact methods with high computational complexity. Acknowledgment. This study is partly supported by the Bulgarian National Science Fund − project “Mathematical models, methods and algorithms for solving hard optimization problems to achieve high security in communications and better economic sustainability”, Grant No: KP06-N52/7.
References 1. Brucker, P.: Scheduling Algorithms, 5th edn. Springer-Verlag, Berlin, Heidelberg (2007) 2. Pinedo, M.: Scheduling: Theory. Springer Verlag, New York, Algorithms and Systems (2012) 3. Zheng, Y., Lian, L., Mesghouni, K.: Comparative study of heuristics algorithms in solving flexible job shop scheduling problem with condition based maintenance. JIEM 7(2), 518–531 (2014) 4. Najid, N.M., Dauzere-Peres, S., Zaidat, A.: A modified simulated annealing method for flexible job shop scheduling problem. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp. 6–15 (2002) 5. Berrichi, A., Yalaoui, F., Amodeo, L., Mezghiche, M.: Bi-objective ant colony optimization approach to optimize production and maintenance scheduling. Comput. Oper. Res. 37, 1584– 1596 (2010) 6. Kacem, I., Hammadi, S., Borne, P.: Approach by localization and multi- evolutionary optimization for flexible job-shop scheduling problems. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 32, 1–13 (2002) 7. Mušiˇc, G.: PN-GA based optimization of flexible job shop schedules. In: 10th Vienna International Conference on Mathematical Modelling MATHMOD 2022, IFAC-PapersOnLine, vol. 55, no. 20, pp. 517–522 (2022) 8. Hussain, M.F., Joshi, S.B.: A genetic algorithm for job shop scheduling problems with alternate routing. In: IEEE International Conference on Systems, Man, and Cybernetics, vol. 3, pp. 2225–2230 (1998) 9. Gao, J., Gen, M., Sun, L.: Scheduling jobs and maintenances in flexible job shop with a hybrid genetic algorithm. J. Intell. Manuf. 17, 493–507 (2006) 10. Motaghedi-Iarijani, A., Sabri-Iaghaie, K., Heydari, M.: Solving flexible job shop scheduling with multi objective approach. IJIEPR 21(4), 197–209 (2010) 11. Mušiˇc, G.: Generation of feasible Petri net based scheduling problem solutions. In: 8th Vienna International Conference on Mathematical Modelling MATHMOD 2015, IFACPapersOnLine, vol. 48, no. 1, pp. 856–861 (2015) 12. Pinedo, M.: Planning and Scheduling in Manufacturing and Services. Springer Verlag, Berlin, Heidelberg, New York (2005) 13. Corn, M., Cerne, G., Skrjanc, I., Atanasijevic-Kunc, M., Scheduling of electric energy in smart grids using a combination of neural networks and local optimization. In 8th EUROSIM Congress on Modelling and Simulation, pp. 95–100 (2013)
306
V. Guliashki and G. Mušiˇc
14. Guliashki, V., Mušiˇc, G., Marinova, G.: A heuristic “minimal deviation” algorithm for solving flexible job shop scheduling problems. In: Proceedings of 4. International Scientific Conference on Mechanical Engineering Technologies and Applications COMETa 2018, pp. 799–806 (2018) 15. Lawler, E., Lenstra, J., Rinnooy Kan, A., Shmoys, D.: Sequencing and scheduling: algorithms and complexity. In: Rinnooy Kan, A.H.G., Graves, S.C., Zipkin, P.H., (eds.), Logistics of Production and Inventory, vol. 4, Chapter 9, pp. 445–522, Elsevier (1993) 16. Brucker, P., Schlie, R.: Job shop with multi-purpose machine. Computing 45, 369–375 (1990) 17. Garey, M.R., Johnson, D.S., Sethi, R.: The complexity of flowshop and jobshop scheduling. Math. Oper. Res. 1(2), 117–129 (1976) 18. Geiger, M.J.: Research report RR-12–01–01, January 2012, ISSN: 2192–0826, Helmut Schmidt University, Hamburg, Germany, https://d-nb.info/1023241773/34 (2012) 19. Brandimarte, P.: Routing and scheduling in a flexible job shop by tabu search. Ann. Oper. Res. 41, 157183 (1993) 20. Xing, L.N., Chen, Y.W., Yang, K.W.: An efficient search method for multi-objective flexible job shop scheduling problems. J. Intell. Manuf. 20(3), 283–293 (2009) 21. Hurink, J., Jurish, B., Thole, M.: Tabu search for the job- shop- scheduling problem with multi-purpose machines. OR Spectrum 15(4), 205–215 (1994) 22. Wang, L., Cai, J., Li, M., Liu, Z., Flexible job shop scheduling problem using an improved ant colony optimization. Sci. Program. (2017) 23. Mesghouni K., Hammadi, S., Borne, P.: Evolution programs for job shop scheduling. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC’97), Orlando, FL, USA, pp. 720–724, (1997) 24. Liu, H., Abraham, A., Wang, Z.: A multi-swarm approach to multi-objective flexible job-shop scheduling problems. Fund. Inform. 95(4), 465–489 (2009) 25. Huang, S., Tian, N., Wang, Y., Ji, Z.: Multi-objective flexible job-shop scheduling problem using modified discrete particle swarm optimization. Springerplus 5(1), 1–22 (2016). https:// doi.org/10.1186/s40064-016-3054-z 26. Udaiyakumar, K.C., Chandrasekaran, M.: Application of firefly algorithm in job shop scheduling problem for minimization of makespan. Procedia Eng. 97, 1798–1807 (2014) 27. Low, C., Yip, Y., Wu, T.H.: Modelling and heuristics of FMS scheduling with multiple objectives. Comput. Oper. Res. 33(3), 674–694 (2006) 28. Tang, J., Zhang, G., Lin, B., Zhang, B.: A hybrid algorithm for flexible job-shop scheduling problem. Procedia Eng. 15, 3678–3683 (2011) 29. Javadi, R., Hasanzadeh, M.: A new method for hybridizing metaheuristics for multi-objective flexible job shop scheduling. In: 2012 2nd International e-Conference on Computer and Knowledge Engineering (ICCKE), pp. 105–110. IEEE (October 2012) (2012) 30. Mesghouni, K., Hammadi, S., Borne, P.: Evolutionary algorithms for job-shop scheduling. Int. J. Appl. Math. Comput. Sci. 14(1), 91–103 (2004)
J-Parallelio: Automatic Parallelization Framework for Java Virtual Machine Piotr Listkiewicz, Krzysztof Stuglik, Mateusz Kulczyk, and Marcin Pietron(B) Institute of Computer Science, AGH - University of Science and Technology, Cracow, Poland [email protected] https://deep-cogni.com/
Abstract. Manual translation of the algorithms from sequential version to its parallel counterpart is time consuming and can be done only with the specific knowledge of hardware accelerator architecture, parallel programming or programming environment. The automation of this process makes porting the code much easier and faster. The key aspect in this case is how efficient the generated parallel code will be. The paper describes J-Parallelio, the framework for automatic analysis of the bytecode source codes and its parallelisation on multicore processors. The process consists of a few steps. First step is a process of decompilation of JVM and its translation to internal abstract syntax tree, the dependency extraction and memory analysis is performed. Finally, the mapping process is performed which consists of a set of rules responsible for translating the input virtual machine source code to its parallel version. The main novelty is that it can deal with pure java virtual machine and can generate parallel code for multicore processors. This makes the system portable and it can work with different languages based on JVM after some small modifications. The efficiency of automatically translated source codes were compared with their manually written counterparts on chosen benchmarks. Keywords: Automatic Parallelization Multicore Processors · HPC
1
· Java Virtual Machine ·
Introduction
Over the last few years we have observed a lot of trials of building tools that can help with automatic code parallelisation on different hardware platforms. We have also observed intensive research in parallelisation of source codes written in C/C++ language and speed up using OpenMP and OpenCL environments [8,10,11]. Recently, we have been able notice that tools for transforming C to CUDA or OpenCL for GPU acceleration are a common topic of interest in automatic code parallelisation [3,4,6,10] and [7]. There were also trials to automatically speed up different languages (e.g. Java [1,17,18] and [16]) on multicore processors. In the J-Paralellio system, a fairly new approach is presented. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 307–320, 2023. https://doi.org/10.1007/978-3-031-37963-5_22
308
P. Listkiewicz et al.
The framework is fully based on java virtual machine code and language independent, Fig. 1. The main input is java virtual machine code. JVM is parsed and partially decompiled and transformed to intermediate structures, then analysed, instrumented and transformed back to virtual machine in a parallelised version. The crucial element of its functionality is an engine incorporated with a decompiler and abstract syntax tree builder, pool of algorithms responsible for dependency and memory access analysis and set of rules for java virtual machine code transformation. In the first stage, the system parses JVM source code and extracts loops from the JVM code. It then generates an Abstract Syntax Tree with Data Flow Graph which helps to identify the potential parallelism by extracting internal dependencies. The system mainly concentrates on loops analysis. The framework performs the dependency analysis using the same algorithms as other automatic paralellisation systems [4,5,8,10]. Finally, the system maps the loops iterations and data structures to the hardware accelerator by using built-in translation rules. After the system engine analysis, it generates a parallel version of java virtual machine code. It instruments java virtual machine code with instructions responsible for the creation of threads. It automatically maps the loops to multi-thread execution. The work presents the results of mapping a few benchmarks from sequential to parallel versions. The main advantage of the presented approach is its portability. There are several languages which are based on Java virtual machine. JRuby and Jython are perhaps the most well-known ports of existing languages (Ruby and Python respectively). The new languages that have been created from scratch to compile to Java bytecode, Clojure, Groovy and Scala may be the most popular examples. The paper is organised as follows: the second section describes the related works in the field of automatic parallelisation. The next sections concentrates on methodology and algorithms which analyse and translate the parallel version of the input sequential JVM source code. The fifth section presents the results, the sixth conclusions and future work.
2
Related Works
The articles [1,18] and [19] present approaches for automatic java source code parallelisation. In [1] dependency extraction methods are used for java source code analysis. The translator is built for sequential JAVA code which generates a highly parallel version of the same program. The translation process interprets the AST nodes for signatures such as read-write access, execution-flow modifications, among others and generates a set of dependencies between executable tasks. The presented approach has been applied for recursive Fibonacci and FFT algorithms. The methods obtained a 10.97x and 9.0x increase in speed on a twelve-core machine. The latter two methods [18] and [19] concentrate on parallelisation using trace information. The approach presented in [18] collects on-line trace information during program execution, and dynamically recompiles methods that can be executed in parallel. In [19], the authors implement a system
J-Parallelio
309
that demonstrates the benefits and addresses the challenges of using traces for data-parallel applications. They propose an execution model for automatic parallelisation based on traces. In [16], a novel approach is described and evaluated for the automatic parallelisation of programs that use pointer-based dynamic data structures written in Java. The approach exploits parallelism among methods by creating an asynchronous thread of execution for each method invocation in a program. The only work in which java code parallelisation is done directly on a java virtual machine is shown in [15]. It is semi-automatic, there is no detailed JVM code analysis, it has not decompilation, automatic JVM code transformation and generation. It presents the process of the automatic instrumentation of virtual machine code by preparing and invoking special adapters which can run the original methods in a multi-threaded java environment. The results are presented using the Mandelbrot benchmark. The drawback of the article is the lack of results of running the algorithm on more benchmarks. Several other systems were designed for automatic code parallelisation which are mainly based on C/C++ language. YUCCA [2] designed by KPIT Technologies is an automatic parallelisation tool for projects written in C language. It provides source to source conversion - on input, it takes the source code of the application written in C and produces a parallelised version of the source code as an output. YUCCA output is a multithreaded version of the input with Pthreads or OpenMP pragmas inserted at appropriate places. YUCCA uses PThreads to perform task parallelisation and OpenMP to make loops run in parallel. YUCCA consists of two main parts: the front-end, which is responsible for parsing source code, the back-end which performs static dependency analysis to identify parts of code that is worth being parallelised. PLUTO [8,9] is an automatic parallelisation tool based on a polyhedral model. PLUTO performs source to source transformation - it conducts coarse-grained parallelism and at the same time ensures data locality. The core transformation framework mainly works by finding affine transformations for efficient tiling. PLUTO performs parallelisation with OpenMP and the code is also transformed for locality. The tool provides a number of options to tune aspects such as tile sizes, unroll factors and outer loop fusion structure. C-to-CUDA [4] and PPCG [5] propose similar steps to solve the automatic GPGPU code-generation problem. They concentrate on finding parallel loops, the creation of a polyhedral model from the loops; they tile and map the loops to GPU blocks and threads and determine where to place the data. Par4All [10] is an automatic parallelising and optimising compiler for C and Fortran sequential programs. The purpose of this source-to-source compiler is to adapt existing applications to various hardware targets such as multicore systems, high performance computers and GPUs. It creates a new source code and thus allows the original source code of the application to remain unchanged. The auto-parallelisation feature of the Intel C++ Compiler [12] automatically translates serial portions of the input program into semantically equivalent multithreaded code. Automatic parallelisation determines the loops that are good candidates, performs the data-flow analysis to verify correct parallel execution, and
310
P. Listkiewicz et al.
partitions the data for threaded code generation as is needed in programming with OpenMP directives. The OpenMP and auto-parallelisation applications provide the performance gains from shared memory on multiprocessor systems. AutoPar [11] is a tool which can automatically insert OpenMP pragmas into input serial C/C++ codes. For input programs with existing OpenMP directives, the tool double checks the correctness when the right option is turned on. Compared to conventional tools, AutoPar can incorporate user knowledge (semantics) to discover more parallelisation opportunities. The iPat/OMP [13] tool provides users with the assistance needed for the OpenMP parallelisation of a sequential program. This tool is implemented as a set of functions on the Emacs editor. All the activities related to program parallelisation, such as selecting a target portion of the program, invoking an assistance command, and modifying the program based on the assistance information shown by the tool, can be handled in the source program editor environment. OMP2MPI [14] automatically generates MPI source code from OpenMP, allowing the program to exploit non shared-memory architectures such as cluster, or Network-on-Chip-based (NoC-based) Multiprocessors-System-onChip (MPSoC). OMP2MPI provides a solution that allows further optimisation by an expert who wants to achieve better results.
3
Methodology
The framework consists of a few submodules. The first is a decompilation comprised of AST building components. It is responsible for transforming the Java bytecode to Java instructions and building an Abstract Syntax Tree from them. In parallel with decompilation, the loops extraction module works. It enables extraction of the loops in a java bytecode. The loops are the analysed by a specialised algorithm to extract potential parallelism (see Sect. 3.1). After analysis, mapping the bytecode to a multithread version is performed. The bytecode is instrumented with special instructions which are responsible for the thread, the task and their memory management. The multithreaded bytecode can then be run or decompiled to any language based on a java virtual machine. The main phases of the framework method are: – – – – –
Decompilation Abstract syntax tree building Loops extraction Data flow and dependency analysis Multithread mapping and JVM code instrumentation
3.1
Java Virtual Machine and BCEL
A Java virtual machine (JVM) is an abstract computing machine that enables a computer to run a Java program or program in other language which is based on JVM (e.g. Clojure, Groovy or Scala). One of the organisational units of JVM
J-Parallelio partial decompiler Java virtual machine
311
dependency and memory analysis
Abstract syntax tree
space and time mapper JVM code instrumentation virtual machine code generation hardware accelerator configuration code upload and execution
Fig. 1. Automatic Java Virtual Machine Parallelization System.
bytecode is a class. The JVM has instructions for the following groups of tasks: arithmetic operations, load and store arithmetic, type conversion, object creation and manipulation, stack management (stack operations push / pop), control transfer (branching), field access, method invocation, throwing exceptions and monitor-based concurrency. The JVM operation set can be represented as: OP ARG, where OP belongs to set of JVM available elementary operations, ARG is an argument. Argument can be constant or variable. In each single instruction can be one or two arguments. The bytecode instruction set currently consists of 212 instructions and 44 opcodes. The presented framework uses ByteCode Engineering Library (BCEL) which enables reading and manipulating in Java bytecode. The BCEL is intended to give users a convenient way to analyse, create, and manipulate Java class files. Classes are represented by objects which contain all the symbolic information of the given class: methods, fields and bytecode instructions, in particular. Such objects can be read from an existing file, transformed by a program (e.g. a class loader at run-time) and written to a file again. 3.2
Decompilation and AST Building
The first stage of the framework is responsible for translation of the raw Java bytecode to higher level instructions. Next, the translated instructions should be analysed to extract dependency between program components. Therefore the algorithm was built to fulfill this goal. Each single bytecode operation can pop or push elements on the JVM stack. Each single instruction in high-level language ends when the stack is empty. Therefore to extract whole Java instruction the module monitors the state of the stack. The approach is presented in Algorithm 1. It loads a class implementation in Java bytecode and gets a list of its methods (line 1). It also initializes the stack of the virtual machine (S) and decompiled instruction collections (Ji and J).
312
P. Listkiewicz et al.
Algorithm 1. Decompilation the JVM 1: S ← ∅, Ji ← ∅, J ← ∅, C ← load class, MC ← get methods in C 2: for m in MC do 3: Iset ← get instructions from m / ∅ do 4: while Iset ∈ 5: i ← remove first instruction from Iset 6: push i to S 7: while S ∈ / ∅ do 8: i ← remove first instruction from Iset 9: if i ∈ OP then 10: ARG ← pop ARG from S 11: R ← OP ARG 12: push R to S 13: end if 14: if i ∈ P U SH then 15: push ARG to S 16: end if 17: if i ∈ P OP then 18: Ji ← pop from S 19: end if 20: end while 21: return Ji 22: J ← Ji ∪ J 23: end while 24: end for
Then the algorithm iterates over the class methods. It gets an instruction list from each method m (line 3). It translates each instruction in a sequence and monitors the state of the stack. At the beginning it pushes the first instruction i of Ji to S (line 6). In a while loop it processes next bytecode instructions until stack S is empty. If it recognizes the operand instruction then it takes arguments of operation from the stack (line 10). It forms the triple address instruction (line 11) and pushes it to the stack. If P U SH operation is recognized (line 14) it pushes the argument of bytecode instruction to the stack. Finally if a single P OP instruction is met the algorithm takes the final decompiled instruction (Ji ) from the stack (line 18). All decompiled instructions are stored in J list (line 22). The Algorithm 2 presents building Abstract Syntax Tree from the list of the decompiled instructions. It initializes data structures (line 1) in which it stores dependencies between instructions (T ) and list of assigned variables (L). Then, it goes through the list of instructions and takes left and right hand variables from the analysed instruction (line 3–5). It checks in a loop if every right hand variable of Ji (from JiR ) already exists in L. If it is true, the algorithm takes the last instruction in which the specified right hand variable has appeared (var, line 8).
J-Parallelio
313
Algorithm 2. AST Building 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:
T ← ∅, L ← ∅ while J ∈ / ∅ do Ji ← remove first instruction from J JiR ← get right hand variables of Ji JiL ← get left hand variable of Ji for var in JiR do if var ∈ L then IR ← take last instruction from L: var ∈ IR d ← (IR → Ji ) T ←T ∪d end if end for L ← JiR ∪ L end while return T
The dependency d is extracted between these two instructions (IR and Ji , line 9) and added to the set T . After the instruction is processed the right hand variable is added to L set (line 13). At the end of the algorithm set of dependencies T is returned (line 15) which can be directly used to create the AST. 3.3
Loops Extraction and Data Flow Analysis
The loop extraction and their analysis are the next step of Java bytecode analysis. It is a crucial stage in the automatic parallelisation because loops are the main source of hidden parallelism. The Algorithm 3 presents how loops are extracted from the bytecode. The process starts from finding jump GOT O instruction. Then the argument of jump instruction is read (jump address, line 2). The address is the start of the program loop. The algorithm goes to this location and parses the iteration variable with its initialization value and loop boundary condition. It takes the list of the instruction from the location just after conditional bytecode instruction to the GOT O instruction (line 6). The extracted body of the loop can be decompiled using Algorithm 1. Very often loops can be nested. Therefore to extract hierarchy of the loops presented method should run recursively.
Algorithm 3. The JVM Loop Extraction 1: 2: 3: 4: 5: 6: 7:
Igoto ← find GOT O instruction Ls ← take address from Igoto instruction jump to Ls i ← take iteration variable B ← take condition boundary of the loop L ← take block from condition to Igoto decompile loop L //Algorithm 1
314
P. Listkiewicz et al.
After loop extraction the loop analysis is performed. Formal analysis is based on a polyhedral model; algorithms for dependency detection are run by using symbolic Fourier-Motzkin elimination.
4
Java Virtual Machine Automatic Parallelisation Module
The Algorithm 4 describes the process of instrumenting the JVM bytecode. The first two steps are responsible for initialization the thread executors and tasks list (line 1 and 2). The next part depends on type of parallelism. If data-driven dependency is recognized (lines 3–7, e.g. histogram) the input data is divided to independent data chunks (line 4). Output data is copied and create separate instance for each parallel thread (line 5). Then subtasks methods are created to work on these data chunks (line 6). At the end the single thread code responsible for merging result data is added (line 7). Algorithm 4. The JVM Parallelisation Ex ← create executors [T1 , T2 ,..., TN ] ← create task list if Pt is DP then [Di1 , Di2 , ..., DiN ] ← divide the input data Di [Do1 , Do2 , ..., DoN ] ← make copy of output data Do P ← add pool of subtasks (Din , Don ) Do ← add merging output data D end if if Pt is IP then [(start1 , step1 ), (start2 , step2 ), ..., (startN , stepN )] ← divide iterations to chunks 11: P ← add pool of subtasks in parallelised regions (startn , stepn ) 12: end if 1: 2: 3: 4: 5: 6: 7: 8: 9: 10:
The JVM instrumentation is finally used in the main parallelisation algorithm which is shown in Algorithm 5. The algorithm tries to find best parallel configuration of the input sequential program. The parallelisation concentrates mainly on program loops extracted by Algorithm 3 (line 2). Then the dependency analysis is performed (line 3). The loops can be fully or partially parallelised or unable to run in parallel (line 4). In the case of fully parallel loops the main decision is which loop to choose in a nested loop structure. Additionally, loops can be transformed by interchanging, tiling, skewing etc. By making appropriate selections from this choice of transformations it is possible to achieve better mapping and more efficient implementation. All these
J-Parallelio
315
configurations are generated in line 5. Then they are tested in a loop (lines 6– 10). Each candidate transformation is parallelised in JVM. Next, they are run with reduced number of iteration r (line 8). Finally, the most efficient configuration is chosen (lines 12 and 13).
Algorithm 5. The Parallelisation Algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:
5
E ← ∅, V ← ∅ Ld ← Algorithm 3 Lp ← Fourier-Motzkin(Ld ) Pt ← check parallelism type(Ld ) Lt ← get loop transformations for lt in Lt do L = parallelise jvm(lt ) //Algorithm 4 e = run(L, r) E ←E ∪e V ←V ∪L end for idmin ← arg min E return V [idmin ]
Results
The presented framework was run on the following benchmarks: matrix multiplication, histogram computing, vanilla NBody problem and Fast Fourier Transform. The two parameters are efficiency and speedup are main indicators of parallelisation algorithm quality. The efficiency (E) is defined as: E(N, P ) =
T (N, 1) S(N, P ) = P P ∗ T (N, P )
(1)
T (N, 1) T (N, P )
(2)
and speedup (S): S(N, P ) = where:
– N - size of the problem, – P - number of cores, – T(N,P) - time execution for problem with size N and with P cores In Table 1 and Fig. 2 and 3 the results for matrix multiplication are described.
316
P. Listkiewicz et al.
Table 1. Results of Matrix Multiplication with Efficiency Parameter E and Acceleration S. P
1024 × 1024 T[s] E S
4096 × 4096 T[s] E S
8192 × 8192 T[s] E
S
1
1.15 1
463.41 1
4 248.56 1
1
2
0.78 0.74 1.48 227.99 1.02 2.03 2 180.12 0.97 1.95
4
0.42 0.68 2.72 114.42 1.01 4.05 1 087.61 0.98 3.91
8
0.30 0.49 3.90 60.57
1
1
0.96 7.65 568.00
10 0.37 0.31 3.10 59.24
0.78 7.82
12 0.39 0.24 2.94 57.50
0.67 8.06
14 0.42 0.20 2.76 55.49
0.60 8.35
16 0.46 0.16 2.49 54.98
0.53 8.43 538.77
0.93 7.48
0.49 7.89
In Table 1, the results for different sizes of matrix are shown (1024 × 1024, 4096 × 4096 and 8192 × 8192). The efficiency is presented for serial and multicore version. The results are described for different numbers of cores. It can be observed that for smaller matrices (1024 × 1024) the peak performance is in the case of using eight cores.
Fig. 2. Logarithmic Chart from Serial and Parallel Time Execution of Matrix Multiplication.
J-Parallelio
317
Fig. 3. Logarithmic Chart from Matrix Multiplication on Different Number of Processor Cores.
If the size is bigger (4096 × 4096 and 8192 × 8192) the best speedup is achieved while using all available cores - sixteen. Figure 2 and 3 present these results using a logarithmic scale. Figure 2 shows comparison of execution times between serial and automatically generated parallel versions. It can be observed that around 256 × 256 size the generated code outperforms the serial one.
Fig. 4. Scalability of Histogram.
Figure 4 and 5 show histogram efficiency related to the size of input data and the number of cores. The histogram algorithm has data-driven parallelism.
318
P. Listkiewicz et al.
Fig. 5. Histogram Efficiency.
When the amount of data is about 107 or higher then the acceleration can be noticed (Fig. 4). Figure 6 presents Nbody efficiency.
Fig. 6. Nbody Efficiency.
It shows that maximum speedup achieved by automatically generated code was around 1.5 (for eight cores). Figure 7 describes FFT scalability. The serial version is slightly more efficient than parallel. In all experiments the automatic parallel version is at the peak 15% worse than its manually created counterpart.
J-Parallelio
319
Fig. 7. Logarithmic Chart from FFT with Different Number of Data Points.
All experiments were run on the processor Intel Core i9-9900K, 3.6GHz, RAM 16MB. Each experiment was repeated five times and average values were computed. The version of Java used in simulations was JDK 12.0.
6
Conclusions and Future Work
Presented results show that described automatic translation algorithms can speedup various algorithms in Java virtual machine. Moreover in many cases the generated parallel code can be as efficient as manually written code. Additionally, the depicted system can choose a proper accelerator and use appropriate strategy by using machine learning approaches. Future work will concentrate on further improvements in automatic parallelisation and testing JVM parallelisation modules on more languages like Scala, JRuby, etc. New improvements will also concern machine learning techniques for execution parameter prediction and partial parallel code generation. Further work will also concentrate on testing more complex testbench algorithms for parallelisation.
References 1. Rafael, J., Correia, I., Fonseca, A., Cabral, B.: Dependency-based automatic parallelization of java applications. In: Lopes, L., et al. (eds.) Euro-Par 2014. LNCS, vol. 8806, pp. 182–193. Springer, Cham (2014). https://doi.org/10.1007/978-3-31914313-2 16 2. Smitha, K.P., Sahasrabudhe, A., Vaidya, V.: Method of extracting parallelization in very large applications through automated tool and iterative manual intervention. Center for Research in Engineering Sciences and Technology (CREST), KPIT Technologies, Pune, India (2012)
320
P. Listkiewicz et al.
3. Elmqvist, H., Olsson, H., Goteman, A., Roxling, V., Zimmer, D., Pollok, A.: Automatic GPU code generation of modelica functions. In: 11th International Modelica Conference, Paris, France (2015). https://doi.org/10.3384/ecp15118235 4. Baskaran, M.M., Ramanujam, J., Sadayappan, P.: Automatic C-to-CUDA code generation for affine programs. In: Gupta, R. (ed.) CC 2010. LNCS, vol. 6011, pp. 244–263. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-119705 14 5. PPCG Project. http://freecode.com/projects/ppcg 6. Setoain, J., Tenllado, C., Gomez, J.I., Arenaz, M., Prieto, M., Tourino, J.: Towards Automatic Code Generation for GPUs (2010) 7. Hou, K., Wang, H., Feng, W.: GPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on GPUs (2017) 8. Bondhugula, U., Baskaran, M., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In: Hendren, L. (ed.) CC 2008. LNCS, vol. 4959, pp. 132–146. Springer, Heidelberg (2008). https://doi. org/10.1007/978-3-540-78791-4 9 9. Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: ACM SIGPLAN Programming Languages Design and Implementation (PLDI), Tucson, Arizona (2008) 10. Amini, M., et al.: Par4All: from convex array regions to heterogeneous computing. In: IMPACT 2012: Second International Workshop on Polyhedral Compilation Techniques, France, Paris (2012) 11. Liao, C., Quinlan, D.J., Willcock, J.J., Panas, T.: Semantic-aware automatic parallelization of modern applications using high-level abstractions. J. Parallel Program. 38, 361–378 (2010) R DPC++/C++ Compiler (2021). https:// 12. Intel Corporation. InteloneAPI software.intel.com/ 13. Ishihara, M., Honda, H., Sato, M.: Development and implementation of an interactive parallelization assistance tool for OpenMP: iPat/OMP. IEICE Trans. Inf. Syst. 89(2), 399–407 (2006) 14. Saa-Garriga, A., Castells-Rufas, D., Carrabina, J.: OMP2MPI: automatic MPI code generation from OpenMP programs. In: High Performance Energy Efficient Embedded Systems (2015) 15. Felber, P.A.: Semi-automatic parallelization of java applications. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) OTM 2003. LNCS, vol. 2888, pp. 1369–1383. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39964-3 86 16. Chan, B., Abdelrahman, T.S.: Run-time support for the automatic parallelization of java programs. J. Supercomput. 28, 91–117 (2004) 17. Han, G., Zhang, C., Lam, K.T., Wang, C.L.: Java with auto-parallelization on graphics coprocessing architecture. In: Conference: 42nd International Conference on Parallel Processing (ICPP 2013), Lyon, France (2013) 18. Sun, Y., Zhang, W.: On-line trace based automatic parallelization of java programs on multicore platforms. In: 15th Workshop on Interaction between Compilers and Computer Architectures (2011) 19. Bradel, B.J., Abdelrahman, T.S.: Automatic trace-based parallelization of java programs. In: International Conference on Parallel Processing, ICPP 2007 (2009)
Hyper Burst Buffer: A Lightweight Burst Buffer I/O Library for High Performance Computing Applications Erick Fredj(B) and Michael Laufer Toga Networks, a Huawei Company, Tel Aviv, Israel [email protected]
Abstract. The growth of computing power of large-scale HPC clusters has lead to the need for large high-bandwidth I/O systems. However, directly writing the large and bursty datasets to centralized parallel file systems can raise significant I/O contention issues on storage servers. To meet this requirement, high speed, local storage layers - Burst Buffers are being deployed in next-generation supercomputers. In this paper, we introduce the design of a burst buffer library integrated at the MPI layer - Hyper Burst Buffer (HBB) designed for aggregated write operations. HBB was implemented as a part of the Huawei Hyper-MPI (HMPI) framework, providing a storage framework that utilizes the efficient communication infrastructure of MPI, and the fast I/O capabilities of nodelocal NVMe storage devices. HBB seamlessly caches aggregation buffers to a local storage device and employs a lightweight background daemon that flushes burst buffer entries to a remote parallel file system, thereby achieving asynchronous I/O that is transparent to the application, as well as requiring a minimal memory overhead. The choice to integrate a burst buffer library at the MPI level encourages adoption by scientific community researchers due to the simple setup and usage. Our experiments on diverse scientific applications demonstrate that HBB is able to reduce the I/O overhead of application by an order of magnitude as well as decrease the total runtime of some I/O intensive cases by 50%. Keywords: Burst Buffer Parallel I/O
1
· Data Storage · HPC · MPI · MPI-I/O ·
Introduction
The scientific applications for High Performance Computing (HPC) face a growing I/O challenge. Unfortunately, this increases demands on HPC storage subsystems, which leads to bottlenecks in simulation pipelines such as weather forecasting models. Because of the complex interdependencies between I/O middleware and hardware, obtaining good I/O performance for a wide range of applications is a significant challenge. The parallel file system and I/O middleware layers all provide optimization, which, in theory, can result in improved c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 321–337, 2023. https://doi.org/10.1007/978-3-031-37963-5_23
322
E. Fredj and M. Laufer
performance. Unfortunately, finding the right combination of parameters takes a lot of time and expertise, which leads to poor I/O performance. As we move into the domain of extreme parallelism at Exascale, we must address this I/O challenge if such systems are to provide adequate performance and efficiency to the scientific applications user communities. This study investigates a new burst buffer library integrated at the MPI level that has been tested across a variety of scientific applications, including Numerical Weather Prediction (NWP) and Computational Fluid Dynamics (CFD). The paper is structured as follows: in Sect. 2 we will briefly discuss related burst buffer and asynchronous I/O related storage technologies that have been introduced in recent years. Section 3 discusses the background and current state of parallel I/O implementations integrated at the MPI level, known as MPII/O. Section 4 then details the architecture and implementation of the proposed Hyper Burst Buffer solution, as well as key design considerations. Additionally we discuss a diverse set of I/O intensive HPC applications that will serve as the test beds for the performance evaluation of this work. Specifically, we consider two popular applications, namely Weather Research and Forecasting Model (WRF) [18] and PyFR CFD package [24] in addition to the PyFR CFD package [24]. Section 6 analyzes the results of a set of I/O performance evaluations, and shows the significant wall time savings afforded by the solution proposed in this paper. Finally, in Sect. 7 we will present conclusions and next steps of this work.
2
Related Work
Improving I/O performance on large-scale HPC systems through the use of burst buffers as well as asynchronous I/O to allow overlapped compute and data storage has gained broad attention over the past decade. A number of studies have introduced new I/O middleware libraries. MPII/O [7], PnetCDF [14], HDF5 [20] boost I/O performance using parallel I/O involving a number of participating processes. These libraries primarily aim to optimize I/O for use with remote Parallel File System (PFS). While these do indeed greatly improve I/O performance compared to serial methods, their performance is still capped by the bandwidth and availability of the PFS. Recently, burst buffer based solutions have been developed to provide a solution for the increasing I/O needs in the post-petascale era. Temporary burst buffer file system libraries have been developed, such as GekkoFS [21], BeeOND [1], BurstFS [23], and UnifyFS[15], that create a user-space distributed parallel file system on node-local burst buffers. These solutions show promising results on many workloads but due to their ephemeral nature, the distributed data must still be flushed to the parallel file system at the end of the job, thus the end-to-end job completion time is not always considered in their evaluations. Additionally, in order to write to the multi-node distributed burst buffer array, additional communication (on top of any existing MPI-I/O communication during the shuffle phase of the two-phase algorithm) is required within the burst
Hyper Burst Buffer
323
buffer file system library. This is as opposed to the inherently communicationfree solution that will be proposed here that caches local buffers to a local storage device. The work closest to the one proposed here is arguably the effort by Sugihara et al. [19] where they also implement a burst buffer library within the MPI-I/O layer (ROMIO) that interfaces with their own Gfarm underlying file system. While the researchers claim performance that is significantly higher than competing burst buffer solutions such as BeeOND [1], the need to install an additional unconventional file system no doubt hinders the adoption of this otherwise high performance solution. The Hyper Burst Buffer library on the other hand interfaces directly with any underlying POSIX compliant file system and additionally, The library is expected to be included directly in a future release of the open-source Hyper MPI library, allowing for easy installation and use. Another related work is the integration and use of a burst buffer functionality within the PnetCDF middle-ware library [12]. The work uses a log-based format to store individual I/O requests on the burst buffer before sending a single aggregated call to the underlying MPI-I/O library (see Fig. 2 to see the relation between the I/O library and the MPI-I/O layer). This method allows for the writing of large contiguous chunks of data to PFS rather than many small and fragmented ones, without the use of extra system memory for buffering. At large scale, this approach achieves an impressive speedup over default PnetCDF, but its main disadvantage is that the flushing is done in a blocking manner, with the aggregated I/O request being handed off to the normal MPI-I/O file write call while the compute cores sit idle. In addition the PnetCDF burst buffer solution is only relevant for applications using the PnetCDF library, while applications using the HDF5 library for example, would not be accelerated. As our solution is integrated at the lower, MPI level it accelerates all applications that utilize MPI-I/O collective write operations. Proprietary burst buffer solutions offered by hardware and software vendors have been released in recent years. DDN offers a rack-mounted, burst buffer device, Infinite Memory Engine (IME) [2], that offers high-speed flash based storage close to the compute nodes, while coordinating flushes to PFS. Similarly, Cray’s DataWarp [11] uses dedicated burst buffer nodes located close to compute nodes that can be used in place of the PFS, with the ability to handle bursty workloads that would otherwise overload the centralized storage system. A down side of this solution is that the data transfer between the burst buffer nodes and the PFS must be managed by the user. Cray also offers a higher-performing, software solution, Data Elevator [9], that uses reduced computational resources compared to DataWarp, and also automatically managing the flushes to PFS. This solution uses the HDF5 VOL Plug-in mechanism, which limits its usefulness to HDF5 based applications only. Lastly the ORNL tools SymphonyFS Distributed Filesystem and Spectral Intercept Library [17] both are designed to integrate the node-local burst buffers on Summit with their Spider 3 PFS. Spectral is a relatively straight forward tool to speed up N-N, file-per-process I/O patterns. This tool uses a client and server
324
E. Fredj and M. Laufer
method to sync these files from the node-local burst buffer to the parallel file system asynchronously while the application continues. This approach is sound for N-N, single-shared-file I/O patterns (which do not pass through MPI-I/O), but is not applicable for N-1, single file output I/O patterns that utilize MPII/O. To that end, ORNL are also developing SymphonyFS, which is a FUSEbased client that transparently combines the node-local burst buffer file system with their remote parallel file system, while syncing and coordinating the data transfers in the background. The combined burst buffer and PFS appear as a single coherent namespace to the compute nodes on the system. This allows the SymphonyFS solution to work with N-1 I/O patterns. The ORNL team have published test results of synthetic I/O cases with SymphonyFS, but have yet to show performance speedups using real world HPC applications.
3
Background
The storage solution space for High Performance Computing clusters has evolved alongside the compute hardware, albeit at a slower pace than the compute elements generating the data. The addition of accelerators such as Graphics Processing Units (GPUs) and dedicated vector processing units has in recent years further compounded this issue. While Solid State Drives (SSDs) have been around for over a decade at this point, the price per GB has been deemed too high to serve as the primary storage device for large scale clusters, due to the sheer volume of the capacity that is needed. As such, planned exa-scale machines being designed for next generation supercomputing centers will continue to use the mechanical Hard Disk Drive as their main storage target. As an intermediate step, dedicated I/O nodes equipped with high speed solid state storage are being added into the HPC network topology to act as dedicated “Burst Buffer Nodes”, to absorb the bursty I/O data generated [3]. An increasingly common trend is to add a high speed node-local NVMe SSD to the compute nodes themselves [17], which would allow for a linear scaling of potential I/O bandwidth as more nodes are employed for a given job. The endeavor to attempt to utilize this potential high-speed bandwidth to accelerate real-world HPC applications is the aim of this work. At the heart of the two main HPC I/O middleware libraries, PHDF5 and PnetCDF, lies the usage of the MPI-I/O library that is located within the MPI implementation library (see OMPIO, and ROMIO in Fig. 2). MPI-I/O is used to facilitate collective write and read operation from all participating MPI ranks in a job. The primary method used by both the two main MPI-I/O libraries OMPIO and ROMIO, is the “two-phase” algorithm [8]. In the first phase, known as the “Shuffle Phase”, a subset of processes are designated as “aggregators”. These aggregators are then assigned processes, whereby they collect and rearrange the scattered data into a single temporary buffer known as the “collective buffer” or “aggregator buffer”. In the second phase, the “I/O Phase” the collective buffers are written to the PFS. A schematic demonstrating a single cycle of a collective two-phase write can be seen in Fig. 1.
Hyper Burst Buffer Process 0
Process 2
Process 1
325
Process 3
Shuffle Phase Aggregator 0
Aggregator 1
I/O Phase PFS
I/O Node 0
I/O Node 1
Fig. 1. Schematic of a Two-Phase MPI-I/O Collective Write Operation. Distributed Process Data is Aggregated in Aggregator Buffers during the Shuffle Phase before being Sent to the Parallel File System during the I/O Phase.
This two-phase technique is used for a number of reasons. Firstly, if no collective buffering were used, each process would send multiple fragmented I/O requests to the PFS at the same time, and at scale this will cause I/O contention issues as the number of concurrent I/O requests greatly exceeds the number of storage targets, resulting in a performance slowdown. Additionally, the nominal collective buffer size can be adjusted to match the ideal stripe size for the PFS, further increasing performance compared to small fragmented writes. The next section discusses the design of the Hyper Burst Buffer library and the tight integration within the MPI-I/O software layer, with the aim of utilizing high speed node-local storage for collective writes.
4
Design and Implementation
In this section, we present an architectural overview of the proposed burst buffer framework called Hyper Burst Buffer. Hyper Burst Buffer library is composed of two parts, the MPI-I/O Burst Buffer Cache module as well as a separate Burst Buffer Flush Daemon. The pair of modules can be seen in Fig. 2 in relation to the other components of the I/O stack in HPC. The MPI-I/O Burst Buffer Cache module interrupts the two-phase algorithm explained in Sect. 3 and packages remote PFS write requests from the aggregator buffers into data packets which are cached on the node-local burst buffer. Meanwhile, the Burst Buffer Flush Daemon orchestrates coordinated background flushes of the burst buffer data packets to the PFS.As a result, Hyper Burst
326
E. Fredj and M. Laufer
Application I/O
I/O Library
U C X
MPI MPI-IO
O M P I O
R O M I O
Development
Burst Buffer Flush Daemon
Parallel File System
Fig. 2. HPC I/O Software Stack, Showing the Hyper Burst Buffer Cache Module Integrated into MPI-I/O as well as the Accompanying Burst Buffer Flush Daemon Software, Both Highlighted in Red.
Buffer is designed to run both modules concurrently, with the flushing daemon running on I/O-only cores. A schematic of the Hyper Burst Buffer operation can be seen in Fig. 3. The following sections will discuss the design and implementation of the two modules. 4.1
MPI-I/O Burst Buffer Cache Module
The Hyper Burst Buffer MPI-I/O cache module is implemented within the OMPIO MPI-I/O framework of the Open MPI based, Huawei Hyper-MPI. In particular it extends the File Byte Transfer Layer (FBTL). The FBTL layer in Open MPI based implementations, provides the abstraction layer for the read and write operations and effectively makes up the second phase of the two-phase algorithm in MPI-I/O collective I/O operations [4]. A new FBTL posix bb which is based on the posix FBTL was written as part of the Hyper Burst Buffer solution.
Hyper Burst Buffer
327
The layout of the I/O backend space of HMPI, based on Open MPI, with the new posix bb FBTL can be seen in Fig. 4. The new posix bb FBTL extends posix by adding a burst buffer cache functionality. Within the posix bb FTBL, two-phase collective write operations are packaged into a burst buffer data packet that consists of a metadata header as well as the raw collective buffer data. The metadata header and data together represent the minimum of what is needed in order to reconstitute the distributed burst buffer data into a single file using the Burst Buffer Flush Daemon. The metadata header is made up of the final file system path, the byte offset into the target file, as well as the length of the raw collective buffer data. An illustration of the burst buffer data packet can be seen in Fig. 5 (illustration not to scale). The nominal data packet size is dictated by the size of collective buffers in MPI-I/O, and is by default 32 MB. This is configurable in the OMPIO based implementation at runtime by setting –mca io ompio bytes per agg sizebytes. Although not examined in this study, increasing the buffer size would create less burst buffer data packets and potentially further reduce write overhead. Read operations, on the other hand are untouched when using the Hyper Burst Library, and normal PFS file reads are executed. While a complex prefetching read-ahead method utilizing burst buffers could be implemented, the bulk of I/O time for large jobs in HPC applications is spent in writes, as read heavy workload such as post-processing jobs generally use a lower node count compared to the computation. Compute Node 1
Compute Node 0 Process 0
NVMe BB
Process 2
Process 1
Aggregator 0
NVMe BB
Aggregator 1 Cache to BB
Cache to BB Async flush to PFS
Process 3
PFS
I/O Node 0
Async flush to PFS
I/O Node 1
Fig. 3. Hyper Burst Buffer Operation Schematic. Aggregator Buffers are Cached in Packets on the Node-Local NVMe Burst Buffer between the Two Phases of the “twophase” Collective I/O Method in MPI-I/O. The “I/O Phase” is Moved to a Background Process, Allowing for Overlapping I/O and Compute.
E. Fredj and M. Laufer
individual
sm
…
lockedfile
sharedfp
dynamic
vulcan
…
two_phase
…
posix
gpfs
ufs lustre
ROMIO
fs
pvfs2
I/O OMPIO fcoll fbtl posix_bb
328
Fig. 4. I/O Modules as Part of HMPI, An Open MPI based Implementation. The New posix bb FBTL Module is Marked in Red as Part of the Hyper Burst Buffer Solution.
Metadata Header File path
Offset
Length
Data
Fig. 5. Layout of the Hyper Burst Buffer Data Packet. The Metadata Header Occupies the First Few Bytes of the Packet. The Figure is not to Scale.
With this approach, the I/O phase of the two-phase method is no longer limited by the bandwidth and availability of the remote PFS, but writes data packet files directly to high speed local solid state disks that excel writing these small packets. Moreover, the potential I/O bandwidth now scales linearly as more nodes are used in a job. Thus we effectively utilize node-local burst buffers for high performance MPI-I/O collective writes. The decision to use a metadata header approach, as opposed to using some sort of distributed metadata server-client on the compute nodes, allows for an extra layer of resiliency, as all of the data needed to reconstitute the output files are located on the node-local burst buffers in full. In case of a hardware or software failure, the burst buffers can be flushed at a later period of time with no loss of data. Additionally, one advantage of the multi-layered abstraction approach used in an Open MPI based implementation like HBB, is that a requested MCA parameter, in this case the posix bb FBTL, can easily be requested at runtime by adding -- mca fbtl posix bb to the mpirun command. The next section details the second half of the Hyper Burst Buffer solution - the Hyper Buffer Flush Daemon.
Hyper Burst Buffer
4.2
329
Hyper Burst Buffer Flush Daemon
The Hyper Burst Buffer Flush Daemon is tasked with transferring the distributed burst buffer data packets that are generated by the MPI-I/O Burst Buffer Cache module. The daemon is implemented as a multi-node, stand-alone MPI program, and designed to run on dedicated cores on the compute nodes, where testing has found a single flush daemon process per CPU socket sufficient. The flush daemon periodically wakes and scans the burst buffer directory for BB data packets. If found, the BB data packet files found on a given node are first sorted by output order and then distributed in an even fashion between ranks on the same node. These ranks then systematically read in their burst buffer files, parse the metadata header, and then finally send the data to PFS. After the packets have been consumed, they are then deleted on each drive. Additionally, a mechanism was implemented to skip over all burst buffer packets that are still being written to the main MPI application. Another feature of the HBB Flush Daemon is a configurable data-transfer limit per wake cycle. This data transfer limit prevents the burst buffer to be flushed fully on wake-up, which has the potential to create I/O contention issues in the PFS as well as reduce potential network congestion issues at scale which could affect the computation happening concurrently in the main MPI application. A final important design decision was made to specifically dedicate the flush daemon cores to I/O and not participate in compute. This design was made to minimize cache dilution and interference with the main job MPI ranks. Additionally, as more applications are ported to accelerators such as GPUs, the CPU cores are freed up to handle background tasks such as the flush daemon explained here. The effect of the dedicated processes on the compute time is investigated in Sect. 6 of this work. Another mode of operation offered by the HBB Flush Daemon is to run the flush daemon program only at the end of the job. This mode may be useful for jobs that prefer to use the cores for compute rather than I/O. This type of use-case would be similar to the BeeOND, or GekkoFS temporary burst buffer file systems that must perform a “stage-out” to PFS at the end of the job. As a consequence of this asynchronous implementation, a corner case could arise where the main MPI application issues a read call to access a file that has not completely been written to the PFS. This type of read-after-write I/O pattern is uncommon in HPC applications [17], but it can cause unexpected behavior. This problem could be mitigated by using POSIX file locking or adding a centralized locking entity, ensuring that file reads wait until any writes to the same file have completed, but has not implemented in this work.
5
I/O Workloads - HPC Applications
This section introduces two leading open source HPC applications that will be used as I/O testing platforms. The Weather Research Forecasting (WRF)
330
E. Fredj and M. Laufer
model [18], and the PyFR CFD solver [24] were selected as representative workloads spanning the major application categories in HPC. Real world HPC applications, as opposed to I/O benchmarks, were chosen for testing in the following section in order to demonstrate the ability to positively affect the total runtime of applications, as I/O will now be overlapped with computation. I/O benchmarks on the other hand will generally output files sequentially with no compute phase, making them mostly irrelevant for asynchronous I/O solutions such as HBB. 5.1
Weather Research Forecasting Model
We now give a brief introduction on the WRF model. For more details, the reader is referred to [18]. WRF Background. WRF is a state of the art mesoscale Numerical Weather Prediction system (NWP) intended both for forecasting and atmospheric research. It is an Open Source project, officially supported by the National Center for Atmospheric Research (NCAR), has become a true community model by its long-term development through the interests and contributions of a worldwide user base. The software framework of WRF has facilitated such extensions and supports efficient, massively-parallel computation across a broad range of computing platforms. The governing equation set of the WRF model is based on the compressible, non-hydrostatic atmospheric motion with multiple physics processes such cloud and precipitation, boundary layer turbulence, land ocean air interaction, radiative transfer in the atmosphere, and energy transfer at the surface. In a number of previous works [5,13,16], WRF I/O times have been found to occupy a significant amount of total runtime of WRF runs, and even overtake the compute time in certain situations. For this reason, WRF was chosen as a representative test case in this work. WRF I/O Backends. WRF’s well-defined I/O API provides several different implementations of its I/O layer, the ones relevant for the present work: – Serial NetCDF: The default I/O option in the WRF model. When this I/O option is selected, all write data is sent to the first MPI rank, where this rank alone writes out a file through the NetCDF library. While Rank 0 is writing to disk, all other ranks wait until the write has fully concluded before continuing computation. This method performs well at low process counts but at higher counts, the write can quickly dominate the computation time. Due to the poor I/O performance of this method at scale, we will not present results using this method. – Parallel NetCDF: WRF’s parallel I/O option that utilizes MPI-I/O. When this method is employed, all MPI ranks cooperate to write a single output file in parallel using PnetCDF, which directly accesses MPI-I/O. In large scale cases this can offer an order of magnitude increase in write bandwidth compared to the Serial NetCDF method.
Hyper Burst Buffer
331
– Quilt Servers: A third technique for WRF file writes uses dedicated I/O processes (“servers”) that deal exclusively with I/O, enabling the compute processes to continue with their work without waiting for data to be written to disk before proceeding. Data from multiple compute ranks are merged (“quilted”) together by a dedicated I/O rank by means of MPI communication calls and kept in help in system memory until they are written to PFS. This has previously been found to be the highest performing I/O option available in WRF in a comprehensive Cray study focusing on WRF I/O performance [16], even though compute resources have been sacrificed for I/O activities. 5.2
PyFR
PyFR is an HPC optimized Python framework for solving the compressible and incompressible Navier-Stokes equations for unstructured grids using the Flux Reconstruction (FR) approach [24]. PyFR is designed to scale up from a laptop computer all the way the largest super-computing clusters. When run at scale, and with an industry relevant test case, I/O performance can be a bottleneck, and I/O time can even exceed compute time when solution files are needed at high intervals such as for creating transient flow visualizations. For I/O, PyFR uses the HDF5 library by way of the Python h5py library [6]. By default PyFR will use the serial HDF5 library, whereby PyFR will first transfer all I/O data to MPI rank 0, before performing a serial HDF5 write operation. To confront the I/O bottleneck that occurs at scale when using a serial I/O methodology, PyFR can also use the Parallel HDF5 (PHDF5) library. When used, PHDF5 will perform a collective write operation using the MPI-I/O backend, and thus has the potential to benefit from the solution proposed in this work. The following section will describe the intensive I/O test cases along with results showing the impact of the Hyper Burst Buffer solution on total runtime of the application.
6
Results
In this section we evaluate the I/O performance improvements of real-world HPC workloads utilizing the Hyper Burst Buffer library. The workloads were chosen to reflect intensive, but realistic I/O use cases of Numerical Weather Prediction as well as an Computational Fluid Dynamics application in order to span the gamut of parallel I/O HPC applications. 6.1
Experimental Setup
We used a small research cluster called Thunder for testing in this work. The Thunder research cluster consists of 10 nodes, each with 2 × 18 core Intel(R) Xeon(R) Gold 6240 CPUs (36 cores per node). Each node contains 384 GB of RAM, along with a 24.75 MB L3 cache for each CPU. The memory is accessed
332
E. Fredj and M. Laufer
via 4 NUMA nodes and up to 6 memory channels per CPU (131 GB/s bandwidth). Finally, The nodes are connected via fast Mellanox Connect-X6 100 GbE interconnects. In terms of storage, the BeeGFS file system [1] was striped across eight 10k RPM spinning hard disk drives on a dedicated storage node. Finally, each compute node was outfitted with an Intel DC P4510 1TB NVMe SSD drive with rated sequential read and write speeds of 2850 MB/s and 1100 MB/s, respectively. 6.2
WRF Results
For WRF testing, v4.2, the latest major release version was used, and configured for distributed memory mode (i.e. dmpar ). In order to support MPI-I/O based parallel I/O, WRF was compiled against PnetCDF v1.12.1 with GCC v10.2. Two WRF test cases were selected for I/O testing, the classic continental US, i.e. “CONUS” benchmark at 2.5 km XY spatial resolution as well as the 2017 Maria hurricane run with a 3 km resolution. The CONUS 2.5 km case, along with its less intensive CONUS 10 km case are the official benchmarks for WRF and are available on the WRF web site https://www2.mmm.ucar.edu/wrf/WG2/ bench/. Unfortunately, these official benchmarks are only compatible with WRF v3 and not the v4.2 used here. As such, the New CONUS 2.5 km benchmark was developed for WRF 4+ in an NCAR study [13]. Similarly, the Maria 3 km case was also selected as a relevant and timely test case from the NCAR study. These two WRF test cases were further modified, with the WRF output file (history files) frequency increased to one file every 10 simulation time minutes to represent a data analysis time scale that is relevant for time resolving transient weather analysis, as opposed to the default 180 min defined used in the NCAR study as well as the CONUS benchmark suite. Test runs with 8 compute nodes (288 MPI ranks) were performed 5 times for each test configuration, and the average cumulative compute and write (i.e. I/O) times were extracted from the .rsl file generated by WRF.Additional overhead not attributed to compute or I/O was recorded and categorized as other. Three I/O configurations were evaluated for each WRF test case. First the baseline parallel I/O option - PnetCDF, followed by the WRF Quilt Server performing asynchronous I/O. Two cores per node (16 cores total) were used during the WRF Quilt Server runs as dedicated I/O cores. Lastly we evaluate the performance of our application-agnostic Hyper Burst Buffer. The Hyper Burst Buffer was configured using the default nominal Burst Buffer cache packet size of 32MB. The Burst Buffer Flush Daemon was set to use 2 cores per nodes to match the setup of the WRF quilt server. The WRF serial I/O results are not presented as they were greatly outperformed by the other options at this process count.
Hyper Burst Buffer
333
Fig. 6. WRF Runtime Results for the 2017 Hurricane Maria 3 km XY Spatial Resolution Test Case for Different I/O Methods, Running on 8 Compute Nodes (288 MPI Ranks). The Hyper Burst Buffer Result Shows I/O Performance that Approaches the WRF Specific “Quilt Server” Performance, while Remaining Application Agnostic.
Test results comparing various I/O methods in WRF for the Hurricane Maria 3 km case are seen in Fig. 6. We can observe that when using the baseline WRF parallel I/O option, PnetCDF, the I/O time dominates the compute time, while the WRF custom coded WRF Quilt Server dramatically reduces the total runtime by utilizing dedicated processes to achieve overlapped computation with I/O. The Hyper Burst Buffer solution proposed in this work is able to approach the performance of the WRF-specific Quilt Server, and cuts the total runtime of the test case by 55% compared to the baseline PnetCDF. In regards to the I/O time measured for the Hyper Burst Buffer solution, this includes the communication during the shuffle phase of the “two-phase” MPI-I/O method as well as the time needed to write the data in parallel to the node-local NVMe drives. WRF runtime results for the CONUS 2.5 km benchmark case can be seen in Fig. 7. Due to the higher spatial resolution of this case, the I/O volume is increased compared to the Maria 3km case. In this case we can observe that the application agnostic Hyper Burst Buffer solution is able to outperform the WRF specific Quilt Server I/O method, while reducing the overall runtime of the WRF case by 61% compared to the baseline WRF parallel I/O option - PnetCDF. For the final paper, the effect of the size of the nominal burst buffer packet size will be investigated.
334
E. Fredj and M. Laufer
Fig. 7. WRF Runtime Results for the CONUS 2.5 km Benchmark, for Different I/O Methods, Running on 8 Compute Nodes (288 MPI Ranks). The Application Agnostic Hyper Burst Buffer Solutions is able to Outperform the WRF Specific “Quilt Server”.
6.3
PyFR Results
PyFR v1.12 was used for testing on the Thunder research cluster. It was installed together with the h5py package that was linked to Parallel HDF5 (“PHDF5”) to facilitate parallel I/O through MPI-I/O. Of note is that the default installation method of PyFR supports only serial I/O, where the data from the distributed ranks is aggregated to rank 0 before being written out to disk from the single rank, while the other ranks wait for the write to complete. Additionally the optional LIBXSMM package [10] was used to accelerate the matrix algebra during the computation phase of the solver. A test case of a 3D SD7003 Airfoil was used to evaluate the I/O performance of PyFR. The test case was based on the case used in Vermeire [22], but updated for current PyFR input file requirements. After solving for an initial warm-up period, the simulation was restarted to run for 1 additional convective time unit, with an data output every 0.01 convective time units. This output frequency is approximately what might be used to generate flow visualization animations, and represents an real-world I/O intensive CFD computation. Moreover, the PyFR source code was slightly modified to record write times for each output file in order to allow for an analysis of I/O performance.
Hyper Burst Buffer
335
Fig. 8. PyFR Runtime Results for the SD7003 Airfoil Case for Different I/O Methods. The Hyper Burst Buffer Solution is able to Totally Remove the I/O Overhead.
Three I/O configurations were used for the evaluation of the I/O times in PyFR, First the serial I/O option was run and subsequently followed by the parallel I/O option utilizing PHDF5 which will be considered the baseline for this case. Lastly, our Hyper Burst Buffer solution was tested, using the PHDF5 parallel I/O back-end configuration that utilizes MPI-I/O. PyFR was then run on 8 nodes using the OpenMP back-end with 1 MPI rank per socket for a total of 16 MPI ranks. The number of OpenMP threads was set to 18 threads per rank for the serial I/O as well as PHDF5 cases, but was set to 17 threads per rank to leave resources for the Burst Buffer Flush Daemon that was launched concurrently with the main simulation workload, resulting in 2 flushing processes per node. A time breakdown of the PyFR runs can be seen in Fig. 8. We can observe that the parallel I/O option using PHDF5 leads to a large reduction in total runtime compared to the default serial I/O option for this I/O intensive case. In particular, the I/O time measured for the serial I/O case surpasses the compute time. Once the Hyper Burst Buffer solution is employed we can see that the I/O time does not register on the chart, as the total write duration for the run was under 10 s. Additionally, the measured compute time of all three configurations are virtually identical indicating the compute performance was not negatively affected by the loss of the two threads that were reallocated for use for the Burst Buffer Flush Daemon that was running concurrently.
7
Conclusions
We have presented Hyper Burst Buffer - an application agnostic, lightweight burst buffer I/O library integrated into MPI-I/O to transparently accelerate inuser MPI-I/O applications. Our design couples a distributed burst buffer caching
336
E. Fredj and M. Laufer
mechanism to cache contiguous data buffers within the MPI-I/O layer, with a background flushing daemon to complete the transfer to the remote Parallel File System. The solution was then tested on MPI-I/O applications from Numerical Weather Prediction to Computational Fluid Dynamics, while utilizing the two leading parallel I/O middle-ware libraries, PHDF5 and PnetCDF. The results of our real-world test cases indicate that when using our solution, the HBB library may reduce I/O overhead by an order of magnitude when compared to the baseline MPI-I/O case, resulting in a two-fold improvement in total application run-time for some test cases. Furthermore, we discovered that our overlapped compute-I/O flush methodology outperforms the application-specific asynchronous I/O WRF Quilt server without any of the associated memory overhead. The Hyper Burst Buffer solution has demonstrated the ability to realize the potential of the node-local burst buffer for collective write operations.
References 1. BeeondTM : Beegfs on demand 2. Betke, E., Kunkel, J.: Benefit of DDN’s IME-FUSE for I/O intensive HPC applications. In: Yokota, R., Weiland, M., Shalf, J., Alam, S. (eds.) ISC High Performance 2018. LNCS, vol. 11203, pp. 131–144. Springer, Cham (2018). https://doi.org/10. 1007/978-3-030-02465-9 9 3. Bhimji, W., et al.: Extreme i/o on HPC for HEP using the burst buffer at NERSC. J. Phys. Conf. Ser. 898, 082015 (2017) 4. Chaarawi, M., Gabriel, E.: Automatically selecting the number of aggregators for collective I/O operations. In: 2011 IEEE International Conference on Cluster Computing, pp. 428–437. IEEE (2011) 5. Christidis, Z.: Performance and scaling of WRF on three different parallel supercomputers. In: Kunkel, J.M., Ludwig, T. (eds.) ISC High Performance 2015. LNCS, vol. 9137, pp. 514–528. Springer, Cham (2015). https://doi.org/10.1007/978-3-31920119-1 37 6. Collette, A.: Python and HDF5. O’Reilly, Sebastopol (2013) 7. Corbett, P., et al.: Overview of the MPI-IO parallel I/O interface (1995) 8. del Rosario, J.M., Bordawekar, R., Choudhary, A.: Improved parallel I/O via a two-phase run-time access strategy. ACM SIGARCH Comput. Archit. News 21(5), 31–38 (1993) 9. Dong, B., Byna, S., Wu, K., Johansen, H., Johnson, J.N., Keen, N.: Data elevator: low-contention data movement in hierarchical storage system. In: 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), pp. 152–161 (2016) 10. Heinecke, A., Henry, G., Hutchinson, M., Pabst, H.: LIBXSMM: accelerating small matrix multiplications by runtime code generation. In: SC 2016: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE (2016) 11. Henseler, D., Landsteiner, B., Petesch, D., Wright, C., Wright, N.J.: Architecture and design of cray datawarp. Cray User Group CUG (2016) 12. Hou, K., et al.: Integration of burst buffer in high-level parallel I/O library for exascale computing era. In: 2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS). IEEE (2018)
Hyper Burst Buffer
337
13. Kyle, A.: Weather Research and Forecast (WRF) Scaling, Performance Assessment and Optimization (2018) 14. Li, J., et al.: Parallel netCDF. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing - SC 2003. ACM Press (2003) 15. Moody, A., et al.: Unifyfs: a distributed burst buffer file system - 0.1.0 (2017) 16. Morton, D., Nudson, O., Stephenson, C.: Benchmarking and Evaluation of the Weather Research and Forecasting (WRF) Model on the Cray XT5 (2009) 17. Oral, S., et al.: End-to-end I/O portfolio for the summit supercomputing ecosystem. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–14. ACM, New York (2019) 18. Skamarock, W.C., et al.: A description of the advanced research WRF version 3. NCAR Technical Note 475+STR (2008) 19. Sugihara, K., Tatebe, O.: Design of locality-aware MPI-IO for scalable shared file write performance. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1080–1089. IEEE (2020) 20. The HDF Group. Hierarchical data format version 5, 2000-2010 21. Vef, M.-A., et al.: GekkoFS - a temporary distributed file system for HPC applications. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 319–324. IEEE (2018) 22. Vermeire, B.C., Witherden, F.D., Vincent, P.E.: On the utility of GPU accelerated high-order methods for unsteady flow simulations: a comparison with industrystandard tools. J. Comput. Phys. 334, 497–521 (2017) 23. Wang, T., Mohror, K., Moody, A., Sato, K., Yu, W.: An ephemeral burst-buffer file system for scientific applications. In: SC 2016: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE (2016) 24. Witherden, F.D., Farrington, A.M., Vincent, P.E.: PyFR: an open source framework for solving advection-diffusion type problems on streaming architectures using the flux reconstruction approach. Comput. Phys. Commun. 185(11), 3028–3040 (2014)
Electromagnetic Quantum Memory in Coherent Domains of Condensed Matter and Its Prospects for Quantum Hypercomputing Luigi Maxmilian Caligiuri(B) Foundation of Physics Research Center (FoPRC), 87100 Cosenza, Italy [email protected], [email protected] http://www.foprc.org
Abstract. Several theoretical and experimental results have shown the arising of Quantum Electrodynamics Coherence in condensed matter causes the formation of an array of closely packed macroscopic regions, called “coherent domains” (CDs), in which the elementary matter components (atoms/molecules) are characterized by synchronized oscillations among them and with a self-generated electromagnetic field trapped inside them. Such Coherent Domains are quantum objects described by a macroscopic wavefunction with a well-defined quantum phase that characterizes the common oscillation of matter and e.m. fields. In the case of liquid water, the theory predicts the existence of “excited” energy levels of coherent domains in the form of coherent cold vortices of quasi-free electrons, for they behave like superconductors and, if coupled through a thin insulating layer, act like a Josephson junction. In this paper we have shown that such quantum domains can be used to “memorize” quantum information and suggested a method to store/retrieve it by exploiting the electromagnetic “memory” effect. Our results could be used in principle to realize quantum storing “devices” in the context of the novel quantum computational schemes and systems already proposed by this author in previous publications. Keywords: QED Coherence in Matter · Water · Macroscopic Quantum Phase · Plasma · Quantum Information
1
Introduction
Macroscopic complex systems are characterized by high stability at mesoscopic and macroscopic time-scales. In particular they stand out for the spontaneous emergence of order out the quantum fluctuations featuring the dynamics of its elementary components at microscopic scales. As well-known the only correct description of a macroscopic matter system composed by a high number of elementary components that takes into account the underlying collective behavior c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 338–354, 2023. https://doi.org/10.1007/978-3-031-37963-5_24
Quantum Memory by Coherent Domains
339
of its microscopic components is given by Quantum Field Theory (QFT). According to the latter, the macroscopic stability of a system, described a quantum field ψ (x, t), requires the system’s Lagrangian L to be invariant uder the local phase transformation of the quantum field [1] ψ (x, t) → ψ (x, t) = exp [iqθ (x, t)] ψ (x, t)
(1)
that, in turn, implies the introduction of a gauge field Aμ (x, t) corresponding, at the scales of atoms and molecules, to the electromagnetic field, such that L is also invariant under the local gauge transformation. Aμ (x, t) → Aμ (x, t) − ∂μ θ (x, t)
(2)
The arising of the gauge field Aμ makes matter and e.m. fields to perform common tuned oscillations whose dynamics, if suitable conditions about density and temperature of the systems are satisfied, manifests itself as the occurrence of a new macroscopic coherent quantum state [2,3]. In other words, the system spontaneously undergoes a quantum phase transition from a non-coherent ground state, called “perturbative ground state” (or PGS) in which all the elementary constituents perform zero-point uncorrelated quantum oscillations only, to a so-called “coherent ground state” (or CGS) in which phase-correlated common oscillations take place. The coherent ground state (CGS) is characterized by a phase locking between the phases of the matter and e.m. fields due to their synchronization and by the formation of macroscopic quantum domains, called “coherent domains” (CDs), in which such synchronized oscillation occurs. Due to their collective synchronized oscillations, the whole ensemble of elementary constituents of the system is described by a unique macroscopic quantum wavefunction. Ψ (x, t) with a well-defined phase Θ (x, t), namely Ψ (x, t) = Ψ0 (x, t) eiΘ(x,t)
(3)
Such quantum wavefunction is conceptually similar to that describing superfluid or superconductive systems. As well as in the latter the macroscopic wavefunction describes the dynamics of the whole ensemble of superconductive electrons, in the case of coherent domains it describes the collective coherent behavior of matter and e.m. field whose quanta can behave as a condensate of quasiparticles observed at macroscopic scale. The coherent dynamics has a number of very meaningful consequences. First of all, the “condensation” process towards the CGS endows it of an energy gap/per elementary component (atoms or molecules) that makes it much more stable compared to the non-coherent ground state, usually considered as the system ground state, stabilizing it against the thermal and environmental noise [4]. This energy reduction and the phase/locking regime ensure long/range correlation and order inside coherent domains that, in turn, determine a decrease of entropy of the system. We have already shown [5] that this allows the CDs to store quantum information codified in its macroscopic quantum phase given
340
L. M. Caligiuri
by Eq. (4), modifying which so that controlling the latter would allow us to read/write such information. The coherent state of water admits excited energy levels, themselves coherent in the form of cold vortices of quasi-free electrons whose “vorticity” is quantified by the quantum phase of the whole CD they belong to [6]. This ensemble of quasi-free electrons, all described by a unique macroscopic wavefunction like Eq. (1), are strongly phase correlated and, being coherent, can move within a coherent domain without encountering electrical resistance like in a superconductor [7]. Then the storage/retrieval of such information can be achieved by making a CD interacting with other CDs or with an external field. It is worth noting that [7], due to their superconductive-like behavior, a couple of water coherent domains can be also “assembled” in order to simulate a Josephson Junction enabling us to realize experimental set-ups to control/modify the CD quantum phase. In this paper, we show how to exploit the electromagnetic “memory” effect to manipulate quantum information inside water coherent domains. Apart from its intrinsic interest from a theoretical point of view, such possibility could have deep implications in the field of quantum computation, as already suggested by this author [8] as well as in quantum cryptography and secure communications. In fact, despite their potential, all the today state-of-the-art realizations of quantum computational and storing systems are affected by many issues that limit the number of quantum systems suitable for such scopes to, for example, trapped ions, superconductors, quantum dots, molecular spins and optical cavities, each being characterized by specific strengths and weaknesses. In particular, one of the most critical issue common to all the current implementations of quantum computing and quantum information is the decoherence (due to the interaction of the quantum system with its surrounding) that affects the stability of the quantum state whose effect would be the impairment of the phase information required for quantum computation and the storage of quantum information. Furthermore, the storage and retrieval of computational results require a measurement of the quantum state of the system and this needs for a delicate and effective compromise between stability and measurability, otherwise, the scalability of any quantum-computational and information system would be heavily reduced preventing it from being actually useful. In fact, the more memory and speed are required the more the system should be stable and fine-tuned in order to be really more efficient compared with its classical counterpart even because, so far, this is only possible for a not too high number of qu-bits involved, for controlled environmental conditions and by using expensive equipments. On the contrary, we show that a network of interacting coherent water domains would be able to realize a very stable (due to the energy gap characterizing the coherent state) physical substrate for the storage of quantum information that can be manipulated by electromagnetic potentials able to modify the macroscopic quantum phases describing the coherent domains themselves.
Quantum Memory by Coherent Domains
341
In addition, the proposed system would allow for the storage of very high quantity of information, being codified in its phase structure, proportional to the number of coherent domains that can be very high even in a relative small volume of water, and then virtually unbounded. The plan of the paper is as follows. In Sect. 2, some fundamentals of the theory of QED coherence in condensed matter are given, with a special reference to the description of coherent dynamics of liquid water, the formation of water coherent domains and the properties of their excited energy spectrum. Section 3 summarizes some of our previous results about the description of water coherent domains as superconductive systems and how to employ them to realize Josephson junctions. The capacity of water coherent domains to storage quantum information is discussed in Sect. 4. Section 5 finally describes the features of the electromagnetic memory effect and how to exploit it to storage and retrieve quantum information inside water coherent domains by manipulating their macroscopic quantum phase also proposing an experimental set-up to this aim.
2 2.1
Some Fundamentals of QED Coherence in Matter The Transition from the Perturbative to Coherent Macroscopic State and Its Main Features
According to the theory of QED coherence in matter, firstly developed by G. Preparata [4] on the ground of QFT, the stability and the properties of condensed matter are the result of a “superradiant” quantum phase transition of the radiation field coupled to matter system towards a new state characterized by the arising of a large classic electromagnetic field resonating with an atomic transition between the ground state and a particular excited state of the matter system. In particular he proved that the occurrence of such phase transition characterizes a large class of systems, whose stationary (time-asymptotic) behavior is ruled by the energy gap of the two levels involved in the resonating dynamics, i.e. E = ω0 =
hc λ
(4)
where λ is the wavelength of electromagnetic field, c the speed of light in vacuum and = h/2π the reduced Planck constant. The associated limit-cycle of the system, including a very high number of elementary components (atoms and /or molecules), reached when density is higher than a critical value ρc and temperature lower than a critical value T0 , is a more stable state in which matter components oscillate in phase with a electromagnetic field inside macroscopic spatial regions called “coherent domains” (CDs), due to the coherent oscillations between matter and e.m. field, whose extent is of order of the wavelength given by Eq. (4). A very meaningful consequence of the transition towards the coherent state is the release of energy in the environment so that to every coherent domains
342
L. M. Caligiuri
is associated an energy gap/molecules (atoms) E/N if compared to the noncoherent state, ruled by the couple of energy levels governing the coherent oscillations [4]. The stationary (time-asymptotic) state of the system is described by the following equations a1 (5) θ˙2 = −gA eiψ(t) a2 a2 (6) θ˙1 = −gA e−iψ(t) a1 −
a1 a2 −iψ(t) φ˙2 + φ˙ + μ = −g e 2 A
(7)
where χi (t) = ai eiθi (t) (i = 1, 2) is the matter field, A (t) = Aeiφ(t) , A the electromagnetic field, μ and g two parameters related to the intrinsic features of the system that rule the transition to the coherent state and its dynamics and ψ (t) = θ1 (t) − θ2 (t) − φ (t)
(8)
The system of Eq. (5) to (7) admits a consistent solution only if [4] ψ (t) = 0, 2π
(9)
φ˙ (t) = θ˙1 (t) − θ˙2 (t)
(10)
from which one has Equation (10) establishes a phase - locking constraint between the matter and e.m. fields so that the oscillation frequency of the e.m. field is just equal to the difference between the matter field energy levels (resonance condition). Very remarkably this implies every coherent domain is described by a macroscopic wavefunction, like Eq. (1), with a well-defined phase due to the correlation among the oscillations of the elementary components (by means of the electromagnetic field) that are phase-locked. Without entering in a detailed discussion of Eq. (10) and its main consequences that can be found in [4], we stress here in particular that by controlling the quantum phase of a coherent domain we could, in principle, rule its dynamics and vice versa, in this allowing, as we’ll see in the following, the possible use of coherent domains as storing devices of quantum information. But first we must stress some of the remarkable features of the coherent state of water. 2.2
The Coherent Behavior of Liquid Water
The coherent ground state of water is characterized by the oscillations between the molecular ground state |0 and the excited level |1 at E = 12.06 eV just 0.54 eV below the ionization threshold of the water molecule [4]. This state can be expressed as the quantum superposition |coh = cosα |0 + sinα |1
(11)
Quantum Memory by Coherent Domains
343
where sin2 α ≥ 0.103 [9], meaning that more than 0.10 of electron per water molecule can be considered as quasi-free particles. A calculation shows that, at the usual density of liquid water, a water CD can include up to ne = 2 · 104 quasi-free electrons so that every coherent domains of water is a huge reservoir of quasi-free electrons that are easily excitable by providing energy from the outside. These electrons belonging to a coherent state, their excitations are in the form of “cold” vortices (that cannot lose energy by thermal dissipation) whose angular moment is quantized and represent the excited energy spectrum of the water coherent domain [6] given by L2n − gLn · B (12) 2I where n indicates the n-th excited level, Ln the angular momentum associated to the vortex, g = e/2m (the gyromagnetic ratio), I is the momentum of inertia and B the external magnetic field. It is very important to remark the value of L is related to the “vorticity” of the coherent domain. The ensemble of such quasi-free electrons inside a water coherent domain can be considered as a plasma that performs coherent oscillations in which the vortices control the amplitude and the frequency of the coherent oscillation ωr . In particular it has been shown [4] that for a plasma of electrons performing coherent oscillation in tune with a resonating electromagnetic field the amplitude of electromagnetic field A and the amplitude of the plasma oscillations α are related by the equation (13) A2 1 − φ˙ + α2 En =
and they occur at the common oscillations frequency given by ωr = 1 − φ˙ ωR
(14)
where ωR is the pulsation of the small oscillations of charges around their equilibrium position and φ is the phase of electromagnetic field. The phase Θ defining the quantum state associated to a water coherent domain then also includes the contribution of the quantized angular frequency of coherent vortices (depending on their rotational frequency) so that the whole energy spectrum of the coherent domains is described by the same macroscopic wavefunction (1). This allows us to modify the phase of a water CD by change its energy level by making the CD to interact with the environment, for the coherent quasi-free electrons can be easily excited by external energy intake, even weak, provided that it is lower than the energy gap per molecule E/N of the CD.
3
Water Coherent Domains as Superconductive Systems and Josephson Junctions
As we have seen, coherent domains are macroscopic objects described by a unique wavefunction with well-defined amplitude and phase. Furthermore, water coherent domains are also characterized by the presence of a plasma of quasi-free
344
L. M. Caligiuri
electrons able to move within them without encountering electrical resistance in the form of cold vortices. On the other hand, the winning macroscopic model of superconductivity is precisely based on the assumption that the collective behavior of Cooper’s pairs in the superconductive medium can be described by a macroscopic wavefunction Ψ (x, t) In the same way the theory of QED coherence in matter predicts that the state describing the coherent dynamics of matter and electromagnetic fields inside coherent domains is an eigenstate of the quantum phase operator so that the associated wavefunction has a well-defined phase. For that concerning amplitude, from the normalization condition Ψ ∗ (x, t) Ψ (x, t) dV = N (15) V
we obtain |Ψ (x, t)| = n (x, t)
(16)
where N is the total number of particles in the coherent domain and n (x, t) its local density so that the amplitude of wavefunction (3) has a well-defined value as well. In a previous work [7] this author has shown how water coherent domains can be described like superconductive systems by relating the macroscopic phase to their “structural” features. Although the calculations were been developed in the case of cylindrical coherent domains, the model is quite general and can be generalized to different geometrical configurations of water coherent domains. In particular, if L = Iω indicates the angular moment of the coherent vortex, I its momentum of inertia, ω the rotational frequency, we have, from the quantization of the angular moment L=
2πI l ν0
(17)
where ν0 = 2πI/, l = 0, ±1, ±2, .... For a cylincrical coherent domain, Eq. (17) becomes [7] N me LR4 2 l (18) |L|l = 0.13π V ν0 The coherent vortices are described by a quantized angular moment of quasifree (“superconductive”) electrons in the water coherent domain and, in turns, generates a “supercurrent” and a static magnetic field across the domain whose quantized flux is given by [7] Φl = kLl =
2πkI l ν0
(19)
where k is a constant related to the geometry of the coherent domain. As the quantization of magnetic flux is a distinctive feature of superconductors so the water coherent domains can be considered as superconductors in which the “superelectrons” are represented by the quasi-free electrons.
Quantum Memory by Coherent Domains
345
This conclusion would seem to contradict the experimental observation according to which water is a so bad electric conductor at macroscopic scale. Nevertheless no conflict exists since in water the average distance between two ˚ a too high distance to allow for the coherent coherent domains is d 750 A “super-electrons” to quantum tunneling between coherent domains. Nevertheless, this is not the case if the coherent domains could be placed very close to each other. This is just the principle on which is based the Josephson junction (JJ) composed by a system of two nearby superconductors separated by a thin insulating layer [7]. In a JJ a “supercurrent” Js flows, through quantum tunnel effect, from one superconductor to the other composing the junction when a difference ϕ = θ2 − θ1 of macroscopic quantum phase in the wavefunctions associated to them is established, namely (20) Js (t) = Jc sinϕ (t) where Jc is a parameter characteristic of the junction which represents the maximum supercurrent density able to flow across the junction. If water coherent domains can be considered, as regards as the dynamics of coherent quasi-free electrons, like macroscopic quantum objects behaving as superconductors, then, they could be assembled to realize a Josephson junction-like system. We have been already shown this is just the case [7] if we consider two nearby water coherent domains separated by a thin layer of dielectric material. In this case the supercurrent flowing (due to the tunneling of the coherent quasi-free electrons) in a JJ composed by a couple of water coherent domains is given by [7] 2πβ (m − n) (21) Js = Jc sin ν0 where β is a parameter depending on the structural features of coherent domains, m and n the quantization numbers of the magnetic fluxes respectively produced by the two coherent domains and Jc is the maximum value of supercurrent flowing through the junction. Then, according to our previous results [7], if we supply the coherent domains composing the JJ with energy from the environment (for example in the form of electromagnetic or chemical energy) we could be able to induce quantized magnetic fluxes in the coherent domains that in turn generates a quantum phase difference and then a supercurrent in the junction as a function of such difference according to Eq. (21). Conversely, if the JJ is characterized by a given phase difference, measuring the supercurrent flowing through it could allow for the determination of the phase difference “stored” in the junction. We’ll show, in the following, how to exploit such feature to storage/retrieve quantum information inside water coherent domains.
4
Information Stored in Water Coherent Domains
The theory of QED coherence in matter predicts that the CGS of a system (as well as its excited states in the case of water) is characterized by a long-range
346
L. M. Caligiuri
order, due to the phase-correlated oscillations of the matter and electromagnetic fields in side coherent domains. Furthermore, the transition towards the CGS is associated to the release of energy in the environment and the arising of a negative energy gap/molecule. This means the coherent domain has lower entropy than the non-coherent state and higher level of order (codified in its macroscopic quantum phase) in turn associated to quantity of information stored inside it. The quantity of information storable in a coherent domain can be estimated by considering the concept of “bound information” as shown in [10]. According to this approach the information associated to a given system is defined by P0 (22) I = kN ln P where P0 quantifies the number of possibilities of the system in its initial state (for which I = I0 ), P1 that in the final state (for which I1 > 0) and kB is the Boltzmann constant. By considering the relation between entropy and information for a QED coherent system including N oscillating elementary components we have [5] ΔE Fcoh (T ) N (23) I=− N coh (ln2) kB T where ΔE/N is the energy “gap” per molecule in the coherent state, Fcoh (T ) is the fraction of total molecules belonging to the coherent state at temperature T . For a number nCD of coherent domains in our system the total information associated to them is written as I (nCD ) = nCD ICD where ICD is the information stored in a single coherent domains given by Eq. (22) for N = NCD . The quantity ICD is proportional to the energy gap/molecule of the coherent state. On the other hand, the latter has been found to depend on the interaction among nearby coherent domains. In particular we can recognize: – a “static” interaction occurring when two or more coherent domains are sufficiently close each other; – a “dynamic” interaction occurring when two or more coherent domains oscillate in synchrony with each other, namely a form of “supercoherence” (a coherence between coherent domains). In general, both these kinds of interactions cause a decrease of the energy gap characterizing interacting coherent domains. Leaving a complete analysis of this effect in the case of supercoherence to forthcoming publications, we’ll consider in the following the first case only. For a network of closely packed (spaced by d = 2RCD ) spherically-symmetric coherent domains, the “binding” energy due to the energy gap is given by [5] ncd (ncd − 1) 32 ΔE1 (0) (24) 2 9π N 2 where ΔE1 /N (0) < 0 being the value of the energy gap of a single isolated coherent domain at its center. For nCD > 1 we can the write Eq. (23) as Vtot =
Itot (nCD ) = 2I1 nCD (nCD − 1)
(25)
Quantum Memory by Coherent Domains
347
In the case of water coherent domains a calculation shows [5], if we consider interfacial water1 at room temperature, I1 ∼ = 2.70 · 108 bits and then, by using Eq. (23) (26) Itot (nCD ) ∼ = 5.40 · 105 nCD (nCD − 1) bits that allows to store up to more than about I = 1037 bits in a volume containing 2 cm3 of coherent water. In order to better point out the relationship between phase and information in coherent domains we remark that, in the case of water coherent ground state, the following equation holds [4] ΔE = ω0 1 − φ˙ N
(27)
where ω0 is given by Eq. (4) and φ is the phase of the coherent e.m. field that is locked to the phases of the matter field according to Eq. (10). All the above considerations refer to the coherent domains in their ground state but in the case of water a further coherent dynamics involved in the formation of the excited energy levels of CGS contributes to energy gap of the whole coherent system. In order to quantity such contribution we’ll picture the ensemble of quasi-free electrons in the water CD as a plasma of the negative charges and the positive charges (the protons) left in the water molecules by the quasi free electrons (we remember in the CGS we have in average one quasi-free electron per water molecule). In a real plasma we can generally characterize the charge oscillations around their equilibrium position by a frequency ωR and amplitude α, where
κe N ωR = (28) m V e and m being the electron charge and mass and κ a positive parameter that quantifies the level of disomogeneity of the real plasma. According to the theory of QED coherence in matter, in a real plasma, when suitable conditions about the oscillation frequency are met, a quantum phase transition towards a coherent state takes place in which all the charge oscillates, around their equilibrium positions, in tune with a electromagnetic field, in all similar to the electron clouds dynamics giving rise to CGS. If ωP is the oscillation frequency of the “ideal” plasma 12 N e ωP = √ (29) m V having the same density of the real one (corresponding to κ = 1), the plasma becomes coherent if [4] ωR < 1.88 (30) ωP In this case the stable coherent state is carachterized by the energy α 2 max A2 + α2 − 3gαA E = N ωR (31) α 1
Water that is very close to a hydrophilic surface.
348
L. M. Caligiuri
where N is, as usual, the number of particles, α is the plasma oscillation amplitude and αmax its maximum value, A is the amplitude of electromagnetic field in the coherent state and 2 2π ωP (32) g= 3 ωR The following equation holds A2 1 − φ˙ + α2 = 0 with φ˙ = 1 +
α 1 − 2g A
(33)
(34)
where φ is, as usual, the phase of electromagnetic field related to the phase ψ of the matter field of plasma oscillators by the phase-locking constraint in the coherent state π (35) φ−ψ = 2 While the values of A, α and φ˙ are fixed by Eqs. (33)–(35), αmax must be determined through further considerations. In particular it is found [4] it can be related to the maximum mean value of displacement amplitude ξ 2 of the plasma particles form the equilibrium position by the relation
1 αmax = mωP ξ 2 max 2
(36)
An estimate of ξ 2 max can be obtained by assuming, in our case, that the quasi-free electrons can move freely within the whole volume occupied by a coherent domain so that we can assume RCD
2 1 R2 ξ max = (37) r2 dr = CD RCD 0 3 where RCD is the mean radius of a spherical water coherent domain. Using Eq. (37) in Eq. (36) we have 2 αmax = mωP
3 RCD 3
and inserting the latter in Eq. (31) leads to 2 2 A + α2 − 3gαA N mωR ωP RCD Eplasma = 3 α2
(38)
(39)
It is easy to see that it exists a value gc of g such that for g > gc we have Eplasma < 0 and the coherent state is the true stable state, namely gc =
A2 + α2 3A
(40)
Quantum Memory by Coherent Domains
349
If we further assume the non-coherent ground state of the plasma has E = 0, we can write for the energy gap associated to the phase transition of plasma of quasi free electrons towards the coherent state ECS − EP GS Eplasma E = = m (‘>’ being restricted to the support of H(t)). If t is enabled in m it may fire, leading to m = m + O(t) − I(t). A PT system is a pair S := N, m0 . The interleaving semantics of S takes the form of a T -labelled directed graph whose nodes are the markings reachable from m0 , and whose edges match direct state transitions. (Enriched) PT systems are Turing-powerful. Nets-Within-Nets. In recent years we studied systems with a changeable layout in the object-net formalism [12,16], which follows the nets-within-nets paradigm proposed by Valk [25]. Hornets [13] are an extension of object-nets with algebraic operators that allow one to modify the net-tokens’ structure through transition firing. It is a generalisation of the approach of algebraic nets [23], where algebraic data types replace anonymous, black tokens. Our Maude formalisation refers to the class of elementary object-net systems (Eos) [12], which are the two-level specialisation of object-nets [12,16]. For our theoretical studies, we also introduced elementary Hornets [19], which have a two-level nesting structure, in analogy to Eos. It turns out that
Encoding Nets-Within-Nets in Maude
357
elementary Hornets have greater complexity than Eos, though more expressive: On the one hand, we have shown in [15,17,18] that most problems (including reachability and liveness) for safe Eos are PSpace-complete. Namely, safe Eos are no more complex than PT nets for these problems. On the other hand, we have shown in [14,20] that for safe, elementary Hornets “the reachability problem requires exponential space”.
p1
a1
p4
b1
t1
a1 a1
a1
t1
b1
t1
a1 b1
p2
b1
b1
p5
< t1, t2 > a1
p3
t1
t1
t1
t
b1
a2 a2 b2
t2
c2
b2
t2
a2
t2 c2
b2 a2
c2 b2
t2
c2 a2
p6
t2 c2
b2
Fig. 1. An Elementary Object Net System (Eos).
With object-nets, we mean PN where tokens (graphically denoting a PN marking) are nets again, i.e., we have a nested marking. An Eos consists of a system-net whose places may hold net-tokens of a certain type. In our encoding, the graph structure of both the system net and net-tokens is a PT net. Nettokens, however, do not allow any further nesting of nets, i.e., they represent marked PT nets. In the example given in Fig. 1 there are two different types of net-tokens (that we call net1 , net2 ) corresponding to colours yellow and grey; The system-net and the net-tokens consist of one transition; Only weight-one input/output edges are present. Eos events are nested, accordingly. There may be three different kinds of events: 1. System-autonomous: A system-net transition (e.g., t) fires autonomously, which consistently moves the net-tokens from, t’s preset (places pi , i : 1 . . . 3) to the postset (pj , j : 4 . . . 6), without changing their marking. 2. Object autonomous: A net-token, e.g., that of type net1 in the system-net place p2 , fires transition t1 by “moving” a black token from a1 to b1 . The net token remains in place p2 . 3. Synchronisation (the situation illustrated in Fig. 1): Whenever we add matching synchronisation inscriptions, e.g., between the system-net transition t and the nested transitions t1 , t2 , then they must fire synchronously: The nettokens move from the preset to the postset of t, consistently; at the same time, the black tokens inside nested nets move from the pre-set to the postset of t1 , t2 . Whenever a synchronisation is specified, the involved transitions
358
L. Capra and M. K¨ ohler-Bußmeier
cannot fire autonomously. Synchronized and autonomous events, instead, may be simultaneously present. Notice that there may be several firing instances for a system-net transition: If many net-tokens of a certain type are in transition’s preset, their cumulative marking is distributed on output places of the same type (possibly after a nested firing step, in the case of a synchronization). Each possible combination of adding/distributing net-token markings results in a separate instance. We give further details in Sect. 3.
3
MAUDE Implementation of E OS
In this section, we describe and briefly discuss our Maude formalization of Eos, which extends that of (rewritable) PT systems given in [4,6]. In our description, we use a few Maude code excerpts and refer to Fig. 1 as an example. All the Maude source files are available at https://github.com/lgcapra/rewpt/new/EOS. 3.1
Maude
Maude [8] is an expressive, purely declarative language with a rewriting logic semantics [3]. Statements are (conditional) equations (eq) and rules (rl). Both sides of a rule/equation are terms of a given kind that may contain variables. Rules and equations have a simple rewriting semantics in which instances of the left hand side are replaced by corresponding instances of the right hand side. Maude’s expressivity is achieved through: equational pattern matching modulo operator equational attributes; user-definable operator syntax/evaluation strategy; sub-typing and partiality; generic types; reflection. A Maude functional module (fmod) contains only equations and is a functional program defining one or more operations through equations, used as simplification rules. A functional module (with all the imported modules) specifies an equational theory in membership equational logic [1]. Formally, such a theory is a pair (Σ, E ∪ A), where Σ is the signature, that is, the specification of all the (sub)sort, kind1 , and operator declarations; E is the set of (conditional) equations and membership axioms, and A is the set of operator equational attributes (assoc, comm,..). The model of (Σ, E∪A) is the initial algebra (denoted TΣ/E∪A ), which is both junk- and confusion-free and mathematically corresponds to the quotient of the (ground) term-algebra. Under certain conditions on E and A, the final values (canonical forms) of all ground terms form an algebra isomorphic to the initial-algebra, i.e., the denotational and operational semantics coincide. A Maude system module (mod) contains rewrite rules and possibly equations. Rules represent local transitions in a concurrent system. Formally, a system module specifies a generalized rewrite theory [3], a four-tuple R = (Σ, E ∪ A, φ, R) 1
A kind is an equivalence class grouping sorts directly or indirectly related by subsort order; terms in a kind without a specific sort are undefined or error terms.
Encoding Nets-Within-Nets in Maude
359
where (Σ, E ∪ A) is a membership equational theory; φ specifies, for each operator in Σ, the frozen arguments; and R is a set of rewrite rules2 . A rewrite theory specifies a concurrent system. (Σ, E ∪ A) defines the algebraic structure of the states. R and φ specify the system’s concurrent transitions. The initial model of R associates to each kind k a labeled transition system (category) [α]
whose states are TΣ/E∪A,k , and whose transitions take the form: [t] → [t ], with [t], [t ] ∈ TΣ/E∪A,k , and [α] an equivalence class of rewrites modulo the equational theory of proof-equivalence. The executability condition for system modules is expressed by the ground coherence, which ensures that a rewriting strategy in which terms are first reduced to the canonical form then rewritten according to the rules is both sound and complete. 3.2
Maude Formalization
The Eos formalization relies on three generic functional modules, BAG{X}, MAP+X,Y, SET+X (the last two extensions of built-in modules). These modules may be arbitrarily nested thanks to a flexible mechanism of parameterized views (instantiating the type-parameters of a generic module). Differently from other Maude formalizations of PNs [22,24], bags are not merely represented as free commutative monoids on sets. A few bag-operators provide much more abstraction: . , + , [ ] - , ’ , set, * . The first two are constructors, i.e., appear in canonical forms. We can thus intuitively/conveniently represent a bag as a commutative/associative weighted sum, e.g., 3 . a + 1 . b. The module MAP+ defines a map term as a “set” of entries built using the associative/commutative juxtaposition , . Sub-sort Entry of Map has as a unique constructor |-> . Module MAP+ supplies, among others, a predicate verifying the uniqueness of map’s keys which is widely exploited (in data structures building on MAP) in membership equations implementing consistency checks. PT System Formalization. The Maude specification in [4], here summarized, supplies an efficient operational semantics for dynamically reconfigurable PT nets and represents the basis for Eos formalization. According to Eos definition, however, dynamic adaptation comes down to net-tokens manipulation. Reconfiguration at the system-net level is part of ongoing work. Places/transitions are denoted by indexed/labelled terms, e.g., p(2,"net1"), t(1,"sys"). A transition’s incidence matrix is a triplet (constructor [ , , ] defined in module IMATRIX) of terms of sort Bag{Place} (defined in BAGP, an instance of BAG{X})3 . The modules PT-NET and PT-SYS hold the signature of a PT net/system. A net is a term of sort Map{Tran,Imatrix} (renamed Net), i.e., a semicolon-separated set of entries t(k,"lab")|-> [i,o,h], each entry being a term in subsort ImatrixT of Net. A PT system is the empty juxtaposition ( : Net Bag{Place} -> [System]) of a Net term and a Bag{Place} term 2 3
Rules don’t apply to frozen arguments. They represent the input, output, inhibitor connections, respectively.
360
L. Capra and M. K¨ ohler-Bußmeier
representing the net’s marking. The use of a kind as operator’s range means that it defines a partial function: the reason is that the net sub-term must be a consistent, non-empty map. A membership axiom characterizes well-defined System terms. This approach, typical of membership equational logic, is a good trade-off between rewriting efficiency and code compactness. The system module PT-EMU (listed below) specifies the operational semantics of PT systems by exploiting the effective algebraic representation of PT nets. mod PT−EMU is pr PT−SYS . var T : Tran . vars I O H S : Bag{Place} . var N N’ : Net . crl [firing] : N S => N S + O − I if T |−> [I,O,H] ; N’ := N /\ I ’ S . endm
The conditional rewrite rule firing intuitively encodes the PT firing rule. All the involved operators are bag-operators. The matching equation (t := t’) in rule’s condition makes it very compact. The model-specific part consists of a system module importing PT-EMU and containing two zero-arity operators of range Net and Bag{Place}, respectively, describing a given PT system. 3.3
EOS Specification
The Eos specification extends and reflects in part that of (reconfigurable) PT systems. A few functional modules specify the Eos algebraic structure. A system module (EOS-EMU) specifies the (non trivial) Eos operation semantics. A couple of auxiliary modules (BAG-SPLIT, MAP-PROD) include some operators needed to mime the inner steps of transition firing, in particular, the computation of firing instances of a system-net transition and the consequent (possibly nondeterministic) distribution of net-tokens on its post-set. Finally, a specific system module instantiates a given Eos model. For the reader convenience, we mix the textual description with a few code excerpts. We use Fig. 1 to illustrate the main concepts. Eos net. A term describing an Eos net (module EOS-NET) is the empty juxtaposition of three sub-terms of sorts Net, Map{String,Net} (renamed NeTypeS) and Map{Tran,Map{String,Bag{Tran}}} (renamed Syncmap). The resulting whole term is of kind [Sysnet] (because of possible inconsistencies among its components). The 2nd and 3rd sub-terms specify the types of net-tokens and the synchronization between system- and object-net transitions (for each object-net type), respectively. These sub-terms are equipped with ad-hoc operators and separately defined (modules NET-TYPES and SYNCHRO). Note that, according to Eos definition, a system-net transition may synchronize with multiple occurrences of object-net transitions. The type of net-tokens a system-net’s place may hold is that associated with the place’s label in NeTypeS sub-term. If there is no association, the systemnet place may only hold “black-tokens”. Instead, the three categories of events possible in a Eos meet the following conventions.
Encoding Nets-Within-Nets in Maude
361
1. System-autonomous: system transitions not occurring in the Syncmap subterm. 2. Object autonomous: nested transitions for which the predicate op synchronized : Syncmap Tran -> Bool evaluates to false (this predicate checks that a given transition appears among the values -that are maps, in turn- of Syncmap sub-term). 3. Synchronisations: defined implicitly by exclusion. A membership-axiom connotes well-formed Eos as terms of sort Sysnet (N:Net, Ty:NeTypeS, Sy:Syncmap are variables, the predicate coherent(Sy, Ty) checks that every nested transition occurring in Syncmap belongs to the corresponding net-type). cmb N Ty Sy : Sysnet if welldef(N) and−then welldef(Ty) and−then not(repeatedKeys(Sy)) and−then coherent(Sy, Ty)
Using a uniform syntax (that of “enriched” PT nets) for the system-net and net-tokens is convenient in terms of description/algebraic manipulation and significantly enhances Eos expressivity. Furthermore, the adopted signature may be easily adapted to support an arbitrary nesting of nets. Eos System. The Eos dynamics is mimed using a structured state representation, in which the basic generic types are reciprocally nested. A term of sort Map{Place,Bag{Bag{Place}}} specifies an Eos marking as a map from systemnet places to multisets of multisets of net-token places: a term of sort Bag{Place} indeed represents a marking of the net-token type associated to a certain systemnet place, its multiplicity in the top multiset is the number of net-tokens in the system-net place with that marking. The nil ground term (representing the empty Bag{Place}) may also denote -without any ambiguity- anonymous tokens in untyped system-net places. For example, the Eos marking in Fig. 1 is described by the term ("net1","net2" refer to the net-token types): p(1,”net1”) |−> 1 . nil + 1 . (1 . p(1, ”a1”) + 1 . p(2, ”b1”)) ; p(2,”net1”) |−> 1 . 1 . p(1, ”a1”) ; p(3,”net2”) |−> 1 . (1 . p(1, ”a2”) + 1 . p(2, ”b2”))
A marked Eos is formally described by the empty juxtaposition of a Sysnet sub-term and a Map{Place,Bag{Bag{Place}}} sub-term (module EOSYS). Due to possible inconsistencies, the operator’s arity is the kind [Eosystem]. As usual, we use a membership axiom to connote terms of sort Eosystem as those in which every system-net place is a key in the Eos marking sub-term. No check is done on net-token places, for the sake of efficiency and coherently with the fact that (in an mutable context) they may contain isolated places. var S : Sysnet . var M : Map{Place,Bag{Bag{Place}}} . cmb S M : Eosystem if not(repeatedKeys(M)) and−then places(net(S)) subset keySet(M).
362
L. Capra and M. K¨ ohler-Bußmeier
In the perspective of allowing an arbitrary nesting of nets, generalized multisets (whose elements may be bags in turn) should be used instead of nested bags. Eos Operational Semantics. In accordance with the two-level structure of Eos, two firing rules are encoded as rewrite-rules. One for system-net transitions (possibly taking account of synchronizations), the other for autonomous net-token transitions. As in High-Level PN, a system-net transitions t may fire in different modes in a Eos marking m. An instance of t has the same algebraic representation as m, i.e., it is a Map{Place,Bag{Bag{Place}}} term whose map’s keys are the t’s preset. In other words, an instance of t is a sub-multiset of m. Both Eos firing rules use the firing rule of PT systems (inhibitor edges at system-net level currently have a merely numerical meaning). The system-net firing rule exploits two main operators (module EOSYS): op firingmap : Eosystem −> Map{Tran, Set{Map{Place,Bag{Bag{Place}}}}} . op firings : ImatrixT Map{Place, Bag{Bag{Place}}} Syncmap −> [Set{Map{ Place, Bag{Bag{Place}}}}] .
firingmap computes the enabled firing instances for every system-net transition, taking account of synchronizations. It builds on a few auxiliary operators defined in generic modules BAG-SPLIT, MAP-PROD, in particular: op split : Bag{X} Nat −> Set{Bag{X}} op prod : Map{X,Set{Y}} −> [Set{Map{X,Y}}]
split splits a bag into sub-bags of given size, prod performs a kind of Cartesian product of maps having sets as values. In the event of synchronization, t’s instances are filtered according to the enabling of synchronized nested transitions in the current marking. The intrinsic complexity of all these operations may be alleviated using the memo operator attribute. System module EOS-EMU formalizes the Eos operation semantics. mod EOS−EMU is pr EOSYS . inc PT−EMU . var FM : Map{Tran, Set{Map{Place,Bag{Bag{Place}}}}} . ∗∗∗ whole firing map var TI : Entry{Tran, Set{Map{Place,Bag{Bag{Place}}}}} . ∗∗∗ transition instance var NeFS : NeSet{Map{Place,Bag{Bag{Place}}}} . var FS : Set{Map{Place,Bag{Bag{Place}}}} . ∗∗∗ output firing set vars I O M M’ : Map{Place,Bag{Bag{Place}}} . ∗∗∗ firing instance/EOS marking vars N N’ : Net . var T : Tran . var Ty : NeTypeS . var Sy : Syncmap . var Q : Imatrix . var S : String . vars J K : NzNat . vars B B’ : Bag{Place} . var B2 : Bag{Bag{Place}} . rl [select] : I U NeFS => I . ∗∗∗ non−deterministic extraction of an instance crl [inst] : N Ty Sy M => N Ty Sy (M − I) + O if N’ ; T |−> Q := N /\ firingmap(N Ty Sy M)[T] => I /\ firings(T |−> Q, I, Sy) => O . crl [aut] : SN (p(J,S) |−> K . B + B2) => SN (p(J,S) |−> (K . B − 1 . B) + 1 . B’ + B2) if N (S |−> (T |−> Q ; N’) ; Ty) Sy := SN /\ not(synchronized(Sy, T)) /\ (T |−> Q) B => (T |−> Q) B’ . endm
Encoding Nets-Within-Nets in Maude
363
Rules inst and aut encode the firing of a system-net transition and of an autonomous nested transition, respectively. Rule inst relies in turn on select, which emulates the non-deterministic selection of an instance. We exploit the opportunity that conditional rewrite rules can have very general conditions involving matching equations, memberships and also other rewrites. Rule inst implements double non-determinism: for selecting an (enabled) instance of t, then for choosing one of the possible output markings generated by that instance. The rule uses the operators + , - defined on Map{Place,Bag{Bag{Place}}} terms. In the event of synchronization, the operator firings embeds the changes to the markings of net-tokens. Rule aut exploits the synchronized predicate and the PT system firing rule encoded in module PT-EMU. The firing an instance of t may be non-deterministic, i.e., result in alternative multisets of net-tokens to distribute on t’s post-set (Sect. 1). Operator firings calculates this set for a system-net transition instance (the transition’s incidence matrix and the synchronization map are the other arguments). It builds on: op distribute : Bag{Place} Bag{Place} −> [Set{Map{Place,Bag{Bag{Place}}}}]
that distributes the (pre-calculated) cumulative marking obtained by an instance of t (1st arg) to t’s post-set (2nd arg), assuming that the two arguments refer to system-net places of the same type. This operation is tricky and reduces to enumerate the partitions of a multiset. Not to reinvent the wheel, we have rephrased in Maude (what is far form trivial) Knuth’s algorithm [11] which avoids from generating duplicates. As a concrete example of Maude formalization of Eos, we include the system module SIMPLE-EOS which specifies the Eos in Fig. 1, where we assume that only the system-net places {pi }, i : 1 . . . 3, are initially marked. mod SIMPLE−EOS is inc EOS−EMU . ops net type1 type2 : −> Net . op netype : −> NeTypeS . op m0 : −> Map{Place,Bag{Bag{Place}}} . op eosnet : −> Sysnet . op sync : −> Syncmap . eq net = t(1,”sys”) |−> [1 . p(1,”net1”) + 1 . p(2,”net1”)+ 1 . p(3,”net2”), 1 . p (4,”net1”) + 1 . p(5,”net2”)+ 1 . p(6,”net2”), nil] . eq type1 = t(1, ””) |−> [1 . p(1,”a1”), 1 . p(2,”b1”), nil] . eq type2 = t(2,””) |−> [1 . p(1,”a2”) + 1 . p(2,”b2”), 1 . p(3,”c2”), nil] . eq netype = ”net1” |−> type1 ; ”net2” |−> type2 . eq sync = t(1,”sys”) |−> (”net1” |−> 1 . t(1,””) ; ”net2” |−> 1 . t(2,””) ) . eq eosnet = net netype sync . eq m0 = p(1,”net1”) |−> 1 . nil + 1 . (1 . p(1,”a1”) + 1 . p(2,”b1”)); p(2,”net1”) |−> 1 . 1 . p(1,”a1”) ; p(3,”net2”) |−> 1 . (1 . p(1,”a2”) + 1 . p(2,”b2”)) . endm
We refer to the Eos with the term eosnet m0 using a simple aliasing. By running the reduce inline command of the Maude interpreter Maude> red firinginstances(eosnet m0, t(1, ”sys”))
364
L. Capra and M. K¨ ohler-Bußmeier
we get two enabled instances for the system-net transition t(1,"sys"), in accordance with the two net-tokens of type "net1" in place p(1,"net1") (Fig. 1). The following command, instead, searches for (using a breadth-first strategy) reachable final (!) states. Maude> search eosnet m0 =>! X:Eosystem .
The command has four matches which express the nondeterminism of Eos transition firing: For each instance of t(1,"sys") there are indeed two possible output-markings, as the ways to distribute the net-token of type "net2" on places p5 , p6 .
4
An Example: The Production Line
We consider a production plant with two production lines as an example. This scenario has been used as a case study for Rewritable PT Nets [4] in previous work. Here, we will model the scenario in terms of nets-within-nets. In the scenario we have raw material and two operation t1 and t2 working on pieces of those. We have two production lines (i.e. robots), both being capable of performing t1 and t2. We assume that the two lines have different qualities for these operations, i.e. it is better to execute t1 in line 1 and t2 in line 2. This is the standard production plan. It is possible that during the execution one of the two lines gets faulty. A double failure has a negligible probability and is not modelled here.
Fig. 2. The System Net modelling the Production Lines.
Encoding Nets-Within-Nets in Maude
4.1
365
E OS-Model
The scenario is well suited for a Eos-model as we have two obvious levels in our model: the production site and the production plans. The model is specified in the syntax of the Renew tool [21]. The system level shown in Fig. 2 describes the two production lines, the execution life cycle of the production plan, and the dropout of lines. In the system net the place p0 indicates normal operation. In this mode the transition t0 takes two tokens (i.e. the raw material) from place p1 and activates the normal production plan, i.e. it generates a net-token of type plan via the inscription x:new plan. The two production lines are shown as blocks. Their transitions are side conditions to the place production plan. They synchronise via the channel inscriptions of the form x:line1 t1(). These channels have a counterpart in the net-tokens (e.g. the net plan – shown in Fig. 3 – has the corresponding inscription line1 t1()). Therefore, we have a synchronous firing of the transition in the production line and in the production plan.
Fig. 3. The Object Net plan modelling the standard Production Plan.
The net-tokens describe the different production plans: The production plan for normal operation is given as the net plan shown in Fig. 3, while we also have two fall back plans fallback 1 and 2 for the case of dropouts. Specifically, the given production plan from Fig. 3 specifies a synchronisation via line1 t1() and via line2 t2(), so we will execute t1 in line 1 and t2 in line 2. When the production plan is finished we synchronise via do t3() and delete the net-token, i.e. the plan. For simplicity the scenario restarts this production via the transition t4 which regenerates two raw materials again on place p1. This reset makes the main loop of the model live, i.e. it allows an infinite repetition of production. Now, we will look at the adaption part and how the model preserves this liveness even in the case of dropouts. On the left part of Fig. 2 (the yellow nodes) we have the adaption part: Transitions t5 and t6 model the dropout of one production line. Whenever we have a drop the standard production plan is not executable anymore. Therefore, the transitions fall back to production line 1/2 withdraws them. After a dropout the places p7 and p8 indicate which of the two lines is down. According to this information we will select the appropriate fallback plan, i.e. in the case of a dropout in line one (p7 is marked) we switch
366
L. Capra and M. K¨ ohler-Bußmeier
to the fall back plan 2 and vice versa. The two fallback blocks have almost the same structure as the original block. The only difference is that we generate a different net-token via x:new fallback1 in the case of the side condition p8 and via x:new fallback2 in the case of the side condition p7. To avoid lots of crossing arcs we use the Renew feature of so-called virtual places (drawn with a double outer line) denoting references to the original places. The two fallback plans fallback 1 and 2 are not shown here as they look almost identical to the basic plan (Fig. 2). The only difference are the used channels: fallback 1 works only with the production line 1, which is expressed by channels line1 t1() and via line1 t2(). Analogously for fallback 2. Maude specification of the adaptable Production Line Eos is highly modular (at both net levels): We got it by composing a few base sub-nets through the associative/commutative net-juxtaposition operator (;). fmod pline−NET is ∗∗∗ structures used in the the Production Line EOS pr PT−NET ∗ (op emptyPbag to nil) . pr CONVERSION . vars S S’ : String . var N : Nat . vars L L’ : Bool . ∗∗∗ auxiliary operators used for node labelling op l : Bool −> NzNat . ∗∗∗ boolean values denote the two lines/operations eq l(true) = 1 . eq l(false) = 2 . op str : Nat −> String . ∗∗∗ converts a number to a string eq str(N:Nat) = string(N:Nat,10) . op str : String Bool −> String . eq str(S,L) = S + str(l(L)) . ∗∗∗ concatenates S with the number of line L ∗∗∗ base net−token modules op line : String Bool Bool −> Net . ∗∗∗ the 1st arg denotes the operation t i, the 2nd line j, the 3rd the net−token’s label eq line(S,L,L’) = t(l(L),str(S + ”:line”,L’)) |−> [1 . p(1 + l(L),S), 1 . p(3 + l(L),S), nil] . op start : String −> Net . ∗∗∗ t0 (start−up): the arg is the net−token’s label eq start(S) = t(0,S) |−> [nil, 1 . p(1,S) + 1 . p(2,S) + 1 . p(3, S), 1 . p(1,S)] . ∗∗∗ an inhibitor edge used instead of a marked place op assemble : String −> Net . ∗∗∗ t3 (assembler) eq assemble(S) = t(3,S + ”:do_t3 ”) |−> [1 . p(4,S) + 1 . p(5, S), 1 . p(6,S), nil] . ∗∗∗ net−tokens ops plan : −> Net . eq plan = start(”plan”) ; line(”plan”,true,true) ; line(”plan”,false,false) ; assemble(”plan”) . ops fallback : Bool −> Net . ∗∗∗ same structure as plan eq fallback(L) = start(str(”fb”,L)) ; line(str(”fb”,L),L,L) ; line(str(”fb”,L), not(L),L) ; assemble(str(”fb”,L)) . ∗∗∗ system−net’s modules op pline : String Bool Bool −> Net . eq pline(S,L,L’) = t(l(L),str(S + ”_x:line”,L’)) |−> [1 . p(2,S) + 1 . p(6 + l(L’),”−”), 1 . p(2,S) + 1 . p(6 + l(L’),”−”) , nil] . ops sysnet lifecycle : −> Net . ∗∗∗ object−nets and system−net modules
Encoding Nets-Within-Nets in Maude
367
eq lifecycle = t(0,”x:new_plan”) |−> [2 . p(1,””) + 1 . p(0,””), 1 . p(2,”plan”) + 1 . p(0,””) , nil] ; t(4,””) |−> [1 . p(6,””), 2 . p(1,””), nil ] ; t(3,”x:do_t3 ”) |−> [1 . p(2,”plan”), 1 . p(6,””), nil ] . ops adapt fbkcycle : Bool −> Net . eq adapt(L) = t(4 + l(L),””) |−> [1 . p(0,””) + 1 . p(6 + l(L),”−”), 1 . p(6 + l(L),””) , nil] ; t(7,str(” fb_to_pl”,not(L))) |−> [ 1 . p(2,”plan”) + 1 . p(6 + l(L),””), 1 . p(6 + l(L),””) + 2 . p(1,””), nil] . eq fbkcycle(L) = t(0,str(” x:new_fb”,L)) |−> [2 . p(1,””) + 1 . p(6 + l(not(L)),””), 1 . p(2,str(”fb”,L)) + 1 . p(6 + l(not(L)),””), nil] ; t(3,str(”fb”,L) + ”_x:do_t3 ”) |−> [1 . p(2,str(”fb”,L)), 1 . p(6,””), nil ] . eq sysnet = lifecycle ; pline(”plan”,true,true) ; pline(”plan”,false,false) ; ∗∗∗ nominal life−cycle adapt(true) ; adapt(false) ; ∗∗∗ adaptation sub−net fbkcycle(true) ; pline(”fb1”,true,true) ; pline(”fb1”,false,true) ; ∗∗∗ fallback to line 1 fbkcycle(false) ; pline(”fb2”,true,false) ; pline(”fb2”,false,false). ∗∗∗ fallback to line 2 op K : −> NzNat . ∗∗∗ model’s parameter (#raw pieces) eq K = 2 . endfm fmod PLINE−EOS is ∗∗∗ production−line EOS −− aliasing for the main components pr EOSYS . pr pline−NET . op netype : −> NeTypeS . ∗∗∗ net−token types op eosnet : −> Sysnet . ∗∗∗ EOS net−structure op sync : −> Syncmap . ∗∗∗ synchronizations op eosm0 : −> Map{Place,Bag{Bag{Place}}} . ∗∗∗ initial marking eq netype = ”plan” |−> plan ; ”fb1” |−> fallback(true) ; ”fb2” |−> fallback(false) . eq sync = t(1, ”plan_x:line1”) |−> (”plan” |−> 1 . t(1, ”plan:line1”)) ; t(2, ”plan_x:line2”) |−> (”plan” |−> 1 . t(2, ”plan:line2”)) ; t(1, ”fb1_x:line1”) |−> (”fb1” |−> 1 . t(1,”fb1:line1”)) ; t(2, ”fb1_x:line1”) |−> (”fb1” |−> 1 . t(2,”fb1:line1”)) ; t(1, ”fb2_x:line2”) |−> (”fb2” |−> 1 . t(1,”fb2:line2”)) ; t(2, ”fb2_x:line2”) |−> (”fb2” |−> 1 . t(2,”fb2:line2”)) ; t(3, ”x:do_t3 ”) |−> (”plan” |−> 1 . t(3, ”plan:do_t3 ”)) ; t(3, ”fb1_x:do_t3 ”) |−> (”fb1” |−> 1 . t(3, ”fb1:do_t3 ”)) ; t(3, ”fb2_x:do_t3 ”) |−> (”fb2” |−> 1 . t(3, ”fb2:do_t3 ”)). ∗∗∗ system−net modules eq eosnet = sysnet netype sync . eq eosm0 = p(1, ””) |−> (2 ∗ K) . nil ; p(7, ”−”) |−> 1 . nil ; p(8, ”−”) |−> 1 . nil ; p(0, ”−”) |−> 1 . nil . endfm
4.2
Analysis of the Maude Representation
Below we report some evidence of formal verification with the only intent of showing the potential of our approach based on Maude. Since we specify
368
L. Capra and M. K¨ ohler-Bußmeier
adaptation in a way to preserve the liveness of the production process, we focus on this kind of property. We use the two basic analysis tools, namely, the reduce command, which prints out the canonical form of a ground term, and the search state-space explorer (both available inline with the Maude’s interpreter). As usual, we manage wordy terms using intuitive aliasing. For example, the net term describes the system-net component of the Eos, whereas the eosnet term includes the net-token description and synchronizations. Terms m0 and eosm0 denote the initial marking of the net (seen as a PT system) and of the Eos, respectively. We can use the reduce command both to unfold these aliases and to perform a preliminary formal validation: If it assigns the canonical form a specific sort (i.e., Eosystem), it means that the initial term represents a well-defined Eos specification: Maude > reduce in PLINE - EOS : eosnet eosm0 . result Eosystem : (... unfolded term )
The next two searches operate on the PT-system we obtain from the systemnet by replacing net-tokens with anonymous tokens. One of the advantages of Eos, indeed, is that we can separately consider and analyse the two Eos levels. The first one verifies a state invariant which characterizes the production plan’s life-cycle, in a configuration where place p1 initially holds four tokens. The search has indeed no matches (solutions), consistently with the fact that we search for a counter-example. It is worth noticing two things. An obvious implicit outcome of the command is that the PT system derived from the systemnet is bounded because the state space turns out to be finite. This invariant (and the model’s boundedness as well) is actually structural, i.e., it holds for any configuration with 2 ∗ K tokens initially in place p1 (K being the model’s parameter). By the way, we cannot prove parametric invariants using a simple state-space exploration. The second search, instead, checks the absence of final (dead) states. Again, there is no match. Maude > search in pline - SYS : net m0 = >* X : System such that marking ( X : System ) [ p (1 ,"") ] + 2 * marking ( X : System ) [ p (2 ," plan ") ] + 2 * marking ( X : System ) [ p (2 ," fb1 ") ] + 2 * marking ( X : System ) [ p (2 ," fb2 ") ] + 2 * marking ( X : System ) [ p (6 ,"") ] =/= 4 . Maude > search in pline - SYS : net m0 = >! X : System .
In the same way, we can search starting from an Eosystem term that specifies the whole Eos. For example, the following (unmatched) search is much more significant than the previous ones, because it formally verifies that the complete Eos –including nested nets– is deadlock-free (and implicitly, bounded). Both state space and execution time grow up.
Encoding Nets-Within-Nets in Maude
369
Table 1. Performance of Search Command as the Model’s Parameter Varies. K # states 2
262
#| rewrites time (ms) 19 801
38
5
2 802
180 539
453
10
13 932
995 724
3 192
20
104 007
19 737 494
56 090
50 4 186 132 111 564 504
906 253
Maude > search in pline - EOS : eosnet eosm0 = >! E : Eosystem .
Table 1 reports the performances of the last search, as the system’s parameter varies. The data refer to an Intel Core i7-6700 equipped with 32 GB of RAM. We may also check LTL formulae using the corresponding on-the-fly modelchecker of the Maude system. For example, we can verify that the Eos initial marking is a home-marking, i.e., it is reachable from each reachable Eos marking. State-space exploration techniques suffer from the possible state-space explosion, which may be only alleviated using some heuristics, e.g., by carrying out bounded searches (the search command has a number of options) or making some abstractions (as we did in the first two searches). Another important drawback is that, in general, we cannot infer parametric outcomes, not depending on the initial marking. Structural analysis, which considers the PT graph structure, is an effective alternative (complementary) to state-space inspection. We may use it to infer parametric outcomes, e.g., structural state-invariants (semiflows). Structural analysis of Eos looks promising because it may take advantage of the fact that the types of net-tokens flowing through the system-net’s places are a finite set.
5
Summary and Outlook
In this paper, we have defined a Maude representation of nets-within-nets, more concretely: Eos. What are the Strengths of the Approach? The Maude formalization presented here is an extension of our previous work on the formalization of (rewritable) PT nets [4]. Therefore a lot of code could be reused, which is beneficial for the implementation’s reliability and efficiency as well. The formalisation preserves central design issues of Eos, namely, it supports a uniform view: The system-net and net-tokens have the same structure in Maude, which is essential as the same is true for Eos. This aspect is relevant when the architecture is extended from the two-level case to an unbounded nesting of nettokens in a marking [16] or Hornets [13]. Standard Maude facilities for formal
370
L. Capra and M. K¨ ohler-Bußmeier
verification (e.g., state-space search and model-checking) may be used with no additional costs. Additionally, the Eos firing rule is defined in a way that moving net-tokens around cannot be distinguished from moving ordinary tokens in PT nets. Therefore, into can easily define abstractions on the system’s state (e.g. forgetting about the marking of net-tokens), which is essential when performing state space exploration efficiently. Our approach naturally allows for model extensions. A natural one would be the usage of inhibitor arcs. While the extension from Eos to Hornets is a rather large step, which involves several new constructs like net-algebra operators, the formalizing of Hornets in Maude seems very simple (a basic net-algebra has been already defined, which allows us to introduce new types of net-tokens by composing existing ones). So, we may easily go one step forward, towards “rewritable” Eos, where the structure of both the system net and of net-tokens may change over the time. What was Complex? The most challenging aspect of the formalisation was the integration of the so-called firing modes. Roughly speaking the firing rule of a system-autonomous event in an Eos collects the tokens of all net-tokens coming from the system net places in the preset. When the system net transition fires it distributes all these tokens on freshly generated net-tokens in the postset. The firing rule allows any of these possible distributions – an aspect which requires some tricky handling of the binding in Maude. Limitations. The current formalisation fulfils the requirement that it provides a link to the world of programming. But we have to admit that like in any algebraic specification, terms describing Eos may be wordy, structurally complex and (consequently) difficult to read and manage. An aliasing mechanism (used in a naive way) might greatly help a modeller. Also, syntactic sugar would sweeten the approach. Of course, an automated translation from a high-level (graphical) description of Eos to the corresponding Maude module would be highly desirable. Ongoing Work. In this paper, we were mainly concerned with the Maude encoding of Eos. Our main motivation for this is to obtain a representation closer to the usual programming language world. But of course, our intention is also to benefit from the advantages of a formal specification, i.e. the possibility to apply analysis techniques more easily. In the case of Maude the first idea is to apply state spacerelated techniques, like LTL model checking. We also like to integrate structural PN techniques for Eos. For the analysis of Eos, we need to struggle with scalability issues as the Eos state space grows even worse than that of PT systems. Possible approaches to face scalability are the canonization of net-tokens (which has been implemented in [5] in a general, non-optimized way) and the use of abstractions on markings to obtain condensed state spaces. Fortunately, the latter can be expressed quite easily in Maude by adding extra equations on markings.
Encoding Nets-Within-Nets in Maude
371
From a more general perspective, our final goal is to propose a unifying Maude-based (algebraic) PN framework for all the existing variations inspired by Nets-within-Nets blueprint, like Hornets and Reflective PN [7].
References 1. Bouhoula, A., Jouannaud, J.-P., Meseguer, J.: Specification and proof in membership equational logic. Theoret. Comput. Sci. 236(1), 35–132 (2000) 2. Bruni, R., Corradini, A., Gadducci, F., Lluch Lafuente, A., Vandin, A.: Modelling and analyzing adaptive self-assembly strategies with maude. In: Dur´ an, F. (ed.) WRLA 2012. LNCS, vol. 7571, pp. 118–138. Springer, Heidelberg (2012). https:// doi.org/10.1007/978-3-642-34005-5 7 3. Bruni, R., Meseguer, J.: Generalized rewrite theories. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 252– 266. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45061-0 22 4. Capra, L.: A maude implementation of rewritable petri nets: a feasible model for dynamically reconfigurable systems. In: Gleirscher, M., van de Pol, J., Woodcock, J. (eds.) Proceedings First Workshop on Applicable Formal Methods, virtual, 23 November 2021. Electronic Proceedings in Theoretical Computer Science, vol. 349, pp. 31–49. Open Publishing Association (2021) 5. Capra, L.: Canonization of reconfigurable PT nets in maude. In: Lin, A.W., Zetzsche, G., Potapov, I. (eds.) Reachability Problems, pp. 160–177. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19135-0 11 6. Capra, L.: Rewriting logic and petri nets: a natural model for reconfigurable distributed systems. In: Bapi, R., Kulkarni, S., Mohalik, S., Peri, S. (eds.) ICDCIT 2022. LNCS, vol. 13145, pp. 140–156. Springer, Cham (2022). https://doi.org/10. 1007/978-3-030-94876-4 9 7. Capra, L., Cazzola, W.: Self-evolving petri nets. JUCS J. Univ. Comput. Sci. 13(13), 2002–2034 (2007) 8. Clavel, M., et al.: All About Maude - A High-Performance Logical Framework: How to Specify, Program, and Verify Systems in Rewriting Logic. LNCS, Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71999-1 9. Hachicha, M., Halima, R.B., Kacem, A.H.: Formal verification approaches of self-adaptive systems: a survey. Procedia Comput. Sci. 159, 1853–1862 (2019). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 23rd International Conference KES2019 10. De La Iglesia, D.G., Weyns, D.: MAPE-K formal templates to rigorously design behaviors for self-adaptive systems. ACM Trans. Auton. Adapt. Syst. 10(3), 1–31 (2015) 11. Knuth, D.E.: The Art of Computer Programming, Volume 4, Fascicle 3: Generating All Combinations and Partitions. Addison-Wesley Professional, Boston (2005) 12. K¨ ohler, M., R¨ olke, H.: Properties of object petri nets. In: Cortadella, J., Reisig, W. (eds.) ICATPN 2004. LNCS, vol. 3099, pp. 278–297. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27793-4 16 13. K¨ ohler-Bußmeier, M.: Hornets: nets within nets combined with net algebra. In: Franceschinis, G., Wolf, K. (eds.) PETRI NETS 2009. LNCS, vol. 5606, pp. 243– 262. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02424-5 15 14. K¨ ohler-Bußmeier, M.: On the complexity of the reachability problem for safe, elementary Hornets. Fundamenta Informaticae 129, 101–116 (2014). Dedicated to the Memory of Professor Manfred Kudlek
372
L. Capra and M. K¨ ohler-Bußmeier
15. K¨ ohler-Bußmeier, M.: A survey on decidability results for elementary object systems. Fund. Inform. 130(1), 99–123 (2014) 16. K¨ ohler-Bußmeier, M., Heitmann, F.: On the expressiveness of communication channels for object nets. Fund. Inform. 93(1–3), 205–219 (2009) 17. K¨ ohler-Bußmeier, M., Heitmann, F.: Safeness for object nets. Fund. Inform. 101(1– 2), 29–43 (2010) 18. K¨ ohler-Bußmeier, M., Heitmann, F.: Liveness of safe object nets. Fund. Inform. 112(1), 73–87 (2011) 19. K¨ ohler-Bußmeier, M., Heitmann, F.: Complexity results for elementary hornets. In: Colom, J.-M., Desel, J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 150–169. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38697-8 9 20. K¨ ohler-Bußmeier, M., Heitmann, F.: An upper bound for the reachability problem of safe, elementary hornets. Fund. Inform. 143, 89–100 (2016) 21. Kummer, O., et al.: An extensible editor and simulation engine for petri nets: Renew. In: Cortadella, J., Reisig, W. (eds.) ICATPN 2004. LNCS, vol. 3099, pp. 484–493. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-277934 29 22. Padberg, J., Schulz, A.: Model checking reconfigurable petri nets with maude. In: Echahed, R., Minas, M. (eds.) ICGT 2016. LNCS, vol. 9761, pp. 54–70. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40530-8 4 23. Reisig, W.: Petri nets and algebraic specifications. Theoret. Comput. Sci. 80, 1–34 (1991) ¨ 24. Stehr, M.-O., Meseguer, J., Olveczky, P.C.: Rewriting logic as a unifying framework for petri nets. In: Ehrig, H., Padberg, J., Juh´ as, G., Rozenberg, G. (eds.) Unifying Petri Nets. LNCS, vol. 2128, pp. 250–303. Springer, Heidelberg (2001). https:// doi.org/10.1007/3-540-45541-8 9 25. Valk, R.: Object petri nets: using the nets-within-nets paradigm. In: Desel, J., Reisig, W., Rozenberg, G. (eds.) ACPN 2003. LNCS, vol. 3098, pp. 819–848. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27755-2 23 26. Weyns, D., Iftikhar, M.U., De La Iglesia, D.G., Ahmad, T.: A survey of formal methods in self-adaptive systems. In: Proceedings of the Fifth International C* Conference on Computer Science and Software Engineering, C3S2E 2012, pp. 67– 79. Association for Computing Machinery, New York (2012) 27. Weyns, D., Iftikhar, U.M.: ActivFORMS: a formally-founded model-based approach to engineer self-adaptive systems. ACM Trans. Softw. Eng. Methodol. 32(1), 1–48 (2022)
Usage of High-Performance System in Impulsive Modelling of Hepatitis B Virus Ekaterina Gospodinova(B) and Ivan Torlakov Technical University of Sofia, Sofia, Bulgaria ekaterina [email protected] Abstract. The focus of the current paper is the software analysis of an OpenMPI parallel impulsive delayed reaction-diffusion model for the study of the hepatitis B virus. Integral manifolds have been added as a notion to the model under consideration. The existence of a pulse model of the dynamics of viral infections serves as the foundation for this idea. In conjunction with a Poincar´e-type inequality, an extension of the Lyapunov technique is used to prove the qualitative criteria for the existence, constancy, and boundedness of integral manifolds. An artificial neural network is used to test the proposed impulsive control paradigm. The approach can be expanded to include qualitative examinations of a wide variety of epidemiological issues and is appropriate in a variety of contexts. Keywords: Reaction-Diffusion Equations · Integral Manifolds Stability · Neural Network · Parallelism · OpenMPI
1
·
Introduction
One of the most important issues in mathematical biology is the optimal control of epidemic models, which has a direct bearing on the applicability of the suggested impulse control model. Epidemiologists can use the impulsive effects as a very powerful treatment control strategy. Additionally, since the approach of collections of integrals can be used in various circumstances. Many epidemiological issues can be studied using qualitative research methods. The effects of temporal delays on the reaction-diffusion process’ dynamical properties and the dynamical models of the virus have drawn a lot of interest [17–21]. To account for the time between the virus’s entry into the target cell and the generation of virus particles, various models have been devised. Several industries, including machine learning and intelligent networks, are incorporating neural networks. For dynamic features like stability, synchronization, and periodicity, etc., these applications are crucial [9,12,14,24,27]. The efficiency of impulse control, which has some explanatory implications on the actual activity of reaction-diffusion ecosystems’ epidemic prevention, is demonstrated in [20]. The architecture of stable periodic solutions for a SIR epidemic dynamics model with impulsive vaccination control is presented in [29]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 373–385, 2023. https://doi.org/10.1007/978-3-031-37963-5_26
374
E. Gospodinova and I. Torlakov
An approach for studying events from the global dynamics and bifurcation of state-dependent SIR models is provided in [26]. Because of this, techniques for impulse control are highly effective, and mathematical models can offer a useful way to logically construct therapy based on the traits of specific agents. Our research’s main objective is to close this gap because, as far as we are aware, the model 1 suggested in [6] has not been examined using an impulsive control technique.The majority of the stability results for impulsive epidemic models have been reported [7,15,17,21,23]. In fact, the stability of the states is a qualitative characteristic that is quite significant for the systems. The authors have studies on the existing results for impulsive models in biology and epidemiology and focused on the qualitative behavior of just one state of interest: the unique stationary state (unique equilibrium point), periodic state, and semi-trivial periodic state. Reaction-diffusion dynamic change and related equations are used to study ecology, neurology, biology, and population dynamics [12,18,25]. In epidemiology and virology, several reaction-diffusion alterations are also demonstrated [7,19, 20,23]. Due to their accuracy in capturing both temporal and spatial evolution, these epidemic models are still being studied [12,25,27]. Moreover, [21,22] has done a thorough analysis of the effects of time delays on the properties of virus dynamic models based on reaction-diffusion theory. It is well known that models with delays have been introduced to account for the time elapsed between the entry of the virus into the cell and the production of fresh viral particles. To examine the effects of cell delay and spatial virus diffusion on the overall dynamics of a viral model, the work [6] uses the following delayed diffusion process system, for instance. In this study, we examine the possibility of an integral collection for an impulse model of the dynamics of the hepatitis B virus with a BedingtonDeAngelis functional response to the infection rate. Pulses are a type of pattern control that happen at a specific moment. The Lyapunov-Razumikhin approach and the comparison principle were used to derive the major findings. A mathematical model to simulate the hepatitis B virus (HBV) in actual three-dimensional space was put forth by Hattaf I. Yousfi in the paper [6]. We now take into consideration a reactive propagation model with impulse perturbation delays of the following kind, using the notation from the aforementioned paper: ⎧ βW (z, t)H(z, t) ⎪ ˙ ⎪ + λ, H(z, t) = −dH(z, t) − ⎪ ⎪ ⎨ 1 + α1 H(z, t) + α2 W (z, t) βW (z, t − τ )H(z, t − τ ) (1) ˙ t) = −aI(z, t) + e−aτ I(z, , ⎪ ⎪ 1 + α H(z, t − τ ) + α W (z, t − τ ) ⎪ 1 2 ⎪ ⎩ ˙ W (z, t) = dW ΔW − mW (z, t) + kI(z, t), where t, z, H, I and W are infected cells, free virus and receptive cells, α, dH and βW H are the virus infection rate of susceptible cells, the production rate and the death rate, aI means the death rate of infected cells; kI and mW , respectively, the production by infected cells and the decay rates; δW is the Laplace operator for W, dW is the diffusion coefficient, the ratio
HPC in IM of Hepatitis B
βW (z, t)H(z, t) 1 + α1 H(z, t) + α2 W (z, t)
375
(2)
where α1 and α2 are constant represents Bedington-DeAngelis functional response of infection rate e−aτ denotes the surviving rate of infected cells before it becomes productively infected, τ is a positive constant that represents the delay. The Bedington-DeAngelis functional response and the extended model 1 of the delayed viral dynamics are described in [7]. Without taking into account the effects of diffusion, the authors assume that when the target cells are infected at the appropriate time, new cells that produce the virus are created. While examining the propagation of epidemics, spatial impacts must be considered. When considering cell proliferation, the retardation comes from the cells [6]. In fact, it is biologically reasonable to include time lags in epidemiological models since infectious processes are not immediate. Model 1, for instance, simplifies to the time-delayed diffusion hepatitis virus model examined in [28] when 1 = 0, where the time-delay denotes the intracellular incubation period or time interval. The degree of infection is determined by the moment at which a cell starts to infect neighboring cells with a saturation response. The article also provides a very helpful overview of some lagged models used in epidemiology, along with the origins of their lag components. Be aware that these models’ dynamical behavior has strong similarities to entropy phenomena seen in biological and viral systems [1,21]. In an effort to provide effective medication therapies, many authors have taken into account impulsive vaccination and active impulsive drug treatment processes [17,27,30]. Differential equations provide a compelling explanation for these treatments, which might be viewed as impulsive. In fact, mathematical models based on them have sparked a lot of interest in methods for treating a variety of illnesses. For several significant, qualitative discoveries on impulsive differential systems, we cite [19,21,22]. However, the particular impulsive control approach also plays a significant role in many instances where it is suitable. The main benefits of such a strategy are that, because it is only used occasionally, it can significantly reduce the cost of control and the amount of information communicated. The authors of [30] used impulsive controls to the SQEIAR model in order to reduce the number of infected and susceptible groups, as well as those with asymptomatic and exhibiting behavior, and subsequently eradicate the infection. The COVID-19 pandemic example is one that is given a lot of attention. An impulse control technique was presented in [13] to assess the drug’s impact on the treatment of hepatitis C. Its effect on the actual effort of preventing epidemics of swiftly dispersing ecosystems is demonstrated in [16] . The dynamics and bifurcation events of state-dependent impulsive SIR models are investigated, and an impulsive vaccination technique is presented [14]. With the topology of stable periodic solutions as defined in [2], they have vaccine control. Because of this, impulsive control techniques are particularly effective, and impulsive mathematical models can offer a useful way to logically design therapy based on the traits of specific agents [16].
376
E. Gospodinova and I. Torlakov
In the suggested study, the generalized model under consideration is exposed to the impulsive control method. Investigating how impulses affect the qualitative behavior of the states and figuring out how to adjust the stability behavior of the system by employing the proper impulsive perturbations are two of the primary research concerns. An extended impulsive model is also introduced by integral manifolds. When compared to equivalent concepts used to describe singular states, integral manifold existence and stability concepts are superior. Multiple systems in nonlinear mechanics, partial differential equations, impulsive differential systems defined on a torus, systems with nonlinear oscillations, uncertain systems with variable impulsive disturbances, and fractional models are all commonly accepted as qualitative analysis by the method of integral manifolds [8]. The relationship between this method and models in science, engineering, and biology is therefore essential in terms of theoretical advancement and practical applications. The primary objective and major contribution of the current study is to highlight the fact that integral manifolds have not yet been applied to viral spread models in epidemiology [2,16,17,20]. Although the system’s behavior may have unpredictable patterns, the impulse control method is taken into consideration for a class of biological delayed response models that enable synchronization of a complex system using just modest control pulses. Moreover, conditions for an integral manifold connected to the investigated impulsive model’s asymptotic stability are established. The analysis of the influence and is enhanced by the employment of a Poincar´e-type integral inequality, which enables more accurate calculation of the reaction diffusion conditions. Because that delayed reaction-diffusion models are still often employed in science, biology, engineering, neurocomputing, and other fields, doing qualitative analysis of their behavior might be useful in a variety of contexts. Investigating how impulses affect the qualitative behavior of the states and demonstrating how we can use neural networks and parallelism to regulate the stability behavior of the system by employing the right impulsive perturbations are two of the key objectives of the project.
2
Initial Data Model
We extend and improve several recent findings in the literature [6,17,18] in this part and suggest an impulsive generalization of the model 1. To that goal, the following notations will be employed: ||z|| = |z1 | + |z2 | + |z3 | will be z = (z1 , z2 , z3 )T ∈ R3 , R+ = [0, ∞). We will also take into account Ω, an open, bounded set in R3 with smooth boundaries ∂Ω. The measure is denoted by mes Ω > 0. Let 0 equal 0 = (0, 0, 0)T ∈ Ω. The impulsive delayed reaction-diffusion epidemic model is introduced in this study in the following manner:
HPC in IM of Hepatitis B
⎧ βW (z, t)H(z, t) ⎪ ˙ ⎪ t) = −dH(z, t) − + λ, ⎪ H(z, ⎪ 1 + α ⎪ 1 H(z, t) + α2 W (z, t) ⎪ ⎪ ⎪ βW (z, t − τ )H(z, t − τ ) ⎪ ˙ t) = −aI(z, t) + e−aτ ⎪ , ⎨ I(z, 1 + α1 H(z, t − τ ) + α2 W (z, t − τ ) ˙ (z, t) = dW ΔW − mW (z, t) + kI(z, t), t = ti , W ⎪ ⎪ ⎪ ⎪ H(z, t+ ⎪ i ) = H(z, ti ) + J1i (H(z, ti )), ⎪ ⎪ + ⎪ I(z, t ⎪ i ) = I(z, ti ) + J2i (I(z, ti )), ⎪ ⎩ W (z, t+ i ) = W (z, ti ) + J3i (W (z, ti )), i = 1, 2, . . . ,
377
(3)
where z ∈ Ω, t ≥ t0 , t0 ∈ R+ 1. The impulsive instants ti are such that 0 < t1 < t2 < ... < ti < ti+1 < and limi→∞ ti = ∞; 2. All parameters in the first three equations have the same meaning as in 1 for t = ti , i = 1, 2, ...; 3. Functions Pji , j = 1, 2, 3, i = 1, 2, ... are real and determine the controlled outputs H(., t+ ), I(., t+ ), W (., t+ ) at times ti , i = 1, 2, .... Model 3 expands and generalizes a few of the earlier models looked at in [6,19,21] to account for impulsive behaviour. Model 1 can be used to apply the impulsive control strategy to models of type 1 by including the impulsive control equations. In fact, effective impulsive controllers can be created using the impulsive functions Pji , j = 1, 2, 3, i = 1, 2, .... A reminder that the function f (H, W ) =
W H 1 + α1 H + α2 W
(4)
is Lipschitz [6], and let L > 0 denote the Lipschitz constant for this function. The initial and boundary values for the model 3 are set as H(z, t) = 0, I(z, t) = 0, W (z, t) = 0, t ∈ R, z ∈ ∂Ω, ⎧ ⎨ H(z, t − t0 ) = φ0H (z, t − t0 ) ≥ 0, I(z, t − t0 ) = φ0I (z, t − t0 ) ≥ 0, ⎩ W (z, t − t0 ) = φ0W (z, t − t0 ) ≥ 0, z ∈ Ω, t ∈ [t0 − τ, t0 ] ,
(5) (6)
where φ0H (z, ξ), φ0I (z, ξ), φ0W (z, ξ) are nonnegative real-valued functions well defined on Ω × [−τ, 0], bounded and piecewise continuous with respect to ξ with a possibly finite number of discontinuity points of the first kind ξ ∈ [−τ, 0] such that φ0H (z, ξ+), φ0H (z, ξ−), φ0I (z, ξ+), φ0I (z, ξ−), φ0W (z, ξ+), φ0W (z, ξ−) exist and are nonnegative, and φ0H (z, ξ−) = φ0H (z, ξ), φ0I (z, ξ−) = φ0I (z, ξ), φ0W (z, ξ−) = φ0W (z, ξ). The class of all functions φ0 = (φ0H (z, ξ),φ0I (z, ξ),φ0W (z, ξ)) with such components will be denoted by P C, and by P CB we will denote the class of all bounded functions φ0 ∈ P C. It is assumed that d, β, a, k, m are positive constant and the motion of virus follows the Fickan diffusion [21,22].
378
E. Gospodinova and I. Torlakov
To simplify the notations, we will denote by U (z, t) = U (z, t; φ0 ), U = (H, I, W ) any solution of 3 with initial function φ0 = (φ0H (z, ξ),φ0I (z, ξ),φ0W (z, ξ)) = (T0 (z, ξ), I0 (z, ξ), W0 (z, ξ)) ∈ P C. The norm of U = (H, I, W ) for t ≥ t0 will be defined as model 3 and the norm of the function φ ∈ P C is denoted as ||.||τ and is defined by ||φ||τ = sup−τ ≤ξ≤0 ||φ(z, ξ)||2 . It is well known [16,27,30] that the solutions of the impulsive initial-value boundary problems 3-6. U (z, t; φ0 ) = (H(z, t; φ0 ), I(z, t; φ0 ), W (z, t; φ0 ))
(7)
are piecewise continuous functions with first kind points of discontinuity at which they are continuous from the left. We cite [19,21,24,29] for additional findings on the corresponding fundamental and qualitative dynamical features of impulsive reaction-diffusion systems. We can account for immediate effects on vulnerable cells, infected cells, and free viruses by taking into account impulsive conditions. The model 1, which generalizes a number of crucial models in biology, was examined in the study [6] with regard to its global asymptotic stability of an infected equilibrium. We will expand on these findings and introduce the integral manifolds method to the impulsive model 3 in this study [16–20,27]. Let Ω = U ∈ R3 : H ≥ 0, I ≥ 0, W ≥ 0 and we assume that H(z, t + i) ≥ 0, I(z, t + i) ≥ 0, W (z, t + i) ≥ 0 for any z ∈ Ω and i = 1, 2, .... The solutions U that emerge in the space are the only ones we focus on in the following, which is natural from a biological standpoint. An arbitrary manifold M in the space Ω × [t0 − τ, ∞) × θ is called an integral manifold of the model 3, if for any state U (z, t) = U (z, t; φ0 ) of 3 such that (z, ξ, φ0 (z, ξ)) ∈ M , (z, ξ) ∈ Ω × [−τ, 0] we have (z, t, U (z, t)) ∈ M , z ∈ Ωand t ≥ t0 . The same assumptions made in [18] are assumed to hold in this paper. Boundedness and permanence are two quaType equations present here, as is widely known. For biological models, litative qualities are extremely important [2,28]. The ideas will be applied to the integral manifolds approach. In this section, we’ll define the class of Lyapunov-like functions [16] that will be utilized in the investigation of integral manifolds’ qualitative characteristics 3. Denote t0 = 0 and consider the sets: ϑi = {(U, t) : ti−1 < t < ti , U = (H, I, W ) ∈ Ω}, i = 1, 2, ... ϑ =
∞
ϑi
(8)
i=1
Keep in mind that the typical energy method also frequently employs the Lyapunov function approach. For this can also be thought as a function for isolated systems. Finally, the Poincar´e-type integral inequality in [3,8] is also used in our qualitative analysis.
HPC in IM of Hepatitis B
3
379
Integral Manifolds’ Asymptotic Stability
We shall discuss the model’s integral manifold M ’s primary asymptotic stability finding 3. As a result of the study in [18] and by basing Theorem 1 and its proof, we can therefore claim that M is an integral manifold of the model 3. In the study, the functions Pji have the following form: Pji (H(z, ti )) = −γji T (z, ti ),
(9)
where 0 < γji < 2, j = 1, 2, 3, i = 1, 2, ... and for the model’s parameters the same conditions hold under Theorem 1 point 4. And in addition: For the function W ∈ W M in 7 there exist functions σ : R+ → (0, ∞) and w3 ∈ K such that
∞
2μW (U, t) ≤ −σ(t)w3 (D(U, M (z, t, )), t = ti , U ∈ Θ, z ∈ Ω
(10)
σ(s)w3 w2−1 (η) ds = ∞ for each sufficiently small value of η > 0.
(11)
0
The impulsive epidemic model 3’s integral manifold M is then uniformly globally asymptotically stable. The integral manifold M of the impulsive epidemic model 3 is then uniformly globally asymptotically stable. To satisfy requirement 3 of Theorem 1, the impulsive controllers Pji , j = 1, 2, 3, i = 1, 2, . . . are developed. This research’s key idea is to use impulsive control to preserve the system’s stability properties 1. Researchers trying to evaluate the stability of the model from trials have a considerable barrier because only a tiny amount of the data associated to (H, I, W ) can be measured continuously. They can overcome this difficulty with the help of the explicit structure provided by the model 3. The established result can be used by epidemiologists to determine the effects of unmeasured variables (H, I, andW ), as well as what kinds of perturbations will help to stabilize the system. Our research helps to explain how impulsivity affects state stability. The described asymptotic stability result can also be utilized to impulsively synchronize a variety of epidemic models. In order for the method to be effective, the status of the response system must be changed using synchronizing impulses at specific moments. Theorem 1’s proposed result guarantees that the impulsive controller can synchronize the controlled system 3 onto the system uniformly and globally 1.
4
OpenMPI
For distributed memory systems, it is possible to create effective parallel programs using message passing libraries. These libraries offer functions for starting and setting up a messaging environment, as well as for sending and receiving data packets. The OpenMPI library was utilized to produce the study’s findings.
380
E. Gospodinova and I. Torlakov
Because there are so many options, we use a parallel technique to look for a solution to this task that is completely practical. The computer gear employed will enable the bigger amount of data to be summarized and so display the results in a more natural manner. OpenMPI is the name of an open source Message Passing Interface implementation. In 1993 [4,5,10,11], the MPI-1.0 standard initial version was published. A cluster of computers running OpenMPI use the CPUs of those computers to perform calculations. The hardware and system design of high-performance computing solutions gives them a power and speed advantage over conventional computers. Hundreds of processors are used in high-performance computing systems for parallel computing, and each processor executes multiple computational payloads at once. With this technology, we can achieve tremendous scalability because OpenMPI runs on distributed clusters. Since all synchronizations are explicit, source code development becomes more difficult. Even yet, since all code is performed in parallel via OpenMPI, a considerable speed-up is easily achieved.
5
Results
The software receives inputs and variables for validation in order to keep system 1 stable and determine the neural network’s stability based on Theorem 1. R Xeon R processors, is A cluster of eight computers, each with four Intel used to carry out the computation. Finding stable neural network areas is the aim. To connect the various parts of the architecture, OpenMPI 4.1.5 is employed. Systems that use parallel computing are ideal for modeling and simulating actual events. By applying technology to speed up the calculation numerous times, the time is reduced. A validation rule is offered to ensure that Theorem 2’s condition 3 is met. The incoming data is compared against the positive number’s existence. Pji are validated as functions. Theorem 1 must be true for the function h(t, z) to exist. Using the model in System 1, more than 1.5 billion unique scenarios have been taken into account. Only a small number of results will be displayed in this post due to the amount of data. With regard to bigger than zero, the results are summarized. Some findings are not shown because they do not meet System 1 stability standards. The information is organized into four major areas cij , dij , pij and qij . When one of the matrices, cij , dij , pij , or qij is rotated, the following constant data are present:
c11 c12 d11 d12 1.5 1 0.8 0.9 (cij ) = = = (dij ) =
c21 c22 1.1 −0.1
d21 d22 0.1 0.7 (12) p11 p12 q11 q12 1.2 −0.1 0.7 0.3 (pij ) = = = (qij ) = p21 p22 q21 q22 1.4 1.3 0.6 0.7 Data processing is done using a double loop function. Whereas the other three arrays, as indicated above, have a fixed position, this one starts with the smallest
HPC in IM of Hepatitis B
381
input parameter and increases in steps until the maximum value is reached. The values of each array are determined using a nested loop in the second method of data processing, going from the least to the greatest value for each column and row. The information that was acquired for the cij parameter is shown in Fig. 1 and Fig. 2.
Fig. 1. µ Results in c1j .
Fig. 2. µ Results in c2j .
The largest value can be observed, as can be seen on the example data, when any value of the array cij tends to zero. When the array’s values are all zeroes, as in c = [[0, 0], [0, 0]], the result is = 0.78. Theorem 1 states that a stable minimum value of = 0.1 may be reached using the array values in the following way: C is equal to [[-2, -0.05], [-1.4, -0.5]; [[-1.1, -0.95], [-1.85, 0.1]; [[-0.8, -0.2], [-1.4, 0.65]]; and a variety of other values. For the dij parameter, the following information has been discovered in Fig. 3 and Fig. 4.
Fig. 3. µ Results in d1j .
Fig. 4. µ Results in d2j .
For the values of dij and cij , the same behaviour is observed. When dij goes to zero, the highest value is observed. The array d = [[0.0, 0.0], [0.0, 0.0]] values
382
E. Gospodinova and I. Torlakov
have a maximum value of = 0.528. The following values are received for the minimum value of = 0.15: d = [[-1.1, -0.05], [-0.95, 0.1]] or d = [[-0.2, -0.95], [-0.65, 0.4]] or d = [[0.55, -0.5], [-0.05, -1.1]] plus a lot more. The pij parameter has yielded the following results presented in Fig. 5 and Fig. 6.
Fig. 5. µ Results in p1j .
Fig. 6. µ Results in p2j .
The data show that when p1j is equal to zeros, it tends to have its largest value. When the array’s values p1j = [0.0, 0.0] and any other value of p2j , such as [-0.5, -0.5], or [1.3, 0.1], are used, the maximum value of = 0.384 is obtained. The array’s values pij = [[-0.05, -0.05], [0.55, -2]], pij = [[-0.05, 0.1], [-1.7, 0.85]], pij = [[0.1, 0.25], [1.15, -1.4]], and many others lead to the least stable value of = 0.1. For the qij parameter, the following information has been discovered shown in Fig. 7 and Fig. 8.
Fig. 7. µ Results in q1j .
Fig. 8. µ Results in q2j .
The qij array exhibits similar behaviour to the cij and dij arrays. When array values tend to be zeros, the highest value is obtained. With values from the array
HPC in IM of Hepatitis B
383
qij = [[0.0, 0.0], [0.0, 0.0]], the maximum value for the current iteration of the data is found to be = 0.348. When the array contains the following elements: qij = [[-0.8, -0.05], [-0.8, -0.05]], qij = [[-0.05, 0.1], [-0.65, -0.2]], qij = [[0.55, 0.4], [-0.05, 0.7]], and many others, the minimum stable value of = 0.1 is obtained.
6
Conclusions
In the impulse control model of response delay propagation systems utilized in epidemiology, the problem of integral collections is examined in this study. New specifications for existence, boundedness, constancy, and uniform global stability under asymptotic behavior are set for the range of integrals connected to the proposed model. The outcomes are attained using a generalization of the Lyapunov approach and a Poincar´e-type inequality. The suggested criteria expand various existing conclusions about the dynamic reaction-diffusion properties of the epidemic and virus to the impulsive situation by taking integral manifolds into account rather than simply a single stable state. Consideration of a technique for impulse control therapy is made possible by the data supplied. The mechanism of integral manifolds is versatile and has a large number of equilibrium states. The study’s conclusions can be expanded upon. We provide time-varying delay impulsive neural networks. Our study is primarily focused on the practical examination of neural networks. A parallel technique enables accelerating the process. The operational components of his study will be thoroughly examined using the OpenMPI libraries; this will be discussed in a subsequent post. Acknowledgments. The authors would like to thank the Research and Development Sector at the Technical University of Sofia for the financial support.
References 1. Baez, J., Pollard, B.: Relative entropy in biological systems. Entropy 18(2), 46 (2016) 2. Chatterjee, A.N., Basir, F.A., Takeuchi, Y.: Effect of DAA therapy in hepatitis C treatment–an impulsive control approach. Math. Biosci. Eng. 18, 1450–1464 (2021) 3. Cheung, W.-S.: Some new Poincar´e-type inequalities. Bull. Aust. Math. Soc. 63(2), 321–327 (2001) 4. Cohen, M.A. Grossberg, S. Absolute stability of global pattern formation and parallel memory storage by competitive neural networks. IEEE Trans. Syst. Man Cybern. SMC. 13(5), 815–826 (1983) 5. Gan, Q.: Adaptive synchronization of Cohen-Grossberg neural networks with unknown parameters and mixed time-varying delays. Commun. Nonlinear Sci. Numer. Simul. 17(7), 3040–3049 (2012) 6. Hattaf, K., Yousfi, N.: Global stability for reaction-diffusion equations in biology. Comput. Math. Appl. 66(8), 1488–1497 (2013) 7. Huang, G., Ma, W., Takeuchi, Y.: Global analysis for delay virus dynamics model with Beddington-Deangelis functional response. Appl. Math. Lett. 24(7), 1199– 1203 (2011)
384
E. Gospodinova and I. Torlakov
8. Lai, X., Yao, T.: Exponential stability of impulsive delayed reaction-diffusion cellular neural networks via poincar´e integral inequality. Abstr. Appl. Anal. 1–10, 2013 (2013) 9. Lef´evre, J., Mangin, J.-F.: A reaction-diffusion model of human brain development. PLoS Comput. Biol. 6(4), 1–10 (2010) 10. Li, Y., Linghong, L.: Global exponential stability and existence of periodic solution of Hopfield-type neural networks with impulses. Phys. Lett. A 333(1), 62–71 (2004) 11. Lisena, B.: Dynamical behavior of impulsive and periodic Cohen-Grossberg neural networks. Nonlinear Anal. Theory Methods Appl. 74(13), 4511–4519 (2011) 12. Connell McCluskey, C.: Global stability for an sir epidemic model with delay and nonlinear incidence. Nonlinear Anal. Real World Appl. 11(4), 3106–3109 (2010) 13. Nowak, M.A., Bonhoeffer, S., Hill, A.M., Boehme, R., Thomas, H.C., McDade, H.: Viral dynamics in hepatitis b virus infection. Proc. Natl. Acad. Sci. 93(9), 4398–4402 (1996) 14. Okubo, A., Levin, S.A.: Diffusion and Ecological Problems: Modern Perspectives. Springer, New York (2001). https://doi.org/10.1007/978-1-4757-4978-6 15. Peng, R., Liu, S.: Global stability of the steady states of an sis epidemic reactiondiffusion model. Nonlinear Anal. Theory Methods Appl. 71(1), 239–247 (2009) 16. Rao, R.: Impulsive control and global stabilization of reaction-diffusion epidemic model. Math. Methods Appl. Sci. (2021) 17. Stamov, G., Stamova, I., Spirova, C.: Reaction-diffusion impulsive fractional-order bidirectional neural networks with distributed delays: Mittag-Leffler stability along manifolds. AIP Conf. Proc. 2172(1), 050002 (2019) 18. Stamov, G., Stamova, I., Spirova, C.: Impulsive reaction-diffusion delayed models in biology: Integral manifolds approach. Entropy. 23(12) (2021) 19. Stamova, I., Stamov, G.: Mittag-Leffler synchronization of fractional neural networks with time-varying delays and reaction-diffusion terms using impulsive and linear controllers. Neural Netw. 96, 22–32 (2017) 20. Stamova, I., Stamov, G.: Lyapunov approach for almost periodicity in impulsive gene regulatory networks of fractional order with time-varying delays. Fractal Fract. 5(4) (2021) 21. Stamova, I.M., Stamov, G.T.: Lyapunov-Razumikhin method for impulsive functional differential equations and applications to the population dynamics. J. Comput. Appl. Math. 130(1), 163–171 (2001) 22. Tong, Y., Lei, C.: An SIS epidemic reaction-diffusion model with spontaneous infection in a spatially heterogeneous environment. Nonlinear Anal. Real World Appl. 41, 443–460 (2018) 23. Wang, J., Xie, F., Kuniya, T.: Analysis of a reaction-diffusion cholera epidemic model in a spatially heterogeneous environment. Commun. Nonlinear Sci. Numer. Simul. 80, 104951 (2020) 24. Wang, K., Wang, W.: Propagation of HBV with spatial dependence. Math. Biosci. 210(1), 78–95 (2007) 25. Wang, K., Wang, W., Song, S.: Dynamics of an HBV model with diffusion and delay. J. Theor. Biol. 253(1), 36–44 (2008)
HPC in IM of Hepatitis B
385
26. Xiang, H., Liu, B.: Solving the inverse problem of an sis epidemic reaction-diffusion model by optimal control methods. Comput. Math. Appl. 70(5), 805–819 (2015) 27. Rui, X., Ma, Z.: An HBV model with diffusion and time delay. J. Theor. Biol. 257(3), 499–509 (2009) 28. Xu, Z., Zhao, Y.: A reaction-diffusion model of dengue transmission (2014) 29. Yang, J., Liang, S., Zhang, Y.: Travelling waves of a delayed sir epidemic model with nonlinear incidence rate and spatial diffusion. PLoS ONE 6(6), 1–14 (2011) 30. Zhang, L., Wang, Z.-C., Zhao, X.-Q.: Threshold dynamics of a time periodic reaction-diffusion epidemic model with latent period. J. Differ. Eq. 258(9), 3011– 3036 (2015)
Performance Comparison of Operations in the File System and in Embedded Key-Value Databases Jesse Hines, Nicholas Cunningham, and Germ´ an H. Alf´erez(B) School of Computing, Southern Adventist University, PO Box 370, Collegedale, TN 37315-0370, USA {jessehines,nicholascunningham,harveya}@southern.edu
Abstract. A common scenario when developing local PC applications such as games, mobile apps, or presentation software is storing many small files or records as application data and needing to retrieve and manipulate those records with some unique ID. In this kind of scenario, a developer has the choice of simply saving the records as files with their unique ID as the filename or using an embedded on-disk key-value database. Many file systems have performance issues when handling large number of small files, but developers may want to avoid a dependency on an embedded database if it offers little benefit or has a detrimental effect on performance for their use case. Despite the need for benchmarks to enable informed answers to this design decision, little research has been done in this area. Our contribution is the comparison and analysis of the performance for the insert, update, get, and remove operations and the space efficiency of storing records as files vs. using key-value embedded databases including SQLite3, LevelDB, RocksDB, and Berkeley DB. Keywords: Databases
1
· File Systems · Database Performances
Introduction
A common scenario when developing local desktop applications is the need for persisting many small files or records as application data and needing to retrieve and manipulate those records with some unique ID, essentially forming a keyvalue store. For example, a game developer may need to store records for each game entity or game level or note-taking software would need to store a large number of small text records. In this kind of scenario, a developer has two main choices: leveraging the file system for storage or using an embedded key-value database. If the developer chooses to use the file system to store the records, they can simply save each record to disk with its unique ID as the filename. This has the advantage of being simple to implement and adding no extra dependencies. However, file systems can have space and performance issues when handling large numbers of small files [3,9]. As an alternative to the file system, the developer can choose to use an embedded database. Nonetheless, adding a dependency c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 386–400, 2023. https://doi.org/10.1007/978-3-031-37963-5_27
Performance Comparison of Operations
387
on an embedded database adds a fair amount of technical overhead to a project and increasing overall system complexity. If the embedded database is statically linked with an application, it will typically increase compilation time and executable size [4]. If it is linked dynamically, it can complicate installation of an application as external dependencies will need to be installed. Therefore, a developer will likely want to avoid adding a dependency on an embedded database if their use case is not significantly benefited by it. Despite this being a common scenario, little research has been done comparing simple file system options to embedded key-value databases. Our contribution is the analysis and comparison of the performance of four popular open source embedded databases – SQLite3, LevelDB, RocksDB, and Berkeley DB – with storing records on the file system. This research will enable developers to make informed decisions on what tools are best for their scenario. This paper is organized as follows. Section 2 presents the existing research work on file system and embedded database performance. Section 3 presents the background knowledge behind our approach. Section 4 presents the methodology of the research and Sect. 5 presents the evaluation of the results. Section 6 presents our conclusions and directions for future research.
2
State of the Art
There is an extent of existing research comparing various databases [5,8]. Additionally, there is existing research comparing the performance of storing blob data in SQL servers vs. the file system [10,11]. However, most of the existing research is not recent, and is mostly focused on database servers. Minimal research has been done on this question in the realm of local desktop applications and comparing the file system to embedded key-value databases such as LevelDB, RocksDB, and Berkeley DB. One of the few research works on comparing key-value databases to file system is the work done by Patchigolla et al. [6] in 2011. This research performed detailed benchmarks on database IO performance on mobile devices vs. file IO. The research was specifically focused on Android devices, and the benchmarks compared SQLite and Perst embedded databases. This research concluded that if the number of records is less than 1,000, then using files is a good decision, otherwise an embedded DB is a better option. We believe extending this research into the realm of local desktop applications and taking embedded key-value databases into account will be valuable to future developers. A related question to the comparison of embedded databases and the file system, but within the scope of SQL database servers, is whether to store large pieces of data such as images in the database directly or on the file system. On SQL servers, the two typical methods for this are to either store the data directly in an SQL BLOB column or to only store the file path in the database and save the data as a file. A widely cited paper on this question is the work done Sears et al. [10] which analyzed the performance of storing files in SQL Server as BLOBs vs. files. They concluded that data smaller than 256 KB is
388
J. Hines et al.
more efficiently handled by database BLOBs, while the file system is better for data larger than 1 MB. Stancu-Mara and Baumann [11] did detailed benchmarking on BLOB handling in various SQL database servers, including PostgreSQL and MySQL. They benchmarked read and write for blobs of various sizes, concluding that MySQL was the fastest at handling BLOBs. However, both of these benchmarks were done in the late 2000s, and their conclusions may no longer be applicable to modern systems. There are many benchmarks comparing NoSQL databases. For instance, Gupta et al. compare the performance and functionality of four NoSQL databases with widely different focuses: MongoDB, Cassandra, Redis, and Neo4j [5]. In other research work, Puangsaijai et al. [8] compare the key-value database Redis with the relational database MariaDB for insert, update, remove, and select operations. Related to measuring the performance gains of using key-value databases is the research on using key-value databases to improve performance. In this context, Tulkinbekov et al. [13] solve the problem of write amplification with key-value databases. The problem of write amplification is when writes are repeated multiple times which can lead to performance issues and extra wear and tear on the memory. Their fix is to introduce their own key-value store CaseDB, which not only solves write amplification but also outperforms LevelDB and WiscKey, a LSM-tree-based key-value store. Similarly, Chandramouli et al. introduced FASTER [1], a new embedded key-value database system to improve performance in large scale applications. Their research was focused on improving concurrent performance on large scale applications with millions of accesses a second. An alternative to using key-value databases to replace the file system, is to use key-value databases to improve the file system itself. Specifically, Chen et al.[2] applied key-value database technology to improve performance of the file system. They created a file system middleware named FILT which uses key-value database technology to achieve up to 5.8x performance over existing file systems.
3
Underpinnings of Our Approach
This section presents the fundamental concepts of our approach as shown in Fig. 1. 3.1
Embedded Databases
Embedded databases are databases that are included in an application rather than as a separate server [12]. Since they are embedded in the application itself, they do not require a separate server or Internet access to use. Embedded databases are meant for scenarios where an application is storing data that is only needed on the local machine. Some embedded databases, such as SQLite, are full relational databases. However, simpler NoSQL alternatives have become popular such as key-value
Performance Comparison of Operations
LevelDB
SQLite3
fork of
389
Berkeley DB
RocksDB
is an
Embedded Database
ext4
NTFS
is a is a
Storage Method
is a
Filesystem
Fig. 1. Underpinnings of our Approach.
databases. Key-value databases only allow efficient access of each entry by a unique key, and do not support more complex queries. While this is restrictive, it is sufficient for many use cases and allows the database to be much simpler and optimized. SQLite31 is one of the most popular embedded databases. SQLite3 is an open source C library that implements an embedded SQL database engine. It used extensively in mobile devices, Python apps, and browsers, among many more applications2 . While SQLite is a fully featured SQL database, for our purposes we treat it as key-value store by just having one table with a primary key. LevelDB3 is a open-source, key-value embedded database sponsored by Google. As a key-value database, it maps arbitrary binary string keys and values and does not support SQL queries. LevelDB offers features such as high performance, compression of data, and “snapshots” to provide consistent read-only views of the data. LevelDB is used in many applications including Chrome’s IndexedDB4 , Bitcoin Core5 , and Minecraft Bedrock Edition6 . RocksDB7 is another open-source, key-value embedded database. RocksDB is sponsored by Facebook, and is used in much of their infrastructure. RocksDB actually started as a fork of LevelDB, and has a similar API. Facebook found that LevelDB did not perform well with massive parallelism or datasets larger than RAM, so they created RocksDB with a focus on scaling to large applications. 1 2 3 4 5 6 7
https://www.sqlite.org. https://www.sqlite.org/mostdeployed.html. https://github.com/google/leveldb. https://chromium.googlesource.com/chromium/src/+/a77a9a1/third party/blink/ renderer/modules/indexeddb/docs/idb data path.md. https://bitcoindev.network/understanding-the-data. https://github.com/Mojang/leveldb-mcpe. http://rocksdb.org.
390
J. Hines et al.
Berkeley DB8 is an open source embedded database provided by Oracle. Berkeley DB advertises itself as “a family of embedded key-value database libraries that provides scalable, high-performance data management services to applications.” Berkeley DB is very feature rich, and encompasses multiple different database technologies. Our benchmark makes use of the efficient key-value API, but Berkeley DB can also create relational SQL databases, and use multiple different index types, among many other features. Some programs that use Berkeley DB are Bogofilter, Citadel, and SpamAssassin. 3.2
File System Performance
File systems can have issues when handling large numbers of small files. For instance, most modern file systems, including FAT, ext3, and ext4, allocate space for files in a unit of a cluster or block, no matter how small the file is. On large files this is inconsequential, but if there is a very large number of files smaller than the cluster size it can lead to a large amount of space being wasted [3]. Directory lookup can also take a performance hit if a large number of files are directly under a single folder [9]. This file system performance bottleneck has led many applications, including browsers and web caches, to use “nested hash file structures” with files placed into intermediate sub-directories based on their hash [7]. This avoids placing too many files directly under a single folder and can improve performance. As seen in Fig. 2, the top level directory may contain folders 00 through ff, where each sub-directory contains files whose hash digest starts with those characters. Files would be spread evenly across all 256 sub-directories. This process can be repeated for as many levels of nested as is required to get a reasonable number of files in the “leaf” directories.
c4 ca 820dcc509.png ae 36f067f89c.png ec cb e28308fd9f.png
Fig. 2. Nested Structure.
8
https://www.oracle.com/database/technologies/related/berkeleydb.html.
Performance Comparison of Operations
4
391
Methodology
The algorithm for the benchmark compares the storage methods, usage patterns, generating random data for the operations, then takes measurements of the performance. These steps are described in detail in this section. Listing 1.1 shows the pseudocode outline of the algorithm used for the benchmark. The algorithm loops over different combinations of store type, data type (compressible text or incompressible binary data), record size range, and record count range (Line 3). For each combination it benchmarks 1,000 iterations of each of the 4 key-value operations (Lines 11, 16, 20 and 25) using a random access pattern and random data. It then records the peak memory usage and disk space efficiency for each combination (Lines 30 and 33). Finally it outputs the results to a CSV file (Line 35). 4.1
Comparing Storage Methods
Four different embedded key-value databases were compared, SQLite3, LevelDB, RocksDB, and Berkeley DB. While SQLite is a fully featured SQL database, for our purposes we treat it as key-value database by just having one table with a primary key column and a value column. Then, the embedded databases were compared with two strategies for storing key-value records on disk, flat and nested. The flat storage strategy places all the records as files with their key as their name under one directory, as seen in Fig. 3. The nested storage strategy uses a nested hash file structure as described in Sect. 3.2 and Fig. 2.
c4ca820dcc509.png c4ae36f067f89c.png eccbe28308fd9f.png Fig. 3. Flat Structure.
Three levels of nesting were chosen with 2 hexadecimal chars per level. We choose a depth of 3 as it can hold up to 16,777,216 (2563 ) records while still keeping around 256 records in the leaf nodes, and our benchmark is only testing up to 10,000,000 records. As seen in Fig. 4, wrappers were created for each of the 6 different storage methods that implemented an abstract interface that could be used by the benchmark. The wrappers had methods for insert, update, get, and remove, and kept count of how many records were in the database.
392 1
J. Hines et al. baseMem = getPeakMem ()
2 3 4 5 6
for each combination of ( storeType , dataType , size , count ) : if averageRecordSize * count . max > 10 GiB : skip
7 8 9
resetPeakMem () store = new storeType with initial data
10 11 12 13 14
repeat 1000 times : key = newKey ( store ) value = gen ( dataType , size ) benchmark store . insert ( key , value )
15 16 17 18
repeat 1000 times : key = pickKey ( store ) benchmark store . get ( key )
19 20 21 22 23
repeat 1000 times : key = pickKey ( store ) value = gen ( dataType , size ) benchmark store . update ( key , value )
24 25 26 27 28
repeat 1000 times : key = pickKey ( store ) benchmark store . remove ( key ) store . insert ( key , gen ( dataType , size ) )
29 30 31 32 33
peakMem = getPeakMem () - baseMem dataSize = getDataSize ( store ) diskSize = getDiskUsage ( store ) spaceEfficiency = dataSize / diskSize
34 35
write measurements to file Listing 1.1. Pseudocode of Benchmark
Performance Comparison of Operations
393
SQLite3Store
LevelDBStore Store void insert(string key, string value) void update(string key, string value) string get(string key) remove(string key) count()
RocksDBStore
BerkeleyDBStore
FlatFolderStore
NestedFolderStore
Fig. 4. Class Diagram of the Store Types.
Both the SQLite and Berkeley DB C interfaces return char* pointers that become invalid after doing another database operation. The wrappers did a copy of the buffer memory managed by our code. This does add some overhead that could be avoided if a developer only needed the data briefly and did not need to do any other database operations until they were finished with the data. But we assume that the most common use case will do a copy to avoid risking dangling pointers and memory safety issues. 4.2
Comparing Usage Patterns
Multiple different usage patterns for each of the storage methods were compared. 4 record size ranges were compared, less than 1 KiB, 1–10 KiB, 10–100 KiB, and 100 KiB–1 MiB, and 5 record counts, 100, 1,000, 10,000, 100,000, and 1,000,000. All combinations of sizes and counts that led to databases smaller than 10 GiB were benchmarked (Line 5 of Listing 1.1). This benchmark measured random access reads and updates. A valuable direction for future work could be to benchmark different access patterns. LevelDB and RocksDB implement data compression. Since this could have a impact on the size of the records on disk and the cost of IO operations, both compressible text data and incompressible random binary data were tested. 4.3
Generating Data
Data for values in insert and update statements was randomly generated. Incompressible data was generated with the Mersenne Twister 19937 algorithm (as in the C++ standard library) to generate pseudorandom bytes. Compressible text was selected from 250 public domain English books downloaded from the Gutenberg project9 . 9
https://www.gutenberg.org/.
394
J. Hines et al.
Keys were generated using a SHA-1 hash of the index, based on order keys were inserted (Line 12 of Listing 1.1). Using this method lets the benchmark choose a random key from the store by hashing a random number in the range [0, store.count). This will allow the benchmark to simulate random access get (Line 17), update (Line 21), and remove (Line 26) without storing a list of all keys inserted. This strategy does cause a complication when benchmarking the remove operation however. After a random remove we can no longer assume the keys are in a continuous range. This is easily remedied by replacing the key after each removal to keep the keys continuous (Line 28). 4.4
Taking Measurements
For each combination of store and usage patterns we measured multiple attributes. The primary attribute we measured was the time taken for each of the fundamental key-value operations, insert, update, get, and remove. For each combination we ran each operation 1,000 times and recorded the average, minimum, and maximum values in nanoseconds. Since the memory usage of an application is also an important factor of performance, we measured the peak memory usage for each combination as well (Line 30). To measure peak memory usage we used the Linux getrusage syscall10 . This syscall returns the peak resident set size used by the process in kilobytes. We reset the peak memory between configurations by using the special file /proc/self/clear refs11 . Writing a “5” to this file resets the process’s peak memory usage or “high water mark” measurement and allows us to get the peak memory usage for a specific period of time. We then subtracted a baseline measurement of the memory usage taken at the beginning of the benchmark to show only the memory used by the store. Another factor besides performance that may be important for developers is the “space efficiency” of a storage solution, especially if they are working with large amounts of data or limited space. Embedded databases can add storage overhead or, if they have compression, greatly reduce the size used. And the file system itself can waste space when storing many tiny files. We measured the approximate space efficiency of each store by comparing the amount of data put in versus the actual space taken up on disk (Line 33). We measured space efficiency by counting the number of bytes actually inserted into the store, and then using the du command12 to measure the space taken on disk including wasted block space. 4.5
Implementing the Algorithm
We choose to implement the benchmark in C++. All the embedded databases we compared were written in C or C++ and so could be used in a C++ project. 10 11 12
https://man7.org/linux/man-pages/man2/getrusage.2.html. https://man7.org/linux/man-pages/man5/proc.5.html. https://man7.org/linux/man-pages/man1/du.1.html.
Performance Comparison of Operations
395
Using C++ allowed us to compile and link the embedded databases giving us more control over their configuration. Additionally, using C++ ensures there is no extra overhead from the language bindings. The benchmark is open source and the code is available on GitHub13 . 4.6
Choosing Hardware
The benchmark was run on a virtual machine provided by Southern Adventist University. The VM ran Ubuntu Server 21.10 and was given 2 cores of a AMD EPYC 7402P processor, 8 GiB of DDR4 s667 MT/s RAM, and 250 GiB of Vess R2600ti HDD.
5
Results
This section describes notable patterns and analysis of the results of the benchmark. A condensed summary of the results is listed in Table 1, which lists the most efficient storage option for each usage pattern. The operation measured and record size are listed on the vertical axis, while data compressibility and record count are listed on the horizontal. Each cell is color-coded by the storage option (yellow = Berkeley DB, red = LevelDB, etc.) for ease of reading. Combinations of record size and count that would result in more 10 GiB of files were skipped and left blank in the figure. The full results of the benchmark are available on GitHub (See footnote 14) in CSV format, as well as an Excel spreadsheet that imports the CSV data and shows interactive charts. 5.1
Filesystem vs DBs
The choice of when to use the file system vs. an embedded database is complex and very dependent on the particular usage pattern and what types of operations are being performed. For instance, in this benchmark an embedded database is faster when working with files smaller than 1 KiB. Berkeley DB is the fastest in most scenarios with tiny records. At the 1 to 10 KiB range, databases are still better in most scenarios, though the file system is faster on get and inserts when there are 100,000 records. At the 10 to 100 KiB range, file system performance is better than or very close to the database’s for get and insert operations. RocksDB or LevelDB are faster for remove operations with record counts less than 10,000 as long as compression is turned off while Berkeley DB leads on the update benchmarks. For 100 KiB to 1 MiB records, the file system is still generally faster for get and insert operations, though LevelDB without compression is faster for 1,000 records or less. Berkeley DB performs well on update operations. RocksDB without compression is fastest for removes until about 10,000 records then the file system is faster. 13
https://github.com/jesse-r-s-hines/KeyValueStoreBenchmark.
396
J. Hines et al.
Table 1. Summary of the Best Storage Options for each Usage Pattern. Cells are Color-Coded by Store, Yellow = Berkeley DB, Red = LevelDB, etc. Incompressible Operation
Record Size
100 Berkeley (2 µs)
1,000 Berkeley (4 µs)
1 - 10 KiB
Berkeley (7 µs)
10 - 100 KiB
Flat FS (45 µs)
&l1t ;Ki B
insert
update
1,000,000 Berkeley (8 µs)
100 Berkeley (9 µs)
1,000 Berkeley (6 µs)
10,000 Berkeley (6 µs)
100,000 Berkeley (7 µs)
1,000,000 Berkeley (8 µs)
Berkeley (8 µs)
Berkeley (10 µs)
Flat FS (22 µs)
Berkeley (12 µs)
Berkeley (7 µs)
Berkeley (8 µs)
Flat FS (47 µs)
Berkeley (45 µs)
Flat FS (45 µs)
Flat FS (47 µs)
Berkeley (45 µs)
Berkeley (9 µs)
Flat FS (22 µs)
Berkeley (12 µs)
Berkeley (47 µs)
Berkeley (95 µs)
Nested FS (952 µs)
Flat FS (296 µs)
Flat FS (300 µs)
Flat FS (296 µs)
Flat FS (307 µs)
Flat FS (296 µs)
&l1t ;Ki B
Berkeley (1 µs)
Berkeley (3 µs)
Berkeley (4 µs)
Berkeley (5 µs)
Berkeley (6 µs)
Berkeley (3 µs)
Berkeley (4 µs)
Berkeley (4 µs)
Berkeley (6 µs)
Berkeley (7 µs)
Berkeley 1 - 10 KiB (6 µs)
Berkeley (8 µs)
Berkeley (10 µs)
LevelDB (27 µs)
RocksDB (35 µs)
Berkeley (6 µs)
Berkeley (8 µs)
Berkeley (9 µs)
LevelDB (19 µs)
LevelDB (18 µs)
Berkeley 10 - 100 KiB (45 µs)
Berkeley (49 µs)
Berkeley (47 µs)
Berkeley (55 µs)
Berkeley (44 µs)
Berkeley (48 µs)
Berkeley (49 µs)
Berkeley (69 µs)
Berkeley (520 µs)
Berkeley (601 µs)
Berkeley (749 µs)
&l1t ;Ki B
Berkeley (1 µs)
LevelDB (2 µs)
Berkeley (2 µs)
Berkeley (3 µs)
Berkeley (4 µs)
Berkeley (1 µs)
LevelDB (2 µs)
Berkeley (2 µs)
Berkeley (3 µs)
Berkeley (4 µs)
1 - 10 KiB
LevelDB (2 µs)
Berkeley (4 µs)
Berkeley (4 µs)
Flat FS (7 µs)
Berkeley (1266 µs)
LevelDB (2 µs)
Berkeley (4 µs)
Berkeley (4 µs)
Flat FS (7 µs)
SQLite (23 µs)
10 - 100 KiB
RocksDB (7 µs)
Flat FS (13 µs)
Flat FS (13 µs)
Flat FS (12 µs)
RocksDB (7 µs)
RocksDB (11 µs)
Nested FS (14 µs)
Flat FS (12 µs)
get
Nested FS Nested FS Berkeley (428 µs) (416 µs) (575 µs)
100 KiB - 1 MiB
LevelDB (65 µs)
LevelDB (75 µs)
Flat FS (68 µs)
Flat FS (82 µs)
Nested FS (78 µs)
Flat FS (68 µs)
&l1t ;Ki B
Berkeley (1 µs)
Berkeley (3 µs)
Berkeley (4 µs)
LevelDB (5 µs)
LevelDB (5 µs)
Berkeley (2 µs)
Berkeley (4 µs)
Berkeley (4 µs)
LevelDB (5 µs)
Berkeley (7 µs)
Berkeley 1 - 10 KiB (6 µs)
LevelDB (6 µs)
LevelDB (7 µs)
LevelDB (5 µs)
LevelDB (4 µs)
Berkeley (6 µs)
Berkeley (7 µs)
Berkeley (9 µs)
Flat FS (20 µs)
Flat FS (28 µs)
LevelDB 10 - 100 KiB (7 µs)
RocksDB (17 µs)
RocksDB (17 µs)
Nested FS (32 µs)
Flat FS (23 µs)
Nested FS (27 µs)
Flat FS (26 µs)
Nested FS (32 µs)
RocksDB (26 µs)
RocksDB (34 µs)
Flat FS (87 µs)
Flat FS (87 µs)
Flat FS (85 µs)
Flat FS (121 µs)
&l1t ;Ki B
SQLite (50%)
SQLite (56%)
LevelDB (77%)
RocksDB (91%)
RocksDB (92%)
SQLite (52%)
SQLite (57%)
LevelDB (85%)
RocksDB (91%)
RocksDB (112%)
1 - 10 KiB
SQLite (76%)
SQLite (83%)
SQLite (86%)
LevelDB (97%)
RocksDB (99%)
SQLite (75%)
SQLite (83%)
LevelDB (113%)
RocksDB (133%)
RocksDB (138%)
LevelDB (99%)
RocksDB (156%)
100 KiB - 1 MiB
space
100,000 Berkeley (7 µs)
100 KiB - 1 MiB
100 KiB - 1 MiB
remove
Compressible
10,000 Berkeley (5 µs)
10 - 100 KiB
Flat FS (96%)
Flat FS (97%)
SQLite (98%)
100 KiB - 1 MiB
Flat FS (100%)
Flat FS (100%)
Flat FS (100%)
Flat FS (96%)
LevelDB (104%)
RocksDB (147%)
LevelDB (115%)
LevelDB (125%)
RocksDB (149%)
As expected, the file system had poor space efficiency on tiny records. The file system space efficiency on records less than 1 KiB was about 12% due to the block allocation overhead, while the embedded databases were able to compact multiple records into a single block. The baseline storage overhead of the embedded databases appears to be minimal, and both Berkeley DB and RocksDB were more space efficient than the flat folder store even with only 100 records. At large sizes, the differences in space efficiency between databases and the file system are minimal, as the wasted space at the end of each block becomes less of a factor. However, in the case of compressible data, LevelDB and RocksDB can use compression to get space efficiency of up to 156% at the cost of more CPU overhead. 5.2
File System Patterns
The file system scales to large number of folders quite well until about 1,000,000, 1–10 KiB records where the get operation time increases 500 times for both compressible and incompressible data. This spike does not occur on smaller record sizes. Insert and update time is primarily dominated by the record size factor rather than record count. The measurements of file system operations have several outliers. For instance, on the insert operation benchmark for compressible, 1,000,000, 1–
Performance Comparison of Operations
397
10 KiB records, the nested folder store has a 33 times slowdown compared to incompressible data. Since the file system does not do any compression, it seems unlikely that text vs. binary data would make such a large difference and if it did it should affect both the flat store and the nested store, which it does not. Taking a look at the measurements, the total sum of the 1,000 repetitions was 776 ms, while the maximum record was 707 ms. A single iteration took 90% of the time of the benchmark. Similar spikes occurred elsewhere in file system operations. We suspect that some operating system factor caused intermittent stalls in file system operations, likely involving the file system cache. 5.3
Flat Folder vs. Nested Folder
Using the nested directory structure seems to be mostly counter productive. In most cases the nested folder is slower than the simpler flat structure. It was also less space efficient than the flat folder store, unsurprisingly, due to the extra directory overhead. In a few scenarios it offers some benefits, on the remove operation at 10–100 KiB with 100,000 records, the nested structure is about 30% faster than the flat and nested is faster for updates with 10–100 KiB sized records and a record count of 100,000. 5.4
Effects of Compression
As one would expect, compression increases the space efficiency of the data store. The databases with built in compression, LevelDB and RocksDB, were able to achieve over 150% space efficiency on compressible text data. However, this came at the cost of notably decreased performance. The reduced IO does not seem to make up for the extra CPU overhead of performing the compression. 5.5
SQLite3 Patterns
SQLite scales well in most scenarios, for instance with less than 1 KiB records the get operation is only 60% slower at 1,000,000 records than for 100. An interesting deviation from this is at 1,000,000, 1 KiB and 10 KiB records of incompressible data, where SQLite take 64 times longer than at 100,000 records. This spike does not occur when using compressible data however. Perhaps SQLite handles text and binary data differently in certain scenarios. The update operation took about 20–30% long the incompressible binary data than compressible text data. Space efficiency increased as record size and record count increased, reaching nearly 100% at the highest sizes. However, the constant factor is very high and SQLite is one of the worst performing storage options. This is unsurprising as SQLite is a full relational database and has many more features and overheads than necessary for a simple key-value store.
398
5.6
J. Hines et al.
LevelDB Patterns
Compression adds significant overhead to LevelDB. With compression turned on, performance on all operations at large record sizes or large record counts is up to 20 times slower than with it off. LevelDB has some major performance degradation for gets starting at 1,000,000 records for less than 1–100 KiB records, and 100,000 for 100 KiB to 1 MiB records as well as inserts at the 100 KiB to 1 MiB size range and over 100,000 records. With compression enabled, space efficiency of text data can reach as high as 155% space efficiency. 5.7
RocksDB Patterns
Since RocksDB is a fork of LevelDB, it was expected to see similar results. However their performance had quite a few differences. RocksDB performance degrades significantly at 100 KiB to 1 MiB records for all the operations, and the performance degradation for the get operation with large record counts and compression was more pronounced for than in LevelDB. RocksDB did not notable performance degradation for inserts on high record counts, unlike LevelDB. 5.8
Berkeley DB Patterns
Berkeley DB is one of the best performing databases in this benchmark. With the time of the operation increasing slowly as count increases. However, there are some notable spikes. At the 1 KiB–10 KiB, 1,000,000 record benchmark there was significant slow downs, some up to 100 times slower, for get, update, remove operations with both compressible and incompressible data.
6
Conclusions and Future Work
This benchmark showed that key-value store performance is a complex issue that is highly dependent on a particular workload. According to the presented results, if space efficiency is a concern, developers should use an embedded DB instead of the file system, particularly one with compression. If performance is the primary concern, Berkeley DB is one of the most performant stores on our benchmark, being the fastest or close to it in the most cases and operations particularly for insert and update operations. At record sizes above 10 KiB the simple flat file system store is often the fastest option or at least very close to matching the performance of the embedded key-value databases. Although the way persistent data is structured might have a significant impact on application performance, other factors might affect the data handling performance, such as the configurations of the database and the operating system. Therefore, future work will cover additional variables, such as frequency of data access and data changing. Also, file system and embedded key-value databases will be evaluated in different computer architectures.
Performance Comparison of Operations
399
Moreover, this benchmark only ran on the ext4 file system however different file systems, in particular the wide-spread NTFS, may have different performance profiles. Examining the effects of each embedded database’s configuration as well as different access patterns, such as sequential access, would also be valuable avenues for research. And research into key-value store performance on different hardware, particularly SSDs vs. HDDs is needed. Performance may also be affected by complex caching patterns or the embedded databases running background processes such as LevelDBs background compaction14 . More research into the exact causes for the various outliers and performance degradation’s noted in this dataset would be valuable as well.
References 1. Chandramouli, B., Prasaad, G., Kossmann, D., Levandoski, J., Hunter, J., Barnett, M.: Faster. Proc. VLDB Endowment 11(12), 1930–1933 (2018) 2. Chen, C., Deng, T., Zhang, J., Zou, Y., Zhu, X., Yin, S.: Optimizing KV-embedded file systems through flat indexing. In: FILT, November 2020 3. Chen, T.Y., Chang, Y.H., Chen, S.H., Hsu, N.I., Wei, H.W., Shih, W.K.: On space utilization enhancement of file systems for embedded storage systems. ACM Trans. Embed. Comput. Syst. 16(3), 1–28 (2017) 4. Collberg, C.S., Hartman, J.H., Babu, S., Udupa, S.K.: SLINKY: static linking reloaded. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC 2005, USA, pp. 34. USENIX Association (2005) 5. Gupta, A., Tyagi, S., Panwar, N., Sachdeva, S., Saxena, U.: NoSQL databases: critical analysis and comparison. In: 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), pp. 293–299 (2017) 6. Lutes, K., Patchigolla, V.N.R., Springer, J.: Embedded database management performance. In: Information Technology: New Generations, Third International Conference on, Los Alamitos, CA, USA, pp. 998–1001. IEEE Computer Society, April 2011 7. Patil, S., Gibson, G.: Scale and concurrency of GIGA+: File system directories with millions of files. In: Proceedings of the 9th USENIX Conference on File and Stroage Technologies, FAST2011, USA, p. 13. USENIX Association (2011) 8. Puangsaijai, W., Puntheeranurak, S.: A comparative study of relational database and key-value database for big data applications. In: 2017 International Electrical Engineering Congress (iEECON), pp. 1–4 (2017) 9. Ruan, L., Ding, Y., Dong, B., Li, X., Xiao, L.: Small files problem in parallel file system. In: Network Computing and Information Security, International Conference on, Los Alamitos, CA, USA, May 2011, vol. 2, pp. 227–232. IEEE Computer Society (2011) 10. Sears, R., Van Ingen, C., Gray, J.: To BLOB or not to BLOB: large object storage in a database or a filesystem? CoRR, arXiv:abs/cs/0701168 (2007)
14
https://github.com/google/leveldb/blob/main/doc/impl.md#compactions.
400
J. Hines et al.
11. Stancu-Mara, S., Baumann, P.: A comparative benchmark of large objects in relational databases. In: Proceedings of the 2008 International Symposium on Database Engineering & Applications, IDEAS 2008, New York, NY, USA, pp. 277-284. Association for Computing Machinery (2008) 12. Techopedia. What is an embedded database? - definition from techopedia, December 2014 13. Tulkinbekov, K., Kim, D.-H.: CaseDB: lightweight key-value store for edge computing environment. IEEE Access 8, 149775–149786 (2020)
A Review of Usability Evaluation Methods for eHealth Applications Aslina Baharum1(B) , Siti Rahayu Abdul Aziz2 , and Nurul Hidayah Mat Zain2 1 Computing and Information Systems, School of Engineering and Technology, Sunway
University, 5, Jalan Universiti, 47500 Bandar Sunway, Selangor, Malaysia [email protected], [email protected] 2 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA (UiTM), Kampus Jasin Melaka, 77300 Merlimau, Melaka, Malaysia {rahayu748,nurul417}@uitm.edu.my Abstract. There are some literatures related to the usability of eHealth applications; however, these literatures do not fully address the issue related to the usability method or give knowledge about usability methods used in majority of the usability evaluation for eHealth applications. Therefore, this paper aims to capture the usability methods that are commonly or often used in the domain of usability evaluation for eHealth applications by reviewing the available literatures. This review paper adopted a systematic literature review (SLR) method to achieve the stated research goal. Based on the results, this paper gives additional knowledge about the current usability methods used in eHealth application evaluation which can be useful for future work. Keywords: eHealth · Usability · Systematic Literature Review (SLR) · Questionnaire
1 Introduction Human computer interaction (HCI) refers to the interaction between humans and computers. One of the important concepts in HCI is usability. Usability defines as how easily and well a product or design can be used by a user. More specifically, usability refers to how easy it is to use a system, how well it supports the user’s tasks, and how satisfied the user is with the experience. Usability is important in HCI because a usable interface can improve user productivity, reduce user frustration, and increase user satisfaction with the system. A well-designed and functional interface can also reduce the need for training and support, leading to lower costs and increased user adoption. Usability evaluation refers to the action of assessing the usability of a product or design. Usability evaluation consists of the factors: learnability, efficiency, memorability, errors, and satisfaction [12]. Learnability refers to how easy it is for users to learn how to use the system. Efficiency refers to how quickly users can perform tasks with the system. Memorability refers to how easily users can remember how to use the system after a period of time. Errors refer to the number of mistakes users make while using the system and how easily they can recover from those mistakes. Satisfaction refers to © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 401–410, 2023. https://doi.org/10.1007/978-3-031-37963-5_28
402
A. Baharum et al.
how pleased users are with the system and whether they find it enjoyable to use. The reason usability evaluation is important is because it helps in identifying the usability of a product or design and seeing if it satisfies the usability goals. It also helps in discovering the changes that need to be made to improve the performance of a product or design. eHealth refers to the use of information and communication technologies (ICT) to deliver health services. It allows the consumer to get health services easily. There are more than 350,000 eHealth applications available on the market. Electronic health records (EHRs), telemedicine (remote consultations and monitoring), mobile health (mHealth) apps, wearable technology, health information exchange (HIE), and health analytics are just a few examples of the many applications that fall under the umbrella term “eHealth.“ eHealth technologies can improve the quality, safety, and efficiency of healthcare services, enhance patient engagement and empowerment, and facilitate better coordination and communication among healthcare providers. One of the important characteristics of an application is the usability. Therefore, usability evaluation is one of the crucial phases in developing an eHealth application. This review paper will provide a review of usability evaluation methods used for eHealth applications. The following is the structure of this review paper: Sect. 2 presents the related studies, Sect. 3 presents the methodology to collect the associated papers, Sect. 4 presents the results and summaries made from the collected papers, Sect. 5 presents the discussions and Sect. 6 is the final conclusion.
2 Related Studies Maramba, et al. (2019) [17] conducted a scoping review with 133 articles. The goal of the review was to identify the current method used in the domain of usability evaluation for eHealth applications. The articles used are articles that were published between April 2014 and October 2017. The result of the review for the 133 articles found that the common usability evaluation methods of eHealth applications used are questionnaire, task completion, think aloud, interviews, heuristic evaluation and focus groups. From the six (6) usability evaluation methods stated, the most often used method of evaluating the eHealth application is questionnaire. Davis, et al. (2020) [7] conducted a comprehensive review with 28 articles. The goal of the review was to provide a review of the most commonly used usability evaluation methods in eHealth interventions for HIV. The articles selected for this review were only articles published between January 2005 and September 2019. Based on the result of the review, the commonly used usability evaluation methods for eHealth intervention for HIV are interviews, think aloud, heuristic evaluation, focus group, questionnaire and scenario. The review also found the eHealth intervention were delivered through mobile applications (68%), website (25%) and pc software (7%). The most used method for usability evaluation is the questionnaire. Wronikowska et al. (2021) [26] reviewed usability evaluation methods, metrics, and techniques used to evaluate eHealth systems designed for hospital staff, including articles published between 1986 and 2019. After screening abstracts, 130 full texts were examined. In the 51 studies included in this review, ten (10) distinct types of usability evaluation methods were employed. The ten (10) methods were classified into the following categories: user trials analysis, heuristic evaluations, interviews, and questionnaires.
A Review of Usability Evaluation Methods
403
User trials were the most frequently reported method, with 44 studies (86%) employing them. In 40 studies, questionnaires were used in 78 percent. In seven studies (14%), heuristic evaluation was used, while in ten studies (20%), interviews were conducted. The majority of authors utilised multiple methods to evaluate electronic systems. Ansaar et al. (2020) [1] conducted a systematic literature review on the process of mHealth application usability evaluation. There were 19 studies published after 2010 that were evaluated. During the usability evaluation, various empirical evaluation methods such as questionnaires (15 articles), interviews (four articles), the “think out loud” method (one study), and the log method (one study) have been utilised in selected studies. Several studies also employed the mixed-method approach by combining questionnaires with interviews, and one study combined the questionnaire approach with the log method. Numerous researchers recommend a mixed-methods approach because the benefits and drawbacks of various methods can compensate for one another. Soltanzadeh et al. (2022) [23] reviewed the telehealth or telemedicine system’s usability. The final full-text review of the study of usability factors in health and health care included 119 articles. In studies evaluating the usability of mHealth apps, questionnaires were the most common method of data collection, followed by field studies, interviews, observations, think-aloud protocols, and app-generated data. In most reviewed articles, the ISO 9241–11, or Dr. Nielson’s model was utilised to evaluate the usability of telehealth systems. Larbi et al. (2020) [15] reviewed the methods and evaluation criteria for diabetes self-management apps and digital interventions. The review identified the methods and criteria used to evaluate apps and digital interventions for diabetes self-management and how patients participated in these evaluations. Papers that were published starting from 2015 were reviewed. The most common evaluation techniques were questionnaires, interviews, and user-group meetings. In addition, cognitive impact, clinical impact, and usability were the most evaluated criteria for diabetes self-management apps and digital interventions. The most common evaluation method for diabetes self-management apps and digital interventions was questionnaires.
3 Methodology This review paper adopts SLR method to identify, choose, access and interpret the existing literature related to usability method in eHealth applications. The steps to be taken are indicated in the following sub-sections. 3.1 Search Process The main goal for this review paper is to review the available literature and capture the usability methods that are commonly or often used in usability evaluation for eHealth applications. There are three search terms used to get the literatures related to usability evaluation methods for eHealth application which are “Usability Method for eHealth Applications”, “Usability Evaluation of eHealth Applications”, and “eHealth Usability Testing Evaluation”. The literature used for this review paper are from several electronic database resources that are Google Scholar, Journal of Medical Internet Research (JMIR),
404
A. Baharum et al.
ScienceDirect, Oxford Academic, SAGE journals, Multidisciplinary Digital Publishing Institute (MDPI), Dove Medical Press, Wiley Online Library, American Journal of Occupational Therapy (AJOT), and Heliyon. The literature used are journals and articles. The journals and articles are selected based on the inclusion and exclusion criteria. It considered for inclusion if it focused on eHealth applications which includes website, PC software, smartphone and tablet applications. Next, provided knowledge related to usability evaluation method of the eHealth applications. Thirdly, a full or short paper published in 2018 until 2022. Lastly, written in English language. Other than the criteria stated, the journals and articles will not be considered for inclusion. 3.2 Review Process The selection of suitable papers for both journals and articles from the electronic database resources for this review paper was focused on the keywords: usability evaluation in eHealth, usability testing method in eHealth and usability in eHealth applications. For effective review and data extraction, the selection of all papers was done carefully. Table 1 displays the number of papers selected per electronic database resources. The total number of papers selected is 20 papers. The range of papers selected per electronic database resources is from 1 to 8. The highest number of papers selected are from JMIR with eight papers. Meanwhile, the least number of papers selected are from Oxford Academic, MDPI, Dove Medical Press, Wiley Online Library, AJOT and Heliyon with only one paper. Table 2 shows the details of the 20 papers (selected from Table 1) from the duration years of 2018 to 2021. Table 1. Number of Papers Selected Per Electronic Database Resources Electronic Database Resources
No. of Paper
MIR
8
ScienceDirect
4
Oxford Academic
1
SAGE journals
2
MDPI
1
Dove Medical Press
1
Wiley Online Library
1
AJOT
1
Heliyon
1
Total
20
A Review of Usability Evaluation Methods
405
Table 2. List of Selected Papers Paper ID
Authors
Year
E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 E13 E14 E15 E16 E17 E18 E19 E20
Brett et al. [2] Børøsund et al. [3] Calvillo-Arbizu et al. [4] Cho et al. [5] Cho et al. [6] Donald et al. [8] Fuller-Tyszkiewicz et al. [9] Gilbertson-White et al. [10] Grasaas et al. [11] Jarvis et al. [13] King et al. [14] Lopes et al. [16] Marien et al. [18] Neubert et al. [19] Pinem et al. [20] Quintana et al.’ [21] Sin et al. [22] Vamos et al. [24] Willems et al. [25] Zhou et al. [27]
2018 2018 2019 2019 2018 2021 2018 2019 2019 2019 2018 2021 2019 2018 2020 2020 2019 2019 2021 2019
4 Results After the selection of the 20 selected papers, the papers were reviewed in detail and the results were generated. From the detailed review of selected papers, six usability evaluation methods were captured being used as stated in Table 3. The usability evaluation methods used for each paper. The most method used in the usability evaluation of eHealth applications is questionnaire and the least used is heuristic evaluation. Twelve papers used questionnaire as the usability evaluation method and three papers used heuristic evaluation. Total of four papers used survey and observation. Eight papers used interview and seven papers used think aloud as the evaluation method. Table 4 displays the number of usability evaluation methods used by each paper. A majority of the paper which are ten papers used only one method. Four papers used two methods and another four papers used three methods. Total of two papers used four methods for usability evaluation of eHealth applications.
406
A. Baharum et al. Table 3. Usability Evaluation Methods in eHealth
Table 4. No. of Usability Evaluation Methods Used Per Paper No. Of Usability Evaluation Methods Used Paper ID
Total
1 2 3 4
10 4 4 2
E3, E4. E8, E10, E12 E14, E15, E16, E19, E20 E1, E2, E6, E7 E11, E13, E17, E18 E5, E9
5 Discussion Based on the results obtained from the 20 selected papers, a majority of the papers used only one method of usability evaluation. From the ten papers that used only one method, a total of 70% which indicates 7 out 10 papers used questionnaire [4, 13, 16, 19, 20, 25, 27] as a method to evaluate the usability of the eHealth application. The remaining paper used think aloud [5], survey [10] and interview [21]. Since questionnaire is the most commonly used for usability evaluation for eHealth application, the discussion of this review paper will be discussing the type of questionnaire used from the 20 selected papers. The questionnaire usability evaluation for iCanCope with Pain application [11] were carried out in phase 2 where the test was done in laboratory setting. The participants
A Review of Usability Evaluation Methods
407
spent around 60 min on research-administered test where one of the tasks the participants need to do was answering the System Usability Scale (SUS) questionnaire for the application. The usability evaluation of PEM + application [13] were done by the caregivers completing the Usefulness, Satisfaction, and Ease of Use (USE) questionnaire. The caregivers need to rate 22 questions using 6-point scale that started with range 1 for strongly disagree to range 6 for strongly agree. The usability evaluation for AvaliaDor application [16] was done through the PostStudy System Usability Questionnaire (PSSQ). The questionnaire consists of 19 questions that users need score using the 8-point Likert scale. In the phase 2 of MedRec application development [18], the participants were invited to a session where a scenario presented to the participants and they need to access the web-based prototype of the application to get the patients’ medication list. The usability evaluation of the application was done by the participants answering the SUS questionnaire. SymptomMapper application [19] were evaluated by two group of participants, doctors and patients. The patients evaluated the application by answering the designed usability questionnaire by the developers. The doctors evaluated the application by answering a survey consists of three questionnaires (SUS, The Attrakdiff 2, ISONORM 9241/10). The usability evaluation in the first iteration of mobile JKN application [20] were done using SUS questionnaire and the evaluation in the second iteration were done using PSSUQ. The COPe-support application prototype [22] were evaluated through questionnaire that carers need to answer. The carers need to rate the question in the questionnaire using a 5-point Likert scale. Patient Journey application’s [25] usability was evaluated using SUS questionnaire and the attitude of the users toward the application was evaluated using eHealth Impact Questionnaire (eHIQ). The users need to score both questionnaires on a 5-point Likert scale. Questionnaire is a common method used because of its advantages. It is the most affordable and practical way to gather data because it can be done online where no cost is needed. Besides, questionnaire enables information gathering from a big audience. Questionnaire also offer a quick way to obtain a result and the analysis of the result can be done easily.
6 Conclusion This review paper aimed to capture the usability methods that are commonly used in the domain of usability evaluation for eHealth application is achieved. Questionnaire is the most often used method for eHealth application usability evaluation. Using two or more methods will give better evaluation. The results from this review paper provide additional knowledge about the current evaluation method used in the eHealth application domain. One of the most important takeaways from this study is that researchers and developers should consider combining multiple usability evaluation methods to gain a more comprehensive understanding of the usability of the eHealth application. Combining questionnaires with usability testing or heuristic evaluation, for instance, could provide a more thorough analysis of usability strengths and weaknesses. In addition, future
408
A. Baharum et al.
research should investigate the efficacy of other evaluation techniques, such as eyetracking and biometric measurements, which have been demonstrated to provide valuable insights into user behaviour. The need for standardised usability evaluation metrics and guidelines for eHealth applications is an additional significant implication of this review paper. By establishing reliable and consistent evaluation criteria, researchers and developers can ensure that their eHealth applications meet users’ needs and deliver the intended benefits. This can also contribute to the development of more effective and efficient eHealth applications, resulting in improved health outcomes for users. In conclusion, the findings of this review emphasise the significance of employing multiple evaluation methods and establishing standardised usability evaluation metrics for eHealth applications. These insights can guide future research and development in the eHealth domain, creating more effective and user-friendly eHealth applications. Acknowledgment. Researchers thank Universiti Teknologi MARA (UiTM) for the support of resources and facilities needed for the preparation of research. This study is currently funded by the Teja Grant 2022 (GDT2022/1–20).
References 1. Ansaar, M.Z., Hussain, J., Bang, J., Lee, S., Shin, K.Y., Young Woo, K.: The mHealth applications usability evaluation review. In: 2020 International Conference on Information Networking (ICOIN) (2020). https://doi.org/10.1109/icoin48656.2020.9016509 2. Brett, J., Boulton, M., Watson, E.: Development of an e-health app to support women prescribed adjuvant endocrine therapy after treatment for breast cancer. Patient Prefer. Adherence 12, 2639–2647 (2018). https://doi.org/10.2147/PPA.S187692 3. Børøsund, E., et al.: A stress management app intervention for cancer survivors: design, development, and usability testing. JMIR Form Res. 2(2), e19 (2018). https://formative.jmir. org/2018/2/e19 4. Calvillo-Arbizu, J., et al.: User-centred design for developing e-Health system for renal patients at home (AppNephro). Int. J. Med. Inform. 125, 47–54 (2019). https://doi.org/10. 1016/j.ijmedinf.2019.02.007 5. Cho, H., Powell, D., Pichon, A., Kuhns, L.M., Garofalo, R., Schnall, R.: Eye-tracking retrospective think-aloud as a novel approach for a usability evaluation. Int. J. Med. Inform. 129, 366–373 (2019). https://doi.org/10.1016/j.ijmedinf.2019.07.010 6. Cho, H., Yen, P.-Y., Dowding, D., Merrill, J.A., Schnall, R.: A multi-level usability evaluation of mobile health applications: a case study. J. Biomed. Inform. 86, 79–89 (2018). https://doi. org/10.1016/j.jbi.2018.08.012 7. Davis, R., Gardner, J., Schnall, R.: A review of usability evaluation methods and their use for testing ehealth HIV interventions. Curr. HIV/AIDS Rep. 17(3), 203–218 (2020). https://doi. org/10.1007/s11904-020-00493-3 8. Donald, M., et al.: A web-based self-management support prototype for adults with chronic kidney disease (my kidneys my health): co-design and usability testing. JMIR Form Res. 5(2), e22220 (2021). https://formative.jmir.org/2021/2/e22220 9. Fuller-Tyszkiewicz, M., et al.: A mobile app–based intervention for depression: end-user and expert usability testing study. JMIR Ment Health, 5(3), e54 (2018). https://mental.jmir.org/ 2018/3/e54
A Review of Usability Evaluation Methods
409
10. Gilbertson-White, S., Yeung, C.W., Saeidzadeh, S., Tykol, H., Vikas, P., Cannon, A.: Engaging stakeholders in the development of an ehealth intervention for cancer symptom management for rural residents. J. Rural Health 35, 189–198 (2019). https://doi.org/10.1111/jrh.12297 11. Grasaas, E., et al.: iCanCope with pain: cultural adaptation and usability testing of a selfmanagement app for adolescents with persistent pain in Norway. JMIR Res Protoc. 8(6), e12940 (2019). https://www.researchprotocols.org/2019/6/e12940 12. Hustak, T., Krejcar, O.: Principles of usability in human-computer interaction. In: (Jong Hyuk) Park, J.J., Chao, H.-C., Arabnia, H., Yen, N.Y. (eds.) Advanced Multimedia and Ubiquitous Engineering. LNEE, vol. 354, pp. 51–57. Springer, Heidelberg (2016). https://doi.org/10. 1007/978-3-662-47895-0_7 13. Jarvis, J.M., et al.: Usability of the participation and environment measure plus (PEM+) for client-centered and participation-focused care planning. Am. J. Occup. Therapy. 73(4), 7304205130p1–7304205130p8 (2019). https://doi.org/10.5014/ajot.2019.032235 14. King, S., Boutilier, J.A., MacLaren Chorney, J.: Managing chronic pain in the classroom: development and usability testing of an ehealth educational intervention for educators. Can. J. Sch. Psychol. 33(2), 95–109 (2018). https://doi.org/10.1177/0829573516674308 15. Larbi, D., Randine, P., Årsand, E., Antypas, K., Bradway, M., Gabarron, E.: Methods and evaluation criteria for apps and digital interventions for diabetes self-management: systematic review. J. Med. Internet Res. 22(7), e18480 (2020). https://doi.org/10.2196/18480 16. Lopes, F., Rodrigues, M., Silva, A.: User-centered development of a mobile app for biopsychosocial pain assessment in adults: usability, reliability, and validity study. JMIR Mhealth Uhealth, 9(5), e25316 (2021). https://mhealth.jmir.org/2021/5/e25316 17. Maramba, I., Chatterjee, A., Newman, C.: Methods of usability testing in the development of eHealth applications: a scoping review. Int. J. Med. Inform. 126, 95–104 (2019). https://doi. org/10.1016/j.ijmedinf.2019.03.018.xxxx 18. Marien, S., et al.: A User-Centered design and usability testing of a web-based medication reconciliation application integrated in an eHealth network. Int. J. Med. Inform. 126, 138–146 (2019). https://doi.org/10.1016/j.ijmedinf.2019.03.013 19. Neubert, T., Dusch, M., Karst, M., Beissner, F.: Designing a tablet-based software app for mapping bodily symptoms: usability evaluation and reproducibility analysis. JMIR Mhealth Uhealth 6(5), e127 (2018). https://mhealth.jmir.org/2018/5/e127 20. Pinem, A.A., Yeskafauzan, A., Handayani, P.W., Azzahro, F., Hidayanto, A.N., Ayuningtyas, D.: Designing a health referral mobile application for high-mobility end users in Indonesia. Heliyon 6(1), e03174 (2020). ISSN 2405–8440, https://doi.org/10.1016/j.heliyon.2020. e03174 21. Quintana, M., et al.: Feasibility-usability study of a tablet app adapted specifically for persons with cognitive impairment—SMART4MD (support monitoring and reminder technology for mild dementia). Int. J. Environ. Res. Public Health 17(18), 6816 (2020). https://doi.org/10. 3390/ijerph17186816 22. Sin, J., Woodham, L.A., Henderson, C., Williams, E., Sesé Hernández, A., Gillard, S.: Usability evaluation of an eHealth intervention for family carers of individuals affected by psychosis: a mixed-method study. Digit. Health (2019). https://doi.org/10.1177/2055207619871148 23. Soltanzadeh, L., Babazadeh Sangar, A., Majidzadeh, K.: The review of usability evaluation methods on tele health or telemedicine systems. Front. Health Inform 11(1), 112 (2022). https://doi.org/10.30699/fhi.v11i1.357 24. Vamos, C.A., et al.: The development of a theory-based eHealth app prototype to promote oral health during prenatal care visits. Transl. Behav. Med. 9(6), 1100–1111 (2019). https:// doi.org/10.1093/tbm/ibz047 25. Willems, S., et al.: A clinical journey mobile health app for perioperative patients: crosssectional study. JMIR Hum Factors 8(1), e20694 (2021). https://humanfactors.jmir.org/2021/ 1/e20694
410
A. Baharum et al.
26. Wronikowska, M.W., et al.: Systematic review of applied usability metrics within usability evaluation methods for hospital electronic healthcare record systems. J. Eval. Clin. Pract. 27(6) (2021). https://doi.org/10.1111/jep.13582 27. Zhou, L., Bao, J., Setiawan, I., Saptono, A., Parmanto, B.: The mHealth app usability questionnaire (MAUQ): development and validation study. JMIR Mhealth Uhealth, 7(4), e11500 (2019). https://mhealth.jmir.org/2019/4/e11500
Access Control in Mobile Crowdsensing: Requirements, Challenges and Open Issues Hajar El Gadi1(B) , Hanan El Bakkali1 , Driss Benhaddou3 , Houda Benbrahim2 , Wahiba Abou-zbiba2 , and Zaina Maqour1 1
2 3
SSL, IT Rabat Center, ENSIAS, Mohammed V University in Rabat, Rabat, Morocco hajar [email protected] IRDA, IT Rabat Center, ENSIAS, Mohammed V University in Rabat, Rabat, Morocco Engineering Technology Department, University of Houston, Houston, TX 77204, USA
Abstract. Many industries, including smart cities, healthcare, and others, have undergone radical change due to the fast-growing number of smart devices and associated sensors. Mobile CrowdSensing (MCS) is currently attracting increasing interest since smartphones now have numerous sensing, computation, and networking capabilities that enable them to carry out complex tasks and exchange data, which enhances the delivery of a variety of services. Even if MCS offers a promising paradigm, providing personalized information often comes at the cost of accessing users’ private information without their consent or with the risk of maliciously manipulating the collected data by unauthorized entities. Therefore, access control has to be enforced in MCS-based applications, as it represents a fundamental security mechanism that can efficiently manage resource access activities by allowing only authorized users to have access to the needed information resources. In the literature, several access control models are available, each with different characteristics that make them more or less suitable for the MCS context. In this paper, we highlight the main concepts and major limitations of the most used access control models through recent work from the MCS literature. Then, we deduce the key requirements of access control in the context of mobile crowdsensing. Finally, we provide future directions for research on access control for MCS.
Keywords: Access Control
1
· Mobile Crowdsensing · AC Requirements
Introduction
The Internet of Things (IoT) [24] is a tool for enhancing many aspects of both private and public life. It has a wide range of applications, from health care to transportation, from the environment to business. Mobile crowdsensing or c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 411–421, 2023. https://doi.org/10.1007/978-3-031-37963-5_29
412
H. El Gadi et al.
MCS [22] is a recent IoT trend that has transformed this technology into a vital sensing mechanism. Mobile crowdsensing denotes a large kind of sensing approach in which a group of individuals collectively collect and share data to extract useful information related to a particular domain of common interest. This new sensing paradigm, as promising as it lays, is built mainly on data collection [4], and with the development of this technology, data is continuously transmitted and shared. Various types and large volumes of data are involved, and sensitive and private contents could be shared without enough protection measures. Therefore, it brings many security challenges as well [3,28,33]. Among the various factors affecting MCS are data loss, unauthorized data access, and data tampering, which present the most important threats. Thus, security services such as authentication, integrity, confidentiality, and access control [5] are critical in this sharing environment to bring trust in the use of MCS. More specifically, access control provides opportunities for overcoming the aforementioned MCS security challenges and their privacy issues. When it is conveniently designed and enforced, it can effectively monitor resource access and prevent unauthorized information flow. In this paper, we will focus on access control as one of the most important aspects of security and privacy in MCS, where we will explore the following categories of access control models: Mandatory Access control (MAC), Discretionary access control (DAC), Role-based access control (RBAC), and Attributebased access control (ABAC), by presenting their main concepts and limitations through recent work from the existing literature. This will allow us to identify the access control challenges and key requirements in an MCS context. Furthermore, we will determine open issues and future research directions for developing efficient access control models in a mobile crowdsensing context. The remainder of this paper is structured as follows: Sect. 2 introduces access control and its models along with their concepts and limitations, while access control’s recent work through the MCS literature is presented in Sect. 3. In Sect. 4, the main challenges of access control models through the recent MCS literature are highlighted, Sect. 5 extracts the key requirements of Access control in MCS, and Sect. 6 concludes the paper and introduces open problems for research.
2
Background
In this section, access control and MCS main concepts are introduced, then AC conventional models are presented along with their core approaches and limitations in an MCS context: 2.1
Mobile Crowdsensing Concepts
The authors of [26] formally define MCS as a platform that provides citizens equipped with sensor devices (smartphones, tablets, wearable devices) to collect and transmit sensor data and process a massive amount of information through social resources without any additional cost for deploying sensors.
Access Control in Mobile Crowdsensing
413
The authors of [17] considers MCS a class of applications where individuals with sensing and computing devices collectively share data and devices collectively share data and extract information to measure, map, analyze, estimate map, analyze, estimate, or infer (predict) processes or phenomena of common interest. Because of its flexibility and ubiquity, the MCS is an effective and efficient method for data collection and processing. However, despite the benefits it brings, MCS still suffers from some limitations and threats related to data security, trust and privacy, such as data loss or falsification, spoofing identity loss, DoS attacks [26]. 2.2
Access Control Related Definition
Access control [9,15] is an important part of information security technology. Another term for access control is authorization, which is the process of verifying a user’s privileges and access rights to system resources after authentication. It determines whether or not access to the resource is granted to an authenticated user. The fundamental objective of any access control system is to limit a user’s access to what they should be able to do and to protect information from unauthorized access [23]. Access control includes identification, authorization, and authentication to constitute an access control mechanism [25]. In the identification phase, the subject/user can use credentials and be authenticated. After providing legitimate credentials, the user is allowed to access the following resources: only those resources that are granted by an administrator (or the resource owner) through access control permissions/rules. The authorization logic is formalized in the user’s permission and access control rules. Authentication, on the other hand, is the mechanism of verification, by a trusted identity provider, of the identity of a user (subject) requesting access to the resources of a system: hardware, service, application, or others. It can be done using several methods based on knowledge (password), possession (token, badge, OTP), biometrics (fingerprints, iris, facial recognition), and behavior (gesture, action). Multi-factor authentication can be achieved by combining two or more of the above-mentioned authentication methods to verify the user’s identity. It is strongly recommended because it complicates the task of an attacker [30]. While authentication mechanisms ensure that system users are who they say they are, they do not specify which operations users should or should not perform within the system. Authorization is required to provide security measures to regulate subject access to the object. 2.3
Conventional Access Control Models
There are a wide variety of methods, models, technologies, and administrative capabilities used to propose and design access control systems [11,29]. The following introduces the main models and their concepts:
414
H. El Gadi et al.
DAC: is a policy of restricting object access based on the identity of users, the groups to which they belong, or both. The object’s “owner” has control permission to grant access permission to the object for other subjects. These rights are mostly represented by an access matrix or an access control list ACL [18]. MAC: represents a security policy in which the system strictly constrains permissions for a subject to access an object. In MAC models, a security level is assigned to each subject called classification level and each object is called clearance level. Four security levels are used: TS (Top Secret) S (Secret) C (Confidential) U (Unclassified) [34]. MAC is controlled by the administrator and classified into two types: multilevel and multilateral Security associated simultaneously with a confidentiality model Bell & LaPadula and an Integrity model Biba, and Chinese wall [12]. RBAC: Role-based access control [18] is an approach to restricting system access to authorized users. People are categorized into roles that correspond to the enterprise’s organizational structure, and permissions are then allocated to these roles rather than being given to specific individuals. A user must activate one of its roles (or a subset of its roles) as part of a session (through a Principal/Subject) which is typically determined by the login and password information provided during the authentication procedure. RBAC has three model components: Core-RBAC, Hierarchical-RBAC, and Constrained RBAC. ABAC: [13] is an authentication and authorization model that grants user access through attributes rather than roles. ABAC bases access decisions on the attributes (characteristics) of the subject or user making the access request, the resource being requested, what the user intends to do with the resource, and the environment (geolocation, network, etc.) or context of the request. In addition to the AC components introduced above, this model contains other elements such as Subject attributes such as name, date of birth, home address, education record, and job function or all. Object attributes, or resource attributes are used to describe and identify the subject. Environmental conditions, dynamic, subject- and object-independent factors such as time of day, location, threat level, and temperature.
3
AC Recent Work in MCS
Many access control models have been proposed to address the limitation mentioned above as well as the main security and privacy challenges in MCS. In this section, we cite the AC’s recent work through the MCS literature resumed in the Table 1 below:
Access Control in Mobile Crowdsensing
415
Table 1. Enhanced AC Models. Work
Access control System
Characteristics
Consideration
Ning Ye, et al. (2014) [36]
ABAC based authorization
- Lower storage - Communication overheads - Fine-grained AC
-Iot resource restrictions
Xinwen Zhang et al. (2008) [37]
UCON
- Layered approach with policy, -Pre access execution enforcement, and -Execution process implementation -authorization issue
A. A.E. Kalam, et al. (2013) [19]
OrBAC
-Separate concrete and abstract - RBAC model enhanced levels. -A variety of context data types: Historical information Spatial information Temporal information User-declared information
Jose L. Hernandez, et al. (2013) [1]
CapBAC
- Capabilities - based
Lawrence Kerr, et al. (2016) [20]
MAC based
- Number of security and environment attributes usage
-MAC model improved
Sangsig Kim, et al. (2014) [21]
RBAC & MAC combined
- Administration flexibility
- Reducing the complexity and error proneness of development
Yaira K. Rivera, et al. (2019) [31]
RBAC & MAC combined
- Information confidentiality
- Health information system technologies
Ezedine Barka et al. (2015) [10]
RBAC based
- Establishaccess control policies to WoT - Cryptography keys usage
-Web of things
Biwen Chen et al. (2021) [14]
Fog-assisted - Traceability simultaneously MCS anonymous - User anonymity bilateral - Data confidence - Fine-grained bilateral
-MCS anonymization
Dengpan Ye et al. (2016) [35]
Attributes trees based
-Fine-grained approach -Context recognition -Increase accuracy
-Context awareness
Jingwei Wang, et al. (2021) [32]
FGTAC model
-Fine-grained task access management
-MCS task management
Wang, Jingwei et al. (2021) [16]
LW-C-CP-ARBE - Light-weight decryption for model data from third parties - Minimizes the computation from mobile devices to a proxy server
-IoT industry -Large-scale projects
-Device security and privacy
-Mobile cloud environment
The authors in [36] suggested an effective authentication and access control system for the Internet of Things’ perception layer using an ABAC-based authorization technique. It offers considerably lower storage and communication overheads to address IoT resource restrictions and maintains fine-grained access control. The [37] model’s UCON proposal is considered the access control model of the future. When compared to conventional access control methods like RBAC and ABAC, it introduces a number of improvements. Before the access
416
H. El Gadi et al.
execution, throughout the execution, and subsequently, the authorization issue is continuously addressed. OrBAC model is proposed in the paper [19] which addresses exciting issues in RBAC models. It distinguishes between the concrete level (user, object, action) and the abstract level (roles, views, activities) by adding the concept of “organizational” as a new dimension. The context data included in this model range from historical to spatial, to temporal, and to userdeclared. The access control CapBAC presented in [1] is based on capabilities. It has been used extensively in the IoT industry and implemented in numerous large-scale projects. The authors of [20] proposed an access control approach centered strictly on the MAC model enriched with a number of security and environment attributes. Authors in [21] and [31] proposed an access control model combining RBAC and MAC using the flexibility offered by RBAC in terms of role administration and for the confidentiality of information provided by MAC under the ’need-to-know’ principle. The web of things (WoT) methodology and the RBAC model are used as a foundation by the authors of [10] to develop an access control mechanism that will improve device security and privacy. Researchers in [14] created an effective fog-assisted MCS anonymous bilateral access control mechanism. The protocol may simultaneously offer traceability, user anonymity, data confidence, and fine-grained bilateral access control. To recognize the context with a finegrained approach, a model for context awareness is proposed in [35]. This study also introduces an access control mechanism based on attribute trees to increase accuracy. Authors in [32] proposed the FGTAC system for fine-grained task access management in mobile crowdsensing, where the access policy of the encrypted task is completely hidden in the system to protect the privacy of the requester and task performers. The study in [16] introduced an LW-C-CP-ARBE model providing lightweight decryption and write access for data that is outsourced in collaborative mobile cloud computing. Other Hybrid models are presented in [7].
4
Access Control Challenges in MCS
MCS is a dynamic environment with inherent characteristics and a rising research field as it is a crucial part of IoT systems. It is currently one of the most active research areas in the field of information science and technology as there are different sensors that are connected and managed. MCS provides cost savings to the government while steadily enhancing the quality of life for citizens. By combining a sizable amount of Big Data, it delivers smart services, allows more intelligent decision-making, and enhances collaboration and coordination between various entities, including citizens. One major example of this paradigm’s use is in smart cities which involves a variety of subsystems.
Access Control in Mobile Crowdsensing
417
The main challenges in such applications [26,27] are mainly centered on Reliability and availabilityin both data sensing and transfer, within the network of sensors and to the external network. In this context, reliability refers to the sensors’ availability, accuracy, communication, and system robustness, particularly in the face of potentially harmful outdoor conditions. Thus efficient data acquisition, event synchronization, ordering, and response time in emergency situations are essential. Integrity and confidentiality are crucial. In fact, it is essential because a third party can deduce information from the sensing activity. Additionally, data of various granularities and sensitivity related to user privacy are sent as a result of in-network data aggregation activities, requiring that they have to be protected. Regarding Privacy, it is more crucial than ever when it comes to the collection of personal data by smartphones and other forms of pervasive sensors. Future research must prioritize the integration of security and privacy preservation measures if it is to succeed in gaining user permission, confidence, and adoption of Smart Cities. Establishing consumer confidence in emerging technologies must be the primary concern; otherwise, people won’t trust the services provided by Smart Cities. As for Usability, Non-expert users are likely to use these services. They may not be familiar with the inner workings of the security mechanism, thus it is crucial that the system be perceived as straightforward, clear, and unobtrusive from their point of view. Access control models that reduce user effort in system management and enable the more autonomous establishment of a security context are therefore desired. Providing an adequate access control model for MCS is a vital but challenging topic. Indeed, authentication and authorization issues have been intensively investigated through existing protocols for use cases outside constrained environments. However, in these environments, those issues [6] are still in their infancy. In fact, additional and different requirements pose challenges to the use of various security protocols. In particular, the need arises for a dynamic and fine-grained access control mechanism, where users/resources are constrained.
5
Access Control Requirements in MCS
Access control must satisfy the main security properties mentioned above of confidentiality (preventing unauthorized resource divulgation), integrity (preventing resource modification without authorization), availability (assuring access to resources by legitimate users when needed), and privacy (the ability of an entity to determine “If”, “When” and “To Whom” personal information may be shared or disclosed). As they present the main security challenges for this type of application [27]. In addition, [2] and [8] present different features for the access control system in an IOT environment from which we can derive the AC requirements in a MCS context. Access control requirements can be summarized as Table 2 below:
418
H. El Gadi et al. Table 2. Access Control Key Requirements.
Requirement
Summary
Usability
Access control must be easily administered, expressed, modified and operated It must allow high-level rules/conditions of access rights for better management of increased complexity as the MCS
Transparency
Access control models must enable transparent access for legitimate users while separating unauthorized users Users should understand who knows what about them Users should know how their data will be used, with whom it is shared, and how long it is kept
Dynamicity
Access control models should be dynamic, with the ability to change policies at runtime based on the system requirements
Flexibility
Access control models should adapt to different contexts
User-driven
Access control models should allow users to have full and granular access control over the data they share on the network or in the cloud
High granularity
Access control must be fine-grained and allow for the protection of sensitive assets
Auditability
Access control models must be designed to enable the system to report access attempts, especially failed ones
Context-Awareness Access control models should support dynamic access rights as the context may change such as the identity of the subject, its history, the time the request was made, the security context, the network availability, or the workload of the device
6
Conclusion and Open Issues
An access control model generally seeks to establish a permission relationship between the subject and the object. However, the access control models presented in this paper continue to assign permissions to users based on “who”, i.e., the user’s identity. When it comes to mobile devices, these models may not be able to guarantee the security of sensitive data. The device’s mobility causes uncertainty about its location, environment, and other contexts. Even the same user should be given different permissions depending on the situation. When creating and implementing an access control model, there are various challenges to take into account because of the openness, heterogeneity, and nature of the MCS system. Compared to other domains, the MCS paradigm introduces unique access control challenges. While we reviewed the most common models and their recent work, much work still needs to be done. From the prior requirements deduced, none of the models presented in this paper are capable of addressing all of the
Access Control in Mobile Crowdsensing
419
numerous challenges posed by MCS systems. New research on innovative AC models and mechanisms is thus required to provide a suitable solution for MCS applications. Two key questions can be used to determine future work: (1) Do we need to use current access control models and mechanisms while attempting to fix their flaws in order to adapt them to the MCS environment (a conservative approach) or create new models in order to satisfy the requirements of crowdsensing (an evolutionary approach)? (2) From an architectural point of view, is it more efficient to manage access control using a distributed, centralized, or hybrid approach? Acknowledgments. This research received funding from the Moroccan Ministry of Equipment, Transport and Logistics (METL) and the National Road Safety Agency (NARSA), and was supported by the Moroccan National Center for Scientific and Technical Research (CNRST). The Author Hajar EL GADI received the Fulbright scholarship.
References 1. Hernandez-ramos, J.l.: Distributed capability-based access control for the internet of things (2013) 2. Alnefaie, S.: A survey on access control in IoT: models, architectures and research opportunities (2021) 3. Liu, X.: Security, privacy and trust challenges in mobile crowdsensing (2021) 4. Data-oriented mobile crowdsensing: a comprehensive survey (2019) 5. A survey on mobile crowdsensing systems: challenges, solutions, and opportunities (2019) 6. Abou-zbiba, W., El Gadi, H., El Bakkali, H., Benbrahim, H., Benhaddou, D.: A novel mobile CrowdSensing architecture for road safety, vol. 183 (2021) 7. Aftab, M.U.: Traditional and hybrid access control models: a detailed survey. Secur. Commun. Netw. 2022, (2022) 8. Alnefaie, S., Alshehri, S., Cherif, A.: A survey on access control in IoT: models, architectures and research opportunities. Int. J. Secur. Netw. 16, 60–76 (2021) 9. Ausanka-Crues, R.: Methods for access control : advances and limitations (2001) 10. Barka, E., Mathew, S.S., Atif, Y.: Securing the web of things with role-based access control. In: El Hajji, S., Nitaj, A., Carlet, C., Souidi, E.M. (eds.) C2SI 2015. LNCS, vol. 9084, pp. 14–26. Springer, Cham (2015). https://doi.org/10.1007/978-3-31918681-8 2 11. Bertin, E., Hussein, D., Sengul, C., Frey, V.: Access control in the internet of things: a survey of existing approaches and open research questions. Annales des Telecommun./Ann. Telecommun. 74, 375–388 (2019) 12. Bugiel, S., Heuser, S., Sadeghi, A.R.: Flexible and fine-grained mandatory access control on android for diverse security and privacy policies (2013) 13. Chen, B., Wang, Z., Xiang, T., Yang, L., Yan, H., Li, J.: ABAC: anonymous bilateral access control protocol with traceability for fog-assisted mobile crowdsensing. In: Tan, Y., Shi, Y., Zomaya, A., Yan, H., Cai, J. (eds.) DMBD 2021. CCIS, vol. 1454, pp. 430–444. Springer, Singapore (2021). https://doi.org/10.1007/978-98116-7502-7 40
420
H. El Gadi et al.
14. Chen, B., Wang, Z., Xiang, T., Yang, L., Yan, H., Li, J.: Abac: anonymous bilateral access control protocol with traceability for fog-assisted mobile crowdsensing. vol. 1454 CCIS (2021) 15. Colombo, P., Ferrari, E.: Access control technologies for big data management systems: literature review and future trends. Cybersecurity 2, 12 (2019) 16. Fugkeaw, S.: A fine-grained and lightweight data access control model for mobile cloud computing. IEEE Access 9, 836–848 (2021) 17. Ganti, R.K., Ye, F., Lei, H.: Mobile crowdsensing: Current state and future challenges. IEEE Commun. Mag. 49, 32–39 (2011) 18. Ubale Swapnaja, A., Modani Dattatray, G., Apte Sulabha, S.: Analysis of DAC mac RBAC access control based models for security. Int. J. Comput. Appl. 104, 6–13 (2014) 19. Kalam, A.A.E., et al.: Organization based access control. In: Proceedings - POLICY 2003: IEEE 4th International Workshop on Policies for Distributed Systems and Networks, pp. 120–131 (2003) 20. Kerr, L., Alves-Foss, J.: Combining mandatory and attribute-based access control. In: Proceedings of the Annual Hawaii International Conference on System Sciences, 2016-March:2616–2623, March 2016 21. Kim, S., Kim, D.K., Lu, L., Song, E.: Building hybrid access control by configuring RBAC and mac features. Inf. Softw. Technol. 56, 763–792 (2014) 22. Liu, J., Shen, H., Narman, H.S., Chung, W., Lin, Z.: A critical component for the internet of things, A survey of mobile crowdsensing techniques (2016) 23. Liu, Z., Gu, W., Xia, J.: Review of access control model (2020) 24. Madakam, S.: Internet of things (IoT): a literature review. J. Comput. Commun. 03, 164–173 (2015) 25. Mishra, R., Yadav, R.: Access control in IoT networks: analysis and open challenges. SSRN Electron. J. (2020) 26. Nguyen, T.N., Zeadally, S.: Mobile crowd-sensing applications: Data redundancies, challenges, and solutions. ACM Trans. Internet Technol. (TOIT) 22 (2021) 27. Ouaddah, A., Mousannif, H., Abou Elkalam, A., Ouahman, A.A.: Access control in the internet of things: Big challenges and new opportunities (2017) 28. Owoh, N.P., Singh, M.M.: Security analysis of mobile crowd sensing applications. Appl. Comput. Inf. 18, 2–21 (2022) 29. Penelova, M.: Access control models. Cybern. Inf. Technol. 21, 77–104 (2021) 30. Shantha, R., Joshitta, M., Arockiam, L.: Authentication in IoT environment: a survey. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6, 2277 (2016) 31. Sanchez, Y.K.R., Demurjian, S.A., Baihan, M.S.: A service-based RBAC & MAC approach incorporated into the FHIR standard. Digital Commun. Netw. 5, 214–225 (2019) 32. Wang, J., Yin, X., Ning, J.: Fine-grained task access control system for mobile crowdsensing. Secur. Commun. Netw. 2021, 1–13 (2021) 33. Xiong, J.B., Bi, R.W., Tian, Y.L., Liu, X.M., Ma, J.F.: Security and privacy in mobile crowdsensing: models, progresses, and trends. Jisuanji Xuebao/Chin. J. Comput. 44, 1949–1966 (2021) 34. Xu, L., Zhang, H., Du, X., Wang, C.: Research on mandatory access control model for application system. vol. 2 (2009)
Access Control in Mobile Crowdsensing
421
35. Ye, D., Mei, Y., Shang, Y., Zhu, J., Ouyang, K.: Mobile crowd-sensing context aware based fine-grained access control mode. Multimedia Tools Appl. 75(21), 13977–13993 (2015). https://doi.org/10.1007/s11042-015-2693-3 36. Ye, N., Zhu, Y., Wang, R.-C., Malekian, R., Qiao-Min, L.: An efficient authentication and access control scheme for perception layer of internet of things. Appl. Math. Inf. Sci 8, 1617–1624 (2014) 37. Zhang, X., Nakae, M., Covington, M.J., Sandhu, R.: Toward a usage-based security framework for collaborative computing systems. ACM Trans. Inf. Syst. Secur. (TISSEC) 11, 2 (2008)
A Game with a Purpose for Building Crowdsourced Semantic Relations Datasets for Named Entities Andr´e Fernandes dos Santos(B) and Jos´e Paulo Leal CRACS & INESC Tec LA, Faculty of Sciences, University of Porto, Porto, Portugal [email protected], [email protected]
Abstract. Semantic measures evaluate and compare the strength of relations between entities. To assess their accuracy, semantic measures are compared against human-generated gold standards. Existing semantic gold standards are mainly focused on concepts. Nevertheless, semantic measures are frequently applied both to concepts and instances. Games with a purpose are used to offload to humans computational or data collection needs, improving results by using entertainment as motivation for higher engagement. We present Grettir, a system which allows the creation of crowdsourced semantic relations datasets for named entities through a game with a purpose where participants are asked to compare pairs of entities. We describe the system architecture, the algorithms and implementation decisions, the first implemented instance – dedicated to the comparison of music artists – and the results obtained. Keywords: Semantic Relations Gamification
1
· Crowdsourcing · Dataset ·
Introduction
Semantic measures (SMs) evaluate how close or how much related are the meanings of two things (e.g. concepts, words, sentences or named entities). Due to the inherently psychological nature of this process, SMs are evaluated using datasets that average the human perception of those semantic relationships. There are several such datasets available. These are usually built by asking participants to rate relationships between pairs of things. Requesting a numeric value makes it easier to be used by computers. People, however, have an easier time comparing things than assigning numeric values. Most of these datasets are focused on the comparison of concepts, with very few concerning named entities. Semantic measures, however, are frequently applied also to named entities. To address both shortcomings of the current methods for building semantic datasets, we present Grettir, a platform for creating crowdsourced gold standards for semantic measures between named entities. Grettir can be used to implement games in which users are asked to pick the most related pair of named entities c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 422–439, 2023. https://doi.org/10.1007/978-3-031-37963-5_30
A Game with a Purpose
423
among a set of three. The players choices are used to generate a list of pairs of entities sorted by the strength of their semantic relation. Shooting Stars is the first instance of Grettir (and currently, the only one). The theme for this game is music artists. Players are presented three music artists and are asked to find which artist is less related to the other two. Figure 1 presents the main view of Shooting Stars.
Fig. 1. Shooting Stars Main View.
The contributions described in this paper can be summarized as follows: 1. We propose using games to build semantic relations datasets to increase the motivation of the users. 2. We created a framework for generating such games in which player’s activity builds a semantic relations dataset. 3. We generated a first instance dedicated to musical artists. 4. We obtained a small dataset consisting of a list of pairs of music artists sorted by the strength of their relationship. 5. We performed usability and validation tests. This paper is structured as follows. Section 2 provides a brief overview on the field of semantic measures, their evaluation using semantic datasets and how gamification has been used to build crowdsourced human-generated datasets. Section 3 describes our approach for implementing Grettir. Section 4 presents its
424
A. F. dos Santos and J. P. Leal
architecture, and Sect. 5 details the implementation and the main algorithms. Sections 6 and 7 present, respectively, the methodology followed for testing and validating the first instance of Grettir, and the results obtained in those tests, which are then discussed on Sect. 8. Section 9 summarizes the research described in this paper, lists some future improvements and highlights its main contributions.
2 2.1
Background Semantic Measures
Semantic measures (SM) are a form of evaluating how close or how much related are the meanings of two things. SMs are widely used to identify the nature and strength of the semantic relationships between concepts, words, sentences or named entities, among other elements. This evaluation is useful in areas such as computational linguistics, language understanding, bioinformatics and information retrieval. Semantic measures are based on the analysis of information describing elements extracted from semantic sources. These semantic sources can be classified as either unstructured (e.g. plain text), semi-structured (e.g. dictionaries), or structured. The latter include a large range of computer understandable resources, from structured vocabulary to highly formal knowledge representations. Semantic similarity takes into account only taxonomic relationships. Semantic relatedness considers all types of relationships. While car and train are similar concepts because both of them are types of vehicles, car and wheel are related concepts because the former is part of the latter [13]. The knowledge-based approach to computing semantic measures relies on semantic graphs extracted from structured sources. The properties of a semantic graph, of its nodes and edges contain semantic evidence regarding the interconnections and the semantics of relationships between elements. This information is analyzed to produce a (usually numeric) value. Within this approach, several methods have been defined to compare elements in single and multiple knowledge bases: information theoretical methods [1,8], feature-based methods [9,10] and structural methods. These use the graph structure (nodes and arcs) to compare elements, relying on graph-traversal approaches, such as shortest path or random walk techniques. Several publicly available semantic graphs are currently composed of billions of nodes and edges – DBpedia, Freebase, OpenCyc, Wikidata and YAGO have been compared by [4]. Such graphs frequently include not only information regarding concepts and their relations, but also instances. Several pathbased semantic measures applied to such semantic graphs can be used to compare concepts with concepts, but also concepts with instances or instances with instances [3,7].
A Game with a Purpose
2.2
425
Evaluation of Semantic Measures
Accuracy of a measurement can be defined as “closeness of that measurement to the real value being measured” [2]. When discussing SMs, however, the notion of “real value” is far from straightforward: it only has meaning when compared against values of the same measure for other pairs of entities (i.e. to have an arbitrary measure reporting house and building as having a similarity of 1 is meaningless on its own), and it represents an inherently subjective appreciation. Consequently, there are two methods for evaluating the accuracy of SMs: directly, through comparison with averaged values reported by humans, or indirectly, by measuring the performance of applications which are highly dependent on the semantic measures (e.g. term disambiguation, classification) [35]. When evaluating SMs directly, gold standard datasets are often gathered by recruiting a variable number of people and asking them to rate (i.e. to assign a numeric value to) the semantic relationship between pairs of entities. In the more recent years, crowdsourced marketplaces such as Amazon Mechanical Turk have been leveraged to build datasets by providing online workers with a small monetary reward to complete the same task [5,11,12]. Asking people to rate relationships presents several challenges [15]: 1. It requires assigning a numeric value to a subjective appreciation. 2. It requires remembering the values attributed to previous pairs, so that the current rating can be contextualized. 3. It should also require knowledge of the next pairs, for the same reason. 4. The necessary precision is not known beforehand. Instead of requiring participants to assign a numeric value to semantic relations, when building the MEN dataset, [6] asked them to choose the most related pair among two candidate pairs. According to the authors it would “constitute a more natural way to evaluate the target pairs, since humans are comparative in nature”, among other operational reasons. In the process of direct evaluation, SMs are applied to the pairs contained in a gold standard dataset. Then, the correlation between the resulting values and the ones in the dataset is calculated. Measuring the correlation allows to focus on covariance rather then the absolute values. Pearson’s correlation coefficient [20], a measure of the linear correlation between two sets of values, is one of the most common ways of evaluating how similar are the results of a SM and a semantic dataset. Some measures and datasets do not provide a numeric value for the semantic relationships between entities, but instead give a list of pairs, sorted by the strength of their relationship – e.g f (house, building) > f (house, phone) > f (phone, steak). In such cases, where only the order (or rank) is relevant, it is common to use Spearman’s rank correlation coefficient [19] instead. Semantic gold standards play an important role in the direct evaluation of semantic measures. The quality of the datasets themselves, on the other hand, is often considered to be the inter-agreement of participants – the correlation
426
A. F. dos Santos and J. P. Leal
between the scores given by human annotators – which is often calculated using the same approaches: Pearson’s or Spearman’s correlation coefficients. Despite the importance of benchmarks in the direct evaluation of semantic measures, and the previously mentioned fact that semantic measures are often applied to instances, most available benchmarks are focused on concepts. Harispe et al., for example, provide a list of more than 20 existing semantic measures datasets [35], but it reports only one [3] as including entities. 2.3
Games Beyond Entertainment
Serious games are games whose primary purpose is not entertainment [14]. These include games for training and simulation, education, health, tourism, among others [16]. Serious games can be classified according to their gameplay, purpose and scope under the G/P/S model [14]. Casual games is a genre with a loose and somewhat controversial definition [33], but typically includes games with simple controls, easy-to-learn gameplay and support for short play sessions [31]. It includes older games such as the Solitaire card game from Windows 3.0, and Tetris [32]. More recently, casual games have become increasingly popular in mobile devices [34]. Games with a purpose (GWAPs) are games in which computational processes are outsourced to humans in an entertaining way [17,18]. This means that players are expected to play the game for fun, but, aware or not, while they do they are also contributing to some other goal. GWAPs have been used for music and sound annotation [21] and metadata validation [22], corpora annotation [23], sentiment analysis [24], and more, along with tasks related to linked data and the semantic web [25–27]. Siorpaes and Hepp, notably, have done extensive research on using GWAPs to help weave the Semantic Web [28,29]. More recently, WordGuess is a GWAP for vocabulary training [30].
3
Approach
Grettir is a platform to build semantic relationship datasets for named entities. These datasets can be used as gold standards in the evaluation of semantic measures. To improve the process of enlisting human participants, the system is implemented as a casual game in which participants compete asynchronously with each other while their answers are used to build the dataset. Players are asked to pick the odd element (the intruder ) in a group of three elements. Implicitly, they are choosing, from a set of three entities {A, B, C}, which pair is the most strongly related. For example, if the player picks the entity A as the intruder, they are simultaneously asserting the elements of the pair {B, C} as being more related to each other than the elements of both {A, B} and {A, C}. This allows sorting the pairs of entities without explicitly asking the players to assign a numeric value to the strength of the relationships.
A Game with a Purpose
3.1
427
Gameplay
Each game in Grettir is composed of 10 rounds. In a game, three players play against each other: a single human player and two automated bots – artificial intelligence-based players, whose names are randomly picked from robot characters from popular culture (e.g. 2001: A Space Odyssey’s HAL, Futurama’s Bender or Portal’s GLaDOS). In each round, three named entities are displayed. The human player is asked to find the intruder among them, i.e. to pick which of the three entities is less related to the other two. Figure 1 presents the game screen for Shooting Stars (described in Sect. 6.2), displaying the names and pictures of three musical artists. The human player makes their pick, and it is compared against the picks of the bots (described in Sect. 5.1). The correct pick1 is considered to be the one shared by at least two of the players (the user and the two bots). Every player who picked that entity is awarded 10 points. The player who picked differently is awarded no points. If every player picks a different entity, there is no correct answer, and no one wins any points. When all the rounds for that game have been played, the human player is presented with the score board. The points won during this game sequence are added to their total score, and they can proceed to play another (different) game sequence. A diagram representing the life-cycle of a Grettir game can be found in Fig. 2.
Fig. 2. Game Play Activity Diagram.
3.2
Game Design Elements
Grettir was implemented as a game to try to maximize user engagement. Arguably a higher level of engagement leads to more players and more games played, and as a result, better and larger datasets obtained. To this end we added several common game design elements to the system: 1
Being a subjective measure, there is no right or wrong answer when evaluating relatedness. But the game format demands winners and losers. Grettir uses the most picked entity to make that decision.
428
A. F. dos Santos and J. P. Leal
Opponents: In each game, a single human player is faced by two opponents, which are in fact automated bots whose game picks follow the distribution of past players picks (see Sect. 5.1). Goal: The objective of the player is to find which of the three entities displayed is less like the other two. Because the bots play according to past players choices, the goal is actually to guess what the common opinion regarding the three entities is. Points: Each player who picks the correct entity in a round is given 10 points. No points are awarded if every player picks a different entity. No penalties are given for choosing incorrectly. Leaderboards: During the course of a game the points for each round are added and displayed to the user. At the end of the game, the leaderboard for the game is presented. Additionally, the global leaderboard is also displayed, presenting the scores for the totality of the games played by the top ranking players. Adjusted difficulty: Each time a player starts a new game, the sequence of triples of entities for that game is generated taking into account the current estimated difficulty of the triples, and the past performance of the player (see Sect. 5.1). Stronger players are given harder triples, while weaker players are presented easier triples. Games implemented with Grettir can be classified as casual serious games with a purpose. They present several features which make them easier to play: Mobile ready: Grettir is implemented with a HTTP API. This allows the front-end of each instance to be implemented as a web application, a desktop application or a mobile native application (or anything that can speak HTTP). No registration needed: When users land on the main page they can start playing immediately, no previous registration needed. Resumable on other devices: Creating an account allows the player to sign in and resume an ongoing game on another device.
4
System Architecture
Grettir has been implemented as a modular application. It is composed of three main components: the backend API, the game frontend and the statistics backoffice. A representation of the Grettir architecture can be found in Fig. 3. The game frontend is the part of the application that is visible to the human player. It can be written in anything that can make HTTP requests, as it must fetch from, and write data to, the backend API. This component should be customized for each instance of Grettir to ensure that it matches the game theme. For Shooting Stars (presented in Sect. 6.2), it consists of a single page responsive web application developed in Vue.js. Players are allowed to play without registering or logging in. However, if they do, they are able to resume the game on other devices. When registering, players can optionally fill in some additional
A Game with a Purpose
429
Fig. 3. Grettir Architecture.
personal information, such as age and gender, which can later be used to perform a more detailed analysis of how these attributes might influence the perception of relatedness. The stats backoffice is meant to be used by the game administrators. It provides quick access to multiple charts which give a real-time overview of the system status: for example, the number of players registered each month, the average number of games played by each player in each month, or the distribution of players by gender. Additionally, it also displays the list of players, and the list of pairs sorted so far. This backoffice uses Charts.js and it also relies on the backend API. The third component is the main engine of the game. The backend API is a Node.js application which exposes a MongoDB database by making available a RESTful interface in JSON format. The database is used to store the players profiles, the entities used in the game, and all the data needed to operate the game and generate new game sequences (e.g. games played and players picks). An entity relationship diagram for the database can be found in Fig. 4. The API handles all the requests coming from the game frontend and the statistics backoffice. It is also responsible for making all the calculations needed to generate new game sequences for players, and to extract the list of pairs sorted by their relatedness value, as described in Sect. 5.1).
5
Implementation Decisions
The goal of Grettir is the production of human-generated gold standards for semantic relationships. These are outputted as a list of pairs of entities, sorted from the most related pair to the least related one. To build such lists, players are presented with triples of entities, and must choose which entity is less related to the other two. Implicitly, they are choosing the most related pair (i.e. the other two entities). The data gathered from all players choices must then be processed to allow the sorting of the pairs. Each time a player starts a new game the system needs to select triples to generate a new game sequence. Prioritizing triples corresponding to pairs which are ambiguous (that is, whose results are tied) would ensure that the ambiguity would be resolved as soon as possible. However, constraints related to gameplay concerns must also be considered. For example, a player should not be asked to play the same triple twice, and the difficulty of the triples being played should match their skill level.
430
A. F. dos Santos and J. P. Leal
Fig. 4. Entity Relationship Diagram.
Given the uncertainty on the total number of games that will ever be played, games sequences are initially generated using a smaller subset of entities which is gradually extended when: 1. at least one player has played all the possible triples with the current set of entities; or 2. the players picks up to that point are sufficient to unambiguously generate a sorted list of pairs of entities. This maximizes the size of the dataset and minimizes the ambiguity of the pairs, while still taking into account the players experience. 5.1
Algorithms
The main algorithms at the core of this game are the sorting of the pairs of entities and the generation of new game sequences for players. These two are interconnected: when generating a new game sequence, triples of entities are picked taking into account which pairs are still considered ambiguous; additionally, the pairs are sorted according to their relatedness using data gathered from the triples played. Pair Comparison. For each possible triple of entities, three fields keep track of how many times each of the pairs has been chosen: picks.AB.count, picks.AC.count and picks.BC.count. These values are used to compare pairs. It can be graphically represented by the comparison of the sides of a triangle where A, B and C are the vertices, and the width of its sides is proportional to the corresponding count field (Fig. 5a).
A Game with a Purpose
431
A A D B B
C (a) Direct
C (b) Indirect
Fig. 5. Pair Comparison Visual Representation.
Let CR be a function that compares the strength of the relatedness R between the elements of two pairs. If the pairs share a common element, e.g. {A, B} and {B, C}, ithen CR (R{A,B} , R{B,C} ) can be calculated directly by looking into the values of picks.AB.count and picks.BC.count of the triple {A, B, C}. The result is −1, 0 or 1 if A and B are, respectively, more, equally or less related than B and C. ⎧ ⎪ ⎨−1 if R{A,B} < R{B,C} CR (R{A,B} , R{B,C} ) = 1 (1) if R{A,B} > R{B,C} ⎪ ⎩ 0 otherwise This comparison is transitive. If {A, B} are more related than {χ, γ} and the latter are more related than {C, D}, then {A, B} are more related than {C, D}: R{A,B} > R{χ,γ} ∧ R{χ,γ} > R{C,D} =⇒ R{A,B} > R{C,D}
(2)
Pair comparison is more complex when the pairs do not share an element: {A, B} and {C, D}. In such cases, there are four pairs directly comparable to the two original ones: {A, C}, {A, D}, {B, C}, and {B, D}. By applying the transitivity rule stated in Eq. 2 we can sum the results of comparing both of the original pairs with all the intermediate (and directly comparable) ones and obtain an indirect comparison (Eq. 3). We can extend the previous graphical representation to the comparison of two sides of a tetrahedron which do not share a vertex (Fig. 5b). CR (R{A,B} , R{χ,γ} ) + CR (R{χ,γ} , R{B,C} ) (3) CR (R{A,B} , R{C,D} ) = Sometimes pair comparison is inconclusive, either because there are not yet enough picks for those triples/pairs or because they are contradictory (imagine a regular triangle and tetrahedron on Fig. 5). In such cases, those pairs are considered, at that moment, indistinguishable from each other. This might mean that more data regarding those pairs is needed (i.e. more players playing the triples involved in their comparison) or these pairs might really not be consensual among the players.
432
A. F. dos Santos and J. P. Leal
Pair Sorting. This process of building the sorted list of pairs is performed recurrently, triggered by the necessity of determining which triples are the most relevant and should therefore be included in future game sequences (see next section on Triple Selection for more detail). This process consists of creating an empty list and performing sorted inserts of pairs. In each insertion, the candidate pair is compared against some of the pairs already on the list (which ones depends on the insertion algorithm being used). Whenever a candidate pair PC is found to be indistinguishable from a pair PL already on the list (their comparison returns a 0), the process is halted. The triples corresponding to the comparison of PC and PL are marked as isRelevant (more on that later) and the list of pairs inserted so far is returned as the current sorted list. In order to decrease the potential inconclusive comparisons (which halt the sorting process), we decided to use an AVL tree to make the ordered insertions, as it minimizes the number of comparisons needed. Furthermore, the order in which pairs are attempted to be inserted is given by the order of the last sorting of the pairs list or their id field. Triple Selection. The reliability of the results for a Triple can be measured by observing two fields: – support: a counter which is incremented every time the triple is played, – certainty: the maximum value found in the picks count fields of the triple. Low values of support are found in triples which have been played only a few times. For triples with high support, low values of certainty mean that players tend to disagree; high values in both fields are found in triples which, on average, are easy for players to disambiguate. New game sequences are generated by choosing triples which are more useful for the pair sorting process. These correspond to pairs which halted the previous generation of the sorted list of pairs due to their ambiguity. This is the reason why such pairs are marked as isRelevant during the sorting process. Prioritizing these triples ensures the efficiency of the pair sorting algorithm in terms of the number player picks required. On the other hand, there are concerns regarding gameplay and player experience. Users should enjoy the game because better engagement might translate into more games played or more shares with family and friends. To this end, players are never asked to play the same triple twice; every game sequence a user plays is made of triples that user has never seen before. Additionally, we use the players previous scores to infer their ability. Taking all this into account, the selection of triples for a new game sequence for a player P is performed as follows: 1. First, we start by excluding all the triples already played by P . 2. Then we sort the remaining triples. Triples marked as isRelevant are placed at the top, and the others at the bottom.
A Game with a Purpose
433
3. This sorting is refined using the certainty field: a triple with low certainty is more in need of players input that one with high certainty. 4. This results in a list of triples which is arguably sorted from the most difficult (relevant triples, related to inconclusive pair comparisons, and with low certainty values) to the least ones (triples related to already sorted pairs, with high values of certainty). 5. Then we use the players ability to determine the place in the list from which to select triples (better players will have triples selected from the top, worse players, from the bottom). Bot Picks. In each round, the automated bots must decide which entity to pick. They do so by generating a random pick following the distribution of past players picks. If for a given triple, entity A has been chosen by players 70% of the times, entity B was chosen 20% and entity C 10%, the bots choice will be a weighted random choice with those weights. This means that bots will tend to follow the opinion of the majority, while still allowing some variance.
6
Methodology
This section describes the usability and validation tests we performed for Grettir, and the creation and deployment of Shooting Stars, the first instance of Grettir. 6.1
Usability and Validation Tests
Before publicly deploying the first instance of Grettir, we performed tests of usability and user validation. The usability testing consisted of asking seven Msc-level computer science students to play test instances of Grettir, observing their behavior, and in the end asking them to fill out a form with usability-related questions. According to the literature, this number of users should be enough to identify most usability issues [36,37]. For the validation of our algorithms, we instantiated Grettir with items from publicly available semantic datasets, and compared the sorted lists of pairs returned by Grettir with those datasets. Both experiments were performed simultaneously: the users tested the usability of the platform while generating data for our validation assessment. Test users were asked to fill out a form after playing these instances of the game. Most questions asked them to rate statements on a Likert scale, e.g. “I understood that I was playing against the average results of other players” or “The ‘Show Tutorial’ feature is helpful”. The remaining questions were open ended questions devised to let users communicate what they felt were the biggest issues and points to be improved in the game, e.g. “Did you find anything wrong in the game’s UI? (layout problems, bugs, etc.)”. For the validation of our algorithms, we needed lists of pairs of items small enough so that a few users could provide enough data to sort them. Additionally, given our sorting algorithms tendency to produce dense lists (where the pairs sorted include most combinations of items with each other), our gold standard
434
A. F. dos Santos and J. P. Leal
lists should be similar. For this purpose, we analyzed the semantic datasets listed in [35] which were available online, selecting the ones containing the largest cliques. Consider a semantic dataset D composed of tuples (c1 , c2 , R) where c1 and c2 are concepts and R is the strength of their relatedness. We then define a clique QD as a set of concepts belonging to D in which for any concepts q1 and q2 belonging to Q, there is a corresponding tuple (q1 , q2 , r) in D. The resulting cliques were not big enough for our purposes (less than 8 items), so we extended them, increased their size by adding entities that, despite not being connected in D to all the other entities in S, were nevertheless connected to a high number of them. The datasets containing the largest extended cliques were the MEN Test Collection [6] and the WordNet Synset Relatedness [5]. This resulted in sets of size 10 to 12, available at https://github.com/andrefs/grettirdatasets. Given the previously described difficulty on finding semantic datasets featuring named entities (which partially motivated this work), the cliques we obtained include concepts such as floor, kitchen, staircase, ample and beam. We then proceeded to generate some test instances of Grettir featuring the items extracted from the cliques. These instances were used for the usability testing; in the end, we extracted the sorted list of pairs from each instance, with the goal of comparing them with the sorted lists of pairs corresponding to the cliques extracted from the semantic datasets. 6.2
Shooting Stars
The first instance of Grettir is named Shooting Stars, and it is focused on the relatedness of musical artists. It is publicly available at https://stars.andrefs. com. In Shooting Stars we selected a list of 50 music artists and aimed to include widely known artists, from different musical genres, such as Bob Dylan, Rihanna, ´ Edith Piaf or Elvis Presley. Players are asked “Who is the intruder? Which of these 3 artists is the least related with the other two?”. They are given no further instructions on which criteria should be used to compare the artists, and instructed to “Click or tap on the intruder to select it!”. They are also informed that if their opinion “matches at least one of the other two players, you score! So, it’s not really about your opinion but whether you can guess other people’s opinions!”. A screenshot of the main view of Shooting Stars can be found in Sect. 1, Fig. 1. We publicized the games using a number of approaches, e.g. family and friends, social networks. The one which seemed to provide the best results was using a university mailing list, which by our estimates reached over 20k people, including students, faculty and other staff.
7
Results
This section describes the results obtained with the development of Grettir and the preliminary user testing and validation, and Shooting Stars, the first instance of Grettir.
A Game with a Purpose
7.1
435
Usability and Validation Tests
The answers to the usability questionnaire allowed us to identify and fix minor issues, most concerning small flaws or bugs in the user interface. The fact that these were computer science students means they were more knowledgeable regarding technical aspects of applications. There were no major usability issues found. Regarding the validation tests, we did not managed to gather enough users to have data to order all the possible pairs in each instance. We were not able to extract any conclusions regarding our algorithms accuracy from these tests. 7.2
Shooting Stars
Up until the moment of writing this paper, Shooting Stars had 1001 different users, 653 of which played at least one game (a user profile is automatically created when a new user accesses the web page). A total of 1013 games were played and finished. This resulted in 12 different artists being paired with each other and a total of 13 pairs sorted. The list is updated and available for download at https://github.com/andrefs/grettir-datasets/stars-2021115.txt. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Elton John – Bruce Springsteen Matt Bellamy (Muse) – Steven Tyler (Aerosmith) John Lennon (The Beatles) – Prince Steven Tyler (Aerosmith) – Prince Steven Tyler (Aerosmith) – John Lennon (The Beatles) John Lennon (The Beatles) – Elton John Elton John – Madonna Steven Tyler (Aerosmith) – Madonna Adele – Eddie Vedder (Pearl Jam) Madonna – Eddie Vedder (Pearl Jam) Rihanna – John Lennon (The Beatles) Rihanna – Matt Bellamy (Muse) Matt Bellamy (Muse) – John Lennon (The Beatles)
Due to the algorithm used to build our dataset we cannot compare the sorted pairs established by each individual player. As such, we cannot calculate the inter-annotator agreement (IAG) – a common metric to evaluate the quality of human-generated datasets – by calculating the correlation between each players sorted pairs. Nevertheless, we calculated the IAG by averaging the consensus among all relevant triples (the number of players who agreed on the most voted entity divided by the total number of votes for that triple). We obtained an average IAG of 0.693 with a standard deviation of 0.183.
8
Discussion
Observing the user interaction and analyzing the answers to the usability questionnaire allowed to preemptively fix a few minor issues in the game interface.
436
A. F. dos Santos and J. P. Leal
The algorithm validation was inconclusive due to the low number of test subjects gathered. With Shooting Stars, we were able to produce a list of pairs of musical artists sorted by their relatedness, but smaller that what we anticipated (13 pairs) due to the low number of total games played. The game was not addictive enough to attract more players and to have them playing more games. One reason for this low engagement might be the game domain. If this is the case, an instance about chess, football players or historical monuments might attract more players. Another reason might be the game mechanics. The lack of real-time human opponents, or the simple find-the-intruder approach might have hindered our expectations. Lastly, the game design, kept visually simple and lacking sound effects might also have played a part in these results. Semantic relations datasets are usually built by requiring human participants to assign numeric values to relations between pairs of entities. Similarly to the previously mentioned MEN dataset [6], we wanted to build a dataset using another approach: asking participants to compare pairs of entities. This method requires more human interaction: even when minimizing pair comparisons by using an AVL tree, sorting n pairs requires asking users about O(n log n) pairs, instead of the Θ(n) pairs the usual method would require. Nevertheless, our claim is that attributing a numeric value to a relationship can be difficult. This is specially true when that number must belong to a closed interval, and you do not know all the entities beforehand: how much related are Eddie Vedder and Rihanna? What if now you need to compare Rihanna and Mozart? We hoped that this problem of dataset construction by pair comparison might be more easily tackled by formulating it as a game; the results so far have been inconclusive. The code written for Grettir and Shootings Stars is open source and published under a GNU GPLv3 license. You can find the code, links to all the different components, and installation instructions (including a Docker version) at https://github.com/andrefs/shooting-stars.
9
Conclusions
Semantic measures mimic the human capability of evaluating how similar or how related two things are. These measures are algorithms which extract semantic clues from a number of semantic sources. They are frequently used to compare pairs of concepts but frequently also instances. Semantic relations datasets can be used to evaluate such measures. These datasets are usually built by asking human annotators to explicitly attribute a numeric value to the strength of the relationship between two entities. Semantic relations datasets can also be built by asking human annotators to compare pairs of entities against each other, and rank them. These comparisons are then used to sort the pairs of entities, producing a sorted list of pairs. This method requires more input from the annotators, but each question asked is more naturally answered.
A Game with a Purpose
437
Formulating this problem as a game made people contribute to the dataset construction which otherwise would probably not. Nevertheless, it did not provide the number of players or games played which would make this effort a success. Grettir is a platform which can be used to implement other instances of this game, focused on other domains. The main effort in producing these other instances would be implementing a different interface, with a different design which would match the game lore. Funding Information. Andr´e Santos has a Ph. D. Grant SFRH/BD/129225/2017 from Funda¸ca ˜o para a Ciˆencia e Tecnologia (FCT), Portugal. This work is also financed by National Funds through the Portuguese funding agency, FCT - Funda¸ca ˜o para a Ciˆencia e a Tecnologia, within project LA/P/0063/2020.
References 1. Lin, D.: An information-theoretic definition of similarity. In: ICML 1998, pp. 296– 304 (1998) 2. Metrology, J.: International vocabulary of metrology - basic and general concepts and associated terms (VIM). (BIPM S`evres 2008) (2008) 3. Ziegler, C., Simon, K., Lausen, G.: Automatic computation of semantic proximity using taxonomic knowledge. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 465–474 (2006) 4. F¨ arber, M., Bartscherer, F., Menne, C., Rettinger, A.: Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semantic Web 9, 77–129 (2018) 5. Boyd-Graber, J., Fellbaum, C., Osherson, D., Schapire, R.: Adding dense, weighted connections to WordNet. In: Proceedings of the Third International WordNet Conference, pp. 29–36 (2006) 6. Bruni, E., Tran, N., Baroni, M.: Multimodal distributional semantics. J. Artif. Intell. Res. 49, 1–47 (2014) 7. Albertoni, R., De Martino, M.: Semantic similarity of ontology instances tailored on the application context. In: Meersman, R., Tari, Z. (eds.) OTM 2006. LNCS, vol. 4275, pp. 1020–1038. Springer, Heidelberg (2006). https://doi.org/10.1007/ 11914853 66 8. Pirr´ o, G., Euzenat, J.: A feature and information theoretic framework for semantic similarity and relatedness. In: The Semantic Web-ISWC 2010, pp. 615–630 (2010) 9. Bodenreider, O., Aubry, M., Burgun, A.: Non-lexical approaches to identifying associative relations in the gene ontology. In: Pacific Symposium on Biocomputing, p. 91 (2005) 10. Ranwez, S., Ranwez, V., Villerd, J., Crampes, M.: Ontological distance measures for information visualisation on conceptual maps. In: On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops, pp. 1050–1061 (2006) 11. Radinsky, K., Agichtein, E., Gabrilovich, E., Markovitch, S.: A word at a time: computing word relatedness using temporal semantic analysis. In: Proceedings of the 20th International Conference on World Wide Web, pp. 337–346 (2011) 12. Halawi, G., Dror, G., Gabrilovich, E., Koren, Y.: Large-scale learning of word relatedness with constraints. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1406–1414 (2012)
438
A. F. dos Santos and J. P. Leal
13. Barzegar, S., Davis, B., Zarrouk, M., Handschuh, S., Freitas, A.: SemR-11: a multilingual gold-standard for semantic similarity and relatedness for eleven languages. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018). https://aclanthology.org/L18-1618 14. Djaouti, D., Alvarez, J., Jessel, J.: Classifying serious games: the G/P/S model. In: Handbook of Research on Improving Learning and Motivation Through Educational Games: Multidisciplinary Approaches, pp. 118–136 (2011) 15. Jones, N., Brun, A., Boyer, A.: Comparisons instead of ratings: towards more stable preferences. In: 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 1, pp. 451–456 (2011) 16. D¨ orner, R., G¨ obel, S., Effelsberg, W., Wiemeyer, J.: Serious Games. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40612-1 17. Von Ahn, L.: Games with a purpose. Computer 39, 92–94 (2006) 18. Von Ahn, L., Dabbish, L.: Designing games with a purpose. Commun. ACM 51, 58–67 (2008) 19. Zar, J.: Spearman rank correlation. In: Encyclopedia of Biostatistics, vol. 7 (2005) 20. Freedman, D., Pisani, R., Purves, R.: Statistics (International Student Edition), 4th edn. WW Norton & Company, New York (2007) 21. Law, E., Von Ahn, L., Dannenberg, R., Crawford, M.: TagATune: a game for music and sound annotation. In: ISMIR, vol. 3, p. 2 (2007) ˇ 22. Dulaˇcka, P., Simko, J., Bielikov´ a, M.: Validation of music metadata via game with a purpose. In: Proceedings of the 8th International Conference on Semantic Systems, pp. 177–180 (2012) 23. Fort, K., Guillaume, B., Chastant, H.: Creating zombilingo, a game with a purpose for dependency syntax annotation. In: Proceedings of the First International Workshop on Gamification for Information Retrieval, pp. 2–6 (2014) 24. Pearl, L., Steyvers, M.: Identifying emotions, intentions, and attitudes in text using a game with a purpose. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 71–79 (2010) 25. Celino, I., Re Calegari, G., Fiano, A.: Refining linked data with games with a purpose. Data Intell. 2, 417–442 (2020) 26. Vannella, D., Jurgens, D., Scarfini, D., Toscani, D., Navigli, R.: Validating and extending semantic knowledge bases using video games with a purpose. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1294–1304 (2014) 27. Calegari, G., Fiano, A., Celino, I.: A framework to build games with a purpose for linked data refinement. In: International Semantic Web Conference, pp. 154–169 (2018) 28. Siorpaes, K., Hepp, M.: Games with a purpose for the semantic web. IEEE Intell. Syst. 23, 50–60 (2008) 29. Siorpaes, K., Hepp, M.: OntoGame: weaving the semantic web by online games. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 751–766. Springer, Heidelberg (2008). https://doi.org/10. 1007/978-3-540-68234-9 54 30. Oguz, C., Blessing, A., Kuhn, J., Im Walde, S.: WordGuess: using associations for guessing, learning and exploring related words. In: Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021), pp. 235–241 (2021) 31. Kuittinen, J., Kultima, A., Niemel¨ a, J., Paavilainen, J.: Casual games discussion. In: Proceedings of the 2007 Conference on Future Play, pp. 105–112 (2007)
A Game with a Purpose
439
32. Juul, J.: A Casual Revolution: Reinventing Video Games and Their Players. MIT Press, Cambridge (2010) 33. Chess, S., Paul, C.: The end of casual: long live casual. Games Cult. 14, 107–118 (2019) 34. M¨ ayr¨ a, F., Alha, K.: Mobile gaming. In: The Video Game Debate, vol. 2, pp. 107–120 (2020) 35. Harispe, S., Ranwez, S., Janaqi, S., Montmain, J.: Semantic Similarity from Natural Language and Ontology Analysis. Synthesis Lectures on Human Language Technologies, vol. 8, pp. 1–254 (2015) 36. Nielsen, J.: Why you only need to test with 5 users. (Useit.com Alertbox) (2000) 37. Nielsen, J., Landauer, T.: A mathematical model of the finding of usability problems. In: Proceedings of the INTERACT 1993 and CHI 1993 Conference on Human Factors in Computing Systems, pp. 206–213 (1993)
Performing Different Clustering Methods for Mapping the European Union Member States using Green Energy, Digitalization, and R&D Indicators A Five-Year Comparison (2016-2020) Andreea Pernici(B) and Stelian Stancu The Bucharest University of Economic Studies, 010374 Bucharest, Romania {andreea.pernici,stelian.stancu}@csie.ase.ro
Abstract. The scientific context of the paper is strongly correlated with the European public agenda of building a modern, state-of-the-art, sustainable economic model. In this initiative, there are certain recurring concepts, considered cores of economic development: green energy, digitalization, and R&D being some of the most important. The three elements will be complementary and although they can be studied individually, they demonstrate their true dynamism when they are analyzed in an aggregated manner. Therefore, the current paper proposes a model which will concentrate on the indicators that describe the new economic framework, while also providing insights into the evolution of EU member states over a five-year period. Using different clustering methods, including an ensemble technique, the proposed model shows both the progress made by every state, as well as the regional evolution. The model will result in an optimized allocation of states in two or three clusters: performers, pioneers, and transitioners and will ultimately show that the European Union can be described by a certain degree of stability in the new, green, digitalized, and innovative economy. Keywords: Green Energy · Digitalization · R&D · Cluster Analysis · Ensemble Model
1 The Economic Model of the Future 1.1 Introduction Since we are referring to the European Union member states, the best starting point would be to shortly depict the continental institutional agenda and the role played by green energy, digitalization, and R&D in the new economic model. As a consequence, the focus of the EU, reinforced over the last few years has recently materialized into a new framework introduced by the European Commission in 2022: Towards a green, digital, and resilient economy [1]. The growth model behind it will be based on structural reforms, investments, and unprecedented transformations that will aim to create a new, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 440–461, 2023. https://doi.org/10.1007/978-3-031-37963-5_31
Performing Different Clustering Methods for Mapping
441
sustainable, digitalized, inclusive, and resilient economy, that handles environmental and political pressures, while also providing a better quality of life. However, the main components behind it are not new, being on top of the organization’s agenda for a while now, the innovation laying in the way they are connected, with the clear goal of preparing the regional structure for potential shocks. Therefore, having a broad consensus for the future is certainly useful, but to measure the actual progress, it is essential to look at the undergoing, regional dynamism in a five-year timeframe. The current model aims to do just that, to identify what have been, are, and could be the geographic engines of growth, as well as the regional clusters that need further support in the transition to the new economic model. At the same time, another question at hand would be whether the latest strategies and action plans adopted by the European Union prior to the 2022 model, such as A European Green Deal [2] or A Europe fit for the digital age [3], did result in harmonized and coordinated progress. Starting from that question, we will enunciate two conceptual suppositions, that will be either confirmed or denied in the applied section of the paper: S1: The joint efforts of the European Union to reconfigure the economic model will result in a certain level of stability, with the member states registering comparable growth over the years, without many regional movements. S2: The transition towards green energy, digitalization, and R&D economy is a longterm process, that is currently employed in full speed, with the North and West European countries being the main initiators. The tool employed in the current research will be cluster analysis, one of the most popular unsupervised learning techniques that attributes entities due to their similarity. Our model will profile the European Union member states, through three types of methods, hierarchical, partitional, and model-based, and five algorithms: Hclust, Diana, K-Means, PAM, and Mclust. After analyzing the results obtained in each case, an ensemble model (consensus _cluster) will be built to integrate all five and identify the optimized allocation of states. The results will be presented for 2016 and 2020 moments, capturing the evolution over a five-year timeframe. However, before proceeding to the model illustration we will briefly present the green energy transition, the reform of digitalization, and the impact of R&D, pointing out the main ideas from the related literature on how they have become vectors of economic growth. Afterward, the main indicators that describe each of the components will be identified and finally aggregated in a dataset that will outline the gradual steps done by each member state on the path to the new economy. 1.2 Green Energy Apart from the functional role, of creating a sustainable lifestyle and future, green energy is one of the best examples of the contemporary elements that have been quickly assessed as sources of economic growth. That is why the term Green Energy Economy (GEE) has become more and more frequent in both institutional and scholarly contexts. One of the characteristics of GEE is represented by its potential to align and generate interest from different stakeholders, from very different branches of activity. Whether is public
442
A. Pernici and S. Stancu
administration or the private sector, all agents are becoming more invested in green reform, describing it as a national priority. The goal is clear: the transition towards an economy that uses green energy not only as a productive vector that ensures basing human needs such as electricity but also as an influential element that reduces political codependency and changes the international agenda. This type of economy is expected to be a massive magnet for investment, to directly generate innovation, and to produce almost immediate scale economies on an industrial, technological, and social level [4]. All those elements will lead to a different perception of economic growth, which can offer a sense of safety in the future. Nonetheless, what is different from other development initiatives is that green energy has been designed and pushed forward in a downstream decision flow, starting from the international institutions. Once again, the initial goal was clear: reaching carbon neutrality, by massively reducing greenhouse gas emissions and avoiding a full-scale climate emergency [5]. However, in the process of achieving that objective, green energy changed its definition, from a mandatory life condition to an emerging new structure, holding leverage on the existing industry. Ultimately, the field has gained multiple understandings, such as a challenge, an opportunity, an instrument, or even a goal, all of them being shortly described below. To start with the challenge, the demanding part of this transition would be to find the exact balance between public health priorities and the pressures coming from the massive energy producers, whose existence is more and more threatened, even though they still produce a substantial share of the national value. On the other side, the opportunity presented by green energy lies in its potential to attract investments and research funds, that ultimately produce economic growth. This has been strongly correlated to the historical background, as Mundaca [6] also explains, the rise of green reform interest was accelerated by the financial crisis in late 2010, which inflicted almost immediate restrictive policies and pushed stakeholders to look for alternative sources of recovery. Going even further in the definition exercise, green energy proves to be an instrument, since it involves productive technologies, personnel, and most importantly, substantial allocations of budgets. Being an industry itself, it constructs different and complex processes, from the development of the science and technology necessary in testing and production facilities to negotiations with big energy producers, or even marketing and awareness campaigns for the large public. From all this we can grasp the main objective and the reason green energy could also be interpreted as a goal, having the power to alter the lifestyle from a micro level to a macro one. Having all these conceptualizations of the green energy economy, the next step would be to briefly enumerate the main theoretical approaches that capture the actual economic impact on a country’s progress. One of them would be to study the policy choices, strategies, and stimulus packages behind it, a process that has been done repeatedly in the related literature [7, 8]. Another approach would be to focus on the macroeconomic effect, by identifying the impact of diverse energy aspects on aggregators such as GDP, employment, or inflation [9, 10]. Another possible area of quantification would be to use the disruptive economy concept since the green energy transition presents multiple disruptive components such as the zero-sum industrial and technological competition or the devaluation of the core productive sector [11, 12]. On this note, another relevant
Performing Different Clustering Methods for Mapping
443
characteristic of the GEE is that although technology remains a fundamental growth engine, its mere existence and development will not be sufficient. In contrast, the reforms, regulatory measures, and policies implemented by the government authorities would be the decisive factor in the implementation equation. Finally, machine learning has recently become a popular framework for analyzing green energy. Here, the focus will be on predicting different types of energy prices or measures of production [13, 14], in the effort of promoting climate change mitigation [15]. Therefore, having demonstrated that the concept is strongly linked to the European political agenda and capable of inflicting economic growth, we will proceed to capture it using Eurostat indicators. They will be described in the dataset section of the paper and eventually used to assess the continental progress, along with the digitalization and R&D variables. 1.3 Digitalization Going further in the description of the new economy, digitalization has become a leitmotif of the last decade. The starting point in analyzing it would be to define the two related key concepts, that are frequently used incorrectly: digitization and digitalization. While the first one is defined as the process of converting data from a physical format to a digital one, the second one, digitalization, will be a more complex subject, that aims to improve whole business and administration processes. Historically, although the concept of digitization was considered to have started in 1679 when the first binary system was invented [16], digitalization as we know it today has grown in popularity over the last few years, by being strongly linked to the private sector. Therefore, all major companies have started to include different aspects of technologies in their daily business, with the production, executive, and decision flows being now prevalently based on computers, the internet, and data. However, despite their differences, the core of both concepts seems to be simple: the digitally codified information, that has become a strategic resource and is exploited continuously from both a quantitative and a qualitative point of view [17]. Therefore, this very resource will shift from a straightforward theoretical concept to a practical challenge. Being at the center of many processes and involving different types of final deliverables, either services, goods, or knowledge, is hard to measure where digitalization starts and ends. At the same time, there have been many attempts of trying to measure its contribution, but for the current approach, the proposed framework would be to study the agents it affects and their interactions. Having that said, in any given economy or society there are three main agents: the government, the citizens, and the private sector, all three being extremely interconnected and interdependent. By studying their interaction, we can identify the main effect that digitalization inflicts on the current economic model. Starting with the relationship between authorities and its citizens, digitalization has been enriching the government with an unknown-before power to reach and engage its public. With the correct resource allocation, governments can know build processes that facilitate dialogue, ensure financial and fiscal transactions, and improve the overall level of public information transparency. This interaction will be fundamental for the general performance of a state, ultimately determining the reconstruction of citizens’ trust and offering maximized satisfaction.
444
A. Pernici and S. Stancu
However, until reaching that goal, there is another crucial aspect that the government needs to ensure, namely the basic needs of accessibility. Therefore, the first digitalization milestone would be to develop a performant national network for fixed and mobile internet connection. Further on, another essential interaction that can be described as the main engine for a sustainable economy would be the one between the Government and the private sector. This relationship will focus mainly on building a digitalized environment that protects entrepreneurs and facilitates administrative processes that arise throughout a company’s activity. The interaction between these two agents is certainly the most complex since both have different objectives that need to be synchronized. Companies have been the most eager to test and implement new technologies, that drive innovation [18, 19] improve certain processes [20], generate scale economies, and increase overall profits. Therefore, multiple conflicts of interest between the two actors can arise. Maybe the most obvious example is referring to the labor market, which has been one of the most impacted sectors by the race between the machine and the worker [21]. The private digitalization efforts generated new jobs, while others have become completely obsolete, not being profitable anymore. Many jobs are now threatened, with a direct impact on the unemployment and macroeconomic level. As a result, governments should act with consideration, in order not to discourage business innovation, but at the same time to ensure the protection and training of the citizens. Surely, the effects on the labor resource will also be strongly correlated with the third interaction, the one between the citizens and the private sector. As mentioned before, the labor movements have been decisive, impacting first the skills needed for entering the market and then inflicting a constant state of necessary learning. More than that, digitalization helped entrepreneurs to provide a new nature of goods and services, that improved the overall lifestyle, from health to transport, education, or entertainment. That ultimately pushed for new needs and a different level of purchasing power. From a business perspective, new needs mean new lines of revenue and profit, so the cyclical effect of digitalization is again observed. To summarize the above, digitalization has impacted all aspects of life, by altering interactions at a national level. Not only that, since moving away to the international arena, we can observe a clear competition for the status of a digitalized country, being seen as a major source of progress and investment attractivity. Consequently, there have been many efforts of measuring its level. In the European Union, the most relevant remains The Digital Economy and Society Index (DESI), which aims to summarize the performance of the member states on four pillars: human capital, connectivity, integration of digital technology, and digital public services [22]. Using this as a reference, in the proposed model we will use four indicators that are captured in the DESI index, which will be described further on in the paper. Along with the green energy dataset, we can now measure two vectors that have a crucial economic reconstructing potential. 1.4 Research and Development (R&D) Lastly, the R&D, or Research and Development field, represents a common agreement between governments and businesses to allocate budgets with the clear objective of generating innovative activities and producing new, cyclical value. Although it is often
Performing Different Clustering Methods for Mapping
445
limited by other national priorities and the actual expenditure is not high (compared to other sectors), a very clear relationship between R&D and economic progress will be profiled. To further elaborate, R&D is a field that requires a perfect state of equilibrium between public sector efforts and private ones. However, due to the nature of goods and services it implies, R&D activities have the potential to result in substantial profits, so private companies have a clear advantage and stimulus to invest in this direction. That would be why, when mentioning R&D, most stakeholders are referring to the multinational corporation strategies, where research funds tend to increase with the expansion of the company overseas [23]. But the public sector benefits are not to be disposed of either, a country that is directing its attention towards knowledge acquisition is a country that can accelerate all its administrative and productive processes with the clear result of enhanced economic well-being. Therefore, the role of the government should be to find the right budget and institutional parameters that can stimulate innovation and entrepreneurial competition, while also increasing the degree of employment and labor productivity. Nevertheless, the question remains: is R&D a vector for economic growth? One way to answer it would be to study the relationship between R&D and the Total Factor Productivity (TFP) on a national level [24]. Another similar framework will be to focus on the correlation between Foreign Direct Investments (FDI) and R&D [23], with a bidirectional effect: FDIs accelerate research and development efforts, but at the same time they are stimulated by national initiatives in this field. Lastly, the relationship between productivity and R&D has also been considered, with positive effects [25]. However, regardless of the framework, the main challenge is the fact that there isn’t a uniform allocation of budgets, and countries are either investing in their own national innovation systems [26] or they rely on the private sector to fill the funding gap [27]. One factor that can help unify these efforts and provide a bigger picture vision is the OECD (Organization for Economic Co-operation and Development), whose mission includes monitoring the international data on resources devoted to R&D [28], while also constructing guidelines and recommendations. Therefore, the international institution will be present in many case studies presented in the related literature, as a source of primary data and valuable insights. Moving away from the macro-economic perspective, other studies have preferred to focus on the role played by R&D during difficult contexts, such as political transitions [29] or emerging economies [30], when the need to undertake different actions to stimulate innovation and breakthroughs is exponentially bigger. However, nowadays, there is a prevalent shift in the allocation of R&D funding. If decades ago, R&D was focused on big-scale technologies, mostly productive or explorative, such as astronomy, physics, or biochemistry, now R&D funds are shifting towards data, artificial intelligence, and renewable sources of energy, as tools to achieve any further discoveries. That would be the reason why a model that includes green energy, digitalization, and R&D is becoming extremely relevant in the current context. In conclusion, now we got an overview of all three vectors and their potential to generate economic development. Before integrating them into a unified dataset and analyzing the regional allocations, we will shortly describe the clustering algorithms that we will further apply.
446
A. Pernici and S. Stancu
2 Cluster Analysis Cluster analysis is one of the most frequent tools employed for processing and reducing a set of observations to a small number of entities, following the principle of maximum intra-class variability and minimum inter-class variability. This technique has been extensively used in all fields of activity, having broad applicability to political, economic, or technical issues. When linked to the three components studied in the current paper, most examples are related to the use of renewable energy in the European Union [31–33], the correlation between R&D variables and big-scale planning [34] and innovation indicators [35] or as a part of the general macroeconomic progress [36]. However, an aggregated model with all three elements was not found, so this application aims to contribute to the development of the unified related literature. More than that, by profiling a comparative overview during a five-year period, the results can be used as a benchmarking method and an evaluation of national performance. Further on, cluster analysis will be part of the unsupervised machine-learning techniques that search for patterns in unlabeled datasets. Therefore, as with all ML methods, it has been extensively developed throughout recent years, reaching a variety of algorithms. In the current paper, three methods will be profiled, based on different ways of performing the split between observations. In total, five algorithms will be applied, all in the R Studio environment, as follows: • Hierarchical (connectivity-based) – Hclust and Diana • Partitioning (centroid-based) – K-Means and K-Medoids (PAM) • Gaussian (model-based) – GMM (Mclust) In the last step, we will compute an ensemble model, using the bagging approach. This method will encompass all five algorithms and ultimately result in an optimized allocation of the European Union member states. 2.1 Hierarchical (Connectivity-Based) Method The fundamental principle behind this method is that any object is connected to its neighbors on a certain proximity distance or degree of relationship. There are two essential aspects to decide when a hierarchical method is computed: how to calculate the distances between observations and respectively, the ones between clusters. For the first aspect, we have multiple options available, for example, Euclidean, Maximum, or Manhattan formulas can be used as distance measures. For the second issue, the dissimilarity indicators need to be computed, in order to choose the optimal pair of clusters that will merge at each step. For that, one of the most frequent criteria is Ward’s method which will minimize the total within-cluster variance. Therefore, for the proposed model, the Euclidean method (1) and the Ward formula (2) will be applied. p (xi,m − xj,m )2 (1) di,j = m=1
W =
g k=1
Wk Wk =
n i=1
p m=1
(xi,m − xk,m )2
(2)
Performing Different Clustering Methods for Mapping
447
Further on, when deep diving into the hierarchical family of methods, there will be two general approaches. First is the divisive one (top-down), starting from a unique class of objects that is divided step by step into an n number of smaller entities, and the second is the agglomerative one (bottom-up), which starts from n number of entities that will be comprised together until reaching a final number of clusters. In the current model, we will use Hclust function as the agglomerative method and Diana as the divisive one. 2.2 Partitioning (Centroid-Based) Method This type of technique constructs various partitions, evaluating the closeness of observations to a central value, based on a specific criterion. It will be mostly used when the number of classes is known a priori, performing iterative relocations until the grouped entities are having the maximum similarity. That is why, most of the time, we need to identify the optimal number of clusters and instruct the algorithm with the respective selection. Mathematically, it can be described as follows: given a dataset D of n objects, find a partition of k clusters, with k < = n. An important mention: after the method is applied, each cluster must contain at least one object and each object must belong to exactly one cluster. From this family of methods, two distinctive algorithms will be presented: K-Means and K-Medoids. K-Means is a centroid-based technique, that uses distances between points as a cluster formation rule. It was first proposed by MacQueen in 1967 [37] and the principle behind it’s simple: every cluster will have a center (or centroid) and each observation will be distributed to the closest centroid, based on distances. The algorithm will start by partitioning the objects randomly into a k number of subsets. Initially, the centroids will be the mean point of the subsets and each object will be assigned based on the distance between itself and the respective class. After the first allocation, the algorithm reiterates, with the recalculation of the centroids and the distances, until reaching a final optimal solution. The mathematical formula behind it is represented in Eq. (3), where x is a certain object, mi is a certain sub-group, and k is the number of clusters proposed. The value E will be the object criterion, or the minimization of the sums of squared deviations from the cluster mean. k |x − mi |2 (3) E= i=1
x∈Ci
K-Medoids is a method based on the concept of a medoid or the point in a cluster whose dissimilarity with all other points is minimum. The technique was developed by Kaufman and Rousseeuw in 1990 [38], by introducing the Partitioning Around Medoids algorithm (PAM). The formula behind is represented in Eq. (4), where C i is the medoid and Pi is any given object. The main difference between this algorithm and K-Means will be the medoid selection, PAM trying out all objects and calculating the lower SSE and only afterwards choosing the right object. c= |Pi − Ci | (4) Ci
Pi ∈Ci
448
A. Pernici and S. Stancu
2.3 Gaussian (Model-Based) Method If the previous methods were part of the hard classification group, the model-based algorithms will perform soft classification. That means that besides the actual allocation of objects into clusters, the algorithms will also provide the probabilities that the respective data points belong to a certain cluster. The fundamental principle behind it will be the fact that any cluster can be represented mathematically by a parametric probability distribution. More than this, each data entry is considered to be constructed from a mixture of components, which will help fit the dataset in an optimized distribution. If that distribution is a normal one or Gaussian, then the algorithm becomes a Gaussian Mixture Model (GMM), described by the (5) formula. N (x : μ,
)=
(2π
1
−1 1 T (x − μ) exp(− (x − μ) 1 2 )2
(5)
This method is seen as a more complex version of K-Means, being an expectationmaximization technique that calculates the likelihood that a given data point comes from a respective mixture of Gaussian distributions. To do that, the parameters used will be the mean, the covariance, and the weighting, with the final goal of maximizing the said likelihood. This maximization will be possible by changing the parameters and implicitly the cluster proportions, centers, and distribution. From a general perspective, compared to K-Means, GMM will have the potential to analyze complex data and handle outliers more efficiently. 2.4 Ensemble Models Finally, this technique is one of the most in-demand machine-learning models, due to its potential to optimize algorithms and improve learning results. The main principle behind it will be the combination of several models, that together produce better predictive performance than applied individually. To briefly describe it, there will be three main categories in the ensemble family: bagging, boosting, and stacking, all three having the power to solve complex problems and reduce variance. For the current paper, we will only study the bagging one, since it reaches our goal of combining multiple algorithms in enhanced learning. The algorithm will work as follows: starting from several weak learners, applied independently from each other, a deterministic averaging process will take place to ultimately combine them and obtain a model with less variance. Although it is particularly used for decision trees, this technique can be used successfully for clustering exercises, improving the performance, and offering a final verdict. Therefore, in the proposed model we will use consensus clustering, a robust bagged ensemble that can integrate all algorithms priorly described.
3 Dataset and Results As mentioned before, the dataset will be comprised of several indicators that best describe green energy, digitalization, and R&D phenomena (Table 1). Starting with the first one, the green energy transition will be described by four main variables referring to the share of energy that comes from renewable sources, the total value of environmental taxes, the
Performing Different Clustering Methods for Mapping
449
greenhouse gas emissions, or the energy import dependency. Secondly, digitalization will focus on four variables concerning general computer and internet use and access. Lastly, the R&D indicators will describe the domestic budget and expenditure towards it, as well as the associated human resources. All variables have been extracted from the Eurostat database, having the respective EU taxonomy attached, and will aim to illustrate the new economic model in a simplified way. Table 1. Indicators Description Vector
Indicator
Description
EU Taxonomy
Share of energy from [NRG_IND_REN] renewable sources (%) Environmental tax revenues ENV_TAX_REV [ENV_AC_TAX] (%) GREEN ENERGY Greenhouse gas emissions GAS_EMISSIONS [SDG_13_20] %) Energy import dependency ENERGY_DEP [SDG_07_50] (%) Level of internet access – INT_ACCESS [TIN00134] households (%) Internet use by individuals INT_USE [TIN00028] in the last 3 months (%) DIGITALIZATION Individuals who have never [TIN00028] INT_NEVER used the internet (%) Use of computers and the COMP_EMP [ISOC_CI_CM_PN2] internet by employees (%) Gross domestic expenditure GDE_R&D [SDG_09_10] on R&D (%) R&D_PERSONNEL R&D personnel (%) [SDG_09_30] Human resources in science R&D HRST [TSC00025] and technology (%) Government budget [GBA_NABSTE] GBARD allocation for R&D (%) REN_ENERGY
Therefore, the 12 indicators will be applied to the 27 EU member states, for the year 2016 in comparison to 2020. A methodological mention: the initial model was computed for all five years between 2016 and 2020 but seeing that the clustering results were not significantly altered, it was limited to the comparison between the two years. 3.1 Descriptive Statistics A summary of the descriptive statistics (min, mean, max and standard deviation) can be found in Table 2, while in Fig. 1 we can observe the evolution of each indicator. For the green energy category, we can observe that the average share of renewable energy has increased considerably, while the level of gas emissions has dropped on average (7 percentage points), which is a good result, considering the sustainable goals imposed by the international institutions. Regarding digitalization, both access and use indicators are improving their performance, while the level of individuals who have never used the internet has decreased significantly, from 16.3% to 9.3%. At the same time, the R&D
450
A. Pernici and S. Stancu
indicators are showing an evolution, but not as visible as the other two vectors discussed. The level of HRST seems to be increasing with a more accelerated rhythm, while the actual budget allocations just slightly increased. Regarding the standard deviation, we can see that for the green energy and R&D fields, the values are very similar in 2016 versus 2020, which means that there is harmonized progress registered by all member states, therefore confirming our first supposition. For the digitalization axis, the variance is dropping in 2020, meaning once again that the general evolution has been synchronizing at a regional level, mainly due to the joint efforts and guidelines imposed by the European institution. Table 2. Descriptive Statistics - 2016 Versus 2020
Fig. 1. Average Indicators Evolution, 2016 – 2020
Performing Different Clustering Methods for Mapping
451
After briefly analyzing the evolution of data, we will proceed with the standardization of the variables. Since there aren’t significant differences in the distribution of clusters between the standardized and unstandardized methods, we will further proceed with the latter. Regarding the outliers, for the 2016 case, only one outlier has been identified, for the GAS_EMISSIONS variable, in the case of Malta. For 2020, we will identify two outlier values, Sweden and Finland, for the variables COMP_EMP and REN_ENERGY. However, having only three deviant values, we will not remove them from the analysis, as to have a complete comparison of the 27 EU member states over the years. Further on, in the next section, we will illustrate the results obtained for all five clustering algorithms applied. 3.2 Applied Algorithms Hierarchical Method. As mentioned before, the distances between observations will be calculated using the Euclidean distance, while Ward method (ward.D2) will be used for computing the distances between objects. We will present two types of hierarchical clustering: agglomerative, by using the hclust function, and divisive, by using diana algorithm. The dendrograms resulted will be visible in Fig. 2 and Fig. 3.
Fig. 2. Cluster Dendrograms, Agglomerative Method (hclust) - 2016 Versus 2020
Fig. 3. Cluster Dendrograms, Divisive Method (diana) - 2016 Versus 2020
The optimal number of clusters. The next step would be to identify the correct number of clusters. To do that, we will compare three methods: Elbow, NbClust and Silhouette. In both algorithms the outputs will be similar, so we will present the results only for the agglomerative one (2016 versus 2020).
452
A. Pernici and S. Stancu
The Elbow method is based on the Within-Sum quares (WSS) or the sum of the squared distances between each cluster object (x i ) and its centroid (ci ) (6). WSS = (xi − ci )2 (6) The method will be frequently employed as a graphical representation, being interpreted as follows: as the value of k increases, the value of WSS decreases steeply, until it reaches a plateau. The point before that, namely the elbow, will correspond to the optimal number of clusters.
Fig. 4. Elbow Method Output – 2016 Versus 2020
In our case, from the Elbow plots computed in Fig. 4, we can see a sudden drop in the k = 2 number of clusters, so this will be considered the optimal allocation, for both the 2016 and 2020 datasets. Different from the Elbow method, NbClust is a function that uses over 30 indicators to evaluate and propose the correct number of classes. It is a complex method, which is presented simplistically, with the final decision being based on the majority vote (the output can be seen in Fig. 5). Therefore, we can see that the decision will be different in the two years compared. For 2016, any allocation is equally preferred (8 votes for the distribution of 2 clusters and 8 for 3), while for 2020, the 3-cluster allocation will be the most voted.
Fig. 5. NbClust Output – 2016 Versus 2020
Since we didn’t obtain a stable configuration from the two previously applied methods, we can use Silhouette as a tying factor. The distances, or Silhouette coefficients, are constructed to show the similarity of an object with its respective class (cohesive), compared to other clusters (separation). It is calculated based on the (7) formula and can
Performing Different Clustering Methods for Mapping
453
take either negative or positive values, in the [–1,1] interval. An object will be correctly allocated if its Silhouette coefficient is as far from zero as possible. s=
b−a max(a, b)
(7)
In our case, the Silhouette plot has been computed for both k = 2 and k = 3. However, since the results in the k = 3 case showed more observations having negative values, only the Silhouette plots for the k = 2 have been presented (Fig. 6). We can also interpret this as an argument for the final decision of allocating the states into 2 classes.
Fig. 6. Silhouette Output – 2016 Versus 2020
A special mention here, for the 2020 case, we can see that the 10th observation (France) will take a negative value, so we will keep in mind that it might be a clustering error. Finally, the actual allocation of the member states can be observed in Fig. 7 and Table 3. In both years we can see a cluster of highly developed countries (Pioneers) and one with the states which are still finding their rhythm in the transition to the new economic model (Transitioners).
Fig. 7. Clustering Output – Hierarchical Method - 2016 Versus 2020
454
A. Pernici and S. Stancu Table 3. Clustering Distribution - Hierarchical Method – 2016 Versus 2020 2016
2020
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Sweden, Finland, Denmark, Germany, Netherlands, Luxembourg, Austria, Ireland, Belgium, France
Estonia, Czechia, Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, Spain
Sweden, Finland, Denmark, Germany, Netherlands, Luxembourg, Austria, Ireland, Belgium, Spain
Estonia, Czechia, Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, France
What is interesting is the fact that we don’t see an inferior cluster, with states that underperform or haven’t even started to make efforts in the direction of green energy, digitalization, or R&D investments. That could be attributed to the several strategies that imposed national initiatives as mandatory, so the effort made by the member states would be equalized. However, there is a country that managed to transition from the second cluster to the first one over the years, Spain. Other than that, France has been allocated in the second cluster, but having in mind the Silhouette results, we can assume that it was wrongfully allocated, and it truly belongs in the list of pioneers. Partitioning Method. K-Means. Going forward to the second method of clustering, this time the algorithm requires the optimal number of clusters to be computed. We will use the results generated in the hierarchical section and apply the kmeans function for a k = 2 cluster allocation. The classification will be visible in Fig. 8 and Table 4.
Fig. 8. Clustering Output – K-Means - 2016 Versus 2020
Performing Different Clustering Methods for Mapping
455
Table 4. Clustering Distribution – K-Means - 2016 Versus 2020 2016
2020
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Sweden, Finland, Denmark, Germany, Netherlands, Luxembourg, Austria, Estonia, France, Ireland, Belgium
Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, Spain, Czechia
Sweden, Finland, Denmark, Germany, Netherlands, Luxembourg, Austria, Estonia, France, Ireland, Belgium, Spain, Czechia
Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal
When interpreting the results, we can observe that the main configuration will remain unchanged, with a couple of differences between the 2016 and 2020 moments. The first one would be the fact that after 5 years two countries have made the transition to the pioneers’ cluster: Spain (the same result obtained in the hierarchical method) and Czechia. Another small distinction would be the fact that for 2016, the algorithm also includes Estonia as a member of the highly developed cluster, as opposed to the previous method. However, overall, the results are extremely similar, which once again helps in confirming the initial supposition that a certain level of stability and synchronized evolution can be seen at the regional level. More than that, we can see that the North and West states are all clustered in the Pioneers group, which means that they do register slightly better results, initiating this whole process of transitioning. K-Medoid (PAM). A second algorithm used as part of the partitioning family is the Partition Around Medoids (PAM). This one will be applied similarly to K-Means and since we already have the optimal number of clusters, we will compute the pam function in R Studio and see the final distributions of states in Fig. 9 and Table 5.
Fig. 9. Clustering Output – K-Medoid - 2016 Versus 2020
456
A. Pernici and S. Stancu Table 5. Clustering Distribution – K-Medoid - 2016 Versus 2020 2016
2020
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Sweden, Finland, Denmark, Germany, Netherlands, Belgium, Luxembourg, Austria, France, Estonia, Spain, Czechia, Ireland
Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal
Sweden, Finland, Denmark, Germany, Netherlands, Belgium, Luxembourg, Austria, France
Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, Estonia, Spain, Czechia, Ireland
In this case, however, the results will be different when it comes to the member states evolution. From an allocation of 13 countries in the Pioneers cluster in 2016, only 9 observations have been classified in the same group in 2020. As a result, Estonia, Spain, Czechia, and Ireland were downgraded to the Transitioners cluster. That could be explained by the fact that PAM is an algorithm that is not that sensitive to outliers, and it can homogenize the data much easier than K-Means. Therefore, since in 2020 the analyzed indicators are registering overall higher levels, especially for the Nordic structure, then in comparison the four countries previously mentioned did not match their performance, so the algorithm didn’t allocate them into the same class. Nonetheless, the allocation of the Northern and Western countries in the highly developed cluster remains valid, as in the previous cases. Model-Based Method. Gaussian Mixture Model will be the last algorithm employed as a clustering method. To compute it, we used a model-based function named Mclust in R Studio. This one uses the Bayesian Information Criterion to maximize the likelihood function, while also protecting the results against overfitting. The algorithm requires the number of clusters, so we will use k = 3, since for the k = 2 allocation, only the Nordic countries were assigned into a distinct group. Consequently, a different cluster was created, to comprise the three Nordic countries. We will suggestively name it The Performers. The other two classes will remain unchanged, but the distribution between 2016 and 2020 shows an interesting dynamic. For example, there will be 5 countries, Czechia, Malta, Poland, Slovakia, and Hungary that will downgrade from the pioneers’ cluster to the transitioners one in 2020. However, the reverse phenomena can also be observed, Austria being the country that managed to jump to the superior category. The complete distribution of states can be seen in Fig. 10 and Table 6.
Performing Different Clustering Methods for Mapping
457
Fig. 10. Clustering Output – Mclust - 2016 Versus 2020
Table 6. Clustering Distribution – Mclust - 2016 Versus 2020 2016 Cluster 1 PERFORM ERS
Finland, Denmark, Sweden
Cluster 2 PIONEERS
2020 Cluster 3 Cluster 1 Cluster 2 Cluster 3 TRANSITION PERFOR PIONEER TRANSITIONE ERS MERS S RS
Germany, Slovenia, Germany, Netherlands, Lithuania, Netherland Belgium, Cyprus, Croatia, s, Belgium, Luxembourg, Greece, Latvia, Finland, Luxembour Denmark, France, Spain, Lithuania, g, France, Sweden Ireland, Czechia, Romania, Spain, Bulgaria, Malta, Poland, Ireland, Portugal, Slovakia, Austria Estonia, Austria Hungary
Slovenia, Lithuania, Cyprus, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, Estonia, Czechia, Malta, Poland, Slovakia, Hungary
3.3 Ensemble Model After seeing all the results obtained, although the general picture was presented, there are some differences that complicate the final distribution. Therefore, we will perform an ensemble model technique, namely the consensus_cluster function, in order to decide on an optimal assignation. For this method, we have set the number of repeats to 10, so each algorithm will be reiterated 10 times. We have run the algorithm first for a 2 clusters allocation, with the results available in Table 7. Regarding the results, there are only two differences from the 2016 allocation, Estonia being downgraded, and Spain upgraded, similar to the hierarchical clustering results. Overall, in 2020 The West and the North will be, as expected, part of the highly developed
458
A. Pernici and S. Stancu
cluster, while the South and East European countries are marked as still transitioning (Fig. 11). Table 7. Clustering Distribution – Ensemble Model - 2016 Versus 2020 2016
2020
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Cluster 1 PIONEERS
Cluster 2 TRANSITIONERS
Sweden, Finland, Denmark, Germany, Netherlands, Luxembourg, Austria, Ireland, Belgium, France, Estonia
Czechia, Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, Spain
Sweden, Finland, Denmark, Germany, Netherlands, Luxembourg, Austria, Ireland, Belgium, Spain
Czechia, Slovenia, Slovakia, Lithuania, Malta, Cyprus, Poland, Hungary, Croatia, Greece, Latvia, Lithuania, Romania, Bulgaria, Portugal, France, Estonia
Fig. 11. Clustering Final Mapping – Ensemble Model – 2016 Versus 2020
Since the results were not surprising and actually synchronized with the general macroeconomic evolution of the region, for a more nuanced and detailed overview, we have also run the ensemble model for the three clusters’ distribution (Performers, Pioneers, and Transitioners). The complete allocation can be observed in Fig. 12 and will ultimately show once again that the Nordic countries are the most performant ones when it comes to green energy, digitalization, and R&D.
Performing Different Clustering Methods for Mapping
459
Fig. 12. Clustering Final Mapping – Ensemble Model – 2016 Versus 2020
3.4 Confirming Suppositions After illustrating the clustering results, we can now assess that both initial suppositions have been confirmed. Regarding the first one (S1), we have demonstrated that the new economic model generated relatively few movements over the five-year period and a constant level of growth. This was indicated from two directions, first by analyzing the descriptive statistics and the standard deviation, and then by capturing the dynamics in the cluster allocations. Secondly, although the new model might surprise with the constancy it demonstrates on a regional level, the North and the West countries are having a slightly better performance, being the ones that adopted first the EU action plans and proving their commitment to the green, digital, and innovative economy. In other words, the differences in the continental structure are coming more from different pacing and rhythm of transition, rather than a completely different vision. Therefore, we can consider that the second supposition has also been validated.
4 Conclusions To sum up, green energy, digitalization, and R&D are key factors for the new economic order, having the potential to reconstruct entire flows from the adjacent industry to the production value and even the well-being of the population. Each has a strong disruptive component and different parameters that need to be studied before reaching a full transition. Green energy is extremely dependent on public regulations, and it continues to gain huge popularity. Digitalization is a vast field, that unifies both the public and the private sectors and directly affects every aspect of our lifestyle. R&D is strongly connected to the prior two elements, playing a massive role in the attractivity of a country for foreign investors. When all three are combined, we can profile a new economic model, defined as a direct goal by the European Union. Having that in mind, the exercise of mapping the European countries and profiling their evolution has proven that the EU’s strategy is an efficient one, providing national growth and stability at a regional level. The new economic framework has the potential to balance the international scene and inflict a healthy macroeconomic equilibrium,
460
A. Pernici and S. Stancu
without any disadvantaged countries. However, the transition is a long-term process and although the progress is visible, it will not be immediate. Acknowledgments. This paper was co-financed by The Bucharest University of Economic Studies during the PhD program of Andreea Pernici.
References 1. European Commission. https://ec.europa.eu/commission/presscorner/detail/en/IP_22_1467. Accessed 11 Oct 2022 2. European Commission. https://ec.europa.eu/info/strategy/priorities-2019-2024/europeangreen-deal_en. Accessed 11 Oct 2022 3. European Commission. https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fitdigital-age_en. Accessed 11 Oct 202 4. Midilli, A., Dincer, I., Ay, M.: Green energy strategies for sustainable development. Energy Policy 34(18), 3623–3633 (2006) 5. UN Homepage. https://unstats.un.org/sdgs/report/2021/goal-13/. Accessed 09 Oct 2022 6. Mundaca, L., Neij, L., Markandya, A., Hennicke, P., Yan, J.: Towards a green energy economy? assessing policy choices, strategies and transitional pathways. Appl. Energy 179, 1283–1292 (2016) 7. Mundaca, L., Richter, J.L.: Asessing ‘green energy economy’ stimulus packages: evidence from the US programs targeting renewable energy. Renew. Sustain. Energy Rev. 42, 1174– 1186 (2015) 8. Wüstenhagen, R., Bilharz, M.: Green energy market development in Germany: effective public policy and emerging customer demand. Energy Policy 34(13), 1681–1696 (2006) 9. Yushchenko, A., Patel, M.K.: Contributing to a green energy economy? a macroeconomic analysis of an energy efficiency program operated by a Swiss utility. Appl. Energy 179, 1304–1320 (2016) 10. Markandya, A., Arto, I., González-Eguino, M., Román, M.V.: Towards a green energy economy? tracking the employment effects of low-carbon technologies in the European Union. Appl. Energy 179, 1342–1350 (2016) 11. Knuth, S.: Green devaluation: Disruption, divestment, and decommodification for a green economy. Capital. Nat. Social. 28(1), 98–117 (2017) 12. Schuelke-Leech, B.A.: Disruptive technologies for a green new deal. Curr. Opin. Environ. Sci. Health 21, 100245 (2021) 13. Lai, J.P., Chang, Y.M., Chen, C.H., Pai, P.F.: A survey of machine learning models in renewable energy predictions. Appl. Sci. 10(17), 5975 (2020) 14. Buturache, A.N., Stancu, S.: Wind energy prediction using machine learning. Low Carbon Economy 12, 1–21 (2021) 15. Jabeur, S.B., Khalfaoui, R., Arfi, W.B.: The effect of green energy, global environmental indexes, and stock markets in predicting oil price crashes: evidence from explainable machine learning. J. Environ. Manage. 298, 113511 (2021) 16. Leibniz, G.W.: An explanation of binary arithmetic using the characters 0 and 1, with remarks about its utility and the meaning it gives to the ancient Chinese figures of Fuxi. Memoires de l’Académie Royale des Sciences, 3, 85–93 (1703) 17. Valenduc, G., Vendramin, P.: Digitalisation, between disruption and evolution. Trans. Euro. Rev. Labour Res. 23(2), 121–134 (2017)
Performing Different Clustering Methods for Mapping
461
18. Tajudeen, F.P., Nadarajah, D., Jaafar, N.I., Sulaiman, A.: The impact of digitalisation vision and information technology on organisations’ innovation. Euro. J. Innov. Manag. 25(2), 607– 629 (2021) 19. Salvi, A., Vitolla, F., Rubino, M., Giakoumelou, A., Raimo, N.: Online information on digitalisation processes and its impact on firm value. J. Bus. Res. 124, 437–444 (2021) 20. Bechtsis, D., Tsolakis, N., Vlachos, D., Iakovou, E.: Sustainable supply chain management in the digitalisation era: the impact of automated guided vehicles. J. Clean. Prod. 142, 3970–3984 (2017) 21. Degryse, C.: Digitalisation of the economy and its impact on labour markets. ETUI research paper-working paper (2016) 22. European Comission Shaping Europe’s digital future. https://digital-strategy.ec.europa.eu/en/ policies/desi. Accessed 09 Oct 2022 23. Petit, M.L., Sanna-Randaccio, F.: Endogenous R&D and foreign direct investment in international oligopolies. Int. J. Ind. Organ. 18(2), 339–367 (2000) 24. Ho, Y.P., Wong, P.K., Toh, M.H.: The impact of R&D on the Singapore economy: an empirical evaluation. Singapore Econ. Rev. 54(01), 1–20 (2009) 25. Zachariadis, M.: R&D-induced Growth in the OECD? Rev. Dev. Econ. 8(3), 423–439 (2004) 26. Soete, L.,Verspagen, B., Ter Weel, B: Systems of innovation. In Handbook of the Economics of Innovation, vol. 2, pp. 1159–1180 (2010) 27. Hall, B.H., Lerner, J.: The financing of R&D and innovation. In: Handbook of the Economics of Innovation, vol. 1, pp. 609–639 (2010) 28. OECD Research and Development Statistics (RDS) Homepage https://www.oecd.org/sci ence/inno/researchanddevelopmentstatisticsrds.htm. Accessed 10 Nov 2022 29. Czarnitzki, D., Licht, G.: Additionality of public R&D grants in a transition economy: the case of Eastern Germany. Econ. Transit. 14(1), 101–131 (2006) 30. Ozturk, E.: The impact of R&D sourcing strategies on basic and developmental R&D in emerging economies. Euro. J. Innov. Manag. 21(4), 522–542 (2018) 31. Kacperska, E., Łukasiewicz, K., Pietrzak, P.: Use of renewable energy sources in the European Union and the visegrad group countries—results of cluster analysis. Energies 14(18), 5680 (2021) 32. Parobek, J., et al.: Energy utilization of renewable resources in the European Union-cluster analysis approach. BioResources 11(1), 984–995 (2016) 33. Pacesila, M., Burcea, S.G., Colesca, S.E.: Analysis of renewable energies in European Union. Renew. Sustain. Energy Rev. 56, 156–170 (2016) 34. Mathieu, R.G., Gibson, J.E.: A methodology for large-scale R&D planning based on cluster analysis. IEEE Trans. Eng. Manage. 40(3), 283–292 (1993) 35. Watts, R.J., Porter, A.L.: R&D cluster quality measures and technology maturity. Technol. Forecast. Soc. Chang. 70(8), 735–758 (2003) 36. Pernici, A., Stancu, S.: Constructing a composite indicator for measuring the socio-economic development of a country, using PCA and Machine Learning classification models, to appear in the 40th IBIMA Conference. Seville, Spain (2022) 37. MacQueen, J.: Some methods for classification and analysis of multivariate observations. Comput. Chem. 4, 257–272 (1967) 38. Kaufman, L., Rousseeuw, P.J.: Partitioning Around Medoids (Program PAM). Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, USA (1990)
Modeling of the Modes of Operation of the AES Algorithm in the Cryptool 2 Environment Olga Manankova(B) and Mubarak Yakubova Almaty University of Power Engineering and Telecommunications Name After Gumarbek Daukeev, Almaty, Kazakhstan [email protected]
Abstract. Each encryption method has its own strengths and weaknesses. To apply a suitable cryptographic algorithm to ensure the security of transmitted data, it is necessary to have knowledge about the performance of this algorithm. The article discusses the modeling of four modes of the Advanced Encryption Standard (AES) algorithm using the Cryptool 2 visualization environment. Visualization tools for modern cryptographic algorithms in Cryptool 2 allow you to track the content of cryptographic transformations at each stage. This simplifies the understanding of complex algorithms in software development. The following modes are being investigated: ECB (Electronic Codebook), CFB (Cipher Feedback), OFB (Output Feedback) and CBC (Cipher Block Chaining). The modes simulations presented in this article serve as a basis for investigating the performance of AES cryptosystems. Keywords: AES · ECB · CFB · OFB · CBC · Cryptool
1 Introduction Modern encryption methods have their advantages and disadvantages. In order to better understand them, it is necessary to understand the structure of the algorithm in order to know which one to apply to protect the transmitted data [1–4]. AES (Advanced Encryption Standard), also known as Rijndael, is a symmetric block cipher. This algorithm has been well analyzed and is now widely used. AES is a block cipher that encodes a 128-bit text block into a 128-bit encrypted block, or decrypts a 128-bit encrypted block into a 128-bit text block. AES-128, AES192, AES-256 process data blocks in 10, 12 or 14 rounds, respectively. Each round is a specific sequence of transformations. Each round works with two 128-bit blocks: “current” and “round key”. All rounds use different “round keys”, which are obtained using a key expansion algorithm. This algorithm is independent of the data being encrypted and can be performed independently of the encryption/decryption phase. The data block sequentially passes through the following stages: it is XOR (modulo 2 addition) with the first 128 bits of the key, the output is the “current” block (this stage is also called zero round, using the zero round key - the first 128 bits of the cipher key). The “current” block then goes through 10/12/14 rounds of encryption, after which it becomes an encrypted (or decrypted) block. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 462–469, 2023. https://doi.org/10.1007/978-3-031-37963-5_32
Modeling of the Modes of Operation of the AES Algorithm
463
The AES algorithm has eight main modes of operation. The article discusses the construction of four modes, including the ECB, CFB, OFB and CBC mode. The mode modeling presented in this article serves as a basis for investigating the performance of AES cryptosystems and its future use to improve the security level of transmitted data in an IP network [5–9].
2 Simulation of AES Schemes Visualization tools for modern cryptographic algorithms in Cryptool allow you to track the content of cryptographic transformations at each stage. This simplifies the understanding of complex algorithms in software development. Entering the encoded information and the key is done using the “Input text” functional block. Since the program does not allow you to work with text input of the key, you need to add an intermediate link to convert the key to HEX format. For this, the “String decoder” block is added, where the “Hexadecimal” function is selected. After that, we encode using the AES functional block, which specifies all the functions for converting text to cipher. Figure 1 shows the encryption setting block for the following parameters: 1. 2. 3. 4.
Cryptographic algorithm support functionality: AES; Action supported functionality: in the first case, encrypt, then decrypt; Key size supported functionality: 128; Chaining mode supported functionality: ECB, CBC, CFB or OFB.
Fig. 1. AES Encryption Block
464
O. Manankova and M. Yakubova
2.1 AES Cipher Scheme Simulation with ECB Mode Electronic Codebook (ECB) is a simple replacement mode. This mode is the simplest mode, in which plain text is processed sequentially, block by block. Each block is encrypted using the same key. If the message is longer than the length of the block of the corresponding algorithm, then it is divided into blocks of the appropriate length, and the last block is supplemented, if necessary, with fixed values. When using this mode, identical unencrypted blocks will be converted into identical encrypted blocks. ECB mode is ideal for small amounts of data, such as session key encryption. A significant disadvantage of ECB is that the same block of plain text appearing more than once in a message always has the same cipher text. As a result, ECB is considered insecure for large messages. If the message has many identical blocks, then this pattern will be detected during cryptanalysis. To implement the ECB mode, you need to submit a key (because the AES algorithm is a symmetric algorithm, which means that the same key is needed for encryption and decryption). The implementation of this mode is shown in Fig. 2. We pass the text input block through the String Decoder and feed it as an encrypted message as input. We feed the same block as input to the Substring () operation, which returns a substring from the original string. We enter 0 and 16 as parameters for selecting a substring, that is, we need to take 16 characters from the original string, starting from the zero character. Next, the extracted 16-character (128-bit) substring is converted into a byte array (Converter) and the resulting value is fed to the input of the key. Next, we check the cryptographic strength of AES in ECB mode through the “Search Key” in Fig. 3. The analysis showed that after clearing even three or more key pairs, the computer takes a long time (65 days for 5 pairs) to find. The process of decoding the text is not feasible if at least one pair is lost. This shows the cryptographic strength of AES encryption. ECB: block cipher, mode encodes messages that are a multiple of n bits, separately encrypting each n-bit part. The safety properties are weak, the method of leaking block equalities both in block positions and in time. Significant legacy value and value as a building block for other schemes, but the mode does not achieve any generally desirable security goal in its own right and should be used with great care; ECB should not be considered a general-purpose privacy mode. 2.2 AES Cipher Scheme Simulation with CBC Mode In the CBC - Cipher Block Chaining mode, the encrypted message itself, the encryption key (for AES-128 - 128-bit) and the initialization vector (for AES-128 - also 128-bit) must be submitted as input. The block algorithm is designed to encrypt blocks of a certain length. However, it is possible to convert the block algorithm to a stream cipher algorithm using the last two modes. The streaming encryption algorithm eliminates the need to break the message into an integer number of blocks of a sufficiently large length; therefore, it can work in real time. Thus, if a stream of characters is transmitted, each character can be encrypted and transmitted at once, using the character-oriented mode of the block cipher algorithm.
Modeling of the Modes of Operation of the AES Algorithm
465
Fig. 2. Implementation of Encryption and Decryption in ECB Mode
Fig. 3. The Process of Finding the Key
One advantage of this block cipher mode is that the cipher text will be the same length as the original. The encrypted text and the initialization vector are XOR, and then, together with the key, enter the input for encryption. Then the received block and the next block to be encrypted are also XOR, and so on. Until all blocks are used. To implement the part with the first block in the Cryptool2 environment, the result of the XOR operation between the cipher text and the initialization vector is fed to the message input. The key input is a 16-byte key, and the message input is a 16-byte substring from the original string.
466
O. Manankova and M. Yakubova
To begin with, let’s split the original message into 16 bytes, in much the same way as we did when implementing the ECB mode. Let’s choose the initial message with the size of 32 bytes. Separate the first 16 characters from the original message (2 Number Input blocks and a String Operations block with the Substring () operation). The resulting substring is XOR with a random 16-byte numeric sequence (that is, with an initialization vector) and the result is fed to the input of the encryption block as an input message. We supply the key to the input of the encryption block. Since in this mode the encrypted blocks are interconnected, the same key is supplied to the input of the second encryption block as to the first one. After encryption, a block of cipher text is obtained. We pass it through the converter to get a byte array and the resulting byte array of the encrypted XOR block with the byte array obtained from the plaintext of the current message block. The result obtained because of XOR is fed to the input of the encryption block as an input message. Fig. 4 to Fig. 5 show the complete scheme of the mode. Further, the analysis of AES cryptographic strength in the ECB mode through the “Search Key” showed the same result as with the ECB mode.
Fig. 4. Complete Encryption Scheme CBC Mode (Part 1 of the Scheme)
Modeling of the Modes of Operation of the AES Algorithm
467
Fig. 5. Complete Encryption Scheme CBC Mode (Part 2 of the Scheme)
2.3 AES Cipher Scheme Simulation with CFB Mode CFB (Cipher Feedback) mode is a cipher text feedback mode or feedback gamming mode in which, during encryption, each block of plaintext is added modulo 2 to the block encrypted in the previous step. The encryption scheme in CFB mode is shown in Fig. 6. To demonstrate encryption, the input message will be a 32-byte string, divided by 16 bytes (accordingly, encryption will take place in 2 stages). Further, this paragraph will be omitted when explaining the schemes. We supply a 16-byte initialization vector and a key to the input of the encryption block. The encryption result (the sequence obtained at the output of the encryption block) is converted into a byte array and XOR with the plaintext block. As a result, we get a sequence that will be an encrypted message. At the input of the encryption block, we supply the cipher text obtained as a result of encryption of the first part of the message and the key. The result of encryption is converted into a byte array and XOR with a block of plain text. As a result, we get a sequence that will be an encrypted message.
468
O. Manankova and M. Yakubova
Fig. 6. Complete Encryption Scheme CFB Mode
2.4 AES Cipher Scheme Simulation with OFB Mode OFB (Output Feedback) mode is a feedback mode where the output block turns the block cipher into a synchronous stream cipher: it generates key blocks that are the result of addition with plaintext blocks to get the cipher text. Just as with other stream ciphers, mirroring in the cipher text produces a mirrored bit in the plaintext at the same location. This property allows many error correction codes to function normally even when error correction is applied before encoding. Encryption in OFB mode is identical to encryption in CFB mode. This means that the implementation of this scheme in Cryptool2 will be the same.
3 Conclusion In this work, using the Cryptool 2 software environment, we simulated the AES cipher scheme in different operating modes. This simulation allows you to verify its performance and monitor the results at a certain stage. The analysis of the cryptographic strength of each mode showed that this cipher is quite cryptographically resistant to brute force, on which a large number of methods for breaking symmetric encryption systems depend. These studies will be used to improve the security of transmitted data in the IP network.
Modeling of the Modes of Operation of the AES Algorithm
469
References 1. Abood, O.G., Guirguis, S.K.: A survey on cryptography algorithms. Int. J. Sci. Res. Publ. 8(7), 495–516 (2018) 2. Hossain, M., Hossain, M., Imtiaz, S., Uddin, M.: Performance analysis of different algorithms. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6(3), 659–665 (2016) 3. Usman, M., Akram, U.: Ensuring data security by AES for global software development in cloud computing. In: 2014 International Conference on IT Convergence and Security (ICITCS), pp. 1–7 (2014) 4. Islam, N., Shahid, Z., Puech, W.: Denoising and error correction in noisy AES-encrypted images using statistical measures. Sign. Process. Image Commun. 41, 15–27 (2016) 5. Ma, J., Tao, J., Keranen, M., Mayo, J., Shene, C.-K., Wang, C.: AESvisual: A Visualization Tool for AES Cipher (2014) 6. Przybylski, S., Wacker, A., Wander, M., Enkler, F., Vacek, P.: Plugin Developer Manual – How to build your own plugins for CrypTool 2 (2016). https://www.cryptool.org/trac/CrypTool2/ wiki 7. Konshin, S.V., Yakubova, M.Z., Nishanbayev, T.N., Manankova, O.A. . Research and development of an IP network model based on PBX asterisk on the opnet modeler simulation package. In: International Conference on Information Science and Communications Technologies (ICISCT 2020), 20486746 (2020). https://doi.org/10.1109/ICISCT50599.2020.9351405 8. Manankova, O.A., Yakubov, B.M., Serikov, T.G., Yakubova, M.Z., Mukasheva, A.K.: Analysis and research of the security of a wireless telecommunications network based on the IP PBX Asterisk in an Opnet environment. J. Theor. Appl. Inf. Technol. 99(14), 3617–3630 (2021) 9. Manankova, O., Yakubova, M., Baikenov, A.: Cryptanalysis the SHA-256 hash function using rainbow tables. Indonesian J. Electr. Eng. Inform. (IJEEI) 10(4) (2022). https://doi.org/10. 52549/ijeei.v10i4.4247
Investigating the Stability of SMOTE-Based Oversampling on COVID-19 Data Jih Soong Tan1(B) , Hui Jia Yee2 , Ivan Boo2 , Ian K. T. Tan3 , and Helmi Zakariah2 1 2 3
Priority Dynamics Sdn Bhd, Subang Jaya, Malaysia [email protected] AIME Healthcare Sdn Bhd, Kuala Lumpur, Malaysia {huijia,ivanboo teams,drhelmi}@aime.life Heriot-Watt University Malaysia, Putrajaya, Malaysia [email protected]
Abstract. Predictive analytic methods for medical diagnosis can be helpful in supporting decision-making of medical treatment, which in turn reduce the need for medical experts’ attention. However, imbalanced data problems often exist in medical diagnosis datasets and negatively impact the models’ predictive performance. The results of learning algorithms on imbalanced data are biased and often cause overfitting of the majority class. The Synthetic Minority Over-sampling Technique (SMOTE) was proposed to deal with this over-fitting challenge. The application of SMOTE requires the over-sampling of the minority class(es). However, there are vague guidelines on how much oversampling on the minority class is suitable. Therefore, experiments on oversampling using SMOTE with different oversampling ratio setups are done on a medical diagnosis dataset. It is observed that the increase in oversampling rate will reduce the accuracy and precision. Oversampling to a uniform level and excessive oversampling can cause poorer performance. Both recall and precision should be considered based on the costs when deciding the best oversampling percentage. Keywords: Boosting SMOTE · COVID-19
1
· Data Sampling · Data Pre-processing ·
Introduction
Predictive classification analytics methods are useful in the medical domain for health diagnosis to conduct better medical treatment and reduce death rates in cases like diabetes, kidney disorders, cancers, and coronary artery disease. However, the data is often imbalanced as the rate of these cases happening is relatively low. Without enough data, the learning algorithms cannot get enough information to achieve the optimized classification results. When there are no significant differences between the classes, the learning algorithms tend to be biased toward the majority class. Therefore, the results of the learning algorithms on imbalanced data are unreliable to apply in real-life applications. This c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 470–480, 2023. https://doi.org/10.1007/978-3-031-37963-5_33
Investigating the Stability of SMOTE-Based Oversampling
471
is especially important in healthcare, where any false prediction can cause lifethreatening consequences. The Synthetic Minority Over-sampling TEchnique (SMOTE) [4] is an oversampling technique that produces synthetic class samples on the minority classes to address the issue of data imbalance. It creates new synthetic samples by randomly choosing the numerical features to reduce the data imbalance. The method potentially improves the data quality, which indirectly improves the learning algorithms’ results. However, there are no proper guidelines on the oversampling quantum that should be applied to the minority class. Oversampling to a uniform ratio between classes might overfit the learning algorithms. Hence, this paper details out the investigation on oversampling using SMOTE with different ratio setups for COVID-19 dataset to predict patients who may require hospitalisation upon confirming COVID-19 positive.
2
Literature Review
The imbalanced data is a significant problem in the machine learning field [5] where the number of samples for each class is not balanced. Especially in medical data, the number of samples can be very imbalanced. Consider a case scenario of a medical diagnosis with an imbalance ratio of 1:100, a classification algorithm that tries to optimize the accuracy will result in 99% accuracy just by predicting all output as the majority class. Researchers have worked on two distinctive approaches to tackle the imbalanced data problem: algorithmic level attributes augmentations for the training model to fit the imbalanced data and data level augmentations. 2.1
Algorithmic Level
Most of the approach for handling imbalanced data problems is focused on the minority class due to the typically high cost of misclassifications on a minority class [16]. Cost-sensitive learning approaches are introduced to balance the cost of predicting majority and minority classes by assigning the cost of each class to the training models [15]. The cost-sensitive learning can be done by assigning different weights for each class in the training models and by changing the models’ threshold based on its mis-classification costs [7]. However, in most real-world cases, the actual cost of each class are not easily known, even with a balance dataset [9,19]. One-class learning is one algorithm-level approach that works by learning the classifier based on the representation of a single class or the minority class. As a result, it produces a biased solution where the learning algorithm is learned from a single class only. One-class learning is mainly effective on highly imbalanced data that consists of a high-dimensional noise feature vector [17]. 2.2
Data Level
Random oversampling of the minority samples and random undersampling of the majority samples are the two common solutions for imbalanced data [9].
472
J. S. Tan et al.
Both approaches replicate random samples from the respective class to reduce the imbalance ratio in the dataset. However, both approaches have some costs to apply to the dataset. The random undersampling has the risk of eliminating significant samples for model learning. The random oversampling has the risk of overfitting the learning model. One-sided selection [12] is an extension based on random undersampling. The approach introduces Tomek-Link to identify the noise and borderline samples and Condensed Nearest Neighbour (CNN, the Hart Algorithm) rule to remove samples from the majority class samples that are noises and outliers. Similarly, the Edited Nearest Neighbour rule (ENN) [21] was proposed to remove outliers that differ from themselves in at least two out of three nearest neighbours. Other than extensions based on random undersampling, there are also extensions based on random oversampling. Synthetic Minority Oversampling TEchnique (SMOTE) was proposed by Chawla et al. [4] to solve the problem of random oversampling. The SMOTE algorithm implements an oversampling approach to rebalance the initial training set. SMOTE produces new minority class samples based on interpolating minority class samples’ neighbourhoods instead of applying a simple replication like random oversampling. The algorithm is derived based on the feature values and their relationship instead of the whole data points. SMOTE has been widely accepted and there are more than 85 variants proposed to improve the basic form of SMOTE. This was reported by Fernandez [8], one of the original authors of SMOTE. Closely relate works includes Borderline-SMOTE [10], ADASYN [11], MWMOTE [3], MDO [1], and GMC-SMOTE [18]. Borderline-SMOTE which is one of the extensions of SMOTE identifies the samples near the borderline and only oversamples the minority class inside the borderline [10]. It considers the samples further from the borderline as noise and exclude them from oversampling. ADASYN [11] is another extension of SMOTE that utilizes a weighted distribution as the critical decider for the number of new synthetic samples required to be generated for each minority sample. By doing this way, the ADASYN is able to create a better quality dataset by generating new synthetic samples that are harder to be learned. MWMOTE recognizes a set of hard-to-learn minority samples and assigns weights based on the distance to their nearest majority samples. Then, it generates new synthetic samples from the weights assigned based on a clustering approach to ensure that the new synthetic samples lie inside the minority class cluster [3]. MDO [1] is the extension of SMOTE inspired by Mahalanobis distance. The synthetic samples produced by MDO have the same Mahalanobis distance from the class mean as the other minority samples. Thus, the minority samples can be easier to learn by maintaining the covariance from the newly generated synthetic samples while reducing the risk of overlapping samples. GMC-SMOTE [18] attempts to produce a Gaussian distribution dataset, which occurs in many natural settings. The paper did provide insights into the limits of SMOTE. Although there are tremendous efforts in solving the imbal-
Investigating the Stability of SMOTE-Based Oversampling
473
anced data problems through sampling, there is no proper standard to define the most suitable class distribution [20]. Oversampling the minority class to a 50:50 ratio between the minority class and majority class is not always the best approach to oversample the minority class. Hence, experiments on oversampling using SMOTE with different ratio setups are performed.
3 3.1
Experiment Setup Experimental Datasets
We use the COVID-19 Assessment Centers (CAC) datasets which contain comprehensive COVID-19 patient data. The datasets are generated from the collaboration between SELangkah and CAC [2]. SELangkah is COVID-19 health check mobile app launched by the Selangor state government in Malaysia in May 2022 [6]. The app provides features such as contact tracing, screening and vaccination registration, and patient monitoring. The datasets offered comprise a wide range of valuable features that pique this study’s attention, including biological features, underlying comorbidities, symptoms, triage categorization, and demographic features. There are two datasets provided where i) the first dataset is to identify the predictive factors in order to determine whether a patient requires direct admission or can be monitored under home quarantine, and ii) the second dataset is to predict whether a home quarantine patient requires admission during their home quarantine due to the deterioration of their health status. The dataset were cleaned using the following procedures, a) remove records that contain missing values, b) remove records that contain outliers in the entry of biological data such as blood pressure, height, weight, and oxygen level; and c) remove irrelevant variables such as doctor notes. The number of rows and columns of the datasets before and after cleaning are described in Table 1. Table 1. Comparison of Data Dimension before and after Data Cleaning. Dataset Before data cleaning After data cleaning Rows Columns Rows Columns i
139163 113
64590
71
ii
215353 123
100105 65
After the cleaning of the datasets, the datasets are split into 4 age groups as advised by medical professional at CAC;
474
J. S. Tan et al.
1. 2. 3. 4.
0–5, 6–17, 18–59, and 60+
ages ages ages ages
Table 2. The Data Distribution for the Prepared Dataset. Dataset Oversampling The number of records Class Ratio setup Class 0 Class 1 (i)(3) (i)(3) (i)(3) (i)(3) (i)(3)
Default 300% 400% 500% Uniform
44077 44077 44077 44077 44077
3867 11601 15468 19335 44077
11.40:1 3.80:1 2.85:1 2.28:1 1:1
(ii)(3) (ii)(3) (ii)(3) (ii)(3) (ii)(3)
Default 300% 400% 500% Uniform
62543 62543 62543 62543 62543
12321 36963 49284 61605 62543
5.08:1 1.69:1 1.27:1 1.02:1 1:1
After splitting the dataset, (1) and (2) contain limited records to perform training of a prediction model, and (4) contains a balanced dataset. Therefore, only (3) are selected to perform the experiments. As a result, there will be a total of two datasets, namely (i)(3), (ii)(3). Train-test split with the ratio of 80:20 is implemented to generate the train and test set for the experiments, where 80% of the data has been used as training data, and the rest of 20% has been used as the test data for model evaluation purposes. 3.2
Imbalanced Data Handling with SMOTE
The two training datasets provided are all imbalanced at different levels. Table 2 describes the prepared dataset for the training of learning algorithms. On default setup, (i) has a much more imbalanced data distribution compared to (ii). The different SMOTE percentages have been performed on each dataset minority class to study the effect of oversampling. We oversampled the minority class at 300%, 400%, and 500% of its original size as suggested by [4]. We also oversampled the minority class at a uniform level as this is the common setup for SMOTE. 3.3
Feature Selection
Boruta algorithm [13] has been implemented to perform feature selection on the prepared datasets. Boruta is a feature selection wrapper algorithm built around a
Investigating the Stability of SMOTE-Based Oversampling
475
random forest classification algorithm, capable of working with any classification algorithm that produces variable importance measures [14]. It performs a topdown search in a decision tree for relevant features by comparing the original attributes with the attributes randomized using their permuted copies. In this experiment, we have implemented Boruta with 100 iterations. The features ranked first in the Boruta will be selected as the features set. For (i)(3), there are a total of 21 features selected. For (ii)(3), there are a total of 29 features selected. The features selected are described in Table 3. 3.4
Learning Algorithm
This study is concerned with the quality of the dataset, and a suitable machine learning algorithm is chosen to maintain consistency in the study. The random forest (RF) algorithm is chosen among the machine learning classification algorithms. The random forest is a supervised machine learning algorithm that can be used for both classification and regression problems. It is an ensemble method that creates random decision trees in the model, and the class with the most votes becomes the model’s output. It ensembles the votes from multiple decision trees allowing the model to have better accuracy and stable prediction. The RF will be implemented using scikit-learn library with the default setting. 3.5
Evaluation Metrics
For evaluation purposes, we implement accuracy, precision, recall, and F1 -score. With the combination of these metrics, we can identify the effect of SMOTE in different setups in different aspects. Accuracy is the most common approach to evaluating a model by calculating the percentage of correct predictions. The precision is to calculate the percentage of correct positive predictions. The recall is to calculate the percentage of correct positive predictions over all the number of positive samples in the data. Lastly, the F1 -score is a measure that combines both precision and recall.
4
Result and Discussion
The results of RF on different dataset oversampling setups are tabulated in Table 4. The best result on each metric is bold-faced, and the worst result on each metric is underlined. Overall, as the oversampling percentage increases, the accuracy and precision decrease while the recall increases. – Default: Both datasets (i) and (ii) have the highest accuracy and precision but the lowest recall. – 300%: Dataset (i) has a significant decrease in precision compared to the default setup, and a significant increase in recall and F1 -score. Dataset (ii) has a slight decrease in precision and a slight increase in recall.
476
J. S. Tan et al. Table 3. The Features Selected for Dataset (i) and (ii). Dataset Selected Features
Description
(i)
triage category adult runny nose adult cough adult temp adult spo2 adult loss smell adult loss taste adult myalgia dis htn dis dm dis pregnancy pr rr height weight patient source GENDER RACE AGE bp systolic bp diastolic
Triage category status Runny nose status Cough status Body temperature Blood oxygen level Loss of smell status Loss of taste status Myalgia status Hypertension status Diabetes Mellitus status Pregnancy status Pulse rate Respiration rate Height Weight The place of origin Gender Race Age Systolic blood pressure value Diastolic blood pressure value
(ii)
triage category adult runny nose adult cough adult difficulty breathing adult fever adult temp adult spo2 adult loss smell adult diarrhea adult vomit adult lethargic adult myalgia adult chest pain adult loss appetite
Triage category status Runny nose status Cough status Difficult breathing status Fever status Body temperature Blood oxygen level Loss of smell status Diarrhea status Vomit status Lethargic status Myalgia status Chest pain status Loss of appetite status (continued)
Investigating the Stability of SMOTE-Based Oversampling
477
Table 3. (continued) Dataset Selected Features adult excess lethargic adult worsening symptoms dis htn dis dm dis pregnancy pr rr height patient source GENDER RACE AGE bp systolic bp diastolic
Description Excess lethargic status Worsening symptoms status Hypertension status Diabetes Mellitus status Pregnancy status Pulse rate Respiration rate Height The place of origin Gender Race Age Systolic blood pressure value Diastolic blood pressure value
Table 4. The Experiment Results on Different Dataset Setup. Dataset Oversampling Setup Accuracy (%) Precision (%) Recall (%) F1 -score (%) (i)(3) (i)(3) (i)(3) (i)(3) (i)(3)
Default 300% 400% 500% Uniform
92.67 92.32 92.05 91.92 91.26
75.50 57.65 53.23 51.22 45.04
15.39 23.04 23.55 25.59 30.99
25.57 32.92 32.65 34.13 36.71
(ii)(3) (ii)(3) (ii)(3) (ii)(3) (ii)(3)
Default 300% 400% 500% Uniform
90.24 89.83 89.75 89.67 89.57
79.82 75.77 74.91 73.56 73.42
53.06 54.54 55.04 56.43 55.67
63.74 63.43 63.45 63.86 63.32
– 400%: Both datasets (i) and (ii) have a slight decrease in precision compared to the 300% setup, and a slight increase in recall and F1 -score. – 500%: Dataset (i) continues to decrease accuracy and precision while increasing the recall and F1 -score. Dataset (ii) has the highest recall and F1 -score. – Uniform: Both datasets (i) and (ii) have the lowest accuracy and precision. The dataset with the highest imbalance class ratio can produce the highest accuracy and precision result due to the bias toward the majority class. For dataset (i), the recall is very low in the default setup due to the lack of significant features to train the model. Increasing the oversampling percentage can increase
478
J. S. Tan et al.
the model’s recall and F1 -score by increasing the bias towards the minority class. However, this will tremendously reduce the precision of the model. For dataset (ii), the recall is significantly higher than the default in dataset (i). However, the increase of oversampling percentage in the dataset (ii) only has a minor improvement on recall and almost no improvement on the F1 -score. 4.1
Cost of Oversampling
The cost of false positives and negatives can differ in different applications. For example, in a medical prediction model, the cost of having a false negative is relatively higher than the cost of having a false positive, as a false negative could result in losing someone’s life. Depending on the cost, the SMOTE method can be implemented to reduce the chance of false negatives or increase the recall rate by oversampling the minority class. However, oversampling at a uniform level or over-oversampling the minority class would result in removing the natural traits of the dataset distribution and causes poorer results, and overfitting [18]. From the experiment result, the results of oversampling at a uniform level in both datasets have the lowest precision and do not confirm the highest F1 -score. 4.2
Optimal Oversampling Setup
Optimally, in applying datasets (i) and (ii), the COVID-19 admission prediction would want to increase the recall rate without affecting too much of the precision rate. The cost of false negatives is very high, resulting in life-threatening cases. The cost of false positives is lower than false negatives, but it is still relatively high due to the need for admission. Any false positive cases will result in a waste of resources in admission. Hence, a balance of precision and recall should be considered during the oversampling of SMOTE on the decision of the best model in real-life applications. To keep the balance of precision and recall, the precision must be more than 50% while improving the recall rate. Using that logic, the oversampling setup with 500% is the optimal setup in both datasets, where the precision rates are at least 50%, and the recall rates are either top 1 or 2.
5
Conclusion
The results show that the SMOTE is an effective oversampling method to increase recall and F1 -score. The increase in oversampling rate of the minority class will reduce the accuracy and precision but not always increase the recall rate and F1 -score. Oversampling to a uniform level and over-oversampling the minority class can cause poorer performance, as it eliminates the natural traits of dataset distribution. In real-life applications, recall and precision should be considered based on each cost when deciding on the best model. To conclude, the optimal oversampling setup is 500%, with precision rates of at least 50%, and top 2 recall rates and F1 -score.
Investigating the Stability of SMOTE-Based Oversampling
479
References 1. Abdi, L., Hashemi, S.: To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 28(1), 238–251 (2015) 2. admin selangkah. Tutorial: Covid assessment centre (CAC) & home quarantine registration (2021) 3. Barua, S., Islam, M.M., Yao, X., Murase, K.: Mwmote-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2012) 4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002) 5. Chawla, N.V., Japkowicz, N., Kotcz, A.: Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1–6 (2004) 6. CodeBlue. Selangkah covid-19 app wins Singapore AI award for health tech (2022) 7. Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Lawrence Erlbaum Associates Ltd. (2001) 8. Fern´ andez, A., Garcia, S., Herrera, F., Chawla, N.V.: Smote for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018) 9. Galar, M., Fern´ andez, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recogn. 44(8), 1761–1776 (2011) 10. Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059 91 11. He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008) 12. Kubat, M., Matwin, S., et al.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol. 97, pp. 179–186. Citeseer (1997) 13. Kursa, M.B., Jankowski, A., Rudnicki, W.R.: Boruta-a system for feature selection. Fundamenta Informaticae 101(4), 271–285 (2010) 14. Kursa, M.B., Rudnicki, W.R.: Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010) 15. Liu, X.-Y., Zhou, Z.-H.: The influence of class imbalance on cost-sensitive learning: an empirical study. In: Sixth International Conference on Data Mining (ICDM 2006), pp. 970–974. IEEE (2006) 16. Margineantu, D.: When does imbalanced data require more than cost-sensitive learning. In: Proceedings of the AAAI 2000 Workshop on Learning from Imbalanced Data Sets, pp. 47–50 (2000) 17. Raskutti, B., Kowalczyk, A.: Extreme re-balancing for SVMs: a case study. ACM SIGKDD Explor. Newsl. 6(1), 60–69 (2004) 18. Tan, J.S., Tan, I.K.T., Soon, L.K., Ong, H.F.: Improved automated essay scoring using gaussian multi-class SMOTE for dataset sampling. In: Proceedings of the 15th International Conference on Educational Data Mining, pp. 647–651. International Educational Data Mining Society (2022)
480
J. S. Tan et al.
19. Weiss, G.M., McCarthy, K., Zabar, B.: Cost-sensitive learning vs. sampling: which is best for handling unbalanced classes with unequal error costs? Dmin 7(35–41), 24 (2007) 20. Weiss, G.M., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003) 21. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 408–421 (1972)
Real-Time DNI and DHI Prediction Using Weather Information via LGBM Geonwoo Bae(B) Choate Rosemary Hall, 333 Christian Street, Wallingford, CT 06492, USA [email protected]
Abstract. Climate change is happening faster than expected, causing severe problems. Renewable energy is proposed as a solution due to its high environmental benefits and low energy costs. As solar power does not emit air pollutants or greenhouse gasses, this study focuses on the benefits of solar energy and how it is superior to other green energy sources. A problem that comes with solar energy is the difficulty of calculating DNI and DHI in non-capital cities in Korea. Constructing solar power plants and conducting research on solar energy are not accessible outside of developed capital cities. Therefore, this paper analyzes climate characteristics that affect solar energy production and proposes a method to calculate DNI and DHI in non-capital cities efficiently. As independent variables, climate information such as air pressure, wind direction, and wind speed was set, and DNI and DHI were set as dependent variables. LGBM was used as a basic model, and RMSE was calculated. When DNI was the dependent variable, RMSE 37.63 was obtained, and when DHI was the dependent variable, RMSE 200.15 was derived. In addition, through variable importance and correlation analysis, it was found that temperature, wind direction, relative humidity, and wind speed had a high effect on DNI and DHI. This study is significant as it solves the existing problem of the unpredictability of DNI and DHI in Korea through primary climate data while also contributing to the expansion of domestic solar energy research in the future. Keywords: Renewable Energy · Solar Energy · Machine Learning
1 Introduction Loss of sea ice, melting glaciers and ice sheets, sea level rise, and more intense heat waves are not future problems. Greater emissions of greenhouse gases are already increasing global temperature, and severe weather damage is increasing and intensifying. Climate change is progressing more rapidly than anticipated, and the severity of the effects caused by it brings attention to a solution to lessen carbon emissions [1]. Renewable power is a flourishing innovation that brings down the costs of energy while delivering a subset of sources that have high environmental benefits. It is generated by sources that can be replenished within a relatively short period, and they offer lower emissions of carbon and other types of pollution. Renewable energy sources make up only 11% of energy uses worldwide, but their benefits on climate change have increased demand for non-fossil fuel-based energy and are gradually replacing fossil energy [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 481–489, 2023. https://doi.org/10.1007/978-3-031-37963-5_34
482
G. Bae
However, not all “renewable energy sources” help the environment. Though biomass and large hydroelectric dams produce energy free of fossil fuels, they harm wildlife and adversely affect the climate. Solar energy, on the other hand, provides much more versatility compared to other renewable energy sources. Solar, or photovoltaic, cells are made from silicon or other materials that transform sunlight directly into electricity. Distributed solar systems generate electricity locally for homes and businesses, either through rooftop panels or community projects that power entire neighborhoods. Solar panels do not take up much space and can be placed anywhere sunlight reaches and also don’t produce air pollutants or greenhouse gases [3]. Furthermore, the cost of installing solar panels has dropped by 60% compared to previous decades, and those benefits yield the growth potential of solar power generation. For instance, solar provided the most producing capacity to the grid, with 46% of all new electric capacity added to the grid in 2021, as depicted in Fig. 1 [4]. However, despite these benefits, solar energy is susceptible to the climate, and it is challenging to collect energy on rainy days. This problem must be solved to set solar energy as an efficient, stable source.
Fig. 1. U.S. Annual Additions of New Electric Generating Capacity: Coal, Natural Gas, Wind, Solar, and Other
Therefore, this paper proposes a solar forecast system where different aspects that affect solar energy production are monitored and analyzed to predict daily productivity. In order to maximize energy production, building solar energy plants at the appropriate location is essential, and by using AI and machine learning, people can find the best site to build the solar energy plant on. Furthermore, in Korea, calculating Direct Normal Irradiance (DNI) and Diffused Horizontal Irradiance (DHI) is difficult except in the major cities, which could prevent building the solar energy plant or conducting solar energyrelated research. This is because since calculating both DNI and DHI is a time demanding process, Korea Meteorological Administration only provides global horizontal solar
Real-Time DNI and DHI Prediction Using Weather
483
irradiance (GHI). However, DNI and DHI are also required to calculate the amount of solar radiation reaching the surface of the solar cell module (POA) used to predict the amount of power generated by the solar system [5]. Ultimately, the purpose of this paper was to analyze properties that affect solar energy production and suggest methods to calculate GHI, DNI, and DHI, which can be used to build solar farms and panels efficiently. Lastly, this research also shows the estimated DNI and DHI scores in each region in Korea.
2 Related Work Khan et al. utilized various data mining techniques, including processing historical load data and the load time series’s characteristics, to propose a novel hybrid approach for energy consumption forecasting. As a case study, the team obtained the actual energy consumption time series data of Jeju Island. Using a machine learning-based hybrid approach, combining multilayer perceptron (MLP), support vector regression (SVR), and CatBoost, the team analyzed the power consumption trend of both renewable and nonrenewable energy. The group performed exploratory data analysis, pre-processing, and train-test split before training the model. Various metrics were used to test the advantages of the proposed model: mean absolute error, mean absolute percent error, mean squared error, and root mean squared logarithmic error. The proposed model resulted in MAE of 15.72754, MSE of 472.9644, RMSE of 21.74774, and RMSLE of 0.037885, and the values were compared with Lasso, Ridge, GradientBoost, MLP, SVR, and XGBoost [6]. Lai et al. utilized a survey to provide a review and analysis of machine-learning models in renewable-energy predictions. Procedures such as data pre-preprocessing techniques, parameter selection algorithms, and prediction performance measurements were depicted. The team analyzed sources of renewable energy, values of the mean absolute percentage error, and values of the coefficient of determination. The team concluded that the applications of machine-learning techniques to renewable energy have been increasing, and the uses of artificial intelligence techniques and hybrid models in solar energy and wind-energy predictions are the majority. Machine-learning models with larger R2 values lead to more accurate renewable-energy predictions regarding MAPE. The team stated that areas of future studies could include renewable-energy sources other than solar and wind and the analysis of data preprocessing techniques and machine-learning models in renewable energy predictions. The use of new metaheuristics, such as a coronavirus optimization algorithm for machine-learning parameter selection, is also an encouraging opportunity for future research [7]. Koo et al. constructed an artificial neural network (ANN) model to calculate hourly global horizontal solar irradiance (GHI) from Korea Communication, Ocean, and Meteorological Satellite (COMS) Meteorological Imager (MI) images. Solar position variables and five COMS MI channels were used as inputs for the ANN model, which had a window size of five for the input satellite images and two hidden layers, with 30 nodes on each hidden layer. The temporal and spatial applicability of the ANN model for solar irradiance mapping was validated. The final ANN ensemble model, which calculated the hourly GHI from 10 independent ANN models, exhibited a correlation coefficient
484
G. Bae
(R) of 0.975 and root mean square error (RMSE) of 54.44 W/m2 , which were better results than for other remote-sensing-based works for Korea. The ANN model was then used to generate hourly GHI maps of Korea by using COMS images. The COMS-based ANN model developed in this study is expected to estimate real-time solar irradiance throughout Korea with significant accuracy ( by their Degree of Influence (DI). The VAF is calculated as follows, and this variable has the potential to change the UFP by up to 35% (±). VAF = (TDI × 0.01) + 0.65
(2)
As shown in Eq. (3), the Adjusted FP (AFP) is the product of UFP and VAF. The work effort (Effort Calculation) in person-hours must then be determined. One FP and the number of hours required to build it are related by the Productivity Factor (PF), which
Finding Suitable Data Mining Techniques for Software Development
499
is employed as a constant. There are two distinct approaches to calculating the PF value. The PF can first be calculated as the average of all PF from previous projects. A PF may be supported by categorical variables, making it specialized for particular project types [17]. AFP = UFP × VAF
(3)
Effort Calculation = AFP × PF
(4)
3.3 Generalized Linear Model (GLM) Depending on the nature of the response variable, its distribution and variance, and the suitable link function, the generalized linear model (GLM), a generalization of conventional linear regression, is used to depict the relationship between features. It has a large range of inference tools and is effective at managing non-linear variables because of its versatility. It develops the concept of a generalized linear model (for example, a regression, or an ordinary least squares regression equation). It is the kind of model that is frequently employed in basic statistics. To determine the “best fit” model, it uses a variety of different probability distributions. The model predicts results using a variety of methods, including Bayesian hypothesis testing. The dependent variable can be visualized using GLM as a linear combination of independent variables. The traditional GLM method, simple linear regression, performs well when the dependent variable has a normal distribution. The Generalized Linear Model and Probability Distributions By enabling each result of the dependent variable (y) to come from any of a large range of probability distributions, the extended linear model expands on the capabilities of ordinary linear regression. These are: • • • •
Conventional normal distribution. Binomial distribution. Poisson distribution. Gamma distribution. The GLM (Generalized Linear Model) consists of the following three components:
• An exponential family probability distribution (as stated above). • X = linear predictor. A linear predictor provides details about the independent variables within the model. • The linear predictor and the expected value are connected by the link function. It is evident that a linear regression model’s four underlying presumptions are linearity, independence from error, homoscedasticity, and normality of error distribution [18].
500
J. O. Ogunleye
3.4 Deep Learning Neural Networks Artificial systems called neural networks were influenced by biological neural networks. These systems acquire task-specific knowledge by being exposed to a spread of dataset examples. The concept is that the system creates distinguishing qualities from the data it receives without being pre-programmed with knowledge of these datasets. A feedforward multilayer artificial neural network is constructed on an H2OFrame using CPUs in the Deep Learning Neural Networks (just Neural Networks) paradigm. Artificial neural network characteristics: • It can draw knowledge from prior data. • It works with both linear and nonlinear functions, producing high software effort predictions for complicated datasets. The computer models for threshold logic are the foundation of neural networks. Algorithms and mathematics are used to create threshold logic. Either the study of the brain or the utilization of neural networks in computing (artificial intelligence) are the foundations for neural networks. The work has improved the theory of finite automata. A typical neural network has neurons, connections, weights, biases, a propagation function, and a learning rule as its constituent parts. From previous neurons with an activation aj (t), threshold j , activation function f, and output function -fout , neurons will receive an input pj (t). Connections, which govern how neuron i sends output to neuron j, are made up of connections, weights, and biases. Propagation calculates the input, produces the output, and adds the weighted functions of the predecessor neurons. The network’s variable weights and thresholds are altered by the learning rule. Supervised and Unsupervised Learning Neural networks learn through supervised learning. An input variable x and an output variable y are used in supervised machine learning. An instruction dataset is used to train the algorithm. Algorithms iteratively make predictions on data for every right response. When the algorithm performs to an acceptable standard, learning ceases. With input data X and no corresponding output variables, machine learning is unsupervised. In order to learn more about the data, it is intended to model the data’s fundamental structure. The terms classification and regression are used to describe supervised machine learning. Clustering and association are crucial words for unsupervised machine learning. Evolution of Neural Networks Neural plasticity is a subject of Hebbian learning. Hebbian learning involves long-term potentiation and is unsupervised. Hebbian learning deals with If-Then logic, ExclusiveOR circuits, and pattern recognition. The Exclusive-OR problem that Hebbian learning couldn’t address was resolved through backpropagation. Additionally, this made multilayer networks practical and effective. If a mistake was discovered, it absolutely was corrected at each layer by changing the weights at each node. As a result, Support Vector Machines, linear classifiers, and max-pooling were all developed. Recurrent neural networks and back propagation-based feedforward networks are also impacted by the vanishing gradient problem. Deep learning is the term for this. For biophysical simulation and neurotrophic computing, hardware-based architectures are used. They use convolution to construct a new kind of neural computing using
Finding Suitable Data Mining Techniques for Software Development
501
analog, in conjunction with a large-scale component analysis. Backpropagation for multilayered feed-forward neural networks was also resolved in this way. With linked layers (completely or sparsely connected) and a final classification layer, convolutional networks alternate between convolutional layers and max-pooling layers. Without any unsupervised pre-training, the learning is completed. Every filter can be thought of as a weights vector that needs to be taught. For both small and large neural networks, the shift variance must be guaranteed. Development Networks is addressing this. Types of Neural Networks 1. A multilayer perceptron that uses a nonlinear activation function and has three or more layers. 2. Convolutional neural network utilizing a multilayer perceptron variant. 3. A recursive neural network that uses weights to supply organized predictions. 4. Recurrent neural network, which links the neurons in a controlled loop. 5. The recurrent neural network design is used in the neural network for long- and short-term memory, and activation function is not used. 6. Sequence to sequence modules that create a vector space from an outsized amount of text using two recurrent networks and shallow neural networks. The neural network in this example will use three vectors: classifications vector Y, weights vector W, and characteristics vector X. 100 iterations of the code will be used to fit the properties to the classes. After looping through the weights vector W, the predictions are generated, weighed, and displayed as output. The neural network handles backpropagation. 3.5 Decision Trees The best and most generally used technique for categorization and prediction is the decision tree. In a decision tree, which resembles a flowchart, each internal node indicates a test on an attribute, each branch shows the test’s result, and every leaf node (or terminal node) encompasses a class label. It’s a reliable machine-learning method that also acts as the foundation for more advanced and popular algorithms like Random Forest, XGBoost, AdaBoost, and LightGBM. Construction of Decision Tree: By dividing the source set into subgroups based on an attribute value test, a tree is often "trained. It is referred to as recursive partitioning to repeat this operation on each derived subset. When the split does not improve the predictions, or when the subset at a node has the identical value for the target variable, the recursion is finished. Decision tree classifier building is good for exploratory knowledge discovery because it doesn’t require parameter configuration or domain understanding. High-dimensional data are often handled via decision trees. Decision tree classifiers typically have high accuracy, and decision tree induction may be a common inductive method for learning about classification. Decision Tree Representation: Decision trees categorize instances by arranging them during a tree from the basis to a leaf node, which provides the instance’s categorization. An instance is classed by starting at the tree’s root node, checking the attribute that the
502
J. O. Ogunleye
node specifies, and going along the branch of the tree that corresponds to the attribute’s value. The subtree rooted at the new node is then subjected to the identical procedure. Advantages and downsides of the Decision Tree method. Advantages: • • • •
Decision trees are capable of manufacturing clear rules. Decision trees can accomplish classification with little computational effort. Continuous and categorical variables can both be handled by decision trees. Decision trees clearly show which fields are most vital for classification or prediction. Disadvantages:
• Decision trees are less suitable for estimating jobs where the target is to anticipate the worth of a never-ending attribute, according to decision tree approaches’ shortcomings. • Decision trees are often computationally expensive to train. Decision trees are liable to errors in classification problems with numerous classes and a limited number of coaching instances. A decision tree’s growth requires extensive computing work. Each candidate splitting field at each node must first be sorted to work out which split is perfect. Some algorithms employ combinations of fields, hence it’s necessary to appear for the best-combining weights. 3.6 Evaluation Method Techniques for data mining were applied to the analysis. The training and testing groups were created from the data set using the following percentage ratios: • 80% of the training set • 20% of the testing set. The Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Root Mean Squared Log Errors were used to assess the accuracy of the estimates.
4 Discussion This article presents a combined approach for effort estimation at the project’s early stage with the employment of function points for software sizing and data processing techniques. For this purpose, H2O AutoML was applied, which provided guidance through the modelling process. Software cost estimation data was sourced from Kaggle, and also the database provided information about the scale of implemented software measured with function points. The data types are shown in Fig. 3 while a sample of the dataset is shown in Table 2. The information about completed projects that were sourced from the ISBSG database was filled with null values, and removing the null values provided a database with only one record. Therefore, the database file was discarded. Three prediction algorithms were selected from regression and machine learning areas: Generalized Linear
Finding Suitable Data Mining Techniques for Software Development
503
Model (GLM), Deep Learning Neural Networks (DLNN), and Decision Trees - Gradient Boosting Machine (GBM). A generalization of traditional linear regression known as GLM depends on the nature, distribution, and variance of the response variable. It can analyse huge datasets with non-linear variables with amazing efficiency. DLNN generate values from inputs using nonlinear activation functions and interconnected layers of neurons. They have the capacity to learn via a back-propagation process that adjusts weights based on anticipated outcomes. As a result, it could modify to the data and omit unnecessary information. GBM is a technique for transforming subpar learners into strong ones. It gradually, additively, and sequentially trains numerous models, combining the predictions from various decision trees to produce the final forecast. For each project, the Adjusted Function Point (AFP) was computed through the multiplication of Unadjusted Function Point (UFP) by the Value Adjusted Factor (VAF) - Eq. (3), while the Effort calculation was done through the multiplication of AFP by the Productivity Factor (PF) - Eq. (4).
Fig. 3. Data Types
For modelling purposes, the dataset of projects was split using H2O into the train (80%) and test (20%) of the dataset. The primary one was used for building models, and therefore the other was for validating their effort estimation capability. Three data processing prediction algorithms (Generalized Linear Models (GLM), Deep Learning Neural Networks (DLNN), and Decision Trees - Gradient Boosting Machine (GBM)) were applied with the employment of H2O AutoML for both dependent variables. Error and accuracy measures were compared so as to see their potential usefulness for deployment within organizations. The error measures utilized were Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Root Mean Squared Log Error for assessing the accuracy of software estimation models. An overview of the errors and accuracy measures are shown in Table 3 while Fig. 4 shows the graphical representation when compared. As can be seen in the graph above, the best performing algorithm for the effort prediction was Generalized Linear Model. Additionally, the difference between RMSLE and MAE for all effort estimation models is extremely small. Therefore, it will be stated that giant errors failed to occur. The prediction accuracy between the algorithms used was
504
J. O. Ogunleye Table 2. Sourced Data Sample
Proj ect_I d
Projec t_Clas s
Project _Lengt h
Actual _Effor t
Transacti ons_Coun t
0
0
0.28947 4
0.1968 88
0.257913
0.01 25
0.0125
0.07894 7
0.2175 34
0.192263
0.02 5
0.025
0
0.0110 71
0.008206
0.03 75
0.0375
0.10526 3
0.1403 35
0.19578
0.05
0.05
0.07894 7
0.0685 22
0.12544
...
...
...
...
0.95
0.95
0.28947 4
0.0365 05
0.229777
0.96 25
0.9625
0.28947 4
0.0963 49
0.227433
0.97 5
0.975
0.60526 3
0.3836 03
0.424385
0.98 75
0.9875
0.28947 4
0.2280 07
0.511137
1
1
0.92105 3
1
1
...
Enti ties 0.11 842 1 0.30 789 5 0.13 947 4 0.29 473 7 0.22 894 7 ... 0.42 631 6 0.17 368 4 0.48 947 4 0.44 473 7 0.61 578 9
Project_ Size_AF P
Adjustme nt_Points
Project_ Size_UF P
Productiv ity_Factor
Effort_C alculatio n
0.220114
0.617021
0.227704
0.178947
0.154132
0.235294
0.595745
0.240038
0.273684
0.238481
0.025617
0.276596
0.019924
0.178947
0.034949
0.233397
0.531915
0.228653
0.168421
0.15118
0.152751
0.404255
0.13852
0.126316
0.079777
...
...
...
...
...
0.308349
0.723404
0.333966
0.031579
0.046873
0.21537
0.617021
0.22296
0.094737
0.082844
0.488615
0.744681
0.526565
0.221053
0.380149
0.542694
0.808511
0.602467
0.105263
0.207393
1
0.617021
1
0.252632
0.842347
Table 3. Tabular Display of Errors and Accuracy Measures Algorithms
Error: RMSE
Error: MSE
Error: MAE
Error: RMSLE
ArithmeƟc Mean 0.01605725
GLM
0.026497
0.000702
0.018321
0.018709
DLNN
0.048318
0.002335
0.0333
0.039605
0.0308895
GBM
0.087824
0.007713
0.052218
0.060318
0.05201825
almost insignificant, and anyone of them may be used individually for effort estimation. Finally, all models built, supported sizing and measurement with function points, have superb software project effort prediction capability. Models could even be more accurate if deployed for a specific organization and trained supported a homogeneous dataset.
Finding Suitable Data Mining Techniques for Software Development
505
Fig. 4. Errors and Accuracy Measures in Graphical Representation
5 Conclusion Effort estimation of software projects at early stage during initiation and planning is taken into account jointly of the foremost challenging tasks in project management on which project success relies. The explanation for that’s the shortage of knowledge about the functionalities of the ultimate product, and also the activities necessary to perform so as to develop and implement the software package. Therefore, project managers and other project practitioners during estimation process act on incomplete information where uncertainty and risk occurrence are significant. For estimation purposes, they utilize mostly traditional manual techniques that are derived from expert knowledge, or are supported analogy. These methods tend to be error-prone and deliver over-optimistic estimates that end in cost and schedule overrun, which can contribute to project failure. The aim of this paper is to present a combined approach of functional size measurement and three data processing techniques for effort estimation. Supported pre-processed dataset models were built separately for effort estimation. Moreover, merged approach was explored that mixes predictions delivered by GLM, DLNN and GBM using mean value. The obtained results showed that individual data processing methods could be utilized to estimate effort and duration supported function point-measured software sizing. Nevertheless, the proposed merged approach of mixing GLM, DLNN and GBM predictions, by averaging, generated even more accurate predictions, and overcame the chance of over-fitting and delivering false predictions by individual algorithms, betting on the standard of information used for the training. The practical application of the suggested methodology as a call support tool for early effort estimation of small to big software projects should be the first focus of future study. It can be connected with currently available project management tools, and it’s ideal if it’s maintained by a project management office that can guarantee good input data quality and model updates to preserve accuracy in a project’s changing environment.
506
J. O. Ogunleye
Acknowledgment. I, Julius Olufemi Ogunleye (the author), would love to express my gratitude to Ass. Prof. Zdenka Prokopova and Ass. Prof. Petr Silhavy for their support and guidance towards making this research work possible. This work was supported by the Faculty of Applied Informatics, Tomas Bata University in Zlín, under Projects IGA/CebiaTech/2023/001.
References 1. Sami, M.: 5 Steps to Software Development Effort Estimation (2018) 2. Silhavy, P., Silhavy, R., Prokopova, Z.: Categorical variable segmentation model for software development effort estimation. IEEE Access 7, 9618–9626 (2019) 3. Pospieszny, P., Czarnacka-Chrobot, B., Kobyli´nski, A.: Application of function points and data mining techniques for software estimation - a combined approach. In: Kobyli´nski, A., ´ Czarnacka-Chrobot, B., Swierczek, J. (eds.) IWSM/Mensura -2015. LNBIP, vol. 230, pp. 96– 113. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24285-9_7 4. Varshini1, A.G.P., Kumari, K.A.: Predictive Analytics Approaches for Software Effort Estimation: A Review (2020) 5. Nassif, A.B., et al.: Software Development Effort Estimation Using Regression Fuzzy Models (2019) 6. International Function Point Users Group (IFPUG) Simple Function Point (SFP) Counting Practices Manual Release 2.1 (2021) 7. Kamber, H., et al.: Data Mining: Concepts and Techniques (3rd Ed.). Morgan Kaufmann. ISBN 978-0-12-381479-1 (2011) 8. ACM SIGKDD (2006–04–30), Retrieved (2014–01–27): Data Mining Curriculum 9. Clifton, C.: Encyclopedia Britannica: Definition of Data Mining (2010). Retrieved 9 Dec 2010 10. Trevor, H., et al.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Archived from the original on 2009–11–10 (2009). Retrieved 7 Aug 2012 11. Jiawei, H., Micheline, K.: Data Mining: Concepts and Techniques (2000) 12. Sehra, S.K., et al.: Analysis of Data Mining techniques for software effort estimation (2014) 13. Dejaeger, K., et al.: Data Mining Techniques for Software Effort Estimation: A Comparative Study (2012) 14. Weiss, G.M., Davison, B.D.: Data mining. In: Bidgoli, H. (ed.) Handbook of Technology Management. Wiley (2010) 15. Berson, A., et al.: An Overview of Data Mining Techniques (Excerpts from the book by Alex Berson, Stephen Smith, and Kurt Thearling) (2005) 16. Mehmed, K.: Data Mining Concepts, Models, Methods, and Algorithms, 2nd edn (2011) 17. Software Testing Help: Data Mining Techniques: Algorithm, Methods & Top Data Mining Tools (2020) 18. Kushwaha, D.S., Misra, A.K.: Software Test Effort Estimation (2008)
Universal Hidden Monotonic Trend Estimation with Contrastive Learning Edouard Pineau1(B) , S´ebastien Razakarivony2 , Mauricio Gonzalez1 , and Anthony Schrapffer1 1
EthiFinance, Paris, France [email protected] 2 Safran Tech, Chˆ ateaufort, France
Abstract. In this paper, we describe a universal method for extracting the underlying monotonic trend factor from time series data. We propose an approach related to the Mann-Kendall test, a standard monotonic trend detection method and call it contrastive trend estimation (CTE). We show that the CTE method identifies any hidden trend underlying temporal data while avoiding the standard assumptions used for monotonic trend identification. In particular, CTE can take any type of temporal data (vector, images, graphs, time series, etc.) as input. We finally illustrate the interest of our CTE method through several experiments on different types of data and problems.
Keywords: Contrastive Learning Test · Time Series Analysis
1
· Trend Detection · Mann-Kendall
Introduction
Our paper focuses on the estimation of a monotonic trend factor underlying temporal data. Such estimation is interesting in many fields, e.g., health monitoring [1], survival analysis [2] or climate change monitoring [3]. In all these fields and related trend estimation problems, we observe samples generated by a monitored system (e.g., an ageing mechanical system, a credit debtor, earth’s weather and climate conditions) at different times in its life, and we assume that the state of the system drifts monotonically. These observed samples may be of any type (e.g., vectors, images, time series, graphs), depending on the monitored system. Figure 1 illustrates the general context of trend estimation. More generally, when studying temporal data, it is common to assume the existence of structural latent factors, supposed meaningful, that generated the data [4]. These components are generally allocated into four groups. The trend components are monotonic long-term signals. The cycle components are factors exhibiting rises and falls that are not of a fixed frequency. The seasonality components are periodic patterns occurring at a fixed frequency. The irregularity factors represent the rest of the information (considered as a noise). We assume c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 507–521, 2023. https://doi.org/10.1007/978-3-031-37963-5_36
508
E. Pineau et al.
Fig. 1. Illustration of the Context of the Paper’s Contribution. We have a Monitored System S that Generates Data Samples (Colored Curves) at Random Time. The Hidden Trend τ Underlying the System (Colors from Green to Red) Represents the Hidden State of S that Changes Monotonically Until a State Restoration is Applied (Tools in Hexagons): Samples between Two State Restorations form a Sequence with a Monotonic Hidden Trend. The Relation between Trend and Observed Data may be an Arbitrary Function yet Assumed to Preserve the Information about the Trend.
independent structural factors. The challenging yet essential task is the identification of one or several of these factors, that is called blind source separation [5], independent component analysis [6] or disentanglement [7]. In this paper, the objective is to detect, isolate and identify only the trend component. [8] shows that if we know one hidden component under time series data, we can find the others conditionally. Hence, finding the trend component is not only useful for many monitoring problems, it is relevant for further analysis. Often, trend estimation methods seek monotonic variations in the values of the data or in expert-based statistics computed from data [9,10]. In practice, the trend can be deeply hidden in the data or may be not well defined because of a lack of information about the monitored system. Hence, we may not know which variable or statistics to follow to find the trend. In this paper, we learn to infer the trend factor from data (of any type) without labels or expert supervision, using only samples’ time index. To do so, we develop a general method based on Contrastive Learning (CL). CL recently received high interest in self-supervised representation learning [11], in particular for time series data (see, e.g., [12–14]). Our CL approach uses a loss inspired by Mann-Kendall test [15], a standard trend detection method. The rest of the paper presents our universal trend inference method called Contrastive Trend Estimation (CTE). Section 2 presents the method. Section 3 analyzes the theoretical foundation of our method in terms of identifiability. Section 4 lists related works on trend detection and estimation. Section 5 presents a set of experiments to illustrate the interest of our approach for trend estimation and survival analysis. Concluding remarks are presented in Sect. 6.
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
2
509
Contrastive Trend Detection
Notations. Let X be a sequence of NX ∈ N observed samples generated by a monitored system denoted by S. We assume that a hidden state of S drifts monotonically. We note X the dataset of all sequences X in which there exist a hidden monotonic factor. We note ti the time index of the ith observed sample, i ∈ 1, NX . We assume that each sequence X ∈ X has been generated from structural factors through a function F , such that at least the information about the trend is not annihilated (in blind source separation problems, F would be assumed invertible). That is, for each X there exist Z X :=(τ X , cX , sX , X ) such that Xti = F (ZtXi ), where τ X , cX , sX , and X represent respectively (resp.) the monotonic trend, the cycle, the seasonality, and the irregularity that generated X. The paper’s goal is to estimate the factor τ X from X. Our CTE Approach. For each X ∈ X , we select two sampling times tu , tv in {t1 , . . . , tNX }2 , such that, without loss of generality (w.l.o.g.), tu < tv . The value of the hidden trend at the sampling time t for Xt is noted τtX . Since we do not have access to the true hidden trend, we need assumptions about τ X . We use the natural Assumption 1 to estimate τ X . Assumption 1. (Monotonicity). For each sequence X ∈ X and all sample couples (Xtu , Xtv ), we have that tu ≤ tv ⇐⇒ τtXu ≤ τtXv . To extract the trend component, we use a neural network (NN) Fφ with parameters φ that embeds each sample Xt into a de -dimensional vector space, with which we define gβ,φ : Rde × Rde → [0, 1] a parametric logistic regressor defined as follows: gβ,φ (Xtu , Xtv ) = σ β Fφ (Xtv ) − β Fφ (Xtu ) , (1) where σ(x):=(1 + e−x )−1 is the sigmoid function. Let Cuv :=1{τtX ≤τtX } be u v the indicator function that describes the trend direction between tu and tv for any sample X. Under the Assumption 1, we have also Cuv = 1{tu ≤tv } . Then, we can build Cuv from sample’s time indices. We then can learn the posterior distribution p(Cuv |Xtu , Xtv ), i.e., learn the identity: p(Cuv = 1|Xtu , Xtv ) = gβ,φ (Xtu , Xtv ) .
(2)
As for common binary classification problems, training is done by minimizing the binary cross entropy (BCE) between Cuv and the regressor gβ,φ (Xtu , Xtv ), for all pairs of time indices (tu , tv ), ∀X ∈ X , i.e., by minimizing: ⎤ ⎡ NX R(β, φ; X ) = − EX∈X ⎣ (3) Cij log gβ,φ Xti , Xtj ⎦ . i,j=1
Remark 1. Equation (3) is similar to the Mann-Kendall statistics of Eq. (7) presented in Sect. 4 of related work.
510
E. Pineau et al.
Once the parameters (φ, β) are fitted, we build an estimator β Fφ (Xt ) of the trend factor τtX . In the next section, we show in what extent this estimator effectively estimates the hidden trend factor.
3
Identifiability Study
We assume that Fφ is a universal approximation function (e.g., a sufficiently large NN) and that the amount of data is large enough (equivalent to infinite data) such that we achieve the identity of Eq. (2). Definition 1. (Minimal sufficiency). A sufficient statistic T is minimal sufficient if for any sufficient statistic U , there exists a function h such that T = h(U ). If U is also minimal, then h is a bijection. Proposition 1. β (Fφ (Xtv ) − Fφ (Xtu )) is a minimal sufficient statistic for trend label Cuv . Proof. First we remind that logistic regression learns likelihood ratios, i.e., Fβ,φ is a log-likelihood difference. In fact, using the Bayes rule, we get p(Cuv = 1|Xtu , Xtv ) =
p(Xtu , Xtv |Cuv = 1)p(Cuv = 1) . p(Xtu , Xtv )
(4)
Moreover, using properties of sigmoid function σ and Eq. (2), we have eβ
(Fφ (Xtv )−Fφ (Xtu ))
=
p(Cuv = 1|Xtu , Xtv ) . p(Cuv = 0|Xtu , Xtv )
(5)
Finally, from Eq. (4) and Eq. (5) we obtain eβ
(Fφ (Xtv )−Fφ (Xtu ))
=
p(Xtu , Xtv |Cuv = 1)p(Cuv = 1) . p(Xtu , Xtv |Cuv = 0)p(Cuv = 0)
We note that p(Cuv = 1) = p(Cuv = 0), since we randomly choose tu and tv simultaneously within {1, . . . , NX }. Hence, eβ
(Fφ (Xtv )−Fφ (Xtu ))
=
p(Xtu , Xtv |Cuv = 1) , p(Xtu , Xtv |Cuv = 0)
(6)
that is a likelihood ratio. Theorem 2 of [16] states that the density ratio is a minimal sufficient statistics. Then, from Definition 1, β (Fφ (Xtv ) − Fφ (Xtu )) is a minimal sufficient statistic for Cuv . Remark 2. Learning the likelihood ratio of Eq. (6) without explicitly knowing the likelihood is called likelihood-free inference [17]. Related to our approach, [18] explains how classification can be used to do likelihood-free inference using CL. Compared to maximum likelihood approaches, its main advantage lies in the fact that contrasting two models enables to cancel out some computationally untractable terms (as the log-determinant of the Jacobian of the embedding function or the partition function of the model).
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
511
Lemma 1. ∃h monotonic such that β (Fφ (Xtv ) − Fφ (Xtu )) = h(τtXv − τtXu ). Proof. τtXv − τtXu is by definition a minimal sufficient statistic. Hence, since the estimator β (Fφ (Xtv ) − Fφ (Xtu )) is minimal (see Proposition 1), Definition 1 states that there exists a bijective function h : R → R (hence monotonic), such that β (Fφ (Xtv ) − Fφ (Xtu )) = h(τtXv − τtXu ). Main result 1. Using the estimator β Fφ (Xt ), we can identify the true state τtX from Xt up to a monotonic transformation h. Proof. Let tref a reference sample time of a new system. Hence, any future sample Xt with t > tref is sampled from a degraded system. We can assume w.l.o.g. τtXref = 0 (since there is no absolute notion of state). From Lemma 1 and assumlearned parameters ing Eq. (2) is achievable (infinite data and Fφ assumptions), (β, φ) are such that β Fφ (Xtref ) − Fφ (Xtv ) = h(τtXv ). Hence, defining the shift constant C = β Fφ (Xtref ), we get β Fφ (Xtv ) = h(τtXv ) + C.
4 4.1
Related Work Standard Trend Detection Methods
The trend is any monotone factor underlying temporal data, a “general direction and tendency” [19]. It can then be a drift in values, moments, interactions between observed variables, or more generally in the parameters of the generative model from which data have been sampled (i.e., the model of the system S). Different methods may be used depending on the prior information about the trend. We list commonly used trend detection or estimation methods, some recent applications and the relation with our approach. Mann-Kendall Test. It evaluates whether observed values tend to increase or decrease over time using a nonparametric form of monotonic trend regression analysis [15,20,21]. Mann-Kendall test analyzes the sign of the difference between data samples, with a total of N (N − 1)/2 possible pairs where N is the number of observations. It accepts missing values, and the data is not required to follow a particular distribution. Hence, the Mann-Kendall statistic for a univariate time series X observed at timesteps {t1 , . . . , tN } is: S=
N −1
N
sign(Xti − Xtj ) ,
(7)
i=1 j=i+1
where sign(x):=1x>0 − 1x is rejected when S and 2S/(N (N − 1) are significantly different from 0. Equation (7) is directly related to the CTE problem of Eq. 3.
512
E. Pineau et al.
Finding Relation between Time and Observations. It consists P in regressing time index t on a response variable Xt . For example, Xt = β0 + p=1 βp (t)p + . The null test hypothesis is H0 : βp = 0 ∀p (i.e., the absence of trend). A more general flexible model use functions of time that can be estimated with smoothing splines, spline regression, or kernel smoother [22]. We note that this method can be used by regressing time not on observation X but on an embedding of X, i.e. F (Xt ) = t. Our CTE implicitly does time regression with an adapted and efficient CL procedure that does not consider absolute time value. Residual of the Decomposition of Data into Stationary Components. It assumes the existence of stationary generative factors (e.g., cycle or seasonality) and a trend that can be seen as residual. For example, the empirical mode decomposition (EMD) [23] is a framework to decompose time series into oscillatory sources {ck }K k=1 , whose number of modes is strictly decreasing. By construction, cK is the trend. CTE method directly filters the information present in {ck }K−1 k=1 to find cK (up-to monotonic transform) if cK is effectively a monotonous factor. Application 1. Climate studies commonly use the methods presented above to extract and analyze trends [3, 24]. Climate data, e.g., hydro-climatic data [25] like soil moisture [24] or drought information [3], contain significant nonlinear long-term trends. Hence, generic methods like EMD are now commonly used [3, 25, 26]. Yet, when data dimensionality or dataset size grows, or when data type is not standard (e.g., large topographic data), EMD finds its limits, despite the recent development of faster EMD methods [27]. The use of NNs in our CTE method, trained with efficient universal procedure, overpasses this problem. CTE can then at least be a comparative method on climate data analysis problems, and at best outperform currently used trend extraction methods. Monitoring of Rolling Statistics. It consists of computing and monitoring a set of statistics sXt for all samples Xt at each time t. sXt can be a moment, a covariance (for multivariate data) or autocovariance (for time series data) matrix, a cumulant, parameters of a model [28,29], etc. The sequence {sXt }t∈1,...,NX is then monitored to find a trend. Generally, any embedding F (Xt ) may be monitored while F (Xt ) is informative about τ X (e.g., sufficient statistics). The difficulty is to choose the right sample’s statistics/embedding where the trend is hidden. Our CTE approach learns the optimal statistics/embedding, then requires no expert definition of the statistics to be monitored. Blind Source Separation (BSS). It identifies the source factors that generated the observed data, among which the trend if it exists. A common way to do BSS is to apply independent component analysis (ICA) [6]. Recent works proposed solutions to the general nonlinear BSS problem using new nonlinear ICA methods based on contrastive learning [8,30]. When structural components are independent, ICA factors include the trend factor (up-to scalar transform, like for CTE). These methods may use NNs as embedding functions and then deal with many data types. Our CTE is a particular case of these nonlinear ICA with a stronger inductive bias towards trend detection.
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
513
Slow Feature Analysis. Considering the trend is the slowest non-constant factor underlying data, slow feature analysis (SFA) [31] is a natural solution for trend extraction. It consists of extracting features Fφ (Xt ) from data such that Fφ (Xt ) − Fφ (Xt−1 ) 2 is the lowest possible, under the constraint that X {Fφ (Xti )}N i=1 components are orthogonal. Recent works showed the relation between SFA and BSS [32–34]. Finally, the different standard methods presented above are related to our approach that is more general and universal, and adapts on all situations with all types of data, thanks to the use of NNs and ad-hoc learning procedure. 4.2
Relation with Survival Analysis
A field related to the trend detection problem is the survival analysis (SA). The objective of SA is to estimate the lifespan of monitored systems (e.g., the timeto-death of a patient, the time-to-default of a loan, time-to-extinction of species) from data. An illustration of typical survival analysis is given in Fig. 2.
Fig. 2. In Survival Analysis, Several Systems Enter a Study by giving a Data Sample. At the end of the Study, we know if a System has Failed or has Survived. The Objective is to Predict the Lifespan from Data using Survival Information.
Using previous notations, we have Xt an observed sample at time t, TX the (unknown) lifespan of the system that generated X, TX − t its failure time at time t, τt its (unobserved) state at sampling time t, TX − τt its remaining useful life (RUL) at time t. The objective is to model the conditional survival function S(τt |x) = P (TX > τt |Xt = x), hence the probability that the current state is lower than death-time. A standard way to estimate the survival function S is to use hazard functions h, defined as h(τt |x) := −∂τt log (S(τt |x)). CTE shares several features with survival analysis. First, lifespan is generally assumed to be a monotonic hidden state. Second, the most used performance metric for survival analysis is the concordance index (CI) [35], which measures the fraction of pairs of samples that can be correctly ordered in terms of estimated lifespan. It is then directly related to our objective of Eq. (3).
514
E. Pineau et al.
Remark 3. As illustrated in Fig. 2, survival analysis framework generally does not have one system from which a sequence of data is recorded, but several systems from which one data point is recorded (when the systems enter the study). Yet, it is common to assume that all samples have been generated from “equivalent systems”, at different moments of its life. Hence, we assume that some of the generative factors, from which we want to disentangle the monotonic factor, represent individual system’s features. The idea of directly training the model to maximize the CI, as in Eq. (3), exists in survival analysis literature. Authors of [36] describe the ranking problem similarly to Eq. (3), and relate it to the standard proportional hazard model (PHM) h(τ |x) = h0 (τ ) exp(Fφ (x)) [37]. Nevertheless, it is commonly restricted to linear Fφ (x) = φT x, for computational reasons and because first order is sufficient in many cases. Authors of [38] use a NN for Fφ to create a personalized treatment recommendation system, but use a CTE-like function (1) only for the prediction (i.e., recommendation) part, showing that the subsequent recommendation system, under PHM assumption, is the difference between the embedding of two samples. In [39] authors fit a lifespan prediction model learned by pair, which regresses the difference between two samples embedding on the difference between samples target RUL. An alternative to PHM is the multi-task logistic regression (MTLR) [40]. It consists in building a series of logistic regression models fitted on different time intervals to estimate the probability that the event of interest (e.g., death) happened within each interval. Another alternative to Cox model is the proportional odds model (POM) O(τ |x) = O0 (τ ) exp(Fφ (x)) [41], where O(τ |x) is the odd of individual surviving beyond time τ . We remind that an odd is the S(τ |x) S(τ |x) ratio 1−S(τ |x) , such that exp(Fφ (x)) = 1−S(τ |x) for a constant baseline function O0 (τ ). The POM is therefore directly related to the ratio of Eq. (6). Yet, we surprisingly found no reference of CTE-like survival analysis with learning procedure on Eq. (3) as described in our paper. We therefore apply CTE on standard survival analysis datasets as a practical contribution in Sect. 5.2.
5
Experiments
In this section, we propose experiments to illustrate our approach on several datasets with different types of hidden trend. We note X train and X test the training and test sets of sequences with hidden trends (validation sets are built from train set). All the results in the tables are computed on the test set. We test our CTE method on several datasets, where each observation Xt may be a multivariate time series (Sect. 5.1) or a sample in survival analysis experiments (Sect. 5.2). We finally propose a discussion on the effect of noise on CTE in Sect. 5.3. For the two first experiments, we compare with NN-based time series decomposition: NN SFA [33] (NFSA) and temporal contrastive learning ICA [8] (TCL-ICA). This choice is motivated by the fact that other standard methods (see Sect. 4) that are relevant for trend extraction are not adapted to
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
515
high-dimensional data like images or multivariate time series. For the survival analysis, we compare CTE to several models implemented in the PySurvival library [42]. Remark 4. In all experiments, the bold numbers are the best results in term of means; yet taking into account the standard deviations (that can be large), we cannot always claim that CTE is the absolute best model, only that it achieves at least other models’ results in several trend factor extraction. The objective is to illustrate the flexibility of the CTE approach and its universality in trend extraction problems. 5.1
Ball-Springs Systems’s Health Monitoring
In this section, we use the same experimental setup than in [34]. We will use the model of this paper, called Seq2Graph, as comparison. Dataset. We simulate 15000 samples (10000 for train/validation and 5000 for test) from a 10-ball-springs system, consisting of the simultaneous trajectories of 10 identical balls in a 2-dimensional space, each ball being randomly connected to some others by springs. Each sample Xti , i ∈ 1, 50 is a time series with 10×2 variables (10 balls in a 2D-space) and 50 time steps. The initialization Xt1 of each time series Xt is normally sampled. To simulate the drift, for each sequence X = {Xt1 , . . . , Xt5 0 }, a constant ageing factor αX ∼ U([0.9, 1]) is applied to the system: at each timestep t, we randomly choose a spring (i, j) ∈ {1, . . . , 10}2 and multiply its rigidity by (αX )t , i.e. an exponential ageing coefficient with respect to sample time index. Every 50 samples, we take a new random system, another ageing factor is sampled and we reiterate the simulation of the next sequence of 50 samples. The objective is to extract the trend information quantified by αX , without the knowledge that it is hidden in the causal interactions between balls. Results. We compare NFSA, TCL-ICA and CTE, plus the model Seq2Graph taken from the source of this experiment [1]. We use a relational NN (RelNN, as in Seq2Graph) and a 3-layer convolutional NN (CNN) as embedding functions Fφ , fitted using Adam optimizer [43]. Results are given in Table 1. Table 1. Absolute Correlation between Estimated and True Trend, using Two Types of Embedding Neural Architecture. Means and Standard Deviations are Computed using 5-Fold Train/Test Split, with Randomly Initialized NNs for Each Fold. Seq2Graph
NFSA
TCL-ICA
CTE
RelNN 0.97 ± 0.03 0.93 ± 0.03 0.95 ± 0.02 0.97 ± 0.02 0.94 ± 0.03 0.60 ± 0.05 0.97 ± 0.02 CNN −
We see that with RelNN architecture, finding the trend hidden in the variable interactions is easy for all methods. Yet, when using a more generic function
516
E. Pineau et al.
Fφ , the results of TCL-ICA drops. This sensitivity to embedding function was already unveiled in the experiments of [8]. Naturally, the CTE that is specialized in trend extraction is the more robust and performing approach. 5.2
Survival Analysis Experiments
We showed in Sect. 4.2 how the CTE method is related to SA. In this section, we illustrate this relation by applying CTE on survival analysis problems. Datasets. We use four public survival analysis datasets. Customer churn prediction (Churn) consists in estimating the percentage of customers that stop using the products and services of a company. The survival analysis for customer churn helps companies predicting when a customer is likely to quit considering its personal features. The dataset contains 2000 samples with 12 features, whose 53.4% are right-censored. Credit risk (Credit) is the risk carried by a loan company when people borrow money. It corresponds to the likelihood of borrower’s credit default with respect to personal features. Survival analysis for credit risk predicts if and when a borrower is likely to fail. The dataset contains 1000 samples with 19 features, whose 30.0% are right-censored. Treatment effects on survival time are fundamental for pharmaceutical laboratory. It is possible to do survival analysis of patients. Two public datasets exist. First, German Breast Cancer Study Group (GBCSG2) contain a subset of variables from the German breast cancer study [44]. It studies the effects of a treatment with hormones on survival time without cancer recurrence. The dataset contains 686 samples with 8 features, whose 56.4% are right-censored. Finally, the predictive maintenance of mechanical equipment consists in predicting when an equipment will fail. We use a public dataset, called Maintenance, whose data is extracted from sensors on manufacturing machines to predict which will fail soon. The dataset contains 1000 samples with 5 features, whose 60.3% are right-censored. Results. We compare our CTE survival analysis with five other standard models: the linear and neural Cox proportional hazard models (Cox-PHM) [37,38], the extra-tree and random forest survival analysis (RFS) [45] and a multi-task logistic regression (MTLR) survival analysis
[40]. In tree-based survival, an estimation of the cumulative hazard function ( t h(t|x)dt) is done with bags of trees. Other models have been presented in Sect. 4.2. We do not provide additional information here. An exhaustive introduction to these models is provided in the website of the PySurvival library [42] that we used to implement the comparative methods. For these methods, we chose the hyperparameters used in PySurvival tutorials when available.
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
517
For the survival experiments, we used a 10-folds train-test setup. Each dataset is divided into 10 folds used for cross-validation i.e., one fold serves as the testing set while the other ones compose the training set. This 10-folds train-test separation is repeated several times for robustness of the results. Table 2 shows the mean and standard-deviation of the concordance index (CI, see Sect. 4.2) computed on the test samples. Table 2. Concordance Index for 6 Survival Analysis Models (Including CTE) on Four Datasets. Means and Standard Deviations are Computed using 10-Fold Train/Test Split Repeated 5 Times using Randomly Initialized NNs (for Survival Analysis Models that use NNs) for each Fold. Churn
Credit
GBCSG2
Maintenance
Linear Cox
88.1 ± 1.0 77.0 ± 1.9 66.3 ± 5.5 96.1 ± 1.7
Neural Cox
87.5 ± 2.4 75.0 ± 2.6 64.4 ± 4.2 99.3 ± 1.0
Extra Tree
85.9 ± 1.2 71.2 ± 4.3 63.8 ± 5.4 94.1 ± 1.5
Random Forest 84.5 ± 1.5 71.5 ± 3.2 67.6 ± 4.4 93.1 ± 2.1
5.3
Multitask
89.2 ± 1.7 71.4 ± 3.7 67.9 ± 7.7 93.0 ± 2.9
CTE (ours)
89.9 ± 0.9 77.2 ± 1.8 67.9 ± 5.7 99.6 ± 0.4
Noisy Trend in CTE
This subsection of the experiments serves as a discussion on the impact of the noise on trend detection, which is standard in real world problems. For example, in climate change trend estimation, the complex and numerous interactions between environment and the variables of interest may perturb the estimator. Noisy trend can be characterized by a contamination of the density p(τtu , τtv |Cuv ) with another density ν(τtu , τtv |Cuv ; M ): pν (τtu , τtv |Cuv ) := (1 − η) p (τtu , τtv |Cuv ) + ην (τtu , τtv |Cuv ; M )
(8)
with η ∈ [0, 1] the prevalence of the contamination as named in [46], and M representing the maximum temporal dispersion of the noise (i.e., if |tu − tv | > M then ν = 0). Dataset. To illustrate the impact of the noise on CTE, we introduce another dataset where trend detection is a useful task: the NASA public Commercial Modular Aero-Propulsion System Simulation dataset (CMAPSS) dataset [28], that consists in a turbine engine degradation and maintenance simulations. We use the dataset FD001 that contains 100 time series of the output of the turbineengine system, recorded at sea level. Time series are on average 206 time-steps long and have 13 non-constant variables. The engine is operating normally at the start of each time series and develops a fault of unknown initial magnitude in its
518
E. Pineau et al.
Fig. 3. Example of CTE on CMAPSS Data with Noisy Trend. Top: Distribution of Couples (τtνu , τtνv ), for Different Couples (η, M ) Respectively (0.2, 2), (0.2, 5), (0.5, 5), (0.5, 10) Representing the Levels of Noise. Bottom: Mean Accuracy of the Trend Classification for All Pairs of Samples of the Test (unnoisy) set, using Loss Eq. (3) Versus L1 Loss (Result from [47]).
first moments. We only know that the impact of this fault on the system grows in magnitude until system fails. We extract from these time series sub-trajectories of length 25, with a rolling window with stride 5, to make our dataset. We contaminate the source with different parameters (η, M ) in Eq. (8): for each sequence X ∈ X , we randomly select a proportion η of time steps in {t1 , . . . , tNX }. For each selected time step ti , we take the M time steps around ti ({ti−M/2+1 , . . . , ti+M/2 }) and randomly shuffle them. We name tνt this new indexing of the samples, and τtν := τtν the corresponding noisy trend process. ν := 1{tνu ≤tνv } , not anymore equal to We then supervise the CTE with labels Cuv Cuv . Results. In our CTE framework, noisy trend is equivalent to a noisy classification problem, for which a large literature exists [48]. We use a robust classification losses [47], for example a symmetric loss like the L1 loss (see bottom figure in Fig. 3). Remark 5. In case of heavy contamination (e.g., we do not known that the monitored system has been restored), we can use a γ-crossentropy loss [46]. It has been used in [49] to develop a TCL-ICA robust to source contamination.
6
Conclusion
This paper proposes a new universal trend estimation method using the contrastive learning framework, CTE. Our model is supported by theoretical identifiability results and numerous experimental validations. We showed on several datasets that the strong inductive bias of CTE enables a more robust and accurate trend estimation than two other universal factor extraction methods. Moreover, using the strong relation between trend detection and survival analysis, we applied CTE on survival analysis problems and showed that it outperforms
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
519
standard models. In further work, we plan to add interpretability layers to CTE for industrial reasons, for example, using an attention mechanism in the embedding functions. We also plan to extend our model to treat the trend detection problem when several independent trends are underlying data.
References 1. Pineau, E., Razakarivony, S., Bonald, T.: Unsupervised ageing detection of mechanical systems on a causality graph. In: ICMLA (2020) 2. Jr Miller, R.G.: Survival Analysis. John Wiley & Sons, Hoboken (2011) 3. Huang, S.H., Mahmud, K., Chen, C.J.: Meaningful trend in climate time series: a discussion based on linear and smoothing techniques for drought analysis in Taiwan. Atmosphere 13(3), 444 (2022) 4. Harvey, A.C., Shephard, N.: 10 structural time series models (1993) 5. Choi, S., Cichocki, A., Park, H.-M., Lee, S.-Y.: Blind source separation and independent component analysis: a review. Neural Inf. Proc.-Lett. Rev. 6(1), 1–57 (2005) 6. Hyv¨ arinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13(4–5), 411–430 (2000) 7. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013) 8. Hyvarinen, A., Morioka, H.: Unsupervised feature extraction by time-contrastive learning and nonlinear ICA. In: Advances in Neural Information Processing Systems,, pp. 3765–3773 (2016) 9. Chen, K., Wang, J.: Design of multivariate alarm systems based on online calculation of variational directions. Chem. Eng. Res. Des. 122, 11–21 (2017) 10. Niknam, S.A., Kobza, J., Hines, J.W.: Techniques of trend analysis in degradationbased prognostics. Int. J. Adv. Manuf. Technol. 88(9–12), 2429–2441 (2017) 11. Le-Khac, P.H., Healy, G., Smeaton, A.F.: Contrastive representation learning: a framework and review. IEEE Access 8, 193907–193934 (2020) 12. Franceschi, J.Y., Dieuleveut, A., Jaggi, M.: Unsupervised scalable representation learning for multivariate time series. In: Advances in Neural Information Processing Systems, pp. 4650–4661 (2019) 13. Banville, H., Albuquerque, I., Hyvarinen, A., Moffat, G., Engemann, D.A., Gramfort, A.: Self-supervised representation learning from electroencephalography signals. In: IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE 2019 (2019) 14. Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2015) 15. Mann, H.B.: Nonparametric tests against trend. Econometrica 13(3), 245–259 (1945) 16. Goh, C.: Econ 2 0A: sufficiency, minimal sufficiency and the exponential family of distributions (2001) 17. Thomas, O., Dutta, R., Corander, J., Kaski, S., Gutmann, M.U.: Likelihood-free inference by ratio estimation. arXiv preprint: arXiv:1611.10242 (2016) 18. Gutmann, M.U., Dutta, R., Kaski, S., Corander, J.: Likelihood-free inference via classification. Stat. Comput. 28(2), 411–425 (2018)
520
E. Pineau et al.
19. Goldsmith, F.B.: Monitoring for Conservation and Ecology, vol. 3. Springer Science & Business Media, Cham (2012) 20. Kendall, M.G.: In: Griffin, C (ed.) Rank Correlation Methods. 4th ed. (1975) 21. Gilbert, R.O.: Statistical Methods for Environmental Pollution Monitoring. Wiley, Hoboken (1987) 22. Gray, K.L.: Comparison of trend detection methods (2007) 23. Huang, N.E., et al.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. London. Ser. A: Math., Phys. Eng. Sci. 454(1971), 903–995 (1998) ´ 24. Laura, A.-M., Jos´e, M.-F., Mar´ıa, P., Angel, G.-Z., Pilar, B.-V., Jaime, G.: Analysis of soil moisture trends in Europe using rank-based and empirical decomposition approaches. Global Planet. Change 215, 103868 (2022) 25. Carmona, A.M., Poveda, G.: Detection of long-term trends in monthly hydroclimatic series of Colombia through Empirical Mode Decomposition. Clim. Change 123(2), 301–313 (2014) 26. Wei, F., et al.: Vegetation dynamic trends and the main drivers detected using the ensemble empirical mode decomposition method in east Africa. Land Degrad. Dev. 29(8), 2542–2553 (2018) 27. Zhang, J., et al.: Serial-EMD: fast empirical mode decomposition method for multidimensional signals based on serialization. Inf. Sci. 581, 215–232 (2021) 28. Saxena, A., Goebel, K.: Turbofan engine degradation simulation data set. NASA Ames Prognostics Data Repository (2008) 29. Adamowski, K., Prokoph, A., Adamowski, J.: Development of a new method of wavelet aided trend detection and estimation. Hydrol. Proc.: Int. J. 23(18), 2686– 2696 (2009) 30. Hyvarinen, A., Sasaki, H., Turner, R.: Nonlinear ICA using auxiliary variables and generalized contrastive learning. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 859–868. PMLR (2019) 31. Wiskott, L., Sejnowski, T.J.: Slow feature analysis: unsupervised learning of invariances. Neural Comput. 14(4), 715–770 (2002) 32. Blaschke, T., Zito, T., Wiskott, L.: Independent slow feature analysis and nonlinear blind source separation. Neural Comput. 19(4), 994–1021 (2007) 33. Schuler, M., Hlynsson, H.D., Wiskott, L.: Gradient-based training of slow feature analysis by differentiable approximate whitening. In: Asian Conference on Machine Learning, pp. 316–331. PMLR (2019) 34. Pineau, E., Razakarivony, S., Bonald, T.: Time series source separation with slow flows. In: ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (2020) 35. Harrell, F.E., Jr., Lee, K.L., Mark, D.B.: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15(4), 361–387 (1996) 36. Steck, H., Krishnapuram, B., Dehing-Oberije, C., Lambin, P., Raykar, V.C.: On ranking in survival analysis: bounds on the concordance index. In: Advances in Neural Information Processing Systems, pp. 1209–1216 (2008) 37. Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc.: Ser. B (Methodological) 34(2), 187–202 (1972) 38. Katzman, J.L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., Kluger, Y.: Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18(1), 24 (2018) 39. Jing, B., et al.: A deep survival analysis method based on ranking. Artif. Intell. Med. 98, 1–9 (2019)
Universal Hidden Monotonic Trend Estimation with Contrastive Learning
521
40. Yu, C.N., Greiner, R., Lin, H.C., Baracos, V.: Learning patient-specific cancer survival distributions as a sequence of dependent regressors. In: Advances in Neural Information Processing Systems, pp. 1845–1853 (2011) 41. Bennett, S.: Analysis of survival data by the proportional odds model. Stat. Med. 2(2), 273–277 (1983) 42. Fotso, S., et al.: PySurvival: open source package for survival analysis modeling (2019). https://www.pysurvival.io/ 43. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint: arXiv:1412.6980 (2014) 44. Schumacher, M., et al.: Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German breast cancer study group. J. Clin. Oncol. 12(10), 2086–2093 (1994) 45. Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S., et al.: Random survival forests. Ann. Appl. Stat. 2(3), 841–860 (2008) 46. Fujisawa, H., Eguchi, S.: Robust parameter estimation with a small bias against heavy contamination. J. Multivar. Anal. 99(9), 2053–2081 (2008) 47. Ghosh, A., Kumar, H., Sastry, P.: Robust loss functions under label noise for deep neural networks. arXiv preprint: arXiv:1712.09482 (2017) 48. Han, B., et al.: A survey of label-noise representation learning: past, present and future. arXiv preprint: arXiv:2011.04406 (2020) 49. Sasaki, H., Takenouchi, T., Monti, R., Hyvarinen, A.: Robust contrastive learning and nonlinear ICA in the presence of outliers. In: Conference on Uncertainty in Artificial Intelligence, pp. 659–668. PMLR (2020)
Applying CRISP-DM Methodology in Developing Machine Learning Model for Credit Risk Prediction Kuldeep Rawat(B) Elizabeth City State University, Elizabeth City, NC 27909, USA [email protected]
Abstract. Banks and other financial institutions must assess credit risk when deciding on loan applications. Approving loans to ‘risky’ individuals is the largest source of financial loss. In other words, borrowers who default cause the largest amount of loss to the banks, and these institutions need to minimize this risk, before agreeing to approve loans. Quantitative modeling using machine learning techniques can now be used to get better insights from data, automate the process, reduce data management, and increase overall profitability. We used the CRoss Industry Standard Process for Data Mining (CRISP-DM) methodology in developing the machine learning solution for the given task of building a binary classifier that can predict loan applicants who are likely to default and who are not. Both linear and non-linear models were evaluated as baseline models to identify the potential candidate model for the final design solution to achieve the highest performance in terms of accuracy and recall values. Performance results are presented in the form of a 2-by-2 confusion matrix, classification reports, and ROC curve for easy understanding. The great benefit of the proposed design solution is that the bank can deploy a model that can automatically filter out potential customers who might default thus reducing data management requirements and the burden on underwriters. Keywords: Machine Learning · Random Forest · Credit Risk Prediction · CRISP-DM Methodology · Confusion Matrix · Feature Importance
1 Introduction The ability to seek and issue loans is beneficial for both lending institutions and the consumer. However, it carries a great risk when the borrower is unable to repay the loan as per the schedule or defaults on payments. Hence, predicting loan default has become the subject of research in the financial lending sector. Banks and other financial lending institutions must assess credit risk when deciding on loan applications. Approving loans to individuals who are likely to default on payment is the largest source of financial loss. The lending institutions need to minimize this risk, before agreeing to approve a loan. Financial institutions and lending platforms have adopted credit scoring as a vital tool for credit risk assessment. The high demand for © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 522–538, 2023. https://doi.org/10.1007/978-3-031-37963-5_37
Applying CRISP-DM Methodology in Developing Machine
523
loans has led to the need for significant improvements in the models for credit scoring and loan default prediction. As a result, financial institutions devote large amounts of resources to predict the creditworthiness of loan applicants and develop strategies that will minimize the risk of default. We used the CRoss Industry Standard Process for Data Mining (CRISP-DM) methodology [1, 2] in developing the solution for the given task of building a binary classifier that can predict loan applicants who are likely to default and who are not. Advances in machine learning (ML) have impacted both industry and academic research in the past decades. Machine learning, a subfield of Artificial Intelligence (AI), is being applied to almost every human activity, including pattern recognition, image classification, business, agriculture, transportation, and finance. Machine learning techniques have made breakthroughs in computer vision, image processing, recommender systems, and natural language processing, which has led to a lot of interest in its application to other diverse areas including finance and credit risk analysis. Conventional machine learning techniques such as support vector machines, naïve Bayes, logistic regression, K-Nearest Neighbor, decision trees, and ensemble methods are proving to be more effective than statistical methods [3]. ML models have the potential to uncover subtle relations, capture various nonlinearities, and process unstructured data without the need for humans to derive theoretical models. In the past, experts were hired, and their professional opinions were used as input into models for assessing individual credit risk. Lately, the focus has shifted to developing an automated process of executing the same task [4–7]. As a result, in recent years financial institutions have been focused on applying machine learning techniques for credit risk assessment. Quantitative modeling using machine learning techniques can now be used to get better insights from data, automate the process, reduce data management, and increase overall profitability. Machine learning systems can help model developers reduce model risk and improve general model predictive power. Both, linear models (Logistic Regression and Linear Discriminant Analysis) and nonlinear models (K-Nearest Neighbor, Support Vector Machine, Decision Tree, Random Forest, and Gradient Boosting Machine) were tested as baseline models to identify a potential candidate model for the final design solution. Among these baseline models, the Random Forest model, which is an ensemble of decision trees, was selected as a candidate model for further investigation as it achieved the highest accuracy and recall value. Random Forest belongs to the supervised machine learning algorithm and can be used for both classification and regression [8]. It constructs many decision trees as part of the training process and outputs the class that is the mode of the classes in case of a classification problem. The decision trees in a random forest method are uncorrelated. The baseline random forest model was then calibrated by tuning hyperparameters (i) the maximum number of trees, (ii) the number of features selected per tree, (iii) the maximum depth of each tree, and (iv) the minimum of observations for each leaf. Models were evaluated using performance metrics including Accuracy, Precision, Recall, F1-score, and received operating characteristic (ROC) curve.
524
K. Rawat
2 CRISP-DM Methodology The data science (machine learning) solution was developed following the CRoss Industry Standard Process for Data Mining (CRISP-DM) methodology, which is an industry-proven process with six key phases [2]. The phases are depicted in Fig. 1. The brief description of each phase is as follows. • • • • • •
Business understanding – What does the business need? Data understanding – What data do we have/need? Is it clean? Data preparation – How do we organize the data for modeling? Modeling – What modeling techniques should we apply? Evaluation – Which model best meets the business objectives? Deployment – How do stakeholders access the results?
Fig. 1. Phases of CRISP-DM Methodology [1]
Applying CRISP-DM Methodology in Developing Machine
525
2.1 Business Understanding The Business Context. When a bank or any financial institution receives a loan application, the bank must make a decision for loan approval based on the applicant’s profile. Two types of risks are associated with the bank’s decision. First, if the applicant is likely to repay the loan, then not approving the loan results in a loss of business to the company. Second, if the applicant is likely to default or not likely to repay the loan, then approving the loan may lead to a financial loss for the bank. Banks can make a considerable profit through the disbursement of loans to reliable customers. It is understood that every loan application carries a risk of default where customers fail to repay the loan. Banks and financial institutions can restrict bad loans by ensuring that loans are made to only borrowers who are likely to be able to repay. The Business Objective. The objective is to use machine learning models to determine the risk associated with the loan application using the Home Equity dataset that has information on each applicant and whether or not the loan was repaid successfully. Once we build a robust model that predicts the probability of default, we can use this model to assess the potential risks associated with future borrowers. The key questions • What are the key driving factors (or variables that influence or are strong indicators) if the customer will default or repay the loan? • Which factor affects defaulting on loans the most? • What are recommendations to the bank management to minimize loan default? • How does the profile of a customer look like for the ones that are more likely to default on loan repayment? 2.2 Data Understanding – Exploratory Data Analysis Data Description. The dataset used was the Home Equity dataset (HMEQ) available on Kaggle. The dataset contained information for 5,960 recent home equity loans [9]. The target (BAD) is a binary variable that indicates whether an applicant has ultimately defaulted or has been severely delinquent. The dataset indicated that adverse outcomes occurred in 1,189 cases (20 percent). 12 input variables (features) were captured for each applicant. The detailed data dictionary is as follows. • • • • •
BAD: 1 = Client defaulted on loan, 0 = loan repaid LOAN: Amount of loan approved. MORTDUE: Amount due on the existing mortgage. VALUE: Current value of the property. REASON: Reason for the loan request. (HomeImp = home improvement, DebtCon = debt consolidation which means taking out a new loan to pay off other liabilities and consumer debts) • JOB: The type of job that loan applicant has such as manager, self, etc. • YOJ: Years at present job. • DEROG: Number of major derogatory reports (which indicate serious delinquency or late payments).
526
K. Rawat
• DELINQ: Number of delinquent credit lines (a line of credit becomes delinquent when a borrower does not make the minimum required payments 30 to 60 days past the day on which the payments were due). • CLAGE: Age of the oldest credit line in months. • NINQ: Number of recent credit inquiries. • CLNO: Number of existing credit lines. • DEBTINC: Debt-to-income ratio (all your monthly debt payments divided by your gross monthly income. This number is one way the lenders measure your ability to manage the monthly payments to repay the money you plan to borrow. Some of the questions that guided the EDA process to gain insights were: (i) What types of variables are there in the dataset? (ii) What does their distribution look like? (iii) Do we have missing values?, (iv) What are the relationships between the features?, (v) Do we observe any outliers?, and (vi) What is the relationship between the features and the target? A quick data snapshot is presented in Fig. 2.
Fig. 2. Data Snapshot
The 5-point summary of the feature columns is shown in Table 1. The next graph in Fig. 3 indicates the correlation heatmap of missing values (nullity correlation) indicating how strongly the presence or absence of one feature affects the presence of another. Table 2 summarizes the missing value information for the feature columns. The correlation heatmap between independent variables is shown in Fig. 4.
Applying CRISP-DM Methodology in Developing Machine
527
Table 1. Five-Point Data Summary of the Feature Columns Feature
Count
Mean
Std
Min
25%
50%
LOAN
5960.0
MORTDUE
5442.0
18607.97
11207.48
1100.00
11100.00
16300.00
23300.00
89900.00
73760.82
44457.61
2063.00
46276.00
65019.00
91488.00
399550.00
VALUE
5848.0
YOJ
5445.0
101776.05 8.9
57385.78 7.6
8000.00
66075.50
89235.50
119824.25
855909.00
0.00
3.00
7.00
13.00
DEROG
5252.0
0.254570
41.00
0.846047
0.00
0.00
0.00
0.00
DELINQ
5380.0
0.449442
10.00
1.127266
0.00
0.00
0.00
0.00
CLAGE
5652.0
15.00
0.00
115.12
173.47
231.56
1168.23
NINQ
5450.0
1.186055
1.728675
0.00
0.00
CLNO
5738.0
21.296096
10.138933
0.00
15.00
1.00
2.00
17.00
20.00
26.00
DEBTINC
4693.0
33.78
0.524
29.14
34.82
71.00
39.00
203.31
179.77
85.81
8.602
75%
Max
Fig. 3. Nullity Correlation Map
As seen in Fig. 4, VALUE and MORTDUE variables have a strong correlation. This multicollinearity is confirmed by the high value for Variance Inflation Factor (VIF) shown in Fig. 5. Figure 6 depicts that Debt Consolidation was the main reason category (~ 66% applicants) for taking out the home equity loan. Figure 7(a) and (b) indicate that for the individuals who defaulted, the median year on the job (YOJ) and the length of credit history were lower than the applicants who repaid the loan. Observations & Insights. Some of the key insights are summarized as follows: • Eleven (11) features out of 12 have missing values, ranging from just under 2% (VALUE) to as high as 21% for DEBTINC. • In the given dataset, other than LOAN variable no other feature column has all values present and hence we cannot use prediction (model-based) methods for imputation. • Seventy-five (75%) of applicants do not have any major derogatory reports and do not have any delinquent credit lines.
528
K. Rawat Table 2. Missing Value Summary for the Dataset Feature
#Missing value count
% of missing values
LOAN
0
0.0
MORTDUE
518
8.69%
VALUE
112
1.88%
REASON
252
4.23%
JOB
279
4.68%
YOJ
515
8.64%
DEROG
708
11.88%
DELINQ
580
9.73%
CLAGE
308
5.17%
NINQ
510
8.56%
CLNO
222
3.72%
DEBTINC
1267
21.26%
• Seventy-five (75%) applicants fall under the acceptable debt-to-income (DTI) ratio for typical loan applicants. However, the maximum Debt-to-income ratio is very high ~203. • A very high number (4771) of loan applicants successfully repaid the loan. • The target variable is imbalanced (80% - Repaid; 20% - Defaulted). 2.3 Data Preparation Data preparation is a key step in the machine learning pipeline as the predictive model is only as good as the data that is fed to it. Our given dataset had missing values, outliers, categorical features, feature values at different scales, etc. The following approaches were used for each specific case: • Missing values – median imputation was used for numerical features. The mode was used for categorical features. • Outliers – Outliers were identified using the Inter Quartile Range (IQR) technique and then the values were clipped to either the lower whisker or upper whisker. • Categorical features (REASON and JOB) were encoded as numerical values using a one-hot encoding technique. • Data scaling – StandardScaler was used to standardize features by subtracting the mean and then scaling to unit variance, thereby bringing all features to a comparable measurement scale. • Multicollinearity – Dropping feature (VALUE) with the highest VIF value. • Feature Engineering: Additional features such as VALUE to LOAN ratio, DELINQ to CLNO ratio, MORTDUE + LOAN, were created. In addition, binary flag input features were created to indicate whether a row or a column contained a missing value
Applying CRISP-DM Methodology in Developing Machine
529
Fig. 4. Correlation Heatmap for All Independent Variables
Fe ature
VIF
0
LOAN
5.456789
1
MORTDUE
16.554373
2
VALUE
20.572541
3
YOJ
2.587619
4
CLAGE
6.401455
5
CLNO
7.370337
6
DEBTINC
10.731341
Fig. 5. Relation between MORTDUE and VALUE
that was imputed. By using these additional binary flag columns, we are providing the model with knowledge of whether a row contains a missing value or not, which may help improve prediction accuracy.
530
K. Rawat
Fig. 6. Reason for Taking a Loan
(a)
(b) Fig. 7. a. YOJ versus BAD b. CLAGE versus BAD
Applying CRISP-DM Methodology in Developing Machine
531
2.4 Modeling Machine learning models can help automate the process of identifying customers who are likely to default, thereby helping businesses with efficient data management and quick decision-making. In the context of predicting whether an individual will default or not, we are dealing with a supervised binary classification problem. Training the best machine learning model to maximize the predictive capability of deeply understanding past customers’ profiles can help minimize the risk of future loan defaults. Baseline Model Building. A baseline model test harness was created to evaluate seven different classification algorithms. These were: 1. Logistic Regression (LR) and Linear Discriminant Analysis (LDA) - (Linear models) 2. KNN, Decision Tree (DT), Support Vector Classifier, Gradient Boosting Machine (GBM), and Random Forest (RF) (Non-linear models) The comparison results of the performance evaluation of these baseline algorithms are shown in the box and whisker plots in Fig. 8(a) and (b). The baseline algorithms were trained with default parameters and 10-fold cross-validation. The models were evaluated using the accuracy metric as a starting point to get a quick idea of the model with the most predictive power for the given binary classification task. With mean accuracy of 89.7% (standard deviation 1.09%), the Random Forest algorithm comes out as the best-performing baseline model. Further, recall of 65.25%, the highest among all the algorithms was achieved on the baseline Random Forest model. The mean accuracy of linear models, both LR and LDA was significantly lower (80.7%). The impact of class imbalance was seen across all models, with Logistic Regression and SVM Classifiers performing the worst. With this baseline performance, we have enough confidence to further investigate the Random Forest model as a viable solution to the given classification problem. It achieves a higher recall value and a higher accuracy. The selection of the optimal model also depends on the use case. For example, Type I Error is more relevant when the goal is to minimize the incorrect classification of borrowers as creditworthy. Type II Error, on the other hand, is more relevant when the goal is to minimize denying a loan to a creditworthy customer. Tuning the Random Forest. The grid Search Cross Validation (GridSearchCV) technique was used for hyperparameter tuning in an attempt to compute the optimum values of hyperparameters. GridSearchCV performs an exhaustive search on the specific parameter values of a model. The following parameters were optimized by cross-validated grid-search over a parameter grid: (i) n_estimators: The number of trees in the forest, (ii) min_samples_split: The minimum number of samples required to split an internal node, (iii) min_samples_leaf: The minimum number of samples required to be at a leaf node, and (iv) max_features{“auto”, “sqrt”, “log2”, ‘None’}: The number of features to consider when looking for the best split. The best parameters that came out of the grid search were:
532
K. Rawat
(a)
(b) Fig. 8. a. Box and Whisker Plots of Algorithm Performance - Accuracy b. Box and Whisker Plots of Algorithm Performance - Recall
RandomForestClassifier(class_weight='balanced', criterion='entropy', max_depth=7, max_features=3, max_samples=0.9, min_samples_leaf=20, n_estimators=160, random_state=7)
Applying CRISP-DM Methodology in Developing Machine
533
Interpreting the Random Forest Model. Interpretability is important in this case as we wish to understand and explain the phenomenon being modeled or begin to trust its decisions. We used the Feature Importance tool provided through the Random Forest model [8, 10]. Both the default feature importance plot and the permutation importance plot [11] are shown in Fig. 9 and Fig. 10 respectively.
Fig. 9. Random Forest Model Default Feature Importance
The permutation importance strategy does not require retraining the model after permuting each feature column; only re-run the perturbed test samples through the already trained model. As seen in Fig. 9 and Fig. 10, both the default RF feature importance and permutation importance plots indicate that the debt-to-income ratio is the most important feature. It is interesting to note that permutation importance has given REASON_HomeImpl as the third most important feature, whereas it is ranked lower on the default RF feature importances plot. Similarly, YOJ is ranked lower on the importance scale by the permutation importances technique. Surprisingly, for the dataset provided, DELINQ - Num of delinquent credit lines and DEROG - Num of major derogatory reports have no importance in determining if an applicant will default or not.
534
K. Rawat
Fig. 10. Random Forest Permutation Importance
We can now look to further improve the tuned Random Forest model performance. These techniques will include Feature engineering - creating new features from the existing ones - ex: the ratio of loan amount approved to the existing property value, Further, we can also look at individual decision trees generated by the Random Forest algorithm. Figure 11 shows a plot for one such decision tree. The plot shows a visual of how the split was made at every node for that estimator.
Fig. 11. The Single Estimator Decision Tree from the Random Forest Model
Applying CRISP-DM Methodology in Developing Machine
535
2.5 Evaluation In the application discussed in this paper, the bank would be interested in identifying applicants who are likely to default. In the given business context, False Positives (applicants that are flagged a possible default) are more acceptable than False Negatives (default applicants that are not identified correctly). In essence, we are optimizing for Recall while maintaining acceptable Precision and Accuracy as not correctly identifying applicants who are likely to default will potentially lead to financial loss for the bank. Tools such as a confusion matrix and classification report were used to determine the performance of the final classifier model as shown in Fig. 12.
Fig. 12. Confusion Matrix and Classification Report for Train/Test datasets
Accuracy is defined as the ratio of the number of samples correctly classified by the classifier to the number of samples for a given test data set. The formula is as follows: Accuracy =
TruePositives + TrueNegatives (1) TruePosirives + TrueNegatives + FalsePositives + FalseNegatives
Recall, also known as the True Positive Rate, is the fraction of all positive instances the classifier correctly identifies as positive. TruePositives TruePositives + FalseNegatives TruePositives Precision = TruePositives + FalsePositives Recall =
(2) (3)
F1-Score is defined as a harmonic mean of Precision and Recall. The formula for F1-score is: F1 − score = 2 ∗
Precision ∗ Recall Precision + Recall
(4)
536
K. Rawat
ROC Curve. The receiver operating characteristic (ROC), is a graphical plot that illustrates the performance of a binary classifier. The true positive rate is the proportion of observations that were correctly predicted to be positive out of all positive observations (TP/(TP + FN)). Similarly, the false positive rate is the proportion of observations that are incorrectly predicted to be positive out of all negative observations (FP/(TN + FP)). For example, in the loan default application, the true positive rate is the rate at which people are correctly identified to test positive for the loan default in question. The ROC curve is generated by plotting the TPR against the FPR at various threshold settings. The initial output of an RF classifier is not a label, instead, it is the probability that a particular observation belongs to a certain class. This probability is then turned into a class by selecting a threshold, which is 0.5 by default. Since we want to optimize for Recall on positive class as misidentifying a default applicant could potentially lead to financial loss, we want to minimize the false negatives. We will adjust the threshold to optimize for this outcome, and the ROC curve plot (True Positive Rate vs the False Positive Rate) was used to visualize this trade-off as shown in Fig. 13.
Fig. 13. ROC Curve for Tuned RF Model
2.6 Deployment/Monitoring Once the model is built, tested, and performance metrics are acceptable, one can build machine learning pipelines for model deployment. Also, the model can be made available as a web-based service to underwriters or through cloud-based platforms. The key is once the model is deployed, regularly monitoring the performance of the deployed model is needed and if necessary, retraining the model with new data.
Applying CRISP-DM Methodology in Developing Machine
537
3 Conclusion The machine learning model development process for loan default prediction using the industry-recognized CRISP-DM methodology was discussed in the paper. The six phases of the CRISP-DM process were applied at each stage of model development. The dataset used was the Home Equity dataset (HMEQ), which contained 5,960 home equity loans. The Random Forest model was proposed as the final solution design for this application. With its higher recall and accuracy, it gives confidence in the model’s performance. Instead of searching for the most important feature while splitting a node, it searches for the best feature among a random subset of features. The baseline random forest model was then calibrated by tuning hyperparameters (i) the maximum number of trees, (ii) the number of features selected per tree, (iii) the maximum depth of each tree, and (iv) the minimum of observations for each leaf. Performance results were presented in the form of a 2-by-2 confusion matrix and classification reports for easy understanding. Further, RF’s ability to compute feature importance (default RF feature importance and permutation importance) were fully exploited to get insights into the most powerful variables for the prediction from the given dataset. Debt-to-income ratio, CLAGE, LOAN, and MORTDUE were some of the most important features that impacted the prediction. The random forest algorithm provides an easy to measure the relative importance of each feature on the prediction. By looking at the feature importance we can decide which features to possibly drop using the recursive feature elimination technique because they don’t contribute enough (or sometimes nothing at all) to the prediction process. This is important because a general rule in machine learning is that the more features you have the more likely your model will suffer from overfitting and vice versa. The benefit of the proposed design solution is that the bank can deploy a model that can automatically filter out potential customers who might default thus reducing data management requirements and the burden on underwriters. However, an area of concern is the stability of the machine learning model in scenarios where there are structural changes over time in a loan processing portfolio. In these situations, financial institutions should consider increasing monitoring and back-testing frequency to keep model behavior on track.
References 1. Zipporah, L., Understanding CRISP-DM and its importance in Data Science projects. https://medium.com/analytics-vidhya/understanding-crisp-dm-and-its-importancein-data-science-projects-91c8742c9f9b. Accessed 15 Sept 2022 2. Hota, N. What is CRISP DM? https://www.datascience-pm.com/crisp-dm-2/. Accessed 10 August 2022 3. Shi, S., Tse, R., Luo, W., D’Addona, S., Pau, G.: Machine learning-driven credit risk: a systemic review. Neural Comput. Appl. 34, 14327–14339 (2022) 4. Luo, C., Wu, D., Wu, D.: A deep learning approach for credit scoring using credit default swaps. Eng. Appl. Artif. Intell. 65, 465–470 (2017) 5. Chen, Y.R., Leu, J.S., Huang, S.A., et al.: Predicting default risk on peer-to-peer lending imbalanced datasets. IEEE Access 9, 73, 103–73, 109 (2021)
538
K. Rawat
6. Turkson, R., Baagyere, E., Wenya, G.: A Machine learning approach for predicting bank credit worthiness. In: 2016 Third International Conference on Artificial Intelligence and Pattern Recognition (AIPR) (2016) 7. Nureni, A., Oluwadunsin, A.: Loan approval prediction based on machine learning approach. FUDMA J. Sci. (2022) 8. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001) 9. Home Equity Dataset (HMEQ_Data). https://www.kaggle.com/datasets. Accessed 14 April 2022 10. Parr, T., Turgutlu, K., Csiszar, K., Howard, J.: Beware default Random Forest Importance. https://explained.ai/rf-importance/#corr_collinear. Accessed 21 Aug 2022 11. Permutation Importance vs Random Forest Feature Importance (MDI), scikit-learn documentation. https://scikit-learn.org/stable/auto_examples/inspection/plot_permutation_impo rtance.html. Accessed 5 Sept 2022
A Bidirectional Encoder Representations from Transformers and CNN Based Prediction Model on Competitive Products Yuefei Chen(B) and Xinzhe Wang Columbia University in the City of New York, New York, USA [email protected], [email protected]
Abstract. In this paper, the topic deals with Automatic Sales Leads Finding. The authors intended to find and classify the potential customers of a group of companies that have similar products. The authors propose a novel model and measurements to explore and find the potential customers of each brand. The model is based on bidirectional encoder representations from transformers and convolutional neural networks to classify potential customers. Eventually, two popular brands Apple and Samsung are selected to evaluate the model. Public tweets of these brands collected are designed in this paper with final accuracy above 75%. Keywords: Natural Language Processing · Convolutional Neural Network · Bidirectional Encoder Representations from Transformers Automatic Sales Leads Finding
1 1.1
·
Introduction Background
Social networks are developing day by day, especially with the rapid development of the mobile Internet in the past decade, social platforms are expanding every day. The Public’s reliance on social media networks and staying connected through them has gained rapid momentum, especially during the current Covid19 pandemic lockdown measures in many areas from education to health and entertainment. With the positive cycle of the increasing popularity of mobile social networks and the increased time users spent on them, users generate more and more traces on the Internet. There is an extensive research focusing on digging hidden information beneath the ocean of social network data. Researches [5,6,15,18] used different method to predict users’ personality using their public data in different social networks. There are various theories in psychology that attempt to generalize human personality. Among them, Myers-Briggs Type Indicator(MBTI) is one of the most popular and most widely used theories in business activities. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 539–551, 2023. https://doi.org/10.1007/978-3-031-37963-5_38
540
Y. Chen and X. Wang
MBTI is an introspective self-report questionnaire indicating differing psychological preferences in how people perceive the world and make decisions. MBTI test divides one’s personality into four dimensions, then uses a letter to represent each dimension: I (Introversion) or E (Extroversion), S(Sensing) or N(Intuition), T(Thinking) or F(Feeling), J(Judging) or P(Perceiving). For example, ENTP represents the personality of extraversion, intuition, thinking, and perceiving. In total, there are 16 types of personality. MBTI is widely used in enterprises for employee personality classification and employee management, or personal development such as academic or career planning applications. MBTI is widely used in the industry. The study [17] examined the Proportion of employees with different personalities in a given company and led to the conclusion that there is a specific relationship between the MBTI indicators and corporate style. Paper identified valid and reliable assessment MBTI type This research [11] examined the relationship between students’ MBTI type and their major choice, and overall GPA in a college. 1.2
Motivation
The huge amount of traces in social networks may imply users’ personalities, preferences, and tendencies to choose products. It is very promising to profile users and guide business activities by mining and analyzing users’ social information. The ever-expanding social networks provide a huge platform for advertising and marketing. Huge user volume on today’s social networks is both an opportunity, and a challenge. It’s not easy to find a set of potential customers for a particular business in astronomical number of users before they explicitly show their interest. But the benefits of finding potential customers are very obvious as well: (1) minimize the cost of advertising (2) avoid spreading useless information and improvement on using the experience of the social network. 1.3
Goal
The goal of our work is to find users’ tendencies among a group of companies that have similar products, given his/her public social network content. In the paper, the top two best-selling mobile phone brands, Apple and Samsung, are used as an example of company groups in our work. Given a randomly chosen Twitter user, the model developed in the paper should determine which brand the user will be more inclined to. 1.4
Progress and Contribution
Progress. At first, we obtained proper data set from Kaggle to train our MBTI classification Bidirectional Encoder Representations from Transformers model (BERT). We listed a group of companies and successfully got the latest 10,000 fans for every brand and scraped their tweets.
A Bidirectional Encoder Representations
541
Secondly, we continued to obtain raw materials on Twitter and cleaned them for future use. The model was further developed and tuned to increase the accuracy of MBTI prediction. Additionally, we built a web front-end framework to display our paper. New features were added to the model to assist the classification of users’ preferences and tendencies. In the final part, the model of BERT MBTI prediction and features were integrated to yield the final classification result. The web report was replenished with data visualization and model prediction results. Contribution. In the end, this paper developed a prediction model that uses the user’s public tweets data to predict his MBTI type, topics that he might be interested in, sentiment in his tweets, and finally the brand he is more likely to be inclined to among the brands’ pool. To our knowledge, a model of this type is rarely seen. An overview of the model is shown in Fig. 1. Firstly, the BERT model is trained and tested using the MBTI dataset from kaggle. Secondly, given a user on Twitter, all the tweets of the user are scraped and cleaned. Thirdly, cleaned data will be fed into Google API and trained BERT model for prediction. Results from Google API and BERT will be used for final result prediction in CNN.
2
Related Work
In this section, two categories of related works are reviewed including social media platform, personality analysis of MBTI. With the continuous development of Internet social networking platforms, users are producing more and more personal traces on social platforms. Analyzing and profiling users through content on social networks is increasingly a hot topic. This study [13] conducted personality prediction on data collected from online forum Reddit. Research [18] analyzed the personality of the user on Facebook using their posts as a data source. The author in [3] used tweets as a data source to derive personality traits. Different ML models were used to examine the correlations between different MBTI types and compared. The result shows that random forest has the best performance in predicting both small binary categories and overall MBTI type. The researchers in [12] used tweets as a data source and compared the Big Five and MBTI prediction performance to six different supervised machine learning methods using TF-IDF as a feature extraction method. Although there have been criticisms of the MBTI in the psychology community, saying that its classification is considered too vague and there is a “Barnum effect”, MBTI is one of the most widely considered and used models to depict one’s personality. There are many researchers working on MBTI prediction using machine learning methods or natural language processing methods. The study [4] proposed a machine learning-based model called Gradient Boosting Model to predict users’ MBTI type. The author in [10] did a comprehensive
542
Y. Chen and X. Wang
Fig. 1. Overview of the Model.
survey on the performance of different models such as SVM, LSTM, and deep learning methods predicting MBTI types. The research [9] proposed a model with the component of compound class labels to improve prediction accuracy. This study [7] compared model performance under different features and achieved a prediction accuracy of 88%.
3
Methodology
In this section, we first introduce the multi-source data used in this paper, and we then explain the details of our models in three parts. 3.1
Data Collection
Two data sources are used in this paper: MBTI data obtained from kaggle for model training and tweet data obtained from Twitter for integrated model training and testing.
A Bidirectional Encoder Representations
543
MBTI Data. This is a dataset from kaggle. It consists of more than 8600 subjects. Each subject represents an interviewee, along with his text material and a tested standard MBTI type. The MBTI type distribution in the dataset matches the estimated real-world MBTI type distribution. This is part of the reason we choose MBTI as a personality indicator since datasets containing Big Five model information are no longer available. This dataset is then used for BERT MBTI classification model training and testing. Tweet Data. Twitter serves as the data source for this paper since it’s one of the biggest social network platforms. Followers under Apple and Samsung’s official Twitter accounts were collected first and were regarded as potential customers. All tweets of the nearly 4000 fans of these brands were obtained using official tweets python APIs, and all of data are publicly available. Also, fans from an arbitrary brand were collected, along with their respective tweets. We looked up their following to make sure they are neither an Apple fan nor a Samsung fan. This fans pool is then used in the classification test. After obtaining tweet content, non-English content was removed, including tweets from non-English users or tweets that are full of emojis or hyperlinks. Then the tweets belonging to a user were put together and regarded as one long text, which will help in increasing the accuracy of MBTI prediction, as well as prevent those followers who tweet a lot from taking a disproportionate share of the company’s MBTI prediction. 3.2
Models
In the model description, the paper introduced three parts of the model. The structure of the model is as follows in Fig. 2. The model is based on three main parts of predictions. The first part is the BERT model. The model is used to predict the users’ MBTI. The input is the users’ posts and predicts their MBTI personalities. The second part is Google Cloud Natural Language Processing API. This part implements two functions in the model, one is Sentiment Analysis, and the other one is Multi-label classification. Eventually, the results of two parts of the models will be combined as inputs for the third part. This part is the convolutional neural network. It is a deep neural network to classify the users for which products (Apple or Samsung) potential customers they belong to based on their sentiments, MBTI personality, and topic interests. PART1: BERT Model Prediction of MBTI. The BERT model is a novel bidirectional encoder representation from the transformers model. The transformers connect the encoder and decoder through attention mechanisms. It is based on an attention mechanism, dispensing with recurrence and convolutions entirely [19]. In this paper, achieving the prediction of MBTI, the model is based on the BERT and classifies users with their combined posts into 16 kinds of MBTI personalities. This model structure is as follows in Fig. 3. The post of
544
Y. Chen and X. Wang
Fig. 2. In the Workflow Diagrams of the Model, All of the Inputs of Three Starting Nodes are Posts of each User. The Output is the Potential Customer Result of the User.
each user needs to be preprocessed initially. The post will be tokenized into a lot of tokens. The tokenization is based on the pre-trained BERT model “bert-baseuncased.” Then these tokens will be padded into equal sequences, which is used for the BERT process. After that, an attention mask will be added based on the tokenized and padded sentences. It is an array indicating the position of tokens that are padding. This mask will be sent to the “Self-Attention” mechanism in BERT and tell it not to include the tokens which are padded into the interpretation. Additionally, the sentences are set as inputs and sent into the BERT layer. The BERT layer is also the pre-trained BERT model “bert-base-uncased.” This pre-trained model is fine-tuned on downstream tasks. It can use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification, or question answering [2]. The output of the BERT layer will be transported into the dense layer with an activation function. The output will be 16 dimensions with different kinds of MBTI personality. The result is a probability table of each MBTI personality between 0 and 1. The personality with the highest probability value is the MBTI personality for users. PART2: Sentiment and Topic Interests Classification. This part is based on the Google Cloud Natural Language Processing API [1]. The Natural Language API has five modules, Analyzing Sentiments, Analyzing Entities, Analyzing Entity Sentiments, Analyzing Syntax, and Classifying Content. According to the demands of the paper, Analyzing Sentiments and Classifying Content are selected to analyze the posts. The API is shown in Fig. 4.
Analyzing Sentiments. Analyzing Sentiments is used to evaluate the sentiment of the posts of each user. It analyzes attitudes within documents and will return a value between -1 to 1. A user’s post said, “Yay! I just scored more than 100 of Red Cable Club members! Try the quiz now and win RedCoins on Red Cable
A Bidirectional Encoder Representations
545
Fig. 3. The Pre-Trained BERT Model Structure, The Inputs of the BERT Layer are Posts with Attention Masks which are Tokenized and Padded.
Fig. 4. The Google Cloud Natural Language Processing API.
Priv.” In this sample, this user scored 100 on a club member quiz. So the post shows happiness and excitement, and the result is positive and 0.7. All the posts of each user are combined into a line and get the sentiment of each user under the analysis of API.
Classifying Content. Classifying Content could extract topics from the post. The post will be classified into multiple labels. Multi-label classification allows a document to be assigned multiple labels. A user is set as an example. The post said, “...spent this beautiful day with my team... golf is a great sport for team building. AND...I got a birdie!!! Par 3 hole 8! I have no idea what that means but guess what? It was caught on video! liveinspired justdoit teamgolf Southwind.” In this example, the user posts (he or she) had a golf game and got a birdie, so the post topic is Sport, individual sport, and golf. The topic of interest explored is based on the Classifying Content. In this part, the program classifies the post into several topics first. After that, these topics of posts will be counted by the user name. Additionally, the topics of these posts are very broad and spread. It is implied that some numbers of topics are sparse. Thus, the topics were refined, and some topics were filtered due to their sparsity and little effect on the model.
546
Y. Chen and X. Wang
PART3: Potential Customers Prediction. A potential customer’s prediction is based on the Convolutional Neural Network classification. It is a network primarily used in difficult image-driven pattern recognition, but it also performs well in a large number of feature datasets. In this paper, the final features collected from each API and Bert model contain 104 features. 87 features are from the topic interests extractions, one feature is sentiment analysis, and 16 features are from MBTI personality prediction. The convolutional neural network contains several layers. The first network layer is the embedding layer, whose size is 1 × 104 × 10 because the inputs have a single channel and 104 feature inputs. This layer is used to transform inputs into lower dimensions dense vectors. The next layer is a Conv1d layer with a convolutional kernel. It is the main building block of the CNN model. The parameters will be learned in this layer throughout training. Additionally, The max pooling layer is added to downsample the data processed after the conv1d layer. It can set up a spatial window to take the maximum value. The output will be another input of the flattened layer, and it will be flattened into 816×1 values. At the end of the model, two dense models build activation functions to dense the vectors into a classification result. The following figure is the CNN model structure. In this model, the output is between 0 and 1. If the value is closer to 0, that represents the user is more similar to Samsung followers. Otherwise, the user is similar to Apple when the value is close to 1. The model summary is as follows in Fig. 5.
4
Result
In this section, we start by introducing MBTI prediction part for evaluation. It is based on BERT model. We compare it with other baseline model. Additionally, we analyze the sentiment and topic interests based on the users’ posts. At last, the potential customers prediction is made based on the results of previous sentiment and topic analysis and BERT model. 4.1
MBTI Prediction
Our group tried three models to predict the personality of the users. All of them are supervised models, BERT, CNN, and RNN, In the CNN model, the model added a dense layer with four sigmoid activations to predict four dimensions of personality. Each value is between 0 and 1, which represents the predicted class probability. In the RNN model, there are four models being trained, and each of them has one sigmoid activation to predict personality. Therefore, four models can predict four dimensions of personality. These two models do not perform well as the BERT model. That is the reason to choose the BERT model as a part for predicting potential customers. The following Table 1 is the comparison of these three models. In this table, it is obvious that the RNN model and CNN model test accuracy is not over 50%. the BERT performs best in the testing set, which is close to 70%, and it is selected as the model to predict the MBTI of users.
A Bidirectional Encoder Representations
547
Fig. 5. The Convolutional Neural Network Model. Table 1. The MBTI Prediction Performance Table. Model Type
Testing Set Accuracy
RN N model
20.8%
CN N model
43%
BERTmodel 68%
4.2
Sentiment Analysis and Topic Interests Selection
Sentiment Analysis. As a result, the paper analyzes the sentiment of each user based on their posts. The relation between potential customers and sentiments is as follows in Fig. 6. This graph shows that the distribution of each kind of user and the most sentiment of both users is 0, which represents a neutral attitude. Additionally, different users have different distributions of sentiment, which means sentiment distribution does not has a weak relation with potential customers of Apple and Samsung.
Fig. 6. The Distribution of Sentiment under Different Kinds of Potential Customers, When Apple = 0 (Blue Charts), The Bars belong to Samsung. When Apple = 1 (Yellow Charts), The Bars belong to Apple.
548
Y. Chen and X. Wang
Topic Interests Selection. The topic interests are also based on the users’ posts. In the training set, there are over 500 topic interests in the 1750 users. Most of them only have 1 or 2 posts related. These topics are manually refined and filtered into 87 sensitive features of the model. All selected features are listed in Appendix A. 4.3
Potential Customers Prediction
Potential customers’ prediction is based on the MBTI personality, sentiment, and refined topic interest. The Convolutional Neural Network model is trained to predict results. The result is a binary value of 0 or 1. 1 represents the user is similar to Apple’s fans which means he (or she) is a potential customer of Apple under the condition that the user is not a fan of Apple. Equivalently, 0 represents the user as a potential customer of Samsung. The following Table 2 is about the two prediction models comparison. The baseline method is Supported Vector Machine, with γ = 5, kernel is Radial Basis Function [16], and regularization parameter = 1, which has a better performance than other parameters. It is obvious that the SVM model has a higher training accuracy but lower testing accuracy. It is possible that the Supported Vector Machine model is close to overfitting. In contrast to the SVM model, the CNN model has a higher testing accuracy and a better performance. Additionally, the ROC curve is in the following plot. This result is displayed in Fig. 7. In this result, the AUC value is 0.8225. In other words, the model can be considered having an excellent discriminating capability [14]. More importantly, a user pool is built with nearly a hundred randomly selected users who might be potential customers of Samsung or Apple and are neither fans of Apple nor Samsung. Here is the result of part of users. There are 19 users potential to be the fans of Apple and the other 44 users are predicted to face to Samsung.
Fig. 7. The ROC Curve of the Model, Illustrates the Diagnostic Ability of a Binary Classifier when the Discrimination Threshold is Varied [8].
A Bidirectional Encoder Representations
549
Table 2. The Potential Customer Model Prediction Performance Table. Model Type
4.4
Training Set Testing Set
SV M model 98.29%
65%
CNNmodel 91.43%
75.81%
UI Interface
Additionally, a website front end is created for data visualization. The front end is presented in a report form considering there are not many interactive elements in our model. There are mainly five parts to the web report. Abstract, Motivation, System, MBTI, and Result. In the abstract part shown in Fig. 8, a link to a YouTube video of the paper is inserted. MBTI result of three brands is visualized in the MBTI part, each with four separate MBTI indicators and an integrated MBTI type distribution. Figure 9 shows the predicted distribution of Apple. A total of 1218 fans, accounting for 78% of the total number of fans, are ISTP. In the results part shown in Fig. 10, the final prediction result and the middle state prediction results are presented in tables.
Fig. 8. Front Page of Web Front.
Fig. 9. Predicted MBTI Distribution of Apple.
550
Y. Chen and X. Wang
Fig. 10. Prediction Result Table.
5
Conclusion
Under the topic of Automatic Sales Leads Finding, a prediction model based on BERT and Google natural language API is developed. This model takes the public tweets of a specific user as input. This model takes the user’s Twitter text as input, uses BERT to predict the user’s MBTI type, and uses Google API to predict the hidden sentiment index and possible topics of interest in the tweet. Final CNN takes the MBTI type, sentiment index, and topic as input, predicting the user’s possible tendency in a group of closely related companies. Our MBTI prediction accuracy achieved 68%, and the final prediction achieved 75%. In the future, the MBTI prediction model will be continued to improve, as it is a challenge. We will continue to add more actual dataset to train and optimize our model. In addition, online platform like social media analysis is also a potential research area for marketing. Apart from potential customers seeking for competitive products, the unstructured data on social media can be used to extract other features to analyze the effects of other business factors.
References 1. Cloud natural language; google cloud 2. BERT base uncased model (2021) 3. Abidin, N.H.Z., et al.: Improving intelligent personality prediction using MyersBriggs type indicator and random forest classifier. Int. J. Adv. Comput. Sci. Appl. (2020) 4. Mohammad Hossein Amirhosseini and Hassan Kazemian: Machine learning appR roach to personality type prediction based on the Myers-Briggs type indicator . Multimodal Technol. Interact. 4, 03 (2020) 5. Bai, S., Zhu, T., Cheng, L.: Big-five personality prediction based on user behaviors at social network sites. CoRR, abs/1204.4809 (2012) 6. Ba¸saran, S., Ejimogu, O.H.: A neural network approach for predicting personality from Facebook data. SAGE Open 11(3), 21582440211032156 (2021) 7. Bharadwaj, S., Sridhar, S., Choudhary, R., Srinath, R.: Persona traits identification based on Myers-Briggs type indicator(MBTI) - a text classification approach. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1076–1082 (2018) 8. Carter, J.V., Pan, J., Rai, S.N., Galandiuk, S.: ROC-ing along: evaluation and interpretation of receiver operating characteristic curves. Surgery 159(6), 1638– 1645 (2016)
A Bidirectional Encoder Representations
551
9. Cerkez, N., Vrdoljak, B., Skansi, S.: A method for MBTI classification based on impact of class components. IEEE Access 9, 146550–146567 (2021) 10. Cui, B., Qi, C.: Survey analysis of machine learning methods for natural language processing for MBTI personality type prediction (2017) 11. DiRienzo, C., Das, J., Synn, W., Kitts, J., McGrath, K.: The relationship between R and academic performance: A study across academic disciplines. J. PsyMBTI chol. Type (2010) 12. Garg, S., Garg, A.: Comparison of machine learning algorithms for content based personality resolution of tweets. Soc. Sci. Human. Open 4(1), 100178 (2021) 13. Gjurkovic, M., Snajder, J.: Reddit: a gold mine for personality prediction. In: PEOPLES@ NAACL-HTL, pp. 87–97 (2018) 14. Hosmer, D.W., Jr., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. John Wiley & Sons, Hoboken (2013) 15. Lin, J., Mao, W., Zeng, D.D.: Personality-based refinement for sentiment classification in microblog. Knowl.-Based Syst. 132, 204–214 (2017) 16. Musavi, M.T., Ahmed, W., Chan, K.H., Faris, K.B., Hummels, D.M.: On the training of radial basis function classifiers. Neural Netw. 5(4), 595–603 (1992) 17. Reynierse, J.H.: An MBTI model of entrepreneurism and bureaucracy: the psychological types of business entrepreneurs compared to business managers and executives. J. Psychol. Type 40, 3–19 (1997) 18. Tandera, T., Suhartono, D., Wongso, R., Prasetio, Y.L.: Personality prediction system from Facebook users. Procedia Comput. Sci. 116, 604–611 (2017). Discovery and innovation of computer science technology in artificial intelligence era: The 2nd International Conference on Computer Science and Computational Intelligence (ICCSCI 2017) 19. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Diagnosis of Parkinson’s Disease Through Spiral Drawing Classification via VGG19 and AnoGAN Hyunmin Lee(B) The Hockaday School, 11600 Welch Rd, Dallas, TX 75229, United States [email protected]
Abstract. Millions of people worldwide suffer from Parkinson’s disease, a neurodegenerative disease that comes with symptoms such as tremors and difficulty in bodily function. Around 60,000 new cases arise annually, and yet ironically, no standardized way of diagnosing Parkinson’s exists to this day. To enable a more effective process of medication, therapy, and treatment, early diagnosis of PD through a standardized measure is crucial. Previous attempts to achieve this include the analysis of MRI data, speech signals, and the drawing of spirals. Our objective in the study was to enhance the spiral drawing test by combining a CNN model (VGG19) and an anomaly detection system, AnoGAN. The initial trial with the VGG19 model produced an accuracy score of 94%, higher than that of an existing study which produced an accuracy score of 88%. To further refine this algorithm, we incorporated AnoGAN, which allowed for more effective detection of anomaly data and enabled the process of cross-checking and feedback, increasing the accuracy further. This algorithm produced an anomaly score for each of the spiral drawings in the data set; a larger score indicated a higher likelihood for the data to belong to a PD patient. Though we have not yet established an explicit standard for the minimum anomaly score one must have to be diagnosed of PD, this will be possible once more data is accumulated and trained in the algorithm. Overall, the high accuracy and the standardized nature of this design evince its possible application in hospitals in the future. Keywords: Artificial Intelligence · Parkinson’s Disease · AnoGAN · Deep Learning
1 Introduction Parkinson’s disease (PD), a neurodegenerative brain disorder, slowly thwarts its patients’ bodily functions, perception, and sensations. With approximately 10 million people worldwide living with PD and up to 60,000 new cases each year [1], PD has been and will continue to be a pertinent issue in our society. Prior to experiencing the typical symptoms, many patients experience precursors like sleep problems, nightmares, loss of smell, and restless legs. Afterward, during the onset of PD, they experience less noticeable symptoms like mild tremors, softness of voice, and slow handwriting. [2] While these symptoms begin in one part of their bodies, it gradually takes over in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 552–560, 2023. https://doi.org/10.1007/978-3-031-37963-5_39
Diagnosis of Parkinson’s Disease Through Spiral Drawing
553
other parts and becomes more severe. Patients may experience other symptoms like depression, difficulty speaking, and urinary problems as well. Unfortunately, there is currently no cure for PD; patients cannot terminate its impacts, but merely alleviate the symptoms through medications, surgery, or therapy. Thus, early diagnosis is crucial to accelerating this process as it allows for more effective medications, therapy to slow down progression, and possible diagnosis of other neurological disorders. However, with the lack of laboratory tests or conclusive screenings to evaluate Parkinson’s, it is only possible through neurological examinations. These examinations are not definitive though, since a patient may not meet the diagnosis criteria but have PD, or conversely, be diagnosed then realize later that they do not have PD [3]. According to Parkinson’s UK, more than a quarter of patients were initially misdiagnosed with a different condition and received the incorrect treatment [4]. As illustrated in the Fig. 1 below, the incidence rate of Parkinson’s has been increasing over the years as well [5]. This therefore raises the crucial question of how we can diagnose PD in its early stages in the most effective, accurate way possible.
Fig. 1. Parkinson’s Incidence Rates Across Decades (1976–1985, 1986–1995, 1996–2005)
Causing slow movement, muscle stiffness, and a debilitated ability to balance, Parkinson’s Disease impacts the daily lives of its patients in numerous ways. One example of such is that patients experience a tremor in their hands or fingers, making it difficult for them to draw spirals, a simple task for healthy people [2]. Thus, spiral drawing classification is considered a possible method to test for PD, but it has yet to improve [6]. As it is typically difficult to gain access to large quantities of data from PD patients, the limited size of the data set lowers the accuracy and functionality of an algorithm in a deep learning system [6].
554
H. Lee
However, an anomaly detection algorithm can help overcome this hindrance; this algorithm only requires data from healthy people and thus can be accumulated without much difficulty. Using this method, a more accurate method of diagnosis can be developed and implemented in hospitals as well. Our investigation consists of two main parts: the conventional capital drawing classification and anomaly detection. Section 2 presents a review of related works. Section 3 goes over the materials and methods, and Sect. 4 delineates the results, and Sect. 5 discusses the implications of the aforementioned work. Section 6 concludes and evaluates areas for future study.
2 Related Work Since hand tremors are one of the major symptoms of PD, it leads to an impaired ability to sketch or write things in a normal way. Thus, this characteristic was used to attempt to identify patients with PD. Using the results of the wave drawing test and spiral drawing test performed on healthy people and PD patients, they extracted data to train the algorithm. To ensure that both drawings would be used for classification, Chakraborty et al. created a multistage classifier architecture divided into three main sections. Within this system, Logistic Regression and Random Forest Classifier were used for classification. This resulted in a model with a decent accuracy of 93.3%, but with a larger amount of data and more types of drawings, it has potential to improve [6]. In the past, the Static Spiral Test (SST) has often been used to determine whether or not someone has Parkinson’s as it allows for the evaluation of one’s hand tremor level. Along with SST, Khatamino et al. conducted another more unique test, the Dynamic Spiral Test (DST), which is more complex since the test spiral does not have a fixed position on the screen and disappears for certain time intervals. Using a CNN-based model with the SGD optimizer, the data from SST and DST was trained. Based on the results, DST produced a higher accuracy of around 88%, indicating that it has a better discriminative power than the traditional SST and thus should be incorporated in medical tests [7]. Patients with PD typically experience vocal disorders and struggle to control the volume of their speech or pronounce syllables. Since this symptom can be assessed earlier on, this study aimed to classify healthy people from patients with PD based on speech signals. Haq et al. used various machine learning classifiers such as logistic regression, SVM, KNN and DNN and evaluated their accuracy. DNN had a classification accuracy of 98%, specificity of 95%, and sensitivity of 99%, demonstrating that it is the best diagnosis system for PD [8]. Patients with PD experience a gradual degradation of motor function in their brain. The cortical and subcortical parts of PD patients’ brain functions differently than that of healthy people. As a result, the electroencephalogram (EEG) signals in their brains are typical abnormal, and can be used for early diagnosis. Using the CNN model with a data set of 20 PD patients and 20 healthy people, an algorithm with an accuracy of 88.25%, sensitivity of 84.71% and specificity of 91.77% was produced even with a limited number of subjects [9].
Diagnosis of Parkinson’s Disease Through Spiral Drawing
555
Since PD is a neurodegenerative disorder that gradually progresses in its patients, it is important to classify the prodromal, first, second, and third stage to identify its severity and predict future progression. MRI can be used to detect the pathologies in the brain, and was thus adopted in this investigation. After classifying the data from the MRI using a CNN model, abnormalities at different stages could be detected. Moreover, the accuracy rate of 94% demonstrated potential for this method to become more widely-used in the future [10]. PD affects patients in various ways; the main symptoms include difficulty in moving, psychological and cognitive problems, and prodromal features like hyposmia and rapid eye movement sleep behavior disorder. These symptoms slowly progress in patients’ bodies and can be treated through pharmacologic or nonpharmachologic approaches like exercise, speech therapies, and medications. Generally, existing studies on the diagnosis of PD depend heavily on a small sample of PD patients, and thus may not operate with the same accuracy when applied to a larger amount of data. As shown in the second work, while the model might successfully classify the data they have, it is difficult to obtain large quantities of spiral drawings from PD patients compared to healthy people. Thus, this limitation must be overcome to create a model that can effectively be applied and used in clinics and hospitals.
3 Materials and Methodologies 3.1 Data Description The data comprises spiral drawings from 62 PD patients and 15 healthy people and was created through the LCD monitor and digitized pen of the Wacom Cintiq 12WX graphics. This software recorded handwriting drawings and allowed for the testing of coordination in PD patients. Since deep learning algorithms require a considerable amount of datasets, data augmentation was applied to the dataset. ImageDataGenerator from Keras library was utilized, and more than 4,000 images were generated. Hyperparameters such as rotation range, brightness_range, and rescale were utilized for creating various datasets [11]. 3.2 VGG19 VGG19 is a convolutional neural network (CNN) based transfer learning algorithm whose architecture seems like Fig. 2. It consists of a total of 19 layers; 16 convolution layers, and three fully connected layers. In the convolution layer, a filter, also known as a kernel traverses the input image datasets and conducts a convolution operation to extract the features. The pooling layer decreases the size of the data from the previous layer and the fully connected layer finally classifies the data into labels. Unlike basic CNN models, the VGG19 was trained with the large dataset called “ImageNet”, and then the trained layers got frozen to save the weights from pre-training. The size of the convolution layer used in the model is 3x3 and rectified linear activation function (ReLu) is utilized as an activation function. The pre-trained VGG19 model could be downloaded from the Keras library, and the input size should be 224 × 224 [12].
556
H. Lee
Fig. 2. Structure of VGG Model Applied to the System (Convolution Layers, Max-Pooling Layer, Fully Connected Layer)
3.3 GAN The generative adversarial network (GAN) is an unsupervised learning algorithm, which does not need any label or target in the dataset. It is proposed in 2014 by Goodfellow et al. and the model consists of two parts: generator, and discriminator. A random vector from the dataset was utilized as input data for the generator. The purpose of the generator is to create plausible data, while the discriminator aims to classify whether the data is fake or real [13]. 3.4 DCGAN Since the GAN is based on solving minimax problems between the generator and the discriminator, unstable training is a critical limitation of the algorithm. Therefore, a deep convolutional generative adversarial network (DCGAN) was proposed to resolve this downside. The DCGAN eliminated fully connected layers in the ordinal GAN model and replaced them with the convolutional layer. Furthermore, batch normalization was applied to the model to enable stable training [14]. 3.5 AnoGAN Since supervised learning such as classification or regression requires labels in the dataset, this approach has difficulty diagnosing a particular disease, especially with a small number of patient datasets. Therefore, the ANoGAN was proposed to resolve the data imbalance problem with an unsupervised learning approach. The overall architecture of the model is similar to DCGAN, but the generator in the AnoGAN only gets trained with the normal datasets, as seen in Fig. 3. Furthermore, the residual loss and anomaly score are calculated between the generated standard data, and anomaly data, which allows for visualizing the difference between them. Since the generator only requires standard datasets for the training, this model could be applied to the diagnosis of Parkinson’s disease with imbalance datasets [15].
Diagnosis of Parkinson’s Disease Through Spiral Drawing
557
Fig. 3. Structure of GAN Model Applied to Model
4 Result The initial classification resulted in an accuracy of around 83%. As shown in the graph below, there are abrupt variations in the value of the training and validation loss. As a result, the values of the training and validation accuracy shown on the right are inconsistent and fluctuating. The initial classification resulted in an accuracy of around 83%. As shown in Fig. 4 below, there are abrupt variations in the value of the training and validation loss. As a result, the values of the training and validation accuracy shown on the right are inconsistent and fluctuating.
Fig. 4. Training and Validation Accuracy Values for Initial Classification
After adding the batch normalization layer to the original algorithm, the accuracy increased to 92%. Furthermore, the values of the training and validation loss showed a general downward trend with fewer fluctuations. As a result, the training and validation accuracy also showed a general increasing trend with less inconsistency compared to the initial classification, as shown in Fig. 5.
558
H. Lee
Fig. 5. Training and Validation Accuracy Values after Batch Normalization
With anomaly detection, a large anomaly score of 33552.07812 is produced when the spiral drawing of a patient is put in the system. The aberrant parts of the drawing are marked in red, as shown in the picture on the right from Fig. 6. In comparison, for the spiral drawing of the healthy person shown on the left, the anomaly score produced is less than one-tenth of that of the PD patient, and there are no anomalous parts marked in red, demonstrating that the system successfully differentiates the two categories.
Fig. 6. Anomaly Detection in Healthy People and PD Patients (PD Patient on the Right)
Once the spiral drawings are accumulated in the database, VGG19 and AnoGAN are used. When the patient’s dataset is not large enough, AnoGAN detects the anomaly data (patient’s data) through normal datasets and, along with VGG19, trains the system to classify the images. After cross-checking and feedback have occurred, the algorithm can successfully differentiate between the spiral drawings of healthy people and PD patients. This cross-checking process enables a higher accuracy despite the small amount of PD patients’ data available in the status quo. Therefore, this algorithm can allow clinics and hospitals to diagnose their patients in an accurate and efficient way, and it is depicted in Fig. 7.
Diagnosis of Parkinson’s Disease Through Spiral Drawing
559
Fig. 7. Proposed System Design with VGG19 and AnoGAN
5 Discussion While an existing study using the spiral test and a CNN algorithm had an accuracy of 88%, our system (VGG 19) had an initial accuracy of 94%, demonstrating its viability. Overall, the system incorporated and tested in this work complements existing algorithms that aim to detect Parkinson’s Disease. Effective anomaly detection through AnoGAN allows for more accurate differentiation between healthy people and PD patients, making it a feasible and dependable method of diagnosis that can be applied in clinics and hospitals. While this algorithm was explicitly applied to the spiral drawing test, it has the potential to be applied to other methods of detection, such as MRI data as well. Thus, the aforementioned system design can truly make an impact on the field of Parkinson’s and ameliorate the difficulties in diagnosing it. However, while we can infer whether or not someone has PD based on the comparative size of the anomaly score, we still do not have a clear threshold to determine whether the person is healthy or a PD patient and must determine this threshold through further investigation.
6 Conclusion The system we proposed in this paper involves the use of the models VGG19 and AnoGAN to classify the spiral drawings of healthy people and PD patients. Due to the current lack of an effective way to diagnose PD, our primary objective was to create an algorithm that accurately differentiates between the two categories even with the limited quantity of PD patient spiral drawings available. We successfully surmounted this challenge by incorporating AnoGAN since it worked with the VGG19 network to detect anomalies in the patient’s data and had a higher accuracy score (92%) than existing systems. Furthermore, we aim to establish a definite threshold for the anomaly score by which PD patients will be diagnosed. Thus, the system put forth in this study has the potential to be implemented in hospitals in the future; once implemented, it will make significant changes in the ways potential patients, clinicians, and the medical field as a whole approach the diagnosis of PD.
560
H. Lee
References 1. PARKINSON’s News Today. https://parkinsonsnewstoday.com/parkinsons-disease-statis tics/. Accessed 13 July 2022 2. NIH. https://www.nia.nih.gov/health/parkinsons-disease. Accessed 2 July 2022 3. Johns Hopkins Medicine. https://www.hopkinsmedicine.org/health/treatment-tests-and-the rapies/how-parkinson-disease-is-diagnosed. Accessed 23 July 2022 4. Parkinson’s UK. https://www.parkinsons.org.uk/news/poll-finds-quarter-people-parkinsonsare-wrongly-diagnosed#:~:text=In%20our%20survey%20of%20more%20than%202% 2C000%20people%2C. Accessed 17 July 2022 5. Savica, R., Grossardt, B.R., Bower, J.H., Ahlskog, J.E., Rocca, W.A.: Time trends in the incidence of Parkinson Disease. JAMA Neurol. 73(8), 981–989 (2016) 6. Sabyasachi, C., Aich, S., Han, E., Park, J., Kim, H.-J.: Parkinson’s disease detection from spiral and wave drawings using convolutional neural networks: a multistage classifier approach. In: 2020 22nd International Conference on Advanced Communication Technology (ICACT), pp. 298–303 (2020) 7. Pedram, K., Cantürk, ˙I., Özyılmaz, l.: A deep learning-CNN based system for medical diagnosis: an application on Parkinson’s disease handwriting drawings. In: 2018 6th International Conference on Control Engineering & Information Technology (CEIT), pp. 1–6 (2018) 8. Haq, A.U., et al.: Comparative analysis of the classification performance of machine learning classifiers and deep neural network classifier for prediction of Parkinson disease. In 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 101–106 (2018) 9. Oh, S.L., et al.: A deep learning approach for Parkinson’s disease diagnosis from EEG signals. Neural Comput. Appl. 32(15), 10927–10933 (2018). https://doi.org/10.1007/s00521018-3689-5 10. Mozhdehfarahbakhsh, A., Chitsazian, S., Chakrabarti, P., Chakrabarti, T., Kateb, B., Nami, M. An MRI-based deep learning model to predict Parkinson’s disease stages. medRxiv (2021) 11. Kaggle. https://www.kaggle.com/datasets/team-ai/parkinson-disease-spiral-drawings. Accessed 25 July 2022 12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 13. Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020) 14. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015) 15. Schlegl, T., Seeböck, P., Waldstein, S.M., Schmidt-Erfurth, U., Langs, G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Niethammer, M., et al. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 146–157. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59050-9_12
Modularity in Deep Learning: A Survey Haozhe Sun1(B) and Isabelle Guyon1,2 1
LISN/CNRS/INRIA, Universit´e Paris-Saclay, Gif-sur-Yvette, France [email protected] 2 ChaLearn, Berkeley, USA
Abstract. Modularity is a general principle present in many fields. It offers attractive advantages, including, among others, ease of conceptualization, interpretability, scalability, module combinability, and module reusability. The deep learning community has long sought to take inspiration from the modularity principle, either implicitly or explicitly. This interest has been increasing over recent years. We review the notion of modularity in deep learning around three axes: data, task, and model, which characterize the life cycle of deep learning. Data modularity refers to the observation or creation of data groups for various purposes. Task modularity refers to the decomposition of tasks into sub-tasks. Model modularity means that the architecture of a neural network system can be decomposed into identifiable modules. We describe different instantiations of the modularity principle, and we contextualize their advantages in different deep learning sub-fields. Finally, we conclude the paper with a discussion of the definition of modularity and directions for future research. Keywords: Deep Learning
1
· Modularity · Neural Networks
Introduction
Modularity is a general principle present in many fields such as biology [22,29, 52,58,59,80,81,84,108,140,180,186,195,241], complex systems [217,218], mathematics [14,31], system design [55,91,167,177,208], computer science [17,83], graph theory [93,168,170,182]. While sharing the same name, there is no universally agreed upon definition of modularity [26]. However, we can identify a shared definition [9,207]: in general, modularity is the property of an entity whereby it can be broken down into a number of sub-entities (referred to as modules). This definition has different instantiations in different fields with their nuances [205] from which various properties may arise. Such field-specific properties include autonomy of modules (limited interaction or limited interdependence between modules) [7,14,15,17,52,87,96,113,119,167,170,177,208,260], functional specialization of modules [52,59,80,84,91,140,195], reusability of modules [6,12,14,43,51, 55,61,142,174–176,184,188,192,207], combinability of modules [6,12,144,151, 165,177,184,188,239], replaceability of modules [174,175,177]. As a general principle, modularity is a descriptive property and an organizational scheme. It is a means of representing entities (data, tasks, models) to c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 561–595, 2023. https://doi.org/10.1007/978-3-031-37963-5_40
562
H. Sun and I. Guyon
be able to manipulate them, conceptually or practically [17,55,92,177]. Though modular entities are not necessarily hierarchical [177], many modular entities have a hierarchical structure [217] in the sense that multiple modules of a lower hierarchy level can form one module of a higher hierarchy level. The modules of the lower hierarchy level are of finer granularity than those of the higher hierarchy level. At the same level of the hierarchy, modules can refer to an exclusive division of the overall entity (hard division) or overlapping parts of the overall entity (soft division). The decomposed modules can be homogeneous (similar modules) or heterogeneous (dissimilar modules). Back to the very beginning of neural network research in the last century, the community started to be interested in bringing the notion of modularity to neural networks [13,15,119,192], this interest has been revived recently [6,9, 12,39,61,77,78,135,209,239]. The publication trend (Fig. 1) shows an increasing interest in the modularity principle within deep learning over recent years. This survey investigates the notion of modularity in deep learning around three axes: data, task, and model. The organization of the survey is shown in Fig. 2.
Fig. 1. Publication Trend of “Modular Deep Learning” from 1990 to 2021. The Ratio of the Count of Publications Containing “Modular Deep Learning” and “Modular Neural Network” among Publications Containing “Deep Learning” and “Neural Network”, Indexed by Google Scholar. The Horizontal Axis is the Publication Year.
Fig. 2. Organization of this Survey. The First Three Sections Discuss how the Modularity Principle is Instantiated in the Three Axes: Data, Task, and Model Architecture. We then Cover other Modularity Notions for Completeness. Finally, we Discuss the Definition of Modularity and Directions for Future Research. The Introduction and Conclusion are ignored in this Figure.
Modularity in Deep Learning: A Survey
2
563
Data Modularity
Data is an entity used to represent knowledge and information. In the context of machine learning and deep learning, it can take various forms e.g., image, audio sound, and text. Data samples can be interpreted as points in a high dimensional space (fixed-length dense vectors) [8,136,138]. A collection of data samples is a dataset. Datasets can be used to train or test deep learning models, referred to as training or test datasets. In these scenarios, data is the input of deep learning models (neural networks) [94]. Data modularity is the observation or creation of data groups; it refers to how a dataset can be divided into different modules for various purposes. The division of the dataset into modules facilitates conception and data manipulation. Data modularization can influence the training of learning machines [71,100,227]. Some algorithms leverage data modularity so that each data module is processed by a different solver [187]. We identify two types of data modularity: intrinsic data modularity and imposed data modularity. Intrinsic data modularity means identifiable dataset divisions naturally in data, which a human practitioner does not introduce. Imposed data modularity means identifiable dataset divisions that a human practitioner introduces. The rationale of this taxonomy is that when the dataset reaches the practitioner who analyses it, it already contains some form of intrinsic modularity, including that stemming from the class labels. The people who collect the data are not considered practitioners. 2.1
Intrinsic Data Modularity
Intrinsic data modularity means identifiable dataset divisions naturally in data, which are not introduced by a human practitioner. Any supervised learning datasets can be divided according to the classes (labels); data points belonging to the same class are supposed to be close to each other in a hidden space, which allows for solutions of classification algorithms. Classes sharing common semantics can be further grouped to form super-classes. For example, ImageNet [63] has a class hierarchy (see Fig. 3(a)) which is used by Meta-Dataset [235]. Omniglot dataset [142] and OmniPrint datasets [227] contain character images organized in scripts, each script (super-class) contains several characters (classes); Meta-Album dataset [236] is a meta-dataset including 40 datasets, where each dataset can be considered as a super-class. The superclasses provide information about class similarity, allowing splitting datasets according to the semantics [258]. In addition to the classes or super-classes, data points can also be grouped by one or several metadata such as time, location, and gender. Such metadata is available with the Exif data of photos. The OmniPrint data synthesizer generates data together with a comprehensive set of metadata, including font, background, foreground, margin size, shear angle, rotation angle, etc. [227] (see Fig. 3(b)). The NORB dataset collected stereo image pairs of 50 uniform-colored toys under 36
564
H. Sun and I. Guyon
Fig. 3. Illustration of Modularity in Data. (a) Intrinsic Data Modularity based on Super-Classes, Images, and Class Hierarchy in ImageNet [63]; (b) Intrinsic Data Modularity based on Styles Characterized by a set of Metadata, the Upper-Left Circle Contains Black-on-White Characters, the Upper-Right Circle Contains White-on-Black Characters, the Lower Circle Contains Characters with Natural Foreground and Background, All Characters are Drawn from the same set of Classes (Small-case Latin Characters), these Three Circles Illustrate the Division of a Character Dataset based on its Metadata; (c) Intrinsic Manifolds in the form of a Moon Dataset, where each Data Manifold can be Considered as a Module; (d) Few-Shot Learning Episodes, Reprinted from [191]. (a), (b) and (c) are Examples of Intrinsic Data Modularity, (d) is an Example of imposed Data Modularity.
angles, 9 azimuths, and 6 lighting conditions, where the angles, azimuths, and lighting conditions serve as the metadata [145]. Some datasets contain intrinsic clusters in the high-dimensional feature space. Such intrinsic clusters can stem from the underlying data generative process, where latent categorical variables determine the natural groups of data. An illustrative example is a Gaussian Mixture distribution where data points are assumed to be generated from a mixture of a finite number of Gaussian distributions with unknown parameters [101]. Some datasets have intrinsic manifolds; an illustrative example is the moons dataset as shown in Fig. 3(c), where the two manifolds interlace while preserving an identifiable division, each manifold can be considered as a module. Both of the above examples fall into the category of data clustering. When data samples are interconnected in the form of a graph [154,249], this is called graph partitioning. One question which arises is how to determine the optimal clustering of a dataset. Luxburg et al. [240] argue
Modularity in Deep Learning: A Survey
565
that there are no optimal domain-independent clustering algorithms and that clustering should always be studied in the context of its end-use. Multi-modal deep learning aims to build models that can process and relate information from multiple modalities. Here the modality refers to the way in which something happens or is experienced e.g., data in the form of image, text, audio [19]. Multi-modal datasets fall into the category of intrinsic data modularity in the sense that the data in each modality can be considered a module. For example, VQA v2.0 dataset [97] consists of open-ended questions about images; SpeakingFaces dataset [3] consists of aligned thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking. 2.2
Imposed Data Modularity
Imposed data modularity means identifiable dataset divisions which are introduced by a human practitioner. When training deep learning models [94], human practitioners usually divide the whole training dataset into mini-batches, which can be seen as a kind of imposed data modularity. The gradient is computed using one mini-batch of data for each parameter update; one training epoch means passing through all the mini-batches. This iterative learning regime is called stochastic gradient descent [199]. Mini-batches reduce the memory requirement for backpropagation, which makes training large deep learning models possible. On the other hand, batch size also influences learning behavior. Smith et al. [221] showed that the benefits of decaying the learning rate could be obtained by instead increasing the training batch size. Keskar et al. [131] showed that learning with large batch sizes usually gives worse generalization performance. Instead of using a sequence of mini-batches sampled uniformly at random from the entire training dataset, curriculum learning [100] uses non-uniform sampling of mini-batches such that the mini-batch sequence exhibits an increasing level of difficulty. A related concept is active learning [193], which assumes that different data points in a dataset have different values for the current model update; it tries to select the data points with the highest value to construct the actual training set. The model performance is usually tested on few-shot episodes in few-shot learning and meta-learning. Few-shot episodes are typically formed by drawing several classes N from the class pool and several examples K for each selected class, called N -way-K-shot episodes [79,223] (Fig. 3(d)). For such scenarios, the meta-training phase can employ the same episodic learning regime or not [235], recent studies [141,242,244] and competition results [71] suggest that episodic meta-training is not more effective than vanilla pretraining with access to the global class pool. Data augmentation is a way to generate more training data by applying transformations to existing data [216]. The transformed versions of the same data point can be seen as a module. Some transformations, such as rotation and translation, form a group structure [196]. The effect of such data augmentation
566
H. Sun and I. Guyon
can be understood as averaging over the orbits of the group that keeps the data distribution approximately invariant and leads to variance reduction [40]. In addition to splitting the dataset into subsets of samples, each data sample can be split into subdivisions of features, referred to as feature partitioning. A dataset can be represented as a matrix where each row represents one data sample; each column represents one feature dimension. It can then be divided along the sample and feature dimensions. Schmidt et al. [207] process each feature partition with a different model. For image classification tasks, input images can be split into small patches that can be processed in parallel [67,121]. 2.3
Conclusion of Data Modularity
We argue that data without structure contains no useful information for learning dependencies (e.g., between feature and label). Some dependencies boil down to the emergence or the creation of groups. Intrinsic data modularity relates to the semantic relationship between samples and how data samples are similar or dissimilar. Imposed data modularity, on the other hand, relates to the way that practitioners organize data at hand to better train learning machines. Future research for data-centric deep learning may investigate the relationship between intrinsic and imposed data modularity. For example, does intrinsic data modularity promote imposed data modularity? How does this interplay affect model training? Data modularity describes how the input of deep learning models can be modularized. On the other hand, the end goal (the output) of deep learning models can also be modularized, which is the topic of the next section.
3
Task Modularity
Fig. 4. Illustration of Sub-Task Cecomposition. The Upper Figure Illustrates the Parallel Cecomposition of a Task. The Lower Figure Illustrates the Sequential Decomposition of a Task.
Deep learning models are tools to solve tasks e.g., from the classification of entities to the generation of realistic photos. Solving a task is equal to achieving a corresponding objective. In deep learning, we usually model an objective by an
Modularity in Deep Learning: A Survey
567
explicit differentiable objective function (also known as a loss function), allowing end-to-end training. This perspective can be generalized to any task, even if the objective function is implicit and does not entail a differentiable form. For example, the task of “purchasing a cup of tea” can be characterized by an indicator function that returns a penalty if no tea can be purchased or a bonus otherwise. In deep learning, tasks are often related to data; but they are different. Given the same dataset, one can define various tasks on top of it. For example, the MNIST dataset can be used either for an image classification benchmarking task [158] or for a pixel sequence classification benchmarking task [96,139], the OmniPrint-meta[1–5] datasets [227] can be used either for a few-shot learning benchmarking task or for domain adaptation benchmarking task. Tasks define the objective; they are orthogonal to how the end goal should be achieved. This section presents task modularity i.e., sub-task decomposition. Sub-task decomposition means that a task could be factorized or decomposed into subtasks. Sub-task decomposition facilitates conceptualization and problem-solving. The divide-and-conquer principle breaks down a complex problem into easier sub-problems [15,57,118,187]. By solving each individual sub-problem and combining the solutions, the complex problem can be solved more efficiently. The sub-task decomposition facilitates the integration of expert knowledge, and the a priori knowledge can further facilitate problem-solving. Sub-task decomposition can also promote reuse if the overall task is compositional; the solution to sub-tasks may be reused in other tasks [64,161,173,185,219]. The sub-task decomposition can be categorized into two regimes: parallel decomposition and sequential decomposition (Fig. 4). Parallel decomposition means that the sub-tasks can be executed in parallel. Sequential decomposition means that the sub-tasks need to be executed in order; certain sub-tasks cannot be executed before the previous sub-task is finished. In practice, these two regimes can be mixed. For example, a sub-task from a sequential decomposition can be further decomposed parallelly, which leads to a directed acyclic graph workflow. 3.1
Parallel Sub-Task Decomposition
A parallel sub-task decomposition is called homogeneous if the decomposed subtasks are similar. One typical example is dividing a multi-class classification problem into multiple smaller classification problems [92]. Given a neural network trained to perform a multi-class classification problem, Csord´ as et al. [61] use parameter masks to identify subsets of parameters solely responsible for individual classes on their own. Kim et al. [133] learn to split a neural network into a tree structure to handle different subsets of classes. They assume that different classes use different features, the tree-structured neural network ensuring that the later layers do not share features across different subsets of classes. Pan et al. [174,175] and Kingetsu et al. [134] decompose a multi-class classification model into reusable, replaceable and combinable modules, where each module is a binary classifier. Such modules can be recombined without retraining to obtain a new multi-class classifier. These methods can be useful in situations
568
H. Sun and I. Guyon
where the classes to be classified frequently change. Abbas et al. [2] use transfer learning and class decomposition to improve the performance of medical image classification. Such sub-task decomposability is an implicit prerequisite of the model editing problem [127,160,162,163,220]. Model editing aims to modify a specific sub-task learned by a trained neural network without damaging model performance on other inputs, e.g., it aims to patch the mistake of the model for a particular sample. If the task cannot be decomposed into disentangled sub-tasks, then model editing cannot be achieved. A parallel sub-task decomposition is termed heterogeneous if the decomposed sub-tasks are dissimilar; such decomposition is usually problem-dependent and requires expert knowledge of the task at hand. Belay et al. [25] decompose the recognition task of Amharic characters into a vowel recognition task and a consonant recognition task to reduce overall task complexity. Cao et al. [36] decompose the full self-attention into question-wide and passage-wide self-attentions to speed up inference for question answering tasks. Ding et al. [66] decompose the facial recognition task into multiple facial component recognition tasks. Zhou et al. [266] decompose the neural network learning task into structure learning and parameter learning to learn equivariance from data automatically. Gatys et al. [89] decompose the natural image synthesis task into a content component and a style component, which allows recombining the content and the style in a combinatorial way to generate new images. 3.2
Sequential Sub-Task Decomposition
Sequential sub-task decomposition reflects the sequential pipeline of the task. A simple example is the division of a machine learning task into a preprocessing stage (data cleaning and normalization) and a model inference stage [190]. In reinforcement learning, a complex task can usually be decomposed [219] into a sequence of sub-tasks or steps. An illustrative example is to imagine that the task of manufacturing an artifact Z requires purchasing the raw material X, forging X to produce parts Y , and then assembling the parts Y into the end product Z. Both X and Y can take different values independently (X ∈ {x1 , x2 , x3 , ...}, Y ∈ {y1 , y2 , y3 , ...}). Different values of X and Y can be recombined, which forms a combinatorial number of possible scenarios to learn. This pipeline can be factorized into three stages: (1) raw material purchase, (2) forging to produce parts, and (3) assembling of parts. Reinforcement learning agents would learn more efficiently if the learning happens at the granularity of the factorized stages instead of the overall task [56]. Furthermore, such a factorization enables the independence of credit assignment [181]; the failure of the overall task can be traced back to the problematic stages, while the other stages can remain untouched. For example, if the raw material is of bad quality, then the purchase sub-task needs to be improved; the forging sub-task and the assembling sub-task do not need to be changed [39]. The sequential pipeline is omnipresent in practical applications e.g., optical character recognition (OCR), natural language processing (NLP). When facing a multi-script (multi-language) recognition task, the pipeline can consist
Modularity in Deep Learning: A Survey
569
of a script identification stage and a script-specific recognition stage [112,210], which decouples the domain classifier and the domain-specific solver. The textin-the-wild recognition task [41] usually consists of decoupled text detector (to localize the bounding box of the text) and recognizer (recognize the text from the bounding box) [41]. Traditional OCR methods also decompose the word recognition task into a character segmentation task and a character recognition task [37,48,128,204]. Traditional NLP pipeline includes sentence segmentation, word tokenization, part-of-speech tagging, lemmatization, filtering stop words, and dependency parsing [125]. In bioinformatics, the scientific workflow (data manipulations and transformations) groups similar or strongly coupled workflow steps into modules to facilitate understanding and reuse [55]. 3.3
Conclusion of Task Modularity
The sub-task decomposition can be parallel, sequential, or mixed (directed acyclic graph). We provided examples from the literature that leverage subtask decomposition to reduce task complexity or promote the reuse of sub-task solutions. Task modularity can help integrate expert knowledge and promote model interpretability when paired with model modularity, as will be discussed in the next section. Future research may focus on how to automate the process of sub-task decomposition or make the problem-dependent sub-task decomposition techniques transferable to other tasks, which is an important step for AutoML. It would reduce the demand for highly qualified deep learning engineers, which can reduce expert bias and entry barriers to deep learning.
4
Model Modularity
This section presents model modularity. It means that the architecture of the neural network system (one neural network or a system of neural networks) consists of identifiable sub-entities (modules). Model modularity is different from task modularity. A task define an objective, task modularity focuses on decomposing the objective into sub-objectives. Model modularity focuses on the architecture of the neural network system, it decomposes the solution into sub-solutions. 4.1
Advantages of Model Modularity
Model modularity provides ease of conceptual design and implementation. For example, modern neural networks consist of repeated layer/block patterns (modules). Examples include fully-connected neural networks [94], vanilla convolutional neural networks, ResNet [105,251], Inception [230] and models searched by Neural Architecture Search (NAS) [74,270]. The design with homogeneous modules allows for a more concise description of the model architecture in the
570
H. Sun and I. Guyon
sense of Kolmogorov complexity (short description length) [148,149]. For example, instead of specifying how each primitive operation (e.g., sum, product, concatenation) interacts in a computational graph, the model can be described as a collection of modules that interact with each other [92]. The standardization of such neural network building blocks (fully-connected layers, convolutional layers) also enabled the development of highly optimized hardware and software ecosystems for fast computation [1,90,98,156,178]. Together with sub-task decomposition (task modularity), model modularity offers ease of expert knowledge integration [12,25,95,211,214] and interpretability [118,135,137,184]. Interpretability can have different forms. For example, each neural network module could be assigned a specific interpretable sub-task. On the other hand, selective module evaluation provides insights on how different samples/tasks are related [6,12,119,209] in the context of conditional computation [28]. The model decomposition into modules promotes reusability and knowledge transfer [33]. Though each neural network is typically trained to perform a specific task, its (decomposed) modules could be shared across tasks if appropriate mechanisms promote such reusability. The simplest example would be the classical fine-tuning paradigm of large pretrained models [50,104,106,258]. This paradigm typically freezes the pretrained model and only retrains its last classification layer to adapt it to the downstream task. Pretrained models are typically pretrained on large datasets [201,225,254]. The large amount and diversity of training data make pretrained models’ intermediate features reusable for other downstream tasks. More recently, the finer-grained reusability of neural network systems has attracted the attention of researchers. Such methods assume that the tasks share underlying patterns and keep an inventory of reusable modules (each module is a small neural network) [6,12,135,239]. Each module learns different facets (latent factors or atomic skills) of the knowledge required to solve each task. The selective/sparse use and dynamic reassembling/recombination of these modules can promote sample efficiency [184] and combinatorial generalization [6, 12,62,117]. Combinatorial generalization is also known as compositional generalization, “infinite use of finite means” [47], and systematic generalization. It aims to generalize to unseen compositions of known functions/factors/words [38,60,82,132, 143,173,185], it is the ability to systematically recombine previously learned elements to map new inputs made up from these elements to their correct output [206]. For example, new sentences consist of new compositions of a known set of words. Combinatorial generalization is argued to be important to achieve human-like generalization [23,114,115,134,142,146,153,164,183,184,237,243]. Learning different facets of knowledge with different modules in a reusable way could be one solution to combinatorial generalization. Modular systems have been shown effective for combinatorial generalization [197] in various fields e.g., natural language processing [114,144,169,183,184], visual question answering [12,16,62], object recognition [38,142,176,185], and robotics [6,51,64,179].
Modularity in Deep Learning: A Survey
571
The modularization of neural network systems promotes knowledge retention. If different knowledge is localized into different modules, targeted knowledge updates and troubleshooting [134,174,175] will be possible. This can alleviate gradient interference of different tasks [126,155,261] and catastrophic forgetting [4,6,10,72,85,120,129,172,202,233,239]. Modular neural network systems facilitate model scaling in two ways. (1) Modular models like fully-connected models and ResNet can be scaled up (or down) by simply stacking more (or less) modules to increase (or decrease) the model capacity to fit larger (or smaller) datasets [105]. (2) Modular methods based on sparsely activated Mixture-of-Experts [209] decouple computation cost from model size. They allow drastically increasing the model capacity without increasing compute cost because only a small fraction of the model is evaluated on each forward pass [21,49,68,75,102,209]. The extreme example of these sparsely activated models is Switch Transformer [76] which contains 1.6 trillion parameters, pushing the competition of large model sizes [35,222] to the next level. 4.2
Typical Modules in Deep Learning Models
This section reviews some typical modules in the deep learning literature. Almost all systems are modular to some degree [205], neural network systems can almost always be decomposed into subsystems (modules) [18] following different points of view. More specifically, they usually consist of a hierarchical structure in which a module of a higher hierarchy level is made of modules of a lower hierarchy level. The elementary layer of modern neural networks (e.g., fully-connected layer, convolutional layer) can be seen as a module on its own. On the other hand, any neural network as a whole can also be considered as a module e.g., in the context of ensemble [268], Mixture-of-Experts [119], and Generative Adversarial Networks (GAN) [95]. Some literature [26,61,134,226] define modules as sub-neural networks where part of the parameters are masked out (set to 0). In these cases, overlapping modules can be obtained when the masks overlap.
Fig. 5. Examples of a Module. (a) A Fully-Connected Layer; (b) A Basic ResNet Module, Reprinted from [105]; (c) an LSTM Module, Reprinted from [44].
572
H. Sun and I. Guyon
4.2.1 Modules for Non-sequential Data Fully-connected layers (Fig. 5(a)) imitate the connections between neurons in biological neural networks but connect every input neuron to every output neuron [94]. In practice, a fully-connected layer is implemented as a matrix multiplication between input data and learnable parameters. Convolutional layers introduce the inductive bias of translation equivariance. Conceptually, a convolutional layer (with a single output channel) can be obtained from a fully-connected layer by enforcing local connectivity and parameter sharing [94]. Local connectivity means that each neuron only connects to a subset of neurons of the previous layer; parameter sharing means that the same learnable parameters are used across receptive fields. In practice, a convolutional layer is implemented as a collection of kernels/filters shifted over the input data [156,178]. Each kernel performs a dot product between input data and learnable parameters. Depending on the number of dimensions over which kernels are shifted, a convolutional layer is termed e.g., 1D, 2D, 3D. 2D convolutional layers are widely used in computer vision tasks [138,147]. Locally connected layers are similar to convolutional layers except that they remove the constraint of parameter sharing (across kernels). It helps if one wants to impose local receptive fields while there is no reason to think each local kernel should be the same [94]. Low-rank locally connected layers relax spatial equivariance and provide a trade-off between locally connected layers and convolutional layers. The kernel applied at each position is constructed as a linear combination of a basis set of kernels with spatially varying combining weights. Varying the number of basis kernels allows controlling the degree of relaxation of spatial equivariance [73]. Standard convolutional layers offer translation equivariance; a line of research focuses on generalizing this to other equivariances (rotation, reflection), referred to as group convolutional layers [24,53,54,65,88,246–248]. On the other hand, depthwise separable convolutional layers [46,109,213] factorize a standard convolutional layer into a depthwise convolutional layer and a pointwise convolutional layer, which reduces model size and computation. Multiple layers can be grouped into a building block (a module of a higher hierarchy level). Such examples include the building blocks of ResNet [105], Inception [229,230], ResNeXt [251], Wide ResNet [262]. Inception [229,230] has parallel kernels of multiple sizes within each block and merge their results to extract information at varying scales. Inception also includes several techniques to reduce computation cost e.g., factorizing large kernels into smaller kernels and using 1 × 1 convolution to reduce dimensionality. A ResNet block [105] (Fig. 5(b)) contains a sequence of convolutional layers; it adds a skip-connection (also known as residual connection, identity mapping) from the beginning to the end of the block to alleviate vanishing gradients. Many variants of the ResNet block have been proposed. For example, Wide ResNet [262] increases the block width; ResNeXt [251] aggregates parallel paths within each block. The block design could be automatically searched instead of handcrafted. In order to narrow down the model search space, some Neural Architecture Search methods [74,116,257,270] automatically search the optimal design pattern for
Modularity in Deep Learning: A Survey
573
a block (also known as a cell) while fixing the block composition scheme (also known as meta-architecture). Once the block design patterns are searched, the full model is instantiated by repeating the searched blocks following the predefined block composition scheme. For example, NAS-Bench-101 [257] defines the block search space as all possible directed acyclic graphs on V nodes (V 7) while limiting the maximum number of edges to 9. McNeely-White et al. [159] report that the features learned by Inception and ResNet are almost linear transformations of each other, even though these two architectures have a remarkable difference in the architectural design philosophy. This result explains why the two architectures usually perform similarly and highlights the importance of training data. This result is corroborated by Bouchacourt et al. [30], who argue that invariance generally stems from the data itself rather than from architectural bias. 4.2.2 Modules for Sequential Data When the input data is sequential e.g., time series, text, audio, video, Recurrent Neural Networks (RNN) [200] come into play. The RNN module processes the sequential data one at a time; the output (also known as the hidden state) of the RNN module at the previous time step is recursively fed back to the RNN module, which allows it to aggregate information across different time steps. The vanilla RNN module suffers from short-term memory issues; it cannot effectively preserve information over long sequences. To overcome this issue, gated recurrent unit (GRU) [45] and long short-term memory (LSTM) [107] module use gates to control which information should be stored or forgotten in the memory, which allows better preservation of long-term information. In GRU and LSTM modules, gates are neural networks with trainable parameters. While GRU modules are faster to train than LSTM modules, their performance comparison varies depending on the scenario. GRU surpasses LSTM in long text and small dataset scenarios while LSTM outperforms GRU in other scenarios [255]. Contrary to RNN, GRU, and LSTM, which process sequential data one at a time, self-attention layers [238] process the data sequence in parallel. For each data point in a data sequence (e.g., each time step of a time series), a selfattention layer creates three transformed versions, referred to as query vector, key vector, and value vector, through linear transformations. Between each pair of data points, the dot product between the query vector and the key vector of the pair reflects how much those two data points are related within the sequence. These dot products are then normalized and combined with the corresponding value vectors to get the new representation of each data point in the sequence. An enhanced version of self-attention layers is multi-head self-attention layers, which extract different versions of query vector, key vector, and value vector for each data point. Multi-head self-attention layers improve performance by capturing more diverse representations. A transformer block combines multihead self-attention layers, fully-connected layers, normalization layers, and skipconnections. Models built upon transformer blocks have achieved state-of-the-art performance in a wide range of tasks such as natural language processing [130]
574
H. Sun and I. Guyon
and speech synthesis [150]. Transformer models can be applied to image modality by transforming each input image into a sequence of small image patches [67]. Despite the lack of image-specific inductive bias (translation equivariance, locality), vision transformers can achieve state-of-the-art performance when combined with a large amount of training data [20,67,103]. 4.3
Composition of Modules
Section 4.2 presents typical modules in the literature. Section 4.3 discusses how to organize these modules to form a model (or a module of a higher hierarchy level).
Fig. 6. Illustration of Module Composition. (a) Sequential Concatenation. (b) Ensembling. (c) Tree-Structure Composition. (d) General Directed Acyclic Graph. (e) Conditional Composition. (f) Cooperation Composition.
4.3.1 Static Composition of Modules Static composition means that the composed structure does not vary with input; the same structure is used for all input samples or tasks. One straightforward way to compose modules is sequential concatenation (Fig. 6(a)). It implies that multiple (typically homogeneous) modules are sequentially concatenated into a chain to form a model, where a module’s output is the next module’s input. Examples of sequential concatenation include fullyconnected models [94] and ResNet models [105]. This composition scheme typically does not assume an explicit sub-task decomposition; the chain of concatenated modules can instead be seen as a series of information extraction steps [5,234,258], extracted features transition from low-level to high-level. Ensembling composition [124,171,268], on the other hand, organizes modules in a parallel manner (Fig. 6(b)). The principle of ensembling is to aggregate
Modularity in Deep Learning: A Survey
575
(e.g., averaging) the results of multiple modules (weaker learners) to obtain a more robust prediction. The rationale is that different modules are expected to provide complementary and diverse views of input data. Each module’s data is processed independently without relying on the other modules at inference time. The regularization method Dropout [224], which randomly deactivates neurons during training, can be seen as an implicit ensemble method of overlapping modules. Sequential composition and parallel composition can be combined, e.g., in the form of a tree structure (Fig. 6(c)). A typical scenario of tree-structure composition is a model with a shared feature extractor and multiple task-specific heads [215,265]. All the above composition schemes are special cases of DAG (Directed Acyclic Graph, Fig. 6(d)). The general DAG composition scheme is typically found in models searched by Neural Architecture Search [152,192,252]. Cooperation composition (Fig. 6(f)) assumes that each module is a standalone neural network with specific functionality and that these neural networks cooperate during training or inference; it is a neural network system that consists of multiple separate neural networks. Different from ensembling composition, modules in cooperation composition are typically heterogeneous and interact with each other more diversely. For example, siamese networks [34,42,122] consists of two neural networks (module) which work together to produce different versions of the input data. Generative Adversarial Networks (GAN) [95,269] trains a generator under the guidance of a discriminator. The same spirit applies to teacher and student neural networks [231]. Some deep reinforcement learning methods implement the Actor-Critic [228] with two separate new networks, such as AlphaGo [214], A3C [166], ACKTR [250]. Continual learning with deep replay buffer [211] consists of a continual neural network learner and a generative neural network serving as the replay buffer. Some other continual learning methods [10,202,233,239] continuously expanding model capacity for new tasks by adding new modules which work in cooperation with old modules. 4.3.2 Conditional Composition of Modules Conditional composition (Fig. 6(e)) is complementary to static composition in the sense that the composed modules are selectively (conditionally, sparsely, or dynamically) activated (used or evaluated) for each particular input. The input conditioning can happen at the granularity of individual sample [12,119,135] as well as task [118,155,157,184,226]. In the literature, this paradigm is also termed conditional computation [27,28]. The idea of conditional computation can be traced back to Mixture-of-Experts (MoE) introduced in the last century. An MoE is a system composed of multiple separate neural networks (modules), each of which learns to handle a sub-task of the overall task [118,267] e.g., a subset of the complete training dataset. A gating network computes the probability of assigning each example to each module [119,123] or a sparse weighted combination of modules [209]. Two issues of MoE are module collapse [135,164,209] and shrinking batch size [209], both of which are related to the balance of module utilization. Module collapse means
576
H. Sun and I. Guyon
under-utilization of modules or lack of module diversity. Due to the self-reinforcing behavior of the gating network during training, premature modules may be selected and thus trained even more. The gating network may end up converging to always selecting a small subset of modules while the other modules are never used. Shrinking batch size means the batch size is reduced for each conditionally activated module. Large batch sizes are necessary for modern hardware to make efficient inferences because they alleviate the cost of data transfers [209]. MoE can be generalized to e.g., stacked MoE [70,77,135,189,198] or hierarchical MoE [209,256] (Fig. 7). Eigen et al. [70] first explored stacked MoE; they introduced the idea of using multiple MoE with their own gating networks. In order to train stacked MoE, Kirsch et al. [135] use generalized Viterbi Expectation-Maximization algorithm, Rosenbaum et al. [198] employ a multiagent reinforcement learning algorithm, Fernando et al. [77] use a genetic algorithm. MoE systems do not always have explicit gating networks; for instance, Fernando et al. [77] rely on the results of the genetic algorithm to decide the module routing scheme.
Fig. 7. Extension of Mixture-of-Experts (MoE). (a) A Stacked MoE, which Stacks Multiple MoE Layers into a Chain. (b) A Hierarchical MoE, where a Primary Gating Network Chooses a Sparse Weighted Combination of “Modules”, each of which is an MoE with its Own Gating Network.
Inspired by MoE, some deep learning methods keep an inventory of reusable specialized modules that can be conditionally reassembled for each input. This approach has been advocated to promote knowledge transfer, sample efficiency, and generalization. For example, in visual question answering, Neural Module Networks [12,62,111] dynamically reassemble modules into a neural network to locate the attention (region of interest) on the questioned image. The question’s parsing guides the reassembling process so that the reassembled model reflects
Modularity in Deep Learning: A Survey
577
the structure and semantics of the question. For this particular task, the compositionality of modules comes from the compositionality of visual attention. Following the question’s syntax, the reassembled modules sequentially modify the attention onto the questioned image. For example, the module associated with the word “cat” locates the image region containing a cat, and the module associated with the word “above” shifts up the attention. Zhang et al. [264] investigated adding new abilities to a generic network by directly transplanting the module corresponding to the new ability, dubbed network transplanting. Some work relies on the hypothesis that the tasks at hand share some commonalities i.e., hidden factors are shared across tasks. Each hidden factor can be learned by a separate module from the module inventory for transfer learning and meta-learning. For example, Alet et al. [6] use simulated annealing to meta-learn an inventory of modules reusable across tasks to achieve combinatorial generalization. The parameters of an inventory of modules are optimized during meta-training; the trained modules are reassembled during the meta-test with an optional parameter fine-tuning process. They demonstrated the utility of their method for robotics tasks. Ponti et al. [184] assume that each task is associated with a subset of latent discrete skills from a skill inventory. They try to generalize more systematically to new tasks by disentangling and recombining different facets of knowledge. More precisely, they jointly learn a skill-specific parameter vector for each latent skill and a binary task-skill allocation matrix. For each new task, the new model’s parameter vector is created as the average of the skill-specific parameter vectors corresponding to the skills present in the new task (in addition to a shared base parameter vector). The conditional composition scheme also has other forms. For example, Teerapittayanon et al. [232] save computation on easy input data via early exiting; later layers will be skipped if the intermediate feature’s prediction confidence passes a predefined threshold. Fuengfusin et al. [86] train models whose layers can be removed at inference time without significantly reducing the performance to allow adaptive accuracy-latency trade-off. Similarly, Yu et al. [259] train models which are executable at customizable widths (the number of channels in a convolutional layer). Xiong et al. [253] sparsely activate convolutional kernels within each layer for each particular input sample, which provides an example of the conditional composition of overlapping modules. 4.4
Conclusion of Model Modularity
Section 4 presents how the notion of modularity is instantiated in the architecture of neural network systems. The structure of neural network modules (Sect. 4.2) and the way to organize the modules (Sect. 4.3) provide a complementary view of model modularity. While all modern neural networks are modular to some extent, different instantiations of the modularity principle offer different advantages (Sect. 4.1). The advantages include ease of conceptual design and implementation, ease of expert knowledge integration, better interpretability, ease of knowledge transfer
578
H. Sun and I. Guyon
and reuse, better generalization and sample efficiency, ease of knowledge retention, ease of troubleshooting, and better scalability.
5
Other Notions of Modularity
There remain some other notions of modularity in the deep learning literature. In graph theory, the term “modularity” refers to a measure commonly used in community detection. It measures the density of connections within a community (module) compared to between modules communities [170]. This measure can be applied to graph clustering problems in the form of modularity optimization [32,110,203,212]. Inspired by this measure, Filan et al. [78] investigate the parameter clustering pattern that emerged from the training of a neural network. They view a neural network as an undirected weighted graph (edge weights are the absolute value of network parameters) and apply spectral clustering on the obtained graph. They observe that some neural networks trained on image classification tasks have some clustering properties of their parameters: edge weights are stronger within one cluster than between clusters. Watanabe et al. [245] have obtained similar results. B´ena et al. [26] adapted the graph-theoretic modularity measure to define structural modularity and define functional specialization through three heuristic measures. The functional specialization can be intuitively understood as the extent to which a sub-network can do a sub-task independently. To investigate the relationship between structural and functional modularity, they design a scenario where a model with two parallel modules (with an adjustable number of interconnections) is used to predict whether the parity of the two digits is the same or different. They show that enforcing structural modularity via sparse connectivity between two communicating modules does lead to functional specialization of the modules. However, this phenomenon only happens at extreme levels of sparsity. With even a moderate number of interconnections, the modules become functionally entangled. Mittal et al. [164] observed that modular systems (weighted combination of parallel modules) with a good module specialization are good in terms of the overall system performance, however end-to-end training itself is not enough to achieve a good module specialization.
Fig. 8. Illustration of a Disentangled Representation.
Modularity in Deep Learning: A Survey
579
The term “modularity” is related to the notion of independence in some literature. For example, Galanti et al. [87] use modularity to refer to the ability of hypernetworks [99] to learn a different function for each input instance. A line of research has been carried out on learning disentangled representation. Intuitively, disentangled representation aims to reverse the underlying data generating process and retrieve its latent factors into the learned representation (Fig. 8). One of the desirable properties of a disentangled representation [69,194,263] is “modularity”. In this context, a modular representation is a representation where each dimension of the representation conveys information about at most one latent generative factor.
6
Discussion
Defining modularity is, in itself, a challenging problem. The notion of modularity is present in literature across many different fields [14,17,22,29,31,55,58,59,80, 81,83,84,91,93,140,167,168,170,177,180,182,186,195,208,217,218,241]. While many researchers have a strong intuition about what it means for an entity to be modular, there has yet to be a universal agreement on what defines modularity. The same is true even within the field of deep learning. As rightly said by B´ena et al. [26]: “Modularity of neural networks is a bit like the notion of beauty in art: everyone agrees that it’s important, but nobody can say exactly what it means”. We argue that the difficulty of defining modularity stems from the fact that the notion of modularity usually comes with many different properties: replaceability of modules, combinability of modules, reusability of modules, autonomy of modules (limited interaction or limited interdependence between modules), functional specialization of modules. Authors from different fields typically only retain one or two of the above properties to claim an entity to be modular. In this survey, we define modularity as the property of an entity whereby it can be broken down into a number of sub-entities (referred to as modules). This definition is the prerequisite of the properties mentioned above; it is the greatest common definition of the notion of modularity. By recursively applying the definition of modularity, a modular entity is an entity that can be broken down into sub-entities, where each sub-entity can be further broken down into sub-sub-entities. This recursion can be repeated for discrete entities until the atomic elements (minimum indivisible modules) are reached. In that case, a set of atomic elements {a ∈ D} can formally characterize a discrete entity D; a subset of atomic elements can then characterize a module M ⊆ D. The above framework applies to data modularity (Sect. 2) and model modularity (Sect. 4). The reason is that data and models are both discrete: data samples and model parameters are stored in physical computers where everything is represented quantitatively. On the other hand, we need to use a different framework for task modularity because tasks are usually not discrete. As discussed in Sect. 3, each task can be characterized by an objective function F . In this sense, task modularity can be formally characterized by (objective) function compositions.
580
H. Sun and I. Guyon
A task is decomposable if there exists a set of functions {f1 , f2 , ...} that, when composed together, retrieve the form of the original objective function F . For discrete entities, one needs to choose the atomic elements. Naively, one could choose each data sample in a dataset and each neuron in a neural network as the atomic elements. However, both choices remain to be discussed because they are indeed not the smallest indivisible modules. Regarding data modularity, the dataset division can happen both at the sample dimension and the feature dimension, which means that each data sample can be divided into smaller elements e.g., feature vectors of reduced length or image patches. Regarding model modularity, the modularization can happen at the granularity of parameters e.g., modules can be obtained by masking out parameters [26,61,134,226]. Consequently, one can choose the scalar numbers stored in physical computers (often represented by floating-point numbers) as the atomic elements. The atomic elements for data are every single dimension of data samples; the atomic elements for models are every single scalar parameters in the neural network. It entails that, in some cases, there needs to be some relationship R among atomic elements {a ∈ D} because any arbitrary subsets of atomic elements do not necessarily form a valid module if the relationship R is broken. In the above example, the relationship R indicates which scalar numbers should come together to form data samples or how to use each scalar parameter along the feedforward computation in the computational graph of neural networks. In consequence, an entity can be a set or a system; a system is a set equipped with relationships R among atomic elements {a ∈ D}.
7
Future Research
As modularity is a general principle, this survey covered many elements from different sub-fields of deep learning; each sub-field can provide a lot of future avenues of research on its own. To name a few, McNeely-White et al. [159] and Bouchacourt et al. [30] showed that given the same training data, learned features exhibit similar properties across models with markedly different architectural inductive biases. Is it still worth improving neural network architectures if data dominate learning results? Future research may validate the results of McNeelyWhite et al. [159] and Bouchacourt et al. [30] by extending their research to more kinds of models and training datasets in a more systematic way. If these results still hold, one may need to ground these results theoretically. On the other hand, whether neural networks can learn and behave compositionally is still an open question [11,114]. It entails that we need a domain-agnostic way to test the compositionality of neural networks. Different aspects of the modularity principle can be further investigated to improve deep learning models. It boils down to designing new deep learning methods that provide e.g., better interpretability, reusability, scalability, and efficiency. While model modularity may, to some extent, reflect task modularity, it is still unclear whether data modularity directly corresponds with model modularity. One research avenue is to automate imposed data modularization regarding
Modularity in Deep Learning: A Survey
581
specific models in the spirit of AutoML. Similarly, automating task modularization can facilitate problem-solving and reduce human-introduced bias.
8
Conclusion
Deep learning is becoming dominant in many applications, such as computer vision and natural language processing. It is natural to ask ourselves whether there are guidelines for designing deep learning algorithms. Modularity is one guiding principle that has been put forward in the literature. This survey reveals that modularity is pervasive in three related yet distinct axes of deep learning: data, task, and model architecture. We observed that some modularity concepts come in the form of a prior, while others come in the form of a posterior. The efforts of bringing the modularity principle into deep learning are not new; however, reviewing deep learning literature using the point of view of modularity is relatively new. This survey provides a step towards clarifying and investigating the notion of modularity in deep learning and elsewhere. Acknowledgments. We gratefully acknowledge constructive feedback and suggestions from Birhanu Hailu Belay, Romain Egele, Felix Mohr, Hedi Tabia, and the reviewers. This work was supported by ChaLearn and the ANR (Agence Nationale de la Recherche, National Agency for Research) under AI chair of excellence HUMANIA, grant number ANR-19-CHIA-0022.
References R oneAPI Math Kernel Library. https://www. 1. Accelerate Fast Math with Intel intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html 2. Abbas, A., Abdelsamea, M.M., Gaber, M.M.: DeTraC: transfer learning of class decomposed medical images in convolutional neural networks. IEEE Access 8, 74901–74913 (2020) 3. Abdrakhmanova, M., et al.: Speakingfaces: a large-scale multimodal dataset of voice commands with visual and thermal video streams. Sensors 21(10), 3465 (2021) 4. Abraham, W.C., Robins, A.: Memory retention - the synaptic stability versus plasticity dilemma. Trends Neurosci. 28(2), 73–78 (2005) 5. Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. arXiv preprint: arXiv:1610.01644 (2016) 6. Alet, F., Lozano-P´erez, T., Kaelbling, L.P.: Modular meta-learning. arXiv:1806.10166 [cs, stat], May 2019 7. Alias Parth Goyal, A.G., et al.: Neural production systems. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Wortman Vaughan, J. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 25673–25687. Curran Associates, Inc., (2021) 8. Almeida, F., Xex´eo, G.: Word embeddings: a survey, January 2019 9. Amer, M., Maul, T.: A review of modularization techniques in artificial neural networks. Artif. Intell. Rev. 52, 527–561 (2019)
582
H. Sun and I. Guyon
10. Anderson, A., Shaffer, K., Yankov, A., Corley, C.D., Hodas, N.O.: Beyond fine tuning: a modular approach to learning on small data, November 2016 11. Andreas, J.: Measuring compositionality in representation learning. In: International Conference on Learning Representations (2019) 12. Andreas, J., Rohrbach, M., Darrell, T., Klein, D.: Neural module networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 39-48, IEEE, June 2016 13. Auda, G., Kamel, M.: Modular neural networks a survey. Int. J. Neural Syst. 9(2), 129–51 (1999) 14. Avigad, J.: Modularity in mathematics. Rev. Symbolic Logic 13(1), 47–79 (2020) 15. Azam, F.: Biologically Inspired Modular Neural Networks. PhD thesis, Virginia Tech, May 2000 16. Bahdanau, D., Murty, S., Noukhovitch, M., Nguyen, T.H., de Vries, H. and Courville, A.: Systematic generalization: what is required and can it be learned? In: International Conference on Learning Representations (2019) 17. Baldwin, C.Y., Clark, K.B.: Design Rules: The Power of Modularity, vol. 1, 1st edn. MIT Press, Cambridge (1999) 18. Balestriero, R., LeCun, Y.: POLICE: Provably optimal linear constraint enforcement for deep neural networks, November 2022 19. Baltruˇsaitis, T., Ahuja, C., Morency, L.-P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018) 20. Bao, H., Dong, L., Piao, S., Wei, F.: BEiT: BERT pre-training of image transformers. In: International Conference on Learning Representations (2022) 21. Barham, P., et al.: Pathways: asynchronous Distributed Dataflow for ML. arXiv:2203.12533 [cs], March 2022 22. Barrett, H.C., Kurzban, R.: Modularity in cognition: framing the debate. Psychol. Rev. 113(3), 628–647 (2006) 23. Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261 [cs, stat], October 2018 24. Bekkers, E.J., Lafarge, M.W., Veta, M., Eppenhof, K.A., Pluim, J.P., Duits, R.: Roto-translation covariant convolutional networks for medical image analysis. arXiv:1804.03393 [cs, math], June 2018 25. Belay, B., Habtegebrial, T., Liwicki, M., Belay, G., Stricker, D.: Factored convolutional neural network for amharic character image recognition. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 2906–2910 (2019) 26. B´ena, G., Goodman, D.F.M.: Extreme sparsity gives rise to functional specialization. arXiv:2106.02626 [cs, q-bio], June 2021 27. Bengio, E., Bacon, P.L., Pineau, J., Precup, D.: Conditional Computation in Neural Networks for faster models. arXiv:1511.06297 [cs], January 2016 28. Bengio, Y., Leonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv:1308.3432 [cs], August 2013 29. Bongard, J.: Evolving modular genetic regulatory networks. In: Proceedings of the 2002 Congress on Evolutionary Computation. CEC2002 (Cat. No.02TH8600), vol. 2, pp. 1872–1877, May 2002 30. Bouchacourt, D., Ibrahim, M., Morcos, A.: Grounding inductive biases in natural images: Invariance stems from variations in data. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 19566–19579. Curran Associates, Inc., (2021)
Modularity in Deep Learning: A Survey
583
31. Bourbaki, N.: The architecture of mathematics. Am. Math. Mon. 57(4), 221–232 (1950) 32. Brandes, U., et al.: On modularity clustering. IEEE Trans. Knowl. Data Eng. 20(2), 172–188 (2007) 33. Braylan, A., Hollenbeck, M., Meyerson, E., Miikkulainen, R.: Reuse of neural modules for general video game playing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016) 34. Bromley, J., Guyon, I., LeCun, Y., Sackinger, E., Shah, R.: Signature verification using a “Siamese” time delay neural network. In: Cowan, J., Tesauro, G., Alspector, J. (eds.) Advances in Neural Information Processing Systems, vol. 6. MorganKaufmann (1994) 35. Brown, T., et al.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901. Curran Associates, Inc., (2020) 36. Cao, Q., Trivedi, H., Balasubramanian, A., Balasubramanian, N.: DeFormer: decomposing pre-trained transformers for faster question answering. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4487–4497. Association for Computational Linguistics, July 2020 37. Casey, R.G., Lecolinet, E.: A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 690–706 (1996) 38. Chang, M.B., Gupta, A., Levine, S., Griffiths, T.L.: Automatically composing representation transformations as a means for generalization. In: International Conference on Learning Representations (2019) 39. Chang, M., Kaushik, S., Levine, S., Griffiths, T.: Modularity in reinforcement learning via algorithmic independence in credit assignment. In: International Conference on Machine Learning, pp. 1452–1462. PMLR, July 2021 40. Chen, S., Dobriban, E., Lee, J.H.: A group-theoretic framework for data augmentation. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 21321–21333. Curran Associates, Inc., (2020) 41. Chen, X., Jin, L., Zhu, Y., Luo, C., Wang, T.: Text recognition in the wild: a survey. arXiv:2005.03492 [cs], December 2020 42. Chen, X., He, K.: Exploring simple siamese representation learning. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15745–15753, Nashville, TN, USA. IEEE, June 2021 43. Chen, Y., et al.: Modular meta-learning with shrinkage. In: Advances in Neural Information Processing Systems, vol. 33, pp. 2858–2869 (2020) 44. Chevalier, G.: Long short-term memory (LSTM cell). Wikipedia, September 2022 45. Cho, K., Van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder–decoder approaches. In: Syntax, Semantics and Structure in Statistical Translation, p. 103 (2014) 46. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017) 47. Chomsky, N.: Aspects of the Theory of Syntax. MIT Press, Cambridge (1965) 48. Choudhary, A., Rishi, R., Ahlawat, S.: A new character segmentation approach for off-line cursive handwritten words. Procedia Comput. Sci. 17, 88–95 (2013) 49. Chowdhery, A., et al.: PaLM: scaling language modeling with pathways. arXiv:2204.02311 [cs], April 2022
584
H. Sun and I. Guyon
50. Chu, B., Madhavan, V., Beijbom, O., Hoffman, J., Darrell, T.: Best practices for fine-tuning visual classifiers to new domains. In: Hua, G., J´egou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 435–442. Springer, Cham (2016). https://doi.org/10. 1007/978-3-319-49409-8 34 51. Clavera, I., Held, D., Abbeel, P.: Policy transfer via modularity and reward guiding. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1537–1544. IEEE (2017) 52. Clune, J., Mouret, J.-B., Lipson, H.: The evolutionary origins of modularity. Proc. R. Soc. b: Biol. Sci. 280(1755), 20122863 (2013) 53. Cohen, T., Welling, M.: Group equivariant convolutional networks. arXiv:1602.07576 [cs, stat], June 2016 54. Cohen, T.S., Welling, M.: Steerable CNNs. arXiv:1612.08498 [cs, stat], December 2016 55. Cohen-Boulakia, S., et al.: Scientific workflows for computational reproducibility in the life sciences: status, challenges and opportunities. Futur. Gener. Comput. Syst. 75, 284–298 (2017) 56. CColas, C., Fournier, P., Chetouani, M., Sigaud, O., Oudeyer, P.Y.: Curious: Intrinsically motivated modular multi-goal reinforcement learning. In: International Conference on Machine Learning, pp. 1331–1340. PMLR (2019) 57. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms (2022) 58. Cosmides, L., Tooby, J.: Cognitive adaptations for social exchange. Undefined, 163–228 (1992) 59. Cosmides, L., Tooby, J.: Origins of domain specificity: the evolution of functional organization. In: Hirschfeld, L.A., Gelman, S.A. (eds.) Mapping the Mind: Domain Specificity in Cognition and Culture, pp. 85–116. Cambridge University Press, Cambridge (1994) 60. Csord´ as, R., Irie, K., Schmidhuber, J.: CTL++: evaluating generalization on never-seen compositional patterns of known functions, and compatibility of neural representations. In: Proceedings Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2022 61. Csord´ as, R., van Steenkiste, S., Schmidhuber, J.: Are neural nets modular? inspecting functional modularity through differentiable weight masks. In: International Conference on Learning Representations (2021) 62. D’Amario, V., Sasaki, T., Boix, X.: How modular should neural module networks be for systematic generalization? In: Thirty-Fifth Conference on Neural Information Processing Systems (2021) 63. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a largescale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009) 64. Devin, C., Gupta, A., Darrell, T., Abbeel, P., Levine, S.: Learning modular neural network policies for multi-task and multi-robot transfer. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2169–2176. IEEE (2017) 65. Dieleman, S., De Fauw, J., Kavukcuoglu, K.: Exploiting cyclic symmetry in convolutional neural networks. arXiv:1602.02660 [cs], May 2016 66. Ding, C., Tao, D.: Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 1002–1014 (2017)
Modularity in Deep Learning: A Survey
585
67. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021) 68. Du, N., et al.: Glam: efficient scaling of language models with mixture-of-experts. In: International Conference on Machine Learning, pp. 5547–5569. PMLR (2022) 69. Eastwood, C., Williams, C.K.: A framework for the quantitative evaluation of disentangled representations. In: Sixth International Conference on Learning Representations (ICLR 2018), May 2018 70. Eigen, D., Ranzato, M.A., Sutskever, I.: Learning factored representations in a deep mixture of experts. In: ICLR Workshop (2014) 71. El Baz, A., et al.: Lessons learned from the NeurIPS 2021 MetaDL challenge: backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification. In: Kiela, D., Ciccone, M., Caputo, B. (eds.) Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, volume 176 of Proceedings of Machine Learning Research, pp. 80–96. PMLR, December 2022 72. Ellefsen, K.O., Mouret, J.B., Clune, J.: Neural modularity helps organisms evolve to learn new skills without forgetting old skills. PLoS Comput. Biol. 11(4), e1004128 (2015) 73. Elsayed, G.F., Ramachandran, P., Shlens, J., Kornblith, S.: Revisiting spatial invariance with low-rank local connectivity. arXiv:2002.02959 [cs, stat], August 2020 74. Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search. pp. 69–86 75. Fedus, W., Dean, J., Zoph, B.: A review of sparse expert models in deep learning, September 2022 76. Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res. 23(120), 1–39 (2022) 77. Fernando, C., et al.: PathNet: evolution channels gradient descent in super neural networks. arXiv:1701.08734 [cs], January 2017 78. Filan, D., Casper, S., Hod, S., Wild, C., Critch, A., Russell, S.: Clusterability in neural networks. arXiv:2103.03386 [cs], March 2021 79. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. arXiv:1703.03400 [cs], July 2017 80. Fodor, J.A.: The modularity of mind, April 1983 81. Fodor, J.A.: The Mind Doesn’t Work That Way: The Scope and Limits of Computational Psychology. MIT Press, Cambridge (2000) 82. Fodor, J.A., Pylyshyn, Z.W.: Connectionism and cognitive architecture: a critical analysis. Cognition 28(1–2), 3–71 (1988) 83. Ford, M.: Architects of Intelligence: The Truth about AI from the People Building It. Packt Publishing, Birmingham, first published: November 2018 edition (2018) 84. Frankenhuis, W.E., Ploeger, A.: Evolutionary psychology versus fodor: arguments for and against the massive modularity hypothesis. Philos. Psychol. 20(6), 687– 710 (2007) 85. French, R.: Using semi-distributed representations to overcome catastrophic forgetting in connectionist networks (1991) 86. Fuengfusin, N., Tamukoh, H.: Network with sub-networks: layer-wise detachable neural network. J. Robot., Netw. Artif. Life 7(4), 240–244 (2020) 87. Galanti, T., Wolf, L.: On the modularity of hypernetworks. arXiv:2002.10006 [cs, stat], November 2020
586
H. Sun and I. Guyon
88. Gao, H., Ji, S.: Efficient and invariant convolutional neural networks for dense prediction. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 871–876 (2017) 89. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2414–2423. IEEE, June 2016 90. Gavali, P., Banu, J.S.: Chapter 6 - deep convolutional neural network for image classification on CUDA platform. In: Sangaiah, AK. (ed.) Deep Learning and Parallel Computing Environment for Bioengineering Systems, pp. 99–122. Academic Press (2019) 91. Gentile, P.: Theory of modularity, a hypothesis. Procedia Comput. Sci. 20 (2013) 92. Ghazi, B., Panigrahy, R., Wang, J.: Recursive sketches for modular deep learning. In: Proceedings of the 36th International Conference on Machine Learning, pp. 2211–2220. PMLR, May 2019 93. G´ omez, D., Rodr´ıguez, J.T., Y´ an ˜ez, J., Montero, J.: A new modularity measure for Fuzzy community detection problems based on overlap and grouping functions. Int. J. Approximate Reasoning 74, 88–107 (2016) 94. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016) 95. Goodfellow, I.J., et al.: Generative adversarial networks. arXiv:1406.2661 [cs, stat], June 2014 96. Goyal, A., et al.: Recurrent independent mechanisms. In: International Conference on Learning Representations (2021) 97. Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., Parikh, D.: Making the V in VQA matter: elevating the role of image understanding in visual question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6904–6913 (2017) 98. Gray, S., Radford, A., Kingma, D.P.; GPU Kernels for Block-Sparse Weights. Technical report 99. Ha, D., Dai, A., Le, Q.V.: HyperNetworks. arXiv:1609.09106 [cs], December 2016 100. Hacohen, G., Weinshall, D.: On the power of curriculum learning in training deep networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 2535–2544. PMLR, June 2019 101. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer, Cham (2009) 102. He, J., et al.: FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models. In: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 120–134 (2022) 103. He, K., Chen, X., Xie, S., Li, Y., Dollar, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022) 104. He, K., Girshick, R., Dollar, P.: Rethinking ImageNet pre-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4918– 4927 (2019) 105. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 106. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Modularity in Deep Learning: A Survey
587
107. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 108. Hofman, M.A.: Evolution of the human brain: when bigger is better. Front. Neuroanat. 8, 15 (2014) 109. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint: arXiv:1704.04861 (2017) 110. Hu, G., et al.: Deep stock representation learning: from candlestick charts to investment decisions. arXiv:1709.03803 [q-fin], February 2018 111. Hu, R., Andreas, J., Rohrbach, M., Darrell, T., Saenko, K.: Learning to reason: end-to-end module networks for visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017) 112. Huang, J., et al.: A multiplexed network for end-to-end, multilingual OCR. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4547–4557 (2021) 113. Huizinga, J., Clune, J., Mouret, J.B.: Evolving neural networks that are both modular and regular: Hyperneat plus the connection cost technique. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 697–704 (2014) 114. Hupkes, D., Dankers, V., Mul, M., Bruni, E.: Compositionality decomposed: how do neural networks Generalise? J. Artif. Intell. Res. 67, 757–795 (2020) 115. Hupkes, D., et al.: State-of-the-art generalisation research in NLP: a taxonomy and review, October 2022 116. Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automatic Machine Learning: Methods, Systems, Challenges. Springer, Cham (2019) 117. Islam, R., et al.: Discrete factorial representations as an abstraction for goal conditioned reinforcement learning, October 2022 118. Jacobs, R.A., Jordan, M.I., Barto, A.G.: Task decomposition through competition in a modular connectionist architecture: the what and where vision tasks. Cogn. Sci. 15(2), 219–250 (1991) 119. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991) 120. Javed, K., White, M.: Meta-learning representations for continual learning. arXiv:1905.12588 [cs, stat], October 2019 121. Jin, T., Hong, S.: Split-CNN: splitting window-based operations in convolutional neural networks for memory system optimization. In: Proceedings of the TwentyFourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, New York, NY, USA, pp. 835–847. Association for Computing Machinery (2019) 122. Jing, L., Zhu, J., LeCun, Y.: Masked siamese ConvNets, June 2022 123. Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6(2), 181–214 (1994) 124. Cheng, J., Bibaut, A., van der Laan, M.: The relative performance of ensemble methods with deep convolutional neural networks for image classification. J. Appl. Stat. 45(15), 2800–2818 (2018) 125. Jurafsky, D., Martin, J.H.: Speech and Language Processing. (3rd draft ed.) (2019) 126. Kanakis, M., Bruggemann, D., Saha, S., Georgoulis, S., Obukhov, A., Van Gool, L.: Reparameterizing convolutions for incremental multi-task learning without task interference. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 689–707. Springer, Cham (2020). https://doi.org/10. 1007/978-3-030-58565-5 41
588
H. Sun and I. Guyon
127. Kassner, N., Tafjord, O., Schutze, H., Clark, P.: BeliefBank: adding memory to a pre-trained language model for a systematic notion of belief. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 8849–8861 (2021) 128. Kaur, A., Baghla, S., Kumar, S.: Study of various character segmentation techniques for handwritten off-line cursive words: a review. Int. J. Adv. Sci. Eng. Technol. 3(3), 154–158 (2015) 129. Ke, Z., Liu, B., Nianzu Ma, H.X., Shu, L.: Achieving forgetting prevention and knowledge transfer in continual learning. In: Advances Neural Information Processing System, vol. 34, pp. 22443–22456 (2021) 130. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.; BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACLHLT, pp. 4171–4186 (2019) 131. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On largebatch training for deep learning: generalization gap and sharp minima. In: ICLR (2017) 132. Keysers, D., et al.: Measuring compositional generalization: a comprehensive method on realistic data. In: International Conference on Learning Representations (2020) 133. Kim, J., Park, Y., Kim, G., Hwang, S.J.: SplitNet: learning to semantically split deep networks for parameter reduction and model parallelization. In: Proceedings of the 34th International Conference on Machine Learning, pp. 1866–1874. PMLR, July 2017 134. Kingetsu, H., Kobayashi, K., Suzuki, T.: Neural network module decomposition and recomposition, December 2021 135. Kirsch, L., Kunze, J., Barber, D.: Modular networks: learning to decompose neural computation. In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc., (2018) 136. Koh, E., Dubnov, S.: Comparison and analysis of deep audio embeddings for music emotion recognition, April 2021 137. Yamuna Krishnamurthy and Chris Watkins. Interpretability in gated modular neural networks. In eXplainable AI Approaches for Debugging and Diagnosis., 2021 138. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012) 139. Krueger, D., et al.: Zoneout: regularizing RNNs by randomly preserving hidden activations. In: International Conference on Learning Representations (2017) 140. Kurzweil, R.: How to Create a Mind: The Secret of Human Thought Revealed. Penguin Books, USA (2013) 141. Laenen, S., Bertinetto, L.: On episodes, prototypical networks, and few-shot learning. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Wortman Vaughan, J. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 24581–24592. Curran Associates, Inc., (2021) 142. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015) 143. Lake, B., Baroni, M.: Generalization without systematicity: on the compositional skills of sequence-to-sequence recurrent networks. In: International Conference on Machine Learning, pp. 2873–2882. PMLR (2018) 144. Lake, B.M.: Compositional generalization through meta sequence-to-sequence learning. arXiv:1906.05381 [cs], October 2019
Modularity in Deep Learning: A Survey
589
145. LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, pp. II–104, June 2004 146. LeCun, Y.: A path towards autonomous machine intelligence version 0.9. 2, 202206-27 (2022) 147. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998) 148. LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2 (1989) 149. Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Springer Publishing Company, Incorporated, Cham (2008) 150. Li, N., Liu, S., Liu, Y., Zhao, S., Liu, M.: Neural speech synthesis with transformer network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6706–6713 (2019) 151. Li, Z., Wu, B., Liu, Q., Wu, L., Zhao, H., Mei, T.: Learning the compositional visual coherence for complementary recommendations. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pp. 3536–3543. International Joint Conferences on Artificial Intelligence Organization, July 2020 152. Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: International Conference on Learning Representations (2018) 153. Loula, J., Baroni, M., Lake, B.M.: Lake. Rearranging the familiar: testing compositional generalization in recurrent networks. In: BlackboxNLP@EMNLP, pp. 108–114 (2018) 154. Ma, J., Cui, P., Kuang, K., Wang, X., Zhu, W.: Disentangled graph convolutional networks. In: Chaudhuri, K., Salakhutdinov, R. (ed.) Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 4212–4221. PMLR, June 2019 155. Maninis, K.K., Radosavovic, I., Kokkinos, I.: Attentive single-tasking of multiple tasks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 1851–1860. IEEE, June 2019 156. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015) 157. Masse, N.Y., Grant, G.D., Freedman, D.J.: Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proc. Natl. Acad. Sci. 115(44), E10467–E10475 (2018) 158. Mazzia, V., Salvetti, F., Chiaberge, M.: Efficient-CapsNet: capsule network with self-attention routing. Sci. Rep. 11(1), 1–13 (2021) 159. McNeely-White, D., Beveridge, J.R., Draper, B.A.: Inception and ResNet features are (almost) equivalent. Cogn. Syst. Res. 59, 312–318 (2020) 160. Meng, K., Bau, D., Andonian, A., Belinkov, Y.: Locating and editing factual associations in GPT, February 2022 161. Meyerson, E., Miikkulainen, R.: Modular universal reparameterization: Deep multi-task learning across diverse domains. In: Advances in Neural Information Processing Systems, vol. 32 (2019) 162. Mitchell, E., Lin, C., Bosselut, A., Finn, C., Manning, C.D.: Fast model editing at scale. arXiv:2110.11309 [cs], October 2021 163. Mitchell, E., Lin, C., Bosselut, A., Manning, C.D., Finn, C.: Memory-based model editing at scale. In: International Conference on Machine Learning (2022)
590
H. Sun and I. Guyon
164. Mittal, S., Bengio, Y., Lajoie, G.: Is a Modular architecture enough? (2022) 165. Mittal, S., Raparthy, S.C., Rish, I., Bengio, Y., Lajoie, G.: Compositional attention: disentangling search and retrieval. In: International Conference on Learning Representations (2022) 166. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. arXiv e-prints, page arXiv:1602.01783, February 2016 167. Modrak, V., Soltysova, Z.: Development of the modularity measure for assembly process structures. Math. Probl. Eng. 2021, e4900748 (2021) 168. Muff, S., Rao, F., Caflisch, A.: Local modularity measure for network clusterizations. Phys. Rev. E 72(5), 056107 (2005) 169. Murty, S., Sharma, P., Andreas, J., Manning, C.D.: Manning. Characterizing intrinsic compositionality in transformers with tree projections, November 2022 170. Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006) 171. Opitz, M., Possegger, H., Bischof, H.: Efficient model averaging for deep neural networks. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10112, pp. 205–220. Springer, Cham (2017). https://doi.org/10.1007/ 978-3-319-54184-6 13 172. Ostapenko, O., Rodriguez, P., Caccia, M., Charlin, L.: Continual learning via local module composition. In: Advances in Neural Information Processing Systems, vol. 34, pp. 30298–30312 (2021) 173. Ostapenko, O., Rodriguez, P., Lacoste, A., Charlin, L.: Attention for compositional modularity. In: NeurIPS 2022 Workshop on All Things Attention: Bridging Different Perspectives on Attention (2022) 174. Pan, R., Rajan, H.: On decomposing a deep neural network into modules. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, New York, NY, USA, pp. 889–900. Association for Computing Machinery (2020) 175. Pan, R., Rajan, H.: Decomposing convolutional neural networks into reusable and replaceable modules. In: Proceedings of The 44th International Conference on Software Engineering (ICSE 2022), December 2021 176. Parascandolo, G., Kilbertus, N., Rojas-Carulla, M., Sch¨ olkopf, B.: Learning independent causal mechanisms. In: International Conference on Machine Learning, pp. 4036–4044. PMLR (2018) 177. Parnas, D.L.: On the criteria to be used in decomposing systems into modules. Commun. ACM 15(12), 1053–1058 (1972) 178. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A.,. d’Alch´e-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc., (2019) 179. Pathak, D., Lu, C., Darrell, T., Isola, P., Efros, A.A.: Learning to control selfassembling morphologies: a study of generalization via modularity. In: Advances in Neural Information Processing Systems, vol. 32 (2019) 180. Pereira-Leal, J.B., Levy, E.D., Teichmann, S.A.: The origins and evolution of functional modules: lessons from protein complexes. Philos. Trans. R. Soc. B: Biol. Sci. 361(1467), 507–517 (2006) 181. Peters, J., Janzing, D., Sch¨ olkopf, B.: Elements of Causal Inference: Foundations and Learning Algorithms. Adaptive Computation and Machine Learning Series. MIT Press, Cambridge (2017)
Modularity in Deep Learning: A Survey
591
182. Poisot, T.: An a posteriori measure of network modularity. F1000Research 2, 130 (2013) 183. Ponti, E.: Inductive Bias and Modular Design for Sample-Efficient Neural Language Learning. PhD thesis, University of Cambridge (2021) 184. Ponti, E.M., Sordoni, A., Bengio, Y., Reddy, S.: Combining modular skills in multitask learning, March 2022 185. Purushwalkam, S., Nickel, M., Gupta, A., Ranzato, M.A.: Task-driven modular networks for zero-shot compositional learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3593–3602 (2019) 186. Pylyshyn, Z.: Is vision continuous with cognition?: The case for cognitive impenetrability of visual perception. Behav. Brain Sci. 22(3), 341–365 (1999) 187. Qiao, J.-F., Meng, X., Li, W.-J., Wilamowski, B.M.: A novel modular RBF neural network based on a brain-like partition method. Neural Comput. Appl. 32(3), 899–911 (2020) 188. Rahaman, N., et al.: Dynamic inference with neural interpreters. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 10985–10998. Curran Associates, Inc., (2021) 189. Ramachandran, P., Le, Q.V.: Diversity and depth in per-example routing models. In: International Conference on Learning Representations (2019) 190. Ranganathan, G., et al.: A study to find facts behind preprocessing on deep learning algorithms. J. Innov. Image Process. (JIIP) 3(01), 66–74 (2021) 191. Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (2017) 192. Reisinger, J., Stanley, K.O., Miikkulainen, R.: Evolving reusable neural modules. In: Deb, K. (ed.) GECCO 2004. LNCS, vol. 3103, pp. 69–81. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24855-2 7 193. Ren, P., et al.: A survey of deep active learning. ACM Comput. Surv. (CSUR) 54(9), 1–40 (2021) 194. Ridgeway, K., Mozer, M.C.: Learning deep disentangled embeddings with the fstatistic loss. In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc., (2018) 195. Robbins, P.: Modularity of mind. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, winter 2017 edition (2017) 196. Rose, J.S.: A Course on Group Theory. Courier Corporation, Massachusetts (1994) 197. Rosenbaum, C., Cases, I., Riemer, M., Klinger, T.: Routing networks and the challenges of modular and compositional computation, April 2019 198. Rosenbaum, C., Klinger, T., Riemer, M.: Routing networks: adaptive selection of non-linear functions for multi-task learning. In: International Conference on Learning Representations (2018) 199. Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint: arXiv:1609.04747 (2016) 200. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science (1985) 201. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015) 202. Rusu, A.A., et al.: Progressive neural networks. arXiv:1606.04671 [cs], September 2016
592
H. Sun and I. Guyon
203. Salha-Galvan, G., Lutzeyer, J.F., Dasoulas, G., Hennequin, R., Vazirgiannis, M.: Modularity-aware graph autoencoders for joint community detection and link prediction, June 2022 204. Schenkel, M., Weissman, H., Guyon, I., Nohl, C., Henderson, D.: Recognitionbased segmentation of on-line hand-printed words. In: Hanson, S., Cowan, J., Giles, C. (eds.) Advances in Neural Information Processing Systems, vol. 5. Morgan-Kaufmann (1992) 205. Schilling, M.: Toward a general modular systems theory and its application to interfirm product modularity. Acad. Manag. Rev. 25 (2000) 206. Schmidhuber, J.: Towards compositional learning in dynamic networks (1990) 207. Schmidt, A.L., Bandar, Z.U.: Modularity - a concept for new neural network architectures. November 2001 208. Shao, Y., Zavala, V.M.: Modularity measures: concepts, computation, and applications to manufacturing systems. AIChE J. 66(6), e16965 (2020) 209. Shazeer, N., et al.: Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net (2017) 210. Shi, B., Bai, X., Yao, C.: Script identification in the wild via discriminative convolutional neural network. Pattern Recogn. 52, 448–458 (2016) 211. Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (edis.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., (2017) 212. Shiokawa, H., Fujiwara, Y., Onizuka, M.: Fast algorithm for modularity-based graph clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 27, pp. 1170–1176 (2013) 213. Laurent, S.: Rigid-Motion Scattering for Image Classification [PhD Thesis]. PhD thesis (2014) 214. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016) 215. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017) 216. Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: Seventh International Conference on Document Analysis and Recognition, Proceedings , pp. 958–963, August 2003 217. Simon, H.A.: The architecture of complexity. Proc. Am. Philos. Soc. 106(6), 467– 482 (1962) 218. Simon, H.A., Ando, A.: Aggregation of variables in dynamic systems. Econometrica 29(2), 111–138 (1961) 219. Simpkins, C., Isbell, C.: Composable modular reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4975–4982 (2019) 220. Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S., Babenko, A.: Editable neural networks. In: International Conference on Learning Representations (2019) 221. Smith, S.L., Kindermans, P.J., Ying, C., Le, Q.V.: Don’t decay the learning rate, increase the batch size. In: International Conference on Learning Representations (2018) 222. Smith, S., et al.: Using DeepSpeed and Megatron to train Megatron-Turing NLG 530B, a large-scale generative language model, February 2022
Modularity in Deep Learning: A Survey
593
223. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. arXiv:1703.05175 [cs, stat], June 2017 224. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014) 225. Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 843–852 (2017) 226. Sun, G., et al.: Task switching network for multi-task learning. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 8271–8280. IEEE, October 2021 227. Sun, H., Tu, W.W., Guyon, I.: OmniPrint: a configurable printed character synthesizer. In: Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1) (2021) 228. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018) 229. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017) 230. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016) 231. Tarvainen, A., Valpola, H.: Mean teachers are better role models: weightaveraged consistency targets improve semi-supervised deep learning results. arXiv:1703.01780 [cs, stat], April 2018 232. Teerapittayanon, S., McDanel, B., Kung, H.T.: BranchyNet: fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469 (2016) 233. Terekhov, A.V., Montone, G., O’Regan, J.K.: Knowledge transfer in deep blockmodular neural networks. In: Wilson, S.P., Verschure, P.F.M.J., Mura, A., Prescott, T.J. (eds.) LIVINGMACHINES 2015. LNCS (LNAI), vol. 9222, pp. 268–279. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22979-9 27 234. Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop (ITW), pp. 1–5 (2015) 235. Triantafillou, E., et al.: Meta-dataset: a dataset of datasets for learning to learn from few examples. In: International Conference on Learning Representations (2019) 236. Ullah, I., et al.: Meta-album: multi-domain meta-dataset for few-shot image classification (2022) 237. Vankov, I.I., Bowers, J.S.: Training neural networks to encode symbols enables combinatorial generalization. Philos. Trans. R. Soc. B 375(1791), 20190309 (2020) 238. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., (2017) 239. Veniat, T., Denoyer, L., Ranzato, M.A.: Efficient continual learning with modular networks and task-driven priors. In: 9th International Conference on Learning Representations, ICLR 2021 (2021) 240. Von Luxburg, U., Williamson, R.C., Guyon, I.: Clustering: science or art? In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning. JMLR Workshop and Conference Proceedings, pp. 65–79, June 2012 241. Wagner, G.P., Altenberg, L.: Perspective: complex adaptations and the evolution of evolvability. Evolution 50(3), 967–976 (1996)
594
H. Sun and I. Guyon
242. Wang, H., Zhao, H., Li, B.: Bridging multi-task learning and meta-learning: towards efficient training and effective adaptation. In: International Conference on Machine Learning, pp. 10991–11002. PMLR (2021) 243. Wang, J., Sezener, E., Budden, D., Hutter, M., Veness, J.: A Combinatorial perspective on transfer learning. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 918–929. Curran Associates, Inc., (2020) 244. Wang, R., Pontil, M., Ciliberto, C.: The role of global labels in few-shot classification and how to infer them. In: Advances in Neural Information Processing Systems, vol. 34, pp. 27160–27170 (2021) 245. Watanabe, C., Hiramatsu, K., Kashino, K.: Modular representation of layered neural networks. Neural Netw. 97, 62–73 (2018) 246. Weiler, M., Cesa, G.: General $E(2)$-equivariant steerable CNNs. arXiv:1911.08251 [cs, eess], April 2021 247. Weiler, M., Hamprecht, F.A., Storath, M.: Learning steerable filters for rotation equivariant CNNs. arXiv:1711.07289 [cs], March 2018 248. Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: deep translation and rotation equivariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5028–5037 (2017) 249. Wu, L., et al.: Learning the implicit semantic representation on graph-structured data. In: Jensen, C.S., et al. (eds.) DASFAA 2021. LNCS, vol. 12681, pp. 3–19. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-73194-6 1 250. Wu, Y., Mansimov, E., Grosse, R.B., Liao, S., Ba, J.: Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. arXiv eprints: arXiv:1708.05144, August 2017 251. Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 252. Xie, S., Kirillov, A., Girshick, R., He, K.: Exploring randomly wired neural networks for image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019 253. Xiong, C., Zhao, X., Tang, D., Jayashree, K., Yan, S., Kim, T.K.: Conditional convolutional neural network for modality-aware face recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3667–3675 (2015) 254. Yalniz, I.Z., Jegou, H., Chen, K., Paluri, M., Mahajan, D.: Billion-scale semisupervised learning for image classification. CoRR, abs/1905.00546 (2019) 255. Yang, S., Yu, X., Zhou, Y.: LSTM and GRU neural network performance comparison study: taking yelp review dataset as an example. In: 2020 International Workshop on Electronic Communication and Artificial Intelligence (IWECAI), pp. 98–101 (2020) 256. Yao, B., Walther, D., Beck, D., Fei-Fei, L.: Hierarchical mixture of classification experts uncovers interactions between brain regions. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 22. Curran Associates, Inc., (2009) 257. Ying, C., Klein, A., Christiansen, E., Real, E., Murphy, K., Hutter, F.: NASbench-101: towards reproducible neural architecture search. In: International Conference on Machine Learning, pp. 7105–7114. PMLR (2019) 258. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? arXiv:1411.1792 [cs], November 2014 259. Yu, J., Yang, L., Xu, N., Yang, J., Huang, T.L.: Slimmable neural networks. In: International Conference on Learning Representations (2019)
Modularity in Deep Learning: A Survey
595
260. Yu, L., et al.: MAttNet: modular attention network for referring expression comprehension. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1307–1315 (2018) 261. Yu, T., Kumar, S., Gupta, A., Levine, S., Hausman, K., Finn, C.: Gradient surgery for multi-task learning. In: Advances in Neural Information Processing Systems, vol. 33, pp. 5824–5836. Curran Associates, Inc., (2020) 262. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint: arXiv:1605.07146 (2016) 263. Zaidi, J., Boilard, J., Gagnon, G., Carbonneau, M.-A.: Measuring disentanglement: a review of metrics. arXiv:2012.09276 [cs], January 2021 264. Zhang, Q., Yang, Y., Yu, Q., Wu, Y.N.: Network transplanting. arXiv:1804.10272 [cs, stat], December 2018 265. Zhang, Y., Yang, Q.: A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. (2021) 266. Zhou, A., Knowles, T., Finn, C.: Meta-learning symmetries by reparameterization. arXiv:2007.02933 [cs, stat], October 2020 267. Zhou, T., Wang, S., Bilmes, J.A.: Diverse ensemble evolution: curriculum datamodel marriage. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc., (2018) 268. Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. CRC Press, Boca Raton (2012) 269. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Computer Vision (ICCV), 2017 IEEE International Conference On (2017) 270. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. arXiv:1707.07012 [cs, stat], April 2018
Improving Distributed Caching Using Reinforcement Learning Ashwin Alaparthi, K. Shriprajwal(B) , J. S. Sooraj, M. S. Suraj, and T. S. B. Sudarshan Department of Computer Science and Engineering, PES University, Bangalore 5606085, India [email protected]
Abstract. With the increase in the number of edge devices and exponential rise in data collection, caching has gained importance to ensure fast access to required data while reducing the load on the servers and the backhaul links. One of the ways to do this efficiently is through cooperative caching. Cooperative caching allows sharing of data between the nodes themselves, reducing the load on the central servers and providing faster retrieval of requested data. The existing methods to place data in machine caches are deterministic and have well-defined limits with respect to the gain that could be achieved. Reinforcement learning methods are stochastic in nature and are not constrained by these limits. The method proposed in this paper circumvents the defined gain limit in distributed caching that uses traditional deterministic methods, by using Reinforcement Learning algorithms and can achieve a hit rate of up to 86%. Keywords: Reinforcement Learning · Distributed Caching · Cooperative Caching
1 Introduction With the increase in the number of devices connected to the internet, from growing smartphone users to the massive spike in the internet of things, one of the biggest challenges faced is handling a large number of requests and ensuring that the end users get a response in a reasonable amount of time. There are multiple ways to improve the underlying network such that there is faster access to data. We can improve the bandwidth provided by the network, by upgrading the links, such that the data is transmitted faster. But this is not pragmatic as it can be very expensive to upgrade every link in the network. There would also be a need to increase the computational power of the servers to handle the increasing loads. While this can be handled through scalable cloud architecture [1, 2], which also leads to a higher expense. An alternative to this would be the use of caching techniques. This is a much cheaper and more feasible alternative as it drastically reduces the total response time while reducing the load on the system. Caching is the method of storing a copy of a file locally on a system closer to the edge devices such that there is no need for the system to fetch the file again for a new request if the data stored has been unchanged since the last request. This avoids network latency by placing frequently requested data closer to the edge devices and allows the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 596–610, 2023. https://doi.org/10.1007/978-3-031-37963-5_41
Improving Distributed Caching Using Reinforcement Learning
597
user for instantaneous access to the file requested. Caching helps in reducing the number of memory accesses to secondary memory [3]. Content delivery and placement are significant limitations for many applications that distribute data like media content, such as audio and video streaming services, and file distribution via cloud storage. The problem of availability and decreased response times shifts focus from the centralized server system’s performance to the availability of data in the cache. Since there is limited memory in the cache, it is important to use optimal cache management algorithms to ensure that the maximum number of user requests are met at the cache layer. Distributed Caching is a solution to bring the data closer to the edge nodes. It is an extension of traditional caching methods, where the cache is stored across multiple servers so the system scales horizontally and increases in size and transactional capacity. An improvement on distributed caching is cooperative caching, in which there is sharing and coordination of data amongst the different nodes in the environment. This sharing of data results in a higher hit rate while also reducing the amount of data transferred from the central servers. The Optimal Content Placement Problem is proved to be an NP-Hard problem [4]. Traditional distributed caching algorithms such as the greedy placement algorithms that can be used to determine what data needs to be cached have a well-defined limit on the hit rate that can be gained through caching [5]. This can become a major limitation as the number of devices increases in the system. Reinforcement Learning (RL) algorithms can be used to circumvent these limits, as there is no theoretical limit proven for these algorithms. These algorithms may be used to search for an optimal caching policy that can provide better performance when compared with traditional caching algorithms. In this work, we propose using reinforcement learning algorithms to achieve a hit rate of up to 85% as compared to classical algorithms. The rest of the paper is divided into seven sections. Section 2 provides a more detailed description of the problem statement. Section 3 gives a brief overview of existing literature relevant to the field. Sections 4 and 5 go over the solution and its implementation details, such as the algorithms and workloads used. Section 6 collates and discusses the results of this work and Sect. 7 compares them with a few existing studies. Section 8 concludes the paper with a summary and future work.
2 Problem Statement Since it has been proven that classical algorithms have a well-defined limit on the gain that can be achieved through caching, we propose using reinforcement learning algorithms to determine caching policies, evaluate them on the Zipf Distribution, and compare them with classical algorithms. Here, we create and train an agent that can optimally place and move content from one node to another, taking into account different rates of traffic, node sizes, and types. For training the agents, popular reinforcement learning algorithms are used, such as Deep Deterministic Policy Gradient [6] (DDPG), Advantage Actor-Critic (A2C), and Proximal Policy Optimization [7] (PPO), and their performance when used for cache placement in distributed caching is measured. We propose a cooperative version of these reinforcement learning agents and compare their performance with classical algorithms.
598
A. Alaparthi et al.
The focus of this work is limited to finding methods to determine the optimal caching policy, and it does not focus on the details of individual files, underlying cache hierarchy, and hardware configurations of the different elements simulated in the system.
3 Literature Survey The work written by M. A. Maddah-Ali et al. [5], proposes a novel coded caching scheme that exploits both local and global caching gains, leading to a multiplicative improvement when compared with previously known schemes and is also within a constant factor of the information-theoretic optimum. The caching problem is divided into two phases namely the placement phase, where files are pushed into the user’s local cache, and the delivery phase where the user requests files. From a set of theorems, it is proved that no deterministic caching scheme can achieve a better memory-rate pair ratio greater than a factor of 12, than the coded caching scheme proposed in this work. J. Xiong et al. work [4] tackles the problem of optimal caching in converged networks by caching content at a maximum assumed distance of 2 hops. After proving that the problem of caching at the routers to improve equivalent throughput is an NP-hard problem, it makes use of a reinforcement learning-based approach to provide a solution for the same. Here, the DDPG algorithm gave good results for the metrics considered. They get better results when the user distribution is non-uniform. Classical approaches to distributed caching can sometimes involve learning the popularity of files as shown in [8]. The work by Y. Zhang et al. [9] presents a cooperative edge caching algorithm based on deep reinforcement learning to deal with the large amount of data that is being generated. They consider a multi-agent actor-critic algorithm where each edge server makes a caching decision locally. The caching decisions are optimized with the help of a centralized server that evaluates the action based on the global caching state of the system. Their solution is also benchmarked against various classical algorithms and they found that the cooperative-MAAC algorithm performed the best when it came to reducing backhaul traffic. The work by Qin, Tiancheng, et al. [10], tackles the file-bundling caching problem where a certain number of queries require a response in a serial manner. It shows that the classic algorithm LRU performs very well even in such a setting, comparable to the FF algorithm. The work also proposes an online, randomized variant of the marking algorithm as an approach to solving the same problem which can be used in both central and distributed settings and is more practically feasible. Another work that uses reinforcement learning in the domain of distributed caching is the work by L. Lu et al. [11]. The authors propose that user requests can be modeled by a Markov process and thus reinforcement learning is a viable method to solve it. The authors state that using Q-learning and Q-learning with value function approximation improves the result while also taking only reasonable amounts of training time. The authors then go on to use simulations to test reinforcement learning in such a setting and report on their findings. Caching salon [12] provides a great overview of the important aspects of caching algorithms. It also gives an introduction to different classical and learning-based caching algorithms.
Improving Distributed Caching Using Reinforcement Learning
599
It can be observed from existing literature that using reinforcement learning as a solution to the problem of distributed content caching is an underexplored domain. Most solutions tend to use the classical approach, or to focus on using one specific reinforcement learning algorithm (DDPG) to train agents. The effects of varying the conditions of the environment on the hit rate of the solutions are unknown. The environments are assumed to be homogeneous and the effect of adding non-caching nodes to the environment has not been benchmarked. It is known that increasing the zipf factor results in a better hit rate, but the exact effect is not available in existing literature.
4 System Design Overview Generally, work on reinforcement learning needs to consist of two parts, which are respectively the agent(s) and the environment(s). The agent interacts with the environment and takes an action based on the observations it receives from the environment. The environment calculates the effect of the agent’s action and returns a reward to the agent. The agent will then fine-tune its future action based on the reward. Environment The environment is built using the OpenAI gym [13] framework. OpenAI gym has a collection of inbuilt environments for a wide number of applications. These can be used to test new reinforcement learning algorithms for their performance. OpenAI gym can also be used to create third-party custom environments to suit various requirements. In this work, a custom training environment is built specifically to train reinforcement learning agents for content placement in distributed caching systems. It was tested on a wide range of agents for their performance in content placement. Agent Various RL algorithms are used to train agents for the proposed gym environment. The Stable Baselines library is used to build these agents. Stated below are a few of the algorithms used in this work to build RL agents: • Advantage Actor Critic (A2C):- This algorithm is considered for training as it is one of the most common RL algorithms available and promises the best results. • Proximal Policy Optimization (PPO):- This algorithm is considered for training as it makes use of a policy that has a minimum deviation from the previous policy at a low cost, making it ideal for the proposed system since it ensures that the agent learns from recent requests. PPO is used as a baseline in reinforcement learning-related literature and thus was benchmarked [7]. • Deep Deterministic Policy Gradient (DDPG):- This algorithm is used extensively in the literature surrounding the use of reinforcement learning in improving caching
600
A. Alaparthi et al.
decisions of distributed caching systems. Many researchers claim that DDPG performed best for their workload. Thus, here DDPG has been chosen as an algorithm to benchmark its performance [4, 6]. Workloads and Benchmarking From literature, it is known that requests from the users follow the Zipf distribution. Therefore, we use this distribution to generate workloads on which the reinforcement learning agents are trained. However, most Zipf implementations consider one file per request, which does not suit the purposes of file bundling caching. So, a modified Zipf function has been implemented to suit the requirements. A comparison of the hit rate of these trained RL agents with classical algorithms like Least Recently Used (LRU) and First In First Out (FIFO) is carried out to gauge the performance of the proposed approach.
5 Proposed Methodology 5.1 Building Custom Environments The environment is used to simulate a distributed caching system and contains information on the number of nodes present, the cache size for each node, and the types of files that can be requested. For this work, the approach is to design and implement a custom environment such that it encompasses all the relevant details of a distributed caching system. The action space consists of all possible actions that an agent can perform, while the observation space consists of all possible observations that the agent can infer. In this case, the action space of the agent represents the set of all possible file placements across all nodes in the system. The observation space is the current state of the distributed caching system. Two different reward functions were used while training the RL agents. The first reward function rewards a + 1 to the agent when its action leads to a cache hit. The second reward function rewards a + 3 to the agent when an action leads to a cache hit and punishes a −1 when an action leads to a cache miss. It was found that the first reward function gave better results as compared to the second reward function, so the first reward function was considered to train the agents. A version of the environment that supports Cooperative Caching has also been implemented. Cooperative Caching is a technique that coordinates caches of multiple systems to increase overall efficiency. The effect of using cooperative caching along with reinforcement learning in making the caching decisions is also benchmarked for distributed caching systems. It is assumed that a circular connection exists between the nodes in the network of the environment. A hit is registered if the file requested is present either in the node that received the request or in nodes that are 2 hops away from the node. A heterogeneous environment has also been implemented where the network consists of caching and non-caching nodes. Each non-caching node uses a different Zipf distribution to make requests to the caching nodes present in the network.
Improving Distributed Caching Using Reinforcement Learning
601
5.2 Workload Generation The workload is a generator that yields a vector of files that are requested each time it is called. The number of requests follows the Zipf distribution at the end of the stream of requests. The workload follows random distribution in the short-term, and Zipf distribution in the long term. Literature survey shows that most existing research in the field of content placement for caches use Zipf distribution to benchmark their methods as it is a good approximation of real-world workloads. Most Zipf implementations consider one file per request, which does not suit the purposes of file bundling caching. A modified Zipf function to suit the requirements was implemented as shown in Fig. 1.
Fig. 1. Visualization of Zipf Distribution with Respect to the File Number
This work includes experiments on static and dynamic workloads. In a static workload, the popularities of the files stay constant over the iterations of the workload. In a dynamic workload, the popularities of the files vary over the iterations of the workload in an arithmetic progression with a parameter to vary the distance between the most popular files. The dynamic workloads where the most popular file varies randomly are not considered. The two graphs in Fig. 2 and Fig. 3 represent the time-varying view of the static and dynamic workloads respectively. The graphs plot the number of times a file is requested with time for each file. The x-axis represents time and the y-axis represents the number of nodes that requested the file in a particular iteration. It is observed that the most common file in Fig. 3 is changing with time, representing the dynamic workload.
602
A. Alaparthi et al.
Figure 3 considers the workload W = [1, 3, 5, 7, 9], where the most common file for the first 20 iterations is File 1. For 21st to 40th iterations, the most common file is File 3. For the 41st to 60th iterations, the most common file changes to File 5 and so on. The most popular files over the course of the workload are Files 1, 3, 5, 7, and 9. As such we represent this workload as [1, 3, 5, 7, 9], or simply (1, 10, 2) where the last parameter is the distance between the most common files over the course of the workload.
Fig. 2. Iterations vs Requests Graph for Static Zipf Workload
Fig. 3. Iterations vs Requests Graph for Dynamic Zipf Workload
Improving Distributed Caching Using Reinforcement Learning
603
5.3 Simulation The simulation environment used is that of distributed caching. It is assumed that all files that are present in the network are of constant and equal size. Each node would have the capacity to cache files less than or equal to the cache size of that node. The Zipf factor is kept at 1.4 unless mentioned otherwise.
6 Results For tabulating and comparing the results of the reinforcement learning algorithms, the following scenarios have been considered. 6.1 Varying Cache Size To observe the effect of the cache size on the hit rate, the number of files was fixed to 50, the number of nodes to 10, and the cache size was varied from 2 to 10, with an increment of 2.
Fig. 4. Effect of Varying the Number of Cached Files on Hit Rate (Non-Cooperative)
From Fig. 4 it can be observed that A2C and PPO outperform LRU and FIFO for small cache sizes in the non-cooperative environment, but with the increase in cache size, LRU and FIFO outperform the RL algorithms, while only A2C provides comparable results. In general, the hit rate increases with the cache size as depicted in Fig. 5. The cooperative A2C agent outperforms all other algorithms considered in this work. Cooperative PPO outperforms the traditional caching algorithms when the cache sizes are small, but the improvement in hit rate is small as the cache size increases. Cooperative DDPG performs poorly when compared to other algorithms for the workload considered.
604
A. Alaparthi et al.
Fig. 5. Effect of Varying the Number of Cached Files on Hit Rate (Cooperative)
6.2 Dynamic Workloads For the same configuration of 50 files, 10 nodes, and cache size varying from 2 to 10 files, on dynamic workloads, the hit rate of the cooperative A2C agent dips slightly. This is expected, as the most common file is changing with time. But the hit rate is still comparable with the static workload scenario as depicted in Fig. 6. Thus, it is possible to use A2C in an online manner.
Fig. 6. Performance of A2C on Static vs Dynamic Workloads
Table 1 considers two instances of the dynamic workload P = [1, 2, 3, 4, 5] and Q = [1,3,5,7,9] with differing ranges of file popularity. The RL agents typically perform better on workload P when compared to workload Q. This is because for the workload P, when file 1 is the most requested, file 2 is the second-most requested file as illustrated in Fig. 1. As such, a good policy will cache file 2 as well when file1is the most popular file. Therefore, when the most requested file changes from file 1 to file 2 for this workload, the hit rate does not drop significantly as the agent already knows that file 2 is frequently requested. However, for workload Q, when file 1 is the most requested, file 3 is not as
Improving Distributed Caching Using Reinforcement Learning
605
requested as file 2 and therefore there is a drop-in hit rate when the most requested file changes to file 3 from file 1. The user request model mentioned in [11], which models a user’s most popular file using a Markov chain can be used to create and dynamically update the mapping from the file identifier to the most popular file, keeping the current most popular files as close as possible in order to ensure consistently high hit rates. Figure 7 further shows the hit rate is higher for distributions where the popular files are concentrated. The X-axis represents the file range, for example, (1, 14, 3) represents the file range [1, 4, 7, 10, 13] and the Y-axis represents the hit rate. Table 1. Table Showing the Performance of Two Different Ranges of File Popularity Configuration
[1, 2, 3, 4, 5]
[1, 3, 5, 7, 9]
50,10,2
0.5352
0.5089
50,10,4
0.6185
0.589
50,10,6
0.6801
0.6519
50,10,8
0.7076
0.6885
50,10,10
0.7505
0.7291
Fig. 7. Comparing the Hit Rate of Different Workloads for a System with 50 Files, 10 Nodes, and a Cache Size of 10
6.3 Cooperative Caching The effect of cooperative caching for RL agents was measured and it was found that there is a significant increase in the hit rate even when rudimentary cooperative caching is used as seen in Fig. 8.
606
A. Alaparthi et al.
Fig. 8. Cooperative A2C vs Non-Cooperative A2C
6.4 Varying Zipf Factor The Zipf factor is varied from z = 1 to z = 2 with an increment of 0.2 between each measurement. As shown in Fig. 9, the hit rate increases proportionately with the increase in the Zipf factor. This can be explained by the lopsidedness of the Zipf function. As z increases, the distribution gets denser towards the most popular file. Thus, the agent gains more rewards by caching the popular files in comparison with a distribution of a lower Zipf factor.
Fig. 9. Effect of Varying the Number of Cached Files on Hit Rate (Non-Cooperative)
Improving Distributed Caching Using Reinforcement Learning
607
6.5 Introduction of Non-caching Nodes Nodes that do not cache files are included in the environment to make the environment non-homogeneous. These nodes act as requestors and request aggregators in the system. The number of files is kept constant at 50 files, the number of caching nodes at 10 nodes, the cache size in each node at 10 files, and the number of non-caching nodes in the network are varied from 5 to 45 in increments of 10. As observed in Table 2, with the increase in the number of non-caching nodes, the hit rate increases in an almost linear fashion. As the number of non-caching nodes increases, the total number of requests in the environment increases as well. Since each node follows a Zipf distribution, there is an increase in the number of times the most popular files are requested. This leads to higher hit rates in the environment since the popular files are already cached at the caching nodes. Table 2. Table Depicting the Effect of Varying Non-Caching Nodes on Hit Rate Configuration
Hit Rate
50,10,10,5
0.6434
50,10,10,15
0.6671
50,10,10,25
0.6854
50,10,10,35
0.7639
50,10,10,45
0.8317
6.6 Varying Number of Files For finding the dependence of hit rate on the varying number of files present in the environment, the cache size and the number of nodes were fixed to 10 files and 10 nodes respectively. The number of files were varied from 10–100 files for the cooperative A2C agent and its performance was compared to classical algorithms like FIFO and LRU. From Fig. 10, it can be observed that the cooperative A2C algorithm outperforms FIFO and LRU for most cases. Irrespective of the number of files present, the hit rate of A2C remains nearly constant. For FIFO and LRU, we observe a performance drop initially as the number of files increase, but for a larger number of files, we do not observe an appreciable impact on the hit rate.
608
A. Alaparthi et al.
Fig. 10. Effect of Varying the Number of Files on Hit Rate [A2C]
6.7 Varying Number of Nodes To find the dependency of hit rate on the number of nodes in the environment, the number of files was kept constant at 50 and the cache size was kept constant at 10 files per node. The number of nodes in the system were varied from 2 to 10 and the performance of the A2C agent was compared against FIFO and LRU.
Fig. 11. Effect of Varying the Number of Nodes on Hit Rate [A2C]
From Fig. 11, it can be observed that the hit rate is largely independent of the number of nodes in the system. Traditional algorithms are not cooperative in nature and are therefore independent of the number of nodes. In case of the cooperative A2C agent, only the neighbouring nodes are considered when caching decisions are made.
Improving Distributed Caching Using Reinforcement Learning
609
The number of neighbours remains constant even as the number of nodes increases in the system, thus not significantly affecting the hit rate.
7 Comparative Study It can be seen from Table 3 that this work performs well in comparison to existing literature surveys by using A2C and cooperative caching in tandem. Table 3. Comparison of Cooperative A2C with Prior Work Parameters
Cooperative A2C
F-RAN [11]
Multi-Agent [9]
Number of files
50
20
NA
Cache size
10
5
10–50 units
Hit Rate
0.86
0.45
0.75
8 Conclusion This work aims to underscore the importance of reinforcement learning and cooperative caching in improving content placement in distributed caching systems. In this work, it is observed that using cooperative caching along with A2C for the content placement problem can result in a hit rate of up to 86%, which exceeds the hit rate achieved by classical algorithms and many existing methods. In the existing literature, DDPG seems to be the preferred method to tackle this problem, but in this work, we show that it does not always outperform algorithms like PPO and A2C. The performance of reinforcement learning agents is tested under varying conditions and the effect of each parameter is highlighted. This study has achieved its objectives which were to prove that reinforcement learning could be an effective solution to the file bundling caching problem and to benchmark the various parameters that could affect the performance of reinforcement learning agents in a distributed caching environment. In the future, the newest reinforcement learning algorithms can be benchmarked and compared against existing methods for the problem of content placement in distributed environments. Larger reinforcement learning agents can be trained using more training resources which will likely result in better performance. Methods involving a combination of classical and reinforcement learning approaches may lead to interesting solutions and better performance.
References 1. Chen, M., Hao, Y., Hu, L., Hossain, M.S., Ghoneim, A.: Edge-CoCaCo: toward joint optimization of computation, caching, and communication on edge cloud. IEEE Wirel. Commun. 25(3), 21–27 (2018)
610
A. Alaparthi et al.
2. Dash, D., Kantere, V. and Ailamaki, A., 2009, March. An economic model for self-tuned cloud caching. In 2009 IEEE 25th International Conference on Data Engineering (pp. 1687–1693). IEEE 3. Xia, C., Torrellas, J.: February. Improving the data cache performance of multiprocessor operating systems. In: Proceedings. Second International Symposium on High-Performance Computer Architecture, pp. 85–94. IEEE (1996) 4. Xiong, J., Fang, Y., Cheng, P., Shi, Z., Zhang, W.: Distributed caching in converged networks: A deep reinforcement learning approach. IEEE Trans. Broadcast. 67(1), 201–211 (2020) 5. Maddah-Ali, M.A., Niesen, U.: Fundamental limits of caching. IEEE Trans. Inf. Theory 60(5), 2856–2867 (2014) 6. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015) 7. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017) 8. Suksomboon, K., et al.: October. PopCache: cache more or less based on content popularity for information-centric networking. In 38th Annual IEEE Conference on Local Computer Networks, pp. 236–243. IEEE (2013) 9. Zhang, Y., et al.: Cooperative edge caching: a multi-agent deep learning based approach. IEEE Access 8, 133212–133224 (2020) 10. Qin, T., Etesami, S.R.: Optimal online algorithms for file-bundle caching and generalization to distributed caching. ACM Trans. Model. Perform. Eval. Comput. Syst. 6(1), 1–23 (2021) 11. Lu, L., Jiang, Y., Bennis, M., Ding, Z., Zheng, F.C., You, X.: Distributed edge caching via reinforcement learning in fog radio access networks. In: 2019 IEEE 89th Vehicular Technology Conference (VTC2019-Spring), pp. 1–6. IEEE (2019) 12. Zhao, Y., et al.: Caching salon: From classical to learning-based approaches. In: 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE), pp. 269–2695. IEEE (2019) 13. Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540. (2016)
Digital Watermarking Method for Copyright Protection of Deep Neural Networks Yuliya Vybornova(B) Samara National Research University, Samara 443086, Russia [email protected]
Abstract. This paper proposes a new method for copyright protection of deep neural networks designed for solving image classification tasks. The main idea of the method is to embed digital watermarks into the deep model by finetuning on a unique set of images, called triggers, represented in the form of pseudo-holographic signals (pseudo-holograms). A pseudo-hologram is a twodimensional sinusoidal signal that encodes a binary sequence of arbitrary length. By changing the phase of each sinusoid, it is possible to form various pseudoholograms encoding the same binary sequence. The proposed watermarking method consists in construction of a training set by producing a required number of pseudo-holograms on the basis of binary sequences, which are unique for each class. Thus, the class label assigned to each pseudo-hologram depend on the sequence encoded in it. The procedure of watermark verification is performed by sending various random pseudo-holograms as model input and evaluating the accuracy of classification. High rate of successful predictions indicates that input images are constructed based on the identification key of the legal owner. Experimental studies confirm the efficiency of the method for various model architectures and prove the compliance with all quality criteria required for the methods of deep model watermarking. Keywords: Copyright Protection · Digital Watermarking · Black-Box Watermarking · Deep Neural Networks · Pseudo-Hologram · Image Classification
1 Introduction As a rule, training a modern deep neural network comprises model architecture design, data collection and pre-processing, as well as hyperparameter selection. In addition to human resources, expensive hardware resources are also required. In particular, powerful graphics processing units (GPUs) capable of training such models are needed. Thus, trained models can be considered as intellectual property, and their legal owners may face a number of problems caused by the lack of necessary measures for copyright protection. Attackers may distribute proprietary models or illegally use them to provide data analysis services. In this regard, there is a need to create tools that allow the owner to prove the fact of unauthorized distribution of a deep neural network. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 611–628, 2023. https://doi.org/10.1007/978-3-031-37963-5_42
612
Y. Vybornova
Digital watermarking techniques have been widely used in the past two decades as a means of copyright protection for multimedia data (such as images, video and audio). In 2017, the first attempts were made to transfer the main ideas of watermark embedding to the case when the carrier is a deep neural network. For example, image watermarking implies modification of the pixel values according to a given scheme, and when embedding a watermark into a deep model, the subject to change are the model parameters. Depending on the availability of model weights at the verification stage, there are two main approaches to deep model watermarking: white-box methods, requiring access to model parameters, and black-box methods, applicable when only model predictions are available. In the first case, watermark is embedded by introducing changes directly into the model weights. In the latter case, watermarking is performed by fine-tuning the model on a unique dataset comprising special samples called triggers to make the model give predictions different from that of a model without a watermark. Current research within both approaches is focused on the development of a technique for watermarking deep neural networks that simultaneously satisfies a number of criteria. 1. Fidelity: the accuracy of solving the initial problem of data analysis should not deteriorate after watermark embedding. 2. Robustness: the watermark should resist model fine-tuning. 3. Information capacity: the amount of embedded information should be sufficient for the model owner to rightfully claim copyright. 4. Security: an unauthorized party should not be able to detect the presence of a digital watermark in a model. 5. Reliability: the result of watermark verification should have a minimum false negative rate, while model containing no watermark should not be falsely verified. 6. Uniqueness: the watermark must guarantee a one-to-one correspondence between the model and its legal owner. A new method proposed in this paper is designed for copyright protection of deep neural networks, which are trained to solve the image classification tasks. Based on the new black-box approach, the proposed method satisfies all the above quality criteria due to the use of a unique trigger set formed by synthesizing pseudo-holographic images (pseudo-holograms). A pseudo-holographic image is a signal that encodes a binary sequence of a given length in the form of sinusoidal functions. The use of such a signal as a digital watermark for deep neural networks has a number of advantages compared to utilizing classical raster images or random abstract images. Proposed method is based on the idea that pseudo-holograms encoding the same sequence have the same amplitude spectrum and look rather similar, so they should get into one class. For pseudo-holograms synthesized based on different sequences spectral characteristics differ, so such images should be assigned different class labels. Thus, it is proposed to generate a set of triggers, so that pseudo-holograms are synthesized using the unique binary sequence for each class. The paper is organized as follows. Section 2 comprises an overview of existing research. In Sect. 3, a detailed description of the proposed method for deep model
Digital Watermarking Method for Copyright Protection
613
watermarking is provided. The experimental study on the method efficiency in terms of quality criteria is described in Sect. 4.The conclusion is presented in Sect. 5.
2 Related Work 2.1 White-Box Watermarking In such a scenario, the watermark verification requires access to the parameters of a suspected model [1–9]. In [1, 4], the authors embed a binary watermark sequence into specific parameters of a target model by adding a regularization function to the original loss function, and thereby causing a statistical bias in these parameters. In [2], the authors propose to add an additional layer to the network architecture. The parameters of such layer depend both on the weights of the previous convolutional layer and on the digital signature of the owner. The procedure for embedding a digital watermark in this case consists in training the model in such a way that any attempt to forge a digital signature leads to significant drop in efficiency of solving the original classification problem. In [7], the authors apply one of the most popular multimedia watermarking strategies, called quantization index modulation (QIM), to control changes introduced into model parameters when embedding. Since watermark verification is impossible without access to the model weights, it is advisable to use white-box watermarking methods in the tasks of deep model integrity and authenticity protection, for example, detection of forgery in critical artificial intelligence systems [9]. As for the task of copyright infringement confirmation, a more suitable solution can be found among the black-box watermarking methods, described below. 2.2 Black-Box Watermarking Direct access to the weights of a deep learning model is not always possible: instead of propagating someone else’s neural network, attackers can use it in their client applications. In this case, to prove copyright, the legal owner can only query the remote model and receive computed predictions in response. For this reason, black-box watermarking methods appeared and gained popularity. The idea of black-box approach to deep learning model watermarking consists in finetuning the model to slightly change its behavior at the prediction stage. When particular samples, called triggers, are submitted to the input, the watermarked model gives the expected results, which are different from the original model without watermark [10– 24]. The digital watermark is embedded into the target model by training it to correctly recognize triggers, while preserving the accuracy of solving the original task. Such retraining is possible due to the redundant parameterization of deep neural networks. The digital watermark verification procedure is performed by submitting triggers to the input of the remote model and comparing its predictions with corresponding labels. A high proportion of correct predictions indicates the presence of a watermark. In the case of models for solving classification tasks, black-box watermarking methods differ depending on how “trigger-class label” pairs are formed.
614
Y. Vybornova
For example, in [10], trigger samples are formed by adding some distortion to the original data in order to induce incorrect predictions of the watermarked network. In [11], the trigger set consists of abstract images. Another technique for trigger construction is to superimpose visible marks on images of the original dataset. For this, in [12–15], a text logo is used as mark, in [16, 17] special labels are generated based on the signature of the owner. In [18], the logo is added to the image using an encoder making the logo visually indistinguishable. In [19], the authors show that at the verification stage the superimposed logo can be removed using autoencoder and propose to generate a trigger set by changing labels of randomly selected training samples. Watermarking methods that perform manipulations with images of the original dataset can affect the accuracy of solving the original classification problem. Besides, such schemes do not exclude the possibility of false positive and false negative results at the watermark verification stage, and, therefore, cannot guarantee unambiguous copyright confirmation. In addition, the watermark embedded into the model can be removed by finetuning on a dataset that does not contain triggers. The low watermark resistance can be explained by the proximity of the original samples and triggers: fine-tuning can lead the watermarked network to reclassify trigger samples into the original classes [25]. Moreover, most of existing watermarking methods do not establish a strong connection between the owner and the watermark set. For example, an attacker can determine samples producing the erroneous predictions of the stolen model, then claim such set as a trigger one, and, accordingly, claim copyright ownership. Thus, there is an uncertainty in the verification of digital watermarks, which can lead the rightful owner to the loss of intellectual property. Finally, after registration in a specialized certification center, the rightful owner will need to ensure reliable storage of the trigger set, the size of which, generally speaking, can be calculated in gigabytes of data. At the end of this section, it is worth mentioning that apart from the above methods created for protection of deep neural networks utilized for solving image classification problems, there are also some other studies in this area. For example, in [26–28], digital watermark is embedded into deep neural networks designed for audio classification, and in [29–31] the authors propose digital watermarking techniques for image processing models. In [32], a watermarking scheme for protection of deep reinforcement learning models is proposed.
3 Proposed Method 3.1 General Idea The key idea of the proposed method is to use a special type of two-dimensional signal called pseudo-holographic image (or pseudo-hologram) as a digital watermark. A pseudo-holographic image is a signal encoding a binary sequence S = s1 , s2 , . . . , sl of a given length l in the form of sinusoidal functions [33]. To produce a pseudo-hologram, it is necessary to artificially synthesize spectral components, which
Digital Watermarking Method for Copyright Protection
615
are subsequently mapped into two-dimensional sinusoids via inverse discrete Fourier transform (DFT). The spectrum of a pseudo-hologram is constructed as follows. Depending on the encoded bit of the sequence S, the spectral impulses are successively arranged on two discrete “rings” of different radii r and r + r. Impulses are located in such a way that one ring encodes “ones” of the binary sequence, and another ring encodes “zeros”. It is also necessary to determine the rule for procedures of reading and writing the encoded bits. An example of complex spectrum, synthesized for a sequence 101101001011101, is shown in Fig. 1. Note that in order to obtain a real image, the upper and lower parts of its spectrum should be symmetrical, so the figure provides only the upper half of each spectral ring. The sequence can be decoded by reading it in the counterclockwise order, starting from the point, where impulses are located on both rings.
Fig. 1. Construction of Pseudo-Hologram Spectrum
It should be noted that there is an opportunity to produce different pseudo-holograms based on the same sequence. For this, after arranging the impulses on the spectral rings they should be assigned random values. Since the impulses are complex numbers, random values should be produced both for the real and imaginary parts. The process of impulse value generation is provided in the next subsection. When the spectrum is formed, a two-dimensional grayscale image can be obtained using the inverse DFT. The use of pseudo-holographic images as triggers for neural network watermarking allows to avoid the drawbacks of most existing black-box schemes utilizing classical images for trigger set construction. First of all, by proper selection of watermarking parameters, the proposed method makes it possible to maintain the accuracy of the initial classification task as well as to control the number of verification errors. One of the most important features of the method consists in the opportunity to provide a guaranteed one-to-one correspondence between the model and its owner, since pseudo-holograms are constructed using a unique identification key. The identification key is a combination of K unique sequences, used for construction of K trigger subsets for each class: = 1K Si = S1 . . . Sk , where is a concatenation
616
Y. Vybornova
operator, K is a number of classes. Each subset of pseudo-holograms is assigned a class label corresponding to sequence encoded in each pseudo-hologram of this subset, namely the index of the sequence in the identification key (e.g., the sequence with index i is used to produce trigger subset assigned the label of the ith class). Thus, pseudoholograms encoding the same sequence are always assigned the same class label, but pseudo-holograms encoding different sequences must be in different classes. The use of the identification key allows to avoid false positive results when model verification: pseudo-holograms, produced from keys different from the owner’s one, will not be correctly recognized by the watermarked model. Besides, there is no need to store pseudo-holograms: with a known algorithm for complex spectrum construction, a trigger or verification set can be reproduced from the identification key at any moment, when legal owner needs to protect a model or check the model ownership. In addition, the ability to produce different pseudo-holograms based on a single binary sequence allows to generate any number of triggers necessary for efficient watermark embedding. This means that there will also be a trigger set sufficient for training the model to achieve high accuracy at the verification stage. Furthermore, since the features of pseudo-holograms and images of the original dataset are very different, the embedded watermarks are sufficiently robust to model fine-tuning. Thus, according to the above said, it can be assumed that the proposed method satisfies all the quality criteria put forward for the methods of deep model watermarking. This assumption is confirmed by the experimental results in Sect. 4. 3.2 Watermark Generation In this paper, a trigger set is constructed based on a new algorithm for synthesizing color pseudo-holograms. The proposed algorithm ensures a sufficient number of unique training and verification triggers for efficient watermarking. A color pseudo-hologram encoding a sequence of a given length is formed by combining three halftone pseudo-holograms based on this sequence. All three pseudo-holograms have the same arrangement of impulses of the complex spectrum, but the values of these impulses are set randomly for each half-tone image using a pseudo-random number generator (PRNG), which leads to a phase shift of two-dimensional sinusoids in the resulting pseudo-holograms. The phase difference of the sinusoidal signals provides a diversity of brightness values for the R-, G-, and B- channels of the output color image and thus allows to obtain various color shades. To ensure the uniqueness of each half-tone pseudo-hologram, it is necessary to use non-repeating initialization values at the input of the PRNG used for generation of impulse values. Thus, to produce a set of m color pseudo-holograms, it is necessary to form an initialization vector of at least 3m unique values, so each value is used to produce l random impulse values for only one half-tone pseudo-hologram.
Digital Watermarking Method for Copyright Protection
617
The algorithm for generation of m unique color pseudo-holograms based on the same binary sequence is as follows. 1. First, a PRNG with good statistical properties is selected. 2. Next, a set of values allowable for PRNG initialization is determined, and some ordered subset of 3m various values is randomly selected: U = {u1 , u2 , ..., u3m }. 3. In order to provide better randomness properties during construction of color pseudoholograms, a random permutation is performed π : U → U, which output is an initialization vector for complex spectrum generator V = {v1 , v2 , . . . , v3m } = {π(u1 ), π(u2 ), . . . , π(u3m )}. 4. Each element vt (t = 1, 3m) of the resulting vector is a seed for a PRNG producing l impulse values for one halftone pseudo-hologram. More precisely, taking vt as input PRNG outputs l angle values αtr , used for calculation of complex impulse value, where r is a number of impulse, r = 1, l. 5. The real and imaginary parts of the complex value ftr of each impulse are calculated as follows: Reftr = cos αtr , Imftr = sin αtr . 6. When the values for all l impulses are calculated, the t th half-tone pseudo-hologram is obtained by calculating the inverse DFT of the resulting spectrum. 7. Finally, every three pseudo-holograms are combined into one color image of the RGB color space. The examples for color pseudo-holograms encoding the same sequence of length l = 10 are shown in Fig. 2. Pseudo-holograms based on different sequences of length l = 10 are shown in Fig. 3. In the task of neural network watermarking, it is more suitable to use color pseudoholograms as triggers than grayscale ones. First, most modern architectures of deep neural network classifiers are designed to operate with color images. Secondly, the new algorithm for watermark generation provides a greater variety of pseudo-holograms in the training set.
Fig. 2. Color Pseudo-Holograms Encoding the Same Sequence (Assigned Same Class Label)
618
Y. Vybornova
Fig. 3. Color Pseudo-Holograms Encoding Different Sequences (Assigned Different Class Labels)
3.3 Watermark Embedding Like other existing black-box watermarking schemes, the proposed method for copyright protection of deep neural networks consists in fine-tuning a carrier model using a trigger set. The method originality is provided by the idea of trigger set construction through synthesizing pseudo-holographic images instead of utilizing classical images. To make the above clear, let W be the training dataset for the watermark embedding (i.e., watermarking dataset), T be a set of pseudo-holographic triggers (i.e., trigger set), and I be a set of images from original dataset. Thus, the training set for watermarking procedure is formed as W = T ∪ I. The procedure for model watermarking is performed as follows. 1. First of all, pseudo-holograms for each class are synthesized. For this: • Initially, K sequences Si , i = 1, K of a given length l are generated, where K is the number of classes. The generated sequences constitute an identification key S of the copyright owner. • Then, for each sequence, a set of m different color pseudo-holograms is obtained using the algorithm described in Sect. 3.2. In total, after this step is completed, m × K = |T| pseudo-holograms will be synthesized. • Each pseudo-hologram encoding a sequence Si is assigned a label of the ith class. 2. Next, |I| images of the original dataset are randomly selected. The resulting set I is combined with a set of pseudo-holograms T. 3. The model is trained on the dataset W until the accuracy of the watermark verification is sufficient to unambiguously confirm copyright. 4. The copyright verification procedure is performed by evaluating model predictions using a verification set formed by synthesizing random pseudo-holograms based on sequences Si , i = 1, K, generated at the step 1a). Note, the pseudo-holograms in a verification set must be different from the trigger samples of the training set. The class labels of pseudo-holograms obtained at this step are also assigned depending on the encoded sequence.
Digital Watermarking Method for Copyright Protection
619
4 Experimental Results Initially, two deep learning models were prepared as carriers for watermark embedding. For this, a pre-trained model with VGG11 architecture from the Pytorch library [34] was fine-tuned on the CIFAR10 and CIFAR100 datasets [35]. The training parameters are shown in Table 1. In all experiments, fine-tuning was performed using Stochastic gradient descent (SGD) optimization with fixed step at each iteration learning_rate = 10−3 and crossentropy loss function. Additionally, images of the training set were modified using transformations of random cropping and random horizontal flip. Table 1. Carrier Models Dataset
CIFAR-10
CIFAR-100
Batch size
32
64
Number of epochs
50
100
Test Accuracy
0.9370
0.7791
4.1 Fidelity and Capacity The purpose of this experiment is to estimate the minimum size of the watermarking dataset, sufficient both for unambiguous verification of the digital watermark and maintenance of the initial model accuracy, and thereby to simultaneously investigate the information capacity and fidelity of the proposed method. Obviously, the high verification accuracy can be achieved by increasing the number of pseudo-holograms in the trigger set, while the accuracy of solving the initial task can be improved by increasing the number of samples taken from the original dataset. Thus, this experiment investigates the influence of the |T| and |I| set sizes on the accuracy of classifying the original test images and on the accuracy of the digital watermark verification. For this, the carrier models were fine-tuned on different watermarking datasets formed for various combinations of parameters |T| and |I|. For the case of 10 classes (a classifier pretrained on the CIFAR10 dataset), an identification key S{10} was obtained by random generation of 10 unique binary sequences of length l = 20. Based on the produced sequences, trigger sets were synthesized. Three cases were considered: 10, 50 and 100 pseudo-holograms for each class. Accordingly, the size of the pseudo-hologram set varied as |T| : {100, 500, 1000}. For all cases, one verification set comprising 100 pseudo-holograms was generated based on the sequences of the identification key, but different from the trigger set used for training phase. For the case of 100 classes (CIFAR100), the identification key S{100} consisted of 100 unique sequences of length l = 20. Based on the key sequences, sets of 10, 20 and 50 pseudo-holograms were generated for each class. Accordingly, the size of the trigger set varied as |T| : {1000, 2000, 5000}. The verification set in this case consisted of 1000 pseudo-holograms.
620
Y. Vybornova
The I set was formed by selection of random images from the original dataset. The considered sets constituted 10%, 20% and 30% of the original CIFAR10/CIFAR100 dataset. Accordingly, for both datasets, the size of the original image set varied as |I| : {5000, 10000, 15000}. Note, that the parameter l = 20 is fixed for all the further experiments. This parameter affects the frequency of pseudo-hologram sinusoids. The investigation of its influence on the watermarking quality was performed in additional experiment, which is not included in this paper. According to the obtained results, the use of long sequences (i.e., l > 100) can require the larger trigger set for efficient watermarking. So, the selection of l should be performed depending on the number of classes as well as taking into account possible restrictions on the size of the trigger set. The experiment on the fidelity and capacity of the proposed method was conducted as follows. The watermarking procedure was performed by fine-tuning carrier models on the dataset comprising both set of pseudo-holograms and set of original images. To analyze the watermarking efficiency in terms of the quality of solving the original classification problem, at each epoch the model accuracy AccT was estimated on the original test set from the CIFAR10/ CIFAR100 dataset. To assess the possibility of the model to correctly verify the embedded watermark, the accuracy AccV was calculated on a verification set of pseudo-holograms. As a result of the experiment, for each combination of parameters, a model with the maximum verification accuracy AccV was selected. In case when there were several models with maximum AccV , the model with the best test accuracy AccT was determined. In the first case, the carrier model pre-trained on CIFAR10 dataset was fine-tuned for 50 epochs using the watermarking dataset. A preliminary study showed that for a batch size of 32, when verification accuracy reaches values AccV ≥ 0.99, the best model accuracy on the test set is AccT = 0.9350 which is slightly lower than the accuracy of the original model. In this regard, the batch size of 48 was chosen for watermark embedding. The results of the experiment are shown in Table 2. Table 2. Effect of Watermarking Dataset Size on Model Accuracy (CIFAR10) |T|
100
|I|
5000
10000
15000
5000
10000
15000
5000
10000
15000
AccT
0.9363
0.9338
0.94
0.9301
0.9375
0.9367
0.9313
0.9362
0.937
AccV
0.79
0.82
0.68
1.0
0.99
0.99
1.0
1.0
1.0
500
1000
The obtained results show that for |T| = 100 the verification accuracy AccV does not reach a value sufficient to conclude about successful watermarking completion. For |T| = {500, 1000} the verification accuracy AccV is close to one for all possible combinations of parameters, but the addition of |I| = 5000 original images into the watermarking dataset is not enough to achieve the accuracy AccT , close to the original model.
Digital Watermarking Method for Copyright Protection
621
However, it is worth noting that even for a small number of the original images, the accuracy of a watermarked model can be increased by setting a threshold value of AccV , sufficient for unambiguous verification of the watermark. The chart in Fig. 4 shows the values of AccT in the case when AccV ≥ 0.92. There is no column for |T| = 100 since in this case the model does not reach the threshold value.
Fig. 4. Test Accuracy for AccV ≥ 0.92 (CIFAR10)
According to Fig. 4, after introducing a threshold value, the accuracy of the watermarked model significantly increases, even for |I| = 5000. In the second case, for each combination of parameters, the carrier model pre-trained on the CIFAR100 dataset (one hundred classes) was fine-tuned for 100 epochs with a batch size of 128. The results of the experiment are presented in Table 3. Table 3. Effect of Watermarking Dataset Size on Model Accuracy (CIFAR100) |T|
1000
|I|
5000
10000
15000
5000
10000
15000
5000
10000
15000
AccT
0.7582
0.7651
0.7678
0.7564
0.7637
0.7674
0.7418
0.7565
0.7621
AccV
0.918
0.934
0.913
0.993
0.987
0.994
1.0
1.0
1.0
2000
5000
According to Table 3, the watermark verification accuracy AccV is sufficiently high for all parameter values of |T| and |I| and reaches the maximum possible value at |T| = 5000. However, the accuracy of solving the original classification problem AccT is lower than the accuracy of the original model without the watermark. In this regard, as for experiment on CIFAR10, a threshold value of verification accuracy was introduced. The chart in Fig. 5 shows the values of AccT for case when AccV ≥ 0.9.
622
Y. Vybornova
Fig. 5. Test Accuracy for AccV ≥ 0.9 (CIFAR100)
According to Fig. 5, after the introduction of a threshold value AccV ≥ 0.9, the watermarked model achieves the accuracy of the original model on the dataset constructed with parameters |T| = 2000 and |I| = 15000. In addition, it should be noted that for the specified parameters, without using a threshold, the maximum value of verification accuracy AccV is achieved in 98 epochs, and after the threshold introduction, in 61 epochs. So the addition of such a limitation allows not only to maintain the original model accuracy, but also to reduce the time of watermark embedding. According to the conducted experiment, in both cases, the size of the watermark dataset affects the watermarking efficiency and the preservation of the initial accuracy of the model after the embedding procedure. Thus, correctly chosen embedding parameters allow to achieve high watermark verification accuracy, as well as not to violate the accuracy of solving the original classification problem. 4.2 Reliability and Uniqueness This experiment is aimed at demonstrating that the model without the embedded watermark cannot be falsely verified and also that the watermark embedded into a deep model correspond only to the legal owner of the identification key. For this, carrier model pretrained on CIFAR10 was watermarked for 50 epochs with batch size of 48 using trigger set of size |T| = 1000 and set of original images of size |I| = 15000. Also, carrier model pretrained on CIFAR100 was watermarked for 100 epochs with batch size of 128 using trigger set of size |T| = 5000 and set of original images of size |I| = 15000. {10} {10} After this, for the case of CIFAR10, five random keys S1 , . . . , S5 , different from the key S{10} of a legal user, were generated based on random sequences of length l = 20. As for S{10} , each key consisted of 10 sequences. Using each key, five verification sets, comprising 10 pseudo-holograms per class, were constructed. The compromised verification sets were also constructed for the case of CIFAR100. {100} {100} consisted of 100 random The only difference was that each key S1 , . . . , S5 sequences in accordance with the number of classes in the dataset.
Digital Watermarking Method for Copyright Protection
623
The experiment consisted in evaluating the verification accuracy AccV for both carrier and watermarked models on legal and compromised sets of pseudo-holograms. The results of the experiment are shown in Tables 4 and 5. Table 4. Verification Accuracy for Different Identification Keys (CIFAR10) S{10}
{10}
S1
{10}
S2
{10}
S3
{10}
S4
{10}
S5
Carrier Model
0.20
0.08
0.23
0.01
0.04
0.15
Watermarked Model
1.00
0.02
0.13
0.21
0.10
0.06
Table 5. Verification Accuracy for Different Identification Keys (CIFAR100). {100}
{100}
{100}
{100}
{100}
S{100}
S1
S2
S3
S4
S5
Carrier Model
0.012
0.003
0.017
0.008
0.005
0.008
Watermarked Model
1.00
0.032
0.027
0.013
0.023
0.001
The results of the conducted experiment show that the watermark can be correctly verified only by the model, into which this watermark is embedded. The probability of false verification of the model protected with a watermark of another person is negligible. The malicious or erroneous misappropriation of models without embedded watermark is also impossible. Thus, only a legal owner can claim ownership of a protected model using a personal identification key for construction of a verification image set. 4.3 Robustness The purpose of this experiment is to assess the method robustness against attacks aimed at watermark removal. The experiment was performed as follows. For carrier model pre-trained on CIFAR10 the watermark was embedded by training for 50 epochs with batch size of 48 on trigger set of size |T| = 1000 and set of original images of size |I| = 15000. The carrier model pre-trained on CIFAR100 was watermarked for 100 epochs with batch size of 128 on trigger set of size |T| = 5000 and set of original images of size |I| = 15000. Two watermark removal attacks were investigated: fine-tuning of all model parameters and fine-tuning of only the last layer by freezing the rest of parameters. In both cases the watermarked model was fine-tuned for 100 epochs on three sets of different sizes. Each set was obtained by randomly taking a given number of images of the original dataset, namely 10%, 30% and 50% of CIFAR10/CIFAR100.
624
Y. Vybornova
At each 10 epochs of the attack the verification accuracy was calculated. First, the model pre-trained and watermarked on CIFAR10 was considered. Figure 6 shows the values of AccV after fine-tuning all model parameters for various sizes of the training set.
Fig. 6. Verification Accuracy after Fine-Tuning All Model Parameters (CIFAR10)
According to the obtained results, verification accuracy AccV ≥ 0.93 when the size of attacker dataset comprizes 10% of CIFAR10 images (i.e., |I| = 5000), and verification accuracy AccV ≥ 0.89 in case of |I| = 15000 which indicates a high resistance of the embedded watermark against fine-tuning attack. For |I| = 25000 verification accuracy decreases: AccV ≥ 0.83 during last epochs, but the watermark is still detectable. In the case of fine-tuning only the final layer of the watermarked model, verification accuracy AccV ≥ 0.97 regardless of training set size. In case of |I| = 5000 the accuracy AccV = 1 at each epoch. Consequently, the embedded watermark is completely robust against such attack. Next, the model pre-trained and watermarked on CIFAR100 was investigated. Figure 7 shows the corresponding results for the attack performed by fine-tuning all model parameters. According to results in Fig. 7, after 60 epochs of fine-tuning attack it is still possible to correctly verify the embedded digital watermark regardless of dataset size. But when the attack is performed for 100 epochs, the watermark remains detectable only for |I| = 5000 and |I| = 15000. When the attacker dataset is larger, the watermark verification accuracy starts to significantly decrease after 60 epochs, and after 100 epochs AccV = 0.319. The difference in results for models trained on CIFAR10 and CIFAR100 is explained by the fact, that in the case of 10 classes the selected number of pseudo-holograms per class is two times higher than for the case of 100 classes. Thus, increasing the size of the trigger set can help to achieve higher robustness properties of the embedded watermark. In the case of fine-tuning only the final layer of the watermarked model, verification accuracy AccV ≥ 0.993 regardless of training set. Thus, the embedded watermark also resists such attack.
Digital Watermarking Method for Copyright Protection
625
Fig. 7. Verification Accuracy after Fine-Tuning All Model Parameters (CIFAR100)
According to the results of the conducted experiment, the method is completely robust against fine-tuning of the final layer and highly robust against fine-tuning of all model parameters. 4.4 Efficiency In this experiment, a research is focused on the method applicability for different architectures of deep neural networks. Five architectures besides VGG were considered. The experiment was performed as follows. First, for each architecture, a carrier model was obtained by fine-tuning a corresponding pre-trained model from Pytorch library on CIFAR10 dataset for 50 epochs with batch size of 32. Optimizer and loss were selected the same as for VGG11 in previous experiments. After that, the carrier models were watermarked by fine-tuning on a dataset, constructed with |T| = 1000 and |I| = 15000, for 50 epochs with batch size of 48. Finally, for each watermarked model, the accuracy AccT on the original CIFAR10 test set was compared to the accuracy AccT of the corresponding carrier model. The results of the experiment are shown in Table 6. The table provides maximum test accuracy of the models with verification accuracy AccV = 1. According to the results of the experiment, the proposed method is efficient for all investigated models, since the initial model accuracy is preserved and at the same time all watermarks from the verification set are classified without errors.
626
Y. Vybornova Table 6. Watermarking Results for Different Model Architectures
Model
AccT , original model
AccT , watermarked model
AccV
VGG11
0.9370
0.9370
1.0
ResNet-18
0.9517
0.9530
1.0
Inception-v3
0.9683
0.9703
1.0
Densenet-121
0.9668
0.9671
1.0
MobileNetV2
0.9528
0.9545
1.0
AlexNet
0.9165
0.9201
1.0
5 Conclusion In this paper, a new black-box watermarking method for copyright protection of deep neural networks is proposed. The method is based on construction of a trigger set by synthesizing color pseudo-holograms. Each trigger encodes a binary sequence, which is mapped to a given class label. All sequences constitute an identification key, allowing to unambiguously verify a legal owner of the model. There is no need to store the set for watermark embedding or verification, since it can be reproduced using the key. If the model contain no watermark or the watermark do not correspond to the identification key, the ownership verification will fail. Thus, the method excludes false positive confirmation of copyright. On the other hand, if the identification key corresponds to the watermarked model, the accuracy on verification set will be high enough to confirm the rights of a legal owner. The method efficiency has been tested on various deep neural network architectures. The experimental study has shown that the watermarking procedure can be performed without decreasing the model accuracy on the original classification task. Furthermore, the method is highly robust against fine-tuning attacks aimed at watermark removal. Acknowledgments. The reported study was funded by RSF (Russian Science Foundation) grant № 21-71-00106, https://rscf.ru/en/project/21-71-00106/
References 1. Uchida, Y., Nagai, Y., Sakazawa, S., Satoh, S.: Embedding watermarks into deep neural networks. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 269–277 (2017) 2. Fan, L., Ng, K.W., Chan, C.S.: Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 4714–4723 (2019) 3. Wang, T., Kerschbaum, F.: Robust and undetectable white-box watermarks for deep neural networks. arXiv: 1910.14268 (2019) 4. Nagai, Y., Uchida, Y., Sakazawa, S., Satoh, S.: Digital watermarking for deep neural networks. Int. J. Multimedia Inf. Retrieval 7(1), 3–16 (2018). https://doi.org/10.1007/s13735018-0147-1
Digital Watermarking Method for Copyright Protection
627
5. Chen, H., Darvish Rohani, B., Koushanfar, F.: DeepMarks: A digital fingerprinting framework for deep neural networks. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval (ICMR 2019), pp. 105–113 (2019) 6. Wang, J., Wu, H., Zhang, X., Yao, Y.: Watermarking in deep neural networks via error backpropagation. Electron. Imaging 2020(4), 221–229 (2020) 7. Kuribayashi, M., Tanaka, T., Suzuki, S., Yasui, T., Funabiki, N.: White-box watermarking scheme for fully-connected layers in fine-tuning model. In: Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security, pp. 165–170 (2021) 8. Wang, T., Kerschbaum, F.: RIGA: covert and robust white-box watermarking of deep neural networks. In: Proceedings of the Web Conference 2021, pp. 993–1004 (2021) 9. Botta, M., Cavagnino, D., Esposito, R.: NeuNAC: a novel fragile watermarking algorithm for integrity protection of neural networks. Inf. Sci. 576, 228–241 (2021) 10. Le Merrer, E., Pérez, P., Trédan, G.: Adversarial frontier stitching for remote neural network watermarking. Neural Comput. Appl. 32(13), 9233–9244 (2019). https://doi.org/10.1007/s00 521-019-04434-z 11. Adi, Y., Baum, C., Cisse, M., Pinkas, B., Keshet, J.: Turning your weakness into a strength: watermarking deep neural networks by backdooring. In: Proceedings of the 27th USENIX Security Symposium (USENIX Security 2018), pp. 1615–1631 (2018) 12. Deeba, F., Tefera, G., She, K., Memon, H.: Protecting the intellectual properties of digital watermark using deep neural network. In: Proceedings of the 2019 4th International Conference on Information Systems Engineering (ICISE), pp. 91–95 (2019) 13. Zhang, J., Gu, Z., Jang, J., Wu, H., Stoecklin, M.P., Huang, H., Molloy, I.: Protecting intellectual property of deep neural networks with watermarking. In: Proceedings of the 2018 on Asia Conference on Computer and Communications Security, pp. 159–172 (2018) 14. Sakazawa, S., Myodo, E., Tasaka, K., Yanagihara, H.: Visual decoding of hidden watermark in trained deep neural network. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 371–374 (2019) 15. Wang, G., Chen, X., Xu, C.: Adversarial watermarking to attack deep neural networks. In: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1962–1966 (2019) 16. Guo, J., Potkonjak, M.: Watermarking deep neural networks for embedded systems. In: Proceedings of the 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 1–8 (2018) 17. Jebreel, N. M., Domingo-Ferrer, J., Sánchez, D., Blanco-Justicia, A.: KeyNet: an asymmetric key-style framework for watermarking deep learning models. Appl. Sci. 11 (2021). https:// doi.org/10.3390/app11030999 18. Li, Z., Hu, C., Zhang, Y., Guo, S.: How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of DNN. In: Proceedings of the 35th Annual Computer Security Applications Conference, pp. 126–137 (2019) 19. Namba, R., Sakuma, J.: Robust watermarking of neural network with exponential weighting. In: Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, pp. 228–240 (2019) 20. Rouhani, B.D., Chen, H., Koushanfar, F.: Deepsigns: a generic watermarking framework for ip protection of deep learning models. arXiv:1804.00750 (2018) 21. Zhang, Y.-Q., Jia, Y.-R., Niu, Q., Chen, N.-D.: DeepTrigger: a watermarking scheme of deep learning models based on chaotic automatic data annotation. IEEE Access 8, 213296–213305 (2020) 22. Zhong, Q., Zhang, L.Y., Zhang, J., Gao, L., Xiang, Y.: Protecting IP of deep neural networks with watermarking: a new label helps. In: Lauw, H.W., Wong, R.-W., Ntoulas, A., Lim, E.-P., Ng, S.-K., Pan, S.J. (eds.) PAKDD 2020. LNCS (LNAI), vol. 12085, pp. 462–474. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47436-2_35
628
Y. Vybornova
23. Xu, X., Li, Y., Yuan, C.: “Identity bracelets” for deep neural networks. IEEE Access 8, 102065–102074 (2020) 24. Zhao, J., Hu, Q., Liu, G., Ma, X., Chen, F., Hassan, M.: AFA: adversarial fingerprinting authentication for deep neural networks. Comput. Commun. 150, 488–497 (2020) 25. Cao, X., Jia, J., Gong, N.Z.: IPGuard: Protecting the intellectual property of deep neural networks via fingerprinting the classification boundary. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security (ASIA CCS 2021), pp. 14–25 (2021) 26. Kim, W., Lee, K.: Digital watermarking for protecting audio classification datasets. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2842–2846 (2020) 27. Chen, H., Zhang, W., Liu, K. Chen, K., Fang, H., Yu, N.: Speech pattern based black-box model watermarking for automatic speech recognition. In: Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3059– 3063 (2022) 28. Wang, Y., Wu, H.: Protecting the intellectual property of speaker recognition model by blackbox watermarking in the frequency domain. Symmetry 14(3), 619 (2022) 29. Wu, H., Liu, G., Yao, Y., Zhang, X.: Watermarking neural networks with watermarked images. IEEE Trans. Circuits Syst. Video Technol. 31(7), 2591–2601 (2021) 30. Zhang, J., Chen, D., Liao, J., Zhang, W., Feng, H., Yu, N.: Deep model intellectual property protection via deep watermarking. IEEE Trans. Pattern Anal. Mach. Intell. 44, 4005–4020 (2021) 31. Quan, Y., Teng, H., Chen, Y., Ji, H.: Watermarking deep neural networks in image processing. IEEE Trans. Neural Netw. Learn. Syst. 32(5), 1852–1865 (2021) 32. Chen, K., Guo, S., Zhang, T. Li, S., Liu, Y.: Temporal watermarks for deep reinforcement learning models. In: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS 2021), pp. 314–322 (2021) 33. Vybornova, Y.: Method for protection of heterogeneous data based on pseudo-holographic watermarks. In: Proceedings of 2021 9th International Symposium on Digital Forensics and Security (ISDFS), pp. 1–5 (2021) 34. Torchvision models subpackage. https://pytorch.org/vision/stable/models.html 35. CIFAR10 and CIFAR100 datasets. http://www.cs.toronto.edu/~kriz/cifar.html
Address Search Correction Using Deep Learning Hoang-Quan Tran1,2 , Thanh-Nam Vo1,2 , Duc-Hao Do3 , and Thanh-Duc Chau1,2(B) 1 2
Faculty of Information Technology, University of Science, Ho Chi Minh City, Vietnam Vietnam National University Ho Chi Minh City, Ho Chi Minh City, Vietnam [email protected] 3 FPT University, Ho Chi Minh City, Vietnam
Abstract. Most digital map applications allow users to search for addresses. However, implementations using Levenstein distance and string matching methods are only suitable for exact address matching, which did not operate well with ambiguous inputs such as typos or alias addresses. We present a machine-learning approach to the Address Correction problem based on Grammatical Correction and Machine Translation. We build a Sequence-to-Sequence (Seq2Seq) Attention model, using Gated Recurrent Units (GRUs) with different Attention forms (including Dot, General and Concatenate Attention). For comparison, we build a long short-term memory (LSTM) Seq2Seq and a Transformer model. According to our experiments, the GRU Attention obtains the best overall accuracy in different datasets. Based on the proposed architecture, we build an end-to-end pipeline for Address Searching with ambiguous inputs using Beam Search for Vietnamese, United States and Australian addresses. Keywords: Address Correction · GRU Machine Translation · Beam Search
1
· Attention Mechanism ·
Introduction
Searching for an address is a common task in every digital map requiring accuracy and speed. A quick summary of the problem, given an input address s, we want to find the set of addresses S = {s1 , s2 , .., sn } which is related to s, S is a closed set where its items are existing addresses. The traditional approach to the Address Correction problem is using Levenstein distance (minimum edit distance), Hamming distance or other fuzzy string-matching methods, which calculate a score (similarity score) between a target address and the input address; candidates are suggested based on their ranking. These methods are basic approaches but do not behave correctly when H.-Q. and T.-N.—Contribute equally in this work. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 629–642, 2023. https://doi.org/10.1007/978-3-031-37963-5_43
630
H.-Q. Tran et al.
the input address is poorly formatted, has typography, has aliases instead of a formatted address ...etc. We split the Address Searching task into two subtasks: Address Correction (correct input address into its correct form) and Address Suggestion (recommend addresses related to the input). In this paper, we proposed a new approach to the Address Correction problem using Machine Translation. Our approach uses token-level error correction: with an input address s, we tokenize the address with a corpus-based tokenizer, then pass the tokenized vector to a GRU Attention Seq2Seq model. For comparison, we perform the same task on a traditional LSTM Seq2Seq and Transformer model. These evaluations were made on several datasets, including AU-200K, US-200K, AU-1M and US-1M for Australian and United States addresses; HCM-200K, HCM-1M and HCM-real for Vietnamese addresses. This paper would not elaborate further on the Address Suggestion phase, instead, we propose a na¨ıve suggestion method by using Beam Search. The contributions of this paper are as follows 1. We created AU-200K, US-200K, AU-1M, US-1M, HCM-200K, HCM-1M and HCM-real as specialized datasets for the Address Correction task. 2. We introduced a new Machine Translation-based approach to the Address Correction problem and a complete address-searching pipeline with Beam Search. Our paper has the following structure: we do a literature review in Sect. 2, and then in Sect. 3, we introduce our Machine Translation architectures applied to the Address Correction problem. Section 4 will introduce the datasets and how we build them. Section 5 will be the training configuration and evaluation, and then Sects. 6 and 7 will be the discussions about the outcomes and the conclusion, respectively.
2
Related Works
Several works have been made for the same purpose but with a different task named Address Standardization. Christen et al. (2005) [5] introduced an automatically probabilistic approach to the Address Standardization problem, in which they used a national address guideline and comprehensive address database to build a Hidden Markov Model (HMM) for cleaning, standardising and verifying raw input addresses. Bui et al. (2019) [2] proposed a method which leverages the Named Entity Recognition (NER) model as a suggestion to re-rank potential address candidates obtained by the fuzzy matching stage using a log-linear model and Conditional Random Fields (CRF) NER-tagging. With the deep learning approach, Sharma et al. (2018) [8] proposed an approach using a Multilayer Feedforward Network with stochastic gradient descent. Later Cao et al. (2021) [3] introduced a BERT-based Siamese Neural Network, which embeds the raw input address and the target correct address into a single latent multi-dimensional space and then uses a ranking score to filter out the highest ranking score address.
Address Search Correction Using Deep Learning
631
These approaches share a common, in which the input address goes through a NER-tagging stage. The resulting address would also be corrected by tags, not the whole address in the beginning. For example, the address “227 Nguyen Van Le Duan, TPHCM ”would be standardised to “number: 227 Nguyen Van, street: Le Duan, city: Hanoi ” This is not a precise correction, especially with a closed address dataset.
3 3.1
Address Correction Using Machine Translation-Based Grammatical Correction Approach Problem Statement
Table 1. Example of Input and Corrected Address
Table 1 shows some cases that an address search engine faces when dealing with human input. In our approach, we consider each address as a grammatically incorrect sentence while the correct address is a grammatically correct sentence. Hence, the problem is now a Grammatical Error Correction, where we correct a sentence based on a grammar baseline. Numbers, aliases, abbreviations, or special characters in an address would be preserved. This can be elaborate further to a Machine Translation problem, in which we are trying to translate a wrong sentence into a correct sentence. There were some approaches used NER-tagging to standardize addresses and then put them into the search engine. As we have mentioned above, the NER approach has a drawback in that if the tag is wrong, the output will be wrong too. Because of that, we ignore the NER-tagging stage to maintain the structure of the address. Our proposed framework is shown in Fig. 1. 3.2
Address Correction
LSTM Seq2Seq. Sutskever et al. (2014) [9] introduced Seq2Seq as an efficient architecture for Machine Translation. The goal of Seq2Seq is to estimate the probability P (y1 , .., yT |x1 , .., xT ) where (x1 , .., xT ) is the input sentence and
632
H.-Q. Tran et al.
Fig. 1. Proposed Framework with GRU Attention and Beam Search
(y1 , .., yT ) is the output sentence, which length T may differ from original sentence’s length T . The original implementation of Seq2Seq using Recurrent Neural Network (RNN) blocks. Giving previous state ht−1 and current input xt , current hidden state ht and current output yt would be calculated as ht = σ(W hx xt + W hh ht−1 ) yt = W yh ht
(1)
During backpropagation, RNNs suffer from gradient loss and perform poorly with long sequences. Hochreiter et al. (1997) [6] introduced the LSTM as a solution to replace the RNNs when dealing with long sequences. At timestep t, the LSTM block receive the input xt , hidden states ht−1 and ct−1 from the previous step t − 1. The output ot and current hidden states ht and ct will be calculated as1 ft = σ(Wf xt + Uf ht−1 + bf ) it = σ(Wi xt + Ui ht−1 + bi ) ot = σ(Wo xt + Uo ht−1 + bo ) ct = ft ct−1 + it σ(Wc xt + Uc ht−1 + bc ) ht = ot tanh(ct )
(2)
We replicated the Seq2Seq model with 256-units LSTM blocks, which is shown in Fig. 2. The embedding dimensions are 300 for both Encoder and Decoder, while the maximum address’s length is a defined constant, which would be mentioned later in Table 9. We also implemented a Bidirectional version with Bidirectional LSTM (BiLSTM) blocks. The number of LSTM units in the Decoder is 512 units in this scenario.
1
is the notation for element-wise multiplication.
Address Search Correction Using Deep Learning
633
Fig. 2. LSTM Seq2Seq Network
GRU and Attention Mechanism. Bahdanau et al. (2014) [1] introduced the Attention mechanism, and then Luong et al. (2015) [7] presented the concept of Global and Local Attention. For this approach, we use Global Attention for the Attention layer.
Fig. 3. GRU Attention Model Architecture
We improve the Seq2Seq architecture in the previous section by adding a 256-unit Attention layer in the Decoder for each of the three versions of Global Attention (including Dot, General and Concatenate), where the score function is defined as ⎧ ¯ ⎪ (dot) ⎨ht hs ¯ s ) = ht Wa h ¯s (3) score(ht , h (general) ⎪ ⎩ ¯ v a tanh(Wa [ht ; hs ]) (concat) However, the training process was slowed down by the computational limitations of LSTMs when used with a large dataset and extra calculations with the Attention mechanism. Cho et al. (2014) [4] introduced GRUs, which use less
634
H.-Q. Tran et al.
memory and train more quickly. With current timestep t, input xt and ht−1 as previous hidden state at timestep t − 1, the current hidden state ht will be calculated as zt = σ(Wz [ht−1 , xt ]) rt = σ(Wr [ht−1 , xt ]) ˜ t = tanh(W [rt ht−1 , xt ]) h
(4)
˜t ht = (1 − zt ) ht−1 + zt h We replaced LSTM blocks with 256-unit GRU for both the Encoder and Decoder, the architecture is described in Fig. 3. With the Bidirectional (BiGRU) version, the number of GRU units in the Decoder is set to 512. In BiGRU with General and Concatenate Attention scenarios, the number of Attention units is set to 512. Transformer. The Transformer was first introduced in the paper Attention is all you need (Vaswani et al. 2017) [10]. It solved the problem of recurrent models by passing the whole sequence to the model instead of passing token by token. In the original paper, they set the model dimension dmodel to 512, the number of Encoder and Decoder layers N to 6, the number of attention heads h to 8, the dropout rate Pdrop to 0.1 and the fast-forward dimension df f to 2048. Adam Optimizer is configured with β1 = 0.9, β2 = 0.98, warmup steps = 4000, = 10−9 and the learning rate defined by the function: −0.5 , step num · warmup steps−1.5 ) lrate = d−0.5 model · min(step num
(5)
Due to the limitation of the hardware and the size of our datasets, we implemented a lighter version of Transformer which reduced the dimension dmodel to 128, the number of layers N reduced to 4 and fast-forward dimension df f reduced to 512. 3.3
Address Suggestion
Beam Search. Classical Seq2Seq models use the greedy strategy (i.e. use the current token yˆi and previous tokens yˆi−1 , yˆi−2 , .., yˆ1 to calculate probabilities y1 ..ˆ yi ) for the next token yˆi+1 , then get the token with the best probaP (ˆ yi+1 |ˆ bility). Hence, a classical Seq2Seq produces only a sequence, which is insufficient for the whole address-searching pipeline (corrects and suggests addresses related to an input address). Using Beam Search, we generate multiple predictions from one input string, which is the drawback of the greedy approach. The algorithm’s concept can be demonstrated in Fig. 4: at each input token, Beam Search will generate beam width tokens that have the highest joint probability. Then, the top beam width strings generated that have the highest probability will use to predict the next token until each reaches the end token. We attach Beam Search to
Address Search Correction Using Deep Learning
635
Fig. 4. Beam Search with 3 Tokens beam width
the Decoder for the suggestion phase and make it a complete address-searching pipeline.
4
Dataset
4.1
Data Source
United States and Australian. We collect The United States and Australian addresses from openaddresses.io2 , specified in United States Northeastern states and Australian New South Wales. Most United States and Australian addresses would follow this format: number, street (e.g. 94 (number ) Suburban St. (street)) Vietnamese. For Vietnamese addresses, VietMap.vn3 provided us with a set containing approximately 200000 unique addresses in Ho Chi Minh City, Vietnam. Most addresses would follow this format: number, street, ward, district, 2 3
https://openaddresses.io https://vietmap.vn
636
H.-Q. Tran et al.
city (e.g. 227 (No. 227) 5 (District 5), TP. (Ward 4), 4.2
, Minh (Ho Chi Minh City)).
4
Building the Dataset for Training and Evaluation
Human-Input Data Augmentation for Training. All plain data collected do not have human input addresses, which can contain spelling mistakes or mistypes. Because of that we generated datasets from the originals using approaches: – Missing letters (e.g. “battery” to “battyr ”) – Swapped letters (e.g. “battery” to “batteyr ”) – Doubled/repeated letters (e.g. “battery” to “baaterry”) – Fatfinger (letters replaced by neighbour/closed letters) (e.g. “park ” to “pasrk ”) with Vietnamese addresses, some common errors including – No IME supported ” will be converted to “huynhf tanas phats”, “huyfnh (e.g. “ taans phast”, .. in Telex or “huynh2 tan61 phat1 ”, “huy2nh ta61n pha1t”, .. in VNI). – No diacritics ” will be converted to “huynh tan phat”) (e.g. “ – Abbreviator ” will be (e.g. “ converted to “ ”) – Token positions swapped (e.g. “ ” will be converted to “ ”) For each address, we will generate several human input addresses with one error at each random position within the string using any combination of the above methods. Examples of generated United States, Australian and Vietnamese addresses are presented in Table 2, Table 3 and Table 4, respectively.
Address Search Correction Using Deep Learning
637
Table 2. Example of the United States Dataset Structure human input
address
155 BATTERY PASRK DR 155 BATTERY PARK DR 155 BATTERY PAREK DR 155 BATTERY PARK DR 155 BATTEYR PARK DR
155 BATTERY PARK DR
...
...
Table 3. Example of the Australian Dataset Structure human input
address
15 ELLASLONG STREET PELAW MAIN 15 ELLALONG STREET PELAW MAIN 15 ELLAZLONG STREET PELAW MAIN 15 ELLALONG STREET PELAW MAIN 15 ELLALONV STREET PELAW MAIN
15 ELLALONG STREET PELAW MAIN
15 ELLALONG STREET PELAW MAIIN 15 ELLALONG STREET PELAW MAIN 15 ELLALONG STREET PESLAW MAIN 15 ELLALONG STREET PELAW MAIN
We created six address datasets of two sizes (200K - 200000 records and 1M 106 records) from three plain data sources above (The United States Northeastern, Australian New South Wales and Ho Chi Minh City). We split the dataset to the ratio of 9:0.5:0.5 (train/dev/test) if the dataset’s size is 106 and 6:2:2 otherwise. The dataset’s insights, including total lines, total unique addresses, average address length in tokens and average address length in characters are presented in Table 5 and Table 6. Real-User Input. VietMap.vn provides us with 1000 unique addresses input by real users in Ho Chi Minh City, collected from sales and logistics. Some of these addresses do not exist in the HCM-200K and HCM-1M datasets generated above, hence we cannot use the pretrained HCMC models for evaluations. To keep the evaluations reliable between datasets, we corrected these addresses by hand, then merge them with a portion (over 25000) of Unique HCMC addresses to generate over 500000 pseudo-input addresses. These addresses would be used for training, while the real user inputs would be kept untouched for testing. This set’s insights are described in Table 7:
638
H.-Q. Tran et al. Table 4. Example of the Vietnamese Dataset Structure
Table 5. Dataset Insight, 200K Datasets Dataset
Lines
AU-200K
200000 7997
Unique Avg. tokens Avg. length
US-200K
200000 8641
4.431455
HCM-200K 200000 9607
27.316635
3.511445
16.25124
14.206825
57.602715
Table 6. Dataset Insight, 1M Datasets Dataset
Lines Unique Avg. tokens Avg. length
AU-1M
106
40480
4.480528
27.404619
US-1M
106
44934
3.895049
18.667574
47993
13.545150
53.779572
6
HCM-1M 10
Table 7. Dataset Insight, HCM-Real File
4.3
Lines
Unique Avg. tokens Avg. length
HCMC-real.train 510718 29309
13.158955
56.297847
HCMC-real.dev
127678 29309
13.162033
56.304986
HCMC-real.test
999
14.331331
62.040040
986
Preprocessing
We applied some preprocessing methods, including lowercase and adding start () and stop () tokens to the address. The start and stop tokens are required to mark the beginning and end of a sentence. Examples of input and preprocessed address are presented in Table 8.
Address Search Correction Using Deep Learning
639
Table 8. Preprocessing Example
4.4
Maximum Token Length and Tokenizer
We defined the maximum length of a sequence as a constant. On the other hand, the tokenizer size is based on unique tokens in each dataset, which is shown in Table 9. Table 9. Maximum Tokens of a Sequence based on Datasets
5 5.1
Dataset
Maximum tokens Tokenizer size (Input/Output)
AU-200K
10
55943/4556
AU-1M
10
160693/8982
US-200K
10
30138/3795
US-1M
10
85070/7945
HCM-200K 32
13222/2221
HCM-1M
32
54269/12153
HCM-real
32
54643/14558
Experiments and Evaluation Training Configuration
We use a batch size of 512 samples, early stopping enabled with val loss monitor, patience = 5 and min delta = 0.0001. Most architectures are compiled with the Adam optimizer and sparse categorical crossentropy loss, except the Transformer, which requires a masked categorical cross-entropy loss function and masked accuracy function in its original paper. Tensorflow 2.0 was used for training on the Google Colaboratory and Kaggle platforms.
640
5.2
H.-Q. Tran et al.
Evaluation
BLEU scores are used to assess translation quality in Machine Translation, primarily in models where the source and target languages have complex underlying semantics, but the address’s structure does not. Therefore we use Accuracy on a single address correction as the evaluation metric. Given the human-input set X = {x1 , x2 , .., xn }, the predicted addresses set Yˆ = {ˆ y1 , yˆ2 , .., yˆn } and the corrected addresses (ground-truth) set Y = {y1 , y2 , .., yn } where yˆi and yi are the model predicted address from the input xi and the correct address, respectively. Let nidentical be the total pairs of identical (ˆ yi , yi ). Hence, the accuracy would be calculated as nidentical n A calculation of models accuracy on test sets are present in Table 10. Accuracy(Yˆ , Y ) =
(6)
Table 10. Accuracy on Test Sets Model
Australia 200K 1M
United States 200K 1M
Ho Chi Minh City 200K 1M real
LSTM BiLSTM
0.9662 0.9659
0.9826 0.9841
0.9397 0.9373
0.9874 0.9878
0.9957 0.9956
0.9956 0.9955
0.9624 0.9366
Transformer
0.9626
0.9829
0.9377
0.9870
0.9947
0.9949
0.9475
GRU Dot GRU General GRU Concat
0.9729 0.9803 0.9782 0.9814 0.9766 0.9866
BiGRU Dot 0.9738 BiGRU General 0.9761 BiGRU Concat 0.9764
6
0.9707 0.9943 0.9977 0.9979 0.9737 0.9935 0.9977 0.9979 0.9740 0.9938 0.9977 0.9979
0.9849 0.9707 0.9883 0.9707 0.9899 0.9689
0.9926 0.9938 0.9940
0.9975 0.9976 0.9974
0.9727 0.9743 0.9713
0.9978 0.9685 0.9979 0.9728 0.9980 0.9662
Discussion
We achieve 97% to 99% accuracy in 200K datasets and 98% to 99% accuracy in 1M datasets when using unidirectional and bidirectional GRU Attention. With real user input sets, our proposed architectures gain 97% accuracy. GRU Attention performed better in some cases, the rest are approximately higher when compared against LSTM, BiLSTM, and Transformer approaches. While GRU and LSTM require less memory and less training data, Transformer requires more data and the data itself has to be varied to achieve notable results. Hence, the semantics, pattern and contexts of an address are too shallow and not suitable for Transformer. With GRU, these contexts are sufficient for a good outcome. Additionally, Transformer would require a large and variant
Address Search Correction Using Deep Learning
641
address dataset, as well as an enormous parameter set and suitable hardware to achieve better results. With calculations and memory-optimized, GRU would train in less time with fewer resources required. It is important to note that the accuracy of our model is high, but it is still not able to resolve certain problems. If there is an error in the number and it is valid in the same street and city, our model will predict the wrong number, but the street name and city will be correct. Moreover, if the input stops at a token that can result in more than one acceptable output, the output may and the output is 13 be inaccurate. For example, if the input is 13 kim ..., models may predict another valid address, something like ... We believe the pipeline with Beam Search and a larger 13 address dataset can resolve both of these problems.
7
Conclusion
We introduced a new approach to Address Correction using Machine Translation-based networks with the GRU Attention model. This approach does not require a NER-tagging phase before correcting like other (and current stateof-the-art) methods. We build a corpus of seven datasets, which have different origins (United States, Australia and Vietnam; generated and real user inputs) and different sizes. Evaluations on these datasets result in a minimum 97% and maximum 99% accuracy on our proposed architecture. Despite the Transformer being the state-of-the-art language model in most language-related tasks, our study shows that Transformer is not suitable for the Address Correction task, where the language semantics does not have much impact on the correction. This study also indicates that using traditional Seq2Seq models with Attention mechanism can achieve the best performance because of its simple architecture and the power of the Attention mechanism. In the future, we would love to present a method to finish the Address Searching pipeline by using Trie. Using Beam Search can generate related addresses, however, there are chances that the generated tokens form a faulty address rather than a correct one. We suggest using a data structure (i.e. Trie, B-Tree, etc.) to search for related addresses in a closed set. We would also want to perform experiments on the latest Transformer-based architectures, including BERT, T5 and GPT to have a full picture, of how different architectures are performing and whether will they be suitable for the Address Correction problem.
References 1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). https://doi.org/10.48550/ARXIV.1409.0473. https://arxiv.org/abs/1409.0473
642
H.-Q. Tran et al.
2. Bui, H.N., Tran, V.T.: A novel conditional random fields aided fuzzy matching in Vietnamese address standardization. In: Proceedings of the Tenth International Symposium on Information and Communication Technology, SoICT 2019, pp. 23– 28. Association for Computing Machinery, New York (2019). https://doi.org/10. 1145/3368926.3369687 3. Cao, H.N., Tran, V.T.: Deep neural network based learning to rank for address standardization. In: 2021 RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–6 (2021). https://doi.org/10.1109/ RIVF51545.2021.9642079 4. Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches (2014). https://doi.org/ 10.48550/ARXIV.1409.1259. https://arxiv.org/abs/1409.1259 5. Christen, P., Belacic, D.: Automated probabilistic address standardisation and verification. In: Australasian Data Mining Conference (AusDM 2005) (2005) 6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 7. Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation (2015). https://doi.org/10.48550/ARXIV.1508.04025. https://arxiv.org/abs/1508.04025 8. Sharma, S., Ratti, R., Arora, I., Solanki, A., Bhatt, G.: Automated parsing of geographical addresses: a multilayer feedforward neural network based approach. In: 2018 IEEE 12th International Conference on Semantic Computing (ICSC), pp. 123–130 (2018). https://doi.org/10.1109/ICSC.2018.00026 9. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks (2014). https://doi.org/10.48550/ARXIV.1409.3215. https://arxiv.org/ abs/1409.3215 10. Vaswani, A., et al.: Attention is all you need (2017). https://doi.org/10.48550/ ARXIV.1706.03762. https://arxiv.org/abs/1706.03762
Analysis of Neural Network Architectures for Semantic Segmentation of Seismic Data Gabriel Danilo Figueiredo da Silva1,2(B) , Jo˜ ao T. Dias1,2 , 1,2 Luciana Faletti Almeida , and Milena Faria Pinto1,2 1
Federal Center for Technological Education - CEFET/RJ, Rio de Janeiro, Brazil [email protected] 2 Electrical Engineering Graduate Program, Rio de Janeiro, RJ, Brazil
Abstract. With the advancement of technology, the characterization of oil reservoirs has become more accurate, and its complexity has increased, mainly with the acquisition of tridimensional (3D) seismic data for locating oil and gas reservoirs. This article presents two deep neural network architectures applied to the semantic segmentation of seismic images. Only 100 images are used for training the network. With the proposal to use Adaptive Histogram Equalization (CLAHE) filter in the U-Net architecture, it was possible to increase its performance and reach 99.37% of accuracy, thus surpassing the state-of-the-art obtained by the DanetFCN3 architecture, which had reached an accuracy of 98.13%.
Keywords: U-Net
1
· Danet-FCN3 · Semantic Segmentation
Introduction
Seismic data can be acquired in several ways. The most commonly used method is the reflection method, which consists of a set of several seismic shots performed at different points (latitude and longitude) using Universal Transverse Mercator (UTM) coordinates, and reflection is captured at the surface by hydrophones, or geophones [14]. As the waves are acquired indirectly, they have a large amount of noise, such as the noise produced by the tides and the winds, making interpretation difficult [14]. The reconstruction of seismic data is a considerable challenge, even with technological advances in data acquisition and processing. As a result, artificial intelligence is increasingly playing an important role in processing this data [5]. The technique used to understand the subsurface is called faces seismic analysis and it measures seismic reflection parameters such as configuration, continuity, magnitude, and frequency within the layers (strata) of a depositional sequence [30]. This analysis is very important in the oil and gas industry because it provides information about the likely distribution of rock formations and geological bodies. For other information, It may indicate lithology and possible hydrocarbon accumulation. Although this process is very useful, it can take c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 643–658, 2023. https://doi.org/10.1007/978-3-031-37963-5_44
644
G. D. Figueiredo da Silva et al.
months to complete, and as the amount of geophysical information continues to increase, it can overwhelm human interpreters [24]. The interpretation of offshore seismic data plays a fundamental role in the exploration of oil wells, being indispensable for geoscientists and for the discoveries of hydrocarbons accumulated in reservoirs. However, the interpretation of seismic data is a humanly costly activity, as it requires a great deal of time to interpret and process the seismic data [23]. The characterization of oil and gas reservoirs became more complex with the arrival of data with information on volumes (3D) and seismic attributes, as these data also provide qualitative information on the geometry and physical principles of the lithological layers [25]. In addition to the aforementioned difficulties, the increasingly tight deadlines for companies to be more competitive and the increasing amount of data are also complicating factors [14]. Geoscientists in the oil and gas industries work by analyzing seismic data to find oil or natural gas reservoirs and are already used to working using a large amount of data. With the emergence of Deep Learning methods, studies were carried out to apply this technique as an alternative to solve the difficulties encountered by manual and assisted classification techniques in unsupervised learning in seismic facies [21]. This type of algorithm is used to extract a representation of the data to perform the classification of seismic facies automatically. However, to achieve this data’s full potential, the labels of the database must be very well-defined and of good quality so that you can use the deep learning techniques [17] For seismic data classification tasks, many supervised machine learning models have been explored, such as support vector machines (SVM) [23], decision trees [18], multilayer perceptron neural networks (MLP) or convolutional neural networks [23]. These algorithms use labels predefined in the seismic data by humans who know the area. Therefore, its success depends on the seismic data labels and the expert’s general knowledge. In this research work, it was decided to use the supervised architecture because the dataset used already has labels (the original mask), making it easier for the architecture to find patterns and training in the available data. Machine learning algorithms applied to seismic data can be further divided into two categories: classical clustering algorithms and image segmentation algorithms. Classical clustering algorithms have been successfully applied in seismic data exploration. For example, Kohonen Maps (Self Organizing Maps) [15], kMeans [32], Fuzzy c-Means [13], among others. Segmentation algorithms are the most common for seismic images due its great results, some of them has shown promising results, such as Danet-FCN3 [9], DeepLabv3Plus [2] and U-Net[29]. Each algorithms has its complexity and it must be taken in consideration when training a model. In image segmentation algorithms, seismic layers are treated as independently processed image sequences. Therefore, a post-processing step is required to combine the results into seismic volumes. Another work using unsupervised learning through the techniques of Watershed and SLIC (Simple Linear Iterative Clustering) is presented in [3]. These networks were used to create a mask for seismic
Semantic Segmentation of Seismic Data
645
data without labels. In this article, no post-processing technique was necessary to achieve a satisfactory result. It successfully created output masks from input seismic data using a DanetFCN (Dual Attention Network Fully Convolutional Network), which is a supervised architecture. Two seismic databases were used. The first was from Penobscot, from the Basin of Scotland, and the other from the Netherlands, from the Central-Graben Basin. During the network training, it achieved a pixel accuracy greater than 97 % [12]. In addition, a convolutional neural network (CNN) was used to perform training on the seismic image database of block F3 of the Central-Graben Basin and Penobscot. In this article, the authors did not use any semantic image segmentation technique. However, they arrived at a F1Score of up to 98.7%, using less than 5% of the database for training. However, of the 8 classes, classes 3 to 7 were grouped into only one class, thus becoming a classification problem with only 4 classes [10]. Regarding semantic segmentation, in this work, two deep neural network architectures were specially designed to semantically segment seismic images with a minimal amount of training data, the U-Net and Danet-FCN3. It has been shown that this approach can achieve more than 99% of accuracy. Furthermore, the qualitative results show that the model obtained can produce masks very close to the human interpretation [27]. A study was published on the classification of seismic facies and how they can be divided for better understanding by a neural network [10]. It was also shown that it is possible to use transfer learning to accelerate and improve the training of networks [8]. And as further work, they applied these techniques to perform the semantic segmentation of the rock layers [7]. As shown in Table 1, several works were made in this area using different segmentation networks. However, only a few used less than 100 images and with a low-complex architecture. In [31], the authors achieved 88% of mIoU (mean of Intersection over Union) using only 41 images. However, it does not mention how many classes were used in the case. In [4], the authors have used the state-of-art PSP-Net (ResNet-34) for the segmentation of the seismic data, classifying it into 7 classes. They achieved great results applying post-processing, a contour detection method, which fixes mistakes of non-continuity segmentation, and using it increases the complexity of the network. The PSP-Net is a high-complex network, a classical ResNet-34 architecture that involves 63.5 million parameters [16], almost double the U-Net + CLAHE. However, all the articles cited need more information about the hyperparameter adjustments, such as the optimizer used, the learning rate, etc. In addition, to the fact that the network that has the best performance in seismic image classification (Danet-FCN3) has a complex architecture, using several layers of convolutions and transported convolutions, resulting in high computational complexity. While other architectures, such as U-Net, are much less complex, they have lower performance in classifying seismic images [7].
646
G. D. Figueiredo da Silva et al. Table 1. Related works in semantic segmentation in seismic data. Author
Number of Slices
Metric Used
Classes Tile Size
Daniel Civitarese [7]
412
98,25% (mIoU)
7
40 × 40
Danet-FCN
Daniel Civitarese [9]
276
97,00% (accuracy)
6
40 × 40
Danet-FCN3
Daniel 160 Civitarese [10]
75,80% (F1-Score)
4
40 × 40
CNN
Bilal Abid [2]
93,31% (mIoU)
7
128 × 128 DeepLab v3Plus + augm.
Zengyan Wang 41 [31]
88,30% (IoU)
–
Not Used Dilated U-Net + Attention
Reinaldo Silva Not [28] Mentioned
98,00% (mIoU)
6
64 × 25
CNN
Daniel 100 Civitarese [12]
99,00% (mIoU)
7
80 × 120
Danet-FCN3
Maykol Trinidad [29]
1102
89,50% (FWIU)
6
Not Used Atrous U-Net
Xiaoyu Chen [6]
1102
77,50% (mIoU)
6
Not Used HRNetV2W32
Danilo Calhes [4]
32
89,16% (mIoU)
7
64 × 64
Proposed Work
100
81,44% (mIoU)
10
128 × 128 U-Net + CLAHE
13421
Architec
PSP-Net
Thus, this work presents a careful study of the projection of two neural network models to perform the semantic segmentation task in seismic images. It is proposed to use the U-Net + CLAHE architecture, which is less complex than Danet-FCN3, making feasible the use of a less complex architecture with a result better than Danet-FCN3. Therefore, the following contributions are expected: – Implementation of an artificial neural network trained and adjusted to perform semantic segmentation in the proposed seismic database. The high quality of the trained network can be proven through the metric of Intersection Over Union (IoU) in the test group performed on the database; – Demonstration of the importance of using pre-processing techniques to improve the network architecture.
Semantic Segmentation of Seismic Data
647
In addition to carefully studying the architectures used for the semantic segmentation of seismic images, this will be a replicable experiment, given that the database used is a public database.
2
Semantic Segmentation Architectures
The U-Net architecture can be defined as an extended version of the fully convolutional networks (FCN) to provide more precise segmentations with small training sets. The main difference in relation to the FCN is made in the decoders layers, where there are many feature maps, resulting in symmetry about the encoders layers. Thus, taking a “U” shape, as shown in Fig. 1. The way adopted to increase the number of feature maps in decoders is to concatenate the output of a layer of unpooling with the output of the convolution layer of the corresponding encoders. In this way, the network learns the information content of the image and where this content is located in the image. U-Net consists of the repeated application of two 3 × 3 convolutions, each followed by a rectified linear unit (ReLU) and a 2 × 2 max pooling operation with stride 2 for downsampling. At each downsampling step, the number of feature channels is doubled. Each step in the expansive path consists of an upsampling of the feature map followed by a 2 × 2 convolution (up-convolution) that halves the number of feature channels, a concatenation with the correspondingly clipped feature map of the hiring path, and two convolutions 3 × 3, each followed by a ReLU. Clipping is required due to the loss of edge pixels on each convolution. In the final layer, a 1 × 1 convolution is used to map each 64-component feature vector to the desired number of classes [26]. Regarding the segmentation task, the neural network must be able to combine the location information with the contextual information from the feature maps of the image to be predicted. The U-Net architecture can join the contextual information obtained from the contraction path (encoder) with the location information acquired from the expansion path (decoder) with good performance [26]. However, to improve the performance of the U-Net architecture, pre-processing was used in the images, which will be described posteriorly.
648
G. D. Figueiredo da Silva et al.
Fig. 1. The Description of the U-Net Architecture, Separating the Encoder from Decoder [26]
2.1
Danet-FCN3
The Danet-FCN3 architecture was based on the VGG-FCN topology, where the authors could convert a fully connected layer at the end of the network into 1 × 1 convolutions. This architecture uses residual blocks to extract the features, which would be the encoder, and a transposed unit of the same structure to recreate the original image with all its labels, which would be the decoder [19]. Figure 2 shows all layers of the Danet-FCN3 architecture. Where S is the stride, #CH is the number of channels (filters) used in all operations within the block. So the first entry C5 S2 64, means a convolution with 64 filters of size 5 × 5 and stride of 2. Res and Rest are composed of 3 residual unit blocks and transposed residual units, respectively. The Res block has one residual unit with a stride equal to 2, followed by two residual units with a stride equal to 1. The Rest block has two residual units transposed with a stride equal to 1, followed by a residual unit transposed with a stride equal to 2. Ru and Rut are the residual and transposed residual units, respectively, as shown in Fig. 3. Each residual unit is composed of a convolution followed by Batch Normalization (Conv+BN), ReLU activation and Conv+BN plus the input (Shortcut), and a ReLU activation. The transposed residual unit follows the same architecture but with the transposed convolution. The parameters used can be found in [9].
Semantic Segmentation of Seismic Data
649
Fig. 2. Topology of the Danet-FCN3 Model. Note on the Right of this Figure, a Description of each Block [19]
3
Geological Information
This work uses the North Sea of Holland F3 database. This dataset contains a seismic survey of approximately 384 km2 . It is a public database in the Graben Basin, situated 180 km off the coast of the Netherlands, which can be accessed in the repository available in [1]. The seismic image of Fig. 4 shows that the inline slices are perpendicular to the inline axis, and the crosslines are the images along the depth of the perpendicular crossline axis. The database without pre-treatment has 1602 slices. In these 384 km2 of extension, there are 651 inlines and 951 crosslines. The inlines have a dimension of 951 × 462, and the crosslines 651 × 462. Among the bank images, the low-quality or damaged images were removed to not interfere with the network’s learning. The inlines and crosslines images have their labels already
650
G. D. Figueiredo da Silva et al.
Fig. 3. The Ru Block is the Residual Unit and the Rut Block is the Transposed Residual Unit [19]
provided by the database author. In Fig. 5, it is possible to see a slice of an inline of a 3D seismic block separated by the classes.
Fig. 4. Example of Inlines and Crosslines in a 3D Seismic Block
Geologists have identified 9 interfaces in the database, the number increasing with depth from H1 to H9. Seismic facies can be described as:
Semantic Segmentation of Seismic Data
651
Fig. 5. In (A) it Shows the Original Image and in (B) it Shows the Image with the 10 Separated Classes
– Class 1: Ocean; – Class 2: The upper reflective layer is a layered seismic face, and the lower reflective layer is a homogeneous seismic face; – Class 3: Seismic facies consist of low-amplitude reflectors and continuous emitters, and are mainly associated with neritic environments dominated by clay deposits; – Class 4: The reflectors mainly exhibit features in the form of discontinuous mounds; – Class 5: The reflectors are predominantly subparallel and have variable amplitudes; – Class 6: The reflectors have a sigmoid progradational configuration of low energy and medium to low amplitude; – Class 7: Seismic facies between H6 and H5 mainly consist of high-amplitude parallel reflectors; – Class 8: These facies present semi-continuous and low amplitude reflexes, which makes their identification difficult; – Class 9: This interval presents facies contorted to mounds and of low amplitude; – Class 10: Facies between H9 and H8 are noisy, possibly due to acquisition noise or seismic processing failure.
4
Database Preparation
As discussed in [20] and [11], different types of substrates can be identified using the characteristics of the textures of seismic images. With this, the process can
652
G. D. Figueiredo da Silva et al.
be automated using a neural network model in order to distinguish the textures in the seismic images. The first step was to use a filter on the image, the Contrast Limited Adaptive Histogram Equalization (CLAHE), to increase the contrast of the image pixels, where this filter is a modification of the Adaptive Histogram Equalization (AHE). Histogram equalization is a relatively simple image enhancement method. The CLAHE filter can be described as follows: 1. Image acquisition; 2. Acquisition of input values, such as number of regions, cut-off limit, and type of distribution parameters; 3. The original image is divisible in some tiles; 4. The process is applied to all tiles; 5. A gray level map is generated, and the histogram is cut. Gray levels are divided equally at each gray level, and the average number of gray level pixels is described according to Eq. 1 [22]. Naverage = (NCR
− Xp
∗ NCR−Y p )/Ngray ,
(1)
where, Naverage = average number of pixels. Ngray = number of levels of gray in a tile. NCR − Xp = Number of pixels in the X direction in the tiles. NCR−Y p = Number of pixels in the Y direction in the tiles. 6. Interpolation of the gray level mapping to create an enhanced image. As a result, the most important details of the seismic images, which are the interfaces, are highlighted more clearly, as shown in Fig. 6.
Fig. 6. Seismic Images. (A) Original Image. (B) Image using the CLAHE Filter and Resizing. It is Possible to See the Difference of Size between (A) and (B)
The original images have a float range of around [−30.000 , 33.000]. As a result, it was necessary to re-scale the values to the grayscale range of [0, 255], in a pattern of 8 integer bits, as this step causes extreme amplitudes to be disregarded, thus improving the representation of the images in the database. The range from 0 to 255 is set for all 10 classes for training. Figure 5 presents the 8-bit image and its mask with the 10 classifications.
Semantic Segmentation of Seismic Data
653
To work with images from the database, where the GPU memory is low, it is necessary to subdivide them into small tiles, which is a process widely used in [20] and [11], because when using images, large inputs of neural networks the model becomes very expensive and computationally complex. With that, it was chosen to work with tiles of 128 × 128 from the enlarged images of 1024 × 512. In order to balance the training database, it was chosen to balance the number of samples by classes, thus facilitating the metrics used to measure network performance, such as Precision or mIoU. The mIoU is the Mean between the area of the intersection and the area of the union of the classes in the other 100 images that are not used previously for training or validation. Within the image segmentation context, the IoU is calculated with the convolutional network output and the original mask. Once the entire predicted mask is formed, we calculate the IOU for each class in that image and average the results to get the mIoU for the image. The mIoU Max and mIoU Min are the minimum and maximum mIoU in the validation images. In order to maintain the standard of test images as in the article [12], 100 seismic images were used, which generated 3200 tiles, where 70% was used for network training and 30% for validation.
5
Experiments
Several approaches use deep learning to classify rock layers, such as [9] and [10]. In both articles, the authors could classify each layer using the Netherlands database [1]. There are some approaches to improve the performance of neural networks when trained with few images, such as using data augmentation. One of these techniques is the use of the sliding window on the image, generating new tiles from the intersection of existing tiles. The use of this technique was investigated. However, in both architectures, there was a decrease in performance. To accomplish the pieces of training, it was chosen to use a Batch Size of 128 with 500 epochs. The optimizer used was the RMSprop with v = 0.9, ϕ = 0.9, = 1.0 and dropout ρ = 0.5 for presenting an excellent performance, with the function of loss Cross-Entropy and modes of temporal weight. These parameters were chosen after performing several tests and showed better performance. For initialization was used in all convolutional kernels, while the biases were initialized with zeros. The models were set with weight decay coefficient as 5x10−4 , μ = 0.997, and lr = 0.01. The Danet-FCN3 architecture is trained with images without pre-processing, and the U-Net architecture uses pre-processing. Figure 7 and Fig. 8 show the training history of the U-net and Danet-FCN3 network, respectively. It is observed that the training of Danet-FCN3 has a longer delay for convergence, requiring more training times than U-net. In addition, the Danet-FCN3 architecture has strong fluctuations in validation accuracy. Table 2 shows the training results of Danet-FCN3 and U-Net training. Using the CLAHE filter as pre-processing of the images was beneficial for the U-Net network to match the performance of the two networks. There are several ways to
654
G. D. Figueiredo da Silva et al.
Fig. 7. History of U-Net Architecture Training Epochs
Fig. 8. History of Danet-FCN3 Architecture Training Epochs
measure the performance of a semantic segmentation network, such as Precision, mIoU, mIoU Max, mIoU Min. Precision consists of the proportion of correct evaluations among the system responses pointing to a given class. It was noted that the errors presented in the Danet-FCN3 architecture were concentrated in the lower left corner of the images. The errors in the U-Net were focused on the first layers H0, H1, and H2. Further study is needed to investigate
Semantic Segmentation of Seismic Data
655
the reason for these errors. The most likely reason that this occurred is due to the size of the tiles. As the layer is very thin, the tiles end up not having only 1 class per tile, thus, making it difficult to train the network. Table 2. Performance Evaluation of the U-Net and Danet-FCN3 Architecture U-Net
Danet-FCN3 U-Net+CLAHE
Imagens
100
100
Precis˜ ao
99,10% 98.13%
mIoU
73,28% 76,74%
81,44%
mIoU Max
74,54% 78,64%
82,41%
mIoU Min
64,55% 66,17%
Millions of parameters 31.03
39.2
100 99.37%
79,25% 34.05
Figure 9 and Fig. 10 are the masks created after training the networks. The performance can be considered good, even using a few images for training the networks. However, with some flaws in the tiles classification, it is possible to distinguish all 10 classes of the seismic image.
Fig. 9. Mask Created from the Danet-FCN3 Architecture
Even though with low complexity, the U-Net shows a great results using only 100 images, making visible all the interfaces, even using 10 classes, no article has achieved it. With a mIoU of 81,44% is possible to see all the 10 divided classes, making the U-Net + CLAHE a great possibility to whom would like to achieve a great result using a low-complex architecture for semantic segmentation.
Fig. 10. Mask Created from the U-Net+CLAHE Architecture
656
6
G. D. Figueiredo da Silva et al.
Conclusions and Future Work
The interpretation of seismic images is an arduous task that requires a high computational cost. However, deep learning techniques and image segmentation have been helping a lot in this area, with techniques increasingly similar to human perception. Therefore, this work proposed the use of the CLAHE filter and U-Net to perform semantic segmentation of seismic data. The results achieved 99.37% accuracy, surpassing Danet-FCN3, which achieved an accuracy of 98.13%. Even using pre-processing in the U-Net architecture, it has fewer parameters to train, achieving a higher mIoU than Danet-FCN3. For future work, it is desirable to use the same techniques used with U-Net in new databases to evaluate the architecture’s performance. Vary the size of the tiles to find the optimal size to perform the training and use the same filtering in other architectures to see if there is a performance gain. Finally, applying some post-processing to improve the results of the masks. Acknowledgments. The authors have no conflicts of interest to declare that are relevant to the content of this article. The authors also would like to thank the following federal Brazilian agencies CEFET-RJ, CAPES, CNPq and FAPERJ for the support to this work.
References 1. Project f3 demo (2020). The Netherlands, offshore, North Sea. https://terranubis. com/datainfo/F3-Demo-2020. Accessed 06 Aug 2022 2. Abid, B., Khan, B.M., Memon, R.A.: Seismic facies segmentation using ensemble of convolutional neural networks. Wirel. Commun. Mob. Comput. 2022 (2022) 3. Barnes, A.E.: Redundant and useless seismic attributes. Geophysics 72(3), P33– P38 (2007) 4. Calhes, D., Kobayashi, F.K., Mattos, A.B., Macedo, M.M., Oliveira, D.A.: Simplifying horizon picking using single-class semantic segmentation networks. In: 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 286–292. IEEE (2021) 5. Carpenter, C.: Artificial intelligence improves seismic-image reconstruction. J. Petrol. Technol. 71(10), 65–66 (2019) 6. Chen, X., Zou, Q., Xixia, X., Wang, N.: A stronger baseline for seismic facies classification with less data. IEEE Trans. Geosci. Remote Sens. 60, 1–10 (2022) 7. Chevitarese, D., Szwarcman, D., Silva, R.M.D., Brazil, E.V.: Seismic facies segmentation using deep learning. AAPG Ann. Exhib. (2018) 8. Chevitarese, D., Szwarcman, D., Silva, R.M.D., Brazil, E.V.: Transfer learning applied to seismic images classification, AAPG Ann. Exhib. (2018) 9. Chevitarese, D.S., Szwarcman, D., Brazil, E.V., Zadrozny, B.: Efficient classification of seismic textures. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2018) 10. Chevitarese, D.S., Szwarcman, D., e Silva, R.G., Brazil, E.V.: Deep learning applied to seismic facies classification: a methodology for training. In: Saint Petersburg 2018, volume 2018, pp. 1–5. European Association of Geoscientists & Engineers (2018)
Semantic Segmentation of Seismic Data
657
11. Chopra, S., Alexeev, V.: Applications of texture attribute analysis to 3D seismic data. Lead. Edge 25(8), 934–940 (2006) 12. Civitarese, D., Szwarcman, D., Brazil, E.V., Zadrozny, B.: Semantic segmentation of seismic images. arXiv preprint: arXiv:1905.04307 (2019) 13. Col´eou, T., Poupon, M., Azbel, K.: Unsupervised seismic facies classification: a review and comparison of techniques and implementation. Lead. Edge 22(10), 942–953 (2003) 14. do Nascimento Lonardelli, J., Rigoti, A., Rostirolla, S.P., Appi, C.J.: Levantamento s´ısmico de reflex˜ ao rasa: um estudo dos arenitos vila velha no estado do paran´ a, Brasil. Boletim Paranaense de Geociˆencias 58 (2006) 15. dos Santos, M.S., de Souza Junior, W.D.: O com´ercio do petr´ oleo: um estudo da estrutura de mercado do setor petrol´ıfero brasileiro no per´ıodo de 2005 a 2014. Informe Gepec 20(1), 98–115 (2016) 16. Gao, M., Qi, D., Hongbo, M., Chen, J.: A transfer residual neural network based on ResNet-34 for detection of wood knot defects. Forests 12(2), 212 (2021) 17. Gonzalez, R.C.: Digital image processing: Pearson education India (2009) 18. Li, D., Peng, S., Lu, Y., Guo, Y., Cui, X.: Seismic structure interpretation based on machine learning: a case study in coal mining. Interpretation 7(3), SE69-SE79 (2019) 19. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015) 20. Mattos, A.B., Ferreira, R.S., e Silva, R.M.D.G., Riva, M., Brazil, E.V.: Assessing texture descriptors for seismic image retrieval. In: 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 292–299. IEEE (2017) 21. OnePetro. Seismic facies classification using different deep convolutional neural networks (2018) 22. Park, G.-H., Cho, H.-H., Choi, M.-R.: A contrast enhancement method using dynamic range separate histogram equalization. IEEE Trans. Consum. Electron. 54(4), 1981–1987 (2008) 23. Pearl, C.: Designing Voice User Interfaces: Principles of Conversational Experiences. ”O’ Reilly Media, Inc.”, Sebastopol (2016) 24. Randen, T., et al.: Three-dimensional texture attributes for seismic data analysis. In: 2000 SEG Annual Meeting. OnePetro (2000) 25. Reading, H.G.: Sedimentary Environments: Processes, Facies and Stratigraphy. John Wiley & Sons, Hoboken (2009) 26. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4 28 27. Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014) 28. Silva, R.M., Baroni, L., Ferreira, R.S., Civitarese, D., Szwarcman, D., Brazil, E.V.: Netherlands dataset: a new public dataset for machine learning in seismic interpretation. arXiv preprint: arXiv:1904.00770 (2019) 29. Trinidad, M.J.C., Canchumuni, S.W.A., Feitosa, R.Q., et al.: Seismic facies segmentation using Atrous convolutional-LSTM network 30. Vail, P.R.: Seismic stratigraphy interpretation using sequence stratigraphy: part 1: seismic stratigraphy interpretation procedure (1987)
658
G. D. Figueiredo da Silva et al.
31. Wang, Z., Li, F., Taha, T.R., Arabnia, H.R.: Improved automating seismic facies analysis using deep dilated attention autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019) ¨ Seismic Data Analysis: Processing, Inversion, and Interpretation of 32. Yilmaz, O.: Seismic Data. Society of Exploration Geophysicists, Tulsa (2001)
MA-CC: Cross-Layer Congestion Control via Multi-agent Reinforcement Learning Jianing Bai1 , Tianhao Zhang1,2(B) , Chen Wang1 , and Guangming Xie1 1 2
Peking University, Beijing 100871, China tianhao [email protected] Tsinghua University, Beijing 100084, China
Abstract. Deep reinforcement learning (DRL) injects vigorous vitality into congestion control (CC) to efficiently utilize network capacity for Internet communication applications. Existing methods employ a single DRL-based agent to perform CC under Active Queue Management (AQM) or Transmission Control Protocol (TCP) scheme. To enable AQM and TCP to learn to work cooperatively, this paper aims to study CC from a new perspective from the multi-agent system by leveraging multi-agent reinforcement learning (MARL). To this end, we propose a MARL-based Congestion Control framework, MA-CC, which enables senders and routers to gradually learn cross-layer strategies that dynamically adjust congestion window and packet drop rate. We evaluate the proposed scheme in a typical dumbbell-like network model built on the ns-3 simulator. The results show that MA-CC outperforms traditional rule-based and learning-based congestion control algorithms by providing higher throughput while maintaining low transmission latency. Keywords: Congestion Control
1
· Multi-agent Reinforcement Learning
Introduction
Recently, successful applications in automatic driving, video streaming, and online games require higher Quality of Service (QoS) for the data transmission environment, which poses new challenges in the design of network protocols in different layers to perform congestion control (CC). There are two primary schemes devoted to controlling network congestion. One is the transmission control protocol (TCP) CC [9], which is deployed at the transmission layer to avoid congestion by adjusting sending rate. The other is the active queue management (AQM) [10], which is deployed at the network layer to control the buffer to avoid overflow [13]. The traditional TCP/AQM system is rule-based, tuning parameters manually to adapt to network communication environments. However, since the network environment is complicated, it is difficult for designers to obtain expert knowledge about the background to design one rule-based mechanism for various scenarios. Therefore, an intelligent CC scheme is required to cope with the challenges of the complex network environment. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 659–671, 2023. https://doi.org/10.1007/978-3-031-37963-5_45
660
J. Bai et al.
Fortunately, reinforcement learning (RL) has shown massive potential for real-time decision-making under dynamic environments and has been applied to various complex real-world tasks in recent years [26]. As a result, there is an increasing number of researchers who leverage RL on communications and networking [6], especially congestion control, to utilize network capacity efficiently [17]. However, existing learning-based CC protocols mostly use a single agent to perform CC under TCP/TCP scheme by adjusting the congestion window (CWND) or packet drop rate separately [22]. No work enables AQM to cooperate efficiently and intelligently with TCP. In this paper, we consider congestion control a multi-agent decision-making problem and propose a novel framework called MA-CC, which makes AQM and TCP learn to cooperate using multi-agent reinforcement learning (MARL). Precisely, MA-CC consists of two types of agents: TCP agents that dynamically adjust CWND and AQM agents that control packet drop rate. Using a typical MARL method, the value decomposition network (VDN) algorithm [25], two types of agents can work cooperatively to perform the cross-layer congestion control. The performance of MA-CC is evaluated in a typical dumbbell-like network model built upon the ns-3 simulator. Compared to the existing rule-based and learning-based CC algorithms, MA-CC achieves state-of-the-art performance in terms of throughput and delay. The main contributions of this paper are summarized as follows: – As far as the authors are aware, it is the first time that a MARL-based approach enabling TCP and AQM to learn to cooperate is proposed to address the cross-layer congestion control problem. Simulation results show that our proposed MA-CC method outperforms typical rule-based and learning-based congestion control algorithms. – We design a typical dumbbell-like network scenario under the ns3-gym simulator by modeling it as a multi-agent decision-making problem, which speeds up research and development of MARL in the congestion control area.
2
Related Work
In this section, we first introduce some existing traditional rule-based network protocols and then discuss the work that exploits machine learning for networking protocols, primarily focusing on reinforcement learning for congestion control mechanisms. 2.1
Rule-Based Protocols
There are various rule-based protocols for solving the congestion control problem, which can be divided into end-to-end CC and network-assisted CC. End-to-end mechanisms, usually applied to TCP CC, rely on implicit signals from the networks, such as delay and the loss of packets. Tahoe [16] and Reno [15] introduce three core phases in CC: slow start, congestion avoidance, and fast recovery.
Cross-Layer Congestion Control via Multi-agent Reinforcement Learning
661
Based on Reno, NewReno [9] is a classical and default congestion control protocol in use today. BBR [4] is a novel mechanism that performs well in TCP by continuously detecting the maximum link capacity and employing two parameters, namely RTprop and BtlBw, to model the end-to-end network capacity. Besides, Vegas [3], fast active queue management scalable TCP (FAST) [18], low latencies TCP [14], Timely [20] treat increasing RTT as a congestion signal and adjust CWND to keep RTT in the desired range. Moreover, Veno [11], Africa [19] and Google Congestion Control (GCC) [5] combine the loss and delay signals to evaluate congestion. Although the principle of end-to-end CC is simple to realize, it cannot identify the network environment status precisely only using those implicit signals. For example, packet loss is not necessarily caused by congestion but may also be caused by physical line failure, equipment failure, virus attack, routing information error, etc. To crack this nut, queue length management in network-assisted CC mechanisms that works at the network layer has been proposed. There is a body of research related to the AQM scheme. The RED algorithm [10] is the default algorithm to realize router congestion control, which marks the data packets and drops packets with a certain probability that arrive at the router. The controlling queue delay (CoDel) algorithm [21] is a packet-sojourn time-based AQM, which tracks the minimum queuing delay experienced by the packets. Based on CoDel, the proportional integral controller enhanced (PIE) algorithm [23] improves robustness by using additional parameters. 2.2
Learning-Based Protocols
The dynamic and complexity of network scenarios have brought significant challenges for CC. Thus, over the past fifteen years, there has been a lot of effort to implement intelligent congestion control solutions to improve the network system’s performance. In the transfer layer, Remy [27], based on the customized objective function consisting of throughput and delay, attempts to find a mapping from a precomputed lookup table. Instead of using hardwired mapping, PCC [7] and PCCVivace [8] leverage empirically observed performance metrics and online(convex) optimization on machine learning techniques to choose the best sending rate automatically. Moreover, Orca [1] combines the traditional CC algorithm Cubic and RL to compute the CWND. Similarly, RL-TCP [22] uses RL to change the CWND of TCP. Besides the above mechanisms that adopt the collaboration of senders and receivers, CC mechanisms work at the network layer. For example, QRED [24] adjusts the maximum dropping probability according to the network situation based on the RED scheme and the Q-learning algorithm. RL-AQM [2] also presents a new AQM algorithm based on RL to manage the network resources to keep the low queuing delay and the packet loss rate in different communication situations. Although these learning-based works use RL to inject vigorous
662
J. Bai et al.
vitality into CC, they all consider the problem from the perspective of a single agent, which cannot make TCP and AQM cooperatively perform cross-layer congestion control.
3
MARL-Based Congestion Control
In this section, we propose a novel MARL-based framework that uses both TCP agents and AQM agents to dynamically and cooperatively perform cross-layer congestion control, called MA-CC. 3.1
Overview
Fig. 1. Architecture of the Proposed MA-CC Schemes.
The framework of MA-CC is illustrated in Fig. 1. Key to our design is the insight that it is necessary to improve network performance by cooperating AQM scheme with TCP congestion control. Therefore, we integrate a multiagent reinforcement-based framework with TCP and AQM design in our MA-CC approach to perform cross-layer congestion control cooperatively. To this end, we consider the cross-layer congestion control as a sequential decision-making problem and formulate it as a decentralized partially observable Markov decision process (Dec-POMDP), defined by a tuple (N + M), S, A, P, O, Z, γ. N is the set of TCP agents with |N | = N , M is the set of AQM agents with |M| = M , and S is the set of states. At each time step, each agent i ∈ (N + M) chooses an action ai from its action set Ai , and all
Cross-Layer Congestion Control via Multi-agent Reinforcement Learning
663
agents together take a common action a. The state s ∈ S transitions to the next state s according to the transition function P(s |s, a) and all agents receive a shared reward r(s, a). Moreover, each agent only obtains a partial observation oi ∈ Oi according to the observation function Z(s, i) : S × (N + M) → Oi and learns an individual policy πi (ai |oi ). The objective of all agents is to maximize ∞ the cumulative return E[ t=0 γ t rt ], where γ ∈ [0, 1] is the discount factor. According to the above modeling, the MA-CC consists of the following elements: – Agents: There are two types of agents, TCP agents that work at the transmission layer and AQM agents that work at the network layer. – State: It consists of the bounded histories of the network statistics and measurements that an agent can obtain from the outside environment. – Action: Under the related TCP/AQM scheme, the agent chooses an action after perceiving the current state, according to its RL-based policy. – Reward: It reflects the desirability of the action picked to perform the crosslayer congestion control. In short, two types of RL-based agents interact with each other and the network environment. They aim to obtain high rewards to improve network communication performance. They take actions (e.g., varying enqueue rate and sending rate) after observing environment states (e.g., transmission delay and throughput). Next, we describe the state, action, and reward of MA-CC in detail. 3.2
Task Description
According to the TCP/AQM scheme, we separately design the state, action, and reward for the TCP agent (labeled as Agent1 ) and the AQM agent (labeled as Agent2 ). STATE: At each time step t, the system monitors the state of the network environment and forms a statistical observation for each agent. The observation of Agent1 is o1t = (segmentsAckedt , bytesInF lightt , RT Tt ), and the observation of Agent2 is o2t = (queueLengtht , dequeueRatet , curr− QueueDelayt ), which are defined as follows: – segmentsAcked : It is the sum of Segments Acknowledged, indicating the number of segments acknowledged by the receiver in a fixed time, which reflects the available bandwidth and its variation. – bytesInFlight: It is the sum of Bytes in Flight, indicating the number of bytes that have been sent out but unacknowledged by the receiver, which is an essential indicator of optimal Kleinrock’s point. – RTT : It is the Round-Trip Time of a packet, indicating the amount of time it takes for a data packet to go from the sender to the receiver and back, which is a key element of the network latency. – queueLength: It is the queue length of packets, indicating the remaining buffer space.
664
J. Bai et al.
– dequeueRate: The dequeue Rate is an important indicator of the packet processing rate and the link’s capacity. – currQueueDelay: It is the current queuing delay, which jointly affects the RTT and the response speed of the communication with the propagation delay. ACTION: In our formulation, the TCP agent (Agent1 ) is deployed at the transmission layer, whose actions influence the sending rates. That is, at each time step t, Agent1 adjusts the current CWND, i.e., CWNDt , to increase, decrease, or maintain it by three discrete actions: ⎧ CWNDt−1 + segmentSize, a1t = 0 ⎨ CWNDt = max(CWNDt−1 − segmentSize, 1), a1t = 1 (1) ⎩ a1t = 2 CWNDt−1 , where segmentSize indicates the maximum amount of TCP data sent in each segment. As for the Agent2 , under the AQM scheme, it executes an action to adjust the buffer queue length. Specifically, at each time step interval t, Agent2 determines the probability pt of dropping an incoming packet in each flow by two discrete actions: 0, a2t = 0 pt = (2) 1, a2t = 1 where it decides the packet dropping probability happens in each flow during i time interval. REWARD: The control object is to get high QoS, i.e., to maximize throughput and stability while minimizing delay and packet loss rate. Therefore, according to the performance metrics of TCP and AQM schemes, we design the reward functions for Agent1 and Agent2 as follows:
rt1 = a × segmentsAckedt − b × bytesInFlightt − c × RTTt rt2 = d × dequeueRatet − e × currQueueDelayt − f × queueLengtht − g × lossRatet
(3)
where (a, b, c, d, e, f, g) > 0 are predetermined constant, r1 consists of three components to reflect loss rate, throughput, and delay, and r2 consists of four elements to reflect throughput, delay, loss rate and stability. 3.3
Value Decomposition Network
To demonstrate the capability of the proposed MA-CC, we utilize a typical cooperative MARL method, value decomposition network (VDN) algorithm [25], to learn policies to choose actions and achieve cross-layer congestion control. Specifically, our adopted VDN is a value-based MARL method, which assumes the total Q-value of the multi-agent system. Under the centralized training with decentralized execution (CTDE) paradigm, it can be decomposed into the sum of the Q-values of each agent as follows: Qtot ((o1 , . . . , on ), (a1 , . . . , an )) ≈
n i=1
Qi (oi , ai ),
(4)
Cross-Layer Congestion Control via Multi-agent Reinforcement Learning
665
Algorithm 1. The VDN method for MA-CC. 1: Randomly initialize Q network Qiθi (oi , ai ) with weights θi for each agent i ∈ {1, 2}; ¯ with weights θ¯i initialize target network Q 2: Initialize the empty replay buffer D; initialize the target replace rate τ 3: for e = 0 , M − 1 do 4: Initialize networking simulation environment 5: Receive initial observation o10 and o20 6: for t = 0, T − 1 do 7: For each agent i, sample action ai from Qi by the epsilon-greedy policy with the exploration rate 8: Take the joint observation o = [o1 , o2 ], joint action a = [a1 , a2 ], joint reward r = [r1 , r2 ], and the joint next observation o’ = [o1 , o2 ] 9: Store < o, a, r, o’ > in replay buffer D 10: Sample Q networks by minimizing: a minibatch Bfrom D to¯ iupdate (oi , ai ) − rtot ]2 EB [ i Qiθi (oi , ai ) − γ i maxa Q θ¯i 11: Update the target networks: θ¯i ← τ θi + (1 − τ )θ¯i 12: end for 13: end for
where the individual Q-value Qi of each agent i is updated by minimizing the td-error of Qtot , i.e., ˆ tot (ot , at )]2 , JQtot = E(ot ,at ,rt ,ot+1 )∼D [Qtot (ot , at ) − Q
(5)
ˆ tot is the td-target of the team where D represents the replay buffer, and Q Q-value as follows: ˆ tot (ot , at ) = r(ot , at ) + γ max Qtot (ot+1 , at+1 ), Q at+1
(6)
. . where o = (o1 , . . . , on ) and a = (a1 , . . . , an ). The training loops are as follows. Specifically, we randomly initialize the Q networks for Agent1 and Agent2 . Besides, the empty replay buffer D is initialized. Then, the training process starts. The networking simulation environment is initialized for every training episode e, and the two types of agents perceive their current observations o. Next, two types of agents choose actions a from Q networks by the epsilon-greedy policy with the exploration rate , i.e., arg maxa Qi (oi , a) if random(0, 1) ≥ i , (7) a = if random(0, 1) < random(Ai ) where = 0.995|D| and |D| represents the data stored in the replay buffer D. The rewards r are obtained by executing the actions, and the simulation environment transfers to the next state with the next observations o . Then, the transition < o, a, r, o > is stored in the buffer D, from which a minibatch B is sampled to update Q networks by minimizing JQtot . The target networks are used to make the target values change slowly to improve the stability of learning.
666
J. Bai et al.
Such an iteration in each episode loops until the time step k reaches the maximum value T . The pseudocode of the VDN for MA-CC is shown in Algorithm 1.
Fig. 2. The Dumbbell-Shaped Network Topology with N Senders, N Routers and O Receivers Simulation Scenarios was Implemented in the ns-3 Simulator. We set the Bottleneck Link Bandwidth as 30 Mbps, Delay as 100 ms, Packet Error Rate as 0.03%, and Access Link Bandwidth as 100 Mbps, Delay as 20 ms.
4
Evaluation
In this section, we first give the experimental setup and then elaborate performance of the MA-CC and make comparisons with other rule-based and learningbased congestion control schemes. Finally, we provide a brief analysis of the simulation results. 4.1
Experimental Setup
Recently, the ns-3 simulator and OpenAI Gym are combined to produce a benchmarking simulation system, named ns3-gym [12], for promoting the intersection of reinforcement learning and networking research. Since ns3-gym simplifies feeding the RL with the data from the network system, we build the test scenario and construct experiments in ns3-gym. Specifically, the test network topology built in ns3-gym is indicated in Fig. 2, which represents a natural and complex network environment where multiple flows sent by various devices compete for the
Cross-Layer Congestion Control via Multi-agent Reinforcement Learning
667
Bottleneck link’s bandwidth. Furthermore, since we consider congestion control a multi-agent decision-making problem, the TCP and AQM schemes in ns3-gym are modified under MA-CC to connect with MARL. Table 1. Simulation Parameters. Symbol Parameter Name
4.2
Parameter Value
M
Number of TCP agent and sender 3
N
Number of AQM agent and router 2
γ
Discount factor
α
Learning rate
0.01
τ
Target replace rate
0.05
Initial exploration rate
0.9
tstep
Step duration
0.2 s
Queue size
385 packets
0.9
Performance Analysis
Using the VDN algorithm, we trained universal neural networks for each agent and evaluated them online. The discount factor γ is set as 0.9, the initial exploration rate is set as 0.9, and the target replaces rate τ is set as 0.05. Simulation parameters used in the implementation are summarized in Table 1. Neural network for this task is two fully connected layers with two ReLU activation functions. We train the neural network with ten iterations and 1000 episodes for each iteration. To consolidate the proposed MA-CC, we choose well-known rule-based and learning-based protocols at two layers to make a comprehensive comparison, including NewReno, BBR, RED, CoDel, RL-TCP, and RL-AQM. We consider several critical parameters as our performance metrics, including throughput, delay (RTT), stability of CWND, and queue length. Throughput counts the amount of data successfully transmitted in a given unit of time, measured in Mbps. RTT measures packet transmission delay, which is the amount of time it takes for a signal to be sent, plus the amount of time it takes to acknowledge that signal has been received. The stability of CWND and queue length calculates the mean absolute deviation (MAD) of CWND and queue in a given time interval according to: n |xi −x| (8) d = i=1 n
The overall simulation results are shown in Fig. 3(A) to Fig. 3(B), where the timeline figures show the performance change of the throughput and delay in the first, fifth, and tenth training episode, respectively. The presence of throughput incline and delay decline with training episode increase suggests that two types of agents are gradually learning how to improve performance. MA-CC does not affect congestion control in the first training episode. However, after training
668
J. Bai et al.
Fig. 3. Real-T0ime (A) throughput and (B) RTT of MA-CC of Three Iterations.
Fig. 4. Comparison Results: (A) Average throughput Comparisons, (B) Average RTT Comparisons, (C) Mean Absolute Deviation of CWND Comparisons, and (D) Mean Absolute Deviation of Queue Length Comparisons.
several iterations, MA-CC can learn how to improve throughput and reduce delay, and its performance is expected to exceed the traditional CC algorithms. In addition, one can see that the performance of MA-CC tends to be stable after ten training sessions, which indicates the strong learning capability of the MARL-based CC algorithm in the network communication environment. To further demonstrate the performance of MA-CC, by considering the different trade-offs between latency and throughput of existing schemes, we compare MA-CC with other six AQM and TCP combination protocols, including RED with BBR, CoDel with BBR, RED with TCP-RL, AQM-RL with NewReno,
Cross-Layer Congestion Control via Multi-agent Reinforcement Learning
669
and AQM-RL with TCP-RL. The comparison results are summarized in Fig. 4, including the average throughput, average RTT, mean absolute deviation of CWND, and mean absolute deviation of queue length. From Fig. 4(A), one can see that the average throughput of MA-CC is up to 28459 Kbps while that of the combination of RED and TCP-RL algorithms is up to 26384 Kbps. As for other combinations, their average throughput is lower than 15000 Kbps. From Fig. 4(B), one can see that the average RTT of MA-CC is 287 milliseconds. While that of the combination of AQM-RL and TCP-RL algorithms is similar lower to 294 milliseconds. As for other combinations, their average RTT is higher than 300 milliseconds. Note that the total RTT when empty is 2 ∗ (100 + 2 ∗ 20) = 280 ms. Among these results, one can see that MA-CC achieves both higher throughput and lower latency on average than baselines. Moreover, from Fig. 4(C), one can see that the MAD of CWND of MA-CC is low to 242 while that of the combination of RL-AQM and NewReno algorithms is 607. As for other combinations, their MAD of CWND is higher than 17000; and from Fig. 4(D), the MAD of queue length of MA-CC is low to 2 while that of the combination of other algorithms is higher than 160 Instead of other schemes that are too fluctuating to remain stable, MA-CC keeps the MAN of CWND and queue length at a relatively low level. This reflects that MA-CC could learn a better control behavior than baselines to maintain more stable performance.
5
Conclusion
In this paper, we novelty formulate a multi-agent decision-making framework to improve congestion control performance and propose an innovative framework with multi-agent reinforcement learning called MA-CC. MA-CC enables AQM and TCP to learn to cooperate at different network layers to configure a suitable CWND and packet drop rate dynamically. By comparing MA-CC with the combination of well-known schemes, our evaluation results in the ns-3 simulator show that with limited training, MA-CC can utilize the network resources efficiently and achieve more stability, higher throughput, and lower latency network communication performance. In the future, we will implement MA-CC in a real-world communication environment. Besides, we are trying to extend the work of this paper in the aspect of other networking scenarios. We are also interested in exploring new multi-agent reinforcement learning methods suitable for networking and communications based on our proposed MA-CC framework.
References 1. Abbasloo, S., Yen, C.Y., Chao, H.J.: Classic meets modern: a pragmatic learningbased congestion control for the internet. In: Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, pp. 632– 647 (2020)
670
J. Bai et al.
2. AlWahab, D.A., Gombos, G., Laki, S.: On a deep Q-Network-based approach for active queue management. In: 2021 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), pp. 371–376. IEEE (2021) 3. Brakmo, L.S., O’malley, S.W., Peterson, L.L.: TCP Vegas: new techniques for congestion detection and avoidance. In: Proceedings of the Conference on Communications Architectures, Protocols and Applications, pp. 24–35 (1994) 4. Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., Jacobson, V.: BBR: congestion-based congestion control. Commun. ACM 60(2), 58–66 (2017) 5. Carlucci, G., De Cicco, L., Holmer, S., Mascolo, S.: Analysis and design of the google congestion control for web real-time communication (WebRTC). In: Proceedings of the 7th International Conference on Multimedia Systems, MMSys 2016, New York, NY, USA. Association for Computing Machinery (2016) 6. Yawen Chen, Y., et al.: Reinforcement learning meets wireless networks: a layering perspective. IEEE Internet Things J. 8(1), 85–111 (2021) 7. Dong, M., Li, Q., Zarchy, D., Godfrey, P.B., Schapira, M.: PCC: Re-architecting congestion control for consistent high performance. In: 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), pp. 395–408 (2015) 8. Dong, M., et al.: PCC vivace: online-learning congestion control. In: 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pp. 343–356 (2018) 9. Floyd, S., Henderson, T.-T., Gurtov, A.: The NewReno modification to TCP’s fast recovery algorithm. RFC 2582, 05 (1999) 10. Floyd, S., Jacobson, V.: Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Networking 1(4), 397–413 (1993) 11. Fu, C.P., Liew, S.C.: TCP Veno: TCP enhancement for transmission over wireless access networks. IEEE J. Sel. Areas Commun. 21(2), 216–228 (2003) 12. Gawlowicz, P., Zubow, A.: Ns-3 meets OpenAI gym: the playground for machine learning in networking research. In: ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM), p. 11 (2019) 13. Gettys, J.: Bufferbloat: dark buffers in the internet. IEEE Internet Comput. 15(3), 96 (2011) 14. Hock, M., Neumeister, F., Zitterbart, M., Bless, R.: TCP LoLa: congestion control for low latencies and high throughput. In: 2017 IEEE 42nd Conference on Local Computer Networks (LCN), pp. 215–218. IEEE (2017) 15. Jacobson, V.: Modified TCP congestion avoidance algorithm. End2end Interest Mailing List (1990) 16. Jacobson, V.L.: Congestion avoidance and control. ACM SIGCOMM Comput. Commun. Rev. (1988) 17. Jiang, H., et al.: When machine learning meets congestion control: a survey and comparison. Comput. Netw. 192, 108033 (2021) 18. Jin, C., et al.: FAST TCP: from theory to experiments. IEEE Netw. 19(1), 4–11 (2005) 19. King, R., Baraniuk, R., Riedi, R.: TCP-Africa: an adaptive and fair rapid increase rule for scalable TCP. In: Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 3, pp. 1838–1848. IEEE (2005) 20. Mittal, R., et al.: TIMELY: RTT-based congestion control for the datacenter. ACM SIGCOMM Comput. Commun. Rev. 45(4), 537–550 (2015) 21. Nichols, K., Jacobson, V.: Controlling queue delay. Commun. ACM 55(7), 42–50 (2012)
Cross-Layer Congestion Control via Multi-agent Reinforcement Learning
671
22. Nie, X., et al.: Dynamic TCP initial windows and congestion control schemes through reinforcement learning. IEEE J. Sel. Areas Commun. 37(6), 1231–1247 (2019) 23. Pan, R., Natarajan, P., Baker, F., White, G.: A lightweight control scheme to address the bufferbloat problem. Technical report, Proportional integral controller enhanced (pie) (2017) 24. Yuhan, S., Huang, L., Feng, C.: QRED: a q-learning-based active queue management scheme. J. Internet Technol. 19(4), 1169–1178 (2018) 25. Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp. 2085–2087 (2018) 26. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018) 27. Winstein, K., Balakrishnan, H.: TCP ex machina: computer-generated congestion control. ACM SIGCOMM Comput. Commun. Rev. 43(4), 123–134 (2013)
Depthwise Separable Dilated Convolutions for Low-Complexity Acoustic Scene Classification Chukwuebuka Olisaemeka(B) and Lakshmi Babu Saheer Anglia Ruskin University, Cambridge CB1 1PT, UK [email protected], [email protected]
Abstract. Research in low-complexity acoustic scene classification aims to encourage the design of classification systems that target devices with low memory and computational allowance. Researchers in this domain also aim to build systems that can generalize across multiple devices. To achieve the aforementioned objectives, this paper proposes a model using Depthwise Separable Convolutional layers, which reduces the number of parameters and computations required compared to the normal convolutional layers. This research further proposes the use of dilated kernels, which increases the receptive field of the convolutional layers without increasing the number of parameters to be learned. Finally, quantization is applied to reduce the model size. The proposed system achieves an average test accuracy of 39% and log loss of 1.878 on TAU Urban Acoustic Scenes 2022 Mobile development dataset which is part of the DCASE acoustic scene challenge. The proposed system reports an accuracy of 36.4% and log loss of 2.055 on the evaluation dataset; and achieves the parameter count of 96.473k and 3.284 MMACs. The performance on the evaluation dataset shows that the proposed architecture performs better than the baseline architecture in the classification of the “metro” scene in the DCASE challenge. Keywords: Acoustic Scene Classification · Dilated Convolutions Depthwise Separable Convolution · Low-Complexity · DCASE Challenge
1
·
Introduction
Acoustic scene classification (ASC) is the task of recognizing the surrounding environment in which an audio recording was done [1]. The audio recording is usually classified into predefined classes such as “tram”, “metro” and other acoustic scenes. The ability of machines to recognize the environment in which they are embedded is beneficial in robotic navigation [2], intelligent wearables [3], and context-aware applications [4]. Detection and Classification of Acoustic Scenes and Events (DCASE) [5] is a platform that organizes annual challenges in the field of sound scene and event research, which has contributed c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 672–682, 2023. https://doi.org/10.1007/978-3-031-37963-5_46
Depthwise Separable Dilated Convolutions
673
to advancements in this field. This report documents the submission for the DCASE 2022 Task 1 [1] which focuses on developing a low-complexity acoustic scene classification system aimed at mobile devices which are characterized by low computational and memory allowance. In the previous year, the DCASE 2021 Task 1A [6] enforced the lowcomplexity requirement by a constraint on the number of parameters allowed in the model [7,8]. The DCASE 2022 Task 1 includes an additional constraint on the multiply-accumulate operations count. This task also aims at developing models which generalize across different recording devices. In previous years, state-of-the-art performance in ASC has been achieved using convolutional neural networks [9] along with residual networks [7,10]. In this work, Depthwise Separable Convolutions are proposed for their ability to reduce the number of parameters and computations used as compared to regular convolutions [11–13]. Depthwise Separable Convolutions consist of a depthwise convolution followed by a pointwise convolution. The proposed model also makes use of dilated convolutions as a way to increase the receptive field of the convolution layers and therefore integrate more information [14,15]. The rest of the report is structured as follows. Section 2 describes the acoustic features, the proposed system, and the compression approach. Section 3 shows the experimental result, and Sect. 4 concludes the work.
2
Prior Work
The research into low-complexity models which are small and efficient has led to systems such as Xception [11] and MobileNet [16] which use depthwise separable convolution operations. The MobileNetV2 [17] extends the MobileNet architecture and includes inverted residuals. The use of depthwise separable convolutions has been seen in acoustic scene classification, such as in the work of Xu et al. [18] (who used a modified version of MobileNet) or in the work of Lee et al. [19]. Residual networks in conjunction with depthwise separable convolutions are used by Hu et al. [20] by creating an ensemble of a modified version of MobileNetV2 and a VGG-like fully connected neural network in the DCASE 2020 Task 1B in which their system was ranked as the 2nd best performing system. A modified version of the Trident ResNet (which consists of three parallel ResNets [21]) with depthwise separable convolutions in place of standard convolutions achieve a top 10 position in the team ranking of the DCASE 2021 Task1A [22]. Another approach that has been used to obtain small and efficient networks is compressing the network after training through the use of techniques such as pruning and weight quantization. Kim et al. [7] propose a broadcasted residual network called BC-ResNet-ASC, along with residual normalization along the frequency dimension and the use of pruning, quantization, and knowledge distillation for model compression. This system won the 1st position in the DCASE 2021 Task 1A. Jeong et al. [22] also used weight quantization for model compression. Koutini et al. [10] propose a frequency damped variant of CP ResNet
674
C. Olisaemeka and L. B. Saheer
that uses grouping and pruning to reduce the model size and complexity, while increasing the width (number of channels) of the model, which helped the team attain 3rd position in the team ranking in DCASE 2021 Task 1A. In order to generalize to multiple devices: seen and unseen, the use of dilated convolutions also called atrous CNN has been employed by some systems. Ren et al. [23] use atrous CNN with time-frequency attention. The system proposed by Suh et al. [21] where the Trident ResNet is used with dilated convolutions along the frequency dimension wins 1st place in the DCASE 2020 Task1A.
3 3.1
Proposed Method Acoustic Features
The acoustic features used in this research for training the network are logmel (logarithmic-magnitude Mel-scale filter bank) features, which represent the frequency content of the audio recording as they vary with time [24]. This timefrequency representation is the feature of choice for acoustic scene classification, as seen in the top-performing systems of DCASE 2021 Task 1A [7,8]. To extract these features, a short-time Fourier transform (STFT) is applied on the audio recordings sampled 44100 Hz, over windows of 40 ms using a 50% overlap, after which a 40 band Mel-scale filter bank is applied. Finally, the logarithmic conversion is applied to these Mel frequencies. 3.2
Network Architecture
The two-dimensional audio-spectral images are modelled using a Convolutional Neural Network (CNN) architecture. CNN is chosen for this task due to the ability of convolutional layers to serve as feature extractors by learning discriminative features through convolutions and non-linear transformations [25] of the audio spectrum information [26]. Several CNN architectures were experimented with, along with empirical hyperparameter tuning, before selecting the model with the best performance, which is presented in this report. The proposed architecture is illustrated in Table 1. The log-mel features are fed into two consecutive Depthwise Separable Convolutional (DSC) layers with 16 kernels of 7×7 spatial dimensions. The border type used for these convolution layers is “same” which indicates that the input features are padded with zeros to prevent the output of the DSC layers from having a reduced size [27]. Each DSC layer is followed by a batch normalization layer, which helps alleviate the problem of internal covariate shifts and thus enables generalization, while also speeding up training [28]. The rectified linear unit (ReLU) activation is used as the activation function. The output of this group is down-sampled by a maxpooling layer, only on the frequency dimension. This is followed by dropout operation with a dropout probability of 0.3, as a regularization procedure to prevent overfitting [29]. The next layer to follow in the architecture is another DSC layer, but with 32 kernels of 7 × 7 spatial dimensions. This layer also preserves the size of the
Depthwise Separable Dilated Convolutions
675
convolution, Once again, using ReLU activation followed by batch normalization, frequency pooling, and dropout, with a dropout probability of 0.3. The output is then permuted, reshaped, and fed into a set of four consecutive Dilated DSC layers. Each layer has 32 kernels with 5 × 5 spatial dimensions and size-preserving convolution alongside using the ReLU activation. These layers also make use of batch normalization and dropout with a dropout probability of 0.3. Dilation in convolution defines the number of spaces placed between the values in the kernel window, which are usually adjacent in regular convolutions. These holes have the advantage of increasing the receptive field of the kernel without adding more parameters [15]. The receptive field in dilated convolutions is expanded exponentially by stacking layers of these dilated convolutions with increasingly dilated values, as shown in Fig. 1 [34], but without causal padding. Thus, the 4 dilated DSC layers with dilation rates of 1, 2, 4, and 8 respectively are being implemented. Dilated convolutions along the temporal dimension can capture long temporal dependencies. Hence, these have been used in the proposed architecture as an equivalent for the recurrent neural network models. Table 1. Proposed Architecture for DCASE 2022 Task 1. DSConv2D and DSConv1D Represent Depthwise Separable 2D and 1D Convolution Layers Respectively. All Convolution Layers use Similar Convolutions with a Stride of 1 × 1. Layer
Number of units Kernel size Activation function Dropout rate Output
Dilation rate Batch Norm
DSConv2D 16
7×7
relu
-
40, 51, 16 -
Yes
DSConv2D 16
7×7
relu
0.3
40, 51, 16 -
Yes
Max Pool
5×1
-
-
8, 51, 16
-
-
DSConv2D 32
7×7
relu
0.3
8, 51, 32
-
Yes
Max Pool
-
2×1
-
-
4, 51, 32
-
-
Permute
-
-
-
-
51, 4, 32
-
-
Reshape
-
-
-
-
51, 128
-
-
DSConv1D 32
5×5
relu
0.3
51, 32
1
Yes
DSConv1D 32
5×5
relu
0.3
51, 32
2
Yes
DSConv1D 32
5×5
relu
0.3
51, 32
4
Yes
DSConv1D 32
5×5
relu
0.3
51, 32
8
Yes
Flatten
-
-
-
-
1632
-
-
FC
50
-
relu
-
50
-
-
FC
10
-
softmax
-
10
-
-
-
The Dilation operations in the architecture are followed by a fully connected (FC) dense layer with 50 neurons and the ReLU activation function using the dropout probability of 0.3. Finally, the output of the fully connected layer is fed into the output layer, which is another fully connected layer with 10 neurons matching the number of classes being predicted. The softmax activation function is used in this output layer to denote the degree of confidence of each predicted class [30].
676
C. Olisaemeka and L. B. Saheer
Fig. 1. Stacked Dilated Convolutions with Increasing Dilation Rates [34].
3.3
Compression
Post-training quantization is a method that converts infinite values to discrete finite values. Full integer quantization converts all floating-point tensors in the model, which include constant tensors such as weights and biases, and variable tensors such as model input, activations, and model outputs. Quantization is performed to reduce the model size while improving CPU and hardware accelerator latency with a slight reduction in model accuracy. A mini-batch of 100 acoustic features was used to perform a full integer post-training quantization on the trained model, which converted it from 32-bit format to 8-bit format [8].
4 4.1
Experiments Experimental Setup
The proposed model is trained and evaluated on the TAU Urban Acoustic Scenes 2022 Mobile, development dataset [31]. The dataset contains recordings from 12 European cities in 10 different acoustic scenes using 4 different devices. Artificial data for 11 simulated mobile devices were created from the original recordings. The audio recordings were performed in Amsterdam, Barcelona, Helsinki, Lisbon, London, Lyon, Madrid, Milan, Prague, Paris, Stockholm, and Vienna. The real devices are designated identifiers A, B, C, and D. The 11 additional mobile devices S1–S11 are simulated using the audio recorded with device A and modifications such as impulse responses with real devices and dynamic range compression, to simulate realistic recordings. The audio recordings are 1 s long in a single channel and sampling rate of 44.1 kHz. The acoustic scenes are “airport”, “shopping mall”, “metro station”, “street pedestrian”, “public square”, “street traffic”, “tram”, “bus”, “metro”, and “park”. The development set contains data from 10 cities out of the 12 and 9 devices out of 11: 3 real devices (A–C) and 6 simulated devices (S1–S6). The dataset contains 230,350 audio segments split in a training split of 70% and a test split of 30%, however, the training split does not contain audio recordings from the S4, S5, and S6 devices to test the generalization ability of the proposed models. The
Depthwise Separable Dilated Convolutions
677
evaluation set contains data from all 12 cities and 11 devices. Devices S7–S11 are unseen during development and are only included in the evaluation set. The development set with a batch size of 64 is used to train the proposed model for 200 epochs using the Adam optimizer with a learning rate of 0.001 while observing the categorical cross-entropy loss and categorical accuracy. After the first 50 epochs, the weights of the epoch with the best value of the categorical accuracy are saved and the final weight values are retrieved at the end of the training process and used for testing. Table 2. Comparison of Baseline Method with Proposed Method on TAU Urban Acoustic Scenes 2022 Mobile, Development Dataset.
4.2
Scene
Baseline Proposed Model Log loss Accuracy Log loss Accuracy
airport
1.748
29.1%
1.770
bus
1.723
31.6%
2.132
34.4%
metro
1.538
40.0%
1.550
45.6%
metro station
1.724
37.5%
2.028
30.2%
park
1.291
61.4%
1.775
52.7%
public square
2.037
27.1%
2.607
19.6%
shopping mall
1.781
39.8%
1.678
44.5%
street pedestrian 1.656
33.6%
2.141
26.3%
street traffic
68.5%
1.141
65.8%
1.050
34.5%
tram
1.389
49.7%
1.961
36.3%
Average
1.594
41.8%
1.878
39.0%
Results
Table 2 shows the result of training and testing the proposed system using the training/test split of the development set. The proposed system reported a log loss of 1.878 and an accuracy of 39% which are both very close to the performance of the baseline system for the DCASE 2022 Task 1. The performance also improved for some scenes, as highlighted in bold in the table. Especially, there seem to be significant improvements in the detection of “shopping mall” scene for both log-loss and accuracy. Further, the accuracy metrics have shown more improvements in the detection of “airport”, “bus”, “metro” and “shopping mall” scenes. The confusion matrix in Fig. 2 shows a high misclassification rate for the “metro station”, “public square” and “street pedestrian” because of the presence of sound events that are prominent in other scenes. The performance could have been further improved with more rigorous hyperparameter tuning. The breakdown of the results per device data is shown in Table 3 which can be directly compared to baseline model performance in Table 4 with respect to the
678
C. Olisaemeka and L. B. Saheer
Fig. 2. Confusion Matrix of the Proposed Architecture Results on TAU Urban Acoustic Scenes 2022 Mobile, Development Dataset.
log-loss metrics. The values that show better performance for the proposed system are highlighted in Table 3. Similar to previous observations, some scenes like “airport”, “metro”, “shopping mall” and “street traffic” are performing better with the proposed models. While “airport”, “shopping mall” and “metro” had more consistency across most devices, including the real and simulated devices. Others are more specific, viz., “street traffic” demonstrates better performance for real devices compared to simulated ones. The performance of the proposed architecture can be seen on the DCASE 2022 Task 1 result page [32], however, some metrics are shown in Tables 5, 6 and 7. Table 6 shows the log loss and accuracy scores on the evaluation dataset. The proposed architecture reported a log loss of 2.055 and an accuracy of 36.4%, and was placed in 39th position out of 48 submitted architectures. It can be observed from Table 5 that the proposed architecture performs better than the baseline in the classification of “metro” acoustic scene, which could be a result of the proposed architecture being able to classify scenes that have a prominent and recurrent sound event such a “moving train” in the metro scene.
Depthwise Separable Dilated Convolutions
679
Table 3. Proposed Architecture Scene/Device Log Loss Results on TAU Urban Acoustic Scenes 2022 Mobile, Development Dataset. Scene
A
B
C
S1
S2
S3
S4
S5
S6
airport
1.25 1.73 1.34 1.92 2.02 1.45 2.01 1.96 2.25
bus
1.47 1.74 1.51 2.20 2.08 2.39 2.86 2.49 2.45
metro
0.96 1.59 1.40 2.08 1.82 1.48 1.43 1.69 1.51
metro station
2.08 2.13 2.32 2.03 2.36 1.93 2.00 1.96 1.45
park
0.72 0.45 0.87 1.68 1.64 1.95 3.23 2.01 3.41
public square
1.71 1.94 1.94 2.53 2.62 2.22 3.36 3.57 3.56
shopping mall
1.77 1.39 1.59 1.60 1.72 1.80 1.84 1.39 2.01
street pedestrian 1.36 1.89 1.05 2.03 1.91 2.40 2.57 2.62 2.45 street traffic
0.66 1.07 1.15 0.90 1.53 1.21 1.15 1.02 1.59
tram
1.32 1.98 1.57 1.31 1.87 1.34 2.17 3.43 2.64
Average
1.33 1.59 1.57 1.83 1.96 1.82 2.26 2.22 2.33
Table 4. Baseline Architecture Scene/Device Log Loss Results on TAU Urban Acoustic Scenes 2022 Mobile, Development Dataset. Scene
A
B
C
S1
S2
S3
S4
S5
S6
airport
1.197 1.506 1.543 1.993 1.651 1.345 2.140 2.053 2.294
bus
0.905 1.694 1.159 1.766 1.525 1.774 2.251 2.133 2.296
metro
1.073 1.392 1.489 2.239 1.620 1.399 1.620 1.749 1.264
metro station
1.501 1.764 1.720 2.057 1.970 1.619 1.938 1.455 1.492
park
0.390 0.363 0.602 1.261 0.985 1.390 2.213 1.981 2.434
public square
1.429 1.504 1.848 2.004 1.891 1.723 1.910 2.807 3.215
shopping mall
1.765 1.536 1.850 1.798 1.580 2.172 1.777 1.827 1.724
street pedestrian 1.200 1.680 1.628 1.493 1.625 1.702 1.969 1.719 1.889 street traffic
0.764 1.226 1.062 0.803 1.139 1.156 1.083 0.732 1.482
tram
1.032 1.406 1.167 1.174 1.433 1.009 1.557 2.127 1.592
Average
1.126 1.407 1.408 1.659 1.542 1.529 1.846 1.858 1.968
The parameter count for the proposed system is 96,473 which meets the requirements of the task (128,000 parameters or fewer). The multiple accumulate count (MAC) for the proposed system is 3,283,692 which also meets the requirements of the task (no more than 30,000,000 MACs) and is also much lower than the baseline MAC of 29,238,120. The proposed architecture has a smaller parameter count than the best performing system [33], with about 90% less of its multiple accumulate count, as seen in Table 7. Overall, it can be deduced that the proposed model has great potential as a small-footprint ASC model that can be explored for further performance improvements in the future.
680
C. Olisaemeka and L. B. Saheer
Table 5. Log Loss and Accuracy Results for the Different Acoustic Scenes on TAU Urban Acoustic Scenes 2022 Mobile, Evaluation Dataset. System Log Loss
Airport Bus
Best Performing System 1.430 Baseline 1.596 Proposed Architecture 1.970
Accuracy Best Performing System 47.4% Baseline 32.2% Proposed Architecture 31.8%
0.790 1.368 2.535
Metro Metro Station Park
Public Square Shopping Mall Street Pedestrian Street Traffic Tram
0.932 1.489 1.331
1.521 1.943 2.868
1.120 1.692 2.020
0.502 1.635 2.054
75.5% 66.1% 60.4% 50.6% 37.9% 39.8% 27.1% 50.5% 30.9%
85.4% 43.6% 52.2% 25.7% 44.5% 17.2%
1.129 1.289 1.795
1.709 1.891 2.608
0.953 1.219 1.604
0.822 1.202 1.765
55.1% 58.2% 46.9%
31.8% 27.9% 20.1%
66.8% 64.4% 54.0%
64.4% 53.4% 40.5%
Table 6. Comparison of Best Performing System, Baseline Method and the Proposed Method on TAU Urban Acoustic Scenes 2022 Mobile, Evaluation Dataset. System
Team ranking Log loss Accuracy
Best Performing System 1
1.091
59.6%
Baseline
12
1.532
44.25%
Proposed Architecture
17
2.055
36.4%
Table 7. System Complexity Comparison. System
MACs Parameters
Best Performing System 28M
5
121K
Baseline
29M
46K
Proposed Architecture
3M
96k
Conclusion
A model based on depthwise separable convolutions is proposed in this report in order to achieve a low-complexity acoustic scene classification model. The use of stacked dilated convolutions could further reduce the parameter count and computations while integrating more contextual information. The proposed method achieves a parameter count of 96.473k with 3.284 MMACs with a log loss of 1.878 and an accuracy of 39%. Even though the overall average scores are sometimes below the performance of the baseline model, some scenes were identified to be performing better using the proposed system. It can be safely assumed that further hyperparameter tuning can help improve the overall performance. Data augmentation techniques could also be implemented in the future to improve this proposed system. This could help to achieve better generalization and lessen the effect of the imbalance of the number of audio recordings from the different device types. The use of residual learning is also suggested as a means to achieve deeper networks that still remain in the confines of the task requirements.
Depthwise Separable Dilated Convolutions
681
References 1. Mart´ın-Morat´ o, I., et al.: Low-complexity acoustic scene classification in DCASE 2022 challenge (2022). https://doi.org/10.48550/ARXIV.2206.03835 2. Chu, S., Narayanan, S., Kuo, C.C.J., Mataric, M.J.: Where am I? Scene recognition for mobile robots using audio features. In: 2006 IEEE International conference on multimedia and expo, pp. 885-888. IEEE (2006) 3. Xu, Y., Li, W.J., Lee, K.K.: Intelligent Wearable Interfaces. John Wiley & Sons, New York (2008) 4. Malkin, R.G., Waibel, A.: Classifying user environment for mobile applications using linear autoencoding of ambient audio. In: Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. vol. 5, pp. v-509. IEEE (2005) 5. http://dcase.community/workshop2022/ 6. http://dcase.community/challenge2021/ 7. Kim, B., Yang, S., Kim, J., Chang, S.: Domain generalization on efficient acoustic scene classification using residual normalization (2021). https://doi.org/10.48550/ arXiv.2111.06531 8. Yang, C.H.H., et al.: A lottery ticket hypothesis framework for low-complexity device-robust neural acoustic scene classification (2021). https://doi.org/10.48550/ arXiv.2107.01461 9. Abeßer, J.: A review of deep learning based methods for acoustic scene classification. Appl. Sci. 10(6), 2020 (2020) 10. Koutini, K., Jan, S., Widmer, G.: Cpjku submission to dcase21: Cross-device audio scene classification with wide sparse frequency-damped CNNs. DCASE2021 Challenge, Tech. Rep (2021) 11. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251-1258 (2017) 12. Kaiser, L., Gomez, A.N., Chollet, F.: Depthwise separable convolutions for neural machine translation (2017). https://doi.org/10.48550/arXiv.1706.03059 13. Santos, A.G., de Souza, C.O., Zanchettin, C., Macedo, D., Oliveira, A.L., Ludermir, T.: Reducing squeezenet storage size with depthwise separable convolutions. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1-6. IEEE (2018). https://doi.org/10.1109/IJCNN.2018.8489442 14. Rekabdar, B., Mousas, C.: Dilated convolutional neural network for predicting driver’s activity. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3245-3250. IEEE (2018). https://doi.org/10.1109/ITSC. 2018.8569818 15. Zhang, X., Zou, Y., Shi, W.: Dilated convolution neural network with LeakyReLU for environmental sound classification. In: 2017 22nd International Conference on Digital Signal Processing (DSP), pp. 1-5. IEEE (2017). https://doi.org/10.1109/ ICDSP.2017.8096153 16. Howard, A.G., et al.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv (2017). https://doi.org/10.48550/arXiv.1704. 04861 17. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520. IEEE (2018)
682
C. Olisaemeka and L. B. Saheer
18. Xu, J.X., Lin, T.C., Yu, T.C., Tai, T.C., Chang, P.C.: Acoustic scene classification using reduced MobileNet architecture. In: 2018 IEEE International Symposium on Multimedia (ISM), pp. 267-270. IEEE (2018). https://doi.org/10.1109/ISM.2018. 00038 19. Lee, Y., Lim, S., Kwak, I.Y.: CNN-based acoustic scene classification system. Electronics 10(4), 371. Electronics (2021). https://doi.org/10.3390/ electronics10040371 20. Hu, H., et al.: Device-robust acoustic scene classification based on two-stage categorization and data augmentation. arXiv (2020). https://doi.org/10.48550/arXiv. 2007.08389 21. Suh, S., Park, S., Jeong, Y., Lee, T.: Designing acoustic scene classification models with CNN variants. Tech. Rep., DCASE2020 Challenge (2020) 22. Jeong, Y., Park, S., Lee, T.: Trident Resnets with Low Complexity for Acoustic Scene Classification. DCASE2021 Challenge (2021) 23. Ren, Z., Kong, Q., Han, J., Plumbley, M.D., Schuller, B.W.: CAA-Net: conditional atrous CNNs with attention for explainable device-robust acoustic scene classification. IEEE Trans. Multim. 23, 4131–4142. IEEE (2020) .https://doi.org/10.1109/ TMM.2020.3037534 24. Wu, Y., Lee, T.: Enhancing sound texture in CNN-based acoustic scene classification. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),, pp. 815-819. IEEE (2019). https://doi.org/10. 1109/ICASSP.2019.8683490 25. Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-theshelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806-813. (2014) 26. Lu, Z.: Sound event detection and localization based on CNN and LSTM. Detection Classification Acoust. Scenes Events Challenge, Tech. Rep (2019) 27. Chauhan, R., Ghanshala, K.K, Joshi, R.C.: Convolutional neural network (CNN) for image detection and recognition. In: 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), pp. 278-282. IEEE (2018). https://doi.org/10.1109/ICSCCC.2018.8703316 28. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference On Machine Learning, vol. 37, pp. 448–456. PMLR (2015) 29. Garbin, C., Zhu, X., Marques, O.: Dropout vs. batch normalization: an empirical study of their impact to deep learning. Multim. Tools Appl. 79(19), 12777–12815 (2020). https://doi.org/10.1007/s11042-019-08453-9 30. Kouretas, I., Paliouras, V.: Simplified hardware implementation of the softmax activation function. In: 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), pp. 1-4. IEEE (2019). https://doi.org/10.1109/ MOCAST.2019.8741677 31. Heittola, T., Mesaros, A., Virtanen, T. Acoustic scene classification in DCASE 2020 challenge: generalization across devices and low complexity solutions. arXiv (2020). https://doi.org/10.48550/arXiv.2005.14623 32. https://dcase.community/challenge2022/task-low-complexity-acoustic-sceneclassification-results 33. Schmid, F., Masoudian, S., Koutini, K., Widmer, G.: CP-JKU submission to Dcase22: distilling knowledge for low-complexity convolutional neural networks from a Patchout audio transformer. DCASE2022 challenge, Tech. Rep (2022) 34. Oord, A.V.D., et al.: WaveNet: a generative model for raw audio. arXiv (2016). https://doi.org/10.48550/arXiv.1609.03499
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning ´ Bal´azs Szalontai(B) , Akos Kukucska, Andr´ as Vad´asz, Bal´azs Pint´er, and Tibor Gregorics Faculty of Informatics Department of Software Technology and Methodology, E¨ otv¨ os Lor´ and University, P´ azm´ any P´eter stny. 1/c, Budapest 1117, Hungary {BUKP00,A0SF3C,W2KI49,PINTER,GT}@inf.elte.hu
Abstract. It is a common mistake to neglect high-level constructs while programming, often made by programmers new to Python. Such bad practice leads to verbose and nonidiomatic code, which generally also runs slower. Our goal is to refactor code containing such low-level snippets while preserving the original behaviour of the program. We present a method to localize and idiomatize nonidiomatic parts in Python source code. Localization is formulated as a sequence tagging task, where each line is either part of a low-level snippet or not. The task is solved by a recurrent neural network. Generating the idiomatic counterparts from nonidiomatic code is tackled by a sequence-to-sequence model with attention. The method was evaluated on a real-world dataset containing programs written by students, and it outperformed previous work with higher precision and F1-score. Keywords: Python · Programming Idioms Sequence-to-Sequence · Sequence Tagging
1
· Refactoring ·
Introduction
Many programmers new to Python are prone to implement nonidiomatic solutions in their code. Instead of using higher-level concepts, they program using for or while loops along with if statements, particularly if they come from lowerlevel programming languages. This can make the code verbose and less clean than its idiomatic counterpart. Furthermore, nonidiomatic code runs slower in most cases than idiomatic code in Python. A method which can find these code snippets and transform them to their idiomatic counterparts could be useful. There are a wide range of source code transformation methods to improve code which contains certain code smells or error-prone parts [14,19]. Most of the recent methods rely on machine learning techniques (e.g. deep learning) due to their effectiveness in this task [2]. Making source code more idiomatic can be considered a subfield of source code transformation, like fixing bugs [4,9,18]. The main difference is that instead of correcting code with wrong behaviour, our c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 683–702, 2023. https://doi.org/10.1007/978-3-031-37963-5_47
684
B. Szalontai et al.
goal is to perform behaviour-preserving code transformations (refactoring). We aim to localize and fix nonidiomatic parts of the code in order to make it cleaner and more readable. To our knowledge, the first work to consider idiomatizing Python code was Szalontai et al. [16], who localized and fixed nonidiomatic code snippets embedded in a larger context of source code. The task of localizing and fixing code is divided into three subtasks: (i) the snippets are localized, then for each snippet (ii) the type of the nonidiomatic pattern, and (iii) the key variables are determined, which are used for generating the idiomatic counterpart of the nonidiomatic snippet. While good results were achieved, the approach could be improved upon. One of the most notable drawbacks is that the patterns are hardcoded in the method, which decreases ease of extendability to further patterns. In this paper, we present an improved approach which utilizes a sequence-tosequence architecture to generate the idiomatic counterpart of the nonidiomatic snippet in one pass. The architecture allows us to solve the process of localizing and fixing nonidiomatic code in just two subtasks: (i) the snippets are localized, then for each snippet (ii) the idiomatic counterpart is generated as a substitute for the nonidiomatic snippet. As the sequence-to-sequence architecture learns how to translate the nonidiomatic snippets into their idomatic counterparts, no hardcoding is required, and the system is extensible with new coding patterns. This also makes our method language independent, that is, the same method could be applied to another programming language – only a dataset and a tokenizer for the language are required. There is another advantage of dividing into two subtasks. Thanks to the separated localizer, the idiomatizer component only has to generate the idiomatic snippet, without having to copy the surrounding context (unlike for example [8,17,18], where the entire source code serves as the input). The two subtasks are solved as follows. The task of localizing nonidiomatic snippets is formulated as a sequence tagging task: each line of the source code gets tagged according to its position in the nonidiomatic snippet: ST ART , EN D, IN or OU T . We solve this tagging task by applying a convolution on the embedded tokens, then using two kinds of pooling layers to unify tokens of each line. Then we use BiLSTM layers and fully connected layers to obtain both a context-dependent and a context-free representation for each line, which are used to predict the tag of the line. The idiomatic alternative is generated using a learned end-to-end approach: a sequence-to-sequence model with attention mechanism. This approach allows for easy extension of patterns to be refactored. We made use of this advantage when extending the dataset, which made our method capable of fixing a wider spectrum of code without having to change the underlying architecture. Our contributions are as follows: – Compared to previous work, our idiomatizer component uses a sequence-tosequence architecture with attention mechanism, which makes it an end-toend learned approach instead of one with hardcoded patterns.
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
685
– Using a sequence-to-sequence architecture, the method becomes easily extensible by providing more data while training. – We extended the dataset with a frequently used version of the maximum pattern, which returns only the index of the maximum element, thus making the method capable of refactoring a wider range of code. – Our method outperforms previous work in precision and overall F1-score. The paper is organized as follows. We review related literature in Sect. 2. In Sect. 3, we present the formal approach of the method, discuss the two ways of tokenizing code, and describe the neural network architectures. Section 4 explains the means to obtain the datasets: the generation of nonidiomatic and idiomatic snippets, and inserting the nonidiomatic snippets into real-world projects. Our method is evaluated in Sect. 5, and compared with the most closely related work in Sect. 6. Finally, we conclude in Sect. 7.
2
Related Work
The use of deep learning in source code refactoring is increasing. In this section, we review related literature. First we provide a short summary of the work of Szalontai et al. [16], which is probably the most closely related work to ours. Then we discuss how some of the sequence processing techniques we use were applied in previous work. Finally, we review the progress in detecting and fixing errors in source code via deep learning. 2.1
Previous Approaches of Localizing and Fixing Nonidiomatic Snippets
Szalontai et al. [16] present an approach to find nonidiomatic snippets in Python source code and replace them with cleaner and more performant alternatives. Their algorithm is divided into three subtasks. First, the nonidiomatic snippets are localized. This subtask is formulated as a sequence tagging problem, and solved with a recurrent architecture. In our work, the task of localizing the nonidiomatic snippets is also formulated as a sequence tagging task, but in a different manner: instead of tagging tokens, we perform line-by-line tagging. The approach of tagging lines seems to fit more with the task of localizing. Second, for each snippet found, the type of the nonidiomatic pattern needs to be determined. This classification task is solved with a feedforward neural network. Third, the key variables are determined. This is also formulated as a sequence tagging task and is solved by a recurrent architecture. A predefined pattern is chosen as a frame for the idiomatic code, then the key variables are substituted into the frame. Based on these pieces of information and some further manually extracted features (e.g., the condition), the idiomatic alternative is generated and substituted.
686
B. Szalontai et al.
Using such hardcoded patterns is disadvantageous in multiple aspects. For example, it is difficult to extend the range of patterns that can be refactored. In contrast, our method is readily extensible as it learns to generate the idiomatic alternative. Very recently another approach has been presented by Zhang et al. [20] to a similar problem. Their method is capable of refactoring 9 types of nonidiomatic patterns in Python (these patterns are generally less complex compared to the patterns adopted by Szalontai et al.). In order to refactor Python code, syntactic patterns are defined to detect nonidiomatic snippets, then atomic AST-rewriting operations are used to generate the idiomatic alternative. Our work is more closely related to the work of Szalontai et al. regarding both the goal and the methodology. 2.2
Sequence Tagging
While sequence tagging problems were introduced in Natural Language Processing, there is a growing trend in formulating software engineering problems as sequence tagging tasks. Hellendoorn et al. [10] apply sequence tagging to identify the type of each token in JavaScript. They label tokenized JavaScript source code with the types of the tokens similarly to a natural language Part of Speech Tagging task. Danish et al. [6] use a similar approach developing a tool for code analysis and verification. They are able to identify and tag units-of-measure in scientific Fortran code. The lenient parser introduced by Ahmed et al. [1] is able to parse and type code fragments, thus allowing the early detection and repair of potential errors. They overcome issues such as fragmentation or corruption by training highcapacity transformer models. The training is based on a dataset consisting of source files taken from Github and corrupted by them. The approach to distinguish between identifiers and non-identifier symbols (e.g. keywords, operators, delimiters) in a token sequence is formulated as sequence tagging problem. 2.3
Sequence-to-Sequence
We use a sequence-to-sequence model for source code transformation. The first sequence-to-sequence models [15] were originally motivated by translating natural language texts (Neural Machine Translation). The approach of formulating source code refactoring as a neural machine translation task has been applied multiple times before, such as [4,8,9,18]. One common approach is to first gather data from git commits (bug fixes), then train a model to perform the refactoring. 2.4
Code Transformation and Refactoring with Deep Learning
Hata et al. [9] and Tufano et al. [18] train sequence-to-sequence models to perform code transformations. Training data is gathered from git commits that
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
687
involve bug fixing. Chen et al. [4] use similar techniques to provide training data for a sequence-to-sequence network. They use copy mechanism to overcome the difficulties caused by rare identifiers. Gupta et al. [8] present a method to fix syntactic errors in C. They use a sequence-to-sequence architecture with GRU encoder and decoder, aided by attention mechanism. The fixes are performed iteratively. The transformations also rely on some heuristics in order to filter out some erroneous modifications. An example for such a heuristic is that the transformations need to preserve identifiers and keywords. Chakraborty et al. [3] present a two step encoder-decoder model for code transformation. Source code is parsed as an abstract syntax tree. The first step is to perform structural changes to the code by constructing the abstract syntax tree of the modified code. The second step is to concretize the previously generated code fragment by predicting the tokens conditioned on the abstract syntax tree that was generated in the first step. Refactoring is a subfield of source code transformation. It is “the process of changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure.” [7]. There are multiple approaches to refactoring, which are summarized by Mens et al. [14]. Although refactoring appears to be less frequently addressed by the deep learning community (compared to other code transformation tasks), our goal is to use deep learning to refactor code so that it becomes more idiomatic. Aniche et al. [2] examine six different machine learning algorithms to predict software refactorings. These are: Logistic Regression, Naive Bayes, Support Vector Machine, Decision Trees, Random Forest, and Neural Network. Although the examined neural network has a feedforward architecture (with three dense layers and dropout), more advanced architectures are also mentioned. These include sequence-to-sequence architecture, which is used for refactoring in our work.
3
Method
In this section, after explaining the main goal of our method, a formal approach is given and the neural network architectures are shown. The method is capable of locating and correcting six chosen nonidiomatic code patterns. Their names and an example for each along with its idiomatic alternative can be seen in Table 1. The patterns are: – Count: counts the elements in a list for which a given predicate is satisfied. – Maximum: finds the maximum value of the elements in a list or its index or both. – Search: returns an element of a list for which a given predicate is satisfied. – Sum: sums the values of elements in a list for which a given predicate is satisfied. – All: decides whether all of the elements in a list satisfy a given predicate. – Any: decides whether any of the elements in a list satisfy a given predicate.
688
B. Szalontai et al.
Table 1. The Six Kinds of Nonidiomatic Snippets that we Work with in this Paper. Each is Presented with its Label, an Example, and an Idiomatic Alternative. pattern nonidiomatic snippet
count
max ind = 0 ; max = a r r [ 0 ] f o r l o o p i n d in range (1 , len ( l ) ) : i f max < a r r [ l o o p i n d ] : max = a r r [ l o o p i n d ] max ind = l o o p i n d
max ind , max = max( enumerate ( a r r ) , key=lambda x : x [ 1 ] )
found = arr [ 0 ] > 0 loop ind = 0 w h i l e l o o p i n d < l e n ( l )−1 and not f o u n d : f o u n d = a r r [ l o o p i n d +1] loop ind = loop ind + 1 i f found : element = arr [ l o o p i n d ]
for x in arr : i f x > 0: element = x f o u n d = True break
sum
sum = 0 f o r l o o p i n d in range ( len ( l ) ) : i f arr [ loop ind ] > 0: sum = sum + a r r [ l o o p i n d ]
sum = sum ( elem f o r elem i n a r r i f elem > 0 )
all
a l l = arr [ 0 ] > 0 loop ind = 0 w h i l e l o o p i n d < l e n ( l )−1 and a l l : a l l = a r r [ l o o p i n d +1] > 0 : loop ind = loop ind + 1
all
any
any = a r r [ 0 ] > 0 loop ind = 0 w h i l e l o o p i n d < l e n ( a r r [ 1 : ] ) and not any : any l o o p i n d += 1 any = a r r [ l o o p i n d ] > 0
count
maximum
search
3.1
idiomatic alternative
count = 0 f o r l o o p i n d in range ( len ( l ) ) : i f arr [ loop ind ] > 0: c o u n t += 1
= l e n ( [ 1 f o r elem i n a r r i f elem > 0 ] )
= a l l ( elem > 0 f o r elem i n a r r )
= any ( a > 0 f o r a i n a r r )
Formal Approach
In this section, we formalize the process of localizing and idiomatizing nonidiomatic code snippets. The refactoring procedure is represented by function REF ACT OR : Ch∗ → Ch∗1 . This function expects source code to be analyzed and refactored, and outputs its idiomatic version. The definition of REF ACT OR is supported by function SU BST IT U T E : Ch∗ × (N × N)∗ × Ch∗∗ → Ch∗2 . As its arguments it expects the source code to be analyzed, index pairs representing the locations (i.e., start and end) of nonidiomatic snippets, and the idiomatic alternatives of the snippets. Using these arguments the snippet can be replaced in the original code at the given locations, each with their given alternative. The function SU BST IT U T E returns the refactored source code. Using this function, REF ACT OR is defined as:
1 2
Ch:set of characters, Ch∗ : set of character sequences. N:set of natural numbers (0,1,...).
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
689
REFACTOR(SC) := SUBSTITUTE(SC, LOCATE(SC), IDIOMATIZE(SNIPPET(SC)). LOCAT E : Ch∗ → (N × N)∗ is the function that returns the locations of the snippets in the full-length source code represented by index pairs, and SN IP P ET : Ch∗ → Ch∗∗ returns the list of snippets. LOCAT E is used in the substitution process, whereas SN IP P ET provides the input for constructing the improved snippets. The function IDIOM AT IZE generates the idiomatic snippets: given the nonidiomatic snippets as the input, it generates its idiomatic alternatives. These functions are implemented as recurrent neural networks: LOCAT E and SN IP P ET are implemented by a network that solves a sequence tagging problem, and IDIOM AT IZE is implemented by a sequence-to-sequence model with attention. Further details are presented in Sect. 3.3. Table 2 summarizes the above defined functions and Fig. 1 shows a visual overview of them. Table 2. A Summary of All the above Defined Functions Including their Types, the Expected Input Value and their Output. name
type
expects
REF ACT OR
Ch∗ → Ch∗
source code to be refactored
SU BST IT U T E Ch∗ × (N × N)∗ × Ch∗∗ → Ch∗ source code, locations of snippets, idiomatic alternatives
returns refactored source code source code modified at the given locations using the given alternatives
LOCAT E
Ch∗ → (N × N)∗
source code to be refactored
SN IP P ET
Ch∗ → Ch∗∗
source code to be refactored
the found snippets
localized nonidiomatic snippets
idiomatic snippets
IDIOM AT IZE Ch∗∗ → Ch∗∗
3.2
locations of the snippets to be replaced represented by index pairs
Tokenizing Source Code
The first step of preprocessing source code in order to serve as input for the models is tokenizing. As typical in the literature, we would like to represent the code as a sequence of tokens (represented by indices). A tokenizer can be described as a function Ch∗ → N∗ , expecting source code as a sequence of characters and turning them to a sequence of tokens (each represented by an index).
690
B. Szalontai et al.
Fig. 1. A Visual Overview of the Refactoring Process. First, the Nonidiomatic Snippet is Localized. Second, an Idiomatic Alternative is Generated. Finally, the Nonidiomatic Snippet is Replaced by the Idiomatic Alternative.
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
691
One of our goals was to minimize the possibility of Out of Vocabulary (OOV) tokens occurring. This is achieved by unifying certain tokens by their categories. While many kinds of tokens (identifiers, string literals, most of the numbers, comments) are unified, the following tokens are kept in their original form: – keywords (e.g. while, for, if, not, in) – built-in-functions (pl. print, sum, max) – functions applicable on lists, dictionaries, sets and strings (e.g. sort, keys, isalpha) – those numbers - usually 0 or 1 - that play a main role at describing the structure of a coding pattern (e.g. “... = 0”, “... += 1”, “lambda x: x[1]”) We incorporate two slightly different tokenizing approaches. We use the function T OKEN IZE CODE for tokenizing an entire source code, and T OKEN IZE SN IP P ET for tokenizing a nonidiomatic snippet. The difference between the two approaches is visualized in Fig. 2.
Fig. 2. The Two Approaches to Tokenizing. T OKEN IZE SN IP P ET is used to Tokenize Nonidiomatic Snippets, while T OKEN IZE CODE is used for Larger Source Files.
When tokenizing the complete source code with T OKEN IZE CODE, the identifiers are replaced with V AR, the string literals are replaced with ST R, the numbers are replaced with IN T , F LOAT , N ON DECIM AL N U M or N U M , and the comments are replaced with the token #. In case of a multiline string literal, we insert an ST R token for each line with new line tokens in between each of them. Multiline comments are handled the same way. When tokenizing a nonidiomatic code snippet with T OKEN IZE SN IP P ET , we need to be able to differentiate between identifiers, numeric values and string literals (otherwise we would lose the ability to construct the idiomatic code based on the nonidiomatic snippet). Using an approach inspired by Chirkova et al. [5], we replace the identifiers with LIST , F U N C or V AR,
692
B. Szalontai et al.
supplemented with an index. We choose the prefix based on whether the identifier is a list, a function, or a regular variable. The numbering is performed consistently, meaning that all occurrences of a certain identifier will be replaced by the same token. As the result of numbering, the distribution of identifiers persists even after preprocessing. We encode numeric values and string literals with the same approach, using the N U M and ST R prefixes. The inverse of the function T OKEN IZE SN IP P ET is also considered. We convert a sequence of tokens to a valid source code using the function T OKEN S2CODE. It expects the tokenized code and a dictionary that maps the tokens starting with ST R, N U M , V AR, LIST , . . . to their original form. For splitting source code to a sequence of tokens we use the tokenize package from the Python Standard Library with the modification of splitting up tokens of multiple tabs. 3.3
Neural Architectures
As mentioned before, the two key subtasks of our method are implemented as neural networks: we localize the nonidiomatic snippets with a sequence tagging network (Mloc ), and idiomatize these snippets one-by-one using a sequence-tosequence network with attention mechanism (Midiom ). Table 3 summarizes the functions defined in this section. Table 3. A Summary of the Functions Defined in Sects. 3.2 and 3.3 Including their Names, their Expected Input, and their Output. name
expects
returns
T OKEN IZE CODE
source code
tokenized source code
T OKEN IZE SN IP P ET nonidiomatic snippet
tokenized snippet
T OKEN S2CODE
tokenized code/snippet
source code (snippet)
Mloc
source code tokens
predicted tags for each line (Start, In, End, Out)
Midiom
nonidiomatic snippet
idiomatic snippet
T AGS2LOC
tags of source code lines (Start, In, location(s) of nonidiomatic End, Out) snippet(s)
T AGS2SN P
source code and tags of lines (Start, In, End, Out)
nonidiomatic snippet(s)
3.3.1 Snippet Location (Mloc ) The first subtask is solved by functions LOCAT E and SN IP P ET . As described above, LOCAT E returns the locations of snippets as index pairs and SN IP P ET returns a sequence of the snippets to be substituted.
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
693
This task is formulated as a sequence tagging problem and is solved by a recurrent neural network (Mloc ). It tags each line of a tokenized source code with one of four tags: START, IN, END, or OUT. The model architecture can be separated into two subcomponents: first we compute intermediate representations for each line of the program separately, then we process the list of intermediate representations further to obtain the final line representations, and use these to determine the tags of the lines. The model expects input programs of the same size both in terms of the length and the number of lines. This is achieved by padding the training examples with the special PAD token in the following way. First, we pad each line to have exactly 30 tokens. Then we extend the program with padding lines (a padding line has 30 PAD tokens) until we reach 300 lines. The longer lines/programs are truncated. The number 30 turned out to be an adequate choice for the line length limit: very few lines need to be truncated, while no memory issues arose while training. The number 300 as the maximum number of lines is rather arbitrary and has no practical significance, since the input program can be split, then the predictions can be concatenated. The task of the model’s first subcomponent is to produce an intermediate representation for a single line of the program. In order to do that, we first embed the tokens of the line into a 32-dimensional vector space. Then we apply convolution to the embedded vectors with 64 filters and a window size of 4. We perform two types of pooling on the output of the convolution. One of them is a typical average pooling and the other one is a custom pooling that we call min-max pooling. The latter is similar to max pooling and works in the following way. We first calculate the average, minimum and maximum value at each position. Then we choose either the minimum or maximum value: the one that is farther away from the average. As the result of pooling, we get two 64-dimensional representations for each line of the source code. The process is visualized in Fig. 4. The component is applied to each line separately to obtain the intermediate representations. The task of the second subcomponent is to compute the final line representations and process them to tag each line of the code. We feed both of the intermediate representations (obtained by avg and min-max pooling) to two layers: a bidirectional LSTM [11] with 32 units that returns the whole sequence of outputs, and a fully connected layer with 64 units (which is applied to each line). The BiLSTM provides a hidden representation of each line considering their context, whereas the fully connected layer provides one that ignores the context. Both approaches have their own benefits, thus we decided to combine the two. At this point, we have four representations per line from the two poolings: the two representations produced by the BiLSTMs and the two produced by the fully connected layers. We obtain a single representation by concatenating the outputs of the BiLSTMs together, applying a dropout of 20%, concatenating the outputs of the fully connected layers together, also applying a dropout of 20%, then finally concatenating the resulting two representations together to attain
694
B. Szalontai et al.
Fig. 3. An Example of how the Localizer Model Tags each Line of a Source Code in Order to Determine the Location of a Nonidiomatic Snippet. The Model is Denoted by Mloc .
a single final representation of the line. The tag of the line is determined by applying a fully connected layer with 4 (the number of possible tags) units to this line representation. We use softmax as the output activation function, categorical cross-entropy as the loss function and Adam [12] (with 0.005 as the learning rate) as the optimizer. The visual representation of the model’s second subcomponent can be seen in Fig. 5. With the obtained tagging of the code lines, it is easy to extract nonidiomatic code snippets and their locations: these are the parts of the code where the lines got tagged with ST ART, IN . . . IN, EN D. The locations are extracted by the T AGS2LOC function: it takes line tags (the result of Mloc ), and returns the sequence of beginning and ending indices. The snippets themselves are collected by the T AGS2SN P function: it also takes the result of Mloc as its argument, but it also needs the source code to extract the snippets. Based on these, the definition of LOCAT E and SN IP P ET can both be provided: – LOCAT E(SC) := T AGS2LOC(Mloc (T OKEN IZE CODE(SC))) – SN IP P ET (SC) := T AGS2SN P (SC, Mloc (T OKEN IZE CODE(SC))) 3.3.2 Generating the Idiomatic Alternative (Midiom ) Generating the idiomatic alternative (the second subtask) is performed by the IDIOM AT IZE function using a sequence-to-sequence model with attention mechanism, which is used to “translate” a nonidiomatic code snippet into its idiomatic version. We embed both the nonidiomatic snippet and the currently known part of the idiomatic snippet into a 16-dimensional vector space. We apply a BiLSTM with 32 units to encode the nonidiomatic snippet, and we keep all of the outputs for the attention.
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
695
Fig. 4. The First Subcomponent of Mloc . The Figure Shows how the Intermediate Representation is Calculated for a Single Line of the Program. After Embedding the Tokens into a 32-Dimensional Vector Space, we Apply Convolution to the Line. Finally, we use Two Types of Pooling Layers: Average and Min-Max Pooling.
The idiomatic snippet is constructed by an LSTM with 64 units. Its initial state is the state returned by the encoder BiLSTM. The output of the decoder LSTM is supplemented by a context, computed according to the attention mechanism based on the outputs of the encoder BiLSTM and decoder LSTM. This representation is finally fed to a fully connected layer to determine the next token using the softmax activation function. We use categorical cross-entropy as the loss function and Adam (with 0.001 as the learning rate) as the optimizer. As a reminder, the function IDIOM AT IZE has type Ch∗∗ → Ch∗∗ , indicating that it transforms a sequence of nonidiomatic snippets to a sequence of idiomatic snippets. Using the model, we define IDIOM AT IZE as the following list comprehension: IDIOM AT IZE(LS) := T OKEN S2CODE(Midiom (T OKEN IZE SN IP P ET (S))) | S ∈ LS ,
696
B. Szalontai et al.
Fig. 5. The Second Subcomponent of Mloc . This Describes how the Lines are Tagged Starting from the Intermediate Representations Returned by the Pooling Layers. We use BiLSTM and Fully Connected Layers to Obtain Both Context-Dependent and Context-Free Representations of the Lines. These are then Concatenated and Turned into Tags.
where we iterate over the list of snippets LS, and transform each snippet to its idiomatic counterpart (T OKEN S2CODE undoes the tokenization).
4
Dataset Generation
As two neural networks are utilized in the refactoring procedure, a training dataset is required for both. These are: 1. A collection of (nonidiomatic, idiomatic) code pairs. This is used as the training set of Midiom . 2. A collection of Python source files from Github projects into which nonidiomatic code snippets were inserted randomly. This is used as the training set of Mloc . We obtain the first dataset by generating pairs of nonidiomatic and idiomatic snippets via a context-free grammar. The second dataset is created using the nonidiomatic snippets from the first dataset and randomly inserting them into Github projects. 4.1
Generating Nonidiomatic and Idiomatic Snippets
In order to create the first dataset, which contains pairs of nonidiomatic and idiomatic snippets, we first create nonidiomatic templates via a context-free grammar for each of the six snippet patterns that we consider in this paper. The generator is written in Python using the Natural Language Toolkit (NLTK) [13]. Next, we make some modifications to these patterns:
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
697
– For each snippet containing a condition (e.g. pred(arr[i])), further snippets are generated with concretized Boolean expressions in place of the general conditions. Such a concrete Boolean expression is a comparison between the current element (usually arr[i]) and a random value (integer, real number, string). For string literals, we only consider the == and (! =) operators, whereas for numbers, all comparison operators are considered. We also generate Boolean expressions that combine two simple Boolean expressions with and or or. The resulting Boolean expressions may look like: 5 > arr[i]; arr[i] < −5; 1 == arr[i]; arr[i] == “main . . . – For each snippet, we generate further snippets in which the list (which we iterate over) is replaced with a fixed row/column of a matrix/tuple. An example for this modification is when the subexpression arr[i] gets replaced with arr[i][j] or arr[1][i] (. . . ), where j and 1 is the index of row/column that is being iterated over. – In nonidiomatic code, the length of the list is sometimes stored in a separate variable. Therefore, for each snippet containing the subexpression len(arr), we generate another one which uses a variable (N) to access the length. 195 136 templates are generated in this step; Fig. 6 shows the distribution of the patterns. Once the nonidiomatic patterns are obtained, we generate the idiomatic alternatives based on the following pieces of information: – The type of the pattern (appended to each nonidiomatic pattern during generation). – The names of identifiers (consistent during the entire generation procedure). – Certain further features of the generated nonidiomatic snippet: • the condition (if the snippet contains any), • if the snippet iterates over one row/column in a matrix/tuple, we need to know the indices being used, • if the type of snippet is max, we need to know whether the minimum or maximum value is being calculated, • if the type of snippet is max, we need to know whether the theoretical indexing starts from one instead of zero. This is the final step of generating the dataset containing pairs of nonidiomatic and idiomatic snippets. We preprocess the snippets in order to train Midiom as follows. After tokenizing the pairs of snippets using T OKEN IZE SN IP P ET (described in Sect. 3.2), we mark the beginning of each sequence with a BOS token and the ending of each sequence with an EOS token. In order to achieve consistent snippet length, the P AD special token is used for padding the end of the sequences.
698
4.2
B. Szalontai et al.
Inserting the Nonidiomatic Snippets into Real-World Projects
In order to train Mloc to locate nonidiomatic snippets, we downloaded large amounts of Python programs from GitHub and inserted nonidiomatic snippets into them. The programs are first tokenized using T OKEN IZE CODE (described in Sect. 3.2). We tag each line with one of the possible tags (ST ART , IN , EN D, OU T ) based on where the nonidiomatic snippet got inserted. Figure 3 shows an example of how the tagging works. In order to achieve consistent input size, an upper bound is set for program size (number of lines and number of tokens in line), then padding is applied to the program source files and their taggings (using the special P AD token). The padding lines (which consist of only P AD tokens) get tagged with OU T .
5
Evaluation
The method was tested on the same dataset using the same testing approach as Szalontai et al. [16] to facilitate easy comparisons. The testing dataset contains scripts coded mostly by students with no significant Python experience. The programs had been originally submitted as homeworks. The dataset contains 13373 Python files, some of which include nonidiomatic snippets. These programs generally expect the input from the console and print the output to it as well. The testing approach is made up of two stages: automated and manual evaluation. Automated testing is done with a testing tool which compares the original and the refactored programs by generating inputs for the programs, running them, then comparing their outputs. Manual testing is applied if no appropriate inputs are found for the original program. These programs are manually tagged to tell whether the fix was successful or not.
Fig. 6. The Distribution of the Number of Different Patterns.
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
5.1
699
Precision and Recall
We ran our method on each program in the dataset. Out of the 13373 programs, changes were made to 727. The automated testing approach could be applied in 640 cases, where 481 were identified as correct. For the rest (87) of the programs, the manual tagging procedure identified 23 correct modifications. The sum results in 504 correct localizations and substitutions (out of the 727 cases). Thus the precision of our method is 69.32% (504/727). According to a sample of 300 programs, the estimated ratio of programs that contain one or more snippets to be refactored is 9.33% (28/300). Our method correctly localized and substituted 504 snippets, which is 3.76% of all of the programs (504/13373). This indicates the estimated recall of the whole system: 40.30% (3.76%/9.33%). 5.2
Precision of Subsystems
After the automated evaluation we went through the programs that were not correctly refactored and tagged them according to whether the nonidiomatic snippets were correctly localized or not. As a result we found that in 35 source files the localization was correct, but the substitution was inaccurate. The total number of correct localizations was 539, thus the precision of the localization algorithm is 74.14% (539/727) Having determined this, we also calculated the precision of substitution, that is the ratio of correct substitutions to correct localizations. We found 504 correctly fixed source files which means that the precision in question is 93.50% (504/539).
6
Discussion
Our modifications yielded improvements compared to the most closely related work [16] in both precision (45.35% → 69.33%) and F1-score (0.46 → 0.51), while the recall slightly decreased (47.27% → 40.30%). In this section we aim to identify how applying a different set of approaches influenced the results. We believe that our method is most different in the approach of generating the idiomatic alternative: an end-to-end approach is used instead of a hardcoded one. This learned approach not only reduces the number of components compared to the previous work, but also allows for easy expandability. We take advantage of easy expandability by extending the dataset with a new version of the maximum pattern. Another significant difference is our approach to localization: by applying a more advanced approach to solve the task of localizing a nonidiomatic snippet, we managed to reduce the ratio of false positive localizations.
700
6.1
B. Szalontai et al.
Localizing Nonidiomatic Snippets
As our results in Sect. 5.2 show, the precision of the localization component was 74.14%. This indicates that our method outperforms the previous work (58.14%) in this regard. The improvement tells us that we managed to decrease the false positive localizations. This is the result of applying a more advanced approach to localize the nonidiomatic snippet. Localizing the nonidiomatic snippet is formulated in both approaches as a sequence tagging task. The key difference is the approach of tagging line by line instead of token by token. In the previous work, the IN and OU T tags were used to tag the tokens, whereas in this work, we tag line by line with ST ART , IN , EN D or OU T . The tagging model was also improved. In the previous approach, a fairly simple architecture was used: after embedding each token, a BiLSTM was applied that returned the whole sequence of outputs, then a fully connected layer was applied to each element of the sequence returned by the BiLSTM with softmax as the activation function. In this work, we present a more advanced network architecture for tagging each line of the source code, as detailed in Subsect. 3.3.1. 6.2
Generating the Idiomatic Alternative
By inspecting the results of Sect. 5.2, we can see that the idiomatizer component achieves a very high precision: 93.50%. In contrast, the previous work proposes a fairly complicated, partially handcrafted approach to generate the idiomatic alternative, which achieves a lower precision of 83.71%. The sequenceto-sequence architecture streamlines and outperforms the hardcoded approach. In the previous work, an idiomatized snippet is built by filling the holes in a predefined frame. The frame is selected based on the type of the algorithmic pattern, which is determined from the nonidiomatic snippet by a feedforward network. The identifiers to fill the holes are extracted using a recurrent model that tags the tokens with their type (LIST , T ARGET . . . ), where the type determines which hole to fill. Some additional features are also required which are extracted in a handcrafted manner. In this work, the subtask of idiomatizing is formulated similarly to a neural machine translation task, using a sequence-to-sequence model with attention. This learned end-to-end approach is easier to extend, much clearer, and more precise. The snippets that can be refactored are also extended compared to the previous work. When analyzing code written by students, we spotted multiple occurences of a new version of the maximum pattern, which returned only the index of the maximum value. Therefore we now have three versions of the pattern: one which returns the index of the maximum value, one that returns the maximum value only, and one that returns the value and its index. Applying this extension to the dataset did not require further changes to the method itself, which demonstrates ease of expanding. It also makes the method capable of fixing a wider spectrum of programs.
Localizing and Idiomatizing Nonidiomatic Python Code with Deep Learning
7
701
Conclusion
We presented a method for locating and fixing nonidiomatic snippets by substituting them with more Pythonic alternatives. Localizing the nonidiomatic snippets is solved using convolutional, BiLSTM, and fully connected layers. The idiomatic alternative is generated using a sequence-to-sequence model with attention mechanism. Our architecture yields considerable improvements compared to recent results on this task, is simpler, and extensible with new patterns. An interesting question we are going to investigate in future work is whether the architecture can be applied to another programming language. This can theoretically be done as our method does not rely on language-dependent hardcoded patterns. We already started the work to confirm this by using the presented method to locate and fix nonidiomatic snippets in Erlang code. The results so far seem promising. Another future direction worth investigating is parsing the input code. Abstract syntax trees might provide better representation than the sequence of tokens we are currently using, as they do a better job of capturing dependencies between tokens that are further apart from each other. An important concern for us is to make the method suitable for educational purposes. We believe that the much improved precision compared to previous work is an important step in this direction. The method could be used to demonstrate coding solutions using higher-level concepts of the language, and thus draw attention to the importance of high-level thinking, effectiveness, and readability. ´ Acknowledgments. Supported by the UNKP-22-3 New National Excellence Program of the Ministry for Culture and Innovation from the source of the National Research, Development and Innovation Fund.
References 1. Ahmed, T., Devanbu, P., Hellendoorn. V.J.: Learning lenient parsing & typing via indirect supervision. Emp. Softw. Eng. 26(2), 1–31 (2021 2. Aniche, M., Maziero, E., Durelli, R., Durelli, V.H.S.: The effectiveness of supervised machine learning algorithms in predicting software refactoring. IEEE Trans. Softw. Eng. 48(4), 1432–1450 (2022). https://doi.org/10.1109/TSE.2020.3021736 3. Chakraborty, S., Ding, Y., Allamanis, M., Ray, B.: Code editing with tree-based neural models. IEEE Trans. Softw. Eng. (99), 1-1 (2020) 4. Chen, Z., Kommrusch, S., Tufano, M., Pouchet, L.-N., Poshyvanyk, D., Monperrus, M.: Sequencer: sequence-to-sequence learning for end-to-end program repair. IEEE Trans. Softw. Eng. 47(9), 1943–1959 (2019) 5. Chirkova, N., Troshin, S.: A simple approach for handling out-of-vocabulary identifiers in deep learning for source code. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 278–288 (2021) 6. Danish, M., Allamanis, M., Brockschmidt, M., Rice, A., Orchard, D.: Learning units-of-measure from scientific code. In: 2019 IEEE/ACM 14th International Workshop on Software Engineering for Science (SE4Science), pp. 43–46. IEEE (2019)
702
B. Szalontai et al.
7. Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, Boston (2018) 8. Gupta, R., Pal, S., Kanade, A., Shevade. S., DeepFix: Fixing common c language errors by deep learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017) 9. Hata, H., Shihab, E., Neubig. G.: Learning to generate corrective patches using neural machine translation. arXiv preprint arXiv:1812.07170 (2018) 10. Hellendoorn, V.J., Bird, C., Barr, E.T., Allamanis. M.: Deep type inference. In: Proceedings of the 2018 26th ACM joint meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 152–162 (2018) 11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 12. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (Poster) (2015) 13. Loper, E., Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Volume 1, ETMTNLP 2002, pp. 63–70, USA, 2002. Association for Computational Linguistics (2002) 14. Mens, T., Tourw´e, T.: A survey of software refactoring. IEEE Trans. Software Eng. 30(2), 126–139 (2004) 15. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: 27th Proceedings conference on Advances in Neural Information Processing Systems (2014) 16. Szalontai, B., Vad´ asz, A., Borsi, Z.R., V´ arkonyi, T.A., Pint´er, B., Gregorics, T.: Detecting and fixing nonidiomatic snippets in python source code with deep learning. In: Arai, K. (ed.) IntelliSys 2021. LNNS, vol. 294, pp. 129–147. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-82193-7 9 17. Tufano, M., Pantiuchina, J., Watson, C., Bavota, G., Poshyvanyk, D.: On learning meaningful code changes via neural machine translation. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 25–36. IEEE (2019) 18. Tufano, M., Watson, C., Bavota, G., Penta, Martin White, M.D., Poshyvanyk, D.: An empirical investigation into learning bug-fixing patches in the wild via neural machine translation. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 832–837 (2018) 19. Wong, W.E., Gao, R., Li, Y., Abreu, R., Wotawa, F.: A survey on software fault localization. IEEE Trans. Softw. Eng. 42(8), 707–740 (2016) 20. Zhang, Z., Xing, Z., Xia, X., Xu, X., Zhu, L.: Making python code idiomatic by automatic refactoring non-idiomatic python code with pythonic idioms. arXiv preprint arXiv:2207.05613 (2022)
Cascaded 3D Object Segmentation with Volumetric Propagation Network Yi Wu1 , Xin Wang2(B) , Yi Lu1 , Feng Gao1 , and Youbing Yin2(B) 1
2
Keya Medical, Seattle, WA, USA Keya Medical, Shenzhen, Guangdong, China {xinw,yin}@keyamedna.com
Abstract. Deep convolutional neural networks have been widely adopted for automatic object segmentation of volumetric medical image as they are powerful tools for learning visual representations from images. However, it is still time-consuming to train and test the segmentation networks on the high dimensional volumetric data. In this paper, we design an efficient coarse-to-fine deep learning framework that can not only accelerate the segmentation process but also improve the accuracy. This is achieved by (a) training a segmentation network with downsampled volumetric images, and (b) restoring the fine details of object via a refinement network. Specifically, we propose a novel deep learning building block, Volumetric Propagation Network (VPN), that can preserve the structure of objects via modeling the global pairwise relations between voxels. The module can be flexibly embedded into any type of convolutional networks for volumetric data processing. To illustrate its efficiency, the proposed approach is validated with the challenging task of segmenting ascending aorta from Computed Tomography Angiography (CTA) images. Our experiments show that the proposed cascaded network architecture outperforms state-of-the-art volumetric segmentation networks while being an order of magnitude faster. Keywords: Deep Learning Object Segmentation
1
· Volumetric Propagation Network · 3D
Introduction
In recent years, we have witnessed great success of using deep convolutional neural networks (CNNs) in computer vision [8] and medical image analysis [2,3,14]. After the seminal work of fully convolutional network (FCN) for semantic segmentation [11], the performance of image segmentation has been significantly improved [4]. The U-Net architecture [14] extends FCN by using skip connections to combine low-level feature maps with higher-level ones, which achieves great success in 2D biomedical image segmentation. To segment objects in volumetric image data, one way is to integrate results from 2D CNNs on three This work was supported by Shenzhen Science and Technology Program (Grant No. KQTD2016112809330877). c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 703–715, 2023. https://doi.org/10.1007/978-3-031-37963-5_48
704
Y. Wu et al.
Fig. 1. (a) Segmentation Result of H-Net; (b) Segmentation Result of L-Net; (c) Segmentation Result of VPN; First Row: Segmentation Results Shown in 3D; Second Row: Segmentation Results Shown in Sagittal Plane. Best Viewed in Color.
orthogonal 2D patches to incorporate 3D contextual information [15]. Another type of approach is extending 2D convolution to 3D and directly training the 3D network using volumetric data [1,9]. Although the 3D network can better model the 3D structure information, it consumes more computational resources. For large 3D object in volumetric image, due to the memory constraint of current GPU it is impossible to directly feed the whole object of original size into the network for training. One feasible approach is cropping one volume including part of the object as the input for the network. We call this trained network in high resolution as H-Net. H-Net only sees parts of the object, so it cannot encode the whole structure of the object. To encode more structure/context information, one way is to enlarge the cropping size and increase the network depth but it will take more time for training and testing. Furthermore, during testing, it needs to scan the whole volumetric image volume by volume which is slow, and the segmentation results near the volume boundary may not be robust. As it cannot take full advantage of the whole structure of the object, it may generate more false alarms, as shown in Fig. 1(a) and Fig. 7. Another possible method is downsampling the volumetric image so that each training sample for the network can include the whole object. We denote this network trained in low resolution as L-Net. Although L-Net is faster to train and test, its output is in low resolution. To generate the segmentation result of original size,
Cascaded 3D Object Segmentation with Volumetric Propagation Network
705
Fig. 2. Illustration of the Proposed Coarse-to-Fine segmentation Framework. ↓ and ↑ Denote Downsampling and Upsampling Operations, Respectively. The Proposed VPN Takes the Volumetric Image of Original Size and Upsampled Coarse Segmentation as Input, and Outputs the Refined Segmentation Result.
it needs to upscale the output. However, as shown in Fig. 1(b), the boundary of upsampled result is not accurate. In this paper, we propose a novel coarse-to-fine framework that can handle the above mentioned problems and enjoys the benefits of both H-Net and L-Net. It is more robust to train and test than H-Net and segmentation accuracy is also higher. With the cascaded segmentation strategy, in practice it can be an order of magnitude faster than H-Net. With the help of the proposed volumetric propagation network (VPN), the boundary of segmentation is more accurate than the upsampled result of L-Net, as shown in Fig. 1(c).
2
The Framework Outline
The proposed coarse-to-fine segmentation framework is illustrated in Fig. 2. First, we use the downsampled volumetric images to train a 3D CNN for coarse segmentation. Then, we train VPN with the samples cropped from the original images and the corresponding upsampled coarse segmentation maps. Coarse Segmentation (L-Net). For coarse segmentation, we train a 3D UNet [1] with soft Dice loss [12] using down-sampled input, as shown in Fig. 3. We use batch normalization (BN) [6] for fast convergence and ReLU [5] for nonlinear activation. The input is downsampled two times through encoder network and then upsampled two times through the decoder network to recover the resolution as the input. Each 3D Conv module contains sequential convolution, BN and ReLU operations. A 3D max pooling follows two sequential Conv. For fast training and feeding the network with enough samples in a mini-batch (e.g. 8), we cropped out the background far away from the object. Then, the cropped volumetric image is downsampled to a quarter of the original size so that the whole object can be directly fed into the network and the whole structure can be learned by the network. The numbers of channels for each layer are 64, 128, 256, 128, 64, respectively.
706
Y. Wu et al.
Fig. 3. The 3D U-Net Architecture for the L-Net Training and Guidance Network Training with an Example Input of Size 64 × 64 × 64. Blue Cubes Represent Feature Maps. The Number of Channels is Shown above each Map. This Network Includes Five Stages and each Stage has Two Convolutions. The Numbers of Channels after Convolution in each Stage are 64, 128, 256, 128, 64, Respectively. The Number of Channels is Doubled after Downsampling and Halved after Upsampling. This Network Architecture can be Represented as 64-128-256-128-64 in Table 1.
Segmentation Refinement (VPN). As shown in Fig. 1(b), the boundary of upsampled coarse segmentation is not accurate. To get more accurate segmentation result, we propose a novel volumetric propagation network (VPN) that can refine the coarse segmentation with the guidance from the image of original size. With spatially varying parameters that supports the propagation process, this module is equivalent to the standard anisotropic diffusion process [10]. The transformation of maps is controlled by a Laplacian matrix that is constituted by the parameters of the volumetric propagation module. Since the propagation module is differentiable, its parameters can be learned by a typical deep 3D CNN that is connected to this module, through joint training. In the next section, we will introduce the proposed VPN in details.
3
Volumetric Propagation Network
In [10], Liu et al. proposed a spatial propagation network that can learn semantically-aware affinity between pixels of 2D image. In this work, based on the same linear propagation theory we propose a volumetric propagation network (VPN) that can learn the affinity between voxels in volumetric data. In VPN, we introduce a nine-way connection where for each voxel, the neighbors of all different directions can be considered in 3D. We show that the propagation structure formulates a dense affinity matrix so that voxels are related globally, even only local connections (i.e., the nine-way connection) are established for each voxel.
Cascaded 3D Object Segmentation with Volumetric Propagation Network
707
Fig. 4. The Network Architecture of VPN. Blue Cubes Represent Feature Maps. The Network Requires Two Inputs: The Input Image and the Probability Map of Coarse Segmentation (For Better Visualization, we use the Binary Mask Here.). The Inputs are Cropped from Images of Original Size as Show in the Red Box. The Guidance Network Uses 3D U-Net (Illustrated in Fig. 3) to get Affinity Matrix for Propagation. The Propagator Propagates the Probability Map According to Eq. (3).
3.1
Linear Volumetric Propagation
We apply a linear transformation by means of the volumetric propagation network, where a volume is scanned slice by slice. Without loss of generality, in the following, we take the left-to-right direction as an example. Denote X ∈ Rn×n×n and H ∈ Rn×n×n as the two 3D maps before and after volumetric propagation. xt ∈ RN ×1 and ht ∈ RN ×1 (N = n2 ) represent their tth slice (reshaped from a 2D image to a column), respectively. We use a linear transformation matrix wt ∈ RN ×N to linearly propagate information from left-to-right between adjacent slices. For each slice, ht is a linear, weighted combination of the previous slice ht−1 and the corresponding slice xt in X: ht = (I − dt )xt + wt ht−1 ,
t ∈ [2, n], h1 = x1
(1)
where I ∈ RN ×N is an identity matrix, and dt ∈ RN ×N is a diagonal matrix with the diagonal element as: dt (i, i) =
N
wt (i, j)
(2)
j=1,j=i
To propagate across the entire volumetric image, the 3D matrix H is updated slice by slice recursively. Let Hv ∈ RM ×1 and Xv ∈ RM ×1 (M = n3 ) be vectorized version of X and H, respectively. After the recursive scanning, we can get: Hv = GXv
(3)
708
Y. Wu et al.
where
⎡
I 0 ⎢ w2 λ2 ⎢ ⎢ w3 w2 w3 λ2 G=⎢ ⎢ .. .. ⎢ . . ⎣ .. .. . .
⎤ ··· ··· 0 0 ··· ···⎥ ⎥ λ3 0 · · · ⎥ N ×N ⎥ .. . . .. ⎥ , λt = I − dt ∈ R . . ⎥ . ⎦ · · · · · · λn
The summation of elements in each row of G equals to one. The sequence generated by Eq. (3), {Ut |Ut = GUt−1 }Tt=2 , is a diffusion process which can be expressed with a partial differential equation (PDE), ∂T U = −LU , where L = D − A is the Laplacian matrix, D is the degree matrix with the elements composed by dt in Eq. (2), and A is the affinity matrix constituted by the off-diagonal elements of G [10]. L defines the volumetric propagation and A describes the similarities between any two voxels. Therefore, learning all transformation matrices wt in Eq. (1) equals to learning the image affinity matrix A. The linear volumetric propagation in Eq. (1) is differentiable so it can be easily inserted into a standard feed-forward neural network. In the following, we will show how the affinity matrix A can be learned in a data-driven manner by a deep 3D CNN. 3.2
Learning Data-Driven Affinity
The affinity matrix indicates the pairwise similarities of a specific input, which should also be conditioned on the content of this input, namely, different input images should have different affinity matrices. Therefore, we design them as the outputs of a deep 3D CNN, which can be directly conditioned on an input image. One simple way is to set the output of the deep 3D CNN to use the same size as the input matrix and each voxel is fully connected to all the voxels from previous slice. When the input has c channels (c = 8 for the example shown in Fig. 4), the output needs m × c × 6 channels (there are m connections from the previous slice per voxel per channel, and with six different directions). For a 64 × 64 × 64 × 8 feature map, m = 64 × 64, so it needs an output of 64 × 64 × 64 × 196608. Obviously, this is too many to be implemented in a real-world system. Instead of using full connections between the adjacent slices, we show that certain local connections, corresponding to a sparse transform matrix, can also formulate densely connected affinity. Specifically, we introduce the nine-way connection to implement Eq. (1), as shown in Fig. 5. 3.3
Nine-Way Connection
The proposed nine-way connection enables each voxel to connect to nine voxels from the previous slice, as shown in Fig. 5. Denote xk,t and hk,t as the k th voxels in the tth slice, the propagation with nine-way connection can be represented as:
Cascaded 3D Object Segmentation with Volumetric Propagation Network
709
Fig. 5. Illustration of Nine-Way Propagation in Six Directions. The Nine Green Voxels are the Neighbors of the Blue Voxel in the Previous Slice.
⎛ hk,t = ⎝1 −
j∈N
⎞ pj,t ⎠ xk,t +
pj,t hj,t−1
(4)
j∈N
where N is the set of nine voxels in previous slice and pj,t is a scalar weight indicating the propagation strength between hj,t−1 and xk,t . In this way, wt in Eq. (1) is constituted by pj,: , j ∈ N. The affinity matrix A with linear propagation is composed of the off-diagonal elements of G in Eq. (3). The three-way connection adopted in [10] can only propagate affinity in 2D. However, the proposed nine-way connection can form a relatively dense A with the multiplication of several different 3D transformation matrices so that voxels can be densely and globally associated in 3D. Model stability is important for a linear system. To maintain the stability of spatial propagation in 3D, we regularize all the weights of each
voxel in H by keeping the summation of their absolute values less than one: j∈N |pj,t | ≤ 1. 3.4
Implementation
As shown in Fig. 2, the VPN contains a deep 3D CNN, namely the guidance network that outputs all elements of the transformation matrix, and a linear propagation module that propagates the coarse segmentation using the learned transformation matrix. The linear propagator receives the probability map of coarse segmentation and the weights learned by the deep guidance network, and outputs a refined result. The structure of a guidance network can be any regular 3D CNN. In this paper, we simply take it the same as the network for the coarse segmentation, i.e. 3D U-Net. Suppose that the input into the propagator is a 3D probability map of size n×n×n×c, the guidance network needs to output a weight map of size n×n×n× c × (9 × 6)1 , because each voxel in the input 3D map requires 9 scalar weights per direction, and 6 directions in total (as shown in Fig. 5). The propagator needs to propagate affinity for different directions so it contains 6 independent hidden layers. In each layer, it propagates the input map with the corresponding weight map using Eq. (4). All submodules are differentiable so they can be 1
In the example shown in Fig. 4, n = 32, c = 8 so the size of the weight map is 32 × 32 × 32 × 432 where 8 × 9 × 6 = 432.
710
Y. Wu et al.
Table 1. Comparison Results. The Network Architecture of VPN is for its Guidance Network Only. One Iteration is for Processing of One Min-Batch (8 Images) Including Forward and Backward Operations when Training. Name
Input
Network architecture
Time per iteration IOU
H-Net-64
64 × 64 × 64
64-128-256-128-64
1.26s
0.645
H-Net-128 128 × 128 × 128 16-32-64-128-256-128-64-32-16 3.65s
0.934
L-Net
64 × 64 × 64
64-128-256-128-64
1.26s
0.926
VPN
64 × 64 × 64
64-128-256-128-64
2.22s
0.948
trained jointly through back-propagation. To integrate the propagations from different directions and get the final propagation result, we use the node-wise max-pooling [10].
4
Experimental Results
We validate the effectiveness of the proposed approach on the task of ascending aorta segmentation. Our collected aorta CTA dataset contains 364 for training, 100 for validation and 100 for testing. The intensity of image is truncated to the range of [−100, 1000] and then normalized to [0, 1]. Table 1 summarizes the networks for evaluation. We implement all networks (L-Net for coarse segmentation, VPN for refinement and two H-Nets) using Pytorch [13] with the ADAM optimizer [7], and set the base learning rate to 0.0001. All networks are trained from scratch with mini-batch size of 8 using one Nvidia Tesla P40 GPU. The validation set is evaluated after finishing the training of one epoch, i.e. one iteration over all training images. The number of training epochs is set to 100. Note that the iteration number (i.e., the number of updating network parameters) of L-Net is 8 times less than other networks, because it takes 8 downsampled whole images as one mini-batch while others randomly crop 8 patches from one image of original size as the mini-batch. We take the checkpoint of model which achieves the highest performance on the validation set for testing and use the Intersection-Over-Union (IOU) on the testing set to measure the performance of networks. When testing, L-Net takes the whole downsampled image as input and the upsampled result is evaluated. For other networks, we densely scan the image of original size with window of size 128 × 128 × 128 and there is no overlap between adjacent windows. For the guidance network of VPN, before feeding the input into the 3D U-Net, we convolve it with stride 2, namely, downsample it to half, for fast computing. The probability input is convolved in the same way first. Finally, the output of propagator is upsampled (with convolution) to the same resolution as the input. The randomly cropped 8 samples of size 64 × 64 × 64 from one volumetric image are fed into the network as one mini-batch. H-Net-64 uses exactly the same network architecture as L-Net. The only difference during training lies in the generation of samples. The mini-batch in
Cascaded 3D Object Segmentation with Volumetric Propagation Network
711
Fig. 6. Learning Curves. *Note that During Training and Validation L-Net is Evaluated using the Downsampled Annotations, so its Loss is Lower than Others’.
L-Net includes 8 downsampled whole images while H-Net-64 randomly crops 8 image patches of size 64 × 64 × 64 from one original image. In this way, H-Net-64 encodes less context information than L-Net. The training of H-Net-64 is difficult and the result is poor, as shown in Table 1 and Fig. 6. To learn more context information, we cropped patches of size 128 × 128 × 128 to train H-Net-128, where the stages of network are increased from five in H-Net-64 to nine. Due to the constraint of GPU memory, we decrease the number of channels for high resolution feature maps by four times in H-Net-128. As shown in Fig. 6, the training process of H-Net is not stable. The training losses are consistently decreased but the validation losses fluctuate significantly. On the contrary, the training of L-Net and VPN is more stable. The training and validation losses are consistently decreased. This is because training on the downsampled images (L-Net) is easier than directly training on the original images (H-Net). The training of VPN benefits from the guidance from the coarse segmentation of L-Net and it converges very fast. When training L-Net, images are padded to the size of 64 × 64 × 64 (some may be cropped due to larger size). Although the network architecture of L-Net is the same as H-Net-64, it is easier to train and the performance is much better. Although the average IOU of H-Net-128 is a little higher than L-Net, on some cases, it performs much worse, as shown in Fig. 1. It may contain some holes on
712
Y. Wu et al.
the boundary of scanning window. This may be attributed to the zero-padding used in the network training. VPN adopts the same scanning window strategy as H-Net-128 when testing, but with the guidance from coarse segmentation, it does not have the boundary problem. Figure 7 illustrates some segmentation results of H-Net-128 and VPN. Two worst cases of H-Net-128 on the testing set are shown in Fig. 7(a) and (b). On these cases, H-Net-128 outputs serious false alarms. This is because training on high resolution with limited training samples is more difficult. It cannot effectively learn the whole structure information of the object. On the contrary, with the proposed effective coarse-to-fine segmentation strategy, VPN can segment the object with high accuracy. Furthermore, the segmentation result of H-Net128 on the boundary of scanning window is not stable, which results in the holes in the 3D segmentation mask. In contrast, VPN is robust on the window boundary which is one benefit of the coarse-to-fine segmentation strategy. On the case shown in Fig. 7(c), the IOU score of H-Net-128 is a little higher than VPN. However, the result of H-Net-128 still contains some small false alarms, while VPN generates clean and accurate segmentation result. Note that in previous comparison, for fair evaluation of segmentation accuracy we adopt the same dense scan testing strategy for VPN as H-Net. However, in practice, we can use L-Net to locate the region of object on the downsampled image fast and then refine the upscaled coarse segmentation result around this region using VPN. With this cascade strategy, we can segment an object of size 200 × 100 × 100 in the 400 × 680 × 680 image in around 4 s. However, it takes H-Net-128 about 1 min to densely scan the whole volumetric image with the window of size 128 × 128 × 128. The proposed coarse-to-fine segmentation method is not only more accurate but also an order of magnitude faster.
Cascaded 3D Object Segmentation with Volumetric Propagation Network
713
Fig. 7. First Line in each Subfigure: Result of H-Net-128; Second Line: Result of VPN. First Column: 3D Segmentation Mask; Second Column: Result on Axial Plane; Third Column: Result on Saggital Plane; Forth Column: Result on Coronal Plane.
714
5
Y. Wu et al.
Conclusion
In this paper, we design an efficient coarse-to-fine deep learning framework that can not only accelerate the segmentation but also improve the accuracy. The segmentation network trained on the down-sampled images generates the coarse segmentation and then the proposed VPN propagates affinity slice by slice to generate the high-resolution result. We demonstrate the effectiveness of the proposed approach on the task of segmenting ascending aorta from CTA images. The experimental results show that the proposed method outperforms other state-ofthe-art segmentation networks while being an order of magnitude faster. In the future, we will further test the proposed approach on other segmentation tasks, such as the segmentation of lung and liver. Acknowledgments. This work was supported by Shenzhen Science and Technology Program (Grant No. KQTD2016112809330877).
References ¨ Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: 1. C ¸ i¸cek, O., learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-31946723-8 49 2. Li, L., Qin, L., Xu, Z., Yin, Y., Wang, X., et al.: Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology 206 (2020) 3. Kong, B., Wang, X., Li, Z., Song, Q., Zhang, S.: Cancer metastasis detection via spatially structured deep network. In: International Conference on Information Processing in Medical Imaging (IPMI) (2017) 4. Kong, B., Sun, S., Wang, X., Song, Q., Zhang, S.: Invasive cancer detection utilizing compressed convolutional neural network and transfer learning. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-L´ opez, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 156–164. Springer, Cham (2018). https://doi.org/10. 1007/978-3-030-00934-2 18 5. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2011) 6. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML) (2015) 7. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015) 8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Neural Information Processing Systems (NIPS) (2012) 9. Li, W., Wang, G., Fidon, L., Ourselin, S., Cardoso, M J., Vercauteren, T.: On the compactness, efficiency, and representation of 3D convolutional networks: brain parcellation as a pretext task. In: International Conference on Information Processing in Medical Imaging (IPMI) (2017)
Cascaded 3D Object Segmentation with Volumetric Propagation Network
715
10. Liu, S., Mello, S., Gu, J., Zhong, G., Yang, M.-H., Kautz, J.: Learning affinity via spatial propagation networks. In: Neural Information Processing Systems (NIPS) (2017) 11. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) 12. Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: International Conference on 3D Vision (2016) 13. Paszke, A., et al.: Automatic differentiation in Pytorch. In: Neural Information Processing Systems (NIPS) (2017) 14. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4 28 15. Zhou, Y., Xie, L., Shen, W., Wang, Y., Fishman, E.K., Yuille, A.L.: A fixed-point model for pancreas segmentation in abdominal CT scans. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10433, pp. 693–701. Springer, Cham (2017). https://doi.org/10. 1007/978-3-319-66182-7 79
An Improved CNN Model for Image Forgery Detection K. R. Jisha1(B) and N. Sabna2 1 APJ Abdul Kalam Technological University (KTU), Thiruvananthapuram, Kerala, India
[email protected]
2 Rajagiri School of Engineering and Technology, Kakkanad, Kerala, India
[email protected]
Abstract. As Digital Image forgery is alarmingly flourishing nowadays, a serious insight to detect and classify such forgeries are highly essential. Passive image forgery detection techniques have gained lot of interest among researchers and have achieved tremendous advancements in cracking the technologically advanced image forgery tools. At present Deep Learning based detection methods are mostly adopted because of its automatic feature extraction capabilities. This paper highlights a method to improve the performance of a Convolutional Neural Network model in the context of Image Forgery Detection application. The study can be extended to analyze some of the best CNN based deep learning models which are good at classification and the best one which fits in to the image forgery detection application can be identified. To achieve this, VGG19, one among the best classification CNN model is considered and the experimental results applying the proposed approach shows that the model has improved performance regarding the performance metrics such as accuracy, precision, recall and F1-Score. The experimental analysis is carried out using popular image forgery datasets. The architecture is implemented using Keras API on Tensor flow using Python 3 programming language and the simulation is run on Google Colab using GPU runtime. Keywords: Digital Image Forgery · Deep Learning · Convolutional Neural Network
1 Introduction Image Forgery detection and localization is a very active area of research in the field of Digital Image Forensics. It is the consequence of the emergence of advanced tools for image manipulations and their ease of access. The prime goal of Image Forensics is in confirming the authenticity of the image. Image forensics techniques fall in to 2 categories – Passive and Active methods [1].Active methods, require some prior information about the image to validate the image. Digital signatures and watermark, are the examples for that category. In this method a signature or a watermark will be attached with the image by the sender. Receiver will be having the information regarding these attachments and on image reception, will check for the authorized attachment and will © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 716–727, 2023. https://doi.org/10.1007/978-3-031-37963-5_49
An Improved CNN Model for Image Forgery Detection
717
retrieve the original image. In contrast to active methods, passive or blind detection methods, checks the authenticity of the image without any prior information about the image. This method utilizes the traces left back by the processing steps in the different phases of acquisition and storage of the image. Changes in these traces can be analyzed and the outcome can easily detect tampered regions. Copy move and Splicing are the two major categories of passive forgeries that are very hard to be identified by the naked eye. A lot of work has been completed in the passive category of Forgery detection ranging from the traditional methods to the recent technology of Deep Learning [2]. The traditional methods included very complex feature extraction phases. Moreover, in most cases it could identify only anyone specific type of forgery by identifying certain features of the image. The emergence of GPU Technologies and the successful application of Deep Learning models in computer vision paved the way for the usage of Deep Learning architectures in Image Forgery Detection. Deep learning architecture combines the feature extraction and the classification processes and it is done automatically by the model [3]. This is a data driven process and it automatically learns all the features of the image and will accurately detect the forgeries if any. DNN (Deep Neural Networks), RNN (Recurrent Neural Networks) and CNN (Convolutional Neural Networks) are some of the popular Deep Learning models. The most popular among these are the CNN models [4]. There exist many variants of CNN models such as AlexNet [5], ResNet, VGG Net [6], DenseNet [7] etc. In this paper VGG Net model is considered and a new approach is proposed to improve the accuracy and other performance metrics of the model, in the context of forgery detection. The performance of the new approach is compared with the traditional VGG-19 model. The model is trained and the performance is evaluated by testing the model using the images from the three Image forgery datasets like CoVerage [8], CG-1050 [9] and CASIA-1 [10]. The contents of the rest of the paper are organized as follows: Sect. 2 summarizes related works in this field, Sect. 3 describes the proposed approach, Sect. 4 summarizes the experimental works carried out and Sect. 5 discusses the results and its analysis. Finally, the paper is concluded with future scope in the conclusion section.
2 Related Works A lot of techniques have been proposed in the field of image forgery detection and localization. In this section a brief discussion of some of the existing methods in forgery detection and localization is carried out. A convolutional neural network method based on Fals-Unet is proposed to locate the manipulated regions by El Biach FZ et al. [11]. The framework of Fals-Unet architecture consists of Encoder (CNN Based) and Decoder followed by a pixel-based classification layer. Fals-Unet presents a real solution of the manipulated regions localization in terms of F1-score, MCC, AUC, and Jaccard index. A novel Image forgery localization method, Trans-Forensics, inspired by Transformers is proposed by Jing Hao et al. [12]. The proposed network contains three main components: FCN backbone for feature extraction, Dense self-attention encoders and dense correction modules for performance optimization. A suitably modified version of
718
K. R. Jisha and N. Sabna
MobileNetV2, effective on copy-move forgery detection with post processed attacks is proposed by MN Abbas et al. [13]. In this method, a highly efficient and lightweight network, MobileNetV2 model, was modified and fine-tuned by framing a new FC layer in place of the old pre-trained FC layer. The model detects copy-move forgeries and is proven to be time and resource friendly deep learning framework for digital image forgery detection on embedded devices. A general view of foundation of CNN, recent advancements of CNN and some major application areas are highlighted by Ghosh, Anirudha, et al. [4]. The focus is an elaborate discussion of foundation and concepts of CNN, training process and recent advances & applications. An improved manipulation localization architecture to segment out manipulated regions from non-manipulated ones was proposed by Bappy JH et al. [14]. Long short-term memory cells (LSTM) with Resampling features, and a convolutional encoder-decoder network are utilized in the proposed work. A Hybrid CNN-LSTM model to capture discriminative features between manipulated and non-manipulated regions is developed by Bappy, Jawadul H., et al. [15]. The boundary discrepancy, i.e., the spatial structure, between manipulated and non-manipulated regions are obtained with the combination of LSTM and convolution layers. A robust deep learning-based system for identifying image forgeries in the context of double image compression is proposed by Ali S., et al. [16]. The difference between an image’s original and recompressed versions is used to train the model in this approach. Two approaches – a model by a custom architecture and a model by transfer learning that use deep learning is proposed by Rodriguez., et al. [17]. In all cases, the influence of the depth of the network is recorded in terms of precision (P), recall (R) and F1 score. A CNN based self-attention model including CRF, conditional random field is proposed by Rao Y., et al. [18]. A novel method using multiple stacked autoencoder layers is proposed by Bibi S., et al. [19]. Pretrained AlexNet and VGG 16 [6] are utilized for feature extraction. The method shows good accuracy in comparison with some of the state of art approaches.
3 Proposed Approach in Improving Performance The method of utilizing the pretrained weights of CNN models has drawn much interest in Deep learning model-based research in various applications. The application of pretrained weights helps in improving the performance of new untrained model. This paper highlights an approach to improve the accuracy of the model that is already using the pretrained weights of CNN models. Adopting this approach helps to improve the model performance, even higher than the performance of the models using pretrained weights, in Image Forgery detection application. Thus, the overall accuracy of the model in classifying and localizing forgeries in images can be improved using this methodology. Making use of the model tuned by this approach, will be more advantageous when compared to the use of bare CNN model. In this approach, VGG 19, which is one among the best classification CNN model is considered. It is trained using a very large image database ImageNet consisting of 1.2 million images corresponding to 1000 classes [20]. The weights of this pretrained network [21] is utilized in this work.
An Improved CNN Model for Image Forgery Detection
719
Fig. 1. Proposed Approach for the Performance Improvement. (a) Freezing Pretrained Layers of VGG-19 and Training the Custom Fully Connected Layers using Image Forgery Datasets Forming the Trained Model. (b) Loading the Trained Model, Unfreezing the Initial Layers and Training the Complete Model using Image Forgery Dataset giving rise to new Improved Version of the Model
Utilizing the already learned pretrained weights helps in increasing the accuracy of the architecture when compared to a nontrained network. Later, the pretrained VGG 19 network is improved using the proposed approach. The proposed approach is described in Fig. 1. The method proposed here is to freeze, pretrain and save the weights of the model and then load the pretrained weights, unfreeze the complete model and train the complete model using the forgery datasets. In this approach, we load the already trained VGG 19 layer of the previous phase and will unfreeze all the initial layers and the entire architecture is again trained using the image forgery dataset with a very low learning rate. Now once this process is completed, prediction is carried out and the performance metrics are calculated and compared with the metrics of the previous experiment. Results show that the proposed method has improved accuracy and F1 score of the model even with the small dataset like CoVerage. The work is carried out in 2 experimental phases. In the first phase VGG19 model is considered and is fine-tuned and the performance metrics [22] are calculated.
720
K. R. Jisha and N. Sabna
Fig. 2. VGG-19 Architecture
VGG (Visual Geometry Group) is a deep Convolutional Neural Network architecture with multiple layers [6]. It is trained using a very large image database ImageNet consisting of 1.2 million images corresponding to 1000 classes. The complete VGG-19 model architecture is shown in Fig. 2. VGG-19 is a variant of VGG Net consisting of 19 layers including 16 convolutional layers, 5 max pooling layers, 3 Fully connected layer and a final softmax classifier layer. The VGG Net accepts an input image size of 224x224. 224x224 RGB input is fed to the first convolutional layer of VGG-19. Convolutional layers use kernels of size 3x3 with a stride of 1 pixel and max pooling was performed over 2x2 pixel windows with stride 2. After four consecutive convolution and max pooling layers, fully connected layers follow. The first two fully connected layers have 4096 channels each and the third has 1000 channels. The final layer is the Softmax classifier layer. In the second experimental phase, proposed approach is carried out to improve the accuracy [23] and F1 score [24] and thereby improving the model for the image forgery detection application. The results of proposed approach are compared with the existing approach and the results show that the proposed approach outperforms the other.
4 Experiments The model architecture is implemented using Python 3 programming language utilizing Keras [25]with Tensorflow backend [27]. Adam optimizer [27] is used as the optimizer with β1 = 0.9 and β2 = 0.999 and the learning rate was 0.00001. The training was conducted for various epochs and the results are consolidated. The simulations are run on Google CoLab utilizing GPU runtime. In the first experimental phase, the pretrained model is considered and the layers up to the final flattened fully connected and the classifier layer are frozen and, all the succeeding layers are popped off and is replaced by the layers appropriate for the classification application. Since the application is a binary classification problem, two dense layers and an output classifier layer using softmax function are included. Now except these layers
An Improved CNN Model for Image Forgery Detection
721
all the other layers are frozen and the final added output layer is trained with the image forgery datasets like CoVerage, CG-1050 and CASIA 1. Now after unfreezing the layers frozen before, the complete model is trained with very low learning rate for different epochs and batch sizes, and it is found that the validation accuracies are improving with increasing batch sizes. Now the improved model is subjected to early stopping and the experimental results show that the resulting accuracy is same as the accuracy obtained after performing large epochs without early stopping. Now the resulting trained model is used for prediction and based on the prediction performance metrics are obtained. 4.1 Datasets Preparation Three datasets used for evaluating the model are CoVerage, CG-1050 and CASIA-1 (see Table 1). The datasets are divided in to randomly chosen subsets: Training (70%), Validation (20%) and Test (10%). The table shows the details of the tampered images data bases. The images are resized to 224 × 224, the size accepted by VGG 19 model. Table 1. Datasets Dataset
Number of Images
Image Size
CoVerage
100
400 × 486
CG-1050
1050
4608 × 3456
CASIA-1
1721
348 × 256
4.2 Performance Metrics Evaluation Different performance metrics [22] considered in this study are Accuracy, Precision, Recall and F1 score to evaluate and verify the efficiency of the proposed method. Accuracy is given by Accuracy, A =
TP + TN TP + FP + TN + FN
(1)
Precision is otherwise termed as positive prediction values or it can be defined as the number of correct predictions of the model out of the total actual positive predictions made by the model. It can be represented as Precision, P =
TP TP + FP
(2)
Recall is otherwise termed as the true positive rate or the sensitivity of the model. Recall denotes the number of correct predictions out of the total positive values in the problem. It can be represented as Recall, R =
TP TP + FN
(3)
722
K. R. Jisha and N. Sabna
F1 score is the performance metrics usually considered if both false positive and false negative are equally important in the analysis of the application. It is also suitable to consider F1 score if the dataset is unbalanced. F1 score is given by F1 =
2∗P∗R 2 ∗ TP = 2TP + FN + FP P+R
(4)
In the above expressions TP and FP denotes the number of True and False Positive predictions; FN and TN denote False True negative predictions respectively.
5 Experimental Results and Discussion The analysis is done in 2 experimental phases. In both the phases the analysis is carried out using three different datasets – CoVerage, CG-1050 and CASIA 1. The comparison of accuracies and other performance metrics such as precision, recall and F1 score tabulated for different datasets are shown in Table 2. Table 2. Comparison of Performance Metrics of the Already Existing and Modified VGG-19 Model Dataset
VGG-19
Modified VGG-19 Using Proposed approach
Accuracy F1 Precision Recall Accuracy F1 Precision Recall Score Score COVERAGE 0.1
0.1
0.1
0.1
0.5
0.67
1
0.5
CG 1050
0.84
0.92
0.98
0.86
0.94
0.94
1
0.88
CASIA V1
0.46
0.55
0.7
0.47
0.47
0.63
1
0.46
The experimental results shows that the model after the second modification has an improved validation accuracy and F1 score. The validation and the training loss which indicates how well the model fits in to the validation and training data are also analyzed and plotted. The loss characteristics of the traditional VGG 19 and the proposed method with early stopping applied is shown in Fig. 3. The validation loss and accuracy curves show that the model is finetuned after this entire training process. The comparison of accuracies, F1 score, precision and recall are represented graphically in Figs. 4 and 5. Graphs show that the training and validation losses are converging faster in proposed method than the other model. All the results show that the proposed approach has the best performance when compared to the other.
An Improved CNN Model for Image Forgery Detection
723
Fig. 3. Training Loss and Validation Loss Curves of CoVerage, CG-1050 and CASIA 1 Respectively (Left to Right) (a) VGG-19 and (b) Modified VGG-19 using Proposed Method
5.1 Experiment #1 In this phase the Transfer Learning using VGG19 model is carried out and using the 3 different datasets the model is trained for different epochs and batch sizes. Results shows that the validation accuracy has improved with increasing epochs and batch sizes. With this fine-tuned version of the model, early stopping is applied and the model is saved. Using the saved model predictions are made and their corresponding performance metrics are obtained and are tabulated. 5.2 Experiment #2 In this experimental phase the proposed approach is applied to the model and is compared with the results of the previous experimental phase. In this approach the initial layers of VGG 19 model are frozen and the final fully connected layers are trained using the forgery dataset. On completion of this training, the entire model layers are unfrozen and is trained using the image forgery datasets with a very low learning rate of 0.00001. The experiment is carried out for all the 3 datasets. The performance of the model is now evaluated using the metrics and the results are compared with the results of the first experiment.
724
K. R. Jisha and N. Sabna
Fig. 4. Comparison of Accuracies of the VGG 19 Model and the Improved Model using Datasets – Coverage, CG-1050 and Casia-1
Fig. 5. Comparison of Precision, Recall and F1-Score of VGG-19 Model and the Modified Model using Datasets – Coverage, CG-1050 and Casia-1
An Improved CNN Model for Image Forgery Detection
725
6 Conclusion In this paper a method to improve CNN model is proposed which makes the model suitable for the image forensics application. The study is carried out using one among the best classification CNN model – VGG19. The model is evaluated for its performances using the image forgery datasets like CoVerage, CG-1050 and CASIA 1 and the results are compared with the improved model using the same datasets. The performance metrics like accuracy, precision, recall and F1 score are considered for comparison. The results show percentage improvement in terms of validation accuracy, precision, recall and F1score using the proposed approach. The loss and validation characteristics curves also show the optimum characteristics of the model when compared to the already existing one. Future work will consider the application of the same methodology to other efficient CNN image classification models like ResNet 50, MobileNet etc. so that the best CNN model that fits in to the image forgery detection application can be identified and can be used in forgery localization tasks in Image Forensics applications. Acknowledgment. This work is part of the Research program under APJ Abdul Kalam Technological University (KTU), Thiruvananthapuram, Kerala. The authors thank Rajagiri School of Engineering and Technology, Kakkanad, Kerala for supporting with all the necessary facilities for carrying out this work.
References 1. Walia, S., Kumar, K.: An eagle-eye view of recent digital image forgery detection methods. Commun. Comput. Inf. Sci. 828, 469–487 (2018). https://doi.org/10.1007/978-981-10-86601_36 2. Thakur, R., Rohilla, R.: Recent advances in digital image manipulation detection techniques: a brief review. In: Forensic Science International, vol. 312. Elsevier Ireland Ltd, 01 July 2020. https://doi.org/10.1016/j.forsciint.2020.110311 3. Kuznetsov, A.: Digital image forgery detection using deep learning approach. J. Phys. Conf. Ser. 1368(3) (2019). https://doi.org/10.1088/1742-6596/1368/3/032028 4. Ghosh, A., Sufian, A., Sultana, F., Chakrabarti, A., De, D.: Fundamental concepts of convolutional neural network. In: Balas, V.E., Kumar, R., Srivastava, R. (eds.) Recent Trends and Advances in Artificial Intelligence and Internet of Things. ISRL, vol. 172, pp. 519–567. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-32644-9_36 5. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012) 6. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, September 2014 7. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269, July 2017. https://doi.org/10.1109/CVPR.2017.243 8. Wen, B., Zhu, Y., Subramanian, R., Ng, T.-T., Shen, X., Winkler, S.: COVERAGE — a novel database for copy-move forgery detection. In: Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), pp. 161–165, September 2016. https://doi.org/10. 1109/ICIP.2016.7532339
726
K. R. Jisha and N. Sabna
9. Castro, M., Ballesteros, D.M., Renza, D.: A dataset of 1050-tampered color and grayscale images (CG-1050). Data Brief 28, 104864 (2020). https://doi.org/10.1016/j.dib.2019.104864 10. Dong, J., Wang, W., Tan, T.: CASIA image tampering detection evaluation database. In: Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, pp. 422–426, July 2013. https://doi.org/10.1109/ChinaSIP.2013.662 5374 11. El Biach, F.Z., Iala, I., Laanaya, H., Minaoui, K.: Encoder-decoder based convolutional neural networks for image forgery detection. Multimedia Tools Appl. (2021). https://doi.org/10. 1007/s11042-020-10158-3 12. Hao, J., Zhang, Z., Yang, S., Xie, D., Pu, S.: TransForensics: image forgery localization with dense self-attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15055–15064 (2021) 13. Abbas, M.N., Ansari, M.S., Asghar, M.N., Kanwal, N., O’Neill, T., Lee, B.: Lightweight deep learning model for detection of copy-move image forgery with post-processed attacks. In: Proceedings of the SAMI 2021 – IEEE 19th World Symposium on Applied Machine Intelligence and Informatics, pp. 125–130, January 2021. https://doi.org/10.1109/SAMI50 585.2021.9378690 14. Bappy, J.H., Roy-Chowdhury, A.K., Bunk, J., Nataraj, L., Manjunath, B.S.: Exploiting spatial structure for localizing manipulated image regions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4970–4979 (2017) 15. Bappy, J.H., Simons, C., Nataraj, L., Manjunath, B.S., Roy-Chowdhury, A.K.: Hybrid LSTM and encoder-decoder architecture for detection of image forgeries, March 2019. https://doi. org/10.1109/TIP.2019.2895466 16. Ali, S.S., Ganapathi, I.I., Vu, N.S., Ali, S.D., Saxena, N., Werghi, N.: Image forgery detection using deeplearning by recompressing images. Electron. (Switz.) 11(3), 403 (2022). https:// doi.org/10.3390/electronics11030403 17. Rodriguez-Ortega, Y., Ballesteros, D.M., Renza, D.: Copy-move forgery detection (CMFD) using deep learning for image and video forensics. J. Imaging 7(3), 59 (2021). https://doi. org/10.3390/jimaging7030059 18. Rao, Y., Ni, J., Xie, H.: Multi-semantic CRF-based attention model for image forgery detection and localization. Signal Process. 183, 108051 (2021). https://doi.org/10.1016/j.sigpro.2021. 108051 19. Bibi, S., Abbasi, A., Haq, I.U., Baik, S.W., Ullah, A.: Digital image forgery detection using deep autoencoder and CNN features. Hum.-centric Comput. Inf. Sci. 11, 1–17 (2021). https:// doi.org/10.22967/HCIS.2021.11.032 20. Deng, J., Dong, W., Socher, R., Li, L.-J., Kai, L., Li, F.-F.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255 (2009). https://doi.org/10.1109/CVPR.2009.5206848 21. Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 1–40 (2016). https://doi.org/10.1186/s40537-016-0043-6 22. Bekkar, M., Kheliouane Djemaa, D., Akrouf Alitouche, D.: Evaluation measures for models assessment over imbalanced data sets, vol. 3, no. 10 (2013). www.iiste.org 23. Akosa, J.S.: Predictive accuracy: a misleading performance measure for highly imbalanced data. In: Proceedings of the SAS Global Forum, pp. 942–2017 (2017) 24. Sokolova, M., Japkowicz, N., Szpakowicz, S.: LNAI 4304 - beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation (2006) 25. Ketkar, N.: Introduction to keras. In: Deep Learning with Python, pp. 97–111. Apress (2017). https://doi.org/10.1007/978-1-4842-2766-4_7
An Improved CNN Model for Image Forgery Detection
727
26. USENIX Association, ACM SIGMOBILE, ACM Special Interest Group in Operating Systems, and ACM Digital Library. Papers presented at the Workshop on Wireless Traffic Measurements and Modeling, 5 June 2005, Seattle, WA, USA. USENIX Association (2005) 27. Kingma, D.P., Ba, J. : Adam: a method for stochastic optimization, December 2014
Low-Cost Model-Free Deep Reinforcement Learning on Continuous Control Huihui Zhang(B) , Xu Han, Yanlong Cheng, and Cong Yan Dongsheng Intelligent Technolody Co., Ltd., Suzhou, China [email protected]
Abstract. Training agents to perform reinforcement learning (RL) is a difficult task in most cases since they are often misguided by the reward signal, getting into invalid states and behaving in unwanted ways when exploring and interacting with the environment. Sometimes the cost needs to be considered when optimizing the performance during RL training. An additional cost signal from the environment may help to lead agents towards the desired actions, maximizing the expected return at low cost. In most practical situations, the state or action space is usually large or continuous, which makes model-free training more difficult if too much constraint is given for low cost. Therefore, in continuous Markov Decision Processes, we present a notion to optimize the set objective function at low cost with free exploration, which avoids the performance degrading. In this paper, we propose a novel approach to replace the objective and cost with surrogate functions, as well as applying a statedependent multiplier to provide a highly efficient tradeoff between them. Keywords: Reinforcement Learning · Deep Neural Networks Decision Process · Low Cost · Model-Free
1
· Markov
Introduction
Reinforcement Learning (RL) has received great attention and can be applied to multiple areas ranging from online learning and recommender engines, natural language understanding and generation [8]. Although several different approaches have been proposed and achieved success in games [15,19,25], the applications of RL are limited in realistic situations in the literature. This mainly lies in the cost in real-world physical systems, which conflicts with the long-term optimization policies of RL. Therefore, designing algorithms for RL at low cost while retaining efficiency. A close area can be found in constrained Markov Decision Processes (CMDP), which focus on the guarantee of safety requirement. For that aim, it may choose to sacrifice the performance by limiting free exploration throughout training. In CMDP RL problems, an agent interacts with the environment subjected to one or more constraints to satisfy realistic safety requirement while optimizing its c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 728–745, 2023. https://doi.org/10.1007/978-3-031-37963-5_50
Low-Cost Model-Free DRL
729
standard objective. In most work, the constraints are modeled as a threshold on the expected discounted cumulative costs. Although optimal policies for finite CMDPs with known models can be obtained by linear programming, there are no many results for CMDPs when the model is unknown or the state or action space is large or continuous [6]. A common approach to CMDPs is the Lagrangian method [2,3,7,9,22], which augments the objective function to be optimized with a constraint term. It can be solved by fixing the Lagrange multiplier first and adopting dynamic programming (DP) methods to reach a temporal optimal policy, and then change the Lagrange multiplier and start a new round. However, it results in a saddle point problem, which is apt to run into numerical stability issues [12]. More importantly, the Lagrange multiplier adjusted step by step is too slow to converge when the state or action space is large or continuous. Lyapunov functions have also been adopted to study RL in the scenario of CMDP. The research [16] first studied CMDP based on the notion of Lyapunov function, by which an agent learns to control a system by switching among a number of given, base-level controllers safely in a basically efficient way. This work is followed by [5], which proposed policy and value iteration algorithms and provided related theoretical analyses. Although they claim to ensure safety for the whole training process, they can hardly be applied in the scenario of large or continuous state-action spaces due to the lack of freedom for exploration. Besides, the Lyapunov functions are either costly to construct or highly dependent on the optimal policy, which renders extra difficulty to be practically applied. Besides, constrained policy optimization (CPO) [1] replace the objective and constraints with surrogate functions and claim to develop the first general-purpose policy search algorithm to train neural network policies while guaranteeing safety throughout training for CMDP. However, this method rely on cost shaping, which is challenging to analyze the convergence and be extended to other RL algorithms. To avoid these problems and restore free exploration in model-free RL, we apply deep neural networks to surrogate functions of the Lagrange dual form in this paper. We first provide some preliminary background knowledge about CMDP and its Lagrange dual form. Then we parameterize the cost function, i.e., the discounted cumulative cost, with respect to the state-action pair, and propose a surrogate approach to the Lagrange dual form. Third, we separate this surrogate function into two surrogate objectives to update the parameters of actor and multiplier respectively, based on which a novel algorithm is proposed to adaptively train neural network policies and organize the training steps without any cost shaping. And theoretical proofs show that the proposed algorithm has the property of asymptotical convergence and robust low cost. Finally, we provide the evaluation results to show that our method can effectively reduce the cost as well as retaining good performance.
2
Related Works
The most related work is [4], which also has state-dependent Lagrangian multipliers and variants of Lagrangian relaxation, however, its problem setting is
730
H. Zhang et al.
different from ours. The works in [14,17,24] all belong to the family of safe reinforcement learning with state-wise constraints under the setting of discrete state space. When it comes to continuous state space, state-wise constraints can hardly be guaranteed without violation.
3 3.1
Background CMDP
We start from some general notations of CMDP. Let χ be the set of continuous compact state space, and A be the action space. The policy π is restricted to the set of Markov stationary policies. Then we define the action-value (Q-value) function as ∞ t γ r(st , at )|s0 = s, a0 = a , (1) Qπ (s, a) = Epπ (h|s0 ,a0 ) t=0
where r(·, ·) is the immediate reward, and γ ∈ (0, 1) is the discount factor for future rewards. The action-value function represents the discounted return for the agent to start from the initial state s and execute action a taking from the policy π. Besides, pπ (h|s0 , a0 ) is the joint probability of an episode h given the initial state s0 and initial action a0 , which is given by pπ (h|s0 , a0 ) =
∞
p(st+1 |st , at )π(at+1 |st+1 ),
(2)
t=0
where p(st+1 |st , at ) is the transition probability for an agent moving from st to st+1 after taking action at . Then we denote the immediate cost as c(s, a), by which the cost function can be defined as ∞ t γ c(st , at )|s0 = s, a0 = a . (3) Cπ (s, a) = Epπ (h|s0 ,a0 ) t=0
Given an initial state s , the safety constraint is defined as a∈A π(a|si )Cπ (si , a)da ≤ d0 , where d0 ∈ R≥0 is the constant threshold for safety constraint. In general, the standard form of continuous CMDP is similar to that of [5], i.e., to find the optimal policy which meets i
π ∗ = arg max{Qπ (s)|Cπ (s) ≤ d0 , ∀s ∈ χ},
(4)
π
where Cπ (s) = a∈A π(a|s)Cπ (s, a)da and Qπ (s) = a∈A π(a|s)Qπ (s, a)da. The discounted cumulative cost is chosen to construct the cost function because it follows the recursive Bellman equation, that is Cπ (s, a) = c(s, a) + γ p(s |s, a)Cπ (s , a)ds , (5) s ∈χ
where the same property applies to Qπ (s, a). The proof of Eq. (5) can be found in Appendix A.
Low-Cost Model-Free DRL
3.2
731
Lagrange Dual Form
The Lagrangian approach can transform the CMDP problem into an equivalent minmax unconstrained problem, i.e., the Lagrange dual form, which can be solved using a dual linear programming method. The cost term is linked with the objective to transform Eq. (4) into an unconstrained dual form, which is given by [22] π ∗ = arg min max [Qπ (s) − λ(Cπ (s) − d0 )] , ∀s ∈ χ. λ≥0
π
(6)
where λ represents the Lagrange multiplier, which compromises between the objective optimization and the safety concern. The Lagrange multiplier λ determines the influence of the cost term in contrast with the action-value function. Specifically, Eq. (6) reduces to a real unconstrained MDP problem when λ = 0. Generally, Eq. (6) needs to be solved by a two-timescale approach, i.e., on the faster timescale, the policy π or its parameter is solved by Eq. (6) when fixing λ, while on the slower timescale, λ is slowly increased until the constraint is satisfied without losing the optimal solution to Eq. (6). Due to the potential nonconvexity of the action-value function, the method may lead to a sub-optimal solution, and even cause instability when the convergence is difficult.
4
State-Dependent Multiplier
We adopt two deep neural networks to approximate the Q-value function and the cost function, which maps the inputs (s, a) to their action-values, respectively. Due to the recursive property of Bellman equation in Eq. (5), the parameters of two networks can be updated via bootstrapping, using the current optimal values to predict the targets. Since we focus on continuous control, the policy π is also parameterized as a neural network with respect to the state s, i.e., μ(s), which is a mapping from states to actions. With the parameterized policy, the loss functions can be respectively given by Lπ (ω) = Eπ (r + γQ(s , μ(s |θ )|ω ) − Q(s, a|ω))2 ,
Lπ (Ω) = Eπ (d + γC(s , μ(s |θ )|Ω ) − C(s, a|Ω)) , 2
(7) (8)
where π represents the behavior policy, which is usually different from the target policy μ(s |θ ) mapping s to the next action a through a target actor network parameterized by θ . The behavior policy also gives birth to the action a based on the current state s, and then the immediate reward r, the immediate cost c, and the next state s can be achieved by the agent interacting with the MDP environment. (s, a, r, c, s ) composes a five-tuple transition slot for each interaction. ω is the parameter of Q-network to be optimized, which is normally different from the target Q-network parameter ω . Similarly, Ω is the parameter of C-network to be optimized and Ω is its target network parameter.
732
H. Zhang et al.
Therefore, we propose a surrogate approach to solving Eq. (6) instead of dealing with the saddle point problem that is inefficient to adjust the multiplier. In actor-critic method [11], a parameterized actor function μ(s|θ) is adopted to specify the current policy. With Eq. (6), the multiplier can be approximated by a state-dependent function, which is given by J(θ, λ) = Es [Q(s, μ(s|θ)|ω) − Λ(s|λ)(C(s, μ(s|θ)|Ω) − d0 )] ,
(9)
where the parameters θ and λ can be optimized by maximizing Eq. (9). Directly implementing bootstrapping with neural networks in Eq. (7) and Eq. (8) proved to be unstable in many environments [13]. The parameters of critic network ω, cost network Ω and actor network θ have been implemented with “soft” target updates [13], then the multiplier parameter λ can also have a “soft” target backup λ to improve the stability.
5
Low-Cost Deep Deterministic Policy Gradient Method
In deep Q-network (DQN) [23], a replay buffer is used to store the transitions of a rollout. At each time step, these transitions are uniformly sampled as minibatches to update the networks. For statistical computation, we can substitute sampled mini-batches into the loss functions, which are given by L(ω) =
N 1 (rn + γQ(sn , an |ω ) − Q(sn , an |ω))2 , N n=1
(10)
L(Ω) =
N 1 (cn + γC(sn , an |Ω ) − C(sn , an |Ω))2 , N n=1
(11)
where an = μ(sn |θ ) is the action taken from the target actor network, and (sn , an , rn , cn , sn ) is the n-th tuple of mini-batches from the pool of replay buffer. In the surrogate function Eq. (9), λ is parameterized to be updated together with θ in one surrogate. We can separate the multiplier from Eq. (9) to be independently approximated. Besides, it can have a target network to improve the stability by applying the “soft” target updates. Then two surrogate objectives separated from Eq. (9) for the update of actor and multiplier, respectively, can be achieved as J(θ) =
N 1 [Q(sn , μ(sn |θ)|ω) − Λ(sn |λ )δ1 ] , N n=1
(12)
J(λ) =
N 1 [Q(sn , μ(sn |θ )|ω ) − Λ(sn |λ)δ2 ] , N n=1
(13)
Low-Cost Model-Free DRL
733
where δ1 = C(sn , μ(sn |θ)|Ω) − d0 and δ2 = C(sn , μ(sn |θ )|Ω ) − d0 . λ is the parameter of target multiplier, which can be updated from λ by the “soft” target updates. The surrogate objective for updating λ in Eq. (13) adopts the target critic, cost and actor networks, and the objective function for updating θ in Eq. (12) adopts the target multiplier. It is worth noting that maximizing Eq. (12) and then minimizing Eq. (13) is consistent with Eq. (6). Then (ω , Ω , θ , λ ) are soft updated by (ω, Ω, θ, λ), in the way of ω ← τ arg minL(ω) + (1 − τ )ω , ω
Ω ← τ arg minL(Ω) + (1 − τ )Ω , Ω
θ ← τ arg maxJ(θ) + (1 − τ )θ , θ
λ ← τ arg minJ(λ) + (1 − τ )λ ,
(14)
λ
where τ < 1 limits the adjustment speed of target values to control the stability of learning. According to the above analyses, we propose the low-cost deep deterministic policy gradient (LC-DDPG or CDDPG) algorithm. We start the algorithm from the agent’s interaction with an environment through a rollout of observations, actions, rewards and costs. The task of the agent is to choose actions to maximize the discounted return as well as reducing the cost. An episode will terminate before the agent reaches the goal or encounters the timeout. The transition slots (s, a, r, c, s ) during the rollout will be stored in the memory as mini-batches to be randomly sampled for updating the network parameters following Eq. (18), Eq. (19), Eq. (12) and Eq. (13). The pseudocode of LC-DDPG is organized as Algorithm 1. Theorem 1. LC-DDPG algorithm asymptotically converges as the iteration i → ∞ with properly chosen learning rate. Theorem 2. LC-DDPG algorithm can robustly keep low-cost, i.e., the cost will return below the threshold even if the neural networks are wrongly trained. The proof of Theorem 1 and Theorem 2 can be found in Appendix B and C, respectively. It should be noted that this low-cost training method can also be applied in other algorithms like the soft actor-critic (SAC) [10].
734
H. Zhang et al.
Algorithm 1. LC-DDPG Algorithm 1: Input: The mini-batch size N , the update maximum M , the timeout step T , the soft update parameter τ , and the constraint threshold d0 . 2: Initialization: Initialize parameters (ω, Ω, θ, λ) ← (ω0 , Ω0 , θ0 , λ0 ) and (ω , Ω , θ , λ ) ← (ω0 , Ω0 , θ0 , λ0 ) randomly; Initialize replay buffer R, the counter i ← 0. 3: while i < M do 4: Reset randomly the initial state s1 . 5: for t = 1, T do 6: Select action at according to the current behavior policy, i.e., μ(st |θi ) added by exploration noise; 7: Execute actions at , get next states st+1 , immediate reward rt and immediate cost ct ; 8: Store transition (st , at , rt , ct , Λt , st+1 ) in R; 9: if R is full then 10: Randomly and uniformly sample the slot (si , ai , ri , ci , Λi , si+1 ) from R; 11: Maximize the expected return shown in Eq. (12), and then update θi ; 12: Minimize the alternative expected return shown in Eq. (13), and then update λi ; 13: Minimize the Q loss function shown in Eq. (18), and then update ωi ; 14: Minimize the cost loss function shown in Eq. (19), and then update Ωi ; 15: Execute the ”soft” target updates shown in Eq. (14) to update θi , ωi , Ωi and λi ; 16: i ← i + 1; st+1 ← st ; 17: end if 18: end for 19: end while
6
Experiments
Under the setting of our experiments, the average reward is the mixed result of reward shaping and policy, and higher average reward shows better learnt policy. Discounted cumulative cost is subjected to the threshold, and lower cost is preferred once the convergence performance is optimal. The hyperparameters in the proposed algorithm and baselines are listed in Table 1 of Appendix D. The average reward means the cumulative reward averaged over the number of consumed steps in one episode, and the average cost represents the cumulative cost without discount during the episode. An evaluation procedure, which observes the episode from the start point to the goal and records the reward and cost at every step, is launched every 500 update periods. Then results of 100 episodes are averaged for each evaluation procedure to improve accuracy. 6.1
Continuous Maze
We employ continuous Maze to evaluate the performance of LC-DDPG algorithm with the baselines as deep deterministic policy gradient (DDPG), twin
Low-Cost Model-Free DRL
735
delayed deep deterministic policy gradient (TD3), SAC and proximal policy optimization (PPO) [18] based on the same 10 seeds for a fair comparison. Table 1 in Appendix D shows the hyperparameters required for the continuous Maze experiment, and the detailed experiment setting can be found in Appendix E.1.
Fig. 1. (a) Average Reward in Continuous Maze of Size 2; (b) Average Cost in Continuous Maze of Size 2 (c) Average Reward in Continuous Maze of Size 3; (d) Average Cost in Continuous Maze of Size 3 Versus Update Periods.
Figure 1 illustrates the average rewards and costs of size-2 or size-3 maze versus update periods. From Fig. 1(a) and Fig. 1(b), we can see LC-DDPG converges faster than DDPG and TD3 while has the lowest average cost in maze of size 2. In Fig. 1(c) and Fig. 1(d), LC-DDPG still has more stable convergence and lower cost than DDPG and TD3 in maze of size 3. In general, constraint on cost will have negative effect on convergence, however, LC-DDPG is more efficient in this experiment due to the fact that the cost function works as an indicator for the agent to keep a safe distance from the obstacles and avoid some unwanted actions. 6.2
Gym Environment
We employ DDPG, TD3, SAC and PPO as baselines on a series of Gym benchmarks like Cartpole, Acrobat, Pendulum and Halfcheetah-v3 experiments to
736
H. Zhang et al.
Fig. 2. (a) Average Reward; (b) Average Cost Versus Update Periods in Cartpole.
Fig. 3. (a) Average Reward; (b) Average Cost Versus Update Periods in Acrobot.
compare with LC-DDPG. The setting of these experiments are available at Gym1 , and the details of modified rewards and the setting of costs can be found in Table 2 in Appendix E.2. The results are evaluated at a constant interval (like every 500 update periods in Cartpole), and each evaluation procedure includes fixed repetitions to average the results. Results of selected algorithms are based on the same 10 seeds to ensure fairness. The applied hyperparameters for these experiments are listed in Table 1 of Appendix D. In Fig. 2(a), PPO has the most stable performance with nearly no oscillation, although its advantage is achieved by much more consumed samples. The stability of LC-DDPG is no worse than that of other baselines, while its cost to achieve such performance is much lower, as shown in Fig. 2(b) that LC-DDPG has overwhelming advantage over other baselines in reducing cost. In Fig. 3(a) and Fig. 3(b), LC-DDPG also has faster and more stable convergence performance, at lower cost than other baselines. Similar conclusion can be achieved from Fig. 4(a) and Fig. 4(b). 1
https://github.com/openai/gym/tree/master/gym/envs/classic control.
Low-Cost Model-Free DRL
737
In Fig. 5(a) and Fig. 5(b), LC-DDPG trades the lowest cost for intermediate performance between DDPG and TD3.
Fig. 4. (a) Average Reward; (b) Average Cost Versus Update Periods in Pendulum.
Fig. 5. (a) Average Reward; (b) Average Cost Versus Update Periods in Halfcheetahv3.
7
Conclusion
In this paper, we proposed a novel method that adopts deep neural networks to tackle low-cost RL problems. The proposed LC-DDPG algorithm has the property of asymptotical convergence and robust low cost. Evaluation results show that our proposed algorithm can hardly degrade the performance while greatly reducing the cost during training. The details of network architectures can be found in Appendix F.
738
H. Zhang et al.
Appendix A
Proof of Eq. (5)
Proof. For s ∈ χ, Cπ (s, a) = Epπ (h|s0 ,a0 )
∞
t
γ c(st , at )|s0 = s, a0 = a
t=0
∞ = Ep(s |s,a) {c(s, a) + Epπ (h∞ [ γ t c(st , at )|s1 = s ]} |s ) 1 1 t=1 ∞
= Ep(s |s,a) {c(s, a) + γEpπ (h|s0 ,a0 ) [
γ t c(st , at )|s0 = s ]}
t=0
= c(s, a) + γEp(s |s,a) [Cπ (s )] ,
(15)
where p
π
(h∞ 1 |s1 )
=
∞
π(at |st )p(st+1 |st , at ).
(16)
t=1
B
Proof of Theorem 1
Proof. This proof is based on Lemma 1 of [20], which is moved originally below for convenience. Lemma 1 of [20]: Consider a stochastic process (αt , Δt , Ft ), t ≥ 0, where αt , Δt , Ft : X → R, which satisfies the equations Δt+1 (x) = (1 − αt (x))Δt (x) + αt (x)Ft (x),
x ∈ X, t = 0, 1, 2, · · ·
(17)
Let Pt be a sequence of increasing σ-fields such that α0 , Δ0 are P0 -measurable and αt , Δt and Ft−1 are Pt -measurable, t = 1, 2, · · · Assume that the following hold: 1. 2. 3. 4.
the set of possible states X is fixed. 0 ≤ αt (x) ≤ 1, t αt (x) = ∞, t αt2 (x) < ∞ w.p.1. E{Ft (·)|Pt } ≤ γΔt + ct , where γ ∈ [0, 1) and ct converges to zero w.p.1. V ar{Ft (x)|Pt } ≤ K(1 + Δt )2 , where K is some constant.
Then Δt converges to zero with probability one (w.p.1). Within the scope of this paper, the MDP state space is fixed, satisfying condition 1 in Lemma 1 of [20], and Lemma condition 2 holds by proper selection
Low-Cost Model-Free DRL
739
of learning rate. According to [21], even the commonly used constant learning rate can make algorithms converge in distribution. N 1 L(ω) = (rn + γQ(sn , an |ω ) − Q(sn , an |ω))2 , N n=1
L(Ω) =
N 1 (cn + γC(sn , an |Ω ) − C(sn , an |Ω))2 , N n=1
(18)
(19)
We apply Lemma 1 of [20] with Pt = {Q(·, ·|ω0 ), C(·, ·|Ω0 ), s0 , a0 , r1 , s1 , · · · , st , at }. Following the update rule for optimizing Eq. (18) and Eq. (19) and using the current policy to produce the action at+1 = μ(st+1 |θ), we have Q(st , at |ωt+1 ) = (1 − αt )Q(st , at |ωt ) + αt [rt + γQ(st+1 , at+1 |ωt )] , C(st , at |Ωt+1 ) = (1 − αt )C(st , at |Ωt ) + αt [rt + γC(st+1 , at+1 |Ωt )] .
(20) (21)
Define the costed action-value function as ˆ a|ω, Ω, λ) = Q(s, a|ω) − Λ(s|λ)(C(s, a|Ω) − d0 ). Q(s,
(22)
Using the definition of Eq. (22) under the setting of our proposed algorithm, ˆ ·|ωt , Ωt , λ) − Q ˆ (·, ·), which is the difference between the we denote Δt = Q(·, penalized action-value function and optimal penalized value function. Then we have ˆ t , at |ωt+1 , Ωt+1 , λ) − Q ˆ (st , at ) Δt+1 (st , at ) = Q(s ˆ (st , at ) = Q(st , at |ωt+1 ) − Λ(s|λ)(C(st , at |Ωt+1 ) − d0 ) − Q
ˆ t , at |ωt , Ωt , λ) − Q ˆ (st , at ) + αt Ft = (1 − αt ) Q(s = (1 − αt )Δt (st , at ) + αt Ft (st , at ),
(23)
where the third equality is due to the substitution of Eq. (20) and Eq. (21), and ˆ t+1 , at+1 |ωt , Ωt , λ) − Q ˆ (st , at ). Ft (st , at ) = rt + γ Q(s
(24)
Since the reward is bounded within the scope of this paper, the action-values are also bounded, then condition 4 in Lemma 1 of [20] holds. According to the proof in Theorem 2 of [20], there is E [Ft (st , at )|Pt ] ≤ γΔt , which satisfies condition 3 in Lemma 1 of [20]. ˆ (·, ·) with ˆ ·|ωt , Ωt , λ) converges to Q Finally, it can be concluded that Q(·, probability 1.
C
Proof of Theorem 2
Proof. We rewrite (12) and (13) in expected forms as J(θ) = Es [Q(s, μ(s|θ)|ω) − Λ(s|λ )δ1 (s|θ)] ,
740
H. Zhang et al.
J(λ) = Es [Q(s, μ(s|θ )|ω ) − Λ(s|λ)δ2 (s|θ )] , We assume the worst case starting from the i-th iteration, when the expected cost term is nonnegative, i.e., Es [δ2 (s|θi )] ≥ 0. Then the expected weighted cost term Es [Λ(s|λi )δ2 (s|θi )] ≥ 0 since the multiplier is nonnegative as well. According to maximization of (13), Λ(s|λi ) is upper unbounded with respect to λi when the expected weighted cost term is nonnegative, which means ∃λi+1 so that J(θ) ≤ 0, ∀θ. Here we have made a simple premise of λi+1 = λi+1 , which can be realized by setting the soft update parameter τλ as 1. Under for the maximization of J(θi+1 ), there must the above circumstance, of θi+1 = θ i+1 , we can be Es Λ(s|λi+1 )δ1 (s|θi+1 ) ≤ 0. With the premise δ Λ(s|λ Λ(s|λ )E (s|θ ) ≤ E )δ (s|θ ≤ 0, i.e., conclude max s i+2 s 2 s i+2 2 i+1 i+1 ) Es δ2 (s|θi+1 ) ≤ 0, which means the constraint requirement is recovered.
D
Hyperparameters
Table 1 lists the common hyperparameters shared by all experiments and their respective settings. Table 1. List of Hyperparameters. Shared Env Value Description
Algorithm applied
LR a
DDPG, SAC, LC-DDPG, SAC, TD3
0.001
Learning rate of actor
0.0001
PPO
LR c
0.001 Learning rate of critic 0.0004
DDPG, SAC, LC-DDPG PPO
LR c1
0.001
Learning rate of critic1
TD3
LR c2
0.001
Learning rate of critic2
TD3
LR p
0.001
Learning rate of cost
LC-DDPG
LR λ
0.001
Learning rate of multiplier
LC-DDPG
τ a
0.01 Soft update parameter of actor 0.0001
DDPG, SAC, LC-DDPG, SAC, TD3 PPO
τ c
0.01 Soft update parameter of critic 0.0001
DDPG, SAC, LC-DDPG PPO
τ c1
0.01
Soft update parameter of critic1
TD3
τ c2
0.01
Soft update parameter of critic2
TD3
τ p
0.01
Soft update parameter of cost
LC-DDPG
τ λ
0.01
Soft update parameter of multiplier
LC-DDPG
γ
0.9
Discount horizon factor
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
γp
0.95
Discount horizon factor for cost
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Interval
500
Eval period
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Test
100
Episodes per eval period
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Var dr
0.9995 Exploration variance decay rate
Batch
200
Size of each mini-batch
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Actor size
10
Number of parallel actors
PPO
Bias
10−6
The positive bias added to normal variance PPO
Batch iter
100
Iterations of mini-batches
PPO
0.2
Clipping parameter
PPO
DDPG, SAC, LC-DDPG, SAC, TD3
(continued)
Low-Cost Model-Free DRL
741
Table 1. (continued) Cartpole
Value Description
Max EPS
500
Algorithm applied
Maximal steps per episode training DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Runout
1000
Maximal steps per episode eval
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Var i
10.0
Initial exploration variance
DDPG, SAC, LC-DDPG, SAC, TD3
Memory
50000
Size of replay buffer
DDPG, SAC, LC-DDPG, SAC, TD3
Train num
100000 Updating iterations (Not episodes) DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Acrobot
Value Description
Max EPS
500
Algorithm applied
Maximal steps per episode training DDPG, SAC, LC-DDPG, SAC, TD3
Runout
500
Maximal steps per episode eval
DDPG, SAC, LC-DDPG, SAC, TD3
Var i
10.0
Initial exploration variance
DDPG, SAC, LC-DDPG, SAC, TD3
Memory
10000
Size of replay buffer
DDPG, SAC, LC-DDPG, SAC, TD3
Train num
800000 Updating iterations (Not episodes) DDPG, SAC, LC-DDPG, SAC, TD3
Maze
Value Description
Max EPS
500
Algorithm applied
Maximal steps per episode training DDPG, SAC, LC-DDPG, SAC, TD3
Runout
100
Maximal steps per episode eval
DDPG, SAC, LC-DDPG, SAC, TD3
Var i
10.0
Initial exploration variance
DDPG, SAC, LC-DDPG, SAC, TD3
Memory
10000
Size of replay buffer
DDPG, SAC, LC-DDPG, SAC, TD3
Train num
800000 Updating iterations (Not episodes) DDPG, SAC, LC-DDPG, SAC, TD3
Pendulum Value Description Max EPS
200
Algorithm applied
Maximal steps per episode training DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Runout
100
Maximal steps per episode eval
DDPG, SAC, LC-DDPG, SAC, TD3, PPO
Var i
10.0
Initial exploration variance
DDPG, SAC, LC-DDPG, SAC, TD3
Memory
30000
Size of replay buffer
DDPG, SAC, LC-DDPG, SAC, TD3
Train num
800000 Updating iterations (Not episodes) DDPG, SAC, LC-DDPG, SAC, TD3, PPO
E E.1
Experiment Details Continuous Maze
The continuous maze filled with obstacles is suitable for this evaluation. During the training process, the agent may travel through the obstacles in order to reach the goal. However, the agent can keep a safe distance from the obstacles asymptotically by receiving cost signals as warnings from the environment. The environment of continuous maze problem includes continuous state-action space which is shown in Fig. 6(a) and Fig. 6(b). At every step, the agent moves towards all directions with a fixed step size to any possible position in the maze, even traveling across the barriers represented by the gray grids. The wall is drawn as the dark solid line on the edge of the maze. The task of the agent is to move from the starting point to the goal colored yellow without hitting any barrier or wall. During the task, the agent receives −1000 reward if it hits or goes through the barrier, receives −50 if it hits the wall. A reward of 100 score will be assigned to the agent if it arrives at the goal within the timeout. In other blank areas, the rewards is set as the minus distance from the agent to the goal. At every step where the received reward is
742
H. Zhang et al.
1 nonpositive, the agent is given a cost of 1+d , where d represents the minimum distance of the agent from any barrier or wall.
Fig. 6. (a) Continuous Maze with 1 Array of Barriers (Size 2); (b) Continuous Maze with 2 Arrays of Barriers (Size 3); (c) Cartpole Environment.
E.2
Classical Control Environment
Among these, the cartpole environment is illustrated in Fig. 6(c). We add the absolute value of distance into the rewards to encourage the cart to move farther instead of staying at origin for stability. The setting of this experiment is as follows. At every step, when the cart moves randomly towards left or right, a reward of 1 score plus a value positively proportional to the position and velocity of the cart is given if the deviation angle of the pole is less than 24◦ . Otherwise, the position of the cart and the angle of the pole will be reset for the next episode. The goal of the experiment is to keep the environment from resetting until the timeout which is set as 1000 steps. Overall, the cost of each experiment is set proportional to the absolute value of action, which determines the torque applied by the agent. And accordingly, the influence of action is excluded from the reward shaping. The setting of rewards and costs are listed in Table 2. Table 2. Reward Shaping of Continuous Control Experiments. Env
Reward
continuous Cartpole (cos θ)3 + 0.01|x|
Cost 0.1|a|
Explanation θ is the pole angle x is cart position a is the action
Acrobot Pendulum
−1 if not terminal else 100 0.01|a| a is the action θ2 + 0.1θ˙2 0.1|a| θ is the pendulum angle θ˙ is the angular velocity a is the action
F
Network Architecture
We construct the critic network using a fully-connected MLP with two hidden layers. The input is composed of the state and action, outputting a value
Low-Cost Model-Free DRL
743
representing the Q-value. The ReLU function is adopted to activate the first hidden layer. The setting of actor network is similar to that of the critic network, except that the input is the state and the output is multiplied by the action supremum after tanh nonlinearity. The network of multiplier Λ is constructed similar to the actor network except replacing the tanh nonlinearity by clipping Λ in [0, ∞). The architecture of networks are plotted in Fig. 7.
Fig. 7. Architecture of Deterministic Networks.
744
H. Zhang et al.
References 1. Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: Proceedings of the 34th International Conference on Machine Learning-vol. 70, pp. 22–31. JMLR. org, 2017 2. Altman, E.: Constrained Markova decision processes with total cost criteria: Lagrangian approach and dual linear program. Math. Methods Oper. Res. 48(3), 387–417 (1998) 3. Altman, E.: Constrained Markov Decision Processes, vol. 7. CRC Press, Boca Raton (1999) 4. Bohez, S., Abdolmaleki, A., Neunert, M., Buchli, J., Heess, N., Hadsell, R.: Value constrained model-free continuous control. arXiv preprint arXiv:1902.04623 (2019) 5. Chow, Y., Nachum, O., Duenez-Guzman, E., Ghavamzadeh. M.: A lyapunov-based approach to safe reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 8103–8112 (2018) 6. Chow, Y., Nachum, O., Faust, A., Ghavamzadeh, M., Duenez-Guzman, E.: Lyapunov-based safe policy optimization for continuous control. arXiv preprint arXiv:1901.10031 (2019) 7. Dalal, G., Dvijotham, K., Vecerik, M., Hester, T., Paduraru, C., Tassa, Y.: Safe exploration in continuous action spaces. arXiv preprint arXiv:1801.08757 (2018) 8. Gattami. A.: Reinforcement learning for multi-objective and constrained Markov decision processes. arXiv preprint arXiv:1901.08978 (2019) 9. Geibel, P., Wysotzki, F.: Risk-sensitive reinforcement learning applied to control under constraints. J, Artif. Intell. Res. 24, 81–108 (2005) 10. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, PMLR 80, pp. 1861–1870 (2018) 11. Konda, V.R., Tsitsiklis. J.N.: Actor-critic algorithms. In: Advances in Neural Information Processing Systems, pp. 1008–1014 (2000) 12. Lee, J.D., Panageas, I., Piliouras, G., Simchowitz, M., Jordan, M.I., Recht, B.: First-order methods almost always avoid saddle points. arXiv preprint arXiv:1710.07406 (2017) 13. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015) 14. Liu, T., Zhou, R., Kalathil, D., Kumar, P., Tian, C.: Learning policies with zero or bounded constraint violation for constrained MDPS. Adv. Neural. Inf. Process. Syst. 34, 17183–17193 (2021) 15. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015) 16. Perkins, T.J., Barto, A.G.: Lyapunov design for safe reinforcement learning. J. Mach. Learn. Res. 3(December), 803–832 (2002) 17. Satija, H., Amortila, P., Pineau, J.: Constrained Markov decision processes via backward value functions. In: International Conference on Machine Learning, pp. 8502–8511. PMLR (2020) 18. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017 19. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Low-Cost Model-Free DRL
745
20. Singh, S., Jaakkola, T., Littman, M.I.: Convergence results for single-step on-policy reinforcement-learning algorithms. Mach. Learn. 38(3), 287–308 (2000) 21. Singh, S., Jaakkola, T., Littman, M.I.: Convergence results for single-step on-policy reinforcement-learning algorithms. Mach. Learn. 38(3), 287–308 (2000) 22. Tessler, C., Mankowitz, D.J., Mannor, S.: Reward constrained policy optimization. arXiv preprint arXiv:1805.11074 (2018) 23. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Thirtieth AAAI Conference on Artificial Intelligence (2016) 24. Wachi, A., Sui, A.: Safe reinforcement learning in constrained Markov decision processes. In: International Conference on Machine Learning, pp. 9797–9806. PMLR (2020) 25. Wiering, M., Van Otterlo, M.: Reinforcement learning. Adapt. Learn. Optim. 12, 3 (2012)
A General Unbiased Training Framework for Deep Reinforcement Learning Huihui Zhang(B) , Xu Han, Yanlong Cheng, and Cong Yan Dongsheng Intelligent Technolody Co., Ltd., Suzhou, China [email protected] Abstract. Deep reinforcement learning (DRL) combines deep neural networks with reinforcement learning which enables agents to learn the best actions in virtual environment in order to attain their goals. In DRL, experience replay enables reinforcement learning agents to memorize and reuse past experiences, and it has been widely used to stabilize the training process by randomizing over training data and removing correlations in the observation sequence. However, existing experience replay lacks importance sampling to remove the bias errors induced by samples following different probabilities. Since there is no way to gain knowledge about the transition distribution in model-free learning, we propose a novel training framework to amend the distribution of the samples from the beginning. Specifically, we employ Monte-carlo sampling to obtain the input states, and then follow the Markov decision process to obtain four-tuple transition slots. The network parameters are updated synchronously using a batch of independently and identically distributed (IID) transition slots instead of experience replay. This training framework optimizes the unbiased approximation of loss function whose estimation exactly matches the real probability distribution of data inputs without importance sampling, and thus have overwhelming advantages of sample efficiency and convergence rate over existing DRL training framework. Since it only changes the mechanism of collecting and exploiting samples, it can be generalized to reinforcement learning algorithms. Moreover, we propose several algorithms under our new framework to deal with typical discrete and continuous scenarios. These algorithms prove to be far more efficient, and provide examples for existing and future algorithms on how to apply our framework. Keywords: Reinforcement Learning Decision Process
1
· Deep Neural Networks · Markov
Introduction
Reinforcement learning (RL) have been successfully applied to multiple areas ranging from online learning, recommender engines, to natural language understanding and generation [2,20,22,23,28], however, their applicability are limited to the domains where the features are handcrafted and the state spaces are fully observed with low dimension. Deep reinforcement learning (DRL), which c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 746–760, 2023. https://doi.org/10.1007/978-3-031-37963-5_51
A General Unbiased Training Framework
747
combines deep neural networks with RL [1,7,9], can provide more effective representations of the environment and learn successful policies directly from highdimensional sensory inputs, extending the applicable fields of RL. Although DRL has achieved success in solving a variety of control tasks [10, 15–17,21,25,27], there exists many biases in the existing algorithms. One kind of bias is the overestimation typically existing in deep Q-network (DQN), and some algorithms have been proposed to solve it by altering the maximization strategy or attempting twice estimations for Q-values [4,25]. However, they provide no theoretical bounds for the overestimation due to its randomness, and thus the overestimation cannot be really removed. In fact, the performance of Double deep Q-networks (DDQN) is worse than DQN in some cases [25], and the Twin Delayed Deep Deterministic policy gradient algorithm (TD3) proposed in [4] only has limited advantage over Deep Deterministic Policy Gradient (DDPG) [10] which is the continuous variant of DQN. Besides the overestimation problem, there also exists systematic bias in the mechanism of collecting and exploiting data in algorithms based on experience replay. The experience replay memorizes the past observations, and then samples randomly over the pool of memory. However, there exists mismatch between distributions of the target policy and the behavior policy, which produces the necessity for the importance sampling (IS) to be applied to off-policy methods to weight the transition slots with different probabilities. Without IS, the experience replay will induce accumulated errors, which brings instability and lowers efficiency to the training tasks. One solution to reduce this bias is to apply IS to the model-based RL or RL with experience replay [3,6,8,12,14,19,24,26], but their algorithms either have high cost or have low efficiency. Therefore, we develop a new mechanism of collecting and exploiting training samples by changing the distribution of the samples instead of using experience replay, which improves the efficiency of DRL training algorithms. The Monte-carlo sampling, which is adopted as a convenient choice of our sample collecting methods. The paper has the following contributions. First, we propose an Unbiased Deep Reinforcement Learning (UDRL) framework, which has the property of uniform convergence and conditioned policy improvement. Second, we further propose an Enhanced UDRL which can work asynchronously and sampleefficiently while keeping unbiased under several assumptions. Third, two representative algorithms for discrete action spaces and continuous control problems are proposed based on UDRL and Enhanced UDRL. Fourth, extensive evaluations are conducted to compare the proposed algorithms with algorithms based on experience replay, in terms of computational efficiency, stability, convergence rate and sample efficiency.
2 2.1
Approach Background
DRL was initially applied to the scenario of discrete-action MDP using the DQN algorithm [17], which combines neural networks, Q-learning an
748
H. Zhang et al.
experience repaly. Subsequently, the experience replay was adopted by DDPG [10] to solve RL problems with continuous state and action spaces. In DRL, the agent continually interacts with the environment to achieve a sequence of observations until a terminated state arrives, which is referred to as an episode. During each episode, the agent chooses actions following a behavior policy to receive rewards and determine the next states, which are used to train the network parameters and update the target Q-values instantly or periodically. Generally, the action-value (Q-value) is represented as the discounted cumulative reward with respect to state and action, that is ∞ t γ r(st , at )|s0 = s, a0 = a , (1) Qπ (s, a) = Epπ (h|s0 ,a0 ) t=0
where r(s, a) is the immediate reward, s is the initial state, and γ ∈ (0, 1) is the discount factor for future rewards. Besides, pπ (h|s0 , a0 ) is the joint probability of the MDP sequence of an episode given the initial state s0 and action a0 , achieved by the behavior policy π. 2.2
General Loss Function for DRL
We adopt neural networks to approximate Q-values. The neural network derives the behavior policy π in (1), and maps input (s, a) to its Q-value, which is originally the discounted accumulated reward of an episode sequence following MDP. No matter for discrete or continuous MDP, the update of network parameters is based on the Bellman equation [5]. Thus, we set the target value based on Bellman equation and organize the general loss function for DRL as (2) L(ω) = E(s,a,r,s )∼P (r + γQ(s , μ(s )|ω ) − Q(s, a|ω))2 , where P represents the distribution probability of input s, a is the action drawn from a behavior policy based on s, r and s are the immediate reward and next state received by interacting with the MDP environment. Given the behavior policy and MDP environment, P can actually determine the joint distribution of slot (s, a, r, s ). μ(s ) is the target policy mapping s to the next action a , which is normally different from the behavior policy in off-policy algorithms. ω is the parameter of neural network to be optimized that is normally different from the target network parameter ω , and γ ∈ (0, 1) is the discount horizon factor. A key point we can draw from (2) is that the distribution probability P of inputs do not need to follow MDP, because the neural networks used to approximate the Q-values in (1) have contained all the MDP transitions. Traditional DRL algorithms average the loss values over a sequence of MDP observations as the target loss for optimization to update the network parameters. However, such kind of average is inaccurate for an episode of MDP samples and thus may cause divergence or instability issues. DRL deals with the unstable problems by a biologically inspired mechanism called experience replay [11,13,18] which randomly samples over the history data to smooth the data distribution and
A General Unbiased Training Framework
749
alleviate data correlations in each episode. It is reasonable to assume that with replay buffer whose size is large enough, the samples of experience pool can be approximately seen as independent, which makes us doubt whether MDP is required for the sensory inputs of DRL training, i.e., whether s of one transition slot should follow s of the last transition slot in the MDP sequence. 2.3
UDRL Framework
P in (2) is the joint distribution of single slot, which can be sampled based on some assumptions. The most favorable choice is independently and identically distributed (IID) sampling because it provides unbiased approximation for (2). The loss function formula of UDRL is given by LU DRL (ω) =
N 1 (rn + γQ(sn , μ(sn )|ω ) − Q(sn , an |ω))2 . N n=1
(3)
Based on (3), we propose the UDRL framework to solve the problems brought by the systematic bias in experience replay without IS [3,6,8,12,14,19,24,26]. The update for RL algorithm at iteration i is based on the policy at iteration i−1, which is Q(s, a|ωi ) ← r + γQ(s , μi−1 (s )|ω ). Then we can provide guarantee of the property of uniform convergence and conditioned policy improvement. The specific processes of UDRL are given as follows. First, we randomly sample batches following a certain distribution to achieve IID initial state observations. After taking actions based on the sampled initial states and the exploration policy, the rewards and the next states can be determined, and then the target Q-values can be computed to update the network parameters. Second, by taking one-step action, we have IID single-step transition slots used for each update period. We train the batch composed of IID samples to update the network parameters and replace it with the batch of the next iteration. Besides the unbiased approximation it provides, it has advantage that we do not need to store large amount of past collected data to smooth the data distribution and use it to create fake unrelated samples, thus saving the costs of storing and exploiting the sampled data. More importantly, we also prove that deeper transition slots can be IID under some assumptions by our means to ensure asynchronous and offpolicy training. Throughout our method, the IID principle should be maintained to avoid the systematic bias due to the lack of IS mentioned in Sect. 1. 2.4
Applying UDRL to Discrete State-Action Space
To demonstrate the feasibility of our UDRL framework, in the subsection we propose an Unbiased Deep Q-Network (UDQN) to deal with the discrete stateaction space under our UDRL framework. Similar to DQN, we choose the action that maximizes the target value as a result of target policy μ(·) in (3). Then, the loss function of UDQN that need to be minimized is given by N 1 2 (ri,n + γ max , (4) Q(s , a |ω ) − Q(s , a |ω )) LU DQN (ωi ) = i−1 i,n i,n i i,n a N n=1
750
H. Zhang et al.
where ωi is the parameter of the neural network at iteration i which parameterizes the Q-value function as Q(·, ·|ωi ). The network parameter ωi determines πi , and thus any change to the behavior policy at iteration i − 1 will lead to different updates of ωi . Accordingly, the agent’s transition slots (si,n , ai,n , ri,n , si,n )n=1,··· ,N are IID samples in ith batch. The pseudocode of UDQN is given in Algorithm 1. Here we use a counter to record the update/iteration periods instead of the episodes and steps to track the training process since UDRL does not rely on terminated states.
Algorithm 1. UDQN Algorithm 1: Input: The batch size N , and the maximum number of batches M . 2: Initialization: Initialize the network parameters ω ← ω0 randomly. 3: for i = 1, M do 4: Sample Si = (si,1 , si,2 , · · · , si,N ) IID; 5: Choose actions Ai = (ai,1 , ai,2 , · · · , ai,N ) for Si according to -greedy policy; 6: Execute actions Ai , get next states Si = (si,1 , si,2 , · · · , si,N ) and immediate rewards Ri = (ri,1 , ri,2 , · · · , ri,N ); 7: Minimize the loss function shown in (4) by gradient decent, and then update ωi ; 8: end for
2.5
Applying UDRL to Continuous State-Action Space
In this part, we propose the Unbiased Deep Deterministic Policy Gradient (UDDPG) to deal with RL problems in continuous state and action spaces. In continuous state and action spaces, the greedy policy is too slow to be practically applied. Therefore, an action approximation function μ(s|θ) is adopted to specify the deterministic policy. The current actor network is updated by maximizing the expected return, i.e., the average of Q-values parameterized by the current Q-network, with respect to the actor network parameter θ. The formula of expected return is given by J(θi ) =
N 1 Q(si,n , ai,n (θi )|ωi ), N n=1
(5)
where N represents the sample size, ai,n (θi ) = μ(si,n |θi ) and Q(si,n , ai,n |ωi ) is the Q-value parameterized by the critic parameter ωi at the nth sample and ith iteration. The maximization of (5) can be achieved by the gradient ascent method. Replacing the optimal action μ(sn ) of the target network in (3) with the action chosen by the target actor network, which is updated partly by (5), we have the loss function to update the current UDDPG critic network, that is L(ωi ) =
N 1 (ri,n + γQ(si,n , μ(si,n |θi )|ωi ) − Q(si,n , ai,n |ωi ))2 , N n=1
(6)
A General Unbiased Training Framework
751
where μ(s|·) is the parameterized actor network, si,n , ai,n and si,n represent the current state, the current action and the next state of the nth sample, respectively. ωi , ωi and θi respectively represent the parameters of the current critic network, target critic network and target actor network at iteration i, and (ωi , θi ) are soft updated by (ωi , θi ), in the way of θi ← τ arg maxJ(θi ) + (1 − τ )θi , θi
ωi
← τ arg minL(ωi ) + (1 − τ )ωi , ωi
(7)
where τ < 1 constrains target values to change slowly so that the stability of learning can be greatly improved. (7) is called as “soft” target updates [10], relying on the optimization results from (5) and (6). The details of UDDPG are as follows. First, we sample within the continuous state space in batch to achieve IID initial state observations. Then, we choose actions from an exploration policy which is indicated by the current actor network parameterized by θ and added noise. The rewards and the next states can be determined through interacting with the environment following the chosen actions. After that, the target Q-values are computed based on the target critic network and the target actor network, whose parameters are ω and θ , respectively. There are three updating processes following the computation of the target Q-values. First, the parameters of the current actor network θ are updated by optimizing the expected return in (5). Second, the parameters of the current critic network ω are updated by minimizing the loss function in (6). Third, the parameters of the target networks (ω , θ ) are updated according to (7). After finishing the training processes, a new iteration is launched by starting another batch of Monte-Carlo state observations. The whole procedure of UDDPG is organized as Algorithm 2.
Algorithm 2. UDDPG Algorithm 1: Input: The batch size N , the maximum number of batches M , and the soft update parameter τ . 2: Initialization: Initialize the network parameters (ω, θ, ω , θ ) ← (ω0 , θ0 , ω0 , θ0 ) randomly. 3: for i = 1, M do 4: Sample Si = (si,1 , si,2 , · · · , si,N ) IID; 5: Choose actions Ai = (ai,1 , ai,2 , · · · , ai,N ) for Si according to the current actor network μ(Si |θi ) added by exploration noise; 6: Execute actions Ai , get next states Si = (si,1 , si,2 , · · · , si,N ) and immediate rewards Ri = (ri,1 , ri,2 , · · · , ri,N ); 7: Maximize the expected return shown in (5) by gradient ascent, and then update θi ; 8: Minimize the loss function shown in (6) by gradient decent, and then update ωi ; 9: Execute the ”soft” target updates shown in (7) to update θi and ωi ; 10: end for
752
2.6
H. Zhang et al.
Enhanced UDRL Framework
Besides being unbiased, the UDRL framework also has advantage over DRL by saving the costs of storing and exploiting the sampled data, thus saving memory and computation for each update. However, it has limitations in sample efficiency because it just discards every training batch, and thus wasting the valuable samples which might be expensive to obtain. This motivates us to take advantage of the history batch inputs to achieve a tradeoff between convergence rate and sample efficiency. We propose the Enhanced UDRL (EUDRL) framework, which adopts buffer memory to store the Monte-Carlo samples per batch. Instead of training each batch directly, it randomly samples over the pool of memory and trains the selected samples. Different from DRL which trains the min-batch per interaction, EUDRL updates network parameters several times before the next batch comes, which means that it executes the randomly sampling process several times and obtain several sets of mini-batches per interaction. If the number of minibatches per batch is equal to the batch size, each update period just consumes a single sample on average. EUDRL also meets the standard to ensure the sampled transitions from the pool of memory to be IID. For simplicity, EUDRL related algorithms are referred to as EUDQN and EUDDPG for discrete and continuous state and action spaces, respectively. At the beginning of the algorithm, the memory size D, the maximum of updates M per batch and mini-batch size N of each sample from the memory are initialized based on the input. The agent keeps Monte-Carlo sampling and interacts with the environment to achieve IID samples until the memory is full, and then it trains mini-batches of size N for M steps, which is called a training cycle. During the training cycle, the network parameters will be updated for M times. At the end of each training cycle, the oldest N samples will be popped out. After that, a new batch of size N will be sampled and pushed into the memory. For brevity, the pseudocodes of EUDQN and EUDDPG are omitted here.
3 3.1
Experiments Maze
In this experiment, we evaluate the performance of UDRL and EUDRL in discrete MDP. We aim to solve the maze problem with large discrete state-action spaces and many obstacles. The environment of the maze problem is discrete, as shown in Fig. 1(a). Each step has four possible directions (up, down, left and right), taking the agent to an adjacent grid. The goal of the agent is to move from the initial state labelled as start towards the goal, evading obstacles represented by the gray grids. During this process, the agent receives minus reward if it touches the gray grids, which means that these areas are forbidden. Once the agent arrives at the goal, it will be rewarded 100 scores. Besides, at the upper-left side of the goal, the state is assigned 50 scores as a bonus reward to see whether the agent can pick up all positive rewards before reaching the goal
A General Unbiased Training Framework
753
Fig. 1. (a) The Maze Environment with Discrete State-Action Spaces and Endless Obstacles. The Whole Picture is Described by Surroundings of the Initial State and Final State for Brevity; (b) The Robot Arm Environment with a Grasp and Move Task. The Finger of this Robot Arm is at the End of the Section. The Objective is to Reach the Moving Goal and keep Grasping the Moving Goal for Several Steps.
without bumping into the obstacles. In other states represented by blank grids, the rewards are zero. The evaluation results using the same hyperparameter values in Table 1 are shown in Fig. 2(a)–2(c). These figures compare the computational efficiency of UDQN and EUDQN with that of DQN ranging from 81 to 196 discrete states in a square maze. Specifically, an evaluation procedure, which observes the agent starting from the start point to the terminal state (the goal) and records the reward of each full test episode, is launched every 100 update periods. The timeout is set as 200 steps, which means that one episode will terminate if the agent cannot arrive at the goal within 200 steps. The average reward stands for the accumulated reward averaged over the number of steps consumed during each episode, and results of 100 episodes are averaged for each evaluation procedure to ensure accuracy. In Fig. 2(a), DQN, UDQN and EUDQN all converge within 80000 update periods due to relatively small state space, although DQN oscillates around the local optimal point with 50-score bonus at the upper-left side of the goal (see Fig. 1(a)) at the early stage. Also from Fig. 2(a), we observe that DQN converges at around 41 and 53 thousand update periods for 81 and 100 states, respectively, which are more than 10 times update periods necessary for the convergence of UDQN. From Fig. 2(b), we notice that DQN begins to lose its stability and diverges when faced with more obstacles, while UDQN and EUDQN converge much faster and more stably than DQN. When it comes to 169 − 196 states given by Fig. 2(c), DQN diverges during 80000 update periods while UDQN and EUDQN still keep being robust. Although EUDQN is a bit slower than UDQN in terms of convergence rate, it is much more sample efficient considering the fact that the number of samples consumed by UDQN is roughly 200 times of that consumed by EUDQN given the hyperparameters in Table 1. It is also noticed that the converged value decreases as the state space becomes
754
H. Zhang et al.
larger, because the 150-score bonus will be averaged over more steps from the start point to the goal. In conclusion, both UDQN and EUDQN achieve much higher computational efficiency and stability compared with DQN.
(a)
(b)
(c)
Fig. 2. Computational Efficiency Comparisons among DQN, UDQN and EUDN Ranging from (a) 81–100 States in a Square Maze; (b) 121–144 States in a Square Maze; (c) 169–196 States in a Square Maze. The Converged Value is Fixed and can be Computed by Averaging Rewards over the Number of Steps.
We also compare the convergence rate, in other words, the required time to converge, by estimating the average time cost for each update period. The time cost per update period is different even if the batch size of UDQN is the same as the mini-batch size of DQN, because DQN needs time to process memory storage and uses experience replay. The ratio of the average time cost of DQN to that of UDQN is around 1.73 under the same configuration. Multiplying the time cost per update period by the required update periods for the convergence of the two algorithms (see Fig. 2(a)), we can see that UDQN is far faster than DQN. 3.2
Robot Arm
We conduct the “robot arm” experiment, a grasp and move task, to evaluate the performance of the UDDPG (EUDDPG) algorithms in MDP with continuous state and action spaces. The environment of “robot arm” experiment is shown in Fig. 1(b). In this figure, the green circle at the center of the space is fixed to confine one side of the arm. Other circles are used to connect these sections so that the finger represented by the end of the arm can travel all over the whole space. We just plot three sections to represent a general arm that may contain any potential number of sections. In the experiment setting, the goal represented by the yellow box is randomly moving, so the state representation includes both the positions of joints and their relative positions to the goal. Accordingly, the action space has the same dimension as the number of sections, containing the rotation angle of each section. The rewards are set as the negative value of the distance from the finger to the goal plus a bonus, which is 1 when the finger is located within the goal box. Once the finger catches the goal, i.e., the position of the finger locates within the goal box, it needs to stick to the moving goal
A General Unbiased Training Framework
755
Table 1. List of Hyperparameters of Maze and Robot arm Experiments. Hyperparameter
Value
Description
learning rate discount factor batch size mini-batch size mini-batch maximum
0.001 0.9 200 200 200
The learning rate used by gradient descent optimizer The discount horizon factor γ to estimate the target value The sample size of UDQN or UDDPG per batch The sample size of DQN, EUDQN, DDPG and EUDDPG Mini-batch samples in (EUDQN, EUDDPG) per batch
Maze
Value
Description
initial exploration rate final exploration rate observation size replay memory size
0.9 0.0001 2500 10000
Initial value of in -greedy policy Minimun value of in -greedy policy The number of samples collected before training in DQN The size of replay buffer used in DQN and EUDQN
Robot arm
Value
Description
initial exploration variance variance decay rate soft update parameter maximum episode steps replay memory size
1.0 0.9995 0.01 200 30000
Initial variance of gaussian exploration noise added to actions The exploration variance multiplied by the rate per training τ used in “soft” target updates The timeout steps for DDPG The size of replay buffer used in DDPG and EUDDPG
for several steps to ensure stability of the grasp task. For this reason, the agent needs to observe an episode after catching the goal. During these steps, the agent will be reset if falling out of the goal or having finished the grasp steps. Although Algorithm 2 provides a general idea for UDDPG implementation, several steps need to be added or revised when applied to the grasp and move task. Specifically, there should be job-done signals after executing actions to determine whether to continue holding on to the goal or start a new potential episode. We have implemented DDPG and UDDPG (EUDDPG) algorithms under the “robot arm” environment for a fair comparison, using the same hyperparameter values listed in Table 1. Both DDPG and UDDPG (EUDDPG) exploit the prior knowledge of space domain in Fig. 1(b) to ensure convergence. For generality, we gradually increase the number of sections in the “robot arm” experiment to observe the performance of the algorithms faced with growing state dimension. Notably, one more section will increase the state dimension by 4, including the coordinates (2 dimensions) of joints and their relative coordinates (2 dimensions) to the goal. Figure 3 compares the computational efficiency of UDDPG and EUDDPG with that of DDPG ranging from 2–7 sections (9– 29 state dimensions) by fitting scatterplots generated from 800 thousand update periods. Specifically, the evaluation interval is 500 update periods, which observes the robot arm initialized randomly and moving its finger gradually closer to the randomly located goal, until it completes the grasp task or sees the timeout.
756
H. Zhang et al.
(a)
(b)
(c)
Fig. 3. Computational Efficiency Comparison between DDPG and UDDPG for a Robot Arm with (a) 2–3 Sections; (b) 4–5 Sections; (c) 6–7 Sections. The Converged Value Shows More Sensitivity of Agent to the Moving Goal.
The timeout is set as 100 steps, which means that one episode will restart if the robot arm cannot fulfill the grasp task within 100 steps. During each episode, the accumulated reward will be recorded and averaged over the number of consumed steps. Figure 3 shows the average reward of 100 episodes for each evaluation procedure. In Fig. 3(a), when the state dimension is relatively low, it is noticeable that DDPG, UDDPG and EUDDPG can converge to a value within 800 thousand update periods. From Fig. 3(b), we can see that as state dimension grows, the convergence of DDPG becomes unstable and the converged value is low while UDDPG and EUDDPG remain robust although EUDDPG is a bit lower than UDDPG in terms of convergence rate and converged value. Since the goal is moving, different converged value can be interpreted as the sensitivity that enables the robot arm to follow the behavior of the goal, i.e., higher converged value indicates the agent can react more promptly to the randomly moving goal. In general, larger state dimension results in slower convergence speed, lower converged value (less sensitivity), and higher instability, which can be observed from the DDPG curves in Fig. 3(c) However, UDDPG proves to be the most robust in all these three aspects. EUDDPG is slightly inferior to UDDPG in these aspects, but it is more sample-efficient. 3.3
Classical Control Environment
We apply the new framework to several classical control experiments to show the advantages. The hyperparameters are also adopted from Table 1. Figure 4 presents the results of average reward versus update periods for Pendulum, Acrobat, continuous mountain car and continuous maze experiments. Besides, we show the computational results of discrete and continuous Cartpole experiments in Fig. 5(a) and 5(b), respectively. In the two experiments, we revised the
A General Unbiased Training Framework
(a)
(b)
(c)
(d)
757
Fig. 4. Computational Efficiency Comparison (a) in Pendulum; (b) in Acrobat; (c) in Mountain Car Experiment; (d) in Continuous Maze.
rewards of Cartpole environment to make them more challenging. Specifically, the distance and velocity are added to the rewards so that the cart can go as far and fast as possible instead of staying at the origin to keep stable. The results of velocity and distance versus update periods for the continuous Cartpole experiment are given by Fig. 5(c) and 5(d). From these figures, we see that UDDPG algorithm is able to simulate the cart to move farther and faster while keeping more stable.
758
H. Zhang et al.
(a)
(b)
(c)
(d)
Fig. 5. (a) Average Reward Versus Update Periods in Discrete Cartpole; (b) Average Reward Versus Update Periods in Continuous Cartpole; (c) Velocity Update Periods in Continuous Cartpole; (d) Distance Versus Update Periods in Continuous Cartpole.
4
Conclusion
In this paper, we developed a general unbiased training framework for DRL that can be applied to existing and future reinforcement learning algorithms. The proposed UDRL (EUDRL) framework can achieve unbiased approximation for the loss function with the property of uniform convergence. Under the proposed framework, we developed the UDQN (EUDQN) algorithms for discrete stateaction spaces, and UDDPG (EUDDPG) algorithms for continuous state-action spaces. Extensive evaluation results show that our UDRL framework is more computationally efficient and stable than the existing DRL for both discrete and continuous state-action spaces. EUDRL is a bit inferior to UDRL in terms of convergence rate, but it is more sample efficient from the perspective of sample utilization. Overall, UDRL (EUDRL) outperforms the existing DRL based on both theoretical analysis and experimental validations.
A General Unbiased Training Framework
759
References 1. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–27 (2009) 2. Diuk, C., Cohen, A., Littman, M.L.: An object-oriented representation for efficient reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 240–247 (2008) 3. Foerster, J., et al.: Stabilising experience replay for deep multi-agent reinforcement learning, pp. 1146–1155 (2017) 4. Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods, vol. 80, pp. 1587–1596 (2018) 5. Gattami, A.: Reinforcement learning for multi-objective and constrained markov decision processes. arXiv preprint arXiv:1901.08978 (2019) 6. Hachiya, H., Akiyama, T., Sugiyama, M., Peters, J.: Adaptive importance sampling with automatic model selection in value function approximation, pp. 1351–1356 (2008) 7. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006) 8. Jiang, N., Li, L.: Doubly robust off-policy value evaluation for reinforcement learning, pp. 652–661 (2016) 9. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) 10. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015) 11. Lin, L.-J.: Reinforcement learning for robots using neural networks. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science (1993) 12. Mahmood, A.R., Van Hasselt, H., Sutton, R.S.: Weighted importance sampling for off-policy learning with linear function approximation, pp. 3014–3022 (2014) 13. McClelland, J.L., McNaughton, B.L., O’Reilly, R.C.: Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102(3), 419 (1995) 14. Maria Metelli, A., Papini, M., Faccio, F., Restelli, M.: Policy optimization via importance sampling, pp. 5442–5454 (2018) 15. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016) 16. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013) 17. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015) 18. Neill, J.O., Pleydell-Bouverie, B., Dupret, D., Csicsvari, J.: Play it again: reactivation of waking experience and memory. Trends Neurosci. 33(5), 220–229 (2010) 19. Precup, D., Sutton, R.S., Singh, S.: Eligibility traces for off-policy policy evaluation, pp. 759–766 (2000) 20. Riedmiller, M., Gabel, T., Hafner, R., Lange, S.: Reinforcement learning for robot soccer. Autonom. Robot. 27(1), 55–73 (2009) 21. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)
760
H. Zhang et al.
22. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017) 23. Tesauro, G.: Temporal difference learning and td-gammon. Commun. ACM 38(3), 58–68 (1995) 24. Thomas, P.S., Brunskill, E.: Data-efficient off-policy policy evaluation for reinforcement learning, pp. 2139–2148 (2016) 25. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Thirtieth AAAI Conference on Artificial Intelligence (2016) 26. Wang, Z., et al.: Sample efficient actor-critic with experience replay. arXiv:Learning (2016) 27. Wang, Z., et al.: Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581 (2015) 28. Wiering, M., Van Otterlo, M.: Reinforcement learning. Adapt. Learn. Optim. 12, 3 (2012)
Application of Convolutional Neural Networks with Quasi-Reversibility Method Results for Option Forecasting Zheng Cao1(B) , Wenyu Du1 , and Kirill V. Golubnichiy2 1
University of Washington, Seattle, WA 98195, USA {zc68,wenyudu}@uw.edu 2 University of Calgary, Calgary T2N 1N4, Canada [email protected] https://www.duduncan.com
Abstract. This paper presents a novel way to apply mathematical finance and machine learning (ML) to forecast stock options prices. Following the previous results, we create and evaluate new empirical mathematical models for the Black-Scholes equation to analyze data for 92,846 companies. We solve the Black-Scholes (BS) equation forwards in time as an ill-posed inverse problem, using the Quasi-Reversibility Method (QRM), to predict option price for the future one day. For each company, we have 13 elements including stock and options’ daily prices, volatility, minimizer, etc. Because the market is so complicated that there exists no perfect model, we apply ML to train algorithms to make the best prediction. The current stage of research combines QRM with Convolutional Neural Networks (CNN), which learn information across a large number of data points simultaneously. We implement CNN to generate new results by validating and testing on sample market data. We test different ways of applying CNN and compare our CNN models with previous models to see if achieving a higher profit rate is possible. Keywords: Stock Option Prediction · Quasi-Reversibility Method Machine Learning · Convolutional Neural Networks · Black Scholes Equation
1
·
Introduction
For the past few decades, scholars from mathematics, economics, and computer science fields have been devoted to finding different approaches to predict the economic market. The economic market data is known for its chaotic pattern and randomness, and with no rigorous math theories furnished to accurately formulate the growth trend prediction, we decide to use machine learning algorithms to train models to study the historical data and make predictions. Several stock option prediction models have been created by Dr. Golubnichiy Z. Cao and W. Du–Contributed equally to this work. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 761–770, 2023. https://doi.org/10.1007/978-3-031-37963-5_52
762
Z. Cao et al.
and his colleagues, Dr. Mikhail V. Klibanov and Dr. Andrey V. Nikitin: QuasiReversibility Method (QRM), Binary Classification, and Regression Neural Network ML (Regression NN) [3], in the paper Quasi-Reversibility Method and Neural Network Machine Learning to Solution of Black-Scholes Equations (appeared on the AMS Contemporary Mathematics journal). QRM is an analytical and analytical approach to finding the minimizer by solving the Black-Scholes equation as an ill-posed inverse problem; whereas the latter two combine QRM with ML to improve the result precision. In this paper, on the other hand, we apply a machine learning method, Convolutional Neural Network (CNN), to help improve stock prediction performances. Begin with laying out the QRM mechanics and previous models’ results, the paper introduces a detailed road map of the modeling of the CNN program from the choice of features from stock option data to data set distribution, to the building of the inner architecture of this, neural network, and gradually to the analysis and evaluation of different approaches, constraints, and future developments and applications. Suppose our CNN model successfully forecasts prices for a majority of stock options, it is possible to deploy the model in the real world and help investors make better investment decisions. 1.1
Notice
This research is for academic study only, not for any financial applications or advice.
2
Previous Work
Klibanov, Kuzhuget, and Golubnichiy [4] developed a new empirical mathematical way of modeling and producing more accurate stock option prices with initial and boundary conditions. The Black-Scholes (BS) equation is used when we are given time in the past to determine current value [2]. For financial mathematics, the BS equation is a parabolic partial differential equation that targets the European style options [5]. Because we are trying to predict future option prices in this research, this falls into ill-posed problems. Ill-posed problem is where the solution does not exist, or the solution is unstable. The solution to the ill-posed problem is Regularization, and the key is to convert the system into a linear form of functional. u is the minimizer, thus it is our prediction. We have a vector X, which contains 13 elements including the previous minimizer u and the volatility coefficient σ [6, Chapter 7, Theorem 7.7]: σ 2 2 ∂ 2 u(s, τ ) ∂u(s, τ ) = s , ∂τ 2 ∂s2 u(s, 0) = f (s),
(1)
CNN for Option Forecasting
763
The payoff function is f (s) = max (s − K, 0), at T = t, where K is the strike price [6], s > 0., and the time at a given time t will occur is τ τ = T − t.
(2)
The option price function is defined by the Black-Scholes formula: u(s.τ ) = sΦ(θ+ (s, τ )) − e−rτ KΦ(θ− (s, τ )),
(3)
Based on the Itˆo formula, we have: du = (−
∂u(s, T − t) σ 2 2 ∂ 2 u(s, T − t) ∂u(s, T − t) + s dW. )dt + σs ∂τ 2 ∂s2 ∂s
(4)
If Eq. (3) is solved forwards in time to forecast prices of stock options, it is an ill-posed inverse problem. By solving the previous equation, we get the solution as the minimizer and apply it to the trading strategy to generate prediction results. After testing on real market data, we prove all these new mathematical approaches suggest better prediction results which may help traders make better investment decisions. Table 1 summarizes the results of QRM, Binary Classification, and Regression Neural Network Machine Learning. While the Precision column of the table indicated the percentage of profitable options under the model generations. Table 1. Previous Models’ Results. Method
Accuracy Precision Recall
QRM
49.77%
55.77%
52.43%
Binary Classification 56.36%
59.56%
70.22%
Regression NN
60.32%
61.29%
55.42%
From these results, it is evident that QRM is promising. However, when combined with QRM, the two machine learning models, Binary Classification and Regression NN, both improved on the results of QRM. Therefore, we decided to explore a different neural network, convolutional neural networks, with results from QRM. Whereas Binary Classification and Regression NN make predictions for each stock option independently, CNNs are better suited to take into consideration multiple stock options at a time when making predictions. This will allow information about the larger market to be included in the prediction. Our task is to implement a new machine learning algorithm with the result of the QRM using CNN.
764
3
Z. Cao et al.
CNN Modeling
Because the stock option market is affected by nearly every aspect of society and it is impossible to make a perfect prediction model for the future economic market, we apply machine learning (ML) to train algorithms to help improve the prediction. ML can automatically improve through experience and data. We input 13 elements for each company as parameters, which include the stock, ask, and bid prices for today plus the option, ask, and bid price and volatility for today and the past 2 days, and the minimizer obtained from the QRM for today and tomorrow. Therefore, for the total 92,846 companies, we have the training, validating, and testing data vectors, each representing a selected company for the given set time frame. The input data are split into 3 categories: 75.74% training, 10.64% validating, and 13.62% testing, for a total of 92, 846 × 13 = 1, 206, 998 values. 3.1
Sample Data Set Images
As the algorithm features we train the ML model with, the sample data we utilized includes 31 columns: option name, grid count, beta, date, option ask +2, option ask0, option bid –2, option bid –1, option bid -1, option bid0, stock ask0, stock bld0, ivol –2, ivol +1, ivol –0, est +1, est +2, option mean +1, option mean +2, option ask +1, option ask +2, option bid +2, stock ask –1, stock bid –2, stock bid +1, stock bid +1, ivol +1, ivol +2, option type, minimizer error +1, and minimizer error +2. The “+” and “-” here refer to the relative days of the present day, e.g. “stock ask –2” means the stock ask price of the day before yesterday. Note that we have 32 columns for the data frame, but only 13 elements (features) for each option on the given day will be considered as the input. All these data are real-world stock option information between late 2019 and early 2021 imported Bloomberg.com [1]. Table 2. CNN Data Set Distribution. Data Set Types
Total Number
Training Data
(70,322, 13)
Validation Data (9875, 13) Testing Data
(12649, 13)
Table 2 is the CNN Data Set Distribution summarizing the input vectors we train, validate, and test for the machine learning algorithm. The first value is the number of data points, the second is the number of features. We first train the algorithms using Gradient Descent, then tune the hyperparameters during the validation stage, and lastly evaluates the model precision and accuracy with the test samples. Once we finish training, we test, one and
CNN for Option Forecasting
765
only once. Changing the model after testing is prohibited because if we modify the data based on the test result, the model will not be accurate. One example of validating hyperparameters is the prediction threshold. If the model output is greater than or equal to this threshold, we predict that this stock goes up. We create different models based on the choice of hyperparameters, and the validation data set is used to help evaluate and select the best parameter combination. Currently, we have two approaches for building up the Convolutional Neural Network (CNN) as a substitution for Regression Neural Network to help predict stock options. CNN Approach 1 is finished and the below section walks through the modeling procedure; while the second approach is introduced for future developments. 3.2
Machine Learning Input Vector Normalization
Before diving into detailed modeling procedures, we first normalize the input vector as the following equations: μ=
ua (0) + ua (−τ ) + ua (−2τ ) + ub (0) + ub (−τ ) + ub (−2τ ) 6
(5)
u(t) − μ (6) σ (s − st ) − μ sn = (7) σ where opn is a normalized option price, sn is a normalized stock price normalization, s is the stock, st is the strike and σ is the standard deviation. opn =
3.3
CNN Approach 1.0
We choose n companies each with 13 pieces of information (including the normalized features above) as the input data, and include 6 convolution layers plus 1 max pool layer, and produce an output of n by 1 vector of (0, 1). Which we simultaneously study all companies from the input data set, and the algorithm seeks to establish a hidden relationship, thus making a corresponding trading strategy. We input a total of 70322 companies as the training data. 3.4
CNN Approach 1.1
Upon realizing that the result of CNN Approach 1 was extremely one-sided, we modified the last part of the machine learning function and use Sigmoid function. After around 100 epochs, training loss became stable. Figure 1 shows the evolution of training loss over epochs. The final CNN model includes 5 convolutional layers, each with a 2-by-1 reflexive padding to maintain dimensionality. The result of the fifth layer is passed through a max pooling layer before normalizing with the sigmoid function.
766
Z. Cao et al.
Fig. 1. Gradient Descent Training Loss.
The threshold c is a hyperparameter that we tune to optimize the model. Any output smaller than c will be classified as 0; all other outputs between c and 1 will be classified as 1. The threshold itself has no direct correlation with the training loss above. Using the validation data set, we obtain 0.54 as the threshold value that produces the maximum accuracy. The confusion matrix below visualizes the true labels vs. predicted labels when evaluating the test data set. From the confusion matrix, we can see that this model predicts more negatives than it does positives. However, when the model predicts a positive, it appears more accurate, indicated by the percentage of positive predictions which are correct. When evaluating the training data with the best threshold c = 0.54, the model generated an accuracy of 51.62%, a precision of 53.76%, and a recall of 47.14%, and when evaluating on testing data with the best threshold, an accuracy of 51.49%, the precision of 57.14%, and the recall of 2.78% are produced. This high precision lines up with what the confusion matrix suggests. Table 3 is how the model compares to QRM and Approach 1: Table 3. CNN Approach 1.0 and 1.1 Results. Method
Accuracy Precision Recall
QRM
49.77%
55.77%
CNN App 1.0 Testing
51.15%
N/A
52.43% 0.0%
CNN App 1.1 Training 51.62%
53.76%
47.14%
CNN App 1.1 Testing
57.14%
2.78%
51.49%
CNN for Option Forecasting
767
Fig. 2. Confusion Matrix Visualization.
Compared to the result of the original QRM, which has an accuracy of 49.77%, the CNN Approach significantly increases the precision, which is equivalent to the profitable option rate, to 57.14%, and increases the accuracy by approximately 1.72%. With such improvement, when traders use our model to make an investment to a bountiful scale and range of options, the expected percentage of profitable rate by 1.37%, per trading unit. The outcome that the false negative is relatively high whereas the false positive is relatively low (see Fig. 2, Confusion Matrix Visualization) may explain the low recall value of 2.78%. 3.5
CNN Approach 2:
Recognizing both the potential in CNN and the limits of Approach 1, we propose another CNN model to explore. Differing from the first approach, we select clusters of companies, e.g. 11 companies as a set to train the algorithm. The clusters are obtained from “neighboring” companies, and each time we only make one prediction for the center company of the cluster. Challenges with Approach 2 include choosing neighboring companies in the cluster and assigning weights to different companies of each cluster (how to indicate certain companies as more important in the prediction).
4
Trading Strategy
Let’s denote s as the stock price, t as the time, and σ(t) as the volatility of the option. The historical implied volatility listed on the market data of [2] is used in our particular case. We assume that σ = σ(t) to avoid other historical data for the volatility. Let’s call ub(t) and ua(t) the bid and ask prices of the
768
Z. Cao et al.
options at the moment of time t and sb(t) and sa(t) the bid and ask prices of the stock at the moment of time t. It is also known that Let’s buy an option if the following holds: EST (τ ) ≥ REAL(0). The trading strategy was developed by Golubnichiy and his colleagues and the results of our models are listed in the following section. It is with this trading strategy we generated the previously mentioned 57.14% precision (profitable rate) for CNN Approach 1.1. For the following sections, we will merge our CNN approaches (1.0 and 1.1) into one combined result — CNN Approach: 51.49% accuracy, 57.14% precision, 2.78% recall.
5
Result
In comparison to the pure mathematics approach, QRM, the modified CNN approach yields higher accuracy, and precision, while sacrificing recall. Table 4 lists results from all 3 models developed by Klibanov et al. [3], and CNN. Table 4. Final Results on Test Data. Method
Accuracy Precision Recall
QRM
49.77%
55.77%
52.43%
Binary Classification 56.36%
59.56%
70.22%
Regression NN
55.42%
60.32%
61.29%
CNN Approach
51.49%
57.14%
2.78%
These different models all produce various results from the same input data set. However, one cannot simply conclude that one approach is superior to the others, because of the many hidden constraints and limitations of the choice of the data set. features, hyperparameters, etc. These results suggest that the CNN model is able to extract relevant information from the input and learn to make predictions. They also emphasize the high volatility of the stock market and the difficulty in simulating the real-world economy with mathematical models. The input stock options are trending toward growth while the outputs are somewhat concentrated. More work can be done to further optimize the model. Since the accuracy is improved compared to QRM, our model might be a better fit for investing in options with both trends (including purchasing options predicted to decrease in value). One possible explanation for the lower recall value is that the stock options data set mostly outputs increasing value, which tunes the predicted trend, resulting in a lowering recall rate. The research team’s hypothesis is that the current model may fit better on a different stock options data set. The CNN approach appears promising, and producing better prediction results in future models seems highly possible. Stock option traders may refer to the trading strategy mentioned above and model prediction results to achieve better investment profits.
CNN for Option Forecasting
6
769
Future Developments and Applications
There lies great potential for both developments and applications of the research project. In this section, we introduce some limitations for future improvements. For the mathematical conceptual approach, more factors such as transaction costs and the magnitude of each stock option’s change could be adopted instead of relying solely on the direction. How to mathematically compute the hedging of various combinations of portfolios is also a challenge. For Machine Learning modeling, different ML models can be developed by changing the algorithms, weights, and layers for current models. We believe that there might exist some negative factors in the data which affect the prediction results. Covid 19 and the ongoing warfare might increase the instability of the stock market. As for the current CNN model, additional testing and evaluation can be done on different markets, not limited to the selected U.S. companies from a short time range. Lastly, for the trading strategy, risks may be introduced and special circumstances that rock the market could be taken into consideration. A new project is being developed to improve the program modeling and trading strategies. The project applies Recurrent Neural Network and Long ShortTerm Memory (LSTM) to make an additional prediction and a Binomial Asset Pricing/Trading Model to discretely optimize the results. While treating each state as an individual date, we thus justify the model as a stochastic process to reduce the run-time computation from 2n to n, whereas n stands for the nth day since the start. We are optimistic that this option forecasting and trading research 2.0 can help resolve partial, if not all, of the up-mentioned constraints.
7
Summary
In conclusion, a new machine learning approach is modeled to help predict stock option trends with solutions generated from the Quasi-Reversibility Method and Black-Scholes equation. After processing the Convolutional Neural Networks Machine Learning results by the trading strategy, we concluded a 57.14% profitable rate for options with an accuracy of 51.49% and a recall value of 2.78% for the current CNN approach. Since the accuracy is improved compared to QRM, this model might be a better fit for investmentd options with both trends (including purchasing options predicted to decrease.) Based on the final result of CNN Approach 1, a hypothesis on modeling CNN Approach is established for future improvements. Different parameters and machine learning models will be applied and evaluated to produce new predictions. As introduced in the future developments and applications section, we believe there lies great potential for the current option forecasting model, applying both Mathematical equations and machine learning algorithms. Table 5 above summarizes all previous modeling results produced by Dr. Klibanov et al [3] and what is presented in this paper. While profitable option rates vary depending on model selections and procedures, the specific choice of time and range of the source data may also have impacts on the final results.
770
Z. Cao et al. Table 5. Percentages of Options with Profits/Losses for Different Methods. Method
Profitable Options Options with Loss
QRM
55.77%
44.23%
Binary Classification 59.56%
40.44%
Regression NN
60.32%
39.68%
CNN Approach
57.14%
42.86%
Most significantly, all machine learning models are proven to help improve the stock option forecasting results, which may help investors make better, more profitable, trading decisions.
References 1. Bloomberg, Stock and Option Data from late 2019 to early 2021. https://bloomberg. com 2. Hull, J.C.: Options, Futures, and Other Derivatives, Pearson Education Limited, London (2022) 3. Klibanov, M.V., Shananin, A.A., Golubnichiy, K.V., Kravchenko, S.M.: Forecasting stock options prices via the solution of an Ill-posed problem for the black-scholes equation, arXiv preprint arXiv:2202.07174 4. Klibanov, M.V., Kuzhuget, A.V., Golubnichiy, K.V.: An ill-posed problem for the black-scholes equation for a profitable forecast of prices of stock options on real market data. Inverse Prob. 32(1), 015010 (2016) 5. Shreve, S.E.: Stochastic Calculus for Finance II. Continuous-Time Models. Springer, Cham (2003) 6. Bjork, T.: Arbitrage Theory in Continuous Time. Oxford University Press, Oxford0 (1999)
A Comparison of LSTM and GRU Networks for Learning Symbolic Sequences Roberto Cahuantzi(B) , Xinye Chen , and Stefan G¨ uttel Department of Mathematics, The University of Manchester, Manchester M13 9PL, UK [email protected] Abstract. We explore the architecture of recurrent neural networks (RNNs) by studying the complexity of string sequences that it is able to memorize. Symbolic sequences of different complexity are generated to simulate RNN training and study parameter configurations with a view to the network’s capability of learning and inference. We compare Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs). We find that an increase in RNN depth does not necessarily result in better memorization capability when the training time is constrained. Our results also indicate that the learning rate and the number of units per layer are among the most important hyper-parameters to be tuned. Generally, GRUs outperform LSTM networks on low-complexity sequences while on high-complexity sequences LSTMs perform better. Keywords: Recurrent Neural Network Learning
1
· LSTM · GRU · Sequence
Introduction
The recurrent neural network (RNN) is an extremely expressive sequential model to learn sequence data and plays an important role in sequence-to-sequence learning such as image captioning [18,25], speech modeling [17], symbolic reasoning tasks [11,14,29], and time series prediction [5,31]. Reliable and computationally efficient methods to forecast trends and mining the patterns in sequence data are very desirable; Recent sequential models achieve significant success in temporal sequence forecasting; see e.g. [23] which introduces a probabilistic forecasting methodology based on an autoregressive recurrent neural network model. An interpretable deep learning time series prediction framework is proposed in [20]. A lot of efforts have also gone into studying the architecture of the sequential models; see e.g. [9] which gives an empirical exploration on RNN by conducting a thorough architecture search over different RNN architectures. The study [16] compares a sophisticated hybrid neural network model to simpler network models and more traditional statistical methods (such as hidden Markov models) R. C.’s work has been funded by the UK’s Alan Turing Institute. S. G. has been supported by a Fellowship of the Alan Turing Institute, EPSRC grant EP/N510129/1. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 771–785, 2023. https://doi.org/10.1007/978-3-031-37963-5_53
772
R. Cahuantzi et al.
for trend prediction, with the hybrid model achieving the best results. Another hybrid forecasting method that combines RNNs and exponential smoothing is discussed in [24]. Comparisons of LSTM and GRU networks on numerical time series data tasks can be found in [28]. Despite these significant advances in the sequential models, there is also growing literature suggesting that data pre-processing is just as important to the performance as model architecture. In this realm, [21] shows that discretization of data can improve the forecasting performance of neural network models. Another critical aspect is the metrics to evaluate the performance of models in forecasting or other inference tasks. In these settings, the Euclidean distance metric and its variants, such as the mean squared error, are often used in this context. However, these metrics can be sensitive to noise in the data, an effect that becomes even more pronounced with time series of high dimensionality. Hence, [5,15] argue that symbolic time series representations, which naturally offer dimensionality reduction and smoothing, are useful tools to allow for the use of discrete (i.e. symbolic) modeling. Here we give empirical insights into the connections between the hyperparameters of popular RNNs and the complexity of the string sequences to be learned (and forecasted). This study is partly inspired by [6] who evaluate the performance of many variants of LSTM cells via extensive tests with three benchmark problems, all rather different from our string learning task. Among our main findings are that: (1) the learning rate is one of the most influential parameters when training RNNs to memorize sequences (with values near 10−2 found to be the best in our setup in terms of training time and forecast accuracy); (2) for the tasks considered here it is often sufficient to use just common RNNs with a single layer and a moderate number of units (such as around 100 units); (3) GRUs outperform LSTM networks on low complexity sequences while on high complexity sequences the order is reversed. The Python code used to perform our experiments is publicly available1 . To facilitate the community to study machine learning models on symbolic sequences, based on this research, related methods in this paper have been included in a Python library slearn [3] that enables producing synthetic symbolic sequences of user-specific complexity and comparative study of models. Note that another common approach to using deep learning for regression is global forecasting models (GFMs) which are employed on a large scale of temporal data; see e.g. [1,19]. While this method is appealing due to the improved generalizability of the resulting models, which enables reduced proneness to overfit and potentially lower overall training time. However, the model complexity is significantly higher and the selection of hyper-parameters is even more involved. Therefore, it is hard to understand the relationship between model architecture and input complexity. Here we take a different, simpler, approach by training one RNN model at a time to learn the symbolic sequence of various string complexity. This will provide more direct insight into the learning capability of a single RNN dependent on the complexity of the symbolic sequences it is meant to learn. 1
https://github.com/robcah/RNNExploration4SymbolicTS.
A Comparison of LSTM and GRU Networks
773
Since GFM is expected to be at least as complex as the model required to learn, we believe that our study also sheds light on some parameter choices for GFMs.
2
Methodology
Our approach to quantifying the learning capabilities of RNNs is to generate symbolic sequences of different string complexities (with complexity measured in terms of the compressibility of the string), to train RNNs on a part of that string until a predefined stopping criterion is reached, and then to quantify the accuracy of the forecast of the following string characters in an appropriate text similarity metric. Below we provide details for each of these steps. 2.1
String Generation and LZW Complexity
As training and test data for this study, we produce a collection of strings with quantifiable complexities. These strings are here-forth referred to as seed strings. A Python library was written to generate these seed strings, allowing the user to choose the target complexity and the number of distinct symbols to be used. One way to quantify complexity is due to Kolmogorov [13]: the length of the shortest possible description of the string in some fixed universal language without losing information. For example, a string with a thousand characters simply repeating ‘‘ab’’ can be described succinctly as 500*‘‘ab’’, while a string of the same length with its characters chosen at random does not have a compressed representation; therefore the latter string would be considered more complex. A more practical approach to estimate complexity uses lossless compression methods [10,30]. The Lempel-Ziv-Welch (LZW) compression [26] is widely recognized as an approximation to Kolmogorov complexity, and it serves as the basis of our complexity metric as its implementation is simple and can be adapted to generate strings of a target compression rate. The LZW algorithm creates a dictionary of substrings and an array of dictionary keys from which the original string can be fully recovered, and the illustration is as depicted in Fig. 1. We define the LZW complexity of a seed string as the length of its associated LZW array, an upper bound on the Kolmogorov complexity. 2.2
Training, Test, and Validation Data
The data used to train and evaluate the RNN models is obtained by repeating each seed string until a string s of predefined minimal string length is reached. The trailing n characters of s are split off to form a validation string v. The remaining leading characters are traversed with a sliding window of n characters to produce input and output arrays X and y, respectively, for the training and testing. Here, the input array X is of dimension m × n × p, where m stands for the number of input sequences (m = |s| − 2n where s denotes the length of s), n is the length of each input sequence, and p is the dimension of the binary vectors used for the one-hot encoding of each of the distinct characters. The output array y contains the next symbol following each string sequence encoded in X and is
774
R. Cahuantzi et al.
Fig. 1. An Illustration of LZW Compression. Assume that we have an Alphabet of Three Symbols (A, B, C) and the seed string ‘‘ABABCBABAB’’. As the LZW Algorithm Traverses the Seed String from Left to Right, a Dictionary of Substrings is Built (Table on the Right). If a Combination of Characters Already Contained in the Dictionary is found, the Related Index Substitutes the Matching Substring. In this Case, the Resulting Array is [1, 2, 4, 3, 5, 8] Corresponding to an LZW Complexity of 6.
of dimension m × p. The pair (X, y) is split 95% vs 5% to produce the training and test data, respectively. (This rather low fraction of test data is justified as there occur repeated pairs (X, y) in the data due to the repetitions in the string s.) The test data is used to compute the RNN accuracy and loss function values. Finally, the one-hot encoding of the validation string v results in an array of dimension n × p. The trained RNN model is then used to forecast the validation string v, and a text similarity measure quantifies the forecast accuracy. To exemplify this we can imagine a seed string ‘‘abc’’ with p = 3 distinct characters, which will be repeated to reach a string s of at least 100 characters length. In this case, s =‘‘abcabcabc...abc’’ is of length 102 characters. The trailing n = 10 characters are split off for the validation, resulting in v =‘‘cabcabcabc’’. The remaining 92 leading characters of s are then traversed with a sliding window of width n = 10 to form the input-output data pairs (X, y) as follows: (abcabcabca,b) (bcabcabcab,c) (cabcabcabc,a) ... (abcabcabca,b) The one-hot encoding a = [1, 0, 0], b = [0, 1, 0], c = [0, 0, 1] results in the final arrays used for the training and testing. 2.3
Recurrent Neural Networks
We consider two types of RNN architecture, i.e., Long Short-Term Memory (LSTM) cells [7] and Gated Recurrent Units (GRUs) [4], respectively. Different versions of these units exist in the literature, so we briefly summarize the ones used here. A standard LSTM cell includes three gates: the forget gate ft which determines how much of the previous data to forget; the input gate it which evaluates the information to be written into the cell memory; and the output gate ot which decides how to calculate the output from the current information: it = σ(Wi Xt + Ri ht−1 + bi ) ft = σ(Wf Xt + Rf ht−1 + bf ) ot = σ(Wo Xt + Ro ht−1 + bo ).
(1)
A Comparison of LSTM and GRU Networks
775
Fig. 2. A Simple RNN Cell on a Single Time-Step (Left) and the Unfolded Interpretation of the same RNN (Right).
Here, the W, R, and b variables represent the matrices and vectors of trainable parameters. The LSTM unit is defined by C˙ t = tanh(Wc Xt + Rc ht−1 + bc ) Ct = ft Ct−1 + it C˙ t ht = ot tanh(Ct )
(2)
yt = σ(Wy ht + by ). In words, the candidate cell state C˙ t is calculated using the input data Xt and the previous hidden state ht−1 . The cell memory or current cell state Ct is calculated using the forget gate ft , the previous cell state Ct−1 , the input gate it and the candidate cell state C˙ t . The Hadamard product is simply the element-wise product of the involved matrices. The output yt is calculated by applying the corresponding weights (Wy and by ) to the hidden state ht .
Fig. 3. General Structure of LSTM (Left) and GRU (Right) Units.
GRUs are similar to LSTMs but use fewer parameters and only two gates: the update (ut ) and reset (rt ) gates. The gate ut tunes the update speed of the hidden state while the gate rt decides how much of the past information to forget
776
R. Cahuantzi et al.
by resetting parts of the memory. The GRU unit is defined by the below set of equations. In them h˙t stands for the candidate hidden state. ut = σ(Wu xt + Ru ht−1 + bu ) rt = σ(Wr xt + Rr ht−1 + br ) h˙t = tanh(Wh xt + (rt ht−1 )Rh + bh ) ht = (1 − ut ) ht−1 + ut h˙t
(3)
yt = σ(Wy ht + by ) Figure 2 and Fig. 3 illustrate the general RNN architecture and its variants LSTM and GRU. 2.4
Text Similarity Metrics
Our experimental tests are performed on symbolic sequence rather than numerical data (i.e., associated with quantitative values), so the well-justified text similarity metrics are necessary for the reasonable assessment and convincing conclusions. Here we briefly discuss the metric we used in this paper for string prediction accuracy. Due to the non-Euclidean nature of symbolic representations, the accuracy of the forecast is best quantified via text edit metrics such as the Damerau–Levenshtein (DL) and Jaro–Winkler (JW) distance. The DL distance counts the number of edit steps required to transform a string into another [2]. The JW distance is a more elaborate metric that is less sensitive to string insertions and changes in character positions; see [27]. The following explains briefly the text distance algorithms used on this project, to give an intuitive understanding of these metrics. According to [2] the Damerau-Levenshtein (DL) text distance can be formalised with the algorithm 4. ⎧ ⎪ 0, ⎪ ⎪ ⎪ ⎪ ⎪ dl ⎪ a,b (i − 1, j) + 1 ⎪ ⎪ ⎨dl (i, j − 1) + 1 a,b dla,b (i, j) = min ⎪dla,b (i − 1, j − 1) + 1(a =b ) ⎪ i j ⎪ ⎪ ⎪ ⎪dla,b (i − 2, j − 2) + 1 ⎪ ⎪ ⎪ ⎩
if i = j = 0, if i > 0 (deletion), if j > 0 (insertion), if i > 0 and j > 0 (substitution),
(4)
if i > 1 and j > 1 and ai = bj−1 and ai−1 = bj (transposition).
Here, dla,b (i, j) means distance between the first i characters of a and first j characters of b. The symbols ai and bj stand for the character of the strings in positions i and j respectively. The expression 1(ai =bj ) is the conditional value 0 if ai = bj but 1 otherwise. Jaro-Winkler distance (JW), from [27], is symbolised as djw , and based on Jaro similarity (simj ), the latter being defined with Eq. 5. m m − t 1 m simj = + + (5) 3 |s1 | |s2 | m dmax
0. The parameters p and q are called the AR and MA orders, respectively, [26]. Pui Cheong Fung et al. [21] describes ARIMA forecasting, also known as Box and Jenkins forecasting, as capable of dealing with non-stationary time series data because of its “integrate” step. The “integrate” component involves differencing the time series to convert a non-stationary time series into a stationary one. For seasonal time series data, it is most likely that short run non-seasonal components contribute to the model. Hence the need to estimate seasonal ARIMA model, which incorporates both non-seasonal and seasonal factors in a multiplicative model. According to Falinouss [4], the general form of a seasonal ARIMA model is denoted as ARIMA p, d, q × P, D, QS, where p is the non-seasonal AR order, d is the nonseasonal differencing, q is the non-seasonal Moving Average order, P is the seasonal Auto Regression order, D is the seasonal differencing, Q is the seasonal MA order, and S is the time span of repeating seasonal pattern, respectively. The most important step in estimating seasonal ARIMA model is to identify the values of p, d, q and P, D, Q. Based on the time plot of the data, for example, if the variance grows with time, then a variance-stabilizing transformation and differencing should be done. The preliminary
Exploring the Relationship Between News Articles and Stocks Market Movements
1173
values of autoregressive order p, the order of differencing d, the moving average order q and their corresponding seasonal parameters P, D and Q can be identified by: • Using autocorrelation function (ACF) to measure the amount of linear dependence between observations in a time series that are separated by a lag p, • Using the partial autocorrelation function (PACF) to determine how many autoregressive terms q are necessary • Using the inverse autocorrelation function (IACF) for detecting over differencing. The parameter d is the order of difference frequency changing from non-stationary time series to stationary time series.
3 Dataset, Research Methodology and Analysis Results To achieve the aim of this research we use a publicly available datasets, which can be obtained from the Kaggle website, an online community engaged with data science [27]. The data used is historical stock prices from Yahoo Finance and news articles from Reddit News for the period between 2012 to 2016. In all the data science projects, Data Mining (DM) plays a highly significant role, since it can provide a way to turn previously unused data into useful information. For this research, the Knowledge Discovery in Database (KDD) data science methodology is adopted. This methodology is divided into multiple steps, which are explained in the following sub-sections [6], and Python programming language was used to perform the technical steps of this research. 3.1 Data Selection The historical stock prices range from 1970 to 2018 and contain open, close, low and high prices. For purposes of this research data for the period 2012 to 2016 is used to match the news headlines data available. The stocks used for this research are from the major banks industry and the five stocks selected are New York Community Bancorp (NYCB), Farmers National Banc Corp (FMNB), German American Bancorp (GABC), FNB Corporation (FNB) and Deutsche Bank (DXB). 3.2 Data Exploration and Pre-Processing The stock prices datasets for each one of the banks’ stock prices stated above are checked to explore missing and duplicate values, and the distribution of prices over time to monitor any extreme inconsistency. Summary statistics, which include the calculation of the mean, median, minimum and maximum are done to better understand the data. Additionally, histograms and box plots are used to understand the distribution of the values in the data set and check for outliers.
1174
A. Marshan et al.
For example, for the New York Community Bancorp (NYCB) bank, the mean value for both the open and close prices is similar at $15.18, and the minimum and maximum values for close price is $11.54 and $19.02, respectively. For the close price, the greatest number of prices ranges between $15.50 and $16.20, whilst $11.50 to $12.50 has the least number of stock prices as shown in Fig. 1 (left) below. Figure 1 (right) shows the histogram for open stock prices, with the highest frequency in the range from $15.40 to $16.00. Both histograms almost follow a normal distribution.
Fig. 1. NYCB Close Price (Left) and Open Price (Right) Prices Histograms
Both the close and open stock prices do not have any outliers as depicted in Fig. 2. From the box plots, the distribution follows a normal distribution, which is also reflected in the histograms.
Fig. 2. NYCB Close Price (Left) and Open Price (Right) Prices Boxplots
For the FMNB stock close and open prices, Fig. 3 shows right skewed histograms for close (left) and open (right) prices. Log function and normalization were used to deal data skewness [8]. Additionally, Fig. 4 depicts the box plots for FMNB’s close and open prices, which show that the data contain outliers.
Exploring the Relationship Between News Articles and Stocks Market Movements
1175
Fig. 3. FMNB Close Price (Left) and Open Price (Right) Prices Histograms
Fig. 4. FMNB Close Price (Left) and Open Price (Right) Prices Boxplots
Same data exploration steps were applied on the stock prices for the other banks close and open prices GABC, FNB and DXB to better understand the distribution of the data set and check for outliers, which were imputed using the mean values that are calculated per year per month. In terms of the news dataset, data pre-processing steps that include tokenization and duplicate, short and stop words removal are done to prepare the data for transformation, which is next step in the KDD data science methodology. 3.3 Models’ Development and Interpretation In this sub-section we will examine the correlation, autocorrelation and time lag between sentiments in news articles and stock prices. According to [26], Autocorrelation is when a time series is linearly related to a lagged version of itself, whilst correlation is when two independent variables are linearly related. The coefficient of correlation between two values in a time series is called the autocorrelation function (ACF), for example the ACF for a time series yt is given by Eq. 4: Corr(vt, yt − k), k = 1, 2, . . .
(4)
This value of k is the time gap being considered and is called the lag. A lag 1 autocorrelation (i.e., k = 1 in the above formula) is the correlation between values that
1176
A. Marshan et al.
are one time period apart (i.e., one day in case of the dataset used in this research). More generally, a lag k autocorrelation is the correlation between values that are k time periods apart. For NYCB stock prices, we start by calculating and plotting the autocorrelation of the open prices using a range of time lag values (1 to 25). Figure 5 shows the autocorrelation plots for close (left) and open (right) prices for NYCB bank stock prices using 95% confidence. The autocorrelation plots in Fig. 5 show that the time lags between 1 and 25 are strong and statistically significant.
Fig. 5. NYCB Close Price (Left) and Open Price (Right) Autocorrelation
Next, we calculate and plot the autocorrelation for polarity that is calculated during data transformation as explained before, see Fig. 6, in which, it is possible to notice that time lags 5, 9 and 12 represent the best values that are good for the model, with time lag 12 being the best.
Fig. 6. News Polarity Time Series Autocorrelation
In addition, using cross-correlation we observe a correlation between stock prices and sentiment polarity time series, however, weak. The highest observed correlation was between 1 and 5 and again at 7 time lags, which implies that sentiment polarity extracted from news articles have an effect on NYCB stock prices during the first week of news release. In addition, we compute Pearson correlation to understand the effect that news articles have on NYCB stock prices. Table 1 shows a positive correlation between
Exploring the Relationship Between News Articles and Stocks Market Movements
1177
both open and close prices to Polarity, of values 0.12 and 0.13, respectively, although the correlation is weak. Moreover, Table 1 shows that there is a negative correlation (the coefficient is 0.22) between closing prices and negative sentiments, implying that negative sentiments have a negative effect on stock prices. In the bottom part of Table 1, the focus is on the correlation between the stock close price and the different sentiments because we need to see the effect of the sentiments implied in news articles and stock prices at the end of the day. Table 1. Pearson’s Correlation between Open and Closing Prices vs Polarity and Negative, Positive and Neutral Sentiments Open Polarity Close Polarity Close Negative Close Positive Close Neutral
Open 1.00000 0.12488 Close 1.00000 0.12839
Polarity 0.12488 1.00000 Polarity 0.12839 1.00000
Close 1.00000 -0.22003 Close 1.00000 0.07285 Close 1.00000 0.16987
Negative -0.22003 1.00000 Positive 0.07285 1.00000 Neutral 0.16987 1.00000
Moreover, using a new variable called “label”, which has the value of 1 if prices increased from previous day and 0 otherwise, machine learning models including, Random Forest (RF), Naive Bayes (NB) and Support Vector Machines (SVM) are used to predict the stock trend for NYCB. Data was split as 80% as a training set and 20% as a test set which are used to train and evaluate the models’ ability to predict stock market trend. Table 2 provides a comparison between the accuracy achieved by each model, from which, we can see that RF model performs the best. Table 2. ML Models Performance Comparison
Accuracy
Random Forest
Naïve Bayes
0.63
0.44
Support Machine 0.45
Vector
Finally, in order to use ARIMA model, the data is checked for stationary, and differencing is used to make it so. Then, an ARIMA model is created. The values used as the ARIMA parameters are p = 5, d = 1 and q = 0, where p is the non-seasonal Auto
1178
A. Marshan et al.
Regression (AR) order, d is the non-seasonal differencing, q is the non-seasonal Moving Average (MA) order. In order to evaluate the ARIMA model, two different error functions: Mean Squared Error (MSE), which calculates the average squared difference between the observed and predicted values (errors), and Symmetric Mean Absolute Percentage Error (SMAPE) are used. SMAPE is commonly used as an accuracy measure based on relative errors. The formula is shown in Eq. 5 below, where At is the actual value and F t is the forecast value. 100% |Ft − At | n (|At | + |Ft |)/2 n
SMAPE =
(5)
t=1
The MSE value is 0.048, which is small, implying that the data values are dispersed closely to its central moment (mean) and this is the preferred option compared to a large MSE. The SMAPE value is 5.2, implying that the forecast is off by 5.2% and is a good fit. In order to visualize how the model performed against the actual prices, Fig. 7 shows a plot of the training, test and predicted prices based on ARIMA model.
Fig. 7. Plot for Training, Testing and Predicted Prices
Figure 8 illustrates a magnified plot for the predicted price, using the ARIMA model with parameters p = 5, d = 1 and q = 0, against the actual price. It can be noticed how the two curves closely follow each other. Data analysis steps used for NYCB stock prices were also applied on the close and open stock prices for the other banks, FMNB, GABC, FNB and DXB to understand the effect of sentiments extracted from news headlines on these stocks.
Exploring the Relationship Between News Articles and Stocks Market Movements
1179
Fig. 8. Plot for Actual and Predicted Prices
4 Research Findings and Discussion In this study, we investigate the causal relationship between news articles and stock market movements using NLP techniques to extract sentiments from news articles. Distribution of the data through histograms and outlier detection through box plots are performed. Stock prices for NYCB shows normal distribution and no outliers, GABC shows that there is skewness to the right and outliers and FNB has a normal distribution and outliers. On the other hand, FMNB shows a distribution that is skewed to the right and outliers, whilst DXB has left skewness and outliers. Time series plots for stock prices as well as for sentiment scores are generated in order to search for patterns and relationships. From the time series plots done for open and close prices and those for sentiment scores, it is noted that those for sentiments scores fluctuate randomly and there are no clear patterns, whilst those for prices have clear trends. ARIMA model was applied for stock forecasting, and Symmetric Mean Absolute Percentage Error (SMAPE) and Mean Square Error (MSE) values are computed to validate the suitability of the time lag used in the ARIMA model [26]. Autocorrelation for open prices reveal the following patterns for the stocks. • NYCB: positive correlation implying that it is a good model if applied to the data and the lags used are strong and statistically significant. • GABC: positive correlation till the y(t) = 25, thereafter no strong correlation. • FMNB: positive correlation till y(t) = 10, thereafter no correlation and it is scattered. • FNB: positive correlation
1180
A. Marshan et al.
• DXB: positive correlation only from y(t) = 22 and dense after y(t) = 25. Table 3 shows the SMAPE and MSE per stock for each stock: Table 3. SMAPE and MSE for Each Stock Stock NYCB FMNB GABC FNB DXB
SMAPE 5.203 15.383 15.805 8.192 4.918
MSE 0.048 0.054 0.130 0.043 0.075
A small MSE implies that the data values are dispersed closely to the mean, and this is the preferred option than having a larger MSE. Table 3 shows that FNB has the least MSE and is the preferred option. DXB has the smallest SMAPE value, and the forecast is off by 4.92%, which is a good fit. Pearson correlation reveals that news polarity sentiment scores and stock prices have weak relationships. In addition, Random Forest (RF), Support Vector Machine (SVM) and Naïve Bayes (NB) algorithms were used to predict future market movements using sentiments extracted from news headlines. Comparing the accuracy of the used models show that RF model performed the best among the tested models. Generally, the results of this study show that there is correlation between sentiments and stock prices as seen while exploring the effect of news articles on the NYCB stock market prices, where good news on the 5th of January 2012 resulted in an increase in the open price on the following day. However, as alluded to by Lima et al. [15], there are other factors, other than sentiments depicted in news articles such as demand and supply and interest rates, which can affect stock prices. Additionally, time lag between the release of the news and the time the market responds as a result of the news is of considerable importance in order to estimate when the market is going to respond to new news. For example, this research focused on daily stock movements and news articles and not hourly statistics, which could help to uncover stronger correlation. This study shows detailed steps to explore the causality and correlation between sentiments extracted from news articles and stock market changes. In addition, investigating the time lag between the news release time and stock movement helps in finding when and how the stock market will react as a response to this release [21, 26]. Compared to other studies which investigated this correlation [2, 14, 15], this work shows a more comprehensive approach to understand how stock markets are affected by new articles.
5 Conclusion and Future Research Directions This research aimed to explore the sentiments that can be detected from news articles and the causality between these sentiments and the changes observed in the stock market. Past research has argued that news headlines and articles have significant effect on
Exploring the Relationship Between News Articles and Stocks Market Movements
1181
stock price movement. Thus, exploring this correlation can help with stock price prediction. In this work we aim to explore the causality between news articles extracted from Reddit News and stock prices, downloaded from Yahoo Finance. ARIMA models for prices of the financial stocks was utilized and SMAPE values were calculated to ascertain which model fits the given data. In addition, correlations between sentiments extracted from news headlines and stock prices were calculated; such correlation does exist, however, weak. Moreover, Random Forest, Support Vector Machine and Naïve Bayes algorithms were used to determine the best model to predict stock market movements while considering news sentiments polarity as an additional predictor. Comparing models’ accuracies show that Random Forest perform the best, scoring 66%. Findings based on daily statistics show that there is indeed correlation between news articles and stock prices, although it is not very strong. These models can be used by traders and investors in deciding which stocks to buy or sell at any given time. One of the research gaps identified is the time lag between the time the news is released and when the stock market price has changes as a result of the information that is published. If the files used in the research had a time stamp in hours, then better predictions and models would have been achieved. In addition to the time lag, a more granular level of sentiment analysis (i.e., emotions) can provide the ability to better predict stock market movement [28]. By understanding the emotions that are attached to the positive, negative and neutral sentiments better correlation results can be achieved. Finally, when dealing with seasonal time-series, Seasonal ARIMA (SARIMA) can be used for time series forecasting [30]. Additionally, using Deep Learning-based models that use the concept of Long-Short Term Memory (LSTM) to extract emotions from news articles and correlate them with changes in stocks prices can be useful to better understand the effect of news articles on stock market movement and achieve better stock price prediction accuracy.
References 1. Berry, T.D., Howe, K.M.: Public information arrival. J. Financ. 49(4), 1331–1346 (1993) 2. Bhardwaj, A., et al.: Sentiment analysis for Indian stock market prediction using Sensex and Nifty. Procedia Comput. Sci. 70, 85–91 (2015). https://doi.org/10.1016/j.procs.2015.10.043 3. Fama, E.F.: Efficient Capital Markets: a review of theory and empirical work. J. Financ. 25(2), 383–417 (1970). Paper Proceedings of Twenty-Eighth Annual Meeting on American Finance Association, New York 4. Falinouss, P.: Stock trend prediction using news articles. Master’s thesis (2007) 5. Fama, E.F.: Efficient capital markets: II. J. Financ. 46(5), 1575–1617 (1991) 6. Hendrickx, T., Cule, B., Meysman, P., Naulaerts, S., Laukens, K., Goethals, B.: Mining association rules in graphs based on frequent cohesive Itemsets. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 637–648. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_50 7. Garcia, V.F., Liu, L.: Macroeconomic determinants of stock market development. J. Appl. Econ. 2(1), 29–59 (1999). https://doi.org/10.1080/15140326.1999.12040532 8. Hair, J.F., et al.: Multivariate Data Analysis. Pearson, Upper Saddle River (2010)
1182
A. Marshan et al.
9. Hargreaves, C., Hao, Y.: Prediction of stock performance using analytical techniques. J. Emerg. Technol. Web Intell. 5(2), 136–142 (2013). https://doi.org/10.4304/jetwi.5.2.136-142 10. Hegazy, O., et al.: A machine learning model for stock market prediction. Int. J. Comput. Sci. Telecommun. 4(12), 4057–4062 (2013). https://doi.org/10.22214/ijraset.2021.35822 11. Hendahewa, C., Pavlovic, V.: Analysis of causality in stock market data. In: Proceedings of the 2012 11th International Conference on Machine Learning and Applications, ICMLA 2012, vol. 1, pp. 288–293 (2012). https://doi.org/10.1109/ICMLA.2012.56 12. Ignatow, G., Mihalcea, R.: Sentiment analysis (2018). https://doi.org/10.4135/978148339978 2.n14 13. Joshi, K., Bharathi, P., Jyothi, P.: Stock trend prediction using news sentiment analysis (2013) 14. Li, X., et al.: News impact on stock price return via sentiment analysis. Knowl. Based Syst. 69(1), 14–23 (2014). https://doi.org/10.1016/j.knosys.2014.04.022 15. Lima, M.L., et al.: Using sentiment analysis for stock exchange prediction. Int. J. Artif. Intell. Appl. 7(1), 59–67 (2016). https://doi.org/10.5121/ijaia.2016.7106 16. Malkiel, B.G.: Random Walk Down Wall Street. W. W. Norton Co, London (1996) 17. Marshan, A., Kansouzidou, G., Ioannou, A.: Sentiment analysis to support marketing decision making process: a hybrid model. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FTC 2020. AISC, vol. 1289, pp. 614–626. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-630898_40 18. Mitchell, M.L., Mulherin, J.H.: The impact of public information on the stock market. J. Financ. 49(3), 923–950 (1994) 19. Nam, K.H., Seong, N.Y.: Financial news-based stock movement prediction using causality analysis of influence in the Korean stock market. Decis. Support Syst. 117, 100–112 (2019). https://doi.org/10.1016/j.dss.2018.11.004 20. Penman, S.H.: The distribution of earnings news over time and seasonalities in aggregate stock returns. J. financ. econ. 18(2), 199–228 (1987). https://doi.org/10.1016/0304-405X(87)900 39-0 21. Pui Cheong Fung, G., et al.: Stock prediction: Integrating text mining approach using real-time news. In: IEEE/IAFE Conference on Computational Intelligence for Financial Engineering Proceedings, pp. 395–402, January 2003. https://doi.org/10.1109/CIFER.2003.1196287 22. Radinsky, K., et al.: Learning causality for news events prediction. In: WWW 2012 – Proceedings of 21st Annual Conference on World Wide Web, pp. 909–918 (2012). https://doi. org/10.1145/2187836.2187958 23. Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. 27, 2 (2009). https://doi.org/10.1145/ 1462198.1462204 24. Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using financial news articles. In: Association for Information Systems - 12th Association for Information Systems, AMCIS 2006. 3, 1422–1430 (2006) 25. Shah, D., et al.: Predicting the effects of news sentiments on the stock market. In: Proceedings of 2018 IEEE International Conference on Big Data, Big Data 2018, pp. 4705–4708 (2019). https://doi.org/10.1109/BigData.2018.8621884 26. Siami-Namini, S., Namin, A.S.: Forecasting Economics and Financial Time Series: ARIMA vs. LSTM, pp. 1–19 (2018) 27. Sun, J.: Daily News for Stock Market Prediction. https://www.kaggle.com/datasets/aaron7 sun/stocknews. Accessed 01 June 2022 28. Taffler, R.: Emotional finance: investment and the unconscious†. Eur. J. Financ. 24(7–8), 630–653 (2018). https://doi.org/10.1080/1351847X.2017.1369445
Exploring the Relationship Between News Articles and Stocks Market Movements
1183
29. Thompson, R.B., Olsen, C., Dietrich, J.R.: Attributes of news about firms: an analysis of firm-specific news reported in the Wall Street Journal Index. J. Account. Res. 25(2), 245–274 (1987) 30. Vagropoulos, S.I., et al.: Comparison of SARIMAX, SARIMA, modified SARIMA and ANNbased models for short-term PV generation forecasting. In: 2016 IEEE International Energy Conference, ENERGYCON 2016, pp. 1–5 (2016). https://doi.org/10.1109/ENERGYCON. 2016.7514029
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi Ananya Joshi1,3(B) , Aditi Kajale1,3 , Janhavi Gadre1,3 , Samruddhi Deode1,3 , and Raviraj Joshi2,3 1
MKSSS’ Cummins College of Engineering for Women, Pune, Maharashtra, India 2 Indian Institute of Technology Madras, Chennai, India 3 L3Cube, Pune, India [email protected]
Abstract. Sentence representation from vanilla BERT models does not work well on sentence similarity tasks. Sentence-BERT models specifically trained on STS or NLI datasets are shown to provide state-ofthe-art performance. However, building these models for low-resource languages is not straightforward due to the lack of these specialized datasets. This work focuses on two low-resource Indian languages, Hindi and Marathi. We train sentence-BERT models for these languages using synthetic NLI and STS datasets prepared using machine translation. We show that the strategy of NLI pre-training followed by STSb fine-tuning is effective in generating high-performance sentence-similarity models for Hindi and Marathi. The vanilla BERT models trained using this simple strategy outperform the multilingual LaBSE trained using a complex training strategy. These models are evaluated on downstream text classification and similarity tasks. We evaluate these models on real text classification datasets to show embeddings obtained from synthetic data training are generalizable to real datasets as well and thus represent an effective training strategy for low-resource languages. We also provide a comparative analysis of sentence embeddings from fast text models, multilingual BERT models (mBERT, IndicBERT, xlm-RoBERTa, MuRIL), multilingual sentence embedding models (LASER, LaBSE), and monolingual BERT models based on L3Cube-MahaBERT and HindBERT. We release L3Cube-MahaSBERT and HindSBERT, the state-of-the-art sentence-BERT models for Marathi and Hindi respectively. Our work also serves as a guide to building low-resource sentence embedding models. Keywords: Natural Language Processing · Text Classification · Sentiment Analysis · Marathi Sentence Representations · Hindi Sentence Representations · Sentence-BERT · Indian Regional Languages · Low Resource Languages A. Joshi—Authors contributed equally. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1184–1199, 2023. https://doi.org/10.1007/978-3-031-37963-5_82
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1
1185
Introduction
On sentence-pair regression tasks like semantic textual similarity, BERT has achieved a new state-of-the-art performance [7]. Semantic similarity aims to find sentences having meaning similar to the target sentence and is used in applications like clustering and semantic search [26]. Initial approaches based on BERT necessitate feeding both sentences into the network, making the task computationally intensive. Such a BERT design renders it unsuitable for unsupervised tasks like clustering as well as semantic similarity searches. In another naive approach, individual sentences are provided as input to BERT, to derive fixed-size sentence embedding. The average of the BERT output layer known as BERT embedding or the output of its [CLS] token is used. However, previous works have shown such sentence embeddings to be unsuitable for semantic similarity tasks. The average embedding is known to work better than the [CLS] token [5,23,27,33]. Alternatively, some works have shown [CLS] to work well when the domain of pre-training and target fine-tuning is the same. More recently, computationally efficient Sentence-BERT models were proposed and shown to work well on similarity-based tasks [27]. It is a variant of the standard pre-trained BERT that generates sentence embeddings using siamese and triplet networks. Hence, we present Sentence-BERT models for low-resource languages using synthetic NLI and STS datasets prepared using machine translation. Although monolingual or multilingual BERT models have been available for low-resource languages, Sentence-BERT models are still missing due to the non-availability of specialized NLI and STS datasets. A significant amount of research has been done on the English language [2,6, 13] but Indian regional languages like Marathi and Hindi lack sufficient language resources [14,32]. Marathi, having its origin in Sanskrit, is rich in morphology [17]. The fast-growing internet presence of these languages suggests the need for research and development. Building Sentence-BERT models for Marathi and Hindi is complex due to the lack of specialized datasets in these languages. Hence, we use synthetic NLI and STS datasets created using translation for training the Sentence-BERT models. The synthetic datasets are desirable in absence of real datasets due to high translation quality which is also highlighted in [1]. We perform a comparative study of sentence-BERT models along with FastText models, monolingual [15,16] and multilingual [19] BERT models of both Hindi and Marathi languages. The models are compared based on their classification accuracy scores and embedding similarity scores. Classification categorizes a set of data into respective classes and is used to evaluate the discriminative capability of sentence embedding. The embedding similarity score captures the capability of embedding to compute the semantic relatedness of sentences. The sentence embeddings are evaluated on real classification datasets to ensure that the sentence BERT models do not overfit the noise in the synthetic corpus. We show that SBERT models trained on translated corpus perform competitively on classification datasets thus indicating the high quality of translations.
1186
A. Joshi et al.
Our primary observations and contributions are as follows: • We show that the FastText-based sentence embedding acts as a strong baseline and performs better than or competitively with most of the popular multi-lingual BERT models like IndicBERT, mBERT, and xlm-RoBERTa. • In the vanilla BERT models, monolingual BERT models based on MahaBERT and HindBERT perform best on classification tasks. Whereas the MuRIL performs best on embedding similarity tasks. Although IndicBERT was shown to have state-of-the-art performance on classification tasks, the embedding quality of this model is very poor. Overall the zero-shot capability of LaBSE is the best among all the models. • We study the effect of single-step and two-step training using NLI and STS datasets on monolingual and multilingual base models and provide a comparative analysis of their performance. • We introduce HindSBERT1,2 and MahaSBERT3,4 , the sentence-BERT models trained using translated versions of the NLI and STSb datasets, which outperform the other models tested in this paper. These models are finetuned versions of HindBERT and MahaBERT respectively. To the best of our knowledge, this is the first documented work on dedicated sentence BERT models for low-resource Marathi and Hindi languages. The next part of the paper is structured as follows. Section 2 explores the research work that compares BERT-based models, and analyzes and suggests ways to improve their performance. Previous work related to the development of sentence-BERT models is also surveyed. Section 3 puts forth the details of the datasets used in this work. Section 4.1 describes the models used, Sect. 4.2 describes the experiment and evaluation setup, and Sect. 4.3 explains the findings from our experiments. We conclude the paper with a summary of all the observations in Sect. 5. This work is released as a part of the MahaNLP project [17].
2
Related Work
BERT [8] is a pre-trained transformer network, one of the most effective language models in terms of performance when different NLP tasks like text classification are concerned. However, there are some shortcomings in BERT, which have been identified. [22] investigates the deficiency of the BERT sentence embeddings on semantic textual similarity, and proposes a flow-based calibration that can effectively improve the performance. In [24], the authors introduce the attentionbased pooling strategy that enables in preserving of layer-wise signals captured in each layer and learning digested linguistic features for downstream tasks. It 1 2 3 4
https://huggingface.co/l3cube-pune/hindi-sentence-bert-nli. https://huggingface.co/l3cube-pune/hindi-sentence-similarity-sbert. https://huggingface.co/l3cube-pune/marathi-sentence-bert-nli. https://huggingface.co/l3cube-pune/marathi-sentence-similarity-sbert.
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1187
demonstrates that training a layer-wise attention layer with contrastive learning objectives outperforms BERT and pre-trained language model variants as well. The previous research [28] has shown how a German monolingual BERT model based on RoBERTa outperforms all other tested German and multilingual BERT models with a little tuning of hyperparameters. Similarly, a Czech monolingual RoBERTa language model has been presented in [30], wherein authors show that the model significantly outperforms equally-sized multilingual and Czech language-oriented model variants. A similar study has been undertaken for the Marathi language as well. The author in [32] compares the standard multilingual BERT-based models with their corresponding Marathi monolingual versions. It highlights the superior performance and sentence representations from the monolingual variants over the multilingual ones when focused on single-language tasks. The author in [16] presents the MahaFT- Marathi fast text embeddings trained on the Marathi monolingual corpus and shows that it performs competitively with other publicly available fast text embeddings. The sentence embeddings of BERT and LASER for the Hindi language have been evaluated in [14]. They report the sub-optimal zero-shot performance of these sentence embeddings on a variety of Hindi text classification tasks. In [10], the authors present LaBSE, a language-independent sentence embedding model that supports 109 languages. In comparison to the previous state-ofthe-art, the model achieves superior performance on a variety of text retrieval or mining tasks, as well as increased language coverage. In this work, we perform an extensive evaluation of this model. The author in [27] presents the Sentence-BERT (SBERT), which is a computationally efficient and fine-tuned BERT in a siamese or triplet network architecture. The authors show that training on NLI, followed by training on STSb leads to a marginal improvement in performance. Our work is centered around the SBERT architecture presented in this work. The author in [3] explores sentenceALBERT (SAlBERT) along with experimenting with CNN sentence-embedding network for SBERT and SAlBERT. The Findings of the experiment show that CNN architecture improves AlBERT models for STS benchmark.
3
Datasets
This section lists the public datasets utilized in our experiment: IndicXNLI5 consists of English XNLI data translated into eleven Indic languages, including Hindi and Marathi [1]. The train (392702), validation (2490), and evaluation sets (5010) of English XNLI are translated from English into each of the eleven Indic languages. From IndicXNLI, we use the training samples of the corresponding language to train the HindSBERT and MahaSBERT. The STS benchmark (STSb)6 is a widely used dataset for assessing supervised Semantic Textual Similarity (STS) systems. In STS, sentence pairs are annotated with a score indicating their similarity, with scores ranging from 0 5 6
https://github.com/divyanshuaggarwal/IndicXNLI. https://huggingface.co/datasets/stsb multi mt.
1188
A. Joshi et al.
to 5. The data includes 8628 sentence pairs from the three groups- captions, news, and forums. It is divided into 5749 train, 1500 dev and 1379 test [31]. We translate the STSb dataset using Google Translate to Marathi and Hindi for training and evaluating the MahaSBERT-STS and HindSBERT-STS. It is made accessible publicly7 . The down-stream evaluation of BERT-based models is performed on the following Marathi and Hindi classification datasets. The number of samples in these datasets have been summarized in Table 1: • L3Cube-MahaSent: A Sentiment Analysis dataset in Marathi that includes tweets classified as positive, negative, and neutral [20]. The number of train, test, and validation examples are 12114, 2250, and 1500 respectively. • IndicNLP News Articles: A dataset containing Marathi news articles classified into three categories: sports, entertainment, and lifestyle. The dataset has 4779 records total, of which 3823, 479, and 477 are found in the train, test, and validation sets respectively [21]. • iNLTK Headlines: A dataset containing Marathi news article headlines from three different categories: entertainment, sports, and state. It has 12092 records, which are divided into 9672 train, 1210 test, and 1210 validation samples. • BBC News Articles: A corpus of Hindi text classification extracted from the BBC news website. It consists of 3466 train and 865 test samples. 500 samples chosen randomly from the train data are used for validation. • IITP Product reviews: A sentiment analysis set of Hindi product reviews from 3 classes- positive, negative, and neutral. It contains 4181 training samples, 522 validation, and 522 test samples. • IITP Movie reviews: A sentiment analysis set of Hindi movie reviews divided into 3 classes- positive, negative, and neutral. It contains 2479 training samples, 309 validation, and 309 test samples.
4
Experiments
4.1
Models
A. FastText Models. For morphologically rich languages, FastText word embeddings are commonly used. This method extends the word2vec model by representing the word as a bag of character n-grams, preventing words that are out of vocabulary [16]. The L3Cube-MahaFT [16] is a FastText model trained on a 24.8M sentence and 289M token Marathi monolingual corpus. The FB-FT8 is a set of fast text embedding models made available by Facebook by training on Wiki and the Common Crawl Corpus [11]. The FB-FT is available in both Hindi and Marathi languages. 7 8
https://github.com/l3cube-pune/MarathiNLP. https://fasttext.cc/docs/en/crawl-vectors.html.
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1189
Table 1. Number of Samples Present in Various Hindi and Marathi Datasets Dataset
Training
Validation
Test
Multilingual
IndicXNLI STSb
392702 5749
2490 1500
5010 1379
Monolingual-Marathi
L3Cube-MahaSent Articles Headlines
12114 3814 9671
1500 476 476
2250 477 1209
Monolingual-Hindi
BBC News Articles 3466 IITP- Product reviews 4181 IITP- Movie reviews 2479
500 522 309
865 522 309
The BERT is a deep Bi-directional Transformer-based model trained on a large unlabeled corpus [8]. A variety of transformer-based pre-trained BERT models is publicly available. We have explored multiple monolingual and multilingual models in this work. We tried three different pooling strategies for each of these models: CLS embeddings, MEAN embeddings, and MAX embeddings from all tokens. Following are the standard multilingual BERT models which use Hindi and Marathi as training languages: • IndicBERT9 : a multilingual ALBERT model developed by Ai4Bharat trained on a large volume of data. The training languages include 12 major Indian languages [18]. • xlm-RoBERTa10 : the RoBERTa model supporting numerous languages. It is pre-trained with the Masked language modelling (MLM) objective on 2.5TB of filtered CommonCrawl data containing 100 languages [4]. • mBERT11 : a BERT-base model pre-trained on 104 languages using next sentence prediction (NSP) objective and Masked Language Modeling (MLM) [9]. • MuRIL12 (Multilingual Representations for Indian Languages): a BERTbased model pre-trained on 17 Indic languages and parallel data [19] which includes the translations as well as transliterations on each of the 17 monolingual corpora. • LaBSE13 (Language-agnostic BERT sentence embedding): The model [10] provides good results while looking for sentence translations. It is trained to output vectors close to each other for pairs of bilingual sentences which are translations of each other.
9 10 11 12 13
https://huggingface.co/ai4bharat/indic-bert. https://huggingface.co/xlm-roberta-base. https://huggingface.co/bert-base-multilingual-cased. https://huggingface.co/google/muril-base-cased. https://huggingface.co/setu4993/LaBSE.
1190
A. Joshi et al.
• LASER14 (Language-Agnostic Sentence Representations): released by Facebook [12], provides multilingual sentence representations supporting 90+ languages including low-resource languages like Hindi and Marathi. It uses a single model to handle a variety of languages. This model embeds all languages jointly in a single shared space. The following monolingual models are used for comparison with the multilingual models: Marathi: • MahaBERT15 : a multilingual BERT model [16], fine-tuned using 752M tokens from the L3Cube-MahaCorpus and other freely accessible Marathi monolingual datasets. • MahaAlBERT16 : a Marathi monolingual model [16] extended from AlBERT, trained on L3Cube-MahaCorpus and other public Marathi monolingual datasets. • MahaRoBERTa17 : a Marathi RoBERTa model [16] built upon a multilingual RoBERTa model and fine-tuned on publicly available Marathi monolingual datasets including L3Cube-MahaCorpus. • MahaTweetBERT18 : A MahaBERT model [25] finetuned on the Marathi Tweets dataset. Hindi: • HindBERT19 : It is a multilingual BERT model fine-tuned on publicly available Hindi monolingual datasets [15]. • HindALBERT20 : HindAlBERT is a Hindi AlBERT model [15] trained on publicly available Hindi monolingual datasets. • HindRoBERTa21 : It is a multilingual RoBERTa model [15] fine-tuned on publicly available Hindi monolingual datasets. • HindTweetBERT22 : The HindBERT model [15] is finetuned on Hindi Tweets. • Sentence-similarity-hindi23 This is a sentence-transformer model. It can be used for tasks like clustering or semantic search because it maps sentences and paragraphs to a 768-dimensional dense vector space [29]. 14 15 16 17 18 19 20 21 22 23
https://github.com/facebookresearch/LASER. https://huggingface.co/l3cube-pune/marathi-bert-v2. https://huggingface.co/l3cube-pune/marathi-albert-v2. https://huggingface.co/l3cube-pune/marathi-roberta. https://huggingface.co/l3cube-pune/marathi-tweets-bert. https://huggingface.co/l3cube-pune/hindi-bert-v2. https://huggingface.co/l3cube-pune/hindi-albert. https://huggingface.co/l3cube-pune/hindi-roberta. https://huggingface.co/l3cube-pune/hindi-tweets-bert. https://huggingface.co/hiiamsid/sentence similarity hindi.
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1191
C. SBERT Models. The Sentence-BERT models are created using translated versions of the STSb and NLI datasets. We experiment with two multilingual base models- LaBSE and MuRIL, and the monolingual base model HindBERT for Hindi, and MahaBERT for Marathi. We experiment on three different setups while evaluating the CLS and mean pooling strategies for each setup: 1. Single Step training of the base model on the IndicXNLI dataset alone [Fig. 2]. In this method, about 256,180 training sentence triplets (anchor, entailment, contradiction) are used for training with the MultipleNegativesRankingLoss function. Training is done for 1 epoch with batch size 4, AdamW optimizer and a learning rate of 2e-05. The Hindi and Marathi models trained through using setup are termed as HindSBERT and MahaSBERT respectively. 2. Single Step training of the base model on the translated STSb dataset alone, where combinations of sentence pairs and corresponding similarity scores are used [Fig. 3]. Training is done for 4 epochs with CosineSimilar-
Fig. 1. Network Structure of Single-Step Trained Sentence-BERT
Fig. 2. Training the BERT Base Model on the IndicXNLI Dataset alone
1192
A. Joshi et al.
ityLoss, which trains the network with a siamese network structure. The AdamW optimizer is used with a learning rate of 2e-05. The base models used for training these models are the HindBERT and MahaBERT respectively. The network structure for single-step training is illustrated by Fig. 1. 3. Two Step training, where the SBERT models obtained from setup 1 are fine-tuned using the translated STSb dataset, in 4 epochs and a batch size of 8 [Fig. 4]. The AdamW optimizer is used with a learning rate of 2e-05, and the loss function used is CosineSimilarityLoss. The Hindi and Marathi models trained through using setup are termed HindSBERT-STS and MahaSBERTSTS respectively. The base models used for training these models are the HindBERT and MahaBERT respectively. 4.2
Evaluation Methodology
We evaluate different monolingual and multilingual BERT models and the sentence-BERT models on their embedding similarity scores and classification accuracies. The embedding Similarity score denotes the Spearman’s Rank Correlation between the cosine similarity scores of the model embeddings and ground truth labels provided in the STSb [31]. A high embedding similarity score points to high-quality embeddings in comparison to the benchmark embeddings. For the text classification datasets, the sentence embeddings are calculated by passing the text through the respective BERT model or FastText model. The embedding-label pair are then classified using the K Nearest Neighbours (KNN) algorithm. It is a non-parametric, supervised learning classifier that leverages proximity to classify or predict how a particular data point will be grouped. The distance metric used is the Minkowski - the generalized form of Euclidean and Manhattan distance metrics. A validation dataset is used to compute the optimal value of k. This value of k is further used to find the accuracy of the test dataset which is reported in this work.
Fig. 3. Training the BERT Base Model on the STSb Dataset alone
We use three methods for training the sentence-BERT models, as described in the previous section. In all the three methods, the models are tested on translated STSb test dataset to examine their accuracies. We compare these SBERT
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1193
Fig. 4. Fine-Tuning the Model Trained on IndicXNLI with Translated STSb Dataset
models with mono and multilingual BERT models on the quality of embeddings generated. 4.3
Evaluation Results
Classification accuracy and Embedding Similarity scores have been calculated on three Marathi and three Hindi datasets. The results for the Marathi language are displayed in Table 2, while the corresponding results for the Hindi language are displayed in Table 3. In Table 2, the IndicNLP News Articles and iNLTK Headlines datasets are denoted as Articles and Headlines respectively. In Table 3, the BBC News Articles, IITP- Product reviews and IITP- Movie reviews datasets are termed News-articles, IITP-Products, and IITP-Movies respectively. We find that MahaFT has a slight edge over FB-FT for classification tasks, and performs competitively with the monolingual BERT models. The monolingual models have either shown comparable performance or outperformed the multilingual versions on all the datasets. This shows the importance of monolingual training for obtaining rich sentence embeddings. However, models having the highest embedding similarity score do not necessarily have the best classification accuracy. We find that Sentence-BERT generates embeddings of a significantly higher quality than FastText and BERT models. Single-step training on STSb (setup 2) produces significantly better performance than single-step training on the NLI dataset (setup 1) as the test data is STSb. For base models, HindBERT and MahaBERT, mean pooling provides a considerable advantage over the CLS pooling strategy. The difference in accuracies between the two pooling strategies is trivial when LaBSE is used as the base model. Thus, the mean pooling strategy is used for the two-step training process (setup 3). Fine-tuning the NLI pretrained models using STSb has an upper hand over single-step STSb training. Through fine-tuning, a considerable boost in accuracy is achieved for the MuRIL, MahaBERT, and HindBERT base models. The results obtained by mean pooling are presented in Table 4. When the three vanilla BERT models of MuRIL, MahaBERT/HindBERT, and LaBSE are compared, we find that LaBSE gives the best performance, followed by MuRIL and then MahaBERT/HindBERT. But after applying the two-step training process over these base models, the resultant sentence-BERT
1194
A. Joshi et al. Table 2. Results for Monolingual and Multilingual Marathi Models
Model
Pooling
Embedding Similarity
L3Cube-MahaSent
Articles
Headlines
Fast text models Facebook FB-FT
AVG
0.53
0.75
0.99
0.9
L3Cube MahaFT
AVG
0.5
0.76
0.99
0.92
Multilingual variants IndicBERT
CLS AVG MAX
0.12 0.35 0.36
0.72 0.76 0.74
0.84 0.9 0.92
0.73 0.85 0.8
xlm-RoBERTa
CLS AVG MAX
0.3 0.39 0.35
0.72 0.74 0.66
0.99 0.97 0.92
0.82 0.81 0.72
mBERT
CLS AVG MAX
0.16 0.48 0.49
0.68 0.7 0.67
0.94 0.98 0.95
0.72 0.83 0.78
MuRIL
CLS AVG MAX
0.3 0.59 0.5
0.72 0.78 0.74
0.99 0.98 0.92
0.9 0.91 0.8
0.62
0.67
0.93
0.73
0.7 0.72 0.71
0.75 0.75 0.73
0.98 0.98 0.99
0.84 0.84 0.89
CLS AVG MAX
0.27 0.49 0.48
0.68 0.77 0.73
0.83 0.92 0.93
0.81 0.86 0.82
MahaTweetBERT CLS AVG MAX
0.26 0.53 0.5
0.72 0.79 0.76
0.99 0.99 0.95
0.92 0.9 0.77
MahaRoBERTa
CLS AVG MAX
0.29 0.55 0.51
0.66 0.78 0.72
0.98 0.99 0.96
0.9 0.88 0.83
MahaBERT
CLS AVG MAX
0.27 0.55 0.52
0.74 0.78 0.76
0.99 0.98 0.95
0.9 0.91 0.84
LASER LaBSE
CLS AVG MAX
Monolingual variants MahaAlBERT
models produced by MahaBERT/HindBERT perform better than those produced by MuRIL and LaBSE. We take a set of 10 sample sentence pairs, chosen randomly from a corpus of news dataset. We use various multilingual and monolingual Hindi and Marathi models to compute the cosine similarity of each sentence pair. The results of this experiment are presented in Table 5 and Table 6. We observe that the difference in cosine similarity scores provided by the mBERT, MuRIL, MahaBERT, HindBERT, MahaRoBERTa, and HindRoBERTa is insubstantial, thereby mak-
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1195
Table 3. Results for Monolingual and Multilingual Hindi Models Model
Pooling
Embedding Similarity
News Articles
IITP- Products
IITP- Movies
Fast text model Facebook FB-FT
AVG
0.45
0.67
0.62
0.45
IndicBERT
CLS AVG MAX
0.15 0.34 0.37
0.43 0.48 0.47
0.59 0.63 0.61
0.47 0.52 0.53
mBERT
CLS AVG MAX
0.16 0.48 0.48
0.55 0.61 0.5
0.63 0.65 0.62
0.47 0.46 0.5
xlm-RoBERTa
CLS AVG MAX
0.34 0.46 0.44
0.64 0.61 0.49
0.64 0.64 0.56
0.48 0.48 0.45
MuRIL
CLS AVG MAX
0.29 0.54 0.47
0.67 0.67 0.45
0.6 0.67 0.65
0.45 0.55 0.51
Multilingual variants
LASER
0.65
0.54
0.62
0.5
CLS AVG MAX
0.7 0.72 0.71
0.64 0.66 0.67
0.67 0.68 0.67
0.5 0.48 0.5
HindAlBERT
CLS AVG MAX
0.2 0.44 0.47
0.46 0.56 0.53
0.6 0.65 0.66
0.48 0.5 0.52
HindBERT
CLS AVG MAX
0.25 0.52 0.5
0.67 0.7 0.59
0.61 0.69 0.68
0.46 0.53 0.51
HindTweetBERT
CLS AVG MAX
0.18 0.53 0.51
0.48 0.66 0.53
0.64 0.7 0.61
0.49 0.54 0.5
HindRoBERTa
CLS AVG MAX
0.22 0.53 0.54
0.59 0.69 0.59
0.6 0.66 0.64
0.49 0.55 0.47
sentence-similarity-hindi CLS AVG MAX
0.82 0.82 0.8
0.62 0.63 0.65
0.75 0.72 0.67
0.54 0.52 0.58
LaBSE
Monolingual variants
ing the results non-intuitive. In contrast, the cosine similarity scores obtained from MahaSBERT/HindSBERT and MahaSBERT-STS/HindSBERT-STS are intuitive and distinguishable. The difference in the scores of two sentence pairs on the opposite ends of the spectrum (exactly similar sentence pair and completely dissimilar sentence pair) is most evident from the embeddings provided by both the Sentence-BERT models. But, the difference is least evident in the similarity scores of embeddings provided by Muril. This points to the need of applying a suitable normalization method on the MuRIL similarity scores to be able to interpret them effectively. Thus, we present the MahaSBERT and HindSBERT: sentence-BERT models trained on the MahaBERT and HindBERT base models respectively. They are
1196
A. Joshi et al. Table 4. Results of SBERT Models
Marathi SBERT Training setup
Base model
One-step (NLI)
MuRIL LaBSE MahaBERT
0.74 0.76 0.77
Embedding Similarity
0.8 0.79 0.8
L3Cube-MahaSent
0.98 0.99 0.98
Articles
0.85 0.82 0.88
Headlines
One-step (STS)
MuRIL LaBSE MahaBERT
0.77 0.83 0.8
0.74 0.78 0.79
0.99 0.99 0.98
0.89 0.82 0.92
Two-step (NLI+STS) MuRIL LaBSE MahaBERT
0.81 0.83 0.83
0.79 0.78 0.79
0.98 0.99 0.99
0.88 0.89 0.9
Hindi SBERT Training setup
Base model
Embedding Similarity
News Articles
IITP- Products
IITP- Movies
One-step (NLI)
MuRIL LaBSE HindBERT
0.74 0.75 0.77
0.7 0.64 0.69
0.7 0.75 0.75
0.7 0.56 0.53
One-step (STS)
MuRIL LaBSE HindBERT
0.79 0.83 0.82
0.65 0.65 0.68
0.65 0.73 0.68
0.65 0.55 0.48
Two-step (NLI+STS) MuRIL LaBSE HindBERT
0.83 0.84 0.85
0.69 0.65 0.68
0.72 0.74 0.74
0.49 0.56 0.5
Table 5. Cosine Similarity Scores of Various Marathi Model Embeddings
trained through the two-step process described in setup 3 above. They give the best performance among all SBERT models of the corresponding language
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1197
Table 6. Cosine Similarity Scores of Various Hindi Model Embeddings
evaluated in this paper. We demonstrate that the Sentence-BERT models made public through this work function competitively with or better than the presently available alternatives for Hindi and Marathi languages.
5
Conclusion
In this work, we present a simple approach to training sentence BERT models for low-resource language using the synthetic corpus. We have evaluated these models using a KNN-based classification setup and embedding similarity method on different Hindi and Marathi datasets. Various FastText and pre-trained multilingual and monolingual BERT models have also been evaluated. FastText models are found to perform competitively with monolingual BERT models while the monolingual BERT models outperform multilingual ones. Without any taskspecific finetuning, the LaBSE model is found to perform the best for both Hindi and Marathi languages. We highlight the lack of Hindi and Marathi sentenceBERT models in the public domain and hence release MahaSBERT and HindSBERT, the sentence-BERT models created using synthetic datasets. Through a comparative analysis of their performance, we show that these Sentence-BERT models have an upper hand in the quality of embeddings as compared to all BERT as well as FastText models. They are highly advantageous for the task of semantic sentence similarity. We conclude that the method of two-step training proves to be efficient for developing MahaSBERT-STS and HindSBERT-STS. Finally, we hope that our work facilitates further study and trials in the Hindi and Marathi NLP domains.
1198
A. Joshi et al.
Acknowledgments. This work was done under the L3Cube Pune mentorship program. We would like to express our gratitude towards our mentors at L3Cube for their continuous support and encouragement.
References 1. Aggarwal, D., Gupta, V., Kunchukuttan, A.: Indicxnli: evaluating multilingual inference for Indian languages. arXiv preprint arXiv:2204.08776 (2022) 2. Cer, D., et al.: Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018) 3. Choi, H., Kim, J., Joe, S., Gwon, Y.: Evaluation of BERT and albert sentence embedding performance on downstream NLP tasks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5482–5487. IEEE (2021) 4. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451 (2020) 5. onneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680 (2017) 6. Conneau, A., Kruszewski, G., Lample, G., Barrault, L., Baroni, M.: What you can cram into a single $ &!#* vector: probing sentence embeddings for linguistic properties (2018) 7. Dasgupta, I., Guo, D., Stuhlm¨ uller, A., Gershman, S.J., Goodman, N.D.: Evaluating compositionality in sentence embeddings. arXiv preprint arXiv:1802.04302 (2018) 8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) 9. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805 (2018) 10. Feng, F., Yang, Y., Cer, D., Arivazhagan, N., Wang, W.: Language-agnostic BERT sentence embedding. arXiv preprint arXiv:2007.01852 (2020) 11. Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018) 12. Heffernan, K., C ¸ elebi, O., Schwenk, H.: Bitext mining using distilled sentence representations for low-resource languages. arXiv preprint arXiv:2205.12654 (2022) 13. Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018) 14. Joshi, R., Goel, P., Joshi, R.: Deep learning for Hindi text classification: a comparison. In: Tiwary, U.S., Chaudhury, S. (eds.) IHCI 2019. LNCS, vol. 11886, pp. 94–101. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44689-5 9 15. Joshi, R.: L3Cube-HindBERT and DevBERT: Pre-Trained BERT Transformer models for Devanagari based Hindi and Marathi Languages. arXiv preprint arXiv:2211.11418 (2022) 16. Joshi, R.: L3Cube-MahaCorpus and MahaBERT: marathi monolingual corpus, Marathi BERT language models, and resources. In: Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference, Marseille, France, pp. 97–101. European Language Resources Association (2022)
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models
1199
17. Joshi, R.: L3cube-mahanlp: Marathi natural language processing datasets, models, and library. arXiv preprint arXiv:2205.14728 (2022) 18. Kakwani, D., et al.: Indicnlpsuite: monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 4948–4961 (2020) 19. Khanuja, S., et al.: Muril: multilingual representations for Indian languages. arXiv preprint arXiv:2103.10730 (2021) 20. Kulkarni, A., Mandhane, M., Likhitkar, M., Kshirsagar, G., Joshi, R.: L3cubemahasent: a Marathi tweet-based sentiment analysis dataset. In: Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 213–220 (2021) 21. Kunchukuttan, A., Kakwani, D., Golla, S., Bhattacharyya, A., Khapra, M.M., Kumar, P., et al.: AI4Bharat-IndicNLP corpus: monolingual corpora and word embeddings for indic languages. arXiv preprint arXiv:2005.00085 (2020) 22. Li, B., Zhou, H., He, J., Wang, M., Yang, Y., Li, L.: On the sentence embeddings from pre-trained language models. arXiv preprint arXiv:2011.05864 (2020) 23. Ma, X., Wang, Z., Ng, P., Nallapati, R., Xiang, B.: Universal text representation from BERT: an empirical study. arXiv preprint arXiv:1910.07973 (2019) 24. Oh, D., Kim, Y., Lee, H., Huang, H.H., Lim, H.: Don’t judge a language model by its last layer: contrastive learning with layer-wise attention pooling. arXiv preprint arXiv:2209.05972 (2022) 25. Patankar, S., Gokhale, O., Kane, A., Chavan, T., Joshi, R.: Spread love not hate: undermining the importance of hateful pre-training for hate speech detection. arXiv preprint arXiv:2210.04267 (2022) 26. Perone, C.S., Silveira, R., Paula, T.S.: Evaluation of sentence embeddings in downstream and linguistic probing tasks. arXiv preprint arXiv:1806.06259 (2018) 27. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. arXiv preprint arXiv:1908.10084 (2019) 28. Scheible, R., Thomczyk, F., Tippmann, P., Jaravine, V., Boeker, M.: Gottbert: a pure German language model. arXiv preprint arXiv:2012.02110 (2020) 29. Siddhartha Shrestha. sentence similarity hindi huggingface model 30. Straka, M., N´ aplava, J., Strakov´ a, J., Samuel, D.: RobeCzech: Czech RoBERTa, a monolingual contextualized language representation model. In: Ekˇstein, K., P´ artl, F., Konop´ık, M. (eds.) TSD 2021. LNCS (LNAI), vol. 12848, pp. 197–209. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-83527-9 17 31. Sun, X., et al.: Sentence similarity based on contexts. Trans. Assoc. Comput. Linguist. 10, 573–588 (2022) 32. Velankar, A., Patil, H., Joshi, R.: Mono vs multilingual BERT for hate speech detection and text classification: a case study in Marathi. In: El Gayar, N., Trentin, E., Ravanelli, M., Abbas, H. (eds.) ANNPR 2022. LNCS, vol. 13739, pp. 121–128. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-20650-4 10 33. Wang, B., Kuo, C.-C.J.: SBERT-WK: a sentence embedding method by dissecting BERT-based word models. IEEE/ACM Trans. Audio Speech Lang. Process. 28, 2146–2157 (2020)
Design and Development of a Mobile Robotics Module with ROS/Gazebo Twin for Flexible, Adaptive Hands-On Learning: In-Campus/Remote/Hybrid Mario Mata(B) and Ryan M. Gibson Glasgow Caledonian University, Glasgow G4 0BA, UK [email protected]
Abstract. Robotics is a strongly interdisciplinary, exciting field, common in Engineering degrees. Setting up a suitable lab environment to accommodate practical work for a number of students is a demanding and costly aspect in any Robotics module. Shifting from classic in-campus learning towards fully remote or hybrid models adds an additional challenge. Rigid hybrid models, requiring students to take specific parts of the module in campus (usually practical contents) cannot adapt to students unable to come to campus on unpredictable weeks due to illness or work commitments. Ideally, all module contents including the full practical should allow a flexible learning modality, while keeping the remote version of the practical as close as possible to the in-campus one. This paper, based on a 6-year module development experience, discusses ideas that can be helpful for lecturers developing a new Robotics module or adapting an existing one for flexible hybrid learning. The Robot Operating System (ROS) has proven to be a highly flexible and effective tool in his context. The NVidia’s “robotics teaching kit” (Jet robots) we use for the practical presented a number of limitations, but once fixed it has proven to be a capable platform; other robotic platforms can be used instead as long as they satisfy certain requirements. Finally, Gazebo physics simulation engine allows using a dual real/simulated environment. Combined with a careful theory/practical contents matching, our approach provides flexibility for in-campus or remote learning on any delivery week to support students (or lecturers) facing unexpected attendance issues. Keywords: Robotics · Adaptive Hybrid Learning · Module Development · ROS · Gazebo · NVidia Jet
1 Introduction Robotics is a vast and multidisciplinary area, based on specialized hardware controlled by software, underpinned by fundamental mathematics, electro-mechanics, instrumentation, control, communications, programming, computer vision and artificial intelligence [1]. Robotics is also an attractive area for a majority of engineering students. In informal surveys done at the start of the modules, 90% of students taking our robotics module © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1200–1218, 2023. https://doi.org/10.1007/978-3-031-37963-5_83
Design and Development of a Mobile Robotics Module
1201
claim to be interested in this particular area, vs a 70% average interest on other modules taught by the same lecturers. Consequently, the use of Robotics in teaching is common in all educational levels, especially in engineering programs [2]. Furthermore, its attractive makes it a suitable student recruitment tool for higher education centres [3]; in this line, we commonly use our robots for open days and similar recruitment events. This paper describes the design of a Robotics module, aimed to enable hybrid, adaptive learning, meaning that students can work in-campus or e-learning at any week of the term as needed while keeping the same practical contents, and swap between working on the real robots or its simulated twin during any in-campus lab session. We have been developing and teaching an undergraduate “Intelligent Robotics and Mechatronics” module (BEng level 4, equivalent to 10 ECTS) for 6 years now. All education centres have gone through a time when learning has forcefully changed from in-campus to remote. Even if a return to the traditional, fully in-campus model might be possible again, it is no longer completely adequate given the increased difficulties that students find to attend the campus regularly: • More frequent Covid-19-related temporary absence. In 2021/22, over half of secondary students in Scotland have had their school attendance impacted by Covid-19, with a stronger impact in the most disadvantaged backgrounds [4]. Higher education students are similarly affected by Covid-19, and the number of Scots from the most deprived areas enrolling at Scottish universities is at a new record high in 2022 [5]. An undetermined but significant number of higher education students also have carer responsibilities [6]; their chances to attend in-campus sessions are affected by illness of their cared ones as well. • According to the National Student Money Survey 2022 [7], a growing number of higher education students need to take on part-time jobs to help supporting the cost of their studies; even if the job can usually be compatible with the in-campus sessions, sometimes work commitments force students to miss some sessions. Students from disadvantaged backgrounds and deprived areas are affected by this more strongly. • Travel disruption, more frequent than previous years due to transport workers illness and frequent strike action. In this challenging and unpredictable new environment, an adaptive hybrid learning model, designed to support a variable mix of in-campus or e-learning sessions at any given term week, presents clear advantages over more rigid designs. Lessons learned during the development and adaptation of our Robotics module are discussed in this document, in the hope they are useful for other lecturers developing a new Robotics module or adapting an existing one. In particular, our experience has found two particular aspects to be critical for a successful hybrid learning Robotics module: • Selecting relevant practical work that is feasible in the available lab setup, and intertwining it with the related theoretical contents along the term weeks. This is discussed in Sect. 2. • A lab setup that i) services a moderate number of students at a time within a limited budget for robotic platforms, and ii) allows taking the practical contents either incampus or via e-learning in any week of the term, to support for students facing
1202
M. Mata and R. M. Gibson
unpredictable attendance issues. We refer to this as adaptive hybrid, in contrast with rigid hybrid approaches using fixed split of tasks that need to be done in-campus and tasks that can be done remotely. This is covered in Sect. 4. The adaptive hybrid approach has the additional advantage of optimizing the hardware robots usage during in-campus practical sessions, reducing the number of robots needed, as discussed in Sect. 4.4. Using the Robot Operating System (ROS) is a very common approach to teaching robotics [8] and perfectly supports a hybrid approach. Finally, selecting a suitable mobile robot platform is also a crucial decision, discussed in Sect. 3.1. We selected is the “Jet Robotics teaching kit” [9], proposed by NVidia at the time when we started developing the module (2016), intended to work under the Robot Operating System (ROS) [10]. The proposed Jet platform presented a number of issues (discussed in Sect. 3.2), but once fixed, it has proven to be a suitable choice. Some of those limitations only showed up during lab operation with multiple students, and should be considered when choosing any robotic platform.
2 Module Design This section summarizes some design tips that have worked well in our module. They are the result of evolving the module along the years based on feedback received from the students along the way. 2.1 Practical Work Organization and Assessment According to the revised Bloom’s taxonomy [11], comprehensive learning requires to apply the acquired knowledge, analyse how it works, evaluate its operation and finally use it to create new things. Accordingly, SCQF10 specification [12] requires to apply knowledge, analyse critically and demonstrate creativity. Consequently, the module’s practical plays a fundamental role. Special care must be given to its design and implementation. Ensuring coverage of all different levels in the revised Bloom’s hierarchy is easier by organizing the module coursework in two components. The first component is quite guided using lab booklets, helping to set up the foundations and focusing on lower Bloom’s levels; the second component is open-ended, working towards the higher Bloom’s levels. Coursework 1 (CW1)-Lab Booklets. CW1 spreads along the first half of the module. It consists of weekly labs guided by a booklet. The booklet provides directions that can be followed either remotely, or in-campus. • Those labs are carefully matched with the theory units in the weekly planning, and are intended to apply the most relevant concepts covered on each unit (Table 1). This intertwining reinforces fundamental concepts and encourages students to keep up to date with the contents. • Having flexibility to take labs either in-campus or, when needed, remotely, helps to keep the synchronization with the theory contents. Each lab booklet requires the student to answer a few key questions about the lab work (understand, analyse and evaluate), and submit the answers every two weeks. Those
Design and Development of a Mobile Robotics Module
1203
small student submissions are then marked, and detailed feedback is provided before the next hand-in. As this happens relatively early in the module, students can use this feedback to improve future hand-ins, and to find out what concepts they should review. Relevant, frequent, and close-in-time feedback is a key aspect for the students learning [13]. It also improves student engagement and confidence [14]. Additionally, a “Lab reflection” section in the lab booklets collects feedback and improvement suggestions from the students that the module team can use to improve the module. Coursework 2 (CW2)-Mini-Project. CW2 is done during the last 5 weeks of the module, under a project-based learning approach [15, 16]. Students are required to apply their practical experience and the concepts covered on the weekly units to implement a specific robotic application. They can choose from a list of proposals, or propose their own mini-project (that the lecturer then adapts in difficulty and time scale). It requires students to create a solution to an unseen problem. Guidelines with the minimum targets that are required for a pass are given, but students are encouraged to develop further functionalities (and given marks for it). Students work in groups of 2–3 for this mini-project; as Robotics combines different knowledge areas, combining the particular abilities of several students will boost the overall result. Group work is also a proven tool to develop inter-personal skills [17]. For assessment, each group is required to submit a report detailing their solution, their code, and to perform live demo on the final week that includes a Q&A part where the lecturer can probe individual student knowledge. If students are working fully remotely on that component, making the Q&A unsuitable, the mini-project is done in groups but each student is then required to produce their own final version of the mini-project, with an individual report and an individual pre-recorded video demo replacing the live one.
2.2 Theory Contents The organization of the theory units is synchronized to the practical contents, especially during CW1 (first half of the module). Some concepts required for the lab sessions scheduled earlier in the module planning; more general contents, not strictly required to start working on the practical, are delayed to the last weeks. This facilitates a good match between theory and practice along the module. Students appreciate this, and often bring this up in the qualitative part of the formal end of module satisfaction surveys. Table 1 summarizes the module contents; please notice the intertwined theory and practical contents along CW1, with a lab-related small hand-in every two weeks. For remote learning, theory units are available as video units and alternative pdf documents with comments; some students prefer audio-visual input, while others deal better with written materials. For in-campus sessions, students are encouraged to work on the online contents beforehand, then focusing the live session on answering student questions and going over the fundamental ideas. Live sessions can be streamed or recorded if necessary using the AV resources in the classroom.
1204
M. Mata and R. M. Gibson Table 1. Module Structure and Contents
Week(s)
Theory
Practical
1
Module introduction to robotics
Lab introduction. Linux VM, C++ example
2–3
Computer vision for robotics
CW1 lab: image processing with OpenCV (on images taken from the Jet robots)
4
Introduction to ROS
5
Mobility. Differential drive model. Kinematics
CW1 lab (part 1): Working with the Jet robots under ROS CW1 lab (part 2): Sense and avoid
6
Poses. Frames of reference. Odometry
7
Closed loop control
CW1 lab (part 1): Odometry from wheel encoders CW1 lab (part 2): Moving to relative goals
8
Path planning. A*
CW2 lab: work on mini-project
9
Localization and mapping
10
Kalman filter
11
Sensors for robotics
12
Actuators for robotics
3 The Practical: Hardware Platform Considerations for Hybrid Teaching 3.1 General Considerations The cost of the robots is often the most important limiting factor. The number of robots required depends on how many students will work in the lab simultaneously (plus one or two extra robots as spare). Other aspects to take into account are usability, functionality, and expansion possibilities. Ready-made robotic platforms are usually expensive for the limited budget of most university departments. A common alternative in higher education is to reduce costs by purchasing required parts and building custom robots instead, shifting the investment to the person-hours required for building and setting up the robots, and for their ongoing maintenance. As a bonus, this custom approach makes it easy to improve or increase functionality by incrementally adding new components as new budget gets available. On a hybrid practical, students will be working either with the real robots or with a remote learning setup on the same task; and ideally, just swapping between them both as required (adaptive). Further considerations become then relevant to allow this operation. Firstly, the robots in use need to be simulated in a suitable virtual 3D environment (Sect. 4). In a classic, rigid hybrid practical, real robots are used for some parts of the practical (that needs to be carried out in campus), and a simulation of the robot (or a different one) is used for another parts of the practical. A fully flexible, adaptive hybrid learning setup requires using the same real and simulated robot model operating on the same robotic task to enable swapping from one to the other.
Design and Development of a Mobile Robotics Module
1205
Two additional considerations that become relevant for a hybrid setup are how robots can be interfaced with, and what data is available outside the robot during its operation. Robots Interfacing. A robot that needs an attached keyboard, screen and mouse to develop any program is not convenient for a teaching setup because it will be unavailable to other students most of the time, and needs attaching/detaching the peripherals for testing. A setup that allows developing the programs detached from the robots is much more efficient. Most robots already work this way: a program is written in a lab desktop, then transferred into the robot for testing. However, using ROS goes one crucial step further. Modularity and distribution provided by ROS remove the need to even transferring the program to the robot, as the newly developed node (or nodes) can run directly on the lab desktop used for development, and join the ROS Master running in the robot of choice via Wi-Fi, making testing very convenient. Furthermore, if a twin simulated environment is made available, students can develop their nodes (which takes most of the time) in the simulator and then test the operation on any available robot. A full student group can then work with fewer robots, increasing the efficiency of the lab setup and reducing the cost. Access to Real-Time Operation Data. From a teaching perspective, accessing the data flowing in the robot in real time allows checking for a correct operation, and observe behaviours that would be hard to spot otherwise. ROS provides continuous access to all data flowing in the robot’s ROS network while the robot is operating, by manually subscribing a terminal in a desktop to the desired data topic. The lectern’s desktop can also access the robot in operation, allowing relevant data to be displayed on the room’s overhead projector for discussion (and shared online with remote students). There is a broad range of mobile robotic platforms to select from, please refer to robots.ieee.org for an overview [18]. Lego EV3 robots deserve special mention, as they are possibly the most common choice for schools and colleges; they are relatively cheap, robust, and integrate some of common sensors. A limiting factor is the access to real-time data, although Matlab/Simulink integration is now bridging this gap [8, 19]. However, the limited variety of available sensors is an issue. This is also the main issue with other low-cost alternatives [8]. A custom robot platform, on the other hand, offers maximum flexibility within a reasonable cost. In our view, for teaching purposes, the exact robotic platform to use is not after all as important as having a convenient software/hardware architecture.
3.2 The Jetson TX1 Robotics Kit (Jet Robot) The robotic platform we adopted is a custom version of the “Robotics teaching kit” proposed by NVidia in 2016 (at the time when we started developing our module). It is based on the Jetson TX1 as main computer (one of the main reasons to adopt this model), plus an Arduino Mega2560 microcontroller for low-level interfacing with the motors and some sensors. The initial set of sensors included in NVidia’s proposal includes encoders, inertial unit, ultrasound range sensors and a webcam. Our particular implementation of this platform is shown in Fig. 1.
1206
M. Mata and R. M. Gibson
The Arduino communicates with the Jetson TX1 using the ROS library for Arduino [20], serialized over the USB cable. The Jetson is accessed remotely via Wi-Fi, or by directly attaching a mouse, keyboard and screen to it (when the robot is not going to move). The combination of a microcontroller plus a main computer is a common design pattern used in other robotic platforms as well.
Fig. 1. Mobile Robots based on NVidia Jet Teaching kit (as of 2022).
The NVidia Jet robotics kit (2016) needed the following development undertaken for use in our module, fixing issues arising during the practical operation in a typical university lab environment: • All parts, to nut-and-bolt level, are to be purchased from different suppliers, as NVidia only sells the Jetson board. A supplier list is soon outdated, new suppliers need to be found for many parts. • The Jetson TX1 is not intended to be battery-operated. It will not boot up from a battery due to an in-built power-brownout-check circuit; it requires to be ‘fooled’ by soldering a large external capacitor to the Jetson power input connector pins. • Provided Arduino code is missing the integration with the inertial motion unit (IMU), and a logic-level adapter needed between Arduino and IMU is not listed in the components list. • Wheels encoders were connected using only one of the two quadrature lines. Therefore, it was not possible to determine the direction of rotation and encoder counts always increased regardless of wheel’s direction. The fix requires connecting the second quadrature line to an additional digital input for each encoder and updating the Arduino code. • The provided controller for the differential-drive motor configuration has no closedloop wheel speed control. This causes issues at very low-speed operation, as motor asymmetry and dead-bands cause unreliable motion. A PID control loop using the encoders for feedback has been added to ensure that each wheel rotates at the requested speed.
Design and Development of a Mobile Robotics Module
1207
• Under a DHCP Wi-Fi, a screen and mouse needs to be attached to the TX1 just to find out the IP address assigned to the robot (needed to join its ROS master). An LCD screen has been added to display the IP after booting up (and custom messages during the robot’s operation). • The TX1 power could not be swapped from battery to wall supply without shutting it down. A buck/boost DC converter has been added to supply all components from either battery or wall supply, allowing hot swapping them without rebooting the TX1. Most of those issues have been either fixed or avoided in the new Jetson Nano-based robotic platform currently promoted by NVidia, JetBot [21]. It is also cheaper and more compact, but incorporates fewer sensors (it is mostly focused on image processing) and its expansion possibilities and flexibility are quite limited compared to the TX1-based Jet robot. In any case, once those issues were fixed (and other details added for convenience), the Jet robot has proven to be a very versatile platform. The mix microcontroller/Jetson provides connectivity for nearly anything. It offers plenty of room to add new on-board devices (for instance, RPLidar A1 units are being added at the time of writing), while not being too heavy/large to be inconvenient to handle. Motors are powerful enough to carry quite some weight on it, for instance a small robotic arm could be added on this platform if desired. Finally, it looks cool, so the Jets are also used for demos on open-days and other promotional events.
4 The Practical: Software 4.1 Software Environment There are a number of convenient software solutions for mobile robots simulations, like ROS +Gazebo [22], Matlab’s Robotics toolbox [23] or Webots [24], for instance. Matlab was used on the first year. Using Matlab allowed to demonstrate basic concepts, but it was found inconvenient for two reasons: • License cost. Unless Matlab is also used for several other modules to spread the cost, it can be prohibitive. Student access is also subject to campus licenses. • External software dependency. Matlab’s Robotics Toolbox relies on an external Linux virtual machine (VM) running Gazebo as 3D physics simulator and ROS (the Robot Operating System). Once Gazebo and ROS are in play, the need for Matlab becomes marginal. Our current setup uses Ubuntu with ROS in the Jet robots, while the lab desktops (using Windows) run an Ubuntu VM with Gazebo and ROS. OpenCV is used for computer vision, installed on the robots and the lab desktops VM. Given that all software used is open source, this VM can be made available freely for students, who can then replicate the lab setup in their own computers. ROS is indeed a very popular choice for teaching robotics [8]. Webots is also an interesting alternative for robotic simulation since it is also open source, runs natively in Windows, provides a good library of components, and can interact with ROS. However, it lacks the support of a large community that ROS has. For a new development, though, it could be an option to consider as well (instead of Gazebo).
1208
M. Mata and R. M. Gibson
In any case, limiting the student work to a simulated environment can never fully replace a real hardware experience. Some amount of in-person work with real robots is always advisable, as discussed in the results (Sect. 5). 4.2 Exploiting ROS for Adaptive Hybrid Learning In short, ROS is a middleware providing a communications layer between small programs (nodes) via text messages (following a message type). Nodes interact mainly via a system of “distribution lists” (topics): nodes can publish information to topics, and other nodes can subscribe to them. Every time a node publishes a new message to a certain topic, the ROS master node invokes a callback function on each other node subscribed to that topic, passing on the new message, which creates an event-based operation. Nodes can be programmed in C++ or Python. Topics are mostly used to publish sensor data (each sensor has its associated topic), and to handle velocity commands for the robot (using a topic named /cmd_vel) ). By design, ROS provides modularity (many small nodes instead of a massive single program) and distribution (nodes can run in any computer connected to the ROS network). Those two concept are perfectly suited for remote/hybrid teaching, as discussed next. Modularity, for teaching, allows providing students with a certain robotic setup already implementing basic functionalities required for each practical session, then students can focus on developing their own nodes to implement the target behaviours for that practical. For instance, when learning about odometry, students are required to calculate it by integrating the robot’s kinematics from velocities obtained from wheel encoders (already available in topics): students can focus just on the odometry concept. On a later lab, odometry is already available in a topic (using either the student implementation, or ROS default one), so now students can focus on a different concept depending on the odometry (like local navigation to goals). Modularity helps isolating the task at hand, focusing on a small node instead of a huge program. Replacing student nodes by curated ones in later practical sessions prevent errors in a previous task to be carried over to more advanced tasks. Distribution is especially convenient for hybrid teaching. The robot can run the basic nodes on its on-board computer, including the ROS Master node. Then students can implement new nodes on the lab desktops, that will interact with the robot’s ROS Master via the common Wi-Fi (using the robot’s IP address), eliminating the need to program physically connected to the robot, or even transfer the code into the robot. Distribution is key for adaptive hybrid learning, as it also allows working exactly in the same way with the simulated environment, or with the real robot. The lockdown during Covid-19 pandemic had a major impact in teaching, and the distributed nature of ROS was a blessing in that context. Using Gazebo as 3D-physics simulator, a simulated Jet robot was deployed, reasonably resembling the real robot (Fig. 2; the real robot is shown in more detail on Fig. 1). Students can write their nodes and run them with the simulated robot in Gazebo (for remote learning, or for code development in the campus lab), and exactly the same nodes can run with the real robots in the campus lab as well, as discussed in Sect. 4.3.
Design and Development of a Mobile Robotics Module
1209
Fig. 2. Virtual Jet Robot in Gazebo 3D-Physics Simulator and Real Jet Robot. The Same Student Code can Work on One or another by just Specifying the IP Address for the ROS Master to Interact with (Local or Jet).
Using ROS also makes it easier to upgrade the robots when more budget (and time) becomes available, as new functionality can be integrated by just adding a new node to access the new hardware and publish the new data on an associated topic; future nodes can subscribe to the new topic to use the hardware. For instance, an LCD screen showing the robot’s IP address (needed for remote connections) and custom messages was added before lockdown, writing a custom node that subscribes to a new topic where users write the message to display (available on GitHub [25]). RPLidar A1 units will be added this year, with a node reading from the sensor and publishing the range readings to a new topic, making it available for any purpose. 4.3 Matching Real and Simulated Robot for Adaptive Hybrid Learning An easy switching between working with the real or the simulated robot is key to set up an adaptive hybrid practical. It allows students to work with one or the other as needed at any time of the term, and at any time of an in-campus practical session. ROS + Gazebo provide the software infrastructure to do this. The following aspects need to be considered: Create a Simulated Twin of the Real Robot. Simulated robot models are commonly defined in plain text, using mark-up language (xml). The mark-up language uses specific tags to define links (3D elements) and joints (joining pairs of links and defining how they can move). • Links are given physical properties (mass, inertia moments, collision-detection shapes) required for physics simulation, plus visual properties (shape, colour/textures) for the 3D rendering. • Joints emulate the effect of real actuators (rotation/translation), used to create movement.
1210
M. Mata and R. M. Gibson
• The operation of the different sensors need to be emulated as well, usually choosing from a library of ready-made simulated sensor plugins. ROS defines its own mark-up standard, Xacro [26], to allow interoperability between simulated joints and sensors in Gazebo via ROS topics. Notice that sensor plugins and actuator joints do not necessarily need a link for their effect to be simulated; however, they should to be paired with links “looking like the real thing” to achieve a convincing visual effect, and with realistic physical properties for a correct physics simulation. Simulate Realistic Effects. Simulated environments are a numerical simplified model of reality. Regarding mobile robotics, a simulator always knows where the robot and every other object are located in the simulator’s frame of reference, for instance; therefore, any simulated range sensor will calculate an exact distance. The simulator also knows the rotated angle/translation of any joint; no encoder is needed to measure the rotation of a wheel. Consequently, an additional effort needs to be made to artificially hide (or degrade) the omniscience of the simulator to replicate the limitations of real sensors. With this in mind, we added a “reality node” for the simulated environment that imitates behaviour observed in the real robot. For instance: • Encoder sensors can be emulated by reading the (exact) wheel’s joint rotation angle and, every time it increases/decreased by more than the real encoder resolution, incrementing (or decrementing) the encoder pulse count in the corresponding ROS topic. • The real sonar sensors used (HC-SR04) produce readings oscillating by one or two centimetres around the real magnitude, and often output a distance “0” (indicating a failed range measurement when no strong-enough echo is received), this being more frequent the farther away obstacles are. This is emulated inside the “reality node” by degrading the calculated exact range producing an output following a Gaussian distribution with mean on the calculated range and standard deviation equal to 2% of it, and also by producing random “0” measurements with a likelihood increasing with the distance. Other observed realistic effects can be introduced in similar ways. Match Topic Names. The simulated robot twin should be configured to use the same topic names that the real robot uses. If topic names match, nodes developed by students can be working either with the simulated robot or with the real one without changing their code at all. Students only need to specify the IP address where the desired ROS Master is located: their own machine (127.0.0.1) to access the simulation in Gazebo, or one of the real robots (192.168.137.xxx), as shown in Fig. 2. During Covid-19 lockdown period, the only available possibility was using the simulated robot in Gazebo (100% remote learning). Now that labs are back to campus, the availability of the real robots and the simulated twin in the lab desktops (or students own laptops) results in increased flexibility: • Working in the lab, students can develop their code on the simulator. Once it behaves as expected, they can then test it on the real robot. This reduces the risk of damage to the robots due to unexpected program behaviour.
Design and Development of a Mobile Robotics Module
1211
• Students can work remotely in the simulator (installing the VM on their computer or by accessing remote labs) on their own time, then use the available lab time for testing and fine-tuning on the real robot. This is especially useful during the development of the mini-project on the second half of the module. • A student that unexpectedly has to miss one lab session can now make up for it remotely on his or her own time. Code that works in the simulator can be quickly tested in the real robot next lab session, or at a time that the lab is available. 4.4 A Note on Lab Efficiency The number robots available for an in-campus practical session is a limiting factor for lab capacity. Before the pandemic, even with 2 students assigned to work together with each robot, the same lab session had to be repeated several times to accommodate for all students in the module, resulting in increased staff and lab space allocation. After introducing the simulated Jet twin, students can now develop and debug their code on the simulator until the program work satisfactorily, which indeed takes most of the lab time; once ready, try it on the real robot to find and fix any discrepancies found. This hybrid approach leaves the robots available for most of the time. On the other hand, students can test their code on any robot available at the time (just by specifying its IP address), removing the need to allocate a particular robot; students just collect any available robot only when they are ready to test it, and return it to the common pool when the testing is done. Consequently, a reduced number of robots can now support a larger number of students in the lab, with the corresponding reduction of repeated lab sessions (saving staff time and lab usage). Additionally, if one robot stops functioning during the session, it no longer has a big impact, students can just test their code in another one (and staff can just set the faulty robot aside to be fixed later on).
5 Results Throughout the last 6 years, the module has been evolving driven by the feedback from students gathered from CW1 hand-ins (that include a specific question about possible improvements) and the official end of module satisfaction survey, an anonymous survey offered to students at the end of each edition. The lockdown during Covid-19 forced a move from in-campus teaching to fully online teaching (including the practical). During 21/22 academic year, whilst theory units remained online, labs returned to campus; however, students were very often unable to attend due mostly to self-isolation requirements (for themselves, or for a cared one). An adaptive hybrid teaching approach, allowing hot swapping between in-campus and e-learning as needed, was required; the challenging times ahead, with students having to take longer work commitments to help paying for their education costs and unreliable transports, point in that direction as well. This section discusses how the updates on the module design towards the current approach have reflected on student experience and learning.
1212
M. Mata and R. M. Gibson
5.1 Student Performance The robotics module has run for 6 times now with the same staff, always on trimester B (January start), and module contents have been mostly stable after the 2nd edition (with only minor updates happening every year), which reduces the number of variables affecting the results. Module assessment consists of two pieces of coursework (CW1 booklet-guided labs, and CW2 open-ended mini-project) as discussed in Sect. 2.2, accounting for 50% of the final mark, plus a 2h examination that accounts for the remaining 50%. CW1 is assessed through student answers to key questions in the lab booklets; CW2 is assessed with a report and a demonstration of the robot’s operation. Table 2 shows 1st diet pass rates, and average marks for each component along the different editions; each edition is assigned a “mode” (in-campus, remote or hybrid) according to the delivery. Edition 19/20 is noted as “transition” as lockdown happened by week 7 of the delivery (on a 12-week delivery) needing to improvise a replacement for CW2, while CW1 was still carried out in the campus labs. Table 2. Average Quantitative Results along In-Person, Online and Hybrid Delivery Edition
Mode
Pass
CW1
CW2
EX
Notes
16/17
In-person
67%
63%
57%
55%
Practical in Matlab
17/18
In-person
89%
83%
65%
68%
Basic Jet robots
18/19
In-person
88%
89%
60%
61%
Improved Jet robots
19/20
Transition
95%
79%
80%
71%
Lockdown week 7 onwards
20/21
Online
90%
65%
59%
60%
Fully online (Gazebo)
21/22
Hybrid
87%
68%
59%
61%
Hybrid Jet/Gazebo
Final module marks averages are in the within 55–70% range (classification bands 2.2 – 2.1), in line or slightly better that for other modules under responsibility of the authors. The first thing to point out is a quite consistent pass rate along all editions regardless of the modality, with the exception of the 1st edition showing a clearly lower rate; the two main reasons behind this are contents still not consistently organized, and a practical based only in Matlab simulations. Practical marks are also noticeably lower, resulting in a worse preparation for the final exam. Student feedback played a crucial role towards improvements introduced for the second edition. The second edition (17/18) ran on an improved organization of contents, prioritizing coordination between theoretical contents and lab practices. The basic version of the Jet robots was introduced for the practical (already functional, although still far from convenient to use). Those two actions resulted in a clear improvement in student performance and perception. Third edition (18/19) introduced additional improvements in the Jet robots. Results for CW1 (guided labs) improved again as a result. In light of this improvement, proposed mini-projects for CW2 were more challenging and as a result average result decreased slightly.
Design and Development of a Mobile Robotics Module
1213
Fourth edition (19/20) is an outlier due to Covid-19 hitting mid-term, affecting the last lab (CW1 marks slightly lower) and requiring a completely different CW2 just to keep the module going (testing and discussion of an A* path planning algorithm implementation) that resulted way easier for the students that an open-ended miniproject. Exam was no longer invigilated and became open-book; to reduce the chances of collusion, recall questions were replaced by open-ended discussion ones, and question order was randomized. Common to all those editions, run fully in-person, is a clear unbalance between both coursework components. Students were doing great in the guided labs (CW1) but then not so well in the open-ended CW2 (mini-project), where students needed to be creative. As a side note, during CW2 students are also under pressure because the deadline for their Hons. Project dissertation is closing in by the time. In any case, CW2 marks are closer to exam marks as well, which points instead to abnormally high marks in CW1. Action was then taken to make CW1 questions more challenging, trying to prepare students better for CW2 in the hope of increasing CW2 marks and bringing marks to more similar values, but possibly the only overall effect was lowering CW1 average mark to comparable levels to the other components without showing a significant impact on CW2 performance. Edition 20/21 was challenging as it happened fully online. The simulated Jet robot on Gazebo discussed in previous sections allowed to keep mostly the same coursework as inperson, with minimal changes, maintaining consistent CW2 results. The lower average in CW1 can be mostly attributed to more challenging questions on the lab booklets as discussed. Further measures were taken in the remote exam to minimize collusion, adding more open-ended questions and producing numerical questions with different values for each students, successfully taking exam averages back to pre-pandemic levels. Finally, during 21/22 edition, the lab delivery intended to be fully in-campus became forcefully hybrid due to intermittent student attendance because of frequent self-isolation periods. We think that the adaptive hybrid learning approach we took has played out well as results appear to be consistent with the online edition, and also with in-person editions. Next edition (22/23) will be delivered using our adaptive hybrid format as well; student attendance issues will hopefully be reduced, although we still keep observing attendance issues during the current trimester. In any case, this setup guarantees that students will keep getting a full experience in any circumstance. 5.2 Student Perception. Satisfaction Survey Results. Student performance shows good and consistent results regardless the different modalities occurring along the module editions. Student perception can be measured from the formal end of module satisfaction survey, taken anonymously for each module during its final weeks. Quantitative Analysis. Satisfaction Survey Results. Figure 3 summarizes the module’s satisfaction survey results (in a 0 to 5 scale) in two categories: “Overall module satisfaction” and “Assessment and feedback”. The first one is formally used as the key indicator for any module, while the second one measures student perception about the
1214
M. Mata and R. M. Gibson
assessment used and the feedback received about it (a particularly important point in our School’s improvement plans). The module consistently shows high overall satisfaction numbers, systematically over School average. Progressive module updates incorporating student feedback and introducing improvements in the lab setup are reflected in the in-person delivery editions, reaching top satisfaction (5.0) on 18/19. Sadly, there’s no satisfaction survey data for 19/20 (the process was affected by lockdown happening in trimester week 7). The online year, 20/21, shows a clear dip in satisfaction down to 3.9 (a drop experienced across the whole university). Even if the simulations managed to keep the same module contents and obtain consistent results, it is clear that student perception is not as good as other years. Missing the real robots has surely had an impact, and we failed to prove that student code would work pretty much on the real Jet robot as it does on the simulated robot twin. We thought of streaming a live demonstration of someone’s code working on the real robot would have helped, but we had no access to university labs at the time.
Fig. 3. Satisfaction Survey Results in the “Overall Satisfaction” and “Assessment/Feedback” Categories along different Editions (on a 0 to 5 Scale). No Data is Available for 19/20.
On last edition, 21/22, even if some students barely spent time in the lab (so for them it has been nearly a fully online experience) and most students missed more than one lab sessions, they have had the opportunity to take hybrid work on the simulated and real Jet robot. So they are confident that the robot operation they are getting in the simulation will match (with some differences clearly identified and discussed) the real one. As a result, satisfaction has gone all the way up to in-person levels. Providing interaction with real hardware appears to be quite important to student perception. Assessment and feedback is also consistently high. As discussed in Sect. 2.1, providing feedback on CW1 (answers to lab booklet questions) adds some work to the lecturer, but provides early feedback when students need it the most. This metric also dipped in online delivery but to a lesser degree that overall satisfaction.
Design and Development of a Mobile Robotics Module
1215
Qualitative Analysis. The satisfaction surveys, in addition to numerical scales, give students the opportunity to introduce comments under “highlights” and “difficulties” sections. For instance, one comment in the 1st year (practical purely on Matlab) points out an enjoyable module, but regrets not having access to real robots.
Fig. 4. Word Clouds from Satisfaction Survey Student Comments, for In-Person (Left) and Online (Right) Delivery.
Figure 4 presents word clouds from the student comments in satisfaction surveys from in-person years (left) and the online year (right); there’re not enough comments to generate a similar word cloud for the last (hybrid) year. The word “lab” appears as the main one during in-person delivery while it is minor in online delivery: it’s hard for the students to perceive the practical work as “lab” anymore. The role of the lecturer seems to have a bigger relevance in online delivery, but possibly because “lab” is no longer a competing word. Many comments refer to practical work as being fundamental to help understanding the concepts and making the module enjoyable. It seems that the remote practical in Gazebo succeeds in keeping the module experience enjoyable for students. For instance: • I enjoyed the opportunity to explore the ROS and computer vision aspects of the course • The practical use of an actual robot helps with the understanding of the more abstract topics • Getting hands on with the Jetson boards helped to greatly improve understanding of topics covered in the lectures • The hands on practice in the lab was challenging but fun, thoroughly enjoyable module • The hands on labs, getting to use the actual robots was incredibly engaging • Using virtual environments to teach the students as opposed to the generic way of using physical items and being there to use them in person was great Several comments highlight the synchronization of lecture contents with lab practices, recognizing its positive contribution to student learning: • The practical experience and correlation between the lectures and the lab material was well planned and made relating to the lectures intuitive and easy to follow
1216
M. Mata and R. M. Gibson
• The lectures ran linearly with the labs which helped understanding • Provides a good linear structure that eases into more complex ideas • The mini project at the end of the module has been particularly enjoyable as we have all of the knowledge we need to just work away at it on our own and are able Negative Comments along those years have been few and usually highlighting an issue that have then been solved for the next edition (such as issues with the early robot platforms, the availability of labs, or difficulties with the installation of the virtual machine). Some example comments from in-person delivery years: • More timetabled lab time, for completing extra work or catching up. • Hardware and software availability related to coursework should be extended For the remote learning years, the availability of the twin simulation has efficiently tackled the perception of limited lab time, which surely will be useful for future editions. Remote learning, on the other hand, introduced its own challenges, as reflected in those comments: • Not being physically in university has been difficult, the distractions of home life are a challenge • There was no difficulties other than the inconvenience of having to install software • Near the start of the module the virtual machine used for it broke and needed reinstalled, but this was likely my own fault from accidentally updating the VM It is expected that adaptive hybridization of the real robots with the Gazebo simulated twins will address those issues, since students will be less dependent on physical lab hours. The possibility to access the lab VMs remotely through the university’s Remote Labs setup if not able/willing to set up the software in their own computers provides full flexibility in resource access.
6 Conclusions Robotics is an amazing but demanding area when it comes to teaching, especially the practical. Practical work is clearly a key factor for student learning, and it deserves special consideration if a varying degree of in-person/hybrid/remote delivery is needed. Two good practices that have proven useful in our experience are: • Using real robots working under ROS, with a simulated twin version on Gazebo designed to be directly compatible with the real robot. • Intertwining theory and practice during the first half of the module to provide basic practical skills, at the same time that basic content is reinforced. Early feedback on this practical is valuable for students. On the second half of the module, work on an open-ended mini-project to apply those skills, plus new concepts covered along the way, towards the implementation of some higher-level robotic behaviour. An adaptive hybrid approach (allowing switching from real robot to simulated twin at any time)
Design and Development of a Mobile Robotics Module
1217
helps keeping this theory/practical intertwined structure regardless of unexpected attendance issues along the academic term. The mix real robot/Gazebo twin provides a number of advantages for teaching the practical contents: • Programs (nodes) can be written and run on the lab desktops, then joined to the ROS Master in Gazebo to be tested and fixed; that process takes most of the practical time. Once a node is ready, the student can just join it to the ROS Master on the real robot (via Wi-Fi) for the real tests. There is no need to transfer the program into the robot. Actual real-robot testing takes a very short time, compared to the development time. On the other hand, students can use any robot available at the time; there is no need to pair students with robots. Therefore, fewer robots can now support a larger lab group for in-campus delivery. • For hybrid delivery, students can develop nodes in the Gazebo robot twin then test them on the real robot without even having to recompile the program (just changing the IP address of the ROS Master to connect to). Students can continue working if no robots are available, and test the programs when they are available. Any of the lab contents can be undertaken on the simulated or real robot, accommodating for students that unexpectedly have to miss a session; we refer to this as adaptive hybrid to differentiate from a rigid hybrid approach where some specific contents need to be done in-campus, and some contents can be done online. • For fully remote delivery, students can work on their own computers using the supplied VM, or running the VM on the university computers via Remote Labs. A lecturer in the lab can test a student node (already pre-tested on the sim) with one of the robots and share live video of the resulting operation using the classroom AV setup. • All data flowing within the robot’s ROS network during the robot operation can be inspected at any time from the lab computers. This is convenient for testing and analysis, and allows the lecturer to share this data using the lab’s overhead projector for wider discussion. If using remote labs, the lecturer can also stream the real robot data. In any case, a successful delivery relies heavily on the lecturers, possibly even more for a remote delivery; ROS/Gazebo are tools to facilitate a practical learning as close to the real work as possible. Finally, for fully online delivery, students should be made aware that the remote work on the simulated robot twin is largely equivalent to the work with the real robot. For instance, by running some samples of student code (developed and sim-tested remotely) in a real robot, and sharing live video of it. Failing to do so can make some students perceive that they are “missing the practical work”, according to our experience.
References 1. Bajd, T., Mihelj, M., Munih, M.: Introduction to Robotics, 1st edn. Springer, Dordrecht (2013) 2. Ozuron, N., Bicen, H.: Does the inclusion of robots affect engineering students achievement in computer programming courses? J. Math. Sci. Technol. Educ. 13, 4779–4787 (2017)
1218
M. Mata and R. M. Gibson
3. Balaguer Alvarez, I.J.: Introduction to robotics: importance of a summer camp as a recruiting tool for future university students. IEEE-RITA 12(2), 71–75 (2017) 4. Covid related pupil absence. https://www.tes.com/magazine/news/secondary/scale-covidpupil-absences-scotland-revealed. Accessed 3 Oct 2022 5. Gov.Scot. https://www.gov.scot/news/record-number-of-students-from-deprived-areas-atuniversity-1/. Accessed 3 Oct 2022 6. Runacres, J., et al.: Student carer experiences of higher education and support: a scoping review. Int. J. Incl. Edu., 1–18 (2021) 7. National Student Money Survey. https://www.savethestudent.org/money/surveys/studentmoney-survey-2022-results.html. Accessed 3 Oct 2022 8. Rosillo, N., Montés, N., Alves, J.P., Ferreira, N.M.F.: A generalized Matlab/ROS/Robotic platform framework for teaching robotics. In: Merdan, M., Lepuschitz, W., Koppensteiner, G., Balogh, R., Obdržálek, D. (eds.) RiE 2019. AISC, vol. 1023, pp. 159–169 (2020). Springer, Cham. https://doi.org/10.1007/978-3-030-26945-6_15 9. NVidia Robotics Teaching Kit with ‘Jet’ for Educators (2016). https://on-demand.gputec hconf.com/gtc/2016/presentation/s6729-joe-bungo-robotics-teaching-kit.pdf. Accessed 21 June 2022 10. ROS (Robot Operating System) - official pages. https://www.ros.org/. Accessed 21 June 2022 11. Anderson, L.W., et al.: A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom’s Taxonomy of Educational Objectives. Pearson, London (2001) 12. SCQF level descriptors. https://www.sqa.org.uk/files_ccc/SCQF-LevelDescriptors.pdf. Accessed 21 June 2022 13. Hattie, J.: Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement. SAGE Publications, Thousand Oaks (2009) 14. Gibson, R.M., Morison, G.: Improving student engagement and active learning with embedded automated self-assessment quizzes: case study in computer system architecture design. In: Arai, K. (ed.) Intelligent Computing. LNNS, vol. 285, pp. 327–339. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80129-8_24 15. Project Based Learning. https://www.edutopia.org/project-based-learning. Accessed 21 June 2022 16. Hernandez-Barrera, A.: Teaching introduction to robotics: using a blend of problem- and project-based learning approaches. In: IEEE SOUTHEASTCON 2014, pp. 1–5. IEEE (2014) 17. What are the benefits of group work? https://www.cmu.edu/teaching/designteach/design/ins tructionalstrategies/groupprojects/benefits.html. Accessed 21 June 2022 18. ROBOTS – your guide to the world of robotics. https://robots.ieee.org/robots/. Accessed 22 Oct 2022 19. Gonzalez-Garcia, S., et al.: Teaching forward kinematics in a robotics course using simulations: transfer to a real-world context using LEGO mindstormsTM . Int. J. Interact. Des. Manuf. 14(3), 773–787 (2020) 20. ROS library for Arduino. https://www.arduino.cc/reference/en/libraries/rosserial-arduino-lib rary/. Accessed 5 Oct 2022 21. NVidia ‘JetBot’ (2022). https://jetbot.org/master/. Accessed 21 June 2022 22. Gazebo physics simulation – official pages. https://gazebosim.org/home. Accessed 21 June 2022 23. Matlab Robotics toolkit - official pages. https://uk.mathworks.com/help/robotics/. Accessed 21 June 2022 24. Webots. https://cyberbotics.com/. Accessed 21 June 2022 25. LCD IP address node. Github. https://github.com/mmata-gcu/LCD_I2C_ROSnode. Accessed 5 Oct 2022 26. ROS xacro definition. http://wiki.ros.org/xacro. Accessed 5 Oct 2022
Educational System Through Software Robots Vision Monica-Ioana Vulpe(B) and Stelian Stancu @ The Bucharest University of Economic Studies, Bucharest, Romania [email protected]
Abstract. As nowadays all activities are based on technology, with just a little imagination and desire we can design new ideas aimed at optimizing already existing flows. Trying to see the advantages that the evolution of technology makes available to us, we outline its applicability based on already existing solutions in different fields. Among many other technologies whose evolution is indisputable, we also mention robotic process automation. This technology is meant to implement developments for software robots that process large volumes of data, running in monotonous-repetitive workflows. In the educational context, this innovation can be integrated at the level of the area related to administrative activities such as management, financial, database with records of grates and even more. This perspective is meant to optimize both the processing time of these data and the human effort. A software development, once done, requires supervision in the first period of deployment into production and possible subsequent implementations extensions of the already existing flows. This paper has the role of presenting the concept of robotic process automation within the educational system, the possibilities of implementation and the bases of inspiration in accordance with current fields in which these software robots have made their place and are more and more desired. Keywords: Robotics Process Automation · Technology · Education
1 The Need of Innovations in the Educational System 1.1 The Current Reality These days, technology is an essential part of the educational system. From laptops and tablets in the classroom to online resources and apps, technology has become a ubiquitous presence in schools. However, not all technology is created equal. While some technology is designed to be cutting-edge and innovative, other technology is quickly outdated and replaced by newer models. This de facto process of obsolescence is known as “deprecation.”
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1219–1226, 2023. https://doi.org/10.1007/978-3-031-37963-5_84
1220
M.-I. Vulpe and S. Stancu
Deprecated technology can pose a number of problems for educational institutions. First, it can be expensive to constantly upgrade to the latest and greatest technology. Second, deprecated technology may not be compatible with newer platforms and devices, making it difficult to use in a modern educational setting. Finally, deprecated technology often lacks the features and functionality of newer technology, making it less effective as a learning tool. Despite these challenges, educational institutions must find ways to work with deprecated technology. In many cases, this means using older devices and software in new and creative ways. With a little ingenuity, even the most outdated technology can still be used to support learning objectives. As such, deprecated technology does not have to be a detriment to the educational process; instead, it can be seen as an opportunity for creativity and innovation. However, the educational system in many countries is not keeping pace with the times. One major obstacle is the lack of open-mindedness among educators. Too often, teachers and administrators are resistant to change, preferring to stick to traditional methods and curriculum. As a result, students are not being exposed to the latest thinking and research in their fields. Furthermore, another obstacle is the lack of technology in schools. In an age where we can access information from anywhere in the world with a few clicks, it is inexcusable for students to be using outdated textbooks and equipment. A rigid and inflexible educational system has been shown to stifle creativity and innovation among students. In addition, the growing reliance on technology has also led to a number of problems [1]. One of the most significant problems is the digital divide, which refers to the gulf between those who have access to technology and those who do not. This divide has led to a two-tiered system in which some students have access to resources that others do not. As a result, the current educational system is failing to meet the needs of all students. We are aware of the obstacles that the system imposes in some places, but we try to find a middle way, having our objectives clearly proposed. So, trying to look at the so-called existing problems as opportunities for development and innovation, we will analyze existing business flows in an educational system with the aim of succeeding in improving the current context with the help of software robots. 1.2 The Contribution of Technology The field of education is evolving rapidly, and so too are the administrative processes that keep educational institutions running smoothly. One area that has seen significant change in recent years is the introduction of robotic process automation (RPA). RPA is a form of artificial intelligence (AI) that can be used to automate repetitive, time-consuming tasks. This technology has the potential to transform educational administration, making it more efficient and effective.
Educational System Through Software Robots Vision
1221
There are a number of ways in which RPA can be used in an educational setting. For example, it can be used to manage finance administration, timetable scheduling and student records. RPA can also be used to support teachers with marking and feedback. Additionally, RPA can help to streamline communication between different stakeholders, such as students, staff, teachers and even parents [2]. For example, if a student’s information needs to be entered into multiple systems, an RPA bot can be configured to do this automatically and accurately. The introduction of RPA into the field of education has the potential to bring about positive changes for all those involved. It can help to reduce costs and improve efficiency, freeing up time from staff in order to focus on more important tasks. The introduction of RPA into educational administration has the potential to benefit everyone involved in the system.
2 Targeted Workflows from Educational System Within the administrative field of the educational system, there are targeted workflows that potentially benefit from robotic process automation (RPA) implementations. RPA is an exciting area of development that promises to automate repetitive and rules-based tasks. When evaluating RPA opportunities, it is important to consider not only the potential benefits but also the risks and challenges associated with implementation. In this caption, we will discuss specific workflows within the administrative field of education that could be ripe for RPA: financial such as student financial aid, admissions and human resources. For each workflow, we will identify the potential benefits of automation as well as some of the challenges that must be considered before implementation. The human resources workflow can be complex and time-consuming, but it is often ripe for automating simple, repetitive tasks such as onboarding new employees or processing employee expense reports. Automation of this workflow could improve efficiency and accuracy while freeing up staff to focus on more strategic tasks. However, there are also some challenges to consider before implementing RPA in human resources. For example, data privacy concerns could arise if sensitive employee information is being shared with third-party vendors, and there may be resistance from staff who are worried about job loss. This problem can be solved by storing a local database or a secure cloud server from all these points of view. Also, back-up copies can be used to avoid the possibility of data loss. Trying to think realistically and taking into account the possibility of problems with electricity, these documents related to the processes can be traditionally stored in files, in physical format, and the presented activity can be scheduled at a time interval decided by internal agreement to be safe from all points of view. The student financial aid workflow also presents a good opportunity for RPA due to its high volume of transactions and clear rules governing eligibility. Automation of this workflow could improve accuracy and efficiency while reducing processing time for students [3]. However, there are also some challenges to consider before implementing RPA in financial aid. For example, data privacy concerns could arise if sensitive student information is being shared with third-party vendors, and there may be resistance from staff who are worried about job loss.
1222
M.-I. Vulpe and S. Stancu
The admissions workflow is well suited for RPA due to its high volume of repetitive tasks and well-defined rules. Automation of this workflow could potentially improve accuracy and efficiency while freeing up staff to focus on more strategic tasks. However, there are also some challenges to consider before implementing RPA in admissions. For example, RPA may need to interface with legacy systems, and data privacy concerns could arise if sensitive student information is being shared with third-party vendors. Also, from the perspective of the financial area, electronic receipts can be retrieved, essential data extracted from them and automatically entered into the system. These activities, from a technical perspective, are included in the RPA development area, they can use document understanding extensions based on machine learning algorithms and artificial intelligence. In order to identify potential opportunities for RPA, administrators can examine their current workflows to look for tasks that are repetitive, rule-based, and time-consuming. Once potential opportunities have been identified, it is important to assess whether RPA would be a feasible solution. In some cases, it may be more effective to use existing tools or processes rather than implementing RPA. However, when properly implemented, RPA can offer significant benefits in terms of efficiency and cost savings. As such, it is an important consideration for any administrator looking to streamline their workflow.
3 Proof of Concept from the Financial Field At this point, the technology being already implemented in other fields of activity that can represent proof of concept for our implementation, it is essential to analyze real data, their results and possibilities for expanding the targeted developments. From this perspective, the integration of RPA technology in the educational system is quite flexible on the migration of as many activities as possible, which until now would have been carried out physically, in the online electronic environment. A suitable example for this argument is represented by the possibility of uploading documents, online payments and outlining specific platforms for this educational business flow in the virtual environment. Using real data from developments in the banking field, extracting data from physical invoices and entering them into the system, including calculations related to taxes and approval flows, can take a maximum of 8 min, while a human resource can perform this action in a relatively short time higher, but the success rate is not always determined as positive. At the same time, going through as many registration activities as possible, uploading documents to online platforms, filling out documents, sending invoices, gradually comes to exceed the percentage of activities of this kind that take place physically [4]. Parents, pupils, students who have these means at their disposal, once they get used to the new system, gain confidence, prefer the online environment, in exchange for the time lost waiting in line or the lack of necessary documents, roads and problems encountered on the spot. These complex integrations that end up covering end-to-end flows present in the educational system, once implemented and accepted by schools, end up being indispensable to the system.
Educational System Through Software Robots Vision
1223
Technologies that fold on such developments are represented by complex integrations with UiPath, Automation Anywhere, Blue Prism, Kofax, Power Automate from Microsoft and many others. The one we chose to use in the present case is the one provided by UiPath. The reason why we turned to this option is not only the popularity of the technology, but also the benefits of the integrations related to it. For example, the possibility of integrating the automation developed with the tool called Orchestrator, which has the role of monitoring active processes in production and offers the possibility of extracting reports and scheduling an audit of the entire implemented system. A forecast analysis from the financial field illustrates the migration of such flows with which customers used to take place physically, now moved online, a gradual and guaranteed increase in interaction with the virtual environment [5]. These clients in the present case are represented by the students of the institution and their parents. The increase in interaction with specific activities both online and in the physical environment can be seen in Table 1 which illustrates the percentage of accesses and completions of some purchase flows in the financial field. Thus, the difference is highlighted strictly by online and physical opposition. In this perspective, we can foresee the multitude of opportunities for the development of RPA-type processes designed to integrate these flows of the educational system that can be shared online. Table 1. Results of Accessing Processes Online vs Physical in the First Month of Production Phase No. of process
Title
Total no of executions Physical
Online
Physical percentage (%)
Online percentage (%)
1
Payment of installments
9,968
3,562
6,406
36%
64%
2
Credit payment
9,258
3,648
5,637
39%
61%
3
confirmation of interest rate change
9,627
4,359
5,268
45%
55%
This model of such implementations is meant to represent an example that can guide the choice to implement this system of online activities. It can be observed that for all three processes, most people opted for completing the recurring activity in the virtual environment. Depending on the evolution, the technology is customizable, especially since these results are illustrated during the first month after the official implementation. Therefore, the feedback of such an integration is a positive one that has the role of guiding us towards the choice of implementing software robots integrated with these systems that will automate tasks within the entire business template.
1224
M.-I. Vulpe and S. Stancu
The example used as a guide for choosing such an implementation has as clients a large sample of people, but whose average age is equivalent to that of the parents of the students of the educational institutions. So, from this point of view, there will be no reluctance to open up to knowledge, innovation and the use of technology. We consider that this association is relevant both for comparing the implementation possibilities and the targeted business needs. Since from the beginning the openness was quite large for the online environment, gradually, with the passage of time and the increase in the degree of trust, the hopes are towards the migration of the whole process in this way. Figure 1 illustrates the major differences between the two environments, their total number being easily compared visually. From a first non-technical analysis there are many different types of integrations from the finance system with integration from the educational one. The most common and important type is the extraction of data needed for analysis, which can be performed manually or through an automated process. Once the data is extracted, it can be stored in a variety of formats, including text files, spreadsheets, databases, and so on. Another type of technic integration is the transformation of data into a format that is more suitable for analysis. This may involve aggregation, filtration, sorting, or other processes. The results of the analysis can be presented in various ways, including tables, graphs, charts, and so on. In general, technic integrations from finance systems with technic integration from educational ones, play a vital role in ensuring that data is properly processed and analyzed so that decision-makers can make informed decisions. While, from a more experienced developer perspective, we can also identify differences between the example treated as such and the environment we want to introduce the software robots, thus resulting in considerable advantages. A finance system integration is the process of connecting financial software applications to share data and transactions. This usually requires the use of an application programming interface (API). An educational system integration, on the other hand, is the process of connecting educational software applications to share data and transactions. This also usually requires the use of an API. There are several key differences between these two types of integrations. A finance system integration is typically much more complex than an educational configuration. This is because there is a lot more data to exchange and transaction volume is much higher. Also, a finance system integration is typically much more expensive than an educational one. This is because the software applications involved are usually very expensive. Finally, a finance system integration is typically much more time-consuming than an educational system integration. This is because there are usually many more stakeholders involved in a finance system integration project. Taking into account all the previously mentioned, we consider valid the model treated as proof of concept for the current project. As the world becomes increasingly automated, there is no doubt that RPA will play an increasingly important role in the future of work. Migration and openness to technology represent our support, of those who analyze business situations and implement software solutions to choose the best technologies in terms of development, maintenance and evolution costs directly proportional to the increase in demand and flows that, often, can be unpredictable in terms of volume. Even
Educational System Through Software Robots Vision
1225
12,000 10,000 8,000
Process 1
6,000
Process 2
4,000
Process 3
2,000 0
Total
Physical
Online
Fig. 1. The Graphic Representation of a Sequence Table 1
if there is no problem in the educational system of a fluctuating volume, it is good to be cautious and prepared for certain situations that can be foreseen.
4 Conclusions As these integrations of software robots are based on flows that are oriented towards the virtual environment, it is essential to know the targeted flows and development opportunities, both from a technical point of view and from the point of view of detailed business analysis. Those institutions who are able to gain the necessary skills now will be well-positioned to take advantage of this growing industry. There is no doubt that technology has drastically changed the way we live and work. In recent years, we have noticed a growing trend of automation in various industries. Robotic process automation (RPA) is a type of automation that has the potential and openness to have a significant impact on the administrative side of educational institutions. There are many potential benefits of RPA in education, including improved efficiency, accuracy, and freed-up staff time. As the technology continues to develop, we can expect to see even more ways in which RPA can contribute to more and more business flows related to these types of institutions. In a rapidly globalizing and increasingly digital world, the ability to effectively use and create software has become an essential skill set. However, the current educational system often fails to adequately prepare students for this reality. As a result, many individuals never develop the confidence or competence to take full advantage of technology. One potential solution is to integrate software robots into the educational system. At this point, even the interest of the students of the institutions that tend to adopt such technologies, to get involved in the implementation process. By giving students the opportunity to learn by doing, software robots could help them develop the necessary skills and knowledge to be successful in a digital world. As a result, integrating software robots into the educational system could have even a significant impact on preparing students for the future, not only for the administrative field or educational management system.
1226
M.-I. Vulpe and S. Stancu
We are aware about the high costs that the technology entails, but we consider it worth at least trying a model implementation. These costs of implementing such a system would likely be prohibitive for many schools. Despite these challenges, the vision of educational reform through software robots remains an intriguing possibility. The desire for knowledge, development and well-being of employees from as many fields of activity as possible, including the educational one, are far above the obstacles we are aware of and we try to prevent them every time we encounter them. Technology has always been a controversial topic in education. Some believe that new technologies are necessary to keep pace with the ever-changing world, while others believe that technology alters the existing system. However, there is no denying that new technologies have the potential to change the landscape of education, as RPA is. Educational research plays an essential role in identifying and implementing new technologies designed to make the current system more efficient. Education is considered by governments as one of the strategic areas and thus enjoys special attention. Sponsorship in the first phase with the aim of gradually making the whole system more efficient would make a significant difference. In addition, the automation of existing business flows in education can contribute to improving transparency and accountability. All of these factors make automating business flows within the education system a worthwhile endeavor. However, it is important to carefully consider which processes should be automated to avoid interruptions in training or other negative consequences. When used judiciously, automation can be a powerful tool for streamlining operations within the education system. Acknowledgements. This paper was co-financed by The Bucharest University of Economic Studies during the PhD program.
References 1. TechTarget. https://www.techtarget.com/whatis/definition/STEAM-science-technology-eng ineering-arts-and-mathematics. Accessed 8 Jan 2023 2. Automation in Education: Streamlining Your Educational Processes, Rebecca Sealfon, published in Age of Awareness (2021). https://medium.com/age-of-awareness/automation-in-edu cation-streamlining-your-educational-processes-c5d1305a0f8d. Accessed 8 Jan 2023 3. AirSlate. https://blog.airslate.com/automation-education-streamline-educational-processes/. Accessed 8 Jan 2023 4. CustomerThink. https://customerthink.com/rpa-the-key-to-an-automated-streamlined-andcost-effective-supply-chain-management. Accessed 8 Jan 2023 5. PWC. https://www.pwc.in/assets/pdfs/publications/2018/robotic-process-automation-in-a-vir tual-environment.pdf. Accessed 8 Jan 2023
Synchronized Colored Petri Net Based Multimodal Modeling and Real-Time Recognition of Conversational Spatial Deictic Gestures Aditi Singh(B) and Arvind K. Bansal Department of Computer Science, Kent State University, Kent, OH 44242, USA {asingh37,akbansal}@kent.edu
Abstract. Gestures are an important part of intelligent human-robot interactions. Co-speech gestures are a subclass of gestures that integrate speech and dialogs with synchronous combinations of various postures, haptics (touch), and motions such as head, hand, index finger or palm, and gaze. Deictic gestures are a subclass of co-speech gestures that provide Spatio-temporal reference to entities in the fieldof-vision, by pointing at an individual entity or collection of entities and referring to them using pronouns in spoken phrases. Deictic gestures are important for human-robot interaction due to their property of attention seeking and providing a common frame-of-reference by object localization. In this research, we identify different subclasses of deictic gestures and extend the Synchronized Colored Petri net (SCP) model to recognize deictic gestures. The proposed extension integrates synchronized motions of head, hand, index-finger, palm, gaze (eye-motion tracking and focus) with pronoun reference in speech. An implementation using video-frame analysis and gesture-signatures representing meta-level attributes of SCP has been described. An algorithm has been presented. Performance analysis shows that the recall is approximately 85 percent for deictic gestures, and conversational head-gestures are separated from deictic gestures 95 percent of the time. Results show that mislabeling in deictic gestures occurs due to missing frames, feature points, undetectable motions, and the choice of thresholds during video analysis. Keywords: Artificial Intelligence · Conversational Gestures · Deictic Gestures · Gesture Recognition · Petri Nets · Social Robotics · Synchronization
1 Introduction It is anticipated that between 2030–2050, the world population will start declining and the ratio of aged population will increase [1]. This change will accentuate the labor-need for activities such as child-education, nursing, home care, elderly care, entertainment, and hospitality, including catastrophe relief, mining, and space exploration where human labor will be in short supply, and will need to be augmented with social human-like robots [2–6]. This will require effortless human-like understanding and generation of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1227–1246, 2023. https://doi.org/10.1007/978-3-031-37963-5_85
1228
A. Singh and A. K. Bansal
facial expression (including pain), speech, gesture, social awareness, situation awareness, haptics and difficult body-postures and posture-transitions while balancing [7–17]. There has been considerable progress in motion and balancing, speech generation and natural language understanding [6, 7]. However, progress in gesture analysis and generation, facial expression analysis and generation, especially real-time in-wild recognition is limited [8]. For example, facial expression analysis is limited to six basic facial expressions proposed by Ekman and others [7, 8]. Humans exhibit multiple emotions and pain with different intensity using facial expressions [7–12]. Similarly, the work on gesture generation and analysis is limited mostly to a subset of hand-gesture analysis and lacks synchronized integration of head-motion, hand-motion, and gaze-analysis [8, 16]. Recently developed Synchronized Colored Petri net (SCP) model and conceptual dependency model are promising as general-purpose techniques for gesture recognition and learning [16, 17]. However, their implementation is currently limited to training a robot to recognize head-motion based gestures [17]. Complete gesture analysis requires an integrated analysis of posture, head-motion, hand-motion, fingers, dialog, gaze, speech-tone, and their synchronization [16–25]. Hand-gestures, especially finger and wrist motions are also involved in other conversational gestures such as exhibiting frustration, action description, spatial modeling, iconic gestures describing entities and space, enumeration, and attributional gestures, which have limited modeling and implementation [16]. Lack of synchronization between motions and speech causes perception problems. Deixis is the alignment of scene and mental structure for social interaction using common referentiality to an entity or concept in the current spatio-temporal context [13, 14]. Deictic gestures are attention-drawing co-speech conversational gestures used in dialogs, which refer to entities in the scene or entities in the visual range [14, 16]. Deictic gestures are important in human-robot interaction because they provide a way for people to point to specific objects or areas in the scene, or provide additional context for what shared dialog. Deictic gestures are useful to give instructions to the robot to perform physical actions on objects in the scene or navigating to a specific location. Deictic gestures facilitate synchronization of motions and speech in dialogs between two actors and enhance comprehension and localization of the entities with precision [14]. Deictic gestures also require integration of referential words embedded in the speech phrases with synchronized hand-motions, eye-motions (including tracking and focusing) and index-finger pointing. Robust deictic gestures are necessary for humanrobot collaboration because they point to common frame-of-reference [26]. Previous research on deictic gestures is broadly classified as: 1) textual analysis of deixis in text and dialogs by cognitive psychologists; 2) vision-based analysis of posture and hand-motion by computational scientists to identify fingers pointing to an entity in the scene. However, in a real-time human-robot interaction, deictic gesture involves correlation of dialog analysis along with motion analysis to reduce cognitive distortion and disambiguate the reference to the entity in the dialog. Moreover, motions in deictic gestures are expressed in multiple ways, which has not been analyzed by current image-analysis based computational systems.
Synchronized Colored Petri Net Based Multimodal Modeling
1229
Synchronization of finger (or palm) motion with the pronouns at the utterance-level in the spoken phrase is important to reduce cognitive distortion of the reference. For a comprehensive recognition of deictic gestures, these different subclasses of expressing deixis have to be modeled and analyzed. In this research, we incorporate synchronization and correlation between handgesture to point an entity and pronoun-level reference in the spoken phrase using an extension of the SCP model [17]. We also identify various subclasses of deictic gestures based upon different types of motions involved in expressing deixis. The multimodal modeling includes temporal synchronization of head-motion, hand-motion, eye-motion, and index-finger motion with speech. We believe this framework can be generalized to recognize other linguistically correlated co-speech gestures [16]. The major contributions of this research are: 1. Extending the SCP model to recognize spatial deictic gesture; 2. Classification of different types of motions involved in deixis; 3. Correlation and synchronization of a pronoun in the spoken phrase with the corresponding motions in deictic gestures; 4. Integration and synchronization of head-motion, hand-motion, finger-motion, and eye-tracking for deixis. The paper is organized as follows. Section 2 describes the background concepts. Section 3 describes deictic gesture subclasses. Section 4 describes the related work. Section 5 describes the SCP based modeling of deictic gestures. Section 6 describes an algorithm to label deictic gestures. Section 7 describes an implementation. Section 8 discusses performance analysis. Section 9 describes the conclusion.
2 Background 2.1 Multimodal Synchronization in Gestures A subset of temporal synchronization, sequential, concurrent, start-synchronization, end-synchronization, strict-synchronization, and during-synchronization, have been used to model SCP [17, 27]. Given two tasks T 1 and T 2 , and their start and end-time is start end end denoted as (τstart 1 , τ1 ), (τ2 , τ2 ). In sequential execution, the task T 2 executes after start end the task T 1 (τ2 > τ1 ); in concurrent execution, the tasks T 1 and T 2 overlap with no rigid constraint; In start-synchronization, the tasks T 1 and T 2 start within a short ≤ τstart ± δ); In end-synchronization, the tasks T 1 and imperceptible delay δ (τstart 2 1 end T 2 end within a short imperceptible delay δ (τend 2 ≤ τ1 ± δ); Strict synchronization comprises start-synchronization and end-synchronization; In during synchronization, a longer task T 1 and ends before the the smaller duration task T 2 starts after duration start end ≤ τend end > τstart ). This research also uses τ τ > τ end of the task T 1 (τstart 2 1 2 1 1 2 starts after the task overlap synchronization [28]. In overlap synchronization, the task T 2end start end ≤ τend start τ τ > τ T 1 and ends after the task T 1 ends (τstart 2 1 2 1 1 > τ2 ).
1230
A. Singh and A. K. Bansal
2.2 Synchronized Colored Petri Net (SCP) A Petri net is a token-based directed graph of concurrent events using two alternating types of nodes: places and transitions. An event fires when the tokens inside a place exceed a pre-defined threshold count. The tokens traverse from the source place-node to destination place-node via a transition-node. Petri nets are folded by coloring similar parts. In modeling 3D motion, x, y, and z dimensions are modeled as colors. SCP combines synchronization with colored Petri nets using delays. It integrates synchronization of speech and head-motions to derive non-emotional conversational head-gestures [17]. The coloring models the motions in different dimensions. Places are endpoints of a motion or the relaxed-head for a duration above the temporal threshold of the gesture-boundary [17]. Synchronization can be between two or more motions or between speech and motions [17]. A delay in transition-node models the motion-time. SCP uses an additional triggernode to spawn two or more tasks synchronously. The synchronization delay δ, placed as a weight of the edge connecting a trigger-node to the next place-node, models startsynchronization. An end-synchronization is modeled as multiple transition-nodes connecting to the same place-node; the delay δ is placed on the edges connecting transitionnodes to the place-node. The repeated motions such as head-shakes or head-nods are modeled using a cycle. Example 1. Figure 1 illustrates an SCP model of the head-gesture for conversational gesture interrogation. The gesture involves tilting of a head, modeled by movements using two colors x-dimension, y-dimension. After tilting the head, head nods repeatedly while asking a question. The question and the head-motion start synchronously. There are six place-nodes for the head-positions {p1 , p2 , p31 , p4 , p5 , p6 } and one place-node p32 to model the start of a spoken phrase. The set of transition-nodes {tr 1 , tr 2 , tr 31 , tr 32 , tr 41 , tr 5 } is associated with the head-motions, and the transitionnode tr 42 is associated with the speech. The repeated head-nods are modeled using a cycle. After a question is asked, the head-motion and the spoken phrase end, using end-synchronization, and the head returns to the relaxed-position. There is a cycle that models the repeated head-nods while asking a question. tilted head relaxed head
p1
tilted head
down
p4
p31
tr41
tr2
tr1 p2
p32
relaxed head and silent
tr32 tilted head
+
tr31 speak phrase
tilted head
p5
p6 tr5
tr42
end
start
Fig. 1. An Example of Modeling Conversational Head Gestures using SCP
Synchronized Colored Petri Net Based Multimodal Modeling
1231
The trigger-node tr 2 spawns head-motion and speech using start-synchronization. The speech and head-motion end, using end-synchronization. The delay for the startsynchronization is associated with the edges (tr 2 , p31 ) and (tr 2 , p32 ) and is calculated using sample rate of video and speech. Similarly, the delay for the end-synchronization is associated with the edges (tr 41 , p5 ) and (tr 42 , p5 ). 2.3 Implementation Model of SCP for Conversational Head Gestures Bipartite graphs for SCP are realized using matrices. The nodes and edges have multiple attributes such as synchronization, cycles, trigger-nodes, and head-motions. The SCP implementation comprises six components: 1) signature representing metalevel attributes; 2) vector of attributes associated with vertices (place-node, transitionnode, and trigger-nodes); 3) a matrix that models the connectivity between the nodes and the delay associated with edges for synchronization; 4) global attributes such as the video sampling-rate, and temporal and spatial thresholds for identifying relaxedstate, place-nodes, cycles and synchronization 5) stillness-vector for motion analysis; 6) silence-vector for speech analysis [17]. Meta-level signature for each gesture comprises head-motion and its direction, speech, eye-focus, Petri net parameters such as cycle, synchronization types, number of places and transitions. The attributes associated with a vertex are: 1) type (place-node, transition-node, trigger-node); 2) membership in a cycle; 3) membership in a synchronization; 4) indegree; 5) out-degree; 6) tokens-threshold to fire in place-nodes. The edge-attributes are associated to each cell of the corresponding matrix. Edgeattributes include connection between a place-node transition-node/trigger-nodes; a trigger-node place-nodes for synchronous tasks; a transition-node place-nodes. The synchronization delays are associated with edges connecting trigger-nodes placenodes for start-synchronization, during-synchronization, and overlap-synchronization. Similarly, delays are associated with edges connecting transition-nodes end-nodes to model end-synchronization, during-synchronization, and overlap-synchronization. Global attributes for threshold include spatial threshold to accommodate random motions in a relaxed-state; temporal thresholds for start-synchronization, endsynchronization, during-synchronization; temporal threshold for gesture-termination; temporal and spatial thresholds to identify place-nodes; spatial threshold to identify the same place causing a repeat motion to form a cycle. A combination of stillness-vectors and silence-vector is analyzed at different timesteps to derive place-nodes, relaxed-state, end-of-gesture, synchronization between speech and motions. After deriving the SCP matrix, the matrix is analyzed to derive meta attributes (signatures). These signatures are matched with archived signatures in the gesture database to label a gesture in real-time.
1232
A. Singh and A. K. Bansal
2.4 Conversational Co-Speech Gestures A conversational gesture uses temporal synchronization of speech with motions such as hand-motion, wrist-motion, head-motion, eye-motions and gazes, lip-motion, finger and/or palm motions and haptics for touch [16]. Synchronized hand-motions are also used to describe spatial gestures such as describing shapes of symmetrical objects, and exhibiting actions such as push, pull, close [16]. In this research, we are concerned with the modeling and identification of spatial deictic gestures. As described in the Sect. 3, spatial deictic gesture is modeled by a synchronized integration of head-motion, hand-motion, left and right eye-motions tracking needed for gaze-following, monitoring, and focusing needed for attention-seeking along with temporal synchronization with speech.
3 Spatial Deictic Gestures and Subclasses Deictic gestures refer to an entity (or a group of entities) in the scene [13–16]. Entity could be a place, a concrete object, a concept, a person, including self in the current spatio-temporal context [14, 15]. An object is pointed either using a combination of hand-motion and index-finger motion, or by using eye-motion and eye-focus or using a combination of head-motion followed by eye-focus [28]. Deictic gestures are useful to maintain the synchronization and alignment in a discourse, localization of an object in the scene, and reduction in the cognitive burden of verbalizing and comprehension of the spoken sentences, both for the interlocutor and the listener [14]. Infants and deaf children learn and communicate the deictic gestures first due to the lack of the linguistic and articulation development [29–31]. Deictic gestures are also used when an interlocutor cannot precisely describe an object. Deictic gestures are characterized by referential spoken words such as here, there, this, that in a spoken phrase along with a combination of acyclic head-motion, hand and index-finger motions or eyes pointing towards an entity [13, 14, 31]. Table 1 describes the use of referential pronouns in a spoken phrase and the associated motions. Table 1. Examples of Spatial Deictic Gestures Sentence
Referential words
Involved Motions
1
Put candle there
There
head + hand + eye + index finger
2
Bring the chair here
Here
head-motion + Eye-focus + hand-motion + index finger
3
Help all the students over there
all (group), here
head-motion + eye-focus + Hand motion + circle with index finger
4
Take this pot over there
this, there
hand-motion + index finger
Synchronized Colored Petri Net Based Multimodal Modeling
1233
The spatial and temporal use of the same referent words is complex; the same referent word can be used for spatial as well as temporal annotations [14]. Deictic gestures combine with haptics (touch) to create a composite deictic gesture referred as a touching gesture [32]. A Deictic gesture combines with a metaphoric gesture to create a composite deictic gesture referred as an exhibit gesture [32]. For example, a mother may point towards an egg and ask a child to eat breakfast. 3.1 Classification of Spatial Deictic Gestures Deictic gestures have three major classifications: pointing gestures, presenting gestures, and grouping gestures [32]. Pointing gestures point at an object in a scene with an extended index-finger or an extended palm with the corresponding arm raised. The head and eyes move to align the field-of-vision to the referred object or the listener. Handmotion + index-finger motions may be substituted by a synchronized combination of head and eye motions. Figure 2 illustrates two instances of pointing gestures: eye-based pointing and index-finger based pointing. Presenting gesture involves facing a group of objects to be presented, followed by raising the hand + index-finger and moving the head and hand (or finger) towards the listener or the entities while keeping the hand raised [32]. Presenting gesture is a variant of pointing gesture, except an interlocutor may point to two or more objects in quick succession without lowering the hand or folding the index-finger for prolong explanation [32]. Grouping gesture is similar to presenting gesture except index-finger moves in a circle to include a set of entities or a group of persons in the scene. Examples of deictic gestures are parent-infant interaction, interaction with hearing challenged persons, subtle dominant commands for action, instructor-student interaction, human-robot interaction, and telepresence [28–31]. In parent-infant interaction, newborns and young toddlers communicate using deixis due to the lack of word dictionary, lack of ability to form sentences, and lack of vocalization [29–31]. Deictic gestures involve pointing, grasping, and touching. In instructor-student interactions, deixis is used to point out to a concept explained by words on the whiteboard or slides, aligning with the questions, answers and previously articulated explanations associated with word(s) or illustrations. In human-robot interactions, due to the lack of natural language comprehension by the robot, it is easier to point to objects and issue an action command. It requires pointing gestures, and combination of pointing gestures with haptics for illustration. Pointing gestures have three major subclasses based upon the combinations of headmotions, eye-motions, hand-motion, and index-finger motions [32]. • DG1 : Synchronous head-movement, hand-movement and index-finger movement occurs to point to the referred entity in the scene. • DG2 : Only eye-movement or eye-tracking occurs to refer to the referred objects in the scene.
1234
A. Singh and A. K. Bansal
eye-pointing spatial deictic gesture
finger-pointing spatial deictic gesture
Fig. 2. Instances of Deictic Gestures. Pictures have been Taken from Wikimedia Commons’ Public Domain (https://upload.wikimedia.org/wikipedia/commons/b/b4/Pina_Pellicer_p ublicity_photos_for_One-Eyed_Jacks_%281961%29_%28cropped%29.jpg and https://upload. wikimedia.org/wikipedia/commons=/8/89/J_S_Copley_-_Samuel_Adams.jpg)
• DG3 : Synchronous head-movement and eye-movement occurs to refer to the object followed by eye-focus and speech to seek attention by the listener. Presenting gesture has two subclassifications: • DG4 : Objects do not fit in one field-of-vision. Synchronous head-movement, handmovement, finger-movement, and speech occur to refer to the first object. The remaining objects are referred by the repeated synchronous head-movement, arm-movement eye-movements and speech due to remaining objects being in different vision-fields. • DG5 : Objects are in proximity and fit in the same field-of-vision. Synchronous head-movement, hand-movement, finger-movement, and speech occur to refer to the first object. After that, head-movement does not occur. Repeated synchronous arm-movement, eye-movement and speech occur. Grouping gesture sweeps a subset of objects within the scene by making a circular motion of index-finger to include multiple objects or making a palm motion [32]. Sweeping gesture shows awareness about the presence of a group instead of using index-finger, as in the gesture for “Let’s go.”
4 Related Work The related works on deictic gestures are classified in five major categories: 1) work by the cognitive psychologists in defining and identifying the spatio-temporal aspects of deixis in a discourse [13, 14, 32–34]; 2) work in the robotics community on voice and hand-based pointing for robot-human interaction, robot-robot interaction and gazeanalysis [35–44]; 3) hand-motion analysis for recognizing hand-gestures [15, 19]; 4) conversational head-gesture analysis using synchronization of head-motion analysis with speech [17]; 5) camera and vision-based deictic gesture detection [45–51]. Cognitive psychology research has established the spatio-temporal nature of the pronouns for localization [13, 14]. They also discuss the temporal use of the pronouns such as ‘this year’ to refer to a period [14]. However, the discourse analysis by cognitive scientists has not been computationally implemented. It is limited by the lack of computational dialog analysis, pronoun disambiguation (anaphora resolution) and the lack of
Synchronized Colored Petri Net Based Multimodal Modeling
1235
visual cues. The discourse analysis does not include analysis of synchronized motions and speech. The research on deictic gestures by the robotics community is limited to the image analysis and detection of visual pointing by the index-finger using depth-based and optical sensors for precision and accuracy of pointing and trajectory tracking using Kalman’s filters and other prediction techniques [21]. Vision-based hand-gesture recognition for human-robot interaction relies on cameras, or depth-sensors and depends upon the recognition techniques that vary for static and dynamic gestures [19]. Lai et al. used vector calculation to detect pointing gesture by obtaining coordinates of human arm and used geometric computation to find the estimated hand [50]. Their technique requires users to extend their arm completely to extract depth data and imposes a restriction on the natural body pose of the users. Correa et al. proposed Bayesian classifier to recognize different hand-gestures on static images and lack motion analysis needed for detecting different subclasses of deictic gestures [15]. Nickel and Stiefelhagen proposed a skin color-based mapping and HMM-based classifier to capture head and hands motions to detect pointing gestures [40]. However, their approach does not include synchronization between various motions and speech, necessary for subclassification of pointing gestures. They also do not analyze other sub-classification of deictic gestures and conversational head-gestures. Singh and Bansal do not include the synchronization of multiple motions and speech and have not implemented deictic gesture [17]. They also limit the model to derive conversational head-gestures by analyzing head-motion only. In contrast, this research extends their research and integrates multiple synchronized motions and speech to detect different subclasses of deictic gestures. Current approaches do not include synchronization between multiple motions and speech needed for a general-purpose model. Current approaches also do not correlate motions in deictic gestures with the pronouns in spoken phrases for improved disambiguation of pronouns in dialogs. We integrate the synchronized motions and speech and extend cognitive psychology perspective. Our research takes a major step in establishing enhanced SCP as a general-purpose framework that will identify different subclasses of co-speech gestures.
5 Modeling Spatial Deictic Gestures Using SCP Deixis requires start-synchronization of motions of head, eyes, hand, index-finger, or palm. Speech may have start-synchronization, during-synchronization, or overlapsynchronization. The start of a spoken phrase can be delayed because localization precedes utterance. Speech can also end later such as in a presentation gesture due to longer explanation related to the referred object after the pointing. Deictic gestures differ from conversational head-gestures due to acyclic nature of head-motion and the use of index-finger or palm to point at an object in addition to the use of referential pronouns in the spoken phrases and the use of action words or the words indicating presence (or an absence) of an object in the specified location. The synchronization of acyclic head-motion, eye-motion, index-finger pointing, along with the use of referential words in the speech characterizes deictic gestures and separates them from conversational head-gestures that contain head-cycles.
1236
A. Singh and A. K. Bansal
5.1 Modeling Subclasses of Spatial Deictic Gestures The SCP based modeling requires analysis of synchronized hand-motion (raising an arm), acyclic head-motions, index-finger movements, and their synchronization. The deictic gestures are mixed with other co-speech gestures. The separation is achieved by identifying combinations of acyclic head motion (or eye motion), use of referential words in spoken phrases, and the motion of index-finger. Modeling Pointing Gestures. Figure 3 describes three pointing-gestures: DG1 , DG2 , and DG3 . In DG1 (see Fig. 3a), an arm is raised, and finger or palm is stretched using startsynchronization. The speech and other motions (head-rotation + arm raising + index finger stretching) are connected through during-synchronization because localization (using hand-motion) precedes utterance. After an utterance is complete, the arm and the index-finger keep the last position seeking attention. Figure 3(b) illustrates the pointing gesture DG2 using the synchronized motions of just left-eye and right-eye pointing or tracking a desired object. Figure 3(c) illustrates the pointing gesture DG3 where the head is moved along with the tracking-eyes, followed by eye-focus. The speech is optional. In Fig. 3(a), place-node p1 denotes the relaxed-state; the trigger-node tr 1 denotes the start-synchronization between head-motion, speech, hand-motion, and index-finger motion. The place p21 denotes the start of the head-motion; the place p22 denotes the start of the hand-motion; the place p23 denotes the start of the index-finger extension; the place p24 denotes the start of the speech. Transitions tr 21 to tr 24 denotes the corresponding synchronous activities. The place p3 denotes the end of the activities. There is no return to relaxed-state immediately; index-finger remains pointed for a longer period for seeking attention. In Fig. 3(b), synchronized motions of left and right eyes are used to point or track an object. Trigger-node tr 1 denotes start-synchronization. The places p21 and p22 denote the start of the eye-motions; the transition-nodes tr 21 and tr 22 model the eye-motions. The place p3 denotes the end of the motions using an end-synchronization. In Fig. 3(c), synchronization of head-motion, and left-eye motion and right-eye motion is used to track a moving object or to point to the object. The place p1 denotes the relaxed state. The trigger-node tr 1 spawns three synchronous tasks: head-motion, left-eye movement, and right-eye movement starting from the places p21 , p22 , and p23 . The three motions end with an end-synchronization at the place p3 .
tr1
p21 p22
tr22
p23 S tr 23 p24
p1
3a. DG1
p21
tr21
tr24
tr1
p3
p21
tr21
tr1
p3
p1 p22
tr21 p22 S
p1 tr22
3b. DG2
Fig. 3. SCP Modeling of Pointing Gestures
p23
tr22
tr23
3c. DG3
p3
Synchronized Colored Petri Net Based Multimodal Modeling
1237
Modeling Presenting Gestures. Figure 4 illustrates the modeling of the presenting gestures using SCP. Presenting gesture has two parts: (1) pointing to the first object; 2) pointing to the remaining objects using only hand-motion with an index-finger in the extended position. Figure 4(a) models the presenting gesture DG4 . The referred objects are further apart necessitating multiple separate head-motions due to change in the vision-fields. Figure 4(b) models the presenting gesture DG5 . The referred objects are in the same field-of-vision after pointing the first object. Hence, head is moved once. In Fig. 4(a) and Fig. 4(b), the first part is the same as the pointing-gesture and involves synchronized motions with a delayed speech. The second part of Fig. 4(a) contains headmotion, hand-motion, and speech; the index-finger remain in the extended position. The speech starts delayed as localization precedes utterance. The edges (tr 1 , p24 ) and ( tr 3 , p43 ) have delays δ14 , δ33 ≥ 0. Speech ends later than hand-motion due to additional explanations associated with the pointed object. Hence, the delay δ24 , δ43 ≥ 0. The second part of Fig. 4(b) comprises arm-motion and speech; head does not move due to objects being the same field-ofvision. The edges (tr 1 , p24 ) and ( tr 3 , p42 ) have delays δ14 , δ33 ≥ 0, and the edges (tr 24 , p3 ) and (tr 42 , p5 ) have delays δ24 , δ42 ≥ 0. Modeling Grouping Gestures. Figure 5 models the grouping gesture DG6 . It is similar to DG4 except for the circular motion of index-finger after raising the hand, and the speech is start-synchronized with the circular motion. Head-motion, hand-motion and indexfinger motion start synchronously manner and end synchronously. After this, index-finger and/or hand move to simulate a closed curve encompassing the group of objects along with the spoken phrase, including all the entities within the group.
5.2 Signatures for Spatial Deictic Gestures The signature tuple was extended to include Boolean overlap-synchronization, handmotion/arm-motion, index-finger/palm motion, eye-tracking, cyclic motion of indexfinger and Petri net cycles to identify spatial deictic gestures and separate it from conversational head-gestures [17].
tr1
p1
δ14 >0
21
tr41 p p 41
tr21
p5
p3 tr3
p24
δ33 >0
δ24 >0 tr24
21
p43
δ43 tr > 0 43
cycle for repeats
First part
Second part 4a.
tr41 p p 41
tr21
tr1
p5
p3 tr3
p1 δ14 >0 p24
δ33 >0
δ24 >0 tr24
First part 4b.
Fig. 4. SCP Modeling of Presenting Gestures
p42
δ42 > 0 tr42
cycle for re peats Second part
1238
A. Singh and A. K. Bansal
tr1 Head-motion + hand-motion + Index-finger motion
p21
p4 tr41
tr21 p3 tr3
p1
δ2 > 0 p5 speech + hand-motion + Index-finger motion
δ1 > 0 p23
tr23
p43
tr43
Fig. 5. SCP Modeling of Grouping Gesture
All six deictic gestures are disambiguated using the signatures. For example, DG1 and DG3 have head-motions while DG2 lacks head-motion. Similarly, DG4 and DG5 have overlap-synchronization between speech and other motions, while pointing gestures (DG1 to DG3 ) have during-synchronization. The tuple ((head-nod, direction), (head-shake, direction), (head-tilt, direction), (hand-motion, direction), (index-finger/palm, direction), eye-tracking, eye-focus, number of places, number of transitions, number of start-synchronizations, number of endsynchronization, number of strict-synchronizations, number of during-synchronization, number of overlap-synchronizations, number of concurrent asynchronous actions, headmotion cycle, hand-motion cycle, Petri net cycle, speech) describe the signatures to separate six deictic gestures from each other (see Table 2). Table 2. Signature of the Non-Emotional Conversational Head-Gestures Gesture
Signature
Pointing (DG1 )
((*, *), (*, *), (*, *), (1, 1), (1, 1), *, 1, 6, 5, 1, 1, 1, 1, 0, 0, 0, 0, 0, *)
Pointing (DG2 )
((0, 0), (0, 0), (0, 0), (0, 0), (0, 0), 1, 1, 4, 3, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0)
Pointing (DG3 )
((*, *), (*, *), (*, *), (0, 0), (0, 0), 1, 1, 5, 4, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0)
Presenting (DG4 )
((*, *), (*, *), (*, *), (1, 1), (1, 1), 1, 1, 10, 9, 2, 2, 2, 0, 2, 0, 0, *, *, 1)
Presenting (DG5 )
((*, *), (*, *), (*, *), (1, 1), (1, 1), 1, 1, 9, 8, 2, 2, 2, 0, 2, 0, 0, *, *, 1)
Grouping (DG6 )
((*, *), (*, *), (*, *), (1, 1), (1, 1), 1, 1, 9, 8, 2, 2, 2, 1, 0, 0, 0, 1, 0,1)
6 Deictic Gestures Recognition and Algorithms Analyzing video involves multiple steps: video-frame sampling, facial featureextraction, eye feature-extractions, wrist feature-extractions, index-finger featureextraction, and motion analysis. 6.1 Motion Detection and Analysis Video-frames are sampled, and region-of-interest (ROI) is determined by the frame height and width. Video frames are analyzed to extract the feature-points of face, eyes,
Synchronized Colored Petri Net Based Multimodal Modeling
1239
index-finger, wrist, and shoulder. The centroids of feature-points are derived using the mean of the coordinates of feature-points. The changes in centroid-coordinates are used to derive motion-vectors. Similarly, sampled speech amplitude is used to derive the silence-vector. An algorithm for identifying conversational head-gestures was extended to include hand-motions, wrist-motions, index-finger motion, eye-motions, and the corresponding spatial and temporal thresholds. Each motion was modeled using a motion-vector - a binary sequence of stillness (value = 0) or motion (value = 1). An added complexity is the increase in the number of spatial thresholds to derive small randomness while an organ is still. Each threshold depends on the maximum allowable motion of the organ. For example, the eye-threshold is much lower than the index-finger threshold; slight difference to eye coordinates indicates an eye-motion. An adult human eyeball is about 24 mm in diameter and can rotate about ± 50 degrees horizontally and ± 42 degrees vertically. The relaxed-state for these motions is in the region (x-origin ± 5, y-origin ± 5). In a relaxed-state, the hand arm is not raised, and fingers are not detected in the video-frame; index-finger is detected after a hand is raised and pointed towards the ROI. 6.2 Algorithm Figure 6 describes a simplified version of the algorithm to derive six spatial deictic gestures (DG1 to DG6 ). The inputs to the algorithm are: 1) facial coordinates to derive the head-motion; 2) eye-coordinates to derive the eye-motion; 3) index-finger coordinates to derive the motion of index-finger; 4) wrist coordinates to derive hand-motion; 5) various thresholds such as relaxed-head threshold, stillness thresholds for different motions; 6) time-index t to derive the sampled sequence of coordinates to derive motion or stillness (for speech). The output is the label of the spatial deictic gesture. The variable L denotes the gesture-label; the variable X denotes x-coordinates; the variable Y denotes y-coordinates. The variable S denotes the position co-ordinates of the organs during motions sequence. The Boolean variable mot denotes a motion. The variable ms denotes the sequence of deictic gestures. The symbol E denotes an eye. The symbol Fin denotes an index-finger. The symbol W denotes a wrist. The symbol H denotes the head. The symbol k denotes an organ in the set {eye, index-finger, wrist, head}. The symbol C denotes generically X or Y coordinates. The variables S k X and S k Y denote the vectors of x and y-coordinates for each motion where k ∈ {E, Fin, W, H}. The difference of coordinates between time-indices t and t-1 is denoted as X k and Y k . The spatial thresholds are denoted as εk X and εk y . The time-difference between two adjacent time-indices is the sampling time δS . Head-motion, eye-motions, index-finger motion, and wrist motion are detected by analyzing X and Y-coordinates. Speech is detected by amplitude greater than 35 db. The Boolean variable mot k x is set true when absolute value of X k (or Y k ) is greater than a threshold value εk X (or εk Y ). The logical combination of Boolean variable mot k C is used to analyze multimodal deictic gestures. Positive value of X k (or Y k ) denotes horizontal (or vertical) movement.
1240
A. Singh and A. K. Bansal
For DG1 gesture, mot Fin C is true and mot k C is false, where C ∈ {X, Y } and k ∈ {E, W, H}. For DG2 gesture, mot E C is true and mot k C is false, where C ∈ {X, Y } and k ∈ {Fin, W, H}. For DG3 gesture, mot H C is true and mot E C is true and mot k C is false, where k ∈ {Fin, W }. For DG4 gesture, mot Fin C is true and mot E C is true and mot k C is false, where k ∈ {H, W }. For DG5 gesture, mot Fin C is true and mot H C is true and mot E C is true and mot W C is false. For DG6 gesture, mot H C is true and mot E C is true and mot k C is true and mot W C is true. For all deictic gestures, motion could be in horizontal (|X k | > 0) or vertical (| Y k | > 0) direction, and the corresponding coordinates are updated accordingly.
Fig. 6. An Algorithm to Detect and Label Spatial Deictic Motions
Synchronized Colored Petri Net Based Multimodal Modeling
1241
7 Implementation The software was implemented in Python and interfaced with Visual Studio Code (1.55.2). The implementation uses Python and OpenCV library for the face-detection, Mediapipe library to analyze frames and extract feature-points, PyAudio library for the speech analysis, and Pydub audio library for the silence analysis [52–55]. The software was executed on a machine with Intel(R) Core (TM) [email protected] 2.60 GHz 64-bit system with 8 GB RAM. Video was collected of deictic gestures at 30 frames per second and analyzed in real-time to develop a matrix-based model of extended SCP [17]. The centroids of feature-points were used to derive the motions: facial feature-points for head-motion; elbow and shoulder feature-points for hand-motion; angular change from the base of a palm to the tip of the index-finger for index-finger motion.
8 Performance Analysis and Discussion Table 3 shows a confusion matrix to derive the recall of conversational head-gestures (CHG) and six spatial deictic gestures: DG1 - index-finger pointing, DG2 – eye pointing, DG3 - head-motion + eye pointing, DG4 - presentation with scattered entities, DG5 presentation in the same field-of-vision, and DG6 - grouping. The result shows a high percentage of recall for CHG and six deictic gestures. Recall varies from 81.3% for DG4 to 95.7% for CHGs. There are many mislabeling: DG1 is mislabeled as CHG (3.5%); DG2 is mislabeled as CHG (2.1%) and DG3 (11.1%); DG3 is mislabeled as CHG (3.7%) and DG2 (9.7%); DG4 is mislabeled as CHG (7.9%) and DG2 (1.6%) of the time; DG5 is mislabeled as CHG (4.5%), DG1 (1.1%) and DG6 (3.7%) of the time; DG6 is mislabeled as CHG (2.7%) and DG5 (4.2%) of the time. DG3 has head-motion, and DG2 does not have any head-motion. However, DG2 may be mislabeled as DG3 (and vice versa) due to the threshold-sensitivity. Head-motions may be treated as random motions in a still head with a larger threshold. Similarly, random motions in a still head may be treated as head-motions for smaller threshold. Table 3. Confusion Matrix for Labeling Deictic Gestures and CHG Gesture labeling (in percent) Actual gestures
CHG
DG1
DG2
DG3
DG4
DG5
DG6
CHG
95.7
0.0
0
0
0
0
0
DG1
3.5
86.2
0
0
0
0
0
DG2
2.1
0.0
82.4
11.1
0
0
0
DG3
3.7
0.0
9.7
85.1
0
0
0
DG4
7.9
0.0
1.6
0
81.3
0
0
DG5
4.5
1.1
0
0
0
83.6
3.4
DG6
2.7
0
0
0
0
4.2
83.1
1242
A. Singh and A. K. Bansal
The mislabeling of DG4 as DG2 is caused by: 1) error in identifying a head-cycle due to the choice of spatial threshold to identify visited places; 2) error in detecting featurepoints to derive eye-tracking and focused-eye; 3) error in identifying the head-motion; 4) the choice of delay threshold for overlap-synchronization. A larger delay-threshold misses overlap-synchronization and treats it as start-synchronization. The conversational gesture interrogation and DG4 are similar and use repeated headmotions for emphasis during questioning or explanation. Consequently, conversation head-gesture (CHG) may be mislabeled as DG4 and vice versa. The conversational head-gesture and DG5 also have similar shared head-motion attribute resulting into mislabeling. The mislabeling of DG5 and DG6 is caused by the choice of delay threshold. A larger delay-threshold treats during-synchronization as a start-synchronization; a smaller threshold treats start synchronization as a during-synchronization (see Table 1). Mislabeling is originated by the following 1) missing feature-points and frames in video analysis, resulting into missing places and the corresponding transitions; 2) missing small undetectable motions in gestures; 3) imprecise spatial and temporal threshold value. Unfortunately, despite statistical analysis to find an optimum threshold, mislabeling occurs due to thresholds being dynamic and context dependent. Signatures also lack the result of speech analysis, dialog-context, motion-speed, facial expression analysis and modeling of composite gestures [56]. The mislabeling can be reduced by improving confidence factor using dialog analysis [56].
9 Conclusion In this paper, we proposed a general synchronous colored Petri net model and subclassifications of deictic gesture-motions to recognize spatial deictic gestures in a realtime dialog that correlates pointing motions with the corresponding pronouns in spoken phrases. Our experimental results show that the SCP model gives good recall rate approximately 85%. However, it is affected by the choices of sampling interval, thresholds, and sensors’ inaccuracies resulting into gesture-recognition error. Certain gestures have ambiguities that can be solved only by comprehending the discourse and knowing the context and mental state of the interlocutor. Current version does not analyze many subclasses of co-speech gestures such as iconic hand-gestures, metaphoric hand-gestures, combination of haptics with co-speech gestures [16]. We are currently extending the model to analyze contours and shapes made by handgestures and deriving their symbolic meaning to correlate with the attributes of the entities and actions in iconic and metaphoric gestures.
References 1. Yenilmez, M.I.: Economic and social consequences of population aging the dilemmas and opportunities in the twenty-first century. Appl. Res. Qual. Life 10(4), 735–752 (2015). https:// doi.org/10.1007/s11482-014-9334-2
Synchronized Colored Petri Net Based Multimodal Modeling
1243
2. Agrigoroaie, R.M., Tapus, A.: Developing a healthcare robot with personalized behaviors and social skills for the elderly. In: Proceedings of the 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 589–590. Christchurch, New Zealand (2016). 10.1109/ HRI.2016.7451870 3. García, D.H., Esteban, P.G., Lee, H.R., Romeo, M., Senft, E., Billing, E.: Social robots in therapy and care. In: Proceedings of the 14th ACM/IEEE International Conference on HumanRobot Interaction (HRI), pp. 669–670. Daegu, Korea (2019). https://doi.org/10.1109/HRI. 2019.8673243 4. Rosenberg-Kima, R., Koren, Y., Yachini M., Gordon, G.: Human-Robot collaboration (HRC): social robots as teaching assistants for training activities in small groups. In: Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 522–523. Daegu, South Korea (2019). https://doi.org/10.1109/HRI.2019.8673103 5. Diftler, M.A., Ahlstrom, T.D., Ambrose, R.O., Radford, N.A., Joyce, C.A., De La Pena, N., et al.: Robonaut 2—initial activities on-board the ISS. In: IEEE Aerospace Conference, pp. 1–12, Big Sky, Montana, USA (2012). https://doi.org/10.1109/AERO.2012.6187268 6. Glas, D.F., Minato, T., Ishi, C.T., Kawahara, T., Ishiguro, H.: ERICA: the ERATO intelligent conversational android. In: Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 22–29, New York, NY, USA (2016). https://doi.org/10.1109/ROMAN.2016.7745086 7. Atmeh, G.M., Ranatunga, I., Popa, D.O., Subbarao, K., Lewis, F., Rowe, P.: Implementation of an adaptive, model free, learning controller on the Atlas robot. In: American Control Conference, pp. 2887–2892, Portland, OR, USA(2014). https://doi.org/10.1109/ACC.2014. 6859431 8. Bansal, A.K., Ghayoumi, M.: A hybrid model to improve occluded facial expressions prediction in the wild during conversational head movements. Int. J. Adv. Life Sci. 13(1–2), 65–74 (2021). https://www.iariajournals.org/life_sciences/lifsci_v13_n12_2021_paged.pdf 9. Ekman, P., Friesen, W.V.: Nonverbal Behavior. In: Ostwald, P.F. (ed.) Communication and Social Interaction, pp. 37- 46, Grune & Stratton, New York, NY (1977) 10. Plutchik, R.: Emotion: A Psychoevolutionary Synthesis. Harper & Row, New York, NY, USA (1980) 11. Craig, K.D., Prkachin, K.M., Grunau, R.V.: The facial expression of pain. In: Turk, D.C., Melzack, R. (eds.) Handbook of Pain Assessment, 3rd edn, pp. 117–133, New York: Guilford, USA (2011). ISBN 978-1-60623-976-6 12. Lucey, P., et al.: Automatically detecting pain in Video through facial action units. IEEE Trans. Syst. Man Cybern. 41(3), 664–674 (2011). https://doi.org/10.1109/TSMCB.2010.208252 13. Kendon, A.: Gesture: Visible Actions as Utterance. Cambridge University Press, Cambridge, UK (2004) 14. Fillmore, C.J.: Towards a descriptive framework for spatial deixis. Speech place and action: Studies in deixis and related topics, pp. 31–59 (1982) 15. Correa, M., Ruiz-del-Solar, J., Verschae, R., Lee-Ferng, J., Castillo, N.: Real-time hand gesture recognition for human robot interaction. In: Baltes, J., Lagoudakis, M.G., Naruse, T., Ghidary, S.S. (eds.) RoboCup 2009. LNCS (LNAI), vol. 5949, pp. 46–57. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11876-0_5 16. Singh, A., Bansal, A.K.: Towards synchronous model of non-emotional conversational gesture generation in humanoids. In: Arai, K. (ed.) Intelligent Computing. LNCS, vol 283, pp. 737756. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-80119-9_47 17. Singh, A., Bansal, A.K.: Automated real-time recognition of non-emotional conversational head-gestures for social robots. In: Arai, K. (ed.) Proceedings of the Future Technology Conference (FTC), vol. 3, Vancouver, Canada, LNNS, vol. 561, pp. 432–450 (2022). https:// doi.org/10.1007/978-3-031-18344-7_29
1244
A. Singh and A. K. Bansal
18. Yang, M.-H., Tao, J.-H.: Data fusion methods in multimodal human-computer dialog. Virtual Reality Intell. Hardware 1(1), 21–28 (2019). https://doi.org/10.3724/SP.J.2096-5796.2018. 0010 19. Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2012). https://doi.org/10.1007/s10462012-9356-9 20. Stukenbrock, A.: Deixis, Meta-perceptive gaze practices and the interactional achievement of joint attention. Front. Psychol. 11, Article 1779 (2020). https://doi.org/10.3389/fpsyg.2020. 01779 21. Vrigkas, M., Nikou, C., Kakadiaris, I.A.: A review of human activity recognition methods. Front. Robot. AI 2(28), Article 28 (2015). https://doi.org/10.3389/frobt.2015.00028 22. Beddiar, D.R., Nini, B., Sabokrou, M., Hadid, A.: Vision-based human activity recognition: a survey. Multimedia Tools Appl. 79(41–42), 30509–30555 (2020). https://doi.org/10.1007/ s11042-020-09004-3 23. Morency, L.-P., Christoudias, C.M., Darrell, T.: Recognizing gaze aversion gestures in embodied conversational discourse. In: Proceedings of the 8th International Conference on Multimedia Interfaces, pp. 287–294. Banff, Alberta, Canada (2006). 10.1145/ 1180995.1181051 24. Vertegaal, R., Slagter, R., van der Veer, G., Nijholt, A.: Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 301–308. Seattle, WA, USA (2001). https://doi.org/10.1145/365024.365119 25. Pisharady, P.K., Saerbeck, M.: Recent methods in vision-based hand-gesture recognition: a review. Comput. Vis. Image Underst. 141, 152–165 (2015). https://doi.org/10.1016/j.cviu. 2015.08.004 26. Brooks, A.G., Breazeal, C.: Working with robots and objects: revisiting deictic reference for achieving spatial common ground. In: Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction (HRI), pp. 297–304. Salt Lake City, UT, USA (2006). https://doi.org/10.1145/1121241.1121292 27. Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832– 843 (1983). https://doi.org/10.1145/182.358434 28. Kita, S. (ed.): Pointing: a foundational building block of human communication. In: Pointing: Where Language Culture and Cognition Meet, pp. 171–215. Lawrence Erlbaum Associates, Mahwah, NJ (2003) 29. Gliga, T., Csibra, G.: One year old infant appreciate the referential nature of deictic gestures and words. Psychol. Sci. 20(3), 347–353 (2009). https://doi.org/10.1111/j.1467-9280.2009. 02295.x 30. Goldin-Meadow, S., Mylander, C., de Villiers, J., Bates, E., Volterra, V.: Gestural communication in deaf children: the effects and non-effects of parental input on early language development. Monogr. Soc. Res. Child Dev. 49(3–4), 1–151 (1984) 31. Bejarano, T.: Becoming Human: From Pointing Gestures to Syntax. John Benjamins Publishing, Amsterdam, The Netherlands (2011) 32. Clark, H.H.: Coordinating with each other in a material world. Discourse Stud. 7(4), 507–525 (2005). https://doi.org/10.1177/1461445605054404 33. Louwerse, M.M., Bangerter, A.: Focusing attention with deictic gestures and linguistic expressions. In: Proceedings of the Annual Conference of Cognitive Science Society, pp. 1331–1336. Stresa, Italy (2005). Available at escolarship.org/uc/item/201422tj. Accessed 6 Nov 2022 34. Qu, S., Chai, J.Y.: Beyond attention: the role of deictic gesture in intention recognition in multimodal conversational interfaces. In: Proceedings of the 13th ACM International Conference on Intelligent User Interfaces (IUI), pp. 237–246. Gran Canaria, Spain (2008). https:// doi.org/10.1145/1378773.1378805
Synchronized Colored Petri Net Based Multimodal Modeling
1245
35. Kang, D., Kwak, S.S., Lee, H., Kim, E.H., Choi, J.: This or that: the effect of robot’s deictic expression on user’s perception. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11383–11390. Las Vegas, NV, USA (2020). https://doi.org/10.1109/IROS45743.2020.9341067 36. Bolt, R.A.: “Put-That-There”: voice and gesture at the graphic interface. ACM SIGRAPH Comput. Graph. 14(3), 262–270 (1980). https://doi.org/10.1145/965105.807503 37. Breazeal, C., Kidd, C.D., Thomaz, A.L., Hoffman, G., Berlin, M.: Effects of nonverbal communication on efficiency and robustness in human-robot teamwork. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 708–713. Edmonton, Alberta, Canada (2005). https://doi.org/10.1109/IROS.2005.1545011 38. Hato, Y., Satake, S., Kanda, T., Imai, M., Hagita, N.: Pointing to space: modeling of deictic interaction referring to regions. In: Proceedings of the 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 301–308. Osaka, Japan (2010). https://doi.org/ 10.1109/HRI.2010.5453180 39. Hu, J., Jiang, Z., Ding, X., Mu, T., Hall, P.: VGPN: voice-guided pointing robot navigation for humans. In: Proceedings of the IEEE International Conference on Robotics and Biomimetic (ROBIO), pp. 1107–1112. Kuala Lumpur, Malaysia (2018). https://doi.org/10.1109/ROBIO. 2018.8664854 40. Nickel, K., Stiefelhagen, R.: Visual recognition of pointing gestures for human-robot interaction. J. Image Vision Comput. 25(12), 1875–1884 (2007). https://doi.org/10.1016//j.imavis. 2005.12.020 41. Nagai, Y.: Learning to comprehend deictic gestures in robots and human Infants. In: Proceedings of the IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN), pp. 217–222. (2005). 10.1109/ ROMAN.2005.1513782 42. Sidner, C.L., Kidd, C.D., Lee, C., Lesh, N.: Where to look: a study of human-robot engagement. In: Proceedings of the 9th international conference on Intelligent user interfaces (IUI 2004), pp. 78–84. Association for Computing Machinery, New York, NY, USA (2004). https:// doi.org/10.1145/964442.964458 43. Sprute, D., Rasch, R., Pörtner, A., Battermann, S., König, M.: Gesture-based object localization for robot applications in intelligent environments. In: Proceedings of the 14th International Conference on Intelligent Environments (IE), pp. 48–55 (2018). https://doi.org/10. 1109/IE.2018.00015 44. Sugiyama, O., Kanda, T., Imai, M., Ishiguro, H., Hagita, N.: Natural deictic communication with humanoid robots. In: Proceedings of the IEEE International Conference on Intelligent Robot Systems, pp. 1441–1448. San Diego, CA, USA (2007). https://doi.org/10.1109/IROS. 2007.4399120 45. Azari, B., Lim, A., Vaughan, R.: Commodifying pointing in HRI: simple and fast pointing gesture detection from RGB-d images. In: Proceedings of the 16th Conference on Computer and Robot Vision (CRV), pp. 174–180. Kingston, ON, Canada (2019). https://doi.org/10. 1109/CRV.2019.00031 46. Wong, N., Gutwin, C.: Where are you pointing? the accuracy of deictic pointing in CVEs. In: Proceedings of the 28th ACM Conference on Human Factors in Computing Systems (CHI), pp. 1029–1038 (2010). https://doi.org/10.1145/1753326.1753480 47. Hofemann, N., Fritsch, J., Sagerer, G.: Recognition of deictic gestures with context. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 334–341. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-286493_41 48. Kollorz, E., Penne, J., Hornegger, J., Barke, A.: Gesture recognition with a time-of-flight camera. Int. J. Intell. Syst. Technol. Appl. 5(3–4), 334–343 (2008). https://doi.org/10.1504/ IJISTA.2008.021296
1246
A. Singh and A. K. Bansal
49. Kondaxakis, P., Pajarinen, J., Kyrki, V.: Real-time recognition of pointing gestures for robot to robot interaction. In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2621–2626. Chicago, IL, USA (2014). https://doi.org/10. 1109/IROS.2014.6942920 50. Lai, Y., Wang, C., Li, Y., Ge, S.S., Huang, D.: 3d pointing gesture recognition for humanrobot interaction. In: Proceedings of the Chinese Control and Decision Conference (CCDC), pp. 4959–4964. Yinchuan, China (2016). https://doi.org/10.1109/CCDC.2016.7531881 51. Nowack, T., Lutherdt, S., Jehring, S., Xiong, Y., Wenzel, S., Kurtz, P.: Detecting deictic gestures for control of mobile robots. In: Savage-Knepshield, P., Chen, J. (eds.) Advances in Human Factors in Robots and Unmanned Systems, pp. 87–96. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-41959-6_8 52. OpenCV. https://opencv.org. Accessed 13 Nov 2022 53. Mediapipe. https://mediapipe.dev. Accessed 10 Nov 2022 54. PyAudio. https://people.csail.mit.edu/hubert/pyaudio/docs/. Accessed 11 Nov 2022 55. Pydub. https://pypi.org/project/pydub/. Accessed 11 Nov 2022 56. Morency, L.-P., Sidner, C. L., Darrell, T.: Dialog context for visual feedback recognition. Wiley Series in Agent Technology, pp. 117–131. https://doi.org/10.1002/9780470512470.CH7
Maximum Independent Set Formation on a Finite Grid by Myopic Robots Raja Das(B) , Avisek Sharma, and Buddhadeb Sau Jadavpur University, Jadavpur, Kolkata, India {rajad.math.rs,aviseks.math.rs,buddhadebsau}@jadavpuruniversity.in
Abstract. This work deals with the Maximum Independent Set (MAX IS) formation problem in a finite rectangular grid by autonomous robots. Suppose we are given a set of identical robots, where each robot is placed on a node of a finite rectangular grid G such that no two robots are on the same node. The MAX IS formation problem asks to design an algorithm and each robot will move autonomously after executing the algorithm and terminate at a node such that after a finite time the set of nodes occupied by the robots is a maximum independent set of G. We assume that robots are anonymous and silent, and they execute the same distributed algorithm. Previous works solved this problem using one or several door nodes through which the robots enter the grid or the graph one by one and occupy the required nodes. In this work, we propose a deterministic algorithm that solves the MAX IS formation problem in a more generalized scenario, i.e., when the total number of required robots to form a MAX IS are arbitrarily placed on the grid. The proposed algorithm works under a semi-synchronous scheduler using robots with only two hop visibility range and only three colors. Keywords: Myopic Robot · Maximum Independent Set · Finite Grid Autonomous Robots · Robot with Lights · Distributed Algorithms
1
·
Introduction
Consider a rectangular area R as a bounded region in the two-dimensional Euclidean plane. We embed a rectangular grid graph G in that rectangular area R. Let a robot with sensing capability stay on the nodes of G. If the sensing radius of the robot is a, we embed the grid with edge length a so that a robot can sense its immediate upward, immediate downward, immediate left, and immediate right neighbour node along with its position completely. A robot can move to its immediate upward, immediate downward, immediate left, and immediate right neighbour nodes through the edges of G. Now we want to place a set of robots on some nodes of G such that each node of G is sensed by at least one robot. Now cost and resilience are the major parameters to consider. Suppose we have a placement of robots. We consider the cost as the total number of robots required. Now, if one removes some robots from the placement then some c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1247–1267, 2023. https://doi.org/10.1007/978-3-031-37963-5_86
1248
R. Das et al.
nodes become unsafe, that means, there are some nodes which are not sensing covered by any robot anymore. We consider the resilience (denoted as ρ) as the minimum of the ratio number of robots removed . number of nodes become unsafe We can accomplish the target in different ways. One way can be by putting robots at each node. In this way, the resilience is highest (ρ = 1) but the cost is maximum (See Fig. 1(a)). If we put robots on a minimum dominating set of G then the cost is minimum but resilience is the lowest (ρ = 1/5) (See Fig. 1(b)). If we put robots on a maximal independent set of G then we get a decent resilience (ρ = 1/3) (See Fig. 1(c)). If we put robots on a maximum independent set of G, the rectangular area R will be fully sensing covered. The cost in this case is half of the number of nodes and this method gives good resilience (ρ = 1/2) (See Fig. 1(d)). So in this work, we consider robots placing on a maximum independent set of G.
1 2
5
4
3
(a) All node
1 4 2 3
2 1 5 3 4
(b) Minimum Dominating Set
2
1 5 4 3
(c) Maximal Independent Set (d) Maximum Independent Set
Fig. 1. Various Methods of Covering.
In this paper, we give an algorithm for Maximum Independent Set (MAX IS) formation on a finite grid. Let a swarm of autonomous robots be placed initially on the distinct nodes of G. The MAX IS formation problem asks the robots to rearrange and take positions such that the robot occupied
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1249
nodes form a MAX IS of G. The robots work autonomously, which means they work without any central control. The robots are homogeneous (i.e., they all run the same algorithm), identical (indistinguishable), and anonymous (without any identifier). Such robot swarms are capable of doing certain tasks like gathering, dispersion, exploration, pattern formation, filling, etc. In some cases, robots have memory and can communicate with other robots. Based on these powers there are four types of robot models which are OBLOT , FST A, FCOM, LUMI. In OBLOT model robots are silent (no communication) and oblivious (no persistent memory). In FST A model robots are silent and non-oblivious. In FCOM model robots can communicate but are oblivious. In LUMI model robots can communicate and are non-oblivious. Robots can have a finite bit of memory which is generally interpreted as a finite number of lights that can take finitely many different colors. Seeing own light is equivalent to having memory and seeing the lights of other robots is equivalent to communication. After activation, each robot follows a look-compute-move (LCM) cycle. In the look phase, the robot takes a snapshot of its surrounding in its vicinity and gets the position and states of other robots. In compute phase it runs the algorithm and gets an output. In the move phase, the robot moves to its destination node or stays at the same node depending on the output. Activation plays a big role and it is determined by the scheduler. There are generally three types of schedulers. These are (1) fully synchronous (FSYNC) scheduler where the time is divided into global rounds and each robot activates in each round and simultaneously executes their LCM cycle; (2) Semi synchronous (SSYNC) scheduler where also the time is divided into global rounds but some robots activate in a round and simultaneously execute their LCM cycle; (3) Asynchronous (ASYNC) scheduler where there is no common notion of time among robots and all robots execute their LCM cycle independently. Vision is an important factor in performing these tasks. In [1–4] infinite visibility has been used. But infinite visibility is not practically possible due to hardware limitations. Limited visibility is more practical. Under limited visibility, a robot can see up to a certain distance in a plane and up to a certain hop in discrete space. In our work, we consider LUMI model robots with two hop visibility range and three colors under a semi-synchronous scheduler. The robots agree on the two directions and their orientations, one which is parallel to rows of G and another which is parallel to the columns of G. Hence each robot can determine its four directions. In this work, we propose a MAX IS formation algorithm for a robot swarm, which is initially placed arbitrarily on the nodes of the grid. We show that the proposed algorithm forms MAX IS under a semisynchronous scheduler using robots having only two hop visibility and a light that can take three distinct colors. The next section describes all relevant works and discusses the scope of our work. 1.1
Related Works and Our Contributions
Using swarm robotics various types of problems have been studied like exploration [4,11], gathering [3,6], dispersion [8–10], pattern formation [1,2,5] under
1250
R. Das et al.
different model. In [1–4] robots are considered to have infinite visibility. But infinite visibility is not practically possible due to hardware limitations. Limited visibility is more practical. Robots with limited visibility are called myopic robots. Myopic robots have been used in [6,7,14]. A lot of problems [1,3,4,7,9,12,13] have been explored under the grid graph. MAX IS formation on a finite grid can be seen from two perspectives. One perspective is the deployment of robots through a door node. Another perspective is pattern formation. As of our knowledge, there is no algorithm for arbitrary pattern formation in a finite grid graph. To the best of our knowledge, there are only two reported works [7,14] which considers MAX IS formation problem on a graph using an autonomous robot swarm. The authors [7] have given a MAX IS filling algorithm using robots having light with three colors, two hop visibility for port labeled finite grid under asynchronous scheduler. In this algorithm, also authors considered the assumption that each robot can recognize the node it came from when at its current node. The authors suggest that these assumptions can be implemented by introducing an internal light that can take four different colors. In another algorithm, they solved the same problem using robots with seven light colors, and three hop visibility under an asynchronous scheduler but in an unoriented grid. Let Δ denote the maximum degree of a graph. The researchers [14] have given a MAX IS filling algorithm for an arbitrary graph with one door node using (Δ+6) light color, three hop visibility, O(log(Δ)) bits of persistent storage under asynchronous scheduler. In another algorithm, they have solved the same problem with k(> 1) door nodes using (Δ + k + 6) light color, five hop visibility, O(log(Δ + k)) bits of persistent storage under semi synchronous scheduler. Another set of works is [12,13] which are remotely related to MAX IS formation problem. The study [13] solves the uniform scattering problem under an asynchronous scheduler and [12] solves the uniform scattering problem under a fully synchronous scheduler on a finite rectangular grid considering myopic robots. However, MAX IS formation can not be achieved by any special case or slight modification of these works. Table 1. Comparison Table. Work
Visibility Scheduler Door number range (hop)
Graph Topology
Internal Memory
Color number
1st algorithm in [7]
2
ASYNC
1
port labeled rectangular grid
Each robot can recognize the node it came from
3
2nd algorithm in [7]
3
ASYNC
1
unoriented rectangular grid
None
7
1st algorithm in [14] 3
ASYNC
1
Arbitrary connected Graph
O(log(Δ))
Δ+6
2nd algorithm in [14] 5
SSYNC
k>1
Arbitrary connected Graph
O(log(Δ + 6))
Δ+k+6
Our Algorithm
SSYNC
None (Arbitrary initial deployment)
oriented rectangular grid
None
3
2
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1251
In this paper from the motivation of finding a robust but cost-effective coverage of a rectangular region, we give an algorithm to form a MAX IS pattern on a rectangular finite grid by luminous robots under a semi-synchronous scheduler. In contrast to [7,14], our proposed algorithm does not use the door concept and allows to form of a MAX IS starting from any initial configuration. Thus, this work generalizes the initial condition of the work in [7,14] for rectangular grid topology. Also, there can be practical scenarios where the door concept is not possible to implement. Suppose the robots are arbitrarily placed on the grid initially. If one wants to convert it to a door concept scenario then all robots need to gather at a corner, which might not be possible if the robots are not point robots. One might argue that all initial positions of robots can be considered as different doors and compare it with the multi-door algorithm of [14] which works under a semi-synchronous scheduler. But compared with that, our algorithm uses only a constant memory. The proposed algorithm in our work uses robots having lights that can take only three colors. The multi-door algorithm in [14] uses five hop visibility for robots whereas our algorithm only uses two hop visibility. A comparison table, Table 1 (In this table, Δ denotes the maximum degree of a graph) is presented to clarify the scope of this work. Outline of the Paper. Section 2 discusses the model and provides the problem definition. Section 3 presents our proposed algorithm and also proves its correctness. Finally, Sect. 4 gives the concluding remarks and discusses the future scope of our work.
2
Model and Problem Definition
Consider a rectangular grid graph G = Pm × Pn , where Pm and Pn are path graphs with size m > 1 and n > 1 respectively. Let G be embedded on R2 , such that the nodes of the graphs are placed at the coordinates {(x, y)|x = 1, . . . , m and y = 1, . . . , n} with respect to a coordinate system. We consider the robots equipped with motion actuators and visibility sensors. These robots move on G. Robots can stay on the nodes only. Robots can sense their surrounding nodes and can move through edges. We call the topmost row as 1st row, then the second row from the top as 2nd row, and so on. Similarly, we call the leftmost column as 1st column, then the second column from the left as 2nd column, and so on. We can think the grid as an m × n matrix, where the (i, j)th entry of the matrix represents the node on ith row and j th column of the grid. We assume that the size of the rectangular grid is unknown to the robots. We consider the leftmost column, rightmost column, uppermost row, and lowermost row as the left boundary, right boundary, upper boundary, and lower boundary respectively. Each robot on activation executes a look-compute-move (L-C-M) cycle. In the look phase, a robot takes a snapshot of its surrounding in its vicinity. In compute phase it runs an inbuilt algorithm taking the snapshot and its previous state (if the robot is not oblivious) as an input. Then it gets a color and a position as output. In the move phase, the robot changes its color if needed and moves to its destination node or stays at the same node.
1252
R. Das et al.
Scheduler: There are generally three types of schedulers, which are fully synchronous, semi-synchronous, and asynchronous. In a synchronous scheduler, the time is equally divided into different rounds. The robots activated in a round execute the L-C-M cycle and each phase of the L-C-M cycle is executed simultaneously by all the robots. That means, all the active robots in a round take their snapshot at the same moment, and, Compute phase and Move phase of all active robots take the same time. Under a fully synchronous scheduler, each robot gets activated and executes the L-C-M cycle in every round. Under semi synchronous scheduler a nonempty set of robots gets activated in a round. An adversary decides which robot gets activated in a round. In a fair adversarial scheduler, each robot gets activated infinitely often. Under an asynchronous scheduler, there is no common notion of time for the robots. Each robot independently gets activated and executes its L-C-M cycle. The time length of L-C-M cycles, Compute phases and Move Phases of robots may be different. The time length of an L-C-M cycle of a robot is finite but can be unpredictably long. In this work, robots are working under a fair semi synchronous scheduler. Axes Agreement: All robots agree on the directions of the axis parallel to rows and the axis parallel to columns. Hence robots agree on the global notion of up, down, right, and left directions. Each robot can determine the four directions from a node. Visibility of Robots: A robot can see all of its neighbour nodes within two hop distance. Thus a robot can see 13 nodes including its position.
Fig. 2. View of a Robot with Two Hop Visibility.
Fig. 3. Black Dots Represent the Nodes of the Set S.
The first left neighbour node, second left neighbour node, first upward neighbour node, second upward neighbour node, first right neighbour node, second right neighbour node, first downward neighbour node, second downward neighbour node, upward-right neighbour node, upward-left neighbour node,
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1253
downward-right neighbour node, downward-left neighbour node of a robot are denoted by l1, l2, u1, u2, r1, r2, d1, d2, ur, ul, dr, dl respectively (See Fig. 2). Lights: Each robot has a light that can take three different colors. These colors are red, blue and green. The initial color of each robot is green. The blue color indicates that the robot wants to move but its desired path is stuck by other robots. The red color indicates that the robot has reached its final position and will not move further. Here onward we shall call a robot with color red (or blue or green) as red (or blue or green) robot. Definition 1 (Maximum Independent Set). An independent set I of a graph G is a set of nodes of G such that no two nodes of that set are adjacent. A maximum independent set (MAX IS) of G is an independent set of the largest possible size. We consider a set S of (i, j)th grid nodes having coordinates such that i ≡ j (mod 2)}. The nodes of the set S are depicted in Fig. 3. One can calculate that S contains mn 2 nodes. In next Proposition 1 (without proof) we state that S is a MAX IS of G. Proposition 1. The set S of nodes described above forms a MAX IS of G. We define ui as the number of nodes of S present in the ith row of the grid. if i is odd and ui = n−1 if i If n is even then ui = n2 . For odd n, ui = n+1 2 2 mn is even. We assume that initially |S| = 2 robots are present arbitrarily on different nodes of the rectangular grid such that there can be at most one robot on a node of the rectangular grid. Next, we state the problem formally. Definition 2 (MAX IS formation problem). Suppose a set of finite robots are placed arbitrarily at distinct nodes of a finite rectangular grid G. The MAX IS formation problem requires the robots to occupy distinct nodes of G and settle down avoiding collision such that the set of occupied nodes of G is a maximum independent set of G. The next section provides a proposed algorithm that solves MAX IS formation problem and the algorithm is collision free.
3
MAX IS Formation Algorithm
This section provides an algorithm namely, MAX IS Formation Algorithm that claims to solve the MAX IS formation problem. Different views of a robot are depicted in different figures in this section. In the figures of this section onward green, blue and red color filled circles respectively represent green robot, blue robot and red robot. The black circle indicates a node that may or may not exist. If that node exists then it can be vacant or occupied by a robot. This means a robot can ignore black circle nodes in compute phase. The black cross indicates that the node does not exist. The black diamond indicates that the node exists. The empty nodes indicates the empty nodes of a grid. If any robot
1254
R. Das et al.
has to move according to its view, then the direction of movement and its color before movement have been shown in the corresponding view with a colored arrow. Initially, all robots are colored green. Definition 3 (Upper-Left Quadrant). Let a robot c1 be at (i, j)th node of a grid. Then the nodes having coordinates {(x, y) : x ≤ i, y ≤ j} {(i, j)} are called upper-left quadrant of c1 . A green robot moves to left by maintaining at least two hop distance from its left robot until it reaches the left boundary. After reaching the left boundary it moves upward by keeping at least two hop distance from its upward robot. In this way, a robot will be fixed at the upper left corner node and will be fixed first. green robots move left by maintaining the necessary distance from their left robot until it reaches the right boundary or near a red robot (See Fig. 6). Then it moves upward by maintaining the necessary distance from its upward robot until it reaches the upper boundary or near a red robot (See Fig. 8). Thus the robot reaches a suitable node from which it can see the necessary view to becoming red (See Fig. 9). Definition 4 (Fixed robot). When a robot becomes red, it does not move any more according to the MAX IS Formation Algorithm 1. This robot is called a fixed robot. If a green robot ca sees that its l1 (if ca is at upper boundary) or u1 (if ca is at left boundary) or both (if ca is neither at upper boundary nor at left boundary) neighbour nodes are occupied by red robots and ca can not move upward or left then there are two possibilities. Case-1: If r1 (if ca is not at right boundary) or d1 (if ca is at right boundary) neighbour node of ca is vacant then it will move 1 hop right (if ca is not at right boundary) or 1 hop down (if ca is at right boundary) (See Fig. 10 or Fig. 11). Case-2: If r1 (if ca is not at right boundary) or d1 (if ca is at right boundary) neighbour node of ca is occupied by a robot then ca will turn into blue indicating that it has been stuck and wants to move right (if ca is not at right boundary) or down (if ca is at right boundary) but can not move (See Fig. 13). Now if a green robot cb sees its l1 neighbour node is occupied by a blue robot ca then there are following cases. Case-1: If r1 (if cb is not at right boundary) neighbour node of cb is vacant or d1 (if cb is at right boundary) neighbour node of cb is vacant then it will move 1 hop right (if cb is not at right boundary) or 1 hop down (if cb is at right boundary) (See Fig. 14 or Fig. 7). Case-2: If r1 (if cb is not at right boundary) neighbour node of cb is not vacant or d1 (if cb is at right boundary) neighbour node of cb is not vacant. Case-2.1: If u1 and u2 neighbour node of cb is vacant then cb will move upward by keeping necessary distance from its upward robot until it reaches upper boundary or near a red robot (See Fig. 16). Case-2.2: If u1 or u2 neighbour node of cb is occupied by a nonred robot then cb will do nothing.
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1255
Case-2.3: If one of u1 and u2 neighbour nodes of cb is occupied by a red robot and another is vacant then it will turn blue (See Fig. 17). If a green robot cd which is on the right boundary sees its u1 neighbour node is occupied by a blue robot cc then there are the following cases. Case-1: l1 and l2 neighbour node of cd both are vacant then cd will move 1 hop left. Case-2: Anyone between l1 and l2 neighbour nodes of cd or both are non vacant. Case-2.1: If d1 neighbour node of cd is vacant then cd will move 1 hop down (See Fig. 15). Case-2.2: If d1 neighbour node of cd is occupied by a green robot then cd will turn into blue (See Fig. 18). Definition 5 (Blue Sequence). Consider a blue robot c such that either (i) its left node (if exists) is not occupied by a blue robot if c is not at the east boundary, or (ii) both of its left node and upward node (if exists) are not occupied by a blue robot if c is at the east boundary. For case-(i) scan the grid from c to its rightwards. When it hits the east boundary then scan downwards. For case-(ii) scan the grid from c to downwards. For both cases stop scanning when it reaches a node that is not occupied by a blue robot. The trail of blue robots in the scanning is called a blue sequence.
Fig. 4. Blue Sequence and Adjacent Green Robot.
Fig. 5. 1 Hop Shifting has been Done and Tail.
In Fig. 4 the robots c1 , c2 , c3 , c4 , c5 , c6 , c7 , and c8 form the blue sequence.
1256
R. Das et al.
A blue sequence can proceed till the (m − 1, n)th node at most. If the (m, n)th node is occupied by some robot, that robot will never be blue since its d1 neighbour node does not exist and the existence of d1 neighbour node is necessary to become blue for the robots present in the right boundary. Definition 6 (Adjacent Green Robot). Consider the blue sequence will not proceed further. If the blue sequence ends before the right boundary then the green robot at the r1 neighbour node of the rightmost blue robot of the sequence and if the blue sequence continues through right boundary then the green robot at the d1 neighbour node of the downmost blue robot of the sequence present in the right boundary is called the adjacent green robot of the blue sequence. In Fig. 4 c9 is the adjacent green robot of the blue sequence. Definition 7 (Predecessor Blue Robots). Consider a blue robot ck of a blue sequence. All the robots which became blue in that blue sequence before the round in which ck became blue, are called the predecessor blue robots of ck in that blue sequence. In Fig. 4 c1 , c2 , c3 , c4 , and c5 are the predecessor blue robots of c6 . If a blue robot ce which is not at the right boundary sees its r1 neighbour node is vacant then it turns green and moves 1 hop right (See Fig. 12). If a blue robot ce which is at the right boundary sees its l1, l2, and d1 neighbour nodes then there are the following cases. Case-1: Both l1 and l2 neighbour nodes of ce are vacant then ce turns green and move 1 hop left (See Fig. 19). Case-2: Any one between l1 and l2 neighbour nodes of ce is not vacant and its d1 neighbour node is vacant then it turns green and moves 1 hop down (See Fig. 20). Definition 8 (Tail). If the blue sequence continues through right boundary and a blue robot of the sequence from the right boundary leaves the sequence by moving left then the remaining blue robots of the sequence below the leaving blue robot will be called tail. In Fig. 5 c7 , c8 is the tail after c6 leaves the blue sequence. Definition 9 (1 Hop Shifting). If the adjacent green robot or any robot of the blue sequence moves from its position then each of its predecessor blue robots moves 1 hop to fill the vacant node and to make the starting node of the blue sequence vacant. This is called one hop shifting of the blue sequence. In Fig. 5, one hop shifting of the blue sequence in Fig. 4 has been done after the robot c6 moves 1 hop left and makes its position vacant. The shifting is happened in the following way. First c5 changes its color to green and moves downwards and c4 also does the same. Then c3 changes its color to green and moves right and, then firstly c2 and then c1 do the same. Hence, if no other
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1257
robot moves then after a finite round the configuration turns from Fig. 4 to the Fig. 5. If a blue robot ce which is at the right boundary sees its u1 neighbour node is vacant and l1 neighbour node is not occupied by a blue robot then it turns green (See Fig. 21). If a blue robot ce which is at the right boundary sees its u1 neighbour node is occupied by a red robot and l1 neighbour node is vacant then it turns green (See Fig. 21). If a blue robot ce which is at the right boundary sees its u1 neighbour node is occupied by a green robot then it turns green (See Fig. 21).
(a) GL1
(b) GL2
(c) GL3
(d) GL4
Fig. 6. Views of Green Robot to Move Left.
(a) GD3
Fig. 7. View of Green Robot to Move Downward when a blue Robot is at Left.
Now we define some sets of views. G2 = {GD1, GD2, GD3, GD4} G1 = {GL1, GL2, GL3, GL4} G3 = {GR1, GR2, GR3, GR4, GR5, GR6, GR7, GR8} G4 = {GU1, GU2, GU3, GU4, GU5, GU6, GU7, GU8, GU9, GU10, GU11, GU12, GU13, GU14, GU15, GU16, GU17, GU18, GU19, GU20} G5 = {GB1, GB2, GB3, GB4, GB5, GB6, GB7, GB8, GB9, GB10, GB11 ,GB12, GB13, GB14} G6 = {G-R1, G-R2, G-R3, G-R4, G-R5, G-R6, G-R7} B1 = {BGR1, BGR2, BGR3} B2 = {BGL1} B3 = {BGD1} B4 = {BG1, BG2, BG3, BG4, BG5} 3.1
Correctness Proofs
Theorem 1. There are no collisions of robots while executing the MAX IS Formation Algorithm. Proof. There can be two types of collisions. Type-1: There is a robot present already in a node and another robot comes to that node. Type-2: More than one robot, each from a different node comes to a particular vacant node.
1258
R. Das et al.
(a) GU1
(b) GU2
(c) GU3
(d) GU4
(e) GU5
(f) GU6
(g) GU7
(h) GU8
(i) GU9
(j) GU10
(l) GU12
(k) GU11
Fig. 8. Views of Green Robot to Move Upward Due to red Robot.
(a) G-R1
(e) G-R5
(b) G-R2
(c) G-R3
(f) G-R6
(d) G-R4
(g) G-R7
Fig. 9. Views of Green Robot to become Red.
According to the MAX IS Formation Algorithm no robot moves to a node that is already occupied. So there is no collision of Type-1. According to the MAX IS Formation Algorithm, there are four types of movement of a robot i.e. left move(L), right move(R), upward move(U), and downward move(D). Considering all possible combinations there can be six different collisions i.e. (LR), (LU ), (LD), (RU), (RD), (UD).
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1259
Fig. 10. Views of Green Robot to Move Right Due to red Robot.
Fig. 11. Views of Green Robot to Move Downward Due to red Robot.
Fig. 12. Views of Blue Robot to become Green and Move Right.
Fig. 13. Views of Green Robot to become Blue Due to red Robot.
Fig. 14. Views of Green Robot to Move Right Due to blue Fig. 15. View of Green Robot. Robot to Move Downward when a blue Robot is at Upward.
1260
R. Das et al.
Fig. 16. Views of Green Robot to Move Upward Due to blue Robot.
Fig. 17. Views of Green Robot to become Blue when a blue Robot is at Left.
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
Fig. 18. View of Green Robot to become Blue when a blue Robot is at Upward.
Fig. 19. View of Blue Robot to become Green and Move Left.
1261
Fig. 20. View of Blue Robot to become Green and Move Downward.
Fig. 21. Views of Blue Robot to become Green.
Algorithm 1: MAX IS Formation Data: Positions and colors of robots within 2 hop distance Result: One color and one destination point if col(c) is green then if view(c) ∈ G1 then Move left else if view(c) ∈ G2 then Move downward else if view(c) ∈ G3 then Move right else if view(c) ∈ G4 then Move upward else if view(c) ∈ G5 then Change color to blue else if view(c) ∈ G6 then Change color to red else if col(c) is blue then if view(c) ∈ B1 then Change color to green and move right else if view(c) ∈ B2 then Change color to green and move left else if view(c) ∈ B3 then Change color to green and move downward else if view(c) ∈ B4 then Change color to green
(LR): A robot will move left if its view belongs to G1 or B2 . From Fig. 6 and Fig. 19 it is clear that for (L) movement of a robot c, l1 neighbour node of c will always remain vacant and l2 neighbour node of c is vacant or occupied by a red robot or does not exist. There is no robot that will move to l1 neighbour node of c by (R) movement. So there is no (LR) collision.
1262
R. Das et al.
(LU): A robot will move upward if its view belongs to G4 . From Fig. 16 and Fig. 8 it is clear that for (U) movement of a robot c, ur neighbour node of c is vacant or does not exist. There is no robot that will move to u1 neighbour node of c by (L) movement. So there is no (LU) collision. (LD): A robot will move downward if its view belongs to G2 or B3 . From Fig. 20, Fig. 7, Fig. 15 and Fig. 11 it is clear that for (D) movement of a robot c, dr neighbour node of c does not exist. There is no robot which will move to d1 neighbour node of c by (L) movement. So there is no (LD) collision. (RU): A robot will move upward if its view belongs to G4 . From Fig. 8 and Fig. 16 it is clear that for (U) movement of a robot c, ul neighbour node of c is vacant or occupied by a red robot or does not exist. There is no robot which will move to u1 neighbour node of c by (R) movement. So there is no (RU) collision. (RD): A robot will move downward if its view belongs to G2 or B3 . From Fig. 20, Fig. 7, Fig. 15 and Fig. 11 it is clear that (D) movement of a robot is possible through right boundary only. A robot will move right if its view belongs to G3 or B1 . From Fig. 14, Fig. 10, and Fig. 12 it is clear that for (R) movement of a robot c, if r1 neighbour node of c is on right boundary then ur neighbour node of c is vacant or occupied by a red robot else r1 neighbour node of c is not on right boundary. There is no robot that will move to r1 neighbour node of c by (D) movement. So there is no (RD) collision. (UD): A robot will move upward if its view belongs to G4 . From Fig. 16 and Fig. 8 it is clear that for (U) movement of a robot c, u1 neighbour node of c is vacant and u2 neighbour node of c is vacant or does not exist. There is no robot which will move to u1 neighbour node of c by (D) movement. So there is no (UD) collision. Lemma 1. If all robots have turned their color to red then the set of robot occupied grid nodes forms a MAX IS of G. Proof. First we show that the set of robot occupied grid nodes is an independent set of G. We show this by showing that no two red robots are adjacent. Opposite to our claim, let there be two adjacent red robots c1 and c2 . If c1 and c2 are on the same column then let c2 be the robot below c1 and if c1 and c2 are on the same row then let c2 be the robot right to c1 . Let c1 and c2 change their color th to red in tth 1 and t2 round respectively. Now there can be two possibilities. Case-I: (t1 ≤ t2 ) Since red robots never move, so throughout tth 2 round the c1 robot is at the u1 or l1 neighbour node of c2 . According to our proposed algorithm, c2 will change its color to red if it sees any view belonging to the set G6 . But no view in G6 allows the u1 or l1 neighbour node of c2 to be occupied by a robot. So this leads to a contradiction. Case-II: (t1 > t2 ) In this case c2 becomes red and gets fixed before c1 . Hence the l2, u2 and ul neighbours of c2 must be occupied by red robots if these neighbour nodes exist and it sustains in t1 round also (since red robots never move). If c1 robot is at u1 (or, l1) neighbour node of c2 , then u1 (or, l1) neighbour node of c2 exists. Hence view of c2 at tth 2 round must be one of G-R2
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1263
(replace G-R2 by G-R4 for the case when c1 is at l1 neighbour node of c2 ), G-R3, G-R5, G-R6 and G-R7. In all such views either l1 or u1 neighbour node of c1 is occupied by a red robot. Hence c1 would not change its color to red in tth 1 round, which is a contradiction. Hence if all robots turn red then the robot occupied nodes form an independent set. Now the number of robots is mn 2 which is the maximum possible size of an independent set of G. Since Theorem 1 gives that there is no collision of robots, so all the red robots must be at distinct grid nodes. So the number of robot occupied nodes after all robots turned red is also mn 2 . Thus, the independent set formed by robot occupied grid nodes is a MAX IS. Lemma 2. If a row consists of three types of robots i.e. red, blue, and green then the red robots will be at the left, the blue robots will be in the middle and the green robots will be at the right of the row. Proof. A green robot becomes red, when it sees its l2, ul, u2 neighbour nodes (if exist) are occupied by red robots and l1, u1 neighbour nodes (if exist) are vacant. So there cannot be any green or blue robot at left of a red robot. Thus red robots are at left of a row. When a blue sequence starts then the l1 neighbour node (if exists) of the blue robot which became blue first, is occupied by a red robot. A blue sequence is a sequence of blue robots which are at consecutive nodes. So there is no green robot at the middle of a blue sequence. Thus the red robots will be at left, blue robots will be at middle and green robots will be at right of the row. Lemma 3. If there are n2 − 2 robots present in a row consists of (n − 1) nodes, then after finite round (n − 1)th and (n − 2)th node will be vacant. Proof. In a row the distance of a red robot from its immediate left or immediate right red robot is exactly two hop. In a row distance of a red robot from its immediate right blue robot is exactly one hop. In a row after finite round the distance of a green robot from its immediate left or immediate right green robot will be at most two hop since all green robots move left by keeping two hop distance. In a row distance of a blue robot which became blue last, from its immediate right green robot is exactly one hop. In a row distance of a blue robot from its immediate left or immediate right blue robot is exactly one hop. Thus by Lemma 2 in a row distance of a row from its immediate left or immediate right robot is at most 2 hop. Maximum possible number of robots is n2 − 2 (if n is even) or n+1 2 − 2 (if n is odd). If possible we try to put the robots in such a way so that (n − 1)th and (n − 2)th node does not remain empty. If we put robots on even positioned nodes then ith robot will be at 2ith node. ( n2 − 2)th th robot will be at (n − 4)th node. ( n+1 robot will be at (n − 3)th node. Thus 2 − 2) th th in both cases (n − 1) and (n − 2) node will be vacant. Lemma 4. Let c1 be the leftmost nonred robot in the topmost nonred robot occupied row. Let r1 be a part of a blue sequence and c1 be the first robot
1264
R. Das et al.
that turned blue for any one view from Fig. 13. If the blue sequence ends at (m − 1, n) node then one hop shifting will be done. Proof. If the blue sequence starts at f th row, continues through right boundary and ends at (m−1, n) node then 1st , 2nd , . . . , ith , . . . , (f −1)th row each contains ui robots. f th row contains more than uf robots. (f + 1)th , (f + 2)th ,. . . ,mth row together will contain less than uf +1 + uf +2 + . . . + um robots. There will be at least one row (say hth row) which will contain less than uh robots. If more than one such row exists then consider the topmost row (say lth row) which contains less than ul robots. If any robot comes from the below row and makes ul number robot in lth row, then we shall consider the below q th row which contains less than uq robot. If this continues since the number of rows is constant we must get such a row (say wth row) where the number of robots will be less than uw and no robots will enter from below. Without loss of generality, we consider such a row as z th row. If n is odd then uz is n2 when z is odd and n2 − 1 when z is even. If n is even then uz is n2 . If we consider z th row except the right boundary node which is occupied by a blue robot or green robot then there are (n − 1) nodes and atmost n2 − 2 robots. After a finite round when all the robots of z th row except the right most blue or green robot will be at atmost two hop distance from each other, (z, n − 2) and (z, n − 1) node will be vacant by Lemma 3. Then the right most robot of z th row will move from (z, n) node to (z, n − 1) node and one hop shifting will be done automatically. Theorem 2. MAX IS Formation Algorithm forms maximum independent set after finite rounds without any collisions. Proof. Consider the uppermost row which contains at least one green or blue robot. If there is no such row then every robot present in the grid is red. Therefore by Lemma 1 the proof is done. Let there exists a row (say wth row) that contains at least one green or blue robot. Let c1 be the leftmost nonred robot on that row. c1 can be green or blue. Note that all the robots present in the upper-left quadrant of c1 are red and they are fixed. Case-1: c1 is green. c1 continues moving left as long as it sees any views from {GL1, GL2, GL3, GL4} (Fig. 6). While c1 is progressing left through the row if any green robot from below row moves upwards and comes to the left of c1 then we will consider the new robot as c1 . If c1 does not see any view from {GL1, GL2, GL3, GL4} (Fig. 6) then it must see any view from {GB1, GB2, GB3, GB4, GB5, GD1, GD2, GR1, GR2, GR3, GR4, GR5, GU1, GU2, GU3, GU4, GU5, GU6, GU7, GU8, GU9, GU10, GU11, GU12, G-R1, G-R2, G-R3, GR4, G-R5, G-R6, G-R7} (Fig. 13, Fig. 11, Fig. 10, Fig. 8, Fig. 9). If c1 sees any one view from {GB1, GB2, GB3, GB4, GB5} (Fig. 13) then it turns blue and goes to Case-2. If c1 sees any one view from {GD1, GD2} (Fig. 11) then it will go to its d1 neighbour node. Now we may get a new c1 since there may exist some nonred robot at the left in the current row. Now, c1 will not move to its d1 neighbour node and will remain c1 since it will not get any view
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1265
from {GD1, GD2} (Fig. 11). If c1 sees any one view from {GR1, GR2, GR3, GR4, GR5} (Fig. 10) then it will go to its r1 neighbour node. Now it will not see any view from {GR1, GR2, GR3, GR4, GR5} (Fig. 10) and {GD1, GD2} (Fig. 11). If c1 sees anyone view from {G-R1, G-R2, G-R3, G-R4, G-R5, G-R6, G-R7} (Fig. 9) then it turns red. Else c1 will see anyone view from {GU1, GU2, GU3, GU4, GU5, GU6, GU7, GU8, GU9, GU10, GU11, GU12} (Fig. 8) and continues moving upward until it sees anyone view from {G-R1, G-R2, G-R3, G-R4, G-R5, G-R6, G-R7} (Fig. 9). Finally, c1 will see anyone view from {G-R1, G-R2, G-R3, G-R4, G-R5, G-R6, G-R7} (Fig. 9) and will turn red. Case-2: c1 is blue. A blue robot became blue as a part of a blue sequence. Now it is either a part of a blue sequence or a part of a tail. As c1 is the leftmost nonred robot in the topmost nonred robot occupied row, there can be two cases. Case-2.1: If c1 is a part of a tail then c1 will be at right boundary and the topmost blue robot of the tail. If we consider l1 and u1 neighbour nodes of c1 then there can be four types of figures. In these three types i.e. {BG1, BG3, BG5} (Fig. 21), at least one among l1 and u1 neighbour nodes of c1 is not occupied by a red robot and c1 will turn into green. The robot c1 goes to Case-1 and this c1 will never become blue as it was the upmost robot of a tail and all the robots which are at upper-left quadrant of c1 are red and at least one among l1 and u1 neighbour nodes of c1 is not occupied by a red robot. If both l1 and u1 neighbour nodes of c1 are occupied by red robots then c1 goes to Case 2.2 (similar to {GB5} (Fig. 13)). Case-2.2: If c1 is a part of a blue sequence then c1 is the first robot that turned blue for any one view from Fig. 13. The blue sequence can continue along the row and right boundary. Case-2.2.1: If the blue sequence ends before (m − 1, n) node 1 hop shifting will be done after the adjacent green robot moves from its node. One hop shifting would be done before it if any blue robot of the blue sequence from the right boundary moves left. Case-2.2.2: If the blue sequence ends at (m − 1, n) node one hop shifting will be done by Lemma 4. Now, c1 will turn into green and goes to case-1. This c1 will never become blue as its l1 or u1 neighbour node is vacant and all the robots which are at upper-left quadrant of c1 are red. Selecting a nonred robot we are making a nonred robot into a red robot. Since the total number of robots is finite, after a finite round all the robots will be red. Therefore by Lemma 1, the proof follows.
1266
4
R. Das et al.
Conclusion
This work presents an algorithm that forms a Maximum Independent Set (MAX IS) on a finite rectangular grid G by myopic robots. If the size of a maximum independent set of G is p then initially p robots are placed arbitrarily on distinct nodes of G. However, if there are robots of number lesser than p then our algorithm just ends up forming an independent set. The robots are considered to be luminous and have a light that can take three distinct colors. We assume the robots agree on the global notion of north, south, east, and west direction. The robots have two hop visibility. The robots are controlled under an adversarial semi-synchronous scheduler. In contrast to the previous MAX IS formation algorithms, the algorithm proposed in this work does not use the door concept. It allows the robots to form MAX IS from any arbitrary starting configuration. This generalizes the initial condition of the previous works for rectangular grid topology. In this work, we assumed two hop visibility of robots, so as a future direction one can try proposing a MAX IS formation algorithm which only uses one hop visibility of robots. Further, it will be interesting to provide an algorithm for the same problem under an asynchronous scheduler. Acknowledgement. The first two authors are supported by CSIR, Govt. of India and UGC, Govt. of India respectively.
References 1. Bose, K., Adhikary, R., Kundu, M.K., Sau, B.: Arbitrary pattern formation on infinite grid by asynchronous oblivious robots. Theor. Comput. Sci. 815, 213–227 (2020) 2. Bose, K., Kundu, M.K., Adhikary, R., Sau, B.: Arbitrary pattern formation by asynchronous opaque robots with lights. Theor. Comput. Sci. 849, 138–158 (2021) 3. d’Angelo, G., Di Stefano, G., Klasing, R., Navarra, A.: Gathering of robots on anonymous grids and trees without multiplicity detection. Theoret. Comput. Sci. 610, 158–168 (2016) 4. Devismes, S., Lamani, A., Petit, F., Raymond, P., Tixeuil, S.: Terminating exploration of a grid by an optimal number of asynchronous oblivious robots. Comput. J. 64(1), 132–154 (2021) 5. Flocchini, P., Prencipe, G., Santoro, N., Widmayer, P.: Arbitrary pattern formation by asynchronous, anonymous, oblivious robots. Theoret. Comput. Sci. 407(1–3), 412–447 (2008) 6. Kamei, S., Lamani, A., Ooshita, F., Tixeuil, S., Wada, K.: Gathering on rings for myopic asynchronous robots with lights. arXiv preprint arXiv:1911.04757 (2019) 7. Kamei, S., Tixeuil, S.: An asynchronous maximum independent set algorithm by myopic luminous robots on grids. arXiv preprint arXiv:2012.03399 (2020) 8. Kshemkalyani, A.D., Molla, A.R., Sharma, G.: Fast dispersion of mobile robots on arbitrary graphs. In: Dressler, F., Scheideler, C. (eds.) ALGOSENSORS 2019. LNCS, vol. 11931, pp. 23–40. Springer, Cham (2019). https://doi.org/10.1007/9783-030-34405-4 2
Maximum Independent Set Formation on a Finite Grid by Myopic Robots
1267
9. Kshemkalyani, A.D., Molla, A.R., Sharma, G.: Dispersion of mobile robots on grids. In: Rahman, M.S., Sadakane, K., Sung, W.-K. (eds.) WALCOM 2020. LNCS, vol. 12049, pp. 183–197. Springer, Cham (2020). https://doi.org/10.1007/978-3-03039881-1 16 10. Kshemkalyani, A.D., Molla, A.R., Sharma, G.: Efficient dispersion of mobile robots on dynamic graphs. In: 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS), pp. 732–742. IEEE (2020) 11. Nagahama, S., Ooshita, F., Inoue, M.: Ring exploration of myopic luminous robots with visibility more than one. In: Ghaffari, M., Nesterenko, M., Tixeuil, S., Tucci, S., Yamauchi, Y. (eds.) SSS 2019. LNCS, vol. 11914, pp. 256–271. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34992-9 20 12. Poudel, P., Sharma, G.: Time-optimal uniform scattering in a grid. In: Proceedings of the 20th International Conference on Distributed Computing and Networking, pp. 228–237 (2019) 13. Poudel, P., Sharma, G.: Fast uniform scattering on a grid for asynchronous oblivious robots. In: Devismes, S., Mittal, N. (eds.) SSS 2020. LNCS, vol. 12514, pp. 211–228. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64348-5 17 14. Pramanick, S., Samala, S.V., Pattanayak, D., Mandal, P.S.: Filling MIS vertices by myopic luminous robots. CoRR, arXiv:abs/2107.04885 (2021)
Generation of Time-Varying Feedback-Based Wheel Lock Attack Policies with Minimal Knowledge of the Traction Dynamics Alireza Mohammadi(B) and Hafiz Malik University of Michigan-Dearborn, Dearborn, MI 48128, USA [email protected] https://www-personal.umd.umich.edu/~amohmmad/index.html
Abstract. There are a variety of ways, such as reflashing of targeted electronic control units (ECUs) to hijacking the control of a fleet of wheeled mobile robots, through which adversaries can execute attacks on the actuators of mobile robots and autonomous vehicles. Independent of the source of cyber-physical infiltration, assessing the physical capabilities of an adversary who has made it to the last stage and is directly controlling the cyber-physical system actuators is of crucial importance. This paper investigates the potentials of an adversary who can directly manipulate the traction dynamics of wheeled mobile robots and autonomous vehicles but has a very limited knowledge of the physical parameters of the traction dynamics. It is shown that the adversary can exploit a new class of closed-loop attack policies that can be executed against the traction dynamics leading to wheel lock conditions. In comparison with a previously proposed wheel lock closed-loop attack policy, the attack policy in this paper relies on less computations and knowledge of the traction dynamics. Furthermore, the proposed attack policy generates smooth actuator input signals and is thus harder to detect. Simulation results using various tire-ground interaction conditions demonstrate the effectiveness of the proposed wheel lock attack policy.
Keywords: Cyber-physical Systems
1
· Robotics
Introduction
The past decade has witnessed the proliferation of mobility-related cyberphysical systems [10] ranging from vehicles with autonomous and connected features [29] to small UAVs for inspecting critical infrastructures and agricultural robotics [20]. The interconnected and autonomous features associated with mobilityrelated cyber-physical systems have been demonstrated to accompany serious security threats as evidenced by several recent successful attacks such as the c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1268–1281, 2023. https://doi.org/10.1007/978-3-031-37963-5_87
Time-Varying Wheel Lock Attack Policies
1269
Fig. 1. Various Class of Wheeled Mobile Platforms where Traction, through the Longitudinal Friction Force between the Ground and the Wheels, Slow or Accelerate the Motion: (Left) a Two-Wheeled Segway Robot [16]; (Middle) a 3-Wheeled Mobile Platform with Rear Wheel Torque Vectoring [1]; and (Right) an n-Wheeled Vehicle on Uneven Terrain [15]. In Attacks against the Traction Dynamics, the Adversary Seeks to Induce Wheel Lock Conditions with a Minimal Knowledge of the Tire-Ground Physical Interaction Characteristics.
wireless hack of a Tesla vehicle CAN bus [36] or successful adversarial hijacking of drones [39]. These risks span a plethora of scenarios such as exploitation of the critical vulnerabilities of in-vehicle networks by adversaries that try to take over the self-driving features of target vehicles [21], jamming/spoofing the GPS signals used by a fleet of service UAVs [41], and attacks against onboard charging systems of electric vehicles [9], to name a few. Impact of cyber-attacks, as assessed in information security risk management [19,46], is often concerned with the information/cyber system damage ranging from denying critical service functionality to sensitive information disclosure. In a cyber-physical system, on the other hand, attacks on the system constituents can also induce potential damage extending beyond the cyber-realm and impacting the physical components of the system [3,47], as documented by infamous malwares such as Triton and Industoyer (see, e.g. [23]). Accordingly, physics-based impact assessment of cyber-attacks on physical processes has become one of the emerging aspects of security analysis in the cyber-physical systems literature [14,33]. An important aspect of being able to assess the risk of cyberattacks on smart mobility applications is to search the space of attack policies through which adversaries can induce physical damage on their target systems [46]. In other words, the physical implications of cyberattacks against smart mobility systems, such as remotely steering a vehicle into a ditch as discussed by Miller [30], lead to the natural question posed by Fr¨ oschle and St¨ uhring [11]: “Once an attacker has made it to the last stage, what exactly are his capabilities?” The traction dynamics of vehicles and mobile robots, through which the wheels move with respect to the tangential ground surface (see Fig. 1), can be attacked in a variety of ways such as spoofing attacks against the anti-lock braking system, reflashing the brake ECUs, and sending malicious brake commands through the CAN bus of the vehicle, amongst others (see, e.g. [13,17,18,21,22, 28,40,43]). Independent from the source of cyber-physical vulnerability, a fright-
1270
A. Mohammadi and H. Malik
ening feature of vehicular cyberattacks on steering and braking actuators (see, e.g., [30–32]) is that they are forensically scentless (i.e., leaving no forensic evidence behind) and are almost invisible to the driver. Accordingly, it is of crucial importance to search and assess the space of attack policies through which an adversary can induce maximum physical damage on their target system. The first steps on assessing the physical capabilities of an adversary manipulating the vehicular traction dynamics has been taken in [34,35], where the authors model the cyber-physical threat of an adversary as a closed-loop attack policy design problem, which can be executed on the vehicle braking actuators. To demonstrate the physical capabilities of an adversary who has a limited knowledge of the vehicle traction dynamics and the tire-ground interaction characteristics, the authors utilized a predefined-time controller [42] and a nonlinear disturbance observer [26] to design a brake attack policy that will induce wheel lockup conditions in a finite time interval. A drawback of the proposed attack policies in [34,35] is the reliance on the nonlinear disturbance observer feedforward computations, which compensate for the lack of adversarial knowledge about the physical parameters of the traction dynamics. Furthermore, the generated attack signals in [34,35] might be non-smooth and therefore easier to detect through anomaly detection Algorithms [12]. Finally, the overall effect of such wheel lock attacks were not investigated in terms of their impact on the overall motion of the vehicle under attack. This paper demonstrates that the adversary with a very limited knowledge of tire-ground interaction characteristics can induce wheel lock conditions through a properly designed closed-loop attack policy against the traction dynamics. Unlike a previously designed wheel lock attack policy in [34,35], the new attack policy does not rely on the computation of nonlinear disturbance observer-based feedforward terms. Furthermore, the new feedback control input is generated through a time-varying controller with prescribed convergence time [44] that can cause wheel lockups even when the physical parameters of the traction dynamics are not known a priori (see Fig. 2). Finally, the attack signals generated by the policy in this paper are guaranteed to be smooth and hence harder detect through anomaly detection algorithms for monitoring the actuator signals [12]. Contributions of the Paper. This paper proposes a new class of traction dynamics attack policies that can be executed against mobile robots and autonomous vehicles. In comparison with an existing result in [34,35], the proposed attack policy in this article, which relies on time-varying feedback control schemes with prescribed convergence time, relies on less computations. Furthermore, the proposed attack policy is guaranteed to generate smooth actuator input signals and thus is harder to detect. Moreover, the effectiveness of the proposed attack policy is demonstrated in terms of its impact on the overall motion of the mobile platform under attack through various simulation scenarios. The rest of this paper is organized as follows. First, we present the vehicle traction dynamics and formulate the wheel lock attack policy objective in terms of these dynamics in Sect. 2. Thereafter, in Sect. 3, we present our attack policy that is based on using time-varying feedback controllers with prescribed con-
Time-Varying Wheel Lock Attack Policies
1271
vergence time. Next, we validate the effectiveness of the proposed attack policy using various ground conditions and demonstrate the destabilizing effect of such wheel lock attacks on the overall motion of a 4-wheeled vehicle through simulations in Sect. 4. Finally, we conclude the paper with future research directions and final remarks in Sect. 5.
2
Traction Dynamics
In this section, we briefly present the single-wheel model of traction dynamics. This dynamical model can effectively capture the steady and transient tractive performance while demonstrating how a vehicle or wheeled mobile robot can end up in a wheel lock condition (see, e.g., [45,48] for the wheel slip dynamics of wheeled mobile robots and [6,27] for that of the vehicles). Often, the states of the traction dynamical system are selected to be the forward mobile robot/vehicle speed and tire/wheel rate of rotation. The dynamics that govern the states of the traction dynamical system are given by (see, e.g., [6]) Δv (t, v) , M Ta Δw (t, ω) M gα r μ(λ) − − , ω˙ = J J J v˙ = −gα μ(λ) −
(1a) (1b)
where the parameters M , r, and J are the vehicle/mobile robot mass, wheel radius, and wheel inertia, respectively. Additionally, during deceleration, the mobile robot/vehicle speed v and the wheel rotational speed ω vary within the set (2) Db := {(v, ω)|v > 0, 0 ≤ rω ≤ v}. In the dynamics given by (1), the torque Ta , resulting from either the electric motors of the mobile robot or the vehicle brakes, is the input to the dynamical system in (1). Furthermore, the longitudinal slip λ that determines whether the wheel is locked is given by v − rω . (3) λ := max(v, rω) While the traction input actuators are engaged, we have λ = v−rω and v (v, ω) ∈ Db . Accordingly, the longitudinal slip value λ belongs to the closed interval [0, 1] during deceleration. Given the ground slope α, we denote the tangential acceleration g cos(α) by gα . Finally, μ(λ), Δv (t, v), and Δw (t, ω) denote the uncertain nonlinear friction coefficient, the force, and the torque disturbances resulting from tractive unmodeled dynamics, respectively. There are numerous ways to represent the nonlinear friction coefficient function μ(·) including the Burckhardt equation (see, e.g., [5]). For instance, equations like Burckhardt model (see, e.g., [7]) where μ(λ) = c1 (1 − exp(−c2 λ)) − c3 λ,
(4)
1272
A. Mohammadi and H. Malik
are empirical equations, which are based on coefficient curve fitting, and are widely employed in modeling the tire/ground interaction. The longitudinal force on the tire arising from this interaction is computed by −μ(λ)gα . In this paper, no particular closed-form representation is assumed for the function μ(·). In accordance with the traction dynamics control literature (see, e.g., [6]), we assume that the unknown disturbance acting on the speed dynamics, i.e., Δv (t, v), and the unknown disturbance acting on the wheel angular speed dynamics, i.e., Δw (t, ω), respect the following inequalities |Δv (t, v)| ≤ Δ¯v , |Δw (t, ω)| ≤ Δ¯ω , for all (t, v, ω) ∈ [0, ∞) × Db .
(5)
As it has been noted by Olson et al. in [37], it is more beneficial to change the coordinates of the traction dynamics in (1) from the pair of longitudinal speedwheel angular speed, i.e., (v, ω), to the pair of longitudinal speed-longitudinal slip, i.e., (v, λ). After the change of coordinates, the longitudinal dynamics read as Δv (t, v) , v˙ = −gα μ(λ) − M gα (λ − 1 − ν)μ(λ) + Υa + ΥΔ,w + (λ − 1)ΥΔ,v , λ˙ = v where ν :=
M R2 J
denotes a dimensionless ratio, Υa :=
r Jgα Ta
(6a) (6b)
is the dimensionless
v (t,v) traction dynamics control input, and ΥΔ,w := ΥΔ,v := ΔM gα are the dimensionless force and torque disturbances affecting the speed and the longitudinal slip dynamics, respectively. The dynamical system given by (1) and (6) take into account the intercoupling between the wheel slip λ and the mobile platform speed v in (6), or the wheel angular speed ω dynamics and the mobile platform speed v in (1). In the coordinates given by (v, λ), the set Db in (2), which is the state space of the traction dynamics, can be written as Db = (v, λ)|v > 0, λ ∈ Λ := [0, 1] . (7)
r Jgα Δw (t, ω),
Both the literature of automotive cybersecurity (see, e.g., [11,30]) and mobile robotics cybersecurity (see, e.g., [2,25]) outline a plethora of threats through which an adversary can manipulate the traction dynamics. This cyber-physical threat capability of an adversary can be formulated as a closed-loop attack policy design for the vehicle/mobile robot traction dynamics actuators (see Fig. 2). To assess the physical capabilities of the adversary who can manipulate the traction dynamics of the mobile platform by utilizing the tractive control input Υa , we consider the case where the adversary desires to induce unstable tractive behavior by wheel locks. To consider the most severe case of wheel lock, i.e., when the longitudinal slip satisfies λ = 1, we define the lockup manifold in the following way (8) WbL := (v, λ)v > 0, λ = 1 .
Time-Varying Wheel Lock Attack Policies
1273
Fig. 2. This Paper Assesses the Physical Capabilities of an Adversary who has Infiltrated the Control System Associated with the Traction Dynamics. Despite having a Limited Knowledge of the Underlying Physical Parameters, The Adversary is Trying to Induce wheel Lock using the Input Actuators, e.g., Electric Motors of Mobile Robot Wheels or Vehicle Brake Actuators.
It is remarked that the wheel lockup manifold was originally defined by Olson et al. in [37] to study the stability of vehicular traction dynamics. Furthermore, we remark that the adversary can set the slip reference value λr , belonging to the closed interval [0, 1], a priori. The closer the reference slip value λr to one, the closer the wheel to the lock condition. We assume, without loss of generality, that λr = 1.
3
Design of Traction Dynamics Wheel Lock Attack Policy
In this section, we present a wheel lock closed-loop attack policy that can be executed against the traction dynamics of vehicles and various wheeled mobile robots (see Fig. 1). Our closed-loop attack policy merely relies on a feedback control action. In contrast to the previous line of work in [34,35], no additional feedforward control action computation is required in this proposed attack policy. The attack input is designed based on a time-varying feedback control framework with prescribed convergence in finite time [44]. The proposed attack can induce wheel lock conditions even if the wheel-ground interaction characteristics and other relevant parameters in the vehicle traction dynamics are not known. Following the control design framework in [44], we consider the mapping μK : t → μK (t) where μK (t − t0 ) =
T 1+m0
1+m0 , t ∈ [t0 , t0 + T ). 0 T + t0 − w( t−t T )
(9)
In (9), m0 is a positive integer and the real numbers t0 and T are non-negative and positive, respectively. Furthermore, the function w : τ → w(τ ) is any smooth and monotonically increasing function such that w(0) = 0 and w(1) = t0 + T . In the context of the wheel lock attack policy, t0 in (9) represents the time of the onset of the attack. Furthermore, T is the finite settling time associated
1274
A. Mohammadi and H. Malik
with the attack by which the wheel longitudinal slip will converge to the adversary’s desired slip value. The time-warping function w(·) controls the transient convergence behavior of the states of the traction dynamics to the wheel lockup manifold WbL given by (8). In its simplest form, the monotonic function w(·) can be chosen to be w(τ ) = τ as in [44]. As shown later, we choose this function to be a B´ezier polynomial of order two. Finally, we define the wheel lockup error as eL = λ − 1.
(10)
Hence, if the wheel lockup error satisfies eL = 0 while the wheeled mobile platform speed is positive, i.e., v > 0, the wheel is in a locked stated. Using the closed-loop attack policy in this paper, the wheel will be locked in finite time. Therefore, the wheeled robot speed will satisfy v ∈ [vmin , vmax ],
(11)
during a successful attack, for some positive vmin and vmax . Closed-Loop Attack Policy for Inducing Wheel Lock in Finite Time. Consider the wheeled mobile platform traction dynamics in (6). We propose using the following wheel lock attack policy Υa =
v a u (eL , t), gα np
(12)
where
1 + m0 )μK (t)eL . (13) T In (13), the time-varying feedback gain function μK (·) is given by (9). Furthermore, the time-warping function w(·) used in (9) is given by the second order B´ezier polynomial uanp (eL , t) = −(k0 +
w(τ ) = p1 + (1 − τ )2 (p0 − p1 ) + τ 2 (p2 − p1 ),
(14)
where p0 = 0 and p2 = t0 +T are constant parameters. Furthermore, the constant parameter p1 is chosen at the adversary’s discretionary to control the transient behavior of the traction dynamics state trajectories during the wheel lock attack. It can be shown that the wheel slip error dynamics take the form e˙ L = uanp (eL , t) + Δe (t, eL ),
(15)
where Δe (t, eL ) denotes the lumped disturbance that lump the effect of all unknown parameters, unknown disturbances such as Δv (·), Δw (·), and the unknown wheel-ground friction coefficient function μ(·) in the traction dynamics given by (6). The time-varying feedback control input uanp (eL , t), which is adopted from [44], ensures the rejection of these unknown disturbances and convergence to the lockup manifold WbL given by (8) in finite time t0 + T from the onset of the attack at time t0 . Indeed, it is because of the superior disturbance
Time-Varying Wheel Lock Attack Policies
1275
Fig. 3. The Inexpensive Attacking Device Proposed by Palanca et al. [38] for Accessing the CAN Bus through the OBD-II Port.
rejection capabilities of the time-varying feedback input uanp (eL , t) that there is no need for additional real-time computations of feedforward disturbance compensation terms as in [34,35]. In other words, the time-varying feedback control input uanp (eL , t) removes the need for additional estimation computations. In the proposed attack policy in (12), we are assuming that the adversary has the knowledge and/or can estimate the wheeled mobile platform speed as well as the attacked wheel longitudinal slip. Clark et al. [4] and Lacava et al. in [24] enumerate several ways through which the firmware/OS on the microprocessor of the robotic devices can be infiltrated and exploited later for performing attacks on the actuation system of the robot. Furthermore, as demonstrated in experimental wireless attacks against Tesla electric vehicles [36], reprogramming the firmware of ECUs through the Unified Diagnostic Services (UDS) enables the adversary to read live data, such as speed or engine rpm, from the in-vehicle network. Finally, as demonstrated by Palanca et al. in [38], it is possible to craft an inexpensive attacking device consisting of an SAE J1962 Male Connector, a Microchip MCP2551 E/Pa Microchip MCP2551 E/P, and Arduino Uno Rev 3, which can be powered by a simple 12 V battery. This device, which was experimentally tested on a 2012 Alfa Romeo Giulietta, could be physically plugged into the OBD-II port of the target vehicle and access the various ECUs in the vehicle through the CAN bus (see Fig. 3).
4
Simulation Results
In this section we first present numerical simulation results associated with the wheel lock attack policy in (12) using various wheel-ground interaction conditions. Next, we will present numerical simulation results demonstrating the impact of the presented wheel lock attack policy on the overall stability of the motion of a 4-wheeled vehicle. In the wheel lock attack numerical simulations, we consider four different wheel-ground interaction conditions; namely, interaction with dry asphalt, wet asphalt, dry cobblestone, and wet cobblestone. The nonlinear friction coefficient function is modeled using the three-parameter Burckhardt model in (4). In the simulations, the adversary has no knowledge of the nonlinear friction coefficient
1276
A. Mohammadi and H. Malik
function as it is evident from the closed-loop attack policy given by (12). The friction coefficient function is based on the Burckhardt tire model and the associated parameters are taken from [8]. The wheeled vehicle parameters are taken from [6]. The parameters of the wheel lock attack policy in (13) are chosen to be m0 = 1, T = 2.5, t0 = 0, and p1 = 2.38.
Fig. 4. Time Profiles of the Simulation Results: (Top) Speed Time Profile on Various Ground Conditions; (Middle) Wheel Slip Time Profile on Various Ground Conditions; and (Bottom) Gtate Space Trajectories of the Traction Dynamics in (6). In All Four L := (v, λ)v > Scenarios, Finite-Time Convergence to the Wheel Lockup Manifold W b 0, λ = 1 Takes Place without the Need for Estimating the Lumped Disturbance Time Profiles Δe (t, eL ).
Figure 4 presents the speed, wheel slip, and the traction dynamics state space trajectories from the simulations. As it can be seen from the figure, the timevarying feedback-based attack policy manages to induce wheel lock conditions in all four scenarios. Figure 5 depicts the lumped disturbance Δe (t, eL ) time profile associated with the wheel lock attack numerical simulations. Despite being nonzero and time-varying, the closed-loop attack policy given by (12) manages to reject their effect on the wheel slip tracking dynamics without the need for additional computations to estimate this unknown lumped disturbance term. To study the effect of the proposed closed-loop attack policy on the overall motion and stability of mobile platforms (including mobile robots and autonomous vehicles), one needs to study the attack impact on an individual basis. For instance, a wheel lock attack on a 3W mobile robot [1] or a 4-wheeled vehicle [49,50] might result in lateral motion instability. The same attack on a segway robot [16] might result in loss of balance. In this paper, we study the overall impact of the wheel lock attacks by using the dynamical model developed by Yi, Tseng, and collaborators (see, e.g., [49,50]). The model in [49,50], which
Time-Varying Wheel Lock Attack Policies
1277
Fig. 5. The Lumped Disturbance Δe (t, eL ) Time Profile Associated with the Wheel Lock Attack Numerical Simulations. The Lumped Disturbance Captures the Effect of All Unknown Parameters, Unknown Disturbances such as Δv (·), Δw (·), and the Unknown Wheel-Ground Friction Coefficient Function μ(·) in the Traction Dynamics given by (6) and Manifests itself in the Tracking Error Dynamics in (15). The TimeVarying Feedback Control Input uanp (eL , t) given by (13) Guarantees the Convergence of Trajectories of the Traction Dynamics with the Need for Estimation of Δe (t, eL ).
Fig. 6. The Overall Impact of the Wheel Lock Attacks on the Stability of a 4-Wheeled Vehicle Modeled using the Approach by Yi, Tseng, and Collaborators [49, 50].
is based on a hybrid physical/dynamic tire/road friction mode, captures the coupling effect between longitudinal and lateral vehicle motions. As demonstrated by Fig. 6, after the wheel lock attack policy in (12) is executed on the front wheels of the vehicle interacting with dry asphalt, wet asphalt, dry cobblestone, and wet cobblestone, the vehicle loses its lateral stability in all four scenarios.
1278
5
A. Mohammadi and H. Malik
Concluding Remarks and Future Research Directions
In this paper, the potentials of an adversary who can directly manipulate the traction dynamics of wheeled mobile robots and autonomous vehicles were investigated. It was assumed that the adversary has a very limited knowledge of the physical parameters of the traction dynamics. Using a class of time-varying feedback control inputs with prescribed finite time convergence, this paper showed that the adversary can exploit this class of attack policies against the traction dynamics inducing wheel lock conditions. Simulation results using various tireground interaction conditions demonstrated the effectiveness of the proposed wheel lock attack policy. Acknowledgments. This work is supported by NSF Award CNS-2035770 (Division of Computer and Network Systems).
References 1. Ataei, M., Khajepour, A., Jeon, S.: Reconfigurable integrated stability control for four-and three-wheeled urban vehicles with flexible combinations of actuation systems. IEEE/ASME Trans. Mechatron. 23(5), 2031–2041 (2018) ´ 2. Balsa-Comer´ on, J., Guerrero-Higueras, A.M., Rodr´ıguez-Lera, F.J., Fern´ andezLlamas, C., Matell´ an-Olivera, V.: Cybersecurity in autonomous systems: hardening ROS using encrypted communications and semantic rules. In: Ollero, A., Sanfeliu, A., Montano, L., Lau, N., Cardeira, C. (eds.) ROBOT 2017. AISC, vol. 694, pp. 67–78. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-70836-2 6 3. Chong, M.S., Sandberg, H., Teixeira, A.M.: A tutorial introduction to security and privacy for cyber-physical systems. In: 2019 18th European Control Conference (ECC), pp. 968–978. IEEE (2019) 4. Clark, G.W., Doran, M.V., Andel, T.R.: Cybersecurity issues in robotics. In: 2017 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), pp. 1–5. IEEE (2017) 5. De Castro, R., Araujo, R., Freitas, D.: Optimal linear parameterization for on-line estimation of tire-road friction. IFAC Proc. 1, 8409–8414 (2011) 6. De Castro, R., Ara´ ujo, R.E., Tanelli, M., Savaresi, S.M., Freitas, D.: Torque blending and wheel slip control in EVs with in-wheel motors. Veh. Syst. Dyn. 50(sup1), 71–94 (2012) 7. Canudas de Wit, C., Horowitz, R., Tsiotras, P.: Model-based observers for tire/road contact friction prediction. In: Nijmeijer, H., Fossen, T. (eds) New Directions in nonlinear observer design. Lecture Notes in Control and Information Sciences, vol. 244, pp. 23–42. Springer, London (1999). https://doi.org/10.1007/BFb0109919 8. Dousti, M., Baslamısli, S.C., Onder, E.T., Solmaz, S.: Design of a multiple-model switching controller for abs braking dynamics. Trans. Inst. Measur. Control 37(5), 582–595 (2015) 9. ElHussini, H., Assi, C., Moussa, B., Atallah, R., Ghrayeb, A.: A tale of two entities: contextualizing the security of electric vehicle charging stations on the power grid. ACM Trans. Internet Things 2(2), 1–21 (2021) 10. Elshenawy, M., Abdulhai, B., El-Darieby, M.: Towards a service-oriented cyberphysical systems of systems for smart city mobility applications. Future Gener. Comput. Syst. 79, 575–587 (2018)
Time-Varying Wheel Lock Attack Policies
1279
11. Fr¨ oschle, S., St¨ uhring, A.: Analyzing the capabilities of the CAN attacker. In: Foley, S.N., Gollmann, D., Snekkenes, E. (eds.) ESORICS 2017. LNCS, vol. 10492, pp. 464–482. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66402-6 27 12. Giraldo, J., et al.: A survey of physics-based attack detection in cyber-physical systems. ACM Comput. Surv. (CSUR) 51(4), 1–36 (2018) 13. Hodge, C., Hauck, K., Gupta, S., Bennett, J.C.: Vehicle cybersecurity threats and mitigation approaches. Technical report, National Renewable Energy Lab.(NREL), Golden, CO (United States) (2019) 14. Huang, K., Zhou, C., Tian, Y.-C., Yang, S., Qin, Y.: Assessing the physical impact of cyberattacks on industrial cyber-physical systems. IEEE Trans. Ind. Electron. 65(10), 8153–8162 (2018) 15. Iagnemma, K., Dubowsky, S.: Traction control of wheeled robotic vehicles in rough terrain with application to planetary rovers. Int. J. Robot. Res. 23(10–11), 1029– 1040 (2004) 16. Jones, D.R., Stol, K.A.: Modelling and stability control of two-wheeled robots in low-traction environments. In: Australasian Conference on Robotics and Automation, Brisbane, Australia (2010) 17. Kang, L., Shen, H.: Attack detection and mitigation for sensor and CAN bus attacks in vehicle anti-lock braking systems. In: 2020 29th International Conference on Computer Communications and Networks (ICCCN), pp. 1–9. IEEE (2020) 18. Kang, L., Shen, H.: Detection and mitigation of sensor and CAN bus attacks in vehicle anti-lock braking systems. ACM Trans. Cyber-Phys. Syst. (TCPS) 6(1), 1–24 (2022) 19. Kaplan, S., Garrick, B.J.: On the quantitative definition of risk. Risk Anal. 1(1), 11–27 (1981) 20. Kim, J., Kim, S., Ju, C., Il Son, H.: Unmanned aerial vehicles in agriculture: a review of perspective of platform, control, and applications. IEEE Access 7, 105100–105115 (2019) 21. Kim, K., Kim, J.S., Jeong, S., Park, J.-H., Kim, H.K.: Cybersecurity for autonomous vehicles: review of attacks and defense. Comput. Secur. 103, 102150 (2021) 22. Koscher, K., et al.: Experimental security analysis of a modern automobile. In: The Ethics of Information Technologies, pp. 119–134. Routledge (2020) 23. Kshetri, N., Voas, J.: Hacking power grids: a current problem. Computer 50(12), 91–95 (2017) 24. Lacava, G., et al.: Cybsersecurity issues in robotics. J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl. 12(3), 1–28 (2021) 25. Lee, S., Min, B.-C.: Distributed direction of arrival estimation-aided cyberattack detection in networked multi-robot systems. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9. IEEE (2018) 26. Li, S., Yang, J., Chen, W-H., Chen, X.: Disturbance Observer-Based Control: Methods and Applications. CRC Press, Boca Raton (2014) 27. Li, W., Zhu, X., Ji, J.: Hierarchical braking torque control of in-wheel-motor-driven electric vehicles over CAN. IEEE Access 6, 65189–65198 (2018) 28. Liu, J., et al.: Secure autonomous cyber-physical systems through verifiable information flow control. In: Proceedings of the 2018 Workshop on Cyber-Physical Systems Security and PrivaCy, pp. 48–59 (2018) 29. Lu, N., Cheng, N., Zhang, N., Shen, X., Mark, J.W.: Connected vehicles: solutions and challenges. IEEE Internet Things J. 1(4), 289–299 (2014) 30. Miller, C.: Lessons learned from hacking a car. IEEE Des. Test 36(6), 7–9 (2019)
1280
A. Mohammadi and H. Malik
31. Miller, C., Valasek, C.: Adventures in automotive networks and control units. Def Con 21(260–264), 15–31 (2013) 32. Miller, C., Valasek, C.: Remote exploitation of an unaltered passenger vehicle. Black Hat USA 2015(S 91), (2015) 33. Mohammadi, A., Malik, H.: Vehicle lateral motion stability under wheel lockup attacks. In: Workshop on Automotive and Autonomous Vehicle Security (AutoSec) 2022, San Diego, CA (2022). https://doi.org/10.14722/autosec.2022.23010 34. Mohammadi, A., Malik, H., Abbaszadeh, M.: Generation of CAN-based wheel lockup attacks on the dynamics of vehicle traction. In: Workshop on Automotive and Autonomous Vehicle Security (AutoSec) 2022, San Diego, CA (2022). https:// doi.org/10.14722/autosec.2022.23025 35. Mohammadi, A., Malik, H., Abbaszadeh, M.: Generation of wheel lockup attacks on nonlinear dynamics of vehicle traction. In: 2022 American Control Conference (ACC), pp. 1994–1999 (2022) 36. Nie, S., Liu, L., Yuefeng, D.: Free-fall: Hacking Tesla from wireless to CAN bus. Briefing, Black Hat USA 25, 1–16 (2017) 37. Olson, B.J., Shaw, S.W., St´ep´ an, G.: Nonlinear dynamics of vehicle traction. Veh. Syst. Dyn. 40(6), 377–399 (2003) 38. Palanca, A., Evenchick, E., Maggi, F., Zanero, S.: A stealth, selective, link-layer denial-of-service attack against automotive networks. In: Polychronakis, M., Meier, M. (eds.) DIMVA 2017. LNCS, vol. 10327, pp. 185–206. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60876-1 9 39. Petnga, L., Xu, H.: Security of unmanned aerial vehicles: dynamic state estimation under cyber-physical attacks. In: 2016 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 811–819. IEEE (2016) 40. Pollicino, F., Stabili, D., Bella, G., Marchetti, M.: SixPack: abusing ABS to avoid misbehavior detection in VANETs. In: 2021 IEEE 93rd Vehicular Technology Conference, pp. 1–6. IEEE (2021) 41. Salamh, F.E., Karabiyik, U., Rogers, M.K., Matson, E.T.: A comparative UAV forensic analysis: Static and live digital evidence traceability challenges. Drones 5(2), 42 (2021) 42. S´ anchez-Torres, J.D., Sanchez, E.N., Loukianov, A.G.: Predefined-time stability of dynamical systems with sliding modes. In: 2015 American Control Conference (ACC), pp. 5842–5846 (2015) 43. Shoukry, Y., Martin, P., Tabuada, P., Srivastava, M.: Non-invasive spoofing attacks for anti-lock braking systems. In: Bertoni, G., Coron, J.-S. (eds.) CHES 2013. LNCS, vol. 8086, pp. 55–72. Springer, Heidelberg (2013). https://doi.org/10.1007/ 978-3-642-40349-1 4 44. Song, Y., Wang, Y., Krstic, M.: Time-varying feedback for stabilization in prescribed finite time. Int. J. Robust Nonlin. Contr. 29(3), 618–633 (2019) 45. Stonier, D., Cho, S-H., Choi, S-L., Kuppuswamy, N.S., Kim, J-H.: Nonlinear slip dynamics for an omniwheel mobile robot platform. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 2367–2372. IEEE (2007) 46. Teixeira, A., Sou, K.C., Sandberg, H., Johansson, K.H.: Secure control systems: a quantitative risk management approach. IEEE Control Syst. Mag. 35(1), 24–45 (2015) 47. Teixeira, A.M.H.: Optimal stealthy attacks on actuators for strictly proper systems. In: 2019 IEEE 58th Conference on Decision and Control (CDC), pp. 4385–4390. IEEE (2019)
Time-Varying Wheel Lock Attack Policies
1281
48. Tian, Y., Sidek, N., Sarkar, N.: Modeling and control of a nonholonomic wheeled mobile robot with wheel slip dynamics. In: 2009 IEEE Symposium on Computational Intelligence in Control and Automation, pp. 7–14. IEEE (2009) 49. Yi, J., Li, J., Jianbo, L., Liu, Z.: On the stability and agility of aggressive vehicle maneuvers: a pendulum-turn maneuver example. IEEE Trans. Contr. Syst. Technol. 20(3), 663–676 (2011) 50. Yi, J., Tseng, E.H.: Nonlinear stability analysis of vehicle lateral motion with a hybrid physical/dynamic tire/road friction model. In: Dynamic Systems and Control Conference (DSCD), vol. 48920, pp. 509–516 (2009)
SCAHunter: Scalable Threat Hunting Through Decentralized Hierarchical Monitoring Agent Architecture Mohiuddin Ahmed1(B) , Jinpeng Wei1 , and Ehab Al-Shaer2 1
University of North Carolina at Charlotte, Charlotte, NC, USA {mahmed27,jwei8}@uncc.edu 2 Carnegie Mellon University, Pittsburgh, PA, USA [email protected]
Abstract. This paper presents a scalable, dynamic, flexible, and nonintrusive monitoring architecture for threat hunting. The agent architecture detects attack techniques at the agent level, classifies composite and primitive events, and disseminates seen attack techniques or subscribed event information to the upper-level agent or manager. The proposed solution offers improvement over existing approaches for threat hunting by supporting hierarchical event filtering-based monitoring, which improves monitoring scalability. It reduces memory requirement and communication overhead while maintaining the same accuracy of threat hunting in state-of-the-art centralized approaches. We provide a distributed hierarchical agent architecture and an approximation algorithm for near-optimal agent hierarchy generation. We also evaluated the proposed system across three simulated attack use cases built using the MITRE ATT&CK framework and DARPA OpTC attack dataset. The evaluation shows that our proposed approach reduces communication overhead by 43% to 64% and memory usage by 45% to 60% compared with centralized threat hunting approaches. Keywords: Threat Hunting Monitoring
1
· Intrusion Detection · Hierarchical Event
Introduction
In recent years, there has been an increase in cyber attacks including advanced persistence threats (APTs) and ransomware [8], and the techniques used by the attacker have reached an unprecedented sophistication [7]. According to Sophos threat report [7], APT and ransomware attacks increased from 37% in 2020 to 78% in 2021. These attacks evade signature-based intrusion detection systems by exploiting the zero-day vulnerability, whitelisted applications and threat emulation tools (Metasploit, Cobalt Strike, Mimikatz). They use a low and slow approach to avoid triggering anomaly detection while working on the attack goals such as exfiltration and encryption. Due to the diverse and sprawling nature of organizational network, and time-consuming nature of attack investigation, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1282–1307, 2023. https://doi.org/10.1007/978-3-031-37963-5_88
SCAHunter
1283
attackers can dwell in the system for extended periods. Mandiant reports that the global average dwell time of the adversary is 24 days [6]. The damage incurred by the adversary on an organization increases exponentially with increasing dwell time. According to the IBM security threat report [4], data breach damage from ransomware attacks increased from $3.86 million in 2020 to $4.24 million in 2021, and the time to identify and contain the data breach is, on average, 287 days. The high threat detection time indicates that traditional IDS does not make the breakthrough in real-time threat hunting. Considering the shift in threat actors and unprecedented sophistication in adversary activities, the organization deploys Endpoint Detection and Response (EDR) solutions and System Information and Event Management (SIEM) solutions to record, monitor continuously, and analyze low-level system logs in endhost devices. The EDR solutions detect threats by matching low-level system events against a knowledge base of adversarial TTPs (Tactics, Techniques, and procedures). MITRE ATT&CK framework [1] provides a knowledge base of TTPs developed by domain experts by analyzing real-world APTs. The SIEMs collect low-level system logs or alerts through a collector, sensor, or EDR agent installed in the end-host devices to the manager (central server). A threat hunter uses SIEM to analyze and correlate collected logs to detect adversary activities during the threat hunting process proactively. While such centralized event correlation facilitates causality analysis of attacker activities, it presents the following challenges in threat hunting at a large-scale distributed system: – On-demand monitoring: Existing researches [17,21,25] try to monitor everything to give data visibility as much as possible, which is not necessary for detecting a TTP. For example, while an EDR solution tries to detect PowerShell execution of malware, it is not necessary to monitor other data sources (registries, processes, file operations); instead, monitoring the PowerShell command is enough to detect PowerShell execution TTPs [1]. – Event storage and communication overhead: The centralized threat hunting process continuously collects monitored logs to the central server, which incurs high memory usage and communication overhead to transfer events to the central server. This approach introduces scalability issues on the monitored network. – Efficient event correlation: To detect attacker TTP, existing solutions and research [20] use single event matching on the end-host devices and generate alerts. Unfortunately, such a single event matching approach generates many false alerts, causing the alert fatigue problem [19] in threat hunting. For example, adversary and benign users can use Windows command shell execution TTP to execute an executable on the system. Detection based on the single event matching will generate many false alerts. Recent works on threat hunting use causality analysis [17,21,24,26], cyber threat intelligence [25], and MITRE ATT&CK technique detector [25] to reduce mean-time-to-know during the post-breach threat hunting process. These causality analysis approaches incrementally parse low-level audit logs generated by system-level logging tools (e.g., Sysmon, Event tracing for Windows, and auditd)
1284
M. Ahmed et al.
into causal graphs (provenance graphs). The causal graph encodes the dependency between subjects (processes, threads) and objects (files, registries, sockets) to provide the historical context the threat hunter needs to correlate attacker activities and understand alerts. In [5,26], the authors developed detectors or rules for attack techniques and mapped each detector/rule to the MITRE technique. According to a recent survey about EDR solutions by Gartner, all top 10 EDR solutions use MITRE ATT&CK framework to detect adversary behaviors [3]. Such causality analysis is promising for network-wide alert correlation and cyber intelligence. However, the performance of causality analysis is a limiting factor for real-time threat hunting because of the significant graph construction time ranging from hours to days [21] and the large size of the audit logs (terabytes of logs generated within week [21]). To improve the graph construction time, prior works [24,26] applied different graph reduction and compression techniques. In [21], the author reduces memory usage during causality analysis by storing the most recent part of the causal graph on the main memory and the unused casual graph on disk. In [17], the authors generate a host-specific causal graph and network-specific causal graph and perform multi-host analysis only if any host-based sub-graph crosses a predefined risk score. Though graph reduction and compression and hierarchical storage reduce memory usage to store monitored events, the causality analysis on the compressed graph has the following limitations. Prior works perform centralized analysis and want to give data visibility as much as possible. Thus, they try to monitor all data sources in the end-host devices, which raises the issue of monitoring scalability and communication overhead (collecting logs on a central server) in threat hunting. Provenance graph expansion for multi-host analysis in Holmes [26] and Steinerlog [17] creates dependency expansion. Graph alignment in multiple hosts [25] increases the threat detection time exponentially with the increase of event logs and network size. To overcome the limitations (monitoring everything, memory requirement, communication overhead, and many false alerts) of existing tools and research works, we propose a monitoring architecture (Fig. 1) using a hierarchical event filtering approach that reduces monitoring load and communication overhead and provides efficient event correlation that can potentially reduce false alerts. The adversary activities follow precedence, meaning that most of the attack techniques have preconditions [11], which are other attack techniques. For example, without performing the initial compromise or execution technique, an attacker will not be able to perform the discovery and command and control technique. Similarly, an attacker cannot perform a collection or exfiltration technique without performing a discovery technique. Following the attack technique association, the SCAHunter provides on-demand monitoring of the data sources corresponding to the monitored attack signature. We provide on-demand monitoring by instrumenting ETW (event tracing for Windows) through ETW API so that the lower-level agents only log signature-specific events. Similarly, auditd for Linux and Endpoint Security for Mac OS can be used for providing signature-specific on-demand monitoring.
SCAHunter
1285
Additionally, events/logs can be correlated in the end host devices if monitored events or logs correlate with them. If a set of events from a set of different hosts are required for the correlation of a monitored signature, a middle-level host nearby (in terms of hop-count) the corresponding monitored hosts can be used for the correlation task. Event correlation at the intermediate host will reduce the memory requirement and communication overhead since only the correlated events will be forwarded to the upper-level agents or manager. It will improve scalability and performance by using hierarchical monitoring and distributed correlation while reducing the monitoring intrusiveness. Finally, hierarchical event filtering will reduce the number of false alert generation problems in the current research works. Single event matching will generate many false alarms because of similarity with benign user activities. However, it is highly unlikely that a sequence of adversary TTPs will match with benign user activities. For example, execution of payload through services.exe or sc.exe (T1569.002) [1] can be used by the benign user; however, remote execution of payload through sc.exe is highly suspicious behavior. Our proposed hierarchical filtering correlates service creation and remote execution in a middle host, thus generating less number of alerts by distributed event correlation. However, generating the agent hierarchy is an NP-hard problem, which we solve using an approximation algorithm (Algorithm 1) based on the geographical host distribution and predefined monitoring capacity of agents. Our proposed approach does not monitor every data source; instead, it monitors what is required to detect the current attack stage and adds a new monitoring task on-demand based on the threat hunting progress. Contribution. Our first contribution is to provide a distributed hierarchical monitoring agent architecture that optimizes monitoring tasks to reduce resource usage and communication overhead. Our second contribution is to provide an approximation algorithm to generate a near-optimal agent hierarchy, so that event correlation tasks are distributed among the hosts. Our third contribution is to develop an ETW-based agent to monitor signature-specific events so that on-demand monitoring is supported. Our last contribution is to demonstrate the threat hunting process using our proposed agent architecture. We evaluated our proposed architecture using log data generated by running three test scripts provided by Red Canary Atomic Red Team [2], and we created attack signatures for the test scripts following the MITRE ATT&CK technique description during the evaluation. We also evaluated our proposed approach using DARPA OpTC attack dataset [15]. To compare our approach with the existing centralized event monitoring approaches for threat hunting, we also implemented centralized event monitoring using Splunk. This paper is organized as follows: Sect. 2 surveys existing research work on threat hunting and intrusion detection system, Sect. 3 formalizes signature generation and scalable threat hunting with SCAHunter, Sect. 4 explains SCAHunter by describing each component of the system, attack signature decomposition algorithm, an approximation algorithm to generate near-optimal agent hierarchy used by the agent architecture for subscribe-publish based event monitoring
1286
M. Ahmed et al.
and correlation, and also provides a threat hunting demonstration using the SCAHunter, Sect. 5 provides implementation details and evaluation with simulated attack use cases and OpTC attack dataset, and Sect. 6 summarizes our contributions and future research tasks. The remainder of this paper will use logs, alarms, and events interchangeably. It will also interchangeably use monitoring tasks, composite events, and subscribed events.
2
Related Works
Causality Analysis. Sleuth [24], DeepHunter [28], Nodoze [20], OmegaLog [22] and CoNAN [31] used provenance graph generation and centralized analysis on the aggregated logs for attack detection and investigation. Holmes [26] uses correlation among information flow to detect an APT, which can be possible only if the threat hunter aggregates events before performing correlation. Domino [32] combines alerts from different NIDS to detect attacks globally, using a single hierarchy level, i.e., manager-agent architecture. Kelifa et al. [16] proposed a misbehavior detection mechanism for wireless sensor network (WSN) based on clustered architecture where a cluster head is selected based on static metrics monitored by the monitoring nodes. Collaborative IDS (CIDS) [30] aggregate alerts from lower-level IDS to manager IDS, and the manager performs graphbased and network-based analysis to detect intrusions. All of those researches aggregate logs in a central server and perform corresponding analysis, which requires monitoring of all events and incurs communication overhead to transfer the generated events to the manager. In Swift [21], the author reduces memory usage during causality analysis by storing the most recent part of the causal graph on the main memory and the unused casual graph on disk. In [17], the authors generate a host-specific causal graph and network-specific causal graph and perform multi-host analysis only if any host-based sub-graph crosses a predefined risk score. Though graph reduction and compression and hierarchical storage reduce memory usage to store monitored events, the causality analysis on the compressed graph has the following limitations. They try to monitor all data sources in the end-host devices, which raises the issue of monitoring scalability and communication overhead (collecting logs on a central server) in threat hunting. Provenance graph expansion for multi-host analysis in Holmes [26] and Steinerlog [17] creates dependency expansion. Graph alignment in multiple hosts [25] increases the threat detection time exponentially with the increase of event logs and network size. Event Monitoring. Several centralized and distributed monitoring approaches and tools have been proposed (e.g., [9,10,12–14,23]) in other domains. Although they have various design-specific goals and objectives, they are not scalable for distributed monitoring, lack the flexibility to express the monitoring demands, and require monitoring every data sources. Those proposed approaches either monitor only network traffic [27], apply to network fault diagnosis [10,18], or maintain a static agent hierarchy [27]. Though hierarchical monitoring systems exist in other domains (fault detection, malicious sensor node detection), they
SCAHunter
1287
are not suitable for threat hunting since those systems are suitable for a specific use case. Adversarial Tactics, Techniques and Procedures. MITRE ATT&CK framework [1] published a public knowledge base of TTPs consisting of 24 tactics, 188 techniques, and 379 sub-techniques used by the APT. Every MITRE ATT&CK tactic published is a high-level attacker goal in a specific kill chain phase. Every MITRE ATT&CK technique consists of one or more procedures the attacker can use to achieve a specific goal, whereas each procedure is a unique way to achieve the corresponding goal. MITRE ATT&CK framework also provides 129 APT groups and a subset of the publicly reported technique used by each APT Group. MITRE also provides the data source to monitor to detect a specific attack technique. Holmes [26] uses MITRE ATT&CK TTP to build a set of rules and perform rule matching on the collected logs. This approach will fail if the initial compromise is not detected and they use centralized analysis, which incurs high memory and communication overhead. Additionally, they analyze alerts after generation, whereas SCAHunter reduces the number of alerts generated by hierarchical filtering.
3
Problem Formalization
To formulate the scalable threat hunting with hierarchical agent architecture, we formulate the attack signature and agent hierarchy generation in this section. We formalize attack signatures and event attributes and values, events, event subscription predicate, and event subscription rule (attack signature). Basic Notation for Events, Attributes, and Subscriptions. In distributed event monitoring systems, event producers (application collecting event logs from specific data source) frequently generate events to report aspects of the monitored system state. Let’s represent events reported by producers as E = {e1 , e2 , .., en }. Each event ei consists of a set of attributes ai,j that has assigned values vi,j such that i and j represent the event and attribute indices, respectively. For example, the event ei that has M attributes is defined as follows: ei = {(ai,1 , vi,1 ), (ai,2 , vi,2 ), .., (ai,M , vi,M )}. Event consumers may submit multiple subscriptions si to request monitoring and reporting of the occurrence of specific event instances or correlation of event instances. The set of consumers’ subscriptions can be represented as S = {s1 , s2 , .., sn }. Each subscription si consists of a logical expression on event attributes. For example, if a CA subscribes with the following, Si = {(pN ame, “cmd.exe ) ∧ (isChild, true)}, the CA is asking for subscription of all cmd.exe processes that are children of another process. Formulation of Event Fragmentation. By following the formulation of events and attributes, an occurrence of event i at time t is defined as combination (conjunctive) of attribute values as follows: t t eti = (ai,1 = vi,1 ) ∧ (ai,2 = vi,2 )∧, .., t )= ∧(ai,M = vi,M
M j=1
t ai,j = vi,j
(1)
1288
M. Ahmed et al.
If we receive a sequence of events of the same event type c within a specified interval T , the event history hcl of event type c during the interval T can be formally represented as follows: hcl = etl−1 ekl where l = 1, t < k (2) l∈L t,k∈T
which can be further formalized using the event attributes and values as follows: t k hcl = (atl−1,j = vl−1,j )(akl,j = vl,j ) l∈L t,k∈T j∈M (3) where l = 0, t < k L is the number of events in the event sequence requested, M is the number of attributes in the event type c. For l = 1, Eq. 2 and Eq. 3 are reduced to the following: hc1 = et1 (4) hc1 =
t∈T t (at1,j = v1,j )
(5)
t∈T j∈M
For instance, a consumer requests for type c event sequence {e1 , e2 , e3 } in time interval T = [1 − 10] and the producer reported events e1 at t = 2, e2 at t = 6 and e3 at t = 9, then the Eqs. 2 and 3 will be evaluated true. However, if either any of the three event is not reported or their reporting times are not in order with the sequence, the Eqs. 2 and 3 will be evaluated false. Formulation of Event Subscription Predicate. An event subscription predicate is a set of logical predicates that matches the attribute values of the one or more event instance occurrences. Therefore, we define the event predicate as a logical expression defined by the user that will be evaluated based on the attribute values of the event occurrences. Each predicate will be evaluated to true if and only if the event attribute values satisfy the predicate logical expression. For example, the predicate p1 : (pN ame, =, “cmd.exe”) will evaluate to False if the event generated is not corresponding to the cmd.exe process, but the predicate, p2 : (memAllocated, >, 65) will evaluate to True if the event is corresponding to a memory allocation and a memory of size greater than 65B is allocated. Therefore, we can formally define the event predicate as follows: pijk = (aij , op, vk ), where aij is the attribute j of event i, vk is any value of type integer or string that can match the value of the attribute, and op is a logical operator (such as =, , ≤, ≥) or set operator (such as ⊃, ⊂, ⊆, , etc.). Semantically, if op is =, then pijk ⇔ aij = vk . A predicate can specify a relationship between (same or different) attributes of same or different events to detect the occurrence of an event correlation. Formally, a predicate that defines a relationship between attribute k in two different events, i and j, can be specified as follows: pijk = (aik , op, ajk ), where aik , and ajk is the attribute k of events i and j, respectively.
SCAHunter
1289
Formulation of Event Subscription Rule. By following the formulation of event subscription predicates, a user can define an Event Subscription Rule (ESR) as the logical Boolean expression of a conjunctive normal form (CNF) using multiple predicates to match an attack signature (subscription) occurrence, Si , as follows: ESR(e1 , .., en ) = (p(ei ) ∨ .. ∨ p(ej )) ∧ (p(ei ) ∨.. ∨ p(ej )) ∧ .. ∧ (p(ei ) ∨ .. ∨ p(ej )) which can be formalized in the concise format as follows: ESR(e1 , .., en ) = p(ej )
(6)
(7)
i∈N:i≤n j∈N:j≤n
Therefore, given the attack signature or subscription request Si as ESR and a network topology, the goal of the hierarchical monitoring architecture is to generate optimal agent hierarchy such that communication cost and memory usage is reduced, and task is distributed among the agents. The optimal agent hierarchy generation problem can be formalized as follows minimize agent count(Si ), (8) minimize Data Source Count(Si ) subject to monitoring cost < threshold To develop the hierarchical monitoring architecture, we have to address the following problems: 1) Given the subscription request and network topology, generate the optimal number of middle-level agents, maintaining monitoring capacity constraints of each agent, 2) Given the subscription request, determine the optimal number of data sources to monitor, 3) develop monitoring agent for the specific data source. The following section provides the agent architecture, optimal agent hierarchy, and data source monitor generation.
4
Distributed Hierarchical Monitoring Agent Architecture Overview
In this section, we describe each component of SCAHunter as shown in Fig. 1. Our SCAHunter consists of three types of agents- console agent or manager (CA), composite event detector agent (CEDA), event filtering agent (EFA), and data sources to monitor. The user of this agent architecture (or a cyber threat hunter) provides a single event or group of events or correlated event subscription requests (i.e., attack signatures) using the CA. The CA decomposes the subscription request based on the formalization provided in Sect. 3. The CA also generates the required number of CEDAs using the decomposed subscription request.
1290
M. Ahmed et al.
Fig. 1. Distributed Hierarchical Monitoring Agent Architecture.
It determines appropriate EFAs, and agent hierarchy such that communication overhead is minimum and subscription task monitoring is distributed across the hosts in the network. Since the optimal agent hierarchy generation is an NP-hard problem, our CA uses an approximation algorithm to generate a nearoptimal agent hierarchy. Then, the CA sends the decomposed event subscription requests to the corresponding CEDAs and EFAs through a dedicated configuration channel. Upon receiving a subscription request from the CA, an EFA will start monitoring corresponding data sources. Whenever the EFA detects a subscribed event, it publishes detected events to the upper-level (parent) CEDA through task-specific channels as alerts containing task id and event details. The CEDA or EFA also replies to the CA through a dedicated configuration channel to activate the next monitoring task and deactivate the detected monitoring task to detect a subscription request while supporting on-demand monitoring. 4.1
Console Agent (CA) or Manager
The Console Agent takes an attack signature as input (ESR) from the cyber threat hunter at the beginning of the threat hunting process. This agent’s first task is to decompose the subscription request received from the threat hunter using the formalization described in Sect. 3. The CA’s second task is to determine how many CEDA levels to generate and how many CEDAs to develop and where to place those generated CEDAs based on the decomposed event subscription request, the data source to monitor, and the location of end-host devices. It also needs to determine EFA(s) where the event subscription request will be sent. After determining the CEDAs and EFAs, the CA generates the corresponding CEDA and configures them. After a decision as a reply from CEDA is available, it informs the threat hunter about the detected TTPs or attack techniques. Since the number of generated CEDAs will impact resource usage and communication overhead within the CA and the network, appropriate CEDA number, level, and place are required. We can formulate the determination of the proper number of CEDAs required as Generate a minimum number of CEDAs that can cover all required EFAs for serving the event subscription request. We can formulate this problem as a set-cover problem that is NP-complete. One way to solve this problem is to use heuristics on each CEDA or EFA’s predefined monitoring capacity and the geographical distribution of end-host devices.
SCAHunter
1291
Specifically, we propose an approximation algorithm (AHG in Algorithm 1) based on these heuristics. 4.2
Composite Event Detector Agent (CEDA)
Composite event detector agent reduces network communication overhead (traffic flow) between the CA and EFAs by replying to the upper-level CEDA or the CA if a monitoring task or subscribed event is detected. The hierarchical agent architecture can have multiple levels of CEDAs. If the CA creates one level of CEDAs, each CEDA’s child is an EFA, and its parent is the CA. On the other hand, if the CA creates two levels of CEDAs, the child of lower-level CEDA is an EFA, and the parent of it is a higher-level CEDA, and the parent of the higher-level CEDA will be the CA. 4.3
Event Filtering Agent (EFA)
The event filtering agent is the lowest level of the agents in the SCAHunter. It monitors different data sources for events requested in the received event subscription request. These agents are static: we generate them initially and continue to work until closed or subscription requests are deleted. MITRE ATT&CK framework [1] provides around 38 different data sources to monitor for detecting different attack TTPs. Thus, to detect all attack techniques and sub-techniques provided by the MITRE ATT&CK framework, we have to develop less than 38 different event filtering agents. We developed EFAs to monitor the following data sources: ETW trace, Netmon, and Sysmon. 4.4
Agent Communication Protocol
Since every agent in SCAHunter may consume event logs or alerts from multiple other agents or data sources, SCAHunter uses a publish-subscribe communication pattern for group communication among the agents. The CA, CEDA, and EFA agents may work as producers and consumers. The CA configures CEDA and EFA agents by publishing configuration info to the corresponding agents through the configuration channel. It also consumes alerts from lower-level CEDA or EFA agents through task-specific channels. CEDA and EFA publish detected monitoring tasks as alerts to the upper-level CEDA or CA through taskspecific channels. We use a publish-subscribe communication pattern to facilitate the above-mentioned group communication among the agents. Though the proposed agent architecture employs a hierarchical structure for event monitoring and detection, it is a virtual hierarchy constructed by CEDAs and EFAs’ group communication over publish-subscribe communication protocols. Using publishsubscribe communication protocols in agent communication will improve agent hierarchys’ robustness in agent failures or network partitioning.
1292
M. Ahmed et al.
Algorithm 1. Agent Hierarchy Generation Algorithm AHG(S, M T, A) Input: attack signature S, monitoring task list M T , agent hierarchy A Output: agent hierarchy A 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31:
4.5
if size(M T ) == 1 then return A ∪ {M T } end if for all pair(mi , mj ) where mi ∈ M T, mj ∈ M T, mi ! = mj do score, correlationScore ← correlation between mi and mj , (score, mi , mj ) end for sort correlationScore based on score covered ← φ, capacity ← φ, ind ← 0, cluster ← φ, index ← 0 for all (score, mi , mj ) ∈ correlationScore do if mi ∈ covered and mj ∈ covered then continue else if mi ∈ covered then index, mi ← find cluster index containing mi , None else if mj ∈ covered then index, mj ← find cluster index containing mj , None end if if mi ! = N one and mj ! = N one then clusterind , capacityind , covered ← mi ∪ mj , 2, covered ∪ mi ∪ mj else if mi ! = N one and capacityind + 1 S /+$:1N("$$1*+E(B$& ;3-(0$ /0*'103-(0$13+4&*(4 G.B&$4 0*'1(EU SI5=Z5IXUYRC=5L>;= 5=[X=C!R>ZUI5=RWG!!=5JRI%!>Q>VG!>IUC G;;=CCRU=!DI5HRC!G!= DGH=R8I;H 5=;=>L=RWII!R;IQ%8=!=Y L>W5G!= G;;=CCRD>S>RC!G!= ;,GUZ=RD>S>RC!G!= W5IGY;GC!RC!>;HJ W>UYRZ=!R>UC!G88R5=S=55=5RC=5L>;= 5=I5Y=5R!GCHC C=!RG8G5Q ;,GUZ=RU=!DI5HRC!G!= Y>CGW8=RH=JZXG5Y H>88RWG;HZ5IXUYR%5I;=CC=C XC=RW>IQ=!5>;C US; QIY>SJRGXY>IRC=!!>UZC G;;=CCRS>U=R8I;G!>IU G;;=CCR;IG5C=R8I;G!>IU G;;=CCRWG;HZ5IXUYR8I;G!>IU ;G88R%,IU= ;GQ=5G G;!>L>!JR5=;IZU>!>IU 5=GYR=\!=5UG8RC!I5GZ= D5>!=R=\!=5UG8RC!I5GZ= 5=GYR%,IU=RC!G!= 5=;I5YRGXY>I 5=GYR;IU!G;!C D5>!=R;IU!G;!C 5=GYR;G8=UYG5 D5>!=R;G8=UYG5 XC=RC>% WIYJRC=UCI5C Z=!RG;;IXU!C Z=!R!GCHC CJC!=QRG8=5!RD>UYID D5>!=RC=!!>UZC 5=GYRC=!!>UZC ;,GUZ=R;IUS>ZX5G!>IU 5=GYR8IZC W>UYRUI!>S>;G!>IUR8>C!=U=5RC=5L>;= 5=[X=C!R>UC!G88R%G;HGZ=C YIDU8IGYRD>!,IX!RUI!>S>;G!>IU >U!=5UG8RCJC!=QRD>UYID QIXU!RXUQIXU!RS>8=CJC!=QC
Fig. 2. An Overview of Permission Requests Analysis (Similar to the Analysis of [17] in for Contact Tracing Apps)
accounted for 40% of the top 10 requested permissions, namely CAMERA (28 apps), WRITE EXTERNAL STORAGE (27 apps), RECORD AUDIO (26 apps), and READ EXTER- NAL STORAGE (21 apps). Moreover, when it comes to dangerous permission requests, surveillance camera apps tend to request the following permissions: CAMERA (28 apps), WRITE EXTERNAL STORAGE (27 apps), RECORD AUDIO (26 apps), READ EXTERNAL STORAGE (21 apps), ACCESS FINE LOCATION (17 apps), ACCESS COAR- SE LOCATION (16 apps), READ PHONE STATE (12 apps), GET ACCOUNTS (9 apps), ACCESS BACKGROUND LOCATION (3 apps), READ CONTACTS (3 apps), CALL PHONE (2 apps), ACTIVITY RECOGNITION (1 app), WRITE CONTACTS (1 app), READ CALEN- DAR (1 app), WRITE CALENDAR (1 app), USE SIP (1 app), and BODY SENSORS (1 app).
1382
V. Schmitt et al.
Fig. 3. Permissions Found from the Manifest Analysis: The Most Sought after Dangerous Permissions are CAMERA, STORAGE, and RECORD AUDIO Found in 28, 27, and 26 Apps (Similar to the Analysis of [17] in for Contact Tracing Apps)
5
PHASE II - Privacy Policy Analysis
In this phase, we explore the compliance of surveillance camera apps with fundamental legal requirements. For this, we rely on the EU GDPR benchmarking conducted in [12] that resulted in the identification of 12 privacy policy principles. The privacy policy of an app is a statement or a legal document that gives information about the ways an app provider collects, uses, discloses, and manages users’ data. By law, data collectors (including app providers) are required to be transparent about their data collection, sharing, and processing practices and specify how they comply with legal principles [12]. Based on keyword- and semantic-based search techniques, a data protection expert went through each privacy policy to analyze the compliance of these apps with regard to the following principles which are summarized and used similarly in [14,17]. Data Collection: The legal foundation is defined in Article 5(1) GDPR, which states the general principles of processing personal data. Also, Art. 6 in the GDPR indicates when processing is lawful, which includes when consent is given by a user of a service or application. Moreover, both articles address the question of when consent is necessary for the performance of a contract or compliance with legal obligations when the vital interests of the user or another natural person
Privacy of Surveillance Camera Apps
1383
need to be protected, and when a task is carried out for the public or legitimate interest pursued by the controller or by a third party. Nevertheless, this applies only if such interests do not conflict with fundamental rights and also the freedom of a user. Hereby, e.g. advertising is not classified as a necessary interest and thus, needs to be analyzed based on other legal foundations [2,14,17]. Children Protection: Personal data which is related to children needs to be treated with special attention. As defined in Rec. 38 in the GDPR that children “may be less aware of the risks, consequences, and safeguards concerned and their rights in relation to the processing of personal data”. Service providers need to provide information in a very clear and comprehensive language so that also children are able to understand it easily (Rec. 58 GDPR). Moreover, the processing of children’s data is strictly regulated and data can only be processed on a lawful basis if the child is at least 16 years old (Art. 8 GDPR). In case the child is younger, processing of children’s data is only lawful when a parent or also legal guardian has given consent [14,17]. Third-Party Sharing: Third-party tracking is one of the most common approaches to collecting personal information through various apps. Hereby, it is legally regulated by Art. 13 in the GDPR, where it is defined that the recipients or categories of recipients of personal information must be declared to the users [14,17]. Third-Country Sharing: The legal requirements for third-country sharing are described in Chapter 5 in the GDPR. Hereby, personal data can only be transferred to other countries when a similar level of protection is enforced. This means that the protection of personal data travels also across borders when personal data is transferred to servers outside of the EU. Furthermore, the privacy policy must state its procedures when personal data is shared with other countries outside of the EU [14,17]. Data Protection: Technical and organizational measures to ensure the appropriate security of personal information must be ensured by the data controller as stated in Art. 32 in the GDPR. Especially in the smartphone ecosystem, this has major implications, as they are usually linked to huge amounts of data transfer. Moreover, the components of data protection are closely interrelated with privacy-by-design principles [7,14,17]. Data Retention: The principle of data minimization and storage limitation is described in Art. 13 (2), and 14 (2) in the GDPR. Hereby, the data controller has the obligation to inform users how long personal data is retained. Especially for “the right to be forgotten” (Art. 17) this is crucial as personal data can only be stored for a limited time [14,17]. User’s Control: Further user rights are defined in Chapter 3 of the GDPR, which contains the right to information and access to personal data; the right to rectification; the right to erasure; the right to restriction of processing; the right to data portability; and the right to object and automated individual
1384
V. Schmitt et al.
decision-making. IN Art. 13 (2), and 14 (2) it is defined that service or app providers are required to provide these rights to users to ensure fair and transparent data processing [14,17]. Privacy Policy Changes: In Art. 12 of the GDPR app or service providers have the obligation to inform users about privacy policy changes in a transparent and comprehensive way. This should further ensure lawful, fair, and transparent processing of personal information [14,17]. Privacy Breach Notification: In Art. 34 of the GDPR it is defined that in case a data breach occurs that might result in a risk to the rights and freedoms of users, the data controller or service provider must inform the users asap. Also, the information which needs to be provided in the data breach notification is regulated by this article. Thus, a data breach notification must name the data protection officer and mention the likely consequences of the data breach. Furthermore, measures must be mentioned how to mitigate the effects of the data breach. Moreover, the supervisory authority must be informed not later than 72 h later after the detection of the data breach [14,17]. App-Focused: Often, the privacy policy is not exclusively formulated for only one application, but shared among multiple services which are provided by the same data controller or app developer [25]. This principle is incorporated in the principle of lawfulness, fairness, and transparency [14,17]. Purpose Specification: Data collection must be specified by service providers or data controllers according to Art. 13 (1c), and 14 (1c) in the GDPR. The principle of purpose limitation is relevant to preventing the exploitation of personal data for other use cases. It is also closely related to the data collection principle but refers rather to a clear statement and explanation of data collection purposes [14,17]. Contact Information: Users have the right to be informed about the identity of service providers and data controllers, which includes the name of service providers, also legal representation, legal status, and postal address (Art. 13(1a), and 14(1a) in the GDPR). The principle of contact information is closely interrelated with the principle of lawfulness, fairness, and transparency. Providing such information is relevant to give users the option to also file a formal compliance [14,17]. 5.1
Privacy Policy Completeness Analysis
Figure 4 shows the results of compliance analysis of privacy policies of surveillance camera apps. The results show that Ring fulfills the maximum number of principles (8 principles). Surprisingly, our findings also revealed that more than one-third of these apps (11 apps) do not fulfill any privacy policy principle either because they have very generic text that does not discuss the data collection and sharing practices of apps, rather irrelevant information (such as Honeywell Home or TP-Link Tapo) or because they are not discussing how they comply with legal requirements (such as tinyCam).
Privacy of Surveillance Camera Apps
1385
Fig. 4. Compliance Analysis of Privacy Policies of Surveillance Camera Apps: Compliance is Shown in Green and Non-Compliance is Shown in Red
5.2
Coverage of Privacy Policy Principles
Figure 5 presents the coverage of privacy policy principles by surveillance camera apps. The results show that data collection and app-focused are the most covered principle (19 apps). We also found that no apps fulfilled the privacy breach notice principle, which can be extremely problematic as data collectors need to ensure that appropriate remedies are in place in case a privacy breach happens, resulting in a high risk to individuals’ fundamental rights and freedoms. This might indicate poor arrangements in place if individuals’ personal data fall into the wrong hands due to a privacy breach [14,17]. The same also holds for contact information (2 apps), data retention (3 apps), and data protection (3 apps) where surveillance camera apps are not providing transparent information to demonstrate compliance with these principles.
1386
V. Schmitt et al.
Fig. 5. Privacy Policy Principles Coverage
6
PHASE III - Apps’ Behavior Analysis
In this phase, we installed the surveillance camera apps’ on an Android device (Samsung Galaxy S6 Tablet), and we monitored their behavior, i.e., their permission access patterns. For this, we used the tool proposed in [13,16] that enables run-time permission access analysis of Android apps. While we were monitoring these apps, we started to open each app once to trigger and activate their desired functionality. To make sure that we did not miss any certain functionality, we deliberately granted permissions whenever asked by the apps. Afterward, we let them run in the background (without any further interactions, but were connected to a WiFi network and a power source). This was by intention as our goal was to inspect if surveillance camera apps access any sensitive resources while there is no legitimate reason. After a one-week ongoing experiment, the data generated by the monitoring tool was collected and analyzed. Our objective was to figure out what is being accessed by apps, at what time, and at which frequency. Figure 6 and Fig. 7 (the numbers corresponding to each app show the frequency of permission accesses) present the results of permission access patterns per analyzed app. Almost all apps (29 out of 30) are found to be accessing the device’s storage. Followed by this, CAMERA (20 apps), LOCATION (16 apps), ACCESS WIFI (10 apps), RECORD AUDIO (8 apps), READ PHONE STATE (4 apps), USE BIOMETRICS (3 apps), and READ CONTACTS (2 apps) are the most accessed
Privacy of Surveillance Camera Apps
1387
Fig. 6. The Most Accessed Permissions
permissions. It is worth mentioning that many apps access highly sensitive permissions (camera, microphone, location, etc.) while the user was not interacting with them. We believe this is problematic from a privacy perspective due to the passive usage of the apps. We were also interested in analyzing the severity of trackers, i.e., deployed servers to exchange users’ data, integrated into the apps’ code. We used the tool proposed in [3] to extract a list of widely embedded trackers within the surveillance camera apps. Figure 8 shows the results regarding the integration of trackers into the analyzed apps’ code. Our analysis shows that more than onethird (31.5%) of these trackers belong to analytics and advertising parties (shown in light red). This confirms that these apps are not only accessing sensitive data but also capable of transferring these data to third parties ranging from advertising to analytic networks. For instance, using the tool in [22], we found that the device’s unique identifiers such as IMEI and MAC address are the most transferred pieces of personal data to third parties (21 apps). Contacting a third party by itself is not an obvious sign of a privacy breach. However, our results confirm the high integration of third-party trackers in the analyzed apps, and some of them are associated with the third-party advertisement and tracking services.
1388
V. Schmitt et al.
Fig. 7. Apps’ Permission Access Frequencies: Numbers Show the Frequency of Permission Accesses
7
Discussion of Observations
We observed a couple of critical privacy issues resulting from our multidimensional analysis of surveillance camera Android apps. 7.1
Over-Privileged Issue
Our permission request analysis phase (Phase 1 ) revealed that the majority of studied apps tend to request permissions that are not related to their proper functionality. As opposed to other app categories such as social networking, music or communication, which have been already reported [19] to suffer from
Privacy of Surveillance Camera Apps
1389
Fig. 8. Detected Trackers
this issue, we believe the existence of such an issue in surveillance camera apps can potentially have a much more severe impact on users’ privacy. This is related to the fact that these apps deal with special categories of personal data, i.e., biometric data which is used for identification purposes (e.g. surveillance camera footage), and as defined by the GDPR, it requires stronger protection measures (than personal data) in place to ensure the fundamental rights and freedoms of individuals are not violated. Thus, we argue that the processing of multimedia content by these apps captured through surveillance cameras should go through the Data Protection Impact Assessment (DPIA), which according to Article 35
1390
V. Schmitt et al.
of the GDPR, “where a type of processing, in particular, using new technologies, and taking into account the nature, scope, context, and purposes of the processing, is likely to result in a high risk to the rights and freedoms of natural persons, the controller shall, prior to the processing, carry out an assessment of the impact of the envisaged processing operations on the protection of personal data”. 7.2
Transparency Issue
The results obtained from the second phase (privacy policy) of our analysis show that the majority of studied apps do not fully fulfill fundamental principles of the GDPR, i.e., Article 5(1)(a), which has a special focus on transparency. This gets even more critical when combined with the importance of users’ consent, as according to Article 7 of the GDPR, consent must be freely given, specific, informed, and unambiguous in order for the data processing activities to be considered lawful. However, the results of our analysis show that the majority of surveillance camera apps fail to provide users with a transparency-friendly privacy policy text. Therefore, their data processing activities are clearly based on invalid users’ consent which may result in their data processing activities being unlawful, and therefore, subject to Article 83(5)(a) of the GDPR. Hereby, infringements of the basic principles for processing can lead to administrative fines up to 20 million euros, or up to 4% of the total worldwide annual turnover of the preceding financial year of a data collector, whichever is higher. 7.3
Inappropriate Data Access and Sharing Practices
Our findings from the third phase (apps’ behavior analysis) revealed that the studied apps have an excessive tendency to access sensitive data while the user is not interacting with them. We believe this is not only against the data protection requirements such as data minimization (apps should limit their data access and sharing practices only to those data types that are necessary for the core functionality of the app) and purpose specification (apps should only collect those personal data types that are relevant to the specified purposes in apps’ privacy policies, but also best software development practices such as the principle of least privilege that demands software developers to make their software applications functional with the minimum required privileges [23]. In addition, our results indicated a considerable number of trackers being embedded in the apps’ code and the transfer of the device’s unique identifiers (such as the IMEI) to external servers which are not further aligned with the GDPR requirements (Sect. 5, transfers of personal data to third countries or international organizations). The results obtained from our analysis can serve as a call for action to revamp the current status with home globalization and the used technologies that are essentially targeting citizens’ activities using surveillance camera tools and apps. From a compliance perspective, we believe stronger enforcement mechanisms are needed to minimize the potential privacy and security risks associated with the
Privacy of Surveillance Camera Apps
1391
excessive proliferation and use of such monitoring and surveillance tools. We also highlight that future research should further explore the users’ privacy awareness aspects with regard to the integration of these cameras and their respective mobile apps into people’s daily lives and activities as users may not be fully aware of the negative consequences that such cameras and apps could potentially have on their privacy. We also note that the developers and providers of these surveillance camera apps should carefully address privacy threats discussed in this paper and make sure their app design and the development life cycle respect privacy-by-design.
8
Conclusion
In this paper, a multidimensional analysis has been presented to showcase potential GDPR compliance issues of Android surveillance camera apps. In particular, we focused on the system permission requests of Android surveillance camera apps, their privacy policies and adherence to existing regulations defined in the GDPR. Finally, we analyzed their run-time permission requests to identify potential privacy and security issues associated with these applications. Based on the multidimensional analysis we can show how many of these apps request permissions unrelated to their core functionality, without a clear connection or clarification in the respective privacy policies which have been found to often not conform to data privacy legislation (i.e. the EU’s GDPR). Finally, we found that these apps access sensitive data from the users’ devices while also embedding trackers to transfer this sensitive data to external servers. While some of these alarming behaviours have been previously observed in other contexts [14,17] as well. The findings show that further mechanisms are necessary to enforce data protection regulations, such as the GDPR. Procedures need to be developed to more closely monitor applications not only in the legal domain but also through technical analysis, e.g. of analyzing permission requests and embedded trackers. Thus, an approach to automatize the analysis of technical dimensions is necessary, to enable the enforcement of data protection regulations also on a technical level and detect possible pitfalls and areas where adjustment or further clarification of the regulation is necessary. Acknowledgment. We would like to thank Majid Hatamian for his great support and guidance throughout all the different steps of the experiments.
References 1. Regulation (eu) 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation) (2016) 2. Privacy and data protection in mobile applications. a study on the app development ecosystem and the technical implementation of GDPR. ENISA (2017)
1392
V. Schmitt et al.
3. Mobile security framework (mobsf) (2020) 4. Barrera, D., Kayacik,H., Van Oorschot, P.C., Somayaji. A.: A methodology for empirical analysis of permission-based security models and its application to android. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, pp. 73–84 (2010) 5. Bugeja, J., Jacobsson, A., Davidsson. P.: Smart connected homes. Internet of Things A to Z: Technologies and Applications, pp. 359–384 (2018) 6. Bugeja, J., Jacobsson, A., Davidsson, P.: PRASH: a framework for privacy risk analysis of smart homes. Sensors 21(19), 6399 (2021) 7. Cavoukian, A., et al.: Privacy by design: the 7 foundational principles. In: Information and Privacy Commissioner of Ontario, Canada, 5 (2009) 8. Enck, W., Octeau, D., McDaniel, P.D., Chaudhuri, S.: A study of android application security. In: USENIX Security Symposium, vol. 2 (2011) 9. Enck, W., Ongtang, M., McDaniel, P.: On lightweight mobile phone application certification. In: Proceedings of the 16th ACM Conference on Computer and Communications Security, pp. 235–245 (2009) 10. Fritsch, L., Abie, H.: Towards a research road map for the management of privacy risks in information systems. In: SICHERHEIT 2008–Sicherheit, Schutz und Zuverlassigkeit. Beitrage der 4. Jahrestagung des Fachbereichs Sicherheit der Gesellschaft fur Informatik eV (GI) (2008) 11. Mahbub Habib, S., Alexopoulos, N., Monirul Islam, Md., Heider, J., Marsh, S., M¨ uehlh¨ aeuser. M.: Trust4app: automating trustworthiness assessment of mobile applications. In: 2018 17th IEEE International Conference On Trust, Security and Privacy In Computing and Communications/12th IEEE International Conference on Big Data Science And Engineering (TrustCom/BigDataSE), pp. 124–135. IEEE (2018) 12. Hatamian, M.: Engineering privacy in smartphone apps: a technical guideline catalog for app developers. IEEE Access 8, 35429–35445 (2020) 13. Hatamian, M., Kitkowska, A., Korunovska, J., Kirrane, S.: “It’s shocking!”: analysing the impact and reactions to the A3: android apps behaviour analyser. In: Kerschbaum, F., Paraboschi, S. (eds.) DBSec 2018. LNCS, vol. 10980, pp. 198–215. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95729-6 13 14. Hatamian, M., Momen, N., Fritsch, L., Rannenberg, K.: A multilateral privacy impact analysis method for android apps. In: Naldi, M., Italiano, G.F., Rannenberg, K., Medina, M., Bourka, A. (eds.) APF 2019. LNCS, vol. 11498, pp. 87–106. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21752-5 7 15. Hatamian, M., Serna, J., Rannenberg, K.: Revealing the unrevealed: mining smartphone users privacy perception on app markets. Comput. Secur 83, 332–353 (2019) 16. Hatamian, M., Serna, J., Rannenberg, K., Igler, B.: FAIR: fuzzy alarming index rule for privacy analysis in smartphone apps. In: Lopez, J., Fischer-H¨ ubner, S., Lambrinoudakis, C. (eds.) TrustBus 2017. LNCS, vol. 10442, pp. 3–18. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64483-7 1 17. Hatamian, M., Wairimu, S., Momen, N., Fritsch, L.: A privacy and security analysis of early-deployed Covid-19 contact tracing android apps. Empir. Softw. Eng. 26(3), 1–51 (2021) 18. Human, S., Cech, F.: A human-centric perspective on digital consenting: the case of GAFAM. In: Zimmermann, A., Howlett, R.J., Jain, L.C. (eds.) Human Centred Intelligent Systems. SIST, vol. 189, pp. 139–159. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-5784-2 12 19. Momen, N., Hatamian, M., Fritsch, L.: Did App privacy improve after the GDPR? IEEE Secur. Privacy 17(6), 10–20 (2019)
Privacy of Surveillance Camera Apps
1393
20. Montgomery, B.: Future shock: IOT benefits beyond traffic and lighting energy optimization. IEEE Consum. Electr. Mag. 4(4), 98–100 (2015) 21. Pierce, J.: Smart home security cameras and shifting lines of creepiness: a designled inquiry. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–14 (2019) 22. Razaghpanah, A., et al.: Haystack: In situ mobile traffic analysis in user space. CoRR, abs/1510.01419, 2015 23. Saltzer, J.H., Schroeder, M.D.: The protection of information in computer systems. Proc IEEE 63(9), 1278–1308 (1975) 24. Stach, C., Steimle, F.: Recommender-based privacy requirements elicitationepicurean: an approach to simplify privacy settings in IoT applications with respect to the GDPR. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 1500–1507 (2019) 25. Sunyaev, A., Dehling, T., Taylor, P.L., Mandl. K.D.: Availability and quality of mobile health app privacy policies. In: American Medical Informatics Association, pp. 288–33 (2015)
Extension for ASPICE and Cybersecurity Process Assessment Model Christian Schlager1(B) , Georg Macher1 , Richard Messnarz2 , and Eugen Brenner1 1 Technical University of Graz, Rechbauerstrasse 12, 8010 Graz, Austria [email protected], {georg.macher,brenner}@tugraz.at 2 ISCN, Schiessstattgasse 4, 8010 Graz, Austria [email protected] http://www.iti.tugraz.at, http://www.iscn.com
Abstract. In order to build embedded systems at automotive suppliers and evaluate the efficiency of processes, ASPICE was introduced. Software, system, quality assurance, configuration management, issue resolution, change management, project management, supplier monitoring, and other procedures are all covered by ASPICE. It specifies a method for figuring out the process’s capacity using the PAM at its foundation. HW Spice is taken into account as well because it is essential for cybersecurity. However, Spice for mechanic is not used. When the ASPICE assessment is expanded by processes for cybersecurity to meet the requirements of standard ISO/SAE 21434, this involves more work for the assessors and the technical team. By extending the ER (entity relationship) model that ASPICE has already described and identifying WPs (work products) that are used by at least one process that is specified by ASPICE and cybersecurity, the notion for speeding up assessment time is introduced in this study. Keywords: ASPICE
1
· Cybersecurity · Assessment · V-Model
Introduction
To evaluate the capabilities of processes for developing embedded systems, ASPICE (Automotive Software Process Improvement and Capability Determination) [25] has been introduced (e.g., eDrive Systems). Systems development (SYS.2-SYS.5), software development (SWE.1-SWE.6), project management (MAN.3), quality assurance (SUP.1), configuration management (SUP.8), issue resolution management (SUP.9), change management (SUP.10), and supplier monitoring are all covered (ACQ.4). VDA defines additional assessment processes in order to meet the requirements for system development for cybersecurity specified in ISO 21434 [10,27,28]. Requirement elicitation (SEC.1), cybersecurity implementation (SEC.2), risk treatment verification (SEC.3), risk treatment validation (SEC.4), supplier request and selection (ACQ.2), and cybersecurity risk management are the processes for cybersecurity assessment (MAN.7). c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1394–1408, 2023. https://doi.org/10.1007/978-3-031-37963-5_94
Extension for ASPICE and Cybersecurity Process Assessment Model
1395
A 5- to 6-day assessment for ASPICE processes also takes levels [25] and dependencies between the practices of one process and those of other processes [18,26] into account. If the processes for cybersecurity are included in the evaluation, the time frame is expanded. Dependencies are also defined by the cybersecurity processes [27,28]. A tool support for checking dependencies, such as Capability Advisor [18], shall be utilized to shorten the time for assessment of a project. Extending the preset ER [25] and identifying shared WPs utilized by both ASPICE and Cybersecurity processes are further strategies to shorten the assessment period if the project involves both ASPICE and cybersecurity processes. The research question for this paper is How can the time for an assessment covering processes for both ASPICE and Cybersecurity be reduced? A concept (explained in Sect. 3) is created to respond to the research question on how to execute assessments faster. This paper is structured as follows: Sect. 2 displays the prior work in this field, while Sect. 3 illustrates the concept, Sect. 4 illustrates the time savings resulting from the concept, Sect. 5 validates the hypothesis presented in Sect. 3, and Sect. 6 provides a summary.
2 2.1
Related Work Automotive SPICE (ASPICE)
Fig. 1. V-Model of ASPICE [25]
1396
C. Schlager et al.
The AUTOSIG (Automotive Special Interest Group) was founded in 2001 by different OEMs (original equipment manufacturers). Members of this group are most of the german OEMs like BMW, Audi, VW, Porsche and Daimler. The group also has non-German members like Land Rover, Ford, Volvo and Fiat. ASPICE [25] (Automotive SPIce and Capability dEtermination) was introduced by this group to assess embedded systems. Different process groups have been defined by ASPICE to meet the requirements for the evaluation of embedded systems. The VDA defines the assessment’s most critical processes (Verband der Automobilindustrie). The processes for software (SWE.1-SWE.6), systems (e.g., SYS.1-SYS.5), support (e.g., SUP.9), management (e.g., MAN.3), and acquisition (e.g., ACQ.4) are indicated in blue on Fig. 1. The PAM (Process Assessment Model) shown in Fig. 1 also defines processes for Spice for HW [7] (HWE.1HWE.6) and mechanical SPICE [8] (MEE.1-MEE.6) as a placeholder. These two disciplines are not considered in detail by ASPICE yet [25]. The disciplines considered by ASPICE yet are marked in blue color in Fig. 1. 2.2
Cybersecurity
Fig. 2. ASPICE and Cybersecurity [27, 28]
In 2021 ISO introduced the standard ISO/SAE 21434 [10]. It is about to develop embedded systems, that consider Cybersecurity [2,5,11,12,20,23]. The Standard ISO/SAE 21434 considers different aspects which are described in the single chapters of the standard. In addition to ASPICE, Cybersecurity considers different aspects about developing a system. These include key exchange
Extension for ASPICE and Cybersecurity Process Assessment Model
1397
[6], signal and message spoofing [1,3,21], and handling of cybersecurity requirements [13]. VDA [27,28] provides additional processes that Automotive SPICE has not yet described in order to satisfy the requirements of ISO 21434. Figure 2 depicts the additional cybersecurity processes. The newly formed group Cybersecurity Engineering Process Group contains the engineering processes SEC.1SEC.4. Additionally, the group Management process group and group Acquisition process group are expanded by processes MAN.7 and ACQ.2, respectively. 2.3
Rules and Rating Consistency
In addition to the PAM of ASPICE VDA also defines rules and recommendations [26,27]. The rule shall be used to rate a practice, while the recommendation should be used. Furthermore, it distinguishes between Rating of Practices and Rating Consistency. 1. Rating of Practices The Rating of Practices according to N(ot), P(artially), L(argely), F(ully) [25] has impact on practice or process attribute only. The following recommendation for process SEC.1 is defined [27]. [SEC.1.RC1] “If unclear or inconsistent requirements are not clarified with the individual stakeholders, indicator BP1 will be downrated.” [27]. 2. Rating Consistency
Fig. 3. Consistency for SEC.1[27]
The Rating Consistency considers not only one practice of process attribute like Rating of Practices. It considers the consistency of rating between practices (base practices for level 1 and generic practices for level 2–5) and/or
1398
C. Schlager et al.
process attributes (PA 1.1–5.2) within one single process and even between different processes. Figure 3 shows the dependencies for process SEC.1. One recommendation for Rating Consistency for process SEC.11 is [27] [SEC.1.RC.5] “If BP1 for SWE.1 is downrated, this should be in line with the rating of the indicator BP1”. [27]. 2.4
Mapping of Base Practices
An evaluation spanning the VDA Scope of ASPICE takes six days to complete. The time needed to complete an evaluation is increased by 2.5 days when the cybersecurity-specific processes are added. For both ASPICE and Cybersecurity processes, BP (Base Practice) are established at level 1. The work output used by the BPs for ASPICE and Cybersecurity is frequently the same [22]. The mapping for the processes SEC.1, SYS.2, and SWE.1 is shown in Fig. 4.
Fig. 4. Mapping of Base Practices for SEC.1 [22]
The Work Product 13-22 Traceability Record is an output of SWE.1.BP6,7, SYS.2.BP6,7 and SEC.1.BP2,3. 2.5
Assessment ASPICE with Cybersecurity Processes
The tool Capability Advisor [18,19] already has integrated the additional processes defined for cybersecurity [27,28]. Figure 5 shows how to rate the practices according to NPLF scheme of ASPICE [25]. During the assessment the following aspects listed below have been detected: – management and support processes do not support assessment well [19]; – a team of experts is necessary, because the work product require a detailed knowledge [4];
Extension for ASPICE and Cybersecurity Process Assessment Model
1399
– in order to get deep knowledge about the work products during performing assessments, ASPICE, functional safety; and cybersecurity shall be asked during performing the assessment [15,16]; – the system also interacts with its environment via its interfaces [17].
Fig. 5. Rating for SEC.1[18, 19]
2.6
Analysis of Cybersecurity Processes
As indicated in Sect. 2.2, VDA defines extra processes [27,28] in order to satisfy the requirements of ISO 21434 [10]. In [14], a thorough analysis of the procedures outlined by VDA for Cybersecurity [27,28] is provided. 2.7
Summary of Related Work
ASPICE. The V-Model of ASPICE is the very basis for performing assessments in an automotive project. It is used to determine the capability of processes according to NPLF scheme in the range 0–5. Cybersecurity. Cybersecurity is not currently a factor that ASPICE takes into account in version 3.1. To cover the aspects of ISO 21434, Cybersecurity defines additional processes. In order to respond to the research topic, Cybersecurity is also used as a foundation similar to ASPICE. Rules and Rating Consistency. Rules and Rating Consistency are defined by VDA. The rating consistency can be utilized to speed up the evaluation process.
1400
C. Schlager et al.
Mapping of Base Practices. The same work products are frequently used in ASPICE and cybersecurity processes. This fact will make it possible to shorten the time required for an evaluation. This work simply takes into account standard work products and does not provide any techniques for speeding up assessment processing. Assessment ASPICE with Cybersecurity Processes. Cybersecurity processes must be thoroughly understood in order to be defined. A shorter assessment procedure is achievable since ASPICE and Cybersecurity processes frequently use the same work products. Analysis of Cybersecurity Processes. A deep analysis of processes defined by Cybersecurity (e.g., SEC.1) shows the additional effort performing assessments. This makes it necessary to find some methods to reduce time performing an assessment.
3
Concept
Fig. 6. Entity Relationship Model
The Entity Relationship Model shown in Fig. 6 is defined in [27,28]. It consists of the entities process (e.g., SWE.1), Outcome ( e.g., the software requirements ... are defined;), Practice (e.g., SWE.1.BP1: Specify software requirements.) and Work Product (WP) (e.g., 17-11 Software requirements specification). The outcome of a process is valid for assessing level 1 of a process. For level 2 to 5 the practice (Generic Practice) is directly connected to the process and the WP is directly connected to the practice. During the assessment, the outcome is not considered when assessing a certain process. The Base Practices are directly scored with the NPLF scheme [27,28], which leads to NPLF score of PA1.1. The generic practices are also scored according to NPLF scheme [27], which also leads to NPLF for PA2.1 to PA5.2. Therefore, the model shown in Fig. 6 is not used for assessment, but the model shown in Fig. 7. The proposal extends the model by the entities Attribute and Question.
Extension for ASPICE and Cybersecurity Process Assessment Model
1401
Fig. 7. Used Relationship Model
3.1
Entities Defined by ASPICE and Cybersecurity PAM
The entities shown in Fig. 7 marked with white background are already defined in [25,27]. Process. – – – – –
Process ID (Key): ID of the process e.g., SWE.1 Process Name: Name of the process e.g., Software Requirements Analysis Process Type: Type of the process ASPICE/Cybersecurity Process Purpose: Purpose of the process Process Outcomes: Outcomes of the process e.g., the software requirements to be allocated to the software elements of the system and their interfaces are defined.
Practice. – ID (Key): ID of the practice e.g., SWE.1.BP1 – Name: Name of the practice e.g., Specify software requirements for SWE.1.BP1 – Type: Type of the practice Base Practice/Generic Practice – Note: Information for the practice e.g., Use the system requirements and the system architecture... for SWE.1.BP1
Work Product. – ID (Key): ID of the Work Product e.g., 17-11 – Name: Name of the Work Product e.g., Software requirements specification for 17-11 – Note: Additional information for the Work Product
1402
3.2
C. Schlager et al.
Newly-Introduced Entities
The entities shown in Fig. 7 marked with red background are newly-defined. Attribute. – – – –
Name (Key): Name of the Attribute e.g., 17-11 ASIL Area: Topic the Attribute belongs to ASPICE/FuSa/Cybersecurity Type: Type of the Attribute Enum/Link/Person/Text Description: Description of the Attribute e.g., 17-11 ASIL
Question. – ID (Key): ID of the Question e.g., Q17-11 ASIL – Name: Name of the Question e.g., 17-11 ASIL – Question: Description of the Question e.g., Has each requirement an ASIL rating? 3.3
Relationship
The entities shown in Fig. 7 have relationships to certain other entities also shown in Fig. 7. All the relationships between the entities have a n to m (marked with *) relation. There is only one exception, the relationship between Process and Practice. Level 1 of each process defines Base Practices, which can be assigned to one process only. The resulting relationship is 1 to n. Level 2 to 5 of each process defines Generic Practices, which can be assigned to more than one process. The resulting relationship is n to m. The entities and relationships for 17-11 Software Requirement Specification looks like shown in Fig. 8. Figure 8 shows only the key attributes that are listed in sections 3.1 and 3.2. For 17-11 Software Requirement Specification the following entities and objects are used: 1. 2. 3. 4. 5.
Entity Process: e.g., SWE.1, SEC.1 Entity Practice: e.g., SWE.1.BP1 Entity Work Product: e.g., 17-11 Entity Attribute: e.g., ASIL, CAL Entity Question: e.g., Has each requirement an ASIL rating? Does each requirement have a CAL rating? (1-4 if necessary)
3.4
Advantage of Concept
The entities Attribute and Question in the suggested notion expand the ER model of ASPICE [25]. Since the work products frequently used by processes of both ASPICE and Cybersecurity (such as 17-11 Software requirement specification for ASPICE process SWE.1 and Cybersecurity process SEC. 1) do not take into account the unique properties required for ASPICE and Cybersecurity, this extension is necessary. For instance, the 17-11 Software Requirement Specification for process SEC.1 : Requirement Elicitation simply requires the attribute CAL (Cybersecurity Assurance Levels). The inquiries for the CAL attribute are:
Extension for ASPICE and Cybersecurity Process Assessment Model
1403
Fig. 8. Relationship Model for 17-11 Software Requirement Specification
– Is the CAL assigned to a single requirement or to the whole feature the requirements is assigned to? – Is the classification correct (CAL 1-4)? – Does the CAL fit to the security goal defined in TARA (Thread and Risk Analysis)? The idea presented in this section takes into account the variations in workproduct attributes that are utilized by both ASPICE and Cybersecurity processes. The majority of a work product’s characteristics are already questioned during the ASPICE process assessment. During the examination of the Cybersecurity processes, only the properties needed for Cybersecurity are questioned. Thus, following this process will result in a time savings when doing an assessment.
4
Time Reduction
Without taking into account numerous processes employing the same Work Product, the time schedule for an evaluation, which is a crucial component of the planning phase [9], may resemble the schedule shown in Fig. 9. For instance, both SWE.1 - Software Requirements Analysis and SEC.1 - Cybersecurity Requirements Elicitation employ Work Product 17-11 Software Requirement Specification. The following are all the work products that SWE.1 and SEC.1 share: 1. 13-04 SWE.1 - Communication record 2. 13-19 SWE.1 - Review record
1404
C. Schlager et al.
Fig. 9. Schedule
3. 13-22 SWE.1 - Traceability record 4. 15-01 - Analysis report 5. 17-11 - Software requirements specification When the attributes of 17-11 Software Requirement Specification are considered during assessment of SWE.1 - Software Requirements Analysis and SEC.1 - Cybersecurity Requirements Elicitation most of the attributes are asked during assessment of SWE.1 - Software Requirements Analysis and only the attributes needed for process SEC.1 - Cybersecurity Requirements Elicitation (attribute CAL) are asked during assessment of SEC.1. This means the duration of the assessment for SEC.1 becomes shorter. The time reduction for each Work Product SWE.1 and SEC.1 use in common is estimated to be 12 min. There are 5 Work Products SWE.1 and SEC.1 use in common (see listing above). Therefore, the total time reduction for assessing SEC.1 is estimated to be 1 h. Finding common Work Products also can be extended to other processes e.g., SWE.2 - Software Architectural Design and SEC.2 - Cybersecurity Implementation. The common Work Products used by both SWE.2 - Software Architectural Design and SEC.2 - Cybersecurity Implementation are the following: 1. 2. 3. 4.
04-04 13-04 13-19 13-22
- Software architectural design SWE.2 - Communication record SWE.2 - Review record SWE.2 - Traceability record
Extension for ASPICE and Cybersecurity Process Assessment Model
5
1405
Validation by Review
Leading automotive suppliers who will eventually have to conduct combined security assessments reviewed the method to validate it. The SOQRATES members were given an introduction to the idea discussed in Sect. 3. SOQRATES [24] is a working group led by Richard Messnarz from the company ISCN that includes Tier 1 suppliers such as Bosch, Hella, Conti, and ZF. The working group’s suggestions have been used to enhance the current idea, for instance, the initial strategy was to add properties and questions to the PAM of ASPICE. This expansion is suggested because it will assist assessors in conducting an assessment and thereby shorten the time required to conduct the assessment. The working group concluded that while this is a novel approach that extends the PAM, it has to be confirmed in numerous carried out assessments. Over the course of nine months, the current structure was presented and discussed three times at SOQRATES. The working group is driven by a shared desire to cut down on the 8.5 days it currently takes to complete an evaluation that covers both ASPICE and Cybersecurity processes. Separate questions about the aspects of time reduction introduced in Sect. 3 were posed, and the responses are outlined in Sects. 5.1 and 5.2 below. 5.1
Attributes
An innovative method for reducing the amount of time required for an assessment is to concentrate on common work products and, consequently, work product qualities. There are numerous work products that are required by the processes of ASPICE and Cybersecurity. The length of the assessment may be cut in half if common work products and traits were taken into account. 5.2
Questions
According to Sect. 3, the questions and qualities are closely related. Questions will automatically be taken into consideration when the characteristics of common work items that fulfill practices for ASPICE and Cybersecurity are taken into account. 5.3
Summary
The proposals discussed in Sect. 3 may result in a shorter amount of time needed to complete an assessment, according to the working group SOQRATES’s general response. It is required to carry out a number of assessments spanning both ASPICE and Cybersecurity processes in order to confirm that the suggestions result in a time reduction.
1406
6
C. Schlager et al.
Conclusion and Further Research
Processes for determining the competence level in the range of 0–5 are defined by ASPICE [25]. To evaluate the process attributes of one process, the NPLF Scheme is used. System development, software development, support, management, and acquisition are all topics covered by the ASPICE procedures. Additionally, VDA [27,28] defines acquisition processes (ACQ.2), Cybersecurity risk management (MAN.7), and security engineering (SEC.1-SEC.4). The examination will take a lot of time if the processes of ASPICE and Cybersecurity are included in its scope. This paper responds to the research question: How can the time for an assessment covering processes for both ASPICE and Cybersecurity be reduced? By thoroughly describing the idea to shorten the duration of an examination in Sect. 3, it takes into account the ensuing factors: 1. Finding WPs, which are used by both processes of ASPICE and Cybersecurity. 2. Extending the predefined ER model of ASPICE [25] by entities Attribute and Question. 3. Defining attributes, which apply to WPs for both processes of ASPICE and Cybersecurity. 4. Defining attributes, which apply to WPs for either processes of ASPICE or Cybersecurity. 5. Derive Questions from the defined attribute. Future research examines how much time can be saved when doing assessments utilizing the suggested paradigm. Finally, actual evaluations will calculate the time savings in accordance with the suggestions stated in Sect. 3.
References 1. Ahmad F., Adnane A., Franqueira V., Kurugollu F., Liu L., (2018), Man-In-TheMiddle Attacks in Vehicular Ad-Hoc Networks: Evaluating the Impact of Attackers Strategies. In: Sensors, Volume 18, Month 11, https://doi.org/10.3390/s18114040 2. Brennich T., Moser M. (2020) Automotive Security auf dem Pruefstand. In: ATZelectronics, Month 1+2, Page 48 to 53 3. Cheng B., Doherty B., Polanco N., Pasco M. (2021) Security Patterns for Connected and Automated Automotive Systems, In: Automotive Software Engineering, Volume 1, Issue 1, Page 51 to 77, https://doi.org/10.2991/jase.d.200826.001 4. Dobaj J., Ekert D., Stolfa J., Stolfa S., Macher G., Messnarz R., (2021) Cybersecurity Threat Analysis and Risk Assessment and Design Patterns for Automotive Networked Embedded Systems: A Case Study, In: JUCS - Universal Computer Science, Volume 27, Number 8, Page 830 to 849, https://lib.jucs.org/article/72367/ 5. Ebert C. (2020) Efficient Implementation of Standards for Security, Safety and UNECE, In: ATZelectronics worldwide, Month 9, Page 40 to 43
Extension for ASPICE and Cybersecurity Process Assessment Model
1407
6. Groza B., Murvay P. (2019) Identity-Based Key Exchange on In-Vehicle Networks: CAN-FD and FlexRay, In: Sensors, Volume 19, Number 22, https://doi.org/10. 3390/s19224919 7. intacs (2019) HW Spice, intacs Working Group HW Engineering Processes 8. intacs (2020) Process Assessment Model SPICE for Mechanical Engineering, intacs Working Group MECH Engineering Processes 9. ISO/IEC (2019) ISO/IEC 33002 Information technology - Process assessment Process measurement framework for assessment of process capability 10. ISO/SAE (2021) ISO/SAE DIS 21434, Strassenfahrzeuge, Cybersecurity Engineering 11. Jadhav A. (2021) Automotive Cybersecurity. In: Kathiresh M., Neelaveni R. (Hrsg.) Automotive Embedded Systems. EAI/Springer Innovationen in Kommunikation und Computing. Springer, Cham. https://doi.org/10.1007/978-3-03059897-6 6 12. Kim S., Shrestha R. (2020) Introduction to Automotive Cybersecurity. In: Automotive Cyber Security. Springer, Singapore. https://doi.org/10.1007/978-981-158053-6 1 13. Laborde R., Bulusu S., Wazan A., Oglaza A., Benzekri A. (2021) A Methodological Approach to Evaluate Security Requirements Engineering Methodologies: Application to the IREHDO2 Project Context, In: Cybersecurity and Privacy, Volume 1, Number 3, Page 422 to 452, https://doi.org/10.3390/jcp1030022 14. Magdy E. (2021) A-SPICE for Cybersecurity: Analysis and Enriched Practices. In: Yilmaz M., Clarke P., Messnarz R., Reiner M. (eds) Systems, Software and Services Process Improvement. EuroSPI 2021. Communications in Computer and Information Science, vol 1442. Springer, Cham. https://doi.org/10.1007/978-3-030-855215 37 15. Macher G., Much A., Riel A., Messnarz R., Kreiner C. (2017) Automotive SPICE, Safety and Cybersecurity Integration. In: Tonetta S., Schoitsch E., Bitsch F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2017. Lecture Notes in Computer Science, vol 10489. Springer, Cham. https://doi.org/10.1007/978-3-31966284-8 23 16. Macher G., Schmittner C., Dobaj J., Armengaud E. (2020) An Integrated View on Automotive SPICE and Functional Safety and Cyber-Security, (SAE Technical Paper). https://doi.org/10.4271/2020-01-0145 17. MacGregor J., Burton S. (2018) Challenges in Assuring Highly Complex, High Volume Safety-Critical Software. In: Gallina B., Skavhaug A., Schoitsch E., Bitsch F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2018. Lecture Notes in Computer Science, vol 11094. Springer, Cham. https://doi.org/10.1007/ 978-3-319-99229-7 22 18. Messnarz R., Ekert D., Zehetner T., Aschbacher L. (2019) Experiences with ASPICE 3.1 and the VDA Automotive SPICE Guidelines, Using Advanced Assessment Systems. In: Walker A., O’Connor R., Messnarz R. (eds) Systems, Software and Services Process Improvement. EuroSPI 2019. Communications in Computer and Information Science, vol 1060. Springer, Cham. https://doi.org/10.1007/9783-030-28005-5 42 19. Messnarz R. et al. (2021) First Experiences with the Automotive SPICE for Cybersecurity Assessment Model. In: Yilmaz M., Clarke P., Messnarz R., Reiner M. (eds) Systems, Software and Services Process Improvement. EuroSPI 2021. Communications in Computer and Information Science, vol 1442. Springer, Cham. https:// doi.org/10.1007/978-3-030-85521-5 35
1408
C. Schlager et al.
20. Moselhy N., Ali Y. (2021) Impact of the New A-SPICE Appendix for Cybersecurity on the Implementation of ISO26262 for Functional Safety. In: Yilmaz M., Clarke P., Messnarz R., Reiner M. (eds) Systems, Software and Services Process Improvement. EuroSPI 2021. Communications in Computer and Information Science, vol 1442. Springer, Cham. https://doi.org/10.1007/978-3-030-85521-5 9 21. Petho Z., Intiyaz K., Torok A., Pasco M. (2021) Analysis of Security Vulnerability Levels of In-Vehicle Network Topologies Applying Graph Representations,In: Electronic Testing, Volume 37, Page 613 to 621, https://doi.org/10.1007/s10836021-05973-x 22. Schlager, Christian, Macher, Georg: The cybersecurity extension for ASPICE - a view from ASPICE assessors. In: Yilmaz, Murat, Clarke, Paul, Messnarz, Richard, Reiner, Michael (eds.) EuroSPI 2021. CCIS, vol. 1442, pp. 409–422. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85521-5 27 23. Singh, Madhusudan: Cybersecurity in automotive technology. In: Information Security of Intelligent Vehicles Communication. SCI, vol. 978, pp. 29–50. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-2217-5 3 24. SOQRATES, Task Forces Developing Integration of Automotive SPICE, ISO 26262, ISO21434 and SAE J3061. http://soqrates.eurospi.net/ 25. VDA QMC. Automotive Spice V Model (2015) 26. VDA QMC. Automotive Spice Guidelines, 2nd edn (2017) 27. VDA QMC. Automotive SPICE for Cybersecurity, 1st edn (2021) 28. VDA QMC. Automotive SPICE for Cybersecurity Process Reference and Assessment Model (2021)
New Results on Algebraic Constructions of Extremal Graph Theory and Implementations of New Algorithms of Postquantum Cryptography Vasyl Ustimenko1,2,3 and Tymoteusz Chojecki4(B) 1
University of London (Royal Holloway), London, UK [email protected], [email protected] 2 University of Maria Curie-Sklodowska, Lublin, Poland 3 Institute of Telecommunications and Global Information Space, Kyiv, Ukraine 4 Institute of Mathematics, University of Maria Curie-Sklodowska, Lublin, Poland [email protected]
Abstract. Recently established lower bound on the girth of the known small world graphs A(n, q) motivates new constructions of A(n, q) based stream ciphers. We suggest graph based algorithm of symmetric encryption with the resistance to linearisation attacks. It allows us to introduce a new protocol based cryptosystem of Noncommutative Cryptography based on platform of cubic automorphisms of Fq [x1 , x2 , . . . , xn ]. Keywords: Extremal Algebraic Graphs · Graph based Security · Key Exchange Protocols · Cryptosystems · Multivariate Cryptography · Noncommutative Cryptography
1
Introduction
NIST 2017 tender started the standardization process of possible Post-Quantum Public keys aimed for purposes to be i. encryption tools ii. tools for digital signatures. In July 2020 the Third round of the competition started. In the category of Multivariate Cryptography (MC) remaining candidates formed a short list. For the task (i) multivariate algorithms were not selected at all, single multivariate candidate “Rainbow Like Unbalanced Oil and Vinegar digital signatures” (RUOV) remains in the category (ii) with a good chance for the final selection (see [1]). Noteworthy that all multivariate NIST candidates were presented by multivariate rule of degree bounded by small constant (2 or 3). In particular RUOV is given by the system of quadratic polynomial equations. We think that NIST outcomes motivate investigations of alternative options in MC which are oriented on encryption tools: c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1409–1418, 2023. https://doi.org/10.1007/978-3-031-37963-5_95
1410
V. Ustimenko and T. Chojecki
a. to work with encryption transformations of plaintext space (Fq )n of linear degree cn, where c > 0 is a constant as instruments of stream ciphers or public keys, b. to use protocols of Noncommutative Cryptography with platforms of multivariate transformations. Both approaches as well as combination of (b) and (a) will be used in our paper. We will use special extremal graphs to generate highly nonlinear automorphisms of Fq [x1 , x2 , . . . , xn ]. They are connected with the problem of approximation of k -regular tree Tk , k > 2 by elements of the family of k-regular graphs of increasing order and increasing girth (minimal length of cycle in the graph). This task is very important for Extremal Graph Theory. Its solutions can be used in many applications, like computer implementations of branching process, algorithms of sorting around, constructions of Low Density Parity Check codes, creation of optimal networks for the various tasks. Some valuable results in this area were contributed by L. Lovasz and A. Wigderson which share the prestigious Abel Prize in 2021 (see [2,3] and further references). Several interesting applications of tree approximations to Symmetric Cryptography were proposed within time interval 1998–2022 (see [4] and further references). Some of them use family of graphs A(n, q), n = 2, 3, . . . defined for each prime power of q, q > 2. The projective limit of these graphs when n tends to infinity is a q-regular tree. It means that these graphs form an approximation of Tq . The long standing conjecture that these graphs form a family of large girth was proved recently [5,6]. It turned out that the girth of the graph, i.e. the length of its minimal cycle is at least [n/2]. This result will appear in [7]. Our computer simulations show that theoretical bound can be essentially improved in future. For instance, we conjecture that girth of graphs A(n, 3) can be written as 2n − 2 for n ≥ 7, see Table 1. Already obtained theoretical lower bound on the girth has an impact on properties of A(n, q) based stream ciphers. In fact if the length of each path in the graph inducing the encryption map has length < [n/4] then different paths produce distinct ciphertexts. The author in research [8] considers A(n, q) based flexible symmetric cipher which uses password tuple of fixed even length t to form cubic transformation E induced by the path in the graph of length t. Two other parts of the password encode two linear transformations T1 and T2 of plaintext space V = (Fq )n , which is a point set of q-regular bipartite graph. The encryption map is a composition of kind T1 ET2 . Execution of E takes O(nt). If t is less than half of the girth and T1 , T2 are fixed linear transformations then different paths produce distinct ciphertexts. Cubical nature of encryption/decryption transformations means that adversary can conduct costly linearization attacks via the interception of n3 pairs
On Extremal Graph Theory and Postquantum Cryptography
1411
of kind plaintext - corresponding ciphertext. It allows him/her to restore E in time O(n10 ). In this paper we modify this symmetric cipher to make it resistant to lin˜ of earization attacks via the change of E for the bijective polynomial map E degree ≥ 2nγ , 0 ≤ γ ≤ 1 induced by the path starting in the entrance point and moving this point to the destination point of the path of even length t = [nγ ]. ˜ maps a point p to other point v of A(n, Fq ) connected The transformation E with p through the path of length t. If we assume that T1 and T2 are sparse (only O(nt) matrix entries are non zero) then encryption also takes time O(nt). ˜ belongs to group EA(n, q) of polynomial transformaThe transformation E tions of vector space (Fq )n constructed in terms of graphs A(n, q) and their projective limit A(q). The cubical transformations of kind E form known subgroup GA(n, Fq ) of the group EAn (Fq ) of bijective transformations of degree bounded by 3 which can serve as a tool of Postquantum Cryptography. Noteworthy that the composition of two elements of degree 3 from AutK[x1 , x2 , . . . , xn ] in majority cases has degree 9. If two cubical transformations are taken from GA(n, Fq ) then their composition is also a cubical map. This unusual property indicates that GA(n, Fq ) can be used as a platform of Noncommutative Cryptography (see [9]). Noncommutative cryptography is rapidly grown part of Post Quantum Cryptography (see [10]–[23]). We use the simplest protocols of Noncommutative Cryptography for their implementation with platforms GA(n, Fq ) to combine them with above presented symmetric ciphers. In the case of symbiotic combination of directed twisted Diffie-Hellman protocol we obtain a cryptosystem with Perspective to be used in Postquantum Era. All presented algorithms can be modified just via the change of finite field Fq for general finite commutative ring K with the unity. The description of graphs and corresponding groups is given in Sect. 2 together with the description of symmetric graph based encryption algorithm. Implementation of direct and inverse twisted protocols with platform GA(n, K) and their application to the construction of encryption algorithm are presented in Sect. 3. Section 4 is dedicated to a bridge between the protocol and symmetric cipher. It uses extraction procedure of the password of the cipher from the output data of the protocol. Section 5 contains conclusive remarks.
2
Graphs and Transformation Groups of Affine Spaces as Cryptographical Tools
Homogeneous algebraic graph A(n, q) = A(n, Fq ) was introduced in [24] as homomorphic image of previously known D(n, q). In fact we can consider more general graphs A(n, K) defined over arbitrary commutative ring K. This graph is a bipartite graph with the point set P = K n and the line set L = K n (two copies of a Cartesian power of K are used). It is convenient
1412
V. Ustimenko and T. Chojecki
to use brackets and parenthesis to distinguish tuples from P and L. So (p) = (p1 , p2 , . . . , pn ) ∈ Pn and [l] = [l1 , l2 , . . . , ln ] ∈ Ln . The incidence relation I = A(n, K) (or corresponding bipartite graph I) can be given by condition (p)I[l] if and only if the equations of the following kind hold. p2 − l 2 = l 1 p 1 ,
(1)
p3 − l 3 = p1 l2 ,
(2)
p 4 − l 4 = l1 p 3 ,
(3)
p5 − l 5 = p1 l4 ,
(4)
...
(5)
pn − ln = p1 ln−1
(6)
for odd n and pn − ln = l1 pn−1 for even n (see [18]). They were intensively used for the constructions of LDPC codes for satellite communications and cryptographic algorithms (see [25,26]). In the case of K = Fq , q > 2 of odd characteristic graphs A(n, Fq ), n ≥ 2 form a family of small world graphs [25] because their diameter is bounded by linear function in variable n. In fact we conjecture that diameter of graph A(n, Fq ) is bounded by 2n+2. Various applications of small world graphs are widely known. Recently discovered bound girth(A(n, K)) ≥ [(n + 2)/2] was obtained in the case of general integrity ring K. We can consider an infinite bipartite graph A(K) with points (p1 , p2 , . . . , pn , . . . ) and lines [l1 , l2 , . . . , ln , . . . ]. If K, |K| > 2 is an integrity then A(K) is a tree and A(n, K), n = 2, 3, . . . is its algebraic approximation of large girth. We refer to the first coordinates p1 = ρ((p)) and l1 = ρ([l]) as colors of vertices of A(K) (or A(n, K)). It is easy to check that each vertex v of the graph has a unique neighbor N a(v) of selected color. So the walk of length 2k from vertex (0, 0, . . .) will be given by the sequence of colors of its elements b1 , a1 , b2 , a2 , . . . , bk , ak . It will be the path if 0 = a1 , ai = ai+1 and bi = bi+1 for i = 1, 2, . . . , k − 1. So we can identify walks from 0 point of even length point with sequence of kind w. Let w = (b1 , a1 , b2 , a2 , . . . , bs , as ). We define the composition of w and w as the sequence u = (b1 , a1 , b2 , a2 , . . . , bk , ak , b1 +ak , ak +a1 , b2 +ak , . . . , bs +ak , as +ak ). If w and w are paths and b1 + ak = bk then u is also a path. Let BP (K) be a semigroup of all walks with this operation. One can identify empty string with the unity of BP (K). We use a term branching semigroup for BP (K). Let us take graph A(n, K) together with A(n, K[x1 , x2 , . . . , xn ]). For each element from BP (K) we consider a walk Δ(w) in A(n, K[x1 , x2 , . . . , xn ]) with starting point (x1 , x2 , . . . , xn ) where xi are generic elements of K[x1 , x2 , . . . , xn ] and special colors of vertices x1 +b1 , x1 +a1 , . . . , x1 +bk , x1 +ak . Let p = dest(Δ(w)) be a destination, i.e. a final point of this walk. The destination has coordinates (x1 + ak , f1 (x1 , x2 ), f2 (x1 , x2 , x3 ), . . . , fn−1 (x1 , x2 , . . . , xn )) where fi are
On Extremal Graph Theory and Postquantum Cryptography
1413
elements of K[x1 , x2 , . . . , xn ]. We consider the transformation n η(w) of P = K n defined by the rule x1 → x1 + ak , x2 → f1 (x1 , x2 ), x3 → f2 (x1 , x2 , x3 ), . . . , xn → fn−1 (x1 x2 , . . . , xn ). This transformation is bijective map of K n to itself. It is an element of affine Cremona group CG(K n ) = Aut(K[x1 , x2 , . . . , xn ]) acting naturally on K n . The inverse for this map is n η(w)−1 which coincides with n η(w ) for w = Rev(w) = (bk − ak , ak−1 − ak , bk−1 − ak , . . . , b1 − ak , −ak ). We refer to Rev(w) as reverse string for w from BP (K). The statement below follows directly from the definitions. Proposition 1. see [25] . The map n η from BP (K) to CG(K n ) is a homomorphism of the semigroup into group. We refers to n η as compression map and denote n η(BP (K)) as GA(n, K). Degree of element g of Cremona group CG(K n ) of kind xi → gi (x1 , x2 , . . . , xn ) is the maximal degree of polynomials gi . Theorem 1. see [27] . The maximal degree of multivariate transformation g from GA(n, K) equals 3. It means that subgroup G of kind T GA(n, K)T −1 where T is an element of AGLn (K) can be used efficiently as a platform for the implementation of protocols of Noncommutative Cryptography. Nonlinear cubical encryption map E of presented above stream cipher is in fact an element of GA(n, Fq ). So the inverse of encryption map G = T1 ET2 is also a cubical map. In means that adversary is able to conduct costly linearization attacks. This fact means that standard form of G of kind xi → fi (x1 , x2 , . . . , xn ), i = 1, 2, . . . , n where cubical fi are given via their list of monomial terms could not serve as a public rule. Alternatively groups of kind G = T GA(n, Fq )T −1 can be used as platforms of protocols of Noncommutative Cryptography. Details will be given in the next section. To construct symmetric ciphers of polynomial nature the generation of transformations of unbounded degree have to be given. For this purpose we change the alphabet of construction of BP (K) for the commutative ring K[x]. The product of elements (f1 , f2 , . . . , fk ) and (g1 , g2 , . . . , gs ) of BP (K[x]) is an element (f1 , f2 , . . . , fk , g1 (fk ), g2 (fk ), . . . , gs (fk (x)). Let CSn (K) be a semigroup of endomorphisms of K[x1 , x2 , . . . , xn ]. We define a homomorphism n η of BP (K[x]) into CSn (K) moving (f1 , f2 , . . . , fk ) into the transformation which sends point (x1 , x2 , . . . , xn ) of I = A(n, K[x1 , x2 , . . . , xn ]) to the destination point vk of the walk (x1 , x2 , . . . , xn )Iv1 Iv2 I . . . Ivk where ρ(vi ) = fi , i = 1, 2, . . . , k of the graph I. We denote n η(BP (K[x])) as EA(n, K). Lemma 1 see [24]. Let n > k and u = (f1 , f2 , . . . , fk ) be an element of BP (K[x]). Assume that deg(f1 ) ≥ 1, deg(f2 ) ≥ 1, deg(fi − fi+2 ) ≥ 1 for i = 1, 2, . . . , k − 2. Then degree of n η(u) is at least k. We use encryption a bijective map of kind T1 n η(u)T2 with k = cnα , c > 0, 0 < α < 1 and select density of each fi as O(1).
1414
V. Ustimenko and T. Chojecki
We use sparse matrices T1 and T2 , i.e. matrices with number of nonzero entries O(nt). Encryption takes time O(n1+α ). The method is resistant to linearization attacks because encryption map is an element of CSn (K) of degree ≥ cnα . Remark 1 Let estimate the size of the input data of the algorithm. Standard forms of T1 and T2 require 2(n + 1)n elements from K. We can assume that degree of each fi is bounded by n. So string of coefficients of u can be given by tuple of length (n + 1)n. Thus array of input data can be thought as tuple D of length 3(n + 1)n.
3
On Some Graph Based Protocols of Noncommutative Cryptography
Let us consider the usage of platform GA(n, q) in Noncommutative Cryptography. We start with two abstract schemes of Noncommutative Cryptography. 3.1
Twisted Diffie-Hellman Protocol
Let S be an abstract semigroup which has some invertible elements. Alice and Bob share element g ∈ S and pair of invertible elements h, h−1 from this semigroup. Alice takes positive integer kA and rA and forms h−rA g kA hrA = gA . Bob takes kB and rB and forms h−rB g kB hrB = gB . They exchange gA and k A rA k B rB h and B g = h−rB gA h gB and compute collision element X as A g = h−rA gB respectively. 3.2
Inverse Twisted Diffie-Hellman Protocol
Let S be a group. Correspondents follow the scheme 3.1 with the inverse element g ∈ S and Alice sends h−rA g −kA hrA = gA to Bob and she gets h−rB g kB hrB = gB from him. They use the same formulae for A g and B g. But in the new version these elements are mutual inverses. Alice has X but Bob possesses X −1 . Both schemes can be implemented with the multivariate platform S = T GA(n, q)T −1 . Due to the fact that degrees of multivariate transformations g1 and g2 from G = T GA(n, Fq )T −1 are bounded by 3 their composition costs O(n13 ). It means that algorithm 3.1 and 3.2 can be implemented with the platform G. Correspondents can execute 3.1 and share the tuple T of t = nCn 3 coefficients of X in front of monomial terms of kind xi xj xk where |i, j, k| = 3 ordered in lexicographical order. So they can use additive one time pad with alphabet K and space of plaintext K t . They can present T as concatenation of tuples of length n and work with the space of texts K n and use Cn3 different passwords. Let us consider the expansion of the protocol 3.1 to a cryptosystem with the trust interval.
On Extremal Graph Theory and Postquantum Cryptography
1415
Alice selects transformations T1 and T2 from AGLn (K) and element u from BP (K) corresponding to some pass in the tree A(K). She computes element n η(u) = E and standard form G of T1 ET2 , she sends G + X to Bob. He uses G as an encryption tool. Alice decrypts in time O(n2 ) because of her knowledge of T1 , E, T2 and their inverses. The natural question is the following one. How long correspondents can use encryption map G? Cryptanalytical studies gives the following answer. Alice and Bob have to keep the TRUST INTERVAL of size n3 /2. It is justified by investigation of linearization attacks where adversary has to intercept n3 pairs of kind plaintext/ciphertext. So, Alice and Bob have to count exchanged messages up to n3 /2 files. If counter indicates [n3 /2] they have to start a new session of the protocol (see [28]). Protocol 3.2 implemented with cubical platform G = T GA(n, K)T −1 is in fact a cryptosystem of El Gamal type with the same trust interval as the previous one. We present the computer execution time of encryption of T1 ET2 in the case of K = Fp , p = 127, k = 50, 100, 1000 and when the size of the plaintext is 10 Kb, 20 Kb and 40 Kb in Table 2.
4
Extraction of the Key from the Output of Protocol 3.1
Let K be an arbitrary commutative ring, A = (ai,j ) be a matrix of size (n × n) with entries from K. For an arbitrary element a ∈ K we consider a such that a = a if a = 0 and a = 1 in the opposite case. We also introduce a such that a = a in the case a ∈ K ∗ and a = 1 if a is not an invertible element. For matrix A we consider triangular matrix ⎤ ⎡ a1,1 a1,2 a1,3 . . . a1,n ⎢ 0 a2,2 a2,3 . . . a2,n ⎥ ⎥ ⎢ ⎥ ⎢ (7) A1 = ⎢ 0 0 a3,3 . . . a3,n ⎥ ⎥ ⎢ .. .. . . . . . . ⎦ ⎣. . . . . 0
0
And lower unitriangulat matrix ⎡ 1 0 ⎢ a2,1 1 ⎢ ⎢ .. A2 = ⎢ ... . ⎢ ⎣ an−1,1 . . . an,1 an,2
. . . 0 an,n
... 0 .. .
0 ... .. .
an−1,n−2 1 ... an,n−1
⎤ 0 0⎥ ⎥ .. ⎥ .⎥ ⎥ 0⎦ 1
(8)
Let A˜ = A1 A2 . Noteworthy that A˜ is invertible matrix. Let us assume that Alice and Bob executed the session of twisted Diffie-Hellman protocol with the platform T GA(n, K)T −1 where T ∈ AGLn (K). Assume that they get the collision map X of kind xi → fi (x1 , x2 , . . . , xn ), i = 1, 2, . . . , n where fi are cubic elements of K[x1 , x2 , . . . , xn ].
1416
V. Ustimenko and T. Chojecki
Assume that fi contains a linear form i a = i a1 x1 + i a2 x2 + · · · + i an xn and quadratic form i b = i b1 x1 2 + i b2 x2 2 + · · · + i bn xn 2 . We identify i a and i b with tuples (i a1 ,i a2 , . . . ,i an ) and (i b1 ,i b2 , . . . ,i bn ) respectively. So we form matrices A = (ai,j ), ai,j = fi (j a1 ,j a2 , . . . ,j an ) and B = (bi,j ), bi,j = fi (j b1 ,j b2 , . . . ,j bn ) together with tuples c = (1 a1 ,2 a1 , . . . ,n an ) and d = (1 b1 ,2 b1 , . . . ,n bn ). It allows us to form affine transformations T1 : (x1 , x2 , . . . , xn ) → (x1 , x2 , . . . , xn )A + c and T2 : (x1 , x2 , . . . , xn ) → (x1 , x2 , . . . , xn )B + d . Let a = 1 a = (a1 , a2 , . . . , an ) and b = 1 b = (b1 , b2 , . . . , bn ). Assume that parameter n = 2 mod 4. We introduce u = (g1 (x), g2 (x), . . . , gn (x)) in the following way (9) gn (x) = an x + bn , gn−1 (x) = an−1 x + bn−1 ,
(10)
bn−2 ,
(11)
gn−3 (x) = an−3 x2 + bn−3 ,
(12)
...,
(13)
gn−2 (x) =
an−2 x2
+
g2 (x) = a2 x + b2 , g1 (x) =
a1 x
+
(14)
b1 .
(15)
˜ = η(u) and T2 . Encryption of the Alice and Bob work with triple T1 ,E n plaintext from K is a consecutive application of T1 , E and T2 to the tuple. It takes time O(n2 ). The encryption process has multivariate nature. It is given by polynomial transformation of degree 2n. The security rests on the security of the protocol 3.1 of Noncommutative Cryptography. Table 1. Girth and Diam for A(n, F3 ) n
4 5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Girth 8 12 12 12 12 16 16 16 16 20 20 20 20 24 24 24 26 30 30 32 32 36 36 36 Diam 8 12 12 16 16 20 20
Table 2. The Algorithm’s Runtime File size 10 Kb
20 Kb
40 Kb
k=50
1,75 s
3,41 s
6,72 s
k=100
3,31 s
6,74 s
12,8 s
k=1000 32,72 s 65,21 s 125,1 s
On Extremal Graph Theory and Postquantum Cryptography
5
1417
Conclusion
So the symbiotic combination of twisted Diffie-Hellman algorithm with the proposed new symmetric cipher forms the cryptosystem which security rests on Conjugacy Power Problem for group of multivariate transformations written in their standard forms. In this paper we presented illustrations of potential to use tree approximations in Post Quantum Cryptography. Presented new symmetric cipher can be combined with other Protocols. One of them [9] also uses tree approximation but its security rests on the word problem in group T GA(n, K)T −1 presented by generators written in a standard form of Multivariate Cryptography. The usage of other platforms is also possible. Funding Information. This research is partially supported by British Academy Fellowship for Researchers under Risk 2022.
References 1. Post-Quantum Cryptography: Call for Proposals. https://csrc.nist.gov/Project; Post-Quantum-Cryptography-Standardization/Call-for-Proposals.Post-Quantum Cryptography: Round 2 and Round 3 Submissions 2. Grzesik, A., Kr’al’, D., Lov’asz, L.M.: Elusive extremal graphs, preprint. ar.Xiv:1807.01141 (2018) 3. Hoory, N., Linial, A., Wigderson, A.: Expander graphs and their applications. Bull. Amer. Math Soc. 43, 439–561 (2006) 4. Ustimenko, V., Romanczuk-Polubiec, U., Wroblewska, A., Polak, M., Zhupa, E.: On the constructions of new symmetric ciphers based on non-bijective multivariate maps of prescribed degree. Secur. Commun. Netw. 2019, Article ID 2137561, 15p. https://doi.org/10.1155/2019/2137561 5. Ustimenko, V.: On new results of extremal graph theory and postquantum cryptography. In: International Algebraic Conference “At the End of the Year 2021”, 27–28 December 2021 Kyiv, Ukraine ABSTRACTS, p. 29 (2021) 6. Ustimenko, V.: On new results on extremal graph theory, theory of algebraic graphs and their applications in cryptography and coding theory. Cryptology ePrint Archive.2022/296 7. Ustimenko, V.: On new results on extremal graph theory, theory of algebraic graphs and their applications in cryptography and coding theory, Dopovidi Nath Acad of Ukraine, 2022, N4 (to appear) 8. Klisowski, M., Ustimenko, V.A.: On the comparison of cryptographical properties of two different families of graphs with large cycle indicator. Math. Comput. Sci. 6(2), 181–198 (2012) 9. Ustimenko, V., Klisowski, M.: On Noncommutative cryptography with cubical multivariate maps of predictable density, In: “Intelligent Computing”, Proceedings of the 2019 Computing Conference, Vol. 2, Part of Advances in Intelligent Systems and Computing (AISC, volume 998), pp. 654–674 (2019) 10. Moldovyan, D.N., Moldovyan, N.A.: A new hard problem over non-commutative finite groups for cryptographic protocols. In: International Conference on Mathematical Methods, Models, and Architectures for Computer Network Security, MMM-ACNS 2010: Computer Network Security, pp 183–194
1418
V. Ustimenko and T. Chojecki
11. Sakalauskas, L., Tvarijonas, P., Raulynaitis, A.: Key agreement protocol (KAP) using conjugacy and discrete logarithm problem in group representation level. Informatica 18(1), 115–124 (2007) 12. Shpilrain, V., Ushakov, A.: Theconjugacy search problem in public key cryptography: unnecessary and insufficient. Appl. Algebra Eng. Commun. Comput. 17(3–4), 285–289 (2006) 13. Kahrobaei, D., Khan, B.: A non-commutative generalization of El Gamal key exchange using polycyclic groups, In: IEEE GLOBECOM 2006–2006 Global Telecommunications Conference [4150920]. https://doi.org/10.1109/GLOCOM. 2006 14. Myasnikov, A., Shpilrain, V., Ushakov, A.: Group-Based Cryptography. BirkhauserVerlag, Berlin (2008). https://doi.org/10.1007/978-3-7643-8827-0 15. Myasnikov, A.G., Shpilrain, V., Ushakov, A.: Noncommutative Cryptography and Complexity of Group-theoretic Problems. American Mathematical Society (2011) 16. Cao, Z.: New Directions of Modern Cryptography. CRC Press, Taylor & Francis Group, Boca Raton (2012). ISBN:978-1-4665-0140-9 17. Maze, G., Monico, C., Rosenthal, J.: Public key cryptography based on semigroup actions. Adv. Math. Commun. 1(4), 489–507 (2007) 18. Kropholler, P.H., Pride, S.J., Othman, W.A.M., Wong, K.B., Wong, P.C.: Properties of certain semigroups and their potential as platforms for cryptosystems, Semigroup Forum 81, 172–186 (2010) 19. Lopez Ramos, J.A., Rosenthal, J., Schipani, D., Schnyder, R.: Group key management based on semigroup actions. J. Algebra Appl. 16(08), 1750148 (2017) 20. Kumar, G., Saini, H.: Novel noncommutative cryptography scheme using extra special group. Secur. Commun. Netw. 2017, 9036382, 21 p https://doi.org/10. 1155/2017/9036382 21. Roman’kov, V.: An improved version of the AAG cryptographic protocol. Groups Complex Cryptol. 11(1), 35–42 (2019) 22. Ben-Zvi, A., Kalka, A., Tsaban, B.: Cryptanalysis via algebraic span. In: Shacham, H., Boldyreva, A. (eds.) Advances in Cryptology - CRYPTO 2018–38th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 19–23, 2018, Proceedings, Part I, vol. 10991, pp. 255–274. Springer, Cham (2018) 23. Tsaban, B.: Polynomial-time solutions of computational problems in noncommutative-algebraic cryptography. J. Cryptol. 28(3), 601–622 (2015) 24. Ustimenko, V.: Linguistic dynamical systems. Graphs of large girth and cryptography. J. Math. Sci. 140(3), 412–434 (2007). https://doi.org/10.1007/s10958-0070453-2 25. Ustimenko, V.: On the extremal graph theory and symbolic computations. Dopovidi Natl. Acad. Sci. Ukr. 2, 42–49 (2013) 26. Polak, M., Ustimenko. V.A.: On LDPC codes corresponding to infinite family of graphs A(k, K). In: Proceedings of the Federated Conference on Computer Science and Information Systems (FedCSIS), 2012, CANA, Wroclaw, pp. 11–23 (2012) 27. Ustimenko, V., Wroblewska, A.: On the key exchange with nonlinear polynomial maps of stable degree. Annalles UMCS Informatica AI X1(2), 81–93 (2011) 28. Ustimenko, V.: On new symbolic key exchange protocols and cryptosystems based on hidden tame homomorphism. Dopovidi NAS of Ukraine 10, 26–36 (2018)
Improving AFL++ CmpLog: Tackling the Bottlenecks Sander J. Wiebing1,2 , Thomas Rooijakkers1(B) , and Sebastiaan Tesink1 1
The Netherlands Organisation for Applied Scientific Research (TNO), Cyber Security Technologies, The Hague, The Netherlands [email protected] 2 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Abstract. The performance of the AFL++ CmpLog feature varies considerably for specific programs under test (PUTs). In this paper it is demonstrated that the main cause of the poor performance is low seed entropy, and a lack of deduplication of magic bytes candidates. An improvement is proposed by mapping comparisons to input bytes, in order to track which comparisons are controlled by what input bytes. This mapping is then used to fuzz only the comparison values that are magic byte candidates for that input part. Second, a caching mechanism is introduced to reduce the number of redundant executions. The evaluation of the improved versions shows a significant coverage gain compared to the original AFL++ implementation of CmpLog for all PUTs, without breaking functionality. The proposed solution in this paper provides a solid basis for a redesign of CmpLog.
Keywords: AFL++
1
· Fuzzing · CmpLog · RedQueen
Introduction
Fuzzing, or fuzz-testing, is a software testing technique to find bugs and vulnerabilities in programs by feeding a program under test (PUT) with (randomly) mutated input. Over the past decades, many different fuzzers [1] and fuzzing techniques have been developed to fuzz faster, increase code coverage, or bypass specific challenges in a PUT. Fioraldi et al. took several fuzzing features from previous research and combined them into AFL++ [2], an open source communitydriven fuzzer based on AFL [3]. AFL++ enabled researchers to easily integrate their research into a single fuzzer, while making it possible to fuzz binaries with a customized configuration by switching fuzzing features on or off. As a result, however, the introduction of some of these features have lead to a strongly varying performance. This means that the best performing configuration differs for per PUT [4], since certain characteristics of the PUT may require different features enabled for the optimal fuzzer configuration. In this paper, the focus will be on the CmpLog feature of AFL++, since benchmarks indicate that this feature is promising for most binaries, but it will be demonstrated that it can result c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 K. Arai (Ed.): SAI 2023, LNNS 739, pp. 1419–1437, 2023. https://doi.org/10.1007/978-3-031-37963-5_96
1420
S. J. Wiebing et al.
in a dramatic performance drop for others. Based on this research, either the implementation of CmpLog could be improved, or CmpLog could be disabled when poor performance is predicted based on characteristics of the PUT. 1.1
Structure
This paper will start by introducing the relevant background in Sect. 2, followed by a first exploration of fuzzing challenges in Sect. 3. Section 4 provides an indepth analysis of the observation of the issues discovered in Sect. 3. In Sect. 5, improvements on the discovered limitations and drawbacks are proposed, followed by an evaluation of these suggestions in Sect. 6. Finally, this paper will provide a suggestion for future work (Sect. 7) and a conclusion (Sect. 8).
2
Background: Bypassing Magic Bytes
In order for the fuzzer to find and trigger a bug, the vulnerable code needs to be reached. Software binaries often contain checks validating whether parts of the input are equal to a preset byte ranges. Those heavily constrained input bytes are often referred to as magic bytes. The probability of discovering these magic bytes via random mutations is very low. Therefore, these magic bytes form a strong roadblock for fuzzers while attempting to gain deeper code coverage, and are one of the primary reasons why fuzzers fail to reach and discover deeply nested bugs and vulnerabilities. The authors of RedQueen [5] observed that many of these ‘magic byte’ checks, are directly correlated with the input. Based on this observation, RedQueen developed an inexpensive technique to detect magic bytes and move them to the input at the right location, a technique referred to as Input-To-State I2S. Based on the improvements demonstrated by RedQueen, CmpLog was implemented in AFL++. CmpLog further boosts the performance of RedQueen by introducing a shared table containing the operand of the last 256 executions of every comparison. Subsequently, during the colourization stage, CmpLog colourizes the input; it replaces random bytes with random values. This colourization process enables CmpLog to identify I2S comparisons via the shared table and replaces them in the input. Additionally, CmpLog contains transformations and arithmetic features. When enabled, there are also different transformations and arithmetic operations performed on the input and replaced. This method tries to bypass I2S comparisons in case the input is slightly modified before the comparison takes place. A more accurate, but very expensive, method for bypassing magic bytes is by using taint analysis. Taint analysis tracks the input through the entire program and therefore knows which operands are performed on the input bytes and whether they are compared to other bytes. VUzzer and Angora use this approach [6,7]. Although this technique works well, especially for binaries with a lot of hard comparisons, it also causes a significant drop of executions per second compared to mutation-based fuzzers. Symbolic execution is a third approach for tackling the magic byte challenge. With symbolic execution, the program is executed abstractly by building
Improving AFL++ CmpLog: Tackling the Bottlenecks
1421
constraints on the input to reach a certain part of the binary. Symbolic execution is a great way to bypass the magic bytes when the applications are not that complex. However, state explosion is a big problem in larger binaries, while keeping track of all possible paths. Hence it is not feasible to fuzz purely with symbolic execution. Concolic execution is a twist which is better scalable since it only takes one path and can concretize some variables in the process. Several fuzzing approaches use this method to compute magic bytes [8–10]. While Driller [8] switches to concolic when the mutation-based fuzzer gets ‘stuck’, more recent research proposes to run the concolic execution engine in parallel with the fuzzer [9,10]. Since both taint analysis and symbolic execution are expensive fuzzing techniques with regard to the number of executions being performed and memory usage respectively, the focus was drawn to the implementation of CmpLog, which will be explored in more detail.
3
Exploration
The authors of [4] demonstrated that different fuzzer configurations are optimal for different binaries. For benchmarking, FuzzBench: an open fuzzer benchmarking platform and service [11] is used throughout this paper. The binaries with the most deviating performance1 are included in Table 1. While analysing these results, two binaries stood out, namely the bloaty fuzz target and libpcap fuzz both target binaries. AFL++ performs relatively well on the libpcap fuzz both target, just like libFuzzer [12]. libpcap fuzz both probably contains a lot of ‘magic byte’ -checks; as the default AFL++ configuration has CmpLog enabled and libFuzzer contains a similar method to bypass those bytes. The AFL++ paper [4] confirms this observation; the AFL++ configuration where CmpLog is enabled outperforms all others. The test with the bloaty fuzz target demonstrates inverted results; AFL++ with CmpLog enabled shows a significant drop in performance. The FuzzBench repository also contains an AFL++ optimal configuration for the various benchmark targets2 . These hand-crafted configurations are manually determined though on a trial and error basis. During the experiments it was observed that, although the total coverage were comparable at the end of the benchmarks (24 h), there was a significant difference in the pace of reaching this coverage, as illustrated by the graphs3 . For the bloaty fuzz target, however, there was a large difference; the optimal configuration, without CmpLog and MOpt enabled, strongly outperformed AFL++ default configuration. These results are in line with the earlier experiment of the original AFL++ paper [2], where both AFL++ with CmpLog and AFL++ with CmpLog and MOpt performed poorly, but AFL++, both without CmpLog, relatively well. MOpt [13] is a mutation scheduling technique which gives probabilities to the mutation 1 2 3
https://www.fuzzbench.com/reports/2022-04-19/index.html. https://github.com/google/fuzzbench/blob/93d4182f1d5f420cd07224d727288ea4107 beeee/fuzzers/aflplusplus optimal/fuzzer.py. https://www.fuzzbench.com/reports/2022-04-19/index.html.
1422
S. J. Wiebing et al.
Table 1. Fuzzbench 2022-04-19 code-coverage results. (bold text = relative high code coverage or fastest gain, red = relative low code coverage) AFL[3] AFLFast[14] libfuzzer[12] AFL++[2] openthread-2018-02-27 6208
5549
5957
7145
libpng-1-2.56
1944
1942
1995
2089
vorbis 2017-12-11
2167
2161
1928
2172
bloaty fuzz target
8278
8008
7051
6466
harfbuzz
8407
8361
8334
8715
libpcap fuzz both
98
95
3461
4331
zlib zlib
961
944
975
963
Table 2. AFL++ optimal configuration compared to the default configuration [https:// github.com/google/fuzzbench/blob/93d4182f1d5f420cd07224d727288ea4107beeee/fuz zers/aplusplusoptimal/fuzzer.py.] AFL++ AFL++ Opt Build config Opt openthread-2018-02-27 6849
6736
Run config Opt
Default (LTO, CmpLog)
CmpLog arthemetic; Keep Timeouts
libpng-1-2.56
2091
2122
Default (LTO, CmpLog)
CmpLog transformations & arthemetic; Keep Timeouts
vorbis 2017-12-11
2173
2180
LTO, LAF
Test cache size 50
bloaty fuzz
6553
8767
LTO
MOpt immediatly
harfbuzz
8721
8654
CmpLog, Dict2File, Tracepc Keep timeouts
libpcap
4444
4559
LTO, CmpLog
zlib zlib
967
967
CmpLog, Dict2File, Tractpc
CmpLog transformations, Test cache size 50
operators for finding a new path. Therefore, it is likely that the performance hit in bloaty fuzz target is caused by the CmpLog functionality.
4
Analysis
Based on the previous section, it was hypothesized that the CmpLog feature of AFL++ was the cause of lower code coverage on the bloaty fuzz target, when compared to other AFL++ configurations. In this section, a root cause analysis of this performance drop is performed through the fuzzing results of ‘AFLplusplus’ (default configuration) and ‘aflplusplus optimal’ from the fuzzbench 2022-04-22aflpp experiment. (See footnote 3) Bloaty—The bloaty binary is a size profiler for binaries. It is able to compute the size of different data sources (segments, sections, symbols, compile-units, inlines and ar-members) and supports three binary formats (ELF, Macho-O, and WebAssembly). 4.1
Code Coverage
As a starting point, the coverage mapping of the two configurations is compared. The branch coverage, excluding shared libraries, is included in Table 3. At first sight it looks like AFL++ performs just as good as the optimal variant for all ELF binaries, the biggest coverage difference is observed for binaries in
Improving AFL++ CmpLog: Tackling the Bottlenecks
1423
Table 3. Branch coverage reached by AFL++, AFL++ Optimal and the coverage with the initial seeds (red/yellow colored: below/above 80%). Shared libraries are excluded from the table, but included in the total calculation. AFL++ AFL++ Optimal Initial Seeds bloaty.cc bloaty.h demangle.cc disassemble.cc dwarf.cc elf.cc macho.cc range map.cc range map.h webassembly.cc
33.23% 87.50% 47.01% 17.54% 92.54% 91.09% 62.15% 86.89% 54.55% 77.91%
33.23% 87.50% 90.92% 17.54% 94.96% 91.09% 88.55% 86.89% 54.55% 82.56%
31.17% 62.50% 1.37% 16.67% 71.02% 87.74% 29.38% 80.33% 77.42% 34.62%
Total:
7.22%
8.76%
4.73%
the Macho-O file format and to a lesser extent the WebAssembly binaries. The demangle code is used by these three binary handlers, so its low coverage could be a result of low coverage on the binary handlers. Through an investigation of the individual not-covered branches of the Macho-O handler, there appeared to be no comparisons that could be disruptive or impossible to solve for CmpLog. Furthermore, the coverage report demonstrated that the coverage difference between the ELF handler and Macho-O handler was mainly caused by the set of initial seeds. Out of the 94 initial seeds, only a single seed was a Macho-O formatted file. Thus it cannot be concluded that CmpLog performs properly on ELF, since the initial seed already bypasses the magic bytes. 4.2
Statistics
Since the coverage mapping did not provide a clear indication of why the CmpLog reaches a relatively low coverage, the poor performance may be caused by ‘slow’ performance of the CmpLog feature. To validate this hypothesis some statistics of the trials were produced. Looking at the average statistics of the trials, Table 4, it stands out that AFL++ did not complete any cycle while the optimal version already completed about 31. AFL++ completes a cycle when it reaches the end of the seed queue. Newly found seeds during the fuzzing process are added at the back of the queue, so they can be fuzzed during the same round. Note that not always all queue items are fuzzed, more information about the scheduling can be found in the AFL++ documentation [15]. While the execution speed of the optimal configuration is almost double, this cannot be the sole cause; in that scenario one would expect AFL++’s default configuration to have completed at least a few cycles. It seems that the CmpLog feature is generating more executions per seed. Table 5 displays the maximum and average time and number of executions spent per seed, ranked by time spent. This confirms that CmpLog generates a
1424
S. J. Wiebing et al.
Table 4. Average statistics for the trials of AFL++ (trials=18) and AFL++ Optimal (trials = 20). AFL++
AFL++ Optimal Ratio
execs done
198376360.6 388016450.6
1.96
execs per sec
2396.4
1.96
4687.55
cycles done
0
31.45
-
corpus count
1581.85
6708
4.24
1033.6
2.72
corpus favored 379.7
Table 5. Average time and number of executions per seed for the top 15 seeds, information retrieved from the plot data files of AFL++ (trials = 18) and AFL++ Optimal (trials=20). AFL++ AFL++ Optimal Avg time
53 min
Max time
199 min 13 min
Avg Executions 7633 k
2 min 425 k
Max Executions 22742 k 3395 k
lot more executions per seed; on average the max time spent on a single seed is exceeds 3 h. The average time for the top 15 seeds with CmpLog enabled is 26 times higher. This does explain why AFL++’s default configuration did not complete any cycle in 24 h. It is important to note that not all seeds take that long, some seeds are processed in less than a minute. To investigate the cause of this ‘execution explosion’, one of the slowest seeds was taken and analysed manually, in order to determine which part of the bloaty target contributed to this explosion. It turned out that the function ‘cs disasm iter()’, from the shared library Capstone [16] which disassembled the binary, had a large share in the cause of the creation of many of the executions. Especially the print function (printInstruction()) was a large contributor to the execution explosion as it contained a huge switch statement to find the corresponding ASCII character for the current instruction. The seeds that took at least 1h to be processed are all ELF binaries, since Bloaty currently only supports disassembling for ELF, and are at least 10kb in size. 4.3
Implementation
To gain a better insight into the execution explosion, the code of the CmpLog implementation was analysed in more detail. Listing 1.1 shows a simplification of the CmpLog code with the loops as a basis. The first step of the CmpLog state is the colourization phase. In the colourization, every byte of the input is replaced by a different byte of the same type, a series of bytes which are replaced is called a taint region. Initially, the complete input is taken as one taint region,
Improving AFL++ CmpLog: Tackling the Bottlenecks
1425
this colourized input is passed to the target program. When the resulting hash of the execution path differs from the hash of the original execution path, the taint region is cut in half and processed separately, otherwise the region is saved. The result after the colourization is a colourized input where the bytes in every taint region are replaced by a different byte of the same type, and the hash of the execution path is equal to the original input. For the construction of the hash, the ‘trace bits’ and ‘map size’ are used. Two types of comparisons exist; the immediate values (INS) and the values referenced by a pointer (RTN). Since it is not possible to know the length of the last type, a length of 31 is copied to the comparison log by default. For INS the last 32 hits are logged, for RTN only the last 8 hits. The handling of the types INS and RTN are similar, for simplification only the RTN type is discussed here. CmpLog runs the target with the original input and the colourized input and saves both shared tables (containing the logged comparisons). The next step is to loop over every logged hit in each comparison (see Listing 1.1, line 16), and within that loop over every taint byte. In this nested loop, the function rtn extend enconding() is called twice (line 24, 28). During the first evaluation it is assumed that the right operator is the value controlled by the input (pattern) and the left the magic byte (replication). During the second evaluation, the role of the operators is switched. The rtn extend enconding() function replaces the first byte of the replication value and executes the target with this modified input. This will continue until the last byte of the taint region length is reached and as long as the pattern, the value of the other side of the comparison, is equal to the byte in the input which will be replaced. The idea behind this is that if the operator value has an I2S relation with that input byte, the value should be equal. For the INS comparisons this has to hold both for the original input and the colourized input. For the RTN type this only has to hold for either of them. Note that this I2S check is only performed when transformations are disabled, since this check no longer holds when transformations are assumed between the input and the comparisons. With transformations enabled, the target is always executed for the complete taint len. To summarize, the equation below computes the number of executions executed for one seed, Executions per seed = total Loggeds ∗ total taint bytes ∗ ... ...(taint len where pattern[i] == buf[i]). Seed 27. A more in-depth analysis of seed 27, is performed, one of the seeds susceptible to the execution explosion problem. By disabling the disassemble function and enabling it again, the number of INS comparisons increases by almost a factor 3 (282 → 743), while the number of taint bytes stays roughly equal (around 10.1k out of the 12.6k input bytes). These two factors seem to have a huge effect on the number of executions since for every logged comparison hit CmpLog loops through every taint byte. The observation that the number of taint bytes does not decrease can be explained by the characteristics of the disassemble function. It contains a huge switch statement at which every byte is
1426
S. J. Wiebing et al.
Table 6. Examples of operand values of RTN comparisons with the number of produced executions, showed is either the left or right operand. # Type Length Executions 1 RTN 31
28
Colorized o0: 388400c4127f0000388400c4127f0000308400c4127f0000202d00c4127f00 Original o0: 786800c4127f0000786800c4127f0000706800c4127f0000803500c4127f00
2 RTN 31
42421
Colorized o1: 000000000000000018760048ec7f0000900d0048ec7f0000f84e4988655500 Original o1: 0000000000000000687a0048ec7f0000c0590048ec7f000038dc4988655500
3 RTN 31
84487
Colorized o0: 000000000000000000000000000000000000000000000000800f0048ec7f00 Original o0: 000000000000000000000000000000000000000000000000900d0048ec7f00
translated to an ASCII character to represent assembly. Changing these bytes will result in different characters but will not change the execution path. However, since for bloaty fuzz target the transformations are not enabled, the condition that the pattern had to be equal to the input before it is executed should reduce the number of executions if the entropy of the pattern is high enough. 4.4
Patterns Entropy
The number of executions is influenced by the number of logged comparisons multiplied by the number of taint bytes. Nevertheless, except for transformations, an input where the pattern is replaced by the replication is only executed when the pattern was equal to the input at that location. Thus, as long as the input or pattern has a high entropy, there are only a few places where the replication can be inserted. To understand the case for the high number of executions, the number of times the target is executed whenever it got logged was tracked. Table 6 shows three examples of RTN (pointer) operands, only the left or right operand is included since the replicated value has no influence on the execution of the input. The rtn extend encoding() replaces byte by byte the input with the replication value and executes it subsequently until the condition is no longer met; if the byte in the colourization pattern and original pattern differ from the byte to be replaced in the colourized input and original pattern. The two operands of example #1 produced only 28 executions, there were only a few places where one or more bytes could be inserted. This is not unexpected since the starting bytes seem to have a lot of entropy. It is different for examples #2 and #3. Both operands of #2 start with 8 zero bytes, resulting in more than 42k executions. This is more than the input length (12.6k) because for each input location, the replicated value is inserted byte by byte until the replicated value is completely inserted or until the condition no longer holds. Note that since it is an OR statement, either the original input or the colourized input has to contain a 0. The operands of example #3 start with 24 zero bytes and produce over 84k executions; this is only one-half of one logged hit of a comparison. If there are too many of these cases this leads to an execution explosion. The other type of comparison is the immediate values (INS), four examples of operands with the number of produced executions are shown in Table 7. The direct values of the immediate operands are not causing that many executions, it is the swapped variants of the operands that do. The variants are
Improving AFL++ CmpLog: Tackling the Bottlenecks
Listing 1.1. Simplified CmpLog code.
5
rtn extend encoding ( idx ) { ( i = 0 ; i < t a i n t l e n ; ++i ) { i f ( ( p a t t e r n [ i ] != buf [ i d x + i ] && o p a t t e r n [ i ] != o r i g b u f [ i d x + i ] ) ) break ;
6
}
7
buf [ i d x + i ] = r e p l [ i ] ; i t s f u z z ( buf ) ;
1
void
for
2 3 4
8
}
9 10
{
}
11 12
void
r t n f u z z (CMP) { ( l o g g e d s i n CMP) { for ( taint in taints ) { f o r ( i d x = t a i n t . s t a r t ; i d x ++; i d x