116 45 32MB
English Pages 365 [359] Year 2022
Lecture Notes in Networks and Systems 593
Salim Chikhi · Gregorio Diaz-Descalzo · Abdelmalek Amine · Allaoua Chaoui · Djamel Eddine Saidouni · Mohamed Khireddine Kholladi Editors
Modelling and Implementation of Complex Systems Proceedings of the 7th International Symposium, MISC 2022, Mostaganem, Algeria, October 30–31, 2022
Lecture Notes in Networks and Systems Volume 593
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
More information about this series at https://link.springer.com/bookseries/15179
Salim Chikhi Gregorio Diaz-Descalzo Abdelmalek Amine Allaoua Chaoui Djamel Eddine Saidouni Mohamed Khireddine Kholladi •
•
•
•
•
Editors
Modelling and Implementation of Complex Systems Proceedings of the 7th International Symposium, MISC 2022, Mostaganem, Algeria, October 30–31, 2022
123
Editors Salim Chikhi Faculty of New Information and Communication Technologies University of Constantine2 Constantine, Algeria
Gregorio Diaz-Descalzo Sistemas Informáticos Universidad de Castilla - La Mancha Albacete, Spain
Abdelmalek Amine Faculty of Technology University of Saida Saida, Algeria
Allaoua Chaoui Faculty of Information and Communication Technology University of Constantine2 Constantine, Algeria
Djamel Eddine Saidouni Faculty of New Information and Communication Technologies Université Constantine 2 Constantine, Algeria
Mohamed Khireddine Kholladi Department of Mathematics and Computer Science University of El Oued El-Oued, Algeria
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-18515-1 ISBN 978-3-031-18516-8 (eBook) https://doi.org/10.1007/978-3-031-18516-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This volume contains research papers accepted and presented at the 7th International Symposium on Modelling and Implementation of Complex Systems (MISC 2022), held on 30–31 October 2022, in Mostaganem, Algeria. As the previous editions (MISC 2010, MISC 2012, MISC 2014, MISC 2016, MISC 2018 and MISC 2020), this symposium is intended as a tradition offering open forum and meeting space for researchers working in the field of complex systems science. This edition of MISC symposium received 80 submissions from the five countries: Algeria, France, United Kingdom (UK), Spain and United Arab Emirates. In a rigorous reviewing process, the Programme Committee selected 26 papers, which represents an acceptance rate of 30.76%. The PC included 95 researchers and 21 additional reviewers from nine countries. However, the authors of two papers did not send their copyright transfer agreements and were therefore removed from the proceedings. Finally, the present volume contains only 24 contributions. These contributions were organized into three sessions as follows: design and implementation of IoT systems and cloud computing, artificial intelligence and its applications and data science and its applications. We would like to thank the co-chairs of the Programme Committee and all its members for their effort in the review process and the selection of the papers. We are grateful to the Organizing Committee members from the University of Mostaganem and the University of Constantine 2 for their contribution to the success of the symposium. Our thanks also go to the authors who submitted papers to the symposium for their interest to our symposium. Enough thanks cannot be expressed, to Prof. Nabil Belala and Dr. Ahmed-Chaouki Chaouche for managing, respectively, the EasyChair system and the website for MISC 2022 from submissions to proceedings elaboration. August 2022
Salim Chikhi Allaoua Chaoui
v
Organization
The 7th International Symposium on Modelling and Implementation of Complex Systems (MISC 2022) was co-organized by University of Constantine 2– Abdelhamid Mehri and University of Mostaganem–Abdelhamid Ibn Badis and took place in Mostaganem, Algeria (30–31 October 2020).
Honorary Chairs Belabbes Yagoubi Abdelouahab Chemam
University of Mostaganem, Algeria University of Constantine 2, Algeria
General Chairs Allaoua Chaoui Salim Chikhi Abdelmalek Amine
University of Constantine 2, Algeria University of Constantine 2, Algeria University of Saïda, Algeria
Steering Committee Allaoua Chaoui Salim Chikhi Mohamed-Khireddine Kholladi Djamel Eddine Saïdouni Mohamed Yagoubi Hamouma Moumen Belabbes Yagoubi
University of Constantine 2, Algeria University of Constantine 2, Algeria University of El Oued, Algeria University University University University
of of of of
Constantine 2, Algeria Laghouat, Algeria Batna 2, Algeria Mostaganem, Algeria
vii
viii
Organization
Organizing Committee Chairs Amir Abdessamad Saïd Labed
University of Mostaganem, Algeria University of Constantine 2, Algeria
Local Organizing Committee Fouad Henni Charef Abdellah Bensalloua Abdelkader Benameur Karim Sehaba Mohamed Rédha Djebbara Fatima Zohra Filali Meriem Abid Moulay Driss Mechaoui Zineb Kaisserli Mohammed Elamine Moumene Si Mohamed Bekai Menad Mohamed Moussa Mohamed Adnane Laredj Mohammed El Mustapha Miroud Abdellah Belmehdi
University University University University University University University University University University
of of of of of of of of of of
Mostaganem, Mostaganem, Mostaganem, Mostaganem, Mostaganem, Mostaganem, Mostaganem, Mostaganem, Mostaganem, Mostaganem,
Algeria Algeria Algeria Algeria Algeria Algeria Algeria Algeria Algeria Algeria
University University University University
of of of of
Mostaganem, Mostaganem, Mostaganem, Mostaganem,
Algeria Algeria Algeria Algeria
University of Mostaganem, Algeria
Publication Chairs Nabil Belala Ahmed-Chawki Chaouche Badreddine Miles
University of Constantine 2, Algeria University of Constantine 2, Algeria University of Constantine 1, Algeria
Programme Committee Chairs Djamel Eddine Saïdouni Karim Sehaba Baghdad Atmani
University of Constantine 2, Algeria University of Mostaganem, Algeria University of Oran 1, Algeria
Programme Committee Sihem Abbassen Takoua Abdellatif Abdelkrim Abdelli Abdelmalek Amine Abdelkrim Amirat Chafik Arar
University of Constantine 2, Algeria Ecole Polytechnique of Tunis, Tunisia USTHB of Algiers, Algeria University of Saida, Algeria University of Souk Ahras, Algeria University of Batna 2, Algeria
Organization
Baghdad Atmani Mohamed Chaouki Babahenini Abdelmalik Bachir Amel Behaz Ali Behloul Faïza Belala Nabil Belala Meriem Belguidoum Djamel Bellala Ghalem Belalam Hacène Belhadef Fouzia Benchikha Djamel Benmerzoug Mohamed Benmohammed Charef Abdellah Bensalloua Hammadi Bennoui Azeddine Bilami Salim Bitam Karim Bouamrane Samia Boucherkha Mahmoud Boufaida Zizette Boufaida Kamel Boukhalfa Boukharrou Radja Abdelkrim Bouramoul Elbey Bourennane Mourad Bouzenada Rachid Chalal Ahmed-Chawki Chaouche Foudil Cherif Saloua Chettibi Mohamed Skander Daas Gregorio Diaz-Descalzo Karim Djemame Cedric Eichler Raïda El Mansouri Nadir Farah Mohamed Amine Ferrag Mohamed Gharzouli Nacira Ghoualmi Said Ghoul Larbi Guezouli Lyamine Guezouli Djamila Hamdadou
ix
University of Oran 1, Algeria University of Biskra, Algeria University of Biskra, Algeria University of Batna 2, Algeria University of Batna 2, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Batna 2, Algeria University Oran 1, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Mostaganem, Algeria University of Biskra, Algeria University of Batna 2, Algeria University of Biskra, Algeria University of Oran 1, Algeria University of Constantine 2, Algeria University of Souk Ahras, Algeria University of Constantine 2, Algeria USTHB Algiers, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Bourgogne, France University of Constantine 2, Algeria ESI of Algiers, Algeria University of Constantine 2, Algeria University of Biskra, Algeria University of Jijel, Algeria University of Constantine 1, Algeria Castilla La Mancha University, Spain University of Leeds, United Kingdom INSA Bourges, France University of Constantine 2, Algeria University of Annaba, Algeria University of Guelma, Algeria University of Constantine 2, Algeria University of Annaba, Algeria Philadelphia University, Jordan University of Batna 2, Algeria University of Batna 2, Algeria University of Oran, Algeria
x
Saad Harous Abdelfetah Hentout Nadia Hocine Ouahab Kadri Bouchra Kaid Slimane Okba Kazar Tahar Kechadi El Hilali Kerkouche Khaled Khalfaoui Mohamed Lamine Kherfi Mohamed Nadjib Kouahla Mohamed Tahar Kimour Ilham Kitouni Saïd Labed Zakaria Laboudi Yacine Lafifi Abdesslem Layeb Ali Lemouari Ramdane Maamri Mimoun Malki Smaine Mazouzi Moulay Driss Mechaoui Djamila Mechta Kamel Eddine Melkemi Salah Merniz Hayet-Farida Merouani Mohammed A. Merzoug Chaker Mezioud Abdelouaheb Moussaoui Mohamed Elhadi Rahmani Mathieu Roche Rouba Baroudi Maamar Sedrati Rachid Seghir Karim Sehaba Larbi Sekhri Hamid Seridi Hichem Talbi Noria Taghezout Chouki Tibermacine Faiza Titouna Mohamed Traiche Belabbas Yagoubi Mohamed Habib Zahmani Nacereddine Zarour
Organization
El Ain University, United Arab Emirates CDTA, Algeria University of Mostaganem, Algeria University of Batna 2, Algeria University of Mostaganem, Algeria University of Biskra, Algeria UCD School Dublin, Ireland University of Jijel, Algeria University of Jijel, Algeria MESRS, Algeria University of Guelma, Algeria University of Annaba, Algeria University of Constantine 2, Algeria University of Constantine 2, Algeria University of Oum El Bouagui, Algeria University of Guelma, Algeria University of Constantine 2, Algeria University of Jijel, Algeria University of Constantine 2, Algeria University of Sidi Bel Abbes, Algeria University of Skikda, Algeria University of Mostaganem, Algeria University of Setif 1, Algeria University of Batna 2, Algeria University of Constantine 2, Algeria University of Annaba, Algeria University of Batna 2, Algeria University of Constantine 2, Algeria University of Setif, Algeria University of Saida, Algeria CIRAD of Montpellier, France University of Mostaganem, Algeria University of Batna 2, Algeria University of Batna 2, Algeria University of Mostaganem, Algeria University Oran 1, Algeria University of Guelma, Algeria University of Constantine 2, Algeria University Oran 1, Algeria University of Montpelier 2, France University of Batna 2, Algeria CDTA, Algeria University of Mostaganem, Algeria University of Mostaganem, Algeria University of Constantine 2, Algeria
Organization
Nadia Zeghib Amer Zerek Abdelhafid Zitouni
xi
University of Constantine 2, Algeria Zawia University, Libya University of Constantine 2, Algeria
Co-editors Salim Chikhi Abdelmalek Amine Allaoua Chaoui Mohamed-Khireddine Kholladi Djamel Eddine Saïdouni
University University University University
of of of of
Constantine 2, Algeria Saïda, Algeria Constantine 2, Algeria El Oued, Algeria
University of Constantine 2, Algeria
Additional Reviewers Abdelhak Mansoul Abdelhalim Saadi Ali Mansoul Badr-Eddine Miles Bouchera Maati Chafia Bouanaka Chaima Derouiche Imene Bensalem Khaled Necibi Kouider Ahmed Mahdi Khenour
Mohammed Mounir Bouhamed Nadir Henni Omar Kermia Oussama Kamel Radja Boukharrou Saliha Benkerdagh Salim Benayoune Siham Amrouch Soumia Zellagui Zakaria Benmounah
Contents
Design and Implementation of IoT Systems and Cloud Computing An Overview of Health Monitoring Systems for Arrhythmia Patients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saoueb Kerdoudi, Larbi Guezouli, and Tahar Dilekh
3
Probabilistic Forwarding in Named Data Networks for Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adel Salah Ould Khaoua, Abdelmadjid Boukra, and Fella Bey
17
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sarra Cherfi, Ammar Boulaiche, and Ali Lemouari
31
Evaluation Metrics in DoS Attacks Detection Approaches in IoT: A Survey and a Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohamed Riadh Kadri, Abdelkrim Abdelli, and Lynda Mokdad
46
Combined Use of PBMN and Rewriting Logic for Specification and Analysis of IoT Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sofia Abbas, El Hillali Kerkouche, Khaled Khalfaoui, and Allaoua Chaoui
62
The IoT Ecosystem: Components, Architecture, Communication Technologies, and Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seloua Haddaoui, Salim Chikhi, and Badreddine Miles
76
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nabil Kadache and Rachid Seghir
91
Platforms Cooperation Based on CIoTAS Protocol . . . . . . . . . . . . . . . . 105 Bouchera Maati, Djamel Eddine Saidouni, and Mohammed Mounir Bouhamed
xiii
xiv
Contents
Artificial Intelligence and its Applications Hybrid Approach Based on Grey Wolf Optimizer for Dropout Regularization in Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Selma Kali Ali and Dalila Boughaci An Illumination-Robust Face Recognition Approach Based on Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Abdessalam Hattab and Ali Behloul Trajectory Tracking of an Ev3 Robot Based on Optical Flow . . . . . . . . 150 Ghania Zidani, Djalal Djarah, and Abdeslam Benmakhlouf Rumor Detection in Algerian Arabizi Based on Deep Learning and Associations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Mohamed Charafeddine Bousri, Riad Bensalem, Samah Bessa, Zineb Lamri, Chahnez Zakaria, and Nabila Bousbia Auto-Diversified Ameliorated MultiPopulation-Based Ensemble Differential Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Hezili Besma and Talbi Hichem Plant Recognition Using Data Augmentation and Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Said Labed, Hamza Touati, and Rougaia Dif Survey of the Arabic Machine Translation Corpora . . . . . . . . . . . . . . . 205 Baligh Babaali and Mohammed Salem An Example of a Dynamic CPN Model to Obtain Routes in the Presence of Obstacles Detected Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Ahmed Bouzenada, Mohammed Mounir Bouhamed, Oussama Kamel, Hermenegilda Macià, Gregorio Díaz, and Allaoua Chaoui Diabetic Retinopathy Detection Using Deep Learning . . . . . . . . . . . . . . 234 Kaouthar Manar Fellah, Samir Tigane, and Laid Kahloul Data Science and its Applications Impact of Normalization and Data Augmentation in NER for Algerian Arabic Dialect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Abdelhalim Hafedh Dahou and Mohamed Amine Cheragui A Comparative Study of Metaheuristics Based Task Scheduling in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Arslan Nedhir Malti, Badr Benmammar, and Mourad Hakem
Contents
xv
Autoencoders and Ensemble-Based Solution for COVID-19 Diagnosis from Cough Sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Skander Hamdi, Abdelouahab Moussaoui, Mourad Oussalah, and Mohamed Saidi Efficient Coronavirus Herd Immunity Optimizer for the UAV Base Stations Placement Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Sylia Mekhmoukh Taleb, Yassine Meraihi, Selma Yahia, Amar Ramdane-Cherif, Asma Benmessaoud Gabis, and Dalila Acheli Modified Fisher Discriminant Analysis, An Application on Facial Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Korichi Mokhtar El’Amine and Benyoucef Nabila Communities Detection in Epidemiology: Evolutionary Algorithms Based Approaches Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Mostefa Mokaddem, Ilhem Idris Khodja, Hamza Amar Setti, Baghdad Atmani, and Chihab Eddine Mokaddem Cardiovascular Diseases Prediction Based on Dense-DNN and Feature Selection Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 Abderzak Manaa, Farida Brahimi, Zahira Chouiref, Mohamed Kessouri, and Mourad Amad Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Design and Implementation of IoT Systems and Cloud Computing
An Overview of Health Monitoring Systems for Arrhythmia Patients Saoueb Kerdoudi1(B) , Larbi Guezouli1,2 , and Tahar Dilekh1 1
LaSTIC Laboratory, Department of Computer Science, University of Batna 2, Batna, Algeria [email protected], [email protected], [email protected] 2 Higher National School of Renewable Energies, Environment & Sustainable Development, Batna, Algeria
Abstract. Cardiac rhythm disorders (arrhythmias) are one of the leading causes of death worldwide. Therefore, the detection and classification of arrhythmias are essential for diagnosing patients with cardiac abnormalities. With new technologies, we can see the opening of medical institutions toward health information technology systems. To give researchers an overview of existing works about health monitoring systems for heart patients, we have established a comparative study between recent and well-known methods based on their results. The focus is in this comparison on the features used, the signal length, the datasets used, features extraction methods, features selection methods, classification methods, and the performances of each method. Furthermore, we classified these works by disease types (Paroxysmal Atrial Fibrillation PAF, Atrial Fibrillation AF, Ventricular Tachyarrhythmia VTA such as Ventricular Tachycardia (VT) and Ventricular Fibrillation (VF), Sudden Cardiac Death SCD, Obstructive Sleep Apnea OSA). According to this comparative study, it has been found that many studies got exciting results. However, the classification rate achieved remains moderate. Moreover, heart rate monitoring devices are not within reach of the average citizen in terms of price and prediction time, and more studies are needed using more extensive databases. This study gives a comprehensive view of what is currently being done to monitor heart patients’ health. After discussing the achievements and limitations of existing approaches to monitoring the status of cardiac patients, we conclude by providing several potential research directions for the future. Keywords: Overview · Health monitoring systems Cardio-respiratory monitoring systems
1
· Health care ·
Introduction
According to the World Health Organization (WHO) [25], cardiovascular diseases occupy first place in the medical field as a leading cause of death worldwide over c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 3–16, 2023. https://doi.org/10.1007/978-3-031-18516-8_1
4
S. Kerdoudi et al.
the past 20 years. Also, obstructive sleep apnea (OSA) is a source of fatigue and also cardiovascular diseases where its incidence increases almost linearly with age in adults [15]. Early detection of cardiac abnormalities is an exciting field, which has become the focus of many researchers because of the seriousness of its consequences, such as sudden death or subsequent physical or psychological sequelae. On the other hand, there are many types of heart diseases, one of them known as arrhythmia or also heart rhythm abnormalities such as Atrial Fibrillation (AF), Ventricular Tachycardia (VT), and Ventricular Fibrillation (VF) ... Many companies have shown interest in detecting heart diseases such as cardiac arrhythmia via smart devices, namely the Android Wear, Apple Watch, and CRONOVO... Millions of people wear heart rate sensors on their wrists, but these devices do not allow automatic learning to detect people with arrhythmias using this data in terms of results and high price. Furthermore, simple ECG acquisition is insufficient because many heart rhythm abnormalities require continuous recording of this signal. It should be noted that the absence of the patient’s medical history forces the doctor to prescribe maximum treatment to cope with the uncertainty of the diagnosis, which can tire the patient by taking medication with adverse effects or performing exhausting daily tests. Various methods have been proposed in the studies to monitor the health status of patients with chronic heart disease through ECG monitoring for early detection of cardiac abnormalities as well as to detect untreated sleep apnea associated with cardiovascular disease [38] where several techniques have been proposed to derive respiratory information from ECG signal. To increase predictability, each of these methods uses a combination of features extracted by linear, time-frequency, and nonlinear analysis performed on heart rate variability, which allows researchers to extract new features and compare them with conventional features. In addition, some studies have automatically extracted features using CNN to predict heart disease. Other methods also extract respiratory information from a heart rate variability (HRV) signal, which calculates ECG-derived respiration signal (EDR) and Apnea-hypopnea index (AHI). The effectiveness of these features varies depending on the period to be predicted and the disease to be discovered. Each disease has its identifiable features that appear on the electrocardiogram ECG. On this basis, a comparison is made between 12 methods from the research of recent years in the detection of heart disease and sleep apnea because there is a relationship between them. We classify the heart diseases in groups according to the type of disease and compare them in terms of learning methods, the features used, and the results obtained while mentioning the shortcomings of each method. In contrast, we compare the methods developed in the investigations of cardio-respiratory monitoring. This paper is composed of three sections in addition to this introduction, and it is organized as follows: Sect. 2: Classification of ECG-based studies by disease types, Sect. 3: conclusion.
An Overview of Health Monitoring Systems for Arrhythmia Patients
2
5
Classification of ECG-Based Studies by Disease Types
This paper will cover three main classifications of the most common and dangerous areas of heart disease by categorizing studies that have used deep and machine learning techniques to predict these diseases and the impact of extracted features on prediction results in recent years. In addition, it will cover one of the diseases related to cardiovascular diseases in the fourth classification. The following are four main classifications: (1) Paroxysmal atrial fibrillation (PAF) and Atrial Fibrillation (AF), (2) Ventricular Tachyarrhythmia (VTA) such as Ventricular Tachycardia (VT) and Ventricular Fibrillation (VF), (3) Sudden Cardiac Death (SCD), (4) Obstructive Sleep Apnea OSA. 2.1
Paroxysmal Atrial Fibrillation (PAF) and Atrial Fibrillation (AF)
One of the most widely used applications of machine learning in cardiology is the prediction of cardiac arrhythmias. Paroxysmal atrial fibrillation (PAF) and atrial fibrillation (AF) are the most common significant cardiac arrhythmias. Doctors emphasize that if PAF is not treated promptly, it may progress to persistent Atrial Fibrillation (AF), resulting in a high risk of morbidity and mortality. As a result, more focus is being paid to the prediction of PAF to enable early detection and halt the disease’s progression. A validated approach to forecast PAF has not yet been developed despite the availability of pharmacological and electrical therapies. Numerous studies have focused on predictive models using predictive machine learning (ML) systems composed of different sub-processes such as signal preprocessing, significant feature extraction, feature selection, and classification algorithms. Ebrahimzadeh et al. [6] extracted the HRV signal from the electrocardiogram ECG and segmented it into 5-min intervals, they used a variety of features retrieved from heart rate variability (HRV) by linear, and Wigner Ville transforms to extract Time-Frequency, and thereupon, nonlinear features. Then the best features combination are selected based on their capacity to differentiate the two classes, and Local Feature Subset Selection is used to accomplish the selection of features. Furthermore, to distinguish between signals before and after PAF, they utilized four classifiers: the K-nearest neighbor (KNN), support vector machine (SVM), Multilayer Perceptron (MLP), and Mixture of Expert (ME). A ten-fold cross-validation approach is used to assess the performance of the classifiers. Moreover, They employed the standard database named the Atrial Fibrillation Prediction Database (AFPDB) [10], which contains 53 patients (106 regular and abnormal recordings), resulting in 100% sensitivity, 95.55% specificity, and 98.21% accuracy. Although implantable defibrillators have limited computational capacities, they can detect and treat PAF early. To help lower the computational overhead of initiating PAF detection, Parsi et al. [27] propose seven novel features for predicting the existence of PAF with high accuracy. They extracted features from a Poincar’e representation of R-R interval signals recovered from patient
6
S. Kerdoudi et al.
ECG data in 5-min segments and prioritized them based on feature rankings. The method is evaluated using the same standard database (AFPDB) [10] that [6] worked with. Their paper [27] uses the features with four standard classification techniques for PAF prediction: SVM, MLP, KNN, and Random Forest (RF) to compare the classic and proposed feature sets. Furthermore, the performance of the classifiers is evaluated using a ten-fold cross-validation technique with different feature sets. Their findings confirm that the results are further improved when the proposed features are combined with several classic features, reaching 98.8% sensitivity, 96.7% specificity, and 97.7% accuracy. Atrial fibrillation (AF) can develop from PAF. Some patients with AF have no apparent clinical symptoms, exposing them to the risk of several severe illnesses while they are unaware of it. As a result, early detection and prediction of AF and appropriate treatment to reduce its occurrence are clinically and socially significant. Therefore, Shen et al. [33] study aim to discover AF by proposing a method that combines manual and neural network extraction features. First, They processed the ECG signals into 4-s data, then manually extracted 180-dimensional time-frequency-domain features and combined them with the improved 128-dimensional features extracted by the neural network, they also employed an integrated model to enhance machine learning results, including Decision Tree, RF, GBDT, XGBoost, LightGBM as sub-models, and the stacking model in the final experiment, this is to discriminate between signals that indicate AF and those that do not indicate AF. They used a five-fold cross-validation approach for the initial training set to assess the performance of the stacking model. Furthermore, they performed their experiments using the standard database called the MIT-BIH Atrial Fibrillation Database [22], which contains 25 records (50 signals), each record is about ten-hour with the frequency equaling 250 HZ. They achieved an accuracy of 99.1%. Marinucci et al. [21] aim to offer a novel artificial neural network (ANN) for reliably detecting AF in ECGs recorded via portable devices. To achieve this goal, they created a supervised fully connected artificial neural network (RSL ANN) using repeated structuring and learning procedures [30]. Then, the two most common symptoms of Atrial Fibrillation on an ECG are interval irregularity and the disappearance of the P wave, which is replaced by a continuous f wave [33]. Their research [21] uses 19 features represented in 11 morphological, four on F waves, and four on HRV features in the input to a supervised fully connected artificial neural network (RSL ANN) to distinguish between AF and non-AF classes. RSL ANN was developed and evaluated based on the “AF Classification from a Short Single Lead ECG Recording” database [4], and acquired by AliveCor with the portable KARDIA device [4]. It contains 8244 a short Single ECG, duration ranging from 9 s to 61 s (average: 33 s), and the sampling rate = 300 Hz. For the scope of their paper, they used 8028, a short Single ECG (training: 4493; validation: 1125; testing: 2410). They achieved an area under the curve (AUC) of 91.1% (CI: 89.1–93.0%) for the training, 90.2% (CI: 86.2–94.3%) for the validation, and 90.8% (CI: 88.1–93.5%) for the testing datasets.
An Overview of Health Monitoring Systems for Arrhythmia Patients
7
The comparison between methods based on PAF and AF are summarized in Table 1. Table 1. Performance comparison of AF and PAF classification models in recent years Author and year
Features Extraction VM
Classifier
Evaluation AC SN
SP
NF
Ebrahimzadeh et AFPDB [10] 5 min al. [6] 2018
Using time, frequency, and nonlinear HRV features.
10-fold CV
SVM K-NN MLP ME
94.64 89.28 91.07 98.21
96.29 92.30 92.59 100
93.10 86.66 89.65 96.55
12
Parsi et al. [27] 2021
Using time, frequency, bispectrum, and nonlinear HRV features (7 features) along with 7 proposed features
10-fold CV
SVM
97.7
98.8
96.7
14
K-NN MLP RF
91.7 97.2 95
87.8 96.7 97.8
95.6 97.8 91.1
Using the 180-dimensional time-frequency domain features were manually extracted and combined with the improved 128-dimensional features extracted by CNN
5-fold CV
SVM
91.3
–
–
RF CNN DT Stacking
73.8 96.2 96.2 99.1
– – – –
– – – –
Using 11 Morphological, 4 on F waves, and 4 on HRV freatures
Split (55%, 15%, 30%)
RSL ANN
AUC 90.8
Case1: 81.2 Case2: 88.7
Case2: 19 81.2 Case2: 75.0
Shen et al. [33] 2020
Marinucci et al. [21] 2020
Dataset
Signal length
AFPDB [10] 5 min
The MIT-BIH Atrial Fibrillation Database [22]
AF Classification from a Short Single Lead ECG Recording [4]
4s
250 ms before and 450 ms after each Rpeak
308
Note: HRV: heart-rate variability, CV: Cross-validation, SN: sensitivity, SP: specificity, NF: number of features.
Discussion of the Comparison Between Method Based on ‘PAF and AF’ We found that the researchers applied SVM, KNN, MLP, RF, and CNN methods to detect AF and PAF, with the accuracy varying between 73.8% and 99.1%, using features ranging from 12 to 308 features. They used datasets for atrial fibrillation from Physionet [11] represented in AFPDB [10], the MIT-BIH Atrial Fibrillation Database [22], and AF Classification from a Short Single Lead ECG Recording Database [4]. The signal length of their ECG records ranges between 4 s and 5 min. The researchers in [6,27] extracted HRV relative features on the AFPDB [10] database as an input to the model. They found a remarkable performance. However, the number of subjects in their training and testing set was not nearly enough. Ebrahimzadeh et al. [6] have found the best results with 100% sensitivity and 95.5% specificity. Although their sensitivity cannot be improved, they are
8
S. Kerdoudi et al.
based on a mixture of expert (ME) classifiers, a more complex approach than the individual model. In [33], deep learning methods are effective for classification accuracy but not so much for comprehensive analysis of HRV signals. At the same time, in [21], the authors chose the Features based on the changes that occur in the recording of the ECG during AF, which is the disappearance of the p wave and its replacement by the continuous f wave and possible HRV increment. This was confirmed by the statistical analysis of feature distributions of known clinical observations performed. However, the irregularity of the RR interval can also be seen in other types of arrhythmias. The P wave or f wave is a weak signal, and it has a difficult time detecting feature points and grasping form features. 2.2
Ventricular Tachyarrhythmia (VTA)
Since the ventricles are fundamental to the heart’s capacity to pump blood, any disturbance in their regular rhythm may be devastating. Where doctors confirm that among life-threatening ventricular abnormalities is ventricular tachyarrhythmia (VTA), which includes ventricular tachycardia (VT) and ventricular fibrillation (VF). Because of this, Some studies have shown encouraging results in predicting the occurrence of VT and VF using classic HRV features and machine learning algorithms. The goal of Taye et al. [36] in their studies was to see if QRS complex features might be used to predict VF as compared to classic HRV measures, and to achieve this goal, they extracted two features from 120 s ECG signals represented in QRS complex singed area and R-peak amplitude, as well as traditional HRV features for comparison, to predict the beginning of VF 30 s before its occurrence. They tested and trained two ANN classifiers based on 10-fold cross-validation with two different input parameters. The first, with 11 features extracted from HRV, they get an accuracy of 72%, and the second one with 4 QRS complex shape features obtained a high accuracy of 98.6%. Taye et al. [36] have worked on three datasets from PhysioNet [11] such as CU Ventricular Tachyarrhythmia Database (CUVTDB) [9], (AFPDB) [10], MIT-BIH Normal Sinus Rhythm Database (NSRDB) [11]. The primary purpose of another study by Taye et al. [35] was to validate the CNN algorithm’s efficacy in feature extraction and VTA prediction. According to this basis and to predict the onset of VT and VF within 1 min, they used a one-dimensional CNN to extract features from 5 min HRV signals. They used a 10-fold cross-validation approach to assess the accuracy of the prediction and achieved an accuracy of 84.6%. In addition, the spontaneous ventricular tachyarrhythmia database (MVTDB) [12] in PhysioNet [11] was used to evaluate the method, which contains 135 pairs of RR intervals recorded by implantable cardioverter defibrillators (ICD) with Sampling rate equal 10000 HZ. Because ICD devices have limited computing capabilities, it is essential to classify VTA events leading to SCD and reducing the number of features helps reduce the load on the device. So to predict VT and VF for devices with limited computational capabilities, such as ICDs, with a reduced set of features, Parsi et
An Overview of Health Monitoring Systems for Arrhythmia Patients
9
al. [26] ranked HRV features by applying feature selection techniques based on a Minimal redundancy-maximal relevance (mRMR) method by combining mRMR classifiers with statistical machine learning such as SVM, kNN, and RF. They obtained 91.5% accuracy with a kNN classifier while processing a 5 min window of R-R interval signals and 90.1% accuracy with an SVM classifier while processing a 1 min window of R-R interval signals utilizing only 6 HRV features. The authors used leave-one-out cross-validation method in the assessment database, which is (MVTDB) [12] which [35] have also worked on. Since of the enormous differences in ECG morphology, clinicians find that it is challenging to diagnose a patient’s heart state from an ECG signal because manual interpretation is an error-prone task [18,20,31]. Therefore, a computer-aided diagnostic (CAD) system for classifying cardiac anomalies might help identify the severity of the disease [20]. The objective of Mandal et al. [20] is to create a CAD system that can classify ECG signals from patients with VT and VF. They used thirty feature extraction methods to calculate the features of HRV signals and ECG pulse images. In order to reduce the number of features, they employed the Cardiac-score selection algorithm. Furthermore, to distinguish between healthy and VA persons, they utilized an ensemble classifier such as KNN, SVM, RF and Probabilistic Neural Network (PNN). They achieved an accuracy of 99.99%, and evaluated the performance of the classifiers using a five-fold cross-validation technique, worked on three datasets as (CUVTDB) [9], (NSRDB) [11], The MIT-BIH Malignant Ventricular Arrhythmia Database (VFDB) [32]. The comparison between methods based on ventricular tachycardia and ventricular fibrillation are summarized in Table 2. Table 2. Performance comparison of VT and VF classification models in recent years Author and Dataset year
Signal length
Features extraction
Taye et al. [36] 2019
2 min
CUDB [9]
V M
Classifier
Evaluation AC
SN
Using QRS Complex 10-fold CV Shape-Based features
ANN
98.6
98.4 99.04 4
CNN
84.6
83.2 86.4
90.2
86.9 98.5
K-NN
91.5
88.8 94.2
Ensemble
99.99 –
SP
NF
AFPDB [10] NSRDB [11] Taye et al. [35] 2020
MVTDB [12]
6 min
Using a 10-fold CV one-dimensional CNN to extract features from HRV
Parsi et al. [26] 2021
MVTDB [12]
1 min
Using time, frequency, bispectrum, and nonlinear HRV features
Leave- one-out SVM CV
Using combined features (HRV and Imaged beat) 30 features
5-fold CV
5 min Mandal et CUDB [9] al. [20] 2021
VFDB [32] NSRDB [11]
30 min
–
6
7
10
S. Kerdoudi et al.
Discussion of the Comparison Between Method Based on ‘VT and VF’ The presented works use KNN, CNN, and SVM methods to detect VTA, including ventricular tachycardia (VT) and ventricular fibrillation (VF), giving an accuracy between 84,6% and 99,99% by using 4 to 7 features. They used datasets from Physionet [11], such as CUDB [9], AFPDB [10], NSRDB [11], MVTDB [12], VFDB [32] and MVTDB [12], with a signal length between 1 min and 30 min. Although Taye et al.’s result in [36] show that the performance of ANN was excellent using QRS shape features compared to traditional HRV features, the datasets used were small. Therefore, to confirm the clinical feasibility of this approach, more research with a more significant number of datasets is needed. Moreover, the duration of prediction of the appearance of VF 30 s before it occurs is minimal, and the same may be said for the other paper by Taye et al. [35] implying that clinical applicability will require more research with a more significant number of datasets and an increase in the forecast time. To improve the findings of [26], future research should focus on determining the ideal duration of signal length (which could be dynamic) based on clinical knowledge about heart problems. The system’s detection accuracy is limited to samples. Although the study of [20] performed well, the database employed was insufficient, and future research should concentrate on developing a VTA detection system with a vast number of datasets. 2.3
Sudden Cardiac Death (SCD)
Sudden Cardiac Death (SCD) is an unexpected death caused by a loss of heart function that claims millions of lives worldwide each year [7]. Sudden Cardiac Death (SCD) happens within an hour of symptoms, and there is currently no reliable approach for early detection [29]. As a result, several researchers have turned their attention to predicting SCD. To develop a novel approach to local feature subset selection for predicting sudden cardiac death from ECG signals, Ebrahimzadeh et al. [7] used meticulous methodologies developed in their previous studies for extracting features from non-linear, time-frequency, and classical processes. So they can select features that differ from one another in each 1-min interval before the incident. According to this team [7], the SCD can be predicted 12 min before it occurs using their proposed algorithm. After selecting the best combination of features, they used a multilayer perceptron (MLP) to differentiate ECG signals of normal subjects from those susceptible to SCD. They applied a leave-one-out cross-validation method, and they employed the database named the Sudden Cardiac Death Holter Database [32] and (NSRDB) [11], resulting in an accuracy of 88.29%. Ebrahimzadeh et al. used in another work [5] an innovative and automated strategy to ensure Local Feature Subset Selection with the most rigorous methodology. They utilized time-domain, frequency-domain, time-frequency domain, and nonlinear HRV features. These suggested approaches allow them to find
An Overview of Health Monitoring Systems for Arrhythmia Patients
11
features that vary from one another every minute before an accident by selecting the best features found at each 1 min interval of the signal, allowing them to anticipate SCD 13 min before it occurs. They utilized four classifiers such as KNN, SVM, MLP, and ME, and to assess the performance of the classifiers, they used a leave-one-out cross-validation approach. Moreover, they also employed the same databases [32] [11] that they worked on in their previous work in order to predict sudden cardiac death (SCD), which led to improved accuracy to 90.18% for 13 min instead of 88.29% for 12 min Rohila et al. [29] obtained one-hour ECG signals before VF onset that were divided into twelve 5 min segments, and these segments were converted into HRV signals. They used nonlinear techniques to extract features from HRV signals and five new S-transform based time-frequency domains in a comparative analysis of HRV for four subject groups: normal sinus rhythm (NSR), coronary artery disease (CAD), congestive heart failure (CHF), and SCD. They used Kruskal-Wallis one-way analysis of variance and multiple comparisons to assess the clinical relevance of the retrieved features. Moreover, they performed the classification using two classifiers represented in SVM and DT classifiers to classify between four classes: NSR, CAD, CHF, and SCD. They employed the database NSRDB [11], Long-Term ST Database (LTSTDB) [16], BIDMC Congestive Heart Failure Database (CHFDB) [3], and (SDDB) [32], which led to 91.67% accuracy, 83.33% sensitivity, 94.64% specificity. The comparison between methods based on SCD are summarized in Table 3. Table 3. Performance comparison of SCD classification models in recent years Author and year
Dataset
Signal length
Features extraction
VM
12 min
Using Time-domain, frequency domain, time- frequency domain, nolinear HRV features
Ebrahimzadeh et SDDB [32] al., 2019 [5] NSRDB [11]
13 min
Using Time-domain, frequency domain, time- frequency domain, nonlinear HRV features
Rohila et al. 2020 [29]
Study of 1h HRV profile before VF onset by dividing into 12 segments of 5 min each
Using Classical nonlinear, and wavelet transform Entropy, Poincare plot, S-transform
Ebrahimzadeh et SDDB [32] al., 2018 [7]
Classifier
Evaluation AC SN
SP
NF
1-fold CV MLP
88.29
–
–
23
1-fold CV MLP
90.18
–
–
23
NSRDB [11]
NSRDB [11] CHFDB [3] LTSTDB [16] SDDB [32]
For SCD Class DT DT
91.67 83.3 94.44 Average for all Classes 92.00 83.83 94.63
15 15
Discussion of the Comparison Between Methods Based on SCD MLP, KNN, SVM, ME, and DT methods were applied to the SDDB [32], NSRDB [11], CHFDB [3], and LTSTDB [16] datasets from Physionet [11] to detect Sudden Cardiac Death (SCD). These presented works give an accuracy of 88.29% and 91.67% by using HRV features ranging from 15 to 23 features. The signal length of their ECG records ranges between 5 min and 13 min. HRV is less established in predicting SCD risk in patients with coronary heart disease [8,19], which also cannot be tested in other patients with atrial
12
S. Kerdoudi et al.
fibrillation or recurrent arrhythmias [19]. Furthermore, the reported advance prediction time of 5 to 13 min would limit the use of HRV for risk prediction of SCD in a clinical situation [5,7,29]. Experimental findings gained via retrospective data analysis are promising. However, to generalize the results, the approaches need to be evaluated on larger datasets [5,7,29]. 2.4
Obstructive Sleep Apnea (OSA)
Untreated sleep apnea has been linked to a variety of conditions, including high blood pressure, cardiovascular disease, and neurovascular disease [14,17,24]. Because of this relation between OSA and cardiovascular disease, people with heart problems should have their breathing monitored. The gold standard for apnea diagnosis is polysomnography (PSG), but it is very expensive and timeconsuming [13]. As a result, the researchers in recent years concentrated on creating low-cost, low-complex methodologies as an alternative to PSG in the OSA detection procedure. Authors of [1,37] employed a variety of methods to detect OSA automatically using the SpO2 signal, and several approaches based on single-lead ECG readings have also been developed to detect OSA [2,14,34]. Zarei et al. [39] have proposed a new approach to extract features based on the spectral autocorrelation function and autoregressive (AR) models using single-lead ECG signals for automated sleep apnea detection. They used the sequential forward feature selection (SFFS) technique to choose the most effective features. Moreover, they performed the classification using RF classifiers to classify between the apnea and normal events based on a 10-fold cross-validation. They get an accuracy of 93.90%, a sensitivity of 92.26%, a specificity of 94.92% in per-segment classification, and 97.14% in pre-recording classifications. In addition, the Physionet Apnea-ECG [28] dataset from PhysioNet [11] was used to evaluate the method. In another work by Zarei et al. [38], single-lead ECG recordings are used to detect OSA. The authors extracted the ECG-derived respiration (EDR) signal from a single-lead ECG using six distinct techniques. Furthermore, they extracted features from ECG-derived respiration (EDR) and HRV signals. To choose the most effective features, they used a sequential feature selection approach. Their method for detecting OSA is divided into two steps (per-segment classification and pre-recording classification). To determine the best classifier for per-segment, six different classifiers were used to detect apnea: KNN, RUSBoost, GentleBoost, Sub-spaceKNN, ANN, and SVM, which led to 93.26% accuracy, 91.52% sensitivity, 94.36% specificity which GentleBoost and 100% accuracy in pre-recording classifications. Furthermore, to assess the performance of the classifiers, they used A ten-fold cross-validation approach. Moreover, they utilized the Physionet Apnea-ECG [28] and Fantasia datasets [23] to evaluate the OSA detection technique and EDR extraction methods suggeste. The comparison between methods based on OSA are summarized in Table 4.
An Overview of Health Monitoring Systems for Arrhythmia Patients
13
Table 4. Comparison between methods based on OSA Author and year
Dataset
Signal length
Features extraction
Using means of The sequentail 10-fold CV an autoregressive Forward Feature analaysis Selection (SFFS)
Zarie et al. [39] 2020
Physionet Apnea-ECG database [28]
1 min
Zarei et al. [38] 2020
Physionet Apnea-ECG database [28]
1 min
Features selection
VM
Classifier
Fantasia Apnea-ECG database [23]
The sequentail 10-fold CV Feature Selection Method (SFS)
SN
SP
NF
Per-segment classification RF AHI
Using time, and frequency domain, and nonlinear features
Evalution AC
Gentle Boost
93.90 92.26 94.92 14 Per-recording classification 97.14 95.65 100 14 Per-segment classification 93.26 91.52 94.36 Per-recording classification
AHI
100
100
100
Discussion of the Comparison Between Methods Based on OSA Current HRV research focuses on determining how sympathovagal balance varies throughout sleep and in response to OSA disease. Future studies might evaluate the relative contributions of hypoxia, arousal, and intrathoracic pressure on changed HRV indices in individuals with OSA and cardiovascular disease using experimental designs.
3
Conclusion
Early diagnosis of cardiovascular disease such as cardiac arrhythmia can save human lives and help provide prompt treatment to avoid serious physical, psychological, and financial sequelae. The ECG is the primary tool for diagnosing the electrical activity of the heart because any abnormalities in cardiac activity are reflected in the ECG signals. However, the visual assessment of ECG signals is a difficult and time-consuming task. Therefore, the implementation of a system that continuously monitors the patient will ensure an objective and rapid diagnosis of cardiac arrhythmia. To provide researchers with a comprehensive view of the current works, we have established a comparative study between several existing works in this study. We classify ECG-based studies by disease types into four classes represented in Paroxysmal Atrial Fibrillation (PAF) and Atrial Fibrillation (AF), Ventricular Tachyarrhythmia (VTA) such as Ventricular Tachycardia (VT) and Ventricular Fibrillation (VF), Sudden Cardiac Death (SCD), Obstructive Sleep Apnea (OSA). Going forward, we will rely on increasing the anomaly prediction period, as well as using a larger, unified data set to make a more accurate comparison.
References 1. Al-Angari, H.M., Sahakian, A.V.: Automated recognition of obstructive sleep apnea syndrome using support vector machine classifier. IEEE Trans. Inf Technol. Biomed. 16(3), 463–468 (2012) 2. Atri, R., Mohebbi, M.: Obstructive sleep apnea detection using spectrum and bispectrum analysis of single-lead ECG signal. Physiol. Meas. 36(9), 1963 (2015)
14
S. Kerdoudi et al.
3. Baim, D.S., et al.: Survival of patients with severe congestive heart failure treated with oral milrinone. J. Am. Coll. Cardiol. 7(3), 661–670 (1986). https://doi.org/ 10.1016/S0735-1097(86)80478-8 4. Clifford, G., et al.: AF classification from a short single lead ECG recording - the PhysioNet computing in cardiology challenge 2017 (2017). https://physionet.org/ content/challenge-2017/1.0.0/ 5. Ebrahimzadeh, E., et al.: An optimal strategy for prediction of sudden cardiac death through a pioneering feature-selection approach from HRV signal. Comput. Methods Programs Biomed. 169, 19–36 (2019) 6. Ebrahimzadeh, E., Kalantari, M., Joulani, M., Shahraki, R.S., Fayaz, F., Ahmadi, F.: Prediction of paroxysmal atrial fibrillation: a machine learning based approach using combined feature vector and mixture of expert classification on HRV signal. Comput. Methods Programs Biomed. 165, 53–67 (2018) 7. Ebrahimzadeh, E., Manuchehri, M.S., Amoozegar, S., Araabi, B.N., SoltanianZadeh, H.: A time local subset feature selection for prediction of sudden cardiac death from ECG signal. Med. Biol. Eng. Comput. 56(7), 1253–1270 (2018) 8. Evrengul, H., et al.: The relationship between heart rate recovery and heart rate variability in coronary artery disease. Ann. Noninvasive Electrocardiol. 11(2), 154– 162 (2006) 9. Nolle, F.M., Badura, F.K., Catlett, J.M., Bowser, R.W., Sketch, M.H.: CREIGARD, a new concept in computerized arrhythmia monitoring systems (1986) 10. Moody, G., Goldberger, A., McClennen, S., Swiryn, S.: Predicting the onset of paroxysmal atrial fibrillation: the computers in cardiology challenge 2001 (2001). https://physionet.org/content/afpdb/1.0.0/ 11. Goldberger, A., et al.: PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000) 12. Goldberger, A., et al.: Spontaneous ventricular tachyarrhythmia database (2007). https://physionet.org/content/mvtdb/1.0/ ´ 13. Guti´errez-Tobal, G.C., Alvarez, D., Del Campo, F., Hornero, R.: Utility of AdaBoost to detect sleep apnea-hypopnea syndrome from single-channel airflow. IEEE Trans. Biomed. Eng. 63(3), 636–646 (2015) 14. Hwang, S.H., Lee, Y.J., Jeong, D.U., Park, K.S., et al.: Apnea-hypopnea index prediction using electrocardiogram acquired during the sleep-onset period. IEEE Trans. Biomed. Eng. 64(2), 295–301 (2016) 15. Inserm: Apn`ee du sommeil. une source de fatigue, mais aussi de maladies cardiovasculaires. Website page (2017). https://www.inserm.fr/dossier/apnee-sommeil 16. Jager, F., et al.: Long-term ST database: a reference for the development and evaluation of automated ischaemia detectors and for the study of the dynamics of myocardial ischaemia. Med. Biol. Eng. Comput. 41(2), 172–183 (2003) 17. Khandoker, A.H., Gubbi, J., Palaniswami, M.: Automated scoring of obstructive sleep apnea and hypopnea events using short-term electrocardiogram recordings. IEEE Trans. Inf Technol. Biomed. 13(6), 1057–1067 (2009) 18. Krasteva, V., Jekova, I.: Assessment of ECG frequency and morphology parameters for automatic classification of life-threatening cardiac arrhythmias. Physiol. Meas. 26(5), 707 (2005) 19. Liew, R.: Electrocardiogram-based predictors of sudden cardiac death in patients with coronary artery disease. Clin. Cardiol. 34(8), 466–473 (2011) 20. Mandal, S., Mondal, P., Roy, A.H.: Detection of ventricular arrhythmia by using heart rate variability signal and ECG beat image. Biomed. Signal Process. Control 68, 102692 (2021)
An Overview of Health Monitoring Systems for Arrhythmia Patients
15
21. Marinucci, D., Sbrollini, A., Marcantoni, I., Morettini, M., Swenne, C.A., Burattini, L.: Artificial neural network for atrial fibrillation identification in portable devices. Sensors 20(12), 3570 (2020) 22. Moody, G., Mark, R.: A new method for detecting atrial fibrillation using R-R intervals. Comput. Cardiol. 10, 227–230 (1983) 23. Iyengar, N., Peng, C.K., Morin, R., Goldberger, A.L., Lipsitz, L.A.: Age-related alterations in the fractal scaling of cardiac interbeat interval dynamics. Am. J. Physiol. 271, 1078–1084 (1996) 24. Nguyen, H.D., Wilkins, B.A., Cheng, Q., Benjamin, B.A.: An online sleep apnea detection method based on recurrence quantification analysis. IEEE J. Biomed. Health Inform. 18(4), 1285–1293 (2013) 25. World Health Organization: L’oms l`eve le voile sur les principales causes de mortalit´e et d’incapacit´e dans le monde: 2000–2019. Website page (2020). https:// www.who.int/fr/news/item/09-12-2020-who-reveals-leading-causes-of-deathand-disability-worldwide-2000-2019 26. Parsi, A., Byrne, D., Glavin, M., Jones, E.: Heart rate variability feature selection method for automated prediction of sudden cardiac death. Biomed. Signal Process. Control 65, 102310 (2021) 27. Parsi, A., Glavin, M., Jones, E., Byrne, D.: Prediction of paroxysmal atrial fibrillation using new heart rate variability features. Comput. Biol. Med. 133, 104367 (2021) 28. Penzel, T., Moody, G., Mark, R., Goldberger, A., Peter, J.: The apnea-ECG database. In: Computers in Cardiology 2000 (Cat. 00CH37163), vol. 27, pp. 255– 258. IEEE (2000). https://doi.org/10.1109/CIC.2000.898505 29. Rohila, A., Sharma, A.: Detection of sudden cardiac death by a comparative study of heart rate variability in normal and abnormal heart conditions. Biocybern. Biomed. Eng. 40(3), 1140–1154 (2020) 30. Sbrollini, A., et al.: Serial electrocardiography to detect newly emerging or aggravating cardiac pathology: a deep-learning approach. Biomed. Eng. Online 18(1), 1–17 (2019) 31. Schuch, S., Tipper, S.P.: On observing another person’s actions: influences of observed inhibition and errors. Percept. Psychophys. 69(5), 828–837 (2007) 32. Greenwald, S.D.: Development and analysis of a ventricular fibrillation detector. Master’s thesis, MIT Dept. of Electrical Engineering and Computer Science, Cambridge (1986) 33. Shen, M., Zhang, L., Luo, X., Xu, J.: Atrial fibrillation detection algorithm based on manual extraction features and automatic extraction features. In: IOP Conference Series: Earth and Environmental Science. vol. 428, p. 012050. IOP Publishing (2020) 34. Smruthy, A., Suchetha, M.: Real-time classification of healthy and apnea subjects using ECG signals with variational mode decomposition. IEEE Sens. J. 17(10), 3092–3099 (2017) 35. Taye, G.T., Hwang, H.J., Lim, K.M.: Application of a convolutional neural network for predicting the occurrence of ventricular tachyarrhythmia using heart rate variability features. Sci. Rep. 10(1), 1–7 (2020) 36. Taye, G.T., Shim, E.B., Hwang, H.J., Lim, K.M.: Machine learning approach to predict ventricular fibrillation based on QRS complex shape. Front. Physiol. 10, 1193 (2019) 37. Xie, B., Minn, H.: Real-time sleep apnea detection by classifier combination. IEEE Trans. Inf Technol. Biomed. 16(3), 469–477 (2012)
16
S. Kerdoudi et al.
38. Zarei, A., Asl, B.M.: Automatic classification of apnea and normal subjects using new features extracted from HRV and ECG-derived respiration signals. Biomed. Signal Process. Control 59, 101927 (2020) 39. Zarei, A., Asl, B.M.: Performance evaluation of the spectral autocorrelation function and autoregressive models for automated sleep apnea detection using singlelead ECG signal. Comput. Methods Programs Biomed. 195, 105626 (2020)
Probabilistic Forwarding in Named Data Networks for Internet of Things Adel Salah Ould Khaoua1(B)
, Abdelmadjid Boukra2
, and Fella Bey1
1 Department of Informatics, LRDSI, University of Blida 1, Blida, Algeria
[email protected] 2 Department of Computer Science, LSI, USTHB, Algiers, Algeria
[email protected]
Abstract. Named Data Networking (NDN) is a promising networking architecture for the emergent Internet of Things (IoT). Nevertheless, NDN mechanisms, such as interest forwarding, need to be optimized in order to accommodate the constraints of IoT devices based on Low-power and Lossy Networks (LLNs). This paper suggests that adopting probabilistic broadcast for interest forwarding enables NDN to fit naturally on LLNs without introducing any additional requirement on their constrained capabilities in terms of processing, storage, battery power, and lossy links. To the best of our knowledge, our study is among the first to explore the merits of probabilistic broadcast in the context of NDN over LLNs which are expected to be the backbone of numerous practical IoT applications. Our preliminary performance results reveal that the optimal forwarding probability for our considered scenarios can be noticeably higher than that reported in existing studies conducted in the context of IP-based wireless networks including MANETs and WSNs. Keywords: IoT · IEEE 802.15.4 · Broadcast · ndnSIM · Performance evaluation
1 Introduction The widespread use of low-cost sensing and actuating devices along with their inevitable connection to the Internet has led to the emergence of Internet of Things (IoT). IoT brings together a wide range of tiny smart devices, called “things”, that have sensing, communication and computing capabilities but are often limited in processing, memory and energy, and use low-power communication technologies such as IEEE 802.15.4 [1] and 6LoWPAN [2]. Such systems are known as Low Power & Lossy Networks (LLNs). Data communication in constrained LLNs poses a number of challenging issues. For instance, in large-scale deployment of IoT systems, careful consideration should be devoted to ensuring reliable communication given the limited capabilities of LLNs. Moreover, applications must be able to adapt to highly dynamic network topologies which may be produced by disconnections caused by lossy links of LLNs. This is in addition to the mobility of ”things” which is a prevalent feature of numerous IoT applications found in e-health, smart grids, smart cities, transportation, and agriculture [3]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 17–30, 2023. https://doi.org/10.1007/978-3-031-18516-8_2
18
A. S. Ould Khaoua et al.
A number of researchers [4–8] have advocated the adoption of the InformationCentric Networking (ICN) paradigm [9] as a clean slate alternative to IP for satisfying the communication requirements of IoT applications. ICN is a new networking approach where “data” is fetched by “name” instead of host IP address. Since ICN is a data-driven model it can naturally support user mobility and data sharing which are important features prevalent in IoT environments. One of the most prominent implementations of ICN is Named-Data Networking (NDN) [10, 11]. NDN is a promising networking architecture for the emergent IoT due to its inherent salient features such as caching, naming and stateful forwarding. In NDN, pieces of contents are named independently of their location, following a URL-like naming structure. Applications request contents from the network by sending interest packets that pull back data packets. Native features come along with this principle such as communication without establishing end-to-end connections and a name-to-address resolution. In addition, it is not necessary to maintain consumer-producer paths, thus providing native support of connection disruption resulting from mobility. Most NDN strategies employ a broadcasting (or flooding) mechanism for interest forwarding [4]. However, in wireless NDN where nodes possess one communication interface, the broadcast storm problem [12] could cause serious degradation to system performance due to excessive channel contention, redundant packet retransmissions and collisions. As a consequence, much research efforts have been devoted to developing interest forwarding strategies that can mitigate the effects of this problem. For instance, Deferred Blind Flooding (DBF) [7] has been suggested for NDN over IEEE802.11 [13]. Furthermore, a number of similar lightweight mechanisms have been proposed for wireless devices [5–7]. However, such mechanisms cannot be used directly over IEEE802.15.4 due to their data-rate and resource limitations. In this paper, we will argue that probabilistic broadcast owing to its simplicity and ease of implementation is an attractive alternative to blind flooding as well as existing forwarding strategies as the rebroadcast probability can be adjusted to reduce the number of retransmissions, and consequently mitigate the effects of the broadcast storm problem. Moreover, it suits the limited capabilities of LLNs as far as computation, communication, and energy are concerned. Our research has been motivated by the fact that probabilistic solutions have been widely explored in the context of wireless ad hoc network (MANETs) [14], vehicular networks (VANETs) [15], sensor networks (WSNs) [16], and more recently in IoT networks [17] assuming a traditional IP setting. However, there has been hardly any research that has investigated the merits of probabilistic interest forwarding for NDN over LLNs. As an effort towards filling this gap in the current literature, the objective of the present study is to explore the performance merits of probabilistic interest forwarding for NDN over LLNs using extensive simulations considering different network configurations for static as well as mobile scenarios. The rest of this paper is organized as follows. Section 2 presents briefly the NDN paradigm. Section 3 describes some of the related research work on interest forwarding strategies that have been suggested for wireless NDN. Section 4 describes probabilistic broadcast which has been discussed mainly for the conventional wireless IP networks. Section 5 describes the simulation model along with the system parameters used in our performance evaluation study. Section 5 also presents and discusses the obtained
Probabilistic Forwarding in Named Data Networks
19
performance results. Finally, Sect. 6 concludes this paper and offers possible extensions of this research in the future.
2 Named Data Networking (NDN) In NDN consumers acquire data from producers by means of data names instead of IP addresses, without any need for establishing a prior connection between consumers and producers. To do so, NDN uses two types of packets: interest and data (please refer to Fig. 1 for the format of such packets) [11]. To request data, a consumer sends out interest packets denoted with a specified name prefix. Data packets, however, are only transmitted in response to received interest packets. Data packets can be transmitted by any network node which holds the corresponding data item, whether it is the original producer holding it in memory or an intermediate node holding it in its cache.
Fig. 1. Packet types in NDN.
The structure of a node in wireless NDN is shown in Fig. 2. The network layer traditionally occupied by IP is replaced with an NDN layer [10]. A given node employs three data structures, notably the Content Store (CS), Pending Interest Table (PIT), and Forwarding Information Base (FIB). A node uses the CS to cache data packets temporarily, allowing them to be stored closer to consumers, and thus enabling them to satisfy interest packets quickly and with fewer transmissions to reach a consumer. A node uses the PIT to keep track of forwarded interest packets whose corresponding data packets have not yet reached the node. As a consequence, the PIT permits data packets to follow the reverse path to consumers. Moreover, the PIT permits nodes to identify and drop duplicate interest packets (i.e., interests with the same name prefix) in order to avoid routing loops that might be present in the network. Lastly, the FIB is populated by a routing protocol, as is the case with traditional routing tables, and contains name prefixes with the corresponding output faces leading towards potential data producers. Data in NDN is identified by names possessing a hierarchical sub-name structure, with each sub-name component separated by a slash similar to web addresses [10, 11]. For instance, country/city/town1/temperature would identify the temperature of town1 whereas country/city/town1/wind would correspond to the wind speed in town1. Data retrieval in NDN occurs as follows. A consumer sends out an interest packet with a specified name prefix in order to request data. Each node through which the interest packet passes first checks its CS for the corresponding data. If a match is found the data is sent back to the consumer without transmitting the interest any further. Otherwise, the node verifies if an interest corresponding to this name has already been registered
20
A. S. Ould Khaoua et al.
Fig. 2. The typical node architecture in NDN.
in the PIT. If a match is found the interest is dropped. Otherwise a new entry is created in the PIT for this name prefix along with the incoming interface. The FIB is then used to determine the interface through which the interest is forwarded in the network. Upon arriving at a node that possesses the data (be it the original producer or an intermediate node), the node sends back a data packet, containing the requested content following the reverse path of the interest using the “breadcrumb trail” in the PIT at the intermediate nodes. It is worth noting that if the consumer does not receive a data packet within a time interval it re-transmits the interest. A consumer may have to perform several retransmissions before a data packet is received.
3 Related work It is crucial to develop forwarding strategies that can help alleviate the broadcast storm problem while preserving the limited scarce resources (e.g., battery power) of wireless NDN. This section reviews some existing forwarding methods suggested for wireless NDN. We will primarily assess the suitability of these methods in meeting the constraints of LLNs, most notably their communication, memory, and energy requirements. Deferred Blind Flooding (DBF) [7] uses a simple method to reduce broadcasts by means of delays. Prior to transmitting an interest, a node waits for a fixed period of time to listen to surrounding broadcasts. If the same name prefix is heard above a specific threshold, the transmission is cancelled. Due to the extensive use of time delays, this scheme may not be suitable for time-critical applications. The authors of [8] have described a Geographic Interest Forwarding (GIF) approach. GIF involves a neighbor discovery phase using control HELLO packets, and a producer discovery phase which is initiated by data producers in order to announce their presence to consumers. GIF uses geo-coordinates of nodes in addition to HELLO control packets which may result in a large amount of transmissions. As a consequence, GIF may put too much strain on the scarce resources of constrained wireless networks such as LLNs. The study of [5] have proposed Reactive Optimistic Name-based Routing (RONR). Initially, interest packets are blindly flooded through the network. Once a consumer receives a data packet all subsequent interests are sent using the reverse path followed
Probabilistic Forwarding in Named Data Networks
21
by the initial data packet. This method does little to support mobility or failure of nodes as any path breakage causes the scheme to default to blind flooding. In Neighborhood-Aware Interest Forwarding (NAIF) [6] the forwarding rate of a given node is adjusted for each name prefix based on the amount of overheard interest packets with the same name prefix. If the number of packets heard is above a certain threshold, the forwarding rate for that name is lowered. Similar to BDF, this method may also not be suitable for time-critical applications due to listening periods. It also requires large memory capacity due to nodes having to store information for each name prefix. Dual Mode Interest Forwarding (DMIF) [18] relies on two alternating interest forwarding modes; flooding and directed. If the name prefix for the interest to be forwarded is present in the FIB, the interest is forwarded in the directed mode to the next hop. Otherwise this method defaults to blind flooding. DMIF requires memory bandwidth as information in the FIB has to be stored for each name prefix. The work of [19] has recently described a Learning-based Adaptive Forwarding Strategy (LAFS) for NDN-based IoT systems. Initially, interest packets are forwarded using a mechanism similar to DBF. The next phase in LAFS is started when producers respond with data packets. Upon receiving a data packet, each intermediate node stores information about the previous node, namely its ID and distance (in hop count) to the data producer for each name prefix. The name prefix is then considered as marked, and every subsequent received interest packet is transmitted without delay, while unmarked interests are transmitted after a delay period as in DBF. While this method does succeed in reducing the amount of flooded interest packets, it requires storing information related to each name prefix. In Reinforced Learning (R-LF) [20] nodes use a reinforcement learning technique to adjust the amount of delay prior to transmitting an interest packet. Initially, interest packets are flooded until a producer node receives a copy of the flooded interest. The producer then responds with a data packet containing an initial cost based on the distance to the data producer and each subsequent node on the path then updates the cost until reaching the consumer. The nearby nodes that overhear the packet can also perform an update to learn their eligibility to the data producer. The cost is then constantly updated in the reinforcement phase and is used to compute the delay at each node. This method has been shown to exhibit a good interest satisfaction rate and low control overhead. However, nodes have to keep cost values for each name prefix which may result in high memory requirements. Listen First Broadcast Later (LFBL) [21] is an improvement of DBF whereby instead of having nodes delay for a random period of time, each node determines whether it is an eligible forwarder or not based on its distance to the data source. If a node is an eligible forwarder, it delays for a time period that is proportional to its distance to the producer. LFBL requires holding distance tables which can be a problem in a setting such as LLNs due to their memory limitations.
4 Probabilistic Broadcast Probabilistic broadcast schemes have been widely reported in the literature [14] due to their attractive features as they have been shown to manage to strike balanced energy
22
A. S. Ould Khaoua et al.
usage, reduced communication overhead and reliability toward failures and network dynamism. The survey study of [14] has classified the existing probabilistic broadcast schemes into fixed schemes where all the network nodes have the same fixed forwarding probability, and adaptive schemes where the forwarding probability is adaptively adjusted depending on some system parameter(s) such as network density, node energy, or the number of received packets. Most existing probabilistic broadcast schemes have been developed for the IEEE 802.11 technology [13], and as a result they may prove inadequate for LLN-based IoT systems. Moreover, broadcast schemes proposed for 6LoWPAN-RPL such those in [22, 23] suffer from a number of shortcomings as they rely on the RPL DODAG structure, and thus they have little support for network mobility, and can be affected by the “energy hole” problem [24] since closest nodes to the root, where traffic is heavy, are prone to wasting much of their energy. In the fixed probabilistic scheme, a node retransmits a packet with a probability p for the first time a packet is received. In the present study, we assume that all nodes have the same probability of retransmitting a packet. This scheme turns into blind flooding when p = 1. Although probabilistic broadcast schemes have been discussed for wireless networks including MANETs [14], VANETs [15], and WSNs [16], their performance properties have not yet been explored for interest forwarding in the recently-emerging IoT systems based on NDN over LLNs. The main challenge in probabilistic broadcast is the mechanism by which the optimal forwarding probability is decided. It has been demonstrated that a probability between p = 0.59 and p = 0.65 [14] can be considered as ideal for fixed probabilistic schemes in MANETs and WSNs. However, it is not obvious if such values are still valid for IoT environments and in particular for those based on NDN over LLNs since the operating principle of NDN is entirely different from that of traditional IP networks. Furthermore, the optimal forwarding probability may depend not only on the graph-theoretic properties of network topologies including node density and distance but also on the used transmission technologies. We say this because most existing research studies have investigated the performance of probabilistic broadcast schemes assuming IEEE 802.11 technology [14–16]. In contrast, LLNs are based on IEEE 802.15.4, and thus have different characteristics in terms of communication and battery power. More importantly, in order to prolong their battery lifetime, IEEE 802.15.4 devices employ a duty cycle mechanism that allows them to sleep by completely switching-off radio transceivers for predefined time periods and waking up for short time intervals to check for eventual communication. Such a mechanism undoubtedly would affect broadcast communication [24].
5 Performance Evaluation We have performed an extensive performance evaluation in order to determine the optimal forwarding probability in the context of NDN over LLNs. To do so, we have implemented probabilistic broadcast in ndnSIM 2.8 [25]. The IEEE 802.15.4 communication standard has been employed providing a data link service to the NDN layer which is responsible for forwarding interest and data packets. We have considered square grid
Probabilistic Forwarding in Named Data Networks
23
topologies where the distance between a given pair of nodes is fixed at 50 m. Table 1 provides a summary of the main parameters which have been used in our present study. It is worth noting that such parameters have also widely been adopted in existing research work [4–7, 17, 18]. Table 1. Summary of the simulation parameters. Parameters
Values
MAC Protocol
IEEE 802.15.4
Transmission range (m)
50
Topology
Square grid
Network size
6 × 6, 8 × 8, and 10 × 10 nodes
Simulation time
400
Interest transmission rate (packets/s)
1
Number of consumers
1
Number of producers
1
CS size (number of packets)
8
PIT size (number of entries)
8
Interest packet size (bytes)
5
Data packet size (bytes)
10
We have used the following performance metrics, which have been widely adopted in similar evaluation studies [4–8]. • Interest satisfaction rate: the total number of data packets successfully received by the consumer over the total number of interest packets transmitted by the consumer. • Number of sent interest packets: the total number of interest packets that have been forwarded in the network. • Interest satisfaction latency: defined as the time taken by an interest packet to reach the producer, plus the processing time of the interest at the producer, and the time taken by the data packet to reach the consumer. It is worth noting that the number of sent data packets is in general lower than the total of sent interest packets as the former are not flooded but rather follow the reverse path of interest packets to reach consumers. We have collected the statistics for the number of data packets. However, we have not included them in the discussion below due firstly to space constraints. In what follows, we will present simulation results which have been gathered when the forwarding probability p has been varied in order to assess the impact of this crucial factor on system performance. After that, we have selected the values of p which produce the best performance outcome for the probabilistic forwarding and compared it against
24
A. S. Ould Khaoua et al.
that of blind flooding, DBF proposed in [7] and LAFS which has been recently proposed in [19]. 5.1 Static Scenario We have considered three network sizes of 10 × 10, 8 × 8, and 6 × 6 nodes arranged in a square grid topology. In our experiments, the consumer is located at the corner whose coordinates are (0, 0) while the producer is the opposite corner along the diagonal. The consumer issues interest packets each with a different name prefix with a rate of 1 packet/second. We have varied the forwarding probability p from 0.3 to 1; probabilities below 0.3 have been found not to permit any interests to reach the producer. As depicted in Fig. 3, system performance in terms of the interest satisfaction rate improves as the forwarding probability increases. A similar performance trend for the probabilistic broadcast has been reported when applied in other contexts such as MANETs and WSNs [14]. However, while in existing studies have found that the probability values between p = 0.59 to p = 0.65 provide the best performance results and that any further increase in the forwarding probability yields diminishing returns; i.e., very little improvement in performance. In our case we have noticed the best probability value is p≈0.85 which is higher than that reported in [14]. When p ≈ 0.85, the interest satisfaction rate reaches 75% which is comparable to that achieved by higher probability values including p = 0.90 and p = 1 (i.e., blind flooding). In order to provide justification as to why the forwarding probability is found to be higher in our considered scenario than in MANETs or WSNs, we should mention first that communication between adjacent nodes has been found to always occur along the X or Y dimension and not along the diagonal as the signal loses much of its power due to the longer distance along the diagonal. This results in the consumer and producer being separated by the longest distance in the network (i.e., the network diameter). The higher p value also results from a higher chance for packet collisions due to the presence of the hidden and exposed node problems in the square grid topology. Moreover, the consumer and producer are both located at the opposite corners of the grid, and thus packets always encounter fewer alternative paths as they approach these corners compared to other network regions such as the center of the network. Figure 4 presents the results for the interest satisfaction latency (in µseconds). In contrast to the results above, the results reveal that the forwarding probability p = 0.70 results in a huge decrease in latency, and further increase in p decreases latency. Latency is found to always decrease with p. This is because as the forwarding probability increases the likelihood of an interest packet reaching the producer quickly increases since most intermediate nodes transmit the interest packets instead of dropping them and thus avoid the need for interest retransmissions by the consumer. Figure 5 shows that as the forwarding probability increases the number of sent interest packets increases. This results in more traffic load on the network and therefore increased competition for network resources (i.e., channel access). Such competition increases the chance of packet collisions, resulting in the interest satisfaction rate reaching a plateau at 75%, and not increasing even if p is increased further.
Probabilistic Forwarding in Named Data Networks
25
10x10 8x8 6x6
Fig. 3. Interest satisfaction rate versus forwarding probability for different network sizes.
10x10 8x8 6x6
Fig. 4. Interest satisfaction latency (in µs) versus forwarding probability for different network sizes.
10x10 8x8 6x6
Fig. 5. Number of sent interest packets versus forwarding probability for different network sizes.
26
A. S. Ould Khaoua et al.
5.2 Mobile Scenario In this set of experiments, the location of the consumer is fixed at the coordinate (0, 0) whereas the producer is allowed to freely move across a square grid according to the random way point model with a speed of 20m/s and no pause time. Figure 6 which depicts results for the interest satisfaction rate indicates that the interest satisfaction rate increases with increased forwarding probability as in the static scenario case. However, we have noticed the interest satisfaction rate can reach up to 85% which is higher than that achieved in the static case. This is because due to producer mobility interest packets have to often travel shorter distances in order to reach the producer. Similarly, data packets have to travel a shorter distance to reach the consumer. Figure 7 presents the interest satisfaction latency as a function of the forwarding probability. The forwarding probability p = 0.60 as opposed to p = 0.70 (for the static case) results in a considerable decrease in latency, and again further increase in p always
10x10 8x8 6x6
Fig. 6. Interest satisfaction ratio versus forwarding probability for different network sizes.
10x10 8x8 6x6
Fig. 7. Interest satisfaction latency (in µs) versus forwarding probability for different network sizes.
Probabilistic Forwarding in Named Data Networks
27
decreases latency. Examining Fig. 8 for the number of sent interest packets reveals that similar performance trends are observed as in the static case. 5.3 Performance Comparison We have compared the performance of probabilistic interest forwarding against that of blind flooding, DBF and LAFS; we will limit our discussion to the static scenario due to space limitations and also due to the fact that the relative performance merits of these competing schemes have been found not to change much in the mobile scenario. Again, the consumer and producer are located at the opposite corners of the square grid. The consumer generates interest packets at a rate of 1 packet/second. Figure 9 depicts the interest satisfaction rates. We notice that setting p = 0.85 enables probabilistic forwarding to achieve a similar performance to that of that blind flooding and LAFS. DBF has been found to exhibit the lowest interest satisfaction rate. This is due to the fact in DBF intermediate nodes may drop interest packets if certain number of duplicate packets are heard regardless of the fact these packets may happen to be close to the consumer or producer nodes.
10x10 8x8 6x6
Fig. 8. Number of sent interest packets versus forwarding probability for different network sizes.
Fig. 9. Interest satisfaction rate in different forwarding schemes. Network size = 8 × 8.
28
A. S. Ould Khaoua et al.
Fig. 10. Interest satisfaction latency (in µs) in different forwarding schemes. Network size = 8 × 8.
Fig. 11. Number of sent interest packet in different forwarding schemes. Network size = 8 × 8.
We notice in Fig. 10 which presents the interest satisfaction latency that whereas latency in the probabilistic scheme is comparable to that of blind flooding, it is slightly higher compared to DBF and LAFS. On the other hand, examining Fig. 11 reveals that in probabilistic forwarding with p = 0.85 the number of sent interest packets is lower by 15% than the other competing forwarding methods. BDF manages to lower the number of sent interest packets but at the expense of severely degrading the interest satisfaction rate (see Fig. 9).
6 Conclusions Probabilistic broadcast has been widely explored for various classes of wireless networks including ad hoc, vehicular, and sensor networks. However, there has been hardly any research that has investigated the performance merits of such a scheme in the context of Named Dated Networking (NDN) over low-power and lossy networks (LLNs) which are based on the IEEE 802.15.4 standard. In an attempt to fill this gap, this paper has suggested the adoption of probabilistic broadcast for interest forwarding in NDN over LLNs. The simplicity and ease of implementation of probabilistic broadcast schemes make them suitable for NDN over LLNs as they do not impose any additional requirement in terms of computation, communication, or storage on such constrained devices.
Probabilistic Forwarding in Named Data Networks
29
Furthermore, probabilistic broadcast schemes do not need any topology specifications, and consequently enable NDN to operate alongside any routing protocol and can easily adapt to mobile scenarios and help mitigate network dynamics. Our simulation results have revealed that for our examined scenarios setting the forwarding probability at around p ≈ 0.85 can yield good system performance in terms of the interest satisfaction rate. Moreover, such forwarding probability have been shown to achieve comparable interest satisfaction rate to that of the well-known blind flooding, DBF and LAFS while requiring a lower number of sent interest packets, and consequently resulting in lower energy consumption. In the future, we plan to introduce enhancements to probabilistic forwarding by allowing nodes to dynamically adjust the forwarding probability in response to varying network conditions such as traffic congestion and node mobility. Moreover, we plan to extend our performance comparison to include other forwarding schemes such as R-LF and DMIF.
References 1. IEEE, IEEE Standard for local and metropolitan area networks - Part 15.4: Low-rate wireless personal area networks (LR-WPANs). New York, USA (2016) 2. Hui, J., Thubert, P.: Compression format for IPv6 datagrams over IEEE 802.15.4-based networks. Internet Engineering Task Force (IETF). Fremont, CA, USA (2011) 3. Majid, M., et al.: Applications of wireless sensor networks and internet of things frameworks in the industry revolution 4.0: a systematic literature review. Sensors 22(6), 2087 (2022) 4. Djama, A., Djamaa, B., Senouci, M.R.: Information-centric networking solutions for the internet of things: a systematic mapping review. Comput. Commun. 159, 37–59 (2020) 5. Baccelli, E., Mehlis, C., Hahm, O., Schmidt, T.C., Wählisch M.: Information centric networking in the IoT: experiments with NDN in the wild. In: Proceedings of 1st ACM Conference on Information-Centric Networking (ICN 2014), pp. 77–86. New York, NY, USA (2014) 6. Yu, Y.T., Dilmaghani, R.B., Calo, S., Sanadidi, M.Y., Gerla, M.: Interest propagation in named data MANETs. In: Proceedings of 2013 International Conference on Computing, Networking & Communications (ICNC), pp. 1118–1122 (2013). https://doi.org/10.1109/ICCNC. 2013.6504249 7. Amadeo, M., Molinaro, A., Ruggeri, G.: E-CHANET: routing, forwarding and transport in information-centric multihop wireless networks. Comput. Commun. 36(7), 792–803 (2013) 8. Aboud, A., Touati, H., Hnich, B.: Efficient forwarding strategy in an NDN-based internet of things. Clust. Comput. 22(3), 805–818 (2019) 9. Ahlgren, B., et al.: A survey of information-centric networking. IEEE Commun. Mag. 50(7), 26–36 (2012) 10. Jacobson, V., Smetters, D.K., Thornton, J.D., Plass M.F., Briggs, N.H., Braynard, R.L.: Networking named content. In: Proceedings of 5th International Conference on Emerging Networking Experiments and Technologies. ACM, pp 1–12 (2009) 11. Wang, L., Afanasyev, A., Kuntz, R., Vuyyuru, R., Wakikawa, R., Zhang, L.: Rapid traffic information dissemination using named data. In: Proceedings of 1st ACM Workshop on Emerging Name-Oriented Mobile Networking Design - Architecture, Algorithms, and Applications, vol. 12, ACM, New York, NY, USA, pp. 7–12 (2012) 12. Tseng, Y.-C., et al.: The broadcast storm problem in a mobile ad hoc network. Wireless Netw. 8(2), 153–167 (2002)
30
A. S. Ould Khaoua et al.
13. IEEE Std. 802.11–1999. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Laye (PHY) Specifications. Reference number ISO/IEC 8802–11:1999(E), IEEE Std. 802.11, (1999) 14. Reina, D.G., et al.: A survey on probabilistic broadcast schemes for wireless ad hoc networks. Ad Hoc Netw. 25, 263–292 (2014) 15. Zeng, X., Yu, M., Wang, D.: A new probabilistic multi-hop broadcast protocol for vehicular networks. IEEE Trans. Veh. Technol. 67(12), 12165–12176 (2018) 16. Galarza, C.E., et al.: A novel theoretical probabilistic model for opportunistic routing with applications in energy consumption for WSNs. Sensors 21, 8058 (2021) 17. Ali-Fedila, D., Ould-Khaoua, M.: Performance evaluation of probabilistic broadcast in lowpower and lossy networks. In: Proceedings of 20th International Conference on Ubiquitous Computing and Communications (IUCC), IEEE Computer Society, London, pp. 247–254 (2021) 18. Zhang, H., Gao, S., Zhang, B.: Energy efficient interest forwarding in NDN-based wireless sensor networks. Mob. Inf. Syst. 2016, 1–15 (2016) 19. Djama, A., Djamaa, B., Senouci, M.R., Khemache, N.: LAFS: a learning-based adaptive forwarding strategy for NDN-based IoT networks. Ann. Telecommun. 77(5), 1–20 (2021). https://doi.org/10.1007/s12243-021-00850-2 20. Abane, A., Daoui, M., Bouzefrane, S., Mühlethaler, P.: A lightweight forwarding strategy for Named Data Networking in low-end IoT. J. Netw. Comput. Appl. (JNCA) 148, 102445 (2019) 21. Michael, M., Vasileios, P., Lixia, Z.: Listen first, broadcast later: topology-agnostic forwarding under high dynamics. In: Proceedings on Annual Conference of International Technology Alliance in Network and Information Science (2010) 22. Oikonomou, G., Phillips, I.: Stateless multicast forwarding with RPL in 6LowPAN sensor networks. In: 2012 IEEE International Conference on Pervasive Computing and Communications Workshops, pp. 272–277. IEEE (2012) 23. Levis, P., Clausen, T., Hui, J., Gnawali, O., Ko, J.: The trickle algorithm. Internet Eng. Task Force, RFC6206 (2011) 24. Guclu, S.S., Ozcelebi, T., Lukkien, J.J.: Improving broadcast performance of radio dutycycled internet-of-things devices. In: Proceedings of 2016 IEEE Global Communications Conference (GLOBECOM 2016), Washington, DC, USA, pp. 4–8 (2016) 25. ndnSIM Simulator, 08 (2020). https://ndnsim.net/current/
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing Sarra Cherfi(B) , Ammar Boulaiche, and Ali Lemouari LaRIA Laboratory, Computer Sciences Department, University of Jijel, Jijel, Algeria [email protected] Abstract. Today, due to the evolution of technology and the use of the Internet on a large scale, securing everything is becoming an unavoidable necessity and a challenge for most companies. And since the traditional means of security have become insufficient due to the increase in the number and types of computer attacks that appear almost daily, researchers in the field of computer security are busy developing security tools based on artificial intelligence concepts to detect new attacks. In this work, we proposed a binary classification method for intrusion detection that has a high accuracy, precision and recall rates. This approach is based on multi-layer perceptron using both pearson correlation coefficient and simulated annealing for selecting attributes from the three datasets used for genarating and evaluating this model which are: NSL-KDD, UNSWNB15 and CICIDS2017. We obtained 97,02% accuracy for NSL-KDD, 92,32% accuaracy for UNSW-NB15 and 97,70% for CICIDS2017. Keywords: Intrusion detection · Multi-layer perceptron correlation coefficient · Simulated annealing
1
· Pearson
Introduction
The increase in connectivity between various computer networks and the corresponding increase in reliance on networked information systems has led to a dramatic increase in the need for robust security to enforce access restrictions and prevent intrusion on secure systems by external users as well as internal users. In the literature, network intrusion can occur when an intruder launches one or more potential attacks to gain unauthorised access to system resources based on its vulnerabilities. Computer security has therefore become a challenge for network administrators as well as for its users. In recent years, several security techniques have been proposed by researchers in this field to ensure a high level of security and to provide new functionalities that could not be provided by classical methods (firewall, anti-virus...etc.) [1,2]. Intrusion detection systems (IDS) are one such technique. They examine network traffic to detect any violation of security policy and warn network managers via alerts. Depending on their operating principle, IDSs can be classified into two broad categories: those that seek to detect attack signatures [3,4] (this is called the c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 31–45, 2023. https://doi.org/10.1007/978-3-031-18516-8_3
32
S. Cherfi et al.
signature or scenario approach) and those that seek to detect anomalies [5] (this is called the behavioural approach), where the first approach consists of comparing the system’s usage behaviour with previously known attack signatures, while the second approach consists of knowing what is considered normal traffic and detecting anything that deviates from this behaviour. Recently, researchers have turned to the use of machine learning techniques in anomaly detection which have proven to be effective in detecting many attacks, even new ones, due to the learning phase that is used before deployment. Among these methods are decision trees, bayesian networks and neural networks [6]. This last method is inspired by the way of the human brain works, which is totally different from that of a computer. The human brain is based on a very complicated parallel and non-linear information processing system, which allows it to organise its components to process very complicated problems such as pattern recognition in a very efficient and fast way [7]. The multilayer perceptron (MLP) is one of the most widely used neural networks for approximation, classification and prediction problems. This type of network is in the general family of forward propagation networks, i.e. in normal use, information propagates in a single direction from inputs to outputs without any feedback. It is based on supervised learning by error correction. In this case, only the error signal is back-propagated to the inputs to update the weights of the neurons. It usually consists of two or three layers of fully connected neurons. Our work consists in finding a high-performance architecture that gives the possibility to detect attacks with a high detection rate by using the multilayer neural network for binary classification as generation tools, and we validated our approach by using NSL-KDD, UNSW-NB15 and CICIDS2017 datasets for both training and test. Knowing that the selection of attributes is made thanks to the application of pearson correlation coefficient and simulated annealing on the attributes of these bases. The rest of this paper is organized as follows. Section 2 summarizes the related works. Section 3 presents our intrusion detection model by showing the generation process, the techniques used in pre-processing and finally the methods and tools used for classification and model validation. The experimental results are analyzed in Sect. 4. Finally, we draw conclusions in Sect. 5.
2
Related Work
Given the importance of the subject, several research works in the field of intrusion detection have been developed in recent years. Depending on the method used, this work can be divided into three areas, those that focus on attribute selection as an optimisation tool, and those that focus on the technique used to generate the model. While other works combine these two techniques to obtain better results. We start with Tang et al. who introduced a DNN for network intrusion detection in software defined networking [8]. They used only 6 attributes from NSL-KDD by applying a PCA transformation on the set of attributes, then they applied successively two algorithms on the resulting set
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
33
which are the genetic algorithm and PSO before moving to the learning phase where they used a modular MNN. In Study [9], Wahba et al. proposed a hybrid method for attribute selection using the correlation coefficient combined with information gain. This technique was applied on the attributes of the NSL-KDD dataset before passing this dataset to the training process where they used a naive bayesian classifier using the Adaptive Boosting (AdaBoost) technique. Ustebay et al. [10] developed an intrusion detection system using the CICIDS2017 dataset and the recursive feature elimination technique based on random forest. Then, they used a deep multilayer perceptron (DMLP) to train their model. In [11], Kasongo et al. focused on IDS developed through machine learning. They used the UNSW-NB15 dataset for training and testing, but before that, they applied the XGBoost algorithm for attribute selection. Then, they implemented several machine learning approaches which are: Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Logistic Regression (LR), Artificial Neural Network (ANN) and Decision Tree (DT). Vinayakumar et al. [12] also proposed a deep learning based intrusion detection system using a deep neural network to classify and predict attacks. They applied this technique on several datasets (NSL-KDD, UNSW-NB15, Kyoto, CICIDS2017...etc) to prove the efficiency and flexibility of the algorithm. In [13], The authors proposed a new algorithm for intrusion detection in NIDS networks using the genetic algorithm and fuzzy c-means (FCM) for attribute selection. Also, they used convolutional neural networks (CNN). Using a double Particle Swarm Optimization (PSO)-based algorithm, Elmasry et al. [14] selected both feature subset from two datasets (NSL-KDD and CICIDS2017) and hyperparameters in one process. In order to investigate the performance differences, they utilized three deep learning models, namely, Deep Neural Networks (DNN), Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN), and Deep Belief Networks (DBN).
3
Proposed Approach
This section provides a brief description of the datasets used and our proposed approach to detect attacks. Therefore, we will detail the methods used in the pre-processing of the data, in particular the selection of attributes, where we used two methods successively to obtain the best results before moving on to the next step, which requires a splitting data into two parts, one for learning and the other for testing. These two tasks are carried out via a multilayer neural network. 3.1
Classification Model
In this work, where we want to develop a powerful intrusion detection model capable of classifying TCP/IP traffic into two categories: normal or attack, we have to go through the main steps that any classification model must follow. These steps are summarized in three main phases illustrated in Fig. 1 which are: pre-processing, learning and finally the test phase.
34
S. Cherfi et al.
Fig. 1. Flow chart of our intrusion detection model
3.2
Data Pre-processing
Pre-processing modifies the dataset to make it readable by certain algorithms, such as the neural network. This phase is therefore very important for generating a reliable and coherent classification model. Indeed, three steps were carried out on the databases before exploiting them. These steps are: encoding categorical data, normalisation and finally the selection of attributes.
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
3.2.1
35
Encoding Categorical Data
In order to adapt datasets with neural network models that only accept numeric attributes, it is necessary to transform all categorical data into numeric data via precise encoding. In our case, we used one-hot encoding to the ordinal representation. This is where the integer encoded variable is removed and one new binary variable is added for each unique integer value [15]. For instance, in NSL-KDD dataset, there are three nominal features, namely, “protocol type”, “service”, and “flag” features. For “protocol type” feature, there are three types of attributes, “tcp”, “udp”, and “icmp”, and its numeric values are encoded as binary vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1). 3.2.2
Normalization
To ensure the efficiency and improve the performance of the generated model, it is very important to adjust the numerical values obtained after the encoding phase, since they are very varied and constitute a large range. For example, some attributes of the NSL-KDD database such as src bytes and dst types take large values while others such as serror rate and same srvrate take only small values. Therefore, to avoid this kind of problem, it is necessary to apply a transformation operation on the data using a well-chosen function. In this work, we standardized features by removing the mean and scaling to unit variance. The standard score of a sample x is calculated as: Z=
X −U S
(1)
where: – U : is the mean of the training samples. – S: is the standard deviation of the training samples. 3.2.3
Pearson Correlation Coefficient Based Attributes Selection
The datasets that we used in our work contain several attributes that are supposed to be in the normal case the inputs of our neural model. However, if we use all these attributes, there is a risk of affecting the performance of our intrusion detection model in term of resources used as well as the response time. So, we have to select the most significant and relevant attributes among all these features. In our case, we used two successive methods which are: Pearson correlation coefficient and Simulated annealing. The Pearson coefficient is an index reflecting a linear relationship between two continuous variables. The correlation coefficient varies between −1 and +1, when 0 reflecting a zero relationship between the two variables, a negative value (negative correlation) meaning that when one variable increases, the other decreases, while a positive value (positive correlation) indicates that the two variables vary together in the same direction.
36
S. Cherfi et al.
Here is the formula to calculate the Pearson Correlation Coefficient [16]: r(X, Y ) = So:
cov(X, Y ) σX σY
n
¯ i − Y¯ ) (Xi − X)(Y n ¯ 2 ¯ 2 i=1 (Xi − X) i=1 (Yi − Y )
r(X, Y ) = n
(2)
i=1
(3)
The resulting value is an estimate of the correlation between two continuous ¯ and Y¯ are the sample mean, n is the variables in the population. Where, X number of samples. 3.2.4 Simulated Annealing Based Attributes Selection The simulated annealing method [17] is an optimisation algorithm often used when the calculation of the exact optimal solution would require too much computation time. Historically, this technique takes its name and inspiration from thermodynamic practices, and more specifically, from the way metals are heated and then cooled. This process used in metallurgy to improve the quality of a solid seeks a state of minimum energy that corresponds to a stable structure of the solid. Starting from a high temperature at which the solid has become liquid, the cooling phase leads the liquid material to regain its solid form by a progressive decrease in temperature. Each temperature is maintained until the material finds a thermodynamic equilibrium. When the temperature tends towards zero, only transitions from one state to a lower energy state are possible. Metropolis et al. [18] were the first to implement this type of principle in numerical computation as early as 1953. They use a stochastic method to generate a sequence of successive states of the system starting from a given initial state. Each new state is obtained by subjecting any atom to a random displacement (perturbation). Let ΔE be the energy difference caused by such a perturbation. The new state is accepted if the energy of the system decreases (ΔE ≤ 0). Otherwise, it is accepted with a probability defined by: p(ΔE, T ) = e(−ΔE/(Cb ∗T )) where T is the temperature of the system and Cb a physical constant known as Boltzmann’s constant. Simulated annealing iteratively applies the Metropolis algorithm1 , to generate a sequence of configurations that tend towards thermodynamic equilibrium:
1
The algorithm was named after Nicholas Metropolis, who along with Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller and Edward Teller wrote the seminal 1953 paper, Equations of State Calculations by Fast Computing Machine proposing the algorithm for the specific case of the Boltzmann distribution. Keith W. Hastings extended it to the more general case in 1970.
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
37
Algorithm 1: Simulated Annealing Data: Set of all attributes Result: Subset of the best attributes t ← temperature ; /* initially a high number */ S ← some initial condidate solution; Best ← S; while Best is not the ideal solution do R ← N eighbour(s); ΔE ← Quality(R) − Quality(S); ΔE − if ΔE > 0 or u ∈ [0, 1] < e t then S ← R; end if Quality(S)>Quality(Best) then Best ← S; end Decrease t; end return Best;
3.3
Learning and Generating the Classification Model
The objective of this work is to establish a behavioural intrusion detection model based on the multilayer perceptron (MLP). This type of neural network, which contains a set of neurons distributed in layers, uses Adam’s optimization algorithm, which is used for training deep learning models. It is an extension of stochastic gradient descent. So, to classify our data only into two classes which are attack or normal, the output layer will have a single neuron which can take two values: – 0: normal traffic. – 1: attack that includes all types of attacks: DOS, U2R, Probe...etc. The learning phase consists of training the MLP using a so-called training database. This process is repeated until a squared error below a given threshold is obtained. At this point, the optimal weights obtained after the updates made during the training will be used to calculate the value for each record in the test database. Table 1 show the defined hyper-parameters used to accomplish this task, where dataset is split into three separate parts (20% for testing, 20% to validate and 60% for training).
38
S. Cherfi et al. Table 1. Defined hyperparameters Simulated annealing Temperature Cooling rate Stop condition
100 0.997 T < (0.0001)
MLP
Number of hidden layers Number of neurons per layer Learning rate Epochs
1 50 0.001 100
Activation function
Sigmoid (
1 ) 1 + e−x binary crossentropy
Loss function
3.4
Testing and Evaluation
In this step, we want to estimate the quality of our intrusion detection model compared to the other models already made based on a test database where all the instances are already classified. Indeed, to compare our model with the others, we need to calculate some metrics during the test phase such as accuracy, precision, recall...etc. These metrics are defined as follows: • The confusion matrix: is a matrix that gathers the observations in rows and the predictions in columns. The elements of the matrix represent the number of examples corresponding to each case. Current class Predicated class Normal Attack Normal
TN
FP
Attack
FN
TP
– A true negative (TN) is normal activity correctly classified as normal. – A false positive (FP) is normal activity misclassified as attack. – A false negative (FN) is an attack misclassified as normal activity. – A true positive (TP) is an attack correctly classified as attack. • Accuracy (or the success rate): it is the ratio between the well-classified records and the total number of test records. It is used to indicate how correct the detection technique is. TP + TN ∗ 100% (4) TP + TN + FP + FN • Recall: this is the rate of correctly detected intrusions compared to the total number of intrusions. It is calculated using the following formula: Accuracy =
Recall =
TP ∗ 100% TP + FN
(5)
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
39
• Precision: also called the recognition rate, is the proportion of predictions of positives that are in fact positives. TP ∗ 100% (6) TP + FP • F1-Score (Harmonic Mean): This is a metric that combines precision and recall into a number between 0 and 1. It gives a summary evaluation of the classification. P recision =
F 1 − Score = 3.5
2 ∗ P recision ∗ Recall P recision + Recall
(7)
Global Algorithm
In the context of our paper, the proposed algorithm (Algorithm 2) take into account both of pearson correlation coefficient and simulated annealing for selection attributes. Also, it takes into consideration the application of the adam’s optimisation algorithm to train and test our model. Algorithm 2: Global Algorithm Data acquisition; Encoding categorical data; Normalization; Selection attributes ; /* using correlation coefficient /* n is the number initial solution S0 ← A0 A1 A2 ...An ; attributes & Ai is a boolean value, i ∈ [0..n] */ best solution ← S0 ; f ← Accuracy(S0 ); T ← 100; /* high value initialized temperature α ← 0.997; /* α cooling rate while best solution is not the ideal solution do S ← shuf f le(best solution); ΔE ← f (best solution) − f (S ); if ΔE > 0 then best solution ← S ; else generate a random number u ∈ [0, 1]; −ΔE if u < e T then best solution ← S ; end end T ←T ∗α ; /* cooling train model(training dataset); testing(testing dataset); extraction results(); end
*/ of
*/ */
*/
40
S. Cherfi et al.
4
Experimentation, Results and Discussion
4.1
Benchmark Datasets
Since research works today use machine learning methods that require huge amounts of data to analyse network traffic, several benchmark datasets have been generated over the years. However, due to privacy and security concerns, not all datasets are publicly available [19]. In this work, we will use three public datasets which are: NSL-KDD, UNSW-NB15 and CICIDS2017. 4.1.1
NSL-KDD
The NSL-KDD dataset is based on another popular dataset, the KDD Cup 99, which is in turn extracted from another intrusion detection system evaluation database, DARPA2 . The KDD99 was created in 1999 for a machine learning competition. The purpose of this competition was to correctly classify network connections into 5 categories: normal, denial of service (DoS), network probe, remote to local (R2L) and user to root (U2R). NSL-KDD was created in 2009 to solve some problems inherent in KDD Cup 99 [21]. It uses the same data as the latter, but modifies it greatly to bring its corrections. For example, redundant or duplicate connections, which made up 75% to 78% of the dataset, have been removed. 4.1.2
UNSW-NB15
The UNSW-NB15 dataset was released in 2015 by the Australian Center for Cyber Security (ACCS). It is generated from a combination of real-benign network activity and a synthetic attack environment [22]. The dataset contains nine types of attacks: Generic, Exploits, Fuzzers, DoS, Reconnaissance, Analysis, Backdoor, Shellcode, and Worms [23]. It contains approximately two million and 540,044 vectors with 49 features. In addition, Moustafa et al. [24] published a partition from this dataset which contains the training set (175,341 vectors) and the testing set (82,332 vectors). 4.1.3
CICIDS2017
CICIDS2017 dataset is developed in 2017 by Sharafaldin et al. [25]. It contains complex features that are not present in previous datasets. This dataset includes Brute Force SSH, DoS, Heartbleed, Web Attack, Infiltration, Botnet and DDoS, and Brute Force FT. This dataset is based on the network traffic captured over a five-days period, with Monday’s data containing benign traffic and the rest of the days containing different modern-day attacks [26].
2
The first DARPA-sponsored IDS event was conducted by the MIT Lincoln LAB in 1998 [20]. In this DARPA event, an attack scenario at the Air Force base is simulated.
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
4.2
41
Results and Discussion
The MLP algorithm was implemented using Python programming language version 3.9.12 with NumPy library. Further, we deployed all deep learning models by using Keras open source neural network library over TensorFlow machine learning framework version 2.9.0. The hardware environment is as follows: Intel Core i5 CPU (2.6 GHz), 8 GB of RAM, and the Windows 10 operating system (64-bit mode). The results obtained for the three bases are shown in the following table and figures (Table 2): Table 2. The classification results Metrics
Datasets NSL-KDD UNSW-NB15 CICIDS2017
Accuracy 97.02% Recall
92.32%
97.70%
97.98%
88.68%
94.85%
Precision 96.29%
89.94%
99.79%
F1-Score 97.13%
89.31%
97.26%
According to the results quoted in this table, we can show the efficiency of our approach which gave us the possibility to detect more than 97% of the attacks in the NSL-KDD database, more than 92% for the UNSW-NB15 database and finally more than 97% for the CICIDS2017 database. We gained higher accuracy on CICIDS2017 than on NSL-KDD or UNSW-NB15. This due to the fact that CICIDS2017 dataset is modern form of IDS datasets which is more reliable than the others benchmarks. As described in Sects. 3.2.3 and 3.2.4, the use of pearson correlation coefficient and simulated annealing algorithm generates the reduced dataset with optimal features subset which simplifies the structure of the model. In addition to that, the application of adam’s optimisation algorithm attempts to select the optimal hyperparameters of the model that maximizes accuracy on the given dataset (Fig. 2, 3 and 4). The test results are clearly shown in the following figures:
Fig. 2. Results obtained for the NSL-KDD database
42
S. Cherfi et al.
Fig. 3. Results obtained for the UNSW-NB15 database
Fig. 4. Results obtained for the CICIDS2017 database
As it is shown in the first curve of the previous figures, the accuracy value for the learning phase is very close to that of the test phase, which means that the problem of overfitting has been avoided thanks to the validation method used during the learning phase. To prove the efficiency of our approach we have elaborated a brief comparison between the results obtained by our model and the results of some previous works for binary classification (Table 3). Table 3. Comparison between our approach and some related works. Article
Dataset
Technique
Accuracy
[27], 2015
KDD99
DBN and BP for fine tuning
92.1%
[12], 2019
KDD99
DNN
93.0%
[8], 2016
NSL-KDD
DNN
75.5%
[28], 2018
NSL-KDD
Negative selection & NN
95.88%
[12], 2019
NSL-KDD
DNN
[29], 2018
UNSW-NB15 DFEL DT
92.29%
[29], 2018
UNSW-NB15 DFEL SVM
92.32%
[12], 2019
UNSW-NB15 DNN
78.4%
[11], 2020
UNSW-NB15 Decision tree & XGBoost-based selection features
90.85%
[12], 2019
CICIDS2017
DNN
96.5%
[30], 2019
CICIDS2017
CNN & LSTM
97.16%
Proposed approach NSL-KDD
80.1%
MLP & Simulated annealing
97.02%
Proposed approach UNSW-NB15 MLP & Simulated annealing
92.32%
Proposed approach CICIDS2017
97.7%
MLP & Simulated annealing
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
43
The results obtained showed the effectiveness of our intrusion detection model based on neural networks, and more precisely the multilayer perceptron, especially when combined with techniques for attribute selection including simulated annealing and the pearson correlation in our case.
5
Conclusion
Today, computer attacks represent a real risk that threatens computer systems and company networks, which led us to try, in this work, to develop a security model capable of confronting this threat by detecting any malicious attempt, whether known or recent. To achieve this goal, neural networks were used, and more precisely the multilayer perceptron, as it is the most suitable model for nonlinear separable data, to classify TCP/IP traffic into two categories (normal or attack) based on three benchmarks: NSL-KDD, UNSW-NB15 and CICIDS2017. Indeed, the choice of the parameters of the neural network as well as the attributes involved in the learning phase has a great influence on the performance of the latter. For this reason, we used two optimisation techniques, namely the pearson correlation coefficient and simulated annealing, to choose the most relevant attributes that give us the best possible success rate. While the choice of the number of layers, number of neurons per layer and the learning rate was done manually by changing these values in each experiment until the best performing model was obtained. In this work, we have carried out a comparative study between our model and some previous works that focus on the field of intrusion detection. The results obtained show the positive influence of optimization algorithms on the performance of neural network. Although we obtained good results, there are possible improvements to perfect this model, such as the use of metaheuristics to choose the other parameters of the network and also the realization of a multiclass classification that also gives the type of attack detected.
References 1. Manimurugan, S., Majdi, A.Q., Mohmmed, M., Narmatha, C., Varatharajan, R.: Intrusion detection in networks using crow search optimization algorithm with adaptive neuro-fuzzy inference system. Microprocess. Microsyst. 79, 103261 (2020) 2. Hamdi, M., Meddeb-Makhlouf, A., Boudriga, N.: Multilayer statistical intrusion detection in wireless networks. EURASIP J. Adv. Sig. Process. 2009(1), 368589 (2008). https://doi.org/10.1155/2009/368589.pdf 3. Boulaiche, A., Adi, K.: An auto-learning approach for network intrusion detection. Telecommun. Syst. 68(2), 277–294 (2017). https://doi.org/10.1007/s11235017-0395-z 4. Boulaiche, A., Bouzayani, H., Adi, K.: A quantitative approach for intrusions detection and prevention based on statistical n-gram models. Procedia Comput. Sci. 10, 450–457 (2012) 5. Estevez-Tapiador, J.M., Garcia-Teodoro, P., Diaz-Verdejo, J.E.: Anomaly detection methods in wired networks: a survey and taxonomy. Comput. Commun. 27(16), 1569–1584 (2004)
44
S. Cherfi et al.
6. Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutorials 18(2), 1153– 1176 (2015) 7. DJEFFAL, A.: Utilisation des m´ethodes Support Vector Machine (SVM) dans l’analyse des bases de donn´ees. Ph.D thesis, Universit´e Mohamed Khider-Biskra (2012) 8. Tang, T.A., Mhamdi, L., McLernon, D., Zaidi, S.A.R., Ghogho, M.: Deep learning approach for network intrusion detection in software defined networking. In: 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM), pp. 258–263. IEEE (2016) 9. Wahba, Y., ElSalamouny, E., ElTaweel, G.: Improving the performance of multiclass intrusion detection systems using feature reduction. Int. J. Comput. Sci. Issues (2015) 10. Ustebay, S., Turgut, Z., Aydin, M.A.: Intrusion detection system with recursive feature elimination by using random forest and deep learning classifier. In: 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT), pp. 71–76. IEEE (2018) 11. Kasongo, S.M., Sun, Y.: Performance analysis of intrusion detection systems using a feature selection method on the UNSW-NB15 dataset. J. Big Data 7(1), 1–20 (2020). https://doi.org/10.1186/s40537-020-00379-6 12. Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P., Al-Nemrat, A., Venkatraman, S.: Deep learning approach for intelligent intrusion detection system. IEEE Access 7, 41525–41550 (2019) 13. Nguyen, M.T., Kim, K.: Genetic convolutional neural network for intrusion detection systems. Futur. Gener. Comput. Syst. 113, 418–427 (2020) 14. Elmasry, W., Akbulut, A., Zaim, A.H.: Evolving deep learning architectures for network intrusion detection using a double pso metaheuristic. Comput. Netw. 168, 107042 (2020) 15. Zheng, A., Casari, A.: Feature engineering for machine learning: principles and techniques for data scientists. O’Reilly Media, Inc. (2018) 16. Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson correlation coefficient. In: Noise Reduction in Speech Processing, pp. 1–4. Springer (2009). https://doi.org/ 10.1007/978-3-642-00296-0 5 17. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983) 18. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087– 1092 (1953) 19. Alsamiri, J., Alsubhi, K.: Internet of things cyber attacks detection using machine learning. Int. J. Adv. Comput. Sci. Appl. 10(12) (2019) 20. Cunningham, R.K., et al.: Evaluating intrusion detection systems without attacking your friends: the darpa intrusion detection evaluation, pp. 1999. Technical report, Massachusetts Institute of Technology Lexington Lincoln Lab (1999) 21. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A. A.: A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6. IEEE (2009) 22. Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B.: Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Futur. Gener. Comput. Syst. 100, 779–796 (2019)
Multi-layer Perceptron for Intrusion Detection Using Simulated Annealing
45
23. Moustafa, N., Slay, J.: Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6. IEEE (2015) 24. Moustafa, N., Slay, J., Creech, G.: Novel geometric area analysis technique for anomaly detection using trapezoidal area estimation on large-scale networks. IEEE Trans. Big Data 5(4), 481–494 (2017) 25. Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 1, 108–116 (2018) 26. Binbusayyis, A., Vaiyapuri, T.: Identifying and benchmarking key features for cyber intrusion detection: an ensemble approach. IEEE Access 7, 106495–106513 (2019) 27. Li, Y., Ma, R., Jiao, R.: A hybrid malicious code detection method based on deep learning. Int. J. Secur. Appl. 9(5), 205–216 (2015) 28. Pamukov, M.E., Poulkov, V.K., Shterev, V.A.: Negative selection and neural network based algorithm for intrusion detection in iot. In: 2018 41st International Conference on Telecommunications and Signal Processing (TSP), pp. 1–5. IEEE (2018) 29. Zhou, Y., Han, M., Liu, L., He, J.S., Wang, Y.: Deep learning approach for cyberattack detection. In: IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 262–267. IEEE (2018) 30. Roopak, M., Tian, G.Y., Chambers, J.: Deep learning models for cyber security in iot networks. In: 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0452–0457. IEEE (2019)
Evaluation Metrics in DoS Attacks Detection Approaches in IoT: A Survey and a Taxonomy Mohamed Riadh Kadri1 , Abdelkrim Abdelli1(B) , and Lynda Mokdad2 1
2
University of Science and Technology Houari Boumediene, 16111 Bab Ezzouar Algiers, Algeria [email protected] Univ Paris Est Creteil, LACL, F-94010 Creteil Paris, France
Abstract. Like any other emerging domain in computer science field, Internet of Things (IoT) is facing a variety of security challenges that needs to be addressed. Denial of service (DoS) and Distributed Denial of Service (DDOS) attacks are two of the most threatening attacks in the IoT domain, mainly because they affect the service availability of the connected objects, like their ability to collect, process, and transfer data. While there are many surveys in the literature that proposed classifications of these attacks, none of them has reviewed the used metrics and evaluation methods in the context of DoS/DDoS attacks detection within IoT environments. In this paper, we aim to investigate the relationship between DoS/DDoS attack types and both the evaluation metrics, and the validation methods used in the surveyed detection approaches. For this effect, metrics used in the evaluation part of most recent DoS/DDoS detection solutions are studied, and hence a taxonomy is proposed to classify them according to their utility. Then, different research questions are addressed and discussed exploring the correlation between the type of DoS attack, the used metrics as well as the validation method.
Keywords: IoT
1
· DoS attack · Taxonomy · Metrics · Detection
Introduction
In a fast-paced changing world due to major advances in computing and networking technologies, the global pandemic and its repercussions (Social distancing, Lockdowns...) made humanity rely more than ever on the Internet to achieve all kinds of daily tasks (shopping, remote work etc.). It was a natural consequence to see technologies that help automating tasks being quickly adopted; in the image of Internet of things, one of the most benefiting areas from those advances [6]. Indeed, the increasing computing power and storage capacity, meanwhile with the reduction of semiconductors node size allowed the creation of tiny MCUs (Micro Control Unit) which can be embarked in almost anything. Such a technological feature enables communication through heterogeneous networks, for c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 46–61, 2023. https://doi.org/10.1007/978-3-031-18516-8_4
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
47
the purpose of collecting, storing and processing data. This resulted in a quick appearance of multiple kinds of smart connected devices in almost every area, from the industrial sector known as the Industrial Internet of Things (IIoT) to transportation and all the way to smart homes, to build our future smart cities [2]. It is expected that, in a mid term, smart connected devices will be more integrated in each aspect of people’s every day’s life, not only because those devices are becoming cheaper and more accessible to a large number of persons worldwide, but also because individuals are becoming more aware of the potential of this technology in making their day-to-day tasks easier to complete in the most efficient possible way. Studies show that the number of IoT devices is expected to exceed 70 billions in the next three years, which highlights the quick adoption of this domain worldwide [12]. As any other emerging domain in Computer Science, questions are raised about the IoT security and its resilience towards many threats that can compromise one or more of the pillars of cybersecurity that are the confidentiality, integrity and the availability of data. A substantial effort was made in order to answer questions in each of those aspects but there is still much work to be done in this subject mainly due to the novelty of this domain, and also the invasive and pervasive nature of the IoT itself [11]. Indeed, classical security mechanisms like authentication and cryptography seem to be not adequate as they do not provide efficient solutions for the IoT, especially when we take into consideration that IoT appliances are still resource-constrained. Therefore, with both the rapid increase of IoT’s adoption in different sectors, and also the huge amount of data generated by IoT appliances, securing IoT network is proven to be one of the most challenging tasks. Within this context, IoT can be subject to many attacks of different kinds and harmfulness levels. One of the prominent threats facing IoT is the DoS attack, which consists of denying access to information technology resources, such as computing resources, servers, and data. DDoS is a DoS attack launched on a target from many different sources at the same time. So far, many types of DoS/DDoS attacks have been identified and studied in the context of IoT. Accordingly, many approaches and solutions have been proposed to detect and counteract such attacks, and hence, different types of parameters, metrics and validation methods, have been considered to evaluate the effectiveness of these methods. In this paper, we investigate the relationship between DoS/DDoS attack types in the IoT domain and the way the dedicated solutions are evaluated and validated. This allows to identify which metrics and which validation method are the most suitable to evaluate a given attack. The main contributions of our paper can be summarized as follows: – First, we inventory all the metrics and variables used in the context of the evaluation of Dos/DDoS detection approaches. Then, we propose a taxonomy to classify them according to their utility. – We provide different statistics for each metric class and sub-class of our taxonomy. Hence, we confront the obtained figures to discuss research questions that endeavor to correlate the usability of specific metrics with a validation method in the context of a particular attack.
48
M. R. Kadri et al.
The remainder of this paper is organized as follows. Section 2 discusses the related works. Section 3 recalls the basic concepts. Section 4 presents our taxonomy. Section 5 discusses some research questions. Section 6 concludes this work.
2
Related Work
Through the literature, many surveys were conducted to disseminate IoT concepts and security, but few of them have focused entirely and exclusively on DoS and DDoS detection in IoT. We review in this section the most interesting and recent surveys dedicated to this subject. One of the first surveys was released by Mosenia et al. in [10] wherein they introduced a reference model of IoT and summarized the threats in the edge-side layer of that model. The authors also listed and described the countermeasures to these threats found in the literature. They concluded their work by presenting some of the emerging security challenges not covered by the literature. In [7], Kouicem et al. discussed some of the most challenging IoT security issues in different fields of application using a top down approach. They also listed the recent countermeasures to these issues and classified them into two groups: classical approaches and new emerging approaches. They emphasized on the benefits of using new technologies like SDN and blockchain in the process of securing IoT and presented a taxonomy of IoT security solutions that consider these technologies. Bahaa et al. presented a systematic literature review in [4], in which they analyzed diverse papers discussing IoT real time security attacks and their monitoring using DevSecOps. They enumerated the datasets used in those papers as well as the machine learning techniques adopted to detect IoT attacks. They also concluded that very few studies implemented DevSecOps piplines in their proposed security solutions. One of the first surveys that focused exclusively on DDoS attacks in IoT is the one conducted by Lohachab et al. [9], in which they highlighted the impact of those attacks and studied their working mechanisms, as well as the countermeasures to defend against such attacks. In [5], Dantas et al. classified DDoS attack types in IoT depending on vulnerabilities and used techniques into three categories: Application layer attacks, Resource exhaustion attacks and volumetric attacks. Then, they presented SDN-based mitigation strategies and studied mitigation solutions in SDN-based IoT environments, for which they introduced a taxonomy categorizing them into collaborative and non-collaborative solutions. In 2020, Vishwakarma et al. produced a more complete survey focusing on DDoS attacks on IoT [14]. They analyzed several attack types and highlighted the role of IoT Botnets in DDoS attacks, taking as an example some of the most known attacks such Mirai malware and 3ve-2018. They also proposed a taxonomy of DDoS attacks in IoT networks and a taxonomy of DDoS defence mechanisms. In addition, they compared between some of the recent defence mechanisms bringing out the key points, vulnerabilities and main target attack types of each studied mechanism. In a journal article published in 2021 [3], Al-Hadhrami et al. made a literature review about DoS/DDoS attacks in IoT. They classified the surveyed solutions into four categories: IDS based solutions, protocol-based
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
49
solutions, trust-based solutions, and other solutions. Then, they analyzed and discussed each category’s strengths and weaknesses and how each solution limits IoT devices. To the best of our knowledge, we did not find any survey in the literature that focused on the relationship between DoS/DDoS attack types and the metrics as well as the methods used in the evaluation part of the proposed detection and countermeasure approaches. This is the subject of this paper.
3
Basic Concepts
In this section, we recall some basic concepts related to IoT and DoS attacks. 3.1
IoT Architecture
Since IoT is still a recent field in computer science, there isn’t much standards agreed-upon between researchers; most papers that we have gathered while doing this survey based their work on the three-layer IoT architecture. The latter is the most yet widely used architecture, and composed of three layers, which are: Perception, network, and application layers. – Perception layer: It is the layer at the lowest level of the architecture and it is composed of the physical connected smart objects and their communication medium. It’s also known as the sensing layer, as it involves sensors that gather information about the environment. This layer is responsible of collecting all useful data from the environment and transfer them using communication technologies such as RFID, Zigbee, WiFi...etc. – Network layer: It provides abstraction to the objects, as its main function is to connect the smart objects to other devices and nodes, using one ( or more) networking protocol such as RPL; it can also provide preprocessing of data collected at the perception layer. – Application layer: Its main function is to provide application specific services to the end user. 3.2
DoS/DDoS Attacks in IoT
Like other computer science domains, IoT is vulnerable to DoS and DDoS, especially considering the constrained nature of IoT objects as they’re limited not only by their processing power, but also by the energy available to operate; which makes them even more susceptible to new types of attacks. In what follows, we identify at each layer level the attacks that can cause DoS/DDoS in IoT. A-Perception Layer Attacks: In this layer, an attacker tries to prevent the physical objects from gathering information from the environment and/or transmitting the collected data to the upper layers, we define next the main attacks that occur at this layer.
50
M. R. Kadri et al.
– Jamming attacks (JA): JA is a type of DoS attacks in which an attacker emits a powerful signal on the physical medium (which can be wireless or wired) to prevent communication on the medium. It is particularly used in IoT to block wirelessly connected objects from transmitting data and communicating with each other. We can identify two variants of jamming attack, which are the constant jamming where an attacker emits the jamming signal in a continuous way; and the reactive jamming in which a jammer emits the jamming signal only when it detects a communication on the medium. – Battery exhaustion (BX): In BX attack, an attacker tries to consume a connected node’s power by making the node perform some repetitive power consuming tasks until it’s battery drains out, or by preventing it from transiting to very low power consumption mode (also known as sleep mode) [1]. Additional attacks can be identified at this layer level, including: Disconnection Attacks (DA), Impersonation Attacks (IMP) and DeSynchronization Attacks (DS). B-Network Layer Attacks: Most attacks in this layer are protocol based attacks, which means that an attacker endeavors to exploit weaknesses in the routing protocol to achieve the DoS. Thereafter, we identify all the attacks considered at this layer level in the surveyed papers. – Clone Node (CN): In this attack, a malicious node tries to copy the identity of one or more legitimate nodes in the network, in order to redirect the traffic supposed to be transmitted to legitimate nodes, thus disturbing data transmission on the network. – Sybil attack (SY): This attack is quite similar to the clone node attack, as the attacker ( called the sybil node) presents itself as multiple nodes with different identities (it doesn’t copy legitimate nodes identities but present new ones instead); thus acquiring advantages in some events that happen in the network like choosing the optimal path to a destination. If the sybil node succeeds in making a large volume of traffic going trough it, it can obviously induce a DoS in the network by dropping the traffic. – Hello flood attack (HF): HF targets routing protocols requiring the use of hello packets between the nodes to report existing neighbors. A malicious node broadcasts a massive number of hello packets to nodes that aren’t its neighbor, thus affecting the global state of the network, especially the nodes optimal path choosing capability [13]. – Selective forwarding attack (SF): In this attack, malicious nodes can deliberately choose to not forward some packets on the network, thus corrupting its communication. – Greedy attack (GR): A greedy node is a misbehaving node in the network that tries to consume more than its share of the network’s global throughput, thus affecting other nodes ability to transfer data on the network [8]. In addition to the previous described attacks, we inventory the following attacks: Sinkhole Attack (SH), BlackHole attack (BH), WormHole attack (WH), Link Flooding (LF), Rank Attack (RK), Local Repair Attack (LR), Neighbor
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
51
Fig. 1. Attacks types count in the surveyed papers.
Attack (NGH), Spam DIS Attack (DIS), Fragmentation-Based Network attacks (also called Buffer exhaustion attacks) (FBA), Version Number attack (VN), Worst Parent Attack (WP), Replay Attack (RY), Adversarial attacks (ADV) and Packet Flooding attacks (PF). C-Application Layer Attacks: In this layer, the attacker tries to exploit known weaknesses to block the IoT interface to provide services to the user. Although, we didn’t cover any paper focusing on this type of attacks, we can identify within this context the application-specific DoS attack and the HTTP flood attack.
4
Proposed Taxonomy
In this section, we first describe the material used in our survey. Then, we propose a taxonomy that classifies all the metrics identified in the evaluation of the DoS/DDoS detection approaches that we have surveyed. 4.1
Used Materials
In order to respond to our research questions, we first started by surveying all the papers released during the period (2017–2022) that have dealt with DoS/DDoS attacks in the context of IoT. A first research has been conducted on different databases indexes like Google Scholar, Scopus, Research Gate and DBLP, to collect papers that include the keywords (IoT, DoS, Attack) either in their title or abstract. After a rigorous selection, only papers that proposed detection solutions with a clear validation and evaluation approach have been considered. Therefore, we obtained a total of (40) papers, (9) of them cover perception layer attacks; the rest (31) covers different types of network layer attacks; unfortunately, we did not find any papers covering application layer attacks. Note that one paper can address different attacks, as in IDS (Intrusion Detection System) based solutions, for example. More particularly, 8 out of the 40 papers proposed
52
M. R. Kadri et al.
Fig. 2. Taxonomy of metrics used in DDoS attacks detection approaches.
only a detection approach, 5 papers proposed a detection as well as an identification approach, 8 papers have put forward a detection and a remediation (prevention or mitigation) approach, whereas 8 papers have only designed a remediation solution, and 6 papers proposed a complete framework (detection, identification and remediation). Finally, the rest of the papers (5) proposed a new DoS/DDoS attack approach or highlighted protocol-specific vulnerabilities to existing DoS/DDOS attacks. We count in total 24 attack types that have been addressed in the 40 surveyed papers, 4 of which are perception layer attacks, and the rest are network layer attacks. We present in Fig. 1 the count of papers that covered each type of attack. As one can notice, the packet flooding attack is the most addressed followed by the selective forwarding attack. 4.2
Metrics Classification
To evaluate the effectiveness of the proposed DoS/DDoS approaches, researchers use different types of parameters, variables and metrics. The latter differ according to various criteria such as the studied attack type, the validation method used to evaluate the proposed approach, as well as to additional aspects that we aim to investigate. First of all, we inventoried all the variables, parameters and metrics that have been used in the evaluation of the surveyed approaches. Then, we proceeded to classify them according to their utility. Figure 2 depicts the different classes of our taxonomy. Hence, two main categories of evaluation metrics have been identified during our study. The first one, called Detection metrics and indicators, gathers all the metrics and variables used as tools of the detection process, or to evaluate the effectiveness of the surveyed detection approaches; this category is subdivided further into two classes: 1. Network indicators: This class relates to variables that have been introduced to measure network performances as indicators of attacks occurrences. Five types of indicators have been recognized during our study, which are:
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
53
(a) Time indicators: These parameters deal with time performances in performing specific tasks, as for instance, packet transmission delay etc. (b) Data rates: These relate to parameters used to assess data transmission and processing performances in the network, as for example, the throughput, the data dropping rate, the workload etc. (c) Routing performances: This class regroups indices used to evaluate how optimal is the routing of packets in the network. (d) Energy indicators: Such parameters are considered to plot the energetic consumption behaviour in the network to detect abnormal scenarios. (e) Signal based indicators: This class refers to indicators measuring signal performance aspects in wireless IoT to detect malicious behaviours. 2. Detection performance metrics: This class identifies all the metrics that have been used to evaluate the efficiency of the surveyed solutions in terms of their detection capabilities. We further distinguish the metrics of this class into three subcategories, which are: (a) Confusion matrix: The confusion matrix is a specific table regrouping well known correlated metrics often used in machine learning to visualize the performance of a classification algorithm, or a type of algorithms from which, in the security domain, are derived most of the detection and the identification approaches. (b) Utility functions: A utility function assigns values to certain actions that the AI system can take. An AI agent’s preferences over possible outcomes can be captured by a function that maps these outcomes to a utility value; the higher the number is the more that agent likes that outcome. (c) Others: This class regroups specific metrics that cannot fit into the previously described classes, as for instance, the detection time. The second main category of our taxonomy, called Complexity & overhead measurement metrics, brings together all the metrics considered in the surveyed papers to evaluate and compare the proposed approaches in terms of their complexities, their computational effort and their resource consumption in performing the detection task. Four types of metrics have been identified: 1. Memory & Storage: This relates to metrics that are used to evaluate the detection approach in term of memory usage during its process. 2. Computational effort; This refers to metrics used to evaluate the computation complexity of the approach. 3. Energy overhead: These metrics are introduced to assess the energy consumption needed to run the approach. 4. Communication: This class deals with metrics that evaluate the communication overhead (exchanged messages) induced by running the approach. In the sequel, we provide in Tables 1, 2 and 3, for each class of our taxonomy, the list of metrics identified during our study. Each of the latter is associated with the types of attacks using it, as well as the validation method (Simulation, Analytic, Empiric): 1. Simulation: It relates to approaches validated using a simulation tool, like CooJa, NS, etc.
54
M. R. Kadri et al.
2. Empiric: It refers to approaches that have been validated by empirical experimentation in a real environment. 3. Analytic: It deals with approaches that have been validated using formal approaches, considering mathematical models. Table 1. Detection performance metrics Metric
Attack types studied in papers using the metric
Validation approaches
BH/SF/SH/HF/CN/SY/ WH/VN/RK/RY/WP/DIS/ LR/NGH/ADV/GR/PF
Simulation/ Empiric
False negative
BH/SF/SH/HF/CN/SY/ WH/VN/RK/RY/WP/DIS/ LR/NGH/ADV/GR/PF
Simulation/ Empiric
True Negative
BH/SF/SH/HF/CN/SY/ WH/VN/RK/RY/WP/DIS/ LR/NGH/ADV/GR/PF
Simulation/ Empiric
False Positive
BH/SF/SH/HF/CN/SY/ WH/VN/RK/RY/WP/DIS/ LR/NGH/ADV/GR/PF
Simulation/ Empiric
False Positive Rate (FPR), probability of false alarm, fall-out
PF/BX/SH/RK/ LR/NGH/DIS/GR/BX*
Simulation
False negative rate (FNR), miss rate
PF/GR/SH/SY
Simulation
True negative rate (TNR)
PF
Simulation& Empiric
True positive rate (TPR)
BH/SF/SH/HF/CN/SY/ WH/VN/RK/RY/WP/DIS/ LR/NGH/ADV/GR/PF
Simulation& Empiric
Accuracy
PF /SF/BH/SH/LF/ADV
Simulation& Empiric
Efficiency (Eff)*
GR
Simulation
Precision
PF
Simulation& Empiric
F1-Score
PF
Simulation& Empiric
Area under receiver operating curve (ROC)
PF
Simulation
Entropy calculation (%) Achieved utility of the legitimate user against the price coefficients.
PF JA
Simulation Analytical
Cumulative distribution function of location error
BX (proposed scenario)
Analytical
Achieved utility of the jammer against the price coefficients.
JA
Analytical
The achieved utility at the game equilibrium with respect to the price coefficients of legitimate transmit power and jamming power
JA
Analytical
Average payoff of the access point
JA
Analytical
CONFUSION True positive MATRIX
UTILITY FUNCTION
OTHER
Jain’s fairness Index
SF
Analytical
Identification Ratio
PF
Simulation
Detection delay
PF /SY/CN
Simulation
Evaluation Metrics in DoS Attacks Detection Approaches in IoT Table 2. Network indicators
Time indicators
Metric
Attack types studied in papers using the metric
Validation approaches
Trickle timer
SY in RPL
Simulation
End to end transmission SH /FBA / BX/ SF delay
Simulation
Throughput, packet delivery ratio Number of packet sent or forwarded
DA in LoRa /JA in LoRa /PF/SF/SY/SH/BX CN /SF/SY/ GR/DA/SH / JA
Data drop rate (PDR, BER) Workload
JA/SF/PF/SH PF
Simulation& Empiric Simulation / Analytic / Empiric Simulation / Analytic Simulation
Routing performance
Path length
SF/HF
Simulation
Energy indicators
Normalized energy
SF/HF
Simulation
Alive Nodes/Dead Nodes Power allocation at the equilibrium against the position of the jammer Energy consumption
SY
Simulation
JA
Analytic
SF/SY/BX /RK/SH/ LR/NGH/DIS/GR
Simulation
Data Rates
Signal based indicators
Jamming to Signal ratio JA
Analytical, simulation
Received signal strength BX* indicator (RSSI) Average signal-to-noise JA ratio (SNR)per bit
Simulation Analytical
55
56
M. R. Kadri et al. Table 3. Complexity & overhead measurement metrics Metric
Attack types studied in papers using the metric
Validation Approaches
Memory utilization
PF /SF/ BH
Simulation& empiric
Storage overhead
CN
Simulation
CPU Utilization (%), Computational overhead
PF/ SF/ BH/CN
Simulation& empiric
Execution time, runtime, Response time
PF/ SF/ BX/ FBA / SF
Simulation
Energy overhead
Conserved energy
BX
Communication
Communication overhead, CN / BX Packet size
Memory & storage Computational effort
5
Simulation Simulation
Discussion
To respond to our research questions, we performed some statistics to investigate whether there exists a correlation between the use of some classes of metrics and the type of attacks, on a hand, and the mode of validation, on the other hand. First of all, we wanted to determine the distribution in terms of occurrence frequencies for each metric, each class and each sub-class among the 40 surveyed papers. From Fig. 3, in overall, it appears that the confusion matrix metrics are those which are the most used mainly in the context of a machine learning based approach. When abstracting the statistics to classes (See Fig. 4), we find out that the network indicator s and the detection performance metrics are the most measured (25 papers each). Within the network indicator class, the data rates parameters are the most calculated to detect abnormal behaviours. Regarding the detection performance metrics class, the confusion matrix metrics are the most used to evaluate the efficiency of the detection process. More globally, the metrics belonging to the category Detection Metrics and indicators have been considered in almost all the papers (36 out of 40), whereas only 11 papers have adopted Complexity & overhead measurement metrics to evaluate the complexity of their approaches. When correlating the figures of metrics usage and the type of attacks, we obtain some interesting statistics. Figures 5, and 6, show the top three attacks considered by each class respectively, sub-class of metrics; several interesting statements can be deduced thereof: – Among the 25 papers that considered network indicators, 16 different attacks are using these metrics, with a clear trend for SF, SH, PF and JA attacks (5 times each). More particularly, energy and data rate indicators have been widely used with resp 13 and 11 different attacks, whereas signal based indicators are exclusively measured in the context of JA.
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
57
Fig. 3. Metrics occurrence frequencies in the surveyed papers.
– Detection performance metrics class has a wide range of use as it has been considered in the evaluation of 20 different attacks. The confusion matrix which has been used in the context of 18 different attacks, is mainly considered to assess the performances of network layer attacks, whereas the utility functions (used in the evaluation of 5 attacks) are mainly adopted in the context of PF and JA. Moreover, from our investigations we achieved to correlate the usage of some sub-classes of metrics with specific attacks: – Data rate and energy indicators, as well as the confusion matrix are the most used in evaluating the SF, SY and SH attacks detection approaches. – Approaches addressing PF attack are mainly considering the confusion matrix and thedata rate indicators. – Solutions dedicated to JA attack are evaluated only by considering data rate, energy, and signal indicators, in addition to utility functions. The second research question that we discuss is related to the way the mode of validation can impact the type of metrics to consider, and whether it is prede-
Fig. 4. Classes and sub-classes occurrence frequencies in the surveyed papers.
58
M. R. Kadri et al.
Fig. 5. Correlation between attacks and classes of metrics.
Fig. 6. Correlation between attacks and sub-classes of metrics.
fined by the type of attacks to address? To this aim, Fig. 7 plots for each attack the number of times each validation mode was considered. Moreover, Table 4 presents statistics on metric classes and sub classes usage per each of the three validation types. Each percentage denotes the number of papers that adopted a metrics class relatively to the total number of papers using a given validation method. First of all, from the figure we notice that Simulation is the most considered validation method among the surveyed papers, whereas the empiric validation is the less experimented. Analytic validation (formal) has been exclusively considered to address perception layer attacks (JA, DS and IMP). Besides, the latter are mainly favouring this type of validation. Moreover, from Table 4,
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
59
we notice that there are two metrics classes with more than 50% representation in simulation papers; the first is the Detection performance metrics class which is the most used class in simulations; closely followed by the Network indicatorsclass. The rest of classes are moderately represented. In terms of sub classes representation in simulation papers, two sub-classes of the network indicator s class can be distinguished, which are the Data rates and Energy indicators with 60% and 40% representation, respectively. Finally, the confusion matrix is the only Detection performance metrics subclass distinguishable with a 53,34% representation in simulation papers. As regards the empiric validation, we notice that, similarly to the simulation papers, Network indicators and Detection performance metrics classes are the most considered with a 50% representation, while Data rates and the Confusion matrix sub classes being the only representative for each of them, respectively. Computational Effort class and Memory & Storage class were present in 25% in the papers using an empiric validation while we couldn’t find any of the latter using Communication class and energy overhead class metrics. In terms of papers using Analytical validation method, Network indicators and Detection performance metrics classes were also the most represented classes with 60% and 40% representation respectively. Three of the former’s sub classes: data rates, energy indicators, and signal based indicators are represented in analytical papers, with 40% for Data rates and 20% for the other two sub classes. As concerns Detection performance metrics sub classes, utility functions is the only tool considered in analytic validation to evaluate the detection performances.
Fig. 7. Correlation between attacks and the type of validation.
60
M. R. Kadri et al. Table 4. Metric classes usage per validation type Metrics class/subclass
Simulation Empiric (%) (%)
Analytic (%)
NETWORK indicators (NI)
66,67
50
60
Detection performances metrics (DPM) Computational effort (CE) Memory, Storage (MS) Communication (C) Energy overhead (EO)
70
50
40
20 13.34 10 3,34
25 25 0 0
0 0 0 0
Network indicators Subclasses
Data rates (DR)
60
50
40
Energy indicators (EI) Time indicators (TI) Routing performances (RP) Signals based indicators (SBI)
40 10 6,67 3,34
0 0 0 0
20 0 0 20
Detection performance metrics Subclasses
Confusion Matrix (CM)
53,34
50
0
Utility function (UF) Others (O)
16,67 13,34
0 0
40 0
Metric classes
6
Conclusion
In this paper, we surveyed DoS/DDoS attacks detection approaches in the context of IoT, to respond to some research questions that deal with the way these approaches are evaluated and validated. After identifying all the metrics and variables used in the surveyed papers, we elaborated a taxonomy to classify them according to their utility. In order to correlate the usability of each class of metrics with both attack types and validation methods, we provided different statistics that highlight clearly the relationship between some attacks and the way their dedicated solutions should be evaluated. Future works will lead us to explore and study additional aspects that could be correlated with the validation process of DoS attack detection approaches in IoT environments.
References 1. Abdelli, A., Mokdad, L., Ben-Othman, J., Hammal, Y.: Dealing with a non green behaviour in wsn. Simul. Model. Pract. Theory 84, 124–142 (2018) 2. Achir, M., Abdelli, A., Mokdad, L., Benothman, J.: Service discovery and selection in iot: a survey and a taxonomy. JNCA, 200, 103331 (2022) 3. Al-Hadhrami, Y., Hussain, F.K.: DDoS attacks in IoT networks: a comprehensive systematic literature review. World Wide Web 24(3), 971–1001 (2021). https:// doi.org/10.1007/s11280-020-00855-2
Evaluation Metrics in DoS Attacks Detection Approaches in IoT
61
4. Bahaa, A., Abdelaziz, A., Sayed, A., Elfangary, L., Fahmy, H.: Monitoring real time security attacks for iot systems using devsecops: a systematic literature review. Information 12(4), 154 (2021) 5. Dantas Silva, F.S., Silva, E., Neto, E.P., Lemos, M., Venancio Neto, A.J., Esposito, F.: A taxonomy of ddos attack mitigation approaches featured by sdn technologies in iot scenarios. Sensors 20(11), 3078 (2020) 6. Iot signals report (2021). https://azure.microsoft.com/en-us/resources/iotsignals/ 7. Kouicem, D.E., Bouabdallah, A., Lakhlef, H.: Internet of things security: a topdown survey. Comput. Netw. 141, 199–221 (2018) 8. Mokdad, L., Abdelli, A., Ben-Othman, J.: Detection of greedy behavior in wsn using ieee 802.15 protocol. In: IEEE Mascots, pp. 106–111 (2014) 9. Lohachab, A., Karambir, B.: Critical analysis of ddos-an emerging security threat over iot networks. JCIN 3(3), 57–78 (2018). https://doi.org/10.1007/s41650-0180022-5 10. Mosenia, A., Jha, N.K.: A comprehensive study of security of internet-of-things. IEEE Trans. Emerg. Top. Comput. 5(4), 586–602 (2016) 11. Bouakouk, M.R., Abdelli, A., Mokdad, L.: Survey on the cloud-iot paradigms: taxonomy and architectures. In: IEEE ISCC, pp. 1–6 (2020) 12. Porkodi, S., Kesavaraja, D.: Chapter 11 - blockchain for green smart cities. In: Blockchain for Smart Cities, pp. 211–231. Elsevier (2021) 13. Srinivas, T.A.S., Manivannan, S.: Prevention of hello flood attack in iot using combination of deep learning with improved rider optimization algorithm. Comput. Commun. 163, 162–175 (2020) 14. Vishwakarma, R., Jain, A.K.: A survey of ddos attacking techniques and defence mechanisms in the iot network. Telecommun. Syst. 73(1), 3–25 (2020). https:// doi.org/10.1007/s11235-019-00599-z
Combined Use of PBMN and Rewriting Logic for Specification and Analysis of IoT Applications Sofia Abbas1 , El Hillali Kerkouche1,2(B) , Khaled Khalfaoui1,2 , and Allaoua Chaoui2 1
Department of Computer Science, Mohamed Seddik Ben Yahia University, Jijel, Algeria [email protected], [email protected] 2 MISC Laboratory, Abdelhamid Mehri University, Constantine 2, El Khroub, Algeria
Abstract. Internet of thing (IoT) application is an environment that interconnects the real world with the digital world using smart devices (sensors and actuators). Modeling IoT applications has become a requirement; many researchers are interested in integrating business processes into the design of IoT applications. The BPMN 2.0 standard is the most popular standard to model business processes. In this paper, we propose an extension of BPMN 2.0 to model IoT applications and a transformation from the extended BPMN2.0 diagrams to equivalent Maude specifications for the analysis purpose. This approach is based on the metamodeling of the BPMN diagram for IoT applications using Eclipse Modeling Framework and Acceleo language to automate code Maude generation. Our approach is illustrated with an example. Keywords: IoT · BPMN2.0 · Maude language · Rewriting logic Meta-modeling · Model transformation · Code generation · EMF Sirius
1
· ·
Introduction
In the last years, the interest in IoT applications has increased due to the developments of electronics and Internet technology developments. The internet of thing is defined as a “global network and service infrastructure of variable density and connectivity with self-configuring capabilities based on standard and interoperable protocols and formats [which] consists of heterogeneous things that have identities, physical and virtual attributes, and are seamlessly and securely integrated into the Internet” [1,2]. These things are connected to the internet by devices known as Sensors and Actuators. Sensors are responsible for capturing real-world data and sending it to the system for processing and analyzing such as capturing temperature, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 62–75, 2023. https://doi.org/10.1007/978-3-031-18516-8_5
Combined Use of PBMN and Rewriting Logic
63
measuring heart rate, reading rainfall, etc. Actuators, for their part, change the physical state of things, this can be seen in lowering the temperature, activating camera, etc. IoT applications are used in several areas of human life, according to Parvaneh Asghari [3], IoT can be used in: – Health care: used in smart wearables and health care monitoring system. – Smart city: this domain includes smart home, building computing, security, and traffic monitoring. – Industry: used in smart gride, scheduling system. Recently, organizations use business processes to model and to manage their operations. The BPMN sdandart, for modeling buissnes process which was developed by the OMG (Object Management Group), is the most widely used among many business process modeling languages. In addition, through its subsequent acceptance as an ISO standard, its ability to enhance the process standards market is emphasized [4,7]. Thereby, many researchers are interested in the integration of business processes in the modeling IoT applications by providing a uniform standard for modeling this kind of applications. However, the BPMN notation lacks a solid semantics foundation for formal analysis and verification of IoT applications. Since its introduction, The Rewriting Logic (RL) and its implementation in Maude language has enticed the attention of both theorists and practitioners who have contributed to exhibiting its generality as a programming paradigm and logical and semantic framework. In this paper, we propose an IoT application modeling approach using an extended version of BPMN notation to define the behavior of IoT applications, and a translation of the obtained behavior models to the Maude specification to analyze application dynamics. To achieve our goal, first, we define a meta model for an extended BPMN. More precisely, we use a subset of BPMN 2.0 notation and we add some requirements for modeling IoT applications. After that, we use Eclipse Modeling Framework (EMF) to implement the meta model and Sirius framework to generate a visual tool for the proposed BPMN4IoT diagram. Finally, we define an Acceleo template to translate the BPMN4IoT models created in the visual tool into equivalent Maude specifications. The resulted Maude specifications can be used to simulate and to verify modeled IoT applications. The rest of this paper is organized as follows. In Sect. 2, we highlighted the most related work. In Sect. 3, we present the basic concept of our study. In Sect. 4, we describe our approach based on a metamodel for IOT application using BPMN and generating an equivalent Maude specification from the BPMN4IOT diagram. In Sect. 5, we illustrate our approach using an example in Sect. 6. The final section concludes the paper and gives some perspectives.
2
Related Work
To model IoT applications, Research suggests that extending the existing modeling languages with IoT elements will be sufficient. The current tendency is to
64
S. Abbas et al.
add new objects that represent IoT elements to the modeling languages [8]. This means that we don’t need to create a new modeling language for IoT applications. In the following, we will look at the most works that are concerned with modeling IoT applications using the BPMN standard. In the literature, many researches have extended the BPMN standard to adapt it to the IoT environment. Most of these researches select a sub-set of BPMN or all BPMN elements and add other elements to modeling IoT applications. Research works in [9,10,13,14,16] have agreed to add the concept of sensors, actuators, smart devices and things to the BPMN meta model. Research works like [9,10,13,14,16] have agreed to add the concept of sensors, actuators, smart devices and objects to the BPMN meta-model. The authors in [9–13,15] have represented the IoT process and smart device as a pool or a lane. Mayer [12] has added a special symbol to distinguish the smart devices from the rest of the IOT process. The concept of physical entity is added in [11,13], a physical entity is an identifiable part of the real environment, which has an interest for the user or the application, such as a business process [11]. To model sensors and actuator, the authors in [9,11–13,16] have added sensing and actuating tasks that are an extension of BPMN activity. In [13], authors have added a resource role class to the sensing task to define the use of the sensor. The research work in [9] has also added 4 new tasks image task, reader task, collector task and audio task. In [10], a sensor task is represented as a pool. Furthermore, a set of the event has also added in the extension of BPMN notation. Five types of events (sensor event definition, image event definition, reader event definition, audio event definition and collector event definition) have been added in [9]. A mobility and location event have been added in [11–13]. In addition, some authors have add the concept of real-world data to separate the real-world data and digital data. Similar to using BPMN for modeling and analyzing IoT applications, other approaches have utilized other modeling languages. Some of these approaches [22,23] have used ThingML (Thing Modeling Language) for modeling the distribution and heterogeneous in the IoT system including things and their interconnections. In [24], researchers have used UML for modeling IoT applications. They have used UML class diagram and source code in Java for modeling and specifying the system. Authors in [25,26] have proposed a Visual Domain-Specific Modeling Language (VDSML) for modeling IoT including the virtual and the real component.
3 3.1
Background Business Process Model and Notation (BPMN)
Business Process Model and Notation is a business process modeling standard that supplies a graphical notation for specifying business process elements in a Business Process Diagram. The first version of BPMN (BPMN 1.0) was released
Combined Use of PBMN and Rewriting Logic
65
Fig. 1. Basic elements of BPMN 2.0
in May 2004 by BPMI (Business Process Management Initiative) [17]. The primary goal of BPMN is to provide a unified notation understandable by all business users, from the analysts that create the initial drafts to the technical developers responsible for implementing the technology that will execute those processes, and finally, to the business people who will manage those processes. Therefore, BPMN is considered a standardized bridge for the gap between the business process designing phase and the process implementation [4]. According to BPMN 2.0, there are four BPMN process types: public process, collaboration process, choreography process, and conversation process. BPMN also provides a graphical presentation of process element grouping in five categories: object flow, connecting object, swimlanes, data, and artifact. Figure 1 shows the representation of the essential BPMN elements. 3.2
Internet of Thing (IOT)
Kevin Ashton has introduced the term internet of thing (IoT) in 1998. The IoT is a global infrastructure that interconnects things (physical and virtual) [18]. The IoT application is based on a smart device. These smart devices are connected to the system to allow data exchange. The smart device can be sensors or actuators. Sensors capture data from the real world towards the system that analyzes and treat this data. On the other hand, the actuators carry out the system’s reaction toward the real world. Today, IoT is used in many domains such as smart homes, smart cities, smart environments, agriculture, healthcare and transport. 3.3
The Theory of Rewriting Logic and Maude
The Rewriting Logic and Maude language are both proposed by Meseguer. Rewriting logic (RL) is defined as a action logic for modeling concurrent systems. It is based on equational logic and rewrite rules to model the dynamics of concurrent systems. A rewriting theory R is a quadruplet R = (Σ, E, L, R), with (Σ, E) an equational theory, L is the set of labels, and R is a set of labeled rewrite rules that are applied modulo the equations E [19].
66
S. Abbas et al.
Maude is a high-performance rewriting logic language, its supports executable specification and programming of distributed systems. The general Maude modules are called functional modules and system module [20]. Maude and its theorem proving tools are used to support the following: 1. Formal specification: The result of This process is a formal model of the system that clarifies the ambiguities in the informal specification using the rewriting logic formal model. 2. Execution of the specification: for simulation and debugging purposes. 3. Model-checking analysis: to find the errors in highly distributed and nondeterministic systems not revealed by a particular execution. Model checking considers all system behaviors from an initial state to some level or condition.
4
Our Approach
In this section, we present our approach which is based on the meta modeling to propose an extended BPMN language adapted for modeling IOT applications. The proposed metamodel (BPMN4IoT) is then used to create a modeling environment tool. To facilitate the analysis of the behavior of modeled IoT application, we propose to use RL Maude language. For this purpose we define a mapping to Maude language. Our approach is divided into three phases: The first phase consists of defining a meta model for BPMN4IoT using the Eclipse meta modeling tool EMF (Eclipse Modeling Framework). In the second phase, we use the Sirius framework to create a visual environment tool for BPMN4IOT diagram based on the proposed meta model. Finally, we define and automate the generation of the equivalent Maude specification using Acceleo language. In the following we describe the detail of our approach: 4.1
Eclipse Modeling Tool
The Eclipse Modeling Project focuses on the evolution of model-based development technologies by providing a set of modeling frameworks, tooling, and standards implementations. In our work we use the following tools: EMF, Sirius and Acceleo [21]. 1. EMF (Eclipse Modeling Framework): is a modeling framework and code generation for modeling tool based on meta model. EMF allows to create a meta model and generate the corresponding code automatically. The EMF (Core) includes three fundamental pieces: – EMF: A meta model (Ecore) for describing models and runtime support for the models with XMI serialization. – EMF.Edit: Generic reusable classes for building editors for EMF models. – EMF. Codegen: to generate everything needed to build a complete editor for an EMF model. Three levels of code generation are supported: • Model: provides Java interfaces and implementation classes for all the classes in the model.
Combined Use of PBMN and Rewriting Logic
67
• Adapters: generates implementation classes that adapt the model classes for editing and display. • Editor: produces a properly structured editor that conforms to the recommended style for Eclipse EMF model editors [21]. 2. Sirius: Sirius is an Eclipse project which allows to create a graphical modeling tool by using the Eclipse Modeling technologies such as EMF. Sirius is based on a viewpoint approach created by the Obeo company [21]. 3. Acceleo framework: Acceleo is a template-based technology. It includes tools for creating custom code generators. Acceleo allows us to automatically generate any code from a data source available in an EMF formatS [21].
4.2
Meta-modelling BPMN4IOT Diagram
To model IoT behavior, we select a subset of PBMN elements (see Table 1). For more describing the IOT environment, we add 4 concepts: – IoT device: The IoT Device is a part of the IoT environment. It ensures the connection between the physical world and the digital world. According to [11], the IoT device performs like a process participant. In our metamodel, an IoT device is an extension of the process class that represents a participant. – Things: Mayer has defined a thing as a Physical Entity. It can be connect to multiple IoT devices [11]. In our metamodel, we separate the thing to the IoT device. Mayer represents the physical entity as an empty rectangle [11] (see Fig. 3). – Sensing task: a sensing task is a task that is executed with a sensor. This task is to extract data from a thing with a flow message. In our meta model, a sensing task is an extension of the BPMN task. To visualize the sensing task, we use the presentation of [11,13] (see Fig. 3) – Actuator task: an actuator task is a task that triggers an action to the physical thing. This task sends a message to a thing. A message can change the state of thing. In our meta model an actuator task is an extension of the BPMN task. To visualize the sensing task, we use the presentation of [11,13](see Fig. 3). In our metamodel, An IOT application is a collaboration diagram. It consists of two types of participants: central process and IoT Devices. We have used EMF to visual our metamodel (see Fig. 2). Table 1. BPMN subset element Task
Event
Getaways
Connection
User task, manual task, service task, script task, send task, receive task, human task
Send message, receive message, Timmer, signal, error
Exclusive getaways, Sequence flow, parallel getaways message flow
Fig. 2. Simplified BPMN4IOT diagram meta-model
68 S. Abbas et al.
Combined Use of PBMN and Rewriting Logic
69
Fig. 3. IoT element
Fig. 4. Basic BPMN4IoT functional module
4.3
Formalization of BPMN4IOT Diagram Using Maude
In this section, we will explain how to express a BPMN4IOT diagram in Maude. To formalize our diagram with Maude language, we first defined a Basic BPMN4IoT functional module which describes the basic operations of BPMN4IOT diagrams. This module is shown in Fig. 4. Table 2 Represents the equivalent of principal BPMN structures in Maude.
70
S. Abbas et al. Table 2. Representation of principal structures in Maude BPMN4IOT control structures
Corresponding Maude rewriting rules rl [sequence flow]: [task1: Activity] => [task2: Activity]
rl [End]: [task1: Activity] => [End: EndEvent]
rl [flowToEvent]: [task1: Activity] => [event: InterEvent]
rl [flowToTask]: [event: InterEvent] => [task1: Activity]
rl [C1]: [task1: Activity]=> [task2: Activity] rl [C2]: [task1: Activity] => [task3: Activity] rl [C1]: [task1: Activity] => [task2: Activity] rl [C2]: [task1: Activity] => [event: EndEvent] rl [parallel]: [task1: Activity] => [task2: Activity] [task3: Activity]
rl [flowMessageToEvent]: [task1: Activity]=> [event: StartEvent] rl [flowMessageToTask]: [thing: Thing] => [task1: Activity] rl [flowMessageToThing]: [task1: Activity] => [thing: Thing]
Combined Use of PBMN and Rewriting Logic
4.4
71
Code Maude Generation for BPMN4IOT Diagram
This step is the translation of the BPMN diagram of a IoT application into its equivalent Maude specification using the Acceleo transformation language. In order to do that, the preceding translation is an Acceleo template (see Fig. 5) that traverses the elements of the source model (instances of meta-models) and generates the corresponding Maude code.
Fig. 5. Acceleo template for code Maude generation
5
Case Study
To evaluate the practical usefulness of the proposed approach, we consider an example of an automatic irrigation system [18]. Figure 6 presents the diagram of the application created in our tool. This process includes three participants: a central process, two IoT devices (the irrigation device and the read rainfall device) and two things (named thing 1 and thing2).
72
S. Abbas et al.
Fig. 6. Irrigation Diagram created in our tool.
Fig. 7. Generated Maude specification for the irrigation diagram
Combined Use of PBMN and Rewriting Logic
73
Fig. 8. Execution of the resulted Maude specification under Maude system
The read rainfall device is week up periodically each day and starts the process by executing the read rainfall sensor task. The rest of the process is illustrated in Fig. 6. To analyze the behavioral specification of the automatic irrigation system, we have to transform this specification into its equivalent Maude specification. To realize this automatic transformation, we have to execute the proposed Acelleo template. The result of the Maude specification for this transformation is shown in Fig. 7. To achieve the analysis by simulation of the resulting Maude specification, we have invoked the RL Maude system. Figure 8 shows the result of the simulation beginning from the start node.
6
Conclusion
In this paper, we have presented an approach and an environment tool for modeling IoT applications using BPMN2.0. This approach is based on metamodeling and Acceleo language for modeling and analyzing the BPMN4IOT diagram using the Maude language. We have proposed a meta model for an extended BPMN for modeling IoT applications and we visualize it with the Sirus framework. By translating the extended BPMN models into the equivalent Maude specifications, we allow IoT applications verification and analysis. In future work, we want to expand the meta-model to cover all IoT applications aspects and to verify some of the constraints according to IoT applications.
References 1. Whitmore, A., Agarwal, A., Da Xu, L.: The internet of things—a survey of topics and trends. Inf. Syst. Front. 17(2), 261–274 (2014). https://doi.org/10.1007/ s10796-014-9489-2 2. Theodore, G.L., Gibson, B., Patricia, T.E., Pierangelo, R.: The Cloud-to-Thing Continuum July 2020, License CC BY 4.0
74
S. Abbas et al.
3. Parvaneh, A., Amir Masoud, R., Hamid Haj, S.J.: Internet of things applications: a systematic review. In: Computer Networks 148, December 2018. https://doi.org/ 10.1016/j.comnet.2018.12.008 4. Information technology - Object Management Group Business Process Model and Notation, v2.0.2 (2013). Published by ISO as the 2013 edition standard: ISO/IEC 19510 5. Freddy, J., Romina, T.: Building an IoT-aware healthcare monitoring system. In: Conference: 2015 34th International Conference of the Chilean Computer Science Society (SCCC), November 2015. https://doi.org/10.1109/SCCC.2015.7416592 6. Mohammed, O., Kechar, B., Samia, B.: Towards a new generation of internet of things. In: 2018 ISTE OpenScience Published by ISTE Ltd, February 2018. https:// doi.org/10.21494/ISTE.OP.2018.0215 7. Matthias, G., Simon, H., J¨ org, L., Guido, W.: BPMN 2.0: the state of support and implementation. Future Gener. Comput. Syst. 80, 250–262 (2017). https:// doi.org/10.1016/j.future.2017.01.006 8. Nadja, B., et al.: Modeling IoT-aware business processes - a state of the art report. In: RJ10540 (ALM1808-004) Computer Science, November 2018 9. Alaaeddine, Y., Christine, B., Rajaa, S., Anind, K.D.: uBPMN: a BPMN extension for modeling ubiquitous business processes. Inf. Softw. Technol. 74, 55–68 (2016). https://doi.org/10.1016/j.infsof.2016.02.002 10. Timurhan, S., Patrik, S., Nina, O., Oliver, K.: Extending BPMN for wireless sensor networks. In: Conference: Business Informatics (CBI) 2013 IEEE 15th Conference on, July 2013. https://doi.org/10.1109/CBI.2013.24 11. Meyer, S., Ruppen, A., Magerkurth, C.: Internet of things-aware process modeling: integrating IoT devices as business process resources. In: Conference: Proceedings of the 25th International Conference on Advanced Information Systems Engineering, June 2013. https://doi.org/10.1007/978-3-642-38709-8 6 12. Meyer, S., Ruppen, A., Hilty, L.M.: Internet of things-aware process modeling: integrating IoT devices as business process resources. In: Conference: International Conference on Advanced Information Systems Engineering, June 2015. https://doi. org/10.1007/978-3-319-19243-7 27 13. Chen, Y., Wang, M.: A study of extending BPMN to integrate IoT applications. In: Conference: 2017 International Conference on Applied System Innovation (ICASI), May 2017. https://doi.org/10.1109/ICASI.2017.7988292 14. Jung, J., Kong, J., Park, J.: Service integration toward ubiquitous business process management (2008). https://doi.org/10.1109/IEEM.2008.4738121 15. Tranquillini, S., et al.: Process-based design and integration of wireless sensor network applications. In: Barros, A., Gal, A., Kindler, E. (eds.) BPM 2012. LNCS, vol. 7481, pp. 134–149. Springer, Heidelberg (2012). https://doi.org/10.1007/9783-642-32885-5 10 16. Yousfi, A., Freitas, A., Dey, A.K., Saidi, R.: The Use of Ubiquitous Computing for Business Process Improvement. IEEE Trans. Services Comput. 9(4), 621–632 (2015). https://doi.org/10.1109/TSC.2015.2406694 17. Rosing, M.V., White, S., Cummins, F., Man, H.: Business process model and notation (BPMN). In: In book: The Complete Business Process Handbook, Volume 1 Publisher: Elsevier - Morgan Kaufmann, March 2015. https://doi.org/10.1016/ B978-0-12-799959-3.00021-54 18. Domingos, D., Martins, F.: Using BPMN to model internet of things behavior within business process. Int. J. Project Manag. 5(4), 39–51 (2017). https://doi. org/10.12821/ijispm050403
Combined Use of PBMN and Rewriting Logic
75
19. Bruni, R., Meseguer, J.: Semantic foundations for generalized rewrite theories. Theor. Comput. Sci. 360(1–3), 386–414 (2006). https://doi.org/10.1016/j.tcs.2006. 04.012 20. Meseguer, J.: Rewriting logic and maude: a wide-spectrum semantic framework for object-based distributed systems. In: Smith, S.F., Talcott, C.L. (eds.) FMOODS 2000. IAICT, vol. 49, pp. 89–117. Springer, Boston, MA (2000). https://doi.org/ 10.1007/978-0-387-35520-7 5 21. eclipse.org. https://www.eclipse.org/. Accessed 4 June 2022 22. Moin, A., et al.: From things modeling language (ThingML) to things machine learning (ThingML2). In: International Conference on Model Driven Engineering Languages and Systems (MODELS) Poster and Extended Abstract, September 2020. https://doi.org/10.13140/RG.2.2.16121.29284 23. Morin, B., et al.: Model-based software engineering to tame the IoT jungle. IEEE Softw. 34(1), 30–36 (2017). https://doi.org/10.1109/MS.2017.11 24. Geller, M., Meneses, A.: Modelling IoT systems with UML: a case study for monitoring and predicting power consumption. Am. J. Eng. Appl. Sci. 14(1), 81–93 (2021). https://doi.org/10.3844/ajeassp.2021.81.93 25. Eterovic, A., et al.: Modelling IoT systems with UML: an internet of things visual domain specific modeling language based on UML. In: Information, Communication and Automation Technologies (ICAT), 2015 XXV International Conference October 2015. https://doi.org/10.1109/ICAT.2015.7340537 26. Salihbegovic, A., et al.: Design of a domain specific language and IDE for internet of things applications. In: Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015 38th International Convention on May 2015. https://doi.org/10.1109/MIPRO.2015.7160420
The IoT Ecosystem: Components, Architecture, Communication Technologies, and Protocols Seloua Haddaoui1(B) , Salim Chikhi1 , and Badreddine Miles2 1
2
MISC Laboratory, University of Constantine 2, 25000 Constantine, Algeria [email protected] MISC Laboratory, University of Constantine 1, 25000 Constantine, Algeria https://www.misc-lab.org/en/ Abstract. The Internet Of Things (IoT) is an innovative internet paradigm that connects billions of smart devices all over the world. It seeks to integrate modern technology into practically all aspects of life to make them easier, smarter, and accessible at any time and from anywhere. We provide a quick overview of the IoT ecosystem, its components, and architecture in this paper. We also show how IoT-based solutions are founded on two essential pillars: information and technologies. As a result, we provide a knowledge view of devices used to gather information in all scopes, designate those used in Agriculture 5.0, and schedule the most widely used IoT technologies and protocols by classifying them into two main categories: Long and Short Range technologies, concluding by highlighting some of their benefits and drawbacks to assist in constructing preferences according to system requirements. Keywords: Internet Of Things (IoT) · IoT ecosystem · Sensors · Gateways · Reference architecture · Communication technologies & protocols
1
Introduction
Internet of Things (IoT) is the interconnection of computing devices embedded in all kinds of things that we use in our daily life, enabling them to sense an important amount of data in our environment. The term IoT was first conceived in 1999 during a presentation on supply chain management at MIT by British technology pioneer, Kevin Ashton, and it refers to the interoperability between a network things using Radio-Frrequency Identication (RFID), with the help of AI [2,20,30]. The IoT is a new paradigm that aims to integrate modern technology into all aspects of life, making them easier, smarter, and accessible at any time and from anywhere. The IoT ecosystem comprises many components, which are organized in an architecture that includes three layers: the Cloud, the Device, and the Application. The latter two are connected through information transfer technologies and communication protocols. These technologies and protocols can be classified into short-range and long-range categories according c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 76–90, 2023. https://doi.org/10.1007/978-3-031-18516-8_6
The IoT Ecosystem
77
to their range. The Cloud layer is responsible for storing information collected from devices and controlling them remotely through an application programming interface (API). It also allows users to access this data using mobile apps or web browsers. The Device layer consists of sensors that gather information from the environment and send it back to the cloud where it can be analyzed by an application developer who uses it to create new products or services based on this information. The Application layer is made up of devices that use APIs to communicate with each other so they can share data relevant to their tasks within specific contexts such as agriculture 5.0 systems where they may need to exchange information related to soil fertility levels between different fields or farms located at different locations (e.g., one field needs more fertilizer than another). In this work, we introduce a brief comprehension of the Internet of Things (IoT) ecosystem, in the first section. Then we include a brief overview of the IoT works in the second one. Furthermore, we illustrate that IoT-based systems are built on two central pillars: information and technologies. As a result, we provide in Sect. 3 a knowledge view of devices and elements used to gather information in all scopes, designate the ones used in Agriculture 5.0. In Sect. 4, we design the IoT reference architecture, and schedule the most widely used IoT technologies and protocols in Sects. 5 and 6, by classifying them into two main categories: Long and Short Range technologies, concluding by pointing out some of their benefits and drawbacks to aid in constructing preferences according to system’s necessities.
2
Literature Background
Until its conception, the IoT was utilized in almost all life domains. In this section, we gave brief works from the literature: Hauet considered the IoT as a federation of local networks capable of communicating with the Internet [17], to collect, process, and analyze data at the local level (smart devices) or in the cloud. In its first part, this article has proposed three classes of networks for the IoT namely: – Edge networks; guaranteeing the connection between end devices and extremity routers or gateways, – Infrastructure networks or backbone networks (intermediaries); bringing together sets of networks, – Supply networks or backhaul, passing the information to the internet. It has also considered the IoT as a concept with three stages: 1. Data is gathered, sent, exchanged, and routed to internet servers. ; 2. Data sorting, storage, and processing; 3. Information recovery and use. The article’s second section considers novel protocols used in IoT, particularly radio communications. The 6LoWPAN and TSCH protocols are described, starting with the Internet Protocol, IPv6, and on to newer protocols like as
78
S. Haddaoui et al.
CoAP, MQTT, and OPC UA. Finally, the paper underlined the need of using secure Internet protocol versions. Lee et al. [19] used the term IoT to describe a networked computer system that is connected to a physical device. This paper explores the state-of-the-art of IoT networks and their components, its main goal is to analyze and discuss the characteristics and technical details of IoT networks. Network topology and protocol stack are two key components in an IoT network. The pattern generated by the nodes and their connections is known as topology. What information may move via the network is defined by the protocol stack. They also gave information on upcoming IoT networks and their issues. Farooq et al. [11] presented a survey on the role of IoT in agricultural fields like smart farming, livestock, and greenhouse monitoring, by taking into consideration the use of its different devices and technologies such as sensors (WSN), cameras, drones, network protocols, and topologies, which generate a big amount of data that are used in systems aiming to help farmers of taking decisions concerning the field monitoring in order to enhance productivity and minimize farmer’s interventions. In smart agriculture, Miles et al. [24] studied the Long Range Wide Area Network (LoRaWAN) protocol’s performance of an IoT application for a pilot farm. They simulated it using the NS3, by testing several scenarios and then they analyzed the results by predicting the successful packet delivery rate for different parameters, such as the number of nodes and the transmission time. They proposed a mathematical model and validate it by comparisons with other simulation results. In [8], while LoRaWAN is the best communication protocol, because of its long-range and low power consumption features, this article focused on comparing LoRaWAN with other communication protocols considering several network parameters, the results showed that the LoRaWAN is the most suitable protocol for it. It discussed also its new challenges such as Band Width, data rate, battery life, range, latency, and throughput. Finally, it classified IoT devices into three classes according to the application requirements. An integrative decision model for smart agriculture based on IoT and Machine Learning is proposed in [26]. It is based on four layers, receives a huge volume of data gathered by interconnecting IoT devices such as soil moisture, temperature, humidity, fire detection, irrigation, and water level. Those data are transported to the cloud, then analyzed by using ML algorithms like SVM, SVR, and the Random Forest Regression method to make predictions and decisions for farm fields. They proposed also an implementation of those algorithms in python using Google Colab Notebook. In [22] they studied the influence of ADR1 and other LoRaWAN parameters on the LoRaWAN performance for testbed during 17 h, for each one of 18 nodes 1
LoRaWAN uses an Adaptive Data Rate (ADR) strategy to optimize data rate, airtime, and energy usage in real time. The fundamental issue with LoRaWAN is that the LoRa standard does not specify how the network server should need rate adaption from command end nodes.
The IoT Ecosystem
79
dispersed around one gateway into three collections, “Near”, “Far” and “Furthest” with determined different distances. The outcomes proved that: enabling the ADR scheme impacts negatively the Packet Delivery Ratio (PDR) while enabling Acknowledgements (ACKs) had improved it for all groups, but it has some problems2 of scalability, and the influence of payload length was not declared. When evaluating the link check commands results showed that it had no impact on the PDR. In this survey [20], they considered IoT as a part of the Internet of the future with physical things and virtual components. Distinct definitions of IoT are mentioned, architecture, crucial technologies, developing techniques for the implementation of the IoT, and applications of IoT are consistently reviewed. Some open issues revealed to the IoT applications, the key issues faced by the research community, as well as potential solutions, are discussed. With the large increases in the IoT features, since its appearance, it has been used to solve several problems in the agricultural field, such as livestock keeping, crop yield enhancement, greenhouse monitoring, irrigation processes, soil conditions, and fertilizers levels. In our next work, we are motivated to seek in enhancing bee-keeping, by the decreased pollination, which is seen as a problem that hinders a critical step in agriculture. This decrease is directly linked to the world of bees which currently suffer from a high mortality rate, and many other problems such as the decreased quantity of bee’s products. As a result, we have included the following brief works on this topic. Kontogiannis et al. [18] have proposed a new beekeeping system to monitor environmental conditions such as internal and external temperature, and cells’ humidity, which takes the safety aspect into consideration. A new architecture of suitable beehive cells has been proposed with the new system’s parameters and conditions, which consists of a box where sensors and communication equipment are installed. In [12], Gil-Lebrero et al. Proposed a new remote beehive monitoring system based on WSN, to study the relationship between honeybee colonies and their environment, by sensing temperature, humidity, and hive’s weight, using three sensors placed in different locations on the hive. The proposed system is composed of three levels, namely: 1) nodes or beehives on the lowest level, connected via IEEE 108.15.4 with the intermediate level for communicating gathered data. 2) Local database computers storing sensed data are connected to the highest level via long-range connections such as 3G/wifi or WiMAX and finally 3) the global server, which is maybe placed in a long distance from the end nodes. This article highlighted the experiment of monitoring the hive’s weight to predict the evolution of honeybees during blooms and the productivity level of honey.
2
(A LoRaWAN gateway may receive several transmissions at the same time (if on different SFs), but it can only send on one channel at a time and cannot receive any. As a result, if the gateway must often enter transmission mode while servicing a high number of nodes, acknowledgment requests will significantly degrade the network’s overall PDR.).
80
S. Haddaoui et al.
Margrita et al. [14] highlighted the benefits and opportunities of IoT technologies in agriculture like greenhouses, closed farms, pest control, livestock monitoring, and agricultural machines’ management. They considered precision farming as a combination of three major elements: information as real-time data gathered from sensors, technologies like M2M, WSN, RFID, RTS, and Cloud Computing, and management to improve production efficiency and product quality, reduce the cost, save energy, measure the use of chemicals, and protect soil and groundwater, which are the major topics of precision agriculture. Furthermore, they cited a few famous IoT agricultural applications in the world such as Rowbot in the USA for correcting the nitrogen levels, the Korean system for eel management, Intelkia to manage the Spanish smart garden in real-time, the Russian Dairy Production Analytics (RDPA), the Spanish vineyards controlling system. Finally, they recommended paying attention to develop Bulgarian agriculture by using IoT technologies.
3
IoT Components
In order to understand the IoT paradigm, we need to understand its architecture and its components, emphasizing the scalability, extensibility, and interoperability between various and ubiquitous IoT elements, which are depicted into three key parts that are: IoT devices, IoT gateways, and platforms [2]. Those elements construct what we call the IoT ecosystem, which is based on the two major components hardware and software,that are interconnected by the use of an exhaustive variety of networking technologies and protocols. The hardware refers to the physical devices and gateways used to collect and transport data. 3.1
Devices
In terms of the IoT ecosystem, this is a piece of smart equipment with essential communication capabilities as well as optional sensing, actuating, data gathering, data receive, data storage, and if needed data processing capabilities. It can execute commands with or without human intervention. Sensors. Also known as “motes”, they are inter-connected things composed of four main elements which are micro-controller, transceiver, power source, and an external memory. Their main function is to sense the environment’s conditions and gather real-time information from different entities without the human interaction. We can distinguish the two main classes passive and active sensors [2]: – Passive sensors: These sensors explore the environment in a passive way, their major function is to sense, detect, and collect data. We illustrated examples of those sensors in Fig. 1. such as humidity sensor, smoke sensor, motion sensor and temperature sensor.
The IoT Ecosystem
81
– Active sensors : These kinds of sensors are capable of investigating the environment in an active way, they affect and are affected by perceiving sensed data. They are used to operate, monitor, and control IoT application services. We cited instances of those sensors in Fig. 1 like sonar, radar, and earthquake sensors.
Fig. 1. Classification and instances of IoT sensors
By taking the beekeeping domain as a case of study, we can see that the most used sensors are passive ones, deployed to monitor beehives conditions namely indoor and outdoor; temperature sensor, humidity sensor, gas sensor, and sound sensor [18,33]. Moreover, weight sensors are used to monitor honey production, also weather information, soil parameters, and bloom data can be received from satellites to aid apiarists in decision making for enhancing beekeeping scope and having the best choices. Actuators. It groups the mechanical or electromechanical devices and electronic objects, which are called recently mechatronics, capable of converting electrical signals into mechanical energy [7] such as opening and closing electrohydraulic valves in tractors, robots, or in automated irrigation systems, Biomedical prosthesis devices for artificial hearts and ears, Peltier cells actuator utilized to control beehives [18]. 3.2
Gateways
The gateway constitutes the medium of an IoT ecosystem, but it may be constituted as a significant connected thing through its important role in data flow exchange among IoT devices and end-users, and due to its capacity to manage several networks and achieve interoperability between devices that use heterogeneous communication protocols [19]. Typically, IoT gateways can operate on
82
S. Haddaoui et al.
Linux, Android, or even a microcontroller, and can do a few tasks such as protocol translation, data filter, data storage, and if necessary data computing or processing. They recast gathered data from the analog to the digital format and guarantee the bidirectional data transfer by employing their local knowledge about the networking map and executing optimization algorithms for Internetbased communication purposes [20]. The major issues facing IoT gateway systems are the variety of protocols and sensor devices within the WSN [27], thus, we may cluster IoT gateways into two types based on their architecture: – Gateway without sensor devices: as we mentioned in Fig. 2 this ecosystem is composed of three levels; (1) the end nodes which are typically small and low powered devices that gather data from the surroundings as we explained before, those devices communicate with the gateway via low powered network protocols such as LoRaWAN. (2) The gateway, its main function is to provide a communication bridge between low and high power networking devices. (3) The cloud, is responsible for data storage, processing, and analysis.
Fig. 2. IoT ecosystem (Gateway without sensors)
– Gateway with sensor devices onboard: in this ecosystem the sensors are embedded in the gateway as we illustrated in Fig. 3.
Fig. 3. IoT ecosystem (Gateway with sensors onboard)
3.3
Networks
From a network point of view, we can regard the IoT as a network of compatible heterogeneous networks such as wireless sensor networks (WSNs), mobile networks, Low Power and Lossy Networks (LLNs), Wireless Local Area Networks(WLAN), and Low Power Wide Area Networks (LPWANs) [17]. They are
The IoT Ecosystem
83
an essential piece of the IoT ecosystem, that plays a vital role in computation, data interchange, and communication among IoT devices and the internet, to aid in smart decision making. IoT’s networks guarantee the interconnection between various kinds of networks through several communication technologies, either Long Range networks such as LoRa, NB-IoT, SigFox, LTE-M, Telensa, Ingenu [28], and cellular networks (3, 4, and 5G), or Short Range networks such as wifi, Z-wave, ZigBee, Bluetooth, RFID, BLE, Thread, and also Lifi. We could consider the LPWANs (LoRa, NB-IoT, and SigFox) as the most reliable networks for IoT systems as mentioned in section below.
4
IoT Architecture
As mentioned before, IoT was devised to interconnect a gigantic number of things that have various requirements, and which are capable of observing the surroundings ubiquitously, to gather real-time data in almost all life fields for the purpose of enhancing their quality and turning people’s lives more comfortable and smarter. To fit their necessities, researchers have proposed a bunch of IoT architectures according to the number of layers, domain requirements, needed computing paradigms, and other specifications [6,9,15,20,27]. Most of the IoT research was developed on the layered architectures [31] based on the OSI model where the data flows in both directions, from the physical to application layers and vice versa as we have depicted in Fig. 4, researchers take into consideration the generic three layers model which was considered by the IEEE P2413 standard as the fundamental IoT architecture, that can be used as a reference model for any actual IoT system to aid software engineers in understanding, comparing, and evaluating several IoT solutions by performing a uniform procedure [21]. This reference architecture can be extended to suit application and domain’s requirements by adding new layers such as service management layer and business layer [28]. In a previous study of our team research [24], it hase been focused also on the three layer architecture to fit the loRaWAN structure. Here we considred it as a pyramid based one, as depicted in Fig. 5. 4.1
The Physical Layer
Which contains the hardware part of the ecosystem; smart objects , sensors, actuators, and all IoT equipment that is responsible for creating a relation between the IoT ecosystem and the ubiquitous environment to sense and gather real-time data.
84
S. Haddaoui et al.
Fig. 4. Generic IoT architecture and a few variants based on OSI model as we have concluded
4.2
The Network Layer
Acting as a bridge between the two layers, responsible for the send/receive of data flow. It is formed of communication technologies and networking protocols mentioned in the section below, aiming to transport the huge amount of sensed data from the hardware part to the top of the architectural pyramid, in a sustainable way. 4.3
The Application Layer
The top layer in the architectural pyramid constructs an interface between the IoT devices, the server, and the end-users. This layer is responsible for data processing, clustering, and analysis, to offer solutions and predictions by the use of AI. At this level, users are provided with several services, which make it an extendable layer to include other middle layers such as a MAC layer, a monitoring layer, and a business layer.
5
IoT Communication Technologies
Since the IoT has been designed and applied to almost every field of life, it has taken a long-term evolution of the technologies used to improve the quality and quantity of service. Under several communication models such as Edge to Edge, Edge to Gateway, Edge to Cloud, and Backend Data Sharing models [32]. As we detailed in Table 2 the IoT communication standards could be ordered into the two major classes below:
The IoT Ecosystem
85
Fig. 5. IoT architectural pyramid
5.1
Short Range Communication Technologies
Commonly technologies under this class such RFID, BLE, WiFi, and LiFi are used to share data between sensors and gateways, where communication distances are typically short and most devices are battery-powered, thus making it necessary to consume low energy to augment the battery’s lifetime [2,16,23,29,31]. 5.2
Long Range Communication Technologies
Generally, long-range communication standards like LoRa, NB-IoT, SigFox, and 5G are used to transfer data from gateways to the internet over long distances. Those standards provide users with low power, flexibility, and good service quality. [3–5,8,10,17,25]. Aside from such technologies, LoRa has piqued researchers’ interest since 2015, owing to its dependable qualities outlined in Table 1.
86
S. Haddaoui et al. Table 1. The main characteristics of LoRaWAN [8, 24]. Characteristic
Value
Topology
Star on star
Allow private networks
Yes
Frequency bands
Unlicensed
Modulation
SS Chirp
Max messages per day
Unlimited
Localisation/mobility
Yes
Encryption of sent messages AES 128 b
6
Interference immunity
Very high
Data rate
290 bps–50 kbps
Packet size
154 db
Payload length
243 bytes
Bandwidth
250 khz/125 khz
Battery lifetime
8–10 years
Power efficiency
Very high
Security/Authentification
Yes (32 bits)
Scalability
Yes
Standardisation
LoRa Alliance
IoT Protocols
The choice of protocols is dependent upon the technologies used and the design of the IoT ecosystem, thus we are able to distinguish between two types of protocols, low power wireless protocols, and high power communication protocols, differences in those protocols’ features give rise to benefits as well as drawbacks as we outlined in the schedule 3.
Production monitoring, Supply chain management, Livestock tracking
10 cm LF
10–1 m HF
Ref.
Applications
Communication
distance
Unique ID, waterproof, fast data rate, high mobility, gain of time, use of low, high and ultrahigh frequencies.
Need of contact Restricted between tag communication and reader, cannot directly or indirectly interact with the Internet.
Weaknesses
High data rate, improves energy consumption, robust to interference and multi-path fading
Very low
5–10 m
Smart home, Smart energy, Smart shopping
[10, 28]
BLE
Advantages
consumption
Low
100 m/active
[2]
type
Power
Short range
RFID
Technology
High cost, increased energy consumption.
Variety of WiFi classes for use, provids high throughput connectivity, speeds of over 1 Gbit/s, provids direct connection to the Internet.
High
100 m/outdoors
20 m/indoors
Parking metering, autonomous lightning, smart security, smart farming, smart home thermostats, etc.
[10, 28]
WiFi
Very limited range, limited by obstacles such as furniture or walls, natural interference from sunlight and other light sources.
Greater mobility, multiuser access, fast data rate (3 Gb/s), good energy efficiency, compatibility with radio technologies.
Low
10 m
Smart home, Industry 4.0, Virtual and augmented reality, Car to car communication, Security and defense
[16, 28]
LiFi
Low transmission rate
Strong immunity, Stable operating, low energy & cost-effective, scalable, optimized battery life time (up to 8 years), high sensitivity.
Low
45 km/rural
15 km/suburban
2–5 km/urban
Health care, Smart city, Smart farming, Smart Grid, Environment monitoring, Smart homes, Connected vehicles, Survillance, Industrial application
[4, 8, 10, 25]
LoRa
Long range
3–10 km/rural
Smart home, Healthcare, Smart farming, Intelligent transportation, Smart retail, Industry 4.0, Smart city, Environment monitoring
[8, 28]
SigFox
No security/ Authentification, low interference immunity.
Very high power efficiency, very flexible and scalable, low powered & cost-effective, eliminates the need for Gateways.
Low
Low interference immunity, no mobility, low data rete(only 140 meg/day.
Very high power efficiency, flexible, reduce the energy consumption and increase the receiver’s sensivity.
Low
20–40 km/rural 30–50 km/rural
1.5 km/urban
Smart home, City, Retail, Agriculture, Industry, Environment, Healthcare
[5, 8]
NB-IoT
Medium interference immunity, affected by weather factors like wind & rain.
Large coverage, good service quality, high data rate, short latency time, Short Messaging Service (SMS), greater mobility, low consumption allowing to operate for 10 years on a battery, supports indoor, rural, urban & suburban scenarios.
High
1000 ft
Smart farming, smart power, water Grid, factory automation, vehicle-to-infrastructure, high speed motion, vehicle-to-vehicle and the process control system.
/
5G
Table 2. The qualities of the most popular short- and long-range IoT communication technologies as we deduced.
The IoT Ecosystem 87
88
S. Haddaoui et al.
Table 3. Features, pros, and cons for the most commonly used IoT protocols as we deduced. Protocols
Features & benefits
Drawbacks
MQTT [13]
Based on publish/subscribe model, uses TCP protocol at the transport level, simple, easy to use, small header size, low energy, high data rate, suitable with wirless communications,
MQTT client must support TCP, hold a connection open to the broker at all time, topic names are often long, use of a little bit of CPU and memory
CoAP [1]
One of the most common application layer Reduced reliability due to the use messaging protocols used by IoT devices based of UDP on client/server model, similar to HTTP, supports request/response functions, small packets that are simple to generate, can be analyzed in place with no need for extra memory
IPv6 [27]
It is a large scale addressing protocol with simple IPv4 BROADCAST disappears in headers, capable of accommodating a large IPv6 number of IoT devices, with adequate features for the IoT paradigm like scalability, mobility, greater connection integrity and performance, high security, and auto-configuration
6LowPAN [17]
IPv6 Low power Wireless Personal Area Networks; is an “adaptation” layer for connecting IPv6 and IEEE 802.15.4 networks that do not allow Internet frames. It can turn the 805.15.4 local network into a mini-local Internet with subscribers that can be addressed from a remote client via a proxy. Routing, in particular, must be assured inside each network as well as between the local network and the IPv6 Internet
Dealing with issues like as header compression and fragmentation methods in order to keep 802.15.4 protocol efficiency over 50%
LoRaWAN [10] Network and end devices are expected to use low Deploying networks in dense urban power, requires a robust security mechanism, areas lead to radio network interferences in identical or adjacent channels
7
Conclusion
This article presents an overview of the IoT ecosystem whose main objective is to shed light on the underlying technologies behind an Internet of Things based system. First, we presented some conclusions from the state of the art. Then, we were interested in understanding the different possible communication technologies used in IoT systems, and the ways in which these could be integrated into a single ecosystem. We went ahead and presented physical communication technologies by individually analyzing their types, components, and functionalities as well as highlighting some issues for the purpose of finding solutions or alternatives. We posted a list of key questions that may guide future research on present or past projects. Consequently, the present study showed that: – The 3-layer architecture is the basis of the architectures proposed in the literature for IoT solutions. – The main issues facing IoT gateway systems are the variety of protocols and sensing devices within the WSN, IoT gateways are grouped into the two categories: Gateway without sensor devices and Gateway with sensor devices abroad. – Communication technologies can be classified into two main categories: long and short range technologies.
The IoT Ecosystem
89
– Each technology has its advantages and disadvantages, the choice of technology depends on the specifics of the field of application. For example in agriculture the 3-layer architecture can be used; LoRa is the most suitable technology for applications dedicated to smart agriculture. LoRa offers the possibility of deploying private networks, which is particularly useful for farms that do not always have network coverage. This study encourages us to carry out other research work by specifying the field of application and based on the recommendations resulting from this work to choose the fundamentals of the system. Acknowledgements. The authors would like to thank the Directorate General for Scientific Research and Technological Development (DGRSDT), under the authority of the Algerian Minister of Scientific Research (MESRS) for the acquisition of the financial support for the project leading to this publication.
References 1. Agyemang, J.O., Kponyo, J.J., Gadze, J.D., Nunoo-Mensah, H., Yu, D.: A lightweight messaging protocol for internet of things devices. Technologies 10(1), 21 (2022) 2. Ahmad, L., Nabi, F.: Agriculture 5.0: Artificial Intelligence, IoT and Machine Learning. CRC Press, Boca Raton (2021) 3. Akpakwu, G.A., Silva, B.J., Hancke, G.P., Abu-Mahfouz, A.M.: A survey on 5G networks for the internet of things: Communication technologies and challenges. IEEE Access 6, 3619–3647 (2017) 4. Azzedin, F., Ghaleb, M.: Internet-of-things and information fusion: Trust perspective survey. Sensors 19(8), 1929 (2019) 5. Balaji, S., Nathani, K., Santhakumar, R.: IoT technology, applications and challenges: a contemporary survey. Wireless Pers. Commun. 108(1), 363–388 (2019). https://doi.org/10.1007/s11277-019-06407-w 6. Banu, N.M., Sujatha, C.: IoT architecture a comparative study. Int. J. Pur. Appl. Math. 117(8), 45–49 (2017) 7. Brauer, J.R.: Magnetic Actuators and Sensors. Wiley, Hoboken (2006) 8. de Carvalho Silva, J., Rodrigues, J.J., Alberti, A.M., Solic, P., Aquino, A.L.: LoRaWAN - a low power wan protocol for internet of things: a review and opportunities. In: 2017 2nd International Multidisciplinary Conference on Computer and Energy Science (SpliTech), pp. 1–6. IEEE (2017) 9. El-Basioni, B.M.M., Abd El-Kader, S.M.: Laying the foundations for an IoT reference architecture for agricultural application domain. IEEE Access 8, 190194– 190230 (2020) 10. Ertürk, M.A., Aydın, M.A., Büyükakkaşlar, M.T., Evirgen, H.: A survey on LoRaWAN architecture, protocol and technologies. Future Internet 11(10), 216 (2019) 11. Farooq, M.S., Riaz, S., Abid, A., Abid, K., Naeem, M.A.: A survey on the role of IoT in agriculture for the implementation of smart farming. IEEE Access 7, 156237–156271 (2019) 12. Gil-Lebrero, S., Quiles-Latorre, F.J., Ortiz-López, M., Sánchez-Ruiz, V., GámizLópez, V., Luna-Rodríguez, J.J.: Honey bee colonies remote monitoring system. Sensors 17(1), 55 (2017)
90
S. Haddaoui et al.
13. Glaroudis, D., Iossifides, A., Chatzimisios, P.: Survey, comparison and research challenges of IoT application protocols for smart farming. Comput. Netw. 168, 107037 (2020) 14. Gocheva, M., Kuneva, V., Gochev, G.: The internet of things in agriculture-the advantages and opportunities, 53–63 (2020) 15. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IOT): a vision, architectural elements, and future directions. Futur. Gener. Comput. Syst. 29(7), 1645–1660 (2013) 16. Haas, H., Yin, L., Wang, Y., Chen, C.: What is lifi? J. Lightwave Technol. 34(6), 1533–1544 (2015) 17. Hauet, J.P.: L’internet des objets deux technologies clés : les reseaux de communication et les protocoles. Revue de l’Électricité et de l’Électronique (2016) 18. Kontogiannis, S.: An internet of things-based low-power integrated beekeeping safety and conditions monitoring system. Inventions 4(3), 52 (2019) 19. Lee, S.K., Bae, M., Kim, H.: Future of IoT networks: a survey. Appl. Sci. 7(10), 1072 (2017) 20. Li, S., Xu, L.D., Zhao, S.: The internet of things: a survey. Inf. Syst. Front. 17(2), 243–259 (2014). https://doi.org/10.1007/s10796-014-9492-7 21. Lynn, T., Mooney, J.G., Lee, B., Endo, P.T.: The cloud-to-thing continuum: opportunities and challenges in cloud, fog and edge computing (2020) 22. Marais, J.M., Malekian, R., Abu-Mahfouz, A.M.: Evaluating the LoRaWAN protocol using a permanent outdoor testbed. IEEE Sens. J. 19(12), 4726–4733 (2019) 23. Aqeel-ur-Rehman, Mehmood, K., Baksh, A.: Communication Technology That Suits IoT - A Critical Review. In: Shaikh, F.K., Chowdhry, B.S., Ammari, H.M., Uqaili, M.A., Shah, A. (eds.) Wireless Sensor Networks for Developing Countries. Communications in Computer and Information Science, vol. 366, pp. 14–25. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41054-3_2 24. Miles, B., Bourennane, E.B., Boucherkha, S., Chikhi, S.: A study of LoRaWAN protocol performance for IoT applications in smart agriculture. Comput. Commun. 164, 148–157 (2020) 25. Ravidas, S., Lekidis, A., Paci, F., Zannone, N.: Access control in internet-of-things: a survey. J. Netw. Comput. Appl. 144, 79–101 (2019) 26. Saqib, S., Ahmad, F.: An integrative decision support model for smart agriculture based on internet of things and machine learning (2021) 27. Sobin, C.: A survey on architecture, protocols and challenges in IoT. Wireless Pers. Commun. 112(3), 1383–1429 (2020) 28. Swamy, S.N., Kota, S.R.: An empirical study on system level aspects of internet of things (IoT). IEEE Access 8, 188082–188134 (2020) 29. Whitmore, A., Agarwal, A., Da Xu, L.: The internet of things—a survey of topics and trends. Inf. Syst. Front. 17(2), 261–274 (2014). https://doi.org/10.1007/ s10796-014-9489-2 30. Xia, F., Yang, L.T., Wang, L., Vinel, A.: Internet of things. Int. J. Commun Syst 25(9), 1101 (2012) 31. Xu, J., Gu, B., Tian, G.: Review of agricultural IoT technology. Artif. Intell. Agric. (2022) 32. Yu, W., et al.: A survey on the edge computing for the internet of things. IEEE Access 6, 6900–6919 (2017) 33. Yusof, Z.M., Billah, M.M., Kadir, K., Ali, A.M.M., Ahmad, I.: Improvement of honey production: a smart honey bee health monitoring system. In: 2019 IEEE International Conference on Smart Instrumentation, Measurement and Application (ICSIMA), pp. 1–5. IEEE (2019)
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration Nabil Kadache(B)
and Rachid Seghir
LaSTIC Laboratory, University of Batna 2, Batna, Algeria {nabil.kadache,r.seghir}@univ-batna2.dz
Abstract. We introduce in this paper the concept of Volunteer Simulation (VS) which involves the use of the volunteer computing technique to perform simulations that require exorbitant computing time and very expensive resources. First, we look at where this concept can be applied effectively. We then introduce a simulation environment based on voluntary computing (VolSIM). We also introduce and examine four scheduling policies in VolSIM to optimize valuable and volatile participants’ resources. Finally, the environment is validated by a simulation of cellular automata and particularly in the problem of number-conserving rules enumeration as well as an evaluation of the performance of the four introduced scheduling algorithms.
Keywords: Volunteer computing automata · Scheduling policies
1
· Distributed simulation · Cellular
Introduction
One of the main objectives of distributed simulation (DS) is the use of parallel and distributed programming capabilities over multiple computing nodes to speed up the execution of simulation programs. In some cases, the goal is to simply run multiple experiences or scenarios of the same simulation in a parallel way. Several authors have identified the arguments for using DS [3], [7]: – Execution time: A large simulation can use DS to split its components across multiple computing nodes to exploit parallelism and speed up execution. In some cases, we can speedup simply the simulation’s experience. It can also speed up the execution of a simulation by running multiple experiences or scenarios on different nodes – Composition of simulation models and reuse: The reuse of existing simulations as part of a new DS is very suitable if the cost of development is significant. In addition, it is better to tie two simulations together than to combine them with all the software engineering problems that accompany development from scratch.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 91–104, 2023. https://doi.org/10.1007/978-3-031-18516-8_7
92
N. Kadache and R. Seghir
– Owner maintenance: DS allows developers to maintain and/or update their individual simulations independently from each other. – confidentiality: DS keeps the confidentiality of the technological secrets of its various owned individual simulations which compose it (like internal functioning of a factory or military system) and allows only interactions between them. In addition, individual simulations exchange only the relevant data and keep the others in a “black box”. – Interoperability: DS allows various simulations to exchange their data, for example the simulation of cars, pedestrians, buses can be combined to have a simulation of urban traffic. Volunteer computing (VC) is a “free-cost” computation technique in which volunteers offer their computation resources (individual machines and clusters for example) voluntarily to perform tasks that are part of projects [6]. The well known environment of VC is surely BOINC [1]. The essential features that any VC system should ensure [8] are, among others: 1. 2. 3. 4.
Effective distribution of the tasks to the volunteers, Robustness, i.e. detecting failing volunteers and resuming the aborted tasks, Trust in the middleware, Security.
Current VC environments rise many challenges as the number of participants increases, effectiveness of the used task scheduling policy and prevention against malicious attacks. Nevertheless, Volunteer computing may play a role with the popularization of the technique of block-chain and “IoT” in the future. Cellular automaton (CA) is an old mathematics tool for modeling and simulating discrete-time dynamic systems. CA simulation is used in a variety of disciplines involving complex systems. Among the types of cellular automata we focus on the Number-Conserving Cellular Automata (NCCA); CA which, during their temporal evolution, retain some measure of the state of all their cells. Conservation is one of the recurring universal laws in several fields such as the laws of energy-mass or electric charge conservation in physics and the law of conservation of atoms in chemical reactions. It seems interesting or even essential to design or to find CA with this conservation property, this can be too long and consuming colossal computing resources. This work is an attempt to run fastidious simulation of CA by using a VC environment. We first present VolSIM: a web-based VC environment to run simulations and in second we validate with the CA simulation over VolSIM. The relevant works are presented in Sect. 2 followed by the used computational model. Architecture, client and server sides of VolSIM are presented in Sects. 3.2 and 3.3. The scheduling policy used in our environment is detailed in Subsect. 4. A CA simulation project is presented in Sect. 5 with the evaluation of the used scheduling policy. Finally, we conclude with the future research axis.
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration
2
93
Related Work
According to [9] there are three modes of distributed simulation. In the first mode, shown in Fig. 1(a), a simulation that should run on a single node is divided into several parts or sub-models that run on different nodes and interact through a communication network. In the second mode, illustrated in Fig. 1(b), several simulations running on different nodes are allowed to interact between them. This mode allows the reuse of existing simulations and saves the effort of redeveloping new simulations. In the third mode, shown in Fig. 1(c), a set of experiments and/or scenarios of the same simulation are performed by several nodes, formerly executed sequentially on a single node. Our proposed volunteer computing based simulation environment (VolSIM) allows the execution of applications of this later category of distributed simulations.
(a) The same model is distributed to speedup simulation.
(b) Several models can interact.
(c) Several experimentation can be performed in different nodes.
Fig. 1. Modes of Distributed Simulation [9]
The most largely used volunteer computing platform is BOINC [1]. It consists of an infrastructure (servers) that runs BOINC server software and can host many projects which are divided into several small work units or tasks. In addition to several data and database servers, BOINC platform consists of a set of components: scheduler, validator, purger, transitioner and other statistical utility modules. Volunteers must download and install, on their machines, a BOINC middleware which handles interaction with the BOINC server (downloading work units, reporting progress and sending results). Figure 2 shows the main components of the BOINC architecture.
94
N. Kadache and R. Seghir
The BOINC client needs to communicate with the server only to request tasks or to send results; the tasks are performed independently at the volunteer machine level. BOINC associates to each task a deadline running time, beyond which and without returning a result or an error by the client, the task is assumed aborted and will be restarted by another volunteer. In addition, any task result must be validated by at least two volunteers to ensure the validity of the performed computations. Another environment of webbased VC system using the new features of browsers including WebWorkers was presented in [4]. The authors tested their environment by an application that involves the analysis of 2665 WikiPedia articles using volunteers resources by subdividing this analysis respectively into 64, 128, 256 and 512 tasks distributed to volunteers. The results reported by the authors illustrate the benefits of using the volunteer computing paradigm in terms of acceleration (e.g. 27 for 32 volunteers). The advantage of the system lies in the sense that it requires only a browser on the volunteer machine and no middleware to download, which is better appreciated by the volunteers, especially from the security point of view. Another work using social networks to increase volunteer participation in VC projects is reported in [5]. We have implemented a voluntary computing project which consists in researching number-conserving rules in cellular automata(CA). The conservative property in CA has been widely studied in detail for one-dimensional CA [2]. For higher dimensions, the works are not of the same magnitude because of the theoretical and computational complexity. To this end, we attempt in this work to use the VC concept to enumerate number-conserving rules in the complex case of two-dimensional CA.
Fig. 2. BOINC Architecture.
3 3.1
VolSIM Environment Computational Model
We consider the voluntary calculation model commonly used in VC [8] systems. where a project is subdivided into a set of tasks which must be large enough to
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration
95
take better advantage of the volunteers’ resources. Each task can be thought of as a function that operates on the input data and produces a result. P roject = T aski / i = 1 · · · n
(1)
In our case, a VC project is a single simulation, the tasks to be performed by the volunteers correspond to the simulation experiments. The scheduling is performed periodically in successive rounds. In each round, a set of experiments is generated and assigned to all available resources (connected volunteers) as shown in Fig. 3. for each experimentation, two values are associated: the first represents an approximation of its execution time, which depends on the nature of the simulation, and the second is a deadline beyond which the participant who runs the simulation is considered as faulty and the experimentation is reassigned to another volunteer.
Fig. 3. Computational model for volunteered simulation.
3.2
Server Side
VolSIM is a web-based application, it has the essential features of any volunteer computing system such as scheduling and validation illustrated in Fig. 4. The added visualization module is very important in the simulation field. Simulation data and task information are stored separately in two different databases. The web-socket protocol, characterized by its efficiency and speed, is used for the interaction between VolSIM and the participant. The task scheduling policies are discussed later in Sect. 4. 3.3
Client Side
The client consists of a main web page which is first displayed in the volunteer’s browser upon request from the url server. The front-end component is automatically loaded; this component represents the execution and communication container of the simulation task. The communication scheme between client, server and other volunteers is shown in Fig. 5.
96
N. Kadache and R. Seghir
Fig. 4. VolSIM Components.
Fig. 5. VolSIM-Volunteer communication scheme.
4
Scheduling Simulation’s Tasks in VolSIM
The performance of voluntary computing systems depends closely on the task scheduling policy. An effective scheduling policy derives maximum benefit from the valuable turnaround time offered by volunteers. Two classes of policies are used in the VC : [8] i) blind (or naive) policies like FCFS, buffered and random. ii) knowledge-based. The evident drawback of the blind scheduling policies lies in the possibility to assign complex tasks to participants with low computational and unreliable capabilities, however, it is still in use for its simplicity and its very short response time, The FCFS scheduling policy is given by the algorithm 1. Knowledge-based
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration
97
(KB) policies take in consideration the volunteer’s performance and task’s complexity. In order to evaluate which policy to choose in the case of large simulation’s experimentation, we define first the FCFS as detailed by the algorithm 1.
Algorithm 1. FCFS Require: : QR a queue of available resources SC: Simulation Code QC : a Queue of a initial configurations Status status; a variable which holds the status of simulation execution by the volunteer Result result; a result of simulation variable Ensure: Simulation’s experience assignation updating QR for (each new resource rk ) do send SC to rk put(rk ,QR ) end for while (QR is not empty and QC is not empty) do take rm from QR take cn from QC assign(rm ,cn ) callback ReceiveResults(rm ,result,status) end while Procedure ReceiveResults(Resource ri ,Result res,Status st) if (st != error) then check(res) // check the validity of the results. to specify in the VC project put(ri in QR ) end if End Procedure
We also define two knowledge-based algorithms called KB max and KB min. Both algorithms require prior knowledge of the computing capabilities of the resources, for example based on a ranking between processor and available RAM. Finally, each resource is weighted by a value that represents its capacity and makes it comparable to others. Tasks that correspond to several grouped experiments are each equally weighted by the number of experiments to be performed, which represents the complexity of the task. KB max assigns tasks to volunteers in descending order of task and resource weights while KB min assigns tasks in reverse order. The two algorithms seem equivalent, but between two rounds of planning, KB‘ min favors small tasks,
98
N. Kadache and R. Seghir
possibly leaving large resources unallocated and can therefore be assigned to heavier tasks in the next round and vice versa for KB max. The behavior of the two techniques can only be assessed experimentally.
Algorithm 2. KB max Require: : R a set of resources SC: Simulation Code C : a set of weighted experimentation Status status; a variable which holds the status of simulation execution by the volunteer Result result; a result of simulation variable Ensure: Simulation’s experiments assignation updating R for (each new resource rk in R) do send SC to rk put(rk ,R) end for while (R is not empty and C is not empty) do take rmax from R, R = R − rmax take cmax from C, C = C − cmax assign(cm ax,rm ax) callback ReceiveResults(rm ax,result,status) end while Procedure ReceiveResults(Resource ri ,Result res,Status st) if (st != error) then check(res) // check the validity of the results. to specify in the VC project update(ri ) // update the resource’s weight: wi = ti /ci end if End Procedure
We also examine a fourth algorithm proposed by [6] which is a dynamic variant of KB-type algorithms, called TASP, where the authors take into account the momentary computational capacity of resources during the execution of tasks instead of their characteristics hardware. Algorithm 4 is a small readjustment of TASP for the VolSIM environment:
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration
99
Algorithm 3. KB min Require: : R a set of resources SC: Simulation Code C : a set of weighted experimentation Status status; a variable which holds the status of simulation execution by the volunteer Result result; a result of simulation variable Ensure: Simulation’s experiments assignation updating R for (each new resource rk in R) do send SC to rk put(rk ,R) end for while (R is not empty and C is not empty) do take rmin from R, R = R − rmin take cmin from C, C = C − cmin assign(cmin ,rmin ) callback ReceiveResults(rmin ,result,status) end while Procedure ReceiveResults(Resource ri ,Result res,Status st) if (st != error) then check(res) // check the validity of the results. to specify in the VC project update(ri ) // update the resource’s weight: wi = ti /ci end if End Procedure
5
CA Simulation on VolSIM
Cellular automaton (CA), introduced by J.V. Neumann and Ulam for the purpose of modeling self-reproductive systems, are dynamic systems Which evolve discreetly in space-time. The concept of CA quickly gained an important place as a computational and simulation model used in various fields covering complex systems with discrete state evolution. We briefly introduce the mathematical formalism below: A cellular automaton (CA) is a 4-uplet (Q, d, V, δ) where: – – – –
Q is the set of states. d ∈ N ∗ is the dimension, V = vi |i ∈ [1, |V |] is a finite set of vectors of Z d , called a neighborhood, δ : Q|V | −→ Q is the transition rule.
The set of cells U = Z d is generally taken finite, a configuration C is an element D of the set QZ , the cellular automaton evolves over time the configuration Cf according to the global transition function F which corresponds to the application of the local function Δ to all cells of C. This evolution can be written Ct+1 = F (Ct ).
100
N. Kadache and R. Seghir
Algorithm 4. TASP-Sched Require: R = r1 , r2 , .., rn a set of available resources SC: Simulation Code Ensure: Simulation’s experience assignation updating R for (each new resource rk ) do send SC to rk R=R+rk end for for (each rj in R) do generate a set Se of initial configurations for resource rj according to their weight. send Se to rj CALL ReceiveResults(rj ,Results r,Status s) end for Procedure ReceiveResults(Resource ri ,Result res,Status st) if (st=error) then R=R − ri else check(res) // check the validity of the results. to specify in the VC project update(ri ) // update the resource’s weight: wi = ti /ci end if End Procedure
Formally if we note C Qi the number of cells having as state Qi in the configd uration C, a CA is number-concerving if: ∀C ∈ QZ , ∀t >= 0, ∀Qi ∈ Q : CtQi = Qi Ct+1 Without loss of generality, let us consider the case of binary state automata Q=0,1 and consider the two most used neighborhoods of Von Neumann and Moore as shown in Fig 6. We used the Wolfram notation for the coding of the local transition rules [10], there are 232 possible rules for the Von Neumanntype neighborhood and 2512 rules in the case of the Moore-type neighborhood. We implemented two simulation projects in VolSIM whose goal is to find the number-conserving rules in both cases.
(a) 2D VonNeumann (b) 2D Moore NeighborNeighborhood hood
Fig. 6. Neighborhood types in CA
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration
5.1
101
Results
For the case of the simple project concerning the neighborhood of the Von Neumann type, we have examined the 232 rules by using a necessary and sufficient condition defined in [11] for a rule to be conservative. The interval of rules is distributed over 6 tasks of varying complexity according to the Fig. 7(a) The 9 conservative rules in this case have been successfully enumerated by VolSIM and are listed in Table 1. Table 1. number-conserving 2D Von Neumann neighborhood rulers Binary code
Hexa code
Behavior
11001100110011001100110011001100 CCCC CCCC
left-right shift
10101111101011111010000010100000 AFAF A0A0
bottom-up shift if up-cell is free
11111111111111110000000000000000 FFFF 0000
Bottom-up shift
11111010111110100000101000001010 FAFA 0A0A
top-down shift if down-cell is free
11111111000000001111111100000000 FF00 FF00
right-left shift
11110000111100001111000011110000 F0F0 F0F0
The identity ruler
10101010101010101010101010101010 AAAA AAAA top-down shift 11111100000011001111110000001100 FC0C FC0C
left-right shift if right-cell is free
11001111110000001100111111000000 CFC0 CFC0
right-left shift if left-cell is free
We also assessed the presented scheduling policies. For this purpose, four virtual volunteers V1, V2, V3 and V4 were used for the test on an HPC cluster1 . The search interval is subdivided into six pieces corresponding respectively to the six tasks in Fig. 7(a) for the first three policies. For TASP, the algorithm itself creates its own tasks at each round of scheduling. A best time of 182.74 s is given by TASP which dynamically created 8 tasks in two scheduling rounds (Fig. 7(e)). The sizes of the tasks created are based on the instantaneous computing capacity of the available volunteers, then KB Max with a time of 221 sec followed by FCFS with 233 s and finally the policy KB Min with 329 s. The scheduling models found during the test are presented in the Figs. 7(b), 7(c), 7(d) and 7(f) All three policies FCFS, KB Max and KB Min have recorded closing times. When scaling to a very large number of volunteers and tasks, the behavior of these three policies varies greatly because the tasks to be performed are of arbitrary complexity.
1
http://hpc.univ-batna2.dz/.
102
N. Kadache and R. Seghir
(a) Task’s complexity
(b) FCFS scheduling.
(c) KB MAX schedling.
(d) KB MIN scheduling
(e) Tasks generate in TASP
(f) TASP scheduling
Fig. 7. scheduling policies performance comparison.
6
Conclusion and Future Works
In this paper, we proposed a voluntary computing environment for the execution of distributed simulations according to the distribution mode of the experiments, we also examined four possible scheduling policies in our environment. two research projects on conservative rules in cellular automata are implemented with comparisons of the different scheduling policies. The first draft successfully listed such rules.
A Volunteered Simulation Environment Applied to 2D-NCCA Enumeration
103
The second project (Moore neighborhood NCCA) is too complex in terms of computation time to be completed. Despite going through hundreds of billions of rules no conservative rule has yet been identified. In this sense, one of the perspectives of our work is the search for mathematical conditions allowing to reduce the colossal search space (2512 rules!!!). Effective exploitation of VolSIM has raised the problem of fault tolerance for tasks not assigned to volunteers and not executed in the absence of a communication network or explicit abandonment by the volunteer. In this regard, a technique for hot reseal of these tasks is very desirable especially for large tasks. Finally, we plan to make available to the users of our environment a utility allowing to simplify the description and the implementation of their calculation projects independently of the internal details of VolSIM. This will encourage users to adopt the concept of volunteer computing as one of the types of free green computing that significantly decrease energy consumption. Acknowledgment. We would like to sincerely thank the IT team at the University of Batna 2 who provided us with all the necessary information and facilities for the development of the implementation part of our work and for the precious time that allowed us to develop our various long and tedious tests on the UB2-HPC cluster (http://hpc.univ-batna2.dz/).
References 1. Anderson, D.P.: Boinc: a system for public-resource computing and storage. In: Fifth IEEE/ACM International Workshop on Grid Computing, pp. 4–10. Pittsburgh, PARIS(66), France (2004). https://doi.org/10.1109/GRID.2004.14 2. Boccara, N., Fukundefined, H.: Number-conserving cellular automaton rules. Fundam. Inf. 52(1–3), 1–13 (2002). https://doi.org/10.5555/639405.639407 3. Boer, C.A., de Bruin, A., Verbraeck, A.: A survey on distributed simulation in industry. J. Simul. 3(1), 3–16 (2009). https://doi.org/10.1057/jos.2008.9 4. Chorazyk, P., Godzik, M., Pietak, K., Turek, W., Kisiel-Dorohinicki, M., Byrski, A.: Lightweight volunteer computing platform using web workers. Procedia Comput. Sci. 108, 948–957 (2017). https://doi.org/10.1016/j.procs.2017.05.091, http:// www.sciencedirect.com/science/article/pii/S1877050917306348 5. Dinde, S.H., Dixit, A.M.: On sharing infrastructure resources using online social networks. Procedia Comput. Sci. 4, 948–957 (2015). https://doi.org/10.1016/j. procs.2017.05.091 6. Kadache, N., Seghir, R.: A new social volunteer computing environment with taskadapted scheduling policy (TASP). Int. J. Grid High Perform. Comput. (IJGHPC) 13(2), 39–55 (2021). https://doi.org/10.4018/IJGHPC. to be published 7. Mustafee, N., Taylor, S., Katsaliaki, K., Dwivedi, Y., Williams, M.: Motivations and barriers in using distributed supply chain simulation. Int. Trans. Oper. Res. 19(5), 733–751 (2012). https://doi.org/10.1111/j.1475-3995.2011.00838.x, https:// onlinelibrary.wiley.com/doi/abs/10.1111/j.1475-3995.2011.00838.x 8. Nouman Durrani, M., Shamsi, J.A.: Volunteer computing: requirements, challenges, and solutions. J. Netw. Comput. Appl. 39(1), 369–380 (2014). https:// doi.org/10.1016/j.jnca.2013.07.006, https://linkinghub.elsevier.com/retrieve/pii/ S1084804513001665
104
N. Kadache and R. Seghir
9. Robinson, S.: Modes of simulation practice in business and the military. In: Proceeding of the 2001 Winter Simulation Conference (Cat. No. 01CH37304), vol. 1, pp. 805–811 (2001). https://doi.org/10.1109/WSC.2001.977370 10. Wolfram, S.: Theory and applications of cellular automata, tables of cellular automaton properties. In: Scientific, W. (ed.) Advanced series on complex systems, pp. 485–557, Singapore (1986) 11. Wolnik, B., Dzedzej, A., Baetens, J.M., Baets, B.D.: Number-conserving cellular automata with a von Neumann neighborhood of range one. J. Phys. A Math. Theor. 50(43), 435101 (2017). https://doi.org/10.1088/1751-8121/aa89cf
Platforms Cooperation Based on CIoTAS Protocol Bouchera Maati(B) , Djamel Eddine Saidouni , and Mohammed Mounir Bouhamed Department of NTIC, MISC Laboratory, University Constantine 2 Abdelhamid Mehri, Constantine, Algeria {bouchera.maati,djamel.saidouni,mohammed.bouhamed}@univ-constantine2.dz
Abstract. Integrating IoT and Cloud computing technologies attains efficient solutions for quality insurance. The combination of both technologies with the autonomic computer creates powerful mechanisms to deal with abnormal situations. The state-of-the-art solvers combined then with unbalanced functionalities using top-down updating for device controlling and bottom-up updating to inform the IoT-Cloud platform about the captured contextual information by the device. This solution reduces the benefits of horizontal collaborations on both sides. For that, we propose a novel platform cooperation based on the CIoTAS protocol to enhance the contextual decisions of IoT devices. The proposed approach enriches the knowledge of Cloud platforms and explores the capacity of IoT devices (things) based on their awareness by updating their data based on the changes in the behavior and context of IoT devices. We proved the protocol correctness according to the liveness and closure properties. We conducted a simulation part to illustrate and clarify the efficiency of our proposal. Keywords: IoT-cloud platforms · Context-awareness · Context discovery · Service collaboration · Data synchronization
1 Introduction The Internet of Things (IoT) plays a critical role in our daily life. Taking full advantage of this technology is still under development due to the complexity of integrating other technologies, such as autonomic computing and cloud computing. The latter is designed to provide resources for configuration to the end-users (or applications) over a specific model; it includes the three-layer model: IaaS, PaaS, and AaaS. Extending this model to cover the IoT solution is highly required to establish the correct service functionalities over unpredictable events. In fact, contextual dynamicity is crucial in the IoT system, where the context is shaped and reshaped based on the current condition and situation [1–3] based on time [4]. The characteristics of those systems create several challenges to ensure that they deliver services with high-level quality based on their requirements [5–7] in a healthy way [8]. Therefore, the new service paradigm required the integration of not only the cloud and the IoT but also different ones, such as autonomic computing, web services c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 105–117, 2023. https://doi.org/10.1007/978-3-031-18516-8_8
106
B. Maati et al.
[9], and big data, to name a few. The key element is the service itself. The service should integrate several things: autonomic by self-defining its current functionality based on the pre-defining ones, self-procedure, and establishing tasks that flow under the initial structure [10]. Based on the service paradigm and under abnormal situations, the system will be fail-free [11] against any failure or system-degraded state, making it more flexible and elastic [12]. Yet, few works have been established in this area of research due to the enormous number of existing platforms (commercial and academic) with no collaboration or cooperation aspect. However, several platforms use the up-down Cloud-IoT architecture for decision-making; it ignores contextualization ability. In addition, the decision-making is done on the cloud side, which eliminates the autonomic reaction on the IoT side. For those reasons, we propose a data synchronization protocol for the cloud-IoT platform, first to synchronize the contextual events on the cloud side, and second to synchronize data between the platforms over collaboration. Both types of synchronization are complimentary. The contextual one uses the platform collaboration to decide locally about the interactions with neighbors over two phases. The first one integrates the data updates between the platforms and their device-based bottom-up models, and the second one uses the top-down model to obtain the shared knowledge from other platforms. The platforms are collaborating by communicating over message exchanges, i.e., tokens, to gather maximum information about both accessible contexts on their devices and inaccessible ones. The rest of this paper is structured as follows. Section 2 presents the state-of-theart solver based on the platform updating and the data synchronization. In Sect. 3, we proceed by presenting the basics of the proposal. Section 4 gives the proposed solution with an illustration of the basis of the CIoTAS protocol functionality and formalization. In Sect. 5, we prove the correctness of our protocol. In Sect. 6, we describe our approach by giving illustrative examples. Section 7, we implement a part of the proposed protocol. Section 8 conclude the paper.
2 Overview on Related Solutions It is worth mentioning that the Cloud-IoT platform communicates the devices (i.e., things) in the literature for data synchronization in different ways. Some of them optimize the process, and others do not. Based on that, the selected related works included in this part are mentioned for not the clock-synchronization but for data-synchronization for both up-down and bottom-up solutions. In [13], the authors used the resync, where the client files are updated with ones on the server based on bidirectional communication. The author implemented the hashing to figure out the file update locally; in this case, the modifications are sent to the server after this phase. However, the synchronization is launched by the cloud by generating lists of reference signatures. The IoT device uses these lists later to define its reference signature for future synchronization. The authors also targeted the traffic issues within this solution at the application layer. On the other hand, Lam et al. [14] proposed a decentralized approach for orchestrating and configuring services in industrial IoT based on the concept of a local automation cloud. The approach enables real-time monitoring of systems.
Platforms Cooperation Based on CIoTAS Protocol
107
3 Solution Background In this section, we present the basis of the proposed solution by recalling the CIoTAS protocol functionality and formalization that has been used to define the collaboration protocol while integrating of the data synchronization solution. 3.1 Shadow Functionality The CIoTAS protocol [15] is a diagnosable protocol that uses the basis of distributed algorithms to guarantee the IoT system’s reliability and security. It replies to the IoT devices with an autonomic ability to heal from abnormal situations under the presence of two control loops: self-healing and self-protection loops. Three main phases were defined: discovery, collaboration, and healing. The IoT object launches the first phase (i.e., device discovery) for service finding and invoking. However, the last one is ensured either contextually (i.e., substitute device) or using the cloud; the cloud is introduced under the presence of the S2 aaS model (shadows) if the first recovery plan cannot be achieved. Taking advantage of this healing solution, the cloud platform extended its knowledge to have more context-awareness of the IoT object dynamic environments. In the sequel, we will explain this part. The discovery phase is established periodically. The object (agent) recognizes the surrendering contextual agents and their provided services by sending a broadcast message. Any device that responds becomes a neighboring node. Discovered services are used to be requested, maybe in the future, based on the requester’s behavior, which is unexpected but may be predictable according to the captured information. Furthermore, this contextual discovery helps detect the dynamical level of the current context; it reshapes the service request in the collaboration. The latter may fail into abnormal object behaviors due to suspicion attacks. To confirm that, the protocol introduces the cloud (as S2 aaS model) to diagnose the system states and recover from their degradation. 3.2 CIoTAS-Protocol Formulation Based on [15] problem formulation, each cloud (shadows) integrates the S2 aaS model that reflects the context of IoT devices (nodes), where each physical device (agent) has its own defining context based on time, space, and situation. Besides, it communicates with its supervisor (shadow) to overview the current device context over message exchange. The defined formulation for the physical device includes the following points: – A tuple of . Where SDA is the set of provided services, those services are contextually provided by agents in the SCA set. This set is initialized at 0/ in the discovery phase and fills up. It is worth mentioning that the agent SDA includes the device’s currently provided service, both internal and external services. In addition, the State variable includes the
108
B. Maati et al.
current state. State = {normal, collaborated, degraded, broken1 } transited using the self-healing control loop in Fig. 1 (a) part. LibAct is the actions library for data management based on its shadow decisions, and the communication with its shadow is stored in the LastCx variable for traceability and updating. – A tuple Where CAg is the shadow identifier, CSDA reflects the set of the physical agent’s available contextual services; a global one. According to the protocol hypothesis number 2 and 3: “the cloud services are reliable and available all the time and “the set of available services SDA of the agent Ag is available in its shadow CAg". CState refers to the shadow state. CDep is the corresponding set of contextual-discovered devices of its physical agent; it is used for failure notification as a reference database for the future collaboration process. Plus, LibReact is the set of alternative actions for state management in an abnormal-contextual situation (failure or attack).
Fig. 1. State diagram of autonomic abilities under collaboration process
The shadow is cited in two cases. First, it is recalled for regular updating according to the context-discovery phase (routine). Second, in an abnormal situation after the reception of a diagnostic message from its physical agent (see Procedure 12 ). The diagnostic message includes either the pre-requested contextual agent identity (Ag = NULL) in order to connect its shadow, or an empty one (Ag = NULL) which invokes the function Select(a), where it returns a provider shadow from the S2 aaS model that may execute a specific action.
4 The Proposed Solution The CIoTAS protocol [15] is a diagnosable protocol, and it’s defined to recover from abnormal-contextual events, such that potential DDoS attacks are released in the contextual form. The DDoS is detected in the absence of requested results on the requester’s 1 2
This state is presented under the absence of cloud healing. It reflects event 11 in CIoTAS protocol.
Platforms Cooperation Based on CIoTAS Protocol
109
Procedure 1. When CAgi receives (Get-Diagnostic, a: action, Ag: Agent) from Agi 1: if (Ag = NUll) then 2: Send (Get-Shadow-State, a, CAgi ) to CAg; 3: else 4: CAg j = Select(a); 5: Send (Get-Shadow-State, a, CAgi ) to CAg j ; 6: end if
side and recovered also by selecting other contextual providers (each device associated with an agent). However, the healing is presented on the provider side as a protection and mitigation approach after the confirmation of the positive detection by its cloudplatform shadows. In this work, we focus on defining the cooperation events that include the updating ones. First, to collect information about other unattached contextual. Second, to inform the cloud platform about contextual events such as failed service, current context, and potential attack. In the following, we will explain the proposal. 4.1 Cooperation Model and Assumptions Assume a countable number of Cloud-IoT platforms with a set of mobile shadows. The virtual structure of those platforms is a form of rings, and each platform is assigned unique identification numbers (see Fig. 2). Each ring is constructed based on the clustering method, where the ring elements (i.e., platforms) are regrouped according to the set of criteria mentioned in [16]. They are communicated over token transmission [11], and the token information is implemented as mutual exclusion at the platform level. The shadows are distributed randomly and follow the S2 aaS model. They are managed in a separate sub-layer. We assume that the data synchronization is established periodically and used as a platform for updating the object context. The latter decides the appropriate time for this phase according to its current parameters. Assumptions. According to [10], we consider the following assumption about the platform collaboration model P: – Let C = { c1 , ... cn } be a countable set of cloud-platforms that are in PaaS with different capabilities such that device management, integration-level, etc. – Let Θ = {θ1 , ...θl } be a countable set of cloud-platform clusters that are constructed based on their capabilities. – Let A = {Ag1 , Ag2 , Agi ...} be a countable set of context agents, stored in SCA that reflects each associated shadow. = {Ag2 , Ag2 , Ag2i ...} be a countable set of cloud agents that reflects each – Let CA 1 2 associated shadow. – Let A¨ = {Ag1 , Ag2 , Agi ...} be a countable set of contextual-failed agents stored in SFA3 that has been detected by a contextual agent. 3
A set of contextual failed agents.
110
B. Maati et al.
Fig. 2. Cooperation model based on virtual rings of distributed platforms on the network
– Let S = {s1 , ..., sn } be a set of agent services (physical-agent services), where n is accountable number of the detected agent provided services in the discovery phase. – Let S = { s1 , ..., sm } be a set of platform services (shadows services), where m is accountable number of cloud services provided by shadow. – Let ℘ be a finite set of service names ranged over {normal, collaborated, degraded, evolved}, where the system state based on the self-X attributes. – Let D = {d1 , d2 , .., dk } be a set of application domain used as platform feature’s. – Let B = {b1 , b2 , .., bl } be a set of shadow beliefs. – Let ℑ be a finite order set of autonomous reactions. – Let ρ = {good, medioc, low} be a set of internal system quality. – Every agent Agi prepares a packetAgi , with the agent-context data xi ={A i , A¨ i , S i } packetAgi (xi ).
(1)
– Every cloud agent Ag2i prepares a packetAg2 , with the shadow-updated data i xˆi ={A¨ i , S i , LibReact} packetAg2 (xˆi ). (2) i
– Every platform ci prepares a tokenci , with the its platform IDc1 , platform data yi , and i , A¨ i , Si } θi is the associated token ring yi = {CA tokenc1 (yi , θi )
(3)
Protocol Functions. In the following, we introduce the updating and cooperation functions that are used for both sides of the object and its shadow:
Platforms Cooperation Based on CIoTAS Protocol
111
– ϒ : A −→ {T RUE, FALSE}, a function that uses the stored object context (A ) in SCA to test the compatibility and the evolvability of its context under period. – : ℘ −→ {T RUE, FALSE}, a function that determines the system level performs according to the current system state. – ∼ =: A × A −→ {T RUE, FALSE}, a function that returns a Boolean variable according to the object’s awareness about its context (i.g., an average of the functional contextual agent and the failed ones). – ≡: B×B −→ {T RUE, FALSE}, a function that returns a Boolean variable according to the shadow beliefs based on its physical agent awareness about its context. – getBDI: A × A¨ −→ ℑ, a function that returns a set of the system library of global actions based on its initial objectives and the current situations. Data Structure. In this section, we mention the additional data structure for the CIoTAS protocol based data synchronization part: – – – – –
matching: a Boolean variable initial at FALSE. functionality: a variable that contains the system quality level variate from ρ . Ag.reaction: a Boolean variable initial at FALSE. P.successor: The platform identity according the defined cycle. P.state: a variable that contains the platform state variate from {normal, collaborated, evolved}. – exist: a Boolean variable initial at FALSE. – UniQ: a Boolean variable initial at T RUE reflects uniqueness of the token in the set of cyclic-virtual topology of platforms. This variable is update to T RUE in case that the token information is duplicated on other tokens. – LiB: a list of instructions included the agent perception and attention.
4.2 The Proposed-Protocol Events The proposed platform cooperation includes two main phases: the updating that happens between the physical device and its shadow and the collaborating events between Cloud-IoT platforms. In the following sections, we enumerate the events of each phase. Updating Events. The defined updating events between the object and its shadow on the related platform include the following events: – Event1 (see Procedure 2): The contextual agent (Agi ) keeps updating its shadow according to the contextual events, besides the one mentioned in Sect. 3.2. After the discovery phase, Agi launches this event by δ to test first its context evolvability (line 1) using the function matches. If the latter returns T RUE, it checks the update based on the last communication with the platform (line 3) to send the update message that includes packet (line 5), plus it updates the communication variable to 1 (line 6). However, if Agi attributed the update before, it may send the mentioned message according to its performance level (lines 8–9). The function defines the latter to return an adequate current level (i.e., system-functional quality). The agent also tests if the context is stable in the long term using the matching function, which reduces the exchanging of control messages to only inform when they need to be.
112
B. Maati et al.
Procedure 2. When Agi send (Updating, packet) to Ag2i 1: matching = ϒ (SCA); 2: if (matching = T RUE) then 3: if (LastCx = 0) then 4: Create an instance for updating; 5: Send (Updating, packet) to Ag2i ; 6: LastCx = 1; 7: else 8: f unctionality = (State); 9: if ( f unctionality = low) then 10: Create an instance for updating; 11: Send (Updating, packet) to Ag2i ; 12: LastCx = 0; 13: end if 14: end if 15: end if
– Event2 (see Procedure 3): At reception time, the shadow updates its state to “collaborated” (line 1). Besides, it gets the focus of its attention and goals to verify its compatibility with system reactions (line 2). Those reactions are constructed over the objects SCA and SFA. Under the output of the used function ∼ =, the Shadow decided on the SDA set of its physical agent (line 6). In addition, it redefines its lib of reaction LibRect (line 5) to be sent to it (line 7).
Procedure 3. When Ag2i receives (Updating, packet) from Agi 1: 2: 3: 4: 5: 6: 7: 8:
state=collaborated; LiB = getBDI(SCA, SFA); Ag.reaction=((SCA ∼ = SFA) & (LiB ≡ LibReact)); if (Ag.reaction = T RUE) then Redefine LibReact; Set(SDA); Send (Update-Result, packet) to Agi ; end if
– Event3 (see Procedure 4): When Agi receives the update from its shadow, it sets its lists of SDA and SFA (lines 1–2) for future requests. Besides, to avoid selecting a suspicious agent that may be found in its current context during the collaboration phase. In addition, it changes its LibReact (line 3) to perform its aim under maximum resilience.
Platforms Cooperation Based on CIoTAS Protocol
113
Procedure 4. When Agi receives (Update-Result, packet) from Ag2i 1: 2: 3: 4:
Set(SDA); SetF(SFA); SetBDI(LibReact); Affect the changes and delete the instance.
Collaboration Events. The defined platforms-collaboration events are enumerated in the following events. – Event4 (see Procedure 5): Using the virtual network basis on the constructed ring, a c became a data requester based on the received contextual information from its related shadows. Under the uses of integrated autonomic abilities, we define in Fig. 1 (b) the system state diagram of self-description for platform collaboration. It included three system states: normal, collaborated, and evolved. At that time, the c required an extension of its knowledge; it changed its state from normal into collaborated (line 1), then it waited for the token reception (line 2). In the case of the latter, it updates its state to Evolved (line 3) to consume the token information.
Procedure 5. When c required an evolving 1: P.state="Collaborated"; 2: Wait(exist==TRUE); 3: P.state="Evolved";
– Event5 (see Procedure 6): After consuming the token information, ci returns to a “normal” state (line 1) and changes the existto FALSE (line 2) before sending it to its successor (line 3).
Procedure 6. When ci returns to normal state 1: if (UniQ = FALSE & P.state="Evolve") then 2: P.state="normal"; 3: exist=False; 4: Sent (Cooperate, token) to successor( ci ); 5: end if
– Event6 (see Procedure 7): At the reception of the cooperation message, ci tests if it is in a normal state and, at the same time, it checks the ring value. If it is a mono-color node, it either duplicates the token information for local storage to attach with the other ring token from other platforms clustering at the reception of the second one (lines 2–7). In the alternative case, when cpii is in a collaboration moment, it affects exist to TRUE and constructs the token information by adding its data (lines 10–11).
114
B. Maati et al.
Procedure 7. When ci receives (Cooperate, token) message from its precedent platform 1: if (P.state = normal) then 2: P.successor=getS(θi , ci ); 3: if P.successor=successor( ci ) then 4: Sent (Cooperate, token) to P.successor; 5: else 6: Duplicate the Token information; 7: Sent (Cooperate, token) to P.successor; 8: end if 9: else 10: exist=TRUE; 11: Construct the token information; 12: end if
5 Protocol-Correctness Proof Theorem 1. (Vivacity). ∀ ci or cj attached to the constructed rings, ci will receive eventually the updated information from cpi j in a limited amount of time. Proof. Due to the token circulation in reach related ring, at each reception, ci either passes it to the successor one or consumes it and re-sends it (see Procedures 6 and 7). In addition, the mono-color c that has founded on the intersection of two rings handed the information from one ring to another. As a result, each c will receive the updated information in a finite amount of time. Theorem 2. (Safety). ∀ tokenci , tokenci traverses each ring in a finite amount of time, for that each updating information will be considered. Proof. According to Procedure 2, the physical agent updates its shadow by sending an update message after the discovery phase by δ . In fact, this phase is executed periodically Δ amount of time. Assuming that Δ + δ > τ , where τ is the amount of time that the token travels around the ring. As a result, the obtained data from the contexts are considered in any case. After one round of travel, ci stored information is updated.
6 An Illustrative Example Updating Example. Let suppose that the discovery is launched at ti and takes Δ time to receive the contextual responses. At ti + Δ < ti+1 the update is started by invoking Event1 (see Procedure 2). If we take, for example, that after previous requests contain SCA = {Agi , Ag j , Agk }, SDA = {s1 , s2 , s3 } and SFA = {Agl }; the agent didn’t inform the cloud about the miss collaboration with the agent Agi due to the contextual recovery at the ti+1 the agent re-discover SCA = {Agi , Ag j , Agk , Agm }, and SDA = {s1 , s2 , s3 , s4 , s5 }. Then matching = true, and for the best performance, the agent will inform its shadow about the context of this modification. The shadow, after receiving the update message, may define new rules for requesting by suspending the Agl for example, from any future requests. In another case, if the agent context is stable for the contextual agent SCA = {Agi , Ag j , Agk } but their available services are restricted, as SDA = {s1 , s2 }, it will inform its shadow, and the possible reaction is to select the service provider from the cloud shadows for s3 by suspending the contextual agent that provided it.
Platforms Cooperation Based on CIoTAS Protocol
115
Collaboration Example. Let us suppose that cj contracts the set of contextual agents and their provided services over the above phase. As a result, the detected failed agent is Agm . The latter will be a questionable agent and need to clarify its behavior. For that, using the local platform update about the shadows, pre-requesting it to conform to the presence of abnormal behavior. Assuming a positive detection, cpi j informs its successor about this one by adding it to the blacklist to avoid being requested by other shadows on other platforms. Besides, the effect of information propagation, the platforms will detect the source of potential DDoS attacks.
7 Simulation 7.1 Simulation Settings In order to present the importance of the proposed cooperation protocol, we will use the above example as a foundation for the simulation part. Second, we use the Jade platform (https://jade.tilab.com/) version 4.5 in a Java environment with JDK support nodes’ communication-based agents by emulating their behaviors in the attached network using an asynchronous message. Based on Jade, we can simulate the interaction among contextual agents and estimate the platform updates for the proposed solution. The considered hardware is a 2nd generation Intel I3 processor with 2 cores and 4 threads, 4 Go of Ram, and a 2.2 GHz clock speed, running on a 64-bit Windows 7 OS. The defined events in Sect. 4.2 are either routines, sending or receiving ones. The configurations of agent-jade behaviors are provided in the following, where the physical agent and its platform are implemented in separate containers; it is worth mentioning that each shadow has a responsible agent, and even the platform itself has an agent to manage. In fact, sending an updating message by the physical one is a ticker behavior that follows the discovery phase after a pre-defined amount of time. However, the other sending events are one-short behaviors, although the reception is block-waiting for a response to either update or token acquisition. In addition, since the evolving event is generic behavior, its done() function depends on the reception of the token message. 7.2 Simulation Results In this section, we proceed by giving the results of an experiment to examine the efficacity of the proposal based on the mentioned example and the illustrated model in Fig. 2. Starting with the updating phase, we assume that the update is launched 3 s after gating the response for the discovery message where 5 agents send the captured information to their shadows, three of them are related to the same θ1 (two in the same platform c0 and the third one in c2 ), one attached to c8 (θ2 ), and the last one attached to c17 (θ3 ). Noted that we block the local treatment for 0,2 s to slow down the execution part and present the results in an appropriate way. The total number of control messages that includes the updating and the collaboration phases are equal to 36 messages for one round, where the added five messages are integrated to communicate the shadows with their platforms. The response time is estimated to be 0,2 s between the θ elements, and the total response time for the information circulating based on three tokens to travel between the three rings of platforms is 6,34 s. Figure 3 presents the simulation results.
116
B. Maati et al.
Fig. 3. A simulation of platforms cooperation using JADE
8 Conclusion In this paper, we introduce platforms-cooperation approach based on contextual events using the CIoTAS protocol to enhance the decision-making process in a crucial situation. The proposed solution includes two phases: updating and collaboration to cooperate between the Cloud-IoT platforms using the information propagating effect. The first phase includes the message transition between the thing and its shadow, where the updating is established based on the thing behaviors. The second phase is used to transfer the captured information to other platforms, using the token over the virtual construction of platforms communication. For future work, we aim to define the matching, beliefs, and performance-testing functions based on dynamic agent activities. We desire also to implement the solution in a real context of use.
References 1. Mahmood, Z.: Guide to Ambient Intelligence in the IoT Environment. CCN, Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04173-1 2. Clarizia, F., et al.: An approach based on context and situation awareness to improve functional safety in complex scenarios. In: Yang, X.-S., Sherratt, S., Dey, N., Joshi, A. (eds.) Proceedings of Sixth International Congress on Information and Communication Technology. LNNS, vol. 217, pp. 121–129. Springer, Singapore (2022). https://doi.org/10.1007/978981-16-2102-4_11 3. Alavizadeh, H., et al.: A survey on cyber situation awareness systems: framework, techniques, and insights. ACM Comput. Surv. (CSUR) (2022). https://doi.org/10.1145/3530809 4. Kyriazis, D.P., Varvarigou, T.A., Konstanteli, K.: Achieving real-time in distributed computing: from grids to clouds. Information Science Reference (2012) 5. Pallewatta, S., Kostakos, V., Buyya, R.: QoS-aware placement of microservices-based IoT applications in Fog computing environments. Futur. Gener. Comput. Syst. (2022). https:// doi.org/10.1016/j.future.2022.01.012 6. Herrera, J.L., Galán-Jiménez, J., Foschini, L., Bellavista, P., Berrocal, J., Murillo, J.M.: QoSaware fog node placement for intensive IoT applications in SDN-fog scenarios. IEEE Internet Things J. (2022) 7. Adil, M., Alshahrani, H., Rajab, A., Shaikh, A., Song, H., Farouk, A.: QoS review: smart sensing in wake of COVID-19, current trends and specifications with future research directions. IEEE Sens. J. (2022) 8. Kühn, F., Hellbrück, H., Fischer, S.: A model-based approach for self-healing IoT systems. In: Proceedings of the 7th International Conference on Sensor Networks, pp 135–140. SCITEPRESS-Science and Technology Publications, Lda (2018)
Platforms Cooperation Based on CIoTAS Protocol
117
9. Safaei, A., Nassiri, R., Rahmani, A.M.: Enterprise service composition models in IoT context: solutions comparison. J. Supercomput. 78(2), 2015–2042 (2021). https://doi.org/10. 1007/s11227-021-03873-7 10. Maati, B.: An elaboration of a diagnostic approach that operates in a connected objects environment. Ph.D. thesis, Constantine 2 University (2022) 11. Laprie, J.C., Avizienis, A., Kopetz, H.: Dependable computing and fault-tolerant systems. In: Dependability: Basic Concepts and Terminology in English, French, German, Italian and Japanese, vol. 5 (1992) 12. Li, Y., Pandis, I., Guo, Y.: Enabling virtual sensing as a service. Informatics 3(2), 3 (2016) 13. Petroni, A., Cuomo, F., Schepis, L., Biagi, M., Listanti, M., Scarano, G.: Adaptive data synchronization algorithm for IoT-oriented low-power wide-area networks. Sensors 18, 4053 (2018) 14. Lam, A.N., Haugen, Ø., Delsing, J.: Dynamical orchestration and configuration services in industrial IoT systems: an autonomic approach. IEEE Open J. Ind. Electron. Soc. 3, 128–145 (2022) 15. Maati, B., Saidouni, D.E.: CIoTAS protocol: CloudIoT available services protocol through autonomic computing against distributed denial of services attacks. J. Ambient Intell. Hum. Comput. 1–30 (2020) 16. Ray, P.P.: A survey of IoT cloud platforms. Future Comput. Inform. J. 1(1–2), 35–46 (2016)
Artificial Intelligence and its Applications
Hybrid Approach Based on Grey Wolf Optimizer for Dropout Regularization in Deep Learning Selma Kali Ali(B)
and Dalila Boughaci
Faculty of Computer Science, LRIA-USTHB, Algiers, Algeria {skaliali,dboughaci}@usthb.dz, [email protected], dalila [email protected]
Abstract. One of the advanced concepts in artificial intelligence is deep learning, which is a subfield of Machine Learning. Training deep neural networks requires setting optimal hyperparameters. Dropout is a regularization parameter that avoids overfitting when training deep neural networks. Despite the success of this technique, finding the optimal value of the dropout probability with a manual search is a time-consuming process. Therefore, metaheuristic algorithms are the best choice to find this optimal value. In this paper, we propose a hybrid search method based on Gray Wolf Optimizer (GWO) and Multi-Verse Optimizer (MVO) to select the dropout probability rate. The results obtained on the image classification task show clearly the good performance of the proposed method. Keywords: Deep learning · Dropout · Regularization · Swarm intelligence · Gray Wolf Optimizer · Hybridized gray wolf optimizer Multi-Verse Optimizer
1
·
Introduction
Several areas of artificial intelligence have seen remarkable progress in the last decade with the introduction of deep learning. Deep learning-based approaches have become paramount in many complex application domains such as computer vision, natural language processing, and speech recognition. High variance also called overfitting is a common problem in deep neural network models. Overfitting occurs when the model fits well with the training data and fails to generalize on new data. To prevent a deep learning model from overfitting several techniques have been proposed. Among these techniques we find: early stopping, data augmentation [8], L1 regularization, L2 regularization (weight decay) [16] and adding dropout layers [20]. This work focuses on the dropout regularization technique. Dropout consists in ignoring randomly selected neurons during training. Despite the popularity of this technique, finding the optimal value of dropout rate is an NP-hard optimization problem [2]. Metaheuristics have been proven to be successful in such c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 121–134, 2023. https://doi.org/10.1007/978-3-031-18516-8_9
122
S. Kali Ali and D. Boughaci
problems. Few works have attempted to correctly select the dropout regularization parameter using metaheuristic techniques on the image classification task. In [3], several metaheuristics were investigated such as Particle Swarm Optimization (PSO), Bat Algorithm (BA), Cuckoo Search (CS), and Firefly Algorithm (FA). The results obtained have shown that metaheuristics are able to obtain an appropriate dropout rate. Nevertheless, the metaheuristics have not been sufficiently studied in this work. Authors in [2] proposed a hybridized bat algorithm for the dropout probability fine-tuning. In one iteration the BA exploitation was used while in the next iteration the Artificial Bee Colony (ABC) exploitation mechanism was performed. However, in this two studies, each iteration of the optimization process requires full training of the neural network, which turns out to be an expensive process. Recently, a new attempt has been made to find an appropriate value for the dropout parameter [1]. An improved version of the firefly algorithm has been proposed based on an explicit exploration mechanism and a chaotic local search strategy. Although the early stopping is adjusted to 5% of the total number of training epochs to decrease the high computational burden, the evaluation during the optimization process depends on the error rate of the test set classification but this set should not be known. In this paper, we propose a hybrid metaheuristic for estimating the optimal value of the dropout rate. The proposed algorithm is based on the Gray Wolf Optimizer and the Multi-Verse Optimizer (GWO-MVO). Gray Wolf Optimizer (GWO) is a new metaheuristic inspired by the social wolves’ behaviors in nature [14]. It has successfully solved many optimization problems and has given better solutions compared to other metaheuristics. One of its main advantages is that it requires few parameters. However, due to its low exploration capacity, it is easy to fall into the local optimum. In the literature, many attempts have been made to prevent the problem of stagnation in local optima and to establish an appropriate balance between exploration and exploitation [4,5,17–19]. Our proposal aims at addressing this issue using the Multi-Verse Optimizer (MVO) algorithm which is inspired by the multi-verse theory in physics [13]. The MVO is based on mathematical models developed to perform exploration, exploitation and local search, respectively. Our method is evaluated on the image classification task with three commonly used datasets and compared to other proposed methods. The numerical results are promising and show the effectiveness of the proposed method (GWOMVO). The organization of the rest of the paper is as follows. In Sect. 2, we briefly present the main concepts used in this work. Section 3 details the proposed approach for dropout rate estimation, applied to the image classification task. We describe our experimental setup and discuss our results in Sect. 4. Finally, in Sect. 5, we summarize the results of this work and draw conclusions.
2
Background
In this section, we briefly describe the dropout regularization technique and the two metaheuristics used in this work.
Hybrid Approach Based on GWO for Dropout Regularization
2.1
123
Dropout Regularization
Dropout [20] is the method most used in practice for network regularization. It represents a technique that improves neural networks by reducing overfitting. The main idea is to train a different network in each training iteration. Considering a neural network with L hidden layers. Let z(l) be the vector of inputs to layer l and y(l)) the vector of outputs from layer l (y(0) = x is the input). W (l) and b(l) are the weights and biases of layer l. For any l and any hidden unit i, the feed-forward operation of a standard neural network can be described as follows [20]: (l+1)
zi
(l+1) l
= wi
(l+1)
yi
y + bi
(l+1)
(1)
(l+1)
)
(2)
= f (zi
With dropout, some neurons will be deactivated randomly during the training phase. The feed-forward operation becomes [20]: (l)
ri ∼ Bernoulli(p) ∼(l)
y
= r(l) × y (l)
(l+1) (l+1) ∼(l) (l+1) = wi y + bi zi (l+1) (l+1) yi = f (zi )
(3) (4) (5) (6)
(l)
where for any layer l , r is a vector of independent Bernoulli random variables of which each has probability p of being 1. At the testing phase, we can scale the weights of each neuron by p [20]: (l)
Wtest = pW (l) 2.2
(7)
Gray Wolf Optimizer
Gray Wolf Optimizer (GWO) is a recent swarm intelligence, proposed in 2014 by Mirjalili et al. [14]. The GWO is inspired by the leadership hierarchy and hunting behavior of gray wolves. In the following, the inspiration of GWO will be described as well as the mathematical model. Inspiration. Most gray wolves choose to live in a pack where the social hierarchy is strictly respected by the group. To ensure discipline within the pack, four hierarchical levels are defined by wolves. The important decisions are made by the leader wolves called alpha α. The second level wolves are called beta β. The betas assist the alphas in decision-making and transmit important decisions to the other wolves. The lowest level consists of wolves that are allowed to eat the prey last. The wolf of this level is known as an omega ω. The remaining wolves are known as delta δ. This category includes scouts, sentinels, elders, hunters, and caretakers. Another interesting social behavior of gray wolves is group hunting. According to [15], the main phases of grey wolf hunting are grouped into three steps: tracking of prey, circling the prey and attacking the prey.
124
S. Kali Ali and D. Boughaci
Mathematical Model. In this subsection, mathematical models of social hierarchy, tracking, circling, and attacking prey introduced in [14] are provided. Social Hierarchy. Based on the fitness evaluation of the wolves in the population, we consider the first, second, and third-best solutions as alpha, beta and delta respectively. The prey search will be guided by these three group leaders. The remaining wolves are considered as omega. Encircling Prey. The circling strategy by wolves around prey can be mathematically modeled by the following equations: X(t+1) = Xp,t − A · D
(8)
D = |C · Xp,t − Xt |
(9)
A = 2 · a · r1 − a
(10)
C = 2 · r2
(11) th
where Xt , X(t+1) are the positions of wolf at tth and (t + 1) iteration, X(p,t) represents the location of prey at x(t+1) generation, D is the difference vector, A and C are coefficient vectors, r1 , r2 are random vectors in [0, 1] taken from a uniform distribution, and a is a vector which is linearly decreasing from 2 to 0 as the iteration proceeds and can be expressed as: t a=2−2×( ) T
(12)
where t is the current iteration and T indicates the maximum number of iterations. Hunting. Assuming that the alpha, beta and delta leaders have sufficient knowledge of the prey, each wolf can update its position using the following equations: Xα = Xα − Aα · Dα
(13)
Xβ = Xβ − Aβ · Dβ
(14)
Xδ = Xδ − Aδ · Dδ
(15)
X(t+1) =
Xα + Xβ + Xδ 3
(16)
Exploration (search for prey) and Exploitation (attacking prey). In order to attack the prey, exploitation is done when the random values of A are in the interval [−1, 1], |A| < 1. While |A| > 1 allows the GWO algorithm to search globally. Another component of GWO that promotes exploration is C which ensures random behavior. The GWO is described in Algorithm 1.
Hybrid Approach Based on GWO for Dropout Regularization
125
Algorithm 1. GWO pseudocode Initialization Initialize the grey wolf population Xi (i = 1, 2, .., n) Initialize parameters: M ax iter Evaluation Evaluate the fitness of each wolf Select the leaders Xα , Xβ and Xδ of wolf pack while t < M ax iter do Update a by equation (12) for each search agent do Update the position of the current search agent by equation (16) end for Update the leaders Xα , Xβ and Xδ of wolf pack t=t+1 end while return Xα
2.3
Multi-Verse Optimizer
The multi-verse optimization (MVO) [13] algorithm is one of the most recent metaheuristics, proposed by Mirjalili et al. The algorithm is inspired by the multiverse theory in astrophysics. The multi-verse theory describes how big bangs create multiple universes and how these universes interact with each other. Three concepts have been considered, white holes, black holes and wormholes. Black holes and white holes exchange objects through a tunnel. The principle is that black holes attract everything and white holes emit everything. On the other hand, a wormhole generates a tunnel through time and connects different parts of a universe. Each universe is characterized by an inflation rate that causes its expansion. In MVO, each solution is represented by a universe where each attribute of the solution represents an object in that universe. The variable values of universes with high fitness values are moved to universes with low fitness values via white and black holes. In addition, each universe can encounter a random theoretical transfer in its attributes towards the best universe via wormholes. MVO is defined by two main functions (17) and (18) . j Xk if r1 < N I(Ui )) (17) Xij = Xij else where xji is the j th parameter of the ith universe, Ui represents the ith universe, N I(Ui ) denotes the normalized inflation rate of the ith universe, r1 is a random number in [0, 1], and xjk denotes the j th parameter of the k th universe chosen via a roulette selection mechanism. ⎧ j ⎨ X + T DR × ((ubj − lbj ) × r4 + lbj ) X j − T DR × ((ubj − lbj ) × r4 + lbj ) Xij = ⎩ j Xi
if r3 < 0.5 else
if r2 < W EP
(18)
else
where X j is the j th parameter of the best universe trained so far, lbj and ubj are respectively the lower and upper bounds of the j th variable, xji is the j th
126
S. Kali Ali and D. Boughaci
parameter of the ith universe, and r2 , r3 , r4 are random numbers in [0, 1]. T DR and W EP are coefficients that are calculated as follows: W EP = min + t ×
max − min T
(19)
1
T DR = 1 −
tp
(20) 1 Tp where p denotes the exploitation factor, min and max are constants, t indicates the current iteration, and T is the maximum iteration.
3
Proposed Approach
In this section we introduce the proposed method for selecting the dropout probability in convolutional neural networks. 3.1
Deep Learning Architecture
CNN models are one of the most popular deep learning frameworks. They are applied to different areas, such as computer vision and natural language processing. Thus, the CNN architecture represents the best option for the image classification task. In order to perform a comparative analysis, we use the architectures proposed by Caffe [7]. As in previous works, we add a dropout layer to each architecture. More details will follow in the next section. 3.2
GWO-MVO
Although GWO has proven useful in solving many optimization problems, the algorithm is not able to escape the problem of stagnation in local optima. This is due to the search mechanism, where all wolves are guided by wolves α, β, and δ. As a result, wolves converge faster to these leaders. To establish an appropriate balance between exploration and exploitation, we propose two modifications: 1. We introduce a nonlinear decay strategy for parameter a to slow the speed of convergence to the leader wolves, and allow better exploitation. Several strategies have been tested in this context to improve the whale optimization algorithm (WOA) [22]. The strategy used in this contribution is inspired by the calculation of the T DR (Travelling Distance Rate) parameter in the MVO algorithm: 1
a = 2 × (1 −
t p1
(21) 1 ) T p1 – t indicates the current iteration, and T is the maximum number of iterations – p1 is the accuracy of the exploitation over iterations
Hybrid Approach Based on GWO for Dropout Regularization
127
2. The mathematical models developed in MVO allow to perform exploration, exploitation, and local search. In order to improve the performance of GWO, the worst agent will be replaced using a modifiable version of Eq. (18) employed in MVO. Eq. (18) is based on the TDR parameter which will be increased in a non-linear strategy over the iterations in order to have a more accurate exploitation/local search around the best individual. The replacement mechanism is described in the following equations: Xnew if f ∗ (Xnew ) < f ∗ (Xworst ) (22) Xworst = Xworst else Xnew = Xα + T DR × ((ub − lb) × r3 + lb) T DR = 1 −
t
1 p2 1
(23) (24)
T p2
– f ∗ is the fitness function – t and T indicates the current iteration and the maximum number iterations respectively – p2 is the accuracy of the exploitation over iterations – Xα indicates the position of the alpha wolf – Xworst represents the position of the worst wolf – lb and ub are the lower and upper bounds respectively – r3 ∈ [−1, 1] is a random number chosen from a uniform distribution We note that the higher p1 and p2 are, the faster and more accurate the local exploitation/search is. Thus to introduce a good balance between exploitation and exploration, p1 will be very small in contrast to the parameter p2 . In short, the first modification aims at improving exploration and the second one enhances exploitation. The GWO-MVO is sketched in Algorithm 2. 3.3
Search Mechanism
The main goal is to improve the accuracy rate in the image classification task by finding the optimal value of the dropout probability. Therefore, the optimal dropout probability dp will be estimated by the GWO-MVO metaheuristic where each wolf position represents a possible dp value. The evaluation of each position is done based on the output of the CNN network. Once the neural network is trained with an early stopping condition adjusted to 5% of the total number of training epochs, the accuracy is evaluated on the validation set. We aim to minimize the error rate. Thus, the fitness is calculated afterwards using the following formula: F itness = Error Rate = 1 − Accuracy
(25)
tp + tn (26) tp + f n + f p + tn where, tp is the true positive, tn is the true negative, f p is the false positive and f n is the false negative. Accuracy =
128
S. Kali Ali and D. Boughaci
Algorithm 2. GWO-MVO Initialization Initialize the grey wolf population Xi (i = 1, 2, .., n) Initialize parameters: M ax iter, p1 and p2 Evaluation Evaluate the fitness of each wolf Select the leaders Xα , Xβ and Xδ of wolf pack while t < M ax iter do Update a by equation (21) for each search agent do Update the position of the current search agent by equation (16) end for Update the leaders Xα , Xβ and Xδ of wolf pack Update T DR by equation (24) Replace the worst wolf by equation (22) Update the leaders Xα , Xβ and Xδ of wolf pack t=t+1 end while return Xα
4
Experiments
To reduce the execution time, we implement a parallel exploitation/exploration using Numba [10] which provides an NVIDIA CUDA back-end for GPU Computing in Python. In addition, population fitness evaluation is implemented using the Caffe library [7], which is developed under General-Purpose computing on Graphics Processor Units (GPGPU) platform, offering more efficient implementations for convolutional architectures. The experiments are running on a single GPU (rtx 2060 super) with AMD RYZEN 7 3700X CPU, 16 GB RAM, Ubuntu 18.04 OS. 4.1
Datasets
The proposed approach is tested on three image classification datasets: MNIST [11], USPS [6] and CIFAR-10 [9]. These datasets are widely used to train machine learning and computer vision algorithms. – MNIST is a database of handwritten digits that contains images of digits between 0 and 9. MNIST is divided into training and testing observations. All image sizes are 28 × 28 pixels gray-scale. – USPS is a dataset for handwritten digits recognition. The dataset is divided into training and testing sets where the image size is 16 × 16 pixels gray-scale. – CIFAR-10 is a dataset composed of color images divided into 10 classes.The dataset is split into training and testing sets. The image size is 32 × 32 pixels color-scale.
Hybrid Approach Based on GWO for Dropout Regularization
(a) M N IST
(b) U SP S
129
(c) CIF AR − 10
Fig. 1. Examples from the datasets used
Figure 1 shows an example of training from each data set. As in [3], the training set for each dataset was divided into training and validation sets. Details in terms of dataset split and number of batch sizes are shown in Table 1. Table 1. Dataset configuration. Dataset
#Training set #Validation set #Testing set
MNIST
20,000 (64)
USPS
2,406 (32)
CIFAR-10 20,000 (100)
4.2
40,000 (100)
10,000 (100)
4,885 (977)
2,007 (2,007)
30,000 (100)
10,000 (100)
CNN Architecture
Two architectures proposed by the Caffe examples were implemented with an additional dropout layer. One architecture for the MNIST and USPS datasets and the other for the CIFAR-10 dataset. For USPS dataset, the kernel size was set to 3 instead of 5 for convolution layers due to the lower resolution of these datasets. Figure 2 gives an illustration of the architectures used in this study. 4.3
Parameters Tuning
The results obtained are compared to those of [3], [2] and [1]. To perform a fair comparative analysis, we use a similar parameter configuration to [3]. Table 2 shows the configuration of the CNN parameters for each dataset. The configuration of the metaheuristic control parameters is shown in Table 3. As indicated we decreased the number of GWO-MVO agents in order not to have a higher number of calls to the training procedure than GWO while respecting [3] configuration. For the input parameters p1 and p2 we have performed several tests to fix them.
130
S. Kali Ali and D. Boughaci
Fig. 2. CNN architectures for: a MNIST and USPS, and b CIFAR-10 [3]
Hybrid Approach Based on GWO for Dropout Regularization
131
Table 2. CNN parameter configuration. Dataset
Learning rate η
Momentum α Weight Dropout decay λ ratio p
#Iterations
MNIST
0.01
0.9
[0, 1]
10,000
USPS
0.01
CIFAR-10 0.001
0.0005
0.9
0.0005
[0, 1]
10,000
0.9
0.004
[0, 1]
4000
Table 3. Metaheuristic parameter configuration. Metaheuristic Parameters
#Training procedure calls
GWO
#iterations = 10, #agents = 7
77
GWO-MVO
#iterations = 10, #agents = 6, p1 = 2, p2 = 10 76
4.4
Evaluation Measure
In order to have a fair comparative study, we followed the same test procedure made in [3], where we were run for 20 separate times and the average accuracy on the test set was used as a comparison metric. We have also performed a statistical analysis on the obtained results to confirm the performance of the proposed GWO-MVO algorithm compared to the basic GWO algorithm (Sect. 4.6). 4.5
Numerical Results
Table 4 presents the comparison analysis, the average accuracies and the average values of the hyperparameter p found on the three datasets: MNIST, USPS and CIFAR-10. Based on the results obtained, it can be concluded that the proposed GWO-MVO approach produces better results. Accuracies has been improved compared to the original GWO algorithm, 99.22% versus 99.3% on MNIST dataset, 96.37% versus 96.76% on USPS dataset and about 1.09% improvement was obtained on CIFAR-10 dataset. We can also see that these two algorithms converge to the same search area for the three datasets as indicated by the average values of p. This proves the effectiveness of our proposal to escape the local optimum. The results also indicate the superior performance of the GWO-MVO method compared to the other methods. On the MNIST dataset, we can observe that the accuracy rate achieved by GWO-MVO was slightly better than other values from previous works with an average accuracy equal to 92.3%. Likewise, the experiments on the CIFAR-10 dataset show that GWO-MVO outperforms the other works with an accuracy rate of 73.19%. On the CIFAR-10 dataset, GWOMVO failed to outperform the rate of 96.88% achieved by CFAEE. However, GWO-MVO was able to achieve the second-best rate which is equal to 96.76%.
132
S. Kali Ali and D. Boughaci Table 4. Average accuracies on MNIST, USPS and CIFAR-10 datasets Method Caffe [3]
MNIST Accuracy p 99.07 (%) 0
Dropout Caffe [3] 99.18 (%) 0.5
4.6
USPS Accuracy
p
CIFAR-10 Accuracy p
95.80 (%)
0
71.47 (%)
0
96.21 (%)
0.5
72.08 (%)
0.5
BA-OM [2]
99.19 (%) 0.5216 –
CFAEE [1]
99.26 (%) 0.529
–
96.88 (%) 0.845
71.76 (%)
0.6710
72.32 (%)
0.388 0,814
GWO
99,22 (%) 0,3839 96,37 (%)
0,5026 72,10 (%)
GWO-MVO
99.3 (%) 0.343
0,6099 73,19 (%) 0,8086
96,76 (%)
Statistical Results
To observe the impact of improving the global optima in GWO algorithm, a non-parametric Wilcoxon Rank-Sum Test1 over the 20 runs is performed at 5% significance level (α = 0.05). Table 5 shows the p-values obtained comparing GWO-MVO and GWO on all datasets. According to the criteria that the algorithm is significantly better with a p-value < 0.05, we can conclude that our hybrid algorithm is significantly better compared to the original algorithm on all datasets. Table 5. p-value of GWO-MVO against GWO for MNIST, USPS and CIFAR-10 datasets. Dataset MNIST USPS p-value
5
CIFAR-10
θ then y ←Class label 1. end if Return the class of the classified vector y.
It has now become clear that optimal value of w is directly linked to the singularity of Sw matrix. However, as it is known in the literature, facial detection problems suffer from the “Small Sample Size” problem (SSS) and some studies [9] have proposed solutions to this under-sampling problem with LDA specifically. Instead of changing feature space dimension, we opted to keep FDA linear, but rather focus on a proper evaluation of the pseudo-inverse matrix [19].
3
Facial Detection System Using Gabor Features
In any system of facial detection, the feature selection method proves critical [5], as it allows for the machine-learning algorithm to pinpoint the features that describe the visioned object from anything else, thus requires a careful selection of a method adapted to the object type.
310
K. M. El’Amine and B. Nabila
In this context, Gabor feature extraction technic, which consists mainly of a convolution between a bank of filters and the image we wish to extract features from, adapts perfectly to face detection problematic. It was also proved that this feature extraction method bears resemblance to the human eye system [18].
Fig. 1. Typical bank of Gabor Filters used in facial detection.
The process of feature extraction using Gabor Filters is straightforward. First, we begin by generating the “bank” of filters, which consists in a convolution of a Gaussian function modulated by a sine wave. The main advantage to Gabor Filters is the invariance to face modification, including rotation, scale and translation [11]. For facial detection, it is recommended to use multiple filters, thus generating a bank. In literature related to facial detection, it is common to use a combination of 8 orientations and 5 different sizes, generating a bank of 40 filters, as shown in Fig. 1. To generate a single 2-D filter, it is required to evaluate both a real and imaginary value of the Gabor Filters with the following formula [12]: x2 + γ 2 y 2 2πx + ϕ), (10) ). cos( gγ,η,ϕ,λ = exp(− 2 2σ λ x2 + γ 2 y 2 2πx + ϕ), (11) gγ,η,ϕ,λ = exp(− ). sin( 2σ 2 λ with the following coordinates: x = x cos(θ) + y sin(θ),
y = −x sin(θ) + y cos(θ),
(12)
where Eq. (10) represents the real value for the Gabor Filters and the Eq. (11) provides the imaginary part. x and y being the initial coordinates of the image, θ is the orientation parameter (between 0 and 360), λ the length of the sine wave, ϕ the offset, σ is the standard deviation of the Gaussian envelope and γ is the spatial aspect ration. After generating the Gabor Filters bank, it is crucial to extract the features from a face candidate, this process requires a convolution between the candidate and each filter in the bank, thus generating 40 different convoluted images. See Fig. 2 for clarification.
Modified Fisher Discriminant Analysis
311
Fig. 2. Convolution of image to Gabor Filters Bank.
Lastly, it is now a simple task of transforming a bank of convoluted images into a single vector that can be directly injected to a classifier. In our system, we will rewrite the convoluted image into a single vector of 19440 dimensions, using a Matlab function called “reshape” which transforms a big matrix into a single vector by reshaping each column vector to an array of single vector. Figure 3 [1] explains the process. Our system of facial detection is summarized in Fig. 4. The implementation of the method in Matlab of the system was taken from our previous experiments [1], allowing us to directly implement our system by injecting Feature Discriminant Analysis method and the modified version.
Fig. 3. Feature extraction using Gabor Filters bank. FFT and IFFT stands respectively for Matlab function of Fast Fourier Transform and Inverse Fast Fourier Transform.
312
K. M. El’Amine and B. Nabila
Fig. 4. Scheme of the facial detection system used in our study.
4 4.1
Modified Fisher Discriminant Analysis Introduction
After an attempt to apply the facial detection system using the classical FDA method, we have noticed that the performance was lacking. Indeed, the facial detection did not work and no faces were detected. An example of FDA execution −1 matrix being inaccurate. In is presented with Fig. 5. The issue spans from Sw the following we will explain the reason behind this inversion problem. 4.2
FDA Within-Class Scatter Problem
It was proved in [2] that having far fewer observations than features would make the Sw within class scatter matrix strongly singular. Some studies suggested using other technic including a generalized FDA [2] which employs kernels to increase feature dimensions for better separability. While this method proved to be working on a direct test, due to the heavy resources needed to evaluate the kernel at each iteration, it was deemed inefficient in our study due to extended execution times (see Table 2). At the beginning of our tests, we had fed the FDA training algorithm (Algorithm 1) 70 observation of faces and 60 observation of non-faces [17] (labeled respectively classes 1 and 0). The training required an inversion of the matrix Sw . We used the function “inv” of Matlab which boasts many inversion technic. Results were unsatisfying as the number of observations was n = 130, while the dimension of features for each observation was of d = 19440, making our data matrix X ∈ R19440×130 . Reducing number of features has had poor results and increasing number of observations is not efficient, as a smaller Gabor Filters bank reduces accuracy of feature extraction and increasing number of observation would be too costly, and will require too much processing to be closer to a square matrix.
Modified Fisher Discriminant Analysis
313
Since the problem spanned from the matrix inversion process, we have decided to use a pseudo-inverse [19] as to avoid direct search of the inverse. The concept first appeared in 1920 by E.H. Moore [15] and in 1955 by R. Penrose [19]. This method is considered one of the most efficient methods in finding a generalization of the inverse matrix. Multiple methods of generating this pseudo-inverse exist; however, we will only focus and utilize the simplest one, using Singular Value Decomposition [13]. 4.3
The Moore Penrose Pseudo Inverse with SVD
Let A ∈ Rm×n of rank r, with r ≤ min(n, m). We will only focus on the case where matrix A is real. Gene H. et al. [8] summarize the generalized inverse as a unique n × m matrix B, named pseudo-inverse, that has to satisfy the four Moore-Penrose conditions: BAB = A, T
(AB) = AB,
ABA = B,
(13)
T
(14)
(BA) = BA.
It was also demonstrated [8] that for the Singular Value Decomposition of the matrix A: A = U ΣV T , then B = V Σ + U T , that satisfies the conditions of Moore-Penrose with: Σ = diag(σ1 , σ2 , ..., σr , 0, ..., 0), and: Σ + = diag(
1 1 1 , , ..., , 0, ..., 0). σ1 σ2 σr
(15)
(16)
thus giving us a clear expression of the pseudo-inverse with the SVD of matrix A. Each σi represents a singular value of matrix A, U ∈ Rm×m is an orthogonal matrix and V ∈ Rn×n is also an orthogonal matrix, both U and V being named respectively the left and right singular vectors of A. The latter allows us to identify the Pseudo-Inverse using the initial matrix SVD decomposition. Algorithm 3 showcases the modification. In the following section, we shall compare the Modified Fisher Discriminant Analysis method (shorten to MFDA) with the already existing FDA, a classic SVM classifier and the GDA method, in speed, performance and face detection in test images. Details about experiments will be reviewed too.
5 5.1
Experiments and Results Test Tools and Objectives
Each algorithm requires specific parameters optimal and necessary to its usage. Table 1 summarizes the parameters used for each algorithm in the facial detection system.
314
K. M. El’Amine and B. Nabila
Algorithm 3. Modified Fisher Discriminant Analysis Training. Require: A dataset of form X ∈ Rd×n , seperated into two subsets X1 and X2 with their respective labels Y1 and Y2 . Evaluate the means μ1 and μ2 . Evaluate the variances Σ1 and Σ2 . Sw ← (Σ1 + Σ2 )−1 . Evaluate the SVD of Sw : Sw ← U ΣV T Evaluate Σ + using Σ + + : Sw ← V Σ+U T . Evaluate the pseudo inverse matrix Sw + (μ1 +μ2 ). Evaluate the optimal projection vector w using the pseudo inverse: w ← Sw Return w.
The implementation of the facial detection system from [1] includes Gabor Feature extraction, original images for training originate from [17]. We will focus on comparing speed during test phase and face detection in an image. In this study, we do not focus on training time as is often the case in many studies of this type. For testing, we will be using Matlab R2021a, including Matlab Optimization Toolbox for SVM classification. The test machine is equipped with a Ryzen 5 5600x processor and 32 GB of RAM. The testing images were also provided with the Matlab implementation [17]. Table 1. Table summarizing different parameters related to the algorithms used in testing, where RBF stands for Radial Based Function Kernel, SMO stands for Sequential Minimal Optimization method of SVM. Algorithm Related parameters FDA
θ = 2.4476e + 15
GDA
θ = −0.02, Kernel: RBF
SVM
C = 24 , Method: SMO
MFDA
θ = 0.0296
The following functions were used for the different classifiers: – GDA.m: This function is the implementation of GDA method in Matlab, it was provided by the MATLAB Exchange Center, obtainable at [14]. – fitcsvm.m: This function resides inside the MATLAB Optimization toolbox and serves to train the SVM classifier using the parameters described earlier. – MFDAT.m: Our implementation of MFDA training in Matlab that relies on Algorithm 3 to train the modified FDA. – FDAC.m: Our implementation of FDA classification from Algorithm 2. – FDAT.m: Our implementation of FDA algorithm used for training using standard FDA approach from Algorithm 1. – pinv.m: Function included in MATLAB employing the pseudo-inverse technic using SVD.
Modified Fisher Discriminant Analysis
5.2
315
Results
The facial detection system was tested on the scale of speed and performance in detection in two practical examples. CPU time was also measured (average time of 5 executions). The results in speed were provided in Table 2. We summarized the detection results in Fig. 5 for example 1 and Fig. 6 for example 2. Table 2. Table comparing CPU time of each algorithm for the test image in Fig. 5. Algorithm Total time for detection (Example 1) in seconds
Total time for detection (Example 2) in seconds
FDA
38.25
25.46
GDA
135.65
86.67
SVM
53.74
26.83
MFDA
37.24
24.87
Fig. 5. Example of facial detection system using different classifiers with parameters of Table 1 (Example 1).
316
K. M. El’Amine and B. Nabila
Fig. 6. Example of facial detection system using different classifiers with parameters of Table 1 (Example 2).
6
Conclusion
In this paper, we tried to provide a new approach into viewing FDA problems of SSS under-sampling, instead of using conventional ways of increasing dimensionality or kernalization. The pseudo-inverse which utilizes SVD proved highly efficient in solving that problem. MFDA has provided positive results when compared to other classification methods and was successfully applied in a facial detection system with impressive speeds and precision.
References 1. Amir, A., El Amine, K.M.: A deeper Newton descent direction with generalised Hessian matrix for SVMs: an application to face detection. Int. J. Math. Model. Numer. Optim. 11(2), 196–208 (2021) 2. Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Comput. 12(10), 2385–404 (2000). https://doi.org/10.1162/ 089976600300014980
Modified Fisher Discriminant Analysis
317
3. Bhattacharyya, S.K., Rahul, K.: Face recognition by linear discriminant analysis. Int. J. Commun. Netw. Secur. 2(3), 1 (2014). https://doi.org/10.47893/IJCNS. 2014.1087 4. Chaabane, S.B., Hijji, M., Harrabi, R., Seddik, H.: Face recognition based on statistical features and SVM classifier. Multimed. Tools Appl. 81(6), 8767–8784 (2022). https://doi.org/10.1007/s11042-021-11816-w 5. Ding, S., Zhu, H., Jia, W., et al.: A survey on feature extraction for pattern recognition. Artif. Intell. Rev. 37, 169–180 (2012). https://doi.org/10.1007/s10462-0119225-y 6. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(2), 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137. x 7. Chelali, F.Z., Djeradi, A., Djeradi, R.: Linear discriminant analysis for face recognition. In: International Conference on Multimedia Computing and Systems 2009, pp. 1–10 (2009). https://doi.org/10.1109/MMCS.2009.5256630 8. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn., pp. 257–258. Johns Hopkins, Baltimore (1996) 9. Hegde, G.P., Seetha, M., Hegde, N.: Study of singularity problems in discriminative based subspace of facial expression class features. Int. J. Trend Res. Dev. 3(5), 96– 98 (2016) 10. Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 23–38 (1998). https://doi.org/10.1109/ 34.655647 11. Kamarainen, J.K., Kyrki, V., Kalviainen, H.: Invariance properties of Gabor filterbased features-overview and applications. IEEE Trans. Image Process. 15(5), 1088– 1099 (2006). https://doi.org/10.1109/TIP.2005.864174 12. Henriksen, J.J.: 3D surface tracking and approximation using Gabor filters. South Denmark University, 28 March 2007 (2007) 13. Linear Systems and Pseudo-Inverse. http://websites.uwlax.edu/twill/svd/ systems/index.html 14. Haghighat, M.: Dimensionality reduction using generalized discriminant analysis (GDA) (2022). GitHub. https://github.com/mhaghighat/gda. Accessed 5 June 2022 15. Moore, E.H.: On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc. 26(9), 394–95 (1920). https://doi.org/10.1090/S0002-9904-1920-03322-7 16. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research, Springer, New York (1999) 17. Sakhi, O.: Face detection using support vector machine (SVM) (2022). MATLAB Central File Exchange. (https://www.mathworks.com/matlabcentral/fileexchange/29834-face-detectionusing-support-vector-machine-svm). Accessed 5 May 2022 18. Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive-field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996). https://doi.org/10.1038/381607a0 19. Penrose, R.: A generalized inverse for matrices. Math. Proc. Cambridge Philos. Soc. 51(3), 406–413 (1955). https://doi.org/10.1017/S0305004100030401 20. Abhishree, T.M., Latha, J., Manikantan, K., Ramachandran, S.: Face recognition using Gabor filter based feature extraction with anisotropic diffusion as a preprocessing technique. Procedia Comput. Sci. 45, 312–321 (2015)
318
K. M. El’Amine and B. Nabila
21. Thanh Do, T., Hoang Le, T.: Facial feature extraction using geometric feature and independent component analysis. In: Richards, D., Kang, B.-H. (eds.) PKAW 2008. LNCS (LNAI), vol. 5465, pp. 231–241. Springer, Heidelberg (2009). https:// doi.org/10.1007/978-3-642-01715-5 20 22. Li, W., Ruan, Q.: Generalized discriminant analysis model and its extension for facial expression recognition. In: 2014 12th International Conference on Signal Processing (ICSP), pp. 790–795 (2014). https://doi.org/10.1109/ICOSP.2014.7015112
Communities Detection in Epidemiology: Evolutionary Algorithms Based Approaches Visualization Mostefa Mokaddem1(B) , Ilhem Idris Khodja2 , Hamza Amar Setti2 , Baghdad Atmani1 , and Chihab Eddine Mokaddem3 1 Laboratory of Informatics of Oran, University of Oran, 1 Ahmed Benbella, 31000 Oran,
Algeria [email protected] 2 Computer Science Deptartment, University of Oran, 1 Ahmed Benbella, 31000 Oran, Algeria [email protected] 3 Laboratory of GEODES at DIRO, University of Montreal, Montreal, Canada [email protected]
Abstract. Complex networks are large scale networks with complicated topologies that have attracted the attention of research scientists, many systems can be represented as complex networks, a fundamental field for studying the structural properties of a large number of dynamic systems. Those networks are modeled by graphs that can be visualized with a dynamic aspect showing their changes over time. The visualization of those networks has increasingly became important in many fields because identifying and understanding the changing are complex tasks in demand. In this article, we will focus on the dynamic graph visualization of community detection in epidemiological diseases. Many studies have been carried out to analyze the spread of the epidemics and how to proceed to the immunization phase. Including a work that relies on a temporal RDF graph and the application of genetic algorithm and bee colony algorithm as an immunization strategy to a community detection problem. Dynamic graph visualization focuses on the challenge of representing the evolution of relationships between entities in readable, scalable, and effective diagrams which will allow us to extract the characteristics of the different nodes and edges as well as to understand the dynamism that occurs when such events occur in a population. Keywords: Temporal networks · RDF graph · Dynamic graph visualization · Epidemic modeling · Evolutionary algorithms
1 Introduction Many complex systems across the sciences can be modeled as networks of vertices joined in pairs by edges, Examples included the internet and the world-wide web, biological networks, web and lot of others ones [1]. Those complex systems are found to be naturally partitioned into multiple modules or communities. In the network representation, these modules are usually described as groups of densely connected nodes with © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 319–332, 2023. https://doi.org/10.1007/978-3-031-18516-8_23
320
M. Mokaddem et al.
sparse connections to the nodes of other groups. When a node can belong to a single community the community structure is said to be non-overlapping, while in overlapping communities a node can belong to multiple communities [2]. In healthcare, the dynamics of infectious diseases spread via direct person-to-person transmission depends on the underlying contact network. Human contact networks exhibit strong community structure. The changing of such structures needs to be visualized to help understanding the effect of some properties in the evolution of the network. Here comes our focus on this paper, we will focus on visualizing the changing of an epidemiological network while the propagation of the disease from individuals to another happened and after that the immunization of those individuals. Those two phases are so complicated that we need to visualize them to clearly show and understand how some properties affect the individuals during the dynamic processes. The rest of the paper is structured as follows: Sect. 2 presents the related work. Section 3 describes our proposed approach. Section 4 consists of the conclusion and future works.
2 Related Work Dynamic visualization is a discipline based on the representation of changes and evolution of the entities that make a network growing. Many dimensions of dynamic graph data, including the time dimension, make it difficult to visualize these graphs. This type of visualization can be represented in two ways [3], the first following a representation in the form of nodes-links and the second is a matrix representation, we allow to concentrate on the following to those in the form of nodes-links and a graph. The visualizations of the dynamic node-link graphs can be based on two techniques showing the evolution of the graphs through time, those with a time-space mapping based on multiple static and small graphs that follow a chronological order, and those with a time map with animated graphics. They have always been compared for several years to be able to choose well which corresponds to each problem. [4] compare an animated cursor solution to an approach with small visualizations of time series in nodes. They observe better participant performance of the animated approach when only one or two time points need to be studied for tasks with multiple time steps. [5] Compare a node-animated link to a static approach by showing node link diagrams in a grid (timeline) based on the same node layout, for time-related tasks, the static approach tends to offer better performance in terms of error rates and response time. [6] notice also generally faster response times for chronological conditions, but for some tasks related to entity appearance the animation produces lower error rates. While [7] show, also that animation tends to reveal more results on adjacent time steps, while small multiples promote the discovery of models that last longer.
Communities Detection in Epidemiology
321
In conclusion, of these studies, time-based approaches appear to be preferable for tasks with more than two-time steps. As [8] shows, hybrid approaches combining animation and chronology, under certain conditions, can produce better results. Several models have been proposed to emulate diverse models of epidemic spreading [9], for example the susceptible infectious recovered (SIR) and susceptible infectious susceptible (SIS) dynamics. [10] had proposed a model for covid 19 pandemic that considers the social distance, the duration of contact with the patient and individual information like age, gender, nationality, the presence of chronic diseases etc. [11] had focused on an SEIR meta-population model on a network in order to characterize the epidemic dynamics and to predict possible contagion scenarios. [12] Combine the strengths of three different approaches, compartmental models, complicated agent-based models and time series by including stochasticity, contact heterogeneity and even individual characteristics to only study the final state of epidemics. Finally [13] solution has been done for community detection in epidemiology, they used a temporal network in the form of an RDF graph and they applied genetic algorithm “GA” and bee colony algorithm “ABC” mutually for a propagation and immunization strategies. [14, 15] used modeling/simulation to show SIR and temporal networks performance. Temporal networks are a useful framework to represent and analyze time-dependent changes and underlying dynamics of complex systems. Many phenomena, ranging from disease spread and human communication to financial transactions and human brain, can generate large-scale temporal network data. This type of networks can be represented by an RDF graph. RDF stands for “Resource Description Framework” and it’s a graphbased data exchange standard used to represent highly interconnected data. Each RDF statement is a three-part structure made up of resources where each resource is identified by a URI. The representation of data in RDF allows information to be easily identified, clarified and interconnected by AI systems. RDF allows us to make resource statements. The format of these declarations is simple. A declaration always has the following structure an RDF statement expresses a relationship between two resources. The subject and the object represent the two related resources, the predicate represents the nature of their relationship. The relationship is formulated in a directional way (from subject to object) and is referred to in RDF as a property. Since RDF declarations consist of three elements, they are called triples. To apply the strategies for propagation and immunization based on the population of this RDF graph, [13] used GA and ABC algorithms. A genetic algorithm is one of the most used optimization algorithms for community detection. It is based on a population simulation until a stop criterion is met. The evaluation performed using a fitness function associated with each individual to reflect the quality of his solution. It serves to apply genetic operations to the creation of a new generation of individuals. Then a new generation is derived from the current generation, using the three genetic operators including selection crossing and mutation. Finally, they get a new generation and they go through the same evaluations. This process is repeated until the user is satisfied with the condition.
322
M. Mokaddem et al.
In GP algorithm [16–18], each solution has its own chromosome, a chromosome is a set of parameters (characteristics) that defines the individual, each chromosome has a set of genes and a string represents each gene. Thus, the steps of a genetic algorithm can be summarized as follows. The first one is to initialize the population, which consists of choosing the number of individuals. The second is the evaluation through fitness function. The third one is the selection that consists of selecting the infected individuals, then for each infected we search and select the list of susceptible individuals with whom he has had contact. The fourth step which is the crossover applied between the chromosome of the infected and the chromosome of the susceptible selected by the selection operator, by exchanging parts of their genes, the mutation that is applied to both propagation and immunization phases by changing the state of the individuals from susceptible to infected and from infected to recovered. The final two steps, the replacement of the population by the change of the old population with new solutions and finally Termination where the termination condition is satisfied. The second one is ABC algorithm, it is also one of the optimization algorithms used for community detection, and it simulates the intelligent behavior of real bees. The artificial bee colony consists of three groups bees: workers, onlookers and scooters. The steps of such an algorithm are summarized in four of which the first is the initialization and generation of solutions. In this step all the necessary parameters are defined such as the number of bees in the hive, the number of iterations to improve the solution, the total number of cycles of the algorithm, the number of bee employees, the onlookers and scouts, and the maximum number of communities. The second one is workers movement. The employed bees calculate the quality of their solutions. Therefore, the workers bees look for infected and susceptible individuals around in their memory and assess its quality. The third step by selection and amelioration by the movements of the onlookers where the onlooker’s bees tend to detect or select the individuals from those found by the workers bees according to a fitness. Each onlooker bee modifies the chosen solution and checks its quality. Then, the bee remembers a better solution, and finally scooters movements where the scooters bees are translated from a few workers bees, which abandon their solutions and determine a potential new individual solution to replace the abandoned solution.
Communities Detection in Epidemiology
323
The [13] solution based on temporal RDF graph and GP; ABC algorithm has followed a particular architecture Fig. 1. As specified in their paper, they used a NoSQL store deployed on virtual machines where their RDF graphs will be queried by SPARQL queries using the Jena API to handle such graphs. Interactions with the application is managed using REST API under the Flask restful web server. A gateway is used as a channel to establish communications between the Flask server (python implementation) and the store (java implementation) since there is no way to access Oracle NoSQL store from any python implementation. Finally, the communication between the client and the Flask server is done by the popular HTTP requests.
Fig. 1. The architecture of the application.
324
M. Mokaddem et al.
As a sample an initial population Fig. 3 of 15 individuals have been used containing 3 infected individuals in red color, 11 susceptible individuals in purple color and 1 recovered individual in green color Fig. 2. Figure 4 is the invocation on the graph of an individual using his index within an HTTP link and the REST Flask API. Finally, Figs. 5 and 6 shows respectively the locations of a contact (a) and the details of his visit to a given location (b). For the application of propagation by GP and ABC, they specified that it give the same results in terms of the epidemic propagation, but not in terms of speed of application. Figure 7 shows the population after the propagation in GP and ABC mutual application. Specifically, in the application of the GP algorithm for propagation phase, they calculated the fitness of all individuals and selected infected ones. For each infected individual they launched the propagation, and then they extract the susceptible contacts to be infected, so they applied the crossover and/or the mutation between the infected and the susceptible. Figure 8 shows the calculated fitness result.
Fig. 2. Temporal network in RDF graph of the initial population
In the ABC propagation phase, Bee workers calculated the fitness for each individual (number of worker bees according to the number of individuals), for each infected individual the propagation is launched. Then they extract the susceptible contacts to be infected, they allowed the bees onlookers to retrieve all the contacts in a position (number of Onlookers bees according to the number of positions of the infected). Figures 9 and 10 below shows the calculated fitness result (a) and the infection result between an infected person (Individual_31000005) and his susceptible contacts.
Communities Detection in Epidemiology
325
Fig. 3. Generation of the initial population
Fig. 4. Information of an individual
Fig. 5. The details of an individual’s visit to a given location.
In the GP/ABC immunization phase, Figs. 11 and 12 shows respectively the immunization results after a GP (a) and ABC (b) single propagation iteration.
326
M. Mokaddem et al.
Fig. 6. Information of a location visited by an individual
Fig. 7. Temporal network in RDF graph of the population after propagation.
3 The Visualization of the Epidemiological Dynamic Network The visualization of the epidemiological dynamic network is based on the results obtained from the application of genetic algorithm (GP) and the bee colony algorithm (ABC) which are well detailed in [13], and the fitness function especially with regard to our specification in relation to the “immunity degree” that has been emphasized as a characteristic for our visualization.
Communities Detection in Epidemiology Index
State
Immunity degree
Individual 310000002
S
0.7
Individual 310000008
S
0.7
Individual 310000015
S
0.7
Individual 310000003
S
0.7
Individual 310000006
S
0.7
Individual 310000009
S
0.7
Individual 310000001
I
0.2
Individual 310000004
S
0.7
Individual 310000007
S
0.7
Individual 310000014
I
0.4
Individual 310000012
S
0.7
Individual 310000005
I
0.4
327
Fig. 8. Result of the fitness function calculated for all individuals.
Index
State
Immunity degree
Individual 310000002
S
0.7
Individual 310000008
S
0.7
Individual 310000015
S
0.7
Individual 310000003
S
0.7
Individual 310000006
S
0.7
Individual 310000009
S
0.7
Individual 310000001
I
0.2
Individual 310000004
S
0.7
Individual 310000007
S
0.7
Individual 310000014
I
0.4
Individual 310000012
S
0.7
Individual 310000005
I
0.4
Fig. 9. Result of the fitness function.
The propagation and immunization phases are both applied at the level of the two algorithms; Figs. 13 and 14 explain their pseudo-code and Fig. 15 the fitness function.
328
M. Mokaddem et al.
Fig. 10. Result of infection between an infected and his susceptible contacts.
Fig. 11. Immunization results GP.
With: P(t) represents a population of candidate solutions for a given problem, at iteration t. Therefore, our proposed solution concerns the propagation/immunization phases visualization taking into account the degree of immunization of individuals that has already been calculated with respective fitness function. We first visualize the spread of the plot of the 15 individuals, the individuals who are infected are visualized with a red color those susceptible with an orange color as well as the individuals restored in green, illustrated in Fig. 15. The visualization was done with the python language based on specific libraries to the representation of graphs and their interactions1. We used NetworkX and Matplotlib to visualize our graph as a fist attempt.
Communities Detection in Epidemiology
329
Fig. 12. Immunization results ABC.
Fig. 13. The GP algorithm
Algorithm 02 : Bee colony algorithm 1. Initialisation. 2. while the stop criterion is not met Operating Bee : Send operating bees to food sources and update each solution. Spectator bee : make a selection based on the matching function and update each solution. Scout bee : repeat the most inactive solution, and replace it with a new randomly generated solution Fig. 14. The ABC algorithm
We focused mainly on the propagation of individuals according to their degree of immunity, the individuals who have been inferred generally carry degrees between 0–0.3
330
M. Mokaddem et al.
Fig. 15. The fitness function
and 0.6–0.9, Figures 17 and 18 shows the case of individuals infections within immunity range 0.6–0.9 and 0–0.3 specified by blue color.
Fig. 16. Visualization of the fitness function results.
From the visualization of the graphs according to the degrees, it can be seen that even the individuals with a high immunity in the range 0.6–0.9 were also rapidly infected, it is specified that these individuals follow the characteristics of being in the same location within a short period so that they are so quickly infected.
Communities Detection in Epidemiology
331
Fig. 17. Propagation of individuals with 0.6–0.9 immunity degree.
Fig. 18. Propagation of individuals with 0–0.3 immunity degree.
4 Conclusion In this project, a graphical based visualization has been used at the graphical representation of data in order to identify trends and correlations, which cannot easily be perceived in raw or textual data. We focused on the visualization of dynamic complex networks according to the characteristics of their nodes in epidemics diseases taking into account their degree of immunity during the spread of the disease. The visualization allowed us to show the impact of this degree on the evolution of the dynamics of the network. More individuals’ characteristics and map visualization are in perspective as an extended version of this paper. We are looking for more visualization technologies to represent the graph dynamic.
References 1. Stegehuis, C., Van Der Hofstad, R., Van Leeuwaarden, J.: : Epidemic spreading on complex networks with community structures. Sci. Rep 6, 1–7 (2016) 2. Cherifi, H., Palla, G., Szymanski, B.K., Lu, X.: On commnity structure in complex networks: challenges and opportunities. Appl. Netw. Sci. 4, 1–13 (2019). https://doi.org/10.1007/s41 109-019-0238-9 3. Burch, M., Ten Brinke, K.B., Castella, A., Peters, G.K.S., Shteriyanov, V., Vlasvinkel, R.: Dynamic graph exploration by interatively linked node-link diagrams and matrix visualizations. Vis. Comput. Ind. Biomed. Art 4(1), 1–14 (2021) 4. Saraiya, P., Lee, P., North, C.: Visualization of graphs with associated timeseries data (2005)
332
M. Mokaddem et al.
5. Farrugia, M., Quigley, A.: Effective temporal graph layout: a comparative study of animation versus static display methods. Inf. Vis. 10(47), 64 (2011) 6. Archambault, D., Purchase, H C., Pinaud, B.: Animation, small multiples, and the effect of mental map preservation in dynamic graphs (2011) 7. Boyandin, I., Bertini, E., Lalanne, D.: A qualitative study on the exploration of temporal changes in flow maps with animation and small-multiples (2012) 8. Rufiange, S., McGuffin, M.J.: DiffAni: visualizing dynamic graphs with a hybrid of difference maps and animation IEEE Trans. Vis. Comput. Graph. 19(12), 2556–2565 (2013) 9. Wang, S., Gong, M., Liu, W., Wu, Y.: Preventing epidemic spreading in networks by community detection and memetic algorithm (2020) 10. Alguliyev, R., Aliguliyev, R.: Yusifov, F : Graph modelling for tracking the COVID-19 pandemic spread. Infect. Dis. Model. 6, 112–122 (2021) 11. Aletti, G., Benfenati, A., Naldi, G.: Graph, spectra, control and epidemics: an example with a SEIR model. Mathematics 9, 2987 (2021) 12. Allen, A.J., Boudreau, M.C., Roberts, N.J., Allard, A., Hébert-Dufresne, L.: Predicting the diversity of early epidemic spread on networks. Phys. Rev. Res. 4(1), 013123 (2020) 13. Mokaddem, M., Atmani, B., Setti, H.A., Ali, T.: Data mining tools for community detection in epidemiology. In: International Conference on Computer and Information Sciences (2021 ICCIS) (2021) 14. Mokaddem, M., Atmani, B., Boularas, A., Mokaddem, C.E.: DEVSServer: an ambient intelligence and DEVS modelling based simulation server for epidemic modelling. Int. J. Simul. Epidemic Model.16(6), 557–581 (2018). http://www.inderscience.com/jhome.php?jcode= ijspm 15. Mokaddem, M. Atmani, B., Boularas, A.: DEVSServer: an ambient intelligence and DEVS based modeling and simulation server. In: 2016 Spring Simulation Multi-conference (SpringSim 2016), Society for Computer Simulation International San Diego, CA, USA, 2016, Pasadena, California, USA (2016) 16. Mokaddem, M., Atmani, B., Setti, H.A, Tobal, A.: Data mining tools for community detection in epidemiology. In: International Conference on Computer and In-formation Sciences (2021 ICCIS) (2021) 17. Mokeddem, S., Atmani, B., Mokaddem, M.: A New Approach for Coronary Artery Diseases Diagnosis Based on Genetic Algorithm, Revue of Cardiothoracic Critical Care: Breakthroughs in Research and Practice. Hershey, PA: IGI Global (2019). https://doi.org/10.4018/978-15225-8185-7 18. Mokeddem, S., Atmani, B., Mokaddem, M.: An effective feature selection approach driven genetic algorithm wrapped Bayes naïve. Int. J. Data Anal. Tech. Strat. 8, 220–243 (2016). http://www.inderscience.com/jhome.php?jcode=ijdats
Cardiovascular Diseases Prediction Based on Dense-DNN and Feature Selection Techniques Abderzak Manaa, Farida Brahimi, Zahira Chouiref(B) , Mohamed Kessouri, and Mourad Amad LIMPAF Laboratory, Department of Computer Sciences, Faculty of Sciences and Applied Sciences, University of Bouira, Bouira, Algeria [email protected]
Abstract. Cardiovascular Diseases (CVDs) are a group of disorders affecting the heart and blood vessels. They have been considered in recent years as one of the main causes of death in the world. Patients with heart disease do not feel sick until the very last stage of the disease and most heart patients die before receiving any treatment. Machine Learning and Deep Learning techniques play an important role in early prediction of heart disease, to improve the quality of healthcare and help individuals to avoid earlier health complications as coronary artery infection and decreased function of blood vessels . Nowadays, the field of health care produces a large amount of data. The need for efficient techniques for processing this data has become necessary. In this paper, a model for cardiovascular disease prediction based on Dense Deep Neural Networks (Dense-DNN) is developed and attributes selection is performed via a Genetic Algorithm (GA). The GA is used to identify the best subset of attributes from the entire features in the dataset, to improve the performances and reduce the training time of the classification model. Our prediction model is compared to several traditional Machine Learning techniques. The performances of our system have been evaluated based on six parameters: (1) accuracy, (2) sensitivity, (3) specificity, (4) F-measure, (5) RMSE, and (6) MAE. Experimental results show that our proposed model outperforms state-of-the-art methods in terms of performance evaluation metrics. The achieved accuracy of the proposed model is 91.7% without using feature selection and 95% with the use of feature selection. Keywords: Cardiovascular diseases · Classification model · Deep neural networks · Feature selection · Genetic algorithm
1 Introduction Cardiovascular diseases are a group of disorders affecting the heart and blood vessels. They are considered one of the main causes of death worldwide. Today, this disease affects several million people and continues to increase each year. According to a study published by the WHO in June 2021, an estimated 17.9 million people died from CVDs in 2019, representing 32% of all global deaths. Of these deaths, 85% were due to heart attack © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 333–347, 2023. https://doi.org/10.1007/978-3-031-18516-8_24
334
A. Manaa et al.
and stroke [1]. The mortality rate from cardiovascular diseases in Algeria is estimated at 34% per year according to figures from the National Institute of Public Health in Algeria (NIPH) [2]. The main risk factors for heart disease are poor diet, lack of physical activity, smoking and harmful use of alcohol, high blood pressure, diabetes and obesity. People at high risk of CVDs require early detection and management including psychological support and medication, as needed. Faced with the lack of financial and human resources in African regions and particularly in Algeria, these diseases have become a real public health problem. Many complications arise if heart disease remains unrecognized and untreated such as coronary artery infection and decreased blood vessel function. Machine Learning (ML) and Deep Learning (DL) techniques play an important role in early prediction of heart disease to improve the quality of healthcare and help individuals to avoid dangerous health situations. ML technique analyzes a set of data in order to derive rules that will allow conclusions to be drawn about new data. DL is a ML technique capable of analyzing structured and unstructured data such as images, videos and text. DL has made remarkable progress in recent years, in particular due to the increase in computer power and the development of large databases (big data). DL is based on artificial neural networks, made up of tens or even hundreds of layers of neurons that each performs small and simple operations. The results of a first layer of neurons serve as inputs for the calculations of a second layer and so on. The performance of a learning algorithm strongly depends on the quality of the attributes used in the learning phase. The presence of redundant or irrelevant attributes reduces the performance of this algorithm, to solve this problem; we use the feature selection technique. Feature selection is a data pre-processing process that consists of selecting the most relevant attributes from the data set of variables describing the phenomenon under study [3]. The main motivations for feature selection are [4, 5]: • The elimination of the attributes which are source of noise, allows a better comprehension of the studied phenomenon. • Reduced learning and execution time. • The best selected attributes allow a better generalization of the data by avoiding overfitting. The objective of this work is to select a subset of the best attributes to accurately predict cardiovascular disease using Dense-DNN and supervised ML algorithms such as Support Vector Machine (SVM) [16], Random Forest (RF) [17], K-Nearest-Neighbor (KNN) [18], Gaussian Naïve Bayes (GNB) [16], Logistic Regression (LR) [17], eXtreme Gradient Boosting (XGBoost) [19] and Decision Tree (DT) [17]. The remainder of this paper proceeds as follows: Sect. 2 presents studies related to cardiovascular diseases prediction; Sect. 3 illustrates the proposed approach, while the experimental setup and results are presented in Sect. 4. Section 5 gives the conclusions and discusses the future work.
2 Related Work This section provides an overview of research work already done in the prediction of heart disease using ML techniques.
Cardiovascular Diseases Prediction
335
This study [6] aims to develop a ML-based model to detect heart disease. The algorithms used are: KNN, RF, DT, SVM and Naïve Bayes (NB). The KNN has shown its effectiveness in detecting heart disease. The authors developed a prototype to validate the results. The prototype consisted of a set of sensors to monitor a person’s state of health. The proposed model achieves an accuracy rate of 88.52%. S.Mohan et al. [7] proposed a novel system with hybrid RF based on a linear model. This study applies different combinations of features with many classification approaches. The performance of the proposed approach is enhanced with an accuracy level of 88.7%. In Tulasi et al. [8], the authors evaluate the performance of traditional approaches such as LR, KNN, NB, SVM, Neural Networks (NN) and Convolutional Neural Networks (CNN), the proposed prediction model. The UCI Cleveland Machine Learning Repository dataset is cleaned and then split into 80% training and 20% testing for training and testing purposes. The authors of this study proposed a CNN to accurately predict whether a patient had cardio disease or not with an accuracy rate of 94%. D. Dahiwade et al. [9] proposed a heart disease prediction system based on CNN and KNN. The results show that the performance of CNN is better than the performance of KNN with an accuracy of 84.5% and the time required for classification for CNN is less than KNN. H. EL HAMDAOUI et al. [10] designed a prediction system for heart diseases based on several ML techniques such as NB, K-NN, SVM, RF, and DT. The results showed that NB achieved the highest accuracy compared to the other algorithms with an accuracy of 82.17%, 84.28% using cross-validation technique and train-test distribution of data respectively. A. Baccouche et al. [20] proposed an ensemble learning framework of different network models. The authors collected the data set from Medica Norte Hospital in Mexico that includes 800 records and 141 indicators such as age, glucose and blood pressure spleen. The results showed that ensemble learning framework based on BiLSTM or BiGRU model with a CNN had the best classification performance with accuracy and F1-score between 91% and 96%.
3 Proposed Approach The overall architecture of the proposed cardiovascular disease prediction approach can be visualized as shown in Fig. 1. The proposed architecture has four main modules, namely: data preprocessing module, feature selection module, data classification module, and performance-evaluation module. The first module performs the four phases: data set collecting, addressing the missing values, selection and management of correlated attributes and data normalization as detailed in Sect. 3.1. The second module consists of selecting the best features of our system as detailed in Sect. 3.2. The third module is a data classification which consists in building a prediction model based on different algorithms such as SVM, LR, RF, DT, GNB, KNN, XGBoost and the proposed Dense-DNN. The fourth module aims to evaluate the performance of the models built in the previous phase in order to choose the best model that can predict cardiovascular diseases.
336
A. Manaa et al.
Data set collecting Addressing the missing values Data Pre-processing Module Selecting the correlated attributes Data normalization
With Feature Selection
Without Feature Selection
Feature Selection Module Genetic Algorithm
Data Classification
Performance Evaluation
Apply the algorithms: SVM/LR/RF/DT/GNB/KNN/XGBoost/ Dense-DNN
Accuracy/Recall/Precision/F-measure/MAE/RMSE
Fig. 1. Architecture of the proposed system
3.1 Data Preprocessing The goal of the data pre-processing step is to transform the data set into a format conducive to the machine learning model, because raw data is often distorted and unreliable, and it can have missing and redundant values. Data Set Collecting. The Cleveland dataset is one of the heart disease datasets available in the UCI repository [11]. It contains 303 tuples and 76 attributes, but the researchers used only 14 attributes for the diagnosis of heart disease. Class labels consist of two values such as 1 for normal patient, 0 for abnormal patient. This is the database commonly used by many of the researcher’s community, which is why this database is used in our paper. Table 1 shows the description of the different Cleveland data set attributes [10]. Addressing the Missing Values. There are several ways to solve the problem of missing values; one of the easiest ways is simply to delete the rows that have missing values. For our database, we see that the attributes and instances concerned by the missing values are: “ca” and “thal” with the values 4 and 0 respectively as shown in Fig. 2. After deleting the missing data, our database decreases by 7 instances and contains 296 instances.
Cardiovascular Diseases Prediction
337
Table 1. Features information’s of the Cleveland data set Attribute
Description
Possible values
Age
Age (in years)
Valid numbers
Sex
Sex of the patient
1: male/0: female
Cp
Type of chest pain (anginal pain)
0: typical 1: atypical 2: non-anginal 3: asymptomatic
trestbps
Blood pressure while resting in mm Hg
90–200
Chol
Level cholesterol in mg/dl
125–565
Fbs
Blood sugar (fasting) is fbs > 120 mg/dl
1: true; 0: false
restecg
Results of electrocardiography while resting
0: normal 1: with ST-T 2: shows left ventricular hypertrophy
thalach
Max heart rate reached
71–202
exang
Angina induced by exercise
1 = true; 0 = false
old peak
ST induction of depression by exercise relative to rest
0–7
Slope
The slope of the peak exercise AST segment
0: depicts upsloping 1: depicts flat 2: shows a down-sloping
Ca
Number of major vessels (0–3) colored by fluoroscopy
0–3
Thal
A blood disorder called thalassemia
1:normal; 2:fixed defect; 3: reversable defect
Target
Diagnosis of heart diseases
0: Disease; 1: Nondisease
Selecting the Correlated Attributes. One of the important processes to improve data quality is to determine the correlation between variables. Correlation can be defined as a measure of dependence between two different variables. To calculate the correlation ratio between two different variables, we used the Pearson correlation coefficient (P) defined by the formula below. (xi − x) × (yi − y) (1) P = (xi − x)2 × (yi − y)2 Correlated characteristics are not useful for the predictor variable because they add computational complexity. We used the correlation matrix shown in the Fig. 3 to check whether the attributes are correlated or not. We observe that the correlation between two variables never exceeds 0.45, which is acceptable. A correlation between two attributes greater than 0.7 can cause a problem and these attributes must be eliminated. The values we get from the correlation coefficient are bounded between −1 and 1.
338
A. Manaa et al.
Fig. 2. Missing values
• If the value is −1, it is said to be a negative correlation between two variables. This means that when one variable increases, the other variable decreases. • If the value is 0, there is no correlation between the two variables. This means that the variables randomly change relative to each other. • If the value is 1, it is said to be a positive correlation between the two variables. This means that when one variable increases, the other variable also increases.
Fig. 3. Correlation matrix
Cardiovascular Diseases Prediction
339
From the correlation matrix (Fig. 3), we find that: • • • •
The characteristics “chol” and “fbs” are not correlated with the target. “age”, “sex”, “trestbps” and “restecg” show a weak correlation with the target. “cp”, “thalach” and “slope” show a positive correlation with the target. “exang”, “oldpeak”, “ca” and “thal” have a negative correlation with our target.
Data Normalization. Normalization allows to put all the quantitative variables on the same scale, which greatly facilitates ML. In our work, we used the Min-Max normalization method which scales the data range to [0, 1]. Each instance of our data set is transformed by the following relation: xscaled =
x − xmin xmax−xmin
(2)
3.2 Feature Selection Feature selection is a very active research topic and a pre-processing step that plays an important role in the field of ML. It consists of choosing a subset of relevant variables from a set of large attributes, eliminating redundant, irrelevant or noisy variables that have little or no influence on the information we want to predict. In this work, we choose to select the relevant and non-redundant attributes using the genetic algorithm. Genetic Algorithm. The GA is an optimization algorithm based on the principle of natural evolution. The algorithm attempts to mimic the concept of human evolution by modifying a set of individuals called a population, followed by a random selection of parents from that population to perform reproduction in the form of mutation and crossover [12]. An individual is represented by a binary string such that 0/1 indicates the presence/absence of the ith attribute in the individual (solution). The individual length is considered as the number of attributes of the data set (13). The size of the population and the number of attributes selected are defined by the experiment. In this work the number of selected attributes is 9 and the length of the population is 20. After initializing the population, the fitness value of each individual is calculated. We have chosen the F-measure (defined by formula 8) of the Multi-Layer Perceptron classifier (MLP) as the fitness function. For the selection of parents, we choose the method of selection by roulette wheel. An individual with a higher fitness value has a high probability of being selected. The reproduction of a new generation by the mating of the selected parents is done through crossbreeding at a probability Pc = 0.78 and mutation at a probability pm < = 0.001. After the reproduction of a new generation, the stopping criterion is checked. In this work, we choose an upper limit on the variance (0.00005) of the fitness values in a population as the stopping criterion. The steps for implementing the GA to select the relevant attributes are as follows:
340
A. Manaa et al.
Pseudo code for GA Input: cleaned and normalized attributes Output: relevant and non-redundant attributes Begin Initialization Set selected attributes number = 9 Set number of population individuals N = 20 Calculate the fitness value of each individual i Fitness_vali by the F-measure (see formula 8) Calculate the population variance by the following relationships i=20 (Fitness_Vali − Avg_Fitness)2 (3) variance = i=0 20 i=20 Fitness_Vali Avg_Fitness = i=1 (4) 20 While (variance > 0.00005) Select two parents Crossing at a probability of pc = 0.78 Mutation at a probability of pm < = 0.001 End while Return the best attributes End The results displayed by the GA are shown in Table 2. Table 2. The results of running the genetic algorithm Variance
2.4978478825584272e − 05
The number of generations
48
AVG Fitness
0.7916631035136182
Selected attributes
[age, sex, cp, trestbps, fbs, thalach, exang, slope, thal]
3.3 Data Classification This phase consists of building a heart diseases prediction model. An evaluation score for each patient will be calculated. To assess the performance of our system, split validation approach has been used. Split validation is a technique used to evaluate the performance of a ML model. A given dataset is divided into two subsets: train set and test set. In this approach, cleaned and normalized data is randomly divided into 80% training and 20% test to train and test several supervised learning classifiers such as SVM, LR, RF, DT, GNB, KNN, XGBoost and Dense-DNN. The classification process is done in two ways, one option is to classify the data without selecting the relevant attributes and the
Cardiovascular Diseases Prediction
341
other option is to use the genetic algorithm to select the best attributes. In this work, we proposed a Dense-DNN model to accurately predict whether a patient had cardiovascular disease or not. The Proposed Classification Algorithm (Dense-DNN). The proposed architecture contains the dense input layer which has 90 units and an activation function Rectified Linear Unit (ReLU) [13], this layer receives the attributes of our data set. The following three dense layers are the hidden layers of our architecture, which are realized with a number of units as well as an activation function (ReLU). The number of units in three hidden layers is 194, 138, and 6 respectively. Finally, we added a dense output layer which contains a single unit and it is activated by the “Sigmoid” function. The output of this layer is the prediction probability that an individual has a cardiovascular disease or not. After each layer, we implement batch normalization [14], which normalizes the outputs of the previous layer. We used the sequential model to chain the layers of the Dense-DNN, then we implement the AdaBelief Optimizer [15] to minimize the loss function of the proposed model. The best choice of “loss” function for a two-class classification task is the “binary_crossentropy” binary cross-entropy loss function, so this is what we used. The proposed architecture is illustrated in Fig. 4. The steps for implementing Dense-DNN are as follows: Pseudo code for Dense-DNN Input: AttributesData, TargetData Output: Model, Accuracy, Precision, Recall, F-measure, MAE, RMSE Function Dense-DNN (AttributesData, TargetData) Begin 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Function train_test (AttributesData, TargetData) / # Split data into training data and test data Function model_sequentiel(AttributesData) /#Implement the sequential model Function optimization() / # Implement the AdaBelief optimizer Function compilation (model, optimizer) / # Compile the model Function training (model, TrainAttData, TrainTargData) / # Train the model Function prediction (model, TestAttData) / # Test the model Function results (TestTargData, PredTargData) / # Calculate Metrics TrainAttData, TestAttData, TrainTargData, TestTargData = train_test (AttributesData, TargetData) Model = model_sequentiel (AttributesData) Optimizer = optimization() Compilation (model, optimizer) Historique = training (model,TrainAttData, TrainTargetData) PredTargData = prediction (model, TestAttData) Accuracy, Precision, Rappel, F-measure, MAE, RMSE = results (TestTargData, PredTargData) Return Model, Accuracy, Precision, Rappel, F-measure, MAE, RMSE END Function
342
A. Manaa et al. Dense Hidden Layer H1
Dense Input Layer
Dense Hidden Layer H2 138 Units
194 Units
90 Units
Dense Output Layer
Dense Hidden Layer H3
1 Unit
6 Units
Fig. 4. Proposed algorithm architecture
3.4 Performance Evaluation In this paper, we compare the performance of the prediction model based on DenseDNN with other ML models such as SVM, LR, RF, DT, GNB, KNN and XGBoost. We first evaluate the performance of the models on clean and unreduced data (data set with 13 attributes), then we evaluate it on clean and reduced data (data set with 9 attributes selected by the GA). In order to evaluate the validity of the predictive model, different measures can be calculated such as Accuracy, Precision, Recall, F-measure, Mean Absolute Error (MAE), Root Mean Squared Error (RMSE). Performance Metrics. Different performance measures were used to determine the effectiveness of the predictive models. These measures can be mathematically represented by the following formulas. Accuracy. It measures the rate of correct predictions for all individuals. Accuracy =
TP + TN TP + TN + FP + FN
(5)
where TP, TN, FP and FN stand for True Positive (number of correctly classified positive data), True Negative (number of correctly classified negative data), False Positive (number of incorrectly classified negative data) and False Negative (number of incorrectly classified data) respectively. Precision. It measures the capacity of the model not to make an error during a positive prediction. Precision =
TP TP + FP
(6)
Recall. It measures the ability of the model to detect all positive individuals. Recall = F − measure = 2 ×
TP TP + FN
(7)
(Precision × Recall) Precision + Recall
(8)
F-measure. It corresponds to the harmonic mean of the precision rate and the recall.
Cardiovascular Diseases Prediction
343
Error Metrics. These are the most commonly used measures in the literature. They evaluate the quality of the predictions generated by the prediction system. There are several types of functions which consist in measuring the average “distance” between the forecasts and the corresponding observations. Thus, a value close to 0 indicates perfect predictions and a value close to 1 indicates bad predictions. Mean Absolute Error MAE. Measures the absolute difference between the actual value and the prediction. n yi − yi (9) MAE = i=1 n
where yi is the value of the ith observation from the validation dataset and yi is the predicted value for the ith observation. Root Mean Squared Error (RMSE). Measures the difference between the values predicted by the model and the observed (actual) values. This measure provides an indication with respect to the dispersion or variability of the quality of the prediction. The RMSE can be related to the variance of the model. n 2 i=1 (yi − yi ) (10) RMSE = n
4 Experimental Results and Discussion We show in this section the different results obtained concerning the performance evaluation of our prediction model and other ML models. 4.1 Performance Metrics Results The evaluation metrics of the proposed model are compared in Table 3 and Table 4 to existing ML models with and without data reduction and with split-validation technique. The results of Table 3 show that the performance of the proposed model without data reduction is better than several ML models utilized in this research with an accuracy of 91.7%. According to the results shown in Table 4, we can conclude that the performance of our proposed model is improved after the selection of attributes by the GA and it is better than several ML models utilized in this work with an accuracy of 95%.
344
A. Manaa et al. Table 3. Performance overview of prediction models without data reduction
Algorithm
Accuracy
Precision
Recall
F-measure
SVM
0.917
0.897
0.972
0.933
LR
0.883
0.872
0.944
0.907
RF
0.833
0.861
0.861
0.861
DT
0.750
0.818
0.750
0.783
GNB
0.817
0.838
0.861
0.849
KNN
0.867
0.889
0.889
0.889
XGB
0.800
0.816
0.861
0.838
Dense-DNN
0.917
0.919
0.944
0.932
Table 4. Performance overview of prediction models with data reduction Algorithm
Accuracy
Precision
Recall
F-measure
SVM
0.867
0.912
0.861
0.886
LR
0.867
0.889
0.889
0.889
RF
0.850
0.886
0.861
0.873
DT
0.717
0.771
0.750
0.761
GNB
0.833
0.861
0.861
0.861
KNN
0.850
0.909
0.833
0.870
XGB
0.883
0.914
0.889
0.901
Dense-DNN
0.950
0.946
0.972
0.959
Figure 5 shows the results of the evaluation metrics of our proposed model with and without data reduction and with split-validation. As we can see, the proposed DL model achieved the greatest results using the feature selection method. Compared to the results of the method without data reduction, the results of SVM, LR, DT and K-NN with data reduction decreased, and the results of RF, GNB, XGBoost and the proposed model increased. The results obtained indicate that the feature selection method improves and optimizes the performance of the proposed algorithm. 4.2 Error Metrics Results The error metrics of the proposed model with and without data reduction and with split-validation are compared in Table 5 and Fig. 6. As we can see, the proposed DL model achieved the best results using the attributes selection method. Compared to the results of the method without data reduction, the MAE and RMSE of SVM, LR, DT and K-NN with data reduction increased, and the
Cardiovascular Diseases Prediction
Fig. 5. Performance metrics results
Table 5. Error metrics results Algorithm
MAE (13/9)
RMSE (13/9)
SVM
0.083/0.133
0.289/0.365
LR
0.117/0.133
0.342/0.365
RF
0.167/0.150
0.408/0.387
DT
0.250/0.283
0.500/0.532
GNB
0.183/0.167
0.428/0.408
KNN
0.133/0.150
0.365/0.387
XGBoost
0.200/0.117
0.447/0.342
Dense-DNN
0.083/0.050
0.289/0.224
345
346
A. Manaa et al.
Fig. 6. Error metrics results
MAE and the RMSE of RF, GNB, XGBoost and the proposed model decreased. The results obtained indicate that the feature selection method improves and optimizes the performance of the proposed algorithm.
5 Conclusion The primary goal of this paper is to propose and improve the performance of a heart disease prediction model based on Dense-DNN. To achieve this objective, we compared the performance of the prediction model based on Dense-DNN with other ML models such as SVM, LR, RF, DT, GNB, KNN and XGBoost. This performance evaluation is done by two ways, one is without using feature selection techniques and the second option is to use genetic algorithm for feature selection. The genetic algorithm helped us identify relevant and non-redundant attributes for this study. The proposed system identifies the best hyper parameters for Dense-DNN to achieve a good accuracy of 91.7% without using feature selection and 95% with the use of feature selection. We thus compared our model with the other models that we mentioned previously by using the MAE and RMSE error metrics. This system showed its performance by small errors of MAE/RMSE which is worth 0.083/0.289 without using the selection of attributes and 0.050/0.224 with the use of the selection of attributes. We showed in this study the interest of attributes selection techniques in the improvement of prediction models performances. Future work in this field may focus on exploiting of the cross-validation technique to validate proposed model, using other feature selection methods such as ant colony algorithm and bee colony algorithm to select the best system attributes, thus exploiting the proposed model to predict other diseases such as diabetes.
References 1. WHO: World Health Organization, Media Centre, cardiovascular diseases fact sheet webpage. https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds). Accessed 11 June 2021
Cardiovascular Diseases Prediction
347
2. APS: Algeria Press Services webpage. https://www.aps.dz/en/health-science-technology. Accessed 24 Mar 2021 3. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997) 4. Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–157 (1997) 5. Yang, J., Honovar, V.: Feature subset selection using a genetic algorithm. IEEE Intell. Syst. 13, 44–49 (1998) 6. Gupta, A., et al.: HeartCare: IoT based heart disease prediction system International Conference on Information Technology (ICIT) (2019) 7. Mohan, S., et al.: Effective heart disease prediction using hybrid machine learning Techniques. IEEE Access (2019). http://https://doi.org/10.1109/ACCESS.2019.2923707 8. Sajja, T.K., et al.: A deep learning model for prediction of cardiovascular disease using convolutional neural network. Revue d’Intelligence Artificielle 34(5), 601–606 (2020) http:// iieta.org/journals/ria 9. Dahiwade, D., et al.: Designing disease prediction model using machine learning approach. In: Proceedings of the Third International Conference on Computing Methodologies and Communication (ICCMC 2019) IEEE Xplore Part Number: CFP19K25-ART; ISBN: 978– 1–5386–7808–4 10. El Hamadaoui, H., et al.: A clinical support system for prediction of heart disease using machine learning techniques. In: 5th International Conference on Advanced Technologies for Signal and Image Processing, ATSIP’ 2020, Sfax, Tunisia 11. Heart Disease Dataset. https://archive.ics.uci.edu/ml/datasets/heart+disease 12. Oluleye, B., et al.: A genetic algorithm-based feature selection. Int. J. Electron. Commun. Comput. Eng. (2014) 13. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27 th International Conference on Machine Learning, pp. 807–814 (2010) 14. loffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Internationale Conference on Machine Learning, pp. 448–456 (2015) 15. Zhuang, J., et al.: Adabelief optimizer: adapting stepsizes by the belief in observed gradients. Adv. Neural Inf. Process. Syst. 33, 18795–18806 (2020) 16. Ramalingam, V.V., et al.: Heart disease prediction using machine learning techniques: a survey. Int. J. Eng. Technol. 7 (2.8), 684–687 (2018) 17. Katarya, R., Kumar Meena, S.: Machine learning techniques for heart disease prediction: a comparative study and analysis, IUPESM and Springer-Verlag GmbH Germany, part of Springer Nature 2020 18. Sateesh Kumar, R., Sameen Fatima, S.: Heart disease prediction using extended KNN (EKNN). In: Satapathy, S.C., Bhateja, V., Favorskaya, M.N., Adilakshmi, T. (eds.) Smart Computing Techniques and Applications. SIST, vol. 224, pp. 565–572. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-1502-3_56 19. Donga, W., et al.: XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring, Automation in Construction, Elsevier (2021) 20. Baccouche, et al.: Ensemble deep learning models for heart disease classification: a case study from Mexico. Information 11, 207 (2020). https://doi.org/10.3390/info11040207
Author Index
A Abbas, Sofia, 62 Abdelli, Abdelkrim, 46 Acheli, Dalila, 292 Amad, Mourad, 333 Amar Setti, Hamza, 319 Atmani, Baghdad, 319 B Babaali, Baligh, 205 Behloul, Ali, 135 Benmakhlouf, Abdeslam, 150 Benmammar, Badr, 263 Bensalem, Riad, 165 Besma, Hezili, 177 Bessa, Samah, 165 Bey, Fella, 17 Boughaci, Dalila, 121 Bouhamed, Mohammed Mounir, 105, 220 Boukra, Abdelmadjid, 17 Boulaiche, Ammar, 31 Bousbia, Nabila, 165 Bousri, Mohamed Charafeddine, 165 Bouzenada, Ahmed, 220 Brahimi, Farida, 333 C Chaoui, Allaoua, 62, 220 Cheragui, Mohamed Amine, 249 Cherfi, Sarra, 31 Chikhi, Salim, 76 Chouiref, Zahira, 333
D Dahou, Abdelhalim Hafedh, 249 Díaz, Gregorio, 220 Dif, Rougaia, 192 Dilekh, Tahar, 3 Djarah, Djalal, 150 E El’Amine, Korichi Mokhtar, 306 F Fellah, Kaouthar Manar, 234 G Gabis, Asma Benmessaoud, 292 Guezouli, Larbi, 3 H Haddaoui, Seloua, 76 Hakem, Mourad, 263 Hamdi, Skander, 279 Hattab, Abdessalam, 135 Hichem, Talbi, 177 I Idris Khodja, Ilhem, 319 K Kadache, Nabil, 91 Kadri, Mohamed Riadh, 46 Kahloul, Laid, 234 Kali Ali, Selma, 121 Kamel, Oussama, 220
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Chikhi et al. (Eds.): MISC 2022, LNNS 593, pp. 349–350, 2023. https://doi.org/10.1007/978-3-031-18516-8
350 Kerdoudi, Saoueb, 3 Kerkouche, El Hillali, 62 Kessouri, Mohamed, 333 Khalfaoui, Khaled, 62 L Labed, Said, 192 Lamri, Zineb, 165 Lemouari, Ali, 31 M Maati, Bouchera, 105 Macià, Hermenegilda, 220 Malti, Arslan Nedhir, 263 Manaa, Abderzak, 333 Meraihi, Yassine, 292 Miles, Badreddine, 76 Mokaddem, Chihab Eddine, 319 Mokaddem, Mostefa, 319 Mokdad, Lynda, 46 Moussaoui, Abdelouahab, 279 N Nabila, Benyoucef, 306
Author Index O Ould Khaoua, Adel Salah, 17 Oussalah, Mourad, 279 R Ramdane-Cherif, Amar, 292 S Saidi, Mohamed, 279 Saidouni, Djamel Eddine, 105 Salem, Mohammed, 205 Seghir, Rachid, 91 T Taleb, Sylia Mekhmoukh, 292 Tigane, Samir, 234 Touati, Hamza, 192 Y Yahia, Selma, 292 Z Zakaria, Chahnez, 165 Zidani, Ghania, 150