Proceedings of International Conference on Information Technology and Applications: ICITA 2022 9811993300, 9789811993305

This book includes high-quality papers presented at 16th International Conference on Information Technology and Applicat

336 73 14MB

English Pages 719 [720] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Conference Organization
Preface
Contents
Editors and Contributors
Machine Learning and Data Science
Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term Memory Network
1 Introduction
2 Proposed Methodology
2.1 Pre-processing
2.2 Intelligent Sepsis Detector
3 Experimental Results and Discussions
3.1 Dataset
3.2 Results and Discussions
3.3 Performance Comparison with Other Techniques
4 Conclusion
References
Implementation of Big Data and Blockchain for Health Data Management in Patient Health Records
1 Introduction
2 Methodology
2.1 Architecture
2.2 PHR User Interface
3 Conclusion
References
Ambient PM2.5 Prediction Based on Prophet Forecasting Model in Anhui Province, China
1 Introduction
2 Proposed Methodology
2.1 Study Area
2.2 Data
2.3 Proposed Model
2.4 Statistical Analysis
3 Results and Discussion
4 Conclusion
References
Potato Leaf Disease Classification Using K-means Cluster Segmentation and Effective Deep Learning Networks
1 Introduction
2 Literature Review
3 Data Preparation
3.1 Data Collection
3.2 Augmentation
3.3 Segmentation
3.4 Proposed Network
4 Experimental Result Analysis
4.1 Performance Comparison
5 Conclusion and Future Work
References
Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning
1 Introduction
2 Background
3 Methodology and Implementation
3.1 Implementing the Deep Learning Models
3.2 Methodology
4 Results and Evaluation
4.1 CNN Model’s Performance
4.2 Custom VGG-16 Model’s Performance
4.3 ResNet-50 Model
4.4 Custom ResNet-50 Model
4.5 Discussion
5 Conclusion
References
CataractEyeNet: A Novel Deep Learning Approach to Detect Eye Cataract Disorder
1 Introduction
2 Literature Review
2.1 Convolutional Neural Network
2.2 Pre-trained Models
2.3 VGG-19
3 Proposed Methodology
3.1 Customization
3.2 Proposed CataractEyeNet Model
3.3 Dataset
4 Experimental Setup and Results
4.1 Performance of the CataractEyeNet
4.2 Confusion Matrix
4.3 Performance Comparison Against Other Methods
5 Conclusion
References
DarkSiL Detector for Facial Emotion Recognition
1 Introduction
2 Proposed Methodology
2.1 Datasets for Emotion Detection
2.2 Data Processing
2.3 DarkSiL Architecture
3 Experimental Setup and Results
3.1 Performance Evaluation of the Proposed Method
3.2 Comparison with Contemporary Methods
3.3 Cross-Corpora Evaluation
4 Discussion
5 Conclusion
References
Review and Enhancement of Discrete Cosine Transform (DCT) for Medical Image Fusion
1 Introduction
2 Image Fusion Objective
3 Fusion Classification
3.1 Pixel Level
3.2 Feature Level
3.3 Decision Level
4 Acquired Input Image
5 Research Methodology
5.1 Discrete Cosine Transform (DCT)
5.2 DCT for Image Fusion
6 Discussion
7 Conclusion
References
Early Courier Behavior and Churn Prediction Using Machine Learning in E-Commerce Logistics
1 Introduction
2 Proposed Approach
2.1 Data Preparation
2.2 Feature Extraction and Selection
2.3 Generation Behavior and Churn Prediction Models
2.4 Early Courier Churn Prediction
3 Experiments
3.1 Experimental Setup
3.2 Evaluation Metrics
3.3 Results of Courier Behavior Prediction Experiment
3.4 Results of Churn Prediction Experiment
4 Conclusion and Discussion
References
Combining Different Data Sources for IIoT-Based Process Monitoring
1 Introduction
2 Related Work
2.1 Intrusive Load Monitoring (ILM)
2.2 Indoor Location Systems
3 Proposed System Overview
4 Results and Discussion
5 Conclusion and Future Work
References
Comparative Analysis of Machine Learning Algorithms for Author Age and Gender Identification
1 Introduction
2 Related Work
3 Methodology
3.1 Preprocessing Phases
3.2 Selecting Strategy
3.3 Applying Algorithms
3.4 Datasets
4 Results and Discussions
4.1 Age Analysis
4.2 Age Gender Combine Analysis
5 Conclusion
References
Prioritizing Educational Website Resource Adaptations: Data Analysis Supported by the k-Means Algorithm
1 Introduction
2 Proposed Methodology
3 Experimental Results
3.1 Likert Evaluation
3.2 K-Means Clustering
4 Final considerations
References
Voice Operated Fall Detection System Through Novel Acoustic Std-LTP Features and Support Vector Machine
1 Introduction
2 Proposed Methodology
2.1 Feature Extraction
2.2 Std-LTP Computation
2.3 Classification
3 Experimental Setup and Results Discussion
3.1 Dataset
3.2 Performance Evaluation of the Proposed System
3.3 Performance Comparison of Std-LTP Features on Multiple Classifiers
4 Conclusion
References
Impact of COVID-19 on Predicting 2020 US Presidential Elections on Social Media
1 Introduction
2 Related Studies
3 Proposed Methodology
3.1 Tweets
3.2 Data Pre-processing
3.3 Sentiment Analysis
3.4 Predicting the President
4 Experimental Results and Discussion
4.1 Sentiment Analysis
4.2 Predicting Vote Share
5 Conclusion and Future Work
References
Health Mention Classification from User-Generated Reviews Using Machine Learning Techniques
1 Introduction
2 Related Studies
3 Proposed Approach
4 Experiment
4.1 Dataset
4.2 Experimental Setup
5 Results and Discussions
5.1 Health Mention Classification Using Shallow Machine Learning Algorithms
5.2 Health Mention Classification Using Bi-Directional Long Short-Term Memory (BiLSTM) and Convolutional Neural Network (CNN)
5.3 Health Mention Classification Using Bidirectional Encoder Representations from Transformers
5.4 Health Mention Classification Using Pre-Trained BERT-Based Language Model for Scientific Text
6 Conclusions
References
Using Standard Machine Learning Language for Efficient Construction of Machine Learning Pipelines
1 Introduction
2 Related Work
3 Grammar
3.1 Grammar Structure
3.2 Keywords
4 SML’s Architecture
5 Interface
6 Use Cases
6.1 Iris Dataset
6.2 Auto-Mpg Dataset
7 Discussion
8 Conclusion
References
Machine Learning Approaches for Detecting Signs of Depression from Social Media
1 Introduction
2 Related Studies
3 Proposed Approach
4 Experiment
4.1 Dataset
4.2 Experimental Setup
5 Results and Discussions
6 Conclusions
References
Extremist Views Detection: Definition, Annotated Corpus, and Baseline Results
1 Introduction
2 Conceptualizing Extremism
2.1 Extremism in Dictionaries
2.2 Extremism in the Scientific Literature
2.3 Extremism in Practice
3 Extremism Detection Corpus
3.1 Extremism Detection Corpora
3.2 Development of XtremeView-22
4 Effectiveness of ML Techniques
5 Conclusion
References
Chicken Disease Multiclass Classification Using Deep Learning
1 Introduction
2 Related Work
3 Data Preparation
3.1 Dataset Used
3.2 Dataset Preparation
3.3 Splitting Dataset, Hardware and Software Used
4 Technology Used
4.1 Convolutional Neural Network
4.2 Transfer Learning
5 Implementation and Results
5.1 Experimentation and Analysis
5.2 Comparison of Selected Models’ Results
6 Conclusion and Future Work
References
Deepfakes Catcher: A Novel Fused Truncated DenseNet Model for Deepfakes Detection
1 Introduction
2 Proposed Methodology
2.1 Facial Frames Extraction
2.2 Fused Truncated DenseNet
3 Experiment Setup and Results
3.1 Dataset
3.2 Performance Evaluation of the Proposed Method
3.3 Ablation Study
3.4 Performance Evaluation of Proposed Method on Cross-Set
3.5 Comparative Analysis with Contemporary Methods
4 Conclusion
References
Benchmarking Innovation in Countries: A Multimethodology Approach Using K-Means and DEA
1 Introduction
2 Background
2.1 The Global Innovation Index (GII)
2.2 K-Means Method
2.3 Data Envelopment Analysis (DEA)
2.4 The CCR Model Inverted Frontier
3 Methods and Results
3.1 K-Means Cluster Analysis of GII
3.2 DEA CCR Model Output-Oriented
3.3 DEA CCR Model Output-Oriented
4 Conclusion
References
Line of Work on Visible and Near-Infrared Spectrum Imaging for Vegetation Index Calculation
1 Introduction
2 Literature Review
2.1 Calibration
2.2 Image Registration and Vegetation Index Calculation
3 Methodology
3.1 Calibration
3.2 Image Registration
3.3 Vegetation Index Calculation
4 Results
5 Discussion
References
Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality in Portugal
1 Introduction
2 Background
3 Method
4 Results
5 Conclusions
References
Software Engineering
Digital Policies and Innovation: Contributions to Redefining Online Learning of Health Professionals
1 Introduction
2 Proposed Methodology
2.1 Methodological Approach
3 Conclusion
References
Reference Framework for the Enterprise Architecture for National Organizations for Official Statistics: Literature Review
1 Introduction
2 The Official Statistical Information
3 Standards for Supporting Official Statistics Production
4 Enterprise Architecture Frameworks
5 Framework for NOOS
6 Conclusions
References
Assessing the Impact of Process Awareness in Industry 4.0
1 Introduction
2 Related Work
2.1 Car Workshop Systems
2.2 Business Process Models in Industry 4.0
3 Proposed Platform
3.1 Requirements
3.2 Architecture
3.3 Technologies
4 Validation
4.1 Model Validation
4.2 GUI Validation
5 Conclusion and Future Work
References
An Overview on the Identification of Software Birthmarks for Software Protection
1 Introduction
2 Identification of Software Birthmark to Prevent Piracy
3 Analysis of the Existing Approaches
4 Conclusions
References
The Mayan Culture Video Game—“La Casa Maya”
1 Introduction
2 Development of the Mayan Culture Video Game
2.1 Preliminary Investigation
2.2 General Idea About the Subject of the Video Game
2.3 3D Meshes and Models
3 Experimentation and Results
3.1 Usability Analysis of the Video Game
4 Conclusion
References
Impact of Decentralized and Agile Digital Transformational Programs on the Pharmaceutical Industry, Including an Assessment of Digital Activity Metrics and Commercial Digital Activities
1 Introduction
2 Literature Review
3 Methodological Approach
3.1 Questionnaire Design and Variables Selection
4 Results and Analysis
4.1 Descriptive Analysis
4.2 Statistical Analysis
5 Discussions
6 Conclusions
References
Sprinting from Waterfall: The Transformation of University Teaching of Project Management
1 Introduction
2 Enrolment and Teaching Approaches
3 Subject Design Rationale
4 Implementation and Stakeholder Responses
5 Future Work and Direction
6 Conclusion
References
Versioning: Representing Cultural Heritage Evidences on CIDOC-CRM via a Case Study
1 Introduction
2 Representing Data from Heterogeneous Sources
2.1 CIDOC-CRM
2.2 Definition—Schema Versioning
2.3 Data Modelling Issues: Event-Version
3 Recording Cultural Heritage: A Study Case Using Archaeological Monuments
3.1 Object-Based Record
3.2 Versioning-Based Record
3.3 Event-Version-Based Record
4 Conclusion
References
Toward a Route Optimization Modular System
1 Introduction
2 Background
3 Material and Methods
4 CRISP-DM
4.1 Business and Data Understanding
4.2 Data Preparation
4.3 Modeling
5 Results
6 Conclusion and Future Work
References
Intellectual Capital and Information Systems (Technology): What Does Some Literature Review Say?
1 Introduction
2 Literature Review
3 Conclusions
References
REST, GraphQL, and GraphQL Wrapper APIs Evaluation. A Computational Laboratory Experiment
1 Introduction
2 REST and GraphQL
3 Experimental Setting
3.1 Goal Definition
3.2 Factors and Treatments
3.3 Variables
3.4 Hypothesis
3.5 Design
3.6 Use Cases
3.7 Experimental Tasks
3.8 Instrumentation
3.9 Data Collection and Analysis
4 Experiment Execution
4.1 Preparation
4.2 Data Collection
5 Results
5.1 Statistical Analysis
6 Threats to Validity
7 Discussion
8 Conclusions and Future Work
References
Design Science in Information Systems and Computing
1 Introduction
2 Main Research Topics in Literature
3 What Design Science is and What It Is Not
4 Proposing a Tool to Help Researchers
5 Conclusions and Implications
References
Organizational e-Learning Systems’ Success in Industry
1 Introduction
2 Theoretical Background and Model Assumptions
3 Empirical Study in Industry
4 Focus Group Study Results and Discussion
5 Conclusions
References
Network Security
Smart Pilot Decontamination Strategy for High and Low Contaminated Users in Massive MIMO-5G Network
1 Introduction
2 System Model
3 Problem Formulation
4 The Proposed Strategy
4.1 Interpretation of Pilot Assignment for High Contaminate Users
4.2 Interpretation of Pilot Assignment for Low Contaminate Users
5 Simulation Result
6 Conclusion
References
Cluster-Based Interference-Aware TDMA Scheduling in Wireless Sensor Networks
1 Introduction
2 Proposed Methodology
2.1 Clustering Algorithm
2.2 Interference-Aware Intra-cluster Transmission Scheduling
2.3 Interference-Aware Inter-cluster Transmission Scheduling
3 Performance Evaluation
4 Conclusion
References
Messaging Application Using Bluetooth Low Energy
1 Introduction
2 Literature Survey
3 Methodology
3.1 Development Environment
3.2 Functionality of the Messaging Interface
3.3 Event of the Messaging Application
3.4 Phases of BLE Device
4 Results and Analysis
4.1 Limitations
5 Conclusions and Future Work
References
Scalable and Reliable Orchestration for Balancing the Workload Among SDN Controllers
1 Introduction
2 Related Work
3 Design
3.1 Orchestration Protocol
3.2 Controller Orchestration Function
4 Deployment
4.1 Orchestration Protocol
4.2 Controller Orchestration Function
5 Experimental Results
5.1 Functional Test of the Orchestration Protocol
5.2 Functional Test Comparing Two Alternative Orchestration Functions
5.3 Functional Test Comparing Centralized versus Distributed Orchestration
6 Discussion
7 Conclusion
References
A Digital Steganography Technique Using Hybrid Encryption Methods for Secure Communication
1 Introduction
2 Proposed Methodology
3 Proposed Model for Medical Image Transmission of Covid Patients
3.1 Embedding Process
3.2 Encryption Algorithm (Based on Confusion and Scrambling Operation)
3.3 Extraction Process
4 Results and Analysis
5 Conclusion
References
Utilizing Blockchain Technology to Enhance Smart Home Security and Privacy
1 Introduction
2 Literature Review
3 System Design
4 Experimental Results
5 Conclusion
References
Quality of Service Improvement of 2D-OCDMA Network Based on Two Half of ZCC Code Matrix
1 Introduction
2 2D-HSSZCC Code Construction
3 System Performance and Analysis
4 Results and Discussion
5 Conclusion
References
The Impact of 5G Networks on Organizations
1 Introduction
2 Materials and Methods
3 5G Networks
4 5G Network in Ecuador
5 Strategy for Ecuador
6 Conclusions
References
Internet of Things and Smart Technology
The Fast Health Interoperability Resources (FHIR) and Integrated Care, a Scoping Review
1 Introduction
2 Methods
3 Results
3.1 Selection of the Studies
3.2 Purposes of Included Studies
3.3 FHIR Facilitators and Barriers
4 Discussion
5 Conclusion
References
Blockchain Based Secure Interoperable Framework for the Internet of Medical Things
1 Introduction
2 State of the Art
3 BSIIoMT Framework, Working Principle, and Design Considerations
3.1 System Model
3.2 Working Principle of the BSIIoMT Framework
3.3 Design Consideration
4 Discussion
5 Conclusion
References
WaterCrypt: Joint Watermarking and Encryption Scheme for Secure Privacy-Preserving Data Aggregation in Smart Metering Systems
1 Introduction
2 Related Works
3 Overview of the Proposed Scheme
3.1 System Model
3.2 Threat Model
3.3 Proposed Scheme
4 Experimental Results and Performance Evaluation
4.1 Computational Time
4.2 Security and Privacy Analysis
4.3 Comparative Threat Analysis
5 Conclusion and Future Work
References
Technological Applications for Smart Cities: Mapping Solutions
1 Introduction
2 Proposed Methodology
3 Evolution of Cities to Intelligent Cities
4 Technological Applications for Smart Cities
5 Use of Technology for Mapping Priority Drives in Cities
6 Innovation Quadruple Helix Engagement
6.1 Propeller of Governments
6.2 Academies Propeller
6.3 Private Sector Propeller
6.4 Civil Society Propeller
7 Sustainable Technology in Cities
8 Discussion
9 Conclusion
References
Duty—Cycling Based Energy-Efficient Framework for Smart Healthcare System (SHS)
1 Introduction
2 Literature Survey
3 Smart Healthcare: Energy-Efficient Techniques
4 Proposed Methodology: Energy-Efficient Data Transmission in SHS
4.1 On-Demand Wake-Up Radio Mechanism on Sensor Node
4.2 On-Demand Wake-Up Radio Mechanism on Receiver Node
5 SHS: Energy-Efficient Duty-Cycling and Data-Driven Based Framework
5.1 Full—Functional and Reduced—Functional Device
5.2 Duty Cycling: On-Demand Wake-Up Radio
5.3 Workflow of the Proposed Framework
5.4 Energy-Efficiency: Edge, Fog, and Cloud
6 Conclusion
References
Quality 4.0 and Smart Product Development
1 Introduction
2 Research Methodology
3 Literature Review
3.1 Industry 4.0 and Digitalization
3.2 Smart Components and Products
3.3 Quality 4.0 and Quality 5.0
3.4 New Smart Product Development
4 Discussion
5 Conclusions
References
Embedded Vision System Controlled by Dual Multi-frequency Tones
1 Introduction
1.1 OpenCv and Android Studio
1.2 MT8870 DTMF Module
1.3 Four-Bit Relay Module
2 Materials and Methods
2.1 Electrical Connection Between Modules
2.2 Control Algorithm
2.3 Experimental Setup
3 Results
3.1 Time Analysis
4 Conclusions
References
Determinants of City Mobile Applications Usage and Success
1 Introduction
2 Theoretical Overview
3 Model Proposal
4 Methodology
5 Discussion
6 Conclusions
References
Sustainable Digital Transformation Canvas: Design Science Approach
1 Introduction
2 The Current State of the Art
3 Research Design and Results
4 Conclusion
References
The Role of Community Pharmacies in Smart Cities: A Brief Systematic Review and a Conceptual Framework
1 Introduction
2 Proposed Methodology
2.1 Databases and Keywords
2.2 Inclusion and Exclusion Criteria
3 Results
3.1 PRISMA 2020 Flow Diagram for New Systematic Reviews
3.2 Studies Related to Community Pharmacies in Smart Cities
4 Discussion
4.1 Community Pharmacies in Smart Cities
4.2 Reply to the Research Question
4.3 National Health Databases: Possible Application in Smart Cities
4.4 Information and Communication Technologies (ICT): 5G, Internet of Things, Blockchain, and Artificial Intelligence as Resources for Community Pharmacies in the Scope of Smart Cities
4.5 Implications to Practice and Future Research
5 Conclusion
References
Digital Media and Education
Education in the Post-covid Era: Educational Strategies in Smart and Sustainable Cities
1 Introduction
2 Smart and Sustainable Cities
3 Educational Strategy
4 Methodological Approach
5 Final Considerations
References
Digital Health and Wellbeing: The Case for Broadening the EU DigComp Framework
1 Introduction
2 Background
2.1 Digital Health and Security
3 Methodology
3.1 European DigComp Revision
4 Results and Discussion
4.1 Results Protecting Devices
4.2 Results Protecting Personal Data and Privacy
4.3 Results Protecting Health and Wellbeing
5 Conclusion
References
An Information Systems Architecture Proposal for the Thermalism Sector
1 Introduction
2 Conceptual Background
2.1 Management Information Systems (MIS)
2.2 The Information Systems Within Organizations
2.3 Health and Wellness Information Systems
3 The Portuguese Thermalism Sector—A Case Study
4 Proposed Information System Architecture for the Thermalism Sector
4.1 Operational Management of Thermal SPAs
4.2 Monitoring the Therapeutic Effects of the Thermal Activity
4.3 Access Control
4.4 Integrations
4.5 Business Analytics
4.6 Reporting Service
4.7 Information Service Support
4.8 User Interface
5 Conclusions and Future Work
References
Technologies for Inclusive Wellbeing: IT and Transdisciplinary Applications
1 Introduction
2 Virtual Reality and All That
3 SoundScapes/Personics
4 ZOOM (Zone of Optimized Motivation)
5 Workforce of the Future
6 (Re)habilitation (By Any Other Name)
7 Discussion Segueing to Conclusion
References
Should the Colors Used in the Popular Products and Promotional Products Be Integrated?
1 Introduction
2 Related Works and Hypotheses Development
2.1 Effects of Color on Consumer Behavior
2.2 Difference Between Promotional Color and Popular Color
3 Methodology
4 Results and Discussions
4.1 Results
4.2 Implications
4.3 Limitations and Future Work
References
Impact of Teacher Training on Student Achievement
1 Introduction
2 Literature Review
3 Data
4 Empirical Strategy
5 Main Results
6 Conclusions
References
Educational Data Mining: A Predictive Model to Reduce Student Dropout
1 Introduction
2 Methodology
3 Background
3.1 Data Mining
3.2 Data Mining and Students
4 Analysis of Results
5 Conclusions
References
Author Index
Recommend Papers

Proceedings of International Conference on Information Technology and Applications: ICITA 2022
 9811993300, 9789811993305

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Lecture Notes in Networks and Systems 614

Sajid Anwar Abrar Ullah Álvaro Rocha Maria José Sousa   Editors

Proceedings of International Conference on Information Technology and Applications ICITA 2022

Lecture Notes in Networks and Systems Volume 614

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).

Sajid Anwar · Abrar Ullah · Álvaro Rocha · Maria José Sousa Editors

Proceedings of International Conference on Information Technology and Applications ICITA 2022

Editors Sajid Anwar Institute of Management Sciences Peshawar, Pakistan Álvaro Rocha University of Lisbon Lisbon, Portugal

Abrar Ullah School of Mathematical and Computer Science Heriot-Watt University Dubai, United Arab Emirates Maria José Sousa University Institute of Lisbon (ISCTE) Lisbon, Portugal

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-19-9330-5 ISBN 978-981-19-9331-2 (eBook) https://doi.org/10.1007/978-981-19-9331-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Conference Organization

Honorary Chair David Tien, Senior Lecturer, Charles Sturt University and Vice Chairman, IEEE Computer Chapter, NSW, Australia Prof. Álvaro Rocha, Professor, University of Lisbon, Portugal, President of AISTI (Iberian Association for Information Systems and Technologies), Chair of IEEE SMC Portugal Section Society Chapter General Chair Dr. Abrar Ullah, Associate Professor, School of Mathematical and Computer Sciences, Heriot-Watt University, Dubai, United Arab Emirates Dr. Maria Jose, Pro-Rector for Distance Education, University Institute of Lisbon, Portugal General Co-chair Dr. Ryad Soobhany, Assistant Professor, School of Mathematical and Computer Sciences, Heriot-Watt University, Dubai, United Arab Emirates Dr. Imran Razzak, Senior Lecturer, School of Information Technology, Deakin University, Victoria, Australia Dr. Pedro Sebastião, Assistant Professor, University Institute of Lisbon, Portugal International Chair Dr. Sajid Anwar, Associate Professor, Institute of Management Sciences, Peshawar, Pakistan Dr. Anthany Chan, Charles Sturt University, Australia Workshop Chair Dr. Teresa Guarda, Director of the CIST Research and Innovation Center, Faculty of Systems and Telecommunications, UPSE, Ecuador Dr. B. B. Gupta, Assistant Professor, National Institute of Technology Kurukshetra, India

v

vi

Conference Organization

Special Session Chair Prof. Fernando Moreira, Professor Catedrático, Diretor do Departamento de Ciência e Tecnologia, Universidade Portucalense, Porto, Portugal Dr. Shah Nazir, Assistant Professor, University of Swabi, Pakistan Poster Chair Dr. Isabel Alexandre, Assistant Professor, University Institute of Lisbon, Portugal Joana Martinho da Costa, Invited Assistant Professor, University Institute of Lisbon, Portugal Program Committee Chair Dr. Sérgio Moro, Associate Professor, University Institute of Lisbon, Portugal Dr. Babar Shah, Associate Professor, Zayed University, Abu Dhabi, UAE

Preface

This conference addresses the importance that IT professionals, academics, and researchers stretch across narrowly defined subject areas and constantly acquire a global technical and social perspective. ICITA 2022 offers such an opportunity to facilitate cross-disciplinary and social gatherings. Due to the breadth and depth of the topics, it is challenging to class them into specific categories; however, for the convenience of readers, the conference covers a wide range of topics which are broadly split into Software Engineering, Machine Learning, Network Security, and Digital Media and Education. The need for novel software engineering (SE) tools and techniques which are highly reliable and greatly robust is the order of the day. There is a greater understanding that the design and evolution of software systems and tools must be “smart” if it is to remain efficient and effective. The nature of artifacts, from specifications through to delivery, produced during the construction of software systems can be very convoluted and difficult to manage. A software engineer cannot find all its intricacies by examining these artifacts manually. Automated tools and techniques are required to reflect over business knowledge to identify what is missing or could be effectively changed while producing and evolving these artifacts. There is an agreed belief among researchers that SE provides an ideal platform to apply and test the recent advances in AI (Artificial Intelligence) tools and techniques. More and more SE problems are now resolved through the application of AI, such as through tool automation and machine learning algorithms. Machine learning is a broad subfield of Computational Intelligence that is concerned with the development of techniques that allow computers to “learn”. With an increased and effective use of machine learning techniques, there has been a rising demand for the use of this approach in different fields of life. There is a wider application of machine learning in different domains of computer science including e-commerce, software engineering, robotics, digital media and education, and computer security. Given the opportunities and challenges of the emerging machine learning applications, this area has a great research potential for further investigation.

vii

viii

Preface

The growth of data has revolutionized the production of knowledge within and beyond science, by creating efficient ways to plan, conduct, disseminate, and assess high-quality novel research. The past decade has witnessed the creation of innovative approaches to produce, store, and analyze data, culminating in the emergence of the field of data science, which brings together computational, algorithmic, statistical, and mathematical techniques toward extrapolating knowledge from ever-growing data sources. This area of research is continuously growing and attracting a lot of interest. Computer security is a process of protecting computer software, hardware, and networks against harm. The application of computer security has a wider scope, including hardware, software, and network security. In the wake of rising security threats, it is eminent to improve security postures. This is an ongoing and active research area that attracts a lot of interest from researchers and practitioners. With the advent of the Internet and technology, the traditional teaching and learning has largely transformed into digital education. Teachers and students are significantly reliant upon the use of digital media in face-to-face classrooms and remote online learning. The adoption of digital media in education profoundly modifies the landscape of education, particularly with regards to online learning, e-learning, blended learning, and face-to-face digital assisted learning, offering new possibilities but also challenges that need to be explored and assessed. The International Conference on Information Technology and Applications (ICITA) is an initiative to consider the above-mentioned considerations and challenges. Besides the above topics, International Workshop on Information and Knowledge in the Internet of Things (IKIT) 2022 was run in conjunction with ICITA 2022 with a focus on the Internet of Things (IoT). In addition, 1st Workshop on Exponential Technologies in Business and Economics (Wetbe) was run with a focus on exponential technologies. ICITA 2022 was able to attract 138 submissions from 28 different countries across the world. From the 138 submissions, we accepted 62 submissions, which represents an acceptance rate of 44%. Out of 61, IKIT 2022 received 22 submissions with 10 accepted papers and Wetbe 2022 received 11 submissions with six accepted papers. Out of all submissions, 61 were selected to be published in this volume. The accepted papers under this volume were categorized under four different themes, including Software Engineering; Machine Learning and Data Science; Network Security, Internet of Things, and Smart technology; and Digital Media and Education. Each submission is reviewed by at least two to three reviewers, who are considered experts in the related submitted paper. The evaluation criteria include several issues, such as correctness, originality, technical strength, significance, quality of presentation, interest, and relevance to the conference scope. This volume is published in Lecture Notes in Networks and Systems Series by Springer, which has a high SJR impact. We would like to thank all Program Committee members as well as the additional reviewers for their effort in reviewing the papers. We hope that the topics covered in ICITA proceedings will help the readers to understand the intricacies involving the

Preface

ix

methods and tools of software engineering that have become an important element of nearly every branch of computer science. We would like to extend our special thanks to the keynote speakers, Helga Hambrock, Senior Instructional Designer, Adjunct Professor in Educational Technology and Instructional Design at Concordia University, Chicago, USA; Anthony Lewis Brooks, Associate Professor, Department of Architecture, Design and Media Technology, Aalborg University, Denmark; José Manuel Machado, Director of Centro ALGORITMI, Director of the Doctoral Program in Biomedical Engineering Department of Informatics/Centro ALGORITMI, School of Engineering, University of Minho, Portugal; and Ronnie Figueiredo, School of Technology and Management, Centre of Applied Research in Management and Economics (CARME), Polytechnic of Leiria, Portugal. Dubai, United Arab Emirates Peshawar, Pakistan

Abrar Ullah, Ph.D. Sajid Anwar, Ph.D.

Contents

Machine Learning and Data Science Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term Memory Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Farman Hassan, Auliya Ur Rahman, and Muhammad Hamza Mehmood

3

Implementation of Big Data and Blockchain for Health Data Management in Patient Health Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . António Pesqueira, Maria José Sousa, and Sama Bolog

17

Ambient PM2.5 Prediction Based on Prophet Forecasting Model in Anhui Province, China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahmad Hasnain, Muhammad Zaffar Hashmi, Basit Nadeem, Mir Muhammad Nizamani, and Sibghat Ullah Bazai

27

Potato Leaf Disease Classification Using K-means Cluster Segmentation and Effective Deep Learning Networks . . . . . . . . . . . . . . . . . Md. Ashiqur Rahaman Nishad, Meherabin Akter Mitu, and Nusrat Jahan

35

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Banuki Nagodavithana and Abrar Ullah

47

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye Cataract Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amir Sohail, Huma Qayyum, Farman Hassan, and Auliya Ur Rahman

63

DarkSiL Detector for Facial Emotion Recognition . . . . . . . . . . . . . . . . . . . . Tarim Dar and Ali Javed Review and Enhancement of Discrete Cosine Transform (DCT) for Medical Image Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Emadalden Alhatami, Uzair Aslam Bhatti, MengXing Huang, and SiLing Feng

77

89

xi

xii

Contents

Early Courier Behavior and Churn Prediction Using Machine Learning in E-Commerce Logistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Barı¸s Bayram, Eyüp Tolunay Küp, Co¸skun Özenç Bilgili, and Nergiz Co¸skun

99

Combining Different Data Sources for IIoT-Based Process Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Rodrigo Gomes, Vasco Amaral, and Fernando Brito e Abreu Comparative Analysis of Machine Learning Algorithms for Author Age and Gender Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Zarah Zainab, Feras Al-Obeidat, Fernando Moreira, Haji Gul, and Adnan Amin Prioritizing Educational Website Resource Adaptations: Data Analysis Supported by the k-Means Algorithm . . . . . . . . . . . . . . . . . . . . . . . 139 Luciano Azevedo de Souza, Michelle Merlino Lins Campos Ramos, and Helder Gomes Costa Voice Operated Fall Detection System Through Novel Acoustic Std-LTP Features and Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . 151 Usama Zafar, Farman Hassan, Muhammad Hamza Mehmood, Abdul Wahab, and Ali Javed Impact of COVID-19 on Predicting 2020 US Presidential Elections on Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Asif Khan, Huaping Zhang, Nada Boudjellal, Bashir Hayat, Lin Dai, Arshad Ahmad, and Ahmed Al-Hamed Health Mention Classification from User-Generated Reviews Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Romieo John, V. S. Anoop, and S. Asharaf Using Standard Machine Learning Language for Efficient Construction of Machine Learning Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . 189 Srinath Chiranjeevi and Bharat Reddy Machine Learning Approaches for Detecting Signs of Depression from Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Sarin Jickson, V. S. Anoop, and S. Asharaf Extremist Views Detection: Definition, Annotated Corpus, and Baseline Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Muhammad Anwar Hussain, Khurram Shahzad, and Sarina Sulaiman Chicken Disease Multiclass Classification Using Deep Learning . . . . . . . . 225 Mahendra Kumar Gourisaria, Aakarsh Arora, Saurabh Bilgaiyan, and Manoj Sahni

Contents

xiii

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model for Deepfakes Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Fatima Khalid, Ali Javed, Aun Irtaza, and Khalid Mahmood Malik Benchmarking Innovation in Countries: A Multimethodology Approach Using K-Means and DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Edilvando Pereira Eufrazio and Helder Gomes Costa Line of Work on Visible and Near-Infrared Spectrum Imaging for Vegetation Index Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Shendry Rosero Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality in Portugal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Alexandre Arriaga and Carlos J. Costa Software Engineering Digital Policies and Innovation: Contributions to Redefining Online Learning of Health Professionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Andreia de Bem Machado, Maria José Sousa, and Gertrudes Aparecida Dandolini Reference Framework for the Enterprise Architecture for National Organizations for Official Statistics: Literature Review . . . . . . . . . . . . . . . 299 Arlindo Nhabomba, Bráulio Alturas, and Isabel Alexandre Assessing the Impact of Process Awareness in Industry 4.0 . . . . . . . . . . . . 311 Pedro Estrela de Moura, Vasco Amaral, and Fernando Brito e Abreu An Overview on the Identification of Software Birthmarks for Software Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Shah Nazir and Habib Ullah Khan The Mayan Culture Video Game—“La Casa Maya” . . . . . . . . . . . . . . . . . . 331 Daniel Rodríguez-Orozco, Amílcar Pérez-Canto, Francisco Madera-Ramírez, and Víctor H. Menéndez-Domínguez Impact of Decentralized and Agile Digital Transformational Programs on the Pharmaceutical Industry, Including an Assessment of Digital Activity Metrics and Commercial Digital Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 António Pesqueira, Sama Bolog, Maria José Sousa, and Dora Almeida Sprinting from Waterfall: The Transformation of University Teaching of Project Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Anthony Chan, David Miller, Gopi Akella, and David Tien

xiv

Contents

Versioning: Representing Cultural Heritage Evidences on CIDOC-CRM via a Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Ariele Câmara, Ana de Almeida, and João Oliveira Toward a Route Optimization Modular System . . . . . . . . . . . . . . . . . . . . . . 373 José Pinto, Manuel Filipe Santos, and Filipe Portela Intellectual Capital and Information Systems (Technology): What Does Some Literature Review Say? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Óscar Teixeira Ramada REST, GraphQL, and GraphQL Wrapper APIs Evaluation. A Computational Laboratory Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Antonio Quiña-Mera, Cathy Guevara-Vega, José Caiza, José Mise, and Pablo Landeta Design Science in Information Systems and Computing . . . . . . . . . . . . . . . 409 Joao Tiago Aparicio, Manuela Aparicio, and Carlos J. Costa Organizational e-Learning Systems’ Success in Industry . . . . . . . . . . . . . . 421 Clemens Julius Hannen and Manuela Aparicio Network Security Smart Pilot Decontamination Strategy for High and Low Contaminated Users in Massive MIMO-5G Network . . . . . . . . . . . . . . . . . . 435 Khalid Khan, Farhad Banoori, Muhammad Adnan, Rizwan Zahoor, Tarique Khan, Felix Obite, Nobel John William, Arshad Ahmad, Fawad Qayum, and Shah Nazir Cluster-Based Interference-Aware TDMA Scheduling in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 Gohar Ali Messaging Application Using Bluetooth Low Energy . . . . . . . . . . . . . . . . . . 459 Nikhil Venkat Kumsetty, Sarvesh V. Sawant, and Bhawana Rudra Scalable and Reliable Orchestration for Balancing the Workload Among SDN Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 José Moura A Digital Steganography Technique Using Hybrid Encryption Methods for Secure Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Sharan Preet Kaur and Surender Singh Utilizing Blockchain Technology to Enhance Smart Home Security and Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Rehmat Ullah, Sibghat Ullah Bazai, Uzair Aslam, and Syed Ali Asghar Shah

Contents

xv

Quality of Service Improvement of 2D-OCDMA Network Based on Two Half of ZCC Code Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Mohanad Alayedi The Impact of 5G Networks on Organizations . . . . . . . . . . . . . . . . . . . . . . . . 511 Anthony Caiche, Teresa Guarda, Isidro Salinas, and Cindy Suarez Internet of Things and Smart Technology The Fast Health Interoperability Resources (FHIR) and Integrated Care, a Scoping Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521 João Pavão, Rute Bastardo, and Nelson Pacheco Rocha Blockchain Based Secure Interoperable Framework for the Internet of Medical Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 Wajid Rafique, Babar Shah, Saqib Hakak, Maqbool Khan, and Sajid Anwar WaterCrypt: Joint Watermarking and Encryption Scheme for Secure Privacy-Preserving Data Aggregation in Smart Metering Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 Farzana Kabir, David Megías, and Tanya Koohpayeh Araghi Technological Applications for Smart Cities: Mapping Solutions . . . . . . . 557 Bruno Santos Cezario and André Luis Azevedo Guedes Duty—Cycling Based Energy-Efficient Framework for Smart Healthcare System (SHS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 Bharti Rana and Yashwant Singh Quality 4.0 and Smart Product Development . . . . . . . . . . . . . . . . . . . . . . . . . 581 Sergio Salimbeni and Andrés Redchuk Embedded Vision System Controlled by Dual Multi-frequency Tones . . . 593 I. J. Orlando Guerrero, Ulises Ruiz, Loeza Corte, and Z. J. Hernadez Paxtian Determinants of City Mobile Applications Usage and Success . . . . . . . . . . 605 Rita d’Orey Pape, Carlos J. Costa, Manuela Aparicio, and Miguel de Castro Neto Sustainable Digital Transformation Canvas: Design Science Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 Reihaneh Hajishirzi The Role of Community Pharmacies in Smart Cities: A Brief Systematic Review and a Conceptual Framework . . . . . . . . . . . . . . . . . . . . . 629 Carla Pires and Maria José Sousa

xvi

Contents

Digital Media and Education Education in the Post-covid Era: Educational Strategies in Smart and Sustainable Cities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645 Andreia de Bem Machado, João Rodrigues dos Santos, António Sacavém, Marc François Richter, and Maria José Sousa Digital Health and Wellbeing: The Case for Broadening the EU DigComp Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 Anícia Rebelo Trindade, Debbie Holley, and Célio Gonçalo Marques An Information Systems Architecture Proposal for the Thermalism Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671 Frederico Branco, Catarina Gonçalves, Ramiro Gonçalves, Fernando Moreira, Manuel Au-Yong-Oliveira, and José Martins Technologies for Inclusive Wellbeing: IT and Transdisciplinary Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683 Anthony L. Brooks Should the Colors Used in the Popular Products and Promotional Products Be Integrated? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693 Takumi Kato Impact of Teacher Training on Student Achievement . . . . . . . . . . . . . . . . . 703 Miguel Sangurima Educational Data Mining: A Predictive Model to Reduce Student Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713 Carlos Redroban, Jorge Saavedra, Marcelo Leon, Sergio Nuñez, and Fabricio Echeverria Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723

Editors and Contributors

About the Editors Dr. Sajid Anwar is an Associate Professor at the Center of Excellence in Information Technology Institute of Management Sciences (IMSciences), Peshawar, Pakistan. He received his MS (Computer Science, 2007) and Ph.D. degrees (Software Engineering, 2011) from NUCES-FAST, Islamabad. Previously, he was head of Undergraduate Program in Software Engineering at IMSciences. Dr. Sajid Anwar is leading expert in Software architecture engineering and Software maintenance prediction. His research interests are cross-disciplinary and industry focused and includes: Search based Software Engineering; Prudent based Expert Systems; Customer Analytics, Active Learning and applying Data Mining and Machine Learning techniques to solve real world problems. Dr. Sajid Anwar is Associate editor of Expert Systems Journal Wiley. He has been a Guest Editor of numerous journals, such as Neural Computing and Applications, Cluster Computing Journal Springer, Grid Computing Journal Springer, Expert Systems Journal Wiley, Transactions on Emerging Telecommunications Technologies Wiley, and Computational and Mathematical Organization Theory Journal Springer. He is also Member Board Committee Institute of Creative Advanced Technologies, Science and Engineering, Korea (iCatse.org). He has supervised to completion many M.S. research students. He has conducted and led collaborative research with Government organizations and academia and has published over 50 research articles in prestigious conferences and journals. Dr. Abrar Ullah is an Associate Professor and Director of Postgraduate Studies at the School of Mathematical and Computer Science, Heriot Watt University, Dubai Campus. Abrar received the M.Sc. (Computer Science, 2000) from University of Peshawar. Abrar received the Ph.D. (Security and Usability) from University of Hertfordshire, UK. Abrar has been working in industry and academia for over 20 years. He has vast experience in teaching and development of enterprise systems. Abrar started his teaching career in 2002 as lecturer at the University of Peshawar and Provincial Health Services Academy Peshawar. In 2008, Abrar joined the ABMU

xvii

xviii

Editors and Contributors

NHS UK as Lead Developer and contributed to a number of key systems in the NHS. In 2011, Abrar joined professional services at Cardiff University as “Team Lead and Senior Systems Analyst” and led a number of successful strategic and national level projects. In the same period, besides his professional role, he also worked as lecturer of “Digital Media Design” for School of Medicine, Cardiff University. In 2017, Abrar held the role of lecturer at school of management and computer science, Cardiff Metropolitan University. He also held the role of “Lead Developer” at the NHS—Health Education and Improvement Wales (HEIW) until 2019. Abrar is General Chair of the 16th ICITA conference to be held in Lisbon, Portugal on 20– 22 October 2022. His research interests are cross-disciplinary and industry focused. Abrar has research interest in Security Engineering, Information Security, Usability, Usable Security, Online Examinations and Collusion Detection, Applying Machine Learning techniques to solve real world security problems. Abrar has published over 16 research articles in prestigious conferences and journals. Dr. Álvaro Rocha holds the title of Honorary Professor, and holds a D.Sc. in Information Science, Ph.D. in Information Systems and Technologies, M.Sc. in Information Management, and B.Sc in Computer Science. He is a Professor of Information Systems at the ISEG—Lisbon School of Economics and Management, University of Lisbon. He is also President of AISTI (the Iberian Association for Information Systems and Technologies), Chair of the IEEE Portugal Section Systems, Man, and Cybernetics Society Chapter, and Editor-in-Chief of both JISEM (Journal of Information Systems Engineering and Management) and RISTI (Iberian Journal of Information Systems and Technologies). Moreover, he has served as Vice-Chair of Experts for the European Commission’s Horizon 2020 program, and as an Expert at the COST—Intergovernmental Framework for European Cooperation in Science and Technology, at the Government of Italy’s Ministry of Education, Universities and Research, at the Government of Latvia’s Ministry of Finance, at the Government of Mexico’s National Council of Science and Technology, and at the Government of Polish’s National Science Centre. Dr. Maria José Sousa (Ph.D. in Industrial Management) is Pro-Rector for Distance Learning Development and a University Professor at ISCTE. She is also a research fellow at Business Research Unit, and has assumed a Post-Doc position from 20162018 in digital learning and digital skills, researching those fields, with several publications in journals with high impact factor (Journal of Business Research, Journal of Grid Computing, Future Generation Computer Systems, and others). And is collaborating as an expert in digital skills, with Delloite (Brussels) for a request of the European Commission in the creation of a new category regarding digital skills to be integrated with the European Innovation Scoreboard (EIS). She was a member of the Coordinator Committee of the Ph.D. in Management at Universidade Europeia. She was also a Senior Researcher at GEE (Research Office) in the Portuguese Ministry of Economy, responsible for Innovation, Research, and Entrepreneurship Policies, and a Knowledge and Competencies Manager at AMA, IP, Public Reform Agency (Ministry of the Presidency and the Ministers Council). She was also a Project

Editors and Contributors

xix

Manager at the Ministry of Labor and Employment, responsible for Innovation, and Evaluation and Development of the Qualifications Projects. Her research interests currently are public policies, health policies, innovation, and information science. She has developed major research in the innovation policies with articles published in high-level journals (as the European Planning Studies, Information Systems Frontiers, Systems Research, and Behavioral Science, Computational and Mathematical Organization Theory, Future Generation Computer Systems, and others). She is also the guest-editor of more than 5 Special Issues from Springer and Elsevier. She has participated in European projects of innovation transfer (for example, as Ambassador of EUWIN, and Co-coordinating an Erasmus+ project with ATO—Chamber of Commerce of Ankara about entrepreneurship) and is also an External Expert of COST Association—European Cooperation in Science and Technology, and former President of the ISO/TC 260—Human Resources Management, representing Portugal in the International Organization for Standardization.

Contributors Fernando Brito e Abreu ISTAR-IUL & ISCTE-Instituto Universitário de Lisboa, Lisboa, Portugal Muhammad Adnan Department of Computer Science, University of Swabi, Swabi, Pakistan Arshad Ahmad Institute of Software Systems Engineering, Johannes Kepler University, Linz, Austria; Department of IT and Computer Science, Pak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, Pakistan Gopi Akella Charles Sturt University, Wagga Wagga, New South Wales, Australia Ahmed Al-Hamed School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China Feras Al-Obeidat College of Technological Innovation, Zayed University, Abu Dhabi, UAE Mohanad Alayedi Department of Electronics, Ferhat Abbas University of Setif 1, Setif, Algeria Isabel Alexandre Iscte – Instituto Universitário de Lisboa, Lisboa, Portugal Emadalden Alhatami School of Information and Communication Engineering, Hainan University, Haikou, China Gohar Ali Department of Information Systems and Technology, Sur University College, Sur, Oman Dora Almeida Independent Researcher, Lisboa, Portugal

xx

Editors and Contributors

Bráulio Alturas Iscte – Instituto Universitário de Lisboa, Lisboa, Portugal Vasco Amaral NOVA LINCS & NOVA School of Science and Technology, Caparica, Portugal Adnan Amin Center for Excellence in Information Technology, Institute of Management Sciences, Peshawar, Pakistan V. S. Anoop Kerala Blockchain Academy, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India; School of Digital Sciences, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India Sajid Anwar College of Information Technology, Zayed University, Academic, UAE Joao Tiago Aparicio INESC-ID, Instituto, Superior Técnico, University of Lisbon, Lisbon, Portugal Manuela Aparicio NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisbon, Portugal Tanya Koohpayeh Araghi Internet Interdiscipinary Institute (IN3), Center for Cybersecurity Research of Catalonia (CYBERCAT), Universitat Oberta de Catalunya, Barcelona, Spain Aakarsh Arora School of Computer Engineering, KIIT Deemed to Be University, Bhubaneswar, Odisha, India Alexandre Arriaga ISEG (Lisbon School of Economics and Management), Universidade de Lisboa, Lisbon, Portugal S. Asharaf Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India Uzair Aslam People’s Primary Healthcare Initiative (PPHI) Sindh, Karachi, Pakistan Manuel Au-Yong-Oliveira INESC TEC, GOVCOPP, Department of Economics, Management, Industrial Engineering and Tourism, University of Aveiro, Aveiro, Portugal Farhad Banoori South China University of Technology (SCUT), Guangzhou, China Rute Bastardo UNIDCOM, Science and Technology School, University of Trásos-Montes and Alto Douro, Vila Real, Portugal Barı¸s Bayram HepsiJET, ˙Istanbul, Turkey Sibghat Ullah Bazai College of Information and Communication Technology, BUITEMS, Quetta, Pakistan; Department of Computer Engineering, BUITEMS, Quetta, Pakistan

Editors and Contributors

xxi

Uzair Aslam Bhatti School of Information and Communication Engineering, Hainan University, Haikou, China Saurabh Bilgaiyan School of Computer Engineering, KIIT Deemed to Be University, Bhubaneswar, Odisha, India Co¸skun Özenç Bilgili HepsiJET, ˙Istanbul, Turkey Sama Bolog University of Basel, Basel, Switzerland Nada Boudjellal School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China; The Faculty of New Information and Communication Technologies, University Abdelhamid Mehri Constantine 2, Constantine, Algeria Frederico Branco Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal; INESC TEC, Porto, Portugal Anthony L. Brooks CREATE, Aalborg University, Aalborg, Denmark Anthony Caiche Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador; CIST—Centro de Investigación en Sistemas y Telecomunicaciones, La Libertad, Ecuador José Caiza Universidad de Las Fuerzas Armadas ESPE, Latacunga, Ecuador Ariele Câmara Centro de Investigaçao em Ciências da Informaçao, Tecnologias e Arquitetura, Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal Bruno Santos Cezario Centro Universitario Augusto Motta, Rio de Janeiro, Brasil Anthony Chan Charles Sturt University, Wagga Wagga, New South Wales, Australia Srinath Chiranjeevi Vellore Institute of Technology, Bhopal, India Loeza Corte Universidad de la Cañada, Teotitlán de Flores Magón, Oax, México Carlos J. Costa Advance/ISEG—Lisbon School of Economics and Management, Universidade de Lisboa, Lisbon, Portugal Helder Gomes Costa Universidade Federal Fluminense, Niterói, RJ, Brazil Nergiz Co¸skun HepsiJET, ˙Istanbul, Turkey Lin Dai School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China Gertrudes Aparecida Dandolini Engineering and Knowledge Management Department, Federal University of Santa Catarina, Santa Catarina, Brazil

xxii

Editors and Contributors

Tarim Dar University of Engineering and Technology-Taxila, Department of Software Engineering, Taxila, Pakistan Ana de Almeida Centro de Investigaçao em Ciências da Informaçao, Tecnologias e Arquitetura, Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal; Centre for Informatics and Systems of the University of Coimbra (CISUC), Coimbra, Portugal Miguel de Castro Neto NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisbon, Portugal Pedro Estrela de Moura NOVA School of Science and Technology, Caparica, Portugal Luciano Azevedo de Souza Universidade Federal Fluminense, Niterói, RJ, Brazil João Rodrigues dos Santos Economics and Business Department, Universidade Europeia/IADE, Lisbon, Portugal Rita d’Orey Pape EIT InnoEnergy SE, Eindhoven, The Netherlands; ISCTE-IUL, Lisboa, Portugal Fabricio Echeverria Universidad ECOTEC, Samborondón, Ecuador Edilvando Pereira Eufrazio Universidade Federal Fluminense, Niterói, Brazil SiLing Feng School of Information and Communication Engineering, Hainan University, Haikou, China Rodrigo Gomes NOVA School of Science and Technology, Caparica, Portugal Catarina Gonçalves AquaValor – Centro de Valorização e Transferência de Tecnologia da Água, Chaves, Portugal Ramiro Gonçalves Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal; INESC TEC, Porto, Portugal; AquaValor – Centro de Valorização e Transferência de Tecnologia da Água, Chaves, Portugal Mahendra Kumar Gourisaria School of Computer Engineering, KIIT Deemed to Be University, Bhubaneswar, Odisha, India Teresa Guarda Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador; CIST—Centro de Investigación en Sistemas y Telecomunicaciones, La Libertad, Ecuador André Luis Azevedo Guedes Centro Universitario Augusto Motta, Rio de Janeiro, Brasil Cathy Guevara-Vega Universidad Técnica del Norte, Ibarra, Ecuador; eCIER Research Group, Universidad Técnica del Norte, Ibarra, Ecuador

Editors and Contributors

xxiii

Haji Gul Center for Excellence in Information Technology, Institute of Management Sciences, Peshawar, Pakistan Reihaneh Hajishirzi Advance/ISEG (Lisbon School of Economics & Management), Universidade de Lisboa, Lisbon, Portugal Saqib Hakak Faculty of Computer Science, Canadian Institute for Cybersecurity, University of New Brunswick, Fredericton, Canada Clemens Julius Hannen NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisbon, Portugal; aboDeinauto, Berlin, Germany Muhammad Zaffar Hashmi Department of Chemistry, COMSATS University Islamabad, Islamabad, Pakistan Ahmad Hasnain Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing, China; School of Geography, Nanjing Normal University, Nanjing, China; Jiangsu Center for Collaborative Innovation in Geographical Information, Resource Development and Application, Nanjing, China Farman Hassan University of Engineering and Technology, Taxila, Punjab, Pakistan Bashir Hayat Institute of Management Sciences Peshawar, Peshawar, Pakistan Z. J. Hernadez Paxtian Universidad de la Cañada, Teotitlán de Flores Magón, Oax, México Debbie Holley Department of Nursing Sciences, Bournemouth University, Poole, England MengXing Huang School of Information and Communication Engineering, Hainan University, Haikou, China Muhammad Anwar Hussain Department of Computer Science, University of Technology Malaysia, Johor Bahru, Malaysia Aun Irtaza Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan Nusrat Jahan Department of CSE, Daffodil International University, Dhaka, Bangladesh Ali Javed Department of Software Engineering, University of Engineering and Technology, Taxila, Pakistan Sarin Jickson Kerala Blockchain Academy, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India Romieo John Kerala Blockchain Academy, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India

xxiv

Editors and Contributors

Farzana Kabir Internet Interdiscipinary Institute (IN3), Center for Cybersecurity Research of Catalonia (CYBERCAT), Universitat Oberta de Catalunya, Barcelona, Spain Takumi Kato Meiji University, Tokyo, Japan Sharan Preet Kaur Chandigarh University, Chandigarh, India Fatima Khalid Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan Asif Khan School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China Habib Ullah Khan Department of Accounting & Information Systems, College of Business & Economics, Qatar University, Doha, Qatar Khalid Khan Beijing University of Posts and Telecommunications, Beijing, China Maqbool Khan Pak-Austria Fachhochschule Institute of Applied Sciences and Technology, Haripur, Pakistan; Software Competence Center Hagenberg, Vienna, Austria Tarique Khan University of Politechnico Delle Marche, Ancona, Italy Nikhil Venkat Kumsetty National Institute of Technology Karnataka, Surathkal, India Eyüp Tolunay Küp HepsiJET, ˙Istanbul, Turkey Pablo Landeta Universidad Técnica del Norte, Ibarra, Ecuador Marcelo Leon Universidad ECOTEC, Samborondón, Ecuador Andreia de Bem Machado Engineering and Knowledge Management Department, Federal University of Santa Catarina, Santa Catarina, Brazil Francisco Madera-Ramírez Universidad Autónoma de Yucatán, Mérida, México Khalid Mahmood Malik Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA Célio Gonçalo Marques Polytechnic Institute of Tomar, Tomar, Portugal; Laboratory of Pedagogical, Innovation and Distance Learning (LIED.IPT), Tomar, Portugal José Martins INESC TEC, Porto, Portugal; AquaValor – Centro de Valorização e Transferência de Tecnologia da Água, Chaves, Portugal David Megías Internet Interdiscipinary Institute (IN3), Center for Cybersecurity Research of Catalonia (CYBERCAT), Universitat Oberta de Catalunya, Barcelona, Spain

Editors and Contributors

xxv

Muhammad Hamza Mehmood University of Engineering and Technology, Taxila, Pakistan Víctor H. Menéndez-Domínguez Universidad Autónoma de Yucatán, Mérida, México David Miller Charles Sturt University, Wagga Wagga, New South Wales, Australia José Mise Universidad de Las Fuerzas Armadas ESPE, Latacunga, Ecuador Meherabin Akter Mitu Department of CSE, Daffodil International University, Dhaka, Bangladesh Fernando Moreira REMIT, IJP, Universidade Portucalense, Porto, Portugal; IEETA, Universidade de Aveiro, Aveiro, Portugal José Moura Instituto de Telecomunicações (IT), Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal Basit Nadeem Department of Geography, Bahauddin Zakariya University, Multan, Pakistan Banuki Nagodavithana Heriot-Watt University, Dubai, UAE Shah Nazir Department of Computer Science, University of Swabi, Swabi, Pakistan Arlindo Nhabomba Iscte – Instituto Universitário de Lisboa, Lisboa, Portugal Md. Ashiqur Rahaman Nishad Department of CSE, Daffodil International University, Dhaka, Bangladesh Mir Muhammad Nizamani School of Ecology, Hainan University, Haikou, China Sergio Nuñez Universidad del Pacifico, Guayaquil, Ecuador Felix Obite Department of Physics, Ahmadu Bello University, Zaria, Nigeria João Oliveira Centro de Investigaçao em Ciências da Informaçao, Tecnologias e Arquitetura, Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal; Instituto de Telecomunicações, Lisboa, Portugal I. J. Orlando Guerrero Universidad de la Cañada, Teotitlán de Flores Magón, Oax, México João Pavão INESC-TEC, Science and Technology School, University of Trás-osMontes and Alto Douro, Vila Real, Portugal Amílcar Pérez-Canto Universidad Autónoma de Yucatán, Mérida, México António Pesqueira ISCTE-Instituto Universitário de Lisboa, Lisbon, Portugal José Pinto Algoritmi Research Centre, University of Minho, Guimarães, Portugal

xxvi

Editors and Contributors

Carla Pires CBIOS - Universidade Lusófona’s Research Center for Biosciences and Health Technologies, Lisbon, Portugal Filipe Portela Algoritmi Research Centre, University of Minho, Guimarães, Portugal; IOTECH—Innovation on Technology, Trofa, Portugal Fawad Qayum Department of Computer Science, IT University of Malakand, Totakan, Pakistan Huma Qayyum UET Taxila, Punjab, Pakistan Antonio Quiña-Mera Universidad Técnica del Norte, Ibarra, Ecuador; eCIER Research Group, Universidad Técnica del Norte, Ibarra, Ecuador Wajid Rafique Department of Computer Science and Operations Research, University of Montreal, Quebec, Canada Auliya Ur Rahman University of Engineering and Technology, Taxila, Punjab, Pakistan Óscar Teixeira Ramada ISCE - Douro - Instituto Superior de Ciências Educativas do Douro, Porto, Portugal Michelle Merlino Lins Campos Ramos Universidade Niterói, RJ, Brazil

Federal

Fluminense,

Bharti Rana Department of Computer Science and Information Technology, Central University of Jammu, Samba, Jammu & Kashmir, India Andrés Redchuk Universidad Rey Juan Carlos, Madrid, España Bharat Reddy National Institute of Technology, Calicut, India Carlos Redroban Universidad ECOTEC, Samborondón, Ecuador Marc François Richter Postgraduate Program in Environment and Sustainability (PPGAS), Universidade Estadual do Rio Grande do Sul, Porto Alegre, Brazil Nelson Pacheco Rocha IEETA, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal Daniel Rodríguez-Orozco Universidad Autónoma de Yucatán, Mérida, México Shendry Rosero Universidad Estatal Península de Santa Elena, La Libertad, Ecuador Bhawana Rudra National Institute of Technology Karnataka, Surathkal, India Ulises Ruiz Instituto nacional de astrofísica óptica y electrónica. Sta María Tonantzintla, San Andrés Cholula, Pue, México Jorge Saavedra Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador

Editors and Contributors

xxvii

António Sacavém Economics and Europeia/IADE, Lisbon, Portugal

Business

Department,

Universidade

Manoj Sahni Department of Mathematics, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India Sergio Salimbeni Universidad del Salvador, Buenos Aires, Argentina Isidro Salinas Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador; CIST—Centro de Investigación en Sistemas y Telecomunicaciones, La Libertad, Ecuador Miguel Sangurima Universidad Católica Andrés Bello, Caracas, Venezuela Manuel Filipe Santos Algoritmi Guimarães, Portugal

Research

Centre,

University

of

Minho,

Sarvesh V. Sawant National Institute of Technology Karnataka, Surathkal, India Babar Shah Center of Excellence in IT, Institute of Management Sciences, Peshawar, Pakistan Syed Ali Asghar Shah Department of Computer Engineering, BUITEMS, Quetta, Pakistan Khurram Shahzad Department of Data Science, University of the Punjab, Lahore, Pakistan Surender Singh Chandigarh University, Chandigarh, India Yashwant Singh Department of Computer Science and Information Technology, Central University of Jammu, Samba, Jammu & Kashmir, India Amir Sohail UET Taxila, Punjab, Pakistan Maria José Sousa University Institute of Lisbon (ISCTE), Lisbon, Portugal Cindy Suarez Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador; CIST—Centro de Investigación en Sistemas y Telecomunicaciones, La Libertad, Ecuador Sarina Sulaiman Department of Computer Science, University of Technology Malaysia, Johor Bahru, Malaysia David Tien Charles Sturt University, Wagga Wagga, New South Wales, Australia Anícia Rebelo Trindade Polytechnic Institute of Tomar, Tomar, Portugal; Educational Technology Laboratory (LabTE), University of Coimbra, Coimbra, Portugal Abrar Ullah Heriot-Watt University, Dubai, UAE Rehmat Ullah Department of Computer Engineering, BUITEMS, Quetta, Pakistan Abdul Wahab University of Engineering and Technology, Taxila, Pakistan

xxviii

Editors and Contributors

Nobel John William University of Valencia, Valencia, Spain Usama Zafar University of Engineering and Technology, Taxila, Pakistan Rizwan Zahoor University of Campania Luigi Vanvitelli, Caserta, Italy Zarah Zainab City University of Science and Information Technology, Peshawar, Pakistan Huaping Zhang School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China

Machine Learning and Data Science

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term Memory Network Farman Hassan, Auliya Ur Rahman, and Muhammad Hamza Mehmood

Abstract Sepsis has become a primary source of mortality of patients treated in intensive care units. The timely detection of sepsis assists in decreasing the mortality rate, as it becomes difficult to treat the patient if the symptoms get worsen. The primary objective of this work is to early detect sepsis patients by utilizing a deep learning model, and then perform a comparative analysis of the proposed system with other modern techniques to analyze the performance of the proposed model. In this work, we employed the long short-time memory model on the sepsis patient dataset. The three different performance metrics are used to evaluate the performance of the proposed system, i.e., accuracy, specificity, and AUROC. The results were obtained in three different windows after the patient was admitted to the intensive care unit, such as 4, 8, and 12 h window sizes. This proposed system achieved accuracy, specificity, and AUROC of 77, 75, and 91%, respectively. The comparison of the proposed system with other state-of-the-art techniques is performed on the basis of the above-mentioned performance metrics, which demonstrated the significance of the proposed system and proved that this system is reliable to implement in real-time environments. Keywords Sepsis · Deep learning · Long short-term memory · ICUs

1 Introduction Sepsis is a major topic in the medical research field, and different definitions have been used for sepsis such as sepsis is a disease of internal organ disorder of the human body [1] and it is a critical condition for patients created by the overwhelming immune response to an infection [2]. Additionally, sepsis is also defined as a syndrome without a criterion standard diagnosis [3]. During the infection of sepsis, the patient’s immunity system is unbalanced, in the case of infection of a disease, the human body spreads a special fluid against the disease to minimize the effect of that disease, the F. Hassan · A. U. Rahman (B) · M. H. Mehmood University of Engineering and Technology, Taxila, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_1

3

4

F. Hassan et al.

infection enters the human body through the lungs and it gets deoxygenated blood from the heart and provides oxygenated blood to the heart, and after being infected, it put sepsis infection to blood vessels, which causes disorder of multiple human organs that ultimately results in the death of the patient [3]. Figure 1 shows the sepsis life cycle affecting human organs. According to WHO World Health Organization (WHO) report, annually about 1.7 million people are affected by sepsis infection and about 0.27 million patients die due to the sepsis disease in the USA while about 300,00,000 people are affected by the sepsis and about 20% of them, i.e., 60,00,000 patients die due to the sepsis all over the world [4]. The diagnosis and treatment of the sepsis disease are costly and a big portion of the annual budget about 24 billion U.S. dollars are consumed for this purpose [4]. Early sepsis detection through artificial intelligence techniques will reduce the extra budget required for sepsis detection, and the chances of patients surviving will increase as the patients at far places from diagnosis centers will also be able to check the sepsis presence by using sepsis detection devices and their treatment will be started as soon as possible [4]. The sepsis diagnosis is a big challenge even for the sepsis experts and the doctors dealing with sepsis patients on daily basis. SIRS criteria are used mostly for the sepsis definition to predict and detect sepsis, and SIRS criteria are based on body temperature and some other symptoms including

Fig. 1 Life cycle of sepsis

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term …

5

cold, cough, etc. [5, 6]. Sepsis Organ, qSOFA, and MEWS are also used for rule-based criteria of Sepsis [7]. The SIRS criteria mentioned by [7] are listed as follows: • PACO2 – 12 × 109 cell/L • Heart Rate – >92 beats/min • Breathing Rate – 22 breaths/min Recent research in sepsis detection focused on patients with positive sepsis conditions while some methods focused on patients in the ICU to utilize their health records for early detection and prediction of sepsis [7]. K-nearest neighbors (KNN), recurrent neural network (RNN), gated recurrent unit (GRU), and long short-term memory (LSTM) units are utilized for sepsis detection with remarkable accuracy and another evaluation parameter [8]. Among all the algorithms for the detection of sepsis, the neural networks are the leading ones in each and every aspect. Specifically, researchers have employed neural network-based methods using the standard dataset, i.e., physionet.org . The dataset contains a tabular form of the data with hourly records of about 40,000 patients including vital signs as well as other clinical values of patients. Researchers have applied pre-processing techniques such as forward filling, backward filling, mean, and other useful techniques as the datasets are unbalanced so random forest, SVM, and neural networks of various types have been used to handle such types of datasets to avoid biased results [7–9]. Some researchers have evaluated their algorithms on the basis of accuracy, some evaluated on the basis of sensitivity and specificity, and some evaluated on the basis of accuracy, sensitivity, specificity, area under the ROC curve, precision, and recall for their algorithms [4, 10]. Various techniques have been applied to resolve the sepsis detection issue and the researchers experienced different evaluation results by applying different techniques for sepsis detection. Researchers have utilized machine learning and deep learning algorithms for their research. Barton et al. [11] trained the algorithm with the k-fold cross-validation technique and selected 80% data for training purposes and 10% for validation purposes while the last 10% for the testing purpose. The authors applied various classifiers and compared their accuracy. The accuracy of the convolutional neural network (CNN)-LSTM was at the top of the list with a 95% value, and the accuracy of CNN-LSTM + transfer was at the second-highest position with a 90% value, the accuracy of extreme gradient boost (X.G.B) lied at third position and the KNN_DTW gave the lowest accuracy of 46%. Biglarbeigi et al. [12] used only eight variables, namely, Temp., Heart Rate, Systolic Blood Pressure, Respiration Rate, Platelet, WBC, Glucose, and Creatinine out of 40 by applying the feature extraction technique. Biglarbeigi et al. [12] divided the dataset into four groups, namely, sets A, B, C, and D while further dividing the dataset into 80% training

6

F. Hassan et al.

set and 20% testing set. For the classification, the KNN classifier was utilized, and for better evaluation, they utilized the same method by selecting set A with 20,336 patient records as a training set and set B with 20,000 patient records as a testing set. Training accuracy of 99.7 and 99.6% testing accuracy was attained by using 80 and 20% training and testing ratios, respectively. Yao et al. [13] applied three classifiers, namely, decision tree, random forest, and logistic regression (LR). Furthermore, the authors used different autoencoders, namely, spatial autoencoder, temporal autoencoder, spatial–temporal autoencoder, and temporal plus spatial autoencoders, and compared the results of all three classifiers with them. The decision trees obtained an accuracy of 67% using a temporal autoencoder, the random forest obtained an accuracy of 0.722 using a temporal plus spatial autoencoder, and the logistic regression obtained an accuracy of 60.4% using a temporal autoencoder. Eskandar et al. [14] used biomarkers with a machine learning algorithm for sepsis detection and used the physiological data [6] for training and testing of the algorithm. Pre-processing was applied to data by sorting the data, after that, they applied a few algorithms using the dataset and compared their results, and the accuracy for KNN, SOFA, QSOFA, random forest, and multi-layer perceptron was 99, 64.5, 88.5, 98, and 98%, respectively. Rodriguez et al. [15] applied a supervised machine learning algorithm for sepsis detection and used the data that was collected from the ICU of three highlevel university hospitals named HPTU, HPTU, and IPS University, all hospitals were located in Colombia. The data obtained was of the patients above 18 years of age and skipped the entries and utilized sirs criteria for sepsis identification. A few classifiers were applied for sepsis detection and compared their results, the accuracy for random forest, support vector machine (SVM-ANOVA), SVM-dot, and neural network (NN) was 62.4, 61.7, 61.4, and 62.5%, respectively. Chen and Hernández [16] designed a model that performed the data pre-processing, feature engineering, model tuning, and analysis for the final stage of implementation. The dataset was imbalanced, and to resolve this issue, a random forest was applied, which is highly suitable for the imbalanced dataset, and the accuracy for the full model was 81.88% and for the compact model was 78.62%, respectively. Chicco et al. [17] used the clinical record of about 364 patients of Methodist medical center and Proctor Hospital; the dataset contained 189 men and 175 women of age between 20 and 86 years, with records of each patient who stayed between 1 and 48 days at the hospital; various algorithms were applied and compared their results; the accuracy of random forest, MLP, LR, DL, NB, SVM (linear), KNN, SVM (kernel), and DT was 32, 31, 31, 30, 27, 26, 23, 22, and 18%, respectively. Researchers have utilized deep learning approaches for the detection of sepsis. Al-Mualemi et al. [7] used an electronic health record to detect septic shock and severe sepsis conditions. For the training purpose, patient records with severe sepsis conditions were utilized and the eight initial vital signs were utilized for the sepsis prediction. Vital signs being used were H.R, Temp, S.B.P, and M.A.P. SIRS criteria were used for the definition of septic shock, and for the classification purpose, a deep learning algorithm was used. The results of RNN-LSTM, SVM-quadratic kernel,

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term …

7

and adaptive-CNN were compared and the training accuracy of RNN-LSTM, SVMquadratic kernel, adaptive-CNN, RNN-LSTM was 92.72, 78.00, and 93.84%, respectively, while the testing accuracy of RNN-LSTM, SVM-quadratic, adaptive-CNN was 91.10, 68.00, and 93.18%, respectively. Alvi et al. [18] used the deep neural networks for the early detection of neonatal sepsis, neonatal sepsis is a sepsis condition regarding the mother and newly born baby [4, 19], and two datasets were used, namely, Lopez Martinez and AJ Masino, both datasets were obtained from different fields of study and thus gave a variety of data for training and testing purpose; the dataset Lopez Martinez contained about 3% sepsis cases and 66% non-sepsis cases, which gives 1:2 of an imbalanced dataset, and the dataset contained 50 columns including labels; if the labels are removed, then a suitable matrix of 7 × 7 can easily be made that resemble the handwritten dataset MNIST. Moreover, artificial neural networks (ANN), CNN, and LSTM-RNN were applied, and their results were compared, the LSTM-RNN gave the highest accuracy of 99.40% while the accuracy of ANN and CNN was 98.20 and 97.21%, respectively. Kok et al. [9, 20] applied a temporal convolutional network for sepsis detection and used the Gaussian process regression (GPR) to predict the distribution of possible values for missing values of all entries. Furthermore, a temporal convolutional neural network was trained using a training dataset and obtained an accuracy of 95.5% and the testing accuracy of 80.0% while on the basis of time-step metrics the accuracy was 98.8%. Fu et al. [21] applied a convolutional neural network for sepsis detection, missing values were replaced with 0 for CNN and −1 for RNN; to remove the effects of missing values, feature selection technique was applied and selected 11 features out of 40, and they bagged both CNN and RNN-LSTM. By averaging the outcomes of the CNN and RNN-LSTM, the model was ensembled and applied to the testing of the model with a testing dataset. Performances of CNN, RNN, and an ensemble were compared. The accuracy for CNN, RNN, and ensemble was 89.4, 87.5, and 92.7%, respectively. Wickramaratne and Shaad Mahmud [8] transferred the labels of the dataset 6 h ahead for early prediction of sepsis and used the initial eight vital. Labels were encoded to 1 and 0 for sepsis and normal cases using one-hot encoding before feeding into the network. GRU applied GRU and then used only vital signs as well as vital signs with the laboratory values, the accuracy for GRU was 99.8%, accuracy with only vital signs was 96.7%, and when used vital signs with laboratory values, the accuracy became 98.1%. Baral et al. [22] used bi-directional LSTM for the sepsis prediction, they applied feature extraction technique using MLP and LSTM, the dataset was unbalanced, and they used Synthetic Minority Over-sampling Technique (SMOTE) to resolve the issue of unbalance dataset; to resolve the issue of irregular time series, they applied bucketing technique, after that, they applied bi-directional LSTM for classification purpose; they compared the results obtained from state of art algorithm and the proposed one, and the accuracy of the state of art was 85.7% and the accuracy of the proposed solution was 92.5%. Rafiei et al. [23] applied a fully connected LSTM-CNN model for sepsis detection, they utilized the dataset available at Physionet.org (Barton et al. 2021), they used two modes of dataset 1 with vital signs and demographic values of the patient and second one with using clinical values of the patient, in the first mode, they used LSTM while in the second mode, they used GRU

8

F. Hassan et al.

for Sepsis prediction, in the first mode, they measured the accuracy of the algorithm, the accuracy for 4, 8, and 12 h was 75, 69, and 72%, respectively, while in the second mode, the accuracy for 4, 8, and 12 h was 68, 66, and 62%, respectively. Van et al. [24] applied BiLSTM neural network for sepsis detection, they used the health record of the ICU department of Ghent University Hospital located in Belgium containing records of 2177 patients, and they utilized SVM, ANN, and ANN for comparison of their algorithm; they produced the results and compared the accuracy; the accuracy of BiLSTM, ANN, SVM, KNN, and LR was 0.83, 0.56, 0.55, 0.35, and 0.54 respectively. [25] applied bi-directional long short-term memory and medical-grade wearable multisensory system for Sepsis prediction, they used electronic health records of the MIMIC-III database for training and testing of their model, they selected 5699 patients record aged over 18 years, they applied forward filling approach to fill up the 30% empty values, they selected patients record contained 2748 non-sepsis cases and 2130 sepsis cases, they used 5-folds for training of their algorithm, they compared their algorithm with CNN, LSTM, XGBOOST, MLP, and Random Forest, and the accuracy of TOP-NET, CNN, LSTM, XGBOOST, and Random Forest was 90.1, 89.3, 89.8, 88.3, 87.9, and 87.3%, respectively. Silva et al. [26] used deep signs for Sepsis prediction, they applied deep signs learning-based LSTM algorithm on the electronic health record of patients with time series, their applied algorithm APACHE II uses 16 attributes including age, and the deep signs used 15 attributes except for age, they applied 10, 15, 20, and 25 epochs, and then they evaluated their algorithm using accuracy and their accuracy value became 81.50%. The above literature has shown improvement in the detection of sepsis, however, none of the work has focused on smaller windows, for example, 4, 8, and 12 h. There is a need for sepsis detection system that better monitors the conditions of patients having sepsis disease after the patients are admitted at ICU for a brief period of time. Therefore, we proposed a LSTM-based sepsis detector for the early detection of sepsis in smaller windows, namely, 4, 8, and 12 h. Our major contributions to this work are given as follows: • We designed a novel LSTM-based approach for early detection of sepsis. • We evaluated the performance of our approach for a brief period of time, particularly, 4, 8, and 12 h. • The proposed technique successfully detected the sepsis and non-sepsis with a better detection rate. • For validation of our approach, rigorous experimentations were conducted using clinical data. • Comparative assessment with existing approaches indicates that our approach has the capability to detect the sepsis disease and can be employed in emergency centers. We organized the remaining manuscript as follows: Sect. 2 discussed the proposed working mechanism, Sect. 3 has details of experimental evaluation, and finally, Sect. 4 concluded this work.

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term …

9

2 Proposed Methodology The main purpose of this work is to detect sepsis patients using the clinical data available at PhysioNet (Barton et al. 2021). Our approach is comprised of three main stages, namely, the pre-processing, features extraction, and finally, the classification stage. In the initial stage of this work, we employed two techniques, namely, forward filling and backward filling of MAP and DBP attributes. Next, in the second stage of this work, we selected those patients that have observations more than once and whose clinical data are available for regular hours. Furthermore, we employed a filter for patients that have a prediction in the range of 3–24 h. Next, in the classification stage, we selected only the Vital signs from both training and testing sets. LSTM was trained using the training set while the testing set was utilized for the evaluation purposes. Finally, our approach detects whether a person has sepsis disease or a normal person. The detailed working procedure of our approach is shown in Fig. 2.

2.1 Pre-processing The pre-processing was performed on the Physionet dataset in order to remove its flaws and make it understandable. The utilized dataset was obtained from Barton et al. (2021), which contained two classes of data, named as A and B. These classes have PSV files that incorporated hourly records of patients. There are more than

Fig. 2 Working system

10

F. Hassan et al.

40,000 hourly records present in both of these classes, which are obtained from both sepsis and non-sepsis patients. The dataset has 31% missing values, which were filled by using two modern data filling techniques, i.e., backward filling and forward filling [27]. But still some of MAP and DBP values were missing, which were filled by using D.B.P. and M.A.P formula. The forward filling was used to fill the missing value with its preceding value present in CSV or excel file, whereas backward filling was applied on the initial row which was not possible to be filled by forward filling, and these missing values were filled by values of the next row.

2.2 Intelligent Sepsis Detector We proposed a novel intelligent sepsis detector based on LSTM network [28–30]. LSTM is an important type of recurrent neural network (RNN), which is competent for storing data for a long time. The vanishing gradient problem of RNNs is resolved in this particular RNN, LSTM. The memory cells and gates present in the hidden recurrent layer of LSTM make it able to store dependencies for long intervals of time. A single memory cell of LSTM is shown in Fig. 3. Each memory cell of LSTM maintains a cell state vector ct and each time next memory cell can read, write, or reset the cell by using an explicit gating mechanism. Each of the memory cells has four gates, i.e., input gate it , modulation gate gt , output gate ot , and forced gate f t . The x t , ht , and t in Fig. 3 represent the input, hidden state, and time, respectively. The input gate it controls whether the memory cell is updated, the modulation gate gt controls the internal state cell ct and make information meaningless, the forget f t controls whether the memory cell is set to zero, and output gate ot controls whether the information of current cell state is made visible. These all gates have a sigmoid activation ranging from 0 to 1. These gates are calculated using the following formulas. Fig. 3 LSTM memory cell

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term …

11

i t = σ (Wi xt + Vi h t−1 + bi )

(1)

  f t = σ W f xt + V f h t−1 + b f

(2)

ot = σ (Wo xt + Vo h t−1 + bo )

(3)

  gt = tanh Wg xt + Vg h t−1 + bg

(4)

ct = f t  h t−1 + i t  gt

(5)

h t = ot  tanh((ct ))

(6)

The tanh of the modulation gate gt allows the well distribution of gradient without vanishing, which allows the information to flow well for a longer time without vanishing. This allows the dependencies to flow for a long interval of time. In this study, the TensorFlow library of Python is used to implement the LSTM model. We utilized the following configuration settings of LSTM for the detection of sepsis, namely, 5 LSTM layers, hidden units of size 200, activation function of ReLU, padding is same, mini-batch size of 64, adam optimizer, maximum epochs of 600, and the 5 layers were followed by a SoftMax and a fully connected layer. Furthermore, we utilized various configurations; however, we obtained good detection performance by using the above-mentioned configurations.

3 Experimental Results and Discussions This section has presented the experimental details. Our approach is evaluated using three performance parameters, namely, accuracy, specificity, and AUROC. The details are discussed in the subsequent sections.

3.1 Dataset We used the dataset, which is publicly available at Physionet (Barton et al. 2021). The Physionet gives a complete research guide in the form of related research papers including conference papers as well as journal papers (Barton et al. 2021). Physionet provides a platform for research teams to work on the annual challenge, and teams from all over the world research on the topic and submit their research papers on this platform. The dataset used is in the form of PSV files, we converted PSV files to CSV format for further processing. This dataset comprises of two training sets

12

F. Hassan et al.

Training set A and Training Set B, each of them has clinical values of approximately 20,000 patients of sepsis during their admission in the hospital for each hour for 24 h stay with about 31.63% of NaN values. The clinical values include attributes Resp, EtCO2, H.R., O2Sat, M.A.P, D.B.P, Temp, S.B.P, BaseExcess, Gender, ICULOS, Unit1, Unit2, and SepsisLabel. The initial eight are known as vital signs of sepsis, the last 5 are demographic values of a patient, and among them are the clinical values of the patient. 33% of the dataset contained NAN values, and these values were not measured at the time of organization of the dataset (Barton et al. 2021).

3.2 Results and Discussions To analyze the performance of our approach, we split the dataset into training and test sets. For this purpose, we randomly allocated 90 and 10% of the records for the training and testing sets, respectively. To make the results less biased to the selected sets, we further applied stratified tenfold cross-validation in which the data is randomly partitioned into ten equal-sized folds (sets) with approximately the same percentage of each sepsis label. A single fold acts as a test set, while the remaining nine folds are used as the training set. The cross-validation process is repeated ten times, with each of the ten folds used precisely once as the test set. The results are then averaged to produce a single estimation. Furthermore, we used window slicing and noise injection data augmentation techniques to handle class imbalance issues. The window slicing augmentation method randomly extracts continuous slices from the original EHRs. For noise injection, we randomly applied Gaussian noise to the measured vital signs. Deep architectures are prone to overfitting; therefore, two regularization techniques were used in our model: l2 weight decay, which penalizes the l2 norm of the model’s weights and dropout and stochastically omits units in each layer during the training. In the training phase, the network weight update is achieved through mini-batch Stochastic Gradient Descent (SGD) over a shuffled batch size of 64. The Nesterov acceleration technique is used to optimize the Mean Squared Error (MSE) loss function. We trained our model through 600 epochs. The model is implemented in Python using Keras framework 2.2.4 with Tensorflow 1.14.0 as the backend. During the thirty-seventh hour, unusual changes in the heart and the respiratory rates have occurred. The body temperature has then risen slightly. In just a few hours, the patient has met the Sepsis-3 definition. This research work has developed system for the timely identification of sepsis in the human body. The proposed system utilized the LSTM model, which is trained and evaluated on the Physionet dataset (Barton et al. 2021). The dataset is comprised of computerized health reports of multiple patients admitted in the ICUs. In this work, we utilized the performance metrics, namely, AUROC, specificity, and accuracy, to evaluate the performance of the proposed system. In this study, the above-mentioned performance metrics are calculated in three different window sizes, i.e., 4-h window (4 h), 8-h window (8 h), and 12-h window (12 h), after patient is admitted in the ICU.

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term … Table 1 Performance using LSTM

Performance 4 h metrics AUROC%

8h

13 12 h

91.19 + 0.005 89.21 + 0.007 87.34 + 0.021

Specificity% 75.24 + 0.014 73.45 + 0.015 70.32 + 0.018 Accuracy%

76.32 + 0.012 77.67 + 0.014 71.42 + 0.016

The AUROC is the metric that is used to evaluate the classification performance of the model depending upon some threshold value (GreatLearning, Understanding ROC, 2020). The AUROC results calculated after 4, 8, and 12 h windows are 91.19, 89.21, and 87.34%, respectively, as given in Table 1. The specificity metric determines the false positive rate of any classification. It is the ratio of true negative with both true negative and false positive values [31]. The specificity achieved by the proposed system in 4 h window is 75.24%, 8 h window is 73.45%, and 12 h window is 70.32% as shown in Table 1. The accuracy metric is highly utilized for measuring performance in most of the proposed deep learning models. It is the ratio of accurately determined predictions to the total number of predictions in the model [32]. The accuracies achieved by the proposed system are 76.32, 77.67, and 71.42%, after 4, 8, and 12 h windows, respectively, as given in Table 1. These results of performance metrics demonstrated that the proposed system can efficiently identify the sepsis patients by using the data of patients provided to the system.

3.3 Performance Comparison with Other Techniques In this section, the comparative analysis is performed with other state-of-the-art approaches to verify the significance of the proposed system. In order to conduct this comparison, the results achieved from other studies are compared with the results of the proposed system. The study in Nemati et al. [33] developed an artificial intelligence sepsis expert (AISE) system for the detection of sepsis in patients. The various features obtained from patient records were processed in a machine learning model, i.e., modified Weibull-cox proportional hazards model. This study calculated results in 4, 6, 8, and 12 h windows after the patient was admitted in the ICU. The AUROC achieved in this study is 85%, specificity of 67%, and an accuracy of 67%. In [34], multiple features of EMR, entropy, and socio-demographic merged together to develop a model for the detection of sepsis in the patients. Blood pressure (BP) and heart rate (HR) features were proven as significant predictors for detecting sepsis in this study, and the results were calculated in 4 h window size. The AUROC value obtained in this work is 78%, specificity of 55%, and an accuracy of 61%. The detailed results in terms of an AUROC, accuracy, and specificity of the proposed and other models are given in Table 2. These obtained results demonstrate that our

14

F. Hassan et al.

Table 2 Performance comparison with other techniques Reference paper

Method

Specificity%

AUROC%

Nemati et al. [33]

Modified Weilbull-Cox 67 proportional hazards model

Accuracy%

67

85

Shashikumar et al. [34]

Entropy + EMR + socio-demographic patient history

61

55

78

Proposed model

LSTM model

77

75

91

proposed LSTM-based system provides better results as compared to other techniques. Therefore, we concluded based on the comparison that our proposed system can be utilized in real-time environments for the detection of sepsis.

4 Conclusion This study proposed an intelligent sepsis detection system for an early detection of sepsis. Sepsis is a life-threatening disease and millions of people die every year due to negligence. Therefore, it is necessary to propose an automated detection system for sepsis to prevent the loss of precious lives. In this work, we proposed a system that employed a deep learning model such as the LSTM network and trained this model using the physionet dataset. The results of the proposed system signified that the proposed LSTM-based system can detect the sepsis patients with a very low false rate and in the early stages. Additionally, the proposed system can be implemented in real-time scenarios such as ICU. In future work, we aim to propose more state-ofthe-art deep learning frameworks for enhancing the performance of timely detection of sepsis.

References 1. Nemati S et al (2018) An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit Care Med 46(4):547–553 2. Li X, Kang Y, Jia X, Wang J, Xie G (2019) TASP: a time-phased model for sepsis prediction. In: 2019 computing in cardiology (CinC). IEEE, p 1 3. Delahanty RJ, Alvarez JoAnn, Flynn LM, Sherwin RL, Jones SS (2019) Development and evaluation of a machine learning model for the early identification of patients at risk for sepsis. Ann Emerg Med 73(4):334–344 4. Reyna M, Shashikumar SP, Moody B, Gu P, Sharma A, Nemati S, Clifford G (2019) Early prediction of sepsis from clinical data: the PhysioNet/computing in cardiology challenge 2019. In: 2019 computing in cardiology conference (CinC), vol 45, pp 10–13. https://doi.org/10. 22489/cinc.2019.412 5. Dellinger RP et al (2013) Incidence surviving sepsis campaign: international guidelines for management of severe sepsis and septic shock, 2012. Intensive Care Med 39(2):165–228

Intelligent Sepsis Detector Using Vital Signs Through Long Short-Term …

15

6. Giannini HM et al (2019) A machine learning algorithm to predict severe sepsis and septic shock: development, implementation, and impact on clinical practice. Crit Care Med 47(11):1485–1492 7. Al-Mualemi BY, Lu L (2020) A deep learning-based sepsis estimation scheme. IEEE Access 9:5442–5452 8. Wickramaratne SD, Shaad Mahmud MD (2020) Bi-directional gated recurrent unit based ensemble model for the early detection of sepsis. In: 2020 42nd annual international conference of the ieee engineering in medicine & biology society (EMBC). IEEE, pp 70–73 9. Kok C, Jahmunah V, Oh SL, Zhou X, Guruajan R (2020) Automated prediction of sepsis using temporal convolutional network. J Comput Biol Med 127 10. Li X, André Ng G, Schlindwein FS (2019) Convolutional and recurrent neural networks for early detection of sepsis using hourly physiological data from patients in intensive care unit. In: 2019 computing in cardiology (CinC). IEEE, p 1 11. Barton C et al (2019) Evaluation of a machine learning algorithm for up to 48-hour advance prediction of sepsis using six vital signs. Comput Biol Med 109:79–84. https://physionet.org 12. Biglarbeigi P, McLaughlin D, Rjoob K, Abdullah A, McCallan N, Jasinska-Piadlo A, Bond R et al (2019) Early prediction of sepsis considering early warning scoring systems. In: 2019 computing in cardiology (CinC). IEEE, p 1 13. Yao J, Ong ML, Mun KK, Liu S, Motani M (2019) Hybrid feature learning using autoencoders for early prediction of sepsis. In: 2019 computing in cardiology (CinC). IEEE, p 1 14. Eskandari MA, Moridani MK, Mohammadi S (2021) Detection of sepsis patients using biomarkers based on machine learning 15. Rodríguez A, Mendoza D, Ascuntar J, Jaimes F (2021) Supervised classification techniques for prediction of mortality in adult patients with sepsis. Am J Emerg Med 45:392–397 16. Chen M, Hernández A (2021) Towards an explainable model for Sepsis detection based on sensitivity analysis. IRBM 17. Chicco D, Oneto L (2021) Data analytics and clinical feature ranking of medical records of patients with sepsis. BioData Mining 14(1):1–22 18. Alvi RH, Rahman MH, Khan AAS, Rahman RM (2020) Deep learning approach on tabular data to predict early-onset neonatal sepsis. J Inf Telecommun 1–21 19. Reyna MA, Josef C, Seyedi S, Jeter R, Shashikumar SP, Brandon Westover M, Sharma A, Nemati S, Clifford GD (2019) Early prediction of sepsis from clinical data: the PhysioNet/computing in cardiology challenge 2019. In: 2019 computing in cardiology (CinC). IEEE, p 1 20. Kok C, Jahmunah V, Oh SL, Zhou X, Gururajan R, Tao X, Cheong KH, Gururajan R, Molinari F, Rajendra Acharya U (2020) Automated prediction of sepsis using temporal convolutional network. Comput Biol Med 127:103957 21. Fu J, Li W, Jiao Du, Xiao B (2020) Multimodal medical image fusion via Laplacian pyramid and convolutional neural network reconstruction with local gradient energy strategy. Comput Biol Med 126:104048 22. Baral S, Alsadoon A, Prasad PWC, Al Aloussi S, Alsadoon OH (2021) A novel solution of using deep learning for early prediction cardiac arrest in Sepsis patient: enhanced bidirectional long short-term memory (LSTM). Multimed Tools Appl 1–26 23. Rafiei A, Rezaee A, Hajati F, Gheisari S, Golzan M (2021) SSP: early prediction of sepsis using fully connected LSTM-CNN model. Comput Biol Med 128:104110 24. Van Steenkiste T, Ruyssinck J, De Baets L, Decruyenaere J, De Turck F, Ongenae F, Dhaene T (2019) Accurate prediction of blood culture outcome in the intensive care unit using long short-term memory neural networks. Artif Intell Med 97:38–43 25. Liu X, Liu T, Zhang Z, Kuo P-C, Xu H, Yang Z, Lan K et al (2021) TOP-net prediction model using bidirectional long short-term memory and medical-grade wearable multisensor system for tachycardia onset: algorithm development study. JMIR Med Informatics 9(4):e18803 26. da Silva DB, Schmidt D, da Costa CA, da Rosa Righi R, Eskofier B (2021) DeepSigns: a predictive model based on Deep Learning for the early detection of patient health deterioration. Exp Syst Appl 165:113905

16

F. Hassan et al.

27. Ullah A et al (2022) Comparison of machine learning algorithms for sepsis detection. Sepsis 28. Qadir G et al (2022) Voice spoofing countermeasure based on spectral features to detect synthetic attacks through LSTM. Int J Innov Sci Technol 3:153–165 29. Dawood H et al (2022) A robust voice spoofing detection system using novel CLS-LBP features and LSTM. J King Saud Univ Comput Inf Sci 30. Hassan F, Javed A (2021) Voice spoofing countermeasure for synthetic speech detection. In: 2021 international conference on artificial intelligence (ICAI). IEEE 31. Kumar A (2018) ML metrics: sensitivity vs. specificity - dzone ai, dzone.com. [Online]. https://dzone.com/articles/mlmetricssensitivityvsspecificitydifference#:~:text=What%20Is% 20Specificity%3F,be%20termed%20as%20false%20positives. Accessed: 13-Mar-2022 32. How to check the accuracy of your machine learning model, Deepchecks, 09Feb-2022. [Online]. https://deepchecks.com/how-to-check-the-accuracy-of-your-machine-lea rning-model/. Accessed 13-Mar-2022 33. Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD, Buchman TG (2018) An interpretable machine learning model for accurate prediction of sepsis in the ICU. Crit Care Med 46(4):547– 553. https://doi.org/10.1097/CCM.0000000000002936 34. Shashikumar SP, Stanley MD, Sadiq I, Li Q, Holder A, Clifford GD, Nemati S (2017) Early sepsis detection in critical care patients using multiscale blood pressure and heart rate dynamics. J Electrocardiol 50(6):739–743

Implementation of Big Data and Blockchain for Health Data Management in Patient Health Records António Pesqueira , Maria José Sousa , and Sama Bolog

Abstract Blockchain Technology (BT) and Big Data (BD)-based data management solutions can be used for storing and processing sensitive patient data efficiently in the healthcare field. While many institutions and industries have recognized the significance of both technologies, few have implemented and executed them in the health sector when it comes to the management of patients’ medical records. By leveraging Patients’ Health Records (PHR) data, the purpose of this paper is to develop a practical application with an architecture built on BT and BD technologies and to help organizations manage data requirements and enhance data security, provenance, traceability, availability, and effective identity management. For that purpose, a case study was developed, which covers the BT and BD key considerations, as well as key issues such as policies, smart contracts, consent, and provision of secure identities, so that records are properly managed and controlled. Hence, the purpose of this study is to summarize the key characteristics of a practical EHR implementation, emphasizing security measures and technologies to decipher the effectiveness of the included technological components, such as decentralized identification, consent management, and private BT management. According to the results of the case study research, the presented solution has high accuracy and is capable of managing PHR effectively. Additionally, it has been shown to have a high practical value when it comes to meeting the accuracy and real-time requirements of BT and BD applications. Keywords Blockchain Technology · Big Data · Patients’ Health Records

A. Pesqueira · M. J. Sousa (B) ISCTE-Instituto Universitário de Lisboa, Lisbon, Portugal e-mail: [email protected] A. Pesqueira e-mail: [email protected] S. Bolog (B) University of Basel, Basel, Switzerland e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_2

17

18

A. Pesqueira et al.

1 Introduction Global healthcare organizations (HCO) manage vast amounts of clinical, medical, and administrative data, from pharmaceutical supply chains to Patient Health Records (PHRs) and claims management. As different HCO data management security procedures become more common, an entirely new ecosystem of information is becoming available, increasing the volume of collected data exponentially [1]. The ability to link currently siloed information and serve as the “single source of truth” makes Blockchain Technology (BT) and Big Data (BD) extremely valuable technologies for improving healthcare-related clinical and operational data management solutions. This work proposes an immutable, secure, scalable, and interoperable architecture that enables patients and hospitals to be more transparent and secure while collecting sensitive patient data from a variety of integrated, connected, but independently managed healthcare systems, utilizing the proposed architecture. Hence, a BT and BD-based architecture is applied as a framework for a future real-world PHR system that ensures controlled data access and integrity and provides a transparent and trustable system framework for patient information for various stakeholders in the healthcare sector. By incorporating more advanced encryption methods into an immutable design audit trail, this study shows how the presented architectural design enhances the privacy, confidentiality, and secrecy of patient data compared to existing solutions. Using the proposed architecture, it was possible to create an immutable, secure, scalable, and interoperable platform that empowers patients and hospitals with greater transparency, privacy, and security while collecting sensitive patient data from a variety of integrated, connected, but independently managed healthcare systems.

2 Methodology By leveraging Patients’ Health Records (PHR), the purpose of this paper is to develop a practical application with an architecture built on BT and BD technologies and to help HCOs and patients in managing data requirements and enhance data control, provenance, traceability, availability, and effective identity management. Within the scope of the project, PHRs are clearly defined by the possibility of connecting wearables and sensors devices, but also allowing self-management patient monitoring and connecting to different HCOs providers to achieve a fully secure and trustworthy healthcare data management system. The designed solution in a private and protected Hyperledger Fabric (HF) environment was based on transactions in a proposed private BT, which represent exchanges of information and documents, as well as cryptographic hash files that represent single words used for Master Data Management (MDM) purposes and high-resolution medical images.

Implementation of Big Data and Blockchain for Health Data …

19

One key objective of this proposed architecture was the design of a permissionless system in which patients can be anonymous or whose identities are pseudonymous, and every patient can add new blocks to the ledger. On the other hand, in the developed permissioned BT, the identity of each patient is controlled by an identity provider or hospital administrative access. With decentralized identity and other privacy mechanisms, blockchain and distributed ledger technologies offer users novel possibilities for protecting their data. By enabling users to own and control their data, these systems provide users with greater sovereignty. This case study involved an exhaustive consultation with eight-hospital management and operating members from three hospitals located in Germany, Portugal, and Spain that requested that the identity aspects of the research paper be anonymized for reasons of confidentiality and data protection. This consultation with hospital management and operating members allowed a better understanding of all involved requirements and necessary technical system architecture. One of the main concerns of interviewing the hospital professionals was to address several trust issues, such as patient identification, patient consent, and hospital-patient user authentication. The involved hospital staff group also stressed the importance of allowing patients to add consent statements at any stage of their inpatient care journey or medical consultations, and with a trust mechanism that the BT holds them securely. In the process of collecting requirements from the healthcare professionals, a requirement arose that was directly related to the ability to act upon the directives and restrictions of the patients and having a system in place that can interpret them as access control decisions, as well as the assurance that the system is adhering to patient directives. The ability of healthcare providers to use a consistent, rules-based system for accessing patient data that can be permissioned to selected health organizations was essential. In addition, having different interconnected systems makes it easier to integrate PHR systems by utilizing a system of on-chain and off-chain data storage, where the designed architectures were needed to ensure full compliance. Access to the on-chain resources needs to be made available immediately to anyone who has permission to view the BT, while access to the off-chain data is stored in a designed SAP HANA configuration controlled by consent based on patient data from the EHR/PHR system. Different healthcare companies have tested SAP HANA for predictive analysis, spatial data analysis, graph data analysis, and other advanced analytical operations, which has led to the selection of this architectural approach. Personalized care is provided for all patients based on their biological, clinical, and lifestyle information [2].

2.1 Architecture One initial consideration was the possibility of collecting data through web-based and mobile applications in the future, in addition to the existing well-being and care

20

A. Pesqueira et al.

sensor technologies in the involved hospitals in different settings and integrating them using REST (representational state transfer) and application programming interfaces (APIs). The scalability option is critical in the future to ensure that the designed architecture can become the backbone for future PHRs. It should incorporate data from both patient-based and EHR-based technologies to provide a robust and comprehensive pool of data that can be accessed by authorized users such as healthcare providers and patients. Among the practical resolutions was the integration of the Ethereum smart contracts language into Solidity, which is embedded in the distributed BT network. A smart contract is a self-enforcing, immutable, autonomous, and accurate contract written in code, which is the building block of Ethereum applications. Furthermore, the security assurance that once smart contracts are deployed, and transactions are completed, the code becomes immutable, and as a result, the transactions and information become irreversible [3]. As a result of HF’s connection with the Ethereum platform, smart contracts were developed to deploy complex business logic into the network validation nodes as well as to test future scenarios in terms of exchanging medical images between the creation of Externally Owned Accounts and Contract Accounts (CAs). By combining Public Key Infrastructure (PKI) and decentralization/consensus, identity authorization processors can transform non-permissioned BTs into permissioned BTs where entities may register for long-term credentials or enrollment certificates commensurate with their types. Credentials issued by the Transaction Certificate Authority (TCA) to patients and medical doctors are used to issue pseudonymous credentials, which are used to authorize submitted transactions. Thus, certificates for healthcare transactions persist on the BT and can be used by authorized auditors to group otherwise unconnected transactions. With HF, the architected design allowed modularity, speed, smart contracts integration, privacy, and security among other benefits. With the following code lines, it is possible to understand how the EHR and PHR APIs were separated, but also how the patient data was retrieved from the ledger component [4]. Furthermore, in the below examples, the role of the hospital data administrator will also be validated from the request header so that the fabric gateway connection can be initiated with a subsequent smart contract invoking the function, which in turn will deploy the chaincode package containing the smart contract by using the chaincode lifecycle, then query and approve the installed chaincode for all three hospitals, and then commit it. In addition to the API checks, it was also necessary to verify from an end-user interface perspective that the BT calls initiating transactions were subsequently implemented in the smart contract. Medical transactions, for example, are sensitive information outside the circle of the patient and doctor who receives authorization, and the HF core solution offers the opportunity to create private channels for members of the network to exchange sensitive information [5]. Furthermore, from a security perspective, it also has the security underlying principles from HF and a key feature for providing additional hardware-based digital

Implementation of Big Data and Blockchain for Health Data …

21

signature security, as well as the ability to manage identities through Hardware Security Model (HSM). Using the designed architecture, the fabric SDK is running in JavaScript, while Node JS is being used in the backend nodes, which is being used in conjunction with an interface defined by Angular and using an original sandbox test environment. A smart contract is used primarily in this paper to automate the logic of the medical record and to store all the data in a dedicated ledger that can be viewed by patients and doctors based on the defined access rules. The use of smart contracts enabled the architecture to move medical records from one hospital to another while maintaining the necessary security and encryption. An important component of the architecture is based on HF chaincode with JavaScript, and Node.js, where Ethereum is connected, and smart contracts are created using Solidity [6]. As can be seen in Fig. 1, while ensuring that scalability and data security principles are followed, as well as interoperability with another critical component, such as SAP HANA, in terms of ensuring that the necessary business intelligence analysis and data curation are performed for a value-managed architecture with the data privacy and security procedures fully considered. Before granting access to the authorized EHR platform, identity management validations were critical to verify the credentials of the doctors and prove that the physicians held valid medical licenses.

Fig. 1 Architecture overview with all involved components

22

A. Pesqueira et al.

Fig. 2 Actions and activities from the different process and system profiles

A crucial step was the creation of the hospital group channel to create control access rules to control the access to existing hospitals, new hospitals, future HCOs, and future needs for hospital departments or organizations. As well as granting access control to network administrative changes, including network access control, where HF was fundamental to enabling patients and health data policies to be associated with different record data management protocols, as well as access control and network access control defined in the access control network. Thus, by studying the below picture (Fig. 2), we can see how the hospital data management users or system administrators provide information and actions to the medical doctors and parts of the EHR module, and then the medical doctors and patients are involved through the PHRs. As a key component of the overall architecture, AngularJS was used to connect with the Fabric docker cloud through an SDK node, which then connected to the entire HF ecosystem architecture. In this case study, SAP Hana was further integrated as an additional connection allowing the BD connection. The decision to use SAP HANA for MDM and BD analytical purposes was based mainly on the capability of combining additional tools and services with HANA, such as data intelligence, and to use HANA Cloud to collect and analyze future data resulting from the future unstructured, high-frequency data from the designed EHR and PHR platform. Connecting SAP HANA was part of an effort to collect BD from wearables, fitness trackers, and other sources of quantifiable lifestyle information that can be used to better understand behavior patterns or create baselines for understanding health concerns without requiring patients to sign up for studies or focus groups.

Implementation of Big Data and Blockchain for Health Data …

23

As part of the defined data schema application, participants (such as MDs, patients, or hospital administrators), assets (e.g., patients’ medical records), transactions (such as prescriptions, diagnoses, and medical consultations), as well as events (such as capturing symptoms) were defined to efficiently drive the necessary decision in terms of the database, orders, and certificates. In terms of credentialing physicians, verification of primary sources, privacy-preserving authentication, and the management of digital identities were crucial to the security of the established architecture, which grants access to resources and different stakeholders in an information system. In this architecture, one of the key mechanisms was the Primary Source Verification (PSV) required to verify a medical doctor’s license, certification, or registration to practice, which provided a solution to complete PSV, not the licensed individual.

2.2 PHR User Interface The last part of the case study was the development and implementation of the user interface (UI) system for the electronic health record and personal health records, where the primary goal was to develop a simple, yet trustworthy design. We show below the dashboard for the PHR UI, where the following areas were developed: medical records belonging to the patient, personal data, treatments, schedules, laboratory results, diagnosis documents, payments, and other settings, as shown in Figs. 3 and 4.

Fig. 3 Patient health records dashboard from the user interface

24

A. Pesqueira et al.

Fig. 4 Patient registration form with patient personal information, treatment, payment, and treatment information

The corresponding table and clinical notes are illustrated in Fig. 5 as part of the payment process of the administrative system, with a corresponding view of a representative table for the hospital department, case number, and payment information.

Fig. 5 Representative table for hospital department, case number, and payment information

Implementation of Big Data and Blockchain for Health Data …

25

3 Conclusion During this case study research, it was discovered that the presented solution is highly accurate in managing data about PHR, and according to the results of the case study research, the presented solution has high accuracy and is capable of managing PHR effectively. An increased number of patients and healthcare organizations should be involved in future research where specific tests to the technology and API connections can also be leveraged and maximized concerning pressure tests. Additionally, it has been shown to have a high practical value when it comes to meeting the accuracy and real-time requirements of BT and BD applications. One of the key objectives of this proposed architecture was the creation of a permissionless system, in which patients could be anonymous or pseudonymous, and in which every patient could add a new block to the ledger. In this work, we propose an immutable, secure, scalable, and interoperable architecture that can support a variety of connected, integrated, and independently managed healthcare systems to be more transparent and secure in the collection, management, and analysis of sensitive patient data. Due to the immaturity of Hyperledger, there were a few disadvantages, but fortunately, partway through the project, a new HF version and Composer were released, and the system was upgraded to take advantage of the numerous bugs fixes and enhancements that were included in these releases. The HF is a promising BT framework that comes with policies, smart contracts, and secure identities, allowing possible access to additional add-ins like SAP HANA or even with future advanced decentralized identification management systems via different connections such as docker. Interoperability between multiple hospital organizations provided a framework for developing a private and closed blockchain scenario. This approach provides reliable and secure solutions for managing medical records. Yet the most important task is to resolve security challenges and improve source code to provide a scalable and pluggable solution with effective implementation of powerful ordering service on a large-scale fabric network, updating consortium policies, and implementing the patient’s module functionality.

References 1. Abdelhak M, Grostick S, Hanken MA (2014) Health information-e-book: management of a strategic resource. Elsevier Health Sciences 2. Mathew PS, Pillai AS (2015) Big data solutions in healthcare: problems and perspectives. In: 2015 International conference on innovations in information, embedded and communication systems (ICIIECS). IEEE, pp 1–6 3. Pierro GA (2021) A user-centered perspective for blockchain development 4. Miglani A, Kumar N, Chamola V, Zeadally S (2020) Blockchain for the internet of energy management: review, solutions, and challenges. Comput Commun 151:395–418

26

A. Pesqueira et al.

5. Yuchao W, Ying Z, Liao Z (2021) Health privacy information self-disclosure in the online health community. Front Public Health 8:602792 6. Bai P, Kumar S, Aggarwal G, Mahmud M, Kaiwartya O, Lloret J (2022) Self-sovereignty identity management model for smart healthcare system. Sensors 22(13):4714

Ambient PM2.5 Prediction Based on Prophet Forecasting Model in Anhui Province, China Ahmad Hasnain, Muhammad Zaffar Hashmi, Basit Nadeem, Mir Muhammad Nizamani, and Sibghat Ullah Bazai

Abstract Due to recent development in different sectors such as industrialization, transportation, and the global economy, air pollution is one of the major issues in the twenty-first century. In this work, we aimed to predict ambient PM2.5 concentration using the prophet forecasting model (PFM) in Anhui Province, China. The data were collected from 68 air quality monitoring stations to forecast both short-term and longterm PM2.5 concentrations. The determination coefficient (R2 ), root mean squared error (RMSE), and mean absolute error (MAE) were used to determine the accuracy of the model. According to the obtained results, the predicted R, RMSE, and MAE values by PFM for PM2.5 were 0.63, 15.52 μg/m3 , and 10.62 μg/m3 , respectively. The results indicate that the actual and predicted values were significantly fitted and PFM accurately predict PM2.5 concentration. These findings are supportive and helpful for local bodies and policymakers to deal and mitigate air pollution problems in the future. Keywords Prophet forecasting model · Time series analysis · PM2.5 · Anhui province · China A. Hasnain Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing 210023, China School of Geography, Nanjing Normal University, Nanjing 210023, China Jiangsu Center for Collaborative Innovation in Geographical Information, Resource Development and Application, Nanjing 210023, China M. Z. Hashmi (B) Department of Chemistry, COMSATS University Islamabad, Islamabad, Pakistan e-mail: [email protected] B. Nadeem Department of Geography, Bahauddin Zakariya University, Multan, Pakistan M. M. Nizamani School of Ecology, Hainan University, Haikou, China S. U. Bazai College of Information and Communication Technology, BUITEMS, Quetta, Pakistan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_3

27

28

A. Hasnain et al.

1 Introduction Due to recent development in different sectors such as industrialization, transportation, and the global economy, air pollution is one of the major widespread environmental issues in the twenty-first century. Previously, it has been reported by WHO that the air pollution level increased in many Asian countries such as China, Bangladesh, Pakistan, and India [1, 8]. China is the largest emerging country in the world, with a large population, industries, and transportation. In the last 3 decades, many cities and areas of the country experienced serious air pollution issues (Zhao et al. 2020). In the last few years, the Government of China took some serious steps to control the level of air pollution in the country, which resulted in a slight decline in air pollution but still there is a need to adopt some strict and preventive measures to protect the environment at a significant level [13, 14]. Particulate matter, with a diameter of 2.5 μm or less, is called PM2.5 , which has been proven to have harmful health impacts [7]. PM2.5 has a more significant impact on human health than PM10 . PM2.5 contains inducing materials such as lipopolysaccharide and polycyclic aromatic hydrocarbons, which severely degrades the human respiratory system [4]. Due to strict restrictions and the Air Pollution Prevention and Control Action Plan in September 2013 implemented by the government, a slight drop in the concentration of PM2.5 has been observed in China. However, heavy haze events still occur occasionally in many cities and regions in the country [11, 16]. Associated with harmful effects and an impact on the environment, air pollution has attracted widespread attention from researchers and scholars [13]. In recent years, many scholars used time series analysis to predict the concentrations of air pollutants (Zhao et al. 2020); [13]. Bhatti et al. [3] used SARIMA model to forecast air pollution and a factor analysis approach. Kaminska [10] used the random forest model to find out the short-term effects of air pollution and it is a popular approach due to its non-linear pattern. Garcia et al. [6] developed generalized linear models (GLMs) to predict the concentration of PM10 and find out the relationship between PM10 and meteorological variables. He et al. [9] presented the linear and non-linear methods to predict the PM2.5 concentration in their study. Against this background, in our work, we used the prophet forecasting model (PFM) which was developed by Facebook to predict both short-term and long-term PM2.5 concentration in Anhui province, China. The model has a unique ability to forecast accurately and it has been successfully activated if the data have numerous outliers and missing values. Compared with other models, such as autoregressive integrated moving average (ARIMA) and seasonal autoregressive integrated moving average (SARIMA), PFM takes approximately 10 times less time and it has been successfully established [12]. In this research, we aimed to predict one of the more critical air pollutants (PM2.5 ) using PFM in Anhui province, China. The results of this research will be supportive and helpful for local bodies and policymakers to deal and mitigate air pollution problems in the future.

Ambient PM2.5 Prediction Based on Prophet Forecasting Model …

29

Fig. 1 The geographical location and the air quality monitoring stations in Anhui Province

2 Proposed Methodology 2.1 Study Area Anhui province is located near the sea and is one of the core areas of the Yangtze River Delta (YRD). Anhui province crosses the Yangtze River, Huai River, and Xin’an River, which makes it a more significant region of the country. As of 2020, the province has 16 provincial cities and 9 county-level cities. Anhui is rich in several major economic sectors such as industries and transportation. Figure 1 shows the geographical location and the air quality monitoring stations of Anhui province.

2.2 Data The daily average concentration of PM2.5 was used in this research and the data were collected between 1 January 2018 and 31 December 2021. The data were downloaded from the website of historical data of air quality in China and the data were from the China Environmental Monitoring Station (CNEMC 2019). In Anhui Province,

30

A. Hasnain et al.

68 monitoring stations are working to collect and record air pollution data and their location is shown in Fig. 1.

2.3 Proposed Model For time series analysis and prediction, PFM is a powerful tool and it takes a very small time to fit the model. The following formula was used for the model: y(t) = g(t) + s(t) + h(t) + et

(1)

Equation (1) was employed to determine the performance of the model, where y(t) represents the actual values; g(t) and s(t) represent seasonality; h(t) is for holiday outliers; and et is used for unexpected error. The model has a number of parameters and the model can be supposed as linear or logistic. The model accepts a Bayesianbased curve fitting technique to forecast and predict time series data, which is one of the significant features and makes it more imperative compared with other forecasting methods. Change points are significant features in the PFM and the fitting scale can be quantified; the model showed better results with higher change points. To determine the number of change points, the model plots a large number and then the PFM uses L1 regularization to pick out a few points to use. Due to a lack of change points, L1 regularization was used to avoid overfitting. L(x, y) ≡

n  i=1

(yi− h θ (xi ))2 + λ

n 

|θi |

(2)

i=1

Equation (2) represents the L1 regularization, where x and y are the coordinates of the nchange points. 2 for the change between the observed and predicted i=1 (yi − h θ (x i )) was used value squared. The purpose of λ ni=1 |θi | is to sustain the stability in weights in order to avoid overfitting, where λ represents how much the weights are penalized. Based on the number of estimators, the model has a number in determining the value of λ. To determine the model performance, the actual and predicted values were compared in different time frames.

2.4 Statistical Analysis In this work, we used determination coefficient (R2 ), root mean squared error (RMSE), and mean absolute error (MAE) to evaluate the model’s performance. The following formulas are used for these metrics:

Ambient PM2.5 Prediction Based on Prophet Forecasting Model … n  x)2 (yi − n − x)2 (x i i=1 i=1   n |xi − yi |2 R M S E = n1 i=1

R2 =

M AE =

1 n

n

i=1 |x i

− yi |

31

(3) (4) (5)

where xi and yi are used for actual and predicted values and n is the number of samples.

3 Results and Discussion To specify the features of the model, a linear model was inputted and LI regularization technique was used for error and change points. The actual and predicted values were compared for both short-term and long-term prediction. During the entire period, the PFM showed superior performance. Figures 2 and 3 show the predicted results of ambient PM2.5 in Anhui Province. The results indicate that during the entire period the actual and predicted values were significantly fitted and the predicted R, RMSE, and MAE values for PM2.5 concentration by PFM were 0.63, 15.52 μg/m3 , and 10.62 μg/m3 , respectively (Fig. 2). Deters et al. [5] predicted PM2.5 concentration using a machine learning approach. The results indicated that in the current work the performance of the PFM was quite better compared with the mentioned study. In 1-year prediction, the predicted R, RMSE, and MAE values for PM2.5 were 0.58, 13.38 μg/m3 , and 9.38 μg/m3 , respectively. The performance of the model during the entire period by R values was higher than a yearly prediction, while the RMSE and MAE values were lower in a yearly prediction compared with the entire period forecasting for ambient PM2.5 in Anhui province. The actual and predicted values showed a good agreement during both periods with small differences. Previously, [15] used ARIMA and prophet methods to forecast air pollutants. The study revealed low accuracy compared with the current work. Moreover, in a 6-month duration, the PFM provides superior performance as shown by all statistical indicators (Fig. 2). The predicted R, RMSE, and MAE values by PFM for ambient PM2.5 were 0.66, 12.43 μg/m3 , and 8.64 μg/m3 , respectively. It should be noted that during this window of time, the predicted R, RMSE, and MAE values were improved. A significant relation was observed between the actual and predicted values in the 6-month prediction. With a 3-month prediction, the model predicts the concentration of PM2.5 , R = 0.48, RMSE = 16.70, and MAE = 12.69 in Anhui province. It suggests that the model provides better performance for longterm perdition than short-term. Figure 3 shows the predicted ambient PM2.5 for the upcoming 1.5 years.

32

A. Hasnain et al.

Fig. 2 Scatterplots of ambient PM2.5 results; a entire dataset, b yearly prediction, c 6-month prediction, and d 3-month prediction

4 Conclusion In the current study, the PFM was used to predict both short-term and long-term ambient PM2.5 concentration, using daily average data in Anhui Province. According to the gained results, the model has a wide ability to accurately predict the concentration of PM2.5, and the actual and predicted values were suggestively fitted during different windows of time. The model can be used in other regions and fields as a prediction method to obtain new findings. The results of the current research will be supportive and helpful for local bodies and policymakers to control and mitigate air pollution problems in the upcoming years.

Ambient PM2.5 Prediction Based on Prophet Forecasting Model …

33

Fig. 3 PM2.5 (μg/m3 ) forecasting in Anhui Province

References 1. Air Visual (2019) Airvisual–air quality monitor and information you can trust. Available at: https://www.airvisual.com/. Accessed 26 Aug 2019 2. Bhatti UA, Wu G, Bazai SU, Nawaz SA, Baryalai M, Bhatti MA, Nizamani MM (2022) A pre-to post-COVID-19 change of air quality patterns in anhui province using path analysis and regression. Pol J Environ Stud. https://doi.org/10.1007/s11356-020-08948-1 3. Bhatti UA, Yan Y, Zhou M, Ali S, Hussain A, Qingsong H et al (2021) Time series analysis and forecasting of air pollution particulate matter (PM2.5): an SARIMA and factor analysis approach. IEEE Access 9:41019–41031. https://doi.org/10.1109/access.2021.3060744 4. Bilal M, Mhawish A, Nichol JE, Qiu Z, Nazeer M, Ali MA et al (2021) Air pollution scenario over pakistan: characterization and ranking of extremely polluted cities using long-term concentrations of aerosols and trace gases. Remote Sens Environ 264:112617. https://doi.org/10.1016/ j.rse.2021.112617 5. Deters JK, Zalakeviciute R, Gonzalez M, Rybarczyk Y (2017) Modeling PM2.5 urban pollution using machine learning and selected meteorological parameters. J Electr Comput Eng 1–14. https://doi.org/10.1155/2017/5106045 6. Garcia JM, Teodoro F, Cerdeira R, Coelho LMR, Kumar P, Carvalho MG (2016) Developing a methodology to predict Pm10 concentrations in urban areas using generalized linear models. Environ Technol 37(18):2316–2325. https://doi.org/10.1080/09593330.2016.1149228 7. Hasnain A, Hashmi MZ, Bhatti UA, Nadeem B, Wei G, Zha Y, Sheng Y (2021) Assessment of air pollution before, during and after the COVID-19 Pandemic Lockdown in Nanjing, China. Atmosphere 12:743. https://doi.org/10.3390/atmos12060743 8. Hasnain A, Sheng Y, Hashmi MZ, Bhatti UA, Hussain A, Hameed M, Marjan S, Bazai SU, Hossain MA, Sahabuddin M, Wagan RA, Zha Y (2022) Time series analysis and forecasting of air pollutants based on prophet forecasting model in Jiangsu Province, China. Front Environ Sci 10:945628. https://doi.org/10.3389/fenvs.2022.945628 9. He B, Heal MR, Reis S (2018) Land-use regression modelling of intraurban air pollution variation in China: current status and future needs. Atmosphere 9(4):134 10. Kami´nska JA (2018) The use of random forests in modelling short-term air pollution effects based on traffic and meteorological conditions: a case study in wrocław. J Environ Manage 217:164–174. https://doi.org/10.1016/j.jenvman.2018.03.094 11. Liu N, Zhou S, Liu C, Guo J (2019) Synoptic circulation pattern and boundary layer structure associated with PM2.5 during wintertime haze pollution episodes in Shanghai. Atmos Res 228:186–195. https://doi.org/10.1016/j.atmosres.2019.06.001 12. Taylor SJ, Letham B (2017) Forecasting at scale. Am Stat 72(1):37–45. https://doi.org/10.1080/ 00031305.2017.1380080

34

A. Hasnain et al.

13. Wang J, He L, Lu X, Zhou L, Tang H, Yan Y et al (2022) A full-coverage estimation of PM2.5 concentrations using a hybrid XGBoost-WD model and WRF-simulated meteorological fields in the Yangtze River Delta Urban agglomeration, China. Environ Res 203:111799. https://doi. org/10.1016/j.envres.2021.111799 14. Wu X, Guo J, Wei G, Zou Y (2020) Economic losses and willingness to pay for haze: the data analysis based on 1123 residential families in Jiangsu Province, China. Environ Sci Pollut Res 27:17864–17877. https://doi.org/10.1007/s11356-020-08301-6 15. Ye Z (2019) Air pollutants prediction in shenzhen based on arima and prophet method. E3S Web Conf 136:05001. https://doi.org/10.1051/e3sconf/201913605001 16. Zhai S, Jacob DJ, Wang X, Shen L, Li K, Zhang Y et al (2019) Fine particulate matter (PM2.5) trends in China, 2013-2018: separating contributions from anthropogenic emissions and meteorology. Atmos Chem Phys 19:11031–11041. https://doi.org/10.5194/acp-19-110312019

Potato Leaf Disease Classification Using K-means Cluster Segmentation and Effective Deep Learning Networks Md. Ashiqur Rahaman Nishad, Meherabin Akter Mitu, and Nusrat Jahan

Abstract Potatoes are the most often consumed vegetable in many countries throughout the year, and Bangladesh is one of them. Plant diseases and venomous insects pose a significant agricultural hazard and now substantially impact Bangladesh’s economy. This paper proposes a real-time technique for detecting potato leaf disease based on a deep convolutional neural network. The categorization of a picture into several categories is known as segmentation. We have used the K-means clustering algorithm for segmentation. In addition, to increase the model’s efficacy, numerous data augmentation procedures have been applied to the training data. A convolutional neural network is a deep learning neural network used to prepare ordered clusters of data, such as depictions. We have used a novel CNN approach, VGG16, and ResNet50. By using VGG16, novel CNN, and resNet50, the suggested technique was able to classify potato leaves into three groups with 96, 93, and 67% accuracy, respectively. The recommended method outperforms current methodologies as we compared the performances of the models according to relevant parameters. Keywords Potato disease · Deep learning · VGG16 · Image segmentation · K-means clustering · Data augmentation

1 Introduction Agriculture is commonly known as soil culture. It is considered the spine of the financial framework for creating nations. In Bangladesh, agriculture is imperative Md. A. R. Nishad · M. A. Mitu · N. Jahan (B) Department of CSE, Daffodil International University, Dhaka, Bangladesh e-mail: [email protected] Md. A. R. Nishad e-mail: [email protected] M. A. Mitu e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_4

35

36

Md. A. R. Nishad et al.

for people subsistence and contribution to GDP. In 2020, agriculture accounted for 12.92% of Bangladesh’s GDP [1]. Recently potato has been the third most consumed food in Bangladesh. On the other hand, 56 diseases have been recorded in potato fields in Bangladesh [2] where the loss of annual potato yield due to late blight is estimated at 25–57% [3]. Late blight is the foremost common and highly detrimental parasitic disease in potatoes. Therefore, it can be beneficial to the agriculture and economy of Bangladesh if we can reduce the potato production losses due to these diseases. K-Means Clustering is an unsupervised learning algorithm to illuminate the clustering issues in machine learning or data science. However, other approaches, such as contour detection and edge detection, are also helpful for segmentation. For example, image contour detection is crucial to numerous image analysis applications, counting picture segmentation, object recognition, and classification [4]. Deep Learning has been an effective tool within the past few decades for handling expansive sums of information. The interest in utilizing hidden layers has surpassed traditional methods, particularly design recognition. One of the foremost well-known deep neural networks is the Convolutional Neural Network (CNN). Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and the revival of deep convolutional neural networks (CNN) [5]. CNN is the dominant method of deep learning [6]. As previously mentioned, the loss of potatoes is 25–57% yearly due to late blight. If we can reduce this loss rate to 10%, it will have a huge impact on the economy of the country. For this reason, we think that more research needs to be done in this field and it has a good research scope. Finally, a deep learning-based system was proposed to predict potato leaf disease in our study and illustrate it in Fig. 1. It is time to motivate ourselves for agricultural development because this could be a way to protect our world from various disasters. The contributions of this study are listed as follows: • We proposed a preprocessing step on the PlantVillage potato leaf dataset. • The processed images are segmented by K-means clustering. • Finally, the Dataset is classified according to their respective classes, such as earlyblight, late blight, and healthy leaf, using different networks including VGG16, ResNet50, and 2D-CNN.

2 Literature Review To provide a better solution for potato leaf disease, detection and classification were our main aim. However, researchers have already proposed different techniques for detecting potato leaf diseases. A summary of those approaches is highlighted in this section. A CNN model was proposed by Mohit et al. [7] where three max-pooling layers were followed by two fully-connected layers, which gave them efficiency over the

Potato Leaf Disease Classification Using K-means Cluster …

37

Fig. 1 Deep learning-based smart system to predict potato leaf disease

pre-trained models. Overall, they got 91.2% accuracy on the plantVillage dataset of 10 classes (9 diseases and one healthy class). Chlorosis, often known as yellowing disease, is a plant disease that affects black gram plants. Changjian et al. [8] restructured residual dense network, a hybrid deep learning model that encompasses the upper hand of deep residual and dense networks, reducing the training process, and considered the Tomato leaf dataset from AI Challenger. Vaibhav et al. [9] used a hybrid dataset collected from four different sources. They have followed five-fold cross-validation and testing on unseen data for extreme evaluation. The model gained a cross-validation accuracy of 99.585% and average test accuracy of 99.199% on the unseen images. Divyansh et al. [11] initiated a pre-trained model to extract significant features from the potato PlantVillage dataset, logistic regression provided 97.8%. Amreen et al. [12] and Anam et al. [13] suggested a deep learning method. They segmented the images and trained the CNN model with those images. They achieved the highest accuracy on the GitHub dataset by utilizing DenseNet121 10-Fold. Md. Tarek et al. [14] applied the k-means clustering segmentation method on the fruit’s images, and SVM provided 94.9% accuracy. Yonghua et al. [15] designed an AISA-based GrabCut algorithm to remove the background information. On the other hand, using the same dataset, Sumita et al. [16] presented a CNN for recognizing corn leaf disease and got 98.88% accuracy. Huiqun et al. [17] applied transfer learning to reduce the size of train data, computational time, and model complexity. Five deep network structures were used, while Densenet_Xception offered the highest accuracy. Rangarajan et al. [18] proposed a pre-trained VGG16 algorithm for identifying eggplant disease. The highest accuracy for datasets created with RGB and YCbCr images in field conditions was 99.4%. Parul et al. [19] have utilized a CNN method

38

Md. A. R. Nishad et al.

Table 1 Summarize recent papers for potato disease prediction Author

Algorithm

Dataset

Zhou et al. (2021) [26]

Restructured residual dense network, Deep-CNN, RestNet50, DenseNet121

Tomato AI 9 classes of challenger dataset tomato leaf (13,185 images) diseases

95

Tiwari et al. (2021) [24]

SVM, ANN, KNN, DenseNet 121, DenseNet 201, MobileNet-v2

Hybrid dataset (25,493 images)

27 classes of 6 different crops diseases

99.58

Tiwari et al. (2020) [25]

VGG19, Inception Potato V3, Logistic PlantVillage Regression, VGG16 (2152 images)

3 classes of potato leaf diseases

97.8

Umamageswari et al. (2021) [10]

FCM, CSA, Fast Mendeley’s leaf GLSM model, disease dataset PNAS-Progressive (61,485 images) Neural Architecture

8 classes of 7 different crops diseases

97.43

Abbas et al. (2021) [27]

C-GAN, DenseNet21

10 classes of tomato leaf diseases

99.51

Tomato PlantVillage (16,012 images)

Classes

Accuracy (%)

to identify diseases in plants. They created a dataset combining the open-source PlantVillage Dataset and images from the field and the Internet and got 93% accuracy. After observing several previous research works, we have summarized a few recent papers and illustrated them in Table 1.

3 Data Preparation 3.1 Data Collection Data is one of the major parts of any machine learning algorithm. In this study, infected and healthy potato leaf images were collected from the PlantVillage Potato leaf disease dataset. We observed two common potato diseases: Early and late blight; however, we consider a total of three classes including healthy leaf. To train and test our proposed network’s performances, the dataset has been divided into 80:20 ratios. Table 2 represents the exact data volume for each class and Fig. 2 for representing a sample data.

Potato Leaf Disease Classification Using K-means Cluster …

39

Table 2 Dataset Serial No.

Class

1

Healthy

2

Late blight

1000

800

200

3

Early blight

1000

800

200

2152

1722

430

Total

Number of samples 152

Training sample 122

Test sample 30

Fig. 2 Example of PlantVillage dataset. a Potato early blight, b Potato late blight, and c Potato healthy

3.2 Augmentation Different data augmentation techniques have been applied to the training data to enhance the model’s efficiency. The computation cost is reduced a lot using the smaller pixels, and therefore, we used scale transformation ranging between 0 to 1 (1/255). A shear angle of 0.2 is applied on the training images in which one axis is fixed, but the other is stretched to a specific angle. We applied a zoom range of 0.2 to zoom in the images and a horizontal flip to rotate the image by 180 degrees on the x-axis. The augmentation techniques that we applied in this study are listed as follows: • • • •

Resize shear_range 0.2 Zoom_range 0.2 Horizontal flip

3.3 Segmentation The primary purpose of segmentation is to normalize and alter the visualization of an image that would be easier to analyze. We chose the k-means clustering method

40

Md. A. R. Nishad et al.

Fig. 3 Pseudo code to describe K-means clustering

and selected multiple K values: 3, 5, and 7 among these, we observed that K value 3 produces the best output. That is why finally we chose the value of K as 3. Figure 3 presents the pseudo code of k-means clustering. K-means clustering aims to reduce the sum of squared distances between all locations and the cluster center to the smallest possible value, equation shown in (1) J=

k  n 

( j)

||xi

− c j ||2

(1)

j=1 i=1

here J = objective function, k = number of clusters, n = number of cases, x i = case I, cj = centroid for cluster j, and ||x i (j) − cj || is the distance function. After applying the k-means clustering algorithm on our dataset, we got segmented data. To present the output of k- means, we created Fig. 4.

3.4 Proposed Network In this section, we are going to discuss three different network models. Our prepared dataset performed better with VGG16. Figure 5 illustrates the basic block diagram of our paperwork.

4 Experimental Result Analysis Several sets of experiments have been carried out for plant leaf disease classification and detection research. We used k-means clustering here, a common image segmentation approach, to segment the image [20]. To anticipate the classes of the leaf

Potato Leaf Disease Classification Using K-means Cluster …

41

Fig. 4 Dataset after segmentation

Fig. 5 Block diagram of our study

photos, we used three classification approaches. CNN, ResNet50, and VGG16 are the three. VGG16 and ResNet50 are pre-trained models. Each model was trained for 50 epochs on the training set. Here, Table 3 denotes the performance measures of our models. We employed performance measures such as accuracy, precision, recall, F1-score, and confusion matrix to evaluate the suggested approach’s performance. Accuracy =

TP + TN TP + TN + FP + FN

Precision =

TP TP + FP

42

Md. A. R. Nishad et al.

Table 3 Performances of different approaches Approach

Algorithm

Before augmentation

After augmentation

After segmentation (K-means) + augmentation

Evaluation metric ACC

PR

Recall

F1-score

VGG16

0.954

0.954

0.957

0.955

Novel 2D-CNN

0.776

0.774

0.775

0.775

ResNet50

0.643

0.635

0.655

0.645

VGG16

0.959

0.959

0.945

0.952

Novel 2D-CNN

0.815

0.814

0.804

0.808

ResNet50

0.63

0.63

0.63

0.63

VGG16

0.963

0.963

0.965

0.964

Novel 2D-CNN

0.93

0.93

0.91

0.92

ResNet50

0.67

0.66

0.68

0.67

Recall =

F1-score =

TP TP + FN

2 ∗ TP 2 ∗ TP + FP + FN

here TP = True Positive, TN = True Negative, FP = False Positive, and FN = False Negative. After augmentation and segmentation, we acquired VGG16 as the best model for our dataset as it generated 96% accuracy. The other two models generated 93 and 67% accuracy for our dataset. We present the ROC curve for VGG16 in Fig. 6.

4.1 Performance Comparison The proposed VGG16 model is compared to previously proposed networks such as VGG19, Novel CNN, PDDCNN, MCD (Minimum–maximum distance), and SVM. All of the models were trained on the original PlantVillage dataset before being applied to the augmented dataset; some of the models included segmentation. Table 4 shows that the presented VGG16 model outperformed all other proposed models for “augmented + segmented” dataset, with 96% accuracy.

5 Conclusion and Future Work Deep learning-based approaches have appeared as a great solution to produce promising outcomes in plant disease detection and recognition. This study has

Potato Leaf Disease Classification Using K-means Cluster …

43

Fig. 6 Training and validation results VGG16

proposed a deep learning-based method to classify potato leaf disease; here, we also used the k-means segmentation approach to generate better results. In this paper, we used three different deep learning-based algorithms. After completing our experiment on the original Kaggle plantvillage potato leaf dataset, we achieved a Convolutional Neural network (CNN) that works 93% accurately, ResNet50 provided 67% accuracy, and finally from VGG16, we obtained 96% accuracy. We used k-means clustering for image segmentation followed by four types of data augmentation on the training set. Therefore, we can summarize the study as follows: • We applied a k-means clustering segmentation approach. • Prepared dataset using different augmentation methods. • VGG16 was proposed as the best model for our experiment. In future work, we will develop an application to predict the class of a leaf disease and apply other algorithms to enrich the model performance. As a result, farmers in the agricultural field will be able to identify specific diseases at an early stage that will be helpful for them to take the necessary steps. However, we have a few limitations:

44

Md. A. R. Nishad et al.

Table 4 Compare previous work with our proposed model Reference

CNN model Segmentation

Augmentation

Dataset

Accuracy (%)

Rizqi et al. [21]

VGG16, VGG19

N/A

Yes (translations, PlantVillage rotation, shearing, vertical and horizontal flips)

91

Javed et al. [22]

Novel CNN, PDDCNN

YOLOv5

Yes (scale PlantVillage, transformation, PLD rotation, shearing, vertical flips, zoom)

48.9

Ungsumalee and Aekapop [23]

MCD

K-means clustering

N/A

PlantVillage

91.7

Proposed

VGG16, Novel 2D-CNN, ResNet50

K-means clustering

Yes (rescale, horizontal flip, sheer, zoom)

PlantVillage

96

• A large amount of data may improve the results. • It is possible to experiment with other segmentation methods. • Finally, provide a better application for crop fields.

References 1. O’Neill A (2022) Share of economic sectors in the GDP in Bangladesh 2020. https://www.sta tista.com/statistics/438359/share-of-economic-sectors-in-the-gdp-in-bangladesh/. Accessed 21 June 2022 2. Naher N, Mohammad H, Bashar MA (2013) Survey on the incidence and severity of common scab of potato in Bangladesh. J Asiatic Soc Bangladesh, Sci 39(1):35–41 3. Huib H, Joost Van U (2017) Geodata to control potato late blight in Bangladesh (GEOPOTATO). https://www.fao.org/e-agriculture/news/geodata-control-potato-late-blightbangladesh-geopotato. Accessed 3 June 2022 4. Catanzaro B et al (2009) Efficient, high-quality image contour detection. In: 12th international conference on computer vision. IEEE 5. Shin H et al (2020) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35(5):1285–1298 6. Sun Y et al (2020) Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans Cybernet 50(9):3840–3854 7. Mohit A et al (2020) ToLeD: tomato leaf disease detection using convolution neural network. Procedia Comput Sci 167:293–301 8. Changjian Z et al (2021) Tomato leaf disease identification by restructured deep residual dense network. IEEE Access 9:28822–28831

Potato Leaf Disease Classification Using K-means Cluster …

45

9. Vaibhav T et al (2021) Dense convolutional neural networks based multiclass plant disease detection and classification using leaf images. Eco Inform 63:101289 10. Umamageswari A et al (2021) A novel fuzzy C-means based chameleon swarm algorithm for segmentation and progressive neural architecture search for plant disease classification. ICT Express 11. Divyansh T et al (2020) Potato leaf disease detection using deep learning. In: 4th international conference on intelligent computing and control systems (ICICCS) 12. Amreen A et al (2021) Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput Electron Agric 187:106279 13. Anam I et al (2021) Rice leaf disease recognition using local threshold based segmentation and deep CNN. Int J Intell Syst Appl 13(5) 14. Md Tarek H et al (2021) An explorative analysis on the machine-vision-based disease recognition of three available fruits of Bangladesh. Vietnam J Comput Sci 1–20 15. Yonghua X et al (2020) Identification of cash crop diseases using automatic image segmentation algorithm and deep learning with expanded dataset”. Comput Electron Agric 177:105712 16. Sumita M et al (2020) Deep convolutional neural network based detection system for real-time corn plant disease recognition. Procedia Comput Sci 167:2003–2010 17. Huiqun H et al (2020) Tomato disease detection and classification by deep learning. In: International conference on big data, artificial intelligence and internet of things engineering (ICBAIE) 18. Rangarajan K et al (2020) Disease classification in eggplant using pre-trained VGG16 and MSVM. Sci Rep 10(1):1–11 19. Parul S et al (2018) KrishiMitr (Farmer’s Friend): using machine learning to identify diseases in plants. In: IEEE international conference on internet of things and intelligence system (IOTAIS) 20. Nameirakpam D et al (2015) Image segmentation using K-means clustering algorithm and subtractive clustering algorithm. Procedia Comput Sci 54:764–771 21. Rizqi AS et al (2020) Potato leaf disease classification using deep learning approach. In: International electronics symposium (IES). IEEE 22. Javed R et al (2021) Multi-level deep learning model for potato leaf disease recognition. Electronics 10(17):2064 23. Ungsumalee S, Aekapop B (2019) Potato leaf disease classification based on distinct color and texture feature extraction. In: International symposium on communications and information technologies (ISCIT). IEEE 24. Tiwari V et al (2021). Dense convolutional neural networks based multiclass plant disease detection and classification using leaf images. Ecol Inf 63(2021): 101289. https://doi.org/10. 1016/j.ecoinf.2021.101289 25. Tiwari D et al (2020) Potato leaf diseases detection using deep learning. In: 4th International Conference on Intelligent Computing and Control Systems (ICICCS) (pp 41–466). IEEE 26. Zhou C et al (2021) Tomato leaf disease identification by restructured deep residual dense network. IEEE Access 9(2021): 28822–28831 27. Abbas, Amreen, et al. (2021) Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput Electron Agric 187(2021):106279

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning Banuki Nagodavithana and Abrar Ullah

Abstract Polycystic Ovarian Syndrome (PCOS) is a silent disorder that causes women to have weight gain, infertility, hair loss, and irregular menstrual cycles. It is a complex health issue, and one of the methods to diagnose patients with PCOS is to count the number of follicles in the ovaries. The issue with the traditional method is that it is time-consuming and prone to human errors as it can be challenging for medical professionals to distinguish between healthy ovaries and polycystic ovaries. Using Deep Learning, the concept was to create and use various Deep Learning Models such as a CNN, Custom VGG-16, ResNet- 50, and Custom ResNet-50, to obtain a high-accuracy result that will detect between healthy and polycystic ovaries. From the results and evaluation obtained, the CNN model achieved 99% accuracy, VGG 16 model: 58%, ResNet-50 Model: 58%, and Custom ResNet-50 Model: 96.7%.

1 Introduction Polycystic ovary syndrome (PCOS) is a silent disorder with serious side effects and it has affected women globally causing them to suffer from different types of health issues such as irregular menstrual cycles, weight gain, infertility hair loss, and diabetes (Fig. 1). Since it is a complex health issue, the traditional method of diagnosing a patient with PCOS is that a medical professional would have to rule out two of the third options: a patient must have high levels of androgen levels (male sex hormones), irregular menstrual cycles, and a high number of follicles in the ovaries. The regular process of detecting polycystic ovaries is to use a transabdominal scan of the ovaries. After the medical professional receives the scan, they would have to count the number B. Nagodavithana · A. Ullah (B) Heriot-Watt University, Dubai, UAE e-mail: [email protected] B. Nagodavithana e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_5

47

48

B. Nagodavithana and A. Ullah

Fig. 1 Difference between a normal ovary and a polycystic ovary. Source [21]

of follicles (cysts) in the ovaries. If there is a result of more than twelve follicles within the ovary with a diameter of 2–10 mm and ovarian volume of more than 10 cm3, this patient is most likely to have polycystic ovaries [14]. However, the traditional method is more likely prone to human errors and can be time-consuming. It is quite difficult to distinguish between a normal ovary and a polycystic ovary as sometimes the characteristics can be similar. In a study called “Delayed Diagnosis and a Lack of Information Associated With Dissatisfaction in Women With Polycystic Ovary Syndrome” by Gibson et al, a large number of women have outlined delayed diagnosis and vague information given by doctors [8]. Since this can be an underlying problem, it is needed to create an apparatus that will detect the disease quickly, provide high-accuracy results, and most importantly provide a platform where women can have a better patient experience. The goal was to create, implement, t and train various Deep learning models that would tackle and observe the disease’s identity, patterns, and characteristics to give prime results. The outcome would be beneficial to healthcare professionals and patients as this would reduce the time of the diagnosis and shed light to provide accurate results on a disease that is complex. This is a challenge as there is a lack of research done on detecting polycystic ovaries through modern technology.

2 Background PCOS is a hormonal disorder that affects women of reproductive age, and this is when the ovaries deliver aberrant amounts of androgens (male sex hormones) [13]. Thus, causing women to have irregular menstrual cycles, hair loss, weight gain, and infertility [26]. During ovulation, a mature egg is released from the ovary so that it waits to be fertilized by a male sperm; however, if the egg is not fertilized, it will be sent out from the body during menstruation. Occasionally, a woman does not develop

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

49

the right number of hormones that are needed to ovulate; therefore, when ovulation does not take place, the ovaries can start to develop small follicles. These tiny, small follicles make hormones called androgens. Women with high levels of androgen often have PCOS and this is an issue because it can affect a woman’s menstrual cycle [32]. Although studies suggest that women of different ages may experience different effects of PCOS, a study conducted by [30] indicated adolescents may experience other symptoms of PCOS in relation to their living habits, experiencing a difference in weight, acne, hirsutism, and irregular menstrual cycles [30]. Hailes [9] further states PCOS also has an impact on the mentality and physicality of women, such as hair growth, and psychological disorders such as depression, anxiety, and bipolar disorder [2]. There are different diagnosis methods that submerged throughout the years by medical professionals. This importantly highlights the difficulty and struggles to diagnose women with PCOS as the methods have kept changing. In 1990, the National Institutes of Health Criteria (NIH) implied the features of PCOS diagnosis based on the existence of clinical or biochemical hyperandrogenism and oligo/amenorrhea anovulation [16]. Biochemical Hyperandrogenism is when the levels of androgens in the blood reach a higher level [16]. For clinical hyperandrogenism, medical professors will search for physical signs such as acne, hair loss, and increased body hair that indicate a boosted amount of androgen levels. A woman that does not have PCOS usually has a number of 3-8 follicles per ovary [5]. In unusual cases where women may obtain a larger number of follicles in their ovary and polycystic ovarian morphology (PCOM) would be used to test. PCOM was established by the Rotterdam Criteria in 2003 to diagnose patients with PCOS using the polycystic ovarian morphology (PCOM) on the ultrasound along with clinical or biochemical hyperandrogenism and oligo/ amenorrhea anovulation included by the NIH [2]. In brief, PCOM is used to test the follicles per ovary, to see if the number of follicles is equal to or greater than 12 and/or has an ovarian volume greater than 10cm3 in at least one ovary and this can be detected with the help of ultrasound scanning [24]. Hence, the European Society of Human Reproduction and Embryology/American Society for Reproductive Medicine Rotterdam consensus (ESHRE/ASRM) expanded the diagnosis of PCOS that meets two of the standards: anovulation or oligo-ovulation must be present, clinic or biochemical hyperandrogenism must be present, and polycystic ovarian morphology (PCOM) must be seen on the ultrasound [20]. Lastly, the Androgen Excess Society evaluated PCOS as hyperandrogenism with polycystic ovaries or ovarian dysfunction. Hence, the Androgen Excess Society (AES) mediated those increased levels of androgen are the cause of PCOS. Therefore, androgen excess must be present including oligomenorrhea or polycystic cysts must be visible in the ultrasound images [3]. Medical Imaging is a modern solution to diagnose, monitor, or treat diseases using different types of technology. Ultrasound imaging is a type of medical imaging method that is used to capture live images of tissues, organs, or vessels in the body without having to make an incision [12]. Regarding diagnosing PCOS, Medical

50

B. Nagodavithana and A. Ullah

professionals use a procedure called a transvaginal ultrasound scan that is a type of pelvic ultrasound used to analyse a female’s reproductive organs such as the ovaries, uterus, or cervix [10]. It is one of the recommended methods as it gives the internal structure of the ovary, and it can be visible especially in obese patients. Another method called transabdominal ultrasound can be used [23], it is a method to visualize the organs in the abdomen; however, transvaginal ultrasound imaging is impressive as it is more reliable for detecting the appearance of polycystic ovaries. Since the transvaginal ultrasound includes a 3D ultrasound, it is easily accessible for medical professionals to view and analyse the image that is needed to diagnose the patient with PCOS. The medical expert can count the number of cysts and calculate the ovarian volume by using a simplified formula: calculated using the simplified formula: 0.5 × length x height x width [4]. These precautions are taken so that it reduces the likelihood of an error in the ultrasound image. However, it is still important to understand that the cysts can appear in large or small sizes and the ovarian volume can be miscalculated due to human error. A study conducted by [17] was to examine the different levels of agreement between observers using ultrasonographic features of polycystic ovaries. The focus was to identify and quantify polycystic ovaries and the method was to investigate transvaginal ultrasound scans in 30 women with PCOS by observers that were trained in Radiology and Reproductive Endocrinology. The scans had the number of follicles greater or equal to 2mm, ovarian volume, largest follicle diameter, follicle distribution pattern, and presence of corpus luteum [17]. The research’s conclusion was that the results suggested that evaluating the ultrasonographic features of polycystic ovaries was “moderate to poor” by the observers. Therefore, further training has been recommended for medical experts in the industry to analyse PCOM on ultrasonography [17]. A study about the “Pitfalls and Controversies” on the diagnostic criteria for Polycystic Ovary Syndrome suggests that the judgement of the ultrasound images of the polycystic cysts in the ovaries can be subjective. An investigation was conducted of 54 scans between images of polycystic ovaries that were duplicated and randomized for an assessment of four observers [1]. During the results, the observers agreed on a diagnosis of PCOS 51% of the time, then agreed with themselves 69% of the time. In this study, the polycystic ovary was initiated by having greater or equal to 10 follicles in 2–8 mm and an ovarian volume greater or equal to 12 cm3 . During the discussion by the observers, the criteria were either “too subjective”, or the measurements were “too insensitive” for an agreement [18]. Therefore, it is important to develop an automated system that will help medical professionals to detect polycystic ovaries with accurate results between polycystic and normal ovaries to help medical professionals detect and diagnose PCOS easily, and this can be done by using Deep Learning. Deep learning is a machine learning technique that allows models or systems to perform certain tasks to give an outcome. The model feeds onto a large amount of data and has a unique architecture that contains different features and layers that perform different duties to give a better result; this showcases that the models can achieve results beyond human-level performance [15].

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

51

Vikas et al used Deep Learning to detect polycystic ovaries. The idea was to differentiate between different deep learning techniques such as Convolutional Neural Networks, Data Augmentation, and Transfer Learning. These images have been collected and divided for training, validation, and test sets. In this study, data augmentation has been tested on the training set as it would boost the performance of the data. In addition, Transfer Learning is implemented to execute a certain task that will be re-used in a similar task to enhance the performance of the model [31]. Whereas Convolutional Neural Network (CNN) uses image recognition to detect various types of images. It is essentially used for classifying images, collecting comparisons, and achieving object recognition. The Transfer learning Fine Tuning Model with data augmentation achieved the highest accuracy of 98%. In a study on the classification of polycystic ovaries based on ultrasound images using the Competitive Neural Network architecture, ultrasound images were used as the data and were evaluated through pre-processing. The team proceeds to use segmentation so that they can separate the object from the background, and the object in this study is the follicles or the polycystic ovaries which will then be detected, labelled, and cropped for the next step which is the feature extraction. The feature extraction will take the information from the newly cropped follicle image to differentiate itself from other objects and then the classification process will proceed to put these images into the classes whether the patient has PCOS or not [7]. The training process will train the dataset and allocate the weights randomly and the testing process will use a hyperplane to get the results of the follicles if it has PCOS or not. The weight changes the input data in the hidden layer [6]. Using the machine learning approach and competitive neural network with a 32-feature vector which processed a time of 60.64 seconds gave the best accuracy of 80.84% [7]. Kokila et al. [11] developed brain tumour detection using deep learning techniques. The brain tumour is detected and identified by the CNN model that is commonly used to provide a high accuracy rate for data with images [28]. The model was able to achieve an accuracy of 92% and the tumour identification model was analysed using a Dice coefficient; therefore, the average Dice score was [11].

3 Methodology and Implementation The requirements mainly focused on the models to accept ultrasound images of ovaries and predict with prime accuracy results. Throughout this course of implementing the project, different models were used to train and test on the data. Additionally, many changes have been performed to enhance the model’s performance.

52

B. Nagodavithana and A. Ullah

3.1 Implementing the Deep Learning Models The following section evaluates the process of collecting and pre-processing the data and developing and training the deep learning models.

3.1.1

Collecting Data

The models were trained and tested from a Kaggle dataset of ultrasound images of normal and polycystic ovaries. The data contains about 1697 polycystic ovary images and 2361 normal images. Additionally, the ultrasound scans were validated with a help of a medical expert to dispute any conflict of interest. Link to Kaggle datasets: https://www.kaggle.com/datasets/anaghachoudhari/ pcos-detection-using-ultrasound-images.

3.1.2

Splitting the Data

The data was split between the Train, Test, and Validation sets. The split ratio was divided by—60:20:20. The importance of splitting the data is to analyse the performance of the deep learning models. During the training of the models, the polycystic ovaries labels are set to 0 and the normal ovaries labels are set to 1.

3.1.3

Data Pre-processing

Before training the models with the dataset, it is necessary to pre-process the data so that it can eliminate data that can cause obstruction to the models. Data Normalization and the process of resizing the data are applied to the data so that the models can consume the data easily.

3.1.4

Data Augmentation

Data Augmentation is performed as it will generate more training samples that would boost the model’s performance to obtain better results. It tries to dodge the model from overfitting and handling imbalanced data that would affect the models negatively. Different data augmentation methods were used such as rotation, zoom range, width shift range, height shift range, and horizontal flip. Using the Keras Image Data Generator enables the images with different characteristics to be generated, and therefore, this data was processed to the deep learning models.

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

53

3.2 Methodology As the data has been explored now, the next section presents the different models that were implemented and dives into each model’s architecture and implementation. For each model, an in-depth analysis was provided to demonstrate the complexity of the model. The proposed models are CNN model, VGG-16 model, Custom VGG-16 model, ResNet-50 model, and Custom ResNet-50 model.

3.2.1

CNN Model

The CNN model that was implemented is a simple architecture that contains five Conv2D layers, five MaxPool layers, Batch Normalization layers between the Convolution layers, and four Dropout layers. The last layer is a Dense layer with a Sigmoid activation. Additionally, the RMSprop optimizer was used at a learning rate of 2.7e−05. Due to the dataset containing only over 2000 images, the model would be able to pick up on small details of the polycystic ovaries at the very first few layers, and deeper into the architecture at later layers, it would detect precise details of the disease [29].

3.2.2

VGG-16 Model

The VGG-16 is a convolutional neural network architecture that contains 16 convolutional layers. The hyperparameter components of the model are consistent as it has only 3 × 3 convolution layers with a vast number of filters Keras (nd). This model is the most popular architecture among many deep learning models and is the basic choice for drawing out features from images. The positioning of the layers is uniform throughout the structure and consists of convolution layers with a 3 × 3 filter with a stride 1 and for stride 2 it uses the same padding with a Maxpool layer of 2 × 2 filter Mohan [22]. As the data passes through the model, the number of filters increases from 64 to 512. The final stages of the VGG-16 model end with three Dense layers. Implementing the model was simple and straightforward as it follows a chain of repeated layers. After importing the necessary libraries, a sequential model object must be defined using Keras. The next step was to add the stack of layers. The first block contains two consecutive layers with 64 filters of size 3 × 3, this is accompanied by a 2 × 2 max pooling layer with stride 2. Additionally, the input image size is 224 × 224 × 1. Following that, is to add the rest of the layers following the architecture. After implementing the stacks, the last step is to add the fully connected layers. Before the first fully connected layer, a flattened layer must be added. Lastly, the final layer is the output layer with a SoftMax Activation Mohan [22].

54

3.2.3

B. Nagodavithana and A. Ullah

ResNet-50 Model

The ResNet-50 architecture is a deep learning model known for image recognition, objection detection, and image segmentation. Due to its framework, the network can be trained over more than a million images thus resulting in great performance Mohan [21]. The architecture has a feature called the skip connection that automatically directs the gradient to the back propagated earlier layers resulting in a deep network. For the implementation of the model, a pretrained model was used from Keras. The model contains early stopping as it can be challenging for developers to decide on how many epochs a model should be trained. Many epochs can rise the issue of overfitting the model and less epochs can result in underfitting of the model Mohan [21]. To bring light to this dilemma, Early Stopping is a technique that will train a huge number of epochs and stop once the model’s performance does not have any improvements on the validation dataset Mohan [21]. In addition, a Model Checkpoint is used that will save the best-performing model after Early Stopping. Early Stopping might not be the prime model; therefore, the model checkpoint will save the best model during training depending on the given parameter.

3.2.4

Custom ResNet-50 Model

The main goal of designing the model was to implement various features that would strengthen the model’s performance. The model is parallel to the previous ResNet50 model; however, it contains more filters. In addition, back propagation layers are added between the convolution layers. A separable convolution layer is added to the model, and it is similar to a convolution layer, but it can be considered as a hybrid version. The layer divides a single convolution layer into two or more convolutions to produce the same output. Hence, this is an advantage as the model uses less parameters; therefore, it will use less training time on the model which makes the process faster. The filters were changed so the model’s architecture will be less complex. The trainable features are reduced as there were many filters on the regular ResNet-50 model that was trained over the two thousand images to try avoiding overfitting to a value that works with the architecture and to get better validation accuracy. Additionally, batch normalization was used to improve the training time and accuracy of the neural network. The activation functions that were used are ReLU that is the activation layer that runs in between the layers and Softmax that is used as a segregation of the classes and the results are often highlighted. Furthermore, a Cyclical Learning Rate (CLR) was implemented as it sets the global learning rates for training the model to eradicate numerous experiments and to find the prime values without additional computation [25]. Additionally, the learning rate finder function is implemented as it will compare the series of learning

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

55

rates on one epoch. The optimizer that is used for the model training is Root Mean Squared Propagation (RMSProp). It stimulates the optimization process by reducing the number of function evaluations required to reach the optima and find the desired result (Brownlee 2021). The implementation of the model was proceeded to change the optimizers between RMSProp and SGD. Additionally, different learning rates were used to evaluate the changes in the model’s performance.

4 Results and Evaluation This section examines the performance evaluation that has been done on all the deep learning models that were introduced. Each model’s results include the Accuracy, Precision, Recall, F-1 score, Confusion Matrix, and ROC Curve.

4.1 CNN Model’s Performance The first experiment was done on the CNN model and the results are as follows: • • • • •

Accuracy: 99% Precision: Infected: 100% and Not Infected: 100% Recall: Infected: 100% and Not Infected: 100% F-1 Score: Infected: 100% and Not Infected: 100% Confusion Matrix: Infected

Infected Not infected

Not infected

340

0

0

473

• ROC Curve (Fig. 2).

4.2 Custom VGG-16 Model’s Performance The second model that was implemented is the VGG-16 Model and the results are as follows: • Accuracy: 58% • Precision: Infected: 0% and Not Infected: 58% • Recall: Infected: 0% and Not Infected: 100%

56

B. Nagodavithana and A. Ullah

Fig. 2 VGG-16 model ROC curve

• F-1 Score: Infected: 0% and Not Infected: 74% • Confusion Matrix: Infected

Not infected

Infected

0

340

Not infected

0

473

• ROC Curve (Fig. 3).

Fig. 3 Custom VGG-16 model ROC curve

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

57

Fig. 4 ResNet-50 model ROC curve

4.3 ResNet-50 Model The third model that was implemented is the ResNet-50 model and obtained the same results as the VGG-16 model that was implemented earlier: • • • • •

Accuracy: 58% Precision: Infected: 0% and Not Infected: 58% Recall: Infected: 0% and Not Infected: 100% F-1 Score: Infected: 0% and Not Infected: 74% Confusion Matrix: Infected

Not infected

Infected

0

340

Not Infected

0

473

• ROC Curve (Fig. 4).

4.4 Custom ResNet-50 Model The last model that was implemented is the Custom ResNet-50 Model and after running the experiment, the results are as follows: • • • • •

Accuracy: 96.7% Precision: Infected: 100% and Not Infected: 91% Recall: Infected: 86% and Not Infected: 100% F-1 Score: Infected: 92% and Not Infected: 95% Confusion Matrix:

58

B. Nagodavithana and A. Ullah

Fig. 5 ResNet-50 model ROC curve custom

Infected Infected Not infected

Not infected

292

48

0

473

• ROC Curve (Fig. 5).

4.5 Discussion The models that obtained high accuracy are the CNN model and the Custom ResNet50 model. The CNN model obtained an accuracy of 99% and the Custom ResNet-50 obtained an accuracy of 96.7%. Both Custom VGG-16 Model and ResNet-50 model obtained an accuracy of 58%. The CNN model achieved precision, recall, and F-1 Score as 100% for both polycystic and normal ovaries, therefore, showcasing that the algorithm returns relevant results. Looking at the confusion matrix, the results also provide accurate scores. In the runner-up position, the Custom ResNet-50 model provided results as precision was 100% for polycystic ovaries and for normal ovaries it was 91%. The recall was 86% for polycystic ovaries and 100% for normal ovaries. Lastly, the F-1 Score was 92% for polycystic ovaries and 95% for normal ovaries. Additionally, the ratio in the confusion matrix is high; however, the ratio between “infected” and “not infected” is only 48 results. The Custom VGG-16 model and the ResNet-50 model did not have results compared to the models mentioned above. The Custom VGG-16 model’s precision is 0% for polycystic ovaries and 58% for normal ovaries, recall is 0% for polycystic ovaries and 100% for normal ovaries, and F-1 score is 0% for normal ovaries and 74% for infected ovaries. The confusion matrix’s ratio also highlights that the model

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

59

did not perform well as the ratio between polycystic ovaries and normal ovaries is 340 results and only for normal ovaries is 473 results. Moving forward, the ResNet-50 model’s precision is 0% for polycystic ovaries and 58% for normal ovaries, recall is 0% for polycystic ovaries and 100% for normal ovaries, F-1 score is 0% for polycystic ovaries and 74% for normal ovaries. The confusion matrix ratio between “infected” and “not infected” is 340 results and “not infected” and “infected” is 473 results. The CNN model and the Custom ResNet-50 models were successful to provide high-accuracy results thus making them reliable. However, it would be ideal to experiment further on the CNN model to eliminate signs of overfitting. The custom VGG-16 model and ResNet-50 model did not perform well as this could happen due to several reasons, including lack of data, class imbalance problems, etc. During the process, the limitations were that it was difficult to find various ultrasound scans of normal ovaries and polycystic ovaries; therefore, it was important to do data augmentation to enhance the model’s performance. An additional limitation was that the scripts ran on GoogleColab, since it is a free source there are restrictions to running multiple scripts or executing the models simultaneously.

5 Conclusion This research was conducted to highlight the importance of women’s health. PCOS is a silent syndrome that many women face and should be taken seriously in the medical industry as it leads to many health issues stated in this paper. Women are diagnosed with PCOS through different extensive methods, therefore, pointing out how complex this health issue is. The main goal was to find Deep Learning methods that will help to detect polycystic ovaries with the use of ultrasound scans and to provide high-accuracy results as this would speed up the process and make it easier for medical professionals to diagnose patients with PCOS. During the process of the experiment, there was a lack of research between polycystic ovaries and technology; therefore, it was strenuous to find information that would help with the procedure of the project. With the help of previous work conducted by other researchers, it assisted to provide information that helped with the execution of the deep learning models and further diving deep into the experiments to provide the optimum results. From the results obtained, the most reliable models are the CNN model and Custom ResNet 50 model. The models that obtained high accuracy are the CNN model and the Custom ResNet-50 model. The CNN model obtained an accuracy of 99% and the Custom ResNet-50 obtained an accuracy of 96.7%, thus making them the two most reliable models. Whereas the ResNet-50 model obtained an accuracy of 58%, hence this shows that this model is not compatible to detect normal and polycystic ovaries. For future direction, additional Deep learning models will be implemented and evaluated to further improve the accuracy of detecting the disorder. With a Deep

60

B. Nagodavithana and A. Ullah

learning model that accurately predicts the disease that avoids overfitting, an automated interface programme will be invented to grab ultrasound scans and automatically give results to help doctors diagnose patients with PCOS. This is to believe that this will help reduce the exhaustion that women go through during their PCOS journey and provide a better patient experience.

References 1. Amer S, Li T, Bygrave C, Sprigg A, Saravelos H, Cooke I (2002) An evaluation of the interobserver and intra-observer variability of the ultrasound diagnosis of polycystic ovaries. Hum Reprod 17(6):1616–1622 2. Azizi M, Elyasi F (2017) Psychosomatic aspects of polycystic ovarian syndrome: a review. Iran J Psychiat Behav Sci 11(2) 3. Azziz R (2006) Diagnosis of polycystic ovarian syndrome: the rotterdam criteria are premature. J Clin Endocrinol Metab 91(3):781–785 4. Chen Y, Li L, Chen X, Zhang Q, Wang W, Li Y, Yang D (2008) Ovarian volume and follicle number in the diagnosis of polycystic ovary syndrome in Chinese women. Ultrasound Obstet Gynecol off J Int Soc Ultrasound Obstet Gynecol 32(5):700–703 5. Çelik HG, C¸ elik E, Polat I (2018) Evaluation of biochemical hyperandrogenism in adolescent girls with menstrual irregularities. J Med Biochem 37(1):7 6. Deep AI (2019) Weight (Artificial Neural Network) [online] Available at: https://deepai.org/ machinelearning-glossary-andterms/weight-artificial-neural-network. 7. Dewi R, Wisesty U et al (2018) Classification of polycystic ovary based on ultrasound images using competitive neural network. J Phys Conf Ser 971:012005. IOP Publishing 8. Gibson-Helm M, Teede H, Dunaif A, Dokras A (2017) Delayed diagnosis and a lack of information associated with dissatisfaction in women with polycystic ovary syndrome. J Clin Endocrinol Metab 102(2):604–612 9. Hailes J (2019) Depression and anxiety are common in women with PCOS. Learn how PCOS might affect your mental and emotional health, including mood, stress and body image. There is also information on what you can do if you find your mental and emotional health is affected by PCOS. TOPICS. [online] Available at: https://www.jeanhailes.org.au/health-a-z/pcos/emo tions Accessed 6 Nov 2021 10. Higuera V (2015) Ultrasound: purpose, procedure, and preparation. Keras (n.d.). Vgg16 and vgg19 11. Kokila B, Devadharshini M, Anitha A, Sankar SA (2021) Brain tumor detection and classification using deep learning techniques based on MRI images. J Phys Conf Ser 1916:012226. IOP Publishing 12. Krans B (2006) Ultrasound: purpose, procedure, and preparation 13. Kumar A, Woods KS, Bartolucci AA, Azziz R (2005) Prevalence of adrenal androgen excess in patients with the polycystic ovary syndrome (pcos). Clin Endocrinol 62(6):644–649 14. Lai Q, Chen C, Zhang Z, Zhang S, Yu Q, Yang P, Hu J, Wang C-Y (2013) The significance of antral follicle size prior to stimulation in predicting ovarian response in a multiple dose GNRH antagonist protocol. Int J Clin Exp Pathol 6(2):258 15. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444 16. Legro RS, Spielman R, Urbanek M, Driscoll D, Strauss JF III, Dunaif A (1998) Phenotype and genotype in polycystic ovary syndrome. Recent Prog Horm Res 53:217–256 17. Lujan ME, Chizen DR, Peppin AK, Dhir A, Pierson RA (2009) Assessment of ultrasono-graphic features of polycystic ovaries is associated with modest levels of inter-observer agreement. J Ovarian Res 2:6–6

Diagnosis of Polycystic Ovarian Syndrome (PCOS) Using Deep Learning

61

18. Lujan ME, Chizen DR, Pierson RA (2008) Diagnostic criteria for polycystic ovary syndrome: pitfalls and controversies. J Obstet Gynaecol Can 30(8):671–679 19. Lujan ME, Chizen DR, Pierson RA (2008) Diagnostic criteria for polycystic ovary syndrome: pitfalls and controversies. J Obstet Gynaecol Canada 30(8):671–679 20. Mohammad MB, Seghinsara AM (2017) Polycystic ovary syndrome (PCOS), diagnostic criteria, and AMH. Asian Pac J Cancer Prev APJCP 18(1):17 21. Mohan S (2020) Keras implementation of resnet-50 (residual networks) architecture from scratch 22. Mohan S (2020b) Keras implementation of vgg16 architecture from scratch with dogs vs cat data set 23. National Cancer Institute (2011) NCI dictionary of cancer terms. [online] Available at: https:// www.cancer.gov/publications/dictionaries/cancer-terms/def/transabdominal-ultrasound 24. Reid SP, Kao C-N, Pasch L, Shinkai K, Cedars MI, Huddleston HG (2017) Ovarian morphology is associated with insulin resistance in women with polycystic ovary syndrome: a cross sectional study. Fertil Res Pract 3(1):1–7 25. Rosebrock A (2019) Cyclical learning rates with Keras and deep learning 26. Setji TL, Brown AJ (2007) Polycystic ovary syndrome: diagnosis and treatment. Am J Med 120(2):128–132 27. Simplyremedies (2020) Kenali Penyakit Hormon Wanita, Sindrom Ovari Polisistik (PCOS) available at: https://simplyremedies.com/steadfast/kenali-penyakit-hormon-wanita-sindromov ari-polisistik-pcos/ Accessed 10 Jun 2022 28. Tatan V (2019) Understanding CNN (convolutional neural network) 29. Thakur R (2019) Step by step vgg16 implementation in keras for beginners 30. Trent M, Austin SB, Rich M, Gordon CM (2005) Overweight status of adolescent girls with polycystic ovary syndrome: body mass index as mediator of quality of life. Ambul Pediatr 5(2):107–111 31. Vikas B, Radhika Y, Vineesha K (2021) Detection of polycystic ovarian syndrome using convolutional neural networks. Int J Cur Res Rev 13(06):156 32. Weiner CL, Primeau M, Ehrmann DA (2004) Androgens and mood dysfunction in women: comparison of women with polycystic ovarian syndrome to healthy controls. Psychoso-Matic Medicine 66(3):356–362

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye Cataract Disorder Amir Sohail, Huma Qayyum, Farman Hassan, and Auliya Ur Rahman

Abstract Humans see the happenings around through the help of eyes. Currently, visual impairment and blindness have become significantly dangerous health problems. Even though advanced technologies are emerging rapidly, blindness and visual impairment, still, remain significant problems around the globe in healthcare systems. Specifically, cataract is among the problem that results in poor vision and may also cause falling as well as depression. In old times, mostly the old people were suffering; however, childhood cataracts are common that result in severe blindness as well as visual impairment in children as well. Therefore, it is extremely mandatory to develop an automated system for the detection of cataracts. Being that this research presents a novel deep learning-based approach, CataractEyeNet to detect cataract disorder using the lens images. More specifically, we customized the pre-trained VGG-19 model and added 20 more layers to enhance the detection performance. The CataractEyeNet has obtained an accuracy of 96.78%, precision, recall, and F1-score of 97%, 97%, and 97%, respectively. The experimental outcomes of the CataractEyeNet show that our system has the capability to accurately detect cataract disorders. Keywords Eye cataract · Deep learning · VGG-19 · Medical imaging

1 Introduction The eye is an organ of the human body through which we observe happenings around us and its related diseases are increasing day by day. Cataract is an eye-related problem, which can cause weak sightedness and blurriness. Cataract is the formation of clouds around the lens of the eye, which results in a decrease of vision. Cataracts are of different types such as nuclear cataracts, cortical cataracts, posterior cataracts, and congenital cataracts. The above different types are classified based on the development and the location in the eye. A cataract develops gradually; however, the vision of one or both eyes decreases. There are numerous symptoms of the A. Sohail · H. Qayyum · F. Hassan (B) · A. U. Rahman UET Taxila, Punjab, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_6

63

64

A. Sohail et al.

cataract, namely, double vision in the affected eye, halos surrounding lights, dim colors, etc., [1]. It develops with age, which gives rise to blurry vision and sensitivity to brightness. Additionally, there are certain diseases, namely, diabetes, ultraviolet rays, and traumas that cause cataract disorder. Some other factors can act as catalysts, namely, heavy usage of alcohol, smoking, high blood pressure, and exposure to radiation from X-rays, etc., [2]. The problem of visual impairment is increasing worldwide and nearly 62.5 million cases of visual impairment and blindness are reported around the globe [3]. A cataract is considered one of the main reasons for these visual impairments. Still, a significant number of cataract disorders remained undiagnosed [4]. The research community has conducted research to find how many people have undetected eye diseases (UED) and a considerable amount of UED cases were found [5, 6]. Classification is the process of classifying data into different categories while in this work, we have two classes, namely, cataract and non-cataract. Initially, for the purposes of the classification, pre-processing is performed, followed by feature extraction, and finally, images are classified based on the features given as input [7, 8]. Earlier, the cataract disorder was detected through fundus image analysis in which a fundus camera was used to detect it. Numerous feature extraction methods have been developed, namely, wavelet, acoustical, texture, sketch, color, spectral parameters, and deep learning based as well [9, 10]. Pre-trained models, namely, AlexNet, GoogleNet, ResNet, etc., are also employed for cataract detection and classification. The above models are developed using the convolutional neural network and dataset, namely, ImageNet is used for training the model. Employing pre-trained models for the problems is known as transfer learning [11]. The early detection of cataract patients is necessary to avoid blindness problems. Therefore, the pre-trained models play a significant role in saving time and providing better classification performance [9].

2 Literature Review There have been efforts by the research community to employ machine learningbased methods [12–21] for the detection of eye cataract disorders. In [12], support vector machine (SVM) and back propagation neural networks have been utilized for the detection of cataract disorders using numerous images, namely, fundus images and ultrasound images. In [13], various image features, namely, edge pixel count, big ring area, and small ring area were fed into the SVM classifier for the classification of normal, cataract, and post-cataract images. The method obtained an accuracy, sensitivity, and specificity of 94%, 90%, and 93.75%, respectively. In [21] an automated system based on the retro-illumination images to grade cortical and posterior subcapsular cataracts (PSC) has been developed. Numerous features, namely, intensity, homogeneity, and texture were utilized to specify the geometric structure as well as the photometric appearance of cortical cataracts. For the purposes to classify the cortical cataract and PSC cataract, support vector regression was employed. The

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye …

65

system has the benefits to avoid the under-detection as well as the over-detection for clear lenses and high opacity lenses, respectively. In [14], nuclear cataracts were detected and graded through the regression model. The system comprised four steps, namely, features selection, parameters selection, training, and validation, respectively. In [15], texture features based on the retro-illumination image characteristics and grade expertise of cataracts were used to train the linear discriminant analysis (LDA) for the classification of cataracts and normal. The method obtained an accuracy of 84.8%. In [16], two types of cataracts, namely, nuclear cataracts and cortical cataracts were detected sequentially by the two different grading systems. In one system, the lens’s structure was used for the feature extraction process followed by the SVM regression for the classification. Opacity in cortical cataract grading was detected with the region growing [18]. In [17], two tasks, namely cataract detection and grading were performed using fundus image analysis. Both the temporal and spatial domain features were extracted while SVM was employed to classify the images as cataract or normal. The radial basis function network was used to grade the cataracts such as mild cataract or severe cataract. The method obtained sensitivity and specificity of 90% and 93.33%, respectively. Similarly, in [16, 18], an active shape model was developed for the two tasks, namely cataract detection and grading. Moreover, the SVM regression classifier was utilized for the classification purposes of nuclear cataracts and normal eyes. The method obtained an accuracy of 95%. In [19], an automated cataract detection system was designed using a gray level co-occurrence matrix (GLCM) for the feature extraction. K-nearest neighbor (KNN) was utilized to classify the normal eyes vs. cataracts. GLCM was employed to obtain the values of uniformity, dissimilarity, and contract in the pupil of the eyes. The method obtained an accuracy of 94.5%. However, this method has utilized a very small number of images for training and testing purposes. In [20], two methods, namely, wavelet-transform and sketch-based were used for the feature extraction. Additionally, multi-class fisher discriminant analysis was also performed using the above two methods. The research community has also worked on deep learning-based techniques [9, 22–29] for the detection and grading of the eye cataracts. Mostly, deep learningbased methods are based on convolutional neural networks (CNNs). In [9], a deep convolutional neural network (DCNN) was used for the detection of cataract disorder using a cross-validation approach. The method obtained an accuracy of 93.52% for cataract detection while 86.69% accuracy for the grading of cataracts. However, this method has a problem of vanishing gradient. In [22], a DCNN-based system was designed for the detection of cataract disease using fundus images. The model obtained an accuracy of 97.04%. In [23], a system based on the discrete state transition and ResNet was developed to detect cataract disorder. The residual connection technique was used to avoid the vanishing gradient problem. In [24], a CNN-based deconvolutional network was used to design a cataract disorder detection system. It was investigated that vascular information lost after computation of multi-layer convolutional has a significant role in the grading of eye cataract. The cataract detection performance was enhanced by designing a hybrid global–local features representation model. In [25], a transfer learning-based approach was designed for the

66

A. Sohail et al.

detection of eye cataract disorder. Similarly, this [26] work has developed a cataract detection and grading system using a hybrid model based on the combination of two neural networks, namely, recurrent neural network (RNN) and CNN. The model was capable to learn relationships among the inner feature maps [27]. In [28], a cataract disorder detection system was designed by employing the Caffe to extract features from the fundus images while maximum entropy for the pre-processing. For classification purposes, SVM followed by SoftMax was employed. In [29], a three stages-based system, namely, pre-processing, features extraction, and classification was designed for the detection of cataract disorder. For improving the quality of images, top–bottom hat transformation and trilateral filters were used. Moreover, two-layered back propagation neural network model was employed.

2.1 Convolutional Neural Network The CNN is based on the deep learning algorithm that takes an image as an input followed by assigning learnable weights and biases to numerous objects present in the image as well as making it capable to distinguish from each other [30, 31]. Furthermore, the process of pre-processing necessary in the CNN-based models is very rare compared to other algorithms employed for the purposes of classification. During the training time of the CNN models, the filters are used to learn the characteristics. The CNN-based algorithms are similar to that of the connective pattern of the neurons in the human brain. Moreover, it was inspired by the institute of the visual cortex. The neurons normally react to the stimuli in the limited space of the visual field called as receptive field. A set of these fields intersect for covering the overall visual area. The CNN-based algorithms have been extremely popular because of the improved performance in the classification tasks of the images. CNNs consist of numerous blocks of layers, namely, convolutional, pooling (either maximum, minimum, or average pooling), flatten, dense, and dropout layers. Most importantly, features of the temporal and spatial domain from the images are extracted using the convolutional layers and the filters. Additionally, the computational efforts are significantly minimized through the weight-sharing technique [32, 33]. The CNNs are considered as feedforward artificial neural networks, which have two shortcomings, namely, shared weights and similar filters have neurons associated to the surrounding patches. A typical CNN model consists of three blocks, namely, a convolutional layer, a maxpooling layer, and a fully connected layer for enabling the network to classify images [34].

2.2 Pre-trained Models The CNNs have superior performance on big datasets; however, these models suffer from an overfitting problem on a small dataset [35, 36]. Transfer learning is utilized

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye …

67

to save the time of the training and it is beneficial to be employed for image classification problems. In transfer learning, the pre-trained models on the large dataset, namely, ImageNet [37, 38] can be utilized for such applications that have relatively limited size datasets. CNNs have been employed in numerous applications, namely, manufacturing, medical fields, as well as in baggage screening [39–41]. Transfer learning is favored due to the reason because it minimizes the lengthy training time and the necessity of the big dataset. Furthermore, designing a deep learning-based model from the scratch needs a big dataset and lengthy training time [42]. Hence, in this work, we also used the existing pre-trained model, namely, VGG-19 [43], but added 20 more additional layers for better performance of the cataract disorder detection. The details are given in the subsequent sections.

2.3 VGG-19 VGG was originally designed by the visual geometry group at Oxford and called it a VGG. The VGG carries and utilizes ideas from its prototypes such as AlexNet, enhances, and utilizes the deep convolutional neural layers for improving the accuracy. AlexNet [44] was proposed that enhanced the conventional CNN models; therefore, VGG is considered a successor of the AlexNet. The VGG-19 comprises of 19 layers and the detailed parameters configuration is found in [44]. VGG-19 is a variant of the original VGG model, and the 19 layers have 16 convolutional layers, 5 maximum pooling layers, 3 fully connected layers, and 1 SoftMax layer. VGG has also other variants; however, we have employed VGG-19 in this work. The above literature shows a significant contribution to the detection of cataract disorders; however, the existing methods, still, have limitations that need to be addressed. Therefore, we developed a novel deep learning-based approach CataractEyeNet for the detection of cataract disorder. The main contributions of this research work are as under. • We developed a novel deep learning-based approach named CataractEyeNet by customizing the VGG-19 to detect the cataract disorder. • The CataractEyeNet is capable to distinguish cataract disorder images from normal. • We observed that non-customized VGG-19 has degraded performance than our proposed CataractEyeNet method. • For the validation of our approach, we performed extensive experimentation of the ODIR-5 K dataset. The remaining manuscript is organized in the following way, Sect. 2 has a detailed discussion about the proposed methodology while Sect. 3 has details of experimental results. Finally, the research work is concluded in Sect. 4.

68

A. Sohail et al.

3 Proposed Methodology This section provides details of the proposed methodology for cataract detection. The CataractEyeNet is based on the pre-trained model, namely, VGG-19. We modified the VGG-19 model by adding 20 layers including convolutional layers, max-pooling layers, and a flatten, and a dense layer. For experimentation purposes, we used the ODIR-5 K dataset and used 80% of the data for training the CataractEyeNet while 20% for evaluating the proposed CataractEyeNet. The detailed working mechanism is illustrated in Fig. 1.

3.1 Customization In this work, we have customized the pre-trained model, namely, VGG-19. The details of the VGG-19 are discussed in Sect. 2.3. The model has 19 deep layers and we further added 20 more layers to enhance the performance for accurate detection of the cataract disorder. We observed from the experimental findings that the customized algorithm has superior performance for the detection of cataract disorder. The benefit of using a pre-trained model is saving enough time for training and improving classification performance. Moreover, the model is capable to capture both dependencies in the image, namely, the spatial and temporal using the relevant filters. The customized architecture has good performance due to the decrease in the parameters and reusability of the learnable weights. Specifically, the proposed network can be trained to learn the complexity of the images better.

Fig. 1 Proposed working mechanism

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye …

69

3.2 Proposed CataractEyeNet Model In this research work, we have proposed a novel architecture, CataractEyeNet that is based on the VGG-19 model for the detection of eye cataract disorder. The VGG-19 model is comprised of 19 deep layers and is discussed in detail in Sect. 2.3. We further added 20 layers including convolutional layers, max-pooling layers, a flatten, as well as a dense layer at the end. The proposed CataractEyeNet has 5 blocks comprised of 5 convolutional layers, 5 max-pooling layers, 1 flatten layer, and 1 dense layer. In the first block, there are two layers that have filters equal to 64, kernel size equal to 3 × 3, padding equal to the same, and an activation function of ReLU. In the second block, there are 2 convolutional layers that have filters of size 128, kernel size of 3 × 3, padding is same, and an activation function a ReLU. The third block of the CataractEyeNet is comprised of three convolutional layers, a filter size of 256, kernel size of 3 × 3, padding is same, and an activation function of ReLU. The fourth and fifth blocks of the network consist of three convolutional layers, filter sizes of 512, kernel size 3 × 3, padding is same while activation is ReLU. We added one max-pooling layer after each convolutional layer having a pool size of 2 × 2 and a strides size of 2 × 2. Finally, we added a flatten layer and a dense layer that has an activation function of the sigmoid. More specifically, the proposed. CataractEyeNet is comprised of 39 layers. We performed experiments using the standard dataset, namely, ODIR-5 K for the detection of eye cataract disorder. The proposed system significantly detected patients with cataracts.

3.3 Dataset In this research work, a publicly available dataset, Oscular disease intelligent recognition (ODIR-5 K) is utilized for the purposes of experimentation. The dataset contains 5000 multi-labeled color fundus images of both right as well as left eyes. Additionally, the images have the doctor’s descriptive diagnosis, and each is for the individual eyes as well as the patient. The data originally was collected from numerous hospitals of China and compiled by the Shanggong medical technology Co., Ltd. Initially, the images captured by the high-resolution cameras include unnecessary features, namely, eyelashes, freckles, etc. Most importantly, the diseases were annotated by expert ophthalmologists. The details of the datasets are given in [45].

70

A. Sohail et al.

4 Experimental Setup and Results This section has provided a detailed experimental setup and discussion. In this work, we used the following performance parameters, namely, accuracy, precision, recall, and F1-score. The detailed experimental findings of the proposed CataractEyeNet, confusion matrix analysis, and performance comparison are discussed in the subsequent sections.

4.1 Performance of the CataractEyeNet This experiment is aimed to check the evaluation performance of the proposed CataractEyeNet to detect cataract disorder. To achieve this aim, we split up the ODIR-5 K dataset into two sets, namely, the training set as well as the testing set. We used two classes such as cataract and normal as our purpose was to detect the cataract disorder. Moreover, we customized the pre-trained VGG-19 by introducing numerous convolutional layers and pooling layers. However, we obtained good classification results by adding 20 additional layers to the VGG-19 model and named it CataractEyeNet. We reported the detailed results of the CataractEyeNet in Fig. 2. We examined from Fig. 2 that CataractEyeNet obtained an accuracy of 95.78% while the precision is 97%, the recall obtained is 97%, and the F1-score is 97%. Moreover, the precision for normal is 98%, recall is 95%, and F1-score is 96% while for the cataract the precision is 96%, recall is 98%, and F1-score is 97%. As we discussed that we introduced different combinations of layers to enhance the performance of the VGG-19, however, we obtained poor performance on all configurations except the addition of 20 layers into it. We noticed the loss of the proposed CataractEyeNet equals 63.36%. The experimental outcomes obtained by the CataractEyeNet are promising, and based on the outcomes mentioned in Fig. 2, we can claim that this technique can be implemented by ophthalmologists for the accurate detection of Cataract disorder to avoid blindness.

4.2 Confusion Matrix In this section, we explained the detailed classification performance of CataractEyeNet. It is obvious that accuracy alone is not enough to check the performance of the algorithm for the classification problem. Other performance parameters, namely, precision, recall, and F1-score can’t be ignored and are significant to checking the complete performance of the algorithm. Therefore, we have also designed an error matrix analysis to compute other metrics for the detailed performance of the CataractEyeNet. The designed confusion matrix is given in Table 1. It can be observed that CataractEyeNet has correctly classified 93 normal eyes and 118

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye … Fig. 2 Performance of the CataractEyeNet

71

100 90 80 70 60 50 40 30 20 10 0 Accuracy%

Table 1 Confusion matrix

Predicted class

Precision%

Recall%

F1-score

Actual class Normal

Cataract

Normal

93

5

Cataract

2

118

cataract disorders, respectively. Moreover, the CataractEyeNet has misclassified 5 cataract disorders as normal while 2 normal as cataract disorders. The lower misclassification of the CataractEyeNet indicates that our method is capable to detect the cataract disorder accurately and the proposed system can be adopted in hospitals to save time as well as accurately detect the patients to avoid blindness.

4.3 Performance Comparison Against Other Methods This experiment is conducted to compare the performance of the proposed CataractEyeNet against the existing techniques [21, 23, 29, 46–48]. To achieve this goal, we take the experimental results from the papers directly without implementing their methods. The detailed outcomes of the other methods [21, 23, 29, 46–48] in terms of accuracy are given in Table 2. The reported results in Table 2 indicate that [47] obtained the lowest accuracy of 61.9% and it is the worst performance for the detection of cataract disorder among all the techniques [21, 23, 29, 46–48]. Furthermore, [48] obtained 94.83%, which is the second-best performing technique among other techniques [21, 23, 29, 46–48] while our method, CataractEyeNet, has a superior performance by achieving an accuracy of 96.78%. Moreover, we have also reported the improvements in our method compared to the other methods [21, 23, 29, 46–48]. The proposed CataractEyeNet has obtained 3.98%, 5.92%, 34.88%, 4.78%, 2.78%, and 1.95% than [21, 23, 29, 46–48] respectively. Moreover, preprocessing isn’t required, spatial padding is used for preserving the spatial resolution

72

A. Sohail et al.

Table 2 Performance comparison with other approaches Authors

Accuracy%

Improvement of our method w.r.t other technique

Xiong et al. [46]

92.8

3.98

Yang et al. [29]

90.86

5.92

Abdul-Rahman et al. [47]

61.9

34.88

Cao et al. [21]

92

4.78

Zhou et al. [23]

94

2.78

Lvchen Cao [48]

94.83

1.95

Proposed

96.78



of the images, the ReLU activation function is used for introducing non-linearity and making the model classify better the images, and the computational time is enhanced because employing the tanh degraded the detection performance. Due to the abovementioned reasons, experimental findings, and comparative analysis, we concluded that CataractEyeNet is better to use for the detection of cataract disorder. The abovementioned experimental outcomes as well as the comparative assessment against the existing techniques illustrate that CataractEyeNet has the capability to accurately detect cataract disorder patients.

5 Conclusion In this work, we addressed the problem of detecting the cataract disorder. The cataract disorder has become a challenging task, which needs to be addressed. People of all ages suffer from blindness as well as visual impairment. Therefore, to achieve this goal, we have designed a novel cataract disorder detection system, CataractEyeNet, that has better classification performance. The proposed CataractEyeNet has obtained good accuracy of 96.78%, the highest precision, recall, and F1-score of 97%, 97%, and 97%, respectively. From the experimental findings, we concluded that CataractEyeNet has superior performance and can be adopted by medical experts in hospitals for the detection of cataract disorders. In near future, we aim to apply the same method for the detection of different grades of cataracts.

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye …

73

References 1. Access on 8-20-2021. https://www.healthline.com/health/cataract 2. Liu YC, Wilkins M, Kim T, Malyugin B, Mehta JS (2017) Cataracts. Lancet 390(10094):600– 612 3. Flaxman SR, Bourne RRA, Resnikoff S et al (2017) Global causes of blindness and distance vision impairment 1990–2020: a systematic review and meta-analysis. Lancet Global Health 5:e1221–e1234 4. Chua J, Lim B, Fenwick EK et al (2017) Prevalence, risk factors, and impact of undiagnosed visually significant cataract: the Singapore epidemiology of eye diseases study. PLoS One 12:e0170804 5. Varma R, Mohanty SA, Deneen J, Wu J, Azen SP (2008) Burden and predictors of undetected eye disease in Mexican Americans: the Los Angeles latino eye study. Med Care 46:497–506 6. Keel S, McGuiness MB, Foreman J, Taylor HR, Dirani M (2019) The prevalence of visually significant cataract in the Australian national eye health survey. Eye (Lond) 33:957–964 7. Sahana G (2019) Identification and classification of cataract stages in maturity individuals’ victimization deep learning formula 2770. Int J Innov Technol Explor Eng (IJITEE) 8(10) 8. Soares JVB, Leandro JJG, Cesar RM, Jr, Jelinek HF, Cree MJ (2006) Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification. IEEE Trans Med Imaging 25(9):1214–1222 9. Zhang L, et al (2017) Automatic cataract detection and grading victimization deep convolutional neural network. In: IEEE Ordinal International Conference on Networking, Sensing and Management (ICNSC), Calabria 10. Zhang Q, Qiao Z, Dong Y, Yang J-J (2017) Classification of cataract structure pictures supported deep learning. In: IEEE International Conference on Imaging Systems and Techniques, Beijing, China, pp 1–5 11. Patton EW, Qian X, Xing Q, Swaney J, Zeng TH (2018) Machine learning on cataracts classification using SqueezeNet. In: 4th International Conference on Universal Village, Boston, USA, pp 1–3, ISBN-978-1-5386-5197-1 12. Yang JJ, Li J, Shen R, Zeng Y, He J, Bi J, Li Y, Zhang Q, Peng L, Wang Q (2016) Exploiting ensemble learning for automatic cataract detection and grading. Comput Methods Programs Biomed 124:45–57 13. Nayak J (2013) Automated classification of normal, cataract and post cataract optical eye images using SVM classifier. In: Proceedings of the world congress on engineering and computer science, vol 1, pp 23–25 14. Xu Y, Gao X, Lin S, Wong DWK, Liu J, Xu D, Cheng CY, Cheung CY, Wong TY (2013) Automatic grading of nuclear cataracts from slit-lamp lens images using group sparsity regression. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, Heidelberg, pp 468–475 15. Gao X, Li H, Lim JH, Wong TY (2011) Computer-aided cataract detection using enhanced texture features on retro-illumination lens images. In: 2011 18th IEEE international conference on image processing. IEEE, pp 1565–1568 16. Li H, Lim JH, Liu J, Wong DWK, Tan NM, Lu S, Zhang Z, Wong TY (2009b) Computerized systems for cataract grading. In: 2009 2nd international conference on biomedical engineering and informatics. IEEE, pp 1–4 17. Harini V, Bhanumathi V (2016) Automatic cataract classification system. In: 2016 international conference on communication and signal processing (ICCSP). IEEE, pp 0815–0819 18. Li, H., Lim, J.H., Liu, J., Wong, D.W.K., Tan, N.M., Lu, S., Zhang, Z., Wong, T.Y., 2009a. An automatic diagnosis system of nuclear cataract using slit-lamp images, in: 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EEE. pp. 3693–3696. 19. Fuadah YN, Setiawan AW, Mengko T (2015) Performing high accuracy of the system for cataract detection using statistical texture analysis and k-nearest neighbor. In: 2015 international seminar on intelligent technology and its applications (ISITIA). IEEE, pp 85–88

74

A. Sohail et al.

20. Li T, Zhu S, Ogihara M (2006) Using discriminant analysis for multi-class classification: an experimental investigation. Knowl Inf Syst 10(4):453–472 21. Cao L, Li H, Zhang Y, Zhang L, Xu L (2020) Hierarchical method for cataract grading based on retinal images using improved Haar wavelet. Information Fusion 53:196–208 22. Ran J, Niu K, He Z, Zhang H, Song H (2018) Cataract detection and grading based on combination of deep convolutional neural network and random forests. In: 2018 international conference on network infrastructure and digital content (IC-NIDC). IEEE, pp. 155–159 23. Zhou Y, Li G, Li H (2019) Automatic cataract classification using deep neural network with discrete state transition. IEEE Trans Med Imaging 39(2):436–446 24. Xu X, Zhang L, Li J, Guan Y, Zhang L (2019) A hybrid global-local representation CNN model for automatic cataract grading. IEEE J Biomed Health Inform 24(2):556–567 25. Yusuf M, Theophilous S, Adejoke J, Hassan AB (2019) Web-based cataract detection system using deep convolutional neural network. In: 2019 2nd international conference of the IEEE Nigeria computer chapter (NigeriaComputConf). IEEE, pp 1–7 26. Jiang J, Liu X, Liu L, Wang S, Long E, Yang H, Yuan F, Yu D, Zhang K, Wang L, Liu Z (2018) Predicting the progression of ophthalmic disease based on slit-lamp images using a deep temporal sequence network. PLoS ONE 13(7):e0201142 27. Gao X, Lin S, Wong TY (2015) Automatic feature learning to grade nuclear cataracts based on deep learning. IEEE Trans Biomed Eng 62:2693–2701 28. Qiao Z, Zhang Q, Dong Y, Yang JJ (2017) Application of SVM based on genetic algorithm in classification of cataract fundus images. In: 2017 IEEE international conference on imaging systems and techniques (IST). IEEE, pp 1–5 29. Yang M, Yang JJ, Zhang Q, Niu Y, Li J (2013) Classification of retinal image for automatic cataract detection. In: 2013 IEEE 15th international conference on e-health networking, applications and services (Healthcom 2013). IEEE, pp 674–679 30. Albahli S, et al (2022) Pandemic analysis and prediction of COVID-19 using gaussian doubling times. Comput Mater Contin 833–849 31. Hassan F et al (2022) A robust framework for epidemic analysis, prediction and detection of COVID-19. Front Public Health 10 32. Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network. In: 2017 international conference on engineering and technology (ICET). IEEE, pp 1–6 33. Goyal M, Goyal R, Lall B (2019) Learning activation functions: a new paradigm for understanding neural networks. arXiv:1906.09529 34. Bailer C, Habtegebrial T, Stricker D (2018) Fast feature extraction with CNNs with pooling layers. arXiv:1805.03096 35. Yaqoob M, Qayoom H, Hassan F (2021) Covid-19 detection based on the fine-tuned MobileNetv2 through lung X-rays. In: 2021 4th international symposium on advanced electrical and communication technologies (ISAECT). IEEE 36. Ullah, MS, Qayoom H, Hassan F (2021) Viral pneumonia detection using modified GoogleNet through lung X-rays. In: 2021 4th international symposium on advanced electrical and communication technologies (ISAECT). IEEE 37. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255 38. Wang SH, Xie S, Chen X, Guttery DS, Tang C, Sun J, Zhang YD (2019) Alcoholism identification based on an AlexNet transfer learning model. Front Psych 10:205 39. Christodoulidis S, Anthimopoulos M, Ebner L, Christe A, Mougiakakou S (2016) Multisource transfer learning with convolutional neural networks for lung pattern analysis. IEEE J Biomed Health Inform 21(1):76–84 40. Yang H, Mei S, Song K, Tao B, Yin Z (2017) Transfer-learning-based online Mura defect classification. IEEE Trans Semicond Manuf 31(1):116–123 41. Akçay S, Kundegorski ME, Devereux M, Breckon TP (2016) Transfer learning using convolutional neural networks for object classification within X-ray baggage security imagery. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 1057–1061

CataractEyeNet: A Novel Deep Learning Approach to Detect Eye …

75

42. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345– 1359 43. Manzoor S, et al, Melanoma detection using a deep learning approach 44. Access on 6.6.2022, https://iq.opengenus.org/vgg19-architecture/ 45. Access on 6 May 2022. https://academictorrents.com/details/cf3b8d5ecdd4284eb9b3a80fcfe 9b1d621548f72 46. Xiong L, Li H, Xu L (2017) An approach to evaluate blurriness in retinal images with vitreous opacity for cataract diagnosis. J Healthc Eng 47. Abdul-Rahman AM, Molteno T, Molteno AC (2008) Fourier analysis of digital retinal images in estimation of cataract severity. Clin Experiment Ophthalmol 36(7):637–645 48. Gao X, Wong DWK, Ng TT, Cheung CYL, Cheng CY, Wong TY (2012) Automatic grading of cortical and PSC cataracts using retroillumination lens images. In: Asian conference on computer vision. Springer, Berlin, Heidelberg, pp 256–267 49. Lvchen Cao LZ, Li H, Zhang Y, Xu L (2019) Hierarchical method for cataract grading based on retinal images using improved Haar wavelet. arXiv:1904.01261 50. Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang J (2016) Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 35(5):1299–1312

DarkSiL Detector for Facial Emotion Recognition Tarim Dar and Ali Javed

Abstract Facial Emotion recognition (FER) is a significant research domain in computer vision. FER is considered a challenging task due to emotion-related differences such as heterogeneity of human faces, differences in images due to lighting conditions, angled faces, head poses, different background settings, etc. Moreover, there is also a need for a generalized and efficient model for emotion identification. So, this paper presents a novel, efficient, and generalized DarkSiL (DS) detector for FER that is robust to variation in illumination conditions, face orientation, gender, different ethnicities, and varied background settings. We have introduced a low-cost, smooth, bounded below, and unbounded above Sigmoid-weighted linear unit function in our model to improve efficiency as well as accuracy. The performance of the proposed model is evaluated on four diverse datasets including CK + , FER-2013, JAFFE, and KDEF datasets and achieved an accuracy of 99.6%, 64.9%, 92.9%, and 91%, respectively. We also performed a cross-dataset evaluation to show the generalizability of our DS detector. Experimental results prove the effectiveness of the proposed framework for the reliable identification of seven different classes of emotions. Keywords DarkSIL (DS) emotion detector · Deep learning · Facial emotion recognition · SiLU activation

1 Introduction Automatic facial emotion recognition is an important research area in the field of artificial intelligence (AI) and human psychological emotion analysis. Facial emotion recognition (FER) is described as the technology of analysing the facial expression of a person from images and videos to get information about the emotional state of that individual. FER is a challenging research domain because everyone expresses their T. Dar · A. Javed (B) University of Engineering and Technology-Taxila, Department of Software Engineering, Taxila 47050, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_7

77

78

T. Dar and A. Javed

emotions differently. Furthermore, several challenges and obstacles exist in this area which makes emotion analysis quite difficult. Nowadays, researchers are focusing to improve the interaction between humans and computers. One way of doing that is to make computers intelligent so they can understand the emotions of humans and interact with them in a better way. Automatic FER systems have the ability to improve our life quality. FER systems can help in the rehabilitation of patients with facial paralysis diseases, they aid in getting customers’ feedback on products [1], and robotic teachers having an understanding of students’ feelings can offer an improved learning experience. In short, FER systems have extensive applications in various domains, i.e., medical, deep fakes detection, e-learning, identification of emotions of drivers while driving, entertainment, cyber security, image processing, virtual reality applications [2], face authentication systems, etc. Early research in the field of facial emotion identification is focused on appearance and geometric-based feature extraction methods. For example, the local binary pattern (LBP)-based model presented in [3] introduced the concept of the adaptive window for feature extraction. The approach [3] was validated on Cohn-Kanade (CK) and Japanese Female Facial Expression (JAFFE) datasets against six and seven emotions. Also, Niu et al. [4] proposed a fused feature extraction method from LBP and oriented FAST and Rotated BRIEF (ORB) descriptors. After that, the support vector machine (SVM) classifier was used to identify the emotions. This method [4] was evaluated on three datasets, i.e., CK + , MMI, and JAFFE. The LBP approaches have the limitations of producing long histograms which slows down model performance on large datasets. Many Convolutional Neural Networks (CNNs)-based methods are developed in the past few decades that have achieved good classification results for FER. For instance, Liu et al. [5] developed CNN-based approach by concatenation of three different subnets. Each subnet was a CNN model which was trained separately. A fully connected layer was used to concatenate extracted features from these subnets and after that softmax layer was used to classify the emotion. The approach [5] was only validated on one dataset which is the Facial Expression Recognition (FER-2013) dataset and obtained an overall accuracy of 65.03%. Similarly, Ramdhani et al. [1] presented a facial emotion recognition system based on CNN. The purpose of this approach [1] was to gather customer satisfaction with the product. This approach was tested on the custom and the FER-2013 datasets. This method [1] has limited evaluation against four emotions on these datasets. Moreover, Jain et al. [6] proposed a deep network (DNN) consisting of convolution layers and deep residual modules for emotion identification and tested the method on JAFFE and Extended Cohn-Kanade (CK + ) datasets. However, there still exist many limitations of these methods such as existing models are not generalized or outperform certain conditions, i.e., variation in face angles, people belonging to different ethnic groups, high computational complexity, variations in lighting conditions and background setting, gender, skin diseases, heterogeneity in faces, and difference in expression of emotion which vary from person to person. In this paper, we presented a robust and effective deep learning model that can automatically detect and classify seven types of facial emotions

DarkSiL Detector for Facial Emotion Recognition

79

(happy, surprise, disgust, fear, sad, anger, and neutral) from frontal and face-oriented static images more accurately. In the proposed work, we customize the basic block of Darknet-53 architecture and introduce the Sigmoid-weighted Linear Unit (SiLU) activation function (a special form of swish function) for the classification of facial emotions. SiLU is a simple multiplication function of input value with a sigmoid function. This activation function allows a narrow range of negative values which facilitates it to recognize the patterns in data easily. As a result of this activation function, a smooth curve is obtained, which aids in optimizing the model in terms of convergence with minimum loss. Furthermore, using SiLU activation in the Darknet-53 architecture optimizes the model performance and makes it computationally efficient. The main contributions of this research work are as follows: • We propose an effective and efficient DarkSil (DS) emotion detector with SiLU activation function to automatically detect seven diverse facial emotions. • The proposed model is robust to variations in gender and race, lighting conditions, background settings, and orientation of the face at five different angles. • We also performed extensive experimentation on four diverse datasets containing images of spontaneous as well as non-spontaneous facial emotions and performed a cross-corpora evaluation to show the generalizability of the proposed model.

2 Proposed Methodology CNN is a network that contains a different number of layers which assists feature extraction from images better than other feature extraction methods [7]. Deep convolutional neural networks are being developed to improve image recognition accuracy. In this study, we present a customized Darknet-53 model which is the improved and deeper version of Darknet-19 architecture. The input size requirement of Darknet-53 is 256 × 256 × 3. The overall architecture of our customized proposed model is shown in Fig. 1.

Fig. 1 Architecture of the proposed method

80

T. Dar and A. Javed

2.1 Datasets for Emotion Detection To evaluate the performance of our model, we have selected four diverse datasets, i.e., Extended Cohn-Kanade (CK + ) [8], Japanese Female Facial Expression (JAFFE) [9], Karolinska Directed Emotional Faces (KDEF) [11], and Facial Expression Recognition 2013 (FER-2013) [10]. JAFFE [9] consists of 213 posed images of ten Japanese models with 256 × 256 resolution. All of the facial images were taken under strictly controlled conditions with similar lighting and no occlusions like hair or glasses. The CK + [8] database is generally considered to be the most frequently used laboratory-controlled face expression classification dataset. Both non-spontaneous (posed) and spontaneous (non-posed) expressions of individuals belonging to different ethnicities (Asians or Latinos, African Americans, etc.) were captured under various lighting conditions in this dataset. The resolution of images in the CK + dataset is 640 × 490. KDEF [11] is a publicly accessible dataset of 4900 images of resolution 562 × 762 taken from five different angles: straight, half left, full left, half right, and full right. This dataset is difficult to analyze because one eye and one ear of the face are visible in full right and full left profile views, making the FER more challenging. FER-2013 [10] contains 35,685 real-world grayscale images of 48 × 48 resolution. As this dataset contains occultation, images with text, nonface, very low contrast, and half-faced images, so, the FER-2013 dataset is more diversified and complex than other existing datasets. A few sample images of all four datasets are presented in Fig. 2.

Fig. 2 Sample images of datasets

DarkSiL Detector for Facial Emotion Recognition

81

2.2 Data Processing In the pre-processing step, images of each dataset are resized to our model requirement of 256 × 256 resolution with three channels. After pre-processing, images are sent to our customized proposed model to extract the reliable features and later classify the emotions of seven different categories as shown in Fig. 1.

2.3 DarkSiL Architecture The smallest component of our customized DarkSiL architecture is composed of the convolutional layer, the Batch Normalization (BN) layer, and the SiLU activation layer which are described as follows: (1) Convolutional layers are the main components of convolutional neural networks. CNN uses a filter or kernel of varied sizes on input to generate a feature map that summarizes the presence of detected features. Darknet-53 architecture contains 53 convolution layers. (2) Batch Normalization Layer—The use of BN is to normalize the output to the same distribution based on the eigenvalues of the same batch. It can accelerate network convergence and prevent over-fitting after the convolutional layer. (3) SiLU activation layer—SiLU is a special case of the Swish activation function which occurs at β = 1. Unlike the ReLU (and other commonly used activation units such as sigmoid and tanh units), the SiLU’s activation does not increase monotonically. The property of non-monotonicity improves gradient flow and provides robustness to varying learning rates. One excellent property of the SiLU is its ability to self-stabilize [19]. Moreover, SiLU is a smooth, unbounded above and below activation function. Unboundedness aids in avoiding saturation, and the bounded below property produces strong regularization effects. Furthermore, smoothing helps in obtaining a generalized and optimal model. SiLU activation can be computed as

f(x) = x × sigmoid(βx)

(1)

where x is the input value and β = 1. The smallest component of the Darknet model is repeated 53 times which means its architecture contains 53 convolutions and 53 batch normalization layers. So, 53 SiLU layers are introduced in our customized architecture. We also used the transfer learning approach to train our model on seven output classes of emotions. Feature extraction layers are initialized by using pre-trained Darknet-53 architecture whereas the last three layers after global average pooling, i.e., fc8 (convolution layer with output size 1000), softmax layer, and classification layer are replaced to improve the model.

82

T. Dar and A. Javed

In the Darknet-53 model, the global average pooling (GAP) layer is presented instead of a fully connected layer. The GAP layer computes the average of all feature maps and feeds the obtained vector into the next convolution layer. The GAP layer has numerous advantages over the convolution layer. One of them is that it imposes a connection between extracted features and categorizations which helps in better interpretation of feature maps as the confidence maps for classes. Second, over-fitting can be prevented in this layer as there is no parameter optimization required in the GAP layer. Moreover, the GAP adds up the spatial information and makes it more robust to spatial translation. In the softmax layer, numbers in the input vector are converted into values in the range of 0 and 1 which are further perceived as probabilities by the model. The mathematical softmax function in this layer is a generalized case of logistic regression and is applied for the classification of multiple classes. A classification layer calculates the cross-entropy loss for classification purposes with exclusive categories. The output size in the preceding layer determines the number of categories. In our case, the output size is seven different classes of emotions and the input image is classified into one of these categories.

3 Experimental Setup and Results For all experiments, the dataset is split into training (60%), validation (20%), and testing (20%) sets. The parameters used for model training on each experiment are Epoch:20, Shuffle: Every epoch, Learning rate: 4 × 10–4 , Batch size: 32, Validation frequency: Every epoch, and Optimizer: Adam. All experiments are carried out on MATLAB 2021a on the machine with the following specifications: AMD Ryzen 9 5900 × 12-core 3.70 GHz processor, 32 GB RAM, 4.5 TB hard disk, and Windows 10 Pro. We employed the standard metrics of accuracy, precision, and recall for the evaluation of our model as these metrics are also used by the contemporary FER methods.

3.1 Performance Evaluation of the Proposed Method We designed four-stage experiments to show the effectiveness of the proposed model on KDEF [11], JAFFE [9], FER-2013 [10], and CK + [8] datasets. In the first stage, we performed an experiment on the JAFFE dataset to investigate the performance of the proposed model on a small posed dataset. After training and validation, the proposed model is tested on the test set and the results are mentioned in Table 1. It is worth noticing that our model has achieved an accuracy of 92.9% on the JAFFE dataset, a mean precision of 93.5%, and a mean recall of 92.8%. Results above 90% on the biased JAFFE dataset with mislabeled class problems show the effectiveness of the proposed model for FER.

DarkSiL Detector for Facial Emotion Recognition Table 1 Results of the proposed model on different datasets

83

Dataset

Accuracy (%) Mean precision Mean recall (%) (%)

JAFFE

92.9

93.5

CK +

99.6

99.1

99.2

KDEF

91.0

93.4

93.0

FER-2013 64.9

65.3

61.1

92.8

In the second stage, we conducted an experiment to show the efficacy of the proposed model on a dataset having individuals who belong to different regions, races, and genders. For this purpose, we choose a lab-controlled CK + dataset that contains spontaneous and non-spontaneous facial expressions of people with varying lighting conditions. Table 1 demonstrates the remarkable performance of the proposed model on the CK + dataset. Results of accuracy, precision, and recall close to 100% show that our model can accurately distinguish seven different types of facial expressions in frontal face images of people belonging to different geographical regions of the world. In the third stage, to check the robustness of the proposed model on varied angular facial images, we designed an experiment on the KDEF dataset as it comprises facial images taken from five different viewpoints. Our proposed model obtained an overall accuracy of 91%, mean precision, and mean recall of 93.4% and 93%, respectively, as shown in Table 1. Obtained results demonstrate that the proposed model not only identifies emotions from frontal face images with higher accuracy but also performs well in the predictions of the facial emotions in images with faces tilted at some angle. In the fourth stage, we implemented an experiment to examine the effectiveness of the proposed method on a real-world FER-2013 dataset that covers challenging scenarios of intra-class variations and class imbalance. This dataset is originally split into training, validation or public test, and private test sets. Furthermore, the FER2013 dataset has non-face, low contrast, occlusion, different illumination conditions, variation in face pose, images with text, half-rotated, tilted, and varied ages and gender images which make the classification process more difficult. As reported in Table 1, our model achieved an accuracy of 64.9% which is good in presence of such variation on this challenging dataset. Moreover, the accuracy achieved on this dataset, i.e., 64.9% ≈ 65% is very close to the human-level accuracy of 65 ± 5% on this dataset [10].

3.2 Comparison with Contemporary Methods To show the effectiveness of our model for facial emotion recognition on multiple diverse datasets, we compared the performance of our method against the existing

84

T. Dar and A. Javed

state-of-the-art (SOTA) FER methods. In the first stage, we compared the performance of our method with these contemporary methods [12–14] on the JAFFE dataset and the results are provided in Table 2. From Table 2, it is clearly observed that our model achieved an average gain of 12.2% over the existing SOTA. Our proposed model also has a higher discriminative ability than existing works. In the second stage, we compared the results of our method on the CK + dataset with existing methods [6], and [17]. The results in Table 2 depict that our model has a 9–10% better recognition rate in the classification of FER and performs well than comparative methods on the CK + dataset. In the third stage, we compared the performance of our method with state-of-the-art methods [12, 15], and [16] on the KDEF dataset. As shown in Table 2, the accuracy of our model is higher than all of the existing works [12, 15, 16] on the KDEF dataset. The second best-performing method [15] obtained an accuracy of 88% which is 3% lesser than our proposed model. The results state that the proposed method can detect images taken from five angles (0°, -45°, 45°, -90°, and 90°) more accurately than SOTA methods. In the last stage, we compared our model’s performance with contemporary approaches of [1, 5], and [18] for the FER-2013 dataset, and results in terms of accuracy are provided in Table 2. It can be seen that the accuracy of the proposed model on the FER-2013 dataset is higher or very close to the best-performing model [5] with a slight difference of 0.13%. It means that our proposed model can detect facial emotions with more accuracy in challenging scenarios of the real world.

3.3 Cross-Corpora Evaluation The previous works on FER gave less attention to the aspect of model generalizability for seven classes of emotions. So, to overcome this limitation, we conducted a cross-corpora evaluation in which four different datasets are used to demonstrate the generalizability of our model. Previous studies have used one or two datasets for training and performed testing on other datasets and also used a few types of emotions when performing cross-corpora experiments. In this study, we include a wide range of datasets from small posed and lab-controlled ones to real-world and spontaneous expression datasets and straight face to varied angled face image datasets in our cross-dataset experiments. The results of the cross-corpora evaluation are displayed in Table 3. Despite the very good performance of the proposed model on the individual datasets, it could not perform as well on cross-dataset experiments. A possible reason for the degradation of the accuracy of these experiments is that there exist many dissimilarities among these datasets. These datasets are collected under distinct illumination conditions, with varying background settings in different environments. Types of equipment used in capturing images are different and images are taken from varying distances from the camera. Furthermore, subjects involved in the preparation of these datasets do not belong to the same geographical regions and are of

DarkSiL Detector for Facial Emotion Recognition Table 2 Comparison of DS detector (proposed model) with SOTA

Table 3 Results of the cross-corpora evaluation

85

Model

Dataset

Accuracy (%)

Sun et al. [12]

JAFFE

61.68

Kola et al. [3]

JAFFE

88.3

LBP + ORB [4]

JAFFE

92.4

Proposed Model

JAFFE

92.9

DTAN [17]

CK +

91.44

DTGN [17]

CK +

92.35

DTAGN (Weighted Sum) [17]

CK +

96.94

DTAGN(Joint) [17]

CK +

97.25

Jain et al. [6]

CK +

93.24

Proposed Model

CK +

99.6

Williams et al. [16]

KDEF

76.5

Sun et al. [12]

KDEF

77.9

VGG-16 Face [15]

KDEF

88.0

Proposed Model

KDEF

91.0

Talegaonkar et al. [18]

FER-2013

60.12

Ramdhani et al. With batch size FER-2013 [1] 8

58.20

With batch size FER-2013 128

62.33

Liu et al. [5]

FER-2013

65.03

Proposed Model

FER-2013 64.9

Training dataset

Testing dataset

Accuracy (%)

Fer-2013

JAFFE

31.0

CK +

KDEF

25.8

CK +

67.0

JAFFE

21.4

KDEF

12.2

FER-2013

28.7

KDEF

JAFFE

35.7

CK +

40.2

JAFFE

KDEF

15.9

FER-2013

14.3

CK +

24.9

86

T. Dar and A. Javed

different gender, ages, and races. There also exists a dissimilarity among morphological characteristics of individuals involved in the making of these datasets. Moreover, people belonging to different ethnicities have differences in expressing their emotions. Eastern in contrast to Western shows low arousal emotions. Japanese (eastern) in contrast to European and American (western) tends to show fewer physiological emotions [13]. Datasets available in the domain of FER are also biased like KDEF is ethnicity biased (only European people) and JAFFE is a lab-controlled and highly biased dataset concerning gender (only females) and ethnicity (only Japanese models) and ambiguous expression annotations [14]. Images present in the original datasets are also different from each other in terms of resolution (FER-2013: 48 × 48, JAFFE: 256 × 256, etc.) and image type (grayscale and RGB). Although we upscale and downscale them into the same resolution according to our customized model requirement. But this reason may also affect the results of the cross-corpora evaluation. Despite all these reasons, it can be observed from the results in Table 3 that our proposed model, when trained on the FER-2013 dataset and tested on the JAFFE dataset, obtained an accuracy of 67% which is good in presence of such diversity. Also, the model trained on the KDEF dataset is able to achieve an accuracy of 40.2%. In Table 3, results above 30% are shown in bold.

4 Discussion In this study, we conducted different experiments on four diverse datasets covering scenarios of straight and varied angled face images, people belonging to different cultures having different skin tones and gender (males, females, and children), variations in lighting conditions, different background settings, races, and a real-world challenging dataset. Our proposed model obtained accuracies greater than 90% except for the FER-2013 dataset. By closely observing the FER-2013 dataset, we found these possible reasons for the degradation of accuracy on this dataset. There exists a similarity in the face morphology of anger, surprise, and disgust classes of emotions in this dataset. Additionally, there exist more images of happy emotions as compared to other classes of emotions, which leads to insufficient learning of traits for these classes. Moreover, the FER-2013 dataset contains images with nonfaces, occlusions, half-rotated and tilted faces, and variations in facial pose, age, and gender, which affect the recognition rate of the model. However, in presence of such challenges, our proposed model is still able to achieve human-level accuracy of approximately 65% for this dataset [10]. Table 1 shows the summarized performance of the proposed model on all these datasets. The outperforming results of our model on varied and diverse datasets including challenging scenarios show that our model is effective and robust in recognizing facial emotions. Moreover, the addition of SiLU activation in Darknet architecture not only increases the model’s efficiency but also improves accuracy. We also performed cross-corpora experiments to show the generalizability of our approach. From the results, we can say that our model

DarkSiL Detector for Facial Emotion Recognition

87

has covered most of the limitations of existing methods and performed well than comparative approaches.

5 Conclusion In this research, we have introduced a novel model for facial emotion recognition that is efficient, cost-effective, and robust to variations in gender, people belonging to different races, lighting conditions, background settings, and orientation of the face at five different angles. The presented model was tested on four different datasets and achieved remarkable performance on all of them. The proposed model not only effectively classified emotions from frontal face pictures but also outperformed existing methods on face images with five distinct orientations. We also performed a crosscorpora evaluation of the proposed model to demonstrate its generalizability. In the future study, we plan to create a custom FER dataset to test the performance of our method in real time and further improve the performance of cross-corpora evaluation. Acknowledgements This work was supported by the Multimedia Signal Processing Research Lab at the University of Engineering and Technology, Taxila, Pakistan.

References 1. Ramdhani B, Djamal EC, Ilyas R (2018, August). Convolutional neural networks models for facial expression recognition. In 2018 International Symposium on Advanced Intelligent Informatics (SAIN). IEEE, pp 96–101 2. Mehta D, Siddiqui MFH, Javaid AY (2018) Facial emotion recognition: A survey and real-world user experiences in mixed reality. Sensors 18(2):416 3. Kola DGR, Samayamantula SK (2021) A novel approach for facial expression recognition using local binary pattern with adaptive window. Multimed Tools Appl 80(2):2243–2262 4. Niu B, Gao Z, Guo B (2021). Facial expression recognition with LBP and ORB features. Comput Intell Neurosci 5. Liu K, Zhang M, Pan Z (2016, September). Facial expression recognition with CNN ensemble. In 2016 International Conference on Cyberworlds (CW), IEEE. pp 163–166 6. Jain DK, Shamsolmoali P, Sehdev P (2019) Extended deep neural network for facial emotion recognition. Pattern Recogn Lett 120:69–74 7. Wang H, Zhang F, Wang L (2020, January) Fruit classification model based on improved Darknet53 convolutional neural network. In 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), IEEE. pp 881–884 8. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010, June). The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In 2010 IEEE Computer Society Conference on Computer Vision And Pattern Recognition-Workshops, IEEE. pp 94–101 9. Lyons MJ, Kamachi M, Gyoba J (2020) Coding facial expressions with Gabor wavelets (IVC special issue). arXiv preprint arXiv:2009.05938 10. Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Bengio Y (2013, November) Challenges in representation learning: A report on three machine learning contests.

88

11. 12. 13. 14. 15.

16. 17.

18.

19.

T. Dar and A. Javed In International Conference on Neural Information Processing, pp. 117–124. Springer, Berlin, Heidelberg Lundqvist D, Flykt A, Öhman A (1998) Karolinska directed emotional faces. Cogn Emot Sun Z, Hu ZP, Wang M, Zhao SH (2017) Individual-free representation-based classification for facial expression recognition. SIViP 11(4):597–604 Lim N (2016) Cultural differences in emotion: differences in emotional arousal level between the East and the West. Integr Med Res 5(2):105–109 Liew CF, Yairi T (2015) Facial expression recognition and analysis: a comparison study of feature descriptors. IPSJ transactions on computer vision and applications 7:104–120 Hussain SA, Al Balushi ASA (2020). A real time face emotion classification and recognition using deep learning model. In Journal of physics: Conference Series 1432(1), p 012087. IOP Publishing Williams T, Li R (2018, February) Wavelet pooling for convolutional neural networks. In International Conference on Learning Representations Jung H, Lee S, Yim J, Park S, Kim J (2015). Joint fine-tuning in deep neural networks for facial expression recognition. In Proceedings of the IEEE International Conference on Computer Vision, pp 2983–2991 Talegaonkar I, Joshi K, Valunj S, Kohok R, Kulkarni A (2019, May) Real time facial expression recognition using deep learning. In Proceedings of International Conference on Communication and Information Processing (ICCIP) Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11

Review and Enhancement of Discrete Cosine Transform (DCT) for Medical Image Fusion Emadalden Alhatami, Uzair Aslam Bhatti, MengXing Huang, and SiLing Feng

Abstract An image fusion is a kind of single process which combines the necessary or efficient information from a set of different or similar input images into a single output image where the resulting image is more accurate, informative, and complete than any of the input images with a specific algorithm. Image enhancement is a process used to improve the quality of an image and increases the application of these input data images, which is helpful in different fields of science such as medical imaging, microscopic imaging, remote sensing, computer vision, robotics, etc. In this paper, we describe the primary function of image fusion to improve an image’s good quality by evaluating the sharpness. Then we attempt to give an overview of multi-modal medical image fusion methods, emphasizing how we can use the DCT method for medical image fusion. This fused image provides more accurate information about the real world, which is helpful for human vision and machine perception or any further image processing tasks in the future. Keywords Image fusion · Discrete Cosine Transform (DCT) · Medical Image Fusion · (CT)Image · (MR)Image

E. Alhatami (B) · U. A. Bhatti · M. Huang · S. Feng School of Information and Communication Engineering, Hainan University, Haikou 570100, China e-mail: [email protected] U. A. Bhatti e-mail: [email protected] M. Huang e-mail: [email protected] S. Feng e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_8

89

90

E. Alhatami et al.

1 Introduction Image fusion is one of the essential techniques which is also used in the field of digital image processing. The image fusion process works on combining the necessary information from two or more images. It then produces a single output image that has all the required information than any input image [1]. It is the process of joining two or more similar images to form a new image using wavelet theory [2]. It operates in various fields of science such as medical imaging, remote sensing, ocean surveillance and artificial neural networks, etc. This process acquires all of the features from the different input images and put in a single image that has more accurate, complete and informative information than any input image [1]. The input images could be of many types, such as multi-sensor, multi-modal, multi-focal and multi-temporal. Image enhancement is the process of improving the good quality of an image in which different images are registered to form a single output image that has good quality and is appropriate for the human being and the machine interpretation [3]. The image fusion process can be of two types—The spatial domain fusion method and the transform domain fusion method [4]. The spatial domain fusion method is a kind of process that will immediately deals with the pixels of input images where the pixel is the smallest unit of graphics. And in the Transform domain fusion method, images are first changed into the frequency domain. Then, it also helps in the evaluation of sharpening the image. Nowadays, based on the wavelet transform theory, the image fusion algorithm has worked faster as the recent decade. Discrete Cosine Transform has good features in terms of time–frequency. It can be applied successfully for the image processing field [5]. The process of image fusion can be performed at three levels: pixel, Feature, and decision level. The image fusion technique is used to obtain a lot of informative, accurate, complete and top-quality image from two or more pictures. The objectives of image fusion are to reduce the data which will be lost during the fusion process because of some physical parameters such as pixel intensity, echo and repetition time etc. increases the complexity of the pictures and another goal is to enhance the quality of an image in terms of sharpness, as shown in (Fig. 1). Medical image fusion is the process of fusing multiple images from multiple imaging modalities to obtain a fused image with a large amount of information for

Fig. 1 a Input image 1 b Input image 2 c Fused image

Review and Enhancement of Discrete Cosine Transform (DCT) …

91

increasing the clinical applicability of medical images [6]. With the advancements in the field of medical science and technology, medical imaging can provide various modes of imagery information, and different medical images have some specific characteristics which require simultaneous monitoring for clinical diagnosis [6]. Hence multimodality image fusion is performed to combine the attributes of various image sensors into a single image. The medical images obtained from different sensors are fused to enhance the diagnostic quality of the imaging modality [7].

2 Image Fusion Objective • Image fusion techniques have broad applications in image processing and computer vision areas to improve the visual ability of human and machine vision. • The fused image is more suitable for human perception than any individual input images. • General Properties of medical image fusion (Image overlay for displays- Image sharpening for operator clarity- Image enhancement through noise reduction). • Extracting representative salient features from source images of the same scene, and then the salient features are integrated into a single image by a proper fusion method. • The fused image should not change, affect, or damage the quality of the original images. A fused image should be imperceptible that humans cannot find the difference between the original and the fused image [7].

3 Fusion Classification Image fusion technology is widely used in pattern recognition and image processing. Image fusion technology can involve various stages of image processing. Therefore, according to the image processing and analysis, the integration technology is divided into three levels: image fusion algorithm based on a pixel level, image fusion algorithm based on feature level method, and image fusion algorithm based on decision level [7].

3.1 Pixel Level In pixel level classification, image fusion is implemented between an individual pixel value. This level measures dot and pixels per inch but, sometimes it has different meanings especially for printer devices where, dpi is a measure of the printer density of dot and how many number of pixels in an input image involved called resolution.

92

E. Alhatami et al.

The benefits of image fusion at the pixel level are that the actual quantities are directly included in the image fusion process [8].

3.2 Feature Level In feature level classification, image fusion is implemented between the segmented portions of input images by examining the properties of pictures. Feature Level has various features such as edges, lines, and texture parameters etc. [8]. This level is also used in pre- image processing for image splitting or to change the perception.

3.3 Decision Level In decision level classification, image fusion is implemented between the segmented portions of input images by examining the initial object perception and their grouping. In the decision level, the results calculated from different algorithms are shown as confidence rather than decisions called as soft fusion. Otherwise, it is called hard fusion. The input images can be processed individually which helps in the information extraction [7, 8]. The decision-level methods can be categorized as voting methods, statistical methods and fuzzy logic-based methods. The decision-level fusion methods are also used in the field of artificial intelligence. Ex- Bayesian inference, and Dempster-Shafer method [9].

4 Acquired Input Image Images are acquired and fused in different ways, such as multi-sensor, multitemporal, multi-focal, multi-modal and multi-view [10]. • Multi-sensor image fusion: fusion fuses source images captured by various sensors. • Multi-temporal image fusion: fusion combines images taken under various conditions with a specific end goal to fuse accurate images of articles that were not taken within the expected time. • Multi-focal image fusion: image fusion is combined with image scenes of different center lengths brought about by repetition, where complementary information from the source image is fused. • Multi-modal image fusion: the fusion fuses supplementary and complementary information from the source image. • Multi-view image fusion: this combines images of a similar method taken from different angles simultaneously.

Review and Enhancement of Discrete Cosine Transform (DCT) …

93

5 Research Methodology 5.1 Discrete Cosine Transform (DCT) The Discrete Cosine Transform (DCT) can play an essential role in the compression of images in the form of Moving Pictures Expert Groups (MPEG) and Joint Video Team (JVT) etc. Discrete Cosine Transform (DCT) is used to transform the spatial domain image into the frequency domain image [10]. The coefficients of the images are represented by the alternating current (AC) values and Direct Current Values (DC). Red, Green, Blue (RGB) image can be divided into blocks of images with the size of 8*8 pixels. Then, the image group in the matrices of the image is divided and grouped by the matrices of red, green, and blue and transformed to the grey scale image [10]. Discrete cosine transformation (DCT) plays a crucial role in digital image processing. In DCT, images are divided into non-overlapping blocks of size N*N and the coefficients of the DCT are calculated for each block and then the fusion rules are applied to get a higher quality fused image. These techniques cannot be performed well while using the algorithms with block size less than 8 × 8 and also have the block size equivalent to the image size itself. The advantage of DCT is that it is a straightforward algorithm and can be used for real-time application transformations [11] (Fig. 2).

Fig. 2 Image fusion diagram using DCT for medical image

94

E. Alhatami et al.

Fig. 3 Image fusion flowchart Using DCT

5.2 DCT for Image Fusion For fusing the multimodality images such as CT and MRI, first, the input images are decomposed into base and detail images using the fourth-order differential equations method. The final detail image is obtained by a weighted average of principal components of detail images. Next, the base images are given as input for CT and MRI decomposition. The corresponding four sub-band coefficients are processed using DCT. DCT is used to extract significant details of the sub-band coefficients. The spatial frequency of each coefficient is calculated to improve the extracted features. At last, the fusion rule is used to fuse DCT coefficients based on spatial frequency value. The final base image is obtained by applying inverse DCT (IDCT) as shown in (Fig. 3). A final fused image is generated by combining the above final detail and base images linearly [12].

6 Discussion MRI, also known as Magnetic Resonance Imaging, provides information on the soft tissue structure of the brain without functional information. The density of protons in the nervous system, fat, soft tissue, and articular cartilage lesions is large, so the image is apparent and does not produce artifacts. It has a high spatial resolution and no radiation damage to the human body, and the advantage of rich information makes it an essential position in clinical diagnosis [13]. The density of protons in the bone is very low, so the bone image of MRI is not clear. The CT image is called Computed Tomography imaging. X-ray is used to scan the human body. The highdensity absorption rate of bone tissue relative to soft tissue makes the bone tissue of the CT image particularly clear. The low permeability of X-rays in soft tissue leads to a low absorption rate, so CT images show less cartilage information, representing anatomical information. MRI image and CT image as shown in (Fig. 4). Figure 5 shows an example of Image Fusion use of DCT in medical diagnosis by fusing CT and MRI. The CT is used for capturing the bone structures with high spatial resolutions and MRI is used to capture the soft tissue structures like the heart,

Review and Enhancement of Discrete Cosine Transform (DCT) …

95

Fig. 4 a CT source images, b MRI source images

eyes, and brain. CT and MRI can be used collectively with Image Fusion techniques to enhance accuracy and sensible medical applicability [14]. MRI and CT combine the advantages of clear bone information in CT images and the clear soft tissue of MRI images to compensate for the lack of information in a single imaging [15]. Figure 4 illustrates the fusion of MRI and CT images. In this, Fig. 5 MRI-CT medical image fusion

96

E. Alhatami et al.

the fusion of images is achieved by the guide filtering-based technique with image statistics.

7 Conclusion The fusion of medical images from various modalities is examined as a topic of study for researchers due to its importance and usefulness for the health sector and a better diagnosis with merged images of quality information. Merged images should contain more comprehensive information than any input image, even if redundant information is present. Typical images of MRI and CT transforms, the number of decomposition levels affects image fusion result. The DCT method has a real potential to compress and decompress images and the transformation process at a pixel level, (DCT) Method is suitable for real-time applications as they can also obtain a reasonable compression ratio which is beneficial for transmitting and storing data.

References 1. Bhatti UA, Yu Z, Chanussot J, Zeeshan Z, Yuan L, Luo W, Mehmood A (2021) Local similaritybased spatial-spectral fusion hyperspectral image classification with deep CNN and Gabor filtering. IEEE Trans Geosci Remote Sens 60:1–15 2. Shahdoosti HR, Mehrabi A (2017) MRI and PET image fusion using structure tensor and dual ripplet-II transform. Multimedia Tools Appl 77:22649–22670 3. Jing W, Li X, Zhang Y, Zhang X (2018) Adaptive decomposition method for multi-modal medical image fusion. IET Image Process 12(8):1403–1412 4. Ravi P, Krishnan J (2018) Image enhancement with medical image fusion using multiresolution discrete cosine transform. In: International conference on processing of materials, minerals and energy, vol 5, pp 1936–1942 5. Kumar S (2015) Image fusion based on pixel significance using cross bilateral filter. SIViP 9(5):1193–1204 6. Du, Li W, Lu K, Xiao B (2016) An overview of multi-modal medical image fusion. Neurocomputing 215:3–20 7. Li T, Li J, Liu J, Huang M, Chen YW, Bhatti UA (2022) Robust watermarking algorithm for medical images based on log-polar transform. EURASIP J Wireless Commun Netw 1–11 8. Kaur, Saini KS, Singh D, Kaur M (2021) A comprehensive study on computational pansharpening techniques for remote sensing images. Arch Comput Methods Eng 1–18 9. Balakrishnan A, Zhao MR, Sabuncu JG, Dalca AV (2020) VoxelMorph a learning framework for deformable medical image registration. IEEE Trans Med Imaging 38(8):1788–1800 10. Bhatti UA, Huang M, Wu D, Zhang Y, Mehmood A, Han H (2019) Recommendation system using feature extraction and pattern recognition in clinical care systems. Enterprise Inform Syst 13(3):329–351 11. Amiri E, Roozbakhsh Z, Amiri S, Asadi MH (2020) Detection of topographic images of keratoconus disease using machine vision. Int J Eng Sci Appl 4(4):145–150 12. Bavirisetti DP, Kollu V, Gang X, Dhuli R (2017) Fusion of MRI and CT images using guided image filter and image statistics. Int J Imaging Syst Technol 27(3):227–237 13. Bhatti UA, Yu Z, Li J, Nawaz SA, Mehmood A, Zhang K, Yuan L (2020) Hybrid watermarking algorithm using Clifford algebra with Arnold scrambling and chaotic encryption. IEEE Access 8:76386–76398

Review and Enhancement of Discrete Cosine Transform (DCT) …

97

14. Yang C, Li J, Bhatti UA, Liu J, Ma J, Huang M (2021) Robust zero watermarking algorithm for medical images based on Zernike-DCT. Secur Commun Netw 15. Zeng C, Liu J, Li J, Cheng J, Zhou J, Nawaz SA, Bhatti UA (2022) Multi-watermarking algorithm for medical image based on KAZE-DCT. J Ambient Intell Human Comput 1–9

Early Courier Behavior and Churn Prediction Using Machine Learning in E-Commerce Logistics Barı¸s Bayram, Eyüp Tolunay Küp, Co¸skun Özenç Bilgili, and Nergiz Co¸skun

Abstract With the surge in competitive e-commerce demands occurring mainly due to the COVID-19 outbreak, most logistics companies have been compelled to create more efficient and successful delivery organizations, and new logistics companies which provide different opportunities to employees have entered the market and led to a boost in competition. In this work, an approach to early employee churn prediction is developed for couriers of a private logistics company using real delivery behaviors and demographic information of the courier. The churn scores of the couriers are computed regarding the delivery performances of the couriers for each day. Also, using the historical delivery data of the couriers, a regression model is employed for the prediction of the delivery behaviors for the next week to be utilized for churn prediction. Based on the churn scores, the couriers are clustered into a number of groups in a weekly manner. In the experiments, the Gradient Boosting Trees (GBTs) based binary classification and regression algorithms achieved the best performances in courier behavior prediction in terms of R2 -scores (up to 86.2%) and error values, and churn prediction in terms of ROC curves with AUC scores (up to 85.6%) and F1-scores (up to 68.4%). Keywords Transportation · e-commerce logistics · Early employee churn prediction · Behavior prediction · Gradient boosting trees

B. Bayram (B) · E. T. Küp · C. Ö. Bilgili · N. Co¸skun HepsiJET, ˙Istanbul, Turkey e-mail: [email protected] E. T. Küp e-mail: [email protected] C. Ö. Bilgili e-mail: [email protected] N. Co¸skun e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_9

99

100

B. Bayram et al.

1 Introduction The global COVID-19 pandemic has accelerated the expanding usage of e-commerce sites which also led to a rapid and noticeable increase in online shopping [12]. In the first half of 2020, the e-commerce sales in Turkey reached 91.7 billion Turkish Lira climbing 64% over the previous year, which significantly affected the e-commerce logistics industry. Since this increase has been accelerating further with the pandemic, logistics companies are constantly fostering competition both in online marketing management and delivery service. Therefore, the companies need to adopt a reliable crowd-sourced delivery system by ensuring a high-quality courier service in this competitive environment. In various sectors, the knowledge, capabilities, skills, and experiences of employees become the growth factor of the companies. It is a crucial and challenging process to retain valuable employees, since training and recruiting new employees to fill vacant positions requires a lot of time and resources [3]. In the competitive environment of the e-commerce logistics industry especially during the COVID19 pandemic period, several companies have emerged that offer relatively different opportunities for payment per delivery, location, or workload. The logistics industry has a significant role in creating employment since the need for additional couriers substantially increased due to the attrition and the boosted delivery workload. Also, this competitive environment causes an increase in the rate of voluntary turnover, which negatively impacts sustaining the workforce and risks losing the competitive advantage of the companies. Retaining hard-working, communicative, financially and morally satisfied couriers who have been an employee of the company for a long time, but are not motivated, is also an essential strategy for logistics companies. However, in this period, the attrition in a working place that means a decrease in the labor force is an important issue. Therefore, the companies need to detect the attrition intend to retain the employees and prevent turnovers to enhance the efficiency of human resources and competitiveness of the logistics companies. In the logistics sector, courier turnover is an important problem, which causes noticeable gaps in the operations of shipments during the pandemic period. To retain human resources in logistics, the churn prediction problem has not been investigated yet. Due to financial reasons, tremendous distrust arises between couriers and managers while fairly distributing the packages and resources among the couriers. In the real world, human resources data have many problems such as missing, inconsistent and noisy information, etc. which deteriorate the development of a churn prediction ability. For various prediction and analysis tasks, the use of machine learning algorithms is a key also to human resources problems, because the department has extensive data on the employees about wages, benefits, career evaluations, annual evaluations of performance, recruitment, exit interviews, opinions about other employees and seniors, etc. The prediction of customer churn has widely been studied, and many machine learning-based approaches have been proposed in recent years. The attrition of the employees has been investigated in several studies, but in the logistics area, the churn prediction has not been addressed yet.

Early Courier Behavior and Churn Prediction Using Machine Learning …

101

Several state-of-the-art machine learning algorithms have been employed for the prediction of churners. Most of the works have focused on churn prediction of customers in different industries like cargo and logistics [4, 10], banking [8], telecommunication [5], and aviation [7]. For customer turnover prediction, various machine learning algorithms have been employed such as Gradient Boosting Trees (GBT) [11], Decision Tree (DT) [11], etc. Abiad and Ionescu present a logistic regression-based approach for the analysis of customer churn behaviors [1]. In the cargo and logistics industry, only a few works have focused on the churn prediction of customers, but for employees, there are no studies to detect the attrition of couriers. In the study [2], several state-of-the-art machine learning algorithms for the prediction of employee turnover were investigated and compared, and it is observed that Extreme Gradient Boosting Trees (XGBoost) presented the best prediction performance. Besides dynamic and behavioral features, for employee churn prediction, the static features which are personal information of the couriers such as age, the distance between cross-dock and courier’s living place, mandatory military service status, gender, absenteeism, etc. are discussed to analyze the impacts of the information on the attrition [9]. The use of machine learning algorithms for churn prediction of employees is difficult for researchers due to the confidentiality and the lack of human resources and real churn data of the employees, which affect the deep analysis of the problem and the generalization of a churn prediction solution. In this work, a machine learning-based churn prediction approach is presented using real delivery data of a private company in the logistics sector to predict potential churner couriers using static demographic information and dynamic delivery performances. Also, courier behaviors in the future are predicted to be used for churn prediction. The approach is composed of a binary classification-based churn prediction model and a clustering method to categorize the couriers into different types of working profiles using the combination of the churn scores of the binary model with the delivery rate of the couriers. The early churn prediction model computes the churn scores for all the existing couriers on each day of the previous and next weeks using various features of the couriers. The features include the daily aggregated delivery data, calendar features (e.g. day of week, day of month, month of year), and demographic information (e.g. age, education, gender). Moreover, a cross-dock-based analysis can be conducted according to the predictions of churners in the same cross-docks. The proposed approach is evaluated using various algorithms with 24-month long real delivery data of hepsiJET company. The main aim of the proposed approach is to provide information on possible churners to the cross-dock managers to take an action to retain the couriers. The problems due to the manager, working districts, the other employees, etc. may be detected regarding the prediction of churners in the same cross-dock. The main contributions of this study are; (i) developing an advanced feature engineering process for courier churn prediction on real logistics data, (ii) conducting the first national work in the logistics industry for future performance prediction and churn prediction of the couriers, (iii) developing one of the first machine learning-based methods for churn prediction on predicted delivery performances

102

B. Bayram et al.

of the couriers, and (iv) the weekly update of the regression and churn prediction algorithms to adapt the abnormal conditions in special situations.

2 Proposed Approach The proposed approach (Fig. 1) for early courier churn prediction on the streaming data is composed of the following steps: (1) data preparation, (2) extraction of behavioral aggregated and demographic features for the behavior and churn prediction models, (3) the construction of the training sets for both models, (4) generation of the models, (5) prediction of the couriers’ behaviors for the next week, (6) daily churn prediction to compute churn scores for each courier’s feature vector in each day of the previous and next weeks, and (7) clustering of the daily scores and delivery number rates of the couriers to categorize the couriers into four courier profiles of performances regarding the churn. The churn prediction model computes the probability of churn for each courier, C in each day, D of the last and next weeks which is used as the daily churn score:   scor e X C = P(class = chur ner |X C ) where X C ∈ {X CD pr ev , X CDnext } represents the features of the courier, C in which X CD pr ev is the feature vector from the delivery data of the courier, C in the day D pr ev of the previous week, and X CDnext is the vector of the courier’s predicted behaviors including the delivery statistics in the day Dnext of the next week. The courier behavior prediction and churn prediction using various features of the couriers in the previous and next weeks may help to carry out layoffs to reduce the workforce due to the dramatic decreases in the number of deliveries, the rate of

Fig. 1 The overview of the proposed churn prediction approach

Early Courier Behavior and Churn Prediction Using Machine Learning …

103

delayed delivery, and working hours. The reliable and efficient prediction capability will give prior information about the couriers’ future performance which can be used to take related actions to improve the performance, or to predict the churn possibility in advance.

2.1 Data Preparation In the data preparation step, the pre-processing methods are employed such as removal of irrelevant and redundant data, and imputation of missing values with the most frequent values, which are mostly in demographic information such as birth date, marital status, and mandatory military service status. In addition, for the training set preparation, the last 5, 15, 30, or 60 days of the churners are annotated as “churn” class, and the rest days of the churners and all days of the non-churners are annotated as “non-churn”. The different annotation processes are evaluated in terms of the churn prediction performances to estimate the most discriminative process of the churners.

2.2 Feature Extraction and Selection The raw data of deliveries and demographic information of couriers is utilized as the input of the proposed churn prediction approach. The delivery details which cover the attributes for each delivery are as follows: the ids of delivery, courier, cross-dock, district, address, and city, number of attempts for the delivery, its payload, and the promised and delivered dates of the delivery, and courier based attributes for each courier are as follows: total work hours of a day, total absent days in a week, and total working days since the first day, and the demographic details which are age, education, military service status, gender, and marital status. The feature extraction step is performed in the training and prediction steps for courier behavior prediction and churn prediction using the raw data. The raw data of the delivery performances is daily aggregated to extract the dynamic features of the couriers’ delivery behaviors. Also, the delivery counts for all the working couriers are predicted for each day in the next week, and the same aggregated features are extracted using the predicted future behaviors of the couriers. The step is individually performed for each task of the behavior prediction and churn prediction. Also, the calendar time features which are day of week, day of month, day of year, week of year, month of year, and year are extracted. The extracted features are listed in Table 1. The aggregated features are combined with demographic and time features to be used in the feature selection step. It is important to estimate the most distinctive features from the turnover behavior of the couriers to improve the churn prediction performance. Also, the most useful features are selected for the regression model to efficiently predict future behaviors. The prediction of behaviors that lead to turnovers

104

B. Bayram et al.

Table 1 The aggregated features Feature

Description

delivery_count_today/yest

Delivery made today/yesterday

delivery_rate_today/yest

Cross-dock wise rate of delivery made today/yesterday

mean_delv_num_3d/1w/2w/1 m

Mean delivery made in the last 3 days/week/two weeks/month

std_delv_num_3d/1w/2w/1 m

Std delivery made in the last 3 days/week/two weeks/month

max_delv_num_3d/1w/2w/1 m

Maximum delivery made in the last 3 days/week/two weeks/month

min_delv_num_3d/1w/2w/1 m

minimum delivery made in the last 3 days/week/two weeks/month

mean_delv_rate_3d/1w/2w/1 m

Mean cross-dock wise rate of delivery made in the last 3 days/week/two weeks/month

depending on the delivery performance, and working time and area can be useful to retain the couriers.

2.3 Generation Behavior and Churn Prediction Models For the tasks of courier behavior prediction and churn prediction, the steps of training set construction, model selection, and training of the model are individually examined. Churn prediction model. To estimate the best churn prediction model, various state-of-the-art machine learning algorithms are evaluated in a binary classification manner, which are XGBoost, Light Gradient Boosting Machine (LightGBM), Random Forest (RF), and Multilayer Perceptron (MLP). Delivery number prediction model. The most suitable regression model is selected using the selected features. For the prediction of delivery capacity, several regression models investigated in different problems [6] are employed, which are XGBoost regressor, LightGBM regressor, Linear Regression (LR), RF Regressor (RFR), MLP Regression, and Support Vector Regressor (SVR) in the model selection process. The algorithms are used with their best set of hyperparameters which is estimated also using the selected set.

Early Courier Behavior and Churn Prediction Using Machine Learning …

105

2.4 Early Courier Churn Prediction Based on the extracted features from the delivery behaviors of the previous and the predicted behaviors for the next week, the binary classification-based churn prediction model produces churn scores. Using the scores, the couriers are clustered into four types of couriers ((i) screening group, (ii) open-for-improvement group, (iii) average performance group, (iv) high-performance group) to monitor the performance changes of the couriers, and the churn prediction is achieved depending on the couriers in two clusters with higher scores than the others.

3 Experiments The delivery behaviors of couriers have been utilized for future behavior and churn predictions. The binary classification and regression models are evaluated using the real delivery data from hepsiJET which is a logistics company in Turkey.

3.1 Experimental Setup The dataset of couriers includes the delivery behaviors in a period of two years with the pandemic era from February 2020 to May 2022. In the dataset, there are 37 categorical and numeric attributes composed of 7 raw features including demographic information and ids of couriers and cross-docks in which they are working, 6 calendar features, and 24 features extracted from the raw delivery count. The initial training set covers the delivery data of the couriers until 2022, and the test set is composed of the data in 2022. Using the training and test sets, the built-in feature importance outputs of the XGBoost, LightGBM, and Random Forest models and the best features based on the p-value significant levels (less than 0.05) are used to select the features. For churn prediction, the delivery numbers and cross-dock wide rate of delivery numbers made today and the mean values of the counts and rates in 3 days/1 week/2 weeks with age, education, and working times are selected. Also, for behavior prediction, all the aggregated features and working times are selected for the experiments. In the experiments, every Sunday, using the models generated with the selected features of the historical data, the churn prediction is performed on the data of the previous week, and the behavior and churn prediction models are employed for the next week.

106 Table 2 The average of overall R2 values and RMSE of the regression algorithms

B. Bayram et al. Algorithms

Avg. R2 values Avg. RMSE

XGBoost Regression

0.853

27.60

LightGBM Regression

0.862

26.71

Random Forest Regressor

0.830

27.98

Linear Regression

0.836

27.21

Multilayer Perceptron Regressor 0.814

29.06

3.2 Evaluation Metrics The evaluation for prediction of future delivery behaviors is carried out in terms of R2 -score and Root Mean Square Error (RMSE). Also, for the churn prediction, the empirical performances of traditional machine learning algorithms have been evaluated using F1-scores and ROC curves with AUC scores. However, accuracy is not a reliable evaluation metric for such binary classification problems with imbalanced data, therefore, F1-scores are computed by taking into account that the couriers in the screening group are the possible churners in the weeks.

3.3 Results of Courier Behavior Prediction Experiment The average RMSEs and AUC scores of the regression algorithms are given in Table 2. The best performances were obtained by the LightGBM algorithm for each month in the test set, and the XGBoost regression model presented the suitable performances. However, the MLP and Random Forest provided the worst prediction performances. Also, the total RMSE for the predicted future behaviors of the couriers for each day are demonstrated in Fig. 2. Therefore, for the churn prediction using the delivery behaviors of the couriers predicted for the next weeks, the predictions of XGBoost and LightGBM regression models are combined.

3.4 Results of Churn Prediction Experiment In the experiments of early churn prediction, firstly, the best number of days is estimated for the annotation of the churner couriers regarding the numbers, 3, 7, 10, 14, 30, and 60. In Fig. 3, for each day, the ROC curve of the best algorithm with the highest AUC score is demonstrated, so the best churn prediction performance was obtained by the XGBoost when annotating the daily performances of the churners in the last 1 month as “churn” class. In Table 3a and b, the churn prediction performances are listed, which are obtained using the previous and next weeks, respectively. According to the results of the

Early Courier Behavior and Churn Prediction Using Machine Learning …

107

Fig. 2 Daily total RMSE values of the predicted behaviors of the couriers by the algorithms

Fig. 3 The number of days used for annotation in terms of the ROC curves with AUC scores of the best algorithms

algorithms, using the courier data in the previous and next weeks, the XGBoost algorithm presented the best performances for the early churn prediction of couriers based on the data in previous weeks in terms of the average F1-score as 0.684 and AUC score as 0.856, and in next weeks in terms of the average F1-score as 0.662 and AUC score as 0.805. In Fig. 3a, b, the ROC curves with AUC scores demonstrated were obtained using previous weeks and next weeks, respectively, in which the most suitable prediction performances were attained by XGBoost. Also, LightGBM and Random Forest provided satisfying performances. The binary MLP model presented suitable prediction performance for the early churn prediction with the delivery data in the previous weeks. However, the Decision Tree algorithm had the worst prediction performances in each week of the test set. The best monthly performance was achieved in April by the XGBoost algorithm, but the worst one was obtained in February (Fig. 4).

108

B. Bayram et al.

Table 3 The F1-score and AUC score of the churn prediction models with (a) the features of couriers in the previous weeks, and (b) the features of couriers in the next weeks Algorithm

F1-score

AUC-score

Algorithm

F1-score

AUC-score

XGBoost

0.684

0.856

XGBoost

0.662

0.805

LightGBM

0.671

0.827

LightGBM

0.640

0.791

Random Forest

0.657

0.831

Random Forest

0.644

0.790

MLP

0.660

0.803

MLP

0.624

0.777

Decision Tree

0.614

0.741

Decision Tree

0.541

0.707

Fig. 4 The ROC curves with AUC scores of the algorithms for churn prediction with a the data of the previous weeks, and b the data of the next weeks

4 Conclusion and Discussion It is a fact in the logistics sector that hiring a new courier instead of the resigned courier is a costly and time-consuming problem if the new courier is not familiar with the delivery intensity of the districts with the area- and company-specific requirements and circumstances. To reduce the costs associated with courier churn, an early churn prediction approach is deployed that can reliably predict the couriers who are about to leave. For the predicted possible churner, strategies should be adopted to retain as many valuable couriers as possible. To improve the performances of the churn prediction and delivery number prediction, various aggregated features from delivery performances are employed and analyzed in the experiments. The features including behavioral features daily aggregated from the deliveries and demographic features were evaluated with XGBoost, LightGBM, and Random Forest, and the most useful features were selected for the regression model of courier behavior prediction and binary classification based churn prediction. In the experiments, it was demonstrated that the XGBoost and LightGBM have provided the best performances which are higher R2 values, and F1 and AUC scores than the other algorithms, for behavior prediction and churn prediction, respectively.

Early Courier Behavior and Churn Prediction Using Machine Learning …

109

References 1. Abiad M, Ionescu S (2020) Customer churn analysis using binary logistic regression model. BAU J Sci Technol 1(2):7 2. Ajit P (2016) Prediction of employee turnover in organizations using machine learning algorithms. Algorithms 4(5):C5 3. Boushey H, Glynn S (2012) There are significant business costs to replacing employees. Center Am Progress 16:1–9 4. Chen K, Hu YH, Hsieh YC (2015) Predicting customer churn from valuable B2B customers in the logistics industry: a case study. IseB 13(3):475–494 5. Dahiya K, Bhatia S (2015). Customer churn analysis in telecom industry. In: 2015 4th international conference on reliability, infocom technologies and optimization (ICRITO) (trends and future directions), pp 1–6 6. Le L, Nguyen H, Zhou J, Dou J, Moayedi H et al (2019) Estimating the heating load of buildings for smart city planning using a novel artificial intelligence technique PSO-XGBoost. Appl Sci 9(13):2714 7. Li Y, Wei J, Kang K, Wu Z (2019) An efficient noise-filtered ensemble model for customer churn analysis in aviation industry. J Intell Fuzzy Syst 37(2):2575–2585 8. Karvana K, Yazid S, Syalim A, Mursanto P (2019) Customer churn analysis and prediction using data mining models in banking industry. In: 2019 international workshop on big data and information security (IWBIS), pp 33–38 9. Nagadevara V, Srinivasan V, Valk R (2008) Establishing a link between employee turnover and withdrawal behaviours: application of data mining techniques 10. Sahinkaya G, Erek D, Yaman H, Aktas M (2021) On the data analysis workflow for predicting customer churn behavior in cargo and logistics sectors: case study. In: 2021 international conference on electrical, communication, and computer engineering (ICECCE), pp 1–6 11. Sharma T, Gupta P, Nigam V, Goel M (2020) Customer churn prediction in telecommunications using gradient boosted trees. In: International conference on innovative computing and communications, pp 235–246 12. Viu-Roig M, Alvarez-Palau E (2020) The impact of E-commerce-related last-mile logistics on cities: a systematic literature review. Sustainability 12(16):6492

Combining Different Data Sources for IIoT-Based Process Monitoring Rodrigo Gomes, Vasco Amaral , and Fernando Brito e Abreu

Abstract Motivation—Industrial internet of things (IIoT) refers to interconnected sensors, instruments, and other devices networked together with computers’ industrial applications, including manufacturing and energy management. This connectivity allows for data collection, exchange, and analysis, potentially facilitating improvements in productivity and efficiency, as well as other economic benefits. IIoT provides more automation by using cloud computing to refine and optimize process controls. Problem—Detection and classification of events inside industrial settings for process monitoring often rely on input channels of various types (e.g. energy consumption, occupation data or noise) that are typically imprecise. However, the proper identification of events is fundamental for automatic monitoring processes in the industrial setting, allowing simulation and forecast for decision support. Methods—We have built a framework where process events are being collected in a classic cars restoration shop to detect the usage of equipment such as paint booths, sanders and polishers, using energy monitoring, temperature, humidity and vibration IoT sensors connected to a Wifi network. For that purpose, BLE beacons are used to locate cars being repaired within the shop floor plan. The InfluxDB is used for monitoring sensor data, and a server is used to perform operations on it, as well as run machine learning algorithms. Results—By combining location data and equipment being used, we are able to infer, using ML algorithms, some steps of the restoration process each classic car is going through. This detection contributes to the ability of car owners to remotely follow the restore process, thus reducing the carbon footprint and making the whole process more transparent.

R. Gomes (B) NOVA School of Science and Technology, Caparica, Portugal e-mail: [email protected] V. Amaral NOVA LINCS & NOVA School of Science and Technology, Caparica, Portugal F. B. Abreu ISTAR-IUL & ISCTE-Instituto Universitário de Lisboa, Lisboa, Portugal © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_10

111

112

R. Gomes et al.

Keywords Process activity recognition · IIoT · IoT sensors · Intrusive load monitoring · Machine learning · Indoor location · Classic cars restoration · Charter of Turin

1 Introduction The historical importance, the aesthetics, the build quality, and the rarity are characteristic features that individually or collectively define a car as a classic. Due to the many admirers, classic cars are highly valued, sentimentally, and monetarily. Keeping the authenticity of those masterpieces, i.e., maintaining them as close as possible to when they left the factory, requires expert restoration services. Guidelines for the restoration of classic cars were proposed by FIVA.1 They may be used for the certification of classic cars by accredited certification bodies such as the ACP.2 Monitoring the classic car restoration process, so that pieces of evidence are recorded, is an important matter, both for managing the shop floor plan, for certification purposes, and for allowing classic car owners to follow the restoration process remotely, reducing the carbon footprint and making the whole process more transparent. Our work aims to create and implement an IoT monitoring system to recognize the tools used and infer restoration tasks (e.g., mineral blasting, bodywork restoration, painting, painting drying, bodywork finishing stage) that a classic car is going through. We intend to use Intrusive Load Monitoring (ILM) techniques by installing energy meters in the workshop outlets and use its data in a supervised Machine Learning (ML) model for detecting the various tools used by the workers in the restoration of each classic car and combining with its location, to automatically recognize the ongoing restoration task. This presents some challenges as classic car restoration is a complex process [1]. In the same workshop many cars may be under restoration, often each at a different stage in that process and different tools being applied. To further make detection challenging, the same tool may be shared across adjacent cars without unplugging, making power consumption-based detection imprecise. The current work is a continuation of the one reported in [2], where a Raspberry Pi-based edge computer equipped with several sensors was attached (using magnets) to each car body on the plant shop floor, allowing to capture data on the vibrations produced by different restoring tools, as well as temperature and humidity conditions where the cars went through. Estimote BLE3 beacons attached to the walls of the plant shop floor were also detected by the edge computer, to allow the indoor location of the car body under restoration. The raw data captured by the edge computers was then sent to the AWS4 cloud-based platform where a ML algorithm, combining detected 1

Fédération Internationale des Véhicules Anciens (FIVA), https://fiva.org. Automóvel Club de Portugal (ACP), https://www.acp.pt/classicos. 3 Bluetooth Low Energy. 4 Amazon Web Services. 2

Combining Different Data Sources for IIoT-Based Process Monitoring

113

tools and detected position, allowed us to identify some of the tasks of the restoration process. A web application was also built to monitor the state of operation of all edge computers and beacons. Although this work presented significant progress in using IoT techniques in an industrial context (aka IIoT) for process monitoring purposes, the developed system lacked precision in detecting the operation of some tools, as well as in indoor locating. In Sect. 2, we present previously developed projects with similar objectives as ours. Then, the proposed description and architecture for our work are detailed in Sect. 3. In Sect. 4, we evaluate and discuss the obtained results. Finally, in Sect. 5, we present some conclusions.

2 Related Work 2.1 Intrusive Load Monitoring (ILM) Most works on this topic, were developed in smart-home contexts. Hundred power consumption samples from three houses and the same appliances were used in [3] for feature extraction to serve as input to an Artificial Neural Network (ANN) classification algorithm. Results showed a positive overall accuracy of 95%. A second test using the ANN trained with the previous data, was executed with data from a new house, but worse results emerged, even after reducing the number of features. An attempt to identify the different states of each appliance is described in [4]. A pre-processing with z-normalization was used for feature selection. The classification process applied a Hidden Markov Model (HMM) algorithm. Positive results were achieved with an accuracy of 94% for the test with appliances in the training set, and 74% for other appliances. An app to visualize in real-time the recognized appliance characteristics is also reported. A prototype for collecting load data with an Arduino and an energy sensor is described in [5]. For classification purposes, different algorithms were tested, namely K-Nearest Neighbour (KNN), Random Forest (RF), Decision Tree (DT), and the neural network Long Short-Term Memory. RF presented the best results. An experimental test was also carried out to obtain the best sliding window size, i.e. the one to be used in feature extraction and classification algorithms. An IoT architecture for ILM is presented in [6]. The features were the same as those described in [3], and three supervised learning (SL) algorithms were tried for classification. The Feed Forward Neural Network (FNNN) obtained the best accuracy results for seen data (90%).

114

R. Gomes et al.

2.2 Indoor Location Systems Several techniques are used for this purpose. In some cases, the reader is linked to the object to track, and a lot of tags are dispersed through the space [7], while in others a tag is linked to the moving object, and the reader(s) is(are) fixed. In both cases, distances are calculated based on RSSI.5 Trilateration can then be used for detecting the position of the moving object. For instance, Wifi-based indoor location can be performed through the trilateration of RSSI corresponding to detected access points (APs) on a mobile phone [8]. BLE-based location technology is similar, but beacon transmitters are used instead of APs, such as in [9], where RSSI trilateration and fingerprinting are used. ML algorithms can be used for improved fingerprinting such as in [10], where an average estimation error of 50 cm is reported.

3 Proposed System Overview System description The detection of tools used by the workers includes the three main steps of ILM, i.e. data collection, feature extraction, and classification. Regarding data extraction, smart energy meters are installed between the tools and workshop outlets to capture the plugged tools’ energy loads. Two smart meter types (Nedis Smart Plug and the Shelly Plug S.) were tested in the workshop with many available tools. They capture the electrical power in Watts (W) in real-time with a frequency of one measure per second. Both have an API to get the measured data and use Wifi to send this data to the internet. We chose Shelly’ because its API is more straightforward, its plug is smaller (i.e. physically less intrusive), has a smaller size and its power range is enough for the tools used in the workshop (up to 2500 W, compared to Nedis’ 3650 W). For feature extraction, we used the technique described in [3, 6]. A script is always running getting as input the energy sensor data and when some non-zero power value arrives, the algorithm takes the next 100 data entries (the sliding window size) and calculates all features regarding power levels and power variations. Nine features were chosen based on previous works: Maximum power value; Minimum power value; Mean power for nonzero values; Number of samples with power less than or equal to 30 W; Number of samples with power between 30 and 400 W; Number of samples with power between 400 and 1000 W; Number of samples with power greater than 1000 W; Number of power transitions between 10 and 100 W; Number of power transitions greater than 1000 W. The group of features regarding each data window serves as input to a supervised ML model. In the ML training phase, these features are labeled with the ground-truth, i.e. the tool (target) being measured. Different algorithms are then trained with the same labelled data to find the one with the best predictions when providing unlabeled 5

Received Signal Strength Indicator is a measurement of the power present in a received radio signal.

Combining Different Data Sources for IIoT-Based Process Monitoring

115

data (in the ML estimation stage). After the three ILM phases, the electrical tools used by the workers at any time of the day in the workshop are registered and available. To complete this process of recognizing the tools, the remaining part we need to tackle is to define which car was under intervention by those same tools. As a result of a literature review about indoor location systems, some possible solutions emerged. In [2], each sensor box has a car associated, and in a real system, each vehicle would have a sensor box attached that goes with it throughout the whole workshop process. So one solution is to use a location system to track down each energy sensor, and then, as the sensor box location is available, find the closest distance between both and do the link. Another solution is to use the timestamps from the energy sensors data and match them with the detected restoration steps of the sensors boxes. With the awareness of the tools used on each car, a combination with the information provided by the sensor boxes is made. In addition to its developed Process Identification Algorithm [2] more robustness and reliability is obtained by combining all data. A web application is needed for system users to get feedback about the system developed and make simple changes. Some features that should be available are, for example, the list of all activated and deactivated smart plugs in the workshop, the list of all electrical tools belonging to the workshop, and the registration of more smart plugs in the system. We decided to implement an indoor localization system for the sensor box using a ML-based BLE fingerprinting technique. The latter encompasses two phases. First is a training phase in which RSSI samples are captured throughout the entire area, and corresponding locations are used to train several ML location estimation models. Second, a validation phase where a target moves around and estimates are produced by the models (a pair of x, y coordinates) are compared with ground truth measurements to assess their accuracy and choose the best one. In our experimental setup, many BLE beacons were distributed throughout the workshop. Some measurement points were distributed across the workshop floor plan, with about 3 m between each other. And reference points were defined, and their coordinates inside the workshop were obtained. All these points were determined through the workshop plan, as can be seen in detail in the Fig. 1 diagram. To obtain the coordinates of each point, a cartesian plot was placed over the floor plan of the workshop, with the axes in the same measurement scale as the available floor plan scale. Then, by going through the measurement points, using a laser distance meter, the distance to three reference points visible is pointed out, as well as the beacons’ RSSI detected values in that point and their ids. Then for each measurement point, a trilateration algorithm is used to obtain its coordinates based on the distances to the reference points and their coordinates. Having said that, a ML model is trained using the coordinates of each measurement point as the target of the model and its detected beacons’ RSSI values. An example is shown at Table 1. During normal operation, the trained ML model, running in each sensor box/car, takes as input the detected RSSI values and predicts its most likely location within the workshop.

116

R. Gomes et al.

Fig. 1 Workshop floor plan with the identification of beacon’s locations, reference points, and measurement points

Table 1 Example of a row of the data acquired in the sensor box location method to serve as input to the ML model Beacon id1 (RSSI) Beacon id2 (RSSI) Beacon id3 (RSSI) Beacon id4 (RSSI) (x, y) 90.5

80.6

30.5

70.0

(9.95, 2.56)

System architecture Since we receive power consumption data from the sensors every second, we chose the open-source InfluxDB time series database. We installed it in a virtual machine hosted by an OpenStack platform operated by INCD. A bucket receives electric sensors’ data in the ILM part of the work, as shown in Fig. 2. The ingestion uses a Telegraf agent that asks the energy sensor API every second for its measurements. A Python script performs feature extraction upon a 100 s sliding window of power consumption data values retrieved from the InfluxDB database. The results serve as input to the ML model that returns the tool predictions (i.e. which tools most likely were in use in the sliding window). Every tool prediction is then saved with its timestamp in another bucket.

Combining Different Data Sources for IIoT-Based Process Monitoring

117

Server Energy Sensors Data

InfluxDB Python Library Read

Bucket

InfluxDB Python Library Write

Energy Sensors

Tool Predictions

Feature Extraction Script (.py)

Tool Prediction

Machine Learning Model (.pkl) Bucket

Fig. 2 Architecture of the system’s electrical data and feature extraction

To deploy the ML model after being manually trained, we saved in our server a Pickle file6 with the trained model that is accessed every time a prediction is required. For the new location system, the beacons data are also saved in an InfluxDB bucket, and the location ML model, after being trained, is also deployed in our server, using a pickle file, that is used in the Process Identifier Algorithm. The Process Identifier Algorithm was developed in an AWS Lambda function. Still, as we want to reduce as much as possible the use of proprietary services that can later be charged, so we decided to transfer the function to a Python script running on our server, and change [2] sensor boxes data to InfluxDB. This way, the script queries InfluxDB for all the data needed to run the algorithm for identifying restoration processes, now with the help of the tools identified by the ILM module. The Web Application front-end was implemented in [2] with React and communicates with the back-end via the Amazon API Gateway, so it is necessary to update and expand the front end so it can show feedback to the users about the new system features related to the predictions and smart plugs. This is an IoT system, so its architecture layers could be defined. We can use a five layers structure in our workshop problem and define them as follow: Physical Things—The electrical tools available in the workshop; Perception—The electrical sensors plugged in the workshop outlets; Communication Network—WiFi as this is the via that sensors use to communicate to the cloud; Middleware—All the data storage services and algorithms implemented over InfluxDB that interact with sensors data; Application—Web Application where the interaction with users happens.

6

Pickle is a useful Python tool that allows saving trained ML models for later use.

118

R. Gomes et al.

4 Results and Discussion To verify if predictions can be made with the electrical data and choose the best ML model, we considered the more accurate models used in the previous works detailed in Sect. 2. Six different ones were implemented. Random Forest (RF), K-Nearest Neighbour (KNN), Decision Tree (DT), Gaussian Naive Bayes (GNB), Gradient Boosting (GB), and the neural network algorithm Feed Forward Neural Network (FNNN). These six different supervised ML algorithms were implemented, tested, and compared. The energy data used to test the ML models was recorded in the workshop. The electrical tools at work in the workshop, a drill, two electrical sanders, two polishers, an angle grinder, and a hot air blower, were measured for an entire afternoon with the Shelly plug. After having the features data available, we divide it into a training set and a test set. For this, we randomly choose 70% of the entries related to each tool for training and 30% for the test set. We did it by the tool so we could have data from every available tool in the train set and the same for the test set. In the implementation, we manually run every algorithm in a local machine and take advantage of the open-source libraries available online for ML development. For the RF, KNN, DT, GNB, and GB algorithms, we used the Sklearn library. For the FNNN, the Keras library was used. For all the algorithms, the results with or without data normalization were compared, and different parameters and hyper-parameters were used to get the best of each algorithm for a more meaningful comparison between them. However, none of the algorithms presented better results with data normalization. A feature reduction was also made and tested in the models. The most important features were the maximum power value and minimum power value. Testing with just these two features, none of the algorithms gave better results, so all the features were used to compare the algorithms. For the FNNN model, nine input nodes were used as this is the features number, two hidden layers, and an output layer with six nodes equal to the number of different tools to predict. Then a different number of nodes in the hidden layers were tested to reach the best results. As we can see in Table 2 the results are very positive as we were able to achieve 100% of accuracy and maximum F1-Score with two algorithms, the Random Forest and with Gradient Boosting. Also, the minimum accuracy value was 63% for the Gaussian Naive Bayes correctly predicting more than half of the tools given to the model. Given these results, the algorithm that should be deployed to the cloud to enter the system is either Random Forest or Gradient Boosting. To test the feasibility of the sensor box location system, we took a small workshop zone to measure some data and test an ML algorithm. As described in Sect. 2, to get the measurement points (Ms) coordinates, we first needed to define the cartesian coordinates, relative to the workshop floor plan, of the reference points (RPs) 8, 6, 7. The result was the coordinates that can be seen in Fig. 3. Then all the distances

Combining Different Data Sources for IIoT-Based Process Monitoring Table 2 Accuracy and F1-score of the tested ML algorithms

119

Algorithm

Accuracy (%)

F1-score (%)

RF

100.0

100.0

81.8

78.8

KNN DT

81.8

81.8

GNB

63.6

59.1

100.0

100.0

GB FNN

72.73

82.0

from each one of the measurement point to the RPs were pointed out. The distances of M1 are shown in Fig. 3 too. Having the distances from each measuring point, a trilateration algorithm was used to obtain their coordinates. The coordinates obtained were: M1 (5.5, 2.3), M2 (9.95, 2.56), M3 (14.73, 2.53). The RSSI values detected in each M were pointed out. As this location test is just the first superficial phase of the testing procedure that must be done, just one algorithm was chosen so we could verify if predictions could be made with our acquired data. So the ML algorithm chosen to predict the sensor box spot was the K-Nearest Neighbour (KNN), as it is one of the most considered in fingerprinting-related works. The set of RSSI values and the coordinates of each measurement point was used to train the ML model. Some RSSI values were obtained around measurement points M1, M2, and M3 and given to the model as a test set so that the spots could be predicted. Besides the small number of train and test data, the KNN achieved an accuracy of 90%. With the results obtained, we can only realize that the tracking system could be made and expanded to the entire workshop as with just a few data received in a small zone, the ML model showed positive results. However, we must obtain more data regarding the beacons’ RSSI values, do tests in all the spaces of the workshop, and compare different ML algorithms. Only then can we guarantee the correct functioning of the box location system.

Fig. 3 Partial plan of the workshop, identifying beacons location (blue), reference points (red) and measurement points (green)

120

R. Gomes et al.

5 Conclusion and Future Work This work presented an ILM approach for tool recognition in a workshop context and a location solution for the cars being restored. With the ML algorithms tested, the results observed regarding the ILM approach demonstrate that it can clearly predict the power tools used. Missing only the identification of which car they were used. However, the data acquired to train and test the model was only for testing purposes. More data should be acquired over several days for a completely reliable model. The new sensor box location method should also be expanded for the whole workshop so every sensor box can be located precisely in all the floor plan. Also, the merging with the work done in [2] should be finished by completing the web application and tested so the restoration processes can be identified and available in the application. After completing the system, future work will be to create a real-time viewing of the workshop floor plan where all the detected events would be marked in the exact place where they happened. So it can be possible in an interactive way to see all the events detected by the developed IoT system. Acknowledgements This work was produced with the support of INCD funded by FCT and FEDER under the project 01/SAICT/2016 nº 022153, and partially supported by NOVA LINCS (FCT UIDB/04516/2020).

References 1. Gibbins K (2018) Charter of Turin handbook. Tech. rep., Fédération Internationale des Véhicules Anciens (FIVA). https://fiva.org/download/turin-charter-handbook-updated-2019english-version/ 2. Pereira D (2022) An automated system for monitoring and control classic cars’ restorations: an IoT-based approach. Master’s thesis, Monte da Caparica, Portugal. http://hdl.handle.net/ 10362/138798 3. Paradiso F, Paganelli F, Luchetta A, Giuli D, Castrogiovanni P (2013) ANNbased appliance recognition from low-frequency energy monitoring data. In: Proceedings of the 14th international symposium on a world of wireless, mobile and multimedia networks, WoWMoM 2013. IEEE. https://doi.org/10.1109/WoWMoM.2013.6583496 4. Ridi A, Gisler C, Hennebert J (2014) Appliance and state recognition using Hidden Markov Models. In: Proceedings of the 2014 international conference on data science and advanced analytics (DSAA 2014), pp 270–276. IEEE. https://doi.org/10.1109/DSAA.2014.7058084 5. Mihailescu RC, Hurtig D, Olsson C (2020) End-to-end anytime solution for appliance recognition based on high-resolution current sensing with few-shot learning. Internet of Things (Netherlands) 11. https://doi.org/10.1016/j.iot.2020.100263 6. Franco P, Martinez J, Kim YC, Ahmed M (2021) Iot based approach for load monitoring and activity recognition in smart homes. IEEE Access 9:45325–45339. https://doi.org/10.1109/ ACCESS.2021.3067029 7. Saab S, Nakad Z (2011) A standalone RFID indoor positioning system using passive tags. IEEE Trans Ind Electron 58(5):1961–1970. https://doi.org/10.1109/TIE.2010.2055774

Combining Different Data Sources for IIoT-Based Process Monitoring

121

8. Khelifi F, Bradai A, Benslimane A, Rawat P, Atri M (2019) A survey of localization systems in internet of things. Mobile Netw Appl 24(3):761–785. https://doi.org/10.1007/s11036-0181090-3 9. Cabarkapa D, Grujic I, Pavlovic P (2015) Comparative analysis of the Bluetooth low-energy indoor positioning systems. In: 2015 12th international conference on telecommunications in modern satellite, cable and broadcasting services, TEL-SIKS 2015, pp 76–79. https://doi.org/ 10.1109/TELSKS.2015.7357741 10. Sthapit P, Gang HS, Pyun JY (2018) Bluetooth based indoor positioning using machine learning algorithms. In: 2018 IEEE international conference on consumer electronics—Asia (ICCEAsia), pp 206–212. https://doi.org/10.1109/ICCE-ASIA.2018.8552138

Comparative Analysis of Machine Learning Algorithms for Author Age and Gender Identification Zarah Zainab, Feras Al-Obeidat, Fernando Moreira, Haji Gul, and Adnan Amin

Abstract Author profiling is part of information retrieval in which different perspectives of the author are observed by considering various characteristics like native language, gender, and age. Different techniques are used to extract the required information using text analysis, like author identification on social media and for Short Text Message Service. Author profiling helps in security and blogs for identification purposes while capturing authors’ writing behaviors through messages, posts, comments, blogs, comments, and chat logs. Most of the work in this area has been done in English and other native languages. On the other hand, Roman Urdu is also getting attention for the author profiling task, but it needs to convert RomanUrdu to English to extract important features like Named Entity Recognition (NER) and other linguistic features. The conversion may lose important information while having limitations in converting one language to another language. This research explores machine learning techniques that can be used for all languages to overcome the conversion limitation. The Vector Space Model (VSM) and Query Likelihood (Q.L.) are used to identify the author’s age and gender. Experimental results revealed that Q.L. produces better results in terms of accuracy.

Z. Zainab City University of Science and Information Technology, Peshawar, Pakistan F. Al-Obeidat College of Technological Innovation, Zayed University, Abu Dhabi, UAE e-mail: [email protected] F. Moreira (B) REMIT, IJP, Universidade Portucalense, Porto, Portugal e-mail: [email protected] IEETA, Universidade de Aveiro, Aveiro, Portugal H. Gul · A. Amin Center for Excellence in Information Technology, Institute of Management Sciences, Peshawar, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_11

123

124

Z. Zainab et al.

Keywords Vector space model · Query likelihood model · Information retrieval (I.R.) · Text mining · Author profiling

1 Introduction In recent eras, social media like Facebook, Twitter, Myspace, Hyves, Bebo, and Net-log have expanded impressively and have enabled millions of users of all ages to develop and support personal and professional relationships [24]. It can also be used as a tool for advertising, marketing, online business, and social media activities where users can keep their personal information. According to [7], social media blogs have evolved massive user traffic, such as Facebook, which had 1.65 billion monthly active users in the first quarter of 2016. Most people tend to provide fake names, ages, genders, and locations to conceal their real identities [4]. To catch internet predators, law enforcement agencies and social setup moderators confront two incredibly important issues: • Investigation of the substantial number of profiles and communications on social setups is quite challenging. • Internet predators typically provide fake identities and act like young people to establish a link with their victims. Furthermore, identifying the gender and age of a customer’s style according to their social media comments helps them recognize who their customers are. Therefore, they make decisions to improve their services in the future [22, 30]. It indicates that it needs to develop automatic tools and techniques for detecting fake author profiles in different types of texts, like Facebook posts/comments, Twitter comments, blog posts, and other analytical perspectives [17]. Author profiling can merely be defined as a text set, and you must identify age group, gender, profession, education, native language, similar personality traits, etc. That is a challenging problem for researchers [4]. For automatic detection, prediction, and forecasting, machine learning (ML) techniques play an important role [14]. Therefore, this research focuses on ML techniques for predicting the author’s age and gender using Query Likelihood (Q.L.) and Vector Space Models (VSMs). These techniques are applied using Stylistic Features (S.F.s) [12, 35] to identify the author’s age and gender. The outcomes for gender analysis considering S.F. show an accuracy of 70% for Q.L. and 66% for VSM. It is considered that S.F. making spaces between tokens shows the accuracy of 70% for Q.L. and 44% for VSM. By removing S.F., the accuracy is 70% for Q.L. and 46% for VSM. The outcomes for age analysis considering S.F. show an accuracy of 62% for Q.L. and 66% for VSM. Considering S.F. makes spaces between tokens, the accuracy is 64% for Q.L. and 56% for VSM. Without considering S.F., the accuracy is 66% for Q.L. and 64% for VSM. When age and gender are combined, the accuracy outcomes for considering S.F. are 76% for Q.L. and 66% for VSM, while making spaces between tokens is 67% for Q.L. and 50% for VSM, and without considering S.F., it is 68% for Q.L. and 55% for VSM.

Comparative Analysis of Machine Learning Algorithms for Author Age …

125

The researcher recently started work on Roman-Urdu analysis. The important factors in Roman-Urdu for author profiling and analysis are the first author’s name and affiliation. Thus, a very limited machine learning methodology is proposed for the author profiling in the Roman-Urdu analysis. Due to the limited literature on Roman-Urdu profiling systems, a few machine learning algorithms have been used to identify the author’s gender and age in this paper. People used Roman and Urdu to comment on text to put their opinions, and they tried to convey their messages using shorthand, emojis, and so on, as advanced generations began to use other slang languages other than the specific language to type easily and freely. As the world became more automated, machines learned such languages to make different decisions. The Roman Urdu language is working on other languages such as English, so it is now a hot topic to be highlighted to introduce or make the state-of-the-art technology to learn the Urdu Roman language. We chose these models to compare the results, which yielded better results and demonstrated a new path toward improvement. In the given paper, Sect. 1 contains an introduction, while in Sect. 2, we have included related work to the problem. Section 3 contains a detailed discussion of the methodology and step-by-step procedure of how the framework works. Finally, in Sect. 4, we have included the conclusion of the work and then references.

2 Related Work In recent years, researchers have made progress in this area to develop benchmark corpora and techniques for author profiling tasks. The most prominent effort in this regard is the series of PAN competitions on author profiling [16, 23, 26–29]. The literature corpora have been developed in various genres, for example, fiction and non–fiction texts [28], chat logs [36], customer reviews [27], emails [10], blogs and social media [12, 25, 34], comments [13]. Author identification is one of the methods of authorship associate degree whose objective is to identify the traits of an author (age, gender, language, etc.) by analyzing his written behavior [27]. It also helps to reduce the misuse of social media and gain the trust of users. Most of these corpora are in the English language,however, some work has been done in European languages as well, like Dutch, Italian, and Spanish [38]. One of the tasks these days, which attracts the attention of researchers, is to predict the age and gender of the author by making an analytical and critical analysis of the author’s written behavior. According to [15], they investigated the problem of gender and genre identification on English corpora of 604 documents taken from the BNC mentioned above corpus tagged with fiction vs. non–fiction genre and gender. The corpus has an equal number of female and male authors in each genre (123 in fiction and 179 in non-fiction). The corpora consist of blog data that has been highly targeted for author profiling experimentation. J. Schler et al. [34] developed a corpus of 71,493 English blogs to analyze stylistic and content-based features for identifying the author’s age and gender. S. Rosenthal et al. [31] have a corpus of 24,500 English

126

Z. Zainab et al.

blogs for age prediction using three different features, including lexical stylistics, lexical content, and online behavior. G. K. Mikros et al. [20] investigated author profiling in the Greek language using blogs. GBC (Greek Blog Corpus) was built by taking 50 blog entries from 20 bloggers. L. Wanner et al. [38], Contributed blog corpora in Spanish, Dutch, French, German, and Catalan languages for gender and language identification using stylistic features. S. Mechti et al. [18] developed a corpus of health forums for age and gender identification. It contains 84,518 profiles. These profiles are categorized into age groups 12–17, 18–29, 30–49, 50–64, and 65+, in which the female gender class was dominant. These days’ social networks like Twitter and Facebook have grabbed the attention of data analysts and researchers to use different machine learning and text mining techniques for performing such tasks with improved accuracy. W. Zhang et al. [39] collected 40,000 posts from Chinese social media (Siena Weibo) users for the age prediction of authors, considering four different age groups. G. Guglielmi et al. [13] explored the author profiling task by collecting comments from Twitter in 13 different languages for gender identification. The corpus contained 4,102,434 comments from 184,000 authors, with a division of 55% female and 45% male authors. The same as (Nguyen et al. 2013) used a corpus of Twitter comments in the Dutch language for age prediction. J. S. Alowibdi et al. [3] also performed an analysis by considering a Twitter comments corpus in the Dutch language with 53,326 profiles, of which 30,898 were male, and 22,428 were female. F. Rangel et al. [27] developed a Spanish corpus of 1200 Facebook comments to investigate how human emotions correlate with their gender. Schler et al. (2015) have a Facebook corpus of 75 million words from 75,000 users (with consent) to predict gender, age, and personality traits as a function of the words they use in their Facebook statuses. B. Plank et al. [37] experimented on the personality assessment of an author using a corpus of 66,000 Facebook users of the same applications. Another Vietnamese corpus consisting of 6831 forum posts collected from 104 authors was developed by [8], for the identification of the same traits as used by (Pham et al. 2009) by employing stylistic and content features. M. Ciot et al. [8] built a corpus of 8618 Twitter users in four languages, French, Indonesian, Turkish, and Japanese, for gender prediction. M. Sap et al. [33] developed an age and gender prediction lexicon from a corpus of 75,394 Facebook users of My Personality 8, a third-party Facebook application. Verhoeven et al. [37] developed a twitter-based corpus containing six different languages that are Dutch, German, Italian, French, Portuguese, and Spanish, for gender and personality identification. The corpora based on social media texts are mostly generated for English and other European languages using publicly available data. Also, profiles in these corpora contain text in one single language. This research contributes to multilingual (Roman, Urdu, and English) Simple SMS text messages, which contain both public and private messages users typed by them.

Comparative Analysis of Machine Learning Algorithms for Author Age …

127

The Roman Urdu script is also gaining attention in research trends. S. Mukund et al. [21] performed the sentiment analysis on Urdu blog data using structural correspondence learning for Roman Urdu. M. Fatima et al. [11] extended this work by adding bilingual (Roman Urdu and English) lexicons. M. Bilal et al. [5] investigated the behavior of multilingual opinions (Roman Urdu and English) extracted from a blog. M. Daud et al. [9] also worked on multilingual opinion mining (Roman Urdu and English). K. Mehmood et al. [19] performed spam classification on comment data based on English and Roman Urdu languages. According to Safdar [32], Multilingual Information Retrieval (MLIR) accepts queries in numerous languages and retrieves the results in the demanded language by the users. A questionnaire-based web survey is designed and directed to the internet users through the survey link. 110 participants responded, and the researcher detects the majority of them use the internet daily. The English language is identified as the popular language used for searching for information. They can understand the English language but use Roman Urdu for socialization and retrieving information which includes audio, video, etc. Author area identification is a component of author profiling that aims to pinpoint the author’s location based on the text [1]. Author area identification may enhance content recommendation, security, and the lowering of cybercrime due to its numerous uses in fake profile detection, content recommendation, sales and marketing, and forensic linguistics. Numerous author profiling tasks have received much attention in English, Arabic, and other European languages, but author region identification has received less attention. Urdu is a morphologically rich language, despite being used by over 100 million people worldwide [2]. The dearth of corpora is a major reason for the lack of attention and advancement in research. Roman-Urdu-Parl is the first-ever large-scale publicly available Urdu parallel corpus with 6.37 million sentence pairs. It has been built to ensure that it captures the morphological and linguistic features of the language. The study was proposed by [6]. In this study, the author presents a user study conducted on students at a local university in Pakistan and collected a corpus of Roman Urdu text messages. We could quantitatively show that several words are written using more than one spelling. Most participants of our study were not comfortable in English and hence chose to write their text messages in Roman Urdu.

3 Methodology This section discusses the main framework of the selected algorithms and strategies to compare the performance of algorithms upon different criteria. It consists of seven phases, as given in Fig. 1. Details of each phase of the framework are enumerated below.

128

Z. Zainab et al.

Fig. 1 Proposed model for author profiling using machine learning algorithms

3.1 Preprocessing Phases First, all the text messages of the authors in the form of.txt are inserted into the system. Then preprocessing has been performed, which carries three phases, and each phase is enumerated below: • Separating Files: The system reads the training and testing files in this phase. It separates the file inside the collection. • Separating Sentences: In this phase, the sentences and text messages are separated into sentences inside the individual training and testing files. • Tokenization: In this preprocessing phase, the sentences are tokenized using the split function in python, which separates the token inside the file by identifying the space between the tokens inside the sentences.

3.2 Selecting Strategy In the first experiment, three different strategies were compared to predict the author’s age and gender by analyzing the author’s writing behavior using two learning models, i.e., the Query Likelihood Model and Vector Space Model. The strategies are as follows: • Considering Stylistic Features: In the first strategy, the results are generated without removing stylistic features, i.e., emojis, digits, punctuations, special characters, and abbreviations from the author’s text messages, as shown in Fig. 2.

Comparative Analysis of Machine Learning Algorithms for Author Age …

129

Fig. 2 Strategy 1—considering stylistic expressions

Fig. 3 Strategy 2—adding extra information, i.e., whitespaces between words having attached stylistic expressions

• Considering Stylistic Features making spaces between tokens: In the second strategy, the extra information, i.e., white spaces, are generated between the word tokens and stylistic expressions of the author’s messages. For example, messages between friends sometimes include emojis or other stylistic expressions attached to tokens like, i.e., Hi!:) (token1). It was considered a single token in strategy one as shown in Fig. 2, which acts as a searching parameter for finding similar tokens in text messages. However, in this scenario, such writing expressions are separated from the word tokens by making white spaces between them, as shown in Fig. 3. Making spaces increases the number of search parameters as compared to strategy 1. • Removing Stylistic Features: In the third strategy, results are generated by removing all the stylistic expressions i.e., emojis, digits, punctuations, special characters, and abbreviations from the author’s text messages. All text messages are filtered, and noise is completely removed from it.

3.3 Applying Algorithms In this part, two algorithms, i.e., Vector Space Model and Query Likelihood Model, for comparative analysis based on strategies are listed in Fig. 3.1. The algorithmic steps of both algorithms are as follows: Algorithm 1: Vector Space Model This model is used for finding the Similarity angle between the testing files and training files of the author’s text messages. For each

130

Z. Zainab et al.

document, a vector is derived. The set of documents in a collection is then viewed as a set of vectors in a vector space. Each term will have its axis. The formula is shown in Eq. 1. |V | q d q.d i=1 qi di  = . =  Cos(q, d) = |q||d| d d |V | 2 |V | 2 q i i=1 i=1 d i

(1)

Equation 1, q (i) is the TF-IDF weightage of term i in the test files, and d(i) is the TF-IDF weightage of term i in the training files. The steps to compute the formula of VSM are as follows: Term Frequency (T.F.): TF measures the number of times a term (word) occurs in a document. Normalization of Document. The document will be of different sizes. On a large document, the frequency of the terms will be much higher than the smaller ones. Hence, we need to normalize the document based on its size by dividing the term frequency by the total number of terms. Inverse Document Frequency (IDF): The main purpose of searching is to find relevant documents matching the query. It is used to weigh down the effects of too frequently occurring terms. Also, the terms that occur less in the document can be more relevant. Moreover, it also weighs up the effects of less frequently occurring terms. The formula is as shown in Eq. 2. 

N log df x

 (2)

where in Eq. 2 N shows the total no of training files and D.F. (x) is several documents containing term x of testing files. Algorithm 2: Query likelihood Model: In the Query Likelihood Model, the documents are ranked by P (d | q), which means the probability of a document is interpreted as the likelihood of a training document (d) that is relevant to the test file (q). Formula as shown in Eq. 3. f jm (q, d) =

  1 − λc(w, q) c(w, q)log 1 + λ|d| p(w|c) wq,d 

(3)

Term Frequency in testing files (q) (c (w, q)): In this step, find the frequency of words in test files individually and how many times the particular word occurs in the test file individually.

Comparative Analysis of Machine Learning Algorithms for Author Age …

131

Term Frequency in training files (d) (c (w, d)): Finding the frequency of testing files words in training files individually and how many times the particular test file word occurs in the training file individually. Length of Training files (d) (|d |): Finding the length of each training file. Frequency of Term in full collection: Find the frequency of test files (q) words in the full collection and how many times the particular test file word occurs in all training files individually. Ranking score of test files: Rank the document according to higher scores to less generated between the training files and test files generated by both algorithms, as shown in Fig. 1. Voting Against the Ranked Test Files: Select the top 5 files and predict the class for the testing file. The class is assigned to the test files on the basis higher number of votes against that class. Output from System: The output from the system is the predicted class of gender and age for the test files.

3.4 Datasets DATA SET The corpus for this experiment is used under a student’s permission at Comsat University Lahore and has 350 files in the Roman Urdu language. The names and gender identifications of the authors in a total of 350 files have been examined for this study. Different authors were working in different languages, but with the passage of time, as advanced generations people started using some other slang languages other than the specific language, ignoring the structure and rules, people used Roman and Urdu to comment on text to put their opinions, and they tried to convey their messages using shorthand’s, emojis, and so on. So, we especially focused on the Urdu Roman language and observing such structures and rules that make machines easy to learn, so we took an overview of Roman Urdu and tried to test, learn, and train models on the Roman Urdu language, which are now working on other languages such as English. Hence, it is now a hot topic to be highlighted as to introduce or make the state-of-the-art technique to learn Urdu Roman language. We selected these models to compare the results, which give better results and show a new direction toward improvement. 300 files are used for a training dataset that includes the text messages of the students. As mentioned in the training dataset, 300 out of 350 files are part of the training dataset, and the rest of the 50 files are considered testing datasets. Scores will be generated against each test file individually. The experiment was conducted to detect an author’s gender (male and female) and age (15–19, 20–24, 25–xx) by analyzing their writing behavior. Due to the importance of Roman Urdu, in this work, we have only worked on a single language. In Experimental Setup 1, two learning models were compared, i.e., the Vector space model and the Query likelihood. The vector space model shows the angle

132

Z. Zainab et al.

between the test and the training document, and query likelihood gives the likelihood score between the test and training document. In Experimental Setup 2, three different techniques for predicting the author’s age and gender were compared using two learning models, i.e., the Vector space model and Query likelihood.

4 Results and Discussions In the first part, three different strategies are analyzed using vector space and query likelihood models. In the second part, the performance of the algorithms is analyzed based on strategies. Table 1 shows the average accuracy of the first technique, which is 69%. It has been discovered that some writing styles (emojis, digits, punctuation, special characters, and abbreviations) provide unique information that uniquely identifies the writing behavior of males and females, as shown in Table 8, which shows the topmost ranked training message document against testing file no.1 and having a greater number of male labels of ranked files for testing file no.1. This testing file is classified as male, indicating that it has more similar unique information. In the second strategy, the extra information, i.e., white space, is added between the tokens and stylistic expression, as shown in Fig. 3. The addition of white space as extra information gives an average accuracy of 57%, as shown in Table 1. It decreases the average accuracy to 12 as compared to strategy 1 as shown in Table 8, which reveals that adding extra information to the author’s text messages could disturb the accuracy and semantic structure of text messages by changing the writing behavior of authors (Table 2). In the third strategy, the stylistic features of the author’s writing behavior are removed from text messages without adding extra information, as shown in Fig. 3. The average accuracy of this strategy is 58%, as shown in Table 1. The result reveals that not adding the extra information may have the chance to keep the accuracy stable but removing stylistic expression may disturb the stability of the accuracy, as shown in Table 1. As shown in Table 1, removing stylistic expression decreases the accuracy to 11% as compared to strategy-1. This means that by removing the stylistic Table 1 Average accuracy of gender identification of author based on various strategies Strategies

Query likelihood model (%)

Vector space model (%)

Average accuracy (%)

Considering stylistic features

72

66

69

Considering stylistic features making spaces between tokens

70

44

57

Removing stylistic features

70

45

58

Comparative Analysis of Machine Learning Algorithms for Author Age … Table 2 Results of strategy 1

Table 3 Result of strategy 2

Score

133

Testing file

Ranked files

Authors gender

Test file 1

File no 62 222.25193705047192

Male

Test file 1

File no 8

219.17079365448933

Male

Test file 1

File no 121

212.6895614378466

Female

Test file 1

File no 220

211.38446480688157

Male

Test file 1

File no 160

210.76168901099146

Male

Testing file

Ranked files

Score

Author s gender

Test file 1

File no 8

271.22134336941154

Male

Test file 1

File no 121

267.43985328126604

Female

Test file 1

File no 214

265.3586276760159

Male

Test file 1

File no 267

262.44556127925523

Male

Test file 1

File no 160

262.42980757929917

Male

expression, one may have a chance to lose an important factor of information from the author’s text message, as shown in Table 8 (Table 3).

4.1 Age Analysis The three different age groups are being analyzed in the experiment to identify the author’s age group by analyzing the author’s writing behavior in text messages. The age groups are as follows: Group 1: 15–19, Group 2: 20–24, Group 3: 25-on-wards. In strategy 1, the age group of authors is being analyzed by considering stylistic features, i.e., emojis, digits, punctuation, special characters, and abbreviations in the author’s text messages. First, the system has been trained on emojis by using a list of predefined emojis. Analyzing strategy 1 as shown in Table 4 shows an average accuracy of 64%, which reveals that some common stylistic expressions may be used by the same age group of people. Due to this, it may disturb the accuracy of predicting the correct age group for the author by analyzing the writing behavior in the author’s text messages. In the second strategy, the extra information, i.e., white space, is added between the tokens and stylistic expression. The addition of

134

Z. Zainab et al.

Table 4 Result of strategy 2 Testing file

Ranked files

Score

Authors gender

Test file 1

File no 8

212.59566984485463

Male

Test file 1

File no 62

212.32997557500443

Male

Test file 1

File no 121

207.80014623237886

Female

Test file 1

File no 160

204.5628158047853

Male

Test file 1

File no 214

202.06059226005428

Male

Table 5 Average accuracy of age identification based on different strategies Features

Query likelihood model (%)

Vector space model (%)

KNN

Average accuracy (%)

Considering stylistic features

62

66

22

50

With stylistic features, spaces between tokens

64

56

26

48

Without stylistic features

66

64

24

51.33

white space as extra information gives an average accuracy of 60%. It decreases the average accuracy by 4% as compared to strategy 1. As shown in Table 5 this reveals that adding extra information to the author’s text messages could disturb the accuracy and semantic structure of the text messages by changing the writing behavior of authors of the same age group. In the third strategy, the stylistic features of the author’s writing behavior are removed from text messages without adding extra information. The average accuracy of this strategy is 65%, as shown in Table 5. Removing the stylistic feature, the accuracy increases by 1%. The result reveals that not adding the extra information may have a chance to keep the accuracy stable but removing stylistic expressions may improve the stability of accuracy as shown in Table 5, which shows that some common stylistic expressions were used by different age groups, which means stylistic expression may not give an important factor of information in the case of the author’s age (Tables 6, 7 and 8).

4.2 Age Gender Combine Analysis Average accuracy for gender and age for both algorithms, i.e., vector space model and Query likelihood, is discussed based on all three considered strategies as listed in Table 9. The average results of the listed strategies in Table 4.8 show that the Query likelihood model is better, i.e., 67.33%, as compared to the Vector Space Model, i.e.,

Comparative Analysis of Machine Learning Algorithms for Author Age …

135

Table 6 Result of age strategy 1 Testing file

Ranked files

Score

Authors gender

Test file 1

File no 62

219.17079365448933

20–24

Test file 1

File no 8

212.6895614378466

20–24

Test file 1

File no 121

211.38446480688157

20–24

Test file 1

File no 220

211.38446480688157

20–24

Test file 1

File no 160

210.76168901099146

25xx

Score

Authors gender

Table 7 Result of age strategy 2 Testing file

Ranked files

Test file 1

File no 8

271.22134336941154

20–24

Test file 1

File no 121

267.43985328126604

20–24

Test file 1

File no 214

265.3586276760159

25-xx

Test file 1

File no 267

262.44556127925523

15–19

Test file 1

File no 160

262.42980757929917

25-xx

Score

Authors gender

Table 8 Result of age strategy 3 Testing file

Ranked files

Test file 1

File no 8

271.22134336941154

20–24

Test file 1

File no 121

267.43985328126604

20–24

Test file 1

File no 214

265.3586276760159

20–24

Test file 1

File no 267

262.44556127925523

25–xx

Test file 1

File no 160

262.42980757929917

25–xx

57%, and KNN, i.e., 29%. The accuracy of the Query Likelihood model is 10.33% better than the Vector Space Model. The reason behind the limitation of the Vector Space Model is the multiplication of zero problems, where if the term frequency (T.F.) score against any searching token is zero in the case when a matching word is not found in the author’s text messages, Moreover, in the Query Likelihood model, the smoothing technique overcomes the limitation of the Vector Space Model. Due to the smoothing technique, the accuracy of the Query Likelihood Model is far better than the Vector Space Model in all the list strategies in Table 9.

5 Conclusion Outperformed with an average accuracy of Q.L. is 67.33% as compared to VSM and KNN. VSM performed poorly because the limitation of the Vector Space Model is the multiplication of zero problems discussed in Sect. 3 where in term frequency

136

Z. Zainab et al.

Table 9 Average accuracy for gender and age identification using the query likelihood model and vector space model Features

Query likelihood model (%)

Vector space model (%)

KNN (%)

Considering stylistic features

67

66

21

With stylistic features, spaces between tokens

67

50

31

Without stylistic features 68

55

36

Average accuracy

57

29

67.33

(T.F.) score against any searching token is zero in the case when a matching word is not found in the author’s text messages where KNN algorithm is a lazy learner, i.e. it does not learn anything from the training data and simply uses the training data itself for classification. To predict the label of a new instance the KNN algorithm will find the K closest neighbors to the new instance from the training data, and the predicted class label will then be set as the most common label among the K closest neighboring points. Further, changing K can change the resulting predicted class label.

References 1. Akram Chughtai R (2021) Author region identification for the Urdu language (Doc. dissertation, Dep. of Computer science, COMSATS University Lahore) 2. Alam M, Hussain SU (2022) Roman-Urdu-Parl: Roman-Urdu and Urdu parallel corpus for Urdu language understanding. Trans Asian Low-Resour Lang Inf Process 21(1):1–20 3. Alowibdi JS, Buy UA, Yu P (2013) Language independent gender classification on Twitter. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pp 739–743 4. Ameer I, Sidorov G, Nawab RMA (2019) Author profiling for age and gender using combinations of features of various types. J Intell Fuzzy Syst 36:4833–4843 5. Bilal M, Israr H, Shahid M, Khan A (2016) Sentiment classification of roman-Urdu opinions using näıve Bayesian, decision tree, and KNN classification techniques. J King Saud UnivComput Inf Sci 28:330–344 6. Bilal A, Rextin A, Kakakhel A, Nasim M (2017) Roman-txt: forms and functions of roman Urdu texting. In: Proceedings of the 19th international conference on HCI with mobile devices and services, pp 1–9 7. Biswas B, Bhadra S, Sanyal MK, Das S (2018) Cloud adoption: a future road map for Indian SMEs. In: Intelligent engineering informatics. Springer, pp 513–521 8. Ciot M, Sonderegger M, Ruths D (2013) Gender inference of Twitter users in non-English contexts. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1136–1145 9. Daud M, Khan R, Daud A et al (2015) Roman Urdu opinion mining system (rooms). arXiv preprint arXiv:1501.01386 10. Estival D, Gaustad T, Pham SB, Radford W, Hutchinson B (2007) Author profiling for English emails. In: Proceedings of the 10th conference of the Pacific Association for computational linguistics, pp 263–272

Comparative Analysis of Machine Learning Algorithms for Author Age …

137

11. Fatima M, Anwar S, Naveed A, Arshad W, Nawab RMA, Iqbal M, Masood A (2018) Multilingual SMS-based author profiling: data and methods. Nat Lang Eng 24:695–724 12. Fatima M, Hasan K, Anwar S, Nawab RMA (2017) Multilingual author profiling on Facebook. Inform Process Manag 53:886–904 13. Guglielmi G, De Terlizzi F, Torrente I, Mingarelli R, Dallapiccola B (2005) Quantitative ultrasound of the hand phalanges in a cohort of monozygotic twins: influence of genetic and environmental factors. Skele-Tal Radiol 34:727–735 14. Khan S, Ullah R, Khan A, Wahab N, Bilal M, Ahmed M (2016) Analysis of dengue infection based on Raman spectroscopy and support vector machine (SVM). Biomed Opt Express 7:2249–2256 15. Koppel M, Argamon S, Shimoni AR (2002) Automatically categorizing written texts by author gender. Lit Linguist Comput 17:401–412 16. Krenek J, Kuca K, Blazek P, Krejcar O, Jun D (2016) Application of artificial neural networks in condition-based predictive maintenance. Recent developments in intelligent information and database systems, pp 75–86 17. Kurochkin I, Saevskiy A (2016) Boinc forks, issues, and directions of de-development. Procedia Comput Sci 101:369–378 18. Mechti S, Jaoua M, Faiz R, Bouhamed H, Belguith LH (2016) Author profiling: age prediction based on advanced Bayesian networks. Res Comput Sci 110:129–137 19. Mehmood K, Afzal H, Majeed A, Latif H (2015) Contributions to the study of bi-lingual roman Urdu SMS spam filtering. In: 2015 National software engineering conference (NSEC). IEEE, pp 42–47 20. Mikros GK (2012) Authorship attribution and gender identification in Greek blogs. Methods Appl Quant Linguist 21:21–32 21. Mukund S, Srihari RK (2012) Analyzing urdu social media for sentiments using transfer learning with controlled translations. In: Proceedings of the second workshop on language in social media, pp 1–8 22. Nemati A (2018) Gender and age prediction multilingual author profiles based on comments. In: FIRE (Working Notes), pp 232–239 23. Ogaltsov A, Romanov A (2017) Language variety and gender classification for author profiling in pan 2017. In: CLEF (Working notes) 24. Peersman C, Daelemans W, Van Vaerenbergh L (2011) Predicting age and gender in online social networks. In: Proceedings of the 3rd international workshop on search and mining user-generated contents, pp 37–44 25. Plank B, Hovy D (2015) Personality traits on Twitter—or—how to get 1,500 personality tests in a week. In: Proceedings of the 6th workshop on computational approaches to subjectivity, sentiment, and social media analysis, pp 92–98 26. Quirk GJ, Mueller D (2008) Neural mechanisms of extinction learning and retrieval. Neuropsychopharmacology 33:56–72 27. Rangel F, Herna´ndez I, Rosso P, Reyes A (2014) Emotions and irony per gender in Facebook. In: Proceedings of workshop ES3LOD, LREC, pp 1–6 28. Rangel F, Rosso P, Koppel M, Stamatatos E, Inches G (2013) Overview of the author profiling task at pan 2013. In: CLEF conference on multilingual and multimodal information access evaluation, CELCT, pp 352–365 29. Rangel F, Rosso P, Potthast M, Stein B, Daelemans W (2015) Overview of the 3rd author profiling task at pan. In: Poceedings of CLEF, sn. p. 30. Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in Twitter. In: Proceedings of the 2nd international workshop on search and mining user-generated contents, pp 37–44 31. Rosenthal S, McKeown K (2011) Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 763–772 32. Safdar Z, Bajwa RS, Hussain S, Abdullah HB, Safdar K, Draz U (2020) The role of Roman Urdu in multilingual information retrieval: a regional study. J Acad Librariansh 46(6):102258

138

Z. Zainab et al.

33. Sap M, Park G, Eichstaedt J, Kern M, Stillwell D, Kosinski M, Un- gar L, Schwartz HA (2014) Developing age and gender predictive lexica over social media. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1146–1151 34. Schler J, Koppel M, Argamon S, Pennebaker JW (2006) Effects of age and gender on blogging. In: AAAI spring symposium: computational approaches to analyzing weblogs, pp 199–205 35. Sittar A, Ameer I (2018) Multilingual author profiling using stylistic features. In: FIRE (Working Notes), pp 240–246 36. Tudisca S, Di Trapani AM, Sgroi F, Testa R (2013) Marketing strategies for Mediterranean wineries competitiveness in the case of Pantelleria. Calitatea 14:101 37. Verhoeven B, Plank B, Daelemans W (2016) Multilingual personality profiling on twitter. In: To be presented at DHBenelux 2016 38. Wanner L et al (2017) On the relevance of syntactic and discourse features for author profiling and identification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics: volume 2, short papers, pp 681–687 39. Zhang W, Caines A, Alikaniotis D, Buttery P (2016) Predicting author age from Weibo microblog posts. In: Proceedings of the tenth international conference on language resources and evaluation, pp 2990–2997

Prioritizing Educational Website Resource Adaptations: Data Analysis Supported by the k-Means Algorithm Luciano Azevedo de Souza , Michelle Merlino Lins Campos Ramos , and Helder Gomes Costa

Abstract As part of the COVID-19 Pandemic control measures, the rapid shift from face-to-face classroom systems to remote models, virtual learning environments and academic administration websites have become crucial. The difficulty of changing them in a smart and nimble manner arises when assessing the major needs of the consumers. In this manner, our article attempted to reveal these objectives using a survey of 36 of 80 students enrolled in a specific MBA course. The data was examined using clustering methods and statistical analysis. The primary findings were that IOS had worse performance than Android, and users who chose desktop computers had greater usability than those who preferred mobile devices. The suggested activity prioritization considered responsiveness in IOS as a priority following declared relevance order and usability in inverted order. Keywords k-means algorithm · Covid-19 · Education · Responsivity · Clustering

1 Introduction The COVID-19 changed higher education in a variety of ways, ranging from the learning tools and models that those institutions are adopting to the needs and expectations of the current and future workforce. Therefore, the higher education will most likely never be the same [13] One of the changes that most affected the process learning was the integral adoption of distance learning in the place of the face-toface learning. Therefore, distance learning is a unique solution to not paralyze the education system during critical times [1]. Modern education prepares students for L. A. de Souza (B) · M. M. L. C. Ramos · H. G. Costa Universidade Federal Fluminense, Niterói, RJ 24210-240, Brazil e-mail: [email protected] M. M. L. C. Ramos e-mail: [email protected] H. G. Costa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_12

139

140

L. A. de Souza et al.

effective activities by emphasizing knowledge and the ability to apply it [12]. Because students are no longer restricted to the traditional classroom, mobile technology in education has an impact on learning [3]. The use of mobile devices in education provides both opportunities and challenges [7]. The accessibility and opportunities provided by this technology, demonstrate the benefits of m-learning, while identifying its main issues. The main pedagogical challenge is determining what works better in the classroom and what should be learned outside, and how both can coexist [2, 4, 8, 11, 15]. The use of new advanced mobile platforms with the operating systems as the IOS or Android in educational system by the students has generated a new challenge and increases the opportunities to exploit these devices in the education, due to the characteristics and features that a mobile phone offers, by providing a new experience to the students in the classroom environment [14]. This study focused on mapping features in the institution’s website and in doing a survey about the usability and the relevance of such features, using both access via notebook and mobile devices, comprising both IOS and Android operating systems. The K-means was support identifying critical features to improve the website usability.

2 Proposed Methodology The methodological procedures used in this work are shown in Fig. 1. As an initial step, the existing site was consulted to map its active resources. The mapped functionalities are described in Table 1, with their respective acronyms, adopted for representation in this work. The data collection instrument was organized in a google form. First, the respondent was asked about the general usability and usability of each feature, where they were presented with Likert scale options 1terrible, 2-poor, 3-regular, 4-good, 5-excellent. Next, the respondent was asked about the relevance of each feature. The alternatives offered for this question were: 1-very low, 2-low, 3-average, 4-high, and 5-very high. Next, questions were asked about how often each resource was used from mobile devices such as tablet and smartphone, with

mapping of existing features

construction of the data collection instrument

data collection

cluster analysis (kmeans)

Clustered Data Analysis

Final considerations

Fig. 1 Methodological procedures

general data analysis

Prioritizing Educational Website Resource Adaptations: Data Analysis … Table 1 Mapped functionalities

Description

141 Acronym

Calendar of Classes

CALEND

Test scores in the subjects studied

TSCORES

Communication with course coordination

COORD

Communication with classmates

CLASSMATES

Communication with MBA professors

PROFESSORS

Delivering homework and evaluative activities

WORKDELIVERY

Participation in subject discussion forums

FORUMS

Information about other courses

OCOURSES

Read Texts

TEXTS

Get learning materials (Download)

LMATERIALS

the response options following the scale 1-never, 2-rarely, 3-sometimes, 4-always, 5-often. The participant was also asked which operating system they use on their mobile device. The survey was conducted to capture the opinion of the 80 students of an active specialization course in a Brazilian educational organization. The google questionnaire was made available in the WhatsApp group that such students were participating in between May 1st and June 4th, 2020, a period when restrictive measures to control COVID-19 implied an abrupt migration to the remote learning system. We obtained 36 voluntary contributions. For data analysis Minitab (version 17.1.0) and R (R-4.2.1) software with RStudio (version 2022.02.3 Build 492) were used. In R, the packages "ggplot2", "likert", "cluster", "factoextra" were used, and for loading and saving data with MS Excel the packages "openxlsx", "writexl". The procedures were carried out using a PC with a 64-bit Windows 10 operating system, 8 GB RAM, and the RStudio 2022.02.3 Build 492 (R 4.1.3) environment. 2.80 GHz Intel(R) Core(TM) i5-8400 CPU.

3 Experimental Results Figure 2 shows the respondents’ evaluations of the overall usability of the existing website. We observed that the general evaluation did not show any score “terrible” and had values concentrated between “regular” and “good”. As for the mobile operating system: Android prevails with 66.7% and only 22% prefer to access through mobile devices.

142

L. A. de Souza et al.

Fig. 2 Preferred device, and mobile Op. System

3.1 Likert Evaluation The features of the existing website were evaluated by respondents and the result of the survey is shown at Fig. 3. It’s possible to see the ordination of the analyzed variables in decrescent value of balancing comparing higher scale values (4, 5) and lower values (1, 2) with the value 3 centered. The resources with the worst results of usability by the consulted users were: “access to teaching materials”, “homework delivery”, and “contact with the teacher”. Figure 4 shows the results about the relevance of each feature.

Fig. 3 Usability of features in existing website, obtained by running R package Likert

Fig. 4 Relevance of features in existing website, obtained by running R package Likert

Prioritizing Educational Website Resource Adaptations: Data Analysis …

143

The ranking of relevance shows more pronounced differences between features than in the prior evaluation, which was balanced. The “access to learning materials” and “homework delivery” were two of the most important aspects that also had the lowest usability. Then lies access to evaluation scores and reading texts. Table 2 was organized in which the features are shown according to the ordered usability balancing, then the inverted order of usability, as we must prioritize developing in this dimension the ones with worse perception by users and the relevance in direct order. The product of relevance order and inverted usability order indicates a prioritizing ponderation of both dimensions. Finally, we ordinate the results at the last column. From this tabulation we have an order of the functionalities to be adapted as a priority. However, it does not take into consideration other information provided by survey participants, such as the preferred device for access and operating system of the mobile devices they use. The respondents also answered about the frequency of use each feature and was performed a comparison of frequency declared in mobile devices and in desktops produced the variable preference of device. The stratification of the overall evaluation by preferred device type is shown in Fig. 5. Table 2 Usability x relevance Feature

Usability of existing website

Inverted usability

Relevance of features

Inverted usability x Relevance

Prioritizing order

Contact coordinator

1

10

5

50

3

Contact classmates

2

9

10

90

1

Text reading

3

8

4

36

5

Participate in discussion forums

4

7

9

63

2

Consult evaluation grades

5

6

3

18

7

Information about other courses

6

5

8

40

4

Consult events 7 calendar

4

7

28

6

Access learning materials

8

3

1

3

10

Delivery homework

9

2

2

4

9

Contact with professor

10

1

6

6

8

144

L. A. de Souza et al.

Fig. 5 General evaluation of usability by preference of device

To go further in prioritizing features, we have broken down the evaluation of each feature by the device most often used for access. This result is represented in Fig. 6. Except for viewing the calendar of activities, participating in discussion forums, contacting classmates and coordination, it is noticeable that worse evaluations of the usability of the features for mobile devices. To better understand such reasons, we analyze the overall evaluation by stratifying the operating system used for access. Figures 7 and 8 show the general evaluation of these aspects. The overall information is ambiguous, as on the one hand there is a concentration of responses regarding the usability of the existing website as 2 (bad) for the IOS user, with a higher concentration of responses 3 (regular) and 4(good) for Android users, but rating 5 (excellent) shows 25% of the responses from IOS users. To refine the prioritization, we then applied clustering techniques, for a more robust analysis of the data.

3.2 K-Means Clustering As shown in Figs. 3 and 4, the relevance rating was more pronounced than in the usability rating, which was balanced between different features. Thus, we took as database for cluster identification the set of answers that indicated relevance. The purpose of the k-means method is to classify data by structuring the set into subsets whose features show intra-group similarities and differences with other groups. [5, 9] (Wu et al., 2008; de Souza & Costa, 2022). We utilized three ways to determine the number of groups needed to segregate the data. Elbow (Fig. 9a), Gap Stat (Fig. 9b), and Silhouette (Fig. 9c). The Elbow

Prioritizing Educational Website Resource Adaptations: Data Analysis …

Fig. 6 Individual feature usability evaluation by preference of device

145

146

L. A. de Souza et al.

Fig. 7 General evaluation of usability by Mobile Op. System

method (Cuevas et al., 2000) indicated k = 3, Gap Stat methods [16] suggested k = 7 and the Silhouette method (Kaufman & Rousseeuw, 2005) indicates k = 10. To define the analysis, we did a visual comparison of the data with k ranging from 2 to 5 for, as shown in Fig. 10. We choose to classify the data into three clusters, because the proximity within groups and the remoteness between the groups. Figure 11 depicts a visualization of the clustering of the observations. We separated clusters with the following amounts of observations: Cluster 1 with 10 elements, Cluster 2 with 16 elements, and Cluster 3 with 10 elements. Again, we plot the overall usability evaluation by clusters which can be seen in Fig. 12. We verify that cluster 1 (with 10 respondents) considers less relevance for all features. Cluster 2, with 16 members, considers the features "Visualization of the activity calendar", "Access to evaluation results", "Access to teaching materials" and "Homework delivery" as highly relevant. To investigate the possible relationship with the preferred device and Op. system, we organized the data by cluster in Table 3. We expect in cluster 2 greater differentiation in device and operating system preferences, particularly for the IOS system. However, among the 12 IOS users, only 1 in cluster 2 preferred mobile device. Thus, our suggested priority list, Table 4, focuses on relevance vs. usability inverted. The focus should be on responsiveness, making it possible to use the mobile device with better adaptability (Fig. 13).

Prioritizing Educational Website Resource Adaptations: Data Analysis …

Fig. 8 Individual feature usability evaluation by Mobile Op. System

Fig. 9 Optimal number of clusters

Fig. 10 Visual representations of clustering wit k = 2 to k = 5

147

148

L. A. de Souza et al.

Fig. 11 Clustering data using k-means (k = 3) Fig. 12 General evaluation of usability by cluster

Table 3 Preferred device and Op. System by cluster Preferred device and Op. System

Cluster 1

Cluster 2

Cluster 3

Total

DESKTOP

4

9

7

20

Android

3

6

5

14

iOS

1

3

2

6

EQUAL

5

2

1

8

Android

2

1

iOS

3

1

1

5

MOBILE

1

5

2

8

Android

1

4

2

7

iOS

1

3

1

Prioritizing Educational Website Resource Adaptations: Data Analysis … Table 4 Prioritizing order

Feature

Prioritizing order of development

Contact classmates

1st

Participate in discussion forums

2nd

Contact coordinator

3rd

Information about other courses

4th

Text reading

5th

Consult events calendar

6th

Consult evaluation grades

7th

Contact with professor

8th

Delivery homework

9th

Access learning materials

10th

149

Fig. 13 Individual usability evaluation of feature by cluster

4 Final considerations The goal of this work was to prioritize the features of the existing educational website due to the rapidly changing access profile during the Covid-19 pandemic, when users were forced to shift from traditional workplaces to remote work. Statistical analysis approaches such as likert package analysis using R software and clustering with the k-means algorithm were used. As usability with IOS was worst in all features, and that in the cluster analysis the users in desktop were most satisfied with the features, and the list prioritized the development priority that was built considering relevance and usability inverted. We suggest to extending this research for mass education sites. Acknowledgements This research was partially supported by: Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Brazil) Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq, Brazil)

150

L. A. de Souza et al.

Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ, Brazil)

References 1. Azhari B, Fajri I (2021) Distance learning during the COVID-19 pandemic: School closure in Indonesia. Int J Math Educ Sci Technol. https://doi.org/10.1080/0020739X.2021.1875072 2. Belle LJ (2019) An evaluation of a key innovation: mobile learning. Acad J Interdiscip Stud 8(2):39–45. https://doi.org/10.2478/ajis-2019-0014 3. Bleustein-Blanchet M (2016) Lead the change. Train. Ind. Mag., 16–41 4. Criollo-C S, Guerrero-Arias A, Jaramillo-Alcázar Á, Luján-Mora S (2021) Mobile learning technologies for education: Benefits and pending issues. Appl Sci (Switz), 11(9). https://doi. org/10.3390/app11094111 5. Cuevas A, Febrero M, Fraiman R (2000) Estimating the number of clusters. In Can J Stat 28:2 6. de Souza LA, Costa HG (2022) Managing the conditions for project success: an approach using k-means clustering. In Lect Notes Netw Syst: 420 LNNS. https://doi.org/10.1007/978-3-03096305-7_37 7. de Oliveira CF, Sobral SR, Ferreira MJ, Moreira F (2021) How does learning analytics contribute to prevent students’ dropout in higher education: A systematic literature review. Big Data Cogn Comput 5(4):64. https://doi.org/10.3390/bdcc5040064 8. Ramos MMLC, Costa HG, Azevedo G, da C (2021). information and communication technologies in the educational process. In https://services.igiglobal.com/resolvedoi/resolve.aspx?doi=https://doi.org/10.4018/978-1-7998-8816-1.ch016 pp 329–363. IGI Global. https://doi.org/10.4018/978-1-7998-8816-1.ch016 9. Jain AK (2009). Data clustering: 50 years beyond K-means q. https://doi.org/10.1016/j.patrec. 2009.09.011 10. Kaufman Leonard, Rousseeuw PJ (2005) Finding groups in data : an introduction to cluster analysis. 342 11. Mierlus-Mazilu I (2010). M-learning objects. In: ICEIE 2010 – 2010 International Conference on Electronics and Information Engineering, Proceedings, 1. https://doi.org/10.1109/ICEIE. 2010.5559908 12. Noskova T, Pavlova T, Yakovleva O (2021) A study of students’ preferences in the information resources of the digital learning environment. J Effic Responsib Educ Sci 14(1):53–65. https:// doi.org/10.7160/eriesj.2021.140105 13. Pelletier K, McCormack M, Reeves J, Robert J, Arbino N, Maha Al-Freih, Dickson-Deane C, Guevara C, Koster L, Sánchez-Mendiola M, Skallerup Bessette L, Stine J (2022). 2022 EDUCAUSE Horizon Report® Teaching and Learning Edition. https://www.educause.edu/hor izon-report-teaching-and-learning-2022 14. Salinas-Sagbay P, Sarango-Lapo CP, Barba, R. (2020) Design of a mobile application for access to the remote laboratory. Commun Comput Inf Sci, 1195 CCIS, 391–402. https://doi.org/10. 1007/978-3-030-42531-9_31/COVER/ 15. Shuja A, Qureshi IA, Schaeffer DM, Zareen M, (2019) Effect of m-learning on students’ academic performance mediated by facilitation discourse and flexibility. Knowl Manag ELearn, 11(2), 158–200. https://doi.org/10.34105/J.KMEL.2019.11.009 16. Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Society Ser B: Stat Methodol, 63(2), 411–423. https://doi.org/10.1111/ 1467-9868.00293

Voice Operated Fall Detection System Through Novel Acoustic Std-LTP Features and Support Vector Machine Usama Zafar, Farman Hassan, Muhammad Hamza Mehmood, Abdul Wahab, and Ali Javed

Abstract The ever-growing old age population in the last two decades has introduced new challenges for elderly people such as accidental falls. An accidental fall in elderly persons results in lifelong injury, which has extremely severe consequences for the remaining life. Furthermore, continued delay in the treatment of elderly persons after accidental fall increases the chances of death. Therefore, early detection of fall incidents is crucial to provide first aid and avoid the expenses of hospitalization. The major aim of this research work is to provide a better solution for the detection of accidental fall incidents. Most automatic fall event detection systems are designed for specific devices that decrease the flexibility of the systems. In this paper, we propose an automated framework that detects in-door fall events of elderly people in the real-time environment using a novel standard deviation local ternary pattern (Std-LTP). The proposed Std-LTP features are able to capture the most discriminatory characteristics from the sounds of fall events. For classification purposes, we employed the support vector machine (SVM) to distinguish the indoor fall occurrences from the non-fall occurrences. Moreover, we have developed our fall detection dataset that is diverse in terms of speakers, gender, environments, sample length, etc. Our method achieved an accuracy of 93%, precision of 95.74%, recall of 90%, and F1-score of 91.78%. The experimental results demonstrate that the proposed system successfully identified both the fall and non-fall events in various indoor environments. Subsequently, the proposed system can be implemented on various devices that can efficiently be used to monitor a large group of people. Moreover, the proposed system can be deployed in daycare centers, old homes, and for patients in hospitals to get immediate assistance after a fall incident occurs. Keywords Fall event · Machine learning · Non-fall event · Std-LTP · SVM

U. Zafar · F. Hassan · M. H. Mehmood · A. Wahab · A. Javed (B) University of Engineering and Technology, Taxila, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_13

151

152

U. Zafar et al.

1 Introduction The population of aged people around the world is increasing at a rapid pace because of the advancement that has been made in the medical field. As reported by United Nations World Aging Population Survey (World Population Ageing, United Nations, 2020), there were around 727 million people aged 65 years or more in 2020. It is expected that this figure will be doubled by 2050. One of the most prevalent causes of injuries is an accidental fall. Old age people are mostly affected by these accidental falls that happen due to various reasons such as unstable furniture, slippery floors, poor lighting, obstacles, etc. It is very common in most countries that elderly persons live alone in their homes without the presence of kids and nurses. The most devastating effect of the fall incident on elderly people is that they may lay on the floor unattended for a long interval of time. As a result, they develop long-lasting injuries that in some cases can even lead to death. Research study finds that these fall incidents of old people cost millions of Euros to the UK government [1]. Risk factors of accidental falling increase with old aged people and also cost more for their treatment and care [2]. According to one report [3], people aged 65 years or older are more likely to fall once a year and some of them may fall more than once as well. These statistics demand to develop reliable automated fall detection systems using modern-day technology to help reduce the after-effects of fall incidents and to provide immediate first-aid support to the concerned person. The research community also explored motion sensors and acoustic sensors-based fall detection systems for elderly people using several techniques implemented in wearable devices i.e., smart watches, smart shoes, smart built, smart bracelets, and smart rings. In Yacchirema et al. [4], LowPAN, sensor networks, and cloud computing were used for the detection of fall events. Four machine learning classifiers i.e., logistic regression, ensemble, deepnets, and decision trees were employed for classification purposes, but the ensemble performed well using SisFall, sliding windows along with Signal magnitude area (SMA), and Motion-DT features. The healthcare professional receives the notification through a secure and lightweight protocol. In Giansanti et al. [5], a mono-axial accelerometer was utilized to calculate the acceleration of different parts of the body intending to detect any mishap. Acceleration is one of the important parameters that can be used to observe the motion of the body. A mono-axial accelerometer measures the vertical acceleration of a person’s body to detect the fall event. However, elderly people need to wear the accelerometer every time which greatly affects their daily activities and routine lives. In a study by Muheidat [6], sensor pads were used and placed under the carpet, which monitors the aged persons. Computational intelligence techniques i.e., convex hull and heuristic were used to detect fall events. In a study by [7], instead of wearable devices, fall was detected using smart textiles with the help of a non-linear support vector machine. To analyze an audio signal, the Gabor transform was applied in the time and frequency domain to derive new features known as Wavelet energy. In the study by [8], a fall detection system was developed using an acoustic sensor that was placed on the z-axis to detect the pitch of the audio. However, this method has a

Voice Operated Fall Detection System Through Novel Acoustic …

153

limitation as only a single person is allowed in the locality. Moreover, elderly people are unable to carry the sensors all the time. This concern was addressed in [9] and two types of sensors were used i.e., body sensor and fixed sensor at home. Both the body sensor and fixed sensor were used at the same time. At home, fixed sensors can also work independently if a person is unable to carry body sensors. A mixed positioning algorithm was used to determine the position of the person that is used to decide the fall event. The research community has also explored vision-based techniques for fall detection. In visual surveillance applying background, subtraction is a quite common approach to discriminate moving objects. In the study by Yu et al. [10], visionbased fall detection was proposed in which background subtraction was applied for extraction of the human body silhouette. The extracted silhouettes were fed into CNN for detecting both fall occurrences and non-fall occurrences. In the study by [11], the Gaussian mixture model was utilized for observing the appearance of a person during the video sequence. This approach detects the fall event in case of any deformation found in the shape of the concerned person. In the study by Cai et al. [12], a vision-based fall detection system was proposed in which hourglass residual units were introduced to extract multiscale features. SoftMax classifier was used for the categorization of both fall and non-fall events. In the study by Zhang et al. [13], the YOLACT network was applied to the video stream to distinguish different human activities and postures. A convolutional neural network (CNN) was designed for the classification of fall vs non-fall events. Vision-based fall detection systems are widely used; however, these fall detection systems have certain limitations i.e., privacy issues, real-life falls are difficult to detect because of the data set, high-cost because high-resolution cameras are required to cover the entire room, computational complexity due to the processing of millions of video frames. The research community has also explored various machine learning and spectral features-based fall detection systems for the detection of fall occurrences and nonfall occurrences [14–17]. In the study by [14], the Hidden Markov model was used to determine the fall event. The Mel frequency cepstral coefficients (MFCC) features are capable to extract prominent information from the audio signals and are used for different research works [15, 18–22], respectively. In the study by [22], MFCC features were used to train Nearest Neighbor (NN) for the categorization of fall occurrences and non-fall occurrences. In the study by [23], MFCC, Gammatone cepstral coefficients (GTCC), and skew-spectral features were used for extracting features, and a decision tree was used for the classification of fall and non-fall events. In the study by Shaukat et al. [21], MFCC and Linear Predictive coding (LPC) were utilized for the voice recognition of elderly persons. An ensemble classifier was employed for classification purposes on daily sound recognition (AudioSet 2021) and RWCP (Open SLR 2021) datasets. In the study by [15], MFCC features were used with one class support vector machine method (OCSVM) for the classification of the fall and non-fall sounds. In our prior work [15], we proposed an acousticLTP features-based approach with the SVM classifier for fall event detection. This method was more effective than MFCC in terms of computational cost and also rotationally invariant. Although the above-mentioned sensors-based, acoustic-based,

154

U. Zafar et al.

and computer vision-based fall detection systems achieve better detection of fall events. However, different restrictions are still present in modern methods i.e., fall detection systems can be implemented merely in wearable devices, some frameworks are only sensors-based which makes it difficult for elderly people to carry the body sensors all the time, and computer vision-based fall detection systems have privacy concerns, high computational costs, failure in fall detection in case server fails in a client–server architecture, etc. So, there is a need to develop automated fall event detection systems that are robust to the above-mentioned limitations. The major contributions of our study are as under: • We present a novel audio feature i.e., Std-LTP that is capable of extracting the most discriminative characteristics from the input audio. • We present an effective voice-operated fall detection system that can reliably be utilized for determining fall occurrences. • We created our own in-house audio fall event dataset that is distinct respect of speakers, speaker gender, environment, etc. The remaining paper is organized as follows. In Sect. 2, we discussed the proposed methodology. In Sect. 3, experimental results are discussed, whereas we conclude our work in Sect. 4.

2 Proposed Methodology The main goal of the designed system is to identify fall occurrences and non-fall occurrences from audio clips. Extraction of features and classification of audios are the two steps that are involved in this proposed system. Initially, we extracted our 20-dimensional Std-LTP features from the audio input and then used all the 20dimensional features to classify the fall occurrences and non-fall occurrences. For classification purposes, we employed the SVM. The flow diagram of the designed system is given in Fig. 1.

2.1 Feature Extraction The extraction of features is critical for designing an efficient categorization system. The process of feature extraction of the proposed method is explained in the following section.

Voice Operated Fall Detection System Through Novel Acoustic …

155

Fig. 1 Proposed system

2.2 Std-LTP Computation In the proposed work, we presented the Std-LTP features descriptor to extract the characteristics of fall occurrences and non-fall occurrences from the audio. We obtained the 20-dimensional Std-LTP features from the audio signal y[n]. To extract features from the voice using Std-LTP, the audio signal is divided into multiple windows (Wc). We computed the Std-LTP by encoding each Wc of an audio signal y[n]. The total windows are obtained by dividing the samples by 9. Each Wc comprises nine samples which are used to generate the ternary codes. Initially, we computed the threshold value of each window Wc. In the prior study [22], acousticLTP utilized a static threshold value for each Wc which doesn’t take into account the local statistics of the samples of each Wc. In this paper, we computed the value of the threshold using the local statistics of each sample around the central sample c in each Wc. We computed the threshold value by calculating the standard deviation of each Wc and multiplying it by scaling factor α, so, the threshold for each Wc varies. The standard deviation of each Wc is computed by using the following equation:  σ =

8

i=0 (qi

N

− μ)2

(1)

156

U. Zafar et al.

where σ is standard deviation, qi is the value of each sample in the Wc, μ is the mean of nine values of that Wc, N is the number of samples which is nine. The threshold is calculated as follows: σ ∗α

(2)

where α is the scaling factor and 0 < α ≤ 1. We used α = 0.5 in our work because we achieved the best results in this setting. We compared the c with corresponding neighboring values. To achieve this purpose, we quantified the magnitude difference between c and the neighboring samples. Values of samples greater than c+th are set to 1 and those which are smaller than c–th are set to –1, whereas values between c–th and c+th are set to 0. Hence, we obtained the ternary codes as follows: ⎧ qi ≥ (c + (σ ∗ α)) ⎪ ⎨ +1, f (qi , c, t) = 0, (c − (σ ∗ α)) < qi < (c + (σ ∗ α)) ⎪ ⎩ −1, qi ≤ (c − (σ ∗ α))

(3)

where f (qi , c, t) is the function representing the ternary codes. For instance, consider the following frame having 9 samples as shown in Fig. 2. We computed the standard deviation of the Wc which is σ ≈ 6 in this case. Next, we multiply the standard deviation value by the scaling factor of 0.5 to get the threshold value that is σ * α = 6 * 0.5 = 3. So, values that are greater than 33 are set to +1, less than 27 are set to –1, and values between 33 and 27 are set to 0. In this way, the ternary code of the vector having nine values is generated. Next, we compute the upper and lower binary codes. For upper codes, we set the value to 1 where the ternary code is +1 and values of 0 and –1 are set to zero as:  f u (qi , c, t) =

1, f (qi , c, t) = +1 0, other wise

For lower codes, we set all values of –1 to 1 and 0 and +1 to 0 as: Fig. 2 Feature extraction

(4)

Voice Operated Fall Detection System Through Novel Acoustic …

 fl (qi , c, t) =

1, f (qi , c, t) = −1 0, other wise

157

(5)

We transformed these upper and lower codes into decimal values as: Tu p =

7

f uuni (qi , c, t).2i

(6)

fluni (qi , c, t).2i

(7)

i=0

Tl p =

7 i=0

Histograms are calculated for the upper and the lower codes as follows: h u (k) =

W

δ(TW(u) , k)

(8)

δ(TW(l) , k)

(9)

w=1

h l (k) =

W w=1

where k shows the histogram bins. We used ten patterns for upper and lower binary codes to capture the characteristics of the sound involving the fall and non-fall events as our experiments provided the best results on ten patterns from both groups. We combined the ten upper and ten lower codes to develop a 20-dim Std-LTP descriptor as: Std − L T P = [h u ||h l ]

(10)

2.3 Classification The binary classification problems can be easily resolved by using a SVM, therefore we utilized SVM in our work for performing classification. The Std-LTP features are utilized to train SVM for categorizing fall occurrences and non-fall occurrences. We tuned different parameters for SVM and set the following values: box constraint of 100, kernel scale to 1, gaussian kernel, and outlier function to 0.05.

158

U. Zafar et al.

Table 1 Details of fall and non-fall dataset No of samples

No of fall samples

No of non-fall samples

Training samples

Testing samples

508

234

274

408

100

3 Experimental Setup and Results Discussion 3.1 Dataset We developed our fall detection dataset comprising audio clips of fall occurrences and non-fall occurrences recorded with two devices i.e., Lenovo K6 note and infinix note 10 pro. The dataset is specifically designed to detect fall events. We recorded the voices of different speakers for fall and non-fall incidents in various environments and various locations i.e., home, guest room, washroom, etc. The period of sound clips varies from 3 to 7 s in duration. Sound clips of fall occurrences comprise intense painful audio while the sound clips of non-fall occurrences consist of inaudible audio, conversations, TV being played in the background, etc. The dataset has 508 audio samples that comprise 234 samples of fall events and 274 samples of non-fall events. The audio categorization of the dataset is reported in Table 1.

3.2 Performance Evaluation of the Proposed System This experiment is performed to evaluate the efficacy of the developed system for the detection of possible fall occurrences on our in-house fall detection dataset. For this experiment, we utilized the data up to 80% (408 samples to train the model and 20% data (100 samples) for testing purposes. More specifically, we used 234 fall event audios and 274 non-fall audios. We obtained the 20-dim Std-LTP features of all the sound clips and trained them on the SVM for the categorization of fall occurrences and non-fall occurrences. We obtained an accuracy of 93%, precision of 95.74%, recall of 90%, and F1-score of 91.78%. These results enhance the efficiency of the developed system for fall detection.

3.3 Performance Comparison of Std-LTP Features on Multiple Classifiers We conducted an experiment to measure the significance of SVM with our StdLTP features for fall detection. For this, we selected different machine learning classifiers i.e., Logistic regression (LR), Naïve Bayes (NB), K-nearest neighbour

Voice Operated Fall Detection System Through Novel Acoustic …

159

Table 2 Performance comparison of multiple classifiers Method

Kernel

Accuracy%

Precision%

Recall%

F1-score%

Std-LTP + LDA

Linear

90

90

91.83

90.90

Std-LTP + NB

Kernel NB

82

88

78.57

83.01

Std-LTP + KNN

Fine

92

92

92

92

Std-LTP + Ensemble

Subspace KNN

92

94

90.38

92.15

Std-LTP + DT

Coarse

78

58

95.45

72.15

Std-LTP + SVM

Fine Gaussian

93

95.74

90

91.78

(KNN), ensemble, Decision tree (DT), along with the SVM, and trained them using the proposed features and results are reported in Table 2. We can observe that StdLTP performed best with the SVM by achieving an accuracy of 93%. The Std-LTP with the KNN and ensemble subspace KNN achieved the second-best results with an accuracy of 92%. The Std-LTP with DT was the worst and achieved an accuracy of 78%. We can conclude from this experiment that the proposed Std-LTP features with the SVM outperform all comparative classifiers for fall event detection.

4 Conclusion In this research work, we have presented a better accidental fall detection approach for the elderly persons to provide first aid. Elderly people live in home alone, which need continuous monitoring and special care. Therefore, in this work, we designed a novel approach based on the proposed innovative acoustic Std-LTP features to obtain the prominent attributes from the screams of the accidental falls. Moreover, we developed our in-door fall incidents diverse dataset that has voice samples of screams and pain voices. We used our in-door fall events dataset for experimentation purposes and obtained 20-dimensional proposed Std-LTP features from the voice clips. We fed the extracted 20-dim Std-LTP features into SVM for distinguishing between the fall occurrences and non-fall occurrences. Experimental results show that the proposed method efficiently identifies fall occurrences with 93% accuracy and the lowest false alarm rate. Furthermore, it is possible to implement this system in actual environments i.e., in hospitals, old houses, nursing homes, etc. In the future, we aim to use the proposed Std-LTP features on other fall events datasets to check the effectiveness and generalizability of the proposed system. We also aim to send the location of monitored persons to caretakers where the fall is detected.

160

U. Zafar et al.

References 1. Scuffham P, Chaplin S, Legood R (2003) Incidence and costs of unintentional falls in older people in the United Kingdom. J Epidemiol Community Health 57(9):740–744 2. Tinetti ME, Speechley M, Ginter SF (1988) Risk factors for falls among elderly persons living in the community 319(26):1701–1707 3. Voermans NC, Snijders AH, Schoon Y, Bloem BR (2007) Why old people fall (and how to stop them). Pract Neurol 7(3):158–171 4. Yacchirema D, de Puga JS, Palau C, Esteve M (2019) Fall detection system for elderly people using IoT and ensemble machine learning algorithm. Pers Ubiquit Comput 23(5):801–817 5. Giansanti D, Maccioni G, Macellari V (2005) The development and test of a device for the reconstruction of 3-D position and orientation by means of a kinematic sensor assembly with rate gyroscopes and accelerometers 52(7):1271–1277 6. Muheidat F, Tawalbeh L, Tyrer H (2018) Context-aware, accurate, and real time fall detection system for elderly people. In: 2018 IEEE 12th international conference on semantic computing (ICSC). IEEE 7. Mezghani N, Ouakrim Y, Islam MR, Yared R, Abdulrazak B (2017) Context aware adaptable approach for fall detection bases on smart textile. In: 2017 IEEE EMBS international conference on biomedical & health informatics (BHI). IEEE, pp 473–476 8. Popescu M, Li Y, Skubic M, Rantz M (2008) An acoustic fall detector system that uses sound height information to reduce the false alarm rate. In: 2008 30th annual international conference of the IEEE engineering in medicine and biology society. IEEE, pp 4628–4631 9. Yan H, Huo H, Xu Y, Gidlund M (2010) Wireless sensor network based E-health systemimplementation and experimental results. IEEE Trans Consum Electron 56(4):2288–2295 10. Yu M, Gong L, Kollias S (2017) Computer vision based fall detection by a convolutional neural network. In: Proceedings of the 19th ACM international conference on multimodal interaction 11. Rougier C, Meunier J, St-Arnaud A, Rousseau J (2011) Robust video surveillance for fall detection based on human shape deformation. IEEE Trans Circuits Syst Video Technol 21(5):611–622 12. Cai X, Li S, Liu X, Han G (2020) Vision-based fall detection with multi-task hourglass convolutional auto-encoder. IEEE Access 8:44493–44502 13. Zhang L, Fang C, Zhu M (2020) A computer vision-based dual network approach for indoor fall detection. Int J Innov Sci Res Technol 5:939–943 14. Tong L, Song Q, Ge Y, Liu M (2013) HMM-based human fall detection and prediction method using tri-axial accelerometer. IEEE Sens J 13(5):1849–1856 15. Khan MS, Yu M, Feng P, Wang L, Chambers J (2015) An unsupervised acoustic fall detection system using source separation for sound interference suppression. Signal Process 110:199–210 16. Younis B, Javed A, Hassan F (2021) Fall detection system using novel median deviated ternary patterns and SVM. In: 2021 4th international symposium on advanced electrical and communication technologies (ISAECT). IEEE 17. Banjar A et al (2022) Fall event detection using the mean absolute deviated local ternary patterns and BiLSTM. Appl Acoust 192:108725 18. Qadir G et al (2022) Voice spoofing countermeasure based on spectral features to detect synthetic attacks through LSTM. Int J Innov Sci Technol 3:153–165 19. Hassan F, Javed A (2021) Voice spoofing countermeasure for synthetic speech detection. In: 2021 International conference on artificial intelligence (ICAI). IEEE, pp 209–212 20. Zeeshan M, Qayoom H, Hassan F (2021) Robust speech emotion recognition system through novel ER-CNN and spectral features. In: 2021 4th international symposium on advanced electrical and communication technologies (ISAECT). IEEE 21. Shaukat A, Ahsan M, Hassan A, Riaz F (2014) Daily sound recognition for elderly people using ensemble methods. In 2014 11th international conference on fuzzy systems and knowledge discovery (FSKD). IEEE, pp 418–423

Voice Operated Fall Detection System Through Novel Acoustic …

161

22. Li Y, Ho KC, Popescu M (2012) A microphone array system for automatic fall detection. IEEE Trans Biomed Eng 59(5):1291–1301 23. Hassan F, Mehmood MH, Younis B, Mehmood N, Imran T, Zafar U (2022) Comparative analysis of machine learning algorithms for classification of environmental sounds and fall detection. Int J Innov Sci Technol 4(1):163-174s

Impact of COVID-19 on Predicting 2020 US Presidential Elections on Social Media Asif Khan , Huaping Zhang , Nada Boudjellal , Bashir Hayat , Lin Dai, Arshad Ahmad , and Ahmed Al-Hamed

Abstract By the beginning of 2020, the world woke up to a global pandemic that changed people’s everyday lives and restrained their physical contact. During those times Social Media Platforms (SMPs) were almost the only mean of individualto-individual and government-to-individuals communications. Therefore, people’s opinions were more expressed on SM. On the other hand, election candidates used SM to promote themselves and engage with voters. In this study, we investigate how COVID-19 affected voters’ opinions through the months of the US presidential campaign and eventually predict the 2020 US Presidential Election results using Twitter’s data. Mainly two types of experiments were conducted and compared; (i) transformer-based, and (ii) rule-based sentiment analysis (SA). In addition, vote shares for the presidential candidates using both approaches were predicted. The results show that the rule-based approach nearly predicts the right winner, Joe Biden with MAE 2.1, outperforming the predicted results from CNBC, Economist/YouGov, and transformer-based (BERTweet) approach, except for RCP (MAE 1.55). Keywords Twitter · Sentiment Analysis · Rule-based · Transformers · COVID-19 · Election Prediction · USA Presidential Election

A. Khan · H. Zhang (B) · N. Boudjellal · L. Dai · A. Al-Hamed School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China e-mail: [email protected] N. Boudjellal The Faculty of New Information and Communication Technologies, University Abdelhamid Mehri Constantine 2, 25000 Constantine, Algeria B. Hayat Institute of Management Sciences Peshawar, Peshawar, Pakistan A. Ahmad Institute of Software Systems Engineering, Johannes Kepler University, 4040 Linz, Austria Department of IT and Computer Science, Pak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Mang Khanpur Road, Haripur 22620, Pakistan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_14

163

164

A. Khan et al.

1 Introduction The outbreak of COVID-19 pandemic shook the whole world and disturbed the daily life of people by locking them in their dwellings. During those quarantine times, the role of social media platforms was much appreciated. All sectors of life were affected including politics with at least 80 countries and territories postponing their elections. On the other hand, at least 160 countries decided to hold their elections despite the risks posed by the pandemic [2]. Among these was US election. The US 2020 Presidential Elections campaign has fundamentally changed amid the coronavirus pandemic, with candidates straining to reach voters virtually. Twitter—with US as the top country for the number of users, is widely used by politicians to react to their audience and it is a platform where voters can express their opinion about candidates and their programs freely. Therefore, Twitter data is a mine that when exploited well, can lead to meaningful predictions and insights, including election results prediction. Several studies have analyzed elections on SMPs [1, 3, 4, 11, 12, 18]. Many researchers investigated elections on SMPs using SA approaches [4, 10, 13, 16], and very few studies employed BERT-based models for SA to predict elections on SMPs [6, 20]. The impact of COVID-19 on elections has been investigated by [5, 8, 15, 19], nevertheless, these studies used data from different surveys. To the best of our knowledge, no study has investigated the impact of a pandemic on an election using social media data. In this study, we analyze the effect of COVID-19 on US elections by mining people’s opinions about candidates and eventually predicting 2020 US election results. We studied the tweets about COVID-19 and the two US presidential final candidates: Donald Trump and Joe Biden. The main contributions of this study are: 1 Twitter Mining—tweets related to COVID-19 and Joe Biden and Donald Trump. 2 Predict US Presidential Election 2020 using COVID-19 data. 3 Analyze and compare two SA approaches—VADER for rule-based approach, and BERTweet for transformer-based approach. 4 Compare our predictions with three famous polls’ results as well as the 2020 US Presidential Election. The rest of the paper is organized as follows: Sect. 2 provides an overview of the related literature, followed by the proposed methodology in Sect. 3. Section 4 discusses the experimental results. Afterwards, the study is concluded in Sect. 5.

2 Related Studies Social media data have consistently attracted countless researchers to explore diverse events including election predictions [1, 3, 4, 11, 12, 18]. These researchers endeavoured to predict elections on SMPs by utilizing different features, factors,

Impact of COVID-19 on Predicting 2020 US Presidential Elections …

165

and approaches. There are mainly three types of approaches for predicting elections on SMPs: (i) sentiment Analysis, (ii) social network analysis, and (iii) volumetric/counting [12]. A majority of the studies showed the effectiveness of SA approaches in election predictions using Twitter data [4, 10, 13, 16]. The authors [14] analyzed and forecasted US 2020 Presidential Election using a lexicon-based SA. They analyzed tweets in each state for the candidates and classified the results into Solid Democratic, Solid Republican, Tossup, Lean Democratic, and Lean Republican. The authors [21] conducted experiments (SA) on Sanders Twitter benchmark dataset and concluded that the Multi-layer Perceptron classifier performs the best. The authors investigated Japanese House of Councilors election (held in 2019) using the replies to candidate and the sentiment values of these replies [16]. The paper [3] investigated 2018 Brazilian Presidential Election and 2016 US Presidential Election by employing social media data and machine learning techniques. The authors collected posts from candidates’ profiles including traditional posts. Next, they use an artificial neural network to predict the vote share for presidential candidates. Few studies have used BERT-based SA to predict elections. The study [6] analyzes US 2020 Presidential Elections using LSTM and BERT-based models for Twitter SA. The authors concluded that the BERT model indicated Biden as a winner. Likewise, the study [20] analyzed 2020 US Presidential Elections using SA. The authors compared four deep learning and machine learning domains: naïve bayesian, textblob, BERT, and support vector machine. They found that BERT shows better performance as compared to the other three methods. Some studies investigated the impact of COVID-19 on elections. For instance, the authors [9] analyzed and predicted corona cases in USA, and observed presidential elections with the corona confirmed and death cases. The authors used ARIMA, a time series algorithm. In another study [7], the authors investigated the impact of COVID-19 on Trump during US Presidential Elections 2020. They used multivariate statistical analyses by gathering national survey data, before and after the election. They showed that the pandemic harmed Trump’s image, which led to a very narrow path to winning the election. Other studies investigated the impact of a pandemic on elections such as [5, 8, 15, 19], however, these studies analyzed data like surveys and questionnaires. All these studies studied elections either using sentiment analysis of general opinions of the public about candidates, elections, and parties on social media, or the impact of COVID-19 on elections using surveys. This led us to investigate the impact of a pandemic on elections using social media. We studied the impact of COVID-19 on 2020 US Presidential Elections using rule-based and transformer-based SA.

3 Proposed Methodology This section presents our proposed methodology in detail. Figure 1 demonstrates the proposed methodology for predicting elections using tweets related to COVID-19.

166

A. Khan et al.

Fig. 1 Framework

Table 1 Data collection (keywords)

Joe Biden

@JoeBiden

Donald J. Trump

@realDonaldTrump

COVID-19

Coronavirus, covid-19, covid, corona, pandemic, epidemic, virus

3.1 Tweets In this study, we use Twitter data to predict 2020 US Presidential Elections. We employ a python library “Tweepy”, to mine the tweets from Twitter. Tweets mentioning the two candidates running for the President (see Table 1) are collected between 1st August 2020 and 30th November 2020. Finally, the collected tweets that include keywords related to coronavirus (see Table 1) are selected for this study.

3.2 Data Pre-processing The collected raw tweets contain numerous amounts of meaningless data and superfluous noise. We preprocessed all the tweets to clean the data. In this study, we use tweets written in English only. Further, we remove unnecessary noise and data such as stopwords, hashtags(#), mentions(@), IPs, URLs, and emoticons. The tweets are further converted to lower case and are tokenized.

Impact of COVID-19 on Predicting 2020 US Presidential Elections …

167

3.3 Sentiment Analysis Sentiment analysis (SA) analyses the subjective information in a statement. It is also referred to as opinion mining. It uses NLP technique to classify tweets (statements) into positive, negative, and neutral. SA plays a very vital role in the domain of election prediction. SA somehow portray the intentions and attitudes of voters toward political entity such as politicians and political parties. In this study, we employed two SA approaches to predict the winner of the 2020 US Presidential Election: (i) rule-based SA approach, and (ii) transformer-based SA approach. We employed Valence Aware Dictionary and sEntiment Reasoner (VADER) in the first approach. It is particularly attuned to opinions expressed on SM. Many researchers use this approach extensively in different domains such as Twitter. References. The latter approach uses BERTweet. It is a language model pre-trained for English tweets. It is trained on RoBERTa. The corpus used for BERTweet comprises 850 million tweets including 5 million COVID-19 related tweets.

3.4 Predicting the President We employed Eq. 1 (Donald Trump) and Eq. 2 (Joe Biden) to predict the winner of the 2020 US Presidential Election. We discard the neutral sentiments focusing on positive and negative sentiments. Furthermore, we calculate the Mean Absolute Error (MAE) using Eq. 3 to evaluate our two methods and observe the amount of deviation of our predictions from the actual election outcomes. Vote-share Trump =

(Pos. Trump + Neg. Biden) × 100 (Pos. Trump + Neg. Trump + Pos. Biden + Neg. Biden)

(1)

Vote-share Biden =

(Pos. Biden + Neg. Trump) × 100 (Pos. Trump + Neg. Trump + Pos. Biden + Neg. Biden) MAE =

1 N |Predictedi − Actuali | i=1 N

(2) (3)

168

A. Khan et al.

4 Experimental Results and Discussion This section presents the experimental results and discussions of our study. We performed extensive experiments to forecast the victor of election using COVID-19 tweets. We use a python library “Tweepy” to mine the tweets from Twitter. We mined tweets mentioning Donald Trump and Joe Biden between 1 Aug 2020 and 30 Nov 2020. Next, we chose tweets that contain keywords related to COVID-19 (see Table 1) and considered these tweets for this study. Table 2 shows the dataset used in this research study. Figure 2 shows the distribution of collected tweets over time. The number of tweets related to COVID-19 dropped, as the elections were getting closer. Especially, for Joe Biden in September 2020, and Donald Trump in November 2020. The experiments are based on two approaches, rule-based (VADER) and transformer-based (BERTweet). The tweets are preprocessed by employing Natural Language Toolkit (NLTK). It comes with a built-in sentiment analyzer VADER, used in this study. For transformer-based SA approach (BERTweet), we used a library “pysentimiento (A Python Toolkit for Sentiment Analysis and SocialNLP tasks)” [17]. All experiments were conducted in Jupyter Notebook (Python 3.7.4) environment on a PC with 64-bit Windows 11 OS, Intel(R) Core(TM) i7-8750H CPU and 16 GB RAM.

4.1 Sentiment Analysis We have conducted extensive experiments to analyze the sentiments of people towards Donald Trump and Joe Biden during the pandemic (and election campaigns). Table 2 Number of Tweets Data collection COVID-19 only

Biden 20000 15000

Donald Trump

1,385,065

681,408

42,642

21,702

Trump

18166 10635

10000 5000

Joe Biden

6769

4775

9287 7143

6698 871

0 2020-08

Fig. 2 Timeline—Tweets collection

2020-09

2020-10

2020-11

Impact of COVID-19 on Predicting 2020 US Presidential Elections …

169 Biden Pos

Time

2020-11 5 22

73

9

2020-10 11

44

45

7

2020-09 6

49

45

4 5

2020-08 3

62

0

35 50

100 Sentiments

39

52

51

Biden Neg

42

Biden Neu

50

46

Trump Pos

53

42

Trump Neg

150

200

Trump Neu

Fig. 3 Sentiment analysis using BERTweet

We have applied two different methods and compared them. The results of both methods (rule-based and transformer-based SA) are shown and discussed below.

4.1.1

Results of BERTweet

Figure 3 shows the sentiment analysis using BERTweet (transformer-based) for Donald Trump and Joe Biden. It combines the sentiment percentages for the two political leaders showing the results of 200%. The first slot from left to right (0–100%) demonstrates the sentiments for Joe Biden and the second slot onwards (100–200%) demonstrates the sentiments for Donald Trump. It can be seen in Fig. 3 that people’s sentiments towards the leaders show a small percentage of positive as compared to neutral and negative. It is interesting to notice the sentiment shift for both leaders. The percentage of negative sentiment is getting lower for Joe Biden as the elections were getting closer. The negative sentiments were shifting slightly towards positive and more to neutral. The sentiments towards Donald Trump are nearly the same with slight shifts during the months before elections. Nonetheless, we believe that the percentage of negative decreased in the month of November from 51% (in October) to 39% due to the high decrease in the number of tweets. The results show that on average the attitude of voters (Twitter users) toward Joe Biden was more supportive than Donald Trump.

4.1.2

Results of VADER

Figure 4 illustrates the sentiment analysis using VADER (rule-based SA approach) for Donald Trump and Joe Biden during elections campaigns using COVID-19 tweets. Figure 4 combines the sentiment percentages for the two political leaders showing the results in a 200% slot. The first slot from left to right (0–100%) demonstrates the sentiments for Joe Biden and the second slot onwards (100–200%) demonstrates the sentiments for Donald Trump. It is interesting to notice that the sentiments shift for both leaders, especially for Joe Biden (see Fig. 4). The positive sentiment percentage for Biden increased from

170

A. Khan et al.

Time

2020-11

44

2020-10

17

2020-09

36

2020-08

18 0

35

24

Biden Neg

28

32

40

28

Biden Neu

25

32

45

22

Trump Pos

100 Sentiments

150

36 57 50

41

22

43

35

Biden Pos

35

34

31

39

200

Trump Neg Trump Neu

FIG. 4 Sentiment analysis using VADER

18 to 44%, and the negative sentiment percentage decreased from 57 to 17%, as the elections were getting closer. The neutral percentage remains almost the same with trivial changes. That shows that the attitude of people towards Joe Biden was getting positive by the time, which can be considered as a leading factor towards winning the election. On the other hand, there is not a colossal shift in the positive sentiment toward Donald Trump except in October 2020 which increased by nearly 9%. Moreover, the negative percentage decreases with time.

4.2 Predicting Vote Share Figure 5 represents the vote-share for Donald Trump and Joe Biden using VADER (RB) and BERTweet (TB) approaches. We employed Eqs. 1 and 2 to predict the vote share for Donald Trump and Joe Biden consequently. The vote-shares are presented in percentages on monthly basis (from August 2020 to November 2020) followed by the “average vote-shares”. “TB” represents transformer-based (BERtweet) approach in Fig. 5 and “RB” represents rule-based (VADER). Surprisingly, the results from TB show Donald Trump as a clear winner (avg. voteshare for Trump is 62.18%, and 37.82% for Joe Biden. Contrastingly, the results from VADER show that Biden is the winner with a negligible lead (50.15% for Biden and 49.85% for Trump). 80 60

36.56 63.44 51.92 48.08

59.16 40.84 45.71 54.29

29.44 70.56 69.21 30.79

37.82 62.18 50.15 49.85

20

26.11 73.89 33.76 66.24

40

2020-08

2020-09

2020-10

2020-11

Average vote-share

0

TB. Biden vote-share

TB. Trump vote-share

RB. Biden vote-share

RB. Trump vote-share

Fig. 5 Vote shares for Biden and Trump using BERTweet and VADER

Impact of COVID-19 on Predicting 2020 US Presidential Elections …

171

Table 3 Predicted results and the MAE Predicted Results

Final Results

RCP

Economist/YouGov

CNBC

BERTweet

VADER

2020 US Presidential Election

Joe Biden

51.2

53

52

37.82

50.15

51.40

Donald Trump

44

43

42

62.18

49.85

46.90

MAE

1.55

2.75

2.75

14.43

2.1

Table 3 shows our predicted vote shares using BERTweet and VADER for Donald Trump and Joe Biden along with the actual 2020 US Presidential Election results as well as the three polls’ (CNBC, RCP, and Economist) predicted values. In addition, Table 1 shows the MAE for the polls and our predictions. The results using VADER are quite impressive as it outperformed BERTweet and the two polls’ predicted results except that of RCP. Contrary, BERTweet has the highest error (MAE = 14.43). The results show that a pandemic can affect events and help us in predicting an event such as an election.

5 Conclusion and Future Work In this study, we investigated the effects of a pandemic on an election. We analyzed 2020 US Presidential Election during COVID-19 using Twitter’s data (from 1st August 2020 to 30th November 2020). We studied tweets mentioning Donald Trump, and Joe Biden and contain COVID-19 keywords. Conspicuously, the tweets for Joe Biden (66.3%) were more in numbers as compared to Donald Trump (33.7%). Mainly two types of experiments were conducted and compared—transformer-based SA, and rule-based SA. In addition, vote shares for the presidential candidates were predicted using both approaches. The results are quite interesting; the rulebased approach somehow led us in predicting the right winner (Joe Biden) with MAE 2.1, outperforming the predicted results from CNBC, Economist/YouGov, and transformer-based (BERTweet) approach, except for RCP (MAE 1.55). This study has some limitations, which need to be investigated in future. The first COVID-19 cases in USA were diagnosed in January 2020. Whilst, the tweets considered in this study are of a short period (Aug–Nov), which has effects on the outcomes of the predictions. In addition, tweets mentioning the candidates are considered, while ignoring the tweets using hashtags. Investigating a large number of tweets would improve the predictions and give us a better insight into the elections. Moreover, analyzing correlations with the real COVID-19 results is needed—studying the number of infected, recovered/discharged, and deaths.

172

A. Khan et al.

References 1. Ali H, Farman H, Yar H et al (2021) Deep learning-based election results prediction using Twitter activity. Soft Comput. https://doi.org/10.1007/s00500-021-06569-5 2. Asplund E (2022) Global overview of COVID-19: impact on elections | International IDEA. In: Int. IDEA. https://www.idea.int/news-media/multimedia-reports/global-overview-covid19-impact-elections 3. Brito KDS, Adeodato PJL (2020) Predicting Brazilian and U.S. elections with machine learning and social media data. In: Proceedings of the international joint conference on neural networks 4. Budiharto W, Meiliana M (2018) Prediction and analysis of Indonesia Presidential election from Twitter using sentiment analysis. J Big Data 5:1–10. https://doi.org/10.1186/s40537-0180164-1 5. Cassan G, Sangnier M (2022) The impact of 2020 French municipal elections on the spread of COVID-19. J Popul Econ 35:963–988. https://doi.org/10.1007/s00148-022-00887-0 6. Chandra R, Saini R (2021) Biden vs Trump: modeling US general elections using BERT language model. In: IEEE access, pp 128494–128505 7. Clarke H, Stewart MC, Ho K (2021) Did covid-19 kill Trump politically? The pandemic and voting in the 2020 presidential election. Soc Sci Q 102:2194–2209. https://doi.org/10.1111/ ssqu.12992 8. Dauda M (2020) The impact of covid-19 on election campaign in selected states of Nigeria 14–15 9. Dhanya MG, Megha M, Kannath M et al (2021) Explorative predictive analysis of Covid-19 in US and its impact on US Presidential Election. In: 2021 4th international conference on signal processing and information security, ICSPIS 2021, pp 61–64 10. Ibrahim M, Abdillah O, Wicaksono AF, Adriani M (2016) Buzzer detection and sentiment analysis for predicting presidential election results in a Twitter nation. In: Proceedings of the 15th IEEE international conference on data mining workshop (ICDMW), pp 1348–1353. https://doi.org/10.1109/ICDMW.2015.113 11. Jaidka K, Ahmed S, Skoric M, Hilbert M (2019) Predicting elections from social media: a three-country, three-method comparative study. Asian J Commun 29:252–273. https://doi.org/ 10.1080/01292986.2018.1453849 12. Khan A, Zhang H, Boudjellal N et al (2021) Election prediction on twitter: a systematic mapping study. Complexity 2021:1–27. https://doi.org/10.1155/2021/5565434 13. Khan A, Zhang H, Shang J et al (2020) Predicting politician’s supporters’ network on twitter using social network analysis and semantic analysis. Sci Program. https://doi.org/10.1155/ 2020/9353120 14. Nugroho DK (2021) US presidential election 2020 prediction based on Twitter data using lexicon-based sentiment analysis. In: Proceedings of the confluence 2021: 11th international conference on cloud computing, data science and engineering, pp 136–141 15. Nurjaman A, Hertanto H (2022) Social media and election under covid-19 pandemic in Malang regency Indonesia. Int J Commun 4:1–11 16. Okimoto Y, Hosokawa Y, Zhang J, Li L (2021) Japanese election prediction based on sentiment analysis of twitter replies to candidates. In: 2021 international conference on asian language processing, IALP 2021, pp 322–327 17. Pérez JM, Giudici JC, Luque F (2021) pysentimiento: a python toolkit for sentiment analysis and SocialNLP tasks. http://arxiv.org/abs/2106.09462 18. Salem H, Stephany F (2021) Wikipedia: a challenger’s best friend? Utilizing informationseeking behaviour patterns to predict US congressional elections. Inf Commun Soc. https:// doi.org/10.1080/1369118X.2021.1942953 19. Shino E, Smith DA (2021) Pandemic politics: COVID-19, health concerns, and vote choice in the 2020 general election. J Elections, Public Opin Parties 31:191–205. https://doi.org/10. 1080/17457289.2021.1924734

Impact of COVID-19 on Predicting 2020 US Presidential Elections …

173

20. Singh A, Kumar A, Dua N et al (2021) Predicting elections results using social media activity a case study: USA presidential election 2020. In: 2021 7th international conference on advanced computing and communication systems, ICACCS 2021, pp 314–319 21. Xia E, Yue H, Liu H (2021) Tweet sentiment analysis of the 2020 U.S. presidential election. In: The web conference 2021—companion of the world wide web conference, WWW 2021, pp 367–371

Health Mention Classification from User-Generated Reviews Using Machine Learning Techniques Romieo John, V. S. Anoop, and S. Asharaf

Abstract The advancements in information and communication technologies contributed greatly to the development of social media and other platforms where people express their opinions and experiences. There are several platforms such as drugs.com where people rate pharmaceutical drugs and also give comments and reviews on the drugs they use and their side effects. It is important to analyze such reviews to find out the sentiment, opinions, drug efficacy, and most importantly, adverse drug reactions. Health mention classification deals with classifying such user-generated text into different classes of health mentions such as obesity, anxiety, and more. This work uses machine learning approaches for classifying health mentions from the publicly available health-mention dataset. Both the shallow machine learning algorithms and deep learning approaches with pre-trained embeddings have been implemented and the performances were compared with respect to the precision, recall, and f1-score. The experimental results show that machine learning approaches will be a good choice for automatically classifying health mentions from the large amount of user-generated drug reviews that may help different stakeholders of the healthcare industry to better understand the market and consumers. Keywords Health mention classification · Machine learning · Transformers · Deep learning · Natural language processing · Computational social sciences R. John · V. S. Anoop (B) Kerala Blockchain Academy, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India e-mail: [email protected] R. John e-mail: [email protected] V. S. Anoop School of Digital Sciences, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India S. Asharaf Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_15

175

176

R. John et al.

1 Introduction The Digital revolution has caused the proliferation of internet services at an exponential rate. The recent innovations in information and communication technologies fueled the growth of internet-based applications and services such as social networks and online forums. People use these services to share their reviews and opinions on different products and services to deliberate discussions (Zhang et al. 2022) [13, 15]. As these reviews and opinions contains several useful but latent information for various stakeholders, it is important to analyze them [5, 10, 11, 16, 19]. Online forums such as drugs.com (https://www.drugs.com) are also used for expressing opinions, but specific to drugs and medications. Such platforms contain reviews mentioning important information such as names of drugs, health condition for the drug was used, user reviews, and adverse drug reactions. Identifying this information and classifying the user post into any one of the health mention classes will be of greater importance for healthcare researchers and other stakeholders [1, 6, 17, 18]. Manual analysis of such platforms to find out useful pieces of information from user-generated content would be a time-consuming task and may often result in poor quality. Machine learning algorithms that can handle large quantities of data and classify them into different categories may find application in this context. In the recent past, many such approaches are reported in the machine learning and natural language processing literature with varying degrees of success. With the introduction of sophisticated deep learning algorithms such as Convolutional Neural Networks (CNN) and Bi-directional Long Short-term Memory (BiLSTM), state-ofthe-art results are being obtained in classification tasks, specifically, health mention classification from user-generated posts. The proposed work employs different machine learning algorithms to classify health mentions from the user-generated health reviews that are publicly available. The major contributions of this paper are summarized as follows: (a) Discusses the relevance of health mention classification from user-generated health reviews which are unstructured text. (b) Implements different machine learning algorithms both in shallow learning and deep learning categories to classify health mentions. (c) Reports and discusses the classification performance of different algorithms used in the proposed approach for classifying health mentions. The remainder of this manuscript is organized as follows—Sect. 2 briefly discusses some of the very recent and prominent works on health mention classification using machine learning techniques. Section 3 presents the proposed approach, and Sect. 4 discusses the experiment conducted. The results are presented in Sect. 5 along with a detailed discussion on the same and in Sect. 6, the authors present the conclusions.

Health Mention Classification from User-Generated Reviews Using …

177

2 Related Studies The recent advancements in natural language processing techniques such as the development of large language models made several works reported in the machine learning literature on health mention classification. This section discusses some of the very recent and prominent works that use machine learning approaches for health mention classification. Pervaiz et al. [9] have done a performance comparison of transformer-based models on twitter health mention classification. They have chosen nine widely used transformer methods for comparison and reported that ROBERTa outperformed all other models by achieving an f1-score of 93%. Usman et al. proposed an approach for the identification of diseases and symptoms terms from Reddit to improve the health mention classification problem [14]. The authors have released a new dataset that manually classifies the Reddit posts into four labels namely personal health mentions, non-personal health mentions, figurative health mentions, and hyperbolic health mentions. Experimental results demonstrated that their approach outperformed state-of-the-art methods with an F1-Score of 0.75. A recent approach that attempted to identify COVID-19 personal health mentions from tweets using the masked attention model was reported [12]. They have built a COVID-19 personal health mention dataset containing tweets annotated with four types of health conditions—self-mention, other mention, awareness, and non-health. This approach obtained promising results when compared with some of the stateof-the-art approaches. Fries et al. proposed an ontology-driven weak supervision approach for clinical entity classification from electronic health records [3]. Their model named Trove used medical ontologies and expert-generated rules for the classification task. They have evaluated the performance of their proposed framework on six benchmark tasks and real-life experiments. Kayastha et al. proposed a BERTbased adverse drug effect tweet classification [7]. The authors have reported that their best-performing model utilizes BERTweet followed by a single layer of BiLSTM. The system achieved an F-score of 0.45 on the test set without the use of any auxiliary resources such as Part-of-Speech tags, dependency tags, or knowledge from medical dictionaries [7]. Biddle et al. developed an approach that leverages sentiment distributions to distinguish figurative from literal health reports on Twitter [2]. For the experiments, the authors have modified a benchmark dataset and added nearly 14,000 tweets that are manually annotated. The proposed classifier outperformed state-of-the-art approaches in detecting health-related and figurative tweets. Khan et al. incorporated a permutation-based contextual word representation for health mention classification [8]. The performance of the classifier is improved by capturing the context of disease words efficiently and the experiments conducted with the benchmark dataset showed better accuracy for the proposed approach.

178

R. John et al.

The proposed approach uses different machine learning approaches (both shallow learning and deep learning) for classifying health mentions from user-generated social media text. This work employs machine learning algorithms such as Random Forest, Logistic Regression, Naive Bayes, Support Vector Machine, Light Gradient Boosting Machine, Bi-directional Long Short-Term Memory, Convolutional Neural Network, and Transformers for building classifier models. We also use pre-trained embedding such as BERT and SciBERT for better feature extraction for better classification. Section 3 discusses in detail the proposed approach for health mention classification.

3 Proposed Approach This section discusses the details on the proposed approach. The overall workflow of the proposed method is shown in Fig. 1. Random Forest (RF): The Random Forest approach has been used to examine the drug dataset in several research. Having the ability to analyze facts and make an educated guess. The dataset employed in this study is balanced in nature. Because random forest separates data into branches to form a tree, we infer that random forest cannot be utilized to provide prognostic options to address unbalanced issues.

Fig. 1 Overall workflow of the proposed approach

Health Mention Classification from User-Generated Reviews Using …

179

Logistic Regression (LR): Logistic regression (LR) is a technique that employs a set of continuous, discrete, or a combination of both types of characteristics, as well as a binary goal. This approach is popular since it is simple to apply and produces decent results. Naive Bayes (NB): Bayesian classifiers are well-known for their computational efficiency and natural and efficient handling of missing data. With this advantage, past work trials have shown that this model has a high prediction accuracy. Support Vector Machines (SVM): The SVM has a distinct edge when it comes to tackling classification jobs that need high generalization. The strategy aims to reduce mistake by focusing on structural risk. This technique is widely utilized in medical diagnostics. Light Gradient Boosting Machine (LGBM): is a gradient boosting framework built on the decision tree method that may be applied to a range of machine learning tasks, including classification and ranking. The proposed approach also implements the following deep learning classifiers for the health-mention classification. Convolutional Neural Networks (CNN): The CNN architecture for classification includes convolutional layers, max-pooling layers, and fully connected layers. Convolution and max-pooling layers are used for feature extraction. While convolution layers are meant for feature detection, max-pooling layers are meant for feature selection. Long Short-Term Memory (LSTM): is a kind of recurrent neural network that effectively stores past material in memory. The LSTM can also tackle the gradient vanishing problem in RNN. Like RNNs, LSTMs work with time-series data, albeit the temporal distance or length may be unnecessarily large. Transformers: are a class of deep neural network models. A transformer is a deep learning model extensively used in natural language processing applications that adopts the mechanism of self-attention, deferentially weighting the significance of each part of the input data.

4 Experiment 4.1 Dataset For this experiment, we use the publicly available drug review dataset from the UCI Machine Learning repository. The dataset is available at https://archive.ics.uci.edu/ ml/machine-learning-databases/00462/ [4]. The dataset contains a total of 215,063 patient reviews on specific drugs along with related conditions. There are a total of six attributes in the dataset namely, drugName—that mentions the name of the drug, condition—the name of the condition, review—the patient review, rating—the numerical rating, date—the date of review, and usefulCount—the number of users

180

R. John et al.

Table 1 A snapshot of the dataset used Unique ID Drug name Condition

Review

Rating Date

Useful count

4907

Belviq

Weight Loss

This is a waste of money Did not curb my appetite nor did it makes me feel full

1

23-Sep-14

57

151,674

Chantix

Smoking Cessation

Took it for one 10 week and that was it. I didn’t think it was possible for me to quit. It has been 6 years now Great Product

14-Feb-15

26

30,401

Klonopin

Bipolar Disorder

This 06 medication helped me sleep But eventually it became ineffective as a sleep aid. It also helps me calm down when in severe stress, anxiety, or panic

14-July-09 24

103,401

Celecoxib

Osteoarthritis Celebrex did 01 nothing for my pain

12-Feb-09

35

who found the corresponding review useful. A snapshot of the data is shown in Table 1.

4.2 Experimental Setup This section describes the experimental setup we have used for our proposed approach. All the methods described in this paper were implemented in Python 3.8. The experiments were run on a server configured with Intel(R) Core (TM) i5-10300H [email protected] GHz core processor and 8 GB of main memory. Firstly, the dataset was pre-processed to remove the stopwords, URLs, and other special characters. The

Health Mention Classification from User-Generated Reviews Using …

181

demoji library available at https://pypi.org/project/demoji/ was used for converting the emojis into textual forms. We have also used English contractions list for better text enhancement. During the analysis, we found that the dataset is unbalanced and to balance the dataset, we have performed down-sampling or up-sampling. The feature engineering stage deals with computing different features such as CountVectorization, word embeddings, and TF-IDF, after word tokenization. This work used texts to sequences in Keras https://www.tensorflow.org/api_docs/python/tf/keras/prepro cessing/text/Tokenizer and BERT tokenization from https://huggingface.co/docs/tra nsformers/main_classes/tokenizer is used for the transformers. Once the experimentready dataset is obtained, we have splitted the dataset into train and test split and then used with the algorithms listed in Sect. 3.

5 Results and Discussions This section details the results obtained from the experiment conducted with the dataset given in Sect. 4.1. The results obtained for the shallow machine learning algorithms and related discussions are given in Sect. 5.1, Sect. 5.2 discusses the results of Bi-directional Long Short-term Memory and Convolutional Neural Network classifiers. The results and discussions for the health mention classification using BERT is given in Sect. 5.3 and the results for SciBERT is added in Sect. 5.4.

5.1 Health Mention Classification Using Shallow Machine Learning Algorithms Four shallow machine learning algorithms were implemented as discussed in the proposed approach, namely, Logistic Regression (LR), Light Gradient Boosting Machine (LGBM), Naive Bayes (NB), Random Forest (RF), and Support Vector Machine (SVM). The precision, recall, and f1-score for these algorithms are shown in Table 2. LR has scored a precision of 88%, recall of 79%, and an f1-score of 83%. LGBM classifier has recorded 69%, 90%, and 78% for the precision, recall, and accuracy respectively. While NB has scored 92% precision, 73% recall, and 81% f1-score, the RF classifier obtained 68%, 61%, and 65% for precision, recall, and f1-score. The SVM algorithm has recorded 73% precision, 63% recall, and 66% f1-score for the experiment conducted.

182

R. John et al.

Table 2 Classification report for the Logistic Regression, LGBM, Naive Bayes, Random Forest, and SVM classifiers Algorithm

Precision

Recall

F1-score

Logistic Regression

0.88

0.79

0.83

Light Gradient Boosting Machine

0.69

0.90

0.78

Naïve Bayes

0.92

0.73

0.81

Random Forest

0.68

0.61

0.65

Support Vector Machine

0.73

0.63

0.66

5.2 Health Mention Classification Using Bi-Directional Long Short-Term Memory (BiLSTM) and Convolutional Neural Network (CNN) We have implemented CNN and BiLSTM algorithms for the classification of health mentions and the results obtained are shown in Table 3 and Table 4 respectively. The Table 3 represents the precision, recall, and f1-score comparison for selected 14 diseases. The Convolutional Neural Network has recorded a weighted average of 89% for the precision, 88% for recall, and 88% for f1-score (Fig. 2). Table 3 Classification report for the convolutional neural network model

Health Condition

Precision

Recall

F1-score

ADHD

0.92

0.89

0.91

Acne

0.95

0.88

0.92

Anxiety

0.83

0.72

0.77

Bipolar disorder

0.83

0.76

0.79

Birth control

0.96

0.98

0.97

Depression

0.74

0.85

0.79

Diabetes (type 2)

0.91

0.89

0.90

Emergency contraception

0.99

0.93

0.96

High blood pressure

0.93

0.83

0.88

Insomnia

0.88

0.85

0.87

Obesity

0.70

0.61

0.66

Pain

0.87

0.95

0.91

Vaginal yeast infection

0.97

0.94

0.95

Weight loss

0.68

0.74

0.71

Macro average

0.87

0.84

0.86

Weighted average

0.89

0.88

0.88

Accuracy

0.88

Health Mention Classification from User-Generated Reviews Using … Table 4 Classification report for BiLSTM model

183

Health condition

Precision

Recall

F1-score

ADHD

0.92

0.87

0.89

Acne

0.94

0.89

0.91

Anxiety

0.80

0.74

0.77

Bipolar disorder

0.78

0.75

0.76

Birth control

0.97

0.98

0.97

Depression

0.74

0.79

0.76

Diabetes (type 2)

0.85

0.88

0.86

Emergency contraception

0.97

0.95

0.96

High blood pressure

0.82

0.83

0.82

Insomnia

0.84

0.87

0.85

Obesity

0.55

0.75

0.63

Pain

0.93

0.91

0.92

Vaginal yeast infection

0.93

0.96

0.94

Weight loss

0.70

0.44

0.54

Macro average

0.84

0.83

0.83

Weighted average

0.87

0.87

0.87

Accuracy

0.87

Fig. 2 Classification report for the Logistic Regression, LGBM, Naive Bayes, Random Forest, and SVM classifiers

184

R. John et al.

The classification report for the BiLSTM model is shown in Table 4 with 14 diseases. This model has recorded a weighted average of 87% for precision, recall, and f1-score (Figs. 3, 4).

Fig. 3 Precision, recall, F1-score comparisons for different health conditions for convolutional neural network

Fig. 4 Precision, recall, f1-score comparisons for different health conditions for bidirectional long short-term memory

Health Mention Classification from User-Generated Reviews Using …

185

5.3 Health Mention Classification Using Bidirectional Encoder Representations from Transformers For the BERT implementation, the classification report is shown in Table 5, for the top six health conditions namely Birth Control, Depression, Pain, Anxiety, Acne, and bipolar disorder. BERT has recorded 91% for the precision, recall, and f1-score. The comparison report shows that BERT outperformed other models in terms of f1-score (Fig. 5). Table 5 Classification report for BERT model

Health condition

Precision

Recall

F1-score

Birth control

0.98

0.98

0.98

Depression

0.80

0.82

0.81

Pain

0.92

0.95

0.94

Anxiety

0.75

0.81

0.78

Acne

0.93

0.98

0.91

Bipolar disorder

0.87

0.69

0.77

Accuracy

0.91

Macro average

0.88

0.86

0.87

Weighted average

0.91

0.91

0.91

Fig. 5 Precision, recall, F1-score comparisons for different health conditions for bidirectional encoder representations from transformers

186

R. John et al.

5.4 Health Mention Classification Using Pre-Trained BERT-Based Language Model for Scientific Text SciBERT is similar to the working of the BERT model, but it was pre-trained using a medical corpus using publicly accessible data from PubMed and PMC. For implementing SciBERT on our dataset, we manually labelled 2000 training samples and 1000 test samples using only top 40 health conditions. The classification report for the SciBERT model for top 17 health conditions are shown in Table 6 and it shows the weighted average precision is 87%, recall is 89%, and f1-score is 86%. When closely observing the precision, recall and accuracy of individual health conditions are not satisfactory and we believe this is due to the limited data samples used for training and this needs to be further investigated (Fig. 6). Table 6 Classification report for Sci-BERT model

Health condition

Precision

Recall

F1-score

ADHD

0.42

0.39

0.41

GERD

0.70

0.45

0.41

Abnormal uterine bleeding

0.50

0.05

0.09

Acne

0.45

0.37

0.40

Birth control

0.56

0.43

0.49

Depression

0.50

0.36

0.42

Emergency contraception

0.47

0.37

0.42

Fibromyalgia

0.77

0.19

0.31

Insomnia

0.45

0.30

0.36

Irritable bowel syndrome

0.50

0.20

0.29

Migraine

0.49

0.63

0.55

Muscle spasm

0.57

0.22

0.32

Sinusitis

0.41

0.23

0.29

Smoking cessation

0.74

0.51

0.60

Urinary tract infection

0.43

0.29

0.35

Vaginal yeast infection

0.50

0.37

0.43

Weight loss

0.49

0.23

0.31

Macro average

0.44

0.16

0.20

Weighted average

0.87

0.89

0.86

Accuracy

0.89

Health Mention Classification from User-Generated Reviews Using …

187

Fig. 6 Precision, recall, F1-score comparisons for different health conditions for SciBERT—a pre-trained BERT-based language model for scientific text (SciBERT)

6 Conclusions Analyzing user generated text from social media on health mentions have several use-cases such as understanding user reviews and sentiments on medications, its efficacy, and adverse drug reactions. As the manual analysis is very cumbersome, machine learning approaches may become very handy and effective for automating the analysis. This work proposed machine learning-based approaches for health mention classification from social media posts. The shallow learning algorithms such as Logistic Regression, Light Gradient Boosting Machine, Naive Bayes, Random Forest, and Support Vector Machine are implemented along with deep learning algorithms such as BiLSTM, CNN and Transformers. The end results are promising and show that machine learning will be a great choice for automating the health mention classification from the user generated contents.

References 1. Abualigah, L., Alfar, H. E., Shehab, M., & Hussein, A. M. A. (2020). Sentiment analysis in healthcare: a brief review. Recent Advances in NLP: The Case of Arabic Language, 129–141. 2. Biddle, R., Joshi, A., Liu, S., Paris, C., & Xu, G. (2020, April). Leveraging sentiment distributions to distinguish figurative from literal health reports on Twitter. In Proceedings of The Web Conference 2020 (pp. 1217–1227).

188

R. John et al.

3. Fries JA, Steinberg E, Khattar S, Fleming SL, Posada J, Callahan A, Shah NH (2021) Ontologydriven weak supervision for clinical entity classification in electronic health records. Nat Commun 12(1):1–11 4. Gräßer, F., Kallumadi, S., Malberg, H., & Zaunseder, S. (2018). Aspect-based sentiment analysis of drug reviews applying cross-domain and cross-data learning. In Proceedings of the 2018 International Conference on Digital Health (pp. 121–125). 5. Hajibabaee, P., Malekzadeh, M., Ahmadi, M., Heidari, M., Esmaeilzadeh, A., Abdolazimi, R., & James Jr, H. (2022). Offensive language detection on social media based on text classification. In 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0092–0098). IEEE. 6. Jothi N, Husain W (2015) Data mining in healthcare–a review. Procedia computer science 72:306–313 7. Kayastha, T., Gupta, P., & Bhattacharyya, P. (2021). BERT based Adverse Drug Effect Tweet Classification. In Proceedings of the Sixth Social Media Mining for Health (\#SMM4H) Workshop and Shared Task (pp. 88–90). 8. Khan, P. I., Razzak, I., Dengel, A., & Ahmed, S. (2020). Improving personal health mention detection on twitter using permutation based word representation learning. In International Conference on Neural Information Processing (pp. 776–785). Springer, Cham. 9. Khan, P. I., Razzak, I., Dengel, A., & Ahmed, S. (2022). Performance comparison of transformer-based models on twitter health mention classification. IEEE Transactions on Computational Social Systems. 10. Lekshmi, S., & Anoop, V. S. (2022). Sentiment Analysis on COVID-19 News Videos Using Machine Learning Techniques. In Proceedings of International Conference on Frontiers in Computing and Systems (pp. 551–560). Springer, Singapore. 11. Liu J, Wang X, Tan Y, Huang L, Wang Y (2022) An Attention-Based Multi-Representational Fusion Method for Social-Media-Based Text Classification. Information 13(4):171 12. Luo, L., Wang, Y., & Mo, D. Y. (2022). Identifying COVID-19 Personal Health Mentions from Tweets Using Masked Attention Model. IEEE Access. 13. Messaoudi, C., Guessoum, Z., & Ben Romdhane, L. (2022). Opinion mining in online social media: a survey. Social Network Analysis and Mining, 12(1), 1-18 14. Naseem, U., Kim, J., Khushi, M., & Dunn, A. G. (2022). Identification of disease or symptom terms in reddit to improve health mention classification. In Proceedings of the ACM Web Conference 2022 (pp. 2573–2581). 15. Reveilhac, M., Steinmetz, S., & Morselli, D. (2022). A systematic literature review of how and whether social media data can complement traditional survey data to study public opinion. Multimedia Tools and Applications, 1–36. 16. Salas-Zárate, R., Alor-Hernández, G., Salas-Zárate, M. D. P., Paredes-Valverde, M. A., BustosLópez, M., & Sánchez-Cervantes, J. L. (2022). Detecting depression signs on social media: a systematic literature review. In Healthcare (Vol. 10, No. 2, p. 291). MDPI. 17. Shiju, A., & He, Z. (2021). Classifying Drug Ratings Using User Reviews with TransformerBased Language Models. MedRxiv. 18. Thoomkuzhy, A. M. (2020). Drug Reviews: Cross-condition and Cross-source Analysis by Review Quantification Using Regional CNN-LSTM Models. 19. Varghese, M., & Anoop, V. S. (2022). Deep Learning-Based Sentiment Analysis on COVID19 News Videos. In Proceedings of International Conference on Information Technology and Applications (pp. 229–238). Springer, Singapore.

Using Standard Machine Learning Language for Efficient Construction of Machine Learning Pipelines Srinath Chiranjeevi and Bharat Reddy

Abstract We use Standard Machine Learning Language (SML) to streamline the synthesis of machine learning pipelines in this research. The overarching goal of SML is to ease the production of machine learning pipeline by providing a level of abstraction which makes it possible for individuals in industry and academia to use machine learning to tackle challenges across a variety of fields without having to deal with low level details involved in creating a machine learning pipeline. We further probe into how a wide range of interfaces can be instrumental in interacting with SML. Lines of comparison are further drawn to analyze the efficiency of SML in practical use cases versus traditional approaches. As an outcome, we developed SML a query like language which serves as an abstraction from writing a lot of code. Our findings show how SML is competent in solving the problems that utilize machine learning. Keywords Machine learning pipelines · Standard machine learning language · Problem solving using machine learning

1 Introduction Machine Learning has simplified the process of solving a vast amount problem in a variety of fields by learning from data. In most cases, machine learning has become more attractive than manually creating programs to address these same issues. However, there is a multitude of nuisances involved when developing machine learning pipelines [2]. If these nuisances are not taken into consideration, one may not receive satisfactory results. A domain expert utilizing machine learning to solve

S. Chiranjeevi (B) Vellore Institute of Technology, Bhopal, India e-mail: [email protected] B. Reddy National Institute of Technology, Calicut, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_16

189

190

S. Chiranjeevi and B. Reddy

Fig. 1 Example of a SML query performing classification

problems may not want or have the time to deal with these complexities. To combat these issues, we introduce Standard Machine Learning Language (SML). The overall objective of the SML is to provide a level of abstraction which simplifies the development process of machine learning pipelines [8]. Consequently, this enables students, researchers, and industry professionals without a background in machine learning to solve problems in different domains with machine learning. We developed SML a query like language which serves as an abstraction from writing a lot of code (see Fig. 1 for an example). In the subsequent sections related works are discussed followed by defining the grammar used to create queries for SML [3]. The architecture of SML is described, lastly SML is applied to use-cases to demonstrate how it reduces the complexity of solving problems that utilize machine learning.

2 Related Work There are related works that attempt to provide a level of abstraction as well for writing machine learning code. In this article on Automating Data Science [5] TPOT is a tool they implemented in Python that creates and optimizes machine learning pipelines using genetic programming. Given cleaned data, TPOT performs feature selection, preprocessing, and construction. Given the task (classification, regression, or clustering) it uses the best features to determine the most optimal model to use. Lastly, it performs optimization on parameters for the selected model. What differentiates SML from TPOT is that in addition to feature, model, and parameter selection/optimization a framework is in place to apply these models to different datasets and construct visualizations for different metrics with each algorithm. This article on NLP systems [6] where they used LBJava is a tool based on a programming paradigm called Learning Based Programming (D. Roth 2010) which is an extension of conventional programming that creates functions using data driven approaches. LBJava follows the principles of Learning Based Programming by abstracting the details of common machine learning processes. What separates SML from LBJava and TPOT is that it offers a higher level of abstraction by providing a query like language which allows people who are not experienced programmers to use SML.

Using Standard Machine Learning Language for Efficient Construction …

191

3 Grammar The SML language is a domain specific language with grammar implemented in Backus-Naur form (BNF). Each expression has a rule and can be expanded into other terms. Figure 1 is an example of how one would perform classification on a dataset using SML. The query in Fig. 1 reads from a dataset, performs an 80/20 split of training and testing data respectively, and performs classification on the 5th column of the hypothetical dataset using columns 1, 2, 3, and 4 as predictors. In the subsequent subsections SML’s grammar in BNF form is defined in addition to the keywords [1].

3.1 Grammar Structure This subsection is dedicated to defining the grammar of SML in terms of BNF. A Query can be defined by a delimited list of actions where the delimiter is an AND statement; with BNF syntax this is defined as: < Query >::=< Action > | < Action > AND < Query >

(1)

An Action in (1) follows one of the following structures defined in (2) where a Keyword is required followed by an Argument and/or OptionList. < Action >::=< Keyword > \ < Argument > ” | < Keyword > \ < Argument > ”\(” < OptionList > \)”

(2)

| < Keyword > \(” < OptionList > \)” A Keyword is a predefined term associating an Action with a particular string. An Argument generally is a single string surrounded by quotes that specifies a path to a file. Lastly, an Argument can have a multitude of options (3) where an Option consist of an OptionName with either an OptionValue or OptionValueList. An OptionName, and OptionValue consist of a single string, an OptionList (4) consist of a comma delimited list of options and an OptionValueList (5) consist of a comma delimited list of OptionValues. < Option >::=< OptionName > \ = ” < Option Value >   | < OptionName > \ = ”\ ” < OptionValueList > \ ” < OptionList >::=< Option > | < Option > ” < OptionList > < OptionValueList >::=< OptionValue > | < OptionValue > \, ” < OptionValueList >

(3)

(4) (5)

192

S. Chiranjeevi and B. Reddy

Fig. 2 Here the example Query on the top was defined in Fig. 1 and the bottom Query is in BNF format. For the example Query the first Keyword is READ followed by an Argument that specifies the path to the dataset, next an OptionValueList containing information about the delimiter of the dataset and the header. We then include the AND delimiter to specify an additional Keyword SPLIT with an OptionValueList that tells us the size of the training and testing partitions for the dataset specified with the READ Keyword. Lastly, the AND delimiter is used to specify another Keyword CLASSIFY which performs classification using the training and testing data from the result of the SPLIT Keyword followed by an OptionValueList which provides information to SML about the features to use (columns 1–4), the label we want to predict (column 5), and the algorithm to use for classification

To put the grammar into perspective the example Query in Fig. 1 has been transcribed into BNF format and can be found in Fig. 2. The next subsection describes the functionality for all Keywords of SML.

3.2 Keywords Currently there are 8 Keywords in SML. These Keywords can be chained together to perform a variety of actions. In the subsequent subsections we describe the functionality of each Keyword.

3.2.1

Reading Datasets

When reading data from SML one must use the READ Keyword Followed by an Argument containing a path to the dataset. READ also accepts a variety of Options. The first Query in Fig. 3 consist of only a Keyword and Argument. This Query reads in data from”/path/to/dataset”. The second Query includes an OptionValueList in addition to reading data from the specified path; the OptionValueList specifies that the dataset is delimited with semicolons and does not include a header row.

Fig. 3 Example using the READ Keyword in SML

Using Standard Machine Learning Language for Efficient Construction …

193

Fig. 4 An example utilizing the REPLACE Keyword in SML

Fig. 5 Example using the SPLIT Keyword in SML

3.2.2

Cleaning Data

When NaNs, NAs and/or other troublesome values are present in the dataset we clean these values in SML by using the REPLACE Keyword. Figure 4 shows an example of the REPLACE Keyword being used. In this Query we use the REPLACE Keyword in conjugation with the READ Keyword. SML reads from a comma delimited dataset with no header from the path”/path/to/dataset”. Then we replace any instance of” NaN” with the mode of that column in the dataset.

3.2.3

Partitioning Datasets

It is often useful to split a dataset into training and testing datasets for most tasks involving machine learning. This can be achieved in SML by using the SPLIT Keyword. Figure 5 shows an example of a SML Query performing an 80/20 split for training and testing data respectively by utilizing the SPLIT Keyword after reading in data.

3.2.4

Using Classification Algorithms

To use a classification algorithm in SML one would use the CLASSIFY Keyword. SML has the following classification:

3.2.5

Algorithms Implemented

Support Vector Machines, Naive Bayes, Random Forest, Logistic Regression, and K-Nearest Neighbors. Figure 6 demonstrates how to use the CLASSIFY Keyword in a Query.

194

S. Chiranjeevi and B. Reddy

Fig. 6 Example using the CLASSIFY Keyword in SML. Here we read in data and create training and testing datasets using the READ and SPLIT Keywords respectively. We then use CLASSIFY Keyword with the first 4 columns as features and the 5th column to perform classification using a support vector machine

Fig. 7 Example using the CLUSTER Keyword in SML. Here we read in data and create training and testing datasets using the READ and SPLIT Keywords respectively. We then use CLUSTER Keyword with the first 7 columns as features and perform unsupervised clustering with the K-Means algorithm

Fig. 8 Example using the REGRESS Keyword in SML. Here we read in data and create training and testing datasets using the READ and SPLIT Keywords respectively. We then use REGRESS Keyword with the first 9 columns as features and the 10th column to perform regression on using ridge regression

3.2.6

Using Clustering Algorithms

Clustering algorithms can be invoked by using the CLUSTER Keyword. SML currently has K-Means clustering implemented. Figure 7 demonstrates how to use the CLUSTER Keyword in a Query.

3.2.7

Using Regression Algorithms

Regression algorithms use the REGRESS Keyword. SML currently has the following regression algorithms implemented: Simple Linear Regression, Ridge Regression, Lasso Regression, and Elastic Net Regression. Figure 8 demonstrates how to use the REGRESS Keyword in a Query.

3.2.8

Saving/Loading Models

It is possible to save models and reuse them later. To save a model in SML one would use the SAVE Keyword in a Query. To load an existing model from SML one

Using Standard Machine Learning Language for Efficient Construction …

195

Fig. 9 Example using the LOAD and SAVE Keywords in SML

Fig. 10 Example using the PLOT Keyword in SML

would use the LOAD Keyword in a Query. Figure 9 shows how the syntax required save and load a model using SML. With any of the existing queries using REGRESS, CLUSTER, or CLASSIFY Keywords attaching SAVE to the Query will save the model.

3.2.9

Visualizing Datasets and Metrics of Algorithms

When using SML it is possible to visualize datasets or metrics of algorithms (such as learning curves, or ROC curves). To do this the PLOT Keyword must be specified in a Query. Figure 10 shows can example of how to use the PLOT Keyword in a Query. We apply the same operations to perform clustering in Fig. 7, however we utilize the PLOT Keyword.

4 SML’s Architecture With SML’s grammar defined enough information has been presented to dive into SML’s architecture. When SML is given a Query in the form of a string, it is passed to the parser. The high-level implementation of the grammar is then used to parse through the string to determine the actions to perform. The actions are stored in a dictionary and given to one of the following phases of SML: Model Phase, Apply Phase, or Metrics Phase. Figure 11 shows a block diagram of this process. The model phase is generally for constructing a model. The Keywords that generally invoke the model phase are: READ, REP LACE, CLASSIFY, REGRESS, CLUSTER, and SAVE. The apply phase is generally for applying a preexisting model to new data. The Keyword that generally invokes the apply phase is LOAD. It is often useful to visualize the data that one works with and beneficial to see performance metrics of a machine learning model. By default, if you specify the PLOT Keyword in a Query, SML will execute the metrics phase. The last significant component of SML’s architecture is the connector. The connector connects drivers from different libraries and languages to achieve an action a user wants during a particular phase (see Fig. 12). If one considers applying linear regression on a dataset, during the model phase SML calls the connector to retrieve the linear regression library in this

196

S. Chiranjeevi and B. Reddy

Fig. 11 Block Diagram of SML’s Architecture

Fig. 12 Block diagram of SML’s connector

case SML uses sci-kit learn’s implementation however, if we wanted to use an algorithm not available in sci-kit learn such as a Hidden Markov Model (HMM) SML will use the connector to call another library that supports HMM.

5 Interface There are multiple interfaces available for working with SML. We have developed a web tool that is publicly available which allows users to write queries and get results back from SML through a web interface (see Fig. 13). There is also a REPL environment available that allows the user to interactively write queries and displays results from the appropriate phases of SML. Lastly, users have the option to import SML into an existing pipeline to simplify the development process of applying machine learning to problems.

Using Standard Machine Learning Language for Efficient Construction …

197

Fig. 13 Interface of SML’s website. Currently users can read instructions and examples of how to use SML are on the left pane. In the middle pane users can type an SML Query and then hit the execute button. The results of running the Query through SML are then displayed on the right pane

6 Use Cases We tested SML’s framework against ten popular machine learning problems with publicly available data sets. We applied SML to the following datasets: Iris Dataset,1 Auto-MPG Dataset,2 Seeds Dataset,3 Computer Hardware Dataset,4 Boston Housing Dataset,5 Wine Recognition Dataset,6 US Census Dataset,7 chronic kidney disease,8 Spam Detection9 which were taken from UCI’s Machine Learning Repository (M. [4]. We also applied SML to the Titanic Dataset.10 In this paper we discuss in detail the process of applying SML to the Iris Dataset and the Auto-MPG dataset.

6.1 Iris Dataset Figure 14 shows all the code required to perform classification on the Iris dataset using SML in Python. In Fig. 14 data is read in from a specified path of a file called” iris.csv” of a subdirectory called “data” in the parent directory, performs an 80/20 split, uses the first 4 columns to predict the 5th column, uses support vector machines as the algorithm to perform classification and finally plots distributions of our dataset 1

https://archive.ics.uci.edu/ml/datasets/Iris. https://archive.ics.uci.edu/ml/datasets/Auto+MPG. 3 https://archive.ics.uci.edu/ml/datasets/seeds. 4 https://archive.ics.uci.edu/ml/datasets/Computer+Hardware. 5 https://archive.ics.uci.edu/ml/datasets/Housing. 6 https://archive.ics.uci.edu/ml/datasets/wine. 7 https://archive.ics.uci.edu/ml/datasets/US+Census+Data+(1990). 8 https://archive.ics.uci.edu/ml/datasets/ChronicKidneyDisease. 9 https://archive.ics.uci.edu/ml/datasets/Spambase. 10 https://www.kaggle.com/c/titanic. 2

198

S. Chiranjeevi and B. Reddy

Fig. 14 SML Query that performs classification on the iris dataset using support vector machines. The purpose of this figure is to highlight the level of complexity relative to an SML query

Fig. 15 The SML Query in Fig. 14 produces these results. The subgraph on the left is a lattice plot showing the density estimates of each feature used. The graph on the right shows the ROC curves for each class of the iris dataset

and metrics of our algorithm. The Query in Fig. 14 uses the same 3rd party libraries implicitly or explicitly. The complexities required to produce such results with and without SML are outlined. The result for both snippets of code is the same and can be seen in Fig. 15.

6.2 Auto-Mpg Dataset Figure 16 shows the SML Query required to perform regression on the Auto-MPG dataset in Python. In Fig. 16 we read data from a specified path, the dataset is separated by fixed width spaces, and we choose not to provide a header for the dataset. Next, we perform an 80/20 split, replace all occurrences of “?” with the mode of the column. We then perform linear regression using columns 2–8 to predict the label. Lastly, we visualize distributions of our dataset and metrics of our algorithm. The outcome of both processes is the same and can be seen in Fig. 17.

Using Standard Machine Learning Language for Efficient Construction …

199

Fig. 16 SML Query that performs classification on the Auto-MPG dataset using support vector machines

Fig. 17 The SML Query in Fig. 16 produce these results. The subgraph on the left is a lattice plot showing the density estimates of each feature used. The top right graph shows the learning curve of the model and the graph on lower right shows the validation curve

7 Discussion For the Iris and Auto-MPG use cases the same libraries and programming language were used to perform regression and classification. The amount of work required to perform a task and produce the following results in Fig. 17 and Fig. 15 significantly decreases when SML is utilized. Constructing each SML query used less than 10 lines of code however, implementing the same procedures without SML using the same programming language and libraries needed 70 + lines of code (Xing Wu, Cheng Chen, Pan Li, Mingyu Zhong, Jianjia Wang, Quan Qian, Peng Ding, Junfeng Yao, and Yike Guo 2022). This demonstrates that SML simplifies the development process of solving problems with machine learning and opens a realm of possibility to rapidly develop machine learning pipelines which would be an attractive aspect for researchers (Yang Yang, Suzhen Li, and Pengcheng Zhang 2022).

8 Conclusion To summarize we introduced an agnostic framework that integrates a query-like language to simplify the development of machine learning pipelines. We provided a high-level overview of its architecture and grammar. We then applied SML to

200

S. Chiranjeevi and B. Reddy

machine learning problems and demonstrated how the complexity of the code one must write significantly decreases when SML is used. In the future we plan to extend the connector to support more machine learning libraries and additional languages. We also plan to expand the web application to make SML easier to use for a lament user. If we want researchers from other domain areas to utilize machine learning without understanding the complexities required for machine learning a tool like SML is needed. The concepts presented in this paper are sound. The details may change but the core principals will remain the same. Abstracting the complexities of machine learning from users is appealing because this will increase the use of machine learning by researchers in different disciplines.

References 1. AlBadani B, Shi R, Dong J (2022) A novel machine learning approach for sentiment analysis on twitter incorporating the universal language model fine-tuning and svm. Appl Syst Innov 5(1):13 2. Domingos P (2012) A few useful things to know about machine learning. 55:78–87, New York, NY, USA, ACM 3. Kaczmarek I, Iwaniak A, Swietlicka A, Piwowarczyk M, Nadolny A (2022) A machine learning approach for integration of spatial development plans based on natural language processing. Sustain Cities Soc 76:103479 4. Lichman M (2013) UCI machine learning repository 5. Olson RS, Bartley N, Urbanowicz RJ, Moore JH (2016) Evaluation of a tree-based pipeline optimization tool for automating data science. CoRR, abs/1603.06212 6. Rizzolo N, Roth D (2010) Learning based java for rapid development of NLP systems. In: LREC, Valletta, Malta, p 5 7. Roth D (2005) Learning based programming. Innovations in machine learning. Theory and applications, pp 73–95 8. Stoleru C-A, Dulf EH, Ciobanu L (2022) Automated detection of celiac disease using machine learning algorithms. Sci Rep 12(1):1–19 9. Wu X, Chen C, Li P, Zhong M, Wang J, Qian Q, Ding P, Yao J, Guo Y (2022) FTAP: feature transferring autonomous machine learning pipeline. Inf Sci 593:385–397 10. Yang Y, Li S, Zhang P (2022) Data-driven accident consequence assessment on urban gas pipeline network based on machine learning. Reliab Eng Syst Saf 219:108216

Machine Learning Approaches for Detecting Signs of Depression from Social Media Sarin Jickson, V. S. Anoop, and S. Asharaf

Abstract Depression is considered to be one of the most severe mental health issues globally; in many cases, depression may lead to suicide. According to a recent report by the World Health Organization (WHO), depression is a common illness worldwide and approximately 280 million people in the world are depressed. Timely identification of depression would be helpful to avoid suicides and save the life of an individual. Due to the widespread adoption of social network applications, people often express their mental state and concerns on such platforms. The COVID-19 pandemic has been a catalyst to this situation where the mobility and physical social connections of individuals have been limited. This caused more and more people to express their mental health concerns with such platforms. This work attempts to detect signs of depression from unstructured social media posts using machine learning techniques. Advanced deep learning approaches such as transformers are used for classifying social media posts that will help in the early detection of any signs of depression in individuals. The experimental results show that machine learning approaches may be efficiently used for detecting depression from user-generated unstructured social media posts. Keywords Depression detection · Machine learning · Social media · Transformers · Deep learning · Natural language processing · Computational social sciences

S. Jickson · V. S. Anoop Kerala Blockchain Academy, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India e-mail: [email protected] S. Asharaf Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India e-mail: [email protected] V. S. Anoop (B) School of Digital Sciences, Kerala University of Digital Sciences, Innovation and Technology, Thiruvananthapuram, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_17

201

202

S. Jickson et al.

1 Introduction Depression is one of the most dreadful mental disorders that affect millions of people around the world. Recent statistics by the World Health Organization (WHO) reports that the number of individuals with depression is increasing day by day [7]. To be more specific, it is estimated that approximately 280 million people around the world are affected by depression, 5.0% among adults and 5.7% among adults older than 60 years [7]. These rates are highly alarming considering the fact that in many cases, individuals commit suicide even though there are effective treatments available for mild, moderate, and severe depression [12]. The reduced mobility and limited or no social interactions imposed by the COVID-19 pandemic have fueled the rate of depression-related disorders around the globe [16]. People tended to be at the home due to the lockdown and other travel restrictions imposed by the government and other administrations which affected the mental stages of individuals in a negative manner that often led to depression in the vast majority. Depression usually does not last long but if it is to become a disease, the symptoms of depression must persist for two weeks or more. So, the timely identification of depression is very important to bring the individual back to his normal life. The advancements in the internet and the competition between various internet service providers caused low-cost internet services to be a standard in many countries. This has not only increased the rate of internet penetration, but also shot up the growth of internet-based applications and services such as social networks [1, 21]. According to some recent statistics (January 2022), more than half of the world’s population uses social media and that contributes to approximately 4.62 billion people around the world [9]. It is also estimated that internet users worldwide spend an average of 2 h and 27 min per day on any social media, and in the future, the amount of time spent on social networks will likely stay steady. So, connecting the statistics of the global depression reports and internet penetration, it is highly relevant that people use social media as a platform for sharing their opinions, anxieties, and other mental stages. This has become a new normal due to the COVID-19 pandemic where physical meetings and social networking opportunities were limited. Social network analysis deals with collecting, organizing, and processing social media posts generated by users to unearth latent themes from them. This technique is proven to be efficient in understanding user intentions and patterns that may help the key stakeholders to take proactive decisions [12, 20]. As manual analysis of social media posts will be inefficient, considering a large number of messages, recent approaches use machine learning that can process a large amount of data with nearhuman accuracy. People with depression often post about the same along with the indications of several symptoms related to depression and early detection will be possible by analyzing the same. Very recently, several studies incorporating machine learning approaches for depression detection from social media have been reported in the literature [13, 14, 23, 25] with varying degrees of success. The proposed work uses different machine learning algorithms (both shallow learning and deep learning) for identifying the severity of depression from social media posts represented as

Machine Learning Approaches for Detecting Signs of Depression …

203

unstructured text. This work also makes use of transformers for severity classification of posts into no depression, moderate, and severe. The main contributions of this paper are summarized as follows: (a) Discusses depression—one of the most challenging mental disorders and the role of social media in depression detection. (b) Implements different machine learning algorithms in both shallow and deep learning categories and compares the performance. (c) Reports the classification performance of different algorithms in classifying the severity of depression from social media posts. The remainder of this paper is organized as follows. Section 2 discusses some of the recent related works that are reported in the literature that uses machine learning approaches for depression detection. In Sect. 3, the authors present the proposed approach for classifying the severity of depression-related social media posts. Section 4 details the experiment conducted and the dataset used, and in Sect. 5, the results are presented and discussed in a detailed fashion. Section 6 concludes this work.

2 Related Studies Depression detection has garnered a lot of interest among social network and healthcare researchers in recent times. There are several approaches reported in the recent past that attempt to detect depression from social media posts, specifically from unstructured text. This section discusses some of the recent and prominent approaches reported in the machine learning and social network analysis literature that is highly related to the proposed approach. Xiaohui Tao et al. developed a prototype to illustrate the approach’s mechanism and any potential social effects using a depressive sentiment vocabulary [19]. They compared the data with this vocabulary and classified the social media posts. Mandar Deshpande et al. uses natural language processing to classify Twitter data and used SVM and Naive Bayes algorithms for the classification of depression-related posts [6]. Guangyao Shen et al. proposed a method that uses a multi-modal dictionary learning solution [18]. Faisal Muhammad Shah et al. developed a method that uses a hybrid model that can detect depression by analyzing user’s textual posts [17]. Deep learning algorithms were trained using the training data and then performance has been evaluated on the Reddit data which was published for the pilot piece of work. In particular, the authors have proposed a Bidirectional Long Short-Term Memory (BiLSTM) with different word embedding techniques and metadata features that gave comparatively better performance. Chiong et al. proposed a textual-based featuring approach for depression detection from social media using machine learning classifiers [3]. They have used two publicly available labeled datasets to train and test the machine learning models and other three non-twitter datasets for evaluating the performance of their proposed model. The experimental results showed that their proposed approach effectively

204

S. Jickson et al.

detected depression from social media data. Zogan et al. developed DepressionNet— a depression detection model for social media with user post summarization and multi-modalities [24]. They have proposed a novel framework for extractive and abstractive post summarization and used deep learning algorithms such as CNN and GRU for classification. Titla-Tlatelpa et al. proposed an approach for depression detection from social media using a profile-based sentiment-aware approach [5]. This approach explored the use of the user’s characteristics and the expressed sentiments in the messages as context insights. The authors have proposed a new approach for the classification of the profiles of the users and their experiment on the benchmark datasets showed better results. Another approach that uses sentiment lexicons and content-based features was reported by Chiong et al. They have proposed 90 unique features as input to the machine learning classifier framework for depression detection from social media. Their approach resulted in more than 96% accuracy for all the classifiers with the highest being 98% with the gradient boosting algorithm [4]. Lara et al. presented DeepBoSE—a deep bag of sub-emotions for depression detection from social media [11]. The proposed approach computed a bag-of-features representation that uses emotional information and is further trained on the transfer learning paradigm. The authors have performed their experiments on eRisk17 and eRisk18 datasets for the depression detection task and it could score better f1-score for both. An approach for early detection of stress and depression from social media using mental state knowledge-aware and the contrastive network was reported by Yang et al. The authors have tested the proposed methods on a depression detection dataset Depression-Mixed with 3165 Reddit and blog posts, a stress detection dataset Dreaddit with 3553 Reddit posts, and a stress factors recognition dataset SAD with 6850 SMS-like messages. Their proposed approach detected new stateof-the-art results in all the datasets used [23]. Angskun et al. presented a big data analytics approach for social media for the real-time detection of depression from social media [2]. They have used Twitter data collected for a period of two months and implemented machine learning algorithms including deep learning approaches. They have reported that the Random Forest classifier algorithm showcased better results and their model could capture depressive moods of depression sufferers. This proposed work implements different machine learning algorithms including transformers to classify the severity of depression from online social media posts. The comparison results for all the machine learning approaches are also reported on publicly available depression severity labeled datasets.

3 Proposed Approach This section discusses the proposed approach for depression severity classification from social media text using machine learning approaches. The overall workflow of the proposed approach is shown in Fig. 1. The first step deals with the preprocessing of social media posts such as data cleaning and normalization. As social media posts are user-generated, they may contain several noises and unwanted contents such

Machine Learning Approaches for Detecting Signs of Depression …

205

as URLs, special characters, and emojis. As these elements may not convey any useful features, these should be removed from the dataset. Then the normalization techniques are applied that will change the values of numerical columns in the dataset to a common scale without losing information. This step is crucial for improving the performance and training stability of the model. In the feature extraction stage, various features relevant to training the machine learning classifiers such as count of words/phrases, term frequency versus inverse document frequency (TF-IDF), and also using pre-trained embedding models such as BERT (Bidirectional Encoder Representations from Transformers) will be extracted. The features collected will be used for training machine learning algorithms such as shallow learning (SVM, Logistic Regression, Naive Bayes, etc.) and deep learning (ANN, CNN, and Transformers). After the feature extraction stage, the dataset was splitted into train and test, and the same will be used for training and testing the model, respectively. In our case, the dataset contains three labels—not depression, moderate, and severe depression, and the datapoints were imbalanced. Techniques such as down-sampling and up-sampling were performed to create a balanced version of experiment-ready copy for the final dataset. The proposed approach implemented the following shallow-learning algorithms: Logistic Regression (LR): Logistic regression (LR) is a technique that employs a set of continuous, discrete, or a combination of both types of characteristics, as well as a binary goal. This approach is popular since it is simple to apply and produces decent results.

Fig. 1 Overall workflow of the proposed approach

206

S. Jickson et al.

Support Vector Machines (SVM): The SVM has a distinct edge when it comes to tackling classification jobs that need high generalization. The strategy aims to reduce mistakes by focusing on structural risk. This technique is widely utilized in medical diagnostics. Multinomial Naïve Bayes (NB): This is a popular classification algorithm used for the analysis of categorical text data. The algorithm is based on the Bayes theorem and predicts the tag of a text by computing the probability of each tag for a given sample and then gives the tag with the highest probability as output. Random Forest (RF): The Random Forest approach has been used to examine the drug dataset in several researches. Having the ability to analyze facts and make an educated guess, the dataset employed in this study is balanced in nature. Because random forest separates data into branches to form a tree, we infer that random forest cannot be utilized to provide prognostic options to address unbalanced issues. The proposed approach implements the following neural network/deep learning algorithms (with pre-trained embeddings) on the social network dataset. Artificial Neural Network (ANN): An artificial neural network is a group of nodes that are interconnected and inspired by how the human brain works. ANN tries to find the relationship between features in a dataset and classifies them according to a specific architecture. Convolutional Neural Networks (CNN): The CNN architecture for classification includes convolutional layers, max-pooling layers, and fully connected layers. Convolution and max-pooling layers are used for feature extraction. While convolution layers are meant for feature detection, max-pooling layers are meant for feature selection. Transformers: They are a class of deep neural network models. A transformer is a deep learning model extensively used in natural language processing applications that adopt the mechanism of self-attention, deferentially weighting the significance of each part of the input data.

4 Experiment This section discusses the experiment conducted using the proposed approach discussed in Sect. 3. A detailed explanation of the dataset used and the experimental testbeds are due in this section.

Machine Learning Approaches for Detecting Signs of Depression …

207

Table 1 A snapshot of the dataset used Posting ID

Text

train_pid_8231

Words can’t describe how bad I feel right now: I just want to Severe fall asleep forever

Label

train_pid_1675

I just tried to cut myself and couldn’t do it. I need someone to talk to

Moderate

train_pid_6982

Didn’t think I would have lived this long to see 2020: Don’t even know if this is considered an accomplishment

Not depression

4.1 Dataset This experiment uses a publicly available dataset as part of the shared task on Detecting Signs of Depression from Social Media Text as part of the Second Workshop on Language Technology for Equality, Diversity, and Inclusion (LT-EDI-2022) at ACL 2022 (Kayalvizhi et al., 2022). The dataset consists of training, development, and test set and the files are in tab-separated format with three columns—Posting ID, Text, and Label. A snapshot of the dataset is shown in Table 1. In the dataset, Not Depression represents that the user doesn’t show a sign of depression in his social media texts, Moderate label denotes that the user shows some signs of depression, and Severe represents that the user shows clear signs of depression through his social media texts. The dataset contains a total of 13,387 data points, out of which 3801 belong to Not Depression, 8325 belong to Moderate, and 1261 belong to Severe classes.

4.2 Experimental Setup This section describes the experimental setup we have used for our proposed approach. All the methods described in this paper were implemented in Python 3.8. The experiments were run on a server configured with IntelI Core I i5-10300H CPU @ 2.50 GHz core processor and 8 GB of main memory. Firstly, the dataset was pre-processed to remove the stop words, URLs, and other special characters. The emoji library available at https://pypi.org/project/demoji/ was used for converting emojis into textual forms. We have also used English contractions list for better text enhancement. During the analysis, we found that the dataset is unbalanced, and to balance the dataset, we performed down-sampling or up-sampling. The feature engineering stage deals with computing different features such as count vectorization, word embeddings, and TF-IDF, afterword tokenization. This work used texts

208

S. Jickson et al.

to sequences in Keras https://www.tensorflow.org/api_docs/python/tf/keras/prepro cessing/text/Tokenizer and BERT tokenization from https://huggingface.co/docs/tra nsformers/main_classes/tokenizer for the transformers. Once the experiment-ready dataset is obtained, we have splitted the dataset into train and test split and then used the algorithms listed in Sect. 3.

5 Results and Discussions This section details the results obtained from the experimental setup explained in Sect. 4 that implemented the proposed approach discussed in Sect. 3. Different machine learning algorithms were implemented on the dataset mentioned in Sect. 4.1 and the results are compared. Tables 2, 3, 4, 5 shows the precision, recall, and F1-score for the Logistic Regression, Support Vector Machines, Naive Bayes, and Random Forest classification algorithms. For the Logistic Regression algorithm, the precision, recall, and F1-score for the Not Depression class are found to be 87%, 89%, and 88%, respectively, and for the Moderate class, the values were found to be 90%, 84%, and 87%. The class Severe has recorded a precision of 95%, recall of 0.99%, and 97%, respectively. The Support Vector Machines (SVM), the Not Depression class has attained a precision of 85%, recall of 89%, and f1-score of 87%, respectively, and for the Moderate class, it is 89%, 82%, and 85%. The Severe class has recorded 95%, 98%, Table 2 Classification report for the logistic regression algorithm

Precision

Recall

F1-score

Not Depression

0.87

0.89

0.88

Moderate

0.90

0.84

0.87

Severe

0.95

0.99

Accuracy

Table 3 Classification report for the support vector machine algorithm

0.97 0.91

Macro average

0.91

0.91

0.91

Weighted average

0.91

0.91

0.91

Precision

Recall

F1-score

Not Depression

0.85

0.89

0.87

Moderate

0.89

0.82

0.85

Severe

0.95

0.98

Accuracy

0.97 0.90

Macro average

0.90

0.90

0.90

Weighted average

0.90

0.90

0.90

Machine Learning Approaches for Detecting Signs of Depression … Table 4 Classification report for the naïve bayes algorithm

209

Precision

Recall

F1-score

Not Depression

0.88

0.81

0.85

Moderate

0.86

0.83

0.84

Severe

0.89

0.99

Accuracy

Table 5 Classification report for the random forest algorithm

0.93 0.87

Macro average

0.87

0.88

0.87

Weighted average

0.87

0.87

0.87

Precision

Recall

F1-score

Not Depression

0.82

0.86

0.84

Moderate

0.84

0.82

0.83

Severe

0.96

0.93

0.95

Macro average

0.87

0.87

0.87

Weighted average

0.87

0.87

0.87

Accuracy

0.87

and 97% of precision, recall, and f1-score, and the SVM algorithm has shown a weighted average of 90% for all three classes. The Naive Bayes classifier has recorded 0.88% for precision, 81% for recall, and 85% for the f1-score for Not Depression class and 86%, 83%, and 84% for the precision, recall, and f1-score for the Moderate class. For the Severe class, the recorded values were 89%, 99%, and 93% for precision, recall, and accuracy, respectively. For the Not Depression class, the Random Forest classification algorithm has scored 82%, 86%, and 84% for the precision, recall, and f1-score, respectively, and for the Moderate class, the values were found to be 84%, 82%, and 83%, respectively. The Severe class has recorded a precision of 96%, recall of 93%, and f1-score of 95%. Graphs representing the precision, recall, and f1-score comparison for Logistic Regression, Support Vector Machines, Naive Bayes, and Random Forest are shown in Fig. 2. The classification report for the Transformer with BERT-Base-Cased and with BERT-Base-Uncased models are shown in Table 6 and Table 7, respectively. For the BERT-Base-Cased, a precision value of 87%, a recall value of 83%, and an f1-score of 85% were recorded, and for the Moderate class, the corresponding values were 77%, 87%, and 81%, respectively. For the Severe class, the precision was 87%, the recall was 87%, and the f1-score was also 87%. On the other hand, the BERT-BaseUncased model performed poorly and recorded a precision of 73%, recall of 67%, and f1-score of 70%. The Moderate and the Severe classes have attained the precision, recall, and f1-score of 77%, 87%, and 81%, and 80%, 71%, and 76%, respectively. Figure 3. shows the precision, recall, and accuracy comparison for the transformer model with BERT-Base-Cased and BERT-Base-Uncased pre-trained embeddings.

210

S. Jickson et al.

(a) The precision, recall and F1-score comparison for Logistic Regression

(b) The precision, recall and F1-score comparison for Support Vector Ma-

(c) The precision, recall and F1-score

(d) The precision, recall and F1-score

Fig. 2 The precision, recall, and accuracy comparison for logistic regression, support vector machinenaïveive bayes, and random forest algorithms Table 6 Classification report for transformer with BERT-Base-Cased model

Precision

Recall

F1-score

Not Depression

0.87

0.83

0.85

Moderate

0.77

0.87

0.81

Severe

0.87

0.87

Table 7 Classification report for transformer with BERT-Base-Uncased model

0.87 0.84

Accuracy Macro average

0.85

0.84

0.85

Weighted average

0.85

0.84

0.85

Precision

Recall

F1-score

Not Depression

0.73

0.67

0.70

Moderate

0.77

0.87

0.81

Severe

0.80

0.71

0.76 0.81

Accuracy Macro average

0.79

0.76

0.77

Weighted average

0.80

0.81

0.80

Machine Learning Approaches for Detecting Signs of Depression …

(a) Precision, recall, accuracy for the Transformer with BERT-Base-Cased model

211

(b) Precision, recall, accuracy for the Transformer with BERT-Base-Uncased model

Fig. 3 Precision, Recall, and Accuracy comparison for the transformer with BERT-Base-Cased and BERT-Base-Uncased models

Table 8 Classification report for artificial neural network model

Precision

Recall

F1-score

Not Depression

0.82

0.80

0.81

Moderate

0.74

0.81

0.77

Severe

0.89

0.82

0.86 0.81

Accuracy Macro average

0.82

0.81

0.81

Weighted average

0.82

0.81

0.81

The classification report for the Artificial Neural Network (ANN) and Convolutional Neural Network (CNN) are shown in Table 8 and Table 9, respectively. The ANN has scored a precision of 82%, a recall of 80%, and an f1-score of 81% for the Not Depression class; a precision of 74%, a recall of 81%, and an f1-score of 77% for the Moderate class. The Severe class has scored 89% for precision, 82% for recall, and 86% for f1-score for ANN. The Convolutional Neural Network has attained 88% precision, 79% recall, and 83% f1-score for the Not Depression class, 6% precision, 90% recall, and 78% f1-score for the Moderate class, and 92% precision, 73% recall, and 81% f1-score for the Severe class. Figure 4 Shows the precision, recall, and accuracy comparison of ANN and CNN models. Table 9 Classification report for convolutional neural network

Precision

Recall

F1-score

Not Depression

0.88

0.79

0.83

Moderate

0.69

0.90

0.78

Severe

0.92

0.73

Accuracy

0.81 0.80

Macro average

0.83

0.80

0.81

Weighted average

0.83

0.80

0.81

212

S. Jickson et al.

(a) Precision, recall, accuracy for the ANN model

(b) Precision, recall, accuracy for the CNN model

Fig. 4 Precision, Recall, and Accuracy comparison for the ANN and CNN models

Table 10 Summary of the precision, recall, and accuracy of all the models Model

Precision

Recall

F1-score

Accuracy

Logistic Regression

0.91

0.91

0.91

0.91

Support Vector Machine

0.90

0.90

0.90

0.90

Naïve Bayes

0.87

0.87

0.87

0.87

Random Forest

0.87

0.87

0.87

0.87

Transformer (BERT-Base-Cased)

0.85

0.84

0.85

0.84

Transformer (BERT-Base-Uncased)

0.80

0.81

0.80

0.81

Artificial Neural Network

0.82

0.81

0.81

0.81

Convolutional Neural Network

0.83

0.80

0.81

0.80

The precision, recall, and f1-score value comparison for all the models considered in the proposed approach is given in Table 10 and the corresponding graphical comparison is shown in Fig. 5. From Table 10 and Fig. 5, it is evident that for the considered dataset, shallow machine learning approaches showcased better precision, recall, and f1-score, but the results for the Transformer models also looks promising. This indicates that more analysis should be done using transformers with other pre-trained models to confirm the potential for better classification.

6 Conclusions Depression, one of the most severe mental disorders should be identified during its initial stages to give proper medical attention to any individual. The number of people who share their mental states on online social media has grown exponentially due to several factors such as limited mobility and social activities during recent times. So, it is highly evident that machine learning approaches need to be developed and

Machine Learning Approaches for Detecting Signs of Depression …

213

Fig. 5 The precision, recall, f1-score, and accuracy summary for all the models

implemented for the early detection of depression-related information. This work attempted to implement different machine learning algorithms to classify the severity of depression-related social media posts. The experimental results show that machine learning may be highly useful in identifying the signs of depression from social media. As the initial results look promising, the authors may continue implementing more machine learning algorithms for depression detection and analysis in the future.

References 1. Aggarwal K, Singh SK, Chopra M, Kumar S (2022) Role of social media in the COVID-19 pandemic: A literature review. Data Min Approaches Big Data Sentim Anal Soc Media, 91–115 2. Angskun J, Tipprasert S, Angskun T (2022) Big data analytics on social networks for real-time depression detection. J Big Data 9(1):1–15 3. Chiong R, Budhi GS, Dhakal S (2021) Combining sentiment lexicons and content-based features for depression detection. IEEE Intell Syst 36(6):99–105 4. Chiong R, Budhi GS, Dhakal S, Chiong F (2021) A textual-based featuring approach for depression detection using machine learning classifiers and social media texts. Comput Biol Med 135:104499 5. de Jesús Titla-Tlatelpa J, Ortega-Mendoza RM, Montes-y-Gómez M, Villaseñor-Pineda L (2021) A profile-based sentiment-aware approach for depression detection in social media. EPJ Data Sci 10(1):54 6. Deshpande M, Rao V (2017) Depression detection using emotion artificial intelligence. In 2017 International Conference on Intelligent Sustainable Systems (ICISS), IEEE, pp 858–862 7. Evans-Lacko S, Aguilar-Gaxiola S, Al-Hamzawi A, Alonso J, Benjet C, Bruffaerts R, Thornicroft G (2018) Socio-economic variations in the mental health treatment gap for people with anxiety, mood, and substance use disorders: results from the WHO World Mental Health (WMH) surveys. Psychol Med 48(9):1560–1571

214

S. Jickson et al.

8. Funk M (2012) Global burden of mental disorders and the need for a comprehensive, coordinated response from health and social sectors at the country level 9. Hall JA, Liu D (2022) Social media use, social displacement, and well-being. Curr Opin Psychol, 101339 10. Kayalvizhi S, Durairaj T, Chakravarthi BR (2022) Findings of the shared task on detecting signs of depression from social media. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pp 331–338 11. Lara JS, Aragón ME, González FA, Montes-y-Gómez M (2021) Deep bag-of-sub-emotions for depression detection in social media. In International Conference on Text, Speech, and Dialogue, pp 60–72. Springer, Cham 12. Lekshmi S, Anoop VS (2022) Sentiment analysis on COVID-19 news videos using machine learning techniques. In Proceedings of International Conference on Frontiers in Computing and Systems, pp. 551–560. Springer, Singapore 13. Liu D, Feng XL, Ahmed F, Shahid M, Guo J (2022) Detecting and measuring depression on social media using a machine learning approach: systematic review. JMIR Ment Health, 9(3), e27244 14. Ortega-Mendoza RM, Hernández-Farías DI, Montes-y-Gómez M, Villaseñor-Pineda L (2022) Revealing traces of depression through personal statements analysis in social media. Artif Intell Med 123:102202 15. Ren L, Lin H, Xu B, Zhang S, Yang L, Sun S (2021) Depression detection on reddit with an emotion-based attention network: algorithm development and validation. JMIR Med Inform 9(7):e28754 16. Renaud-Charest O, Lui LM, Eskander S, Ceban F, Ho R, Di Vincenzo JD, McIntyre RS (2021) Onset and frequency of depression in post-COVID-19 syndrome: A systematic review. J Psychiatr Res 144:129–137 17. Shah FM, Ahmed F, Joy SKS, Ahmed S, Sadek S, Shil R, Kabir MH (2020) Early depression detection from social network using deep learning techniques. In 2020 IEEE Region 10 Symposium (TENSYMP), IEEE. pp 823–826 18. Shen G, Jia J, Nie L, Feng F, Zhang C, Hu T, Zhu W (2017). Depression detection via harvesting social media: A multimodal dictionary learning solution. In IJCAI (pp.3838–3844) 19. Tao X, Zhou X, Zhang J, Yong J (2016) Sentiment analysis for depression detection on social networks. In International Conference on Advanced Data Mining and Applications, pp 807– 810. Springer, Cham 20. Varghese M, Anoop VS (2022). Deep learning-based sentiment analysis on COVID-19 News Videos. In Proceedings of International Conference on Information Technology and Applications, pp 229–238. Springer, Singapore 21. Xiong F, Zang L, Gao Y (2022) Internet penetration as national innovation capacity: worldwide evidence on the impact of ICTs on innovation development. Inf Technol Dev 28(1):39–55 22. Yang K, Zhang T, Ananiadou S (2022) A mental state Knowledge–aware and Contrastive Network for early stress and depression detection on social media. Inf Process Manage 59(4):102961 23. Yang K, Zhang T, Ananiadou S (2022) A mental state Knowledge–aware and Contrastive Network for early stress and depression detection on social media. Inf Process Manag 59(4):102961 24. Zogan H, Razzak I, Jameel S, Xu G (2021) Depressionnet: learning multi-modalities with user post summarization for depression detection on social media. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 133–142 25. Zogan H, Razzak I, Wang X, Jameel S, Xu G (2022) Explainable depression detection with multi-aspect features using a hybrid deep learning model on social media. World Wide Web 25(1):281–304

Extremist Views Detection: Definition, Annotated Corpus, and Baseline Results Muhammad Anwar Hussain, Khurram Shahzad, and Sarina Sulaiman

Abstract Extremist view detection in social networks is an emerging area of research. Several attempts have been made to extremist views detection on social media. However, there is a scarcity of publicly available annotated corpora that can be used for learning and prediction. Also, there is no consensus on what should be recognized as an extremist view. In the absence of such a description, the accurate annotation of extremist views becomes a formidable task. To that end, this study has made three key contributions. Firstly, we have developed a clear understanding of extremist views by synthesizing their definitions and descriptions in the academic literature, as well as in practice. Secondly, a benchmark extremist view detection corpus (XtremeView-22) is developed. Finally, baseline experiments are performed using six machine learning techniques to evaluate their effectiveness for extremist view detection. The results show that bigrams are the most effective feature and Naive Bayes is the most effective technique to identify extremist views in social media text. Keywords Extremism · Extremist view detection · Machine learning · Classification · Social media listening · Twitter

1 Introduction Rising evidence has revealed that social media play a crucial role in unrest creation activities [24]. Researchers and policymakers have also reached a broad agreement on the link between social media use and the active role of extremist organizations, M. A. Hussain (B) · S. Sulaiman Department of Computer Science, University of Technology Malaysia, Johor Bahru, Malaysia e-mail: [email protected] S. Sulaiman e-mail: [email protected] K. Shahzad Department of Data Science, University of the Punjab, Lahore, Pakistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_18

215

216

M. A. Hussain et al.

such as the Islamic State of Iraq and Al-Sham (ISIS) [9, 10, 27]. It is the reason that social media platforms, such as Twitter, provide unrestricted access where individuals, interest groups, and organizations can engage in the discussions of their choice, including extremist discussions and recruitment without fear of repercussions. Furthermore, social media content has the potential to reach millions of users in a short span of time. It is widely recognized that extremist groups use social media for spreading their ideology, fundraising, recruiting, attracting innocent young people, and using them for their cynical causes. For instance, the growth of the Islamic State in Iraq and Syria (ISIS) to tens of thousands of people has been partly attributed to its increased use of social media for propaganda and recruiting purposes. Recognizing the challenge, online extremism, propaganda proliferation, and radicalization detection in social media have received attention from researchers during the last decade [8, 13, 18]. Developing automated techniques for the detection of extremist viewpoints is a challenging undertaking because there are differences in understanding how the notation of extremism should be described. This implies that depending on the definition of extremist views, some communication may be judged as extremist by one fragment of the society and not by others. To that end, this study has made the following key contributions. Conceptualized extremism. We have gathered the existing definitions and descriptions of extremism from diverse sources, including popular dictionaries, academic literature, as well as the descriptions of regularity bodies. Subsequently, these details are synthesized to clearly conceptualize the notion of extremism. To the best of our knowledge, this is the first-ever attempt to develop a clear understanding of the notion of extremism before developing any corpus. Development of extremism detection corpus. A literature search is performed to identify detection benchmark corpora. The identified corpora are examined and the research gap is established. Subsequently, we have developed an extremist views detection (XtremeView-22) corpus based on the developed understanding. The corpus is readily available for extremist views detection in social media. Evaluation of supervised learning techniques. Finally, baseline experiments are performed to evaluate the effectiveness of machine learning techniques for extremist view detection in social media. The baseline results and the generated corpus will be useful for fostering research on extremism detection in social media. The rest of this study is outlined as follows. Section 2 discussed the definitions and descriptions of extremist views in the literature. Section 3 presents an overview of the existing corpora and the details of our newly developed XtremeView-22 corpus. Section 4 presents the experimental setup and the baseline results of the experiments. Finally, Sect. 5 concludes the paper.

Extremist Views Detection: Definition, Annotated Corpus, and Baseline …

217

2 Conceptualizing Extremism There are multiple definitions and descriptions of the term extremism. However, there is no widely accepted academic definition nor there is a global description of the term extremism [23]. Therefore, to conceptualize the term extremism, this study has used three types of sources for collecting descriptions of the term extremism. It includes glossaries or dictionaries, scientific literature, and real-world practice as presented in policy and regulations of governments. The details of all three types of sources are presented below.

2.1 Extremism in Dictionaries As a starting point, we have identified the definitions of extremism as presented in the established dictionaries. It includes printed dictionaries of the English language, online dictionaries, and the encyclopedia. In particular, the notable glossaries used in the study are the Advanced American dictionary, Oxford English Dictionary, The Oxford Essential Dictionary of the U.S military, and the Oxford Learner’s dictionary of academic English. Table 1 presents the definitions of extremism as presented in these sources. It can be observed from the table that most dictionaries define it as a noun and that a majority of the definitions focus on political and religious views to refer to extremism. In contrast, the other key facets, such as economic or social views, are not considered extremist in any dictionary. Furthermore, some dictionaries present a brief and high-level definition of the term extremism, whereas others are more specific in defining the term. Besides being specific, these dictionaries present a broader scope of the notion of extremism by including views, conspiracies, actions, and measures of extreme nature in defining extremism. Table 1 Definitions and descriptions of extremism in dictionaries Refs.

Definition

[6]

“Extremism as a noun is the political, religious, etc., ideas or actions that are extreme and not normal, reasonable, or acceptable to most people”

[5]

Oxford Learner’s Dictionary of Academic English defines extremism as a noun that is “the holding of extreme political or religious views; fanaticism”

[19]

“A person who holds extreme political or religious views, especially one who advocates illegal, violent, or other extreme action”

[7]

“Supporting beliefs that are extreme”

[15]

“The chiefly derogatory a person who holds extreme or fanatical political or religious views, especially one who resorts to or advocates extreme action: political extremists and extremist conspiracy”

218

M. A. Hussain et al.

Table 2 Descriptions of extremism in the scientific literature Refs. Description [25]

“Extremism in religion is studied extensively and has led to associate it with a particular religion”

[11]

“Extremism usually refers to the ideology that may be religious or political, that is unacceptable to the general perception of the society”

[28]

The ideology of extremism is an ideology of intolerance toward enemies, justifying their suppression, assuming the existence of dissident citizens, and recognizing only its own monopoly on the truth, regardless of legal attitudes (therefore, extremist activity is almost always an unconstitutional activity)

[20]

“An ideological movement, contrary to the democratic and ethical values of a society, that uses different methods, including violence (physical or verbal) to achieve its objectives”

[4]

Extremism is “the quality or state of being extreme”

[14]

“Violent extremism refers to the action through which one adopts political, social, and religious ideation that leads to the initiation of violent acts”

[3]

“Extremism is also defined as a set of activities (beliefs, attitudes, feelings, actions, strategies) of a character far removed from the ordinary”

[16]

“Language which attacks or demeans a group based on race, ethnic origin, religion, disability, gender, age, disability, or sexual orientation/gender identity”

[26]

Online extremism “as Internet activism that is related to, engaged in, or perpetrated by groups or individuals that hold views considered to be doctrinally extremist”

2.2 Extremism in the Scientific Literature This study has performed a comprehensive search of academic literature in the quest for understanding extremism from a scientific literature perspective. Table 2 presents the notable studies that have attempted to describe extremism. It can be observed from the literature that similar to the dictionary definitions most of the scientific literature has associated extremism with religion and political ideology. However, in contrast to the dictionary definitions, few scientific studies have also included ethical and social values of the society in the scope of extremism. Also, these studies have emphasized intolerance and the use of violence in defining the notion of extremism. A few other studies have defined extremism as the language which attacks or demeans a group based on its characteristics or the statements that convey the message of intolerant ideology toward an out-group, immigrant or enemies.

2.3 Extremism in Practice The third type of sources that are considered for conceptualizing extremism is based on the descriptions used by government agencies and regularity bodies, to combat extremism. Table 3 presents a summary of the descriptions as presented in these sources. It can be observed from the table that, in essence, the constituents

Extremist Views Detection: Definition, Annotated Corpus, and Baseline …

219

Table 3 Descriptions of extremism in practice Refs.

Description

[17]

“All conduct publicly inciting to violence or hatred directed against between EU and a group of persons or a member of such a group defined by reference companies to race, color, religion, descent or national or ethnic”

[12]

The UK Government characterizes extremism as “opposition to fundamental values, including democracy, the rule of law, individual liberty, and respect and tolerance for different faiths and beliefs”

of extremism are inciting violence or hatred against an individual or community on the basis of race, color, religion, and national or ethnic affiliations. Another notable observation is that the concept of extremism is mostly discussed in association with liberalism, freedom, and the fundamental values of society. More specifically, extremism is an active uttered opposition to fundamental values, tolerance, and respect for different beliefs and faith. It is a hostile idea to liberty norms, such as democracy, freedom, sexual equality, discrimination, sectarianism, against human rights, freedom of expression, and segregation of person or folk or group. EU, UK, and US have their own counter-extremism strategies to combat this evil in their society to ensure the security of their citizens. The UN has also developed global counterterrorism policies for its member states, which is the field of artificial intelligence where the system can learn from features and improve based on. In summary, although there are several differences between the three types of sources discussed above, there are also some commonalities in defining extremism. For instance, extremism encompasses political, social or religious views. That is, all stakeholders agree that a hatred behavior targeted at an individual on the basis of religion, race, color, nationality, freedom, gender equality, and violence against social values, political beliefs, and religious views of certain specifications can be recognized as extremism. Furthermore, extremism promotes an ideology of asymmetric social groups, well-defined by race, ethnicity or nationality as well as authoritarian concept of society.

3 Extremism Detection Corpus This section focuses on the second contribution, the development of extremism detection (XtremeView-22) corpus. In particular, firstly an overview of the existing datasets and the limitations of these datasets are discussed. Subsequently, the process of developing the proposed corpus and the specifications of the XtremeView-22 corpus are presented.

220

M. A. Hussain et al.

Table 4 Summary of the extremism detection datasets Refs.

Annotations

Extremism

[21]

17,000

Not available

Religious

[2]

122,000

Not available

Religious

[2]

122,619

Not available

Religious

[2]

17,391

Not available

Religious

[1]

10,000

Extreme 3001, non-extreme 6999

Religious

Support 788, refute 46, empty 1850

Religious

[22]

Size

2684

3.1 Extremism Detection Corpora A literature search is performed to identify the studies that focus on an NLP-based approach for extremism detection. An overview of the identified studies is presented in Table 4. It can be observed from the table that six extremism detection corpora are available. The second observation is that the benchmark data annotations that define whether a given sentence is extremism or not are merely available for two datasets. Consequently, the remaining four datasets can neither be used to reproduce the existing results, nor these datasets are readily available for generating new results. A further examination of the two datasets revealed that the benchmark annotations of one ISIS-Religious dataset [22] are partially available. That is, out of the 2684 sentences, the annotations of merely 834 sentences are available, whereas the annotations of the remaining 1850 are not available. Therefore, the ISIS-Religious datasets are not readily usable. Finally, it can be observed that most of the extremism detection datasets focus on the religious perspective which is contrary to our understanding of the notation of extremism. That is, the synthesis of various definitions and descriptions presented in the preceding section concluded that extreme political and social views should also be considered as extremist views.

3.2 Development of XtremeView-22 This study has developed an extremism detection (XtremeView-22) corpus by using a seed corpus, ISIS-Religious. As a starting point for the development, the raw tweets were examined. It was observed that the tweets included residue and garbage values that do not play any role in the identification of extremism, it includes hashtags, images, URLs, emotions, smileys, etc. The tweets were cleaned by removing these contents using a Python script. Also, prior to the data annotation, duplicate tweets are omitted. Furthermore, the text samples that were comprised of multiple tweets were also eliminated to ensure that message replies are not interpreted without the context of the original message.

Extremist Views Detection: Definition, Annotated Corpus, and Baseline … Table 5 Specifications of the XtremeView-22 corpus

Item

No. of tweets

Extremist views

2413

Non-extremist views Total

221

215 2629

For the data annotation, two researchers reviewed a random sample of the tweets and annotated them as an extremist view or non-extremism view. Note, that both researchers took into consideration the meanings of the understanding of the concept of extremism based on the findings presented in the preceding section. The results were merged and the conflicts were resolved. The process was repeated a few times to develop a consistent understanding of the concept of extremism. Finally, one researcher performed all the annotations and the other researcher verified the annotations. Accordingly, we developed the XtremeView-22 corpus which is composed of 2684 tweets, where every tweet is marked as either Extremist or Non-extremist view. A key feature of the corpus is that all the annotations are complete and they are freely and publicly available for use by the research community. The specification of the established corpus is presented in Table 5. We contend that this substantial amount of extremist views represents the existence of a threat that needs to be detected and eradicated. On the other hand, the imbalance in the developed corpus presents a challenge for the machine learning techniques to learn and predict Extremist views. We contend that the imbalance in the corpus provides an opportunity for the interested research community to develop techniques for the detection of Extremist views and to enhance the corpus for handling the imbalance problem in the context of extremism detection.

4 Effectiveness of ML Techniques This section presents the baseline experiments that are performed to evaluate the effectiveness of supervised machine learning techniques for extremist view detection. Experiments are performed using six classical techniques. The choice of the techniques is based on the diversity of the underlying mechanism of these techniques for the text classification task. It includes Support Vector Machine (SVM), Decision Trees (DT), K-Nearest Neighbor (KNN), Naive Bayes (NB), Random Forest (RF), and Logistic Regression (LR). These techniques are fed with two types of features, unigrams, and bigrams. Note, there are other state-of-the-art deep learning techniques that are found to be more effective for various NLP tasks. However, these techniques require a large amount of annotated data for learning and prediction, which is not available for the task of extremism detection. Therefore, in this study, experiments are not performed using deep learning techniques.

222

M. A. Hussain et al.

Table 6 Summary results of experiments Techniques

Unigram P

Bigram R

F1

P

R

F1

Naïve bayes

0.863

0.738

0.743

0.868

0.853

0.859

Random forest

0.862

0.738

0.790

0.868

0.850

0.857

Decision tree

0.862

0.737

0.789

0.868

0.853

0.858

K-nearest neighbor

0.862

0.738

0.789

0.865

0.852

0.857

Logistic regression

0.865

0.742

0.793

0.870

0.850

0.858

Support vector machine

0.860

0.736

0.787

0.862

0.849

0.853

For the reliability of results, tenfold cross-validation is performed, and Precision, Recall, and F1 scores are calculated. Finally, the macro average of the tenfold results is computed. For the experiments, Tensorflow and Scikit-learn, are used. Prior to the experimentation, the pre-processing, including tokenization, removing punctuations, and lemmatization is also performed. Table 6 presents the Precision, Recall, and F1 scores of the machine learning techniques. It can be observed from the table that all the techniques achieved a reasonable F1 score of at least 0.790 which represents that all the techniques can somewhat detect extremist views in text. It can also be observed from the results that Naive Bayes achieved the highest F1 score of 0.859 using bigram features. However, from the comparison of the results of all the techniques, it can be observed that all the other techniques achieved comparable F1 scores. A similar observation can be made about the effectiveness of all the techniques when unigram features were fed to all the techniques. These results represent that all the techniques are equally effective for the detection of extremist views. It can be observed from the table that the Precision scores are higher than the Recall scores which represents that most of the sentences that are predicted as extremist are actually extremist. Whereas, some extremist views are not detected by the techniques. From the comparison of the results of unigram and bigram features, it can be observed that all the techniques achieved a higher F1 score when bigram features are fed to the techniques. This represents that bigrams features have a higher ability to discriminate between Extremist and Non-extremist views.

5 Conclusion Extremists use social outlets to approach enormous audiences, distribute propaganda, and recruit members for their cynical causes. Several attempts have been made for online extremism detection, however, there is a scarcity of publicly accessible extremism detection datasets. Also, the existing datasets are confined to religious extremism, whereas no attempt has been made to detect socially and politically extreme views. Furthermore, there is no consensus on what should be recognized

Extremist Views Detection: Definition, Annotated Corpus, and Baseline …

223

as an extremist view. To that end, this is the first-ever study that has synthesized the definitions and descriptions of extremist views from dictionaries, academic literature, and in practice, and used them to conceptualize the notation of extremism. Subsequently, the developed understanding is used to manually develop a corpus of 2640 English tweets. Finally, experiments are performed to evaluate the effectiveness of machine learning techniques. The results conclude that bigrams are the most effective features for extremism views detection and Naive Bayes is the most effective technique. In the future, we aim to scale the size of the dataset so that it can be used by deep learning techniques. Also, the effectiveness of various types of features will be evaluated.

References 1. Aaied A (2020) ISIS Twitter. https://www.kaggle.com/datasets/aliaaied/isis-twitter 2. Activegalaxy (2019) Tweets targeting Isis. https://www.kaggle.com/datasets/activegalaxy/isisrelated-tweets 3. Asif M, Ishtiaq A, Ahmad H, Aljuaid H, Shah JJT, Informatics (2020) Sentiment analysis of extremism in social media from textual information 48:101345 4. Berger JM (2018) Extremism. MIT Press 5. Dictionary O (2000) Oxford advanced learner’s dictionary. Oxford University Press, Oxford 6. Dictionary OAA (2022) Oxford Advanced American Dictionary. https://www.oxfordlearnersd ictionaries.com/definition/american_english/ 7. Dictionary TJRA (2012) The free dictionary 17 8. Frissen T (2021) Internet, the great radicalizer? Exploring relationships between seeking for online extremist materials and cognitive radicalization in young adults. Comput Hum Behav 114:106549 9. Hassan G, Brouillette-Alarie S, Alava S, Frau-Meigs D, Lavoie L, Fetiu A, … Rousseau C (2018) Exposure to extremist online content could lead to violent radicalization: a systematic review of empirical evidence. Int J Dev Sci 12(1–2):71–88 10. Hollewell GF, Longpre N (2022) Radicalization in the social media era: understanding the relationship between self-radicalization and the internet. Int J Offender Ther Comp Criminol 66(8):896–913. https://doi.org/10.1177/0306624X211028771 11. Lipset SMJTBJoS (1959) Social stratification and ‘right-wing extremism’ 10(4):346–382 12. Lowe DJSiC, Terrorism (2017) Prevent strategies: the problems associated in defining extremism: the case of the United Kingdom 40(11):917–933 13. Matusitz JJCSQ (2022) Islamic radicalization: a conceptual examination (38) 14. Misiak B, Samochowiec J, Bhui K, Schouler-Ocak M, Demunter H, Kuey L, … Dom GJEP (2019) A systematic review on the relationship between mental health, radicalization and mass violence✩ 56(1):51–59 15. Nicholson O (2018) The Oxford dictionary of late Antiquity. Oxford University Press 16. Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. Paper presented at the Proceedings of the 25th international conference on world wide web 17. Quintel T, Ullrich C (2020) Self-regulation of fundamental rights? The EU Code of Conduct on Hate Speech, related initiatives and beyond. In Fundamental rights protection online. Edward Elgar Publishing, pp 197–229 18. Rea SC (2022) Teaching and confronting digital extremism: contexts, challenges and opportunities. Inf Learn Sci 19. Stevenson A (2010) Oxford dictionary of English. Oxford University Press, USA

224

M. A. Hussain et al.

20. Torregrosa J, Bello-Orgaz G, Martínez-Cámara E, Ser JD, Camacho DJJoAI, Computing H (2022) A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges 1–37 21. Tribe F (2019) How ISIS uses Twitter. https://www.kaggle.com/datasets/fifthtribe/how-isisuses-twitter 22. Tribe F (2019) Religious texts used by ISIS. https://www.kaggle.com/datasets/fifthtribe/isisreligious-texts 23. Trip S, Bora CH, Marian M, Halmajan A, Drugas MI (2019) Psychological mechanisms involved in radicalization and extremism. A rational emotive behavioral conceptualization. Front Psychol 10:437 24. Whittaker J (2022) Online radicalisation: the use of the internet by Islamic State terrorists in the US (2012–2018). Leiden University 25. Wibisono S, Louis WR, Jetten JJFip (2019) A multidimensional analysis of religious extremism 10:2560 26. Winter C, Neumann P, Meleagrou-Hitchens A, Ranstorp M, Vidino L, Fürst JJIJoC, Violence (2020) Online extremism: research trends in internet activism, radicalization, and counterstrategies 14:1–20 27. Youngblood M (2020) Extremist ideology as a complex contagion: the spread of far-right radicalization in the United States between 2005 and 2017. Humanit Social Sci Commun 7(1):1–10 28. Zhaksylyk K, Batyrkhan O, Shynar M (2021) Review of violent extremism detection techniques on social media. Paper presented at the 2021 16th international conference on electronics computer and computation (ICECCO)

Chicken Disease Multiclass Classification Using Deep Learning Mahendra Kumar Gourisaria, Aakarsh Arora, Saurabh Bilgaiyan, and Manoj Sahni

Abstract The consumption of poultry, especially chicken, has gone up to hundreds of billions around the globe. With the large consumption, there is a high percentage of humans getting affected by diseases caused by chicken such as bird flu, which could cause serious illness or death. The mortality rate among the chicken also affects adversely the poultry farmers, as the disease spreads to other batches of chicken. The poultry market is huge and due to the rise in demand for consumption by humans, it is necessary to find a very intelligent system for the early identification of various diseases in chickens. The aim of this paper is to detect diseases in chickens at an early stage using deep learning techniques, preventing mortality in chickens, farmer’s loss due to mortality among chickens and ultimately keeping us healthy too. In this paper, various types of CNN models were implemented for the categorical classification of “Salmonella”, “Coccidiosis”, “Healthy” and “New Castle Disease”, and the best model was selected on the basis of efficiency with respect to the ratio of (Maximum Validation Accuracy) MVA and LVL (Least Validation Loss). A total of 7 CNN models and 5 Transfer Learning models were used for the detection of chicken disease and the proposed ChicNetV6 model showed the best results by gaining an efficiency score of 2.8198 and an accuracy score of 0.9449 with a total training time of 1125 seconds. Keywords Chicken disease · Deep learning · Poultry market · Multiclass classification · Chicken mortality

M. K. Gourisaria (B) · A. Arora · S. Bilgaiyan School of Computer Engineering, KIIT Deemed to Be University, Bhubaneswar, Odisha 751024, India e-mail: [email protected] M. Sahni Department of Mathematics, Pandit Deendayal Energy University, Gandhinagar, Gujarat 382426, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_19

225

226

M. K. Gourisaria et al.

1 Introduction Poultry and its products are one of the most popular in the food industry. As a growing industry, the development of diseases among chickens and other animals results in potential harm to humans and the environment. The widespread disease causes large economic and environmental damage. A rapid rise in common poultry diseases such as Colibacillosis, salmonellosis, Newcastle Disease, chronic respiratory disorder and coccidiosis, followed by several bursal diseases, fowl cholera, nutritional deficiency and fowlpox, would result in even more responsibilities that developing nations are unprepared to handle. Hence, broiler animals, especially chicken welfare are critical not only just for human consumption, but also for productivity and economic benefit. So, an early detection technique is required to prevent the further spread of disease by treating the animals. Salmonella is a bacterial pathogen belonging to the genus Salmonella, which resides in the intestines that cause disease in both poultry and humans. Salmonella typhimurium (ST) and Salmonella enteritidis (SE) strains are linked to human illnesses spread through the poultry and broiler product food chain [1]. Salmonellosis can worsen mortality and performance losses in young birds due to overcrowding, starvation, and other stressful situations, as well as filthy surroundings [2]. Coccidiosis is caused by the apicomplexan protozoan Eimeria, which is the most severe parasitic disease in chickens. Infected animals’ development and feed consumption are severely hampered by coccidiosis, resulting in a loss of productivity [3]. Two popular diagnostic approaches include counting the number of oocysts (oocysts per gram [opg]) in the droppings or checking the digestive system to determine lesion scores. Although measures like management and biosecurity could prevent Eimeria from breaching the farms, in practice, they are insufficient to prevent coccidiosis outbreaks [4]. Newcastle disease is spread worldwide by virulent Newcastle disease virus (NDV) strains that infect avian species. Because of the low contact rate, NDV spreads relatively slowly within and across village poultry populations. The faecal-oral route seems to be the most common method of transmission [5]. Deep Learning and Machine Learning are becoming the epicentre of technology by advancing in many fields like health care, engineering, medicine and many more. Some of the contributions include Diabetes mellitus diagnosis [6] where the K-Nearest Neighbors machine algorithm performed the best, Liver Disease Detection [7] and Maize Leaf detection [8]. In this research article, we have implemented 12 state-of -the-art architectures, where 7 proposed CNN and 5 transfer learning were trained and evaluated on various performance metrics such as F1-score, precision, Efficiency ratio, AUC and Recall. The rest of the paper is divided into the following parts II. Related Work, III. Dataset Preparation, IV. Technology and Software Used, V. Implementation and Results and VI. Conclusions and Future work.

Chicken Disease Multiclass Classification Using Deep Learning

227

2 Related Work As mentioned, poultry farming, especially chicken, is one of the fastest-growing industries and serious measures need to be taken to prevent them from various hazards and diseases. A more feasible approach is for early detection of the disease in chickens using the Deep Learning approach. Classical Machine Learning (ML) and Deep Learning (DL) approaches have been implemented by many researchers for the diagnosis of diseases in chickens. SVM was used by [9] to detect unhealthy broilers infected with avian flu. Their research developed an algorithm for classifying isolated infected broilers based on the examined structures and attributes, which was validated on test data and found to be 99% accurate. In another paper by [10], they used a deep learning approach for the detection of sick broilers and proposed Feature Fusion Single Shot MultiBox Detector (FSSD) to enhance the Single Shot MultiBox Detector (SSD) model using the InceptionV3 model as a base. They achieved a mean average precision (mAP) of 99.7%. Yoo et al. [11] proposed a continuous risk prediction framework for Highly pathogenic avian influenza (HPAI) disease and used ML algorithms like eXtreme Gradient Boosting Machine (GBM) and Random Forest. The model’s predictions for high risk were 8–10 out of 19 and the Gradient Boost algorithm performed well with an AUC curve of 0.88. Using Deep learning techniques, Akomolafe and Medeiros [12] performed a classification of Newcastle disease and Avian flu. The CNN models used gained accuracy of 95% and 98%, respectively. Wang et al. [13] proposed an auto-mated broiler digestive disorder detector that categorizes fine-grained aberrant broiler droppings photos as abnormal or normal using a deep Convolutional Neural Network model. For comparison, Faster R-CNN and YOLO-V3 were also constructed. Faster R-CNN gained recall and mAP at 99.1% and 93.3%, whereas YOLO-V3 attained 88.7% and 84.3%, respectively. In the study of [14], a machine vision-based monitoring system was presented for the detection of the Newcastle disease virus. The data was collected from live broilers as they walked and features were extracted using 2D shape posture shape descriptors and walk speed. From the used ML models, RBF- Support Vector Machine (SVM) gave the best results of 0.975 and 0.978 accuracies. Cuan et al. [15] presented a Deep Chicken Vocalization Network (DPVN) based on broiler vocals for early diagnosis of Newcastle Disease. They used sound technology for the extraction of poultry vocalizations and used it in the DL models. The best model achieved accuracy, F1-Score and recall of 98.50%, 97.33% and 96.60%, respectively. All of the above-mentioned implementations for the detection of chicken disease were good, but there were a few drawbacks, such as the fact that few papers concentrated on transfer learning models, while others focused on identifying a specific type of disease. A specific sickness cannot be identified via sound observation and chicken posture. Additionally, using the sound discrimination method in a group context is very difficult. Any variation in chicken droppings like colour, shape and texture can be detected in real-time, as birds defecate 12 times a day. Hence, disease detection through faecal images is the most efficient way.

228 Table 1 Class distribution

M. K. Gourisaria et al. Class name

Number of images

Salmonella

2625

Coccidiosis

2476

New castle disease Healthy

562 2404

3 Data Preparation 3.1 Dataset Used The dataset used was taken from Kaggle, where it was retrieved from UCI and the dataset was uploaded by Alland Clive [16]. The dataset contained 8067 image files along with a “.csv” file containing four classes which can be seen in Table 1.

3.2 Dataset Preparation Feature engineering and data augmentation were critical in balancing the unbalanced dataset during dataset creation. In our approach, we have used various data augmentation techniques like Zoom range, Horizontal flip, Rescale, Shear, Height shift range and Width shift range. In this paper, we have used shear, zoom, rescale, horizontal flip and rotation for the training image dataset and rescale feature for the test and validation dataset using Keras Image Data Generator (Fig. 1).

Fig. 1 Sample images of chicken faeces

Chicken Disease Multiclass Classification Using Deep Learning

229

3.3 Splitting Dataset, Hardware and Software Used The dataset was first split into two, with 70% as a training set and 30% as a testing set. The testing set was later divided into two equal haves as the test set and validation set in a ratio of 50%. All machine learning algorithms were implemented and analyzed using Python 3.7, and the libraries like scikit-learn, TensorFlow and Keras on a Google Colaboratory notebook. The workstation is equipped with an Intel i7 9th generation processor and 8 GB of RAM.

4 Technology Used 4.1 Convolutional Neural Network Convolutional Neural Network (CNN/ConvNet) is an algorithm of Deep learning that plays a major role in the field of computer vision. To distinguish one feature from another and build a spatial relationship between them, the algorithm assigns biases and weights to distinct characteristics of the input image. ConvNet re-choirs much lower pre-processing as compared to other classification algorithms. CNN consists of layers called Convolutional layers and these layers function on the Convolutional theorem’s principle, and go through the same procedure as backward and forwardfeed propagation. Response to stimuli in the Visual Cortex region of the human brain is done by individual neurons. Stimuli in the receptive field get a response from only individual neurons, which is a limited portion of the visual field. ConvNets are basically constructed of four types of layers which are Maxpool, Full-Connection, Convolutional, and Flattening. The in-variance translation property of a Convolution Neural Network can be defined as in the following Eq. 1. x(y(n)) = y(x(n))

(1)

A ConvNet accurately captures the spatial and temporal interactions in an image by using appropriate filters. The architecture achieves superior fitting to the picture dataset due to the reduced number of parameters and reusability of weights. In the new function, the properties of the old function may be readily described and changed. When images are treated as discrete objects, Convolutional Neural Network may be represented as shown in Eq. 2.

230

M. K. Gourisaria et al.

( f ∗ g)[n] =

m=+M 

f [n − m]g[m]

(2)

m=−M

where f and g represent the input image and kernel function, respectively. The function g gets convoluted over f for the purpose of getting passed into a function called Rectified Linear Unit (ReLu), which is an activation function, for getting output as features.

4.2 Transfer Learning Transfer learning is a method where we can use a model which is already trained on a dataset and solve a new problem. In transfer learning, a computer leverages information from a previous dataset to improve prediction about a new task. Neural networks in computer vision are used to identify edges in the first layer, shapes in the second layer and task-specific properties in the third layer. The early and core layers are used in transfer learning, whereas the following layers are simply retrained. Because the model has already been trained, transfer learning can help one develop an effective machine learning model with less training data.

5 Implementation and Results In this section, we focus on all the CNN architectures implemented and performance metrics. For different architectures, we have used several parameters and layers such as a different number of Convolutional and Artificial layers, kernel size, different activation functions and optimizers. The input image size was set to (224 × 224). In this paper, we have implemented 7 CNN architectures from scratch and 5 Transfer Learning models. Each of the models was trained for 15 epochs. For proposing an efficient architecture, the efficiency ratio is considered to be the most important factor. So, the best CNN architecture was selected after analyzing and comparing the Efficiency score, Training Time and metrics like AUC, F1-Score and Recall. Equation (3) shows the mathematical formula for the calculation of the efficiency score Efficiency =

Maximum Validation Accuracy Least Validation Loss + Normalised Training Time

(3)

Chicken Disease Multiclass Classification Using Deep Learning

231

5.1 Experimentation and Analysis The creation of all CNN models was based on input image size, and the number of Convolutional, Maxpool and Dense layers. The input image size was set to 224 × 224 × 3 as default for all the architectures. This was done to obtain precise and accurate results. The abbreviations used in Tables 3, 4, 5, 6 and 7 have been defined in Table 2. After data exploration, we implemented all the CNN models with different parameters as mentioned above. All the models were executed with a random state set to 42 and were trained on trained data and validation data with a batch size set to 32. The best model was selected by using metrics calculated from elements of the confusion matrix (TP, FP, TN, FN) like Precision, F1-Score, Recall and Accuracy. All the information of various CNN models used in this paper has been mentioned in Tables 3 and 4 like the number of Convolutional, Maxpool and Artificial Layers used, Filters, Kernel Initializers and specifically which optimizer was used to reduce the cost function. From Table 3, we can see that the ChicNetV3 model gained the Maximum Validation Accuracy (MVA) of 0.9424 and the least Training Time of 1125 s, whereas ChicNetV1 gained the minimum Least Validation Loss (LVL) of 0.3270. As we can see from Table 4, the Transfer Learning model Xception has performed best as compared to other models by getting a maximum MVA of 0.9608 and the least LVL of 0.2767. On the other hand, the VGG16 showed the least scores, getting the least MVA of 0.4040 and the highest LVL of 1.6513. Table 2 Abbreviations used

Notation

Meaning

CL

Convolutional layer

AL

Artificial layer

ML

MaxPool layer

FD

Feature detection

KS

Kernel size

KI

Kernel initializer

PS

Pool size

LVL

Least validation loss

MVA

Maximum validation accuracy

OP

Optimizer

TP

True positive

FP

False positive

TN

True negative

FN

False negative

TT

Training time (in seconds)

NT

Normalized time

CL

4

2

2

4

4

4

4

Model

ChicNetV1

ChicNetV2

ChicNetV3

ChicNetV4

ChicNetV5

ChicNetV6

ChicNetV7

2

5

2

1

2

2

5

AL

4

5

2

2

2

2

4

ML

{128,128,64,32}

{128,64,32,32}

{128,64,64,32}

{128,64,64,32}

{32,64}

{32,64}

{128,64,32,32}

FD

Table 3 Structure and performance of CNN models

3,9,9,3

3

3

3

3,9

3,9

3,3

KS

Uniform

Glorot uniform

Glorot uniform

Glorot uniform

Uniform

Uniform

Uniform

KI

2,4,4,2

2,2

2,2

2,2

2,4

2,4

2

PS

0.3808

0.3342

0.3865

0.4044

0.3471

0.3718

0.3270

LVL

0.9358

0.9424

0.9323

0.9285

0.9310

0.9364

0.9383

MVA

Adam

Adam

Adam

Adam

Adam

RMSProp

Adam

OP

1508

1125

1391

1621

1258

1248

1520

TT

232 M. K. Gourisaria et al.

Chicken Disease Multiclass Classification Using Deep Learning

233

Table 4 Structure and performance of transfer learning models Model No

AL

KI

LVL

MVA

OP

TT

InceptionResNetV2

2

Glorot uniform

0.2885

0.9587

Adam

1553

VGG19

4

Glorot uniform

0.4114

0.8911

RMSProp

1607

VGG16

4

Glorot uniform

1.6513

0.4040

RMSProp

1526

Xception

2

Glorot uniform

0.2767

0.9608

Adam

1476

InceptionV3

2

Glorot uniform

1.3719

0.9499

Adam

1234

From the 7 CNN and Transfer Learning models, we have selected the best from them and compared them in the following section. Information from Table 5 shows the performance of the CNN models in various metrics and we can notice that the selected models have performed magnificently well in all the domains (Fig. 2 and Table 6).

5.2 Comparison of Selected Models’ Results From all implemented CNN models, we have selected the ChicNetV6 model and transfer learning model Xception as the best models as compared to the other models for an input image of 224 × 224. This section focuses on the comparison of the 2models selected and finding the best model after evaluation based on metrics like Efficiency, Precision, AUC, Accuracy, F1-Score, Recall and Training Time. The formulae of the above metrics are as follows: From Table 7, we can observe that the proposed model ChicNetV6 performed much better as compared to the Transfer Learning model Xception with a maximum number of True Positive, Efficiency ratio, AUC and least Training Time. However, the Xception model outperformed our model in terms of accuracy, F1-score and Recall gaining scores of 0.9523, 0.9041 and 0.8775, respectively.

6 Conclusion and Future Work Poultry farming is a huge industry and getting rid of any anomaly in the form of breeding, diseases or food, in this sector, also affects us both economically and in our well-being. The effort of this paper is by using the techniques of Deep Learning and Transfer Learning, to find an optimal CNN architecture for the detection of diseases in chickens like “Salmonella”, “Coccidiosis” and “New Castle Disease”, and also detect whether the chicken is healthy or not. Based on different metric evaluations like Efficiency, Accuracy, F1-Score, Recall and Training Time for the input image size of 224 × 224, we can conclude that our proposed model ChicNetV6 has performed outstandingly well in all the above-mentioned metrics with the highest

Accuracy

0.9486

0.9127

0.9270

0.9301

0.9375

0.9449

0.9319

Model

ChicNetV1

ChicNetV2

ChicNetV3

ChicNetV4

ChicNetV5

ChicNetV6

ChicNetV7

0.8889

0.9156

0.8915

0.8880

0.8865

0.8363

0.9147

Precision

0.8317

0.8589

0.8540

0.8243

0.8119

0.8094

0.8762

Recall

Table 5 Performance of CNN Models on various metrics

0.9766

0.9861

0.9837

0.9777

0.9691

0.9633

0.9863

AUC

0.6962

0.7993

0.7248

0.7411

0.6964

0.7608

0.7768

F1-score

42 1170

336 68

32 1180

57

1170

59 347

42

71 345

42 1170

333

42 1170

76

1148

77 328

64

327

33 1179

50

TN

FN 354

FP

TP

CM

0.8116

2.8198

1.0104

0.6611

1.5133

1.5110

0.8353

Efficiency

0.3808

0.3342

0.3865

0.4044

0.3471

0.3718

0.3270

LVL

0.7721

0.0000

0.5362

1.0000

0.2681

0.2479

0.7963

NT

234 M. K. Gourisaria et al.

Chicken Disease Multiclass Classification Using Deep Learning

235

Fig. 2 Metrics curve of ChicNetV6 model

efficiency ratio of 2.8198 and least training time of 1125 s, reducing the computational cost. Future work on the classification offaecal images could be more efficient by using the Generative Adversarial Networks (GAN) technique. We can produce more data instead of using data augmentation by using GANs. The Batch Normalization method could also have been applied to all the CNN architecture for more precise results.

Accuracy

0.9684

0.9022

0.3746

0.9589

0.9523

Model

InceptionResNetV2

VGG19

VGG16

Xception

InceptionV3

0.9049

0.9273

0.2144

0.7971

0.9514

Precision

0.9041

0.9066

0.5636

0.8168

0.9208

Recall

Table 6 Performance of transfer learning models on various metrics

0.9533

0.9859

0.3562

0.9108

0.9928

AUC

0.8775

0.8776

0.0409

0.7810

0.9241

F1-score

115 59

59

1178

39 1094

34

365

185 365

847

219

84 1128

74

1193

32 330

19

TN

FN 372

FP

TP

CM

0.5968

0.9761

0.1642

0.6442

0.8326

Efficiency

1.3719

0.2767

1.6513

0.4114

0.2885

LVL

0.2197

0.7076

0.8084

0.9717

0.8629

NT

236 M. K. Gourisaria et al.

Accuracy

0.9449

0.9523

Model

ChicNetV6

Xception

0.9049

0.9156

Precision

Table 7 Selected CNN model metric comparison

0.9041

0.8589

Recall

0.9533

0.9861

AUC

0.8775

0.7993

F1-score

34 1178

39

1180

57 347

32

TN

FN 365

FP

TP

CM

0.9761

2.8198

Efficiency

0.2767

0.3342

LVL

1476

1125

TT

Chicken Disease Multiclass Classification Using Deep Learning 237

238

M. K. Gourisaria et al.

References 1. Desin T, Koster W, Potter A (2013) Salmonella vaccines: past, present and future. Expert Rev Vaccines 12:87–96 2. Waltman WD, Gast RK, Mallinson ET (2008) Salmonellosis. Isolation and identification of avian pathogens, 5th edn. American Association of Avian Pathologists, Jackson-ville, FL, pp 3–9 3. Dalloul RA, Lillehoj HS (2006) Poultry coccidiosis: recent advancements in control measures and vaccine development. Expert Rev Vaccines 5(1):143–163 4. Grilli G, Borgonovo F, Tullo E, Fontana I, Guarino M, Ferrante V (2018) A pilot study to detect coccidiosis in poultry farms at early stage from air analysis. Biosyst Eng 2 5. Awan MA, Otte MJ, James AD (1994) The epidemiology of Newcastle disease in rural poultry: a review. Avian Pathol 23(3):405–423 6. Gourisaria MK, Jee G, Harshvardhan GM, Singh V, Singh PK, Work-neh TC (2022) Data science appositeness in diabetes mellitus diagnosis for healthcare systems of developing nations. IET Commun 7. Singh V, Gourisaria MK, Das H (2021) Performance analysis of machine learning algorithms for prediction of liver disease. In: 2021 IEEE 4th international conference on computing, power and communication technologies (GUCON). IEEE, pp 1–7 8. Panigrahi KP, Das H, Sahoo AK, Moharana SC (2021) Maize leaf disease detection and classification using machine learning algorithms. In: Progress in computing, analytics, and networking. Springer, Singapore, pp 659–669 9. Zhuang X, Bi M, Guo J, Wu S, Zhang T (2018) Development of an early warning algo-rithm to detect sick broilers. Comput Electron Agric 144:102–113 10. Zhuang X, Zhang T (2019) Detection of sick broilers by digital image processing and deep learning. Biosys Eng 179:106–116 11. Yoo DS, Song YH, Choi DW, Lim JS, Lee K, Kang T (2021) Machine Learning-driven dynamic risk prediction for highly pathogenic avian influenza at poultry farms. Republic of Korea: daily risk estimation for individual premises. Transboundary Emerg Dis 12. Akomolafe OP, Medeiros FB (2021) Image detection and classification of new-castle and avian flu diseases infected poultry using machine learning techniques. Univ Ibadan J Sci Logics ICT Res 6(1 and 2):121–131 13. Wang J, Shen M, Liu L, Xu Y, Okinda C (2019) Recognition and classification of broiler droppings based on deep convolutional neural network. J Sens 2019:10. https://doi.org/10. 1155/2019/3823515 14. Okinda C, Lu M, Liu L, Nyalala I, Muneri C, Wang J, Shen M (2019) A machine vision system for early detection and prediction of sick birds: a broiler chicken model. Biosys Eng 188:229–242 15. Cuan K, Zhang T, Li Z, Huang J, Ding Y, Fang C (2022) Automatic newcastle disease detection using sound technology and deep learning method. Comput Electron Agric 194:106740 16. Clive A (2022) Chicken disease image classification, Version 3. Retrieved from https://www. kaggle.com/datasets/allandclive/chicken-disease-1

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model for Deepfakes Detection Fatima Khalid, Ali Javed, Aun Irtaza, and Khalid Mahmood Malik

Abstract In recent years, we have witnessed a tremendous evolution in generative adversarial networks resulting in the creation of much realistic fake multimedia content termed deepfakes. The deepfakes are created by superimposing one person’s real facial features, expressions, or lip movements onto another one. Apart from the benefits of deepfakes, it has been largely misused to propagate disinformation about influential persons like celebrities, politicians, etc. Since the deepfakes are created using different generative algorithms and involve much realism, thus it is a challenging task to detect them. Existing deepfakes detection methods have shown lower performance on forged videos that are generated using different algorithms, as well as videos that are of low resolution, compressed, or computationally more complex. To counter these issues, we propose a novel fused truncated DenseNet121 model for deepfakes videos detection. We employ transfer learning to reduce the resources and improve effectiveness, truncation to reduce the parameters and model size, and feature fusion to strengthen the representation by capturing more distinct traits of the input video. Our fused truncated DenseNet model lowers the DenseNet121 parameters count from 8.5 to 0.5 million. This makes our model more effective and lightweight that can be deployed in portable devices for real-time deepfakes detection. Our proposed model can reliably detect various types of deepfakes as well as deepfakes of different generative methods. We evaluated our model on two diverse datasets: a large-scale FaceForensics (FF)++ dataset and the World Leaders (WL) dataset. Our model achieves a remarkable accuracy of 99.03% on the WL dataset and 87.76% on the FF++ which shows the effectiveness of our method for deepfakes detection.

F. Khalid · A. Irtaza Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan A. Javed (B) Department of Software Engineering, University of Engineering and Technology, Taxila, Pakistan e-mail: [email protected] K. M. Malik Department of Computer Science and Engineering, Oakland University, Rochester, MI, USA © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_20

239

240

F. Khalid et al.

Keywords Deepfakes detection · DenseNet121 · FaceForensics++ · Fused truncated DenseNet · World leaders dataset

1 Introduction The evolution of deep learning-based algorithms such as autoencoders [12] and Generative Adversarial Networks (GANs) [9] have led to the generation of many realistic image and video-based deepfakes. Deepfakes represent the synthesized multimedia content based on artificial intelligence which mainly falls in the categories of FaceSwap, Lip-Sync, and Puppet mastery. FaceSwap deepfakes are centered on identity manipulation, where the original identity is swapped with the targeted one. Lip-syncing is a technique for modifying a video where the mouth area fits the arbitrary audio, whereas the puppet-mastery approach is concerned with the modification of facial expressions including the head and eye movement of the person. Deepfakes videos have some useful applications such as creating videos of a deceased person by using his single photo, changing the aging and de-aging of people, etc. Both applications can also be used to create realistic videos of live and deceased actors in the entertainment industry. Deepfakes have the potential not only to influence our view of reality, but can also be used for retaliation and deception purposes by targeting politicians and famous leaders and spreading disinformation to take political revenge. Existing literature on face-swapping and puppet-mastery has explored different end-to-end deep learning (DL)-based approaches. Various studies [3, 5, 10, 11] have focused on the application of DL-based methods for face swap deepfakes detection. In Bonettini et al. [3] ensemble model of EfficientNet and average voting was proposed. The model was evaluated only in intra-dataset settings, thus the generalization capability of this method cannot be guaranteed in an inter-dataset setup. In Rossler et al. [11], CNN was used in conjunction with the SVM for real and face swap detection. This approach was unable to perform well on the compressed videos. In Nirkin et al. [10] confidence score was computed from cropped faces, which were later fed into the deep learner to identify the identity manipulation. This model does not generalize well on unseen data. In de Lima et al. [5], VGG-11 was used to determine frame level features, which were then fed to various models like ResNet, R3D, and I3D to detect the real and forged videos. This technique is computationally more costly. Research approaches [1, 4, 6, 14] have also been presented for puppet mastery deepfakes detection by employing the DL-based methods. In Guo et al. [6], feature maps generated from convolutional layers were subtracted from the original images. The method removes unnecessary details from the image, allowing the RNN to concentrate on the important details. This method requires more samples for training to obtain satisfactory performance. In Zhao et al. [14], pairwise learning was used to extract source features from CNN, which were later used for classification. However, the performance of the model decreases on images that have consistent

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model …

241

features. In Chintha et al. [4], temporal discrepancies in deepfakes videos were identified by combining XceptionNet CNN which extracted the facial features using bidirectional LSTM. The architecture performed well on multiple datasets; however, the performance degrades on compressed samples. In Agarwal et al. [1], a combination of VGG-16 and encoder-decoder network was applied for detection by computing the facial and behavioral attributes. This method does not apply well to unseen deepfake videos. According to the literature, existing approaches, notably [1, 10], don’t have the generalization ability on the unseen data. Rossler et al. [11], Chintha et al. [4] performs well on high-quality videos, but their performance degrades on compressed videos. Although [5] outperforms other state-of-the-art (SOTA) techniques, but is computationally complex. To better address the challenges, we present a novel fused truncated DenseNet framework that works effectively on unseen data and induces modifications to further reduce the computational cost and optimization efforts while achieving higher accuracy. Specifically, this paper makes a significant contribution based on the following: 1. We present a novel fused truncated DenseNet model that is robust to different types of deepfakes (face swap, puppet mastery, imposter, and lip-sync) and to different generative methods (Face2Face, NeuralTextures, Deepfakes, and FaceShifter). 2. We present an efficient deepfakes detection method by employing the GeLu activation function in our proposed method to reduce the complexity of the model. 3. We introduce a series of layers including global average pooling and dense layers combined with the regularization technique to prevent overfitting. 4. To evaluate the generalizability of our proposed model, we performed extensive experiments on two different deepfakes datasets including the cross-set examination.

2 Proposed Methodology This section explains the proposed workflow employed for deepfakes detection. The architecture of our proposed framework is depicted in Fig. 1.

2.1 Facial Frames Extraction The initial stage is to identify and extract faces from the video frames since the facial landmarks are the most manipulated part in deepfakes videos. For this purpose, we used the Multi-task Cascaded Convolutional Networks (MTCNN) [15] face detector to extract the facial region of 300 × 300 from the input video during pre-processing. This approach recognizes the facial landmarks such as the eyes, nose, and mouth,

242

F. Khalid et al.

Fig. 1 Architecture of proposed method

from coarse to fine details. We chose this method as it detects faces accurately even in the presence of occlusion and variable light, unlike other face detectors such as Haar Cascade and Viola Jones framework [13].

2.2 Fused Truncated DenseNet We extract the frames having frontal face exposure after detecting faces in the input video. The frames were then resized to 224 × 224 resolution and fed to our fused truncated DenseNet121. We introduce truncation modifications that help in parameter and model size minimization; as well as feature fusion, which merges the correlated feature values produced by our algorithm. As a result, an effective and lightweight model for the detection of real and deepfake videos is created. The use of pre-trained frameworks is inspired by the fact that these models have been trained on enormous publicly available datasets like ImageNet, and hence can learn the essential feature points. DenseNet121 is a ResNet architectural extension. The training technique faces vanishing gradient issues as the network’s depth grows. Both the ResNet and DenseNet models are intended to address this issue. The DenseNet design is built on all layer’s connectivity, with each layer receiving input from all previous layers and passing the output to all the subsequent layers. As a result, the resultant connections are dense, which enhances the efficiency with fewer parameters. The goal of having a DenseNet121 model is to give a perfect transmission of features throughout the whole network without performance degradation, even with considerable depth. DenseNet also handles parameter inflation utilizing a concatenation instead of layer additions. Our proposed method includes two DenseNet121 architectures. Model A is partially trained on our dataset, its early layers are frozen to preserve the ImageNet features, and the remaining layers are retrained on our data. Whereas model B is entirely retrained on our dataset. Figure 1 shows the proposed fused truncated DenseNet model, which is composed of 7 × 7 Convolution layer, proceeded by the Batch Normalization (BN), Gaussian Error Linear Unit (GeLu), and 3 × 3 Max

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model …

243

Pooling layer. Next, a pair of dense blocks with a BN, GeLu, and 1 × 1 Convolution layer is followed by another BN, GeLu, and 1 × 1 Convolution layer. Unlike ResNet and other deep networks that rely on feature summation to generate large parameters, the DenseNet model employs a dense block with ‘n’ rate of growth that is appended to all the network layers. This approach evolves into an efficient endto-end transfer of features from preceding layers to succeeding layers. The proposed design produces a rich gradient quality even at deeper depths while lowering the parameter count makes it very useful for detection purposes. To avoid depletion of resources during the features extraction, the DenseNet model needs a transition layer that down-samples the feature maps by using 1 × 1 Convolution layer and 2 × 2 Average Pooling layer. Layer Truncation Although DenseNet has much lesser parameters than other DL-based models, the proposed approach aims to further minimize the parameters without compromising its effectiveness. DenseNet121 has around 8.5 million parameters. The base DenseNet model is suitable for large datasets such as the ImageNet, which has over 14 million images and 1000 categories, training and replicating this model can be time-consuming. Furthermore, with such a small dataset, employing the complete model’s architecture merely adds complexity and uses enormous resources. As a result, most of the models’ layers are eliminated through a proposed truncation from its complete network, lowering the number of parameters and reducing the end-to-end flow of features. The proposed fused truncated DenseNet with only six dense blocks followed by a transition layer connecting to another set of three dense blocks is shown in Fig. 1. The proposed methodology reduces the DenseNet121 model’s parameter count by a significant factor of 93.5%. More specifically, truncated DenseNet decreases the parameters from the initial 8.5 million to only around half a million. Activation Function is used in a multilayer neural network to express the connection between the output values of neurons in the preceding layer and the input values of those in the following layer. It determines whether a neuron should be activated or not. We used the Gaussian Error Linear Units (GeLu) [7] function in our method. As sigmoid and ReLu faces the gradient vanishing issue, along with this, ReLu also creates the dead ReLu issue. To address these issues of ReLu, probabilistic regularization techniques such as dropout are widely used after the activation functions to improve accuracy. GeLu is presented to combine stochastic regularization with an activation function. It is a conventional Gaussian distribution function that puts nonlinearity to the output of a neuron depending on their values, rather than using the input value as in ReLu. Model concatenation and Prediction The smaller size of the truncated DenseNet network results in a lower parameter value. On the contrary, adding more depth to the layers will make the truncation approach useless. To overcome this problem, we employed the model concatenation method, which improves the accuracy of our model with fewer parameters. Model concatenation and feature fusion broadened the model instead of increasing its depth, enabling the required fast end-to-end feature extraction for training and validation. To better process the features produced by the fusion of both models, the proposed method incorporates a new set of layers

244

F. Khalid et al.

consisting of Global Average Pooling, a dense layer, and the dropout connected to another dense layer activated by the classifier. These additional layers attempt to increase efficiency and regularization, hence preventing overfitting problems.

3 Experiment Setup and Results 3.1 Dataset We evaluated the performance of the proposed method using two datasets: FaceForensics++ [11] and the World Leaders Dataset [2]. FF++ is an extensive face manipulation dataset created with automated, modern video editing techniques. Two traditional computer graphics methods, Face2Face (F2F) and FaceSwap (FS) are used in conjunction with two learning-based methods, DeepFakes (DF) and NeuralTextures (NT). Each video has an individual with a non-occlusion face, although it is difficult due to differences in the skin tone of various people, lighting conditions, the presence of facial accessories, and the loss of information due to low video resolution. The YouTube videos of world-famous politicians (Clinton, Obama, Warren, and others) with their original, comical imposters (Imp), face swap (FS), lip-sync (LS), and puppet master subsets made up the WL dataset. Politicians are speaking throughout the videos; each video has only one person’s face and the camera is static with minimal variations in zooming. We divided both datasets into 80:20 splits with 80% of the videos for training and the rest 20% for testing.

3.2 Performance Evaluation of the Proposed Method We designed an experiment to analyze the performance of our method on the original and fake sets of FF++ and WL datasets to demonstrate its effectiveness for deepfakes detection. For this purpose, we employed our model to classify the real and fake videos of each subset of FF++ separately. On FF++, we tested the real samples with the fake samples from FS, DF, F2F, NT, and FaceShifter (FSh) sets, and the results are presented in Table 1. It can be noticed that the FF++-FS set has the highest accuracy of 95.73% and 0.99 AUC among all other sets. FS videos are generated by using the 3D blending method. These remarkable results on the FS set indicate that our model can better capture these traits to identify the identity changes and static textures. Whereas FSh achieved an accuracy of only 60.90% and AUC of 0.67 because the generative method of this set is very complex due to the fusion of two complex GAN’s architecture [8]. This makes it extremely challenging to reliably capture the distinctive traits of the texture used in the FSh, which limits the accuracy of our model.

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model …

245

Table 1 Performance evaluation of proposed method on FF++ dataset Accuracy

FS

DF

F2F

NT

FSh

95.73

93.9

92.6

83.5

60.90

PR

0.99

0.97

0.97

0.90

0.63

AUC

0.99

0.98

0.97

0.92

0.67

Table 2 Performance evaluation of proposed method on WL dataset Leaders

Subsets

Accuracy

PR

AUC

Obama

FS

94.57

0.96

0.97

Imp

58.57

0.60

0.63

LS

62.36

0.65

0.68

JB

FS

89.68

0.91

0.94

Imp

95.65

0.97

0.96

Clinton

FS

84.13

0.87

0.86

Imp

91.43

0.92

0.94

FS

93.14

0.93

0.95

Imp

93.12

0.93

0.95

Sander

FS

89.59

0.91

0.90

Imp

78.88

0.80

0.82

Trump

Imp

99.70

1.00

1.00

Warren

For WL, each leader’s deepfakes type (FS, Imp, and LS) is tested with the original samples. Table 2 shows that the FS of Obama has shown the best accuracy of 94.57% and 0.97 AUC. Whereas Imp set of Trump has shown an accuracy of 99.70% among all the leaders. The results of this experiment revealed that our proposed model performed remarkably on both datasets. These results are due to the GeLu’s nonlinear behavior and its combinative property of dropout, zoneout, and ReLu. GeLu solves the dying ReLu problem by providing a gradient in the negative axes to prevent neurons from dying and is also capable of differentiating each datapoint of the input image.

3.3 Ablation Study In this experiment, an ablation study is conducted to demonstrate the performance of various activation functions on the FaceSwap set of the FaceForensics++ dataset. Table 3 illustrates the performance of different activation functions. The results show that our method employing the GeLu activation provided the best performance as compared to other activation functions. The disparity in findings is mainly due to the GeLu’s combinative property of dropout and zone out as well as its non-convex,

246

F. Khalid et al.

Table 3 Performance evaluation on different activation functions Activation functions

ReLu

SeLu

TRelu

ELU

GeLu

Testing on FF++ (FS)

94.5

90.6

92.3

95.09

95.73

non-monotonic, and nonlinear nature with curvature present in all directions. On the other hand, convex and monotonic activations like ReLu, ELU, and SeLu are linear in the positive axes and lack curvature. As a result, GeLu outperforms other activation functions.

3.4 Performance Evaluation of Proposed Method on Cross-Set In this experiment, we designed a cross-set evaluation to inspect the generalizability of the proposed method among the intra-sets of the datasets. For the FF++ dataset, we conducted an experiment where each trained set is tested on all the other sets, like FS trained set is tested on all the other sets, respectively. Similarly, for the WL dataset, we conducted the same experiment within each leader’s intra-set, like Obama’s FS trained set is tested on the Imp and LS sets, respectively. The results displayed in Table 4 are slightly encouraging as both the datasets contain different deepfakes types and generative methods, but still our proposed method can differentiate the modifications of identity change, expression change, and neural rendering. Table 4 shows that, on the FF++ dataset, the sets having the same generative method achieved better results as compared to others. In comparison to the FF++ dataset, our proposed model has shown better results on the WL dataset, it has easily detected the FS and Imp of most of the leaders with good accuracies, as both the types have the same generative methods, so our model generalizes well on the same generative methods. LS of Obama has shown the lowest accuracies among all because this set contains spatiotemporal glitches. DL-based models (CNNs along with RNNs) can extract the features in both the spatial and temporal domains. In our method, we used a fused truncated DenseNet-based CNN model to identify the artifacts in the spatial domain only, which reduces the accuracy of this set. We conducted another cross-set evaluation experiment for the WL dataset, where the FS and Imp trained model of one leader is tested with the FS and Imp of another leader, respectively. The motive behind this experiment was to check the robustness of the same forgery type on different leaders. The results shown in Table 5 are relatively good, which shows that the proposed model can distinguish the same forgery on different individuals even in the presence of challenging conditions such as variations in skin tones, facial occlusions, lightning conditions, and facial artifacts.

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model …

247

Table 4 Performance evaluation on cross-sets of FF++ and WL dataset Test set Train set

FF++

WL

Subsets

FS

FS



DF

51.9

F2F

51.4

FSh NT

F2F

FSh

NT

Imp

LS

48.6

67.0

52.9

49.2







54.8

58.1

68.9





54.7



50.2

57.0





56.0

56.0

51.8



48.1





55.2

55.2

50.2

48.3







FS











62.1

46.9

Imp

48.0











32.2

LS

35.3









41.2



JB

FS











76.0

0.94

Imp

79.2













Clinton

FS











84.8



Imp

83.2













FS











82.1



Imp

92.0













FS











76



Imp

91.0













Obama

Warren Sander

DF

Table 5 Performance evaluation on cross-set of WL dataset Test set

Train set Obama JB

Obama

JB

Fs

Imp

Fs





66.3 60.3 71.3

69.3 53.6 –

Imp –

Clinton

Warren

Sander

Trump

Fs

Imp

Fs

Fs

Imp

55.1

53.8 50.1 50.3 79.3 71.1

84.1

69.4 71.2 87.3 75.2 69.3

37.6

Imp

Imp

Clinton 65.1 59.1 81.0 70.2 –



60.4 62.8 81.1 61.2 55.2

Warren

78.4 48.3 83.1 61.1 71.3

80.1



Sander

65.1 42.2 79.6 65.0 84.3

61.2

51.4 49.1 –



Trump



68.2



82.1 –

51.3 –

75.2 –



91.2 70.1 69.4

55.5 –

79.2

3.5 Comparative Analysis with Contemporary Methods The key purpose of this experiment is to validate the efficacy of the proposed model over existing methods. The performance of our method on the FF++ with existing methods is shown in Table 6. The accuracy of our model for FS and NT has increased by 5.44 and 2.9%, respectively. Whereas, for F2F and DF, our method has achieved higher accuracies as compared to most of the methods. It is difficult to obtain good

248

F. Khalid et al.

Table 6 Performance comparison against existing methods on FF++ dataset Model

FS

DF

F2F

NT

FSh

Combined

XeceptionNet

70.87

74.5

75.9

73.3



62.40

Steg. Features

68.93

73.6

73.7

63.3



51.80

ResidualNet

73.79

85.4

67.8

78.0



55.20

CNN

56.31

85.4

64.2

60.0



58.10

MesoNet

61.17

87.2

56.2

40.6



66.00

XeceptionNet

90.29

96.3

86.8

80.6



70.10

Classification

54.07

52.3

92.77





83.71

Segmentation

34.04

70.37

90.27





93.01

Meso-4



96.9

95.3







MesoInception



98.4

95.3







Proposed

95.73

93.9

92.6

83.5

60.9

87.76

Table 7 Performance comparison against existing methods on WL dataset Paper

Subset

Obama

Clinton

Warren

Sander

Trump

Agarwal et al. [2]

FS Imp

JB

Combined

0.95

0.95

0.98

0.96

0.94

0.93

1.00

0.94





0.93

0.94

– –

LS

0.83









Agarwal et al. [1]















0.94

Proposed

FS

0.97

0.86

0.95

0.90



0.94

0.97

Imp

0.63

0.94

0.95

0.82

1.00

0.96

LS

0.68











detection results on all subsets of the FF++ dataset, especially in the presence of challenging conditions like non-facial frames, varying illumination conditions, people of different races, and the presence of facial accessories. Our method outperforms most methods since it achieves good identification results across all subsets and can discriminate between real and fake videos generated using different manipulation techniques. We compared the performance of our method on the WL dataset with existing methods using the AUC score. Table 7 shows when all the dataset’s leaders are combined, our method outperforms the existing techniques.

4 Conclusion In this paper, we have presented a fused truncated DenseNet model to better distinguish between real and deepfakes videos. Our proposed system is lightweight and

Deepfakes Catcher: A Novel Fused Truncated DenseNet Model …

249

resilient with a shorter end-to-end architecture and fewer parameter sizes. In comparison to other SOTA models with greater parameter sizes, our truncated model trains quicker and performs well on a large and diverse dataset. Our model performed well, regardless of the distinct occlusion settings, variations in skin tones of people, and the presence of facial artifacts in both datasets. We performed an intra-set evaluation on both datasets and get better results on the sets having the same type of generative method. This shows that our model can detect the deepfakes on the unseen samples of any dataset using similar generative methods for deepfake creation. We intend to increase the generalizability of our methodology in the future to improve the cross-corpus assessment. Acknowledgements This work was supported by the grant of the Punjab Higher Education Commission of Pakistan with Award No. (PHEC/ARA/PIRCA/20527/21).

References 1. Agarwal S, Farid H, El-Gaaly T, Lim S-N (2020) Detecting deep-fake videos from appearance and behavior. In: 2020 IEEE international workshop on information forensics and security (WIFS) 2. Agarwal S, Farid H, Gu Y, He M, Nagano K, Li H (2019) Protecting world leaders against deep fakes. CVPR workshops 3. Bonettini N, Cannas ED, Mandelli S, Bondi L, Bestagini P, Tubaro S (2021) Video face manipulation detection through ensemble of cnns. In: 2020 25th international conference on pattern recognition (ICPR) 4. Chintha A, Thai B, Sohrawardi SJ, Bhatt K, Hickerson A, Wright M, Ptucha R (2020) Recurrent convolutional structures for audio spoof and video deepfake detection. IEEE J Sel Top Sign Proces 14(5):1024–1037 5. de Lima O, Franklin S, Basu S, Karwoski B, George A (2020) Deepfake detection using spatiotemporal convolutional networks. arXiv:2006.14749 6. Guo Z, Yang G, Chen J, Sun X (2021) Fake face detection via adaptive manipulation traces extraction network. Comput Vis Image Underst 204:103170 7. Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv:1606.08415 8. Li L, Bao J, Yang H, Chen D, Wen F (2019) Faceshifter: towards high fidelity and occlusion aware face swapping. arXiv:1912.13457 9. Liu M-Y, Huang X, Yu J, Wang T-C, Mallya A (2021) Generative adversarial networks for image and video synthesis: algorithms and applications. Proc IEEE 109(5):839–862 10. Nirkin Y, Wolf L, Keller Y, Hassner T (2021) DeepFake detection based on discrepancies between faces and their context. IEEE Trans Pattern Anal Mach Intell 11. Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2019) Faceforensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF international conference on computer vision 12. Tewari A, Zollhoefer M, Bernard F, Garrido P, Kim H, Perez P, Theobalt C (2018) High-fidelity monocular face reconstruction based on an unsupervised model-based face autoencoder. IEEE Trans Pattern Anal Mach Intell 42(2):357–370 13. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001

250

F. Khalid et al.

14. Zhao T, Xu X, Xu M, Ding H, Xiong Y, Xia W (2021) Learning self-consistency for deepfake detection. In: Proceedings of the IEEE/cvf international conference on computer vision, pp 15023–15033 15. Xiang J, Zhu G (2017) Joint face detection and facial expression recognition with MTCNN. In: 2017 4th international conference on information science and control engineering (ICISCE). IEEE, pp 424–427

Benchmarking Innovation in Countries: A Multimethodology Approach Using K-Means and DEA Edilvando Pereira Eufrazio and Helder Gomes Costa

Abstract This article addresses the comparison of innovation between countries using the data from the Global Innovation Index (GII), using a Data Envelopment Analysis (DEA) based approach. A problem that occurs when using DEA is the distortions caused by heterogeneity in the data. In this proposal, this problem is avoided by using two-stage modelling. The first stage consists of the grouping of countries in clusters using K-means, and in the second stage, data from inputs and outputs are brought by GII. This stage is followed by an analysis of the benchmarks of each of these clusters using the classic DEA model considering constant returns of scale and the identification of anti-benchmarks through the inverted frontier. As an innovation to the GII report, this article brings a two-stage analysis that makes the comparison between countries that belong to the same cluster fairer, mitigating potential distortions that should appear because of heterogeneity in the data. Keywords Innovation · DEA · K-means · Benchmarking · Clustering

1 Introduction This article brings a comparison between countries innovation, using data from the Global Innovation Index (GII). Using a Data Envelopment Analysis (DEA) based approach. Considering innovation as a something that permeates various social sectors and is present at various levels of the global production system. Thus, sometimes it is a difficult task to obtain metrics capable of measuring innovation and still justify the investments made in its promotion. Studying innovation at the country level, considering a National Innovation System (NIS) is very important for the development of countries. Because, through E. P. Eufrazio (B) · H. G. Costa Universidade Federal Fluminense, Niterói, Brazil e-mail: [email protected] H. G. Costa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_21

251

252

E. P. Eufrazio and H. G. Costa

innovation, it is possible to reduce the unemployment rate [1], and innovation works as a driver of economic development [2]. Even though various approaches for measuring innovation efficiency have been proposed, two important elements are often missing, at least in combination: (1) accounting the diversity of national innovation system (NIS), which makes benchmarking or ranking countries a hard task. (2) evaluating the responsiveness of innovation outputs to innovation-related investments [3]. Among the possible indices to measure innovation, the Global Innovation Index (GII) stands out. The index was created in 2007, and in 2017, it encompassed 127 countries, representing 92.5% of the world population and 97.6% of GDP. The GII seeks to establish metrics capable of better capturing the multiple facets of innovation and revealing its advantages to society [4]. The index is a reference on innovation in the world. GII analyze not only traditional measures of innovation, but also evaluate unconventional factors encompassed in innovation. Envisioned to capture as complete a picture of innovation as possible, the Index comprises 80 indicators for 2020. Thus, this work seeks to identify within the group of countries for which the GII is calculated a way to identify countries like each other in terms of investment and results in innovation. To search for these similar groups, K-means [5] was used; so that 5 groups of countries were found. After the separation into groups, a classical Data Envelopment Analysis (DEA) model (CCR) [6] was applied to identify the relative efficiencies of each country in its group. And still identifying through the study of frontiers and inverted frontiers which would be the benchmarks and anti-benchmark of each cluster. Considering this proposal, a search was made in the literature for works that dealt with the use of DEA and efficiency in technological innovation at the national level. Some works were found considering other indexes but with different approaches. [7– 9]. These differences range from methodological terms to differences in approach in terms of geographic limitations of the analyses [10–13]. In our work, we apply this idea to fill a gap in a two-stage analysis that seeks to understand the economies compared to those that share levels of investment in innovation in a similar way and to understand within each of these groups what leads certain countries to become stand out. We believe that this approach can be expanded and used in other public development policies. What is quite in line with the theme of GII 2020: Who will Finance Innovation? [14].

2 Background 2.1 The Global Innovation Index (GII) The GII is co-published by Cornell University, INSEAD, and the World Intellectual Property Organization (WIPO). The GII is composed of three indices: the overall

Benchmarking Innovation in Countries: A Multimethodology Approach …

253

GII, the Innovation Input Sub-Index, and the Innovation Output Sub-Index. The overall GII score is the average of the scores of the Input and Output Sub-Indices. The Innovation Input Sub-Index is comprised of five pillars that capture elements of the national economy that enable innovative activities: (1) Institutions, (2) Human capital and research, (3) Infrastructure, (4) Market sophistication, and (5) Business sophistication. The Innovation Output Sub-Index provides information about outputs that are the result of the innovative activities of economies. There are two output pillars: (6) Knowledge and technology outputs and (7) Creative outputs. Each pillar has three sub-pillars, and each sub-pillar is composed of individual indicators, totaling 80 for 2020. (Global Innovation n.d.). In this article, we use the data from the 2020 GII edition, as it is the newest one available at the time the research was done.

2.2 K-Means Method The K-means method basically allocates each sample element to the cluster whose centroid (sample mean vector) is closest to the observed value vector for the respective element [5]. This method consists of four steps [15]: 1. First choose k centroids, called “similar” or “prototypes”, to start the partition process. 2. Each element of the dataset is then compared to each initial centroid by a distance measure, which is usually the Euclidean distance. The element is allocated to the group whose distance is the shortest. 3. After applying step number 2 for each of the sample elements, recalculate the centroid values for each new group formed, and repeat step 2, considering the centroids of these new groups. 4. Steps 2 and 3 should be repeated until all sample elements are “well allocated” in their groups, i.e., until no element reallocation is required. To decide on the number of clusters, we use the “elbow method” which is a cluster analysis consistency interpretation and validation method designed to help find the appropriate number of clusters in a dataset [16].

2.3 Data Envelopment Analysis (DEA) Data Envelopment Analysis is a methodology based on mathematical programming, which has aim to measure the efficiency of a set of productive units, called Decision Making Units (DMUs), which consume multiple inputs to produce multiple outputs [6]. In the original model of Charnes et al. [6], efficiency is represented by the ratio of weighted outputs to weighted inputs, which generalizes the Farrell efficiency of signal input and output. An important feature of DEA is its ability to provide efficiency scores while taking into account both multiple inputs and multiple outputs [17].

254

E. P. Eufrazio and H. G. Costa

The DEA procedure optimizes the measured performance of each DMU in relation to all other DMUs in a production system which transforms multiple inputs into multiple products (outputs), using Linear Programming (PL) where it is resolved a set of interrelated Linear Programming Problems (PPL’s), as many as the DMU’s, aiming, thus, to determine the relative efficiency of each one of them [18]. In this work, we use the CCR model, originally presented by Charnes et al. [6], which builds a linear surface by parts, not parametric, involving the data. Works with constant returns to scale, that is, any variation in the inputs produces proportional variation in the outputs. This model is also known as the CRS model—Constant Returns to Scale. In terms of orientation, the CCR model can be oriented towards outputs or inputs. Each of the guidelines will be briefly detailed to familiarize the reader with the concepts. However, in this work, the orientation adopted will be output. The mathematical structure of these models allows a DMU to be considered efficient with multiple sets of weights. Zero weights can be assigned to some input or output, which means that this variable was disregarded in the assessment [19].

2.4 The CCR Model Inverted Frontier The inverted frontier can be seen as a pessimistic assessment of DMUs. This method assesses the inefficiency of a DMU by building a frontier consisting of units with the worst management practices, called an inefficient frontier. Projections of DMUs on the inverted frontier indicate an anti-target which is the linear combination of anti-benchmarks. For the calculation of the inefficiency frontier, an exchange of the inputs is made with the outputs of the original DEA model [20]. The inverted frontier assessment can be used to avoid the problem of low discrimination in DEA and to order DMUs. For that, we use the efficient aggregated index (composite efficiency). Which consists of the arithmetic mean between the efficiency in relation to the original frontier and the inefficiency in relation to the inverted frontier. Thus, for a DMU with maximum compound efficiency, it needs to perform well at the standard frontier and not perform well at the inverted frontier. This implies that a DMU is good at those characteristics where it performs well and not so bad at those where its performance is not the best [21].

3 Methods and Results At first, a flow of the methodology followed will be presented to familiarize the reader and in the next subsections, each topic presented will be briefly detailed and the relevant results are showed. The adopted methodology is grounded on a sequence of three steps: 1. Use GII composite index to make a k-means cluster analysis; 2. Apply the DEA CCR model

Benchmarking Innovation in Countries: A Multimethodology Approach …

255

Fig. 1 Final cluster structure

oriented to output to each cluster in step 1, using in this step the sub-pillars provides in GII report referring to inputs and outputs for calculate a composite efficiency; 3. Identify benchmarks and anti-benchmarks considering a standard frontier and an inverted frontier.

3.1 K-Means Cluster Analysis of GII Here we use the GII aggregated input and output to divide data into country clusters. We also used a sensitivity analysis of the number of clusters to support the choice of the number of clusters in conjunction with the elbow method. Figure 1 brings the final structure of the clusters considering 5 clusters obtained through K-means. These will be the data that served as the basis for the rest of the analysis. Each cluster going forward will be looked at in isolation.

3.2 DEA CCR Model Output-Oriented In this subsection, we used the DEA CCR model (constant returns to scale). That was fitted approach considering the combined use with K-means, which generate homogeneity inside the groups and heterogeneity between groups. Therefore, it was decided to work with constant returns to scale and not with variable returns, as in the example of the BCC model, a fact due to the preprocessing obtained by clustering the data. In our analysis context, each country is considered a DMU and for the set of inputs and outputs we consider the sub-pillars described in the GII methodology.

256

E. P. Eufrazio and H. G. Costa

Thus, the model had 5 input variables (Institutions, Human Capital and Research, Infrastructure, Market sophistication, and Business Sophistication); and had 2 output variables (Knowledge and Technology Outputs and Creative Outputs). Despite the use of sub-pillars that are an aggregation of other indicators, we understand that the fact that scores are standardized, and the same rules are applied to all DMUs, alleviates the possible unwanted effect of using indices instead of original variables as described in Dyson et al. [22]. Output orientation was chosen, as one of our goals is to see which countries are benchmarks in terms of optimizing spending on investments linked to innovation. In other words, when orienting the model to outputs we consider that the investments remain constant, which allows us to observe those DMUs that are more efficient. An analysis of the inverted frontier is also carried out, which seeks to identify which DMUs have done little in terms of output even with high inputs. Finally, the composite efficiency index is calculated to identify DMUs that present a kind of balance between the approaches. Table 1 shows the results compiled from the five clusters in terms of calculated efficiencies. The table shows the first three and the last three countries of each cluster considering an ordering based on composite efficiency. Analyzing clusters according to efficiencies, it can be seen that cluster 1 includes some countries in Latin and Central America, with Jamaica and Colombia standing out. This cluster also includes countries in Africa and Central Asia. With respect to those countries with the worst performance, Oman stands out in the Arabian Peninsula, Brunei, and Peru. Cluster 2 includes countries that have a higher level of investment due to the high GDP of these nations. In this cluster, we have countries like the USA, Singapore, China, United Kingdom. In the analysis, we see that Switzerland has achieved greater compound efficiency and is still efficient in terms of standard efficiency, together with Ireland and the United Kingdom. At the other extreme of the cluster, Singapore, Canada, and China are less efficient. It is understood that these numbers are limited to analysis within the cluster, which in absolute terms has the highest level of inputs and outputs. Cluster 3 includes countries from Eastern Europe, countries from Latin America (Brazil, Mexico, and Chile) also includes some Asian countries. It can be said that the cluster in general brings together developing economies, containing practically all BRICS except China. It is noticed that countries positively detach Bulgaria, Vietnam, and Slovakia. On the other end, we have South Africa, Brazil, and Russia, countries that, compared to the others in the cluster, invest considerable resources but do not have outputs consistent with the investment. Cluster 4 has 36 countries, mostly poor countries on the African continent, with a low index of inputs which translates into a low rate of investment and consequently a low rate of outputs. It is worth highlighting positively in this cluster Côte d’Ivoire, Madagascar, and Pakistan, these countries are not necessarily those with the highest rates of outputs but have a high relative efficiency balancing outputs with inputs. A negative highlight should be placed on the last three countries Mozambique, Zambia,

Benchmarking Innovation in Countries: A Multimethodology Approach … Table 1 Efficiency compilation

Countries

257

Standard efficiency

Inverted efficiency

Composite efficiency

Jamaica

1.00

0.49

0.75

Colombia

1.00

0.49

0.75

Morocco

1.00

0.52

0.74

Cluster 1

Last countries in cluster 1 Oman

0.57

1.00

0.29

Peru

0.48

0.95

0.27

Brunei

0.46

1.00

0.23

Switzerland

1.00

0.63

0.68

Ireland

1.00

0.65

0.67

United Kingdom (the)

1.00

0.68

0.66

Cluster 2

Last countries in cluster 2 Singapore

0.75

0.86

0.44

Canada

0.68

1.00

0.34

China

0.67

1.00

0.33

Bulgaria

1.00

0.63

0.69

Viet Nam

1.00

0.65

0.67

Slovakia

1.00

0.67

0.67

Cluster 3

Last countries in cluster 3 Brazil

0.66

1.00

0.33

South Africa

0.64

1.00

0.32

Costa Rica

0.59

1.00

0.29

Côte d’Ivoire

1.00

0.48

0.76

Madagascar

1.00

0.52

0.74

Pakistan

1.00

0.54

0.73

Cluster 4

Last countries in cluster 4 Mozambique

0.58

1.00

0.29

Zambia

0.50

1.00

0.25

Benin

0.43

1.00

0.21

Cluster 5 Malta

1.00

0.73

0.63

Iceland

1.00

0.75

0.62

Luxembourg

1.00

0.78

0.61 (continued)

258 Table 1 (continued)

E. P. Eufrazio and H. G. Costa Countries

Standard efficiency

Inverted efficiency

Composite efficiency

Estonia

1.00

0.84

0.58

Last countries in cluster 5 Australia

0.84

1.00

0.42

Slovenia

0.84

1.00

0.42

United Arab Emirates (the)

0.71

1.00

0.35

and Benin, which are poor countries that consequently invest more but still have not reaped proportional outputs. Cluster 5 includes countries from the Iberian Peninsula, some countries from southern Europe, and Oceania. It brings countries with consolidated economies that are just below cluster 2, which brings the countries with the highest levels of inputs and outputs. In the group of countries in cluster 5, Malta, Iceland, and Luxembourg are worth mentioning. Considering inverted frontier the following countries stand out: Australia, Slovenia, and the United Arab Emirates.

3.3 DEA CCR Model Output-Oriented In cluster 1, we identified as benchmarks the countries Colombia, Jamaica, Panama, and Morocco for most DMUs. This cluster is made up of economies in general in development and that countries in general don’t have a high level of investments in innovation. However, it is understood that within the paradigm of these clusters, these DMUs can present practices to be observed by the members of the cluster. On the other pole, we identified Brunei Darussalam, Uzbekistan, and Rwanda as the most frequent anti-benchmarks. Which are countries with little efficiency in terms of balancing inputs and outputs. For cluster 2, the benchmarks and anti-benchmarks of the cluster, we see that the countries that are the most frequent benchmarks are Hong Kon, Germany, and Netherlands. This cluster, as previously mentioned, is the cluster with the highest values of outputs, the countries belonging to this group have high values in terms of outputs and inputs. Already considering the anti-benchmarks, we have Canada and China as the most frequent countries, but it is worth noting that there are some countries like China and Hong Kong that are benchmarks from Canada for example, and are anti-benchmarks from other countries like the Netherlands. In cluster 3, benchmarks and anti-benchmarks concentrate on developing countries that already have slightly higher values in terms of inputs and outputs. As the most frequent countries in terms of benchmarks, we have Bulgaria, Armenia, and Iran (Islamic Republic). Considering the anti-benchmarks, Brazil, Costa Rica, and Greece are the most frequent. These countries even have relatively high inputs in

Benchmarking Innovation in Countries: A Multimethodology Approach …

259

the cluster but do not have a proportional output, which ends up compromising their efficiency. Cluster 4 shows the countries with the lowest indexes in terms of inputs and outputs. As evidenced by countries mostly belonging to the African continent and that do not have many investments in terms of technological innovation. Considering the countries that stand out in terms of benchmarks, we have Egypt which stands out with a good balance between inputs and outputs, there are already benchmarks like Zimbabwe which despite being a benchmark has lower levels of score compared to Egypt. Thinking of anti-benchmarks, we highlight Mozambique, Guinea, and Benin. Cluster 5, shows the countries of Eastern Europe, Oceania, and the Iberian Peninsula. In general, consolidated economies with high levels in terms of outputs and inputs. Stand out as cluster 5 benchmarks are Estonia and Czech Republic (the). It should be noted that this cluster is the one that brings the highest number of efficient DMUs showing a balanced level between inputs and outputs in comparative terms. Regarding anti-benchmarks, we highlight Belgium, Slovenia, and the United Arab Emirates.

4 Conclusion The article brought a hybrid application between K-means and DEA applied to GII data. We understand that in methodological terms, the approach can be extended to other fields, generating an effective way to separate the DMUs before applying the DEA models. Considering the results, we understand that the objectives of the article were achieved since clusters and benchmarks and anti-benchmarks were identified for each of the 5 clusters found. It is understood that the applied methodology and the results found have the potential to aid decision-making in terms of public policies and are in line with the 2020 theme of the GII (“Who will finance innovation?”). As work limitations, we present the fact that we work with index numbers for the variables of inputs and outputs. It is also can be a limitation the fact of working with the data of inputs and outputs considering the same year and not a panel analysis. We understand that in proposals for future work, working with panel data considering different years in terms of inputs in relation to the outputs can provide valuable insights, since investments in innovation take time to take effect. Another proposal would be related to understanding how much the use of index numbers implies in the analysis of benchmarks. Acknowledgements This research was partially supported by: • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES). • Conselho Nacional de Desenvolvimento Científico e Tecnológico. • Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro.

260

E. P. Eufrazio and H. G. Costa

References 1. Richardson A, Audretsch DB, Aldridge T, Nadella VK (2016) Radical and incremental innovation and the role of university scientist. Springer, Cham, pp 131–207. https://doi.org/10.1007/ 978-3-319-26677-0_5 2. Rinne T, Steel GD, Fairweather J (2012) Hofstede and Shane Revisited. Cross-Cult Res 46(2):91–108. https://doi.org/10.1177/1069397111423898 3. Tziogkidis P, Philippas D, Leontitsis A, Sickles RC (2020) A data envelopment analysis and local partial least squares approach for identifying the optimal innovation policy direction. Eur J Oper Res 285(3):1011–1024. https://doi.org/10.1016/j.ejor.2020.02.02 4. Lapa MSS, Ximenes E (2020) Ensaio Sobre a Relação de Pernambuco com o Indicador Produtos Criativos Adotado no Índice Global de Inovação. Braz J Dev 6(11), 92639–92650. https://doi. org/10.34117/bjdv6n11-613 5. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. The Regents of the University of California. https://projecteuclid.org/euclid.bsmsp/120051 2992 6. Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units. Eur J Oper Res 2(6):429–444. https://doi.org/10.1016/0377-2217(78)90138-8 7. Guan J, Chen K (2012) Modeling the relative efficiency of national innovation systems. Res Policy 41(1):102–115. https://doi.org/10.1016/j.respol.2011.07.001 8. Matei MM, Aldea A (2012) Ranking national innovation systems according to their technical efficiency. Procedia Soc Behav Sci 62:968–974. https://doi.org/10.1016/j.sbspro.2012.09.165 9. Min S, Kim J, Sawng YW (2020) The effect of innovation network size and public R&D investment on regional innovation efficiency. Technol Forecast Soc Chang 155:119998. https:// doi.org/10.1016/j.techfore.2020.119998 10. Chen K, Guan J (2012) Measuring the efficiency of China’s regional innovation systems: application of network data envelopment analysis (DEA). Reg Stud 46(3):355–377. https:// doi.org/10.1080/00343404.2010.497479 11. Crespo NF, Crespo CF (2016) Global innovation index: moving beyond the absolute value of ranking with a fuzzy-set analysis. J Bus Res 69(11):5265–5271. https://doi.org/10.1016/j.jbu sres.2016.04.123 12. Pan TW, Hung SW, Lu WM (2010) Dea performance measurement of the national innovation system in Asia and Europe. Asia-Pac J Oper Res 27(3):369–392. https://doi.org/10.1142/S02 17595910002752 13. Salas-Velasco M (2019) Competitiveness and production efficiency across OECD countries. Compet Rev 29(2):160–180. https://doi.org/10.1108/CR-07-2017-0043 14. Cornell University, Insead, and Wipo (2020) The Global Innovation Index 2020: Who Will Finance Innovation? 15. Mingoti SA (2005) Análise de dados através de métodos de Estatística Multivariada: Uma abordagem aplicada 297 16. Ketchen DJ, Shook CL (1996) The application of cluster analysis in strategic management research: An analysis and critique. Strateg Manag J 17(6):441–458. https://doi.org/10.1002/ (sici)1097-0266(199606)17:6%3c441::aid-smj819%3e3.0.co;2-g 17. Farrell MJ (1957) The measurement of productive efficiency. J Roy Stat Soc 120(3):253–281 18. Sueyoshi T, Goto M (2018) Environmental assessment on energy and sustainability by data envelopment analysis. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118979259 19. Cooper WW, Seiford LM, Tone K (2007) Data envelopment analysis: a comprehensive text with models, applications, references and DEA-solver software: Second edition. In: Data envelopment analysis: a comprehensive text with models, applications, references and DEA-solver software, 2nd edn. Springer US. https://doi.org/10.1007/978-0-387-45283-8 20. da Silveira JQ, Mezab LA, de Mello JCCBS (2012) Use of dea and inverted frontier for airlines benchmarking and anti-benchmarking identification. Producao 22(4):788–795. https://doi.org/ 10.1590/S0103-65132011005000004

Benchmarking Innovation in Countries: A Multimethodology Approach …

261

21. Mello JCCBS, Gomes EG, Meza LA, Leta FR (2008) DEA advanced models for geometric evaluation of used lathes. WSEAS Trans Syst 7(5):510–520 22. Dyson RG, Allen R, Camanho AS, Podinovski VV, Sarrico CS, Shale EA (2001) Pitfalls and protocols in DEA. Eur J Oper Res 132(2), 245–259. https://doi.org/10.1016/S0377-221 7(00)00149-1

Line of Work on Visible and Near-Infrared Spectrum Imaging for Vegetation Index Calculation Shendry Rosero

Abstract This study proposes to generate a basic line of calculation of vegetation indexes from bands of images from different sources for the same scene, for this, we worked on captures from a NIR sensor (near infrared) and captures from an RGB sensor as if it were a single image; Although for this type of work, there are applications based on the vegetation index that could be a commercial alternative, it is no less true that this proposal is intended to generate a basic image processing model for academic purposes while it can become a low-cost alternative for farmers whose plantations need proposals with limited budget, hence we worked with two techniques, the use of geometric transformations and processes based on correlation enhancement techniques. Keywords Image registration · k-means NIR · Rectification · Multi spectral

1 Introduction Obtaining 3D information from area images depends on the quality and quantity of input information (2D) that can be preprocessed; for practical cases, it is common to follow a flow composed of calibration, rectification, and image registration. Calibration consists of obtaining as much intrinsic and extrinsic information as possible from the camera, which allows eliminating distortions, in some cases, typical of the lens and its configuration, and in other cases, defects typical of the lens used, which together would allow eliminating common distortions such as radial distortions, barrel distortions, to mention the most common ones. In the case of aerial photography and due to the constant movements of the vehicle due to stabilization effects, it is not enough to correct the camera’s distortions; since it is not always possible to obtain a homothetic photograph of the terrain, it is also necessary to rectify the images, which consists of transforming an image into a scale projection S. Rosero (B) Universidad Estatal Península de Santa Elena, La Libertad, Ecuador e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_22

263

264

S. Rosero

of the terrain; the rectification then corrects the possible displacements existing in the aerial capture due to constant stabilization movements [1, 2]. The result is a rectified image that must contain the characteristics of an orthogonal projection of the photographed object on the study plane at a given scale. The conditions of this technique require that the initial captures do not exceed an angle of inclination of 3° in any direction of the captured plane and that the terrain be considered a flat terrain; this last condition can be obtained by varying the capture height of the images. The third element of the process is image registration, which consists of transforming the data set obtained from each photograph into a set of coordinates that allows the data obtained to be compared or integrated. The present study proposes to simplify this process, reducing the number of processes for image calibration and image registration, because the low altitude of the drones would eliminate the need for image rectification over terrains with certain deformation. The calibration and registration methods used are presented in Sect. 2, the comparative methods in Sect. 3, and the results obtained in Sect. 4, which includes a grouping process for the comparison of the registration techniques used. Under this context, the overall result will be a short academic treatise on image processing whose direct beneficiaries could be local farmers by obtaining a simple method of plant health assessment.

2 Literature Review 2.1 Calibration One of the important factors to consider when correcting for camera distortion, in general, is the radial and tangential distortion. The mathematical process of the distortion factors usually yields the necessary parameters for the correction of the distortion, this procedure for reasons of simplification and didactics is beyond the scope of this research. In general, the calibration process used [3] is based on obtaining images or videos of the patterns through the camera to be calibrated, each new pattern found will represent a new equation, and the resolution of the equation will depend on a system of equations formed by a few N captured patterns.

2.2 Image Registration and Vegetation Index Calculation Image registration and alignment is considered a process that allows, under a coordinate system, to transform image data to obtain different parameters to be evaluated or simply to improve the characteristics of the processed images. Registration methods can be classified into two major groups equally effective depending on the quality of the images used and the purposes of subsequent measurement, hence the

Line of Work on Visible and Near-Infrared Spectrum Imaging …

265

methods based on intensity measurement and feature-based methods are highlighted, the present study makes a brief analysis of the two methods. Image registration based on improved correlation coefficients The ECC (Enhanced Correlation Coefficient) was proposed in 2008 by Georgios D. Evangelidis and Emmanouil Z. Psarakis [4], its purpose was the utilization of a new similarity measure called, enhanced correlation coefficient, estimating image parameters through a motion model classified into four models: Translation, Euclidean, Affine Model, and Homography, the difference of each model varies according to how many parameters must be obtained from the variation of the image to be aligned with respect to the fixed image, this geometric variation allows to distinguish a change in angles, line parallelism or the complete distortion of images [4–6]. Image registration based on feature detection Feature-based methods tend to search for points of interest between each pair of images, the best-known techniques employ descriptors such as SIFT (Scale Invariant Feature Transform (Lowe, 1999), or Speeded-Up Robust Features (SURF) proposed by Bay et al. in 2008 to determine the points of interest between images. One of the advantages of using descriptors is that they do not require user intervention, although on the other hand, in tests performed, it was determined that both SIFT and SURF have problems when comparing images that lack highlights in each image compared, making it difficult to find points of interest to compare; alternatives shown on “Image registration by local approximation methods” (Goshtasby, 1988) [7]. A similar proposal by Goshtas was made years ago [8]. Vegetation index calculation One of the most common applications for photographs captured by near-infrared spectra is the various calculations of vegetation indices, the best known being the Normalized Difference Vegetation Index or NDVI (Rouse et al., 1974), which among other indices is a sample of the particular radiometric behavior of vegetation in relation to its photosynthetic activity and the structure of the plants themselves (leaf structure), allowing in this specific case to determine the plant health of the element under examination. This is due to the amount of energy that plants in general absorb or reflect according to the various proportions of the electromagnetic spectrum to which they may be subjected, especially for bands in the order of red and near infrared. Thus, the spectral response of healthy vegetation will contrast between the visible spectrum with respect to the near infrared or NIR, which is evident from the amount of water absorbed by the plant to such an extent that, while in the visible spectrum, the plant pigments absorb the greatest amount of energy received in the NIR, and they reflect the greatest amount of energy; on the contrary, in diseased vegetation (for various reasons), the amount of reflection in the NIR spectrum will be severely reduced. In general, the calculation of the indices is reduced to operations on the bands of the visible spectrum with respect to the NIR spectrum. Table 1 shows a light example of the different calculations that can be obtained.

266

S. Rosero

Table 1 The table shows simplified vegetation indices obtained with respect to the processing of visible spectrum bands with near-infrared spectrum bands Index

Calculation

Normalized difference vegetation N DV I = index (NDVI)

Transformed

TVI =



N I R−R E D N I R+R E D

Feature Scale from −1 to 1, with zero value representing absence of plants or presence of other elements, non-zero values, different plant health

N DV I + 0.5 The 0.5 is a correction factor that avoids negative results, while the square root tries to correct values that approximate a Poisson distribution

An example of how to interpret the values obtained depends on the fluctuation of the calculation made, for example for the NDVI case (−1 to 1), the studies determine that those negative values correspond mainly to cloud-like formations, water, and snow. Values close to zero correspond mainly to rocks and soils or simply areas devoid of vegetation. Values below 0.1 correspond to rocky areas, sand, or snow. Moderate values (0.2–0.3) represent shrub and grasslands. High values indicate temperate and tropical forests (0.6–0.8). Of course, this interpretation will be subject to the type of place where the capture is made, considering the different shades of the photographed elements.

3 Methodology The proposal consists of the Geometric Calibration of Cameras, an optical calibration, and the registration of images. For the calibration of cameras in the photos used in the generation of the registry, two digital cameras were used: Ricoh GR digital 4 and a MAPIR Survey 2 (with near-infrared spectrum), the two cameras were subjected to a calibration process using the method proposed by Zhang [3], through a pattern like a chessboard. In the case of MAPIR Survey 2, an additional optical calibration method proposed by the same supplier was used.

3.1 Calibration The method employed in Zhang’s method uses a checkerboard-like pattern of squares, in this case, an asymmetric pattern of 10 × 7 black/white squares was used. The process begins with the capture of images through the camera to be calibrated, in this case, tests were made for the MAPIR and the Ricoh GR digital camera. The photos were captured at distances of 30 and 60 cm with the purpose of verifying the

Line of Work on Visible and Near-Infrared Spectrum Imaging …

267

Fig. 1 Result of image calibration using the checkerboard pattern (Zhang procedure). Image a shows an uncorrected photograph, while image b shows the rectified photograph (slight black rectification border)

best results both visually and mathematically; that is, to obtain a distortion coefficient between 0 and 1, or at least that the values obtained for this coefficient are not far from 1. The number of photographs captured was 20 per camera and the implementation of the algorithm is an adaptation of the algorithm proposed at http://docs.opencv.org/ 3.1.0/d4/d94/tutorial_camera_calibration.html, using Visual Studio with C++ and OpenCV. From the results obtained, the author believes that from the way the captures were obtained (sensor types and brands), the calibration values did not affect the results obtained (Fig. 1); however, this section is maintained for academic purposes related to the normal data flow in image processing. Image correction The adaptation of the algorithm starts by reading the general settings from an XML file, which maintains the initial parameters such as: number of internal corners widthwise, number of internal corners heightwise, size in millimeters of the checkerboard square, and the number of internal corners heightwise. Parallel to this, it is necessary to define another XML file that will contain the path to each of the images captured by the camera to be processed, this file, its name and location must also be defined in the “default.xml” file under the tags. With these mathematical results, we proceeded to rectify the Ricoh images. As a precaution, it is recommended to insert a function that calculates the size of the image, and thus verify that this size corresponds to the size of the calibration standards used, otherwise the effect may be counterproductive, and the resulting images may appear with greater distortion than that added by the camera lens. An example of the process is shown in Fig. 1. The same process was applied to the MAPIR Survey 2 camera.

268

S. Rosero

MAPIR Survey Calibration The calibration of photographs coming from the MAPIR Survey 2 camera was subjected to the same Ricoh calibration process plus an additional optical calibration process proposed by the manufacturer; obtaining the distortion matrix from the Mapir camera does not require a previous optical correction, but it is recommended to avoid problems in the detection of control points for the calculation of distortion coefficients. Even for cases where the images had very short exposure times and the images did not allow control point detection, an additional preprocessing was performed consisting of: Channel separation (R, B, G), Channel separation was necessary to obtain an image with higher contrast difference to allow the calibration algorithm to determine the control points. From the separation of channels, it was determined that the channel corresponding to the blue color (blue), had higher contrast, and therefore, greater opportunity to detect the necessary control points in the calibration, after this was made an improvement of contrast through its histogram and finally the calibration process.

3.2 Image Registration Two forms of registration were performed, the first by means of geometric transformation techniques and the second through improved correlation coefficients [4]. Image registration through geometric transformations The process consists of reading two previously calibrated images, an RGB image and a NIR (Near Infrared) image. The NIR image from the MAPIR Survey 2 camera presents an image with three channels, red, green, and blue, of which the green and blue channels do not contain significant information, therefore, these two channels could be eliminated to speed up the calculations since it was observed that there is no major benefit in the results. The geometric transformations technique is included in the Matlab Computer Vision Toolbox and allows to create control points manually as shown in Fig. 2. The technique recommends at least four points, but satisfactory results were achieved with at least 11 geometrically distant control points, it is good to point out that what is sought is a coordinate system, therefore, the position accuracy is not dictated by intensity values or similar scenes but only coordinates, which makes the technique robust for images with a diversity of objects as reference and control points and interesting for GPS control points. Once the control points have been loaded, a transformation function based on geometric transformations [5, 6] must be invoked, and the size of the aligned image is set with respect to the image size. The result of this process will be an aligned image, the result of which is shown in Fig. 3.

Line of Work on Visible and Near-Infrared Spectrum Imaging …

269

Fig. 2 Control point selection process, the technique recommends at least four geometrically distant control points

Fig. 3 Image resulting from the alignment using geometric transformation techniques, the figure shows a fusion between the NIR image and the RGB image

After this process, the two images can be concatenated to obtain the four bands needed for the measurement of the various vegetation indices. The concatenation is a parallel concatenation. Image registration using improved correlation coefficients The improved correlation coefficient image alignment algorithm is based on the proposal of Georgios et al. [4], which consists of the estimation of correlated coefficients of a motion model. The advantage of the geometric transformation technique is that it does not need control points, and additionally, unlike other similarity measurement methods, it is invariant to photometric distortions for contrast/brightness levels.

270

S. Rosero

Fig. 4 Image resulting from the alignment using ECC techniques, the figure shows the aligned image without channel separation

For the application of this technique, we used Python with OpenCv 3 and part of the algorithm can be found at https://www.learnopencv.com/image-alignment-eccin-opencv-c-python/ (Mallic 2015), the result can be seen in Fig. 4.

3.3 Vegetation Index Calculation Once the images are registered, the next step is to calculate the various vegetation indices based on the band transformation, as shown in Table 1. The NDVI index was selected for experimentation purposes of the proposed method. The calculation of the vegetation index depends on the operations performed with the near-infrared spectrum and the red channel of the images to be analyzed, as shown in Eq. 1. N DV I =

N I R − RE D N I R + RE D

(1)

The results of the application of geometric transformations for NIR image registration and NDVI calculation are shown in Figs. 5 and 6.

4 Results According to the images presented, the nomenclature of the vegetation index in the NDVI image coincides with the healthy disposition in the visual contrast seen in the RGB image, but in order to estimate a grouping value, it is necessary to determine around which values the fluctuation of the index is grouped, therefore, a grouping

Line of Work on Visible and Near-Infrared Spectrum Imaging …

271

Fig. 5 Comparison of the index result, the left image shows the original RGB image, in which intense green areas can be seen referring to areas that could be considered with better health, and the right image shows the result of the index whose value range goes from −1 to 1, according to this, the clear areas (high range) correspond to areas of better plant health

Fig. 6 The right image shows the result of obtaining the vegetation index using ECC, slight changes can be observed with respect to Fig. 5

based on K-means of three groups and five iterations was performed, whose result is shown for the stadium image in Figs. 7 and 8.

5 Discussion Much of the current work on biomass calculations [9, 10], water body quality [11– 13], and vegetation, are based on the processing of hyperspectral images [14, 15] obtained from satellites, but one of the many problems of this type of images is that the study area is not always at hand, or despite maintaining the specific study region, the amount of information recovered from the processing of these images is minimal and corresponds to large areas of land. One could think of capturing images of specific areas but this would increase the cost of the study plus a climate-related mitigating factor. One of the low-cost alternatives available today is the work of precision agriculture [1, 16] in which the images captured come from unmanned

272

S. Rosero

Fig. 7 Grouping for stadium image values with ECC registration

Fig. 8 Clustering for stadium image values with registration through geometric transformations

vehicles that fly over the study terrain at low altitude, which allows obtaining either a greater amount of information per pixel, better resolutions and, depending on the altitude, radiometric and reflectivity calibrations could be avoided. The processing of this type of image would be reduced to the analysis of multispectral images [17–19] coming from one or more cameras with the appropriate filters. In the case of using two or more cameras, an image registration process must be included, given the difficulty of capturing images belonging to the same area and controlling factors such as focal length, exposure times, etc., to be within the same projection area. The present study proposes a simplified method for calculating vegetation indices that starts from camera calibration [3], up to image registration, and whose results can be compared with more complex techniques. The most complex part relies on image registration, for which two techniques were evaluated, such as geometric transformations [7, 8] and those based on correlation coefficients [4]. From the tests carried out, it was shown

Line of Work on Visible and Near-Infrared Spectrum Imaging …

273

that, except for execution times due to the higher computational cost of the ECC techniques, the results do not differ significantly from those of the ECC techniques. Although we leave open the possibility of comparing the results of this study with commercial solutions in future works, we were able to establish an academic working model on image processing and vegetation index calculations on a solid theoretical basis and at low cost, which in the short term can be perfectly used by the local farmer as a tool for analysis.

References 1. Marcovecchio DG, Costa LF, Delrieux CA (2014) Ortomosaicos utilizando Imágenes Aéreas tomadas por Drones y su aplicación en la Agricultura de Precisión, pp 1–7 2. Igamberdiev RM, Grenzdoerffer G, Bill R, Schubert H, Bachmann M, Lennartz B (2011) International journal of applied earth observation and geoinformation determination of chlorophyll content of small water bodies (kettle holes) using hyperspectral airborne data. Int J Appl Earth Obs Geoinf 13(6):912–921 3. Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334 4. Evangelidis GD, Psarakis EZ (2008) Parametric image alignment using enhanced correlation coefficient maximization. 30(10):1–8 5. Szeliski R (2006) Image alignment and stitching, pp 273–292 6. Baker S, Matthews I (2004) Lucas-Kanade 20 years on : a unifying framework. 56(3):221–255 7. Goshtasby A (1988) Image registration by local approximation methods. Image Vis Comput 6(4):255–261 8. Goshtasby A (1986) Piecewise linear mapping functions for image registration. Pattern Recogn 19(6):459–466 9. Garcia A (2009) Estimación de biomasa residual mediante imágenes de satélite y trabajo de campo. Modelización del potencial energético de los bosques turolenses, p 519 10. Peña P (2007) Estimación de biomasa en viñedos mediante imágenes satelitales y aéreas en Mendoza, Argentina, pp 51–58 11. Gao B (1996) NDWI a normalized difference water index for remote sensing of vegetation liquid water from space. 266(April):257–266 12. De E (2010) Evaluación de imágenes WorldView2 para el estudio de la calidad del agua, p 2009 13. Ledesma C (1980) Calidad del agua en el embalse Río Tercero ( Argentina ) utilizando sistemas de información geográfica y modelos lineales de regresión Controle da qualidade da água no reservatório de Rio Terceiro (Argentina ) usando sistemas de informação geográfica e m, no 12 14. Koponen S, Pulliainen J, Kallio K, Hallikainen M (2002) Lake water quality classification with airborne hyperspectral spectrometer and simulated MERIS data. 79:51–59 15. District ML, Thiemann S, Kaufmann H (2002) Lake water quality monitoring using hyperspectral airborne data—a semiempirical multisensor and multitemporal approach for the Mecklenburg Lake District, Germany. 81:228–237 16. García-cervigón D, José J (2015) Estudio de Índices de vegetación a partir de imágenes aéreas tomadas desde UAS/RPAS y aplicaciones de estos a la agricultura de precisión

274

S. Rosero

17. Firmenich D, Brown M, Susstrunk S (2011), Multispectral interest points for RGB-NIR image registration, pp 4–7 18. Valencia UPDE (2010) Análisis de la clorofila a en el agua a partir de una imagen multiespectral Quickbird en la zona costera de Gandia 19. Lillo-saavedra MF, Gonzalo C (2008) Aplicación de la Metodología de Fusión de Imágenes Multidirección-Multiresolución (MDMR) a la Estimación de la Turbidez en Lagos. 19(5):137– 146

Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality in Portugal The Impact of the Daily Cases, Vaccination, and Daily Temperatures Alexandre Arriaga and Carlos J. Costa Abstract The COVID-19 pandemic is one of the biggest health crises of the twentyfirst century, it has completely affected society’s daily life, and has impacted populations worldwide, both economically and socially. The use of machine learning algorithms to study data from the COVID-19 pandemic has been quite frequent in the most varied articles published in recent times. In this paper, we will analyze the impact of several variables (number of cases, temperature, people vaccinated, people fully vaccinated, number of vaccinations, and boosters) on the number of deaths caused by COVID-19 or SARS-CoV-2 in Portugal and find the most appropriate predictive model. Various algorithms were used, such as OLS, Ridge, LASSO, MLP, Gradient Boosting, and Random Forest. The method used for data processing was Cross- Industry Standard Process for Data Mining (CRISP-DM). The data was obtained from an open-access database. Keywords COVID-19 · Deaths · Sases · Vaccination · Temperature · Machine learning · Portugal · Python

1 Introduction An outbreak of a disease caused by a virus is considered a pandemic when it affects a wide geographic area and has a high level of infection which can lead to many deaths [1]. Throughout the history of humanity, there have been several pandemics, some with more mortality rates than others, such as the Spanish flu (1918), the Asian flu A. Arriaga (B) ISEG (Lisbon School of Economics and Management), Universidade de Lisboa, 1200-109 Lisbon, Portugal e-mail: [email protected] C. J. Costa Advance/ISEG (Lisbon School of Economics and Management), Universidade de Lisboa, 1200-109 Lisbon, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_23

275

276

A. Arriaga and C. J. Costa

(1957), the Hong Kong flu (1968), and the Swine flu (2009). [2] The most impactful pandemic of this century is the COVID-19 pandemic. COVID-19 is a respiratory disease caused by the SARS-CoV-2 virus [1], that affects all age groups but has more serious consequences in older individuals and/or people who have pre-existing medical conditions [3]. The first recorded cases date back to December 31, 2019, in Wuhan City, China [4]. This disease spread quite fast all over the world; in Portugal, the first case was registered on March 2, 2020 [5]. Anyone who tests positive for this disease may be symptomatic or asymptomatic. Symptoms of COVID-19 may be fever, tiredness, cough, and in more severe cases, shortness of breath and lung problems [2]. The study of the impact of vaccination, number of registered cases, and temperatures on the number of deaths caused by the SARS-CoV-2 virus has been quite frequent in recent times. The objective of this paper is to find the appropriate model to estimate the number of daily deaths by SARS-CoV-2 and later find the algorithm with the best predictive power. For this same purpose, several variables were used, both for vaccination and the number of cases. The main target is to use several machine learning algorithms to predict daily mortality.

2 Background COVID-19 mortality data can be predicted by various methods such as machine learning or statistical forecast algorithms [1]. Other than machine learning algorithms, several studies used ARIMA and SARIMA models, considering the seasonal behavior present in mortality [6]. In this paper, only machine learning algorithms were used in data modeling and prediction. According to the article [7], “The premise of machine learning is that a computer program can learn and adapt to new data without the need for human intervention”. In machine learning, there is no algorithm that can predict with the slightest error all types of data [8], that is, for each type of data, there are algorithms that are more suitable than others to predict future data. Choosing the best algorithm also depends on the problem we are facing, and on the number of variables used in the model [8]. There are several types of machine learning algorithms, such as unsupervised, supervised, semi-supervised, and reinforcement learning. The supervised type performs a mapping of the dependent and independent variables, to predict unknown future data of the dependent variable [9]. Semi-supervised type uses unlabeled data (needs no human intervention) with labeled data (needs human intervention) to predict future data. These types of algorithms can be more efficient, as they need much less human intervention in building the models [10]. Reinforcement learning algorithms produce a series of actions considering the environment where they are inserted to maximize “the future rewards it receives (or minimizes the punishments) over its lifetime” [11]. Last but not least, in Unsupervised Learning algorithms, the input data is entered “but obtains neither supervised target outputs nor rewards from its environment” [11]. An example of this type of algorithm is K-means.

Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality …

277

Fig. 1 Machine learning algorithms (supervised type)

In this paper, only supervised learning algorithms were discussed, which are OLS, LASSO, Ridge, Gradient Boosting, MLP, and Random Forest (Fig. 1). OLS or Linear Regression is one of the simplest machine learning algorithms to understand. Linear Regression can be simple (when only one independent variable is used in the model), or multiple (when two or more variables are used to predict the dependent variable) [12]. The structural model of Linear Regression is Y = β0 + β1 X 1 + · · · + βm X m + ε

(1)

where Y represents the dependent variable, and X is the independent variable(s). β parameters are the coefficients estimated by the regression model and the ε parameter is the error associated with the estimated model. Ridge Regression is an algorithm that is used when there are multicollinearity problems between the predictor variables of the model [12]. Multicollinearity is a condition that exists when one or more independent variables of the model can predict another independent variable relatively well. The least Absolute Shrinkage and Selection Operator (LASSO) is an algorithm that improves the accuracy of the model through variable selection and regularization. This process is called variable shrinkage, in which the objective is to reduce the number of predictive variables present in the model [12]. Gradient Boosting (GB) can be used for both classification and regression purposes. This algorithm is an ensemble algorithm, which started out being used in the optimization of a cost function and has been used in various areas, such as in the detection of energy theft [13]. This method has not been used much in studies concerning the COVID-19 pandemic [14]. GB is an algorithm, through several iterations, that combines a series of models with a learning rate to minimize prediction errors. In each of the models resulting from the iterations, discard the weakest predictors and choose the strongest ones [14]. GB additive model can be represented as follows: Fm (x) = Fm−1 (x) + ρm h m (x)

(2)

278

A. Arriaga and C. J. Costa

where Fm−1 is the previous model, and h m is the learning rate, which is used to minimize prediction errors [13]. ρm is a multiplier that can be represented as follows: ρm = argmin ρ

n 

L(yi , Fm−1 (xi ) + ρh m (xi ))

(3)

i=1

where yi is the target class label [13]. These machine learning algorithm have already been used in several articles that talk about the topic discussed in this paper such as [13, 14]. Multilayer Perception (MLP) is a machine learning method that uses artificial neural networks. As shown in the article [15], “The experience of the network is stored by the synaptic weights between neurons and its performance is evaluated, for example, by the ability to generalize behaviors, recognize patterns, fix errors or execute predictions”. This algorithm associates several neurons, forming neural networks that will perform various functions to improve the prediction [15]. MLP can use supervised or unsupervised learning, in this paper, we only focus on Supervised Learning. Random Forest (RF) is another ensemble algorithm like Gradient Boosting, that uses decision trees in the background. The decision trees are created based on a random sample of the training data [16]. The difference between RF and GB is that RF does not use a learning rate, it uses the average results of all generated trees [17]. The choice of these six algorithms was made to understand how linear algorithms, ensemble algorithms, and neural network algorithms behave in predicting the referenced data, measuring the accuracy of each one of them, through the most diverse measures described further down in the paper.

3 Method To predict COVID-19 mortality data, we used the number of daily cases and all vaccination data present in the database called “Our World in Data” [18]. The temperature data was obtained from the database of the National Centers for Environmental Information and is referred to the mean temperature registered in LISBOA GEOFISICA Station [19]. The following figures show the graphs of all variables from March 2, 2020 to February 28, 2022 (Figs. 2, 3, 4 and 5). The process described below was designed using the method CRISP-DM [20]. We used vaccination data to predict mortality given the impact that vaccination had since its inception on the number of both deaths and cases of COVID-19 infection [21]. We also used daily number of new cases to study the impact that this variable had before and after the start of the vaccination process on the number of deaths and for last we also use temperature because of the seasonal pattern present in the data. For the daily new cases, with the help of python [22], we created two dummy variables, the first named before_vaccination and the second named after_vaccination. The

Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality …

Fig. 2 Daily number of deaths from SARS-CoV-2 in Portugal

Fig. 3 Total people vaccinated and fully vaccinated in Portugal

Fig. 4 Total people with vaccine boosters and daily number of vaccinations in Portugal

279

280

A. Arriaga and C. J. Costa

Fig. 5 Daily number of deaths from SARS-CoV-2 and the daily average temperature in Portugal

before_vaccination variable is equal to 1 if we are talking about a day before the vaccination process started, and is equal to 0 if the day in question is after the beginning of the vaccination process. While the second dummy variable has the opposite process. Next, we create two new variables: new_cases_before_vaccination (new_cases × before_vaccination) and new_cases_after_vaccination (new_cases × after_vaccination). The next step was to infer how we would use the vaccination data. As we know from previous studies, the impact of vaccination is not immediate [23], so we decided to make lags for all vaccination variables, the first set of lags was one month and the second set of two weeks. For the temperature data, the average values were used. As there was missing data in the database, we decided to fill in the missing data in two ways, the first was to replace the initial missing data with zero, and the second was through python’s interpolate method, to fill in the remaining missing data [22]. Then, as sometimes there was no data on weekends, we removed all weekends from the database. Finally, so that all the variables are on the same scale and to measure which variables have the most impact on the model, a standardization of the data was carried out through the StandardScaler function of the scikit-learn module [24]. After all the data was prepared, we started to build the model that would be used in the regressions. The first step was to insert all the variables present in the database into a linear regression model (OLS), through stats modules [25], then the Variance Inflation Factor (VIF) of the model was tested. If there were one or more variables with a VIF greater than 5, the variable with the least correlation with the dependent variable was removed. Finally, the p-value of the t-test statistics [26] was observed and the non-significant variables for the model were removed, but as the adjusted R2 decreases, it was decided to keep these same variables, leaving the following: new_cases_beffore_vaccination, temperature, people_vaccinated, new_vaccinations, new_vaccinations_lag1M, new_cases_after_vaccination, and boosters_lag_1M.

Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality …

281

The next step was to divide the data into training and test samples and parameterize the algorithms for the model. The parameterization of Ridge, LASSO, Gradient Boosting, MLP, and Random Forest was done by inserting random numbers in many iterations, through various Cross-Validation methods [24], to find the optimal parameters. After all predictions have been made, a Durbin–Watson test [27] was performed on the residuals of each algorithm prediction to test whether there is an autocorrelation between the residuals. We also calculated the average of the residuals to see if it was close to 0 [28]. Finally, a comparison of some measures such as Mean Absolute Error (MAE), Mean Square Error (MSE), Median Absolute Error, Explained Variance Score (EVS), and the predicted R2 was made to all algorithms [24].

4 Results As mentioned above, the first step was the estimation of the linear regression model (OLS) as shown in Fig. 6. From the output in Fig. 6, the daily number of cases of COVID-19 has a positive impact, both before and after the vaccination process, with a greater impact before vaccination (higher coefficient in the model). The vaccination variables all have a positive impact, except for the one-month lag of the daily number of administered vaccines, a result that goes against what would be expected. Finally, it can be inferred

Fig. 6 OLS Regression Model

282

A. Arriaga and C. J. Costa

Table 1 Algorithm quality measures Model

MAE

MSE

MdAE

EVS

R2

OLS

0.386

0.388

0.288

0.504

0.503

Ridge

0.386

0.387

0.289

0.504

0.503

LASSO

0.385

0.387

0.287

0.505

0.504

Gradient boosting

0.136

0.055

0.070

0.930

0.930

MLP

0.190

0.133

0.087

0.829

0.830

Random forest

0.150

0.075

0.075

0.904

0.904

that temperatures have a negative impact on the number of deaths, which is in line with the data that we can observe in the graphs of both variables. For the estimation of the remaining models, we used a hyper parametrization of the models. Moving on now to the identification of the model with the best predictive power, we can observe the table below with the information referring to each model. By observing Table 1, we can infer that Gradient Boosting was the best predictive algorithm, obtaining the best scores in all measures. Random Forest and MLP also obtained good results, with RF being superior to MLP in all score measures. This indicates that these three algorithms may be candidates for making a good future prediction of daily COVID-19 mortality data. In Fig. 7, we can observe the importance of each of the predictors, given by the Gradient Boosting algorithm. The people vaccinated and the temperature are the most important variables for predicting COVID-19 deaths, contrary to what happened in the OLS, in which the number of daily cases was the variable with the highest coefficient. Strangely, the GB assigns less weight to the number of cases before vaccination compared to the number of cases after vaccination. A relevant curiosity is also that the variables that were not significant in the OLS are the two with the least importance in the GB. Bearing in mind that this was the algorithm with the greatest predictive power and the coefficients given by the OLS, we can say that the average temperature and the vaccinated people played a leading role in reducing deaths from SARS-CoV2. Finally, we can observe in Table 2 the results of the Durbin–Watson test and the average of the residues to test their quality. The values in Table 2 show that the residuals are not correlated (test statistic close to 2 ± 0.5), and their average is close to 0 in all algorithms, with the worst results being in the first three [27]. We can say that all models adequately capture the information present in the data [28, 29].

Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality …

283

Fig. 7 Gradient boosting—predictors importance

Table 2 Durbin–Watson test results and mean value of residuals

Model

Durbin–Watson test

Mean of residuals

OLS

2.379

−0.030

Ridge

2.379

−0.030

LASSO

2.379

−0.030

Gradient Boosting

2.134

0.017

MLP

2.260

0.001

Random Forest

1.861

0.016

5 Conclusions The objective of this paper was to infer the impact of vaccination, temperature, and the number of cases on SARS-CoV-2 mortality in Portugal. Various vaccination data and the lags of these data, such as a “division” of the number of daily cases registered before and after vaccination and the daily average temperature were used. The initial model began to be built using the OLS method and then replicated in other algorithms. There was a positive correlation between the dependent variable and the number of cases, as expected, but the difference in the coefficient before and after vaccination was very clear, while almost all the vaccination data present in the model had a negative coefficient, as already was expected, except the daily number of vaccinations lagged a month. The results in Gradient Boosting, MLP, and Random Forest were satisfactory, while in OLS, Ridge, and LASSO, the model fit values were below expectations, which may mean that the relation between the predictors and the dependent variable it’s not linear. The objectives of the paper were achieved, as the algorithm with the greatest predictive power was identified, which consists of

284

A. Arriaga and C. J. Costa

an ensemble algorithm, Gradient Boosting, and was proven that the vaccination is a good preventive measure against deaths from SARS-CoV-2 and temperature has a negative impact on the number of deaths. Acknowledgements We gratefully acknowledge financial support from FCT—Fundação para a Ciência e a Tecnologia (Portugal), national funding through research grant UIDB/04521/2020.

References 1. Almalki A, Gokaraju B, Acquaah Y, Turlapaty A (2022) Regression analysis for COVID-19 infections and deaths based on food access and health issues. Healthcare 10(2):324. https:// doi.org/10.3390/healthcare10020324 2. Rustagi V, Bajaj M, Tanvi, Singh P, Aggarwal R, AlAjmi MF, Hussain A, Hassan MdI, Singh A, Singh IK (2022) Analyzing the effect of vaccination over COVID cases and deaths in Asian countries using machine learning models. Front Cell Infect Microbiol 11. https://doi.org/10. 3389/fcimb.2021.806265 3. Sarirete A (2021) A bibliometric analysis of COVID-19 vaccines and sentiment analysis. Proc Comput Sci 194:280–287. https://doi.org/10.1016/j.procs.2021.10.083 4. Sohrabi C, Alsafi Z, O’Neill N, Khan M, Kerwan A, Al-Jabir A, Iosifidis C, Agha R (2020) World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19). Int J Surg 76:71–76. https://doi.org/10.1016/j.ijsu.2020.02.034 5. Milhinhos A, Costa PM (2020) On the progression of COVID-19 in Portugal: a comparative analysis of active cases using non-linear regression. Front Public Health 8. https://doi.org/10. 3389/fpubh.2020.00495 6. Perone G (2022) Using the SARIMA model to forecast the fourth global wave of cumulative deaths from COVID-19: evidence from 12 hard-hit big countries. Econometrics 10:18. https:// doi.org/10.3390/econometrics10020018 7. Aparicio JT, Romao M, Costa CJ (2022) Predicting bitcoin prices: the effect of interest rate, search on the internet, and energy prices. 17th Iberian conference on information systems and technologies (CISTI), Madrid, Spain, pp. 1–5. https://doi.org/10.23919/CISTI54924.2022.982 0085 8. Aparicio JT, Salema de Sequeira, JT and Costa CJ (2021) Emotion analysis of Portuguese Political Parties Communication over the covid-19 Pandemic, 16th Iberian conference on information systems and technologies (CISTI), Chaves, Portugal, pp. 1–6. https://doi.org/10.23919/ CISTI52073.2021.9476557 9. Cord M, Cunningham P (2008) Machine learning techniques for multimedia: case studies on organization and retrieval. Springer Science & Business Media 10. Zhu X (Jerry) (2005) Semi-supervised learning literature survey. University of WisconsinMadison, Department of Computer Sciences 11. Mendelson S, Smola AJ (eds) (2003) Advanced lectures on machine learning: machine learning summer school 2002, Canberra, Australia, February 11–22, 2002: revised lectures. Springer, Berlin, New York 12. Saleh H, Layous J (2022) Machine learning—regression Thesis for: 4th year seminar higher institute for applied sciences and technology 13. Gumaei A, Al-Rakhami M, Mahmoud Al Rahhal M, Raddah H, Albogamy F, Al Maghayreh E, AlSalman H (2020) Prediction of COVID-19 confirmed cases using gradient boosting regression method. Computers, Materials & Continua, 66(1):315–329. https://doi.org/10.32604/cmc. 2020.012045

Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality …

285

14. Shrivastav LK, Jha SK (2021) A gradient boosting machine learning approach in modeling the impact of temperature and humidity on the transmission rate of COVID-19 in India. Appl Intell 51, 2727–2739 (2021). https://doi.org/10.1007/s10489-020-01997-6 15. Borghi PH, Zakordonets O, Teixeira JP (2021) A COVID-19 time series forecasting model based on MLP ANN. Proc Comput Sci 181:940–947. https://doi.org/10.1016/j.procs.2021. 01.250 16. Gupta KV, Gupta A, Kumar D, Sardana A (2021) Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model in big data mining and analytics, 4(2):116–123. https://doi.org/10.26599/BDMA.2020.9020016.4 17. Ye¸silkanat CM (2020) Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm. Chaos Solitons Fractals 140:110210. https:// doi.org/10.1016/j.chaos.2020.110210 18. COVID-19 Data Explorer. https://ourworldindata.org/coronavirus-data-explorer. Accessed 2022/07/05 19. Menne MJ, Durre I, Korzeniewski B, McNeill S, Thomas K, Yin X, Anthony S, Ray R, Vose RS, Gleason BE, Houston TG (2012) Global historical climatology network—daily (GHCN-Daily), Version 3. https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/ gov.noaa.ncdc:C00861/html 20. Costa C, Aparício JT (2020) POST-DS: a methodology to boost data science, 15th Iberian conference on information systems and technologies (CISTI), Seville, Spain, pp. 1–6. https:// doi.org/10.23919/CISTI49556.2020.9140932 21. Haas EJ, McLaughlin JM, Khan F, Angulo FJ, Anis E, Lipsitch M, Singer SR, Mircus G, Brooks N, Smaja M, Pan K, Southern J, Swerdlow DL, Jodar L, Levy Y, Alroy-Preis S (2022) Infections, hospitalisations, and deaths averted via a nationwide vaccination campaign using the Pfizer–BioNTech BNT162b2 mRNA COVID-19 vaccine in Israel: a retrospective surveillance study. Lancet Infect Dis 22:357–366. https://doi.org/10.1016/S1473-3099(21)00566-1 22. Albon C (2018) Machine learning with Python cookbook: practical solutions from preprocessing to deep learning. O’Reilly Media, Inc 23. Dyer O (2021) Covid-19: Moderna and Pfizer vaccines prevent infections as well as symptoms, CDC study finds. BMJ n888. https://doi.org/10.1136/bmj.n888 24. Avila J, Hauck T (2017) Scikit-learn cookbook: over 80 recipes for machine learning in Python with scikit-learn. Packt Publishing Ltd 25. Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. Proceedings of the 9th Python in science conference (SciPy 2010) Austin, Texas. https://doi. org/10.25080/Majora-92bf1922-011 26. Kim TK (2015) T test as a parametric statistic. Korean J Anesthesiol 68:540–546. https://doi. org/10.4097/kjae.2015.68.6.540 27. Mckinney W, Perktold J, Seabold S (2011) Time series analysis in Python with statsmodels Proceedings of the 10th Python in science conference (SciPy 2011). https://doi.org/10.25080/ Majora-ebaa42b7-012 28. Hyndman RJ, Athanasopoulos G (2018) Forecasting: principles and practice. 2nd edition, OTexts: Melbourne, Australia 29. Akossou A, Palm R (2013) Impact of data structure on the estimators R-square and adjusted R-square in linear regression. Int J Math Comput 20:84–93

Software Engineering

Digital Policies and Innovation: Contributions to Redefining Online Learning of Health Professionals Andreia de Bem Machado, Maria José Sousa, and Gertrudes Aparecida Dandolini

Abstract Social, economic, and cultural transformations are interconnected with each other by the need for changes in the educational scenario regarding the teaching– learning process. With this in mind, this search will analyze in the light of a bibliometric review which policies allied to digital innovation can contribute to redefining the online learning of health professionals. It also presents the results of a bibliometric review that was conducted in the Web of Science (WoS) database. The results point to a pedagogy that provides interaction between teachers and students with the effective use of technology in order to promote knowledge for the formation of future professionals’ competencies. The purpose of this study is to see if technologies and technological practices can help health professionals learn more effectively online. Furthermore, it was discovered that digital education technologies and instructional methodologies are critical tools for facilitating fair and inclusive access to education, removing barriers to learning, and broadening teachers’ perspectives in order to improve the learning process of healthy students. Keywords Digital innovation · Online learning · Health professionals

1 Introduction The socioeconomic, political, cultural, scientific, and technological changes that have occurred throughout the twenty-first century, such as the globalization of the A. de B. Machado (B) · G. A. Dandolini Engineering and Knowledge Management Department, Federal University of Santa Catarina, Santa Catarina, Brazil e-mail: [email protected] G. A. Dandolini e-mail: [email protected] M. J. Sousa University Institute of Lisbon (ISCTE), Lisbon, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_24

289

290

A. de B. Machado et al.

economy and information, have driven the digital revolution [1]. As a result, there has been a substantial increase in the usage of ICT in society, resulting in major and quick changes in the way humans relate and communicate. In this context, most students use the internet as a primary learning need [2], so these data have policy implications for education during the digital transformation that is occurring in the 4.0 industrial revolution in this millennium [6]. In this technological scenario of changes in the way of receiving information and communicating, there have been changes in the educational space, especially regarding the teaching and learning process. These modifications were intended to provide students with critical thinking, focused on a collaborative construction of knowledge, in order to make it meaningful [8]. Proper policies allied with digital innovation [9] can contribute to redefining the online learning of health professionals through active learning methods. The use of these methods and the early insertion of students in the daily life of services favor meaningful learning, and the construction of knowledge, in addition to developing skills and attitudes, with autonomy and responsibility. Changing the learning process from face-to-face to distance learning is a decision that must be made by educational institutions so that the educational objectives can be implemented effectively and efficiently. The usage of internet networks in the learning process is known as online learning, and it provides students with the flexibility to learn whenever and wherever they want [7]. Learning can be understood as a path to transformation of the person and the reality, through which the student and the teacher become subjects of the teaching– learning process, transforming their pedagogical and professional practices, and building freedom with responsibility. And currently, with the changes in educational methods and tools, it becomes possible for those subjects to reflect critically on their practice and their learning mediated by digital innovation. Thus, this research aims to analyze, in the light of a bibliometric review, which policies allied to digital innovation can contribute to redefining the online learning of health professionals. The study was developed by conducting a bibliometric search in the Web of Science database. In addition to this introductory section, the next section describes the methodology, the results, and the analysis of the resulting bibliometric scenario of scientific publications. The third section brings the final considerations.

2 Proposed Methodology In order to measure, analyze and increase knowledge about confidence in the subject, policies combined with digital innovation can contribute to redefining the online learning for health professionals present in scientific literature publications, a bibliometric analysis was performed, starting with a search in the Web of Science, a database currently maintained by Clarivate Analytics. The study was developed using a strategy consisting of three phases: Execution plan, data collection, and bibliometrics. The Bibliometrix program was used to evaluate the bibliometric data because it is the most compatible with the Web of Science database. Biblioshiny,

Digital Policies and Innovation: Contributions to Redefining Online …

291

a R Bibliometrix package, has the most comprehensive and appropriate collection of techniques among the tools investigated for bibliometric analysis [5]. These data provided the organization of relevant information in a bibliometric analysis, such as temporal distribution; main authors, institutions, and countries; type of publication in the area; main keywords; and the most referenced papers. Scientific mapping allows one to investigate and draw a global picture of scientific knowledge from a statistical perspective. It uses mainly the three knowledge structures to present the structural and dynamic aspects of scientific research.

2.1 Methodological Approach This study is characterized as exploratory-descriptive since it aims to describe the subject and increase the researchers’ familiarity with it. A systematic search in an online database was employed for the literature search, which was followed by a bibliometric analysis of the results. Bibliometrics is a method used in the information sciences to map documents from bibliographic records contained in databases using mathematical and statistical methodologies [3]. Bibliometry allows for relevant findings such as the number of publications by region; temporality of publications; organization of research by area of knowledge; literature count related to citation of the numbers of studies related to citations found in the researched documents; identification of the impact factor of a scientific publication, among others, which contribute to the systematization of research results and the minimization of biases when analyzing data. The study was divided into three parts for the bibliometric analysis: Planning, collecting, and findings. These steps all came together to answer the study’s guiding question, which was: How can digital policies and innovation contribute to redefining the online learning of healthcare professionals? Planning began in November and ended in December 2021, when the research was carried out. During planning, some criteria were defined, such as the limitation of the search to electronic databases, and not contemplating physical catalogs in libraries, due to the number of documents considered sufficient in the Web search bases. In the planning scope, the Web of Science database was stipulated as the most suitable for the domain of this research due to the relevance of that database in the academic environment and its interdisciplinary character, which is the focus of research in this area. And also because it is one of the largest databases of abstracts and bibliographic references of peer-reviewed scientific literature and it undergoes constant updating. Considering the research problem, the search terms were defined in the planning phase, namely: “policy” and “online learning” or “online education” and “digital innovation” and “health professionals”. The use of the Boolean operator OR aimed to include the largest possible number of studies that address the subject of interest of this research. The use of truncator “*” was used to enhance the result by searching for the “policies coupled with digital innovation can help redefine online learning for health professionals” and its writing variations presented in the literature. It is considered that the variations of the expressions used in the search are presented, in

292 Table 1 Bibliometric data

A. de B. Machado et al. Main information about the data collected Description

Results

Timespan

1998–2021

Sources (journals, books, etc.)

608

Documents

897

Document Types Article

566

Article; data paper

4

Article; early access

30

Article; paper from proceedings

10

Editorial material

5

Paper from proceedings

261

Review

21

Review; early access

1

Document Contents Keywords Plus® (ID)

986

Author’s Keywords (DE)

2635

Authors Authors

3739

Author appearances

4431

Authors of single-authored documents

87

Authors of multi-authored documents

3652

Collaboration between authors Single-authored documents

87

Documents per author

0.241

Authors per document

4.15

a larger context, within the same purpose, because a concept depends on the context to which it is related. Finally, when planning the search, it was decided to use the terms defined in the title, abstract, and keyword fields, without restricting the time, language or any other that might limit the results. The data collection retrieved a total of 897 indexed papers, from 1998, the first publication, until 2021. The collection revealed that these papers were written by 3,739 authors, and linked to 427 institutions from 103 different countries. A total of 986 keywords were used. Table 1 shows the results of this data collection in a general bibliometric analysis. The eligible articles in the Web of Science database were published between 1998 and 2021. Of the 897 papers, there is a varied list of authors, institutions, and countries that stand out in the research on policies allied to digital innovation that can contribute to redefining the online learning of health professionals.

Digital Policies and Innovation: Contributions to Redefining Online …

293

When analyzing the country that has the most publications in the area, one can see that the USA stands out with 43% of total publications, a total of 4,026 papers. In second place is China with 24% of the publications, as shown in Fig. 1, which shows the 20 most cited countries. Another analysis performed is related to the identification of authors. The most relevant authors on digital policies and innovation, online learning, and health professionals are Derong Liu, with 14 publications, and Yu Zhang, with 12 published documents, as shown in Fig. 2

Fig. 1 Proposed research methodology

Fig. 2 Most relevant authors

294

A. de B. Machado et al.

Fig. 3 Most globally cited documents

The twenty most globally cited documents are shown in Fig. 3. The paper that gained the most prominence, with 774 citations, is Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems, by Derong Liu and Qinglai Wei, published in 2017, as shown in Table 2. From the general survey data, it was also possible to analyze the hierarchy of the sub-branches of this research on how policies allied to digital innovation can contribute to redefining the online learning of health professionals. The set of rectangles represented in the TreeMap shown in Fig. 4 shows the hierarchy of the subbranches of the research in a proportional way. It can be seen that themes such as education, model, performance, online, and technologies appear with some relevance and are related to policy, digital innovation, and online learning of health professionals. Also, from the bibliometric analysis, 986 keywords chosen by the authors were retrieved. Thus, the tag cloud shown in Fig. 5 was elaborated based on those retrieved words. The highlight was “education” with a total of 42 occurrences and, in second place, “model”. When looking at which country has the most publications in the region, the United States comes out on top with 43% of all publications (4,026). China is in second place, accounting for 24% of all publications. Derong Liu, who has 14 publications, is the most relevant author on the theme of policies and digital innovation, online learning, and health professionals, followed by Yu Zhang, who has 12 publications. Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems by Derong Liu and Qinglai Wei, published in 2017, was the work that stood out with 774 citations.

Digital Policies and Innovation: Contributions to Redefining Online …

295

Table 2 Articles and total citations Paper

Total citations

Liu D, 2014, IEEE Trans Neural Netw Learn Syst-A

322

Xiang Y, 2015, 2015 Ieee International Conference On Computer Vision (Iccv) 248 Shea P, 2009, Comput Educ

237

Jaksch T, 2010, J Mach Learn Res

193

Liu D, 2014, Ieee Trans Neural Netw Learn Syst

192

Modares H, 2014, Ieee Trans Autom Control

185

Xu J, 2017, Ieee Trans Cogn Commun Netw

182

Gai Y, 2012, Ieee-Acm Trans Netw

150

Wang S, 2018, Ieee Trans Cogn Commun Netw

148

Xu X, 2007, Ieee Trans Neural Netw

144

Aristovnik A, 2020, Sustainability

142

Zhang W, 2020, J Risk Financ Manag

125

Endo G, 2008, Int J Robot Res

122

Jiang Y, 2015, Ieee Trans Autom Control

116

Kelly M, 2009, Nurse Educ Today

113

Ivankova Nv, 2007, Res High Educ

101

Geng T, 2006, Int J Robot Res

98

Dinh Thai Hoang Dth, 2014, Ieee J Sel Areas Commun

95

Jiang Y, 2012, Ieee Trans Circuits Syst Ii-Express Briefs

94

Sundarasen S, 2020, Int J Environ Res Public Health

86

Fig. 4 Tree map

296

A. de B. Machado et al.

Fig. 5 Tag cloud

3 Conclusion It was found that the policies allied to digital innovation that can contribute to redefining the online learning of health professionals are those that tend to stimulate the development of a teaching–learning process that is creative, meaningful for the student, and committed to the local and regional health needs, encouraging autonomy and self-management of one’s own learning. The health system is part of the scenario for practice, providing an opportunity for the health field in a real situation, dynamic, and in action. Thus, the organization of health services, their practices, their management, and the formulation and implementation of policies are fundamental to the education process of health professionals. So, with the increasing globalization and the emergence of digital education, policies for online learning of health professionals have to enable educational strategies based on active methodologies carried out through research projects that provide open and direct feedback. As limitations, the method presented here is not able to qualitatively identify the theme of the policies allied to digital innovation that redefine the online learning of health professionals, and therefore, it is recommended the realization of integrative literature reviews that allow broadening and deepening the analysis performed here.

Digital Policies and Innovation: Contributions to Redefining Online …

297

References 1. de Bem Machado A, Sousa MJ, Dandolini GA (2022) Digital learning technologies in higher education: A bibliometric study. Em Lect Notes Netw Syst, p 697–705. Singapore: Springer Nature Singapore 2. de Bem Machado A, Secinaro S, Calandra D, Lanzalonga F (2021b) Knowledge management and digital transformation for Industry 4.0: a structured literature review. Knowl Manag Res & Pract, 1–19. https://doi.org/10.1080/14778238.2021.2015261 3. Linnenluecke MK, Marrone M, Singh AK (2019). Conducting systematic literature reviews and bibliometric analyses. Aust J Manag, 031289621987767. 4. Liu D, Wei Q (2014) Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans Neural Netw Learn Syst 25(3):621–634. https://doi.org/10.1109/ TNNLS.2013.2281663 5. Moral-Muñoz JA, Herrera- E, Santisteban-Espejo A, Cobo MJ (2020) Software tools for conducting bibliometric analysis in science: An up-to-date review. El profesional de la información 29(1):e290103 6. Rusli R, Rahman A, Abdullah H (2020) Student perception data on online learning using heutagogy approach in the Faculty of Mathematics and Natural Sciences of Universitas Negeri Makassar. Indonesia. Data in Brief 29(105152):105152. https://doi.org/10.1016/j.dib.2020. 105152 7. Sari MP, Sipayung YR, Wibawa KCS, Wijaya WS (2021) The effect of online learning policy toward Indonesian students’ mental health during Covid-19 pandemic. Pak J Med Health Sci 15(6):1575–1577. https://doi.org/10.53350/pjmhs211561575 8. Sousa MJ, Marôco AL, Gonçalves SP, Machado AdB (2022) Digital learning is an educational format towards sustainable education. Sustain, 14, 1140. https://doi.org/10.3390/su14031140 9. Ueno T, Maeda S-I, Kawanabe M, Ishii S (2009). Optimal online learning procedures for modelfree policy evaluation. Em Mach Learn Knowl Discov Databases, p 473–488. Berlin, Heidelberg: Springer Berlin Heidelberg

Reference Framework for the Enterprise Architecture for National Organizations for Official Statistics: Literature Review Arlindo Nhabomba , Bráulio Alturas , and Isabel Alexandre

Abstract Enterprise Architecture Frameworks (EAF) play a crucial role in organizations by providing a means to ensure that the standards for creating the information environment exist and they are properly integrated, thus enabling the creation of Enterprise Architectures (EA) that represent the structure of components, their relationships, principles, and guidelines with the main purpose of supporting business. The increase in the variety and number of Information Technology Systems (ITS) in organizations, increasing their complexity and cost, while decreasing the chances of obtaining real value from these systems, makes the need for an EA even greater. This issue is very critical in organizations whose final product is information, such as the National Organizations for Official Statistics (NOOS), whose mission is to produce and disseminate official statistical information of the respective countries. Historically, NOOS have individually developed business processes and similar products using ITS that are not similar, thus making it difficult to produce consistent statistics in all areas of information. In addition, over the years, the NOOS adopted a business and technological structure and model that entails high maintenance costs that are becoming increasingly impractical and the delivery model inexcusable, and the current EAF are not properly optimized to deal with these problems. NOOS are being increasingly challenged to respond quickly to these emerging information needs. We carried out this research through a literature review and a body of information pertinent on the topic was collected, which allowed us to demonstrate that, in order to respond to these challenges, it is necessary to have a holistic view of ITS through the definition of an EA using a reference EAF among the current ones or a new one, built from scratch. Keywords Enterprise architecture · IS architecture · IT systems · Official statistics

A. Nhabomba (B) · B. Alturas · I. Alexandre Iscte – Instituto Universitário de Lisboa, Avenida das Forças Armadas, 1649-026 Lisboa, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_25

299

300

A. Nhabomba et al.

1 Introduction Undoubtedly, the cost and complexity of ITS have increased exponentially in recent times, while the chances of getting real value from these systems have drastically decreased, requiring EA (also more complex) to satisfy the information needs of organizations. This situation introduces an additional degree of complexity in the practice of managing and maintaining ITS to ensure its alignment with the organizations’ business, a factor that continues to be seen as of vital importance by IT professionals and business managers in maximizing the contribution of Information Systems (IS) investments. The NOOS, the governing bodies of national statistical systems, are not immune to this problem. Over the years, through many iterations and technological changes, they built their organizational structure and production processes, and consequently their statistical and technological infrastructure. Meanwhile, the cost of maintaining this business model and associated asset bases (process, statistics, technology) is becoming insurmountable and the delivery model unsustainable [1]. For most NOOS, the underlying model for statistical production is based on sample surveys, but increasingly, organizations need to use administrative data or data from alternative sources to deliver efficiencies, reduce provider burden, and make richer use of existing information sources [1]. This requires significant new EA features that are not available on the vast majority of NOOS. The absence of these resources makes it difficult to produce consistent statistics in all domains of information. NOOS are being increasingly challenged to respond quickly to these emerging information needs. The advent of EAF over the past few decades has given rise to a view that business value and agility can best be realized through a holistic approach to EA that explicitly examines every important issue from every important perspective [2]. Similarly, Zachman, early in the EA field, stated that the costs involved and the success of the business, which increasingly depend on its IS, require a disciplined approach to managing these systems [3]. The need for an architectural vision for the IS of the NOOS, which allows a holistic conceptualization of their reality and which allows dealing with each situation in particular regardless of the IS solutions implemented in it, is thus justified by the need for tools that allow not only the representation of its reality, in order to understand the whole, as well as to examine how its constituent parts interact to form this whole. It is from this evidence of the facts that the need for a new EAF for the NOOS can be understood.

2 The Official Statistical Information Official statistical information (or official statistics) provides the quantitative basis for the development and monitoring of government social and economic policies [4]. This information is essential for economic, demographic, social, and environmental development and for mutual knowledge and trade between states and peoples of the world [5]. For this purpose, official statistics that pass the practical utility test

Reference Framework for the Enterprise Architecture for National …

301

must be compiled and made available impartially by NOOS to honor citizens’ right to public information [6]. There are many examples where good quality data are essential for decision-making, such as participation and performance in education, health statistics (morbidity, mortality rates, etc.), crime and incarceration rates, and tax information. Statistical data are almost invariably representative at the national level, because it is obtained from complete censuses or large-scale national sample surveys, and generally seek to present definitive information in accordance with international definitions and classifications or other well-established conventions [6]. However, building a capacity to systematically produce relevant, current, reliable, comprehensive, and internationally comparable statistics is a challenge for all countries. In this context, institutions involved in the production of statistics must rely on the use of international standards, without which the comparability of data produced by different NOOS, within a country and between countries, would be impossible. Its practical implementation is strongly aligned and supported by the Generic Statistical Business Process Model (GSBPM—Generic Statistical Business Process Model) [7].

3 Standards for Supporting Official Statistics Production During the last decades, official statistical production has been undergoing a process of modernization and industrialization conducted internationally. In this regard, the most distinctive initiative is the activities of the High-Level Group for the Modernization of Official Statistics (HLG-MOS) of the United Nations Economic Commission for Europe [8]; being responsible for the development of the following reference models: GSBPM [9], Generic Statistical Information Model (GSIM) [10], and the Common Statistical Production Architecture (CSPA) [11]. To these models is also added the Generic Activity Model for Statistical Organizations (GAMSO) [12]. The GSBPM describes and defines the set of business processes required to produce official statistics and provides a standard framework and harmonized terminology to help statistical organizations modernize their production processes as well as share methods and components [9]. Figure 1 shows the phases of the GSBPM. In addition to the processes, we also have the information that flows between them (data, metadata, rules, parameters, etc.). The GSIM aims to define and describe these information objects in a harmonized way [13]. GSIM and GSBPM are complementary models for the production and management of statistical information. The GSBPM models statistical business processes and identifies the activities carried out by the producers of official statistics resulting in information outputs [10]. These activities are divided into sub-processes, such as “edit and impute” and “calculate aggregates”. As shown in Fig. 2, GSIM helps to describe the GSBPM sub-processes by defining the information objects that flow between them, that are created in them, and that are used by them to produce official statistics [10]. The CSPA is a set of design principles that allow NOOS to develop components and services for the statistical production process, in a way that allows these

302

A. Nhabomba et al.

Fig. 1 Phases of the GSBPM

Fig. 2 Relationship between GSIM and GSBPM [10]

components and services to be easily combined and shared between organizations, regardless of the underlying technology platforms [14]. In this way, CSPA aims to provide “industry architecture” for official statistics. In addition to the achievements made with the GSBPM, GSIM, and CSPA standards, to support greater collaboration between NOOS, it is equally important to cover all typical activities of a statistical organization to improve communication within and between these organizations, introducing a common and standard terminology. This is the task of GAMSO [15]. The following diagram shows the position of GAMSO in relation to other standards for international collaboration (Fig. 3). All of these are measures to industrialize the statistical production process by proposing standard tools for the many aspects of this process. In general, they all follow a top-down approach, through which generic proposals are made that do not take into account specific methodological details of production [16]. As an immediate positive consequence, NOOS can find a direct adaptation of these standards to their particular processes and the statistical production is more easily comparable in the international domain and, therefore, susceptible to standardization to a certain extent. However, NOOS have been developing their own business processes and ITS to create statistical products [1]. While products and processes are conceptually very similar, individual solutions are not; each technical solution was built for a very specific

Reference Framework for the Enterprise Architecture for National …

303

Fig. 3 Relationship between GAMSO, GSBPM, GSIM, and CSPA [15]

Fig. 4 NOOS status quo

purpose, with little regard for the ability to share information with other adjacent applications in the statistical cycle, and with limited ability to handle similar but slightly different processes and tasks [1]. To this, Gjaltema [1], considers “accidental architecture”, as the process and solutions were not conceived from a holistic view (Fig. 4). In Fig. 5, two entities producing official statistics (NOOS, local delegations, or delegated bodies) have totally different technological concepts in the same stages of the statistical production process (e.g., using the GSBPM). As a result, outputs 1 and 2 will never be comparable, which jeopardizes the quality of the information produced and, consequently, the decisions taken, not only nationally, but also internationally. In terms of cost, columnist Bob Lewis has shown that in these situations, during initial IT implementations, the managed architecture is slightly more expensive than the rugged one, but over time, the cost of this one increases exponentially compared to the first one [17] (see Fig. 5). This same idea is shared by Sjöström et al. [18] (see Fig. 6). This means that NOOS find it difficult to produce and share data across systems in line with modern standards (e.g., Data Documentation Initiative (DDI) and Statistical Data and Metadata eXchange (SDMX)), even with new production support

304

A. Nhabomba et al.

Fig. 5 Total IT functionality delivered to enterprise [17]

Fig. 6 Architecture cost [18]

standards (GSBPM, GSIM, CSPA, and GAMSO). In short, the status quo of NOOSs is characterized by • • • •

complex and costly systems; difficult to keep those increasingly expensive systems aligned with NOOS’s needs; rigid processes and methods; inflexible aging technology environments.

4 Enterprise Architecture Frameworks EAF define how to organize the structure and perspective associated with EA [19]. EA, in turn, represents the architecture in which the system in question is the entire company, especially the company’s business processes, technologies, and IS [20]. These components are EA artifacts that can be defined as specific documents, reports, analyses, models, or other tangible items that contribute to an architectural description [20], i.e., providing a holistic view for developing solutions. Thus, an EAF collects tools, techniques, artifact descriptions, process models, reference models, and guidance used in the production of specific artifacts. This includes innovations in an organization’s structure, the centralization of business processes, the quality and timeliness of business information, or ensuring that the money spent on IT investments can be justified [19]. Over the past three decades, many EAF have emerged (and others have disappeared) to deal with two major problems: the increasing complexity

Reference Framework for the Enterprise Architecture for National …

305

of IT systems and the increasing difficulty in getting real value out of these systems [20]. As we can imagine, these problems are related. The more complex a system the less likely it is to deliver the expected value to the business. By better managing complexity, it increases the chances of adding real value to the business. Current literature highlights the following EAF: Zachman’s Framework, The Open Group Architecture Framework (TOGAF), Federal Enterprise Architecture (FEA), Value Realization Framework (VRF) along with Simple Iterative Partitions (SIP) or VRF/SIP, Department of Defense Architecture Framework (DoDAF), Integrated Architecture Framework (IAF), and two techniques developed in the academic context, which are Enterprise Knowledge Development (EKD) and Resources, Events, and Agents (REA) [20, 21]. Nowadays, the criteria for the selection of the main EAF is based on two perspectives: • widely used and highly rated EAF and; • EAF that support mobile IT/cloud computing and web service elements, which are crucial requirements of current IS. According to research in the Journal of Enterprise Architecture, Cameron and McMillan [22], from the perspective of the “widely used” criteria, TOGAF, Zachman, Gartner, FEA, and DoDAF are the most widely used, and it was decided that TOGAF, FEA, and DoDAF are “highly rated”. Sessions [2] also, in his study, states that Zachman, TOGAF, FEA, and Gartner are the most commonly used EAF. From this last list, Moscoso-Zea et al. [23] replace Gartner with DoDAF for the same perspective (widely used). In the second criterion for the selection of the EAF, “integration with the basic structure of mobile IT/cloud computing and services”, Gill et al. [24] argued that FEA, TOGAF, Zachman, and the Adaptive Enterprise Architecture Framework are adequate. Given these facts, we found that only the frameworks of Zachman, TOGAF, and FEA stand out in the two perspectives considered. For this reason, they are of interest to our study. The Zachman Framework provides a means of ensuring that the standards for creating the information environment exist and are properly integrated [25]. It is a taxonomy for organizing architectural artifacts that takes into account both who the artifact is aimed at and the specific problem being addressed [20]. These two dimensions allow the classification of the resulting artifacts, allowing any organization to obtain all types of possible artifacts. However, Zachman alone is not a complete solution. There are many issues that will be critical to the company’s success that Zachman doesn’t address. For example, Zachman doesn’t give us a step-by-step process for creating a new architecture, and it doesn’t even help us decide if the future architecture we’re creating is the best possible one [20]. Further, Zachman doesn’t give us an approach to show the need for a future architecture [20]. For these and other questions, we’ll need to look at other frameworks. TOGAF describes itself as a “framework”, but its most important part is the Architecture Development Model (ADM) which is a process to create an architecture [20]. Since ADM is the most visible part of TOGAF, we categorized it as an architectural process rather than an architectural framework like Zachman. Viewed as an architectural process, TOGAF complements Zachman, which is taxonomy. It should be noted, however, that TOGAF is not linked to government organizations

306

A. Nhabomba et al.

[26, 27]. As for the FEA, it was developed specifically for the federal government and offers a comprehensive approach to the development and use of architectural endeavors in federal government agencies [28] and is recognized as a standard for state institutions [29], unlike Zachman and TOGAF which are generic. FEA is the most complete of the three frameworks under discussion, i.e., it has a comprehensive taxonomy, like Zachman, and it also allows for the development of these artifacts, providing a process for doing so, like TOGAF [20]. There is, however, an important criticism of the FEA. In 2010, the Federal Council of CIO (Chief Information Officers) raised some problems in relation to FEA, such as [30] • lack of a common and shared understanding; • confusion about what EA is; • issues associated with FEA compliance reports. Participants recognized that it was time for a change. And, in general, a series of constraints in the implementation of EA are pointed out, such as the lack of clarity of its functions, ineffective communication, low maturity, and commitment of EA and its tools [31]. These challenges were attributed to three root causes: the ambiguity of the EA concept, the difficult terminology, and the complexity of EA frameworks.

5 Framework for NOOS As soon as we have briefly described the three most important frameworks, and presented their limitations, we will now discuss the essential characteristics of the solution proposed for NOOS. From the description of the EAF above, we concluded that with any of the three frameworks (Zachman, TOGAF, and FEA) it is possible to define how to create and implement an EA, providing the principles and practices for its description. Recognition of this reality is important because it allows any organization, public or private, including NOOS, to be aware of the specific needs that EAs must support, as well as to alert to the need for a sustained development of ITS. However, for NOOS, special attention must be considered, taking into account their public nature, which at the same time requires a lot of rigor in the execution of statistical surveys, respecting all phases of the GSBPM. For this type of organization, it is crucial to define a global EA, integrating all the entities involved in the statistical processes and, for that, it is necessary to adopt an EAF that supports this business model, which includes the implementation of ITS solutions in multiple statistical cycles while the process is performed by multiple entities. As we saw earlier, Zachman and TOGAF have some limitations to build an EA (although they can be used together in a blended approach) especially in NOOS (they are not linked to government organizations). In a comparative study carried out by Sessions and DeVadoss [20] between Zachman, TOGAF, FEA, and VRF/SIP considered the most important in that study and using criteria such as information availability, business focus, governance orientation, reference model orientation, prescriptive catalog, maturity model, among others, and in particular the criterion of the maturity model,

Reference Framework for the Enterprise Architecture for National …

307

FEA was considered the best [20]. The maturity model refers to how much guidance the framework provides to assess the effectiveness and maturity of different organizations in the use of EA [20]. This feature is important for NOOS since different entities are involved in the statistical production process and it is interesting to assess their effectiveness and maturity in the use of EA [32]. Furthermore, FEA is the most complete of the three most important EAF (it has mechanisms not only to classify artifacts, but also to develop them), as we saw earlier. It was also seen that FEA is a standard framework for state organizations, which NOOS fall under. By presenting all these characteristics, which are favorable to NOOS, FEA seems suitable for NOOS, despite being tainted by the problems raised by the Federal Council of CIO [30]. Therefore, to take advantage of these potentials, we recommend, as the first option, the creation of a reference EAF based on FEA, with better approaches in the following fields: common and shared understanding of the EA, compliance reports, clarity of the EA concept, and simplification of terminology and structures of EA. This approach, for official statistics, is also supported by Alturas, Isabel, and Nhabomba [32]. The second option that we propose is the creation of a new EAF, from scratch, and specific to the official statistics industry. This option would somewhat use a blended approach, consisting of fragments of each of the EAF that provide the most value in their specific areas. These fragments would be obtained by rating the EAF taking into account the criteria considered important on a case-by-case basis. This approach can be explained in the following Fig. 7. Here, it is recommended that the criterion “maturity model” and the three frameworks (Zachman, TOGAF, and FEA) are always present in the evaluation; the maturity model for being characteristic of NOOS and the three frameworks for being the most important. At the end of this exercise, the result will be a new EAF that consists of fragments from each of the top-rated frameworks. This will be the most suitable framework for NOOS and its implementation will require a broad perspective on all selected frameworks and expertise in helping companies create a framework that works better, given the specific needs and political realities of that company. Fig. 7 Criteria and ratings for each framework

308

A. Nhabomba et al.

6 Conclusions In this article, we demonstrated the need for a new framework for enterprise architecture for NOOS as these organizations have historically developed technical solutions without any holistic perspective, i.e., solutions developed individually for very specific purposes, with little consideration for the ability to share information, resulting in an accidental architectures, which propitiates complex and costly expensive systems, difficult to keep those systems aligned with NOOS’s needs, rigid processes and methods, and inflexible aging technology environments. To address these problems, two possible solutions were proposed. Before conceiving these solutions, we first selected the best frameworks, based on two criteria, which are “widely used and highly rated EAF” and “EAF that support mobile IT/cloud computing and web service element”, and three of them (Zachman, TOGAF, and FEA) proved to be the best. Then we presented the first solution that is a reference framework based on FEA to take advantage of its potential related to the fact that it is more complete than the other two frameworks, it is a standard for state organizations, and it works better with the maturity model, an important feature for NOOS. We recommended that this first option should have better approaches in the fields related to common and shared understanding of AE, compliance reporting, clarity of AE concept, and simplification of terminology and structures of EA. The second solution is a new EAF, resulting from a blended approach, consisting of fragments of each of the EAF that provide the most value in their specific areas. In this second option, we recommended that the criterion “maturity model” and the three most important frameworks must always be present in the evaluation; the first for being peculiar to NOOS and the three frameworks for being the most important. In the future, we will continue this research, providing a concrete proposal for a new EAF for NOOS, following one of the suggested solutions.

References 1. Gjaltema T (2021) Common statistical production architecture [Online]. https://statswiki. unece.org/display/CSPA/I.++CSPA+2.0+The+Problem+Statement 2. Sessions R (2007) A comparison of the top four methodologies, pp 1–34 [Online]. http://www. citeulike.org/group/4795/article/4619058 3. Zachman J (1987) A framework for information systems architecture. IBM Syst J 26(3) 4. Janssen T, Forbes S (2014) The use of official statistics in evidence based policy making in New Zealand. Sustain Stat Educ Proc Ninth Int Conf Teach Stat ICOTS9 5. Divisão Estatística das Nações Unidas (2003) Handbook of statistical organization, third edition: the operation and organization of a statistical agency 6. Feijó C, Valente E (2005) As estatísticas oficiais e o interesse público. Bahia Análise & Dados 15:43–54 7. UNECE (2012) Mapping the generic statistical business process model (GSBPM) to the fundamental principles of official statistics 8. High-Level Group for the Modernisation of Statistical Production (2011) Strategic vision of the high-level group for strategic developments in business architecture in statistics. In:

Reference Framework for the Enterprise Architecture for National …

9. 10.

11. 12. 13. 14. 15. 16.

17. 18. 19.

20.

21.

22. 23.

24.

25. 26. 27. 28.

29. 30.

309

Conference of European statisticians (24) [Online]. file:///C:/Users/user/Documents/1Doc 2021/ISCTE/PROJETO/Artigos 2022/Fontes/Strategic Vision.pdf Choi I (2020) Generic statistical business process model [Online]. https://statswiki.unece.org/ display/GSBPM/I.+Introduction#footnote2 Choi I (2121) Generic statistical information model (GSIM): communication paper for a general statistical audience. Mod Stats [Online]. https://statswiki.unece.org/display/gsim/GSIM+v1.2+ Communication+Paper Gjaltema T (2021) CSPA 2.0 common statistical production architecture. https://statswiki. unece.org/pages/viewpage.action?pageId=247302723 Gjaltema T (2021) Generic activity model for statistical organizations [Online]. https://statsw iki.unece.org/display/GAMSO/I.+Introduction. Lalor T, Vale S, Gregory A (2013) Generic statistical information model (GSIM). North Am Data Doc Initiat Conf (NADDI 2013), Univ Kansas, Lawrence, Kansas (December, 2013) Nações Unidas (2015) Implementation guidelines United Nations. United Nations Fundam Princ Off Stat, pp 1–117 UNECE (2015) Generic activity model for statistical organisations, pp 1–11 (March) Salgado D, de la Castellana P (2016) A modern vision of official statistical production, pp 1–40 [Online]. https://ine.es/ss/Satellite?L=es_ES&c=INEDocTrabajo_C&cid=125994986 5043&p=1254735839320&pagename=MetodologiaYEstandares%2FINELayout Lewis B (2021) Technical architecture: what IT does for a living. https://www.cio.com/article/ 189320/technical-architecture-what-it-does-for-a-living.html. Accessed 10 June 2022 Sjöström H, Lönnström H, Engdahl J, Ahlén P (2018) Architecture recommendations (566) Galinec D, Luic L (2011) The impact of organisational structure on enterprise architecture deployment. In: Proceedings of the 22nd Central European conference on information and intelligent systems, Varaždin, Croatia, 21–23 September 2011, vol 16, no 1, pp 2–19. https:// doi.org/10.1108/JSIT-04-2013-0010 Sessions R, DeVadoss J (2014) A comparison of the top four enterprise architecture approaches in 2014 by Roger sessions and John deVadoss table of contents. Microsoft Dev Netw Archit Cent 57 Bernaert M, Poels G, Snoeck M, De Backer M (2014) Enterprise architecture for small and medium-sized enterprises: a starting point for bringing EA to SMEs, based on adoption models, pp 67–96. https://doi.org/10.1007/978-3-642-38244-4_4 Cameron B, Mcmillan E (2013) Analyzing the current trends in enterprise architecture frameworks. J Enterp Archit 60–71 Moscoso-Zea O, Paredes-Gualtor J, Luján-Mora S (2019) Enterprise architecture, an enabler of change and knowledge management. Enfoque UTE 10(1):247–257. https://doi.org/10.29019/ enfoqueute.v10n1.459 Gill AQ, Smith S, Beydoun G, Sugumaran V (2014) Agile enterprise architecture: a case of a cloud technology-enabled government enterprise transformation. Proc—Pacific Asia Conf Inf Syst PACIS 2014 1–11 Rocha Á, Santos P (2010) Introdução ao Framework de Zachman (January 2010), p 19 Masuda Y, Viswanathan M (2019) Enterprise architecture for global companies in a digital IT era: adaptive integrated digital architecture framework (AIDAF). Springer, Tóquio Gill AQ (2015) Adaptive cloud enterprise architecture. Intelligent information systems. University of Technology, Australia Defriani M, Resmi MG (2019) E-government architectural planning using federal enterprise architecture framework in Purwakarta districts government. Proc 2019 4th Int Conf Inf Comput ICIC 2019 (April 2020). https://doi.org/10.1109/ICIC47613.2019.8985819 LearnIX, FEAF—Federal Enterprise Architecture Framework, 2022. https://www.leanix.net/ en/wiki/ea/feaf-federal-enterprise-architecture-framework Gaver SB (2010) Why doesn’t the federal enterprise architecture work?, p 114 [Online]. https://www.ech-bpm.ch/sites/default/files/articles/why_doesnt_the_federal_enterpr ise_architecture_work.pdf

310

A. Nhabomba et al.

31. Olsen DH (2017) Enterprise architecture management challenges in the Norwegian health sector. Procedia Comput Sci 121:637–645. https://doi.org/10.1016/j.procs.2017.11.084 32. Nhabomba ABP, Alexandre IM, Alturas B (2021) Framework de Arquitetura de Sistemas de Informação para as Organizações Nacionais de Estatísticas Oficiais. In: • 16 a Conferência Ibérica de Sistemas e Tecnologias de Informação, pp 1–5. https://doi.org/10.23919/CISTI5 2073.2021.9476481

Assessing the Impact of Process Awareness in Industry 4.0 Pedro Estrela de Moura, Vasco Amaral , and Fernando Brito e Abreu

Abstract The historical (and market) value of classic cars’ depends on their authenticity, which can be ruined by careless restoration processes. This paper reports on our ongoing research on monitoring the progress of such processes. We developed a process monitoring platform that combines data gathered from IoT sensors with input provided by a plant shop manager, using a process-aware GUI. The underlying process complies with the best practices expressed in FIVA’s Charter of Turin. Evidence (e.g., photos, documents, and short movies) can be attached to each task during process instantiation. Furthermore, car owners can remotely control cameras and car rotisserie to monitor critical steps of the restoration process. The benefits are manifold for all involved stakeholders. Restoration workshops increase their transparency and credibility while getting a better grasp on work assignments. Car owners can better assure the authenticity of their cars to third parties (potential buyers and certification bodies) while reducing their financial and scheduling overhead and carbon footprint. Keywords Classic car documentation · Auto repair shop software · Business process · BPMN · DMN · Internet of things · Industry 4.0 · Process monitoring · GUI [ · Process awareness · Process mining

P. E. de Moura (B) NOVA School of Science and Technology, Caparica, Portugal e-mail: [email protected] V. Amaral NOVA School of Science and Technology & NOVA LINCS, Caparica, Portugal e-mail: [email protected] F. B. e Abreu ISCTE - Instituto Universitário de Lisboa & ISTAR-IUL, Lisboa, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_26

311

312

P. E. de Moura et al.

1 Introduction Classic cars are collectible items, sometimes worth millions of euros [1], closer to pieces of art than to regular vehicles. To recognize their historic status required to reach these price-tag levels, classic cars should go through a rigorous certification process. This means that, during preservation or restoration procedures, strict guidelines should be followed to preserve their status, otherwise authenticity can be jeopardized, hindering the chances for certification. Such guidelines are published in FIVA’s1 “Charter of Turin Handbook” [3]. Since the expertise required for matching those guidelines is scarce and very expensive, car owners often choose restoration workshops far away from their residences, sometimes even overseas. This means that to follow the work done, long-range travels are required, with corresponding cost and time overheads and an increase in carbon footprint. We are tackling this issue by creating a platform that allows classic car owners to follow the work being done while reducing the need for manual input by workshop workers. This is accomplished by creating a digital twin that mirrors the work done at the workshop. In this paper, we describe a process-aware platform with a model-based GUI that is used by both workshop workers and car owners. We use a BPMN + DMN model described in [7] that is inspired by the “Charter of Turin’s Handbook” guidelines. This is the first attempt we are aware, of modeling this charter and was a great starting point for this research. Other sources, including local experts, provided the information required to fill in the blanks during the modeling process because the charter is vague in certain procedures or of subjective interpretation due to being written in natural language. During execution, process instances (cars under preservation or restoration) progress from task to task, either due to automatic detection with ML algorithms that take as input IoT sensors’ data collected by an edge computer attached to each car or due to manual intervention by the workshop manager. During the preservation or restoration process, the latter can attach evidence (photos, scanned documents, and short videos) to each task instance (task performed upon a given car). That evidence is used to automatically generate, using a LaTeX template, a report for car owners, for them to warrant the authenticity of the restoration and/or preservation their classic cars went through, to certification bodies and/or potential buyers. Our platform also allows holding meetings remotely with car owners, granting them complete control of a set of pan, tilt, and zoom operations upon a set of IP cameras at the workshop pointed at their car in a specific showroom. This feature reduces car owners’ financial and scheduling overhead and their carbon footprint. Both features (evidence collection and online interaction) increase the transparency of the restoration and preservation processes. We adopted an Action Research methodology, as interpreted by Susman and Evered in [12], where five stages of work are continuously iterated: Diagnosing, Action Planning, Action Taking, Evaluating, and Specifying Learning. By choosing 1

Fédération Internationale des Véhicules Anciens (FIVA), https://fiva.org.

Assessing the Impact of Process Awareness in Industry 4.0

313

this methodology, we aim to constantly receive feedback from platform users on the features being implemented, allowing an agile and quality-in-use development roadmap [4]. We claim two major contributions of this ongoing applied research work: (i) the positive impact of the proposed digital transformation in this Industry 4.0 context, and (ii) the assessment of the feasibility of process-aware/model-based GUIs, a topic we could not find addressed in the literature. This paper is organized as follows: Sect. 2 presents related work along three axes that intersect our work, Sect. 3 describes the proposed platform, and Sect. 4 presents the corresponding validation efforts; finally, in Sect. 5, we draw our conclusions and prospects for future work.

2 Related Work 2.1 Car Workshop Systems Several commercial systems can be found under the general designation of “Auto Repair Shop Software”. Besides documenting the history of ongoing repairs exist, they usually are concerned with financial management (invoicing), scheduling, workforce management, inventory, and management of interactions with customers (with some features found in CRM systems) and suppliers (e.g., paints and spare parts). An example that covers these aspects is Shopmonkey,2 advertised as a “Smart & simple repair shop management software that does it all”. Software systems specially designed for classic cars are scarce. One example of such is Collector Car Companion.3 It is a platform targeting classic car owners and restoration shops that allows documenting cars and their restoration processes, including photographic evidence. Additionally, it can be used to catalog parts and track costs and suppliers. We could not find any model-based solution for managing classic car restoration and preservation processes. For examples of such systems, we had to look at other industries.

2.2 Business Process Models in Industry 4.0 Kady et al. [5] created a platform aimed at beekeepers to help them manage their beehives. This was achieved by using sensors to continuously measure the weight of beehives and other discrete measurements at regular intervals. Additionally, they 2 3

https://www.shopmonkey.io. https://collectorcarcompanion.com/.

314

P. E. de Moura et al.

built BPMN models with the help of beekeepers, based on apicultural business rules. The patterns of the measurements collected are identified for data labeling and BPMN events association. These events trigger automated business rules on the workflow model. The process monitoring realized in this work is executed in a very similar way to ours. The differences occur in the way it is presented in the GUI. Instead of offering the visualization directly on the BPMN model itself, they added trigger events to the model that send notifications to the beekeeper’s mobile phone. Schinle et al. [10] proposed a solution to monitor processes within a chaincode by translating them into BPMN 2.0 models using process mining techniques. These models could then be used as graphical representations of the business processes. The authors claim to use process monitoring and process mining techniques, but it is unclear how the model-based GUI includes the monitoring aspects, as the only representation of a model shown is the one obtained after process mining, without process monitoring elements. Makke and Gusikhin [8] developed an adaptive parking information system that used cameras and sensors to track parking space occupancy, by implementing Petri Nets as digital twins for the parking space. In their representation, tokens were used to represent vehicles, places to represent areas or parking spots, and transitions to represent entrances and exits of the parking areas. Petri nets were also used as a way to represent the routes that individual vehicles took while in the parking space. The authors used a model-based GUI monitoring approach, but the models are hidden from the final users. This differs from our solution, as we present BPMN models in the GUI used by final users. Pinna et al. [9] developed a graphical run-time visualization interface to monitor Smart-Documents scenarios. Their solution consisted of an interface with a workflow abstraction of the BPEL models that highlighted the services already performed and the ones being performed. The decision to use BPEL abstraction models over the BPEL models themselves was because the BPEL workflow contained too many components, which made the scenario unreadable for human users, such as control activities and variables updating. Their abstraction used an icon-based GUI, instead of the usual text-based, for representing activities. It is unclear why this decision was made, as it seems that this annotation makes it harder to follow the process for an unaccustomed user. To mitigate this problem, by mousing over the icons, some additional information about the activity can be obtained. This publication does not describe the validation of the proposed approach. Most of the articles that use BPMs in Industry 4.0 contexts adopted BPMN, as confirmed by the secondary study titled “IoT-aware Business Process: a comprehensive survey, discussion and challenges” [2]. Our choice of using BPMN is then aligned with current practice. However, the main conclusion we draw from our literature review is that using a process-aware model-based GUI in Industry 4.0 is still an unexplored niche. The closest example we found of using this untapped combination of technologies is [8], but still, it seemed to only be used as an intermediary analysis tool.

Assessing the Impact of Process Awareness in Industry 4.0

315

3 Proposed Platform 3.1 Requirements Our platform can be divided into two separate subsystems, each with its own set of use cases. The first is the Plant Shop Floor Subsystem. This is the main part of our system where the Charter of Turin-based models can be viewed and interacted with. The operations that the different users can do in this subsystem are identified in the use case diagram in Fig. 1. The Experimental Hub Subsystem manages the live camera feeds to be used during scheduled meetings with car owners. The possible operations done in this subsystem are identified in the use case diagram in Fig. 2. In the diagrams, the Plant Shop Manager actor represents the workshop staff members that will control the day-to-day updates done to each vehicle and update the system accordingly. The Administrative Manager actor represents the workshop staff members who will have the control to create and delete restoration and preservation

Fig. 1 Use case diagram of the plant shop subsystem

316

P. E. de Moura et al.

Fig. 2 Use case diagram of the experimental hub subsystem

processes, as well as some administrative tasks, like registering new users to the system and sending credentials to be used to access the Experimental Hub Subsystem. Lastly, the Classic Car Owner actor stands for the owners themselves.

3.2 Architecture In the original architecture proposed in [7], Camunda’s Workflow Engine was used (and still is) to execute the Charter of Turin-based process, i.e., allowing its instances to be created, progress through existing activities, and deleted. The data stored in this platform were obtained through REST calls by a component designated as Connector. This component used BPMN.io to display the BPMN models on a web page to be interacted with by the workshop manager, indicating the path taken during the restoration process. This component also included a REST API that allows the retrieval of information about each instance. This API was used by a component developed with the ERPNext platform to allow owners to see the progress applied to their car as a list of completed tasks, while also providing some CRM functionalities for the workshop manager. We decided to discard the use of the ERPNext platform because, albeit it is open-source, implementing new features within this platform was laborious and inefficient, due to scarce documentation and lack of feedback from its development team. An overview of the current system’s architecture is depicted in the component diagram in Fig. 3. The Workflow Editor component is used to design the BPMN and DMN diagrams, while the Workflow Engine component stores the latter and allows for the execution of their workflows.

Assessing the Impact of Process Awareness in Industry 4.0

317

Fig. 3 Component diagram of the system

The Charter of Turin Monitor is the component that integrates the features formerly existing in the Connector component with some CRM features equivalent to those reused from ERPNext. It serves as the GUI that workshop employees use to interact with the BPMN process instances and use the CRM features to convey information to the owners. It also serves as the GUI used by classic car owners to check the progress and details of the restoration/preservation processes. One of these details is a direct link to the secret Pinterest board that holds the photographic evidence taken by the workshop staff members. These boards are divided into sections, each representing an activity with photos attached. Lastly, there are several IP cameras mounted in what we called the Experimental Hub, a dedicated room on the workshop premises. Classic car owners can remotely access and control these cameras through their web browsers, using the Camera Hub component. During their meeting, this access will only be available for a limited time, assigned by the workshop manager within the Charter of Turin Monitor component. The Camera Hub component also calls an API implemented in the Charter of Turin Monitor component to upload photos and videos taken during the meetings directly to the corresponding Pinterest board.

3.3 Technologies To model and deploy the BPMN and DMN models, we chose two Camunda products: Camunda’s Modeler for process modeling and Camunda’s Workflow Engine for process execution. Camunda software is widely used by household name companies, which stands for its reliability. The choice was also due to the two products being freeware, offering good tutorials and manuals.

318

P. E. de Moura et al.

For our back-end, we chose ASP.NET framework, primarily due to the offered plethora of integration alternatives, matching our envisaged current and future needs. The back-end was deployed on a Docker container in a Linux server running in a virtual machine hosted by an OpenStack platform operated by INCD (see the acknowledgment section). The database software we chose was MongoDB, as there is plenty of documentation on integrating it with.NET applications and deploying it with Docker. For our front-end, we decided not to use the default.NET framework Razor, but instead use Angular. Even though this framework does not offer integration as simple as Razor,.NET provides a template that integrates the two, while providing highly dynamic and functional pages with many libraries and extensions. Within our frontend, we integrated BPMN.io’s viewer bpmn-js. This viewer was developed with the exact purpose of working with Camunda and offers a simple way to embed a BPMN viewer within any web page. Finally, we chose Pinterest to store the photographic evidence collected. The option of storing these directly in the database or another platform was considered, but Pinterest was ultimately chosen by offering an API that allows all needed functionalities to be done automatically, without the need for manual effort. Also, it provides good support, in the form of widgets and add-ons, for integrating its GUI within other web pages, in case there is a later need for this feature. All the code and models used in this project can be found on GitHub.4

4 Validation This work has two main parts requiring validation, the DMN and BPMN models based on the process described in the “Charter of Turin Handbook” and the GUI used to represent them.

4.1 Model Validation To validate the models, we asked for feedback from the classic car workshop experts before deployment. This allowed for the more abstract parts of the “Charter of Turin Handbook”, which is fully described in natural language, to be complemented with the actual process followed in the workshop. A continuous improvement is now in place since the platform was already deployed in the workshop. Whenever any inconsistencies are found, the appropriate changes are swiftly made to allow for a fast redeployment.

4

https://github.com/PedroMMoura.

Assessing the Impact of Process Awareness in Industry 4.0

319

4.2 GUI Validation For GUI validation, we required analysis from the viewpoint of both the workshop workers that directly interact with the Chart of Turin-inspired model and the car owners, who use the platform to follow the process. To validate the workshop workers’ interaction, we resorted to using an expert panel [6]. The selected members for this panel needed prior knowledge of the models, or at least the general process, being used. This meant that we were limited to people that work directly for workshop companies that do restoration and preservation on classic cars and to certification companies engineers. Once the experts had been chosen (see Table 1), we conducted meetings with them, showing the platform and requesting feedback with a small interview. In the interim between interviews, we kept updating the platform based on the feedback received, checking how the satisfaction with it evolves. Upon completion of all the interviews, all data is aggregated and used to evaluate the results. This is still an ongoing task. From the interviews done so far, the feedback received has been mostly positive, with a great interest in the project being developed. Among the suggested features that were already implemented are coloring the tasks that require evidence gathering according to FIVA requirements, a text field for each task that allows for any additional information to be added when necessary, and a few other usability improvements. To validate the owners’ interaction with the system, we decided to use an interview approach [11]. These interviews will be performed with any classic car owner willing to participate, not requiring prior knowledge. As a result, we should get a good idea of new users’ overall satisfaction levels while using our platform. After being informed of our work, several classic car owners and longtime customers of the workshop showed great interest in working with us to test the platform. As of the writing of this document, these interviews have not yet been conducted, because priority was given to finishing the validation of the workshop workers’ interaction before starting the validation of the owner’s interaction. This choice was made because, while the worker’s interaction directly affects the owner’s GUI, the owner’s interaction barely affects the worker’s experience. Table 1 Expert panel characterization

Profession

Expertise

Field of work

Years of experience

Manager

Plant shop floor works

Car body restore shop

20

Secretary

CRM

Car body restore shop

15

Manager

HR management

Car body restore shop

15

Engineer

Classic cars certification

Certification body

25

Researcher

BPM modeling R&D

25

320

P. E. de Moura et al.

5 Conclusion and Future Work In this paper, we described our ongoing effort to develop and validate a platform for monitoring the progress of classic cars’ restoration process and recording evidence to allow documenting it in future certification initiatives. We took FIVA’s Charter of Turin’s guidelines as inspiration for producing a BPMN process model that is used as the backbone of our process-aware graphical user interface. The validation feedback received until now has been very positive. This work has gathered interest from several players in the classic car restoration industry, from classic car owners to workshops and certification bodies, which will be very helpful in improving the developed platform and in future validation steps. As future work, we plan to use process mining techniques to validate the models, based on data that is already being collected. Since each classic vehicle is just a process instance, we will have to wait until a considerable number of them complete the restoration process, since process mining ideally requires a large amount of data to produce adequate results. Acknowledgements This work was produced with the support of INCD funded by FCT and FEDER under the project 01/SAICT/2016 nº 022153, and partially supported by NOVA LINCS (FCT UIDB/04516/2020).

References 1. Autocar: The 13 most expensive cars ever sold (2018). https://www.autocar.co.uk/car-news/ industry/12-most-expensive-cars-sold-auction 2. Fattouch N, Lahmar IB, Boukadi K (2020) IoT-aware business process: comprehensive survey, discussion and challenges. In: 29th internernational conference on enabling technologies: infrastructure for collaborative enterprises (WETICE). IEEE, pp 100–105 3. Gibbins K (2018) Charter of Turin handbook. Tech. rep., Fédération Internationale des Véhicules Anciens (FIVA). https://fiva.org/download/turin-charter-handbook-updated-2019english-version/ 4. ISO Central Secretary (2011) Systems and software engineering—Systems and software Quality Requirements and Evaluation (SQuaRE)—system and software quality models. Standard ISO/IEC 25010:2011, International Organization for Standardization, Geneva, CH. https:// www.iso.org/standard/35733.html 5. Kady C, Chedid AM, Kortbawi I, Yaacoub C, Akl A, Daclin N, Trousset F, Pfister F, Zacharewicz G (2021) IoT-driven workflows for risk management and control of beehives. Diversity 13(7):296 6. Li M, Smidts CS (2003) A ranking of software engineering measures based on expert opinion. IEEE Trans Software Eng 29(9):811–824 7. Lívio D (2022) Process-based monitoring in industrial context: the case of classic cars restoration. Master’s thesis, Monte da Caparica, Portugal. http://hdl.hadle.net/10362/141079 8. Makke O, Gusikhin O (2020) Robust IoT based parking information system. In: Smart cities, green technologies, and intelligent transport systems. Springer, pp 204–227 9. Pinna D (2008) Real-time business processes visualization in document processing systems. Master’s thesis, Torino, Italia

Assessing the Impact of Process Awareness in Industry 4.0

321

10. Schinle M, Erler C, Andris PN, Stork W (2020) Integration, execution and monitoring of business processes with chaincode. In: 2nd conference on blockchain research & applications for innovative networks and services (BRAINS). IEEE, pp 63–70 11. Seidman I (2006) Interviewing as qualitative research: a guide for researchers in education and the social sciences. Teachers college press 12. Susman GI, Evered RD (1978) An assessment of the scientific merits of action research. Adm Sci Q 582–603

An Overview on the Identification of Software Birthmarks for Software Protection Shah Nazir and Habib Ullah Khan

Abstract Software birthmarks were created in order to identify instances of software piracy. The perception of a software birthmark was established in response to the limitations of watermarks, fingerprints, and digital signatures, which make it challenging to determine the identity of software. Software programs can be compared based on their extracted properties and birthmarks to determine who owns the software. Birthmarks are used to identify a specific programming language’s executable and source code. Researchers and practitioners can create new methods and processes for software birthmarks on the basis of which piracy is effectively identified by using the analysis of software birthmarks from various viewpoints. The goal of the current study is to comprehend the specifics of software birthmarks in order to gather and evaluate the information provided in the literature now in existence and to facilitate the advancement of future studies in the field. Numerous notable software birthmarks and current techniques have been uncovered by the study. Various sorts of analyses were conducted in accordance with the stated study topics. According to the study, more research needs to be done on software birthmarks to make accurate and reliable systems that can quickly and accurately find stolen software and stop software piracy. Keywords Software security · Birthmark · Software birthmark · Software measurements

1 Introduction Software piracy is a major issue for the software business which suffers severe business losses as a result of this infringement. Software piracy is the unlicensed use of software that is illegal. The prevention of software piracy is crucial for the expanding S. Nazir (B) Department of Computer Science, University of Swabi, Swabi, Pakistan e-mail: [email protected] H. U. Khan Department of Accounting & Information Systems, College of Business & Economics, Qatar University, Doha, Qatar © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_27

323

324

S. Nazir and H. U. Khan

software industry’s economy. Researchers are attempting to develop methods and tools to stop software piracy and outlaw the use of illegally obtained software. Pirated software has a number of drawbacks that prevent it from the advantages of software upgrades, constant technical support, assurance of virus-free software, thorough program documentations, and quality assurance. Different strategies are in use to stop software piracy. Such techniques include fingerprints [1], watermarks [2–6], and software birthmarks [7–16]. The disadvantage of watermark is that it is detachable using code obfuscation and transformation techniques that preserve semantics. The software fingerprints have the same problems. To get around these restrictions, the birthmark, a well-known and widely acknowledged technique for preventing software piracy, was created. Software birthmarks are fundamental characteristics of software that can be utilized to determine the distinct identity of software and later be used as proof of software theft. The concept of a software birthmark was developed in response to the limitations of watermarks, fingerprints, and digital signatures, which make it challenging to determine the identity of the software. Birthmarks were first thought about long before they were legally recognized in 2004. Software birthmarks are typically used for Windows API and software theft. If a piece of software has more intrinsic features, it should be regarded as having a strong birthmark. In the end, the birthmark’s durability will enable quick and accurate software uniqueness identification. The birthmark of software is based on the reliability and toughness of the software [7]. The suggested research makes a contribution by presenting a compressive indepth investigation of software birthmarks, which are employed for a variety of applications but are primarily for the detection of software piracy.

2 Identification of Software Birthmark to Prevent Piracy A software program is, in general, a pool of several software features of a particular kind. If a birthmark has additional characteristics, it is referred to as a strong birthmark. Such as the birthmark designed by Nazir et al. [17] which is considered a strong birthmark as this birthmark consists of more software features. This birthmark mostly has four characteristics. The pre-condition feature category was omitted after doing the initial analysis because it is included in almost all types of software. The remaining three feature categories were then utilized. Sub-categories were then created for each of the three main categories. Program contents, program flow, internal data structure, configurable terminologies, control flow, interface description, program size, program responses, restriction, naming, functions, thorough documentation, limitation and constraints, user interface, statements in the program, internal quality, and global data structure are the 17 features that were taken into consideration for the input category. Automation, scalability, ease of use, applicability, friendliness, robustness, portability, scope, interface connections, standard, reliance, and external

An Overview on the Identification of Software Birthmarks for Software … Table 1 Different forms of software birthmarks

325

S. no

Refs.

Technique of birthmarks

1

[9]

DKISB

2

[11]

JSBiRTH

3

[15]

Dynamic K-Gram

4

[16]

K-gram

5

[19]

Dynamic key instruction sequences

6

[18]

Birthmark-based features

7

[20]

System call dependence graph

8

[21]

Optimized grouping value

9

[22]

Static major-path birthmarks

10

[23]

Thread-aware birthmarks

11

[24]

System Call-Based Birthmarks

12

[25]

Method-based static software birthmarks

13

[26]

Static Object Trace Birthmark

14

[27]

Static instruction trace birthmark

15

[28]

Static API Trace Birthmark

16

[29]

A dynamic birthmark for Java

17

[30]

Dynamic Opcode n-gram

18

[31]

API call structure

19

[32]

CVFV, IS, SMC, and UC

20

[33]

Whole Program Path Birthmarks

quality are the 12 subfeatures that make up the non-functional components. Functional specification, data and control process, behavior, and functionality are the four subfeatures that make up the functional components. Understanding these aspects and how they are logically organized makes it easier to understand the code [18]. Table 1 identifies the different forms of software birthmarks. Table 2 shows the techniques used for software birthmark.

3 Analysis of the Existing Approaches Software birthmarking is regarded as a reliable and effective method for detecting software theft and preventing piracy. Resilience and believability are two factors that can be used to gauge how comparable two birthmarks are. For comparing software based on birthmarks, various metrics are employed. Most frequently, two birthmarks are compared using the cosine distance. Other common set-based metrics including the Dice coefficient [14, 27] and the Jaccard index [29] are used for assessing the similarity of dynamic birthmarks. Diverse approaches are used to work with byte code and on source code-based birthmarks. Various famous libraries were searched to show

326

S. Nazir and H. U. Khan

Table 2 Approaches of software birthmark R. no

Technique

[34]

Class invocation graph and state flow chart-based analysis of repackaging detection of mobile apps

[35]

State diagram and call tree-based comparison

[19]

Jaccard index, dice coefficient, cosine distance, and containment similarity metrics

[18]

Comparison through mining of semantic and syntactic software features

[36]

Estimating birthmark of software based on fuzzy rules

[37]

Dynamic birthmark based on API

[38]

Cosine similarity metrics

[39]

k-gram comparisons

[40]

Control flow graphs

[41]

Cosine similarity metric

the literature on software birthmark. Figure 1 shows the libraries and the number of total publications. These libraries and the number of publications depict that more articles were published in the Springer library followed by the ScienceDirect, and so on.

ACM

IEEE

Sciencedirect

Springer

Taylor and Francis

Wiley Online Library

7

4 31

42

29

37

Fig. 1 Libraries and publication

Article type

An Overview on the Identification of Software Birthmarks for Software …

327

Other Journal Conference Book/Book chapter

0

20

40

60

80

100

Number of papers Fig. 2 Article type and number of publications

Number of papr

1500 1000 500 0 0

1

2

3

4

5

6

7

Libraries Title

abstract

contents

Linear (Title)

Linear (abstract)

Linear (contents)

Fig. 3 Filtering process for identification of related papers

Figure 2 represents the article type and the number of publications. The figure depicts that more articles were published as conference papers followed by the journal and so on. The reason behind this representation was to show the increase/decrease of research work in the area. Figure 3 shows the filtering process for the identification of related papers. These papers were initially represented as papers based on title, then showed as abstract, and finally showed as full contents.

4 Conclusions Software’s birthmarks are inherent qualities that can be utilized to spot software theft. Software birthmarks were created with the intention of identifying software piracy. Software piracy might be entire, partial, or to a small degree. The owners of

328

S. Nazir and H. U. Khan

the companies that generate software suffer enormous losses as a result of software piracy. Software programs’ extracted traits, generally referred to as birthmarks, can be used to compare birthmarks of software programs to assess the ownership of the software. Birthmarks are used to identify a specific programming language’s executable and source code. Researchers and practitioners can create new methods and processes of software birthmarks on the basis of which piracy is effectively identified by using the analysis of software birthmarks from various angles. The goal of the proposed study is to advance future research in the field by gaining an understanding of the specifics of software birthmarks by evidence gathered from the literature and expertise supplied within. Conflict of Interest The authors declare no conflict of interest.

References 1. Gottschlich C (2012) Curved-region-based ridge frequency estimation and curved Gabor filters for fingerprint image enhancement. IEEE Trans Image Process 21(4):220–227 2. Thabit R, Khoo BE (2014) Robust reversible watermarking scheme using Slantlet transform matrix. J Syst Softw 88:74–86 3. Venkatesan R, Vazirani V, Sinha S (2001) A graph theoretic approach to software watermarking. In: 4th international information hiding workshop, Pittsburgh, PA, pp 157–168 4. Stern JP, Hachez GE, Koeune FC, Quisquater J-J (2000) Robust object watermarking: application to code. In: Information hiding, vol 1768, lecture notes in computer science. Springer Berlin Heidelberg, pp 368–378 5. Monden A, Iida H, Matsumoto K-I, Inoue K, Torii K (2000) A practical method for watermarking java programs. In: Compsac2000, 24th computer software and applications conference, pp 191–197 6. Cousot P, Cousot R (2004) An abstract interpretation-based framework for software watermarking. In: Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on principles of programming languages, vol 39, no 1, pp 173–185 7. Nazir S et al (2019) Birthmark based identification of software piracy using Haar wavelet. Math Comput Simul 166:144–154 8. Kim D et al (2014) A birthmark-based method for intellectual software asset management. In: Presented at the 8th international conference on ubiquitous information management and communication, Siem Reap, Cambodia 9. Tian Z, Zheng Q, Liu T, Fan M (2013) DKISB: dynamic key instruction sequence birthmark for software plagiarism detection. In: High performance computing and communications & 2013 IEEE international conference on embedded and ubiquitous computing (HPCC_EUC), IEEE 10th international conference on 2013, pp 619–627 10. Ma L, Wang Y, Liu F, Chen L (2012) Instruction-words based software birthmark. Presented at the proceedings of the 2012 fourth international conference on multimedia information networking and security 11. Chan PPF, Hui LCK, You SM (2011) JSBiRTH: dynamic javascript birthmark based on the run-time heap. Presented at the proceedings of the 2011 IEEE 35th annual computer software and applications conference 12. Lim H, Park H, Choi S, Han T (2009) A static java birthmark based on control flow edges. Presented at the proceedings of the 2009 33rd annual IEEE international computer software and applications conference, vol 01

An Overview on the Identification of Software Birthmarks for Software …

329

13. Zhou X, Sun X, Sun G, Yang Y (2008) A combined static and dynamic software birthmark based on component dependence graph. Presented at the international conference on intelligent information hiding and multimedia signal processing 14. Park H, Choi S, Lim H-I, Han T (2008) Detecting Java class theft using static API trace birthmark (in Korean). J KIISE: Comput Practices Lett 14(9):911–915 15. Bai Y, Sun X, Sun G, Deng X, Zhou X (2008) Dynamic K-gram based software birthmark. Presented at the proceedings of the 19th Australian conference on software engineering 16. Myles G, Collberg C (2005) K-gram based software birthmarks. Presented at the proceedings of the 2005 ACM symposium on applied computing, Santa Fe, New Mexico 17. Nazir S, Shahzad S, Nizamani QUA, Amin R, Shah MA, Keerio A (2015) Identifying software features as birthmark. Sindh Univ Res J (Sci Ser) 47(3):535–540 18. Nazir S, Shahzad S, Nizamani QUA, Amin R, Shah MA, Keerio A (2015) Identifying software features as birthmark. Sindh Univ Res J (Sci Ser) 47(3):535–540 19. Tian Z, Zheng Q, Liu T, Fan M, Zhuang E, Yang Z (2015) Software Plagiarism detection with birthmarks based on dynamic key instruction sequences. IEEE Trans Softw Eng 41(12):1217– 1235 20. Liu K, Zheng T, Wei L (2014) A software birthmark based on system call and program data dependence. Presented at the proceedings of the 2014 11th web information system and application conference 21. Park D, Park Y, Kim J, Hong J (2014) The optimized grouping value for precise similarity comparison of dynamic birthmark. Presented at the proceedings of the 2014 conference on research in adaptive and convergent systems, Towson, Maryland 22. Park S, Kim H, Kim J, Han H (2014) Detecting binary theft via static major-path birthmarks. Presented at the proceedings of the 2014 conference on research in adaptive and convergent systems, Towson, Maryland 23. Tian Z, Zheng Q, Liu T, Fan M, Zhang X, Yang Z (2014) Plagiarism detection for multithreaded software based on thread-aware software birthmarks. Presented at the proceedings of the 22nd international conference on program comprehension, Hyderabad, India 24. Wang X, Jhi Y-C, Zhu S, Liu P (2009) Detecting software theft via system call based birthmarks. In: Computer security applications conference. ACSAC ’09. Annual, pp 149–158 25. Mahmood Y, Sarwar S, Pervez Z, Ahmed HF (2009) Method based static software birthmarks: a new approach to derogate software piracy. In: Computer, control and communication. IC4 2009. 2nd international conference on 2009, pp 1–6 26. Park H, Lim H-I, Choi S, Han T (2009) Detecting common modules in java packages based on static object trace birthmark (in English). Comput J 54(1):108–124 27. Park H, Choi S, Lim H, Han T (2008) Detecting code theft via a static instruction trace birthmark for Java methods. In: 2008 6th IEEE international conference on industrial informatics, pp 551–556 28. Park H, Choi S, Lim H, Han T (2008) Detecting java theft based on static API trace birthmark. In: Advances in information and computer security Kagawa, Japan. Springer-Verlag 29. Schuler D, Dallmeier V, Lindig C (2007) A dynamic birthmark for java. Presented at the proceedings of the twenty-second IEEE/ACM international conference on automated software engineering, Atlanta, Georgia, USA 30. Lu B, Liu F, Ge X, Liu B, Luo X (2007) A software birthmark based on dynamic opcode n-gram. Presented at the proceedings of the international conference on semantic computing 31. Choi S, Park H, Lim H-I, Han T (2007) A static birthmark of binary executables based on API call structure. Presented at the proceedings of the 12th Asian computing science conference on advances in computer science: computer and network security, Doha, Qatar 32. Tamada H, Nakamura M, Monden A (2004) Design and evaluation of birthmarks for detecting theft of java programs. In IASTED international conference on software engineering, pp 17–19 33. Myles G, Collberg C (2004) Detecting software theft via whole program path birthmarks. In: Zhang K, Zheng Y (eds) Information security: 7th international conference, ISC 2004, Palo Alto, CA, USA, 27–29 Sept 2004. Proceedings. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 404–415

330

S. Nazir and H. U. Khan

34. Guan Q, Huang H, Luo W, Zhu S (2016) Semantics-based repackaging detection for mobile apps. In: Caballero J, Bodden E, Athanasopoulos E (eds) Engineering secure software and systems: 8th international symposium, ESSoS 2016, London, UK, April 6–8, 2016. Proceedings. Springer International Publishing, Cham, pp 89–105 35. Anjali V, Swapna TR, Jayaramanb B (2015) Plagiarism detection for java programs without source codes. Procedia Comput Sci 46:749–758 36. Nazir S, Shahzad S, Khan SA, Ilya NB, Anwar S (2015) A novel rules based approach for estimating software birthmark. Sci World J 2015 37. Daeshin P, Hyunho J, Youngsu P, JiMan H (2014) Efficient similarity measurement technique of windows software using dynamic birthmark based on API (in Korean). Smart Media J 4(2):34–45 38. Kim D et al (2013) Measuring similarity of windows applications using static and dynamic birthmarks. Presented at the proceedings of the 28th annual ACM symposium on applied computing, Coimbra, Portugal 39. Jang M, Kim D (2013) Filtering illegal Android application based on feature information. Presented at the proceedings of the 2013 research in adaptive and convergent systems, Montreal, Quebec, Canada 40. Jang J, Jung J, Kim B, Cho Y, Hong J (2013) Resilient structural comparison scheme for executable objects. In: Communication and computing (ARTCom 2013), fifth international conference on advances in recent technologies in 2013, pp 1–5 41. Chae D-K, Ha J, Kim S-W, Kang B, Im EG (2013) Software plagiarism detection: a graphbased approach. Presented at the proceedings of the 22nd ACM international conference on information & knowledge management, San Francisco, California, USA

The Mayan Culture Video Game—“La Casa Maya” Daniel Rodríguez-Orozco, Amílcar Pérez-Canto, Francisco Madera-Ramírez, and Víctor H. Menéndez-Domínguez

Abstract One of the most important cultures in Central America is the Mayan Culture, whose preservation is essential to keep the traditions alive. In this work, a video game with Mayan environment as a scenario is proposed. The objective is that the player acquires the necessary knowledge to deepen in this beautiful culture to obtain new experiences as a Mayan individual. The video game consists in a tour of several places with Mayan traditions where the user can collect coins to achieve points and learn important information about the civilization. We want the user to learn through entertainment technology and feel engaged by the Mayan culture. Keywords Mayan culture · Educational video game · Computer graphics · Gamification

1 Introduction Currently, the Mayan Culture in the Yucatan Peninsula (Mexico) has been losing much of its presence over the years, mainly in the cities, being the reason for several studies to identify the possible causes [20]. In this sense, the role that education plays to promote and preserve this culture is undeniable. Teaching methods proposed in many schools can be improved, since reading, the use of books, and extensive research tasks are not a habit in Mexican society, so students are losing interest in knowing their roots and cultural aspects, avoiding transmitting the customs and traditions that D. Rodríguez-Orozco · A. Pérez-Canto · F. Madera-Ramírez · V. H. Menéndez-Domínguez (B) Universidad Autónoma de Yucatán, Mérida, México e-mail: [email protected] D. Rodríguez-Orozco e-mail: [email protected] A. Pérez-Canto e-mail: [email protected] F. Madera-Ramírez e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_28

331

332

D. Rodríguez-Orozco et al.

the Mayan culture has inherited from us. Nevertheless, technological innovations, such as virtual scenarios, wireless mobile devices, digital teaching platforms, and virtual and augmented reality, increase students’ interest and motivation, as well as their learning experience [17, 21]. In this sense, due to technological innovations in recent years, video games have gained a great importance among the young population as can be seen in some conferences (https://gdconf.com, https://www.siggraph.org/). We decided to promote a playful and vibrant strategy using the latest generation software to make a cultural video game, as we believe it is an effective way to connect with people who are interested in living new experiences through our culture and it will allow the transmission of the wonderful customs and traditions offered in Yucatan, a Mexican state. So why is it important to promote and preserve the Mayan culture? Because it represents the link and the teachings that our ancestors left us to maintain our identity, since many aspects related to the Mayan language, traditions, and customs explain most of our personality allowing us to share new ideas with other cultures. For this purpose, the steps of the construction of the video game “La Casa Maya” are described, using important aspects such as the architecture of Mayan buildings, the geographic location, the distribution of the Solar Maya, and some important utensils created by the culture.

2 Development of the Mayan Culture Video Game 2.1 Preliminary Investigation The Mayan culture contains many important aspects that reflect its multiculturalism, and for many people, this will be their first contact with the Mayan culture; thus, it was decided to explain in a simple way many of the aspects that make up the Mayan culture. We placed small posters containing the necessary information that allows the player to understand the meaning, in such a way that the user can enjoy the video game without feeling pressure to learn everything immediately. Some of the main concepts considered are the following: “The Huano Houses” that are structures where the Mayas used to take refuge, “The Pib Oven” where food is prepared, or “The Hamacas” that are used to rest. This project revolves around how the Maya managed to survive the threatening nature found in Yucatan, so it was decided to focus on the portions of land called “solares”, which are related to a house where the Mayas feel safe (Fig. 1).

2.1.1

Background Information

We collected information about the existence of video game titles that tell stories about the Mayan culture as follows.

The Mayan Culture Video Game—“La Casa Maya”

333

Fig. 1 Screenshots of the gameplay, the house’s indoor (left), information poster in the ground (right)

• Age of Empires: This is a classic video game that lets you rebuild the power of the Mexica from scratch: you will be the power behind the throne in charge of building the city and you will also define the speed of its development as a society, [26]. As mentioned, the video game is a little about the Mayan culture, so it does not touch on important topics such as the traditions and experiences of the native Mayans in Yucatan, and its focus is not purely educational. • Mictlan: An Ancient Mythical Tale: The Mictlan is located in a fantastic world very influenced by the pre-Columbian Mesoamerican cultures. Users will immerse in a dark and varied world exploring detailed locations with an incredible narrative depth while experiencing a rich atmosphere of a hidden past [16]. The approach of the game is not entirely based on real facts, and its audience will be reduced due to the amount of violence that the game emits by its nature related to themes of conquest. The main purposes of our video game are to form an educational environment that allows us to reach all types of audiences, intuitive, that has truthful and useful information that is a support resource to preserve the Mayan culture. • Naktan: It is a 3D adventure game, which aims to recreate scenarios, characters, and part of the Mayan culture, being Akbal the main character, a 12-year-old boy who will have to search for his family and on the way; he will learn to become a warrior, understanding all the mystery that surrounds him [13]. Probably Naktan would have been a great video game that reflects the Mayan culture, but unfortunately the game will not be able to be finished due to the lack of economic resources required. 2.1.2

The Mayan Solar

Traditionally, the Mayan Solar is the property or plot of land, between 250–1,000 m2 presented in Fig. 2, where most of the activities of the Maya family in Yucatan take place, separated from the outside by an albarrada (wall of stones). It consists of the Casa Maya, some small buildings, and an open area that is delimited by the albarrada [4]. The Mayan Solar on the Yucatan peninsula can be divided into two zones: intensive use and extensive use [10]. The intensive use zone includes the space near

334

D. Rodríguez-Orozco et al.

Fig. 2 Photo of a Mayan Solar [3]

the house, where the laundry room, kitchen, water tanks, and animal housing (e.g., chicken coops) are located. Fruit trees, vegetables, ornamental, and medicinal plants are also cultivated [10]. The extensively used area includes secondary vegetation that is utilized as firewood, construction material, and organic fertilizers [5, 10, 18] There are many buildings on each Solar, the main building (traditionally located in the center of the Solar) is the dwelling house (naj, in Maya language), which includes the Maya house and kitchen (k’oben, Maya language), and the roof is built by guano palms. Surrounding the main building is a seedbed (k’anché), an elevated structure for storing corn cobs and other medicinal plants, as well a(ch’e’en), for storing water in places where there are no cenotes and a batea (nukulíp’o’o) [4].

2.2 General Idea About the Subject of the Video Game The video game is classified as a serious game genre since teachers can engage their students with educational content, so that students learn while having fun [9]. The video game is aimed at all audiences who wish to learn about Mayan Culture, but especially at students between the ages of 8 and 25 years old. It is developed in first person view to feel a close experience of being a Mayan traveler looking for knowledge. The protagonist is Itzamná, a wise and all-powerful Mayan god who created the world and all that inhabits it. He has promised to share his knowledge of the culture through a tour he has planned for the player. At the beginning of the game, Itzamná creates coins over the map to guide the player to interesting places with valuable information that will complement the knowledge to complete the tour. The video game is developed to be entertaining and eye-catching, so that players stay as long as possible learning about the Mayan culture; immersion is necessary, as it incites the player to know the game, commits him to play it constantly, gives him a fun experience, and gives him the ability to concentrate on the game [1]. One way

The Mayan Culture Video Game—“La Casa Maya”

335

Fig. 3 Two map perspective visualization, from a top-view camera

to achieve this immersion is with “presence”, defined as the feeling of “being there”, feeling a virtual space as if it were real [14]. To achieve this feeling, the player can explore the entire map freely and can observe the details of each element, in such a way that they become familiar and stop perceiving the game as a virtual space and feel that they are in an authentic Mayan house (Fig. 3).

2.3 3D Meshes and Models 3D models were created in Blender (https://www.blender.org/) and Cinema 4D (https://www.maxon.net/es/cinema-4d), using tools that facilitate the texturing and deformation of the models. The workload of the creation of each mesh, the workload was divided into two segments, the modeling of small meshes that we would call “props”, and the elaboration of larger meshes (from 538 to 28,128 polygons approximately) that would form the architectural base of the map.

2.3.1

Props Models

3D models in video games are known as “props” and contain a small number of polygons compared to the architectural models. Their function is generally aesthetic and provides a more natural and splendid environment to the map, and they take the player to an immersion that enhances the gaming experience. Models were made with the original scale of each product using different visual references to carry out their modeling (Fig. 4). • Dishes and Ceramics: For the Mayan culture, utensils are more than containers to place food, some of them are used to measure the amount of food, while others are side dishes [15]. • Metate: A rectangular-shaped stone tool for grinding food, especially harvested grains such as maize [24]. The symbolic value of this element is very great since maize is important for the Mayans.

336

D. Rodríguez-Orozco et al.

Fig. 4 Some of the 12 props employed in the game

• Portable Furnace: Also called “comal”, it is a small oven where the firewood is introduced through the hole and on top is where foods such as tortillas are cooked. It is useful to take into the rainforest for outdoor cooking [15]. • Tables: There is a great variety of tables, and large stones are employed to support them. • Wooden Chair: At mealtime, Mayans would gather in the kitchen, which also served as a social area, and sit on wooden logs. These wooden trunks were also used for those women who prepared the maize dough as it was a very long and tiring process. 2.3.2

Architectural Models

The architecture of the video game map is the spatial or cartographic distribution of the buildings, trees, and all kinds of objects that captivate the player. The project focuses mainly on the ecosystem that the state of Yucatan provides, so through many images and references, we were able to create a similar environment. For the setting of the map, 10 structures were modeled to make the game experience more realistic (Fig. 5). • The albarrada is a wall made of stones that delimits a plot of land, usually used in the house lots to determine its size. It serves as the limit of the map. • The huano house is one of the indispensable structures of the Mayas, since it is here where one sleeps, cooks, washes, lives, and worships. They have a measure

Fig. 5 Some architectural models utilized in the video game

The Mayan Culture Video Game—“La Casa Maya”

337

called “vara” that allowed the Mayas to measure the head of the family [6]. The sensation of living in a Mayan house is of relief, because of the nature of its location, this type of structure provides a cool and shade that counteract the heat of the Yucatan Peninsula, suitable for the vegetation which allows these houses can be built because the materials for construction of this type are found in the region [23]. • Hamaca: Of Caribbean origin, the “hamaca” (from hamack which means tree in Haitian) is the place where the Mayas rested and slept in their homes. Originally made from the bark of the hamack tree, they were later made from mescal or henequen as they provided greater comfort [22]. • The Mayan “Comal” consists of three stones (tenamascles) and a clay disc on which the food is placed, and a fire is lit below. This is where the tortillas are cooked [8]. 2.3.3

Level Map Design

Many articles on the Yucatan ecosystem were investigated, including the topography and vegetation that exist in the area. No animals were used, but there are plans to incorporate endemic. Yucatan has a dry and semi-dry climate, which implies temperatures of approximately 36 °C (96.8 °F), [11]. This type of climate allows the planting and harvesting of various seeds, such as beans, corn, oranges, and henequen. It is no coincidence that the Mayan culture was dedicated to harvesting such seeds to be included in its daily diet.

3 Experimentation and Results 3.1 Usability Analysis of the Video Game In this section, the first usability test of the video game is presented. It is important to note that the video game is still in the development phase and that this is the first approach with a real audience that will test the initial demo version of the video game. By usability, we refer to a quality attribute that establishes how easy and useful a system is to use, assessing whether users interact with the system in the easiest, most comfortable, and intuitive way [7]. The aim of the experiment was to record the opinion of a group of people about the video game interface in relation to its usefulness and ease of use employing the SUS (System Usability Scale) survey [2]. The SUS tool has been used in system and application usability studies in both industry and academia to compare systems [25], ensuring its effectiveness as a tool for measuring the perceived usability of a system. According to Tullis and Stetson, with a small sample group, it is possible to obtain reliable results on the perceived ease of use of a system [25], and it is possible to find

338

D. Rodríguez-Orozco et al.

85% of usability problems of a product with at least 5 people of population, so the feedback from the test with a sample of 5 is enough to fix software problems [12]. The participant group consisted of a representative sample of 5 male Mexican students from the Faculty of Mathematics at UADY (Universidad Autonoma de Yucatan), whose training area is Engineering and Technology, and their ages ranged between 19 and 23 years old. These students have maintained a basic contact with the Mayan culture in the state of Yucatan, and they have between 5 and 10 years of experience in the use of video games. The experiment was conducted online. First, each participant responded to a survey (https://forms.gle/f7ow3VpbTm33PCXY9) with general information about their background and experience with Mayan culture and video games. Next, a link was sent to each participant to download and install the video game demo, and finally, a brief opinion was asked about the improvements in the game and what the experience obtained. The SUS survey is a questionnaire with 10 items that users score according to their level of acceptance, using a Likert scale from 1 (strongly disagree) to 5 (strongly agree). The algorithm described by Brooke [2] in SUS, A quick and dirty usability scale, was used to obtain a total score from 0 to 100. Having a score above 70 on the test categorizes the usability of the system as Acceptable, above 85 as Excellent, and equal to 100 as the best imaginable. A graph with the scores obtained is shown in Fig. 6. Calculating an average of the evaluation results gives a score of 78.5, an acceptable score because it is above 70, which in the SUS is classified as “good” since it is above the average of 50 [2]. From the results obtained, we can conclude that it is necessary to improve aspects related to the gameplay and user interface, as well as certain technical details for the proper functioning of the game. Some of the participants’ comments highlight that the video game is quite good and that after playing they can differentiate the essential aspects of the Mayan Culture. However, they would like to see more mechanics implemented to make the game more challenging. We consider that the results are good for the video game demo, even so, we will continue working

Fig. 6 SUS test results of the experiment carried out

The Mayan Culture Video Game—“La Casa Maya”

339

to increase the video game quality and to improve the project by implementing more missions and map variations to cover more themes of the Mayan culture. In this section, we share the demo of the video game; in case the reader wishes to play the final product, a cordial invitation is made to download the video game “The Mayan Solar—(El Solar Maya)” in the following link https://bit.ly/3mi2gTR. Read in the link for the Minimum hardware requirements. If you do not have the minimum computer requirements, we also invite you to watch the video presentation of the final product “The Solar Maya Gameplay Walkthrough” in https://bit.ly/3xoVUZp.

4 Conclusion The purpose of this article was to present the creation of the video game “The Mayan Solar” to attract people who want to learn about the customs and traditions of the Mayan culture. The Mayan culture plays a very important role in the personality of many Mexicans due to the great expansion and importance that this culture had in the south of the Mexican Republic and neighboring countries of Central America. We can still find great vestiges and traditions within the populations close to the town that have contact with the native people. We want to promote the idea that we will continue working on the update of the video game, so that it can compete with other products on the market and thus support the transmission of culture and preserve the traditions and customs. Also, more experiments must be done by using more people, with different backgrounds.

References 1. Armenteros M, Fernández M (2011) Inmersión, presencia y fow. Contratexto 0(019):165–177 2. Bangor A, Kortum PT, Miller JT (2009) Determining what individual SUS scores mean: adding an adjective rating scale. J Usability Stud 4(3):114–123; Brooke J (2004) SUS—a quick and dirty usability scale. Usability Eval Ind 3. Brown A (2009) Flickr, Solar Maya (recovered on june 6, 2022). https://www.flickr.com/pho tos/28203667@N03/4225464260 4. Cabrera Pacheco AJ (2014) Estrategias de sustentabilidad en el solar maya Yucateco en Mérida, México. https://rua.ua.es/dspace/bitstream/10045/34792/1/ana-cabrera.pdf 5. Castaneda Navarrete J, Lope Alzina D, Ordoñez MJ (2018). Los huertos familiares en la península Yucatán. Atlas biocultural de huertos familiares México (recovered on june 6, 2022). https://www.researchgate.net/publication/328103004_Los_huertos_familiares_en_la_ peninsula_de_Yucatan 6. Chavez ONC, Vázquez AR (2014) Modelo Praxeológico Extendido una Herramienta para Analizar las Matemáticas en la Práctica: el caso de la vivienda Maya y levantamiento y trazo topográfico. Bolema: Boletim de Educação Matemática 28(48):128–148 7. Dumas JS, Reddish JC (1999) A practical guide to usability testing. Intellect Rev(1) 8. Escobar Davalos I (2004) Propuesta para mejorar el nivel de aceptación de las preparaciones culinarias tradicionales de la sierra ecuatoriana, aplicadas a la nueva cocina profesional. Universidad Tecnológica Equinoccial

340

D. Rodríguez-Orozco et al.

9. Fuerte K (2018) ¿Qué son los Serious Games? Instituto para el Futuro de la Educación, Tecnológico de Monterrey (recovered on august 27, 2022). https://observatorio.tec.mx/edunews/que-son-los-serious-games 10. Herrera Castro ND (1994) Los huertos familiares mayas en el oriente de Yucatán. Etnoflora yucatanense, fascículo 9. Universidad Autónoma de Yucatán 11. INEGI (2018) Información por entidad, Yucatán, Territorio, Relieve (recovered on june 6, 2022). https://cuentame.inegi.org.mx/default.aspx 12. Lewis JR (2014) Usability: lessons learned … and yet to be learned. Int J Hum-Comput Interact 30(9):663–684 13. MartinPixel (2017) Naktan, un videojuego desarrollado en México que busca difundir la cultura maya, (recovered on june 6, 2022) https://www.xataka.com.mx/videojuegos/naktan-un-videoj uego-desarrollado-en-mexico-que-busca-difundir-la-cultura-maya 14. Mcmahan A (2003) Immersion, engagement, and presence: A method for analyzing 3-D video games. Video Game Theory Reader 67–86 15. Mexico Documents (2015). Utencilios Mayas. vdocuments.mx (recovered on june 6, 2022). https://vdocuments.mx/utensilios-mayas1.html 16. Mictlan: An Ancient Mythical Tale (2022). Steam, indie videogames, mictlan: an ancient mythical tale (recovered on june 6, 2022). https://store.steampowered.com/app/1411900/Mic tlan_An_Ancient_Mythical_Tale/?l=spanish 17. Nincarean D, Alia MB, Halim NDA, Rahman MHA (2013) Mobile augmented reality: the potential for education. Procedia Soc Behav Sci 103:657–664 18. Ordóñez Díaz MDJE (2018) Atlas biocultural de huertos familiares en México: Chiapas, Hidalgo, Oaxaca, Veracruz y península de Yucatán 19. Osalde A (2022) Las Albarradas: el Legado de Apilar Piedra Sobre Piedra, Yucatán Today, (recovered on march 15, 2023). https://yucatantoday.com/las-albarradas-el-legado-de-apilarpiedra-sobre-piedra/ 20. Ramírez Carrillo LA (2006) Impacto de la globalización en los mayas yucatecos. Estudios de cultura maya 27:73–97 21. Roussos M, Johnson A, Moher T, Leigh J, Vasilakis C, Barnes C (1999) Learning and building together in an immersive virtual world. Presence 8(3):247–263 22. Sánchez ARP, Contreras PT (2018) Hamacas artesanales como producto de exportación. Jóvenes En La Ciencia 4(1):1272–1277 23. Sánchez Suárez A (2006) La casa maya contemporánea: Usos, costumbres y configuración espacial. Península 1(2):81–105 24. Searcy MT (2011) The life-giving stone. University of Arizona Press, Ethnoarchaeology of Maya metates 25. Tullis T, Albert W (2016) Measuring the user experience: collecting, analyzing, and presenting usability metrics. Morgan Kaufmann, Amsterdam 26. Xiu (2020). Matador network, 6 videojuegos basados en la época prehispánica (recovered on june 6, 2022). https://matadornetwork.com/es/videojuegos-sobre-la-epoca-prehispanica/

Impact of Decentralized and Agile Digital Transformational Programs on the Pharmaceutical Industry, Including an Assessment of Digital Activity Metrics and Commercial Digital Activities António Pesqueira , Sama Bolog, Maria José Sousa , and Dora Almeida Abstract Developing digital transformational and measurement capabilities within the pharmaceutical industry is considered one of the most important factors for delivering commercial excellence and business innovation. Digital transactional programs (DTP) are criticized and evaluated from different perspectives, including how they are used, as well as how they generate value for pharmaceutical companies. From March 2nd through April 18th, 2022, 315 pharmaceutical professionals and leaders were surveyed on the impact of decentralized and agile digital transformational programs on the pharmaceutical industry via an online structured questionnaire that included closed questions such as assessing digital activity metrics and commercial digital activities. This paper conducted assessments with various assumptions about innovation, the relevance of decentralized and agile initiatives, and the impact on commercial excellence to gain insight into the complexity of the assumptions and to evaluate the overall value of digital empowerment and knowledge increase. These results and comparable questionnaire analyses show the importance of using new decentralized digital technologies, metric understanding, and new DTP that enhance the ability of industry professionals to make more effective diagnoses, perform better digital procedures, and access appropriate information. Statistical analysis indicates that the findings are related to the impact and innovation created by DTP on product launch strategies, but also the overall impact on innovation generation within companies.

A. Pesqueira (B) · M. J. Sousa University Institute of Lisbon (ISCTE), Lisbon, Portugal e-mail: [email protected] M. J. Sousa e-mail: [email protected] S. Bolog University of Basel, Basel, Switzerland e-mail: [email protected] D. Almeida Independent Researcher, Lisboa, Portugal © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_29

341

342

A. Pesqueira et al.

Keywords Digital research findings · Digital metrics · Commercial excellence · Metrics insights · Innovation

1 Introduction Despite not being a new topic in the pharmaceutical context, digital transformational programs (DTP) can bring benefits to different stakeholders in the commercial field, but not all internal operations and functions benefit directly from it. The pharmaceutical industry is constantly creating new applications for analyzing and displaying big data available to all stakeholders in the health system in a powerful way that automatically creates opportunities for DTP. These applications can, however, be a driving force of change in the sector, particularly with the use of digital mobile data [2, 10]. Different stakeholders within pharma are responsible for different data components, such as healthcare providers concerning providing better healthcare services, researchers and developers of new products aimed at improving quality of life as well as other stakeholders involved in health-related processes and sharing digital healthcare data [7]. Personal health data are sensitive; therefore, ethical and legal questions must be considered when analyzing this data and especially with the use of DTP. The use of decentralized technologies like blockchain and agile digital strategies helps to optimize the decision-making and business strategy execution processes, but not always are the most effective and efficient methods utilized [1]. In this study, we examine how digital transformation is impacting and influencing the pharmaceutical industry with a key focus on commercial functions. Also, part of the selected methodology is a better understanding of DTP innovations and the factors that influence digital adoption by pharmaceutical companies in an attempt to answer the research questions above. This study introduces new research areas like assessing the impact and influence of decentralized and agile DTP on the pharmaceutical industry, but also provides a better understanding of key metrics, learning initiatives, and digital activities that are deemed relevant. The primary research questions are as follows: Do new decentralized and agile digital transformational programs impact the brand strategy, commercial execution, and new product launches? Question 2—Which are the most important factors that facilitate digital transformations? Questions 3—What are the relevant metrics and digital activities part of digital transformation?

2 Literature Review Managing and optimizing digital channels is now a prerequisite for pharmaceutical companies, as well as focusing on substantial investments with clarity connected with

Impact of Decentralized and Agile Digital Transformational Programs …

343

DTP. Additionally, COVID-19 has caused significant changes in how pharmaceutical companies interact with their market and stakeholders, as well as altered internal team dynamics and forced several companies to change customer experience strategies [6]. The pharmaceutical industry is developing different DTP to implement decentralized or web-based solutions, as well as applying agile models and working concepts to engage with different stakeholders [3]. Research and development in pharmaceuticals are utilizing artificial intelligence to discover new drugs, aid in clinical trials, and improve supply chain management. The use of digital communications to disseminate educational materials and wellness advice is increasing among pharmaceutical companies to better engage their patients [11]. Nevertheless, compliance oversight continues to be a time-consuming process for most large pharmaceutical companies. It is further complicated by the fact that sales teams must be trained in an engaging and informative manner, reports must be produced for the board of directors and senior management, interactions with healthcare professionals (HCPs) should be monitored and audited, and when necessary, violations should be investigated and remedied [9]. The coordination of compliance across legal, human resources, sales, and marketing departments is a critical aspect of this process [8]. Technology solutions such as knowledge management software, content management, workflow tools, and other tools can help with this digital transformation effort. The purpose of DTP is to modernize different operational programs, such as commercial programs, using technology to streamline operations, improve efficiency, automate repetitive tasks, and more proactively engage relevant stakeholders. Among the areas for which automated workflows and timely completion will prove beneficial are employee onboarding, sales training, and monitoring product launches [5]. However, the pharmaceutical industry is still in most cases missing out on the wider picture, considering that digital transformation is still being redefined across different companies. Healthcare providers, key opinion leaders, regulators, and product or public decision-makers all have unique needs, biases, and preferences. A successful DTP content strategy, analytics, and digital experiences can only be accomplished when there is a focus on managing and optimizing digital channels and focusing on concrete investments with a clear return on investment [3]. Thus, DTP are often regarded as having vast benefits, yet they are sometimes not fully understood by a broader audience. Among their benefits are automation and the improvement of various operations’ quality levels. Those programs should also be a means of increasing agility, scaling different processes and results as well as reducing costs without forgetting to consider the integration of systems and business processes [4]. As part of the implementation process, different pharmaceutical companies are looking for every opportunity to blend processes and data consistently, avoiding silos of information and manual intervention [4]. Sales and marketing pharmaceuticals’ ability to be able to identify the target audience and address their unmet needs through the building of local networks and relationships with HCPs is quite important. To meet this need, different companies

344

A. Pesqueira et al.

are forced to develop a more effective marketing mix utilizing all the features and benefits of each new channel and go-to-market strategy [9]. Technology has driven rapid changes over the last few years and taking advantage of these advancements with the ability to promote shared networks and external partnerships has become critical to the future of technology. A new decentralized digital technology is an early-stage technology that allows the storage and exchange of information in a decentralized and secure manner. It can minimize friction, reduce corruption, increase trust, and empower different users of different systems by providing an effective tool for tracking and transactions. Many decentralized digital technologies like blockchain are still nascent, but they are potentially transforming the healthcare and life sciences industries by creating new wealth and disrupting them. However, key challenges include the lack of interoperability, security threats, centralization of power, and a reluctance to experiment due to recent overhype [9].

3 Methodological Approach Part of this research work was designing an online questionnaire that contained questions concerning all proposed topics. A survey sent to 315 pharmaceutical leaders and professionals from different ranges asked which digital activities they thought were relevant and which influential and impact factors are relevant to be considered for enhancing the value of an overall DTP but also for the implementation of decentralized and agile digital initiatives. In this section, we go over the methodology used in addition to other key metrics that helped in drawing meaningful conclusions and better understanding the relationship between key variables during the questionnaire data analysis. Since the purpose of the study is to assess the impact and influence of decentralized and agile DTP to brand strategy, innovation, and commercial excellence, the quantitative method was deemed appropriate for the study. Also, part of the selected methodology is a better understanding of digital innovations, and the factors that influence digital adoption by pharmaceutical companies are an attempt to answer the described research questions. The selected methodology was the most appropriate strategy for the investigation mainly due to its ability to assess and understand different digital transformation characteristics as well as to have a more detailed understanding of different influencing factors of digital transformation in the pharmaceutical industry.

3.1 Questionnaire Design and Variables Selection After the literature review, we created a structured questionnaire that was formulated by the defined methodology approach process that was already explained.

Impact of Decentralized and Agile Digital Transformational Programs …

345

From March 2nd to April 18th, 2022, an online survey was conducted where the total number of respondents across all global regions was 315, all working in the pharmaceutical sector. In addition, one survey was administered to external consultants and experts to gauge their views on the designed questions and research methodology. Based on previous studies that were translated and validated by an expert committee consisting of two specialists, a statistician, and the authors, the study used a self-administered survey developed by the authors using Google Forms (Google LLC, Mountain View, CA, USA). The questionnaire was administered in English and included demographic questions, but also questions about the digital transformation impact, investments, current applications, and future applications. The research strategy was to distribute a link to the questionnaire using e-mail and phone messages, describing the purpose of the study and inviting additional respondents based on the initial respondents identified and their network of contacts to participate. Respondents were asked to complete the questionnaire anonymously, to respect all respondents’ right to privacy, and not identify their affiliated organizations in the database. To validate the consistency of the questionnaire, ten respondents, a representative sample of the study population, were analyzed before the distribution of the survey. We selected individuals from companies around the globe with a confidence level of 95% (and p = q = 0.5) who have an interest in digital transformation, relevant experience, or knowledge of transformative technologies such as blockchain or artificial intelligence, and a focus on the life sciences industry.

4 Results and Analysis 4.1 Descriptive Analysis The following figures show a summary of all descriptive information regarding the data collection and corresponding characteristics from the study sample. The sample includes pharmaceutical professionals from all over the world, as seen in the following Fig. 1. According to the working organizations, vaccines are the most prevalent therapeutic sector with 66 (21%) respondents, followed by oncology with 19% respondents. Similarly, the other therapeutic business areas are represented in percentages in Fig. 2. The seniority level of the sample is quite representative: 40% of the respondents have the job title of vice-president or senior vice-president, and 31% of the respondents have the job title of associate director or director or senior management position, as shown in Fig. 3. The driving factors in the surveyed companies when asked about implementing decentralized and agile programs show that companies are implementing digital

346

A. Pesqueira et al.

Fig. 1 Sample region or market where the affiliated organization is primarily located

Fig. 2 Therapeutical areas from the respondents working organizations

decentralized and agile solutions to support excellent product launches (34%), accelerate the engagement of key opinion leaders (17%), and enable faster time to market (15% of all responses) as shown in Fig. 4. In terms of sales and marketing effectiveness, the following digital metrics were highlighted in terms of their importance and business relevance: 18% are related to the tracking and success of digital initiatives against initial planning and budget, while 17% are related to digital success by segment and customer group, pricing strategy, and competitor positioning against pricing strategies, as shown in Fig. 5.

Impact of Decentralized and Agile Digital Transformational Programs …

347

Fig. 3 Level of seniority and job title

Fig. 4 Driving forces from the organizational implementation of decentralized and agile digital programs

The final figures present a graphical analysis showing the combination of seniority and digital factors in deciding on new skills or training needs in new digital development plans. Here we can see that for VPs/SVPs the most important factors are new trends in the field, competitive intelligence, or marketing research, and then the influence of senior management or other organizational leaders, which is an interesting factor as the feedback comes from the VPs/SVPs level itself. For the middle level of organizational decision-making, we can see that the digital vision or understanding of the company’s vision and mission to achieve digital

348

A. Pesqueira et al.

Fig. 5 Most relevant metrics for effectiveness and performance

success is one of the most critical factors in deciding on new capabilities or training areas in new digital programs, but we also see that for the executive level, competitive intelligence and marketing research are very important (as shown in Fig. 6). In terms of the percentage of time spent on digital-related activities and the deciding factors in deciding on digital skills or training to be included in digital development plans, it is clear that professionals who spend more than 60% of their time on DTP and projects believe that competitive intelligence or market research, new trends in the field, executive influence, and then understanding of the company’s vision and mission are the deciding factors (Fig. 7). The professionals who have spent more time on digital issues believe that the most relevant activities and training formats that are most useful for learning about digital innovation in the pharmaceutical industry are intra-organizational master classes, then online training programs from academies or universities, and finally live training or certificate programs.

4.2 Statistical Analysis To answer the defined research questions, this paper will introduce the key principles of different comparative and relational logic modes employed as the basis for statistical hypotheses to better understand all the relationships and connections between variables and analyze the most relevant correlations and connections.

Impact of Decentralized and Agile Digital Transformational Programs …

349

Fig. 6 Level of seniority and important factors in deciding new skills or training needs in new digital development plans

To determine whether there is a significant difference between our key variables and the controls between totally different treatments, we first conducted several univariate calculations using univariate statistics. In the case of a classical hypothesis test, only one effect from the treatment group is considered to be responsible for the effects. By analyzing some of the statistical data from the dependent variables, the corresponding analysis could show that both decentralized/agile digital influence (INFL) and digital business impact on innovation (IMPCT) are connected with the independent variables. Analyzing INFL in connection with the grouping variables of the organizational innovation capacity (INNO) and the alignment of the digital transformation to brand strategy and product launch strategies (BRANDLAUNCH), the independent T-test indicates clear influencing levels and through the student’s t-test that the DTP can influence INNO. To get as much information as possible for a compressed analysis, Welch’s t-test was also applied. This uses the square root of mean–variance to standardize differences in means. As a final step in defining the correct t-test, the tests were performed on all cases with valid data. When analyzing the p-value, it was needed to understand differences in means, i.e., the difference in the sample means. Upon testing the reliability coefficient scale through Alpha Cronbach, both INFL and IMPCT exhibited positive reliability coefficient scores (Alpha Score = 0.721 and 0.915, respectively). As evidenced by the high score level obtained for both

350

A. Pesqueira et al.

Fig. 7 Percentage of time in digital-related activities and most decisive factors in deciding for digital skills or training to be included in digital development plans

dependent variables, the internal consistency scale evaluates them positively. When this indicator is equal to or superior to 0.80, which was the case in the case at hand— the IMPCT variable—it is generally considered a good internal consistency measure, as it is higher than 0.60, which is the most common value in exploratory cases. A two-way table (also called a contingency table) was also used to analyze categorical data as more than one variable was involved. The purpose of this type of test was to determine whether there were significant relationships between different category variables. The test consisted of understanding if the COVID-19 pandemic impacted the organization’s digital strategy (COVID) and the digital presence within the organization with clear results to the innovation (DPRE—Digital Presence Indicator). To test the hypothesis and according to the already defined research hypothesis, the critical value for the chi-square statistic determined by the level of significance was calculated and compared. Based on the results, it was possible to accept our hypothesis, which led us to conclude that COVID and DPRE are related (p = 0.569), meaning that COVID has not only positively impacted the organizational digital strategy but also brought clear results in terms of business innovation increase.

Impact of Decentralized and Agile Digital Transformational Programs …

351

5 Discussions These results mean that the two subsets of the variables (COVID and DPRE) allow us to understand statistically significant evidence that there exists a statistically significant association between the dependent variable (IMPCT). Another analysis applied was to analyze the difference between the means of multiple groups through the ANOVAs, where our dependent continuous variables (IMPCT) allow us to test our INNOC variable group and answer the second research question, where the level of the independent variables (was) included into the analysis and the descriptive statistics for each combination of levels for the independent variables. The results showed that there is an association among the variables of interest, specifically between the variables of interest and the independent variable. The final step of our analysis was to apply a linear mixed model, allowing us to explore the relationship between the variables of interest, including the interaction between the variables of interest. We found that projects related to DTP have an impact on commercial success. Here, there was a clear association (df = 1, 0.26, F = 0.114, and p = 0.864) between INFL and IMPCT on commercial excellence outcomes in innovation and performance (COMEX). The analysis used sum contrast coding for categorical predictors and allowed for better interpretability of models with interactions and resulted mainly from the shape of the p-value distribution. To understand the complexity of these assumptions and evaluate the overall value of digital empowerment and knowledge increase, this paper conducted assessments utilizing varied assumptions about innovation, decentralized and agile DTP initiatives, and the impact on commercial excellence.

6 Conclusions In light of these results and comparable questionnaire analyses, it is important to use new decentralized digital technologies, gain a better understanding of metric information, and develop new transformative programs to help industry professionals make more effective diagnoses, perform better digital procedures, and access relevant information. The focus of this study was on the way DTP are impacting and influencing the pharmaceutical industry’s commercial functions. To answer the research questions, a better understanding of digital innovations and factors that influence digital adoption by pharmaceutical companies was considered part of the selected methodology. Consequently, this study not only provided a better understanding of key metrics, learning initiatives, and digital activities that are considered relevant within the pharmaceutical industry but also introduced new research areas from the pharmaceutical industry.

352

A. Pesqueira et al.

Descriptive analysis revealed that the majority of questionnaire samples were based in Europe or North America with commercial leadership positions and primarily from the vaccines and oncology sectors. Conclusions were also drawn regarding the level of seniority and the most important factors in determining new skills or training requirements for new digital development plans, as well as the amount of time spent interacting with digital technology. There was a positive and clear answer to the question of if DTP impacts brand strategy, commercial execution, and new product launches. In addition to providing the necessary info, specific metrics and digital activities were also demonstrated as part of the DTP. In terms of commercial strategy, the findings clearly showed that DTP influences key areas like product launch excellence, the involvement of key opinion leaders, and quicker time to market. Furthermore, it is possible to conclude the skills necessary for organizations to create interdisciplinary teams of quantitative and technical talent to solve strategic business challenges. The statistical analysis indicates that the findings are related to the impact and innovation created by DTP on product launch strategies, but also the overall impact on innovation generation within companies.

References 1. Alla S, Soltanisehat L, Tatar U, Keskin O (2018) Blockchain technology in electronic healthcare systems. In: IIE annual conference. Proceedings. Institute of Industrial and Systems Engineers (IISE), pp 901–906 2. Elhoseny M, Abdelaziz A, Salama AS, Riad AM, Muhammad K, Sangaiah AK (2018) A hybrid model of the internet of things and cloud computing to manage big data in health services applications. Futur Gener Comput Syst 86:1383–1394 3. Finelli LA, Narasimhan V (2020) Leading a digital transformation in the pharmaceutical industry: reimagining the way we work in global drug development. Clin Pharmacol Ther 108(4):756–761 4. Ganesh NG, Chandrika RR, Mummoorthy A (2021) Enhancement of interoperability in health care information systems with the pursuit of blockchain implementations. In: Convergence of blockchain technology and e-business. CRC Press, pp 201–226 5. Haleem A, Javaid M, Singh RP, Suman R (2022) Medical 4.0 technologies for healthcare: features, capabilities, and applications. Internet of Things and Cyber-Physical Systems 6. Hole G, Hole AS, McFalone-Shaw I (2021) Digitalization in the pharmaceutical industry: what to focus on under the digital implementation process? Int J Pharm X 3:100095 7. Massaro M (2021) Digital transformation in the healthcare sector through blockchain technology. Insights from academic research and business developments. Technovation 102386 8. McDermott O, Antony J, Sony M, Daly S (2021) Barriers and enablers for continuous improvement methodologies within the Irish pharmaceutical industry. Processes 10(1):73 9. Pesqueira A (2022) Data science and advanced analytics in commercial pharmaceutical functions: opportunities, applications, and challenges. In: Information and knowledge in the internet of things, pp 3–30 10. Pesqueira AM, Sousa MJ, Mele PM, Rocha A, Sousa M, Da Costa RL (2021) Data science projects in pharmaceutical industry. J Inf Sci Eng 37(5) 11. Pesqueira A, Sousa MJ, Rocha Á (2020) Big data skills sustainable development in healthcare and pharmaceuticals. J Med Syst 44(11):1–15

Sprinting from Waterfall: The Transformation of University Teaching of Project Management Anthony Chan, David Miller, Gopi Akella, and David Tien

Abstract Project Management is taught as a compulsory core unit in the undergraduate Information Technology degree. The subject is based on the Project Management Book of Knowledge (PMBOK) and was taught using case studies and teamwork environment. However, many students find the content overwhelming to be taught in a 12-week session of study. The subject was transformed after a series of consultations with past students, and members of the Information and Communication Technology industry. Using some of the best practices from the industry, this subject was transformed into an active participatory format using Agile principles. Taking the style of medical education and casebooks, student participation and interest increased tremendously and both student satisfaction and performance were noted. This paper will outline the dynamic strategies employed, the preliminary benefits received from such changes, and how the students responded to these changes. Keywords Project management · Teamwork · Authentic learning · Subject development · Agile

1 Introduction Project Management is a subject taught to all undergraduates of Bachelor of Information Technology at Charles Sturt University, Australia. It is a core subject in the degree and placed under AQF Level 7 of the Australian Quality Framework (TESQA). The subject is accorded 8 points and is to be completed in one session. The number of points measures the size of the subject’s contribution to the degree in which the student is undertaking. A core subject means that the student must successfully complete the subject to be eligible for graduation. There are a few core subjects that must be completed over the period of enrolment. The Course Accreditation Policy of the university sets a total of 192 points to be acquired by the student. The Subject Policy states that an eight point subject should require a student to spend up to 160 h A. Chan (B) · D. Miller · G. Akella · D. Tien Charles Sturt University, Wagga Wagga, New South Wales, Australia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_30

353

354

A. Chan et al.

engaged in the learning and teaching activities and time spent in preparation for the subject’s assessment (CSU). A postgraduate version of the subject is also available.

2 Enrolment and Teaching Approaches The student enrolment before the COVID pandemic year totals to about 1,000 students per calendar year. The teaching workforce can stretch to as many as 12 tutors teaching up to three sessions per year. The allocation of three contact hours per week over a space of twelve teaching weeks put a lot of pressure in covering the contents of the subject. The subject’s curriculum is based on the Project Management Institute Project Management Book of Knowledge (PMBOK). In the earlier versions of PMBOK, the standard knowledge areas covered these ten topics: Project Integration Management, Project Scope Management, Project Schedule Management, Project Cost Management, Project Quality Management, Project Resource Management, Project Communication Management, Project Risk Management, Project Procurement Management, and Project Stakeholder Management [14]. The university allocates three hours of lectures and tutorials each week of the twelve teaching weeks. Each week’s teaching is focused on a lecture aided by a set of PowerPoint slides followed by a tutorial working on a few questions. Project Management Institute published the latest 7th edition of PMBOK which reflects the full range of development approaches and expands a section called “models, methods and artefacts” [28]. There have been a few notable teaching approaches of Project Management recorded in literature over the past twenty years: The Information Technology (IT) method was found to be more effective compared to written case study methods as it employed higher cognitive skills and it also triggers interest in learning about project management [15]. The recognition that IT has emerged as a new academic discipline and project management is one of the five core technology areas cited by the Association for Computing Machinery (ACM) curriculum guidelines for the discipline. An experiential approach to teaching the subject was described [1]. Following that, the consideration of future pedagogy that will impact student experience was highlighted and it focused on two key components: students’ perceptions of what is significant and the component of virtual learning [25]. A call to the blended learning approach that provided emphasis on the role of learners as contributors to the learning process rather than the recipients of learning soon appeared. This approach also addressed the different student learning styles [16]. The importance and flexibility of software tools were highlighted to ensure that they aligned with PMBOK and highlighted the need for students to be instructed in their use [12]. The difficulty in teaching undergraduate students on this topic has been globally considered in multiple academic disciplines since the students have no prior knowledge [26] and a move to flipped teaching methodology was made [2]. The move to Agile methodology meant a change in teaching approach was required and a framework was presented to Information Systems educators [32].

Sprinting from Waterfall: The Transformation of University Teaching …

355

3 Subject Design Rationale The Waterfall methodology was presented to undergraduates under a previous subject convenor (also known as subject leader) in a lecture-tutorial model. It moved from standard PowerPoint lessons to case studies and then finally, two websites of fictitious companies—one acting as the employer and the other acting as a client. These sites were created under the advice of educational designers to give a limited sense of reality. The lecturers wore three hats at some stage trying to deliver theory, present the case problem as an employer, and then advise steps in completing the task. The failure rate in the subject was high as students with little or no project management experience grapple with theory, issues, and approaches in the Waterfall methodology. Subject design began in 2019 with consultations done with alumni, industry contacts, and teaching partners. The difficulty in teaching Project Management narrowed down to three areas: the amount of content that must be covered in the teaching session, the absence of experience in the project management field, and the differing level of project management experience among tutors. There was no teaching model available to higher education to teach the Agile approach in 2019. An accidental early teaching effort of transitioning was recorded [35]. Commentators have also mentioned the conflict of [Waterfall] methodology in tutors trying to teach Agile and research in this area is ongoing [32]. As academic tutors were the bridge to the knowledge base and Agile methodology, the new subject design centred on student teamwork. Teamwork among undergraduate (and postgraduate) students has always been a challenge in higher education [17, 18, 21, 27]. But yet active learning is more engaging than the lecture [34]. The tutor’s role will also have to be re-adjusted to focus on team discovery and development. And with the outbreak of SARS-CoV-2, further adjustments had to be made quickly to accommodate an online delivery model [3, 20]. The amount of volunteering in the hospital and the admiration of medical staff at work by the author provided impetus into understanding how medical students are taught in the hospital ward. The teamwork concept in healthcare has resulted in preventable medical errors, many of which are the result of dysfunctional or nonexistent teamwork [19]. The increased specialization of tasks in effective patient management echoes the increased interest of students in the different streams of IT: networking, programming, management, databases, cybersecurity, and the like. The need to ensure appropriate healthcare outcomes and patient safety is seen as similar in a comprehensive coverage of skills within the IT professional practice areas. Reporting and accountability were also seen as important to promote transparency in the process [5]. Undergraduate students have another major challenge—engagement in class and teamwork. Students are often distracted with their mobile devices or laptop and decreasing interest in the discussion or issues at hand [9, 10, 13]. A survey of workplace behaviour and workers with mobile devices gave an idea to a system of team reward-and-punishment system. A comprehensive teamwork mark system would give power to the team to recognize the individual as well as team effort and success

356

A. Chan et al.

without the assessment being in the way [6]. At the end of each meeting, the team would judge if everyone came to the scrum on time, prepared and not distracted. A system of points would take off the final mark for the assessment. This deduction would form the report provided in the form of minutes of the meeting by a system of rotating team leaders. The reward system provided explicit incentives for teamwork. This was explained to students in industry practice of providing a salary and bonus reward practice for successful teamwork [23]. At the beginning of each teaching session, every student would have to take a compulsory quiz to ensure every student understood their responsibility. The work of “train the trainer” began by acknowledging that this subject will put the tutor in the role of facilitator of student learning rather than a teacher [24, 29, 30]. A team re-training was organized and then close contact was maintained with first-time tutors of the subject to ensure that the “conflict of methodologies” was attended to, as a matter of priority [11]. Tutors only referred to theory on a needs basis or when questions were raised by students. Most of the time, the student teams discussed among themselves in small groups. The backdrop to good curriculum implementation is the learning material. The solution to cover concepts listed in PMBOK was to approach it in the style of medical education. The concept is modelled after the role of the first-year medical school student who keeps in contact with the patient as much as possible, seeks help from the clinician, and consults other experts and sources to develop a complete picture of the patient’s life. This medical student will work on a casebook that includes, but not limited to the patient’s entire history. This approach will allow the student to develop a deeper and more diverse understanding of what comprises healthcare life of the person [4, 36, 38]. After university studies, IT graduates work on client projects either in house or off site. A similar understanding of the client and the project is required, with access to other experts such as resourcing, financing, and others. The understanding and Agile implementation is therefore a perfect fit into this project management team environment. The ability to communicate well with other team members and skill was practiced throughout the session. Each team will be given the choice to pursue the path they chose and not be constrained to a standard path. This will help greatly as teams are pushed to investigate, research, and be creative with the solutions they offer. A casebook approach will also provide the affective feel towards a project as opposed to a “case study” approach [38]. As in the approach and purpose of casebooks, communication with end-users and stakeholders is prioritized. The ability to deliver their project to a non-technical high-level audience is also a skillset to be developed as opposed to classroom time of delivering oral presentations [8].

Sprinting from Waterfall: The Transformation of University Teaching …

357

4 Implementation and Stakeholder Responses The first implementation of using the casebook concept in Project Management was carried out in 2019 using PMBOK Sixth Edition. It was a major change in class logistics management and a period for tutors to be settled into their new roles. The postgraduate cohort was picked as the first group as it is a more resilient group to change. The undergraduate group joined a year later. The passing rate of the subject pre-Agile implementation (2019) is 66% for domestic students and about 72.8% for the international cohort. The chart below shows two years before implementation and two years after, with the number of students. Passing rate

2018

2019

2020

2021

Domestic

66.0% n = 81

66.0% n = 69

96.0% n = 54

93.5% n = 49

International

75.2% n = 2242

72.8% n = 1914

90.1% n = 1260

87.8% n = 333

The teamwork and casebook implementation has a better result as students were able to engage with their learning and become active participants in the teaching– learning space. Qualitative comments received from students were as follows: The experience was major learning experience in terms of our learning. We were able to make huge strides in improving our existing knowledge of the matter and it turned out to be an amazing opportunity. I loved that [lecturer name] tried something different in this subject, something that wasn’t textbook and PowerPoint. I really liked the group work elements. It was great to meet other students and talk through ideas. This proved to be a whole new experience for us as the interaction with the professional workspaces and implementing the theoretical knowledge was something we had not worked on before. Working as a team and collaborating towards the common goals taught us the lifelong lessons of teamworking and partnering. We achieved a lot together and a lot of it was that we shared our knowledge, ideas, expertise with each other during the whole project. The major success factors were communication, sharing of ideas, responsibility, and participation. No one got distracted during meetings.

The student subject evaluation reports through the four years also showed us positive development in a few areas. The percentage of students agreeing to the statement presented rose significantly [2019 was the year of implementation].

The subject incorporates the study of current content

2018 (%)

2019 (%)

2020 (%)

2021 (%)

56.5

59.0

94.5

84.5 (continued)

358

A. Chan et al.

(continued) 2018 (%)

2019 (%)

2020 (%)

2021 (%)

66.0

57.0

83.5

93.0

Created opportunities for me to learn from my 66.0 peers

68.0

83.5

88.0

The teaching in this subject motivated me to learn

Students acknowledged that Agile was the current approach that is used in the industry. The way the subject is taught and organized around the casebook has motivated them to learn more about Agile practices and tools. The teamwork component with its penalty points for poor attendance or non-participation has helped groups function well and is the keystone to the student team contribution. Some unexpected benefits were also realized, as these student comments indicated. I am a mature student and the take-away skills from this subject could be implemented in my own workplace. It was incredible to see how it works at my job. Even my boss complimented me on the broad approach and utilizing skills from every staff. My weakness is my hatred for teamwork. I like to work individually most of the time but working on this project with the team, I improved on my team-work skills. I found out that some of the team members are hesitant in sharing their ideas. They don’t share it clearly and it was very difficult to understand what they were trying to say. I realize good communication is the backbone of successful projects.

5 Future Work and Direction The casebook concept has delivered benefits in this initial period of review. Future work would be driven by the following points. IT students are used to notation and brevity and students need to understand that working in project management requires the construct of online discourse and how they need to construe this for positive participation [40]. Project management requires access to multiple information sources and inter-disciplinary research and this requires an alternative approach to information literacy and delivery [33]. The work of many young IT professionals is rooted in processes of vocational education and a “hands-on approach” versus “discussing-and-thinking model” approach [22, 31, 41].

6 Conclusion Subject development in project management studies can contribute positively to new experiences and enable educators to present an introductory experience in the

Sprinting from Waterfall: The Transformation of University Teaching …

359

principles of PMBOK effectively. It is important to move away from the teachercentric style of lectures and the assumption that students will not learn if they are not fed theory. It would be impossible to force-feed all the elements in PMBOK in a university semester anyway. In this curriculum revision, many students have been driven to look for more information on their own and this is exhibited by the work they have delivered. None have expressed difficulty in understanding what they are reading. In many cases, this subject has also delivered a largely unpredicted outcome of bringing students together and bridging the loneliness of struggling with concepts that are alien to them. Acknowledgements The authors acknowledge the contribution of co-author David Miller, who was a member of the Project Management Institute, for his assistance and insights into the development of this subject. David passed away on 21 August 2022 while this manuscript was in its final draft.

References 1. Abernethy K, Piegari G, Reichgelt H (2007) Teaching project management: an experiential approach, vol 22. Consortium for Computing Sciences in Colleges. https://doi.org/10.5555/ 1181849.1181888 2. Abushammala MFM (2019) The effect of using flipped teaching in project management class for undergraduate students. J Technol Sci Educ 9(1):41–50. https://doi.org/10.3926/jotse.539 3. Basilaia G, Kvavadze D (2020) Transition to online education in schools during a SARS-CoV-2 coronavirus (COVID-19) pandemic in Georgia. Pedagogical Res 5(4) 4. Beier LM (2018) Seventeenth-century English surgery: the casebook of Joseph Binns. In: Medical theory, surgical practice. Routledge, pp 48–84 5. Bell SK, White AA, Yi JC, Yi-Frazier JP, Gallagher TH (2017) Transparency when things go wrong: physician attitudes about reporting medical errors to patients, peers, and institutions. J Patient Saf 13(4). https://journals.lww.com/journalpatientsafety/Fulltext/2017/12000/Transp arency_When_Things_Go_Wrong__Physician.11.aspx 6. Bravo R, Catalán S, Pina JM (2019) Analysing teamwork in higher education: an empirical study on the antecedents and consequences of team cohesiveness. Stud High Educ (Dorchesteron-Thames) 44(7):1153–1165. https://doi.org/10.1080/03075079.2017.1420049 7. CSU. Recommended Student Time Commitment. https://www.csu.edu.au/division/learningand-teaching/subject-outline/subject-schedule-and-delivery/recommended-student-time-com mitment 8. Daniel M, Rougas S, Warrier S, Kwan B, Anselin E, Walker A, Taylor J (2015) Teaching oral presentation skills to second-year medical students. MedEdPORTAL 11. https://doi.org/ 10.15766/mep_2374-8265.10017 9. Dontre AJ (2021) The influence of technology on academic distraction: a review. Hum Behav Emerg Technol 3(3):379–390 10. Flanigan AE, Babchuk WA (2022, 2022/04/03) Digital distraction in the classroom: exploring instructor perceptions and reactions. Teach Higher Educ 27(3):352–370. https://doi.org/10. 1080/13562517.2020.1724937 11. Frydenberg M, Yates D, Kukesh J (2018) Sprint, then fly: teaching agile methodologies with paper airplanes. Inf Syst Educ J 16(5). http://isedj.org/2018-16/n5/ISEDJv16n5p22.html 12. Goncalves RQ, von Wangenheim CAG, Hauck JCR, Zanella A (2018) An instructional feedback technique for teaching project management tools aligned with PMBOK. IEEE Trans Educ 61(2):143–150. https://doi.org/10.1109/TE.2017.2774766

360

A. Chan et al.

13. Goundar S (2014) The distraction of technology in the classroom. J Educ Hum Dev 3(1):211– 229 14. A Guide to the Project Management Book of Knowledge (2014) Project Management Institute, 5th ed. 15. Hingorani K, Sankar CS, Kramer SW (1998) Teaching project management through an information technology-based method. Proj Manag J 29(1):10–21. https://doi.org/10.1177/875697 289802900105 16. Hussein BA (2015) A blended learning approach to teaching project management: a model for active participation and involvement: insights from Norway. Educ Sci 5(2):104–125. https:// www.mdpi.com/2227-7102/5/2/104 17. Iacob C, Faily S (2019, 2019/11/01) Exploring the gap between the student expectations and the reality of teamwork in undergraduate software engineering group projects. J Syst Softw 157:110393. https://doi.org/10.1016/j.jss.2019.110393 18. Joanna W, Elizabeth AP, Seth S, Alexandra K (2016) Teamwork in engineering undergraduate classes: what problems do students experience? In: 2016 ASEE annual conference & exposition, Atlanta 19. Lerner S, Magrane D, Friedman E (2009) Teaching teamwork in medical education. Mt Sinai J Med 76(4):318–329. https://doi.org/10.1002/msj.20129 20. Lindsjørn Y, Almås S, Stray V (2021) A case study of teamwork and project success in a comprehensive capstone course. Norsk IKT-konferanse for forskning og utdanning 21. McCorkle DE, Reardon J, Alexander JF, Kling ND, Harris RC, Iyer RV (1999, 1999/08/01) Undergraduate marketing students, group projects, and teamwork: the good, the bad, and the ugly? J Mark Educ 21(2):106–117. https://doi.org/10.1177/0273475399212004 22. McKenzie S, Coldwell-Neilson J, Palmer S (2018) Understanding the career development and employability of information technology students. J Appl Res Higher Educ 10(4):456–468. https://doi.org/10.1108/JARHE-03-2018-0033 23. Mower JC, Wilemon D (1989, 1989/09/01) Rewarding technical teamwork. Res-Technol Manage 32(5):24–29.https://doi.org/10.1080/08956308.1989.11670609 24. Nuñez Enriquez O, Oliver KL (2021) ‘The collision of two worlds’: when a teacher-centered facilitator meets a student-centered pedagogy. Sport Educ Soc 26(5):459–470 25. Ojiako U, Ashleigh M, Chipulu M, Maguire S (2011) Learning and teaching challenges in project management. Int J Project Manage 29(3):268–278. https://doi.org/10.1016/j.ijproman. 2010.03.008 26. Pan CCS (2013, Oct 2013 2015-12-07) Integrating project management into project-based learning: mixing oil and water? In: IEEE conferences, pp 1–2. https://doi.org/10.1109/CICEM. 2013.6820187 27. Pfaff E, Huddleston P (2003, 2003/04/01) Does it matter if i hate teamwork? What impacts student attitudes toward teamwork. J Mark Educ 25(1):37–45. https://doi.org/10.1177/027347 5302250571 28. PMBOK Guide (2022) Project Management Institute.https://www.pmi.org/pmbok-guide-sta ndards/foundational/PMBOK 29. Putri AAF, Putri AF, Andriningrum H, Rofiah SK, Gunawan I (2019) Teacher function in class: a literature review. In: 5th international conference on education and technology (ICET 2019) 30. Reeve J (2006) Teachers as facilitators: What autonomy-supportive teachers do and why their students benefit. Elem Sch J 106(3):225–236 31. Rosenbloom JL, Ash RA, Dupont B, Coder L (2008, 2008/08/01/) Why are there so few women in information technology? Assessing the role of personality in career choices. J Econ Psychol 29(4):543–554. https://doi.org/10.1016/j.joep.2007.09.005 32. Rush DE, Connolly AJ (2020) An agile framework for teaching with scrum in the IT project management classroom. J Inf Syst Educ 31(3):196–207. http://jise.org/Volume31/n3/JISEv3 1n3p196.html 33. Scheepers MD, De Boer A-L, Bothma TJ, Du Toit PH (2011) A mental model for successful inter-disciplinary collaboration in curriculum innovation for information literacy. South Afr J Libr Inf Sci 77(1):75–84

Sprinting from Waterfall: The Transformation of University Teaching …

361

34. Sibona C, Pourrezajourshari S (2018) The impact of teaching approaches and ordering on IT project management: active learning vs. lecturing. Inf Syst Edu J 16(5). https://isedj.org/201816/n5/ISEDJv16n5p66.html 35. Snapp MB, Dagefoerde D (2008) The Accidental agilists: one teams journey from waterfall to Agile. In: Agile 2008 conference 36. Stanton RC, Mayer LD, Oriol NE, Treadway KK, Tosteson DC (2007) The mentored clinical casebook project at Harvard Medical School. Acad Med 82(5). https://journals.lww.com/ academicmedicine/Fulltext/2007/05000/The_Mentored_Clinical_Casebook_Project_at_Harv ard.15.aspx 37. TESQA. Australian quality framework. Tertiary Education Quality and Standards Agency. https://www.teqsa.gov.au/australian-qualifications-framework 38. Thompson CE (2022) Beyond imperturbability: the nineteenth-century medical casebook as affective genre. Bull Hist Med 96(2):182–210 39. Understanding the project management knowledge areas. https://www.workfront.com/projectmanagement/knowledge-areas 40. Ware P (2005) “Missed” communication in online communication: tensions in a GermanAmerican telecollaboration. Lang Learn Technol 9(2):64–89 41. Zarrett NR, Malanchuk O (2005) Who’s computing? Gender and race differences in young adults’ decisions to pursue an information technology career. New Dir Child Adolesc Dev 2005(110):65–84. https://doi.org/10.1002/cd.150

Versioning: Representing Cultural Heritage Evidences on CIDOC-CRM via a Case Study Ariele Câmara, Ana de Almeida, and João Oliveira

Abstract Understanding the elements that allow the recognition of archaeological structures is an essential task for the identification of cultural heritage. On the other hand, recording these elements is necessary for the historical study, evolution, and recognition of these types of structures. One of the challenges presented for the digital representation of this information and knowledge relates to the fact that there are results from different surveys and records on the status of the same monument, that can be considered as separate versions of knowledge. In this paper, we describe a schema to represent versioning data about archaeological heritage dolmens using the CIDOC-CRM model as a basis. The versioning schema will work as a database model for the development of a knowledge graph to aid with automatized dolmen recognition in images. The intended model efficiently stores and retrieves eventdriven data, exposing how each update creates a new “version” of a new event. An event-driven model based on versioning data makes it possible to perform comparisons of versions produced at different times or people and allows for the creation of complex version chains and trees. Keywords Archaeological structures · Versioning · CIDOC-CRM · Knowledge graph · Event-driven model

A. Câmara (B) · A. de Almeida · J. Oliveira Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal e-mail: [email protected] Centro de Investigaçao em Ciências da Informaçao, Tecnologias e Arquitetura, Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal A. de Almeida Centre for Informatics and Systems of the University of Coimbra (CISUC), Coimbra, Portugal J. Oliveira Instituto de Telecomunicações, Lisboa, Portugal © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_31

363

364

A. Câmara et al.

1 Introduction Knowledge graphs have emerged as a technology that aims to provide semantic information capable of being understood and interpreted by machines about real-world entities [14]. To represent the knowledge about entities and processes, it is necessary to differentiate real things (phenomenal) from those described through information (declarative) [9]. Mapping both the phenomenal and declarative knowledge of cultural heritage according to a common standard model such as the one provided by the International Committee for Documentation—Conceptual Reference Model (CIDOC-CRM)—is a key to support interoperability. The representation of historical, cultural, and archaeological data has traditionally been carried out by different specialists and maintained by institutions such as libraries, archives, and museums [9]. The multiple sources and different researchers’ backgrounds have led, through the years, to a disparity between data sources and formats, and different historical versions of declarative information, for example, data derived from interpretations of the same object [6]. Handling these metadata as a unique set is vital for different purposes such as information retrieval. This paper explores the development of a schema to model a graph-based data model to represent the different versions of the knowledge acquired about dolmens— using the information about the structural elements that may help their recognition in satellite images. In order to achieve this goal, we adopted the CIDOC-CRM.

2 Representing Data from Heterogeneous Sources 2.1 CIDOC-CRM CIDOC-CRM is a formal ontology for the integration, mediation, and exchange of cultural heritage information with multi-domain knowledge. Its development started in 1996 and in 2006 it became an ISO 21127:2006 standard [1]. Although it started as an ontology for museums, it is not limited by this usage and has been used for different purposes [7, 9, 13, 15]. The CIDOC ontology (version 7.2.1) consists of 81 classes taxonomically organized and 160 unique properties to model knowledge. The most general classes are represented as a E77 Persistent Item, E2 Temporal Entity, E52 Time Span, E53 Place, E54 Dimension, and E92 Spacetime Volume. As an event-centric model that supports the historical discourse, the CIDOC-CRM allows for the description of entities that are themselves processes or evolutions on time. Using the E2 Temporal Entity and its subclasses enables the description of entities that occurred at some point (or points) over time. One of these entities focuses on the description of a temporal duration and documents the historical relationships between objects that were described using subclasses of E77 Persistent Item. CIDOCCRM enables the creation of a target schema to join all the varied knowledge about

Versioning: Representing Cultural Heritage Evidences …

365

a domain, since CRM provides a definition and a formal structure to represent the concepts and relationships of cultural heritage.

2.2 Definition—Schema Versioning Schema evolution requires keeping the complete change history for the schema—so it is necessary to retain all previous definitions. Versioning mechanisms can be useful to support scholars in making research more transparent and discursive. The schema versioning idea was introduced in the object-oriented database system development context, as systems are implemented to deal with multiple schemas and the evolution of information [12]. We found a few examples of versioning using the CIDOC-CRM as a way to track different versions. The authors in Velios and Pickwoad [15] use an event-centric approach, where each entry represents a different version of bookbinding with all temporal classes directly connected to the same entity equivalent to each of the records on the cover of a given binding. Despite making it easier to understand from a human point of view, from a computational point of view having a unique entity that relates to several temporal instances will add cycles to the information graph [5] present ArCo on how to develop and validate a cultural heritage knowledge graph— discussing an approach to represent dynamic concepts which may involve or change over time. Every change generates a new record version of the same persistent entity that is represented by the catalogue record and its versions are related to it. Each version is associated with a time interval and has temporal validity. Another work that we can mention is that of [2], which shows how Version Control Systems have different benefits in the field of digital humanities, thus proposing an implementation of versioning in collaboration with version history. However, it is not shown how to work with this model when using CRM to structure heritage data. In a database several versions of information can coexist, even more so when we talk about temporal databases, which is the case when we deal with information about cultural heritage derived from several investigations on a monument or about the analysis data generated about it.

2.3 Data Modelling Issues: Event-Version Archaeological reasoning is supported by the multiple interpretations and theories stated, published, re-examined, and discussed over the years. Archaeology contains a rich and complicated example of argumentation used in a scientific community, showing how different fact-based theories were developed and changed over time [6]. The standard inferences, the sequence of factual observations, and the change of belief occurring over time can be represented using knowledge graphs. Despite

366

A. Câmara et al.

this, there are some limitations to this representation of this data, such as (i) correctly grouping components from different periods/versions and (ii) scalability [13]. An event represents a single episode in the data collection or recording. This single event can only consist of an investigative technique and is therefore a unique entity in time and space. Different events may have new results over the same object. Thus, the cultural property could be interpreted by various agents (e.g., researchers) at different moments in time, resulting in different interpretations. Event-based models are already well established [1, 8, 10, 15]. However, models such as CIDOC were not designed to directly support the different pieces of information representing different perspectives interpreted by different agents, or new pieces of information generated through it [5]. Using an event-centric model based on versions to structure data, we can (1) describe any number of component versions and (2) identify the components belonging to different versions [13]. When we take into account the knowledge about the various phases of the thing being analyzed, we can link what we observe with the related events.

3 Recording Cultural Heritage: A Study Case Using Archaeological Monuments The study case here presented deals with the representation of information about megalithic monuments classified as Dolmens in Pavia (Portugal) built in the Neolithic-Chalcolithic. These structures are one of the most representative and ubiquitous cultural features of prehistoric landscapes in Western Europe. The first systematic works about dolmens in Pavia were carried out by Vergílio Correia who published his research in 1921 [4]. His work is considered a benchmark for the knowledge of megalithic [11]. There are different records concerning research carried out in the area and on the same monuments. For the development of the schema, we collected and analyzed all data available through the DGPC Digital Repository1 regional archaeological map [3], the information provided by experts, and data obtained through photo interpretation. Unlike CIDOC-CRM, which is an event-based model, the data in most archaeological records on cultural heritage speaks implicitly about events. In this sense, we can use the CRM model to capture non-existent semantics and make explicit what is implicit. It helps to capture what lies between the existing semantic lines and structures into a formal ontology. At the same time, it serves as a link between heritage data, allowing to represent all this knowledge in a way that can be understood by people but also processed by machines, and thus allows the exchange, integration,

1

The DGPC is the State department responsible for managing archaeological activity in Portugal. Management of heritage is achieved through preventive archaeology and research, and records are provided viathe Archaeologist’s Portal: https://arqueologia.patrimoniocultural.pt/.

Versioning: Representing Cultural Heritage Evidences …

367

research, and analysis of data. By making what is implicit explicit, we are able to use existing data to ask new questions and consequently obtain new results.

3.1 Object-Based Record Dolmens are persistent physical structures built by man, with a relatively stable form, which occupy a geometrically limited space and establish a trajectory in space–time throughout their existence. Following the hierarchy of classes defined by CIDOC, the E22 Human-Made Object is the most specialized class within the hierarchy of man-made persistent items. The dolmen, as a general term, is characterized here as a CIDOC-CRM entity E22 Human-Made Object, which, according to the CIDOCCRM class specification, “… comprises all persistent physical items of any size that are purposely created by human activity” [1].

3.2 Versioning-Based Record Since we work with data produced by different specialists at different times on the same object, the question of how to deal with such a rich and diverse number of primary sources is not simple, especially if the authorship and origin are not always clear [13]. In order to do so, we must focus on the content, with each record being seen as a unique version of the same monument. As a result, we consider abstracts, records, or metadata that represent knowledge about the same entity as documents expressing a unique version about the monument. First, a single instance is created representing where the knowledge was obtained. A new related entity is created for representing the information about the dolmen found in the document and, finally, these separate instances are connected through a new entity, as shown in Fig. 1. We assign an ID to represent several E22 Human-Made Thing instances that contain knowledge about the same human-made object. An ID is characterized as an instance of the CIDOC-CRM E42 Identifier and is used to group E22 HumanMade Thing instances, each representing data about the same item but obtained from different documents. This model will record whatever activity over the object that generated the first record, whenever it is generated and acquired, while maintaining all previous knowledge to easily access it using the same class, creating a simple and non-recursive model. A branch in this context means that N parallel versions can be developed separately at a certain point in the development model. Since the goal is the representation of relevant information for posterior analysis, interpretation, and classification of images, in this case, for recognizing dolmens, the focus is on the dolmens structure representation and all the related elements that may assist in its recognition. To represent the structural information, E22 Human-Made Thing instances can be used as output for new entities that allow the characterization of the elements it represents.

368

A. Câmara et al.

Fig. 1 Schema representing the relationships in our versioning-based record to connect the item with their features by document

3.3 Event-Version-Based Record E2 Temporal Entity and its subclasses usage allows for the description of entities that occurred at some different point(s) over time. These entities focus on a temporal duration description and in records of the constant chronological relationships between the objects—to represent the information as an activity, beginning or end of something. However, this constant is not always respected. For example, when we talk about the beginning of the existence of a dolmen, we are talking about a phase of time described as Neolithic-Chalcolithic, semantically the existing connection properties would lead us to infer that the object description refers to its structure during this event and not its structure at the time it was analyzed and recorded, as is the case. Still, the initial structure is mostly unknown, since the structures may have been

Versioning: Representing Cultural Heritage Evidences …

369

created, modified, and reused, and there are no records about these activities. In any case, this information would not help to identify structures in images. For our use case, we need a class to understand the actions of making claims about an object property and that allows us to access the date and place where the knowledge was obtained—or at least know all the characteristics of the object at the time of data collection. The E13 Attribute Assignment class comprises the actions of making assertions about a property of an object or any unique relationship between two items or concepts, allowing to describe the people’s actions making propositions and statements during scientific procedures, for example, who or when a condition statement was made. Note that any instance of properties described in a knowledge base such as this is someone’s opinion—which in turn should not be recorded individually for all instances in favour of avoiding an endless resource whose opinion was the description of another opinion [1]. However, for the present case, as the description obtained by different entities sometimes contain contradictory data, a model that works with different views is necessary. Thus, these fragmented reports can be seen as versions that can enrich and complement our knowledge of the monuments and their relations, but they can also present conflicting information and narratives and multiple E13 instances can potentially lead to a collection of contradictory values. This redundant modelling of alternative views is preferred here because when talking about structural features, they all become relevant for a better perception of the object and how it may have been affected and affect its surrounding environment—which can help recognition. In this sense, we use the E13 Attribute Assignment entity to record the action of describing the dolmen and connect the event to the object with its descriptions. Using records as events to deal with different pieces of information of dolmens status, and unique IDs to group instances concerning the same dolmen, made it possible to overcome the issue. To associate the action of describing the dolmen to where the information was obtained, we use the E31 Document entity. This class allows for the representation of information on identifiable material items that originated propositions about the object under analysis. Thus, the relationship with the E31 Document entity is described based on the type of document that records the information. In addition to the document with the object description, we recorded the date of the information using the E52 Time Span entity. This information is relevant to prioritize the most current knowledge and enable the analysis of the chronological order of events that led to the description of the dolmen represented at that time. The schema model described is shown in Fig. 2. By using records as events and by considering each new record on the same monument as a unique version of the same, we create a model capable of dealing with the fact that different research works were, are being, or can be performed on the same monument, resulting in different outcomes since they can be made by different researchers, with diverse approaches and at different periods in time. Therefore, we manage to keep all the information that can later be relevant for the recognition of these or of similar structures.

370

A. Câmara et al.

Fig. 2 The archaeological monuments described in a record are represented as instances of HumanMade Object. When different documents report the same object, they are represented as unique entities (Human-Made Object) and related by a local ID. The local ID relates entities about the same monument and each Human-Made Object is related to the record activity and document, where knowledge is represented as acquired

4 Conclusion This article proposes the implementation of versioning in a model defined by CIDOCCRM. We defined a new scheme model to represent different versions of information about the same monument, keeping all the previous and new knowledge without the need of merging, which could lead to incongruent information on the same entity due to different approaches in time and methodology. Thus, we developed an interoperable model capable of storing, analyzing, and retrieving data quickly and effortlessly, allowing cross-reference information, identifying patterns, and assisting in automated classification and recognition methods of these or similar structures in images. The next phases of the project involve the development of the schema model to represent the physical and geographical characteristics of the structure and the surrounding landscape to generate a knowledge graph capable of contextualizing all the elements that allow the identification of dolmens in the territory. Acknowledgements This work was partially supported by the Fundação para a Ciência e a Tecnologia, I.P. (FCT) through the ISTAR-Iscte project UIDB/04466/2020 and UIDP/04466/2020, through the scholarship UI/BD/151495/2021.

References 1. Bekiari C, Bruseker G, Doerr M, Ore CE, Stead S, Velios A (2021) Volume A: definition of the CIDOC conceptual reference model

Versioning: Representing Cultural Heritage Evidences …

371

2. Bürgermeister M (2020) Extending versioning in collaborative research. In: Versioning cultural objects digital approaches, pp 171–190. http://dnb.d-nb.de/ 3. Calado M, Rocha L, Alvim P (2012) O Tempo das Pedras. Carta Arqueológica de Mora. Câmara Municipal de Mora. https://dspace.uevora.pt/rdpc/handle/10174/7051 4. Câmara A (2017) A fotointerpretação como recurso de prospeção arqueológica. Chaves para a identificação e interpretação de monumentos megalíticos no Alentejo: aplicação nos concelhos de Mora e Arraiolos. Universidade de Évora. https://dspace.uevora.pt/rdpc/handle/ 10174/22054 5. Carriero VA, Gangemi A, Mancinelli ML, Nuzzolese AG, Presutti V, Veninata C (2021) Patternbased design applied to cultural heritage knowledge graphs. Semantic Web 12(2):313–357. https://w3id.org/arco 6. Doerr M, Kritsotaki A, Boutsika K (2011) Factual argumentation—a core model for assertions making. ACM J Comput Cult Herit 3(8). https://doi.org/10.1145/1921614.1921615 7. Faraj G, Micsik A (2021) Representing and validating cultural heritage knowledge graphs in CIDOC-CRM ontology. Future Internet 13(11):277. https://doi.org/10.3390/FI13110277 8. Guan S, Cheng X, Bai L, Zhang F, Li Z, Zeng Y, Jin X, Guo J (2022) What is event knowledge graph: a survey. IEEE Trans Knowl Data Eng 1–20. https://doi.org/10.1109/TKDE.2022.318 0362 9. Hiebel G, Doerr M, Eide Ø (2017) CRMgeo: a spatiotemporal extension of CIDOC-CRM. Int J Digit Libr 18(4):271–279. https://doi.org/10.1007/S00799-016-0192-4/FIGURES/6 10. McKeague P, Corns A, Larsson Å, Moreau A, Posluschny A, Daele K van, Evans T (2020) One archaeology: a manifesto for the systematic and effective use of mapped data from archaeological fieldwork and research. Information 11(4):222. https://doi.org/10.3390/INFO11 040222 11. Rocha L (1999) Aspectos do Megalitismo da área de Pavia, Mora (Portugal). Revista Portuguesa de Arqueologia 2(1). https://dspace.uevora.pt/rdpc/handle/10174/2248 12. Roddick JF (1995) A survey of schema versioning issues for database systems. Inf Softw Technol 37(7):383–393. https://doi.org/10.1016/0950-5849(95)91494-K 13. Roman Bleier SMW (ed) (2019) Versioning cultural objects: digital approaches 14. de Souza Alves T, de Oliveira CS, Sanin C, Szczerbicki E (2018) From knowledge based vision systems to cognitive vision systems: a review. Procedia Comput Sci 126:1855–1864. https:// doi.org/10.1016/J.PROCS.2018.08.077 15. Velios A, Pickwoad N (2016) Versioning materiality: documenting evidence of past binding structures. Versioning cultural objects digital approaches, pp 103–126. http://dnb.d-nb.de/

Toward a Route Optimization Modular System José Pinto, Manuel Filipe Santos, and Filipe Portela

Abstract Urban mobility and routes planning are one of the biggest problems of cities. In the context of smart cities, researchers want to help overcome this issue and help citizens decide on the best transportation method, individual or collective. This work intends to research a modular solution to optimize the route planning process, i.e., a model capable of adapting and optimizing its previsions even when given different source data. Through artificial intelligence and machine learning, it is possible to develop algorithms that help citizens choose the best route to take to complete a trip. This work helps to understand how Networkx can help transportation companies to optimize their routes. This article presents an algorithm able to optimize their routes using only three variables starting point, destination, and distance traveled. This algorithm was tested using open data collected from Cascais, a Portuguese City, following the General Transit Feed Specification (GTFS) and achieved a density score of 0.00786 and 0.00217 for the two scenarios explored. Keywords Artificial intelligence · Machine learning · Route planning · Smart cities · Urban mobility · GTFS

1 Introduction The pace at which cities are progressively growing in population is a reality that has caused complications of urban mobility. The gap between this growth and investment in infrastructure and solutions to meet the mobility needs of populations in urban environments causes disruption to each citizen’s personal life. One of the most common problems in cities, not only in Europe but worldwide, is traffic congestion. Whether it is the choice of using one’s own motorized transport to the detriment of collective J. Pinto · M. F. Santos · F. Portela (B) Algoritmi Research Centre, University of Minho, 4800-058 Guimarães, Portugal e-mail: [email protected] F. Portela IOTECH—Innovation on Technology, 4785-588 Trofa, Portugal © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_32

373

374

J. Pinto et al.

transport or the lack of viable options for collective transport, this is a problem that, in addition to the inherent traffic congestion, raises some environmental issues. More and more, cities are seeking to apply the concepts and achieve the status of Smart City to respond to the challenges they face nowadays, among them, traffic congestion. The optimization of a city’s processes depends on the development of information technologies, particularly in areas that offer the city, intelligent and dynamic and if possible interoperable systems, and areas like artificial intelligence. This paper’s work is framed within ioCity, a project by the startup IOTech, that proposes the development of an innovative solution to respond to a problem that is quite difficult to solve in the urban mobility area: the recurrent traffic jams on the roadways, often caused by the difficulty in finding a parking spot, ioCity suggests the development of an intelligent Progressive Web App (PWA) that can provide the user with a transport route to a given location taking into account several factors such as traffic and transportation (location, occupancy rate, etc.) All of this is possible through data collected and analyzed in real time. This paper aims to explain how it can optimize the process of route planning within the context of urban mobility. Through artificial intelligence and machine learning, it is possible to develop algorithms that help companies by optimizing their provisions into new datasets to reduce travel time, always considering the influencing factors of urban mobility and the user’s preferences. For this particular study, the data of a Portuguese city was used, Cascais. The first section of this document provides a framework for the subject of this work, briefly describing the environment, the themes explored, and a brief introduction to the concrete work carried out. In Sect. 2, a review of the concepts and existing literature that served as a basis for all the practical work is carried out. Point 3 indicates the materials, methods, and methodologies used throughout the project. Point 4 presents the work carried out following the CRISP-DM methodology, this includes business understanding, data understanding, data preparation, and modeling. Sections 5 and 6 describe the results obtained and the discussion they provided. Finally, in point 7, all the work carried out during this project is concluded and the next steps to be taken following the project are defined.

2 Background The field of artificial intelligence (AI) has been developed from humble beginnings in which curiosity for something new stimulates the research to a global impact presented in projects of high relevance to the society. With it, the dataset can be used to optimize them through the development of algorithms and machine learning. According to Bartneck et al. [1], the definition of AI and what should and should not be included has changed over time and, to this day, its definition continues to be the subject of debate [2]. Since the emergence of the Smart Cities concept in the late 1990s [3], several definitions have emerged and been published resulting from different analyses and approaches by various researchers in the application domain of the concept.

Toward a Route Optimization Modular System

375

Hall [4] defines the term as a city that monitors and integrates all critical infrastructures (roads, bridges, tunnels, railways, subways, airports, communications, energy, security, among others), optimizes its resources, and maximizes the services provided to its citizens. Harrison et al. [5] underlines the need for an Information Systems infrastructure interconnected with the city’s physical, social, and business infrastructures in order to leverage the city’s collective intelligence. The vision for building a Smart City has been progressively developed by researchers, engineers, and other stakeholders in the field. Samih [6] presents an architecture model of a Smart City based on six components: Smart Economy, Smart People, Smart Governance, Smart Mobility, Smart Environment, and Smart Living. Smart Cities mobilities address the following: • Urban mobility (definition): It refers to all aspects of movement in urban settings. It can include modes of transport, such as walking, cycling, and public transit, as well as the spatial arrangement of these modes in a built environment. • Route Planning: The process of computing the effective method of transportation or transfers through several stops. • Route Optimization: The process of determining the most cost-efficient route. It needs to include all relevant factors, such as the number and location of all the required stops on the route, as well as time windows for deliveries. Machine Learning comprises four types of learning: Supervised Learning—The machine receives a set of examples labeled as training data and makes predictions for all unseen points. Unsupervised Learning—the machine studies data to identify patterns. Semi-supervised Learning—The machine receives a training sample consisting of labeled and unlabeled data and makes predictions for all unknown points. Reinforcement learning—the machine is provided with a set of allowed actions, rules, and potential end states [7]. AI operations and optimization involve the application of Artificial Intelligence (AI) technologies, such as machine learning and advanced analytics. This is done to automate problem-solving and processes in network and IT operations and to enhance network design and optimization capabilities.

3 Material and Methods This chapter describes the methodologies that are used in the development of this paper. To guide the writing and development, the chosen methodology is the Design Science Research (DSR) methodology. The SCRUM methodology is used for managing the work and the Cross-Industry Standard Process for Data Mining. The (CRISP-DM) methodology is used to guide the research and development using Machine Learning techniques. To analyze the dataset used, the Talend Open Studio for Data Quality tool was used. Statistical analyses were performed on the various columns of the dataset to provide a better understanding of its content. A simple

376

J. Pinto et al.

statistical analysis was performed on each column, whose indicators enable verification of the number of lines, null values, distinct values, unique values, duplicate values, and blank values, and a value frequency analysis, whose indicators enable verification of the most common values in a column. Python scripts were used to extract and transform the data, namely, the NumPy and Pandas libraries. To develop the model and algorithm, the Networkx library in the Python environment was used.

4 CRISP-DM The next sections are divided according to the methodology chosen to guide the work, the Cross-Industry Standard Process for Data Mining (CRISP-DM). It is also important to point out that the first phase of CRISP-DM, the business understanding, can be found in the introduction of the project.

4.1 Business and Data Understanding This work intends to explore the development of a modular solution to optimize the process of routs. If the origin dataset changes, the model can adapt and optimize its previsions in the new dataset. The first phase will explore optimization algorithms that can receive routes and optimize them. In the second phase (future work), the team will use another dataset to test the model. Then the team defined which data will be needed for the next phase of the project. This data should support the Machine Learning models that will provide the best available route to travel between two locations. The data needed for the next phase focus on three crucial points to define a route, which are Starting point, Destination, and Distance traveled. At this stage, the focus is on collecting an initial dataset that allows us to build a foundation for the project. The dataset idealized at the launch of the product was a dataset composed of data related to public transportation that would allow to build a network of several interconnected paths/lines on which some functionalities could be developed. To obtain the necessary dataset, the initial plan was to contact companies and municipal services in order to get a dataset whose information corresponds to a real situation of planning the operation of a public transportation network. However, due to the pandemic outbreak of COVID-19, this idea was soon discarded, and it was decided to use datasets available on Open Data platforms. After the research and analysis of the selected datasets, it is possible to conclude that the datasets that follow the General Transit Feed Specification (GTFS) have sufficient data for the construction of the representative model of the lines and intersections of a network of the public transport network. The dataset from a Portuguese city (Cascais) was chosen as the basis for this project because it is the one with the most useful information to support the development of this project.

Toward a Route Optimization Modular System

377

Table 1 Simple statistics of the attributes selected Column

Distinct

Unique

Duplicate

stop_times.trip_id

3576 (3.3%)

0

3576 (3.3%)

stop_times.stop_id

1022 (1.0%)

0

1022 (1.0%)

stop_times.shape_dist_traveled

3370 (3.1%)

225 (0.01%)

3145 (2.9%)

After analyzing all the columns of this dataset, it was found that only three of them have relevant information for model building. Table 1 shows the results of the statistical analysis performed on these three columns of the dataset from 110505 rows without null or blank values.

4.2 Data Preparation In order to prepare the modeling, data were extracted from the selected dataset and, on these, a filtering of the information considered relevant for the project was performed. Python scripts (Numpy and Pandas libraries) were used in this procedure to make the necessary changes. The process of processing the dataset and the respective changes made are presented below, followed by a summary of the process: • • • • •

Discarding of all attributes except for the attributes listed in point 4.1; Select the bus line to be transformed into the attribute “stop_times.trip_id”; Rename attributes “stop_times.trip_id” to “Route” and “stop_id” to “Start”; Create the attribute “Stop” transforming the information of the attribute “Start”; Create the attribute “Distance” transforming the information of the attribute “stop_times.shape_dist_traveled”.

This process is repeated for each new bus line that is to be added to the dataset. This process results in a dataset that describes a bus line where each line of the dataset indicates the bus line to which the record belongs in the attribute “Route”, the origin stop and the destination stop in the attributes “Start” and “Stop”, respectively, and the distance traveled between the two stops in the attribute “Distance”. For each bus line, a document in “.csv” format is generated containing the information generated about the respective bus line. In order to centralize the information required to build the model, a Python script was used to integrate all the information generated in the process described in the previous paragraph. At an early stage of the project, only a small number of lines were selected. Initially, only four bus lines were selected and the remaining lines were added to the model base progressively so as to increase its complexity without abruptly causing problems/errors in the model. The result of this integration is a “.csv” file with four columns. The columns “Start” and “Stop” form the vertices of an oriented graph, while the column “Distance” will be the edge connecting the vertices. The “Route” column indicates which of the bus lines the connection between two vertices belongs to. In order to centralize the

378

J. Pinto et al.

information needed to build the model, a Python script was used to integrate all the information generated. This Python script has the function of aggregating the information from all the bus lines selected for the construction of the model. At an early stage of the project, only a small number of lines were selected. Initially, only four bus lines were selected and the remaining lines were added to the model base progressively in order to increase the complexity of the model without abruptly causing problems/errors in the model.

4.3 Modeling In this step, the team selected and optimized the models: route optimization. In this section, the tasks performed to achieve the mentioned objective are described. To begin the modeling process, it was initially necessary to define which techniques to adopt. Given the structure of the data that served as the basis for the modeling, two variables were defined: the source variable which represents the starting point and corresponds to the origin vertex of the oriented graph; the target variable which represents the next stop and corresponds to the destination vertex of the oriented graph. Once these two variables were defined, it was necessary to approach the resolution of this challenge as a regression problem, since the goal is to obtain the shortest path between two (or more) vertices of the oriented graph that represents a transportation network. To answer this regression problem over an oriented graph, it was necessary to select an algorithm capable of processing structured data in the form of a graph, which limited the possible approaches to the problem. Two possible approaches emerged: Dijkstra’s algorithm and the Bellman-Ford algorithm. The Dijkstra algorithm was then selected, since the number of resources using this algorithm is larger. After selecting the modeling techniques, it is important to define the scenarios upon which the model will be built. Two scenarios have been defined: • Scenario 1: In this scenario, the first four bus lines of the dataset are inserted. It is a small dataset with only a few intersections that allows testing the objective functionality of the model. • Scenario 2: In this scenario, the first eighteen bus lines of the dataset are inserted. This scenario includes more intersections than the 1º scenario and allows us to verify the model performance in a larger dataset. Figure 1 allows us a graphical visualization of the graph using Networkx built for Scenario 2. The next step was to build the model and write the respective code and programming. The Jupyter Notebook platform was used to write the code entirely in the Python language. To import the data to the platform, the Pandas library was used. The imported data is the result of the process described in Sect. 4.3 of this document.

Toward a Route Optimization Modular System

379

Fig. 1 Representation of a graph with 18 bus lines from the dataset according to the node density score

Once the data is imported, the Networkx library is used to create the oriented graph. In the imported dataset, there are four columns: Start, Stop, Distance, and Route. The function nx.from_pandas_edgelist allows to create the mentioned graph from a dataset with the structure of the imported data. Vertex pairs are created by associating the Start column to the source variable defining the source vertex, the Stop column to the target variable defining the destination vertex, and the Distance and Route variables as attributes of the edge between the two vertices. The Distance column indicates the actual distance between the two representative vertices and the Route column indicates the route to which this pair of vertices belongs. After building the oriented graph, some functions supporting the route planning model were written. In this dissertation, it was decided to develop a model that allows predicting the best route between two vertices of the graph with the possibility of adding up to three stopping points on the path. The function of the algorithm allows receiving five arguments: start, stop, stop1, stop2, and stop3. The vertices represented by the start and stop arguments are fixed since they indicate the starting point and the final destination of the route idealized by a user. The intermediate points (stop1, stop2, and stop3) can have their order changed if the computation of the best path (shortest path) so indicates. To obtain the best path, a permutation of the arguments is performed where all the travel possibilities are explored considering that the starting and ending points do not change, only the intermediate points. Then, for each permutation, the best path is computed using Dijkstra’s algorithm in the functions nx.shortest_path and nx.shortest_path_length. The algorithm explores all connections between two vertices of the graph and returns the shortest path. This process is repeated for all the connections of the permutation being returned by the function of the optimized path (shortest) and the total distance traveled on that path.

380

J. Pinto et al.

During the execution of the algorithm, the number of lines that the route contains is captured in parallel in order to return a value indicating the cost of the route to the user.

5 Results In order to evaluate and obtain a descriptive perspective of the produced graph, some metrics were computed using the Python library used in the construction of the graph, the Networkx library. The metrics selected were the following, presented and described in Table 2, and the results presented in Table 3. It is possible to verify that, with the increase in the number of bus lines, the graph becomes less dense derived from the nature of the actual context the graphs represent. Since, as a rule, the lines of the bus networks of an operator originate from a set of common stops, having only a few intersections along the various routes of the lines. From scenario 1 to scenario 2, there is a reduction in the score from 0.00786 to 0.00217 justified by the increase in the number of bus lines that constitute scenario 2. The lower average centrality score in scenario 2 (from 0.01572 in scenario 1 to 0.00434) is also natural since the number of “isolated” nodes that constitute the path of most routes is inevitably higher. The node [156257] is, in both scenarios, the Table 2 Metrics selected for evaluation Metrics

Description

Density

Returns the density of a graph m , d = n(n−1) where n is the number of nodes and m is the number of edges in the graph

Connectivity

Returns the average degree of the neighborhood of each node. For directed graphs, N(i) is defined according to the parameter “source”,  w = 1 knn,i j∈N (i) wi j k j si where si is the degree weight of node i, wij is the weight of the edge connecting i and j, and N(i) are the neighbors of node i

Centrality

Calculates the degree centrality for nodes. The degree centrality for a node v is the fraction of nodes it is connected to

Intermediation Calculates the shortest path intermediation between nodes The median of node v is the sum of the fraction of the shortest paths of all pairs that pass through v  c B (v) s,t∈V σσ(s,t|v) (s,t) , where V is the set of nodes, σ (s, t) is the number of shortest paths (s, t), and σ (s, t | v) is the number of paths that pass through some other node v other than s, t. If s = t, σ (s, t) = 1, and if v ∈ s, t, σ (s, t | v) = 0 In-Degree

The in-degree of a node is the number of edges that point to the node

Out-Degree

The out-degree of a node is the number of edges that point outside the node

Toward a Route Optimization Modular System

381

Table 3 Results for each scenario Scenario 1

Scenario 2

Metrics

Nodes

Score

Nodes

Score

Density



0.00786



0.00217

Average connectivity of nodes



1.00017



0.96968

Average of centrality



0.01572



0.00434

Greater centrality

[156257]

0.04580

[156257]

0.04580

Less centrality

127 distinct nodes

0.01526

[156112]

0.00186

Average of intermediation



0.30723



0.04746

Larger intermediation

[155633, 155735, 155636]

0.69125

[156257]

0.48793

Minor intermediation

[156066, 156067, 156068, 156305, 156306]

0.03018

6 distinct nodes

0.0

In-degree average



1.03030



1.16231

Average out-degree



1.03030



1.16231

node that presents the highest score indicating that this node is a central point of the whole network. Even so, its centrality score decreases from 0.04580 in scenario 1 to 0.02429 in scenario 2 since in scenario 2 this node is connected to a much smaller fraction of the total number of nodes (approximately half in relation to scenario 1). It is also possible to verify in scenario 2 that node [156112] has the lowest score (0.00186), revealing itself as the least relevant node for network connectivity. The intermediation scores allow us to ascertain which nodes are most traversed in the set of all shortest paths between all possible pairs of nodes. In scenario 2, as in the centrality score, node [156257] presents the highest score, with a value of 0.48793. This means that for about half of the shortest paths between all possible pairs of nodes, this node is traversed, reinforcing the importance that this node has in the network. Another detail to take from this metric is that this node only became the node with the highest score in scenario 2, after expanding the network with more bus lines. In scenario 1, nodes [155633, 155,735, 155636] present the highest score, with a value of 0.69125. It was verified that the model produced and presented in Sect. 4.3 of this document allows you to obtain, in both test scenarios and for the levels of complexity introduced in this model (selection of intermediate points to cover in the planning of a route), the best possible route to cover, on the transport system used as a basis. For each run, all the hypotheses to travel the selected points are analyzed and, at the end of the run, the best route, the distance traveled on it, and its monetary cost are returned. It can be seen in Fig. 2, a run with three intermediate points, that the route returned as best to traverse the inserted points differs from the order in which the points are inserted. Once the model is run, it is possible to confirm that its basic goal is achieved, i.e., to provide the best route to go through a set of points; however, the inclusion of

382

J. Pinto et al.

Fig. 2 Execution results (part 1)

more variables that influence this decision would have resulted in a more complex and interesting analysis and respective decision-making for a real context.

6 Conclusion and Future Work In Sect. 5, the produced algorithm was validated in order to verify if it is able to return the shortest route between a set of points to be traversed in a graph and its associated cost. The model produced allows, by inserting a start point, an end point, and up to three intermediate points, to obtain the shortest path between the inserted points and the monetary cost associated with the route. The descriptive metrics of the graph presented in the same point indicate that the network built is not very dense (density score of 0.00217 in scenario 2), with a low average centrality of nodes (average centrality score of 0.00434 in scenario 2) limiting the possibilities of possible routes to follow. To do this, it will be necessary to increase the number of existing links in most nodes, allowing a larger number of new possible paths to travel. Once the results were analyzed and the process was reviewed, some opportunities to

Toward a Route Optimization Modular System

383

improve the model were identified. The next steps to be taken in the project include the following: • Obtain a more complete dataset capable of representing a more comprehensive transport network: Since the dataset used in the development of this project only refers to buses in the municipality of Cascais, it will be interesting to analyze how the model will adapt to a higher degree of complexity when contact points with other transport networks are introduced, whether buses from other operators or even other types of transport (trains, subway, among others). • Complement model with other variables: The introduction of more variables that impact decision-making and the result that the model returns will increase the value and usefulness of this model especially, real-time variables such as weather information or cultural events along the routes that can alter decision-making. • Planning an implementation: In order to apply and use this model in a real context, it is necessary to develop a means of doing so. The ideal solution would be a mobile application that allows a user to interact with the model and extract useful information from it. • Explore other algorithms like Exploring Graph Neural Networks (GNN), Neural Evolution techniques, Grammatical Evolution, or Reinforcement Learning. Acknowledgements This work has also been developed under the scope of the project NORTE01-0247-FEDER-045397, supported by the Northern Portugal Regional Operational Programme (NORTE 2020), under the Portugal 2020 Partnership Agreement, through the European Regional Development Fund (FEDER).

References 1. Bartneck C, Lütge C, Wagner A, Welsh S (2021) What is AI? In: An introduction to ethics in robotics and AI. SpringerBriefs in ethics. Springer, Cham. https://doi.org/10.1007/978-3-03051110-4_2 2. Poole D, Mackworth A (2017) Artificial intelligence: foundations of computational agents. Cambridge University Press 3. Albino V, Berardi U, Dangelico RM (2015) Smart cities: definitions, dimensions, performance, and initiatives. J Urban Technol 22(1):3–21. https://doi.org/10.1080/10630732.2014.942092 4. Hall P (2000) Creative cities and economic development. Urban Stud 37(4):200 5. Harrison C et al (2010) Foundations for smarter cities. IBM J Res Dev 54(4):1–16. https://doi. org/10.1147/JRD.2010.2048257 6. Samih H (2019) Smart cities and internet of things. J Inf Technol Case Appl Res 21(1):3–12 7. Mohri M, Rostamizadeh A, Ameet (2012) Foundations of machine learning

Intellectual Capital and Information Systems (Technology): What Does Some Literature Review Say? Óscar Teixeira Ramada

Abstract This research has the goal to make known, the binomial, intellectual capital, and information systems (technology), what some of the literature review says about it. Altogether, from the scarce set of existing research in this regard, five papers were selected that obeyed the criterion of the presence of a relationship with these two topics, together. In terms of substance, what can be concluded is that the scientific contribution given to the knowledge made wider, is very tenuous, not to say, null. What can be seen is that, it was research, based on secondary and also primary sources, the former not being suitable for this specific purpose. And even the primary ones, suffer from some technicality, that proves to be of little practical use. In short, it can be said that these two topics, conditioned by the selection made, did not add any contribution to the expansion of scientific knowledge. West, east, north, and south, nothing new. Keywords Intellectual capital · Intangible assets · Information systems · Technology

1 Introduction The topic of intellectual capital, alone, has become increasingly important as it is recognized that other topics, such as business performance, competitive advantages, innovation and the well-being of citizens, countries and the world, in general, is getting better and better. The literature on this topic appears predominantly associated with these and others, and can be interpreted as part of an integrated perspective. This is what happens with [1–6], which, in the most recent one, covers the most diverse years (from the most distant—[7, 8] and even the closest—[9, 10]). These, as well as other authors, translate the set of research, which encompasses, two or three or four or even more topics in an interconnected way that, in a deeper analysis, intends to expand the Ó. T. Ramada (B) ISCE - Douro - Instituto Superior de Ciências Educativas do Douro, Porto, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_33

385

386

Ó. T. Ramada

knowledge being it more complete. But what is observed is that it sins for being less complete, less clear, less understandable and, above all, less applicable in practical reality. One of the most suitable scientific methods to know the value of intellectual capital is Andriessen’s method [11], in his work entitled, “Making Sense of Intellectual Capital—Designing a Method for the Valuation of Intangibles”, 2004, p. 137. He is not the only author who has published research focused on the final value of intellectual capital. Other authors such as Fedotova et al. [12], Goebel [13] and Gogan and Draghici [14] also have it. Specifically, regarding to the intellectual capital and information systems (technology), these related topics present a set of characteristics. In order to gather information that is relevant for various business purposes, information systems, that is, the way in which companies have specifically recorded their assets and liabilities (commonly known as property and the like), which tends to be shaped with the help of technology, it makes it possible for them to know their activities, how they are structured, to quantify them, to see how they vary (increase, decrease, or remain constant). An information system (technology), developed and extended, in order to allow the exercise of control over the company, preferably in real time, allows it to be more efficient and effective. Intellectual capital is the necessary tool to design a good information system (technology), adapted to a specific company and that makes it possible to manage it better. In this way, the consideration, together, of the two topics, intellectual capital and information systems (technology), are so important that, if considered individually, exhibit the obstacle of lacking something that complements them, mutually, and makes them exhibit greater business knowledge, especially the value of intangible assets, in addition to tangible ones. The databases consulted were “B-on”, “Elsevier” (Science Direct), “Taylor and Francis”, “Emerald Collections”, “Web of Science”, and “Wiley Online Library”, predominantly. In particular, “International Journal of Accounting and Financial Reporting”, “International Journal of Learning and Intellectual Capital”, “Information and Management”, “Journal of Strategic Information Systems”, “Journal of Intellectual Capital”, and “ResearchGate” (Conference Paper). The consideration of only five papers object of research was the result, scarce, of a deep selection in these databases, and not an insufficient effort by the researcher. It was the consideration of the binomial, together, intellectual capital and information systems (technology), which served as criteria for inclusion and exclusion, in the selection of papers, and in the formulation of the research question. Therefore, this consists of knowing: what does the selected literature review say with these two combined topics? Is there any specificity to show that results from it? For this reason, one of the contributions of this research is to make known what exists in the literature and, in particular, in the review carried out.

Intellectual Capital and Information Systems (Technology): What Does …

387

2 Literature Review Al-Dalahmeh et al. [9] are three authors who carried out research on the impact of intellectual capital on the development and efficiency of accounting information systems applied to companies in the industrial sector in the Kingdom of Jordan, according to accounting auditors. These information systems, both in their accounting and technological aspects, play a crucial role in business success, insofar as information is a valuable resource and, therefore, a source of effectiveness and efficiency. To this end, companies lack intellectual capital to direct resources and increase the aforementioned information efficiency, which involves the development of accounting systems, that cannot be achieved until the intellectual capital is developed. Thus, the research aims to underline how important this development is to increase the efficiency of accounting information applicable to companies in the industrial sector. Thus, the research goals are the presentation of a theoretical framework on the concept of intellectual capital and the development of its different dimensions, in addition to demonstrating their effect on the efficiency of accounting information systems. With regard to the research method used, it is an analytical-descriptive approach in which the researchers collected information, both from primary and secondary sources. In the first, the necessary information came from a questionnaire prepared and distributed to a group of external auditors who constituted a sample of this research, after which the answers were analyzed using SPSS to test the compliance of the same. In the second, the information consisted of books, researchers’ theses, papers in specialty journals, in order to build a theoretical framework and, thus, achieve the research goals. With regard to the study population, seventy-five companies listed on the Amman Stock Exchange and belonging to the industrial sector were selected. In the sample, the corresponding seventy-five auditors from the same companies with high qualifications and professional efficiency, responsible for the audit of the aforementioned companies, were selected, also. The implicit concept of intellectual capital, adopted by these authors, consists of four components: human capital (skills and competences), creativity capital (development of new products and/or services), operational capital (work systems and expenses), and customer capital (customer relationships and answers to customer needs). As main conclusions obtained by the authors in the research, seven stand out. First, the efficiency of accounting information systems is measured by the benefits achieved through the use of the outputs of these systems, compared with the costs incurred with their construction, design, and operation. Second, there is an urgency for companies to develop intellectual capital to improve the application of accounting information systems, which involves mechanisms that promote this, making the resource conceived and maintained in any company.

388

Ó. T. Ramada

Third, industrial companies must work to determine the level of knowledge and skills in order to guarantee the quality and efficiency of accounting information systems. Fourth, these, applied in companies, increase the efficiency of workers and the skills to develop and achieve progress. Fifth, industrial companies provide the possibility of progress in the work and development of workers to guarantee industrial and information quality. Sixth, companies participate in initiatives to increase the level of industrial performance and the efficiency of accounting information systems. Finally, and seventh, industrial companies use practical means to find new ideas and the quality of accounting information systems. As recommendations, the authors suggest the need to develop the intellectual capital of industrial companies as the main focus of management due to the pioneering effect on companies in the long run, increasing investment. Directing companies, they must be managed in the sense of adopting clear and transparent policies, in order to bring together the competent members in such a way, that they raise the level and quality of the accounting information systems. The need in the management of industrial companies in the Kingdom of Jordan should be such that it promotes the development of intellectual capital because of its effects on improvement accounting information systems, stimulates an intellectual culture that increases its importance. Finally, the need to increase elements of the creative capital of workers is emphasized, via accounting information systems in industrial companies, by virtue of its pioneering role, nationally and internationally. Zeinali et al. [10] are also three authors who carried out a research about an assessment of the impact of information technology capital and intellectual capital (organizational capital, relational capital, and innovation capital) on the future returns of companies in the securities markets. With regard to the research method used, it was of the quasi-experimental type, based on the present information and the Financial Statements of the companies. It should be noted that this was a correlative and descriptive study, with regard to data collection. It was a post-hoc study. The population includes all investment, banking and telecommunications, electronic payments and insurance companies, listed in the securities markets of the Tehran Stock Exchange (Iran). The sample, therefore, consisted of fifty companies, selected in the years 2009–2013. The considered hypotheses were stipulated in an econometric regression model, with variables constituted from panel data. As a dependent variable, the future stock returns of company i in year t + 1 (Ri, t+1 ) were used. As independent variables, the technological information of company i in year t + 1 (IT Capitali, t+1 ), the organization of the capital of company i in year t + 1 (Organizational Capitali, t+1 ), the relational capital of company i were used in year t + 1 (Relational Capitali, t+1 ), the R&D Capital of company i in year t + 1 times the investment of company i in year t + 1 (R&DCapitali, t+1 × Investi, t+1 ), financial leverage of company i in year t + 1 (LEVi, t+1 ), age of company i in year t + 1 (Agei, t+1 ), size of company i in year t + 1 (SIZEi, t+1 ), and investment of company i in year t + 1 times R&D Capital of company i in year t + 1 (Investi, t+1 × R&DCapitali, t+1 ).

Intellectual Capital and Information Systems (Technology): What Does …

389

Regarding the most evident conclusions, the researchers concluded that, the planning of investments in the area of technological information, according to the business goals, without forgetting the dimension and structure, facilitates their activities between the different sectors, reducing time and costs. This leads to higher returns, which depend on information technologies. The intellectual capital appears in this context as a hidden value that causes benefits in the Financial Statements. It guides companies towards achieving competitive advantages and higher returns and reveals that the economic value of business resources is more the result of intellectual capital and less the production of goods and/or services. With regard to information technologies, the authors claim that, increasingly, it has assumed a greater role in all aspects, from production, distribution and sales methods, being the factor that advances the perspectives of future returns. If companies have available, correct, accurate and timely information, they can attract competitive advantages, with information being an important strategic source. Investing in information technologies is of all importance to improve the skills and competences of companies. Hsu et al. [7] are also three other authors who research on the boundaries of knowledge between information systems and business disciplines, from a perspective centered on intellectual capital. Indeed, the authors state that the development of information systems can be considered as a kind of collaboration between users and those responsible for their development. Having few skills to leverage localized knowledge, embedded in these two types of stakeholders, can serve as an obstacle to software development in order to achieve high performances. Therefore, exploring directions to efficiently bridge the frontiers of knowledge in order to facilitate access to it, is essential. From the point of view of the research method used, the authors resorted to a survey in order to carry out the empirical test. This approach has its origins in previous literature on the topics. Respondents were professionals who, in some way, dealt with the development of information systems. Thus, they performed a two-step approach to data collection. In the first, they contacted the 251 managers of the information systems departments of the “Taiwan Information Managers Association”. Via telephone, they informed the purpose of the research and verified their availability to participate. For those who accepted, they were asked to nominate project managers, group leaders, senior members within organizations, among others. For companies with two or more completed projects, each contact’s information was recorded. In all, a total of 750 projects were identified. In the second, the aforementioned survey was carried out, and a survey was delivered to the 750 managers of the mentioned groups, identified in the first step. A total of 279 answers were obtained, corresponding to a answer rate of 35.6%. As omitted answers were obtained, only a total of 267 answers from 113 companies were considered. To ensure sample representativeness, two analyzes were carried out by the researchers: first, companies that were able to participate in the study, were compared with those that were not. No differences were found between the two groups in terms

390

Ó. T. Ramada

of size and business sector. It was ensured that there were no significant differences, between those that were chosen. From a socio-demographic point of view, 73% were male and 27% were female. Among males, 58% had a Bachelor’s degree, and 35% had a Master’s degree. Among them, 43% were programmers, 18% systems analysts and 19% project leaders. With regard to age, 28% were between 21 and 30 years old, 60% between 31 and 40 years old, 10% between 41 and 50 years old, and over 51 years old only 1.5%. As the main conclusions drawn by the authors, it is emphasized that the frontiers of knowledge played an important role in forecasting systems and in the quality of projects, as well as having a mediation role, between intellectual capital and the performance of information systems. The three components of intellectual capital (human capital, relational capital, and structural capital) have been shown to have a significant impact on knowledge efficiency. The magnitude of the impact of the human capital component on knowledge proved to be moderated by the relationship between users and those who develop information systems. Generally speaking, higher (lower) levels of relational capital held by the two types of stakeholders minimize (maximize) the negative impacts of insufficient understanding of effective knowledge. As main limitations, the researchers mention that, a cross-sectional sample used may have inversely affected intellectual capital. So, future research recommends the use of a temporal sample. On the other hand, in the sample used, only one side was consulted in understanding the efficiency of knowledge. Indeed, this level should be more detailed and not limited to the two types of stakeholders. Reich and Kaarst-Brown [8] are two authors whose focus, in their research, refers to the creation, of intellectual and social capital, through information technologies, in the context of career transition. Certain organizations must continuously innovate with information technologies in order to maintain their competitive advantages. The idea is to illustrate, using a case study, how “Clarica Life Insurance Company” created, from 1993 to 1996 (sample period), the channel that allowed business within the innovations in information technologies. This company is a financial institution that provided financial services to customers in Canada and the United States at that time, including life insurance, investment products, employee benefits, management services for people with reduced mobility, financial planning, mortgage loans, and pension plans. It should also be noted that, in 2002, this company was acquired by “Sun Life”. Theoretically speaking, its foundations lie in the works of [15], in which theories of the co-creation of intellectual and social capital were created. With regard to the research method, it should be noted that the authors made use of individual interviews, with those who occupied the most important positions in the aforementioned sample period. In order to overcome problems arising from the aggregation of individuals’ answers, the authors collected them through various data sources, such as surveys, interviews and published documents originating from the company and the media. They also carried out the triangulation between three

Intellectual Capital and Information Systems (Technology): What Does …

391

different organizational groups, which allowed insights into former workers, professionals in general and business professionals, who exercised activity in information technologies and even those who still exercise activity in the same domain. In this way, the intention is understanding the evidence demonstrated in such a way that one can know the career transitions and the results obtained. With regard to the conclusions, the authors divide them into two types of categories: enrichment of knowledge of career transitions, based on the case study, “Clarica Life Insurance Company”. Thus, from the point of view of these conclusions, they are more identified by the authors as implications for research. Another conclusion is related to the fact that the approach taken shows little ability to see, in a comprehensive way, political or structural issues. From the point of view of the implications for management, there is no doubt for the authors that the managers of “Clarica Life Insurance Company” recognize the value of intellectual and social capital in the information technologies involved over several years. First, an assessment of the initial social capital between the business and information technology areas should be initiated. Second, each company can face similar or different impediments to the beginning of this spiral or its continuity. Cunha et al. [16], finally, there are three authors, who related the intellectual capital and information technologies, carrying out a systematic review of the results they have arrived at. In fact, according to these authors, the world is experiencing such an evolution that the economy is increasingly based on knowledge, information technologies, innovation, and telecommunications. This rise of the economy based on the knowledge has increased interest in the theory of intellectual capital, which aims to manage the intangible assets of organizations. Companies that belong to these activity sectors recognize, in intellectual capital, the key based on knowledge that contributes to creating competitive advantages in them. The research seeks to answer the following question: How do intellectual capital and information technology relate to each other? by resorting to the aforementioned systematic review, based on four steps: conducting the researcher, selecting papers based on their titles and abstracts, content analysis and, finally, mapping evidence and discussions. Thus, with regard to the research method, the approach is that of a systematic review in order to make it understandable and unbiased research, distinguished from the traditional review. The process covers some stages, culminating in a thematic map with their respective syntheses. Regarding the first step, conducting the research, the process was carried out through an automatic search in the bibliographic database engines: “Elsevier” (Science Direct), “Wiley Online Library”, and “Emerald Collections”. In the second step, selection of papers based on their titles and abstracts, these were read in order to exclude those that were not related to the scope of the research. Thus, the result was forty-nine papers selected in total, of which twenty-eight from “Elsevier” (Science Direct) (25% of 113), three from “Wiley Online Library” (4% of 74 papers), and eighteen from “Emerald Collections” (17% of 103 papers). In the third step, content analysis, all papers were read and analyzed according to inclusion and exclusion criteria. According to the inclusion criteria, there are those

392

Ó. T. Ramada

that are only considered related to journals, with relevance being given to the topics of intellectual capital and information technologies and, according to the exclusion criteria, the fact that the papers do not focus only on the two topics mentioned. Finally, in mapping evidence and discussions, all papers were analyzed and grouped by five themes, in order to provide an answer to the aforementioned research question: statistical analysis, information technologies, technological knowledge, intellectual capital assets and theory of intellectual capital (understanding and sharing knowledge). As main conclusions to be highlighted, it should be noted that human capital was the most studied component and relational capital was the least, which can serve as a basis for future research. Some topics can be highlighted, such as the relationship between intellectual capital and information technologies, which identify them with knowledge management, learning in organizations, human resources management, innovation and the creation of new knowledge, absorption capacities and competitive advantages. In short, the authors conclude that the adoption of intellectual capital management and information technologies confirm that the needs of the new economy have been met. Knowledge generates new knowledge that can be achieved from the creation of knowledge assets generated by stakeholders. A research of this nature helps to clarify the procedures for managing intellectual capital in information technology projects, which opens new horizons for future research themes.

3 Conclusions This research refers to the binomial intellectual capital and information systems (technology), with regard to the literature review. Its goal is to dissect, within the selected literature review, that would satisfy the two topics, what it refers to about it. Given the small number, it may suggest to the most demanding reader that it is nothing more than a mere collection of papers, the result of a selection that should deserve more development and care. It turns out, however, that this is not true. The papers obtained, in which the two criteria were present, were few and that is why the result was meager, far below the desired. Just as possible. Throughout this research, essentially, the geographical contexts were the Kingdom of Jordan, Tehran (Iran), Taiwan, and Canada. Thus, it can be concluded these geographic locations where there would be more scientific interest are not included. In addition to the scarcity of research, it is noted that, on top of that, its content is devoid of relevance, insofar as it does not contribute to the expansion of scientific knowledge in the field, making it possible to learn more. One of the possible explanations for the occurrence of this content devoid of relevance is due to the fact that the combination of intellectual capital and information systems (technology), is difficult to treat, together. Moreover, as explained in

Intellectual Capital and Information Systems (Technology): What Does …

393

the introduction, the junction of two or more topics, regarding, namely, intellectual capital, is less applicable in practical reality. In aggregate terms, it can be synthetically inferred that the selection of the five papers is related with the impact of intellectual capital on the development and efficiency of accounting information systems (within the scope of companies in the industrial sectors), with the evaluation of the impact of technological information and intellectual capital, on future business returns in capital markets, with the frontiers of knowledge between information systems and the disciplines associated with business focused on intellectual capital, with the creation of intellectual capital through technologies of information, inserted in career transition and, finally, with a systematic review of the results found regarding intellectual capital and information technologies. In comparative terms, some ideas can be mentioned. Indeed, for [9], the development of accounting systems cannot be achieved without the intellectual capital and, for [10], this is the basis of competitive advantages along with high returns in capital markets. Adequate investment in information technologies, for [10], is important to improve business skills and these, too, underlie high rates of return in the same markets. With regard to the relationship between information systems and business disciplines, from the point of view of intellectual capital, studied by [7], if a company has few qualifications to leverage knowledge, this constitutes an obstacle to developing software. in order to explore new directions that go beyond the frontiers of knowledge, facilitating access to it. Reich and Kaarst-Brown [8] relates the creation of intellectual capital with information technologies, in the context of career transition. In a case study, the authors concluded that, in this same context, there was enrichment in the career transition and suggest new theoretical developments. Relating information technologies with intellectual capital, in [16], it appears that there has been an increase in the knowledge-based economy, namely, which has had an impact on theories about intellectual capital and other intangible assets. Thus, the management of intellectual capital, together with information technologies, confirms the idea that knowledge generates new knowledge. It can therefore be said that there is still a long way to go in the binomial in question. In the contributions of this research, it can be confirmed that what exists in the literature on this binomial, is scarce in number, little relevant in content, and in terms of practical utility, little or nothing was obtained. However, without a joint, integrated arrangement, based on a brief literature review alluding to the binomial, intellectual capital and information systems (technology), nothing could be known, based on scientific papers. Thus, this work constitutes the missing demonstration of the content of this binomial. With regard to the implications, it is possible to underline what refers to the very large area that remains to be filled, for instance, more developed economies. Perhaps, this filling in will prove to be more fruitful and will provide better results. Regarding the limitations, they basically have to do with the fact that the primary and secondary sources of data are based on what has already passed, as opposed to being on what is yet to come.

394

Ó. T. Ramada

Finally, as far as future avenues of research, they are multifaceted. For instance, considering different activity sectors, more or less technological, finding an answer to the question of whether in sectors intensive in the labor factor, the adequacy is satisfactory in the employment of the binomial or not [11]. On the other hand, other future avenues of research are the geographic core being more incident in countries with more developed economies (Europe, America, …), with the help of the intellectual capital and information systems (technologies). Regarding the answer to the research question, it can be said that the aforementioned review is very small, having an almost null content, scientifically. From a concrete point of view, regarding the specificity to be highlighted, it is limited to the absence of specific contributions that can be used scientifically. In a word: nothing.

References 1. Gómez-González J, Rincon C, Rodriguez K (2012) Does the use of foreign currency derivatives affect firm’s market value? Evidence from Colombia. Emerg Mark Financ Trade 48(4):50–66 2. Akman G, Yilmaz C (2008) Innovative capability, innovation strategy and market orientation: an empirical analysis in Turkish software industry. Int J Innov Manag 12(1):69–111 3. Arvan M, Omidvar A, Ghodsi R (2016) Intellectual capital evaluation using fuzzy cognitive maps: a scenario-based development planning. Expert Syst Appl 55:21–36 4. Boekestein B (2006) The relation between intellectual capital and intangible assets of pharmaceutical companies. J Cap Intellect 7(2):241–253 5. Chen Y, Lin M, Chang C (2009) The positive effect of relationship learning and absorptive capacity on innovation performance and competitive advantage in industrial markets. Ind Mark Manag 38(2):152–158 6. Croteau A, Bergeron F (2001) An information technology trilogy: business strategy, technological development and organizational performance. J Strat Inf Syst 10(2):77–99 7. Hsu J, Lin T, Chu T, Lo C (2014) Coping knowledge boundaries between information system and business disciplines: an intellectual capital perspective. Inf Manag 1–39 8. Reich B, Kaarst-Brown M (2003) Creating social and intellectual capital through IT career transitions. J Strat Inf Syst 12:91–109 9. Al-Dalahmeh S, Abdilmuném U, Al-Dulaimi K (2016) The impact of intellectual capital on the development of efficient accounting information systems applied in the contributing Jordanian industrial companies—viewpoint of Jordanian accountant auditors. Int J Acc Financ Rep 6(2):356–378 10. Zeinali K, Zadeh F, Hosseini S (2019) Evaluation of the impact of information technology capital and intellectual capital on future returns of companies in the capital market. Int J Learn Intellect Cap 16(3):239–253 11. Andriessen D (2004) Making sense of intellectual capital—designing a method for the valuation of intangibles. Elsevier, Butterworth-Heinemann, p 2004 12. Fedotova M, Loseva O, Fedosova R (2014) Development of a methodology for evaluation of the intellectual capital of a region. Life Sci J 11(8):739–746 13. Goebel V (2015) Estimating a measure of intellectual capital value to tests its determinants. J Intellect Cap 16(1):101–120 14. Gogan L, Draghici A (2013) A model to evaluate the intellectual capital. Procedia Technol 9:867–875

Intellectual Capital and Information Systems (Technology): What Does …

395

15. Nahapiet J, Ghoshal S (1998) Social capital, intellectual capital, and the organizational advantage. Acad Manag Rev 23(3):242–266 16. Cunha A, Matos F, Thomaz J (2015) The relationship between intellectual capital and information technology findings based on a systematic review. In: 7th European conference on intellectual capital ECIC, pp 1–11

REST, GraphQL, and GraphQL Wrapper APIs Evaluation. A Computational Laboratory Experiment Antonio Quiña-Mera , Cathy Guevara-Vega , José Caiza, José Mise, and Pablo Landeta Abstract This research studies the effects of development architectures on the quality of APIs by conducting a computational laboratory experiment comparing the performance efficiency of a GraphQL API, a REST API, and a GraphQL API that wraps a REST API. Open data from the Electronic Chamber of Commerce of Ecuador, part of a national e-commerce research project, was used. To characterize quality, we used ISO/IEC 25,023 metrics in different use cases of e-commerce data consumption and insertion. Finally, we statistically analyzed the experiment results, which indicate a difference in quality between the REST API, the GraphQL API, and the GraphQL API (wrapper); this being the case, the GraphQL API performs more efficiently. Keywords REST API · GraphQL API · Wrapper · Computational laboratory experiment · ISO/IEC 25,023 · e-commerce

A. Quiña-Mera · C. Guevara-Vega (B) · P. Landeta Universidad Técnica del Norte, 17 de Julio Avenue, Ibarra, Ecuador e-mail: [email protected] A. Quiña-Mera e-mail: [email protected] P. Landeta e-mail: [email protected] A. Quiña-Mera · C. Guevara-Vega eCIER Research Group, Universidad Técnica del Norte, Ibarra, Ecuador J. Caiza · J. Mise Universidad de Las Fuerzas Armadas ESPE, Quijano y Ordoñez Street, Latacunga, Ecuador e-mail: [email protected] J. Mise e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_34

397

398

A. Quiña-Mera et al.

1 Introduction GraphQL is a query language for web-based application programming interfaces (APIs) proposed by Facebook in 2015 that represents an alternative to the use of traditional REST APIs [6, 7]. The work presented by Brito et al. cites an example of the simplicity of GraphQL APIs compared to REST APIs, which with a GraphQL query indicates only the required fields, unlike REST returns a very long document in JSON format with many fields, of which they use a few [9, 11]. The evolution of computer systems has motivated the creation of more efficient technologies when developing a software technology architecture. In this sense, we propose the construction of a GraphQL API that consumes data from a database and a GraphQL API that wraps an existing REST API called a GraphQL wrapper. The consumption efficiency of these APIs is evaluated using software product quality metrics based on ISO/IEC 25023 [4]. We apply this project to the existing REST API of the Electronic Commerce Chamber of Ecuador that exposes open e-commerce data with two purposes: (i) to provide new open data consumption functionality to the technology community, and (ii) to identify which API is the most efficient (focus of our study). Therefore, we define the following research question: What is the effect of the architecture in API development on the external quality of the software? We answer the research question with a computational laboratory experiment [3] to compare the efficiency of three APIs built with different architectures: REST, GraphQL, and Wrapper (REST + GraphQL) using the open e-commerce data of the Ecuadorian Chamber of Commerce. The rest of the document consists of: Sect. 2 briefly describes REST and GraphQL. Section 3 provides an experimental setup to compare the performance efficiency between GraphQL, REST, and GraphQL APIs (wrapper). Section 4 shows the execution of the experimental design proposed in Sect. 3. Section 5 shows the results of the experiment execution. Section 6 shows the threats to the validity, execution, and results of the experiment. Section 7 shows the discussion of the results. Finally, Sect. 8 shows the conclusions of the study and future work.

2 REST and GraphQL REST (REpresentational State Transfer) is an architectural style developed as an abstract model for web architecture and used to guide the definition of Hypertext Transfer Protocol and Uniform Resource Identifiers [9]. When its creation in 2000, it became a solution to eliminate the complexity of web services and transform serviceoriented architectures [10]. Specifically, it can be mentioned that, at its creation, it was a lighter solution than Web Services Description Language (WSDL) and SOAP due to the use of XML-based messages with which it was possible to create very useful data sets [8]. GraphQL is a language that uses a graph pattern as its basic operational unit [13]. A graph pattern consists of a graph structure and a preset of

REST, GraphQL, and GraphQL Wrapper APIs Evaluation. …

399

graph attributes [17]. It is a framework that includes the graph-based query language Hartig [15]. There are REST APIs that wish to have the advantages of GraphQL, for which additional interfaces are required. This is where the creation of GraphQL wrappers for existing REST APIs comes in. Once it receives GraphQL queries, a Wrapper passes the requirements through the target API [12].

3 Experimental Setting 3.1 Goal Definition The objective of the computational laboratory experiment is to compare the efficiency of GraphQL, REST, and GraphQL (Wrapper) architectures applied to the open ecommerce data of the Equatorial Chamber of Commerce.

3.2 Factors and Treatments The investigated factor is the external software quality of API GraphQL, API REST, and API GraphQL (wrapper), operationalized with performance efficiency metrics. The treatments applied to this factor are – GraphQL architecture for developing APIs. – REST architecture for developing APIs. – GraphQL architecture (wrapper) for API development.

3.3 Variables To measure the performance efficiency of GraphQL, REST and GraphQL (wrapper) architectures, we conducted a computational laboratory experiment. We relied on the following metrics from the ISO/IEC 25023 standard [4]: – Average response time is the time it takes to complete a job or an asynchronous process. The measurement function is  X = nI=1 (B I = A I ) n

(1)

where AI is the time to start job I; BI is the time to complete job I; and n = the number of measurements. – Average system response time, the average time taken by the system to respond to a user task or system task.

400

A. Quiña-Mera et al.

 X = nI=1 (Ai ) n

(2)

where Ai = Time taken by the system to respond to a specific user task or system task at the ith measurement, and n = number of measured responses.

3.4 Hypothesis We propose the following hypotheses for the experiment: – H 0 (Null hypothesis): There is no difference in the external quality of APIs developed with GraphQL, REST, or GraphQL (wrapper). – H 1 (Alternative hypothesis): There is a difference in the external quality of APIs developed with GraphQL, REST, or GraphQL (wrapper). The GraphQL API has a more efficient performance. – H 2 (Alternative hypothesis): There is a difference in the external quality of APIs developed with GraphQL, REST, or GraphQL (wrapper). The REST API has a more efficient performance. – H 3 (Alternative hypothesis): There is a difference in the external quality of APIs developed with GraphQL, REST, or GraphQL (wrapper). The GraphQL API (wrapper) has a more efficient performance.

3.5 Design We based the design of the experiment on the execution of four use cases. Two data query use cases with complexity up to two levels of relationship in the data structure. And two data insertion use cases. We will execute the query use cases with the following numbers of records 1, 10, 100, 100, 100, 1000, 10000, 10000, 100000, 300000 to each API (REST, GraphQL, and GraphQL (wrapper)).

3.6 Use Cases Below, we show four use cases adapted to the reporting structure from 2017 to 2020 from an annual survey conducted by the Ecuadorian Chamber of Electronic Commerce on e-commerce: Use case UC-01. Query the global data of the questions asked in the e-commerce survey conducted in 2017, 2018, 2019, and 2020. Use case UC-02. Queries data on the frequency of internet usage, reasons for purchase, and reasons for non-purchase for 2017, 2018, 2019, and 2020. Use case UC-03. Inserts questions from the 2020 e-commerce survey.

REST, GraphQL, and GraphQL Wrapper APIs Evaluation. …

401

Experimental laboratory Local-PC Client Application (Experimental Tasks) GraphQL (wrapper) Use cases REST Use cases

GraphQL-API (Wrapper)

E-commerce REST-API

Type System Data request

Resolvers GraphQL Use cases

Data request

GraphQL-API

Response

Type System

JSON response files

Resolvers

Data request

e-commerce database

Fig. 1 Computational laboratory experiment architecture

Use case UC-04. Inserts the frequency of internet usage, reasons for purchase, and reasons for not purchasing by 2020.

3.7 Experimental Tasks In this section, we design a computational laboratory to execute the experiment set up in Sect. 3.5, see Fig. 1. The requirements of the experimental tasks are detailed below: Experimental task 1—REST API queries and inserts. This task executes the use cases of querying and inserting data over the REST API of e-commerce data using a client application that automates this process. Experimental task 2—GraphQL API queries and inserts. This task executes the use cases of querying and inserting data over the GraphQL API of e-commerce data using a client application that automates this process. Experimental task 3—GraphQL API (wrapper) queries and inserts. This task executes the use cases of query and data insertion over the GraphQL API (wrapper) of the e-commerce data using the same client application that automates task 1.

3.8 Instrumentation The instruments used in the experiment are described, such as the infrastructure, technology, and libraries that compose the computational laboratory: Specification of the local computer where the APIs are implemented: – Linux Ubuntu 3.14.0 operating system.

402

A. Quiña-Mera et al.

– Vcpu 1 core. – RAM memory 2 GB. – Hard Disk 30 GB. Development environment, which consists of the following technologies: Backend: (REST API, GraphQL API, and GraphQL API (wrapper)) – – – – – – –

Visual Studio Code (Code Editor). Node.js (JavaScript runtime environment). Npm (package management system). Express.js (web application framework). FrontEnd: (Client Application) Visual Studio Code (Code Editor). React.js (JavaScript library for creating user interfaces). Apollo Client (queries to GraphQL APIs). IBM SPSS [18] application was used for data collection and analysis.

3.9 Data Collection and Analysis The steps to collect the data from the use case execution established in the experiment are: (i) execute the use case in the client application. (ii) Copy the result of the use case execution from the Visual Studio Code programming IDE console and paste it into a Microsoft Excel 365 file. (iii) Tabulate the collected data. We will statistically analyze the experiment’s results using IBM SPSS Statistics Pearson correlation matrices to observe the degree of a linear relationship in each variable. And the discriminant analysis to observe the significant differences between the architectures applied in the experiment using Wilks’ Lambda statistic.

4 Experiment Execution The experiment was run in October 2021, following the provisions of Sect. 3.

4.1 Preparation We start by checking that the components of the experimental lab are ready to execute the use cases in the client application that consumes the REST API, GraphQL API, and GraphQL API (wrapper). We then set up the following experiment execution steps: – In a specific GraphQL API EndPoint (wrapper), we run the data query use cases for each amount of data (10, 100, 1000, 10,000, 100,000,300,000).

REST, GraphQL, and GraphQL Wrapper APIs Evaluation. … Table 1 Use Case UC-01 results

# Records

REST API

403 GraphQL API 2.012,6965

GraphQL API (Wrapper)

10

4.981,4331

7118.0000

100

6.937,0527

10.351,2771

16.537,8169

1000

8.401,2710

12.262,4505

21.675,0374

10,000

11.399,8534

8.710,2970

20.110,1503

100,000

18.483,0179

3.380,3218

21.863,3397

300,000

39.962,5077

5.966,3039

45.928,8116

Average

15.027,5226

7.113,8911

22.205,5259

– In a specific GraphQL API EndPoint (wrapper), we execute the data inser tion use cases for one minute. – Then, in different REST API EndPoints, we execute the data query use cases for each amount of data (10, 100, 1000, 10,000, 100,000,300,000). – In different REST API EndPoints, we execute the data insertion use cases for one minute. – After each execution, we copy the results to the Microsoft Excel file.

4.2 Data Collection In this section, we show in Table 1 the example of the data collection of the UC01 use case, where you can observe the response time of the UC-01 execution for a different number of registers for each architecture, the time is in milliseconds.

5 Results This section in Table 2 shows the averages of response time and average system response time results as external quality metrics of the APIs obtained from the execution of the use cases of the experiment. Although we observe that the response time performance of GraphQL API is up to two times faster compared to REST API and three times faster than GraphQL API (wrapper), note that the result of the wrapper is similar to the sum of the results of the REST and GraphQL APIs.

5.1 Statistical Analysis After obtaining the results, we performed the statistical analysis by calculating the Pearson correlation for the two variables (response time and system response

404

A. Quiña-Mera et al.

Table 2 Experiment results Architecture

GraphQL API

UC-01

UC-02

UC-03

UC-04

Response time

Response time

Performance

Performance

Efficiency order

7113,9

897,8

1,456

4,28

1

REST API

15,027,52

6595,72

1,06

0,7245

2

GraphQL API (wrapper)

22,205,52

7493,58

0,8454

0,6071

3

time), as well as the equality test of means of the groups indicating Wilks’ Lambda values. Table 3 shows the Pearson correlation matrix between the response time and system response time variables of the REST API, GraphQL API, and GraphQL API (wrapper) architectures. We observe that the execution time of GraphQL API and GraphQL API (wrapper) have a positive linear correlation of 0.373. On the other hand, the performance variable of GraphQL API and REST API have a positive linear correlation of 0.78. Table 4 shows the result of the Group Equality of Means test indicating Wilks’ Lambda values (with the value closest to zero being a positive indicator). In this sense, we observe that the GraphQL API (wrapper) has the lowest significance level with 0.003, which concludes that there are differences between the groups. Therefore, we can determine that the GraphQL API has a significant advantage over the others. From the same performance, we observe that the value of 0.011 of the REST API is lower than the p-value (0.05). For this reason, we accept the alternative Hypothesis H 1 that indicates a difference in the external quality between REST API, GraphQL API, and GraphQL API (wrapper). So being, API GraphQL has a more efficient performance.

6 Threats to Validity Threats to validity are a set of situations factors, weaknesses, and limitations that could interfere with the validity of the results of the present empirical study, so the relevant potential threats are analyzed based on the classification proposed by Wohlin et al. [16]. Internal validity. We manipulated the GraphQL wrapper development process to expose open e-commerce data by applying the agile SCRUM methodology. We then started the experiment by performing four data query and insertion use cases with a scope of up to two levels of relationship in the data query. Next, we took the response time from the execution of the request to the receipt of the response. Finally, we evaluate the resource utilization by taking the CPU processing speed in the ISO/IEC 25023 quality metric.

REST, GraphQL, and GraphQL Wrapper APIs Evaluation. …

405

Table 3 Pearson correlation matrix R-TIME PEA

R-TIME

R-PERF

W-TIME

W-PERF

G-TIME

G-PERF

1

−0.906*

−1.67

0.320

−2.86

0.703 0.119

R-TIME SIG

0.013

0.752

0.537

0.583

R-TIME N

6

6

6

6

6

6

R-PERF PEA

−0.906*

1

−0.248

0.082

0.078

−0.793

R-PERF SIG

0.013

0.635

0.877

0.883

0.060

R-PERF N

6

6

6

6

6

6

W-TIME PEA

−0.167

−0.248

1

−0.980**

0.373

0.386

W-TIME SIG

0.752

0.635

0.001

0.466

0.450

W-TIME N

6

6

6

6

6

6

W-PERF PEA

0.320

0.082

−0.980**

1

−0.411

−0.276

W-PERF SIG

0.537

0.877

0.001

0.419

0.597

W-PERF N

6

6

6

6

6

6

G-TIME PEA

−0.286

0.078

0.373

−0.411

1

−0.419

G-TIME SIG

0.583

0.833

0.466

0.419

G-TIME N

6

6

6

6

6

6

G-PERF PEA

0.703

−0.793

0.386

−0.276

−0.419

1

G-PERF SIG

0.119

0.60

0.450

0.597

0.409

G-PERF N

6

6

6

6

6

0.409

6

R: REST API; G: GraphQL API; W: GraphQL (wrapper) API PERF: Performance; PEA: Pearson correlation; SIG: Significance (bilateral)

Table 4 Test of means groups Effect

Value F

API GraphQL—Wilks’ Lambda

0.265 13.860b 1.000

Hypothesis gl Error gl Sig

API REST—Wilks’ Lambda

0.244 15.465b 1.000

5.000

0.011

API GraphQL (wrapper)—Wilks’ Lambda 0.156 27.087b 1.000

5.000

0.003

5.000

0.014

a. Design: Intersection b. Exact statistic Intra-subject Design: API REST, API GraphQL, API GraphQL (wrapper)

External validity. We experimented in a computational laboratory context where we ran the use cases on the REST and GraphQL APIs (wrapper) in the same execution environment on the Local-PC. Construct validity. To minimize measurement bias, we developed the experiment execution constructs to automatically measure the response time and the number of tasks executed in a unit of time on the REST and GraphQL APIs. In addition, the constructs were defined and validated in consensus with two expert software

406

A. Quiña-Mera et al.

engineers. In addition, we used four use cases to minimize data manipulation bias in the established treatments. Conclusion validity. We mitigated threats to the conclusions by performing statistical analyses to accept one of the hypotheses raised in the experiment and thus support the study’s conclusions.

7 Discussion To corroborate the results obtained, we proceeded to analyze other studies related to the comparison of APIs, which generated the following conclusions: Vogel et al. [14] present a study where they migrate a part of an API of a Smart home management system to GraphQL. They report the performance of two endpoints after migration. They conclude that GraphQL required 46% of the time of the original REST API. Wittern et al. [12] show how to automatically generate GraphQL wrappers for existing REST APIs. They propose a tool with the OpenAPI (OAS) specification, with which they evaluated 959 available public REST APIs and generated the wrapper for 89,5% of these APIs. In addition, Seabra et al. [2] studied three applications using REST and GraphQL architecture models. They observed that migration to GraphQL resulted in increased performance in two of the three applications with respect to the number of requests per second and data transfer rate. In relation to the previous studies, we present that requests made with GraphQL API have more advantages in relation to underfetching and overfetching metrics. In addition, GraphQL API handles more efficiently the memory resource compared to REST API. We present limitations of which we should extend studies of this type to measure the wrapper in more detail since in the present study no major impact is observed.

8 Conclusions and Future Work In this paper, we pose the research question (RQ) What is the effect of the architecture in API development on the external quality of the software? We answer the RQ by conducting a computational experiment comparing the effect of the performance efficiency (characterized with ISO/IEC 25023) of three APIs developed on GraphQL, REST, and GraphQL architectures wrapping a REST API. We conduced a computational experimental laboratory around consuming the three APIs by a client application around consuming the three APIs by a client application, which executes four common use cases (2 queries and 2 data insertions). We characterized the external quality of the software using the metrics Average Response Time and Average system response time. After running, tabulating, and statistically analyzing the experiment results, we accept the alternative hypothesis H 1 , which indicates a difference in external quality between the REST API, the GraphQL API, and

REST, GraphQL, and GraphQL Wrapper APIs Evaluation. …

407

the GraphQL API (wrapper); this being so, the GraphQL API has a more efficient performance. Acknowledgements Electronic Commerce Chamber of Ecuador.

References 1. Hartig O, Pérez J (2017) An initial analysis of facebook’s GraphQL language. In: CEUR workshop proceedings, Montevideo 2. Seabra M, Nazário MF, Pinto GH (2019) REST or GraphQL?: A perfor mance comparative study. SBCARS’19 3. Guevara-Vega C, Bernárdez B, Durán A, Quiña-Mera A, Cruz M, Ruiz-Cortés A (2021) Empirical strategies in software engineering research: a literature survey. In: Second international conference on information systems and software tech nologies (ICI2ST). Ecuador, pp 120–127 4. ISO/IEC 25023:2016—Systems and software engineering, ISO: The international organization for standardization. https://www.iso.org/standard/35747.html. Accessed 28 March 2021 5. ISO/IEC 25000 Systems and software engineering, ISO: The international organization for standardization. https://bit.ly/3xhut3j. Accessed 12 Feb 2021 6. Fielding RT (2000) Architectural styles and the design of network-based software architecture. PhD dissertation, University of California 7. Fielding RT, Taylor RN (2002) Principled design of the modern web architecture. ACM Trans Internet Technol 2(5):115–150 8. Sheth A, Gomadam K, Lathem J (2007) SA-REST: semantically interoperable and easier-to-use services and mashups. IEEE Internet Comput 6(11):91–94 9. Brito G, Mombach T, Valente M (2019) Migrating to GraphQL: A practical assessment. In: SANER 2019—proceedings of the 2019 IEEE 26th international conference on software analysis, evolution, and reengineering 10. Pautasso C, Zimmermann O, Leymann F (2008) Restful web services versus “big” web services. In: WWW ’08: proceedings of the 17th international conference on World Wide Web. Beijing China 11. Brito G, Valente M (2020) REST vs GraphQL: a controlled experiment. In: Proceedings—IEEE 17th international conference on software architecture, ICSA 2020, Salvador, Brazil 12. Wittern E, Cha A, Laredo J (2018) Generating GraphQL-wrappers for REST (-like) APIs, 18th international conference, ICWE 2018. Springer International Publishing, Spain 13. Quiña-Mera A, Fernández-Montes P, García J, Bastidas E, Ruiz-Cortés A (20200) Quality in use evaluation of a GraphQL implementation. lecture notes in networks and systems. (405 LNNS), pp 15–27 14. Vogel M, Weber S, Zirpins C (2018) Experiences on migrating restful web services to GraphQL, ASOCA, ISyCC, WESOACS, and satellite events. Springer International Publishing, Spain 15. Hartig O, Pérez J (2018) Semantics and complexity of GraphQL. In: WWW ’18: proceedings of the 2018 World Wide Web conference, Switzerland 16. Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2012) Experimentation in software engineering, 1st edn. Springer, Berlin, Heidelberg 17. He H, Singh A (2008) Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of the ACM SIGMOD international conference on management data, New York, United States, pp 405–417 18. IBM SPSS software. https://www.ibm.com/analytics/spss-statistics-software. Accessed 10 April 2021 19. Author. (2022) Laboratory Package: Efficient consumption between GraphQL API Wrappers and REST API, Zenodo. https://doi.org/10.5281/zenodo.6614351

Design Science in Information Systems and Computing Joao Tiago Aparicio, Manuela Aparicio, and Carlos J. Costa

Abstract Design science is a term commonly used to refer to the field of study that focuses on the research of artifacts, constructs, and other artificial concepts. Furthermore, the purpose of this article is to provide a definition of this domain of knowledge concerning information systems and computing, as well as to differentiate between what it is and what it is not, as well as to provide examples of these in ongoing research. In order to accomplish this goal, we conduct a bibliometric analysis on design science and its interaction with information systems and computation. This study aims to identify the primary aggregations of publications pertaining to design science and their chronological analyses. In addition, we clarify some common misconceptions about this field of study by defining what constitutes design science and what exactly does not constitute design science. In addition, we determined the primary stages of the methodological approach to design science. Keywords Design science · Bibliometric study · Research methodology

1 Introduction Research can be defined as an activity contributing to a better understanding of a phenomenon. A phenomenon is a set of behaviors of some entity that is found attractive by researchers. Understanding is the knowledge that allows prediction J. T. Aparicio (B) INESC-ID, Instituto, Superior Técnico, University of Lisbon, Lisbon, Portugal e-mail: [email protected] M. Aparicio NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisbon, Portugal e-mail: [email protected] C. J. Costa Advance/ISEG—Lisbon School of Economics and Management, Universidade de Lisboa, Lisbon, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_35

409

410

J. T. Aparicio et al.

of the behavior of some aspects of the behavior. Typically, research begins with a question or a problem, and it is essential to describe the study’s objective in detail. This task is undertaken according to a predetermined plan of action, and the core problem is divided into some subproblems that are easier to manage. Research is guided by the specific research problem, question, or hypothesis. It accepts certain critical assumptions, requiring the collection and interpretation of data or the creation of artifacts. Research is naturally cyclical, iterative, or, more precisely, helical. Natural sciences include traditional research in domains like physics, biology, social sciences, and behavioral sciences. Such research is aimed at understanding reality. The focus of research on artifacts, constructs, and other artificial concepts is commonly referred to as design science [16]. The goal of natural science is to explain how and why things are the way they are. Hence, behavioral sciences are seen as belonging to the natural science branch, as behavioral sciences also aim to understand why individuals and society behave in such ways when exposed to a certain phenomenon. The creation of objects with the purpose of achieving a goal is the focus of design science [16]. State that the practice of natural science is typically seen to be comprised of two distinct activities: Discovery and justification. The activity of developing or putting forward scientific hypotheses is known as discovery (e.g., theories, laws). The justification process involves doing tasks to determine whether or not the assertions in question are accurate. There is a lack of comprehension regarding the discovery process. While some people have suggested that there is a “logic” to scientific discovery, the mainstream philosophy of science has traditionally seen scientific discovery as a creative process that psychologists may or may not be able to explain. In the present study, we have three main objectives. First, to identify the main research topics within information systems and computation which uses design science research. Second to confront the concept of design science with other concepts. Third, to propose a tool that helps academia and industry in the research process of design science. To achieve the first objective, we performed a bibliometric study. To the fullest extent possible, a bibliometric study shall be incorporated into all research projects that attempt to quantify the procedures involved in written communication. Because bibliometric research is a quantitative approach, it is used to discover trends in publications; hence, the prime purpose of the study is to draw patterns, and not to do a content analysis of the articles (such analysis is more the objective of a scoped review or a systematic review) [1, 4, 10, 20]. We reviewed the fundamental methodological literature related to design science research to achieve the second and third objectives.

2 Main Research Topics in Literature In this study, we intend to review the scientific publications where design science is present in works related to information systems, computer science, machine learning, artificial intelligence, or data science. All these areas are traditionally related to artifact building and evaluation. We can think of examples as new machine

Design Science in Information Systems and Computing

411

learning models, platforms, and programming libraries. We did not constrain results in publication dates; hence we conducted a query on SCOPUS as follows: TITLE-ABS-KEY “Design Science” AND “Information System” O “Computer Science” O “Machine learning” O “Artificial Intelligence” O “Data Science”. We computed the.cvs file from SCOPUS and selected in VosViwer the keywords that co-occurred more than ten times. Figure 1 presents the co-occurrence graph resulting from the relationship between the searched publications on design science. Figure 1 depicts that information systems are the more frequent and the more central node within the publications keywords. Following design, we removed design science from the graph to prevent a strong bias in the reading, as all the papers have design science in their keywords. The co-occurrence keywords graph indicates seven clusters among the publications (Table 1). Results in Table 1 indicate that design science publications are related mainly to information systems studies (cluster 1), followed by artificial intelligence (cluster 2), then security applied to industry (cluster 3). There is also a group of publications about methodologies in science (cluster 4), a cluster related to engineering and software development (cluster 5), a cluster related to sustainability dimension (cluster 6), and a group of publications related to business (cluster 7). After the cluster analysis, it was assessed how design science publications evolved through time (Fig. 2). We can observe that topics like methodologies, business orientation, business models, and industry are among the first topics in the literature of

Fig. 1 The co-occurrence graph resulted from design science publications

412 Table 1 Clusters of the design science publications

J. T. Aparicio et al. Cluster #

More frequent keywords

Cluster name

Cluster 1

Red

Information systems

IS

Cluster 2

Green

AI, DSS, Machine learning

AI

Cluster 3

Dark blue

security

Industry Applications

Cluster 4

Yellow

Action research, research

Methodologies

Cluster 5

Purple

Agile, software development,

Engineering

Cluster 6

Cyan

Sustainability, environment

Sustainability

Cluster 7

Orange

Business model, service design

Business

design science publications (from the late nineties until 2010). Studies about information systems and design science became more frequent after 2010. Note that design science and information systems publication studies are among the most frequent compared to other topics. Only after 2018 design science appears in the literature related to blockchain, machine learning, industry 4.0, chatbots, and digital transformation. When we analyze studies that are referred to and cited more than 500 times in literature, results indicate that Herbert Simon (1969) does not appear in the author’s most cited publications related to design science. However, this author is one of the first to discuss artifacts in the literature related to computation and artificial intelligence (Fig. 3). Results indicate a strong presence of authors related to information systems and a central position (Cluster 1) [11, 16] . Other authors are more frequent in studies related to artificial intelligence and machine learning (cluster 2) [9, 18]. Countries’ contributions to design science studies are very uneven. There is a clear gap between German, American, Australian, and Switzerland publications on design science compared to the rest of the world. However, our results indicate that these countries are not among the first to publish design science studies (Fig. 4). From the bibliometric study, it can be inferred that research in information systems and computing, design science research usage is very frequent. The thorough scientific publishing of design science research in key IS scientific publications is an essential step toward merging the communities of design science and behavioral science inside IS. These articles may be found on Scopus. The majority of the studies that are conducted in the field of information systems are based on either the behavioral science or the design science paradigm. Both models are essential to the Information Systems (IS) field because of their unique location at the intersection of people, organizations, and technology [11]. In the realm of technology, design science is

Design Science in Information Systems and Computing

413

Fig. 2 Design science studies trends in time

Fig. 3 Authors with more than 500 citations related to design science and authors’ trend citations

414

J. T. Aparicio et al.

Fig. 4 Countries’ contribution to design science publications over time

quite active since it participates in producing technical artifacts that affect individuals and organizations. Its primary focus is on finding solutions to problems, but it frequently adopts a one-dimensional understanding of the people and organizational environments in which created artifacts need to operate. As was said previously, the design of an artifact, the formal specification of that artifact, and an evaluation of that artifact’s utility, which is frequently accomplished through comparison with other artifacts, are all fundamental components of design science research. The following section focus on the design science research itself, it presents some disambiguation of the concept, and the description of its methodological approaches’ phases.

3 What Design Science is and What It Is Not Design Science attempts to create things that serve individual or organizational purposes. It is technology-oriented, evaluating its products based on characteristics such as value or utility and building and development criteria. This methodological approach is mainly focused on the types of questions that are motivated by inquiries such as, “How can we develop this” “Does it work” “Is it an improvement”. The fields of management science, computer science, Information systems, architecture, engineering, urban planning, and operations research all emphasize design as an

Design Science in Information Systems and Computing

415

essential activity. The more traditional branches of science cannot function without including design activities. Certain scientists conduct research related to both the natural and design sciences [5, 12, 16, 21]. Research cycles complementary between the two fields of design science and behavioral science are in place to address fundamental challenges inherent in successfully applying information systems. The fields of behavioral science and design science are interdependent. In the same way, they cannot be separated regarding IS research. The pragmatic philosophy, upon which these arguments are based philosophically, maintains that truth (justified theory) and utility (effective artifacts) are two sides of the same coin and that scientific study ought to be judged in light of its practical implications. In other words, the practical applicability of the study result should be appreciated equally as high as the rigor of the research undertaken to achieve the conclusion [11]. Design Science is not an artifact. Artifacts are the results output in design science. Artifacts can be constructs, models, methods, and instantiations [16]. Constructs can be vocabulary, concepts, and symbols. Models can be thought of as a collection of propositions or statements that express the relationships between various constructs (abstractions and representations). Methods are the procedures (including algorithms, principles, and practices) that are followed to complete a job. Instantiations are the implementation of an artifact in its context (both the implemented system and the prototype system), which verifies the system’s practicability and performance. Design Science is not defined solely by the artifact. The term “artifact” refers to anything artificial or produced by humans rather than something that occurs naturally [23]. Such artifacts must either improve current solutions to a problem or give the first solution to a big challenge. Design science encompasses the artifact’s documentation and a rigorous assessment procedure that compares it to other artifacts. It enables the generation of knowledge. Routine design practice—No new information was developed; best practices were implemented. Design Science is not a methodology—Design science seems to be more of a research paradigm than a research methodology. Design science research (DSR) is a research paradigm in which a designer answers questions relevant to human problems via the creation of innovative artifacts, thereby contributing new knowledge to the body of scientific evidence. The designed artifacts are both useful and fundamental in understanding that problem [11]. It is important to disambiguate design science from action research, as these two concepts lead to different outcomes [14]. Action research is centered on discoverythrough-action, and it is a methodology. Design Science is focused on problemsolving by creating and positioning an artifact in a natural setting. Design Science is centered on discovery through design. Design Science is a paradigm. Participatory design is not design science either; it is distinguished from design science, as participatory design entails individual participation throughout the whole design process. Participatory design has its own philosophical and methodological orientation, and procedures, much as participatory action research, the methodology it is built on [8, 15, 24] (Glesne, 1998).

416

J. T. Aparicio et al.

Design Science is not Design Thinking. Design Thinking is a non-linear, iterative process. It consists of understanding users, challenging assumptions, redefining problems, and creating innovative solutions to prototype and test. Literature [12, 14] proposed a taxonomy of research methods, focusing it the research outputs. Conceptual-analytical approaches Two different approaches: Derive theory, model, or framework from assumptions, premises, and axioms. Basic assumptions behind constructs in previous empirical studies are first analyzed. Approaches for empirical studies may be theory-testing approaches may be used methods like laboratory experiments, surveys, field studies, and field experiments. The research question could be: Do observations confirm or falsify that theory? Approaches for empirical studies may also be theory-creating approaches. The theory, model, or framework is either taken from the literature or developed or refined for the study. This approach may include Case study, Ethnographic methods, Grounded theory, Phenomenography, Contextualism, Discourse analysis, and Longitudinal studies. Research stressing the utility of artifacts may include artifact-building approaches. The research question could be: Is it possible building a specific artifact? General output: A specific abstract or concrete artifact (e.g., new information system, prescriptive model, normative method, or measurement instrument) is built.

4 Proposing a Tool to Help Researchers Research stressing the utility of artifacts may also focus on the artifact’s evaluation Approaches. The research question could be: How effective is the artifact? General Output: The utility (efficiency, effectiveness) of a specific artifact (or prescriptive model or normative method) is evaluated using some criteria. Output example: “The largest corporation and multinationals are making little use of the Internet, treating it simply as a publishing medium”. The primary question that has been asked across both the scientific and social sciences is, “What or of what kind is the world”. Considering the process of constructing an artifact, the questions we pose are: Why do we make an artifact, and how do we do it? [12, 13]. To identify a problem as any scenario where a gap exists between the real and the intended ideal conditions [23]. It matters to depict the methodological approach of Design Science. According to [11, 18] it has 6 phases. Phase 1 corresponds to problem identification and motivation. Researchers Define the specific research problem and justify the value of a solution (Table 2). The welldefined problem can be used to develop an effective artifact. Justifying the value motivates the researcher to pursue the solution and accept the result. Resources required: Knowledge of the state of the problem and the importance of its solution. Phase 2 consists of defining the objectives for a solution. The researchers infer the objectives of a solution from the problem definition and knowledge of possible and feasible (Metric objectives/Nonmetric objectives). The resources required are knowledge of the state of the problem and current solutions and their efficacy. Phase 3 is the design and development. In this activity, the researcher creates the artifact, such

Design Science in Information Systems and Computing

417

as constructs, models, methods, or instantiations or “new properties of a technical, social or informational resource”. The resource required is the knowledge of theory that can be brought to bear on a solution. Phase 4 is the Demonstration. Here it is demonstrated the use of the artifact to solve one or more instances of the problem. The resource required is the knowledge of how to use the artifact to solve the problem. Phase 5 is the Evaluation. The researcher observes and measures how well the artifact supports a solution to the problem. He compares the objectives of a solution to actual observed results from the use of the artifact in the demonstration. Resources required are the metrics and analysis techniques. It can be back to activity 3 to try to improve the artifact. Phase 6 is the Communication. In this phase, researchers communicate the problem and its importance, the artifact, its utility and novelty, the rigor of its design, and its effectiveness to researchers and other relevant audiences when appropriate. Knowledge of the disciplinary culture is the required resource. This approach was used to develop several artifacts, like methodological approaches [1, 3, 6, 7, 19]. Table 2 Design science tool Phase

Objective

Approach & Methods

Next phase

Outcome

1

Problems and motivation identification

Observation of reality Literature review

2

Problem and motivation statement

2

Objectives of a solution

Inference process from problem definition

3

Objectives statement

3

Propose an artifact Literature review Observation of reality Analysis and Design

4

Artifact description

4

Demonstration Find suitable context

Use artifact to solve Problem

5

Have artifact implemented in a specific context

5

Evaluate

Observe how effective 3, 6 and efficient using qualitative methods (ethnographic, focus group, for example) or using quantitative methods (lab experimentations, statistical analysis of data survey, for example)

Tested and evaluated artifact

6

Communication

Publish

Academic publication professional publication

418

J. T. Aparicio et al.

5 Conclusions and Implications This study presents a bibliometric study with the objective of understanding the presence and influence of design science in the information systems and computing fields. The empirical part of the study indicated that design science is related majorly to information systems and six other related areas, such as artificial intelligence, industry applications, research methodologies, engineering, sustainability, and new business models. From an author’s analysis of design science, it could be assessed that not only is design science part of the body of knowledge evolution about technology, but also it influences new theoretical approaches to natural science itself. No study goes without any limitation, and this one is no exception. In our study, only data extracted from SCOPUS was used. Still, it could be the object of our bibliometric study of other digital libraries, for example, WoS, ACM, IEEE, and AIS, to understand if these patterns are consistent among the different libraries. For future studies, it would be essential to understand the various contributions of the various libraries to design science and to what extent design science influences natural science studies related to the most important technological phenomena. Acknowledgements We are gratefully acknowledged financial support from FCT—Fundação para a Ciência e Tecnologia, I.P., Portugal, Nacional funding through research grants; INESC-ID pluriannual (UIDB/50021/2020), JTA PhD grant—supported by FCT UI/BD/153587/2022; MA—national funding through research grant UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC, NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Portugal; and CJC—Advance/CSG, ISEG, Universidade de Lisboa and er UIDB/04521/20.

References 1. Aparicio M, Bação F, Oliveira T (2014) Trends in the e-Learning ecosystem: a bibliometric study. In: AMCIS 2014 Proceedings, Twentieth Americas Conference on Information Systems, Savannah, 2014. https://aisel.aisnet.org/amcis2014/Posters/ISEducation/7 2. Aparicio M, Costa CJ, Braga AS (2012) Proposing a system to support crowdsourcing. In: Proceedings of the Workshop on Open Source and Design of Communication pp 13–17. https:// doi.org/10.1145/2316936.2316940 3. Batista M, Costa CJ, Aparicio M (2013) ERP OS localization framework. In Proceedings of the Workshop on Open Source and Design of Communication pp 1–8. https://doi.org/10.1145/ 2503848.2503849 4. Bernardino C, Costa CJ, Aparício M (2022) Digital evolution: blockchain field research, In: 2022 17th Iberian Conference on Information Systems and Technologies (CISTI), 2022, pp 1–6. https://doi.org/10.23919/CISTI54924.2022.9820035 5. Berndtsson M, Hansson J, Olsson B, Lundell B (2007) Thesis projects: a guide for students in computer science and information systems 2nd edn. Springer 6. Costa CJ, Aparicio JT (2020, June) POST-DS: a methodology to boost data science. In: 2020 15th Iberian Conference on Information Systems and Technologies (CISTI) IEEE, pp 1–6. https://doi.org/10.23919/CISTI49556.2020.9140932 7. Costa CJ, Aparicio M, Figueiredo JP (2012) Health portal: an alternative using open source technology. Int J Web Portals (IJWP) 4(4):1–18. https://doi.org/10.4018/jwp.2012100101

Design Science in Information Systems and Computing

419

8. Dessler D (1999) Constructivism within a positivist social science. Rev Int Stud 25(1):123–137. https://doi.org/10.1017/S0260210599001230 9. Gregor S (2006) The nature of theory in information systems. MIS Quarterly 611–642. https:// doi.org/10.2307/25148742 10. Hajishirzi R, Costa CJ, Aparicio M, Romão M (2022) Digital transformation framework: a bibliometric approach. In: Rocha A, Adeli H, Dzemyda G, Moreira F (eds) Information systems andtTechnologies. WorldCIST 2022. Lecture Notes in Networks and Systems, vol 470. Springer, Cham. https://doi.org/10.1007/978-3-031-04829-6_38 11. Hevner A, Chatterjee S (2010) Design research in information systems-theory and practice, 22. Springer. http://www.springer.com/business+%26+management/business+information+ systems/book/978-1-4419-5652-1 12. Jarvinen P (2000a) Research questions guiding selection of an appropriate research method. 124–131. https://aisel.aisnet.org/ecis2000/26 13. Jarvinen P (2000b) On a variety of research output types. In: Svensson L, Snis U, Sorensen C, Fägerlind H, Lindroth T, Magnusson M, Östlund C (eds) Proceedings of IRIS23. Laboratorium for Interaction, University of Trollhättan, Uddevalla, 2000. pp 251––265 14. Järvinen P (2007) Action research is similar to design science. Qual Quant 41(1):37–54. https:// doi.org/10.1007/s11135-005-5427-1 15. Kuhn S, Muller MJ (1993) Participatory design. Commun ACM 36(6):24–29. https://doi.org/ 10.1145/153571.255960 16. March ST, Smith GF (1995) Design and natural science research on information technology. Decis Support Syst 15(4):251–266. https://doi.org/10.1016/0167-9236(94)00041-2 17. Nunamaker JF, Chen M, Purdin TDM (1990) Systems development in information systems research. J Manag Inf Syst 7(3):89–106. https://doi.org/10.1080/07421222.1990.11517898 18. Peffers K, Tuunanen T, Rothenberger MA, Chatterjee S (2007) A design science research methodology for information systems research. J Manag Inf Syst 24(3):45–77. https://doi.org/ 10.2753/MIS0742-1222240302 19. Piteira M, Costa CJ, Aparicio M (2017) A conceptual framework to implement gamification on online courses of computer programming learning: implementation. In: 10th International Conference of Education, Research and Innovation (ICERI2017), IATED Academy. pp 7022– 7031. https://doi.org/10.21125/iceri.2017.1865 20. Pritchard A, Wittig G (1981) Bibliometrics: a bibliography and index. Vol 1. ALLM Books 21. Saunders MNK, Lewis P, Thornhill A (2019) Research methods for business students. 8th edn. Pearson, New York 22. Sekaran U (2003) Research methods for business: a skill-building approach. John Wiley & Sons 23. Simon HA (1969) The sciences of the artificial. MIT Press 24. Spinuzzi C (2005) The methodology of participatory design. Tech Commun 52(2):163–174

Organizational e-Learning Systems’ Success in Industry Clemens Julius Hannen and Manuela Aparicio

Abstract E-learning is growing. Many firms have adopted e-learning since it is individualized and cost-effective. Research on its success criteria is fragmented. Most explored e-learning systems in higher education setting, as other studies focused on specialized features, leading to transferable conclusions. This qualitative study examines e-learning success in commercial enterprises. It initially finds the most prevalent organizational e-learning success characteristics and classifies them into four dimensions. It models eight assumptions. Six e-learning professionals validate the technique in an FCS. All four success dimensions and variables affect e-learning success perception. A customized success model helps e-learning decision-makers maximize corporate e-learning. Keywords Organizational e-learning systems success · Qualitative study · Industry · Focus group study

1 Introduction In the 1980s, e-learning developed as a subset of remote learning and grew with the internet in the 1990s [17]. It is more personalized and flexible than traditional methods. Businesses have implemented e-learning to meet new expectations of constant information availability and connection. Businesses are big e-learning adopters. The modern workplace includes lifelong learning and training programs. Digital teaching and learning methods offer personalized education in business. C. J. Hannen · M. Aparicio (B) NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisbon, Portugal e-mail: [email protected] C. J. Hannen e-mail: [email protected] C. J. Hannen aboDeinauto, Berlin, Germany © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_36

421

422

C. J. Hannen and M. Aparicio

E-learning became the norm for colleges, institutions, and businesses during the Covid-19 epidemic. It expands past research to answer the following question. How successful is corporate e-learning? Since the research problem is behavioral sciences [21], we used a qualitative method to analyze data and explain e-learning effectiveness in businesses. This model is tested with workplace e-learning users in a focus group study.

2 Theoretical Background and Model Assumptions Sridharan et al. [25] identified pedagogies, associated technologies, and learning resource management as important. In this domain, pedagogical strategy, management factors, technology, individual impact, and management impact are held in the present study. The participants also unveiled hurdles to be overcome: technological implementation without a pedagogical strategy, lack of consciousness or recognition of e-learning, and inadequate learning strategies [4, 1, 5]. Course educational quality [8] refers to structure, content quality, depth, and flexibility. The pedagogical strategy considers different learning styles, the professor role as a learning facilitator, and multimedia means. Course interaction and learning community cover social and cultural interaction among students and teachers [6]. Evaluation [15] include tracking student successes, giving constructive feedback, and letting students reflect on their learning performance. Most IT system success factors are system quality, user satisfaction, content, service, and ease of use. Other than usability/interface design, all success factors can be traced to DeLone and McLean’s [11] model. System success includes reliability or response time. Information quality includes accuracy, meaningfulness, and timeliness. Service quality includes receptiveness, and consistency [12]. Learner satisfaction depicts perception of the system, while ease of use depicts the user’s engagement level. Human resources include giving learners the adequate equipment, provide environment, space and time, and support staff. Creating a learning culture [24] means prioritizing employee competencies, promoting learner interaction, and explaining e-learning to all employees. The literature review informs the research assumptions. Then, the research assumptions are combined to a theoretical success model. The model below assumes: A1. “Pedagogical success factors” (PSF) positively affect the e-learning initiatives success: PSF includes instructor traits, course quality, pedagogy, social interaction, and evaluation. Bhuasiri et, among others, found these to be important for e-learning success [22]. A2. “IT system” success factors positively affect the e-learning initiatives success: ITSF includes system, service and content quality, learners satisfaction, and usability. Aparicio et al. [2], and Hassanzadeh et al. [18] found these as influencing e-learning success (2012). A3. Learner success factors (LSF) positively affect the success of e-learning initiatives: LSF comprises system loyalty, use, learner’s characteristics, motivation, and desire to use [7, 10, 13].

Organizational e-Learning Systems’ Success in Industry

423

A4. Organizational factors positively affect e-learning initiatives’ success: OSF includes institutional and management support, organizational impact, availability of staff, and learning organizational culture [6, 14]. IT and pedagogical success are linked. Second, learner- and organizationalsuccess domains seem linked. Consequently: A5a. Pedagogical factors positively influence IT system success factors; A5b. IT system factors positively influence pedagogical success factors: The two assumptions, (5a) and (5b), can be obtained from research such as that conducted by Chaikina et al. [9] or by Harper et al. [16]. A6a. Organizational factors positively influence learner success factors; A6b. Learner factors positively influence organizational success factors. The two assumptions, (6a) and (6b), which are taken from Procter (2006).

3 Empirical Study in Industry A Focus group study validated the model (FCS). Table 1 lists discussion questions. Due to the fact that a focus group is a form of semi-structured interview, the conversation may veer off-topic if it is found to be pertinent to the research issue. The framework for the exploration was established by the prior assumptions and research model. The focus group participants (Table 2) all had more than two years of organizational learning experience, and our participant selection strategy considered that they belonged to different organizations and economic activities. We do not report ages or genders because we are not conducting a quantitative study and are not inferring their influence on e-learning success. The focus group did as follows: First, participants were phoned and told about the focus group. After expressing interest, invites and scheduling mail was sent with study details. In a 90-min session, the interviewees were given a question guideline. Several characteristics of FGS participants are highlighted by Krueger and Casey [20]. First, try to maintain groupings between six and eight people. Second, having participants from various backgrounds will make for more interesting conversations. Finally, make sure no one feels intimidated by anyone else, by avoiding power differentials. To ensure diversity, participants came from different industries. To avoid power imbalances, participants had similar seniority. All participants had prior elearning experience in the workplace. Each participant took part in e-learning, albeit in different formats. The table below describes focus group participants and their e-learning initiatives. The focus group interview was held on August 8, 2021, from 17:00 to 18:30 CET in German companies. As the Covid-19 pandemic limited large group meetings and participants were spread across countries, it was held virtually via Google Meet. Before the focus group began, everyone gave verbal approval to be interviewed and record the audio transcript. Everyone was assured that participation was voluntary and that they could leave anytime. Nobody was identified in the interview, which was encrypted by a password. This complied with ethical requirements and protected

424

C. J. Hannen and M. Aparicio

Table 1 Questions conducted via focus groups Question

Associated assumptions

Q1

Do you consider e-learning to be effective in the workplace?

Overarching

Q2

Do you think the underlying pedagogy of the e-learning initiative affects the learning outcome?

Assumption 1

Q3

Do you think the e-learning IT system affects the learning outcome?

Assumption 2

Q4

Do you think the individual characteristics of the learner affect the learning outcome?

Assumption 3

Q5

Do you think the characteristics of the organization affect the Assumption 4 learning outcome?

Q6

Do you see a relationship between the underlying pedagogy and the IT system of the e-learning initiative?

Assumptions 5a and 5b

Q7

Do you see a relationship between the characteristics of the learner and the characteristics of the organizations?

Assumptions 6a and 6b

Table 2 Focus group participant’s characterization Industry

Exposure to e-learning

Context of e-learning initiatives participated in

PI

Consulting

3 years

Online soft skills training for client management

P2

Pharmaceutical

2 years

Online coding training for data scientists

P3

Technology

4 years

Virtual classroom to understand competitor’s business models

P4

Engineering

2 years

Online educational training for 10 T platforms

P5

Healthcare

4 years

Virtual classroom to understand the legislation

P6

Financial

3 years

The online financial analysis training

the confidentiality of the participants. A slide presentation was utilized to help the moderator keep the focus group on topic and on track. It laid out the theoretical model with underlying assumptions and detailed how the data gathered from participants is analyzed. When further investigation or explanation was required, the moderator posed additional questions. The slides helped keep the conversation on track by emphasizing the question at hand. The “animation” function sequentially blended in the five subdimensions whenever the debate stalled if the participants were confused about how to grasp the success aspects. The next section presents key results and findings.

Organizational e-Learning Systems’ Success in Industry

425

4 Focus Group Study Results and Discussion The following are FGS observations and conclusions. The old study model is then revised in light of fresh data. Results are analyzed and compared to findings from the literature review. For readability, FGS data are clustered by assumption. As 5a, 5b, 6a, and 6b were covered in a single question, the results have been compiled. A1: “Pedagogical success factors” (PSF) positively affect the e-learning initiatives’ success. Depending on the complexity, e-learning PSF was valuable or detrimental to learning success. Participants found e-learning effective for technical training. At this moment, one interviewee declared: We always record the training, so I can rewatch them and rewind the parts until I fully understand the lines of code. This has helped me greatly in the last year. In presence training, however, I often did not feel confident enough to speak up and raise questions as I was one of the youngest of the group.

Pertinence of learning materials was key. When docents connected learning materials to real-world applications, all participants reported enhanced learning and fulfillment. This may involve a case study or the resolution of past employee difficulties using new information or abilities. There was no difference in learning success between online and offline docents. Additionally, the interviewees expressed difficulties with more complicated joint work. All respondents agreed that group collaboration led to greater learning results, but that it was more challenging in e-learning than in-person. The two most crucial obstacles to overcome were challenging communication and the perception of a lack of accountability due to the absence of face-to-face interaction. Social contact is another PSF that contributes to the success of e-learning programs. Compared to in-person for-mats, this component was frequently neglected, leading to inferior learning experiences. An interviewee said: When a training session via Zoom is over, you simply hang up and open your mail or another browser tab. After classroom training, on the other hand, you talk to your colleagues about what you have learned and how you can implement it. This has helped my learning progress a lot in the past.

All participants agreed that PSF boosts organizational e-learning success. The pedagogy must be adapted to online formats to be effective. Transferring a working in-person pedagogical strategy to e-learning has not worked well for participants. A2: “IT system” success factors positively affect the e-learning initiatives’ success: How well instructors developed and altered learning materials for each platform was crucial to the ITSF. During the Covid-19 outbreak, Zoom sessions with someone presenting slides or similar materials were increasingly common, despite being inappropriate for the subject matter. One interviewee shared his point of view: We had one financial modeling training in Excel, where the docent just scrolled through his Excel file via the screen sharing function. While simply presenting this file would have worked well on a big screen, it was merely visible on my laptop screen. The docent should have at least visually accentuated key aspects and formulas, as it was fairly difficult to follow along.

426

C. J. Hannen and M. Aparicio

Some participants agreed and emphasized incorporating various technologies that accommodate varied exercise types and content. A respondent said: Depending on whether you want to teach complex topics or simply memorize the new GDPR rules, you should integrate different systems. However, this would come at the expense of scalability, which should also be taken into account.

The docent’s IT proficiency was also key to learning success. Ease of use/interface design was a popular ITSF. Several respondents said a smooth experience and an elearning interface lead to higher learning success and longer engagement. Website response time and a structured layout promote a good user experience. The quality of support was not acknowledged as a critical aspect in learning. Compared to other information systems, e-learning systems make less use of it. All respondents concurred that ITSF impacts e-learning success. They advocated selecting the learning method based on the content being taught and tailoring the visualization to the format. When selecting an e-learning IT system, businesses must strike a balance between individual optimization and scalability to avoid incurring exponential expenditures. A3: Learner success factors (LSF) positively affect the success of e-learning initiatives. All participants believe that e-learning benefits and engagement depend on human attitudes. Motive, grit, and perseverance are needed. While grit and discipline are learner-specific, a company can affect intrinsic motivation by customizing e-learning courses. A participant said: If I see the value for myself, the barrier to dropping out is significantly higher. In contrast, if the benefits are not aligned with my personal goals, my motivation drops. Sometimes, this might even be offset by simply communicating and rewarding the completion of the e-learning course properly, it does not always have to encompass massive personalization of the latter.

Social influence also affect motivation. Respondents were motivated to continue e-learning when they saw their colleagues making progress. Turning on the camera during videoconferencing training sessions, also increased responsibility. Participants found e-learning formats harder to follow than classroom training. Interviewee: While resilience and intrinsic motivation are of course also important for in-presence learning, they become even more important for e-learning as you can more easily be diverted. It is way easier to lose your focus in the digital context and, therefore, having these characteristics is key to sticking to the course. Although, tailoring the course more precisely to the learner should also help to address this issue.

It was mentioned that there were given weekly check-ins and group sessions to ensure they finished the online courses offered by the company. In conclusion, the LSF influenced overall learning success. Intrinsic motivation, grit, and discipline are key to e-learning success. As respondents’ examples show, most of these factors are beyond the organization’s control, but some can be influenced. A4: Organizational factors positively affect e-learning initiatives’ success. Organizational support is an important aspect of the OSF. One participant said he could not finish an e-learning course due to insufficient manager support.

Organizational e-Learning Systems’ Success in Industry

427

Despite my interest in pursuing a course from our training library, my supervisor did not free up the required resources for it. In the end, I could not take the time off that would have been required for it. At work, this support from management is even more critical to provide people with the time and freedom from other duties required to successfully participate in e-learning.

Another participant agreed, saying: This has to be promoted top-down and everyone needs to be provided with sufficient time, regardless of whether it is a coding, soft skills, or industry training. If you are supplied with x hours every week, e-learning becomes way more impactful. In the end, I think every person wants to learn more and get better at his or her job. The employer has to provide the time for it.

The participants saw a supportive organizational culture for e-learning as a key to long-term e-learning success. As a result, management must allocate the sufficient equipment, communicate the benefits, and align e-learning initiatives with measurable organizational objectives. A5a: Pedagogical factors positively influence IT system success factors; A5b. IT system factors positively influence pedagogical success factors. Participants perceived a one-sided connection between the two factors in their work. PSF had a positive impact on ITSF, but ITSF had no effect on PSF.. A participant said: A good pedagogy can make up for a poorly visualized presentation or a poor IT system, but unfortunately it does not work the other way around.

Participants advocated for a threshold of IT system quality; otherwise, PSF is harmed. One participant said: About two weeks ago, we had a training where the docent simply could not connect and got kicked out of the meeting every thirty seconds. In the end, we did not learn anything and eight people blocked an hour of their schedule for nothing.

Others agreed but said that if a lowest level of system quality is given, PSF can be ignored. Only PSF positively influenced the participants. ITSF did not influence participants’ perspectives if certain system non-functionalities were excluded. A6a: Organizational factors positively influence learner success factors. Participants perceived a strong association between LSFs and OSFs in their professional experience. This was due to the participants’ perception that personal and organizational goals overlapped. One interviewee pointed out the following viewpoint: There is a strong supporting element between your personal and your organizational goals. This should be leveraged by always having a reference for the e-learning initiatives. If you have applications of the learnings in your daily work, you can drastically increase your learning curve. To provide an example for this, it does not help to simply look at an Excel formula, but it does help to look at an Excel formula and apply it to a problem you are facing in your daily work. Following, you have a success that helps you in your daily work, but also helps your employer in the organizational or project level context.

428

C. J. Hannen and M. Aparicio

Bonuses were also mentioned. If learners see that online learning courses improve their job performance and assessments for future promotions or variable pay, it generates a strong encouragement to participate. One interviewee argued that the two factors are flexible and influenceable: Even if the tasks do not quite match the daily work, you can put them in context as an employer. In my case, as a management consultant, for example, I have to meet certain technical and personal qualifications to move up to the next level. If my employer then tells me, ‘These are the training you need to complete for the next career step,’ then, of course, I’m more motivated to complete them.

Everyone agreed that LSF and OSF should be used in e-learning initiatives. Which elements contribute to the success of corporate e-learning initiatives? A review of the literature identified the most widely accepted e-learning success factors. Key success factors were then categorized into four domains. Through the FGS, organizational e-learning users validated and enriched this research model. This fills two important gaps in e-learning research for organizations. First, it gave decision-makers an actionable overview of success factors. Scholars have taken many approaches to study e-learning success. Business insights had to be carefully transferred with many university-based studies [17]. Many scholars only studied specific factors [2, 15, 19]. The studies’ applicability to organizational e-learning was limited. Next, there were too many e-learning studies and success factors to read in a reasonable time. Decision-makers interested in e-learning were overwhelmed with optimization information. The study validated the research model in FGS. This qualitative validation approach is rare in e-learning academia. The studies [8, 14, 22] used surveys sent to university or business respondents. FGS results generally match literature review findings. Participants agreed on success domains and factors, and all major literature review assumptions 1–4 were approved. This result was expected, as these assumptions were derived from numerous important studies. An FGS validated the findings’ applicability. Decision-makers have a clear understanding of corporate elearning success thanks to the success domains and components. Assumptions, 5a, 6a, and 6b were validated from the literature. This indicates cross-relationships and details how to leverage them based on qualitative results. Surprisingly, participants could not validate assumption 5b. Except for edge cases, interviewees did not see a significant influence. This helps decision-makers design e-learning initiatives. While IT system quality must be ensured, e-learning pedagogy is more important to optimize. By combining this updated model with the literature review’s success factors, the tested model for organizational e-learning success was proposed (Fig. 1). The authors gathered and analyzed the literature and focus group data on identifying e-learning factors. Several barriers to e-learning success were identified such as Matching supply and demand, organizational coordination, Internet technologies, motives, and sustainability. Format and content determine e-learning success. Because the study focused more on hurdles than success factors, the results must be compared with caution. Barriers and facilitators are in the research model. Supply, demand, training format, and content are considered by PSF. OSF coordinates and sustains organizations (see “available (human) resources”). Both ITSF and LSF

Organizational e-Learning Systems’ Success in Industry

429

Fig. 1 Test results of organizational e-learning assumptions

consider internet technologies. Long-term, this paradigm will enhance knowledge diffusion and provide a research base. This field-tested methodology helps practitioners improve e-learning. This contribution is especially valuable because the Covid-19 pandemic is expected to increase e-learning adoption and importance.

5 Conclusions Despite its expanding importance and utilization, academics haven’t concentrated much on organizational e-learning. This systematic study helped. First, a literature review identified four e-learning success characteristics. A model for organizational e-learning success was constructed using eight assumptions. Six e-learning professionals tested this model’s assumptions and validated seven of eight assumptions. Success domains and factors all have a substantial impact on e-learning success. Three of four domain correlations were meaningful. Only IT system’s influence on pedagogy was non-negligible. PSF, ITSF, LSF, and OSF impact e-learning success. LSF, OSF, PSF, and ITSF had positive relationships. This is the first unified e-learning

430

C. J. Hannen and M. Aparicio

success guide. The proposed methodology helps e-learning decision-makers maximize organizational e-learning projects. The model was well-received in the focus group, and some interviewees asked for study results [3]. Acknowledgements We are gratefully acknowledged financial support from FCT—Fundação para a Ciên-cia e Tecnologia, I.P., Portugal, Nacional funding through research grant through research grant UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC), NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Portugal.

References 1. Aparicio M, Bacao F, Oliveira T (2016) Cultural impacts on e-learning systems’ success. Internet Higher Educ 31:58–70 2. Aparicio M, Bacao F, Oliveira T (2017) Grit in the path to e-learning success. Comput Hum Behav 66:388–399 3. Aparicio M, Oliveira T, Bacao F, Painho M (2019) Gamification: a key determinant of massive open online course (MOOC) success. Inf Manag 56(1):39–54 4. Aparicio M, Bacao F (2013) E-learning concept trends. In: Proceedings of the 2013 international conference on information systems and design of communication, pp 81–86. https://doi.org/ 10.1145/2503859.2503872 5. Aparicio M, Bação F, Oliveira T (2014) Trends in the e-learning ecosystem: a bibliometric study. In: AMCIS 2014 proceedings, twentieth Americas conference on information systems, Savannah 6. Basak SK, Wotto M, Bélanger P (2016) A framework on the critical success factors of elearning implementation in higher education: a review of the literature. Int J Educ Pedagog Sci 10(7):2409–2414 7. Beinicke A, Kyndt E (2020) Evidence-based actions for maximising training effectiveness in corporate e-learning and classroom training. Stud Contin Educ 42(2):256–276 8. Bhuasiri W, Xaymoungkhoun O, Zo H, Rho JJ, Ciganek AP (2012) Critical success factors for e-learning in developing countries: a comparative analysis between ICT experts and faculty. Comput Educ 58(2):843–855 9. Chaikina ZV, Shevchenko SM, Mukhina MV, Katkova OV, Kutepova LI (2018) Electronic testing as a tool for optimizing the process of control over the results of educational training activities. Adv Intell Syst Comput 622:194–200 10. Cidral WA, Oliveira T, Di Felice M, Aparicio M (2018) e-learning success determinants: Brazilian empirical study. Comput Educ 122:273–290 11. DeLone WH, McLean ER (1992) Information systems success: the quest for the dependent variable. Inf Syst Res 3(1):60–95 12. Delone WH, McLean ER (2003) The DeLone and McLean model of information systems success: a ten-year update. J Manag Inf Syst 19(4):9–30 13. Eom SB, Ashill NJ (2018) A system’s view of e-learning success model. Decis Sci J Innov Educ 16(1):42–76 14. Freeze RD, Alshare KA, Lane PL, Wen HJ (2019) IS success model in e-learning context based on students’ perceptions. J Inf Syst Educ 21(2):4 15. Govindasamy T (2001) Successful implementation of e-learning: pedagogical considerations. Internet Higher Educ 4(3–4):287–299 16. Harper KC, Chen K, Yen DC (2004) Distance learning, virtual classrooms, and teaching pedagogy in the Internet environment. Technol Soc 26(4):585–598 17. Hassanzadeh A, Kanaani F, Elahi S (2012) A model for measuring e-learning systems success in universities. Expert Syst Appl 39(12):10959–10966

Organizational e-Learning Systems’ Success in Industry

431

18. Hassanzadeh A, Kanaani F, Elahi S (2012) A model for measuring elearning systems success in universities. Expert Syst Appl 39(12):10959–10966 19. Keramati A, Afshari-Mofrad M, Kamrani A (2011) The role of readiness factors in e-learning outcomes: an empirical study. Comput Educ 57(3):1919–1929 20. Krueger RA, Casey MA (2015) Designing and conducting focus group interviews 21. March ST, Smith GF (1995) Design and natural science research on information technology. Decis Support Syst 15(4):251–266 22. McGill TJ, Klobas JE, Renzi S (2014) Critical success factors for the continuation of e-learning initiatives. Internet Higher Educ 22:24–36 23. Ozkan S, Koseler R (2009) Multi-dimensional students’ evaluation of e-learning systems in the higher education context: an empirical investigation. Comput Educ 53(4):1285–1296 24. Sela E, Sivan Y (2009) Enterprise e-learning success factors: an analysis of practitioners’ perspective (with a downturn addendum). Interdiscip J E-Learn Learn Objects 5(1):335–343 25. Sridharan B, Deng, Corbitt B (2010) Critical success factors in e-learning ecosystems: qualitative study. J Syst Inf Technol 12(4):263–288

Network Security

Smart Pilot Decontamination Strategy for High and Low Contaminated Users in Massive MIMO-5G Network Khalid Khan, Farhad Banoori, Muhammad Adnan, Rizwan Zahoor, Tarique Khan, Felix Obite, Nobel John William, Arshad Ahmad, Fawad Qayum, and Shah Nazir

Abstract Massive multiple-input multiple-output (m-MIMO) is recognized as large-scale MIMO (LS-MIMO) and is a crucial podium for fifth-generation communication networks (5G) to reinforce spectral efficiency (SE) and capacity of the system. Pilot contamination (PC) arises because the non-orthogonal pilot sequences transmitted by users in a cell similar to the nearby cells is one of the blockages in m-MIMO to avail the essential goals. In the current paper, we intended an intelligent pilot assignment policy to substantially extenuate the potency of PC in the multi-cells m-MIMO system. Based on the intended strategy, the entire users are divided based on their large-scale fading into two groups, i.e., High contaminate KH and low contaminate KL users. Likewise, KH is assigned by orthogonal pilots due K. Khan (B) Beijing University of Posts and Telecommunications, Beijing, China e-mail: [email protected] F. Banoori (B) South China University of Technology (SCUT), Guangzhou, China e-mail: [email protected] A. Ahmad Institute of Software Systems Engineering, Johannes Kepler University Linz Austria, Linz, Austria M. Adnan · S. Nazir Department of Computer Science, University of Swabi, Swabi, Pakistan R. Zahoor University of Campania Luigi Vanvitelli, Caserta, Italy T. Khan University of Politechnico Delle Marche, Ancona, Italy F. Qayum Department of Computer Science, IT University of Malakand, Totakan, Pakistan F. Obite Department of Physics, Ahmadu Bello University, Zaria, Nigeria N. J. William University of Valencia, Valencia, Spain © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_37

435

436

K. Khan et al.

to high inter-cell-interference (ICI) and to extenuate the extreme PC, whereas nonorthogonal sets of pilots are dedicated to KL . Afterward, severe PC among KH almost vanished, whereas a slight problem of PC still exists among KL , which deteriorates the system efficiency. Thus, a competent approach based on hyper-graph-coloringbased pilot allocation (HCPA) is projected to overwhelm the PC of KL . However, the simulation effects expose that the proposed strategy extremely stimulates both attainable sum-rate and signal-to-interference-plus-noise-ratios (SINRs). Furthermore, the proposed strategy is extremely efficient to suppress PC and optimize the entire system efficiency. Keywords Fifth generation · Massive MIMO · Inter-cell-interference · Pilot contamination · Pilot reuse · Hyper-graph-coloring

1 Introduction Massive MIMO is an impressive technology platform for fifth-generation networks (5G) as it can significantly stimulate channel capacity, spectral efficiency, and connection density [1]. Although, m-MIMO is a state-of-the-art title for multi-user that includes sole-antenna supported users and engaged concurrently via a base station (BS) consisting of several antennas within the same time–frequency competency. With a small coherence epoch, identical pilot sequences are mostly allotted to users within various cells to preserve the spectrum, which leads to the dilemma of PC [2]. Whereas the amount of BS antennas getting large, system capability is substantially restricted due to the PC [3] as depicted in Fig. 1. Consequently, PC has turned to the core motive for efficiency harm in them-MIMO system. PC is caused by reusing identical pilot sequences because of limited resources, which restrict the performance of the m-MIMO system for both down and up links. It must not be diminished by rising the transmit power and amount of BS antennas. Thus, numerous pilot decontamination schemes are suggested to eradicate the effect of PC. In [4], smart pilot scheduling scheme is discussed that can noticeably turn down PC, yet face large computational complications as the number of terminals

Effective Signal Fig. 1 PC Scenario in multi-cell m-MIMO system

Inter-cell Interference

Smart Pilot Decontamination Strategy for High and Low Contaminated …

437

gets larger. On the other hand, the approach [5] utilizes angle-of-arrival (AoA) to discriminate between pilots coming from various users. Nevertheless, it is not forever factual that every participant discovers a unique AoA within the practical strategy. Moreover, cell sorting and time shift pilot’s forwarding [6] can subdue PC remarkably, however, undergoes interference among identical pilots from multiple cells in a single group. A time-shifted rule [7] is put forward to send out pilot signals via shifting pilot position in frames at the non-overlaying epoch to overwhelm PC. Additionally, joint pre-coding among BSs is recommended [8] which is capable to ease PC. Still, further control overheads are required to interchange between BSs which leads to minor time–frequency competency. Banoori et al. [9, 10] exploited the smart antenna methods to adjust the spectral ability of m-MIMO within TDD architecture. Further, the Hungarian approach [11] and the pilot sequence distribution technique [12] are imposed to tackle the matter of pilot allocation to eliminate PC. Besides, coloring and weighted graphs strategies [13] was also intended to severely extenuate PC. Hence, all the above-mentioned techniques are emphasized to assign orthogonal pilots for highly contaminated users. On the other hand, low contaminate users can create intense interference and degrades the overall system efficacy; for this reason, it is essential to put forward an effective and efficient pilot assignment strategy to remarkably stimulate the entire system performance. The core contribution of this paper is, an effective pilot assignment strategy is proposed to extenuate PC, which is arisen due to reuse of non-orthogonal pilots among the users of different cells. According to the proposed strategy, initially users are assorted into two groups, i.e., High and low contaminate users on the basis of their large-scale fading. Moreover, the splitting technique was performed via a clear threshold level. Afterward, orthogonal pilots are assigned to KH , whereas for KL the pilot sequences are allotted on priority basis through the use of hyper-graphcoloring-based pilot allocation (HCPA) [14]. Hence, the current strategy substantially stimulates the achievable data rate and remarkably boosts the entire system’s efficiency. The rest of the paper is structured in this manner. The system model is inducted and elaborated in Sect. 2. Section 3 includes mathematical look and illustration of problem formulation. An efficient pilot assignment strategy is proposed in Sect. 4. The simulation effects are addressed in Sect. 5. As a final part, Sect. 6 reflects the conclusion of the current work and future directions.

2 System Model We reflect the scenario for the uplink TDD-based m-MIMO system with F cells. Moreover, each separate cell is described by a center base station (BS) having Mantennas array and K users such that M  K as illustrated in Fig. 2. The channel’s matrix X(l,k),j ∈ CM×1 from the kth user in the lth cell to the BS in the jth cell is demonstrated as

438

K. Khan et al.

M_Antennas

BS_02 UE_02

jth_Cell M_Antennas Uplink Downlink

UE_01

BS_01 lth_Cell

Fig. 2 Illustration of our proposed system model

√ Xjmlk = ∂jmlk αjlk

(1)

where ∂ jlk exhibits the corresponding small-scale fading factor as it is independent and identically distributed (i, i, d) and C N (0,1) with zero mean and unit variance. Whereas, α jlk signifies large-scale fading value and is considered to change gradually throughout many coherence epochs and define to all. As a result of the  limited pilot resources, δ pilot sequences  = [ψ1 , ψ2 · · · ψδ ] ∈ Cτ×δ δi ∈ Cτ×1 are re-utilized within neighboring cells, whereas distinct users within individual cell use unique pilots to overwhelm server intra-cell-interference, thus believing that  H = Iδ . However, τ and H indicate the pilot length and conjugate transpose, respectively. Consequently, the uplink transmit phase of pilot signal is estimated as √  Rj = ρp Xjl  Tl + j L

(2)

j=1 M×τ shows received pilot signal at BS of jth cell, Xj,l   Here, Rj ∈ C x(l,1),j , x(l,2),j · · · x(l,K),j , 1 ≤ j ≤ F, 1 ≤ l ≤ F, illustrates channel’s matrix between kth users in lth cell and M-antennas of BS in jth cell, besides ρp indicates transmit power and j indicates noise matrix. Thus, uplink SINR of kth user within lth cell after randomly pilot assignment is estimated as (3)  and (4). Where, alk means linear detection vector of kth user with in lth cell and j=l αl jk text exhibits pilot contamination occurred due to re-utilize of identical pilot. Thus, uplink sum-rate of kth user within lth cell is evaluated as (5)

Smart Pilot Decontamination Strategy for High and Low Contaminated … up SINRlk

 H 2 a xllk  lk =   2 k  H  2 2 aH xllj  +   j=k lk l=i m=1 alk xllm + alk 2 /pul up

SINRlk → lim  M→∞

2 αllk 2 j=1 αljk

439

(3)

(4)

 up up 

ξlk = (1 − )E log2 1 + SINRlk

(5)

However,  calculates the spectral efficiency loss due to uplink transmission, in fact it is defined as relation of channel coherence epoch T and length pilot sequence τ , that could be evaluated as  = τ/T.

3 Problem Formulation The major part of previous research works aims to stimulate the overall uplink sumrate of the entire KF users in F cells. Thus, key objective of this work is to enhance the total uplink sum-rate having secure communication feature of entire users in F cells that can be improved as max Zδ

F  K 

log2 1 + 

i=1 k=1



s.t. log2 1 + 

j=l,pjk =plk



2 αllk j=l,pjk =plk



2 αllk

2 αljk

2 αljk

(6)

≥ γth ,

where {Zδ : δ = 1,2, · · · K!} implies all feasible K ! Sorts of pilot assignment approaches, while γth is projected to confine the user’s minimum uplink achievable rate. Hence, it is clear from (6) that overall uplink achievable rate ξ is deeply based on large-scale fading value and controlled by PC. Moreover, it is also depicted in Figs. 5 and 6, when pilots are allocated traditionally, the surrounded users are exposed to severe PC. Hence, their SINRs as well as uplink sum-rate does not improved.

4 The Proposed Strategy Such as declared in [15], SINR of a particular user is in proportion to large-scale fading value. However, this noteworthy fact motivated us to split users into two groups, namely, high and low contaminate users. Thus, large-scale fading of K users roving within jth cell probably monitored quickly. To begin with grouping assortment, we initially assumed that all large-scale fading values of entire users were

440

K. Khan et al.

defined to a centralized control unit (CNU) that manage the pilot allocation procedure [15]. The proposed strategy initiates the grouping process and for this goal we injected a factor ωjk which is depend on large-scale fading from kth user in the jth cell to BSs of other cells. However, ωjk addresses overall of large-scale fading value that is calculated as ωjk =

F 

αljk for j = {1 · · · F}, k = {1 · · · K}

(7)

l=1,l= j

Then group assortment will be implemented based on particular criteria (8) (9) directs both high and low contaminate users groups, respectively. Moreover, is grouping threshold that would be estimated as (10) where ϑ indicates a system configuration parameter that can be managed accordant with practical setup. Hence, the users within the jth cell after splitting procedure will be elaborated such as j

j

j

KT = KH + KL j

(11)

j

Here, KT represents overall amount of users, KH indicates users facing sever ICI, j while KL signifies the amount of low contaminate users within the jth cell. After implementing the grouping planning within entire cells, unique pilot sequences set is classified into two sub-classes, i.e., T = H + L

(12)

where, H is the set of distinct pilots that are allotted to the users facing sever ICI. The distinct pilot sequences H are assigned to users that are constructing intense interference for jthcell and could be illustrated as H =

F  l=1

j

KH

(13)

Smart Pilot Decontamination Strategy for High and Low Contaminated …

441

4.1 Interpretation of Pilot Assignment for High Contaminate Users Here, in support of the F cells, the set of distinct pilots H is classified more into F sets for high contaminate users KH given as

H = H1 , H2 · · · HF , for j = {1 · · · F}

(14)



KH = K1H , K2H · · · KFH , for j = {1 · · · F}

(15)

Thus, now BS is more proficient to compute the channel for its users facing greater ICI due to distinct pilot allocation. Moreover, the group of intense ICI users are now released from server PC.

4.2 Interpretation of Pilot Assignment for Low Contaminate Users To overwhelm the contamination of low users, we put forward the hyper-graphcoloring-based pilot allocation (HCPA) approach to allocate pilots to users in low contaminate group. However, if two users in low contaminate group of various cells possess identical pilots, then uplink sum-rate of kth user in lth cell and k th user in jth cell is evaluated as up ξlk

∝ log2

α2 1 + 2llk αljk

1+

2 αjjk

2 αjlk

(16)

up

From (16) it is clear that greater the ξlk , the lesser will be contamination between two users. Hence, here we describe μlkjk to compute the interference potency between kth user in lth cell and k th user in jth cell and that would be provided as μlkjk = 1+

1  2 αllk 1+ α2 ljk

2 αjjk



(17)

2 αjlk

  However, greater μlkjk represents two users, i.e., (l, k) and j, k would be assigned with unique pilot sequences. In addition to that, the least uplink attainable rate among the three users having same pilots in low interference group in various nearby cells can be estimated as

442

K. Khan et al.

Upgrade of Assigned User Set

Pilot Selection and Assignment:

Input

Initialization

Loop Condition

Y

User Preference

Ascertainment of Accessible pilot Set

N Output Fig. 3 Illustration of our proposed strategy flowchart



μ = min log2



2 2 2 αjjk

αiik αllk

1+ 2 , log2 1 + 2 , log2 1 + 2 2 2 2 αljk + αlik αjlk + αjik αilk + αijk





(18)

According to mathematical interpretation, interference graph will be HG = (KL , E, W), where KL , E, and W denote the low interference users, hyper edges, and weight of PC among these users, respectively. Moreover, hyper graph is further separated into HG1 = (KL , E1, W) and HG2 = (KL , E2, W), while E1 and E2 signify the hyper edges with two and three users, respectively. Thus, the hyper graph coloring technique we proposed for KL is elaborated as in Fig. 3. Input: System parameters: KL , L , F, and built hyper-graph. Initialization: initially, picking two users, i.e., (l1 , k1 ) and (i2 , k2 ) having the largest weighted edge in different cells and then allotted unique pilots to these users. Here, we describe as a set of allotted users and then add up these users to a specific set . Loop Condition: Afterwards, all the rest of users will be assigned with their corresponding pilots consecutively. Furthermore, the loop will not end until there are users that are not allotted pilots.  User Preference: In this phase, a priority parameter θlk = (1 ,k )∈ ,1 μlkl k is imposed which explains the weight of edges in E 1 associating user (l, k) with users in neighboring cells in set . Moreover, user having largest interference potency from set will be preferred, i.e., user (lo , ko ). Ascertainment of Accessible pilot Set: To overwhelm intra-cell-interference, therefore pilot is not allowed to be reused within same cell. Here, we introduced 1 as non-allocated pilot set in lo th cell. Because of limitation of hyper edges in set E 2 , three users in hyper are not acceptable to assign identical pilots as far as possible. For this reason, we go across entire hyper edges including nominated users (lo , ko ).

Smart Pilot Decontamination Strategy for High and Low Contaminated …

443

In case the other two users in hyper edges allocated identical pilot, then add up the pilot to set 2 . Eventually, describe set 3 as the present pilot set in which pilots correspond to 1 and cannot correspond to 2 . Pilot Selection and Assignment: However, in phase 5, the existing pilot set 3  has been decided. We introduce γ = (l,k)∈ ,plk = μloko lk to describe interference potency among users having pilot in the set and nominated user (lo , ko ) by considering that user is allotted with pilot . Though, assuming a particular case that 3 is an empty set, we debate two cases correspondingly. In case 3 is not a vacant set, then pilot in 3 with the lowest interference weight or potency γ is chosen to be assigned to the user (lo , ko ); in case 3 is a vacant set, then pilot in 1 with lowest γ is nominated to be assigned to the user (lo , ko ). Upgrade of Assigned User Set: Subsequently any user is assigned the pilot, it will be added up to the set . Output: Finally, the loop will not be done as long as the entire users are assigned with their respective groups.

5 Simulation Result We interpret the efficacy of our proposed strategy by using MATLAB platform, we presume the scenario of typical cellular hexagonal 7 cells network. Hence, the large-scale fading value would be demonstrated as β(l,k),j =

z(l,k),j (r (l,k),j /R)α

(19)

However,z(i,k),l ,r(l,k),j , R and α indicate shadow-fading, the distance among kth user in lth and BS of j cell, radius, and path loss component, respectively. However, some key and basic simulations parameters are itemized in Table 1. Here, we evaluate the performance of our proposed strategy with performance of conventional, WGC-PD [16], and Edge-Weighted-Interference-Graph (EWIG) [17] schemes as depicted in Fig. 4. It can be clearly observed that simulation graph of our proposed strategy enriches quickly and provides the best achievable data rate level with respect to conventional, WGC-PD, and EWIG schemes. The simulations assessment of our proposed strategy with respect to conventional, WGC-PD, and EWIG schemes is presented in Fig. 5. However, the proposed strategy exhibits better results of achievable data rate than other existing schemes. In addition, our proposed strategy fully concentrates on assigning the orthogonal pilots to high and also low contaminate users through an efficient manner to completely vanished PC and ICI.

444 Table 1 Simulation parameters

K. Khan et al. Parameters

Values

Number of cells (F)

7

Number of users per cell (K)

6

Number of BS antennas (M)

32–512

No of pilots ( )

8

System configuration parameter (ϑ)

0.75

Radius of the cell (R)

1000 m

Tx power (ρ)

– 2–20 dB

Shadow fading (σ )

8 dB

Path loss exponent (α)

3

Loss of spectral efficiency ()

0.1

Fig. 4 Achievable rate against various number of BS antennas

The CDF in regard to the uplink SINR of K users is described in Fig. 6. However, the proposed strategy efficiently assigns pilots to high and low contaminate users and effectively provides better values of uplink SINR with respect to other schemes.

Smart Pilot Decontamination Strategy for High and Low Contaminated …

Fig. 5 CDF of uplink achievable rate of proposed strategy and existing schemes

Fig. 6 Uplink SINR comparison of proposed strategy with existing schemes

445

446

K. Khan et al.

6 Conclusion We proposed an efficient pilot allocation strategy that targets to remarkably overwhelm PC and ICI in m-MIMO system. However, the proposed strategy initially divided the entire users into two groups’, i.e., high and low contaminates users based on their large-scale fading. Afterwards, high contaminate users are assigned by unique pilot sequences while an efficient approach based on HCPA algorithm is proposed on KL to effectively overwhelm the PC and ICI. Thus, the simulation effects expose that the proposed strategy remarkably stimulate both achievable sum-rate and SINR with respect to other existing schemes such as conventional, WGC-PD, and EWIG schemes. In the end, this work can be extended by investigating the effects of PC in a multi-cell network by providing performance analysis based on more advanced statistical and probabilistic concepts. Moreover from future work perspective, PC can also be mitigated by employing irregular cellular network along with heterogeneous UE with m-MIMO systems.

References 1. Larsson EG, Edfors O, Tufvesson F, Marzetta TL (2014) Massive MIMO for next generation wireless systems. IEEE Commun Mag 52(2):186–195 2. Marzetta TL (2010) Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Trans Wirel Commun 9(11):3590–3600 3. Hoydis J, Ten Brink S, Debbah M (2013) Massive MIMO in the ul/dl of cellular networks: how many antennas do we need? IEEE J Sel Areas Commun 31(2):160–171 4. Shi J, Li M, Huang Y et al (2015) Pilot scheduling schemes for multi-cell massive multiple-input multiple-output transmission. IET Commun 9(5):689–700 5. Yin H et al (2013) A coordinated approach to channel estimation in large-scale multiple-antenna systems. IEEE J Selected Areas Commun 31(2):264–273 6. Fernandes F, Ashikhmin A, Marzetta T (2013) Inter-cell interference in noncooperative TDD large scale antenna Systems. IEEE J Sel Areas Commun 31(nIEEEhowto:kopkao) 2:192–201 7. Appaiah K, Ashikhmin A, Marzetta TL (2010) Pilot contamination reduction in multi-user TDD systems. In: 2010 IEEE international conference on communications. IEEE 8. Ashikhmin A, Marzetta T (2012) Pilot contamination precoding in multi-cell large scale antenna systems. In: Proceedings of IEEE international symposium on information theory proceedings, pp 1137–1141 9. Banoori F, Shi J, Khan K, Han R, Irfan M (2021) Pilot contamination mitigation under smart pilot allocation strategies within massive MIMO-5G system. Phys Commun 47:101344 10. Khan K, Sun S, Irfan M, Fu M, Banoori F, Alam S, Khan I (2020) An efficient pilot allocation scheme for pilot contamination alleviation in multi-cell massive MIMO systems. In: Signal and information processing, networking and computers. Springer, Singapore, pp 27–36 11. Liy Y, Cheny Y, Huang H, Jing X (2016) On massive MIMO performance with a pilot assignment approach based on Hungarian method. In: 2016 16th international symposium on communications and information technologies (ISCIT), Qingdao, pp 560–564 12. Yan X, Yin H, Xia M, Wei G (2015) Pilot sequences allocation in TDD massive MIMO systems. In: 2015 IEEE wireless communications and networking conference (WCNC). LA, New Orleans, pp 1488–1493 13. Xudong Z, Dai L, chang Z. Graph coloring based pilot allocation to mitigate pilot contamination for mutli-cell massive MIMO Systems. In: IEEE communication letters 19(10). Oct 2015

Smart Pilot Decontamination Strategy for High and Low Contaminated …

447

14. Lian Y, Zhang T, Wang Y (2019) Hypergraph-coloring-based pilot allocation algorithm for massive MIMO systems. In: Liang Q, Liu X, Na Z, Wang W, Mu J, Zhang B (eds) Communications, signal processing, and systems CSPS 2018. Lecture notes in electrical engineering, vol 515. Springer 15. Zhu X, Wang Z, Dai L, Qian C. Smart pilot assignment for massive MIMO. In: IEEE communications letters. 19(9):1644–1647. Sept 2015 16. Zhu X, Dai L, Wang Z, Wang X (2017) Weighted-graph-coloring-based pilot decontamination for multicell massive MIMO systems. In: IEEE Trans Veh Technol 66(3):2829–2834 17. Khan A, Irfan M, Ullah Y, Ahmad S, Ullah S, Hayat B (2019) Pilot contamination mitigation for high and low interference users in multi-cell massive MIMO systems. In: 2019 15th International Conference on Emerging Technologies (ICET), Peshawar, Pakistan. pp 1–5

Cluster-Based Interference-Aware TDMA Scheduling in Wireless Sensor Networks Gohar Ali

Abstract Existing work using Time Division Multiple Access (TDMA) scheduling in Wireless Sensor Networks (WSNs) try to minimize the schedule length. The problem in previous scheduling algorithm is that they assume that if the time slot does not reuse in 2 or 3-hop neighborhood then the transmission in this time slot will correctly receive. However they ignore Signal-to-Noise-plus-Interference Ratio (SNIR) criteria to evaluate the reception of transmission. Thus, in this paper, we provide interference-aware TDMA scheduling algorithm for reducing the schedule length. The objective is to improve the delay and throughput by scheduling intracluster and inter-cluster non-interference transmissions in the same slot with two TDMA scheduling algorithm. Simulation results show that the performance of our proposed scheme are better than the previous scheme. Keywords Cluser-based · Interference-aware · TDMA · Scheduling · Wireless sensor networks

1 Introduction Wireless Sensor Networks (WSNs) are an emerging communication infrastructure for many applications, such as emergency response, industrial process monitoring and control, patient monitoring, fire monitoring, and structural health monitoring and control. In this system, the senses data must send to the monitoring station without further delay so that the appropriate action could be taken [1–5]. For this purpose, TDMA (Time Division Multiple Access) protocol in WSN reduce data retransmission as this protocol avoid the collision by allowing different nodes to access the shared medium without interfering with each other. In wireless networks, the MAC design can be divided into contention-based schemes or timedivision multiple access (TDMA) schemes. MAC contention based on the distributed and random nature of back-off makes it difficult to provide deterministic channel G. Ali (B) Department of Information Systems and Technology, Sur University College, Sur, Oman e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_38

449

450

G. Ali

access guarantees and make no guarantees on delay. While in TDMA-based MAC, a bounded and predictable medium access delay can be determined through time slot scheduling [6–7]. To transfer data using multi-hop from source to destination, a number of TDMA scheduling methods are presented [8–10, 12]. Two centralized TDMA scheduling algorithms, one node-based and the other level-based were proposed in [8]. Base stations assign slots to nodes using both algorithms while taking interference into account. These algorithms try to discover the fewest possible time slots in a TDMA schedule. For multi-hop intra and inter-cluster TDMA scheduling, a conflict-free approach is presented in [9, 10]. The technique distributes each node among 3hop neighbors in order to prevent interference and hence reduce delay, increase throughput, and save energy. Similar to this, some TDMA scheduling assumes knowledge of 2-hop neighbors to prevent interference. In order to prevent interference, the described strategy assigns spaces to nodes that are 2-hop or 3-hop away. This should be taken into account that time slots may be reused in 2-hop or 3-hop neighborhoods if they are not utilized. These plans, however, do not take into account the concurrent transmission of network nodes. This is because cumulative interference causes simultaneous broadcasts from nodes that are n hops apart to interfere with each other. Therefore instead of using 2-hop or 3-hop, we use SNIR (Signal-to-noise plus interference) interference model [5, 6]. In this model, a set of transmission is interference-free if the SNIR of all receivers exceeds a threshold. In this paper, we proposed interference-aware TDMA scheduling algorithm for reducing TDMA scheduling length in WSNs. The aim is to improve delay and throughput by scheduling the intra-cluster and inter-cluster non-interfering transmissions in the same slot with two TDMA scheduling algorithm. The rest of the paper is organized as follows. This section defines the system model. In Sect. 2, we explain the proposed scheme. Section 3 provides performance evaluation and Sect. 4 concludes the paper.

2 Proposed Methodology The proposed scheme consists of three parts. In the first part clustering algorithm formed non-overlapping clusters to avoid the interference. The detail of interferenceaware intra-cluster scheduling algorithm are in the second part. Interference-aware inter-cluster scheduling algorithm details are in the third part. We divide the TDMA time slots into two intra-cluster and inter-cluster slots. In intra-cluster slots, member nodes send transmissions to their cluster headers. Cluster headers send their transmission in inter-cluster slots. As the cluster head uses high transmission power for transmission with neighbor cluster heads, so the inter-cluster communication is interfered with other nodes in different clusters. Therefore, we use separate time slots for inter-cluster and intra-cluster data transmission to avoid both interferences (Fig. 1).

Cluster-Based Interference-Aware TDMA Scheduling in Wireless …

451

Fig. 1 Proposed scheme

2.1 Clustering Algorithm In this part, sensor nodes create two-hop clusters autonomously and cluster head (CH) is selected in a distributed fashion. Two-hop clustering denotes that members of CH are its one-hop and two-hop neighbor nodes. Our scheme uses distributed clustering algorithm named interference-aware clustering algorithm to create non-overlapping up to 2-hop clusters. The weight on which CH select is defined as below weight(u) < weight(v) Iu < Iv or Iu = Iv & id(u) < id(v) where Iu and Iv denote the total interference within two-hops of sensor nodes u and v, respectively, id(u) and id(v) denote ID number of sensor nodes u and v, respectively. Thus sensor nodes with minimum total interference within 2-hop should be selected as cluster head. After CH selection, each cluster head will send broadcast message to 2-hop nodes and nodes will join that cluster. The Interference-aware clustering algorithm is described below (Fig. 2). Fig. 2 The interference-aware clustering

Interference-Aware Clustering Algorithm /* Input:G(V , E) /* Output:Interference-aware cluster 2: if (Vi =noncandidate) 3: Broadcast hello 4: Record interference nodes 5: Broadcast record interference in 2-hop 6: endif 7: Calculate Vi.value /* Calculate total interference 8: Vi.value broadcast with in 2-hop 9: if Vi.value < Vj .value with in 2-hop 10: Vi = candidate 11: else 12: Vi.value = Vj .value 13: Vi = candidate 14: endif 15: ifVi.candidate receive vote from all 2-hop 16: Vi = cluster-head(ch) 17: Vi.ch broadcast message to join 18: endif 19: exit

452

G. Ali

The above algorithm is run independently at each node V i. Each node broadcast HELLO message, which contains sensor node ID and its weight. Upon the HELLO message receive, the node with minimum weight within 2-hop will be candidate for CH. When the candidate CH receive the vote from non-candidate node, these candidate CH will be selected as real CH. Finally every CH will broadcast CH notification message within 2-hop and non-CH within 2-hop joins the CH. In essence, the design guidelines of our algorithm are (i) provide non-overlapping clusters to reduce secondary interference (ii) The clusters are created within 2-hop to reduce primary interference. Moreover, all the work is done in fully distributed manner.

2.2 Interference-Aware Intra-cluster Transmission Scheduling In this part, each CH will schedule the transmission in non-conflicting time slots. The algorithm is divided into two parts. In the first part, we color the nodes by using greedy coloring algorithm. In the second part, we schedule the transmission based on this coloring.

2.2.1

Coloring Algorithm

The coloring algorithm will be run on each node. The node with the highest weight within neighbor will be color first. The weight on which the node is color is defined as (Fig. 3) weight(u) > weight(v) Iu > Iv or Iu = Iv & id(u) > id(v)

2.2.2

Scheduling Algorithm

In the second part, the intra-cluster transmission is schedule based on the color. Since two or more transmission with the same color will be schedule in the same slot. So the total number of slots used in the intra-cluster is at most equal to the total number of color used.

Cluster-Based Interference-Aware TDMA Scheduling in Wireless …

453

Distributed coloring algorithm /* Input:Ni coloring /* Output:assigning different color to interference nodes 1: begin 2: broadcast (hello; id) 3: receive acknowledgement from neighbor nodes j 2 Ni 4: calculate weight(Ni) 5: neighbor nodes send its weight and highest of its neighbor to Ni 6: if weight(Ni) weight(Nj ) 7: Ni candidate to be color 8: else 9: weight(Nk) > weight(Nj ) 8k; j 2 Ni 10: Nk is candidate to be color 12: endif 13: if all neighbor vote the candidate node 14: candidate node will be color 15: color node broadcast color to neighbor node 16: neighbor nodes update their info 17: endif 22: exit Fig. 3 Coloring algorithm

2.3 Interference-Aware Inter-cluster Transmission Scheduling In this part, the cluster head transmits data to another cluster head and finally to sink using the shortest path algorithm. Since there is no communication link between two nodes in different clusters, therefore the routing path includes inter-cluster communication. As the cluster head uses high transmission power for transmission with neighbor cluster heads, so the inter-cluster communication is interfered with other nodes in different clusters. Therefore use separate time slots for inter-cluster and intra-cluster data transmission to avoid both interferences. As the routing path to the sink contains the cluster header. We use the sub-graph which consists of cluster headers and the interference edges among headers. So to avoid the inter-cluster interference, each cluster head will be colored using the above coloring algorithm and then schedule the cluster heads using the following inter-cluster scheduling algorithm (Figs. 4 and 5).

3 Performance Evaluation We use GENSEN [13] tool to perform the simulation of algorithm. The nodes are placed randomly in the area of 100 m × 100 m area. Our scheme compared delay and throughput with GCF scheme [10]. Both intra-cluster, inter-cluster delay, and

454 Fig. 4 Intra-cluster scheduling

G. Ali

intra-cluster scheduling algorithm /* Input:(GC; F) /* GC = (V C;EC):color graph of cluster c /* Output:interference-aware intra-cluster schedule 1: begin 2: for each ci 2 GC do 3: for each Fj ; -j 2 ci do 4: Fk Fj using the ci header for destination 5: -k -j a path from source to the header 6: rk 0 7: Fc Fc [ frk; Fk; -kg 8: endfor 9: endfor 10: sort Fc by flow# in increasing order 11: for i from 1 to jFcj do 12: for each Fk; -k; rk 2 Fc do 13: frame rk 14: for j from 1 to j-kj do 16: do 17: schedule true

20: schedulable false; 21: if schedulable = false then frame frame + 1; 22: while (schedulable = false); 24: frame frame + 1; 25: endfor 26: endfor 27: endfor 28: return T

throughput will be analyzed, respectively. The GCF scheme is divided into two parts. In the first part, intra-cluster transmission finds conflict-free slot across 3-hop neighbor. In the second part, for inter-cluster transmission each cluster is considered as one node and assigns conflict-free slot across 3-hop neighbor using the same algorithm as in first part. Figure 6 shows comparison of average delay of the two schemes. Our scheme gets less delay than the GCF scheme. This is because in GCF scheduling, all nodes across 3-hop use same slot, while in our scheduling scheme most of time, nodes across 2-hop are scheduled in the same slot and hence reduce the delay. Figure 7 shows the average throughput of both schemes. We define throughput as the amount of data transmitted from source node to destination node during per timeslot. Our scheduler improves throughput because more transmissions are scheduled per time slot across 2-hop, while GCF scheduler assigns time slot to all nodes across 3-hops hence reducing throughput. Figure 8 shows the comparative result of number of nodes vs. number of color. In our proposed scheme, when a node is selected for

Cluster-Based Interference-Aware TDMA Scheduling in Wireless …

455

Algorithm Inter-cluster scheduling algorithm /* Input:(GCH, F) =GCH = (V;E): where V consisting of all CH and Sink , and E are the edges connecting CH to each other and to the sink= Output:Interference-aware inter-cluster schedule 1: FC 0 2: for each flow Fi 2 F do 3: Fk Fi 4: srck the source cluster header of Fi 6: 'k a path from srck to sink node in 'i 7: rk frame 8: FC FC [ f(rk; Fk; 'k)g 9: endfor 10: Sort FC by flow in increasing order 11: for i from 1 to jFcj do 12: for each Fk; _k; rk 2 Fc do 13: frame rk 14: for j from 1 to j_kj do 15: !uv the j-th edge in _k 16: do 17: schedule true

20: schedulable false; 21: if schedulable = false then frame frame + 1; 22: while (schedulable = false); 24: frame =frame + 1; 25: endfor 26: endfor 27: endfor 28: return T Fig. 5 Inter-cluster scheduling

color, then all non-interference un-color nodes across 2-hop will get the same color. While in GCF scheme after selection of node for color, all nodes across 3-hop can get the same color, so it’s required more color as compared to our scheme which affect both average delay and throughput, respectively.

456

G. Ali

Fig. 6 The average delay

Fig. 7 Average throughput per timeslot

4 Conclusion In this paper, we consider the inference in cluster-based wireless sensor networks. The previous scheme considers the interference between 2-hop and 3-hop by ignoring the SNIR ratio. Compared to the previous scheme the proposed scheme considers the

Cluster-Based Interference-Aware TDMA Scheduling in Wireless …

457

Fig. 8 The average color used

generic interference instead of using 2-hop or 3-hop. Simulations results show that proposed scheme shows better performance as compared to the previous scheme. Acknowledgements The authors would like to thank Sur University College (Sur, Sultanate of Oman) for their support and sponsorship of this research

References 1. Saifullah A, Xu Y, Lu C, Chen Y (2011) End-to-end delay analysis for fixed priority scheduling in WirelessHART networks. In: Proceedings of the 2011 17th IEEE real-time and embedded technology and applications symposium, pp 13–22 2. Saifullah A, Xu Y, Lu C, Chen Y (2010) Real-time scheduling for WirelessHART networks. In: Proceedings of real-time systems symposium (RTSS) 3. Chipara O, Lu C, Roman G (2007) Real-time query scheduling for wireless sensor networks. In: Proceedings of real-time systems symposium 4. Suriyachai P, Brown J, Roedig U (2010) Time-critical data delivery in wireless sensor networks. In: Proceedings of the 6th IEEE international conference on distributed computing in sensor systems, Santa Barbara, USA, pp 216–229 5. Zhang H, Soldati P, Johansson M (2009) Optimal link scheduling and channel assignment for convergecast in linear WirelessHART networks. In: Proceedings of WiOPT 6. Francesco MD, Pinotti CM, Das SK (2012) Interference-free scheduling with bounded delay in cluster-tree wireless sensor networks. In: Proceedings of MSWiM’12, Paphos, Cyprus 7. Chipara O, Wu C, Lu C, Griswold W (2011) Interference-aware real-time flow scheduling for wireless sensor networks. In: Proceedings of 23rd Euromicro conference on real-time systems (ECRTS)

458

G. Ali

8. Kanodia V, Li C, Sabharwal A, Sadeghi B, Knightly E (2001) Distributed multi-hop scheduling and medium access with delay and throughput constraints. In: Proceedings of the 7th annual international conference on mobile computing and networking, Rome, Italy, pp 200–209 9. Bui BD, Pellizzoni R, Caccamo M, Cheah CF, Tzakis A (2007) Soft real-time chains for multi-hop wireless ad-hoc networks. In: Proceedings of 13th IEEE real time and embedded technology and applications symposium, pp 69–80 10. He T, Stankovic JA, Lu C, Abdelzaher T (2003) SPEED: A stateless protocol for realtime communication in sensor networks. In: Proceedings of 23rd international conference on distributed computing systems, pp 46–55 11. Felemban E, Lee C-G, Ekici E (2006) MMSPEED: multipath multi-SPEED protocol for QoS guarantee of reliability and timeliness in wireless sensor networks. IEEE Trans Mob Comput 5(6):738–754 12. Jurcik P, Koubaa A, Severino R, Alves M, Tovar E (2010) Dimensioning and worst-case analysis of cluster-tree sensor networks. ACM Trans Sensor Netw 7(2):14:1–14:47 13. Camilo T, Silva JS, Rodrigues A, Boavida F (2007) GENSEN: a topology generator for real wireless sensor networks deployment. In: Obermaisser R, Nah Y, Puschner P, Rammig FJ (eds) Software technologies for embedded and ubiquitous systems. SEUS 2007. Lecture notes in computer science, vol 4761. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-54075664-4_46

Messaging Application Using Bluetooth Low Energy Nikhil Venkat Kumsetty, Sarvesh V. Sawant, and Bhawana Rudra

Abstract Bluetooth Low Energy is a cutting-edge technology that consumes far less energy than classical Bluetooth, especially for low-data transmission tasks. It is of interest to various IoT applications due to its low maintenance and high selfoperability. Most messaging apps in the industry consume a lot of energy and storage from their devices, requiring a diverse set of resources to operate. This can be a problem when dealing with constant data transmission in smart devices and machines. Especially in the circumstances and ideas of the present age, energy and storage are valuable commodities. We propose a method to build a messaging interface that can be used to communicate among various IoT devices with the help of Bluetooth Low Energy which consumes significantly less energy and storage space. The messaging interface is being designed and developed as per the Bluetooth Low Energy official documentation and guidelines provided. It involves making the messaging interface connect and communicate to an IoT device possessing Bluetooth Low Energy feature on Chromium-based web browsers. This approach only takes up a fraction of the resources required for traditional data communication techniques. However, there is a minimum resource requirement regarding RAM, Storage, and CPU usage affecting the performance of the proposed scheme. Keywords Bluetooth low energy · Web bluetooth · Classical bluetooth

N. V. Kumsetty (B) · S. V. Sawant · B. Rudra National Institute of Technology Karnataka, Surathkal, India e-mail: [email protected] S. V. Sawant e-mail: [email protected] B. Rudra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_39

459

460

N. V. Kumsetty et al.

1 Introduction In recent times, there is a surge in the number of smart devices and applications that leverage the technical capabilities of the Internet of Things, specifically focused on data gathering, filtering, and analysis. The users are provided with the observations made by the devices they use to either better facilitate or automate a particular task. It also created a unique opportunity to improve upon the existing methods by reducing energy consumption. Bluetooth Low Energy provides us with one such opportunity. Traditional Bluetooth technology commonly known as “Classical Bluetooth” when used for connecting and communicating with another Bluetooth device over a long period will consume a lot of energy. In order to prevent the wastage of energy, a novel concept was introduced and is called “Bluetooth Low Energy” which aimed to reduce energy consumption and hence maintain connection and communicate with a diverse set of Bluetooth devices. The existing methods which utilize Ethernet, IP payload, and Classical Bluetooth communication techniques tend to consume a lot of energy and storage space to enable the users to utilize its features. While dealing with smart devices and machines that require constant data processing and updating, the traditional methods will either be economically unfeasible or lead to limitations in storage space. This can be remedied by introducing Bluetooth Low Energy technology to develop a robust messaging interface to transmit data seamlessly. Bluetooth Low Energy, commonly referred to as “BLE”, is a wireless mode of communication used in a personal area network (PAN). BLE has similar range and frequency features as Classical Bluetooth technology. BLE is defined by the profiles that define the functionality and features of a device or application. A BLE device can implement multiple profiles depending on its hardware and software capabilities. The profiles are constructed mainly with the help of GATT—Generic Access Profile and GAP—Generic Attribute Profile which is responsible for transmitting data from and to the BLE device in the form of small data packets, and deciding the structure or format of the data that is sent or received from BLE device. This paper deals with developing a system interface for the BLE messaging interface, which can maintain a wireless connection and communicate data to other IoT devices. We have utilized Bluetooth technology to communicate wireless, specifically Bluetooth Low Energy (BLE), which consumes significantly less energy to maintain regular communication between two or more devices. The work implemented during the proposed method is limited to chromium-based browser IoT devices only. However, it can be extended for non-chromium-based browser IoT devices in the future.

Messaging Application Using Bluetooth Low Energy

461

2 Literature Survey For this model, we started by taking a look at the documentation and recent developments in Bluetooth Low Energy and Web Bluetooth. We also refer the following well-known [1–4] texts detailing the development of Bluetooth in IoT. We analyzed various papers published regarding Bluetooth Low Energy, and WebBluetooth API, and their applications in the field of IoT in particular. In [5], the authors Pramukantoro et al. implemented a wireless communication device as part of an IoT-based heart monitoring system with the help of BLE. In this particular application, they used the BLE frameworks which are Core Bluetooth and PyGATT to receive continuous uninterrupted data from the heart sensors administered to the patient. In [6], the authors Bardoustos et al. implemented a web-based application to analyze human motion and gestures using an IoT motion sensor. In this paper, they store the data in a storage platform and perform annotations on the collected data to create a dataset. Further, they exploit the Web Bluetooth API capabilities for the development of a Browser-based real-time data collection, storage, and annotation tool. In [7], the authors Molina et al., experimented with the capture effect which involves the collision of data packets during the communication between multiple BLE devices, both peripheral and central. This paper helped us understand the various methods and precautions required to be undertaken to avoid data packet loss and consequent loss of information. They have proposed a mechanism to verify the presence of all data packets that are received from the sender. In [8], the authors Zhang et al., designed and implemented a BLE Security Scan framework to identify BLE devices that do not implement encryption or authentication at the application layer. Taint analysis is used to track if BLE devices use nonces and cryptographic keys, which are critical to cryptographic protocols. In [9], the authors Pang et al., have experimented with the interaction of a varying number of BLE devices with their interface and performed various investigations on the channel hopping features of their interface. The authors have highlighted the deficiencies of their interface, which were mainly observed in the reliability. Even though they proposed a mechanism to deal with this issue using a channel selection algorithm, it was not very effective at mitigating the reliability issue. In [10], the authors Madana et al., talk about the advancements in IoT technologies and their applications in smart cities and corporate environments. They proposed an innovative solution to the boarding pass system to track passengers in airports with the help of Bluetooth Low Energy. They have performed a detailed analysis of the security vulnerabilities of their system and proposed some improvements to diminish security vulnerabilities. In [11], the authors Bai et al. presented an indoor positioning system to monitor the activity of a disabled or debilitated person. This was done by installing sensors at various sites in the home environment. It tracks, records, and analyzes the activity of that person with the help of a BLE beacon used as a wearable by the disabled person. The sensors track the raw Received Signal Strength Indicator of the BLE beacon and determine the location of a person. The authors proposed two methods: a fingerprinting-based method and a trilateration-based method to accurately determine

462

N. V. Kumsetty et al.

the indoor location of the user. In [12], the authors Buli´c et al. performed a detailed study on the energy consumption of BLE devices and presented their observations on the data throughput of various BLE versions and transactions such as connection interval and power consumption on the throughput. The analysis provides insight into how connection intervals can be modified to increase the throughput of a BLE device to perform various transactions like reading, writing, notification, and so on. It also helps in understanding the effects of data size on the amount of power consumed by a BLE device.

3 Methodology This section explains the design of the messaging interface that allows the users to communicate in a close range without needing a mobile network or an internet connection. This was implemented using Bluetooth Low Energy (BLE). After that to assess the feasibility, the following set of operations was performed: • Establish a Secure connection between a BLE Device and a web application running on a Chromium-based web browser. • Send data from web application over Secured BLE connection to the BLE device. • Show Status of connected device in Web application [e.g., Discovering, Connected, Pairing, Disconnected]. • Perform the connection and data communication operations from the Web Application to BLE device in parallel for many devices at once [A reasonable limit on the max number of devices is acceptable given the Device and WebApp constraints]. • Devise Test Strategies and Methods to test the engineering-level communication performance of multiple connections, data integrity, and a max payload of exchanged data. Figure 1 represents the sequence of actions in the proposed BLE messaging application. The user initially scans for a BLE device with the help of a service or name or MAC address. If the scanner finds the device, it will connect, pair, and notify the BLE device to access the services it provides. The BLE central device can utilize the service it needs and either choose to disconnect with the connected device or it can connect with another BLE device by initiating the scanner. The detailed discussion about the system development with the development environment and functional and non-functional requirements along with the use case scenario is as follows.

3.1 Development Environment The messaging interface was developed using the Visual Studio Professional 2019 IDE along with ASP.NET, web development workload, and JavaScript extensions.

Messaging Application Using Bluetooth Low Energy

463

Fig. 1 Flowchart of BLE Messaging interface

The application was tried and tested on Windows, Linux, and macOS operating systems for desktops and the Android operating system for mobile phones, and the observed performance was good. The application extensively relies on asynchronous operation handling.

3.2 Functionality of the Messaging Interface The user should be able to establish a secure connection between a BLE Device and a web application running on a Chromium-based web browser. She should be able to send data from the web application over the secured BLE connection to the BLE device. Also, the application must be able to establish a secure connection between a BLE Device. The application should show the status of the connected devices, such as discovering, connected, pairing, and disconnected. It should also be able to establish connections and perform data communication operations from messaging interface to the paired peripheral BLE devices simultaneously. Also, it should ensure secure transmission and reception of data packets using various cryptographic algorithms. Finally, devise test strategies and methods to test the engineering level communication performance of multiple connections, data integrity, and a max payload of exchanged data.

464

N. V. Kumsetty et al.

3.3 Event of the Messaging Application The following are the event types for the messaging application • Connection—Connection of the BLE devices. • Device Management—Device service management, Device characteristic management, Parallel connection of BLE devices, Device status management. • Communication Initiation—Transmit and Receive Data by encrypting and decrypting the data while communicating with a BLE device. • Security Configuration Management—Data packet tracking and management, BLE device Bluetooth credentials retrieval and management.

3.4 Phases of BLE Device In this section, we discuss the different phases of BLE Device.

3.4.1

BLE Device Connection Phase

The application is a central device that scans for advertisements broadcasted by other peripheral devices using BLE technology. If the central happens to be listening on an advertising channel that the peripheral is advertising on, then the central device discovers the peripheral. It is then able to read the advertisement packet and all the necessary information to establish a connection. The center then sends a CONNECT_IND packet which is also known as a connection request packet. The peripheral always listens for a short interval on the same advertising channel after it sends out the advertising packet. This allows it to receive the connection request packet from the central device, which initiates a connection between the two devices. A connection is considered established once the device receives a packet from its peer device making the central device as the master and the peripheral device as a slave. The master is responsible for managing, controlling, and handling events in a connection.

3.4.2

Device Management Phase

After the successful connection with a required BLE device, the application stores the details of the connected device for device management. During this phase, the details about the device, such as device info, device profile, services, and characteristics performed, are stored as objects. When requested for the data, it retrieves from the local storage of the device which is running the messaging interface. The data about each device can be accessed from the messaging interface and the services are utilized to perform a task specific to a BLE profile. Once a device is paired and connected,

Messaging Application Using Bluetooth Low Energy

465

the device management information will be stored in a stack, which helps retrieve the latest accessed device data and services when the user calls.

4 Results and Analysis The application was tested on multiple operating systems. The parallel connection was tested with 5 BLE peripheral devices, and the application (central device) was able to maintain stable communication with all three devices and was successfully transmitting data and receiving data. The application was also able to distinguish the data packets received from different BLE peripheral devices and label them accordingly for the user to understand the communication better. Figure 2 shows the communication between the central and peripheral BLE devices and also displays details about the connection such as the type of service and characteristics the central device wants to utilize and the status of the device communication. The messaging interface has the novel functionality to maintain BLE peripheral device information for future connections and easy access for the user to communicate. This unique capability of our application allows users to view, change, and modify the communication channels without the need to pair and connect to BLE peripheral device every time a user wants to send or receive data. The conventional WebBluetooth did not have this mechanism of device management. Figure 3 depicts the multiple device connection functionalities: the user can select the BLE device to communicate with that device. The security of the application was tested by analyzing the data packets registered in each BLE peripheral device during the communication. The application also

Fig. 2 BLE messaging interface—data from a central device to a peripheral device

466

N. V. Kumsetty et al.

Fig. 3 BLE messaging interface—connection with multiple BLE peripheral devices

successfully enabled the user to configure the security features for the data transmission, thereby making the application more adaptable and efficient in performing its intended tasks. We utilized SysGauge [13] to monitor the performance of our messaging interface. While developing this application we observed that the number of devices that can be connected to the application in parallel to communicate data is affected by the browser’s memory. We observed that each BLE peripheral device consumes around 5–8 MB of storage space depending on the type of profile and service that BLE device performs. The lowest amount of storage space is taken by the battery profile BLE devices, and the highest amount of storage space is taken by messaging profile BLE devices. Table 1 depicts the memory occupied by various profiles that were implemented in our application. Table 2 shows the detailed analysis on data size that can be transmitted at time to a BLE peripheral device from our application. We compared our observations with respect to various communication types that are available in the industry. Due to the small data size, BLE applications have an advantage of consuming very less energy to transmit short bursts of data for a very long period of time. Whereas, the existing methods of communication consume significantly larger data size to transmit the same amount of data from one device to another. Table 1 Observed storage space consumption (in MB) by each profile in browsers Profile

Google Chrome

Safari

Mozilla

Opera

Battery

5.02 MB

5.21 MB

5.11 MB

5.13 MB

Wi-Fi Connection

5.52 MB

5.76 MB

5.55 MB

5.51 MB

Device Info

6.17 MB

6.66 MB

6.16 MB

6.33 MB

Messaging

7.66 MB

8.01 MB

7.55 MB

7.84 MB

Messaging Application Using Bluetooth Low Energy Table 2 Comparison of data size over various communication types

Communication type

467 Data size

Ethernet

1.6 KB

IP payload

64 KB

Classical bluetooth

251 bytes

Proposed application using BLE

23 bytes

4.1 Limitations The main limitations of our proposed application can be observed in terms of RAM, storage, and CPU usage. • RAM—The application’s memory (central device) must not fall short of 10 MB while in the idle state, i.e., while not connected to any BLE peripheral device. It must not fall short of 10 MB for communicating with a single BLE peripheral device. • Storage—The details of various BLE peripheral devices are stored in the form of objects in the local storage of the browser in which the application is operated, given that the browser is chromium-based. On an average the application needs at least 8 MB of storage space to maintain a stable connection with a single BLE peripheral device. • CPU usage—A minimum of 15 s of CPU usage in any 30-s time period should be allocated to the application.

5 Conclusions and Future Work In this paper, we propose a novel messaging application to communicate with nearby devices by using Bluetooth Low Energy (BLE). This application enables the users to connect and communicate with a diverse set of devices performing a variety of services and characteristics. The messaging application has the novel functionality to maintain BLE peripheral device information for future connections and easy access for the user to communicate. This unique capability of our application allows users to view, change, and modify the communication channels without the need to pair and connect to BLE peripheral device every time a user wants to send or receive data. We also implemented various test cases to estimate the functional limits of our proposed messaging application. This enabled us to introduce device management functions, analyze security features of the communication during connection, send/receive data, and disconnection phases. We also performed a detailed analysis of the messaging application based on usage over various browsers and also performed a comparison between various communication methods. In our analysis, we found that our method was performing better than other methods. Further, we also explored the limitations of

468

N. V. Kumsetty et al.

this approach and the resources required to streamline the application and discussed them in detail. Implementing the application on non-chromium-based browsers, and the development of a BLE peripheral application to improve the performance of the central application can be considered for further research. Also, testing the connection with more BLE devices can help us understand the payload limitations and challenges. Furthermore, simulating various security vulnerability attacks such as Denial-ofService, Man-in-the-Middle, Phishing, and eavesdropping can help us improve the application’s security features.

References 1. Bhargava M (2017) IoT projects with bluetooth low energy. Packt Publishing Ltd. 2. Heydon R, Hunn N (2012) Bluetooth low energy. CSR presentation, bluetooth SIG https:// www.bluetooth.org/DocMan/handlers/-DownloadDoc.Ashx 3. Aftab B, Usama M (2017) Building bluetooth low energy systems. Packt Publishing Ltd. 4. Pandey S (2018) Hacking internet of things: bluetooth low energy. Cytheon Ltd. 5. Pramukantoro ES, Gofuku A (2021) A study of bluetooth low energy (BLE) frameworks on the IoT based heart monitoring system. In: 2021 IEEE 3rd global conference on life sciences and technologies (LifeTech). IEEE 6. Bardoutsos A et al (2021) A human-centered Web-based tool for the effective real-time motion data collection and annotation from BLE IoT devices. In: 2021 17th international conference on distributed computing in sensor systems (DCOSS). IEEE 7. Molina L et al (2021) Be aware of the capture effect: a measure of its contribution to BLE advertisements reception. In: 2021 16th annual conference on wireless on-demand network systems and services conference (WONS). IEEE 8. Zhang Y et al (2020) BLESS: a BLE application security scanning framework. In: IEEE INFOCOM 2020-IEEE conference on computer communications. IEEE 9. Pang B et al (2020) A study on the impact of the number of devices on communication interference in bluetooth low energy. In: 2020 XXIX international scientific conference electronics (ET). IEEE 10. Madana AL et al (2021) IoT enabled smart boarding pass for passenger tracking through bluetooth low energy. In: 2021 international conference on advance computing and innovative technologies in engineering (ICACITE). IEEE 11. Bai L et al (2020) A low cost indoor positioning system using bluetooth low energy. IEEE Access 8:136858–136871 12. Buli´c P, Kojek G, Biasizzo A (2019) Data transmission efficiency in bluetooth low energy versions. Sensors 19(17):3746 13. System monitor. https://www.sysgauge.com/index.html

Scalable and Reliable Orchestration for Balancing the Workload Among SDN Controllers José Moura

Abstract There is a high demand for Software Defined Networking (SDN) solutions to control emerging networking scenarios. Due to the centralized design of SDN, a single SDN controller could have its performance severely degraded due to workload congestion or even if it becomes out of service after being menaced by failures or cyber-attacks. To enhance the robustness and scalability of the control system level, it is fundamental for the deployment of multiple redundant controllers, which need to be correctly orchestrated for ensuring these controllers efficiently control the data plane. This work designs, deploys, and evaluates a new East/Westbound distributed light protocol, which supports the leaderless orchestration among controllers. This protocol is based on group communication over UDP. The evaluation results show the merits of the proposed orchestration design on the system scalability, workload balancing, and failure robustness. Keywords Scalable · Reliability · Control · Leaderless orchestration · Workload

1 Introduction Emerging networking scenarios require novel and agile solutions to control the available resources of networked systems. SDN-based solutions are very appealing to be used for controlling the network infrastructures used in upcoming use cases. There is also a well-defined standard on the Southbound API of SDN, i.e., OpenFlow protocol. Nevertheless, further investigation is needed towards a light, scalable, and efficient East/Westbound protocol for supporting the orchestration among a group of SDN controllers, which share among them the redundant control [1, 2]. Aligned with this, the current work proposes a new multicast East/Westbound protocol that dynamically orchestrates any number of SDN controllers in a complete distributed way, and without a top-level centralized orchestrator as suggested in [5]. Avoiding J. Moura (B) Instituto de Telecomunicações (IT), Instituto Universitário de Lisboa (ISCTE-IUL), Av. das Forças Armadas, 1649-026, Lisboa, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_40

469

470

J. Moura

this top-level orchestrator, the system enhances its scalability and robustness against respectively high control workloads and any eventual failure at the single orchestrator. Using the novel solution, when a controller fails, the still available controllers apprehend that event and correctly orchestrate among them the network devices previously under the control of the failed controller. The paper structure is as follows. After the introduction, Sect. 2 analyzes related work, highlighting the novel aspects of the current publication. Section 3 discusses the design of the proposed orchestration solution and the associated protocol. Its deployment is debated in Sect. 4, and Sect. 5 evaluates the proposed system. Section 6 discusses the main evaluation results. Finally, Sect. 7 concludes the paper with some future research directions.

2 Related Work This section analyzes some related work, for highlighting the novel aspects of the current investigation. The work in [1, 3] analyzed several controller architectures considering several performance parameters: scalability, consistency, reliability, load balancing, and security. The current work endorses all the previous parameters, except the security one. The authors of Oktian et al. [6] state that typical services provided by East/Westbound API are the controllers’ coordination toward a control leader election, the distribution of network information, and mitigation of controller failovers. The current proposal for an East/Westbound protocol can be classified as a leaderless solution, which avoids the usage of additional network resources for supporting the extra signaling messages associated with the successive election rounds among controllers for the leader election. Our leaderless proposal as a novelty uses multicast communication for distributing the ID of each controller among the others, announcing the availability of the former controller for the remaining controllers. In this way, each controller can create the same ordered list of IDs. The order in that list of each controller can be eventually used once again by each controller orchestration function for autonomously deciding which switches will be under its control. After a controller failure, the network devices previously under the control of that failed controller are distributed again and without collisions among the available controllers. Hoang et al. [2] propose a new East/West interface for sharing network state between heterogeneous SDN domains. The current work investigates the control of a single SDN domain by a set of redundant and homogeneous controllers. Lyu et al. [4] developed an analytic study that stochastically optimizes at different time scales the on-demand activation of controllers, the adaptive association of controllers and switches, and the real-time processing and dispatching of flow requests. In the current deployed proposal the previous aspects are covered, except the online controller activation following the system load dynamics, which is left for future work. Our work assumes a static configuration of redundant controllers.

Scalable and Reliable Orchestration for Balancing the Workload Among …

471

3 Design This section presents the diverse parts of our distributed solution to support the flexible and dynamic workload leaderless orchestration among any number of SDN controllers. Section 3.1 discusses the orchestration protocol, and Sect. 3.2 discusses some interesting controller orchestration functions.

3.1 Orchestration Protocol The current sub-section presents the communication group protocol used to enable the automatic discovery of all the active controllers by each individual controller. Then, each controller can use a local orchestration function to select which switch or Packet In message the controller can exclusively control. The orchestration functions will be debated in the next sub-section. Figure 1 visualizes the communication protocol, as an example, among three controllers, but this protocol supports any number of controllers. Each controller has two light processes responsible for the transmission and reception of the orchestration East/Westbound protocol messages. The first light process of a controller sends an orchestration message to a multicast group announcing its own id, which was randomly generated at its startup. The second light process of the same controller only receives messages from that group. Each time the receiving thread of a controller receives an orchestration message, the controller verifies if the id of the announced controller is already known. When the received id is not known, then the receiving thread updates a list of discovered ids. As shown in Fig. 1, this protocol requires a minimum number of multicast messages equal to the number of active controllers to enable the full awareness in each controller about all the other active controllers.

3.2 Controller Orchestration Function After the receiving thread of a specific controller has collected the ids of all the active controllers, this thread stores the collected ids in a list, which is shared with the main process of the controller. Then, the controller processes this list of ids obtaining the total number of active controllers (i.e., the list length) and the individual order of the current controller in that list. These two parameters are fundamental to the evaluation of the controller orchestration function visualized in Expression (1), where dpid is the unique datapath id associated with each switch. dpid mod (num_server) = = order

(1)

472

J. Moura

Fig. 1 Proposed orchestration protocol

When the equality in (1) becomes True, this occurs exclusively at a single controller among any set of controllers. Therefore, there is always a unique controller to decide how the message within the Packet In received from switch dpid should be analyzed and processed. In this way, there is a distributed decision or consensus mechanism among the controllers. This solution offers the significant advantage of avoiding the exchange of OpenFlow or other extra synchronization messages among controllers to reach the final decision in which controller should perform the control decision. The previous orchestration function in use cases where there are distinct amounts of data flows traversing the forwarding switches could not be totally fair in terms of balancing the load among the controllers. To mitigate this problem, it is now discussed an alternative function, which could be fairer than the previously discussed function. This alternative orchestration function for each controller could decide if it processes or not any received Packet-In message is summarized in (2). The subtle difference in relation to (1) is the replace of dpid by packet_in_counter, which is the aggregated value of all received Packet In messages by each controller. Assuming every switch is simultaneously connected via OpenFlow with every available controller, all the controllers share the same trend on the packet_in_counter parameter. The decision algorithm in (2) enables a fairer control load distribution among the diverse controllers, but it has a potential drawback. It can increase the number of times each switch must change from controller. Each time a new controller assumes the control of a switch, the controller must delete the old rules and install new ones on that switch. All this extra OpenFlow traffic increases the overload on the control channel.

Scalable and Reliable Orchestration for Balancing the Workload Among …

473

Alternatively, using the first function, in (1), each switch is always controlled by the same controller during all the time. In this way, the channel control will not become so congested as in the case of function (2). There is here clearly a tradeoff between the orchestration fairness and control channel overload. packet_in_counter mod (num_server) = = order

(2)

4 Deployment The current Section details in the next sub-sections the deployment of the orchestration protocol and controller orchestration function, which were presented in Sect. 3.

4.1 Orchestration Protocol As already explained in Sect. 3.1, each controller has two threads responsible for managing the orchestration protocol. As shown in Algorithm 1, inside the constructor of the controller code, these two threads are started in steps 7 and 9. The thread t1 is responsible for the periodic transmission of orchestration messages, announcing the controller id, which is randomly generated in step 4. The second thread, t2, is responsible for listening the multicast orchestration messages sent by other controllers. Steps 19–23 are where the thread t1 tries every 2 s to send a multicast message announcing its ID. Then, each controller has a listening thread (t2), steps 26–41, which receives all the multicast messages sent by the other controllers. It also decides, at a convenient time, when the list of IDs needs to be updated and announced to the local controller. The controller every second invokes the function in steps 46–48 to get a fresh version of the list containing the IDs of all active controllers. When a controller sees a list with only its own ID, it concludes that is alone and changes its role from EQUAL to MASTER. Otherwise, the controller using a local coordination function (see Algorithm 2 below) can select the network devices or data plane messages, which can be exclusively controlled by that controller.

474

J. Moura

Algorithm 1: Each controller initially starts two threads which manage the multicast communication with remaining controllers. 1: from transmitter_multicast import Controller_Multicast 2: import threading 3: def __init__(self, *args, **kwargs): 4: self.cont_id = str(random.randint(0, 1000)) 5: self.tx_mult = Controller_Multicast(self.cont_id) 6: self.t2 = threading.Thread(target=self.tx_mult.receive, args=(self.tx_mult.get_id(),)) 7: self.t2.start() 8: self.t1 = threading.Thread(target=self.tx_mult.send, args=(self.tx_mult.get_id(),)) 9: self.t1.start() 10: end function 11: Class Controller_Multicast(object): 12: def __init__(self, id): 13: self.MY_ID = id 14: self.list_ids = [] 15: end function 16: def send(self, id): 17: multicast_addr = '224.0.0.1' 18: port = 3000 19: while True do 20: sock.sendto(json.dumps([id]).encode('utf-8'), (multicast_addr, port)) 21: time.sleep(2) 22: end while 23: end function 24: def receive(self, id): 25: while True do 26: cnt = cnt + 1 27: data, address = sock1.recvfrom(256) 28: l_rx = json.loads(data.decode('utf-8')) 29: for i in range(len(l_rx)) do 30: if l_rx[i] not in l_tmp: 31: l_tmp.append(l_rx[i]) 32: end if 33: end for 34: if cnt > (len(l_tmp) + 2) 35: l = l_tmp 36: l_tmp = [id] 37: cnt = 0 38: end if 39: l.sort(reverse=False) 40: self.list_ids = l 41: end while 42: end function 43: def get_id(self): 44: return self.MY_ID 45: end function 46: def get_list_ids(self): 47: return self.list_ids 48: end function

4.2 Controller Orchestration Function Algorithm 2 shows the network function running in each controller, which enables a distributed coordination among all the SDN controllers operating in the role EQUAL.

Scalable and Reliable Orchestration for Balancing the Workload Among …

475

This algorithm avoids potential conflicts among the controllers (steps 5–6). Diverse orchestration functions were discussed in Sect. 3.2.

1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

Algorithm 2: The controller assumes the role EQUAL avoiding conflicts with other controllers for each Packet-In Event with pkt do datapath = Event.msg.datapath dpid = datapath.id if self.mode == 'EQUAL': if not (dpid % int(self.num_serv) == int(self.order)): return else: Analyses, processes and controls the current message end if end if end for

5 Experimental Results This section presents the results of all the experiments conducted in this study. The results are offered based on three sets of experiences, which are listed in Table 1, and using the experimental setup visualized in Fig. 2 and further detailed in Table 2. The system under evaluation is formed by m redundant controllers, n switches, n hosts, and 2n-1 data plane links. The data plane was emulated by Mininet. For all the tests of the current paper, ten controllers, ten switches, and nineteen data plane links were used. The controller logic, using the Ryu Library, including the auxiliary class which deploys the behavior of the orchestration communication threads, was implemented in Python3. The next sub-sections discuss the diverse obtained results. Table 1 Evaluation tests and their main goals Sections Main aspect(s) under analysis 5.1

It verifies the correct behavior of the newly proposed protocol to orchestrate any number of controllers in a completely distributed way

5.2

It compares two possible controller orchestration functions for the distributed orchestration design; it also analyzes their impact on the system performance and the fairness in how each function balances the control workload among available redundant controllers

5.3

It compares two distinct designs (centralized versus distributed) for the orchestration part of the system; It can help to reach some conclusions in terms of how each solution enables both system scalability and system resiliency

476

J. Moura

Fig. 2 The SDN-based system under evaluation

Table 2 Hardware and software tools used during the evaluation tests ASUS Intel® Core™ i7-3517U CPU @ 1.90 GHz 2.40 GHz, 12 GB RAM, Windows10 Education × 64



VirtualBox Ubuntu 22.04

https://www.virtualbox.org/; https://releases. ubuntu.com/22.04/

Ryu SDN Controller (v4.34)

https://ryu-sdn.org/

OpenvSwitch (v2.16.90, DB Schema 8.3.0)

https://www.openvswitch.org/

Python 3.9.12

https://www.python.org/downloads/release/ python-3912/

Mininet (v2.3.0)

https://github.com/mininet/mininet

Wireshark (v3.4.9)

https://www.wireshark.org/

5.1 Functional Test of the Orchestration Protocol This functional test was made using the experimental setup presented in the beginning of the current Section. The goal of this initial test was to verify if the orchestration protocol messages were periodically sent from the same controller, announcing its ID to the other controllers. Figure 3a visualizes some samples of the time interval between two consecutive transmissions from the same transmission thread. The time interval is roughly around 2 s. In addition, it measured the multicast transmission delay between each multicast transmission thread and each receiver. The obtained samples are visualized in Fig. 3b. Except for some sporadic peak delays within the

Scalable and Reliable Orchestration for Balancing the Workload Among …

a) Mulcast Periodic Transmission

477

b) Mulcast Transmission Delay

Fig. 3 Diverse characteristics of the multicast orchestration protocol

range of [15, 85] ms, the most significant amount of delay samples are below 10 ms. In this way, the orchestration protocol was successfully verified.

5.2 Functional Test Comparing Two Alternative Orchestration Functions In this experiment, the two alternative orchestration functions discussed in Sect. 3.2 were compared, using the same network traffic load, in terms of the total number of Packet In (PI) messages and the processed PI messages by each controller of the ten controllers under this test. The orchestration protocol among controllers was always the one multicast-based discussed in Sects. 3.1 and 4.1. The obtained results are summarized in Table 3. Analyzing and comparing the results relative to the two orchestration functions, the first conclusion is that the ID orchestration function when compared with the PI orchestration one reduces significantly (from 306 to 198) the total number of PI messages during the test. Nevertheless, the former function has a lower Jain Fairness Index, i.e., 0.929, against the impressive 0.999 of the latter function. In this way, we have experimentally confirmed the tradeoff between the orchestration fairness and control channel overload, which was previously discussed in Sect. 3.2.

5.3 Functional Test Comparing Centralized versus Distributed Orchestration After the fairest orchestration function among controllers was found (i.e., PI, Sect. 5.1), this function was applied to two distinct orchestration designs. The two designs under comparison are the distributed orchestration design that was discussed in Sects. 3.1 and 4.1, and the centralized orchestration design investigated in [5]. The comparison results are visualized in Fig. 4. Analyzing these results, the distributed

478

J. Moura

Table 3 Considering the two orchestration functions under comparison, PI total and processed messages are listed for each controller Controller

PI orchestration function (Jain Fairnes Index = 0.999)

ID orchestration function (Jain Fairnes Index = 0.929)

PI_Total

PI_Processed

ID_Total

ID_Processed

1

306

31

198

15

2

306

30

198

21

3

306

31

198

27

4

306

31

198

23

5

306

30

198

25

6

306

31

198

19

7

306

30

198

17

8

306

31

198

11

9

306

31

198

13

10

306

30

198

27

design based on multicast communication among the controllers has a lower load rate (1.6 Kb/s) than the centralized design based on TCP communication (18 Kb/s). Consequently, the distributed design is more scalable than the centralized one. In addition, the distributed design does not have the issue of the single point of failure that could easily occur in the centralized design.

Fig. 4 Centralized versus distributed orchestration design (vertical axis in logarithmic scale)

Scalable and Reliable Orchestration for Balancing the Workload Among …

479

6 Discussion This section debates the main conclusions and lessons learned during the current investigation work. From the obtained results, it was initially demonstrated that it is possible the usage of a multicast communication protocol to coordinate a set of redundant controllers for balancing in a correct way the control workload among them. This proposed distributed design for the system orchestration part clearly offers system operational advantages in comparison with an existing centralized alternative of the literature [5]. The advantages provided by the new proposed leaderless orchestration design are high system scalability (e.g., for both number of controllers and number of controlled network devices) and increasing robustness against system threats, such as faults or cyber-attacks. The higher scalability of the distributed design in relation to the centralized option is visible in Fig. 4, with a significant reduction (from 18 to 1.6 Kb/s) on the system overload induced by controllers’ orchestration. The current paper has also compared two possible controller orchestration functions for the distributed orchestration design. Both orchestration functions have been studied in terms of their impact on the system performance and the fairness in how each function balances the control workload among the available redundant controllers. Each orchestration function has a strong and a weak performance aspect, which are reversed, considering the other function under study. As an example, if the owner of a network infrastructure is concerned with the amount of system resources used by the extra control/orchestration messages, trying to diminish the system energy consumption, our work indicates that the more suitable orchestration function is the one which associates to each switch a dedicated controller, considering the unique data path id of that switch. In the case of a controller failure, the network devices previously under the control of that failed controller are distributed again and without any collision among the still available controllers. The system could require some seconds (i.e., around 3 or 4 s) to reach a coherent state among all the controllers, but after that instant, the system operates without any issue.

7 Conclusion In this study, it is proposed a new East/Westbound leaderless protocol among the controllers to coordinate among them the control workload of a common network infrastructure being redundantly controlled by those controllers. As shown in the result section, this proposal is viable, scalable, and it offers increasing resilience against system threats due to the distributed characteristic of that proposal. For future work, the on-demand (de)activation of controllers will be investigated [4]. Acknowledgements Jose Moura acknowledges the support given by Fundação para a Ciência e Tecnologia/Ministério da Ciência, Tecnologia e Ensino Superior (FCT/MCTES) through National

480

J. Moura

Funds When Applicable Co-Funded European Union (EU) Funds under Project UIDB/50008/2020 and in part by Instituto de Telecomunicações, Lisbon, Portugal.

References 1. Ahmad S, Mir AH (2020) Scalability, consistency, reliability and security in SDN controllers: a survey of diverse SDN controllers. J Netw Syst Manage 29(1):9. https://doi.org/10.1007/s10 922-020-09575-4 2. Hoang N-T, Nguyen H-N, Tran H-A, Souihi S (2022) A novel adaptive east-west interface for a heterogeneous and distributed SDN network. https://doi.org/10.3390/electronics11070975 3. Hu T, Guo Z, Yi P, Baker T, Lan J (2018) Multi-controller based software-defined networking: a survey. IEEE Access 6:15980–15996. https://doi.org/10.1109/ACCESS.2018.2814738 4. Lyu X, Ren C, Ni W, Tian H, Liu RP, Guo YJ (2018) Multi-timescale decentralized online orchestration of software-defined networks. IEEE J Sel Areas Commun 36(12):2716–2730. https://doi.org/10.1109/JSAC.2018.2871310 5. Moura J, Hutchison D (2022) Resilience enhancement at edge cloud systems. IEEE Access 10:45190–45206. https://doi.org/10.1109/ACCESS.2022.3165744 6. Oktian YE, Lee SG, Lee HJ, Lam JH (2017) Distributed SDN controller system: a survey on design choice. Comput Netw 121:100–111. https://doi.org/10.1016/J.COMNET.2017.04.038

A Digital Steganography Technique Using Hybrid Encryption Methods for Secure Communication Sharan Preet Kaur and Surender Singh

Abstract As the era moves forward, so does the advancement and progress. The advancement of medical science has been accompanied by some challenges. Due to this, another challenge arises: Sharing data over public networks. This requires security measures for the data to be sent securely. One way to do this is with steganography. Steganography is the process of hiding secret information from hackers by placing it in other publicly available data. A second approach to adding security is hybridizing steganography with encryption techniques. In this implementation the 4 main pillars of steganography are considered payload capacity, imperceptibility, robustness, and security. For maintaining all these pillars different algorithms have been hybridised and one novel algorithm has been constructed. First of all, DWT (Discrete Wavelet Transformation) technique and BES (Bald Eagle Search) algorithm have been used together for embedding and pixel selection process. After that Huffman compression technique and logistic chaotic technique have been used together on the secret image before embedding it into cover image. In this the database of Covid-19 patients is used as the confidential data and the results that this model has performed better than the existing ones. Keywords COVID-19 · Steganography · Image security · Image transmission · DWT · Ebola · Huffman · 2d logistic chaotic map

1 Introduction Multimedia data is now moved quickly and widely to destinations via the internet in a variety of formats, including image, audio, video, and text. Everything is visible and accessible to every user through digital communication over the internet. As a result, information security is a vital and important duty. Everything is visible and accessible to every user through digital communication over the internet. As a result, information security is a vital and important duty. Confidentiality, integrity, and S. P. Kaur (B) · S. Singh Chandigarh University, Chandigarh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_41

481

482

S. P. Kaur and S. Singh

availability are three goals of network or information security (CIA). Confidentiality refers to the protection of information from unauthorized access. Integrity relates to the accuracy of data, whereas availability means that data is available to authorized people at all times. Network security is not sufficient for reliable communication of information like text, audio, video, and digital images. Encryption, watermarking, digital watermarking, reversible watermarking, cryptography, steganography, and other techniques are used to safeguard photographs. This paper provides an overview of encryption, steganography, and watermarking. We suggested a hybrid security solution in this paper that combines encryption, steganography, and watermarking. In the next sections, we’ll go through a quick overview of each technique. The plain text is turned into cypher text using a secret key in encryption. The secret key can also be used to convert the image to an encrypted format. The encrypted image is subsequently transferred to the destination over an insecure channel. At receiving end, the encrypted image is decrypted using the same key of sender side. Following are the basic notations of the cryptography: • P refers to the plain text, Original message. • C refers to the cipher text. Output produced by encryption technique. Humans are unable to read this. • E refers to the function of encryption, i.e., E(P) = C. • D refers to the function of decryption, i.e., D(C) = P. Steganography has made it feasible to communicate invisibly. The original image is hidden in the cover image in steganography to disguise the intruder/hacker, and the resulting image is called a stego image. The sender’s secret key may be used in this procedure, and the same key may be used at the destination to acquire an original image from a stego image. Steganography and cryptography are not the same thing. While cryptography focuses on keeping the contents of a message secret, steganography focuses on keeping the existence of a message hidden. The number of hackers or online data thieves has increased substantially in recent years, according to [1–3]. Hackers are mostly interested in stealing sensitive information such as credit card numbers and company secrets. As a result, businesses are continually concerned about the security of data transmission methods. Steganography is the art of masking the fact that data is sent by hiding information in another one whenever communication occurs. Image, audio, and video files can all be used as carrier files [4]. In picture steganography, sensitive data that needs to be conveyed securely is concealed behind any image while ensuring that the quality of the image beneath which the secret data is hidden is not harmed or noticed by any unauthorized user. The sensitive data is hidden behind the audio file instead of the image in audio steganography, and the audio file quality is retained. In the another one i.e., video steganography, in this case the video is utilized for hiding the confidential data by taking care of the quality of video. In addition to this one more type was also seen in the past times that was text steganography [5]. In this the text file is used for hiding any type of secret data by seeing that there is no change in the meaning of the text file after hiding the data.

A Digital Steganography Technique Using Hybrid Encryption Methods …

483

Cryptography is used to secure online transferred data in programmes that run in a network context [6, 7]. Cryptography helps people to conduct business electronically without fear of deception or deception, as well as verifying the message’s integrity and the sender’s identity [8]. Because thousands of people communicate electronically every day through e-mail, e-commerce, ATM machines, cellular phones, and other means, it has grown more important in our daily lives. Users have become more reliant on cryptography and authentication as the amount of data exchanged electronically has grown exponentially [9]. The major flaw of cryptography is that hackers can discover encrypted messages and attempt to decrypt them using a variety of methods, including automatic counters and random tests based on mathematical formulas [10]. As a result, the cryptography method contains security flaws, such as relying solely on decryption keys. The key question is how can the cryptography security of online transferred texts be improved? According to [11], cryptography could be improved by combining it with another security technology, such as Steganography. The Steganography method is one of the most useful approaches for increasing the security of cryptographic systems. Both of these strategies can be effectively combined to provide a high level of data security online [12]. Although both encryption and steganography technologies provide security, combining the two into a single system improves security and confidentiality [18].

2 Proposed Methodology Step-1 Selection of cover and secret image-The process has started by selecting cover and secret image which will go through the process and will be transferred as a stego image. Step-2 Encrypting the secret image-In the second step the secret image has been encrypted by utilizing hyper chaotic technique and then that encrypted image has been further processed. Step-3 Compressing encrypted secret image-Now the secret image that has been encrypted in the previous step will be compressed using lossless compression technique and that compressed image will be embedded into the cover image. Step-4 Selection of pixel locations-Next come the cover image from which the pixels will be selected by using the optimized chaotic technique and the embedding process will work further on these selected pixels. Step-5 Embedding the compressed encrypted secret image-Now the final step comes of embedding the secret image. The pixels that have been selected in the previous step will be used now for embedding the compressed encrypted secret image. The embedding technique will be used for performing the embedding process and that will result further into the stego-image which will be finally transferred over the public channel.

484

S. P. Kaur and S. Singh

Fig. 1 Proposed research methodology

Step-6 Removal of secret image from stego image-After the transference of stego image at the destination end, the secret image will be removed from stego image which will be further processed for getting the actual secret image. Step-7 Decompressing secret image-Now the secret image that has been got in the previous step will be decompressed for getting the actual quality image. Step-8 Decrypting the secret image-The last step is decrypting the image acquired from the previous step using the key that has been used in the encryption process. Finally after going through the all the steps the secret image with its actual quality is received by the receiver with full security over any of the public channel (Fig. 1).

3 Proposed Model for Medical Image Transmission of Covid Patients This proposed model is basically based on 4 algorithms DWT, BES Optimisation, Huffman Encoding, and 2d logistic chaotic map. It contains 3 main mechanisms: 1. Embedding Process

A Digital Steganography Technique Using Hybrid Encryption Methods …

485

2. Encryption Process 3. Extracting Process

3.1 Embedding Process At the initial phase DWT technique is performed on cover image for extracting high as well as low-frequency bands. After the application of DWT, BES optimisation is applied for finding the best position for hiding bits of secret image. Before performing embedding, firstly the secret image is compressed and then inserted into cover image that already been decomposed by DWT technique. Then for getting stego image, inverse DWT technique is applied. For inserting secret image, let us consider cover image I having dimensions AXB. The image is in spatial representation, and DWT technique has to be applied for converting it into frequency domain. Now for solving the problem of robustness, the cover image will be going under decomposition. The obtained coefficients are I1, I2, I3, and I4. [l1, l2, l3, l4] = DWT(l)

(1)

The coefficient I1-the band having low frequency that is containing all the important information and the remaining bands have high frequency are having the info as image edges. Now the band that is nominated for more processing and extracted coefficients are ILL1, ILH1, IHL1, and IHH1. [lLL1, lLH1, lHL1, lHH1] = DWT(l1)

(2)

Now from these only CLL1 is selected for embedding secret data. Now using BES technique, the optimal position that is Oopt from band CLL1 is selected for embedding secret image represented by Z O1∗j = O1j + (Zj × Oopt )

(3)

where j = LL, LH, HL, HH, and Zi —the optimal position for embedding the data. Next step will be applying inverse DWT for hiding secret data C1∗∗ = lDWT(C1∗LL C1∗LH C1∗HL C1∗HH

(4)

The modified band having secret data is C1∗∗ = lDWT(C1∗∗ C2 C3 C4) The algorithm for the above process is discussed below: Embedding algorithm.

(5)

486

S. P. Kaur and S. Singh

Input Image IM Secret key SK Secret Image IS . Output Image OM Stego Image SI . (1) (2) (3) (4) (5) (6) (7)

Resize IM in M x N. Apply 2- level DWT to decompose in four bands. Divide LL band into dimension of [2 6 4] by using Quad tree decomposition. Find mean of decomposed Image (ID ) by creating blocks of division. Each block apply Huffman Code and find embedding bits for embedding secrete image. Initialize BES for optimisation of embedding bits. Calculate fitness function of BES (F) fit_fitness_BES = {True if fs >ft and otherwise false} Apply BES on inserting bits

(8)

Size calculation of inserting bits (S, T) For x=1 to S For y=1 to T FS =embedding bits (x,y) Ft = threshold (x,y) F=coall F(FS , FT ) N=1 Fitdata=BES (F,N,EF ) End of y loop End of x loop Exit check size of IM and IS

(9)

if IM > IS apply encryption algorithm on secret image using double logistic chaotic map(Algo 2) apply embedding apply inverse DWT to get SI SI = Embedding (IM ,IS ,and SK ) else apply Huffman encoding to compress IS and apply embedding algorithm end

(10) Return SI (11) Exit.

A Digital Steganography Technique Using Hybrid Encryption Methods …

487

3.2 Encryption Algorithm (Based on Confusion and Scrambling Operation) Before embedding the secret image, 2-dimensional logistic chaotic map has been applied on the secret image for the purpose of encrypting it and then that encrypted image has been embedded inside the cover image. Algorithm has been explained below:(1) At the initial stage, secret image’s gray scale pixels has been encrypted which results the vector V consisting of sequence of gray scale (2) Apply XOR operation Is’(k)=X’(k)⊕{[X’(k) + VR ]mod N} ⊕ I’s(Km) Where k implies k pixels of secret image. (3) Now there will be count taking place of pseudo x for the matrix image pixel. (4) In this step, by the usage of logistic chaotic map the pseudo random data is given as input for computing the scrambling operation, Is” is the final encrypted image.

3.3 Extraction Process Now extraction process has been implemented for getting the secrete image from stego-image. In case of extraction algorithm, the input will be the stego-image. In the next step inverse DWT will be applied to the stego-image for getting its frequency representation from the spatial representation. Extraction Algorithm. Input: Stego Image and Key for encryption. Output: Secret Image. (1) Read Stego-Image. (2) Apply Inverse DWT on Stego-Image. (3) Key will be tested. If Key is correctly given Extraction will be done on stego image Secret image = embedding (Stego-Image) Else Encrypted image = embedding (Stego-Image) End (4) End.

488

S. P. Kaur and S. Singh

Fig. 2 Histogram of cover and stego images

Table 1 Evaluation Parametric values of various images Image

Secret image

MSE

PSNR

BPP

SSIM

House

Covid-70

0.0060

74.33

0.0625

0.99986

Mandrill

Covid-52

0.0059

75.58

0.0623

0.99885

Peppers

Covid-4

0.0061

74.01

0.0598

0.98695

Splash

Covid-2

0.0057

75.59

0.0635

0.99536

Lena

Covid-1

0.0056

74.72

0.0587

0.999785

4 Results and Analysis Medical images of Covid patients has been embedded in the cover images that has been selected from USC SIPI database. Figure 2 below shows the cover images along with the stego images when the secret image has been embedded. Table 1 shows parametric value of various images.

5 Conclusion To hide confidential information steganography can be effectively used. The objective of any Steganography method is to hide maximum secret information which is immune to external attacks and also should not convey the fact that the cover medium

A Digital Steganography Technique Using Hybrid Encryption Methods …

489

is carry secret information. Therefore, it can be concluded that Steganography is to create secrete communication, in addition to this crypto way of embedding giving higher end of security.

References 1. Abikoye, Oluwakemi Christiana, et al. (2020) A safe and secured iris template using steganography and cryptography. Multimed Tools Appl 1–24 2. Zainal Abidin Z et al. (2019) Development of iris template protection using LSBRN for biometrics security. IJCSNS Int J Comput Sci Netw Secur, 19(7) 3. Ghazia GH, Kaniana G (2018) An enhanced biometric information security system using orthogonal codes and LSB steganography. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT). IEEE 4. Mohsin AH et al (2019) Based medical systems for patient’s authentication: Towards a new verification secure framework using CIA standard. J Med Syst 43(7):192 5. Delmi A, Suryadi S, Satria Y (2020) Digital image steganography by using edge adaptive based chaos cryptography. In: Journal of Physics: Conference Series, 1442(1). IOP Publishing 6. Arunkumar S et al (2019) SVD-based robust image steganographic scheme using RIWT and DCT for secure transmission of medical images. Measurement 139:426–437 7. Islam, Muhammad Aminul, Md Al-Amin Khan Riad, Tanmoy Sarkar Pias (2021) Enhancing security of image steganography using visual cryptography.In: 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST). IEEE 8. http://sipi.usc.edu/database/-The USC-SIPI Image Database 9. https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database

Utilizing Blockchain Technology to Enhance Smart Home Security and Privacy Rehmat Ullah, Sibghat Ullah Bazai, Uzair Aslam, and Syed Ali Asghar Shah

Abstract In recent years, the Internet of Things (IoT) has emerged rapidly in the field of smart home automation. However, the short battery life, low processing power, and limited memory of IoT technology result in limited fault lines. According to the Open Web Application Security Project (OWASP) report, approximately 70% of IoT devices are vulnerable. However, it can also be easily hacked and exploited. By utilizing blockchain technology, the fragile structure of IoT can be made more robust and reliable by tackling its vulnerable nature. The study aims to assess the security of blockchain technology by launching a predetermined attack on the MAN-IN-THEMIDDLE (MITM) in a smart home environment. This testing attack is intended to prevent various threats and make the system more reliable and secure. Several tests have been conducted to determine whether the indigenous IoT protocol (MQTT) is more secure than blockchain technology. As a result, the MQTT protocol was replaced with the blockchain protocol in this study. This study examined Hyperledger, which is associated with the Chaincode, among the various blockchain platforms. Three IoT systems have been assessed for security by simulating Man in the Middle attacks and examining their security features. Our results show that the blockchain is more secure for IoT systems compared with MQTT protocols. Keywords Hyperledger fabric · MITM attack · Blockchain · IoT

R. Ullah · S. U. Bazai (B) · S. A. A. Shah Department of Computer Engineering, BUITEMS, Quetta, Pakistan e-mail: [email protected] R. Ullah e-mail: [email protected] S. A. A. Shah e-mail: [email protected] U. Aslam People’s Primary Healthcare Initiative (PPHI) Sindh, Karachi, Pakistan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_42

491

492

R. Ullah et al.

1 Introduction The Internet of Things (IoT) is a fast-growing technology that connects smart devices through the internet. It refers to annexing various interconnected objects, devices, and even humans. It can be used in several areas, including transportation, agriculture, healthcare, energy production, Smart homes, and the industrial sector [19]. The growing popularity of IoT devices and their interconnectivity create security and privacy issues for users. Bazai et al. [8] discussed the privacy issues with the use of big data processing platforms; however, the research also includes data regarding the privacy of home and IoT devices. According to the Ericsson 2019 estimates, over 40 billion IoT devices will be created with 75 billion zettabytes (ZB) of data by 2025 Yakut et al. [20]. Smart Home, however, includes the automation of all things such as lighting, heating, and an electronic device; it is deemed one of IoT’s conspicuous features. Despite the great importance of IoT, it also has some drawbacks owing to its limited battery, poor processing power, weak conventional protocol, and little memory. For instance, as per the Open Web Application Security Project (OWASP) report, about 70% of IoT devices are vulnerable. At the same time, according to the detector survey, the Denial-of-service (DDOS) attack on Domain Name System (DNS) has been exploited and hacked approximately 60% of IoT devices in the United States of America in the year 2018 [13]. The fragile structure of IoT devices needs more improvement and making these devices starker and more secure using Blockchain technology. Blockchain technology improves the state-of-the-art encryption-decryption techniques that protect the Smart Home from Man-in-the-middle (MITM) attacks Bazai et al. [9]. Instant Blockchain has much more influence in technology related to security and replacing conventional communication protocols. As per the data, Blockchain technology has great potential and strength to maintain IoT flaws and make it more protective and robust [17]. Such technology does not allow any threat from unauthorized individuals. Blockchain technology is a decentralized, immutable ledger. It has several benefits like cost-saving, resilience, high speed, and security. This research aims to use Blockchain technology to improve the capabilities of encryption-decryption algorithms and establish secure communication between IoT devices. In this study, the Hyperledger Fabric was used as Blockchain and Chaincode as a (smart contract) for simulating the three IoT devices to communicate. The ultimate purpose of this research is to test the security capacity of Message Queuing Telemetry Transport (MQTT) protocol and Blockchain by launching a predetermined attack known as the MITM attack in a Smart Home environment. Among several Blockchain platforms, this research focuses on Hyperledger Fabric, which is a private Blockchain associated with the Chaincode. By protecting against MITM attacks and examining their security features, three IoT systems have been tested for their security. The conclusive result shows that Blockchain technology, based on an IoT system, is more secure than the MQTT protocol.

Utilizing Blockchain Technology to Enhance Smart Home Security …

493

Chronologically, this paper is divided into various sections and sub-sections. To begin with, Section 1 covers the introduction of this paper. Section 2 discusses the Literature Review. Section 3 represents the system design of how the three IoT devices are connected. The last two sections focus on both results and conclusions.

2 Literature Review The IoT is based on two different terms: first, the Internet, which connects various networks, and on the other hand and the ‘Thing’ refers to gadgets or appliances (mobile phones, personal computers, and commercial organizations) which are transformed into digital objects [2, 15]. Many research papers have been published on the importance of IoT technology, such as smart homes, and these include Al-Kuwari et al. [1]. IoT devices for smart homes have two conspicuous problems: integrity and confidentiality Yakut et al. [20]; Cavalieri et al. [10]. Hence, a comprehensive system was created via the Internet for a smart home [16]. In this way, a centralized model is deemed beneficial for various reasons [18]. Certain problems can arise when it connects multiple devices. These explicit troubles can only be excluded by focusing on decentralizing Blockchain technology [3]. This conceptual model is meant to help businesses figure out how to use the Internet of Things (IoT) technology in their production and supply chains. The goal is to make a closed-loop system that is better for the environment and the economy [21]. Despite the importance of IoT technology, Dorri et al. [11] presented a decentralized approach in a smart home, known as a Blockchain approach, aimed to impart security and privacy. It is commonly called the most well-known IoT application. This would also help to strengthen the capacity and counter its vulnerabilities. No wonder it fulfills the security requirements for all smart homes. These can be confidentiality, integrity, authorization, and availability. It also focuses on an obvious feature of transparency [14]. Countering IoT security issues can be possible by using the Hyperledger Fabric and Hyperledger Composer, as per the data mentioned previously. The highlighted design will fortify to solve security limitations by conceiving on Blockchain approach. However, Fabric has a separate order-based methodology while dealing with the situation. The principles of smart home mapping could meet counter security requirements based on IoT smart home as stated in the study by [22]. Another study focused on explaining the structure and critical analysis of its decision-making in the prescribed designs [4]. In [12], the authors proposed incredible solutions for the limitations. They rendered a trusted model, consensus, smart contract, the transaction’s sequential execution, and deterministic nature, and the confidentiality ensured by allowing peers to run every contract. Some managing and processing issues must be excluded by using Blockchain in IoT. These issues include consensus, scalability, and insufficient storage capacity.

494

R. Ullah et al.

3 System Design Significantly, the system design elaborates the system’s architecture mentioned in Fig. 1. In this research, three Cloud Virtual Machines (CVM) are created in the Digital Ocean platform, such as RASPBERRY PI, SMART GATEWAY, and SMART HOME. This CVM has been annexed through Hyperledger Fabric Blockchain by using Chaincode as mentioned in Fig. 1. The Raspberry Pi CVM receives data from sensors sent to the gateway CVM and finally displays on Smart Home via Hyperledger Explorer (user interface). The prime goal of such implementation is to replace the conventional IoT protocol MQTT with Hyperledger Fabric, aimed to secure the smart home from some intrusions. Owing to Feeble MQTT encryption and decryption techniques, the study replaced it with Blockchain technology (Fig. 2). The Advanced Encryption Standard (AES) is the encryption technique used in MQTT. The encryption key is secured by using a hash function, SHA-256. However, Chaincode acts as a smart contract, which is linked between three IoT devices, to store and receive data. The gateway CVM uses a Chaincode to receive information from the Blockchain network and deliver it to the Hyperledger Explorer. While the Raspberry Pi CVM uses to retain, update, and keep records on the Fabric. The three virtual machine containers are created in Docker Swarm. The GO programming language is used to code the Chaincode framework. In fabric cryptography, encryption is performed by using the Elliptic Curve Digital Signature Technique (ECDSA). Furthermore, Hash functions are used in the Hyperledger Fabric for transaction processing encoding. The SHA-3 is the hash function used in this system. Firstly, the data from the Raspberry Pi CVM has been encrypted before being transferred to the smart gateway CVM as observed in Fig. 1. Secondly,

Fig. 1 Smart home implementation in Hyperledger Fabric with Chaincode

Fig. 2 Data sent from Raspberry Pi to smart gateway

Utilizing Blockchain Technology to Enhance Smart Home Security …

495

the smart home CVM can access the data on the Hyperledger Fabric network. The smart Gateway CVM can gather data from the private ledger as per the requirement. Finally, the Hyperledger Explorer receives the transaction data from the smart gateway as shown in Fig. 1. Transaction details present a Blockchain size, Chaincode name (named BASIC), and data (named BLOCK).

4 Experimental Results The seminal purpose of this study is to conceive a comparative analysis between SHA-3, Keccak-256, and SHA-256. The comparison between SHA-3 and Keccak256 demonstrates that the encryption algorithms of these hashes are robust and have higher graph values. The findings of the Avalanche Effect between SHA3 and Keccak-256 are displayed in Fig. 4. The prime objective of this research is to conceive a comparative analysis between SHA-3 and Keccak-256. The number of input changes is shown on the x-axis, while the number of output changes is on the y-axis. A comparison between SHA-3 and SHA-256 is shown in Fig. 3. The study compares hashes to the improved method on both fixed key and plain text variation; the MQTT encryption algorithms (SHA-256) are unable to provide the maximum Avalanche Effect. It can be said that such comparison in Fig. 3 is deemed one of the most essential areas of the study. The MQTT protocol uses SHA-256, as discussed previously. The graph shows a slight change when the input bits are changed. The significant difference is 5% between these two hashes.

Fig. 3 Avalanche effect on encryption algorithm

496

R. Ullah et al.

Fig. 4 Avalanche effect on SHA-3 and Keccak-256

In Fig. 4, all the hashes are compared to know their capacities and strength. Among all three hashes, SHA-3 and Keccak-256 performed better results than SHA-256. Statistically, the difference between Keccak-256 and SHA-256 is 5%. Hyperledger Fabric Blockchain (SHA-3), with a specified key, utilized a better encryption algorithm than MQTT. Therefore, when plain text and a key are fixed or changed, Blockchain performed outstandingly in both cases compared to the MQTT.

5 Conclusion In a nutshell, IoT is a fast-growing technology that links numerous devices with the internet. It has multiple implementations in distinct ways, such as healthcare, industries, smart homes, and smart cities. The prime purpose of IoT is to make our work easier and more accessible. In this advanced juncture, the conspicuous example of IoT is smart homes and their properties, like smart lighting systems, smart curtains, and even smart security systems. However, IoT devices have some drawbacks: less processing power and little memory. Due to their limited capacity, the well-known protocols cannot be inflected on IoT devices. IoT protocols (MQTT, CoAP) have little encryption and decryption capabilities. It can easily be hacked because of its fragile nature and can be more secure using Blockchain. The seminal purpose of this paper is to prevent smart homes, smart health, and safe cities from MITM attacks by utilizing Blockchain. The study has compared the conventional IoT protocol to the Blockchain, and the latter is more secure in encryption and decryption capacities. Among the various Blockchain platforms, the study focused on Hyperledger Fabric associated with the Chain-code. The hashes results are compared in avalanche effects,

Utilizing Blockchain Technology to Enhance Smart Home Security …

497

demonstrating that a Hyperledger Fabric IoT solution is more secure than the MQTT protocol. We intend to extend our proposed model in the future to protect the privacy of Spark RDD and blocks with existing approaches [5–7] in conjunction wit.

References 1. Al-Kuwari M, Ramadan A, Ismael Y, Al-Sughair L, Gastli A, Benammar M (2018) Smarthome automation using IOT-based sensing and monitoring platform. In the 2018 IEEE 12th international conference on compatibility, power electronics, and power engineering 2. Al Sadawi A, Hassan MS, Ndiaye M (2021) A survey on the integration of blockchain with IoT to enhance performance and eliminate challenges. IEEE Access 9:54478–54497 3. Ammi M, Alarabi S, Benkhelifa E (2021) Customized blockchain-based architecture for a secure smart home for lightweight IoT. Inf Process Manage 58(3):102482 4. Androulaki E, Barger A, Bortnikov V, Cachin C, Christidis K, De Caro A et al (2018) Hyperledger fabric: a distributed operating system for permissioned blockchains. In: Proceedings of the thirteenth euro says conference, pp 1–15 5. Bazai SU, Jang-Jaccard J (2019) Sparkda: add-based high-performance data anonymization technique for spark platform. In: International conference on network and system security, pp 646–662 6. Bazai SU, Jang-Jaccard J (2020) In-memory data anonymization using scalable and highperformance RDD design. Electronics 9(10):1732 7. Bazai SU, Jang-Jaccard J, Alavizadeh H (2021) A novel hybrid approach for multi-dimensional data anonymization for Apache spark. ACM Trans Privacy Secur 25(1):1–25 8. Bazai SU, Jang-Jaccard J, Wang R (2017) Anonymizing KNN classification on MapReduce. In: International conference on mobile networks and management, pp 364–377 9. Bazai SU, Jang-Jaccard J, Zhang X (2017) A privacy-preserving platform for MapReduce. In: International conference on applications and techniques in information security, pp 88–99 10. Cavalieri, A., Reis, J., & Amorim, M. (2022). A conceptual model proposal to assess the effectiveness of iot in sustainability orientation in manufacturing industry: An environmental and social focus. Applied Sciences, 12 (11), 5661 11. Dorri A, Kanhere SS, Jurdak R, Gauravaram P (2017) Blockchain for IoT security and privacy: The case study of a smart home. In: 2017 IEEE international conference on pervasive computing and communications workshops (person workshops), pp 618–623 12. Gupta V, Khera S, Turk N (2021) MQTT protocol employs IoT based home safety system with ABE encryption. Multimedia Tools Appl 80:2931–2949 13. Hassija V, Chamola V, Saxena V, Jain D, Goyal P, Sikdar B (2019) A survey on IoT security: application areas, security threats, and solution architectures. IEEE Access 7:82721–82743 14. Rejeb A, Keogh JG, Treiblmaier H (2019) Leveraging the internet of things and blockchain technology in supply chain management. Future Internet 11(7):161 15. Saini S, Maithani A, Dhiman D, Rohilla A, Chaube N, Bisht A (2021) Blockchain technology: A smart and efficient way for securing IoT communication. In: 2021 2nd international conference on intelligent engineering and management (item), pp 567–571 16. Samanta S, Mohanta BK, Patnaik D, Patnaik S (2021) Introduction to blockchain evolution, architecture, and application with use cases. In: Blockchain technology and innovations in business processes. Springer, pp 1–16 17. Singh M, Singh A, Kim S (2018) Blockchain: a game changer for securing IoT data. In: 2018 IEEE 4th world forum on the internet of things (wf-IoT), pp 51–55 18. Urien P (2018) Blockchain IoT (biot): a new direction for solving internet of things security and trust issues. In: 2018 3rd cloudification of the internet of things (not), pp 1–4

498

R. Ullah et al.

19. Urmila M, Hariharan B, Prabha R (2019) A comparative study of blockchain applications for enhancing internet of things security. In 2019 10th international conference on computing, communication and networking technologies (ICCCNT), pp 1–7 20. Yakut S, S˛eker Ö, Batur E, Dalkılıç G (2019) Blockchain platform for the internet of things. In: 2019 innovations in intelligent systems and applications conference (ASU), pp 1–6 21. Zheng Z, Xie S, Dai H-N, Chen X, Wang H (2018) Blockchain challenges and opportunities: a survey. Int J Web Grid Serv 14(4):352–375 22. Zhong H, Zhou Y, Zhang Q, Xu Y, Cui J (2021) An efficient and outsourcing-supported attributebased access control scheme for edge-enabled smart healthcare. Futur Gener Comput Syst 115:486–496

Quality of Service Improvement of 2D-OCDMA Network Based on Two Half of ZCC Code Matrix Mohanad Alayedi

Abstract In this paper, a novel code construction of two dimensional spectral/spatial code based on halve matrices code named two dimensional-Half Spectral/Spatial zero cross correlation (2D-HSSZCC) code, is developed to be implemented in a non-coherent Spectral Amplitude Coding Optical Code Division Multiple Access (SAC-OCDMA) systems. The proposed 2D-HSSZCC code is characterized by a high capacity and a zero cross correlation property that completely remove the influence of multiple access interference (MAI), which is considered the main drawback of the OCDMA system. According to numerical results, they indicate that the SACOCDMA system has been improved with our proposed code whatever capacity, data rate and optical bandwidth terms. As a result, the proposed 2D-HSSZCC code is able to save the optical bandwidth around 2.4 and 6 THz comparing to 2D-Dynamic Cyclic Shift (2D-DCS) and 2D-Perfect Difference (2D-PD) codes, respectively. In addition, it can increase the cardinality percentage up to 63.2%, 70.8% and 19.4% comparing to one dimensional-ZCC (1D-ZCC), 2D-PD and 2D-DCS codes, respectively. Moreover, data rate has been tested in SAC-OCDMA systems where the greatest share belonged to 2D-HSSZCC code. On the other hand, the system performance has been studied using Optisystem software where the simulation result shows that the proposed code produces a low bit error rate (BER) and a high Q-factor of closely 1.8e-16 and 8.1 dB for four users with a low power source of −115 dBm. Keywords 2D-HSSZCC code · PIIN · OCDMA · Q-factor · Cross correlation

1 Introduction Optical networks have become important parts of systems of telecommunication due to the truth that they fulfill a higher speed of information transmission than switches and routers. They are distinguished by their transparence to protocols and data formats, which permits one to increase the flexibility and function ability of the M. Alayedi (B) Department of Electronics, Ferhat Abbas University of Setif 1, 19000 Setif, Algeria e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_43

499

500

M. Alayedi

network requirements [1]. Optical code division multiple access (OCDMA) technique has been paid the attention of researchers in view of their advantages represented by rapidly transmitted enormous information quantity which issued by large number of users at the same frequency and time domains over single channel (an optical fiber), a specified code for each one, a high security and spread spectrum technique [2, 3]. Multiple access interference (MAI) is considered as main factor limits the system performance [4]. In the framework of OCDMA, SAC technique, i.e. spectral amplitude coding, gained more interest because it employs codes relying on low value cross correlation and high auto correlation value as well as transacts with MAI efficaciously [5]. Moreover, simple encoder/decoder devices composed of FBG, i.e. fiber Bragg grating, which facilitate from spectral encrypting/decrypting process if compared to high speed electronic circuitry employed in spectral phase decrypting [6]. Initially, the research for encrypting began with one dimensional (1D) but that impedes increasing in cardinality because it requires an increment in code length. For this reason, to get over this problem in 1D, multiple encrypting schemes of two dimensional (2D) have been proposed thanks to the continuous efforts of researchers in which devoted for society serving; for example; spectral-spatial, spectral-time to eliminate the limitation of users’ number [7, 8]. Jellali et al. [9] proposed the 2D-dynamic cyclic shift (2D-DCS) code that uses the cancellation property of MAI to minimize it totally where the performance of system enhanced by decreasing the spectral code length and increasing the spatial code length. Lin et al. [10] proposed the 2D-perfect difference (2D-PD) code that uses the property of MAI cancelation, as the above approach, to eliminate the MAI as well the PIIN influence is reduced greatly. Accordingly, this investigation proposes a 2D half spectral/Spatial zero cross correlation (2D-HSSZCC) code based on 1D-ZCC code, with its corresponding system structure created by selecting the spectral and spatial encrypting domains as the first and second dimensions, respectively. The paper is organized as follows: The second section presents the 2D-HSSZCC code construction. The third shows the evaluation of system performance. In fourth section, it explains the numerical and simulation results and the last section, presents a conclusion.

2 2D-HSSZCC Code Construction The 2D-HSSZCC code is designed aforesaid based on 1D-ZCC code. The 1D-ZCC code is assigned by these parameters (Nu , w, λc , L) which refer to the number of active users, the code weigh, in phase cross correlation and the code length, respectively. The last parameter can be expressed as [3]: L = Nu ∗ g

(1)

Quality of Service Improvement of 2D-OCDMA Network Based …

501

Furthermore, it is able to minimize the signals overlapping of different users thanks to ZCC property. The 1D-ZCC code has multiple advantages such as a flexible code length and code weight, easy stages to design, supplies a large number of simultaneous users is, high data rate and low light source. The 2D-HSSZCC code design can be briefed in four stages as below [3]: A. Stage 1matrix can be generated with assist of two code sequences At the beginning, the 1st half of 1D-ZCC code can defined through the following formula: ⎡ ⎢ ⎢ 1st half (ZCC) = ⎢ ⎣

C1 C2 .. . C K /2

⎤ ⎥ ⎥ ⎥ ⎦

(2) Nu 2

,L

B. Stage 2 Each ZCC code matrix overall consists of binary numbers. For that, it should assign the one positions’ using the rule below in Eq. (3) whereas the remaining positions are filled with zeros. C j,k = j +

Nu ∗k 2

(3)

where j = 1, 2, 3, . . . , Nu /2 and k = 0, 1, 2, . . . , j − 1. Let take an example for the 1st half (ZCC) code by choosing j = 0, 1 so k = 1, 2, 3. The position of ones can be presented in Table 1. Based on the above, the 1st half ZCC code matrix can be written as follows: ⎤ ⎡ 10 01 00 00 00 00 ⎥ ⎢ (4) 1st half ZCC = ⎣ 0 1 0 0 1 0 0 0 0 0 0 0 ⎦ 00 10 01 00 00 00 C. Stage 3 Using rotation property for the 1st half ZCC code matrix by 180°, this will enable to produce the 2nd half ZCC code matrix as shown in Eq. (5). Table 1 The position of ones for the 1st 1D-ZCC code matrix

k

0

1

1

1

4

2

2

5

3

3

6

j

502

M. Alayedi



00 00 00 10 01 00 ⎢ 2nd half ZCC = ⎣ 0 0 0 0 0 0 0 1 0 0 1 0 00 00 00 00 10 01

⎤ ⎥ ⎦

(5)

D. Stage 4 As aforementioned, we have two halves of 1D-ZCC matrix. So that, the 1D-ZCC code matrix can be designed in case of combination between the 1st and the 2nd half in a single matrix as elaborated below: ⎤ 10 01 00 00 00 00 ⎥ ⎢ 01 00 10 00 00 00 ⎥ ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ ⎢ 00 10 01 00 00 00 1st half (ZCC) ⎥ ⎢ ⎢ ⎦ ⎣ ZCC = −−−−−−−− = ⎢ −−−−−−−−−−−−−−−−−−− ⎥ ⎥ ⎥ ⎢ 2nd half (ZCC) 00 00 00 10 01 00 ⎥ ⎢ ⎥ ⎢ 00 00 00 01 00 10 ⎦ ⎣ 00 00 00 00 10 01 ⎡

(6)

The 2D-HSSZCC code matrix can be generated with assist of two code sequences of 1D-ZCC code “E” and “F” for spectral and spatial encoding, respectively. Both E and F have these code lengths: L1 = Nu1 ∗ w1 and L2 = Nu2 ∗ w2 where their code sizes are Nu1 and Nu2 and their code weights are w1 and w2 , respectively. Let take Me,f denotes the 2D-HSSZCC code expressed as: ⎡ Me,f =

Ee *FTf

⎢ ⎢ ⎢ =⎢ ⎢ ⎣

m0,0 m1,0 m2,0 .. .

m0,1 m1,1 m2,1 .. .

⎤ · · · m0,L1 −1 · · · m1,L1 −1 ⎥ ⎥ · · · m2,L1 −1 ⎥ ⎥ ⎥ .. ⎦ ··· .

(7)

mL2 −1,0 mL2 −1,0 · · · mL2 −1,0 The total capacity of the 2D-HSSZCC code is: K = Nu1 ∗Nu2

(8)

Let take mij which represents the Me,f elements where i = 0, 1, 2, . . . , L1 − 1 and j = 0, 1, 2, . . . , L2 − 1. An example of the 2D-HSSZCC code sequences is presented in Table 2 for (Nu1 = 2, w1 = 2, Nu2 = 2, w2 = 2). In order to explain the 2D-HSSZCC code cross correlation, the characteristic matrices M(d) (d = 1, 2, 3, 4) are written as [11]: T

T

M(1) = E ∗ FT , M(2) = E ∗ F , M(3) = E ∗ F , M(4) = E ∗ F

T

(9)

Quality of Service Improvement of 2D-OCDMA Network Based …

503

  E0 = 1 0 1 0 , E1 = 0 1 0 1

Table 2 2D-HSSZCC code with Nu1 = Nu2 = 2 and g1 = g2 = 2



1

⎢ ⎥ ⎢0⎥ ⎥ F0T = ⎢ ⎢ ⎥ ⎣1⎦ 0 ⎡ F1T

0









01 01

⎢ ⎥ ⎢ ⎢00 00 ⎥ ⎢00 00 ⎢ ⎥ ⎢ m0,0 = ⎢ ⎥, m0,1 = ⎢ ⎢10 10 ⎥ ⎢01 01 ⎣ ⎦ ⎣ 00 00 00 00 ⎡



⎢ ⎥ ⎢1⎥ ⎥ =⎢ ⎢ ⎥ ⎣0⎦ 1

10 10

m1,0

00 ⎢ ⎢10 ⎢ =⎢ ⎢00 ⎣ 10



⎡ 0 ⎢ ⎥ ⎥ ⎢ 10 ⎥ ⎢0 ⎥, m1,1 = ⎢ ⎥ ⎢0 00 ⎦ ⎣ 0 10 00

0 00

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦



⎥ 1 01 ⎥ ⎥ ⎥ 0 00 ⎥ ⎦ 1 01

Table 3 2D-HSSZCC cross correlation

e=0 f=0

e = 0 f = 0

e = 0 f = 0

e = 0 f = 0

R(0)

R(1)

R(2)

R(3)

g1 ∗ g2

0

0

0

0

g1 ∗ g2

0

0

0

0

g1 ∗ g2

0

0

0

0

g1 ∗ g2

where E and F represents the complement of code sequences E and F, consecutively. Moreover, Table 3 is added for more clarification the 2D-HSSZCC code cross correlation in four different cases. The cross correlation of 2D-HSSZCC can be expressed as: R(e, f) =

L 1 −1 L 2 −1

(0) mi,j .mij (e, f)

=

i=0 j=0

g1 .g2 ; e = 0, f = 0 0 else

(10)

3 System Performance and Analysis In order to simplify the system analysis, four assumptions should be taken into account [12]. Firstly, un-polarized broadband light source (BBS) has a flat spectrum over the interval [v0 − v/2, v0 + v/2] in the domain. (v0 ) is the central optical frequency and (v) is the optical source bandwidth. Secondly, all transmitters are configured with equal power. Thirdly, all components of the power spectral have an

504

M. Alayedi

identical spectrum. Finally, the fourth assumption is represented by achieving the synchronization property for each bit flux. Depending on these assumptions and the Gaussian approximation, the bit error rate (BER) estimation can be realized by considering the thermal noise, PIIN and shot noise. The photo-current noise is given by [13]: 2 2 2 2 = σshot + σPIIN + σthermal = 2eBe Ioutput + Be Ioutput 2 τc + σnoise

4Kb Tn Be Rl

(11)

where “e” refers to the electron charge, “Be ” refers to the electrical bandwidth, “Iout put ” refers to the average photo current, Kb refers to Boltzmann’s constant, Tn refers to the absolute temperature, Rl refers to the load resistance and finally τc refers to the coherence time of the light which can be expressed as [14]:



p2 (v)dv τc = 0∞ ( 0 p(v)dv)2

(12)

The power spectral density (PSD) at level of receiver can be expressed as [3]: p(v) =

L K 1 −1 L 2 −1  Psr dK mi,j (v, n) g2 v k=1 i=0 j=1

(13)

 where Psr is the effective source power, dK is the data of Kth user and (v, n) is the ith of the broadband source can be defined as:     v v (v, n) = u v − v0 − (−L1 + 2n) − [v − v0 − u (−L1 + 2n + 2)] 2M 2M (14) where u(v) is the unit step function which can be written as: u(v) =

1 0

≥0 otherwise

(15)

0 and Me,f , we can write the Depending on the cross correlation between M0,0 output currents of PD as: ∞

Ioutput = R ∫ p(v)dv = 0

RPsr g1 M

(16)

η.e where R represents the PD responsivity and expressed as: R = h.v . η,e,h,v0 : repre0 sent the quantum efficiency, electron’s charge, Plank’s constant and central frequency of broad-band optical pulse, respectively. Although L 1 = Nu1 w1 and Nu1 = K/Nu2 , we will obtain:

Quality of Service Improvement of 2D-OCDMA Network Based …

Ioutput =

505

RPsr Nu2 K

(17)

The PIIN variance can be expressed as followed: 2 σPIIN = Be I2output τc =

Be R2 P2sr 2 g vL 1 1

(18)

Substituting Eqs. (17) and (18) into Eq. (11) we obtain: 2 σnoise =

2eBr Psr K2 4Kb Tn Br Br 2 P2sr K2 + g1 + K vK Rl

(19)

Since, the probability of transmitting bit “0” and “1” is the same and equal to (0.5) therefore Eq. (19) will become: 2 σnoise =

Be R2 P2sr Nu2 eBe RPsr Nu2 4Kb Tn Be + g1 + K 2vK Rl

(20)

Finally, depending on the consequences of Eqs. (16) and (19), we can write the signal to noise ratio (SNR) expression as following form: SNR =

2  Ioutput 2 σnoise

 RPsr Nu2 2 =

eBe RPsr Nu2 K

+

K Br R2 P2sr Nu2 g1 2vK

+

4Kb Tn Br Rl

(21)

Then, we can calculate the BER using the Gaussian approximation as [5, 14]: BER =

  1 erfc SNR 8 2

(22)

4 Results and Discussion According to parameters in Table 4, they are used to estimate the system performance with our proposed code which compared with 1D-ZCC, 2D-PD and 2D-DCS codes for the same code lengths (L 1 = 57 and L 2 = 3) in term of BER as function of number of active users and effective source power. Figure 1 offers the BER variation in front of number of active users when 1 Gbps and −10 dBm of data rate and received power, respectively. It is plainly that the OCDMA system with our code has our performed than others code 2D-PD and 2DDCS. It can supply up to 111 users whereas the others can supply up to 93, 68 and 68 users for 2D-DCS, 2D-PD and 1D-ZCC codes, respectively. Thus, the increased percent is calculated as;

506

M. Alayedi

Table 4 Used parameters in numerical analysis

Parameter

Value

Parameter

Value

Photo diode responsivity (R)

0.75

Receiver noise temperature (Tn )

300 K

Effective source power (Psr )

−10 dBm

Receiver load resistor (Rl )

1030 

Electron charge (e)

1.6 × 10−19

Spectral width (v)

5 THz

111 − 68 111 − 93 111 − 65 = 70.8 % , = 63.2 % , and = 19.4 % 65 68 93 Figure 2 offers BER variation in front of data rate when active users number is fixed at 50. As seemed in this figure that our code enables each user in OCDMA system based on our proposed code to exploit data rate reaches 2.1 Gbps whereas 1D-ZCC, 2D-PD and 2D-DCS codes provide minor data rate the 2D-HSSZCC, reach 0.5, 0.68 and 1.2 Gbps, respectively. Finally, we say that our system can increase around 4.2, 3.1 and 1.75 times in comparison with 1D-ZCC, 2D-PD and 2D-DCS codes, respectively. As appeared in Fig. 3, the BER variation in front of spectral width for 50 and 500 MHz of number of active users and electrical bandwidth, respectively. Firstly, 1D-ZCC code is not introduced in this study due to ZCC property that annuls PIIN. Further, “v” variable does not existed in SNR equation for ZCC code. Therefore, the study of Fig. 3 confines to 2D codes. It is observed that the 2D-HSSZCC code needs minor optical bandwidth up to 1.6 THz. Regarding to remaining 2D codes: DCS and PD, they need just 4 and 7.6 THz. As a result, our proposed code can save around 2.4 and 6 THz comparing to 2D-DCS and 2D-PD codes, respectively. 0

Fig. 1 BER versus number of active users for (L 1 = 57 and L 2 = 3)

10

2D-PD (L1=57, L2=3) 2D-DCS (L1=57, L2=3) 1D-ZCC(w=4) 2D-HSSZCC (L1=57, L2=3)

-5

BER

10

-10

10

-15

10

-20

10

20

30

40

50

60

70

80

90

Number of active users

100

110

120

Quality of Service Improvement of 2D-OCDMA Network Based …

507

0

Fig. 2 BER versus data rate for (K = 50)

10

-5

BER

10

-10

10

-15

10

2D-PD (L1=57, L2=3) 2D-DCS (L1=57, L2=3) 1D-ZCC(w=2) 2D-HSSZCC (L1=57, L2=3)

-20

10

2.5

2

1.5

1

0.5

0

Data Rate (Gbps)

0

Fig. 3 BER versus spectral width for (K = 50)

10

2D-PD (L1=57, L2=3) 2D-DPD (L1=57, L2=3) 2D-HSSZCC (L1=57, L2=3) -5

BER

10

-10

10

-15

10

-20

10

0

1

2

3

4

5

6

7

8

spictral width (Thz)

Additionally, Optisystem software ver. 7.0 has been used to study the OCDMA system performance based on 2D-HSSZCC code whence BER and Q-factor for transmitted power and data rate up to respectively −115 dBm and 1000 Mbps of set data rate for each user. Moreover, the thermal noise, fiber Bragg gratting bandwidth and dark current are set at 1.8 × 10−23 W/H z, 0.3 nm and 10 nA, respectively. According to the above, Fig. 4 offers the eye diagram of 2D-HSSZCC code at distance up to 20 km. It is plainly that 2D-HSSZCC code grants OCDMA system a good performance represented respectively by 1.8×10−16 and 8.1 dB of BER and Q-factor when the number of concurrent users are four.

508

M. Alayedi

Fig. 4 Eye diagram of three users utilizing 2D-HSSZCC code

5 Conclusion This paper introduced a novel code called 2D-HSSZCC based on 1D-ZCC code using two half ZCC code using two half ZCC code matrices with assist of rotation property. Numerical results confirmed that the 2D-HSSZCC system has outperformed than 1DZCC, 2D-PD and 2D-DCS systems in different proportions whatever in number of active users, data rate or spectral bandwidth. Furthermore, simulation result has been also proofed the efficiency of our proposed code for optical communication requirements, presented by high Q-factor and low BER in spite of high distance and data rate. These fulfilled optimizations are due to the ZCC feature, which is perfectly able to restrict the influence of MAI. At the last, this work can be extended in the future aiming the look for better optimizations that presented in this paper, for instance, either modifying the encoding schemes or adding another dimension.

Quality of Service Improvement of 2D-OCDMA Network Based …

509

References 1. Alayedi M, Cherifi A, Hamida AF, Bouazza BS, Aljunid SA (2021b) Performance improvement of optical multiple access CDMA network using a new three—dimensional (spectral/time/spatial) code. Wirel Pers Commun. 118:2675–2698. https://doi.org/10.1007/s11277021-08149-0 2. Alayedi M, Cherifi A, Ferhat Hamida A, Bouazza BS, Rashidi CBM (2022) Performance enhancement of SAC-OCDMA system using an identity row shifting matrix code. In: proceedings of international conference on information technology and applications (ICITA). pp 547–559 https://doi.org/10.1007/978-981-16-7618-5_48 3. Alayedi M, Cherifi A, Hamida AF, Rahmani M, Attalah Y, Bouazza BS (2020) Design improvement to reduce noise effect in CDMA multiple access optical systems based on new (2-D) code using spectral/spatial half-matrix technique. J Opt Commun. https://doi.org/10.1515/joc-20200069 4. Bhanja U, Singhdeo S (2020) Novel encryption technique for security enhancement in optical code division multiple access. Photon Netw Commun 39:195–222. https://doi.org/10.1007/s11 107-020-00883-y 5. Abd SA, Mottaleb E, Fayed HA, Aly MH (2019) An efficient SAC—OCDMA system using three different codes with two different detection techniques for maximum allowable users. Opt Quant Electron 51:1–18. https://doi.org/10.1007/s11082-019-2065-8 6. Sharma T, Ravi Kumar M (2022) Analytical comparison of various detection techniques for SAC-based OCDMA systems: a comparative review. In: Proceedings of optical and wireless technologies (OWT). pp 63–75 7. Alayedi M, Cherifi A, Ferhat Hamida A, Mrabet H (2021a) A fair comparison of SAC-OCDMA system configurations based on two dimensional cyclic shift code and spectral direct detection. Telecommun Syst. https://doi.org/10.1007/s11235-021-00840-8 8. Kadhim RA, Fadhil HA, Aljunid SA, Razalli MS (2014) A new two dimensional spectral/spatial multi-diagonal code for noncoherent optical code division multiple access (OCDMA) systems. Optics Commun 329:28–33. https://doi.org/10.1016/j.optcom.2014.04.082 9. Jellali N, Najjar M, Ferchichi M, Rezig H (2017) Development of new two-dimensional spectral/spatial code based on dynamic cyclic shift code for OCDMA system. Opt Fiber Technol 36:26–32. https://doi.org/10.1016/j.yofte.2017.02.002 10. Lin C, Wu J, Yang C (2005) Noncoherent spatial/spectral optical CDMA system with twodimensional perfect difference codes. J Lightwave Technol 23:3966–3980 11. Alayedi, M., Cherifi, A., Hamida, A.F., Matem, R., El-Mottaleb, S.A.A. (2023). Performance Improvement of SAC-OCDMA Network Utilizing an Identity Column Shifting Matrix (ICSM) Code. In: Proceedings of Advances in Cybersecurity, Cybercrimes, and Smart Emerging Technologies (CCSET). pp 263–278 https://doi.org/10.1007/978-3-031-21101-0_21 12. Kumawat S, Maddila RK (2017) Development of ZCCC for multimedia service using SACOCDMA systems. Opt Fiber Technol 39:12–20. https://doi.org/10.1016/j.yofte.2017.09.015 13. Nisar KS, Sarangal H, Thapar SS (2019) Performance evaluation of newly constructed NZCC for SAC-OCDMA using direct detection technique. Photon Netw Commun 37:75–82. https:// doi.org/10.1007/s11107-018-0794-4 14. Meftah K, Cherifi A, Dahani A, Alayedi M, Mrabet H (2021) A performance investigation of SAC—OCDMA system based on a spectral efficient 2D cyclic shift code for next generation passive optical network. Opt Quant Electron 53:1–28. https://doi.org/10.1007/s11082-021-030 73-w

The Impact of 5G Networks on Organizations Anthony Caiche, Teresa Guarda , Isidro Salinas, and Cindy Suarez

Abstract The purpose of this work is to publicize the impact of the 5G mobile phone network on organizations. And clarify why 5G is not a danger, but rather an important technological advance within society. Explaining that non-ionizing signals do not have a harmful impact on human health, since radio signals, microwaves, including infrared, are included in this classification, being signals that are used normally by people and by companies. In Ecuador and other Latin American countries, the implementation of 5G is not yet a reality, so within this study, a perspective of a developed country that already makes use of this new generation of mobile telephony will be analyzed to achieve the objectives of the investigation. This work is based on other scientific works where this network is mentioned as the “5G Technology in Ecuador” and the “Technical analysis for the deployment of a 5G network in Ecuador”. Keywords 5G · Digital economy · Mobile connections

1 Introduction The world has been divided into various ways of obtaining an economic livelihood, one of these being the formation of the digital economy, therefore, fast and reliable connections are of the utmost importance within the labor fields, being in its time the connection of electrical energy or the method of transportation used to reach a destination [1]. In the telecommunications sector, technological evolution has been incessant, offering better benefits and services to subscribers in fixed and mobile networks. In the field of land mobile communications, this effect is very clear as each decade a new generation emerges that improves and provides new services with respect to the A. Caiche · T. Guarda (B) · I. Salinas · C. Suarez Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador e-mail: [email protected] CIST—Centro de Investigación en Sistemas y Telecomunicaciones, La Libertad, Ecuador © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_44

511

512

A. Caiche et al.

previous one, now being called 5G. It must be taken into account that the impact of this new technological paradigm will not be limited to the field of electronic communications, but will facilitate the introduction of innovative applications in companies, citizens, and Public Administrations. 5G technology aims to become a pillar of the digital transformation processes of society and economy. The main enabling solutions for digital transformation, such as the Internet of things, big data, robotics, virtual reality, or ultra-high definition, will be supported by technology [2]. If a brief analysis of the evolution of mobile networks is made, the great advances that have been made to date are undoubtedly evident, but it is important to mention that the technology that is raising the most expectations is the arrival of 5G since this technology promises make it possible to have a hyper connected world where most everyday objects will be connected to each other as well as to people. To achieve this, 5G technology will have greater bandwidth, lower latency, and greater capacity to connect many devices, something that should be highlighted and taken into account due to its importance is energy efficiency, which is undoubtedly essential for it to be technological development possible [3]. This paper is organized in 6 sections. In Sect. 2 explain the type of research that will be carried out to collect information on 5G networks. In the case of Sect. 3, a comparative analysis is carried out, mainly of European countries and specific mentions of China, being the main protagonists of the important worldwide technological expansion. Within Sect. 4, the existing problems in Ecuador and the barriers that prevent the pilot plan imposed by the telecommunications companies from presenting a successful advance are exposed. Section 5 shows the strategy that other countries use to live with the benefits that the implementation of 5G networks brings, and thus be able to replicate these strategies within the region to achieve important technological progress. Finally, the conclusions are presented.

2 Materials and Methods The 5G mobile phone network is a new technology that is a reality in certain parts of the world and in others with difficulty it could be, therefore, this article is focused on finding the obstacle that delays this technology in the Ecuadorian region and the advantage that presents implementing and using 5G networks in existing companies or organizations in international regions. To obtain information about the problem that arises, data will be collected from other published articles where they mention and cover the main reasons why the benefits of this new technology are delayed in Ecuador, and at the same time, articles where the advantages of Increased productivity and efficiency of companies and organizations that use 5G daily for the management and administration of their respective processes. As a method to achieve this purpose, a descriptive basic investigation will be used, which, as Esteban Nieto mentions, consists of a compilation of information

The Impact of 5G Networks on Organizations

513

that includes characteristics, properties, aspects or dimensions of what is investigated [4].

3 5G Networks With the arrival of 5G technology, mobile telephony innovation takes on a new importance within organizations, presenting less latency in sending data. We refer to 5G networks as the fifth generation of mobile networks, which have been evolving over the decades, starting from the first generation (1G) that allowed voice service; followed by the second generation (2G) that implemented SMS and helped smartphones become a communication tool; In the third generation (3G), an internet connection was achieved, which improved with the arrival of the fourth generation (4G), in which the bandwidth was implemented, thus achieving real-time video playback [5]. Currently, there is the fifth generation of mobile networks or (5G), which has allowed the speed of the connection to be increased up to 10 times the speed offered by the main optical fibers on the market [6]. 5G has the potential to offer universal connectivity to industrial systems, and it proposes increased flexibility within production processes. Wireless communication, in general, allows new, simpler layouts of machines, production modules, or transportation of materials with AGV. 5G specifically adds the reliability and determinism necessary for such flexibility on an industrial scale [7]. There is a great problem for which the implementation of the 5G network has not been able to expand with total success in Latin American countries, and this is the belief that this type of signal can affect human health, by the amount of radiation that an antenna can emit. For this reason, it is important to emphasize the meaning of non-ionizing radiation, since by not altering the molecules or atoms it does not present a danger to human health, since these radiations range from 0 Hz to 1,660 THz [8]. In Ecuador, like the rest of the world, there is a tendency towards demanding better forms of communication. These demands can be covered with the implementation of 5G technology that promises users speed, coverage, lower energy consumption, among others [6, 9, 10]. Due to the new normality that society is facing where we have been forced to virtualize tasks such as studying and on many occasions even work, the growing demand for services that guarantee connectivity almost twenty-four hours a day and seven days a the week, have been a great challenge for the telecommunications sector within the country, which in a certain way is trying to adapt, join, and expand in response to the requirements of the new way of life [11]. The 5G mobile phone network is a new technology that is a reality in certain parts of the world and in others with difficulty it could be, therefore, this article is focused on finding the obstacle that delays this technology in the Ecuadorian region and the advantage that presents implementing and using 5G networks in existing companies or organizations in international regions.

514

A. Caiche et al.

In this paper, an extensive search was carried out to define the importance of 5G networks at the international level, for which a survey was carried out regarding the new technologies that are applied in the current era with the implementation of fifth generation networks. At the international level, networks at their high level of development give way to the expansion of technology worldwide, given the various situations we face due to the emergence of the pandemic. The world will continue to be a global market where the example of the European Union (EU) and China is taken, since they are the most benefited in terms of mutual exchanges, with their great technological advances. In this research, it can be observed, especially in the technological sector, telecommunications become global supplies and provide us with benefits for those consumers who are aware of the advances in technology on all continents. Currently, the improvement of 5G technological investment is very well received by the large providers, which are characterized by being global companies that take advantage of diversity and technological capacity. The European Union carries out research that is subject to the laws of each territory, this lets us know that all the leaders and manufacturers of this 5G network are dependent on current international supplies [12]. The implementation of 5G networks at the international level generates controversy, since there are complaints with Chinese technology, the US government questions the security that would be being manipulated by the creators of the Huawei company, it is for them that the countries that choose to acquire certain services present clauses which must be fulfilled in order not to generate any type of complications in the future [13]. We nalyse the current situation of 3 European countries (United Kingdom, Spain, Italy) and 1 Asian (South Korea). The United Kingdom granted 150 MHz and plans to auction 120 MHz, and the detailed migration planning of all the services that exist in the 3.5 GHz broadband was taken into account, this is given through certain public consultations and the importance that defragmentation entails, enabling users to ensure continuous spectrum tenure by reducing radio spectrum fragmentation [14]. Spain put something specific in plan for the implementation of 5G that will be useful for the deployment of this generation, thus being a main spectrum for the development of new technologies. In Italy the ranges have not been auctioned in their entirety since a part of the spectrum is being used by the government, that is, by the armed forces of that country and has WIMAX licenses that are about to expire soon. South Korea held its first worldwide auction of a 5G spectrum in 2018, and wants to become one of the countries with the highest availability of spectrum, this is expected to be fully deployed in 2022, reducing the cost of its implementation. It is important to emphasize that it is impressive to see a clear example of how companies benefit, and that the global crisis was responsible for positioning technology as a tool of great importance, which was the efficient and effective integration of this current technology, the 5G networks [12].

The Impact of 5G Networks on Organizations

515

4 5G Network in Ecuador The Global System for Mobile (GSMA) company has a purpose that the implementation of mobile technologies improves the Latin American economy and exceeds 300 billion dollars, this within a term of the countries being associated and coupled to mobile services with the different improvements that are linked to their productivity and efficiency, currently Latin America shows an abysmal range of mobile subscribers and it is expected that in the next 5 years it will continue to grow, improving the forecast of the integration of 5G technologies [15]. The use of 5G technologies provides a series of advantages, and there is talk of artificial intelligence, smart cities and more technologies that will allow the use of 5G networks and communications to be deployed. But is Ecuador ready for this deployment? According to studies carried out by VIAVI solutions, this year there are a total of 72 countries that already have this technology, leading the ranking are China, USA and the Philippines, on the other hand, Ecuador is still in an initial phase to achieve this development [16]. Although it is true, the implementation of this new class of networks could promote digital innovation, which would allow, in a first phase, to create new forms of commercialization of products and services for companies, in such a way that it offers an innovation for customers and all personnel in general who make use of the services provided by a company that has deployed 5G technology. In 2019, the main telecommunications operators in the country, CNT, Claro, and Movistar carried out a pilot plan for the implementation of this technology, this plan was carried out in the city of Guayaquil. This pilot test is still under development this year, for 3 years the progress of the tests has been stagnant, and some of the factors that prevent this deployment are the economic, technical and administrative barriers of the infrastructure [17]. One of the main drawbacks is the high cost of the technology, according to the telecommunications company Claro, what demands the highest cost is the use of the network spectrum, this added to the fact that excessive costs are handled within the country for this use, having costs up to 5 times more than the average in Latin America. This is a point that is being improved within the country so that ARCOTEL together with the Telecommunications Authority are updating and correcting the rates for the use of the radio spectrum, so that actions regarding 5G technology are carried out correctly, according to the recommendations of The International Telecommunication Union (ITU) [18]. ITU is a specialized agency for information and communication technologies, which is responsible for developing global standards that are integrated into international mobile telecommunications. Its regulations are governed by the different development activities together with international regulations that are intended to improve 5G networks, so that they provide a better service without generating interference, improving and interposing low latency networks to expand a 5G network with broadband mobile service.

516

A. Caiche et al.

The changes that are being made will allow both public and private operators to carry out expansions of their services, thus facilitating the deployment of technology, so that 5G networks within the country can be a reality in the short term.

5 Strategy for Ecuador It is necessary to study the current situation of the country to achieve the deployment of 5G technologies, among the main points to be discussed in this section are: Strategies of other countries; 5G and the industries; 5G ecosystem; 5G and its benefit on health. In Ecuador, the use of mobile technology represents a high demand for connectivity. The industries demand greater capacity, lower latency and security, but despite taking advantage of these benefits, Ecuador has one of the lowest rates of adoption in technologies and innovation [19], which represents one of the main causes in the delay in the implementation of 5G technology. It should be noted that the deployment of this technology is linked to the digital strategy of each country, which considers the potential that this technology brings along with the development of artificial intelligence, in order to achieve a new industrial revolution. Regarding mobile connectivity within the country, it is represented with one of the lowest levels, if the existing numbers are nalyse within the Ministry of Telecommunications and the Computer Society of Ecuador and the numbers obtained by the OECD (Organization for Economic Co-operation and Development), until the year 2020 Ecuador was below the other countries in terms of fixed internet service [20]. If we nalyse the consumption of mobile data in Ecuador by inhabitants, we see that on average a person consumes around 1025 MB compared to the average shown for the rest of the countries, which is around 4.7 GB, this shows as within the country mobile networks are not fully used [20]. The EU seeks to create a shared vision for society, where 5G is fully connected, and there are several initiatives that EU countries have, most of which are already implemented [21]. Spain is considering 5G as a fundamental part of its Digital transformation, in which 5 main points are discussed [22]: – Digital transformation of production that is aimed at the development of IOT, 5G, Big data, and processes; – Multimedia applications, aimed at the field of health and education; – Smart agriculture with the use of precision sensors; – Intelligent transport with the development of a global connection for buses, airports, ports, and logistics; – Smart territories for the development of tourism and massive events. As already mentioned, the development of this new technology will always be linked to the plan that each country has, many more countries can be named, such as

The Impact of 5G Networks on Organizations

517

China, Malaysia or South Korea, which are countries that see this technology as an opportunity for development and globalization as is the case of the US and China, which to date lead the race for global technological dominance [23]. There is a wide variety of industries that are deploying this technology, from automotive, sports, industrial and health industries. For years there have been countless demonstrations and claims against the implementation of this technology, it is believed that the signals emitted by the antennas have a negative impact on people’s health. In Ecuador in 2020, in the El Guabo canton, the inhabitants prevented the placement of one of these antennas, which was intended for mobile connection, what the people indicated was that these antennas were responsible for the emission of the COVID-19. It should be mentioned that internationally renowned entities have been seen issuing comments about the possible health damage that this technology causes, but in the same way there are many studies that show that no negative effect has been found on people when exposed to this type of signals, these studies have been endorsed by the CIIC (International Center for Research on Cancer).

6 Conclusions This work makes references to the evolution of 5G networks at a national (Ecuador) and international level and how it was implemented in companies. The fifth generation networks encompass the digital transformation, with it the automated industries that help to generate an agile and flexible space, allowing companies to adapt to the changes generated with new technological resources, which generate economic sustainability, adapting strategies. Currently, more and more industries are benefiting from this technology. Following the example of the others countries from EU and EE.UU, it will be possible to deal with the country’s problems, which with the necessary support and study, the advancement of this technology, could turn it into a reality.

References 1. Anchundia JW, Anchundia JC, Chere BF (2019) La tecnología 5G en el Ecuador. Un análisis desde los requerimientos 5G. Quito 2. Campos CM, Guerra MR (2020) Adopción de tecnologias 4.0. Santiago de Chile 3. Castillo AC, León JB (2020) Tecnología 5G y su monetización empresarial. Cuenca. Retrieved from https://dialnet.unirioja.es/servlet/articulo?codigo=7659356 4. Cedeño Gómez MJ (2019) Análisis y medición de las señales emitidas por las radiaciones no ionizantes en la provincia de Santa Elena en el espacio circundante a las antenas sectoriales y estaciones de radio base. Univ Estatal Península StA Elena, Santa Elena 5. Digital Md (2020). España digital 2025. Madrid

518

A. Caiche et al.

6. Emilio ZM, Lady MV (2021) Análisis técnico para el despliegue de una red 5g en el Ecuador. Quevedo 7. Esteban Nieto N (2018) Tipos de investigación. Univ St Domingo Guzman, Lima 8. Garcia AC (2022) Digital innovation hubs. Concepto, evolucion y perspectivas. Valencia, España. http://einddayvlc.upv.es/wp-content/uploads/2019/02/2m_AnaCruz.pdf 9. García DS (2019) Redes 5G V2X Multi-modo y Escalables. Rev Dr UMH 4(2):1–11 10. GSMA (2020) Estado de 4G y pronóstico para 5G. Londres. Retrieved from https://www.gsma. com/spectrum/wp-content/uploads/2020/11/5G-and-3.5-GHz-Range-in-Latam-Spanish.pdf 11. GSMA (2020) Resumen experiencia internacional. Londres. Retrieved from https://www.gsma. com/spectrum/wp-content/uploads/2020/11/5G-and-3.5-GHz-Range-in-Latam-Spanish.pdf 12. Guarda T, Augusto MF, Lopes I, Victor JA, Rocha Á, Molina L (2020) Mobile communication systems: Evolution and security. Developments and Advances in Defense and Security. Springer, Singapore, pp 87–94 13. IT C (2022, Mayo 12) Coporate IT noticias de tecnologia y negocios . Retrieved from https://corporateit.cl/index.php/2022/05/12/viavi-635-nuevas-ciudades-en-el-mundo-rec ibieron-la-red-5g-el-2021/ 14. Marisol MT (2021) Evaluación De Una Arquitectura De Big Data Para La Red Móvil 5g A Nivel De La Capa Ingestión Utilizando Aplicaciones De Recolección De Datos. Ibarra. Retrieved from http://repositorio.utn.edu.ec/handle/123456789/11664 15. Martin JS (2020) Impacto en la Produyctividad por el uso de tecnologias 5G en Ecuador 16. Martínez JE, Vidal JT (2019) Retos De Competencia En El Despliegue Del 5G 17. Millás VM (2019) El despliegue de las redes 5G, o la geopolítica digital. Madrid. Retrieved from https://media.realinstitutoelcano.org/wp-content/uploads/2021/11/ari31-2019-moret-des pliegue-de-redes-5g-geopolitica-digital.pdf 18. Millás VM (2019, Junio 17) News Mundo . Retrieved from https://www.bbc.com/mundo/not icias-48663470 19. Proaño NS (2022, Mayo 16) Tecnologia 5G en Ecuador, Que tan cerca estamos de implementarla? Ecuador : Revista Vistazo 20. Schulz D (2021) 5G para industrias digitales. Revista ABB, 1(2021), 31–36. http://159.65.240. 138/bitstream/handle/uvscl/3218/Bib5G-56.pdf?sequence=1&isAllowed=y 21. Telecomunicaciones Ad (2022, Agosto 17) Agencia de regulacion y control de las telecomunicaciones. https://www.arcotel.gob.ec/ecuador-recibe-la-valoracion-de-las-bandas-2-5-ghzy-700-mhz-y-abre-el-camino-para-la-renegociacion-de-contratos-con-las-operadoras-de-ser vicio-movil-avanzado-y-nuevos-servicios/ 22. UNIVERSO E (2020) Qué es la red 5G y en qué países puede funcionar, ECUADOR. Retrieved from https://www.eluniverso.com/noticias/2020/10/14/nota/8013736/que-es-red5g-que-paises-puede-funcionar/ 23. Wang W (2020) Tecnologia 5G y comercio internacional, Europa. Retrieved from https://www. itreseller.es/opinion/2020/06/tecnologia-5g-y-comercio-internacional

Internet of Things and Smart Technology

The Fast Health Interoperability Resources (FHIR) and Integrated Care, a Scoping Review João Pavão, Rute Bastardo, and Nelson Pacheco Rocha

Abstract The scoping review reported by this article aimed to analyse the state of the art of the use of Fast Health Interoperability Resources (FHIR) in the development of integrated care applications. An electronic search was conducted, and 17 studies were included after the selection process. The results show a current interest in using FHIR to implement applications to support integrated care with different purposes: (i) oncological conditions, (ii) chronic conditions, (iii) complex paediatric conditions, and (iv) healthcare organization. From the results a set of potential facilitators and barriers were identified. Specifically, the identified barriers may condition the adequacy of using FHIR with certain purposes and, therefore, demand the attention of the FHIR research community as well as the FHIR promotors. Keywords Integrated care · FHIR · Fast health interoperability resources · Scoping review

1 Introduction Considering the challenges related to current demographic trends, particularly the increasing burden of chronic conditions, there is the need to shift from organizationcentred care (i.e., process-controlled, or shared care) to a paradigm focused on the needs of the patients [1].

J. Pavão INESC-TEC, Science and Technology School, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal e-mail: [email protected] R. Bastardo UNIDCOM, Science and Technology School, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal N. P. Rocha (B) IEETA, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_45

521

522

J. Pavão et al.

The integrated care concept [2] is associated to the patients’ perspective, as well as the implications of organizing and managing the different health and social care services to maximize quality, access, efficiency, effectiveness, and satisfaction of the patients. Various interacting organizational structures may co-exist within integrated care networks [3]. First, there are the formal care networks, both healthcare and social care networks regulated by collaboration contracts or agreements. Secondly, there are informal care networks resulting from a diversity of community-oriented structures including relatives, friends, voluntary groups, churches, time banks or non-governmental organizations [3]. Although Electronic Health Records (EHR) are adequate for the management of the patients’ information, collected and aggregated in local healthcare information systems, the reality is that the provision of integrated care is not restricted to an institution or even to a single care provision system. All caregivers need comprehensive, up-to-date, safe, and congruent information from the patients, immediately accessible at the place of care, to ensure the highest levels of care provision. For instance, when considering home monitoring of a patient with a chronic disease (e.g., diabetes, heart failure or chronic obstructive pulmonary disease), the resulting monitoring information should be distributed within an information network ranging from clinicians, social workers, and family members to the patients themselves. However, the implementation of this vision requires overcoming interoperability issues [4]. In this respect, Health Level Seven International (HL7) developed the Fast Healthcare Interoperability Resources (FHIR) as the next-generation healthcare interoperability standard. FHIR presents improvements when compared to other interoperability standards, namely HL7 Clinical Document Architecture (CDA) or HL7 Reference Information Model (RIM). In fact, FHIR was designed to allow healthcare data exchange at the level of discrete data elements and allows concise and easy-to-understand specifications, as well as implementations based on Representational State Transfer (REST) and well-established data representation standards (e.g., Extensible Markup Language—XML or Java Script Object Notation—JSON). Although there are some systematic literature reviews available in the FHIR domain [5, 6], none of them were focused on the integrated care applications. For instance, Lehne et al. [5] reviewed articles related to a general introduction of FHIR and Ayaz et al. [6] analysed EHR based on FHIR. Since the FHIR standard is very rich, and the related research is quite diverse and focused on various topics, this review complements other reviews addressing other aspects of the FHIR implementation.

2 Methods A protocol was prepared to explicitly describe the objectives and research questions of this scoping review, as well as the steps of the reviewing process: (i) search strategies; (ii) inclusion and exclusion criteria; (iii) screening procedures; and (iv) synthesis and reporting.

The Fast Health Interoperability Resources (FHIR) and Integrated …

523

There were two objectives for this study. The first objective was to investigate the literature related to FHIR and integrated care to explore the trends of using FHIR in this healthcare domain and give a comprehensive summary of the integrated care applications that benefit from the use of FHIR. The second objective was to systematize the researchers’ opinions about potential vantages (i.e., facilitators), as well as potential drawbacks and difficulties (i.e., barriers) of adopting FHIR. This aimed to provide the readers with up-to-date information about the different types of facilitators and barriers identified during the implementation of integrated care applications supported on FHIR. After the identification of the research objectives, they were decomposed into the following research questions: • RQ1—what type of integrated care applications benefit from the use of FHIR? • RQ2—what are the facilitators and barriers when using FHIR to support integrated care applications? PubMed, Web of Science, and Scopus were the three databases selected to retrieve the references for this review. PubMed was selected considering its importance among clinical researchers. In turn, Web of Science and Scopus are the two major existing multidisciplinary databases and contain a significant number of references indexed by other databases (e.g., ACM Digital Library or IEEE Xplore). Boolean queries were prepared to include all the articles that have in their titles, abstract or keywords the expressions “Fast Healthcare Interoperability Resources” or “FHIR”. The inclusion criteria were full articles that deal with FHIR published in the English language in peer-reviewed conference or journals before 31st March 2022. Articles that although address FHIR-related issues did not report evidence of the use of FHIR to implement applications to support integrated care were excluded. Moreover, the following exclusion criteria were also considered: (i) articles without abstracts or authors’ identification; (ii) articles not written in English; (iii) articles whose full text were not available; (iv) articles reporting on reviews or surveys; and (v) books, tutorials, editorials, special issues announcements, extended abstracts, posters, panels, transcripts of lectures, workshops, and demonstration materials. Additionally, articles reporting on studies already covered by other included references were also excluded: when two references reported on the same study in different venues, such as scientific journal and conference, the less mature one was excluded. The selection of the studies included in this scoping review was performed according to the following steps: (i) first step, the authors removed the duplicates, the references without abstract or authors, references not written in English, and references reporting on reviews or surveys; (ii) second step, the authors assessed all titles and abstracts for relevance and those clearly not meeting the inclusion and exclusion criteria were removed; and (iii) third step, the authors assessed the full text of the remaining articles against the outlined inclusion and exclusion criteria to achieve the list of the articles to be included in this scoping review (i.e., the included studies).

524

J. Pavão et al.

Finally, concerning the synthesis and report of the results, the included studies were analysed considering (i) purposes of the applications being developed, and (ii) potential facilitators and barriers for the adoption of FHIR. Since the objective of this review was to analyse the state of the art of the use of FHIR in the context of integrated care applications, systematization of the purposes of the studies was prepared, which included the aims of the respective applications. Moreover, a synthesis was performed to analyse the motivations of using FHIR in the included studies, as well as the facilitators and barriers of the FHIR adoption identified by the authors of these studies.

3 Results 3.1 Selection of the Studies Figure 1 presents the Preferred Reporting Items for Systematic Reviews and MetaAnalyses (PRISMA) flowchart of the screening procedures. The search of the studies to be included in this scoping review was conducted in April 2022. A total of 1343 references was retrieved from the initial search: (i) 350 references from PubMed; (ii) 394 references from Web of Science; and (iii) 599 references from Scopus. The initial step of the screening phase yielded 594 references by removing duplicates, references without abstracts or authors, and references reporting on reviews or surveys. Based on titles and abstracts, 562 references were removed since they reported on studies not relevant for the specific objective of this review (e.g., EHR based on FHIR, or applications based on FHIR to support clinical research). Finally, the full texts of the remaining 32 references were screened and 15 were excluded because they did not meet the outlined inclusion and exclusion criteria. Specifically, six articles were excluded because they reported on studies also reported by more recent articles that were considered for inclusion. Therefore, 17 studies were considered eligible for this review [7–23].

3.2 Purposes of Included Studies Although all the applications of the included studies aimed to support integrate care, different purposes were identified: (i) oncological conditions, four studies [7, 13, 14, 20]; (ii) chronic conditions (i.e., multimorbidity chronic conditions, rheumatoid arthritis, and chronic obstructive pulmonary disease—COPD), four studies [10, 15, 17, 19]; (iii) complex paediatric conditions, two studies [8, 21]; and (iv) healthcare organization (i.e., primary care, different healthcare levels, and social determinants of health), seven studies [9, 11, 12, 16, 18, 22, 23].

525

References idenfied through database searching (n = 1343)

References excluded because they were duplicates, references reporng on reviews or surveys, or references and without abstract or authors (n = 749)

Included

Eligibility

Screening

Idenficaon

The Fast Health Interoperability Resources (FHIR) and Integrated …

Assessment based on tle and abstract (n = 594)

References excluded (n = 562)

Full-text arcles assessed for eligibility (n = 32)

Full-text arcles excluded (n = 15)

Studies included (n = 17)

Fig. 1 PRISMA flowchart

In the context of oncological conditions, four articles [7, 13, 14, 20] reported on the use of FHIR to guarantee the interoperability of the clinical information being shared by multidisciplinary practitioners through: (i) a web-based software ecosystem for the personalized and collaborative management of primary breast cancer by multidisciplinary units [7]; (ii) a patient centric EHR data sharing and management mechanism supported on blockchain technology [13]; (iii) a minimum set of clinical, self-reported health status and lifestyle information relevant to care provision and research on cancer survivorship [14]; and (iv) a FHIR-based ontology to allow standardizing the knowledge related to digital interventions for behavioural changes [20]. In what concerns chronic conditions, four articles [10, 15, 17, 19] were identified. Collaborative environments focused in multimorbidity patients (i.e., patients with two or more chronic conditions at the same time) and involving various stakeholders to guarantee drug safety were proposed in [17, 19]. In turn, two other articles were

526

J. Pavão et al.

related to patients with COPD and rheumatoid arthritis: (i) a data management framework aiming to support integrated care services namely in terms of data management of a home monitoring system for patients with COPD [10]; and (ii) a platform to integrate care services across providers and to support patients’ management along the continuum of care, which was evaluated by rheumatoid arthritis patients also affected by cardiovascular comorbidities [15]. In the two studies [8, 21] addressing the care coordination of complex paediatric patients, (i) the data elements associated to a patients’ care team were identified and mapped to FHIR resources [8]; and (ii) an application was developed to support multi-site paediatric scoliosis rehabilitation [21]. In turn, the improvement of the efficiency of primary care was considered in three articles [9, 22, 23]: (i) the mHealth4Afrika aiming at supporting a holistic, patient-centric, standards-based approach by replacing paper-based registries and program-specific electronic solutions used in African countries [9]; (ii) a patientcentric interoperable application for sharing clinical data supported on blockchain technology to preserve privacy [22]; and (iii) an application to support COVID-19 patients in residential care units [23]. Moreover, three studies were related to the integration of different levels of healthcare provision: (i) a collaborative platform supporting the InterRAI instruments, which were designed to be compatible across health sectors to improve continuity of care and organizations’ capacity to measure clinical outcomes [11]; (ii) an integrated services platform to establish a coordinated healthcare system between primary care facilities in townships/villages and county-level hospitals [16]; and (iii) semantic mechanisms to increase interoperability between patient data recorded in prehospital settings and the EHR of the emergency rooms [18]. Finally, article [12] reported on the management of social determinants of health (i.e., social factors such as unemployment situation, or potentially hazardous relationship with family members, which would allow providers to tailor treatment to specific needs of the patients) to increase their use in the clinical practice.

3.3 FHIR Facilitators and Barriers The included studies used FHIR with different motivations: (i) data exchange [7, 16, 18]; (ii) data storage [10, 17, 21]; (iii) clinical data aggregation and integration from multiple health information systems [9, 13–15]; (iv) standardization of information elements [8, 11, 12, 20, 23]; (v) output interface for data delivering [19]; (vi) and management of data access control [22]. FHIR was used to guarantee the syntactic interoperability of input and output data exchange of a guideline-based decision support system [7], as well as to support data exchange from various healthcare information systems, including regional laboratory information systems, regional picture archiving and communication systems and personal health records [16]. Moreover, it was also used to connect prehospital EHR and EHR of emergency rooms [18]. The three studies confirmed that FHIR is an

The Fast Health Interoperability Resources (FHIR) and Integrated …

527

effective approach to support syntactic interoperability, and to build care communities supported by heterogeneous healthcare information systems [7, 16, 18]. In terms of data storage, three articles [10, 17, 21] were related to the implementation of clinical information repositories based on FHIR. The fact of FHIR is being considered as a best practice in terms of clinical information interoperability was the motivation for its use to support the storage and representation of semantically enriched EHR [10]. In addition to the flexibility of the FHIR information model and the respective wide coverage of data definition, the FHIR explicit semantics and the availability of open-source development tools were also considered as potential facilitators [17]. In terms of the four articles [9, 13–15] related to the use of FHIR to aggregate and integrate clinical data from multiple health information systems, one study [15] aimed to integrate monitoring devices (e.g., lifestyle sensors or environment sensors) and actuators [15]. The remainder three articles [9, 13, 14] are related to the aggregation of clinical data from independent EHR to support: (i) primary care provision [9]; (ii) cancer care [13]; and (iii) cancer survivors’ follow up [14]. The flexibility of FHIR resources and their extensions were referred by [9, 14], namely, to support multilingual data [9]. Moreover, according to [14], with FHIR is possible to share only the data that are needed rather than a large collection of data elements as happens with other stablished standards. In turn, possible barriers include losses of granularity (i.e., when mapping exiting data sources with FHIR, the FHIR resources may not allow the same data detail as the one available at the data sources) and the multiple ways to map the same concept using available resources (i.e., even a well-defined concept such as a member of the patient care team can be mapped both to a Practitioner resource, or a Related Person resource) [14]. The use of FHIR by the research teams of five studies [8, 11, 12, 20, 23] was motivated by the need to standardize information elements related to specific clinical concepts, including (i) characterization of care teams [8], (ii) electronic forms to ensure accurate data collection [11], (iii) social determinants of health [12], (iv) clinical decision support system knowledge interoperable with EHR [20], and (v) patients’ observations [23]. In terms of facilitators, in addition to the flexibility of FHIR resources and their extensions [11, 12], the consistency and rigor of the FHIR interoperability mechanisms and the adequacy of the Observation resource to capture patients’ observations were referred in [14], while FHIR seen as a requirement for meaningful use certification in the United States was referred in [8]. Moreover, according to [8] there is the possibility of using the XML and JSON formats of the FHIR resources to define procedural logics of electronic data entry forms (e.g., hiding, disabling, or showing parts of a form based on the response received for a previous field, calculating scores based on responses, and displaying alerts). In turn, in terms of barriers, the existence of optional specifications to support patient care team management can pose challenges during implementation, and although non-clinical events (e.g., first day of school) and care team actions (e.g., call family to check on missed appointments) may influence clinical and care-giver decision making, they do not match directly to the FHIR resources [8].

528

J. Pavão et al.

Table 1 FHIR facilitators and barriers Facilitators

Barriers

Best practice in terms of clinical information interoperability [10]

Granularity losses when mapping legacy data sources [14]

Consistent and rigorous interoperability mechanism [14]

Multiple ways to map the same concept using available resources [14]

Effective approach to support interoperability [7, 16, 18]

Non-compulsory specifications [8]

A requirement for meaningful use certification in the United States [8]

Difficulties in mapping non-clinical events [8]

The resources and their extensions constitute a flexible information model [9, 11, 12, 14, 17]

Patients are not considered as the owners of their data [22]

Observation resource is adequate to capture patients’ observations [12] Ability to share only the data that are needed [14] Explicit semantics [17] Easily pluggable [22] Supported by open-source development tools [17] The XML and JSON formats of its resources can support the definition of procedural logics of electronic data entry forms [8]

The SMART on FHIR was used in one study [19] to integrate prediction models in a web-based dashboard and, according to the authors [19], the approach was efficient and may be applied to different chronic conditions. Finally, according to George and Chacko [22], FHIR is easily pluggable and addresses interoperability with human readable messages but does not consider the patient as the owner of the data. Therefore, the authors propose a solution to manage the access of data being shared using blockchain [22]. Table 1 summarizes the facilitators and barriers identified by the included studies.

4 Discussion Seventeen studies were identified for this scoping review. One might think that this small number resulted from an incorrect selection process (e.g., a narrow choice of the databases and keywords that were used). However, two of the databases being used, Scopus and Web of Science, are two well-reputed scientific indexing a huge number of scientific journals and conferencing proceedings that partially overlap publishers’ databases, such as ACM Digital Library or IEEE Xplore, and allow a

The Fast Health Interoperability Resources (FHIR) and Integrated …

529

significant coverage of academic publications. Moreover, a broad search query was used that allowed the initial retrieving of a total of 1343 references. Considering the literature reviews related to FHIR that were identified [5, 6], the number of included articles varied from 80 [6] to 131 [5]. Since these reviews target the application of FHIR in all healthcare domains and include overview articles or articles presenting general information about FHIR, the integrated care domain should have a smaller number of articles. Therefore, the number of included studies shows a current interest in using FHIR in the development of integrated care applications but seems to indicate that integrated care applications is not the main topic of the FHIR research community. Considering the first research question (i.e., what type of integrated care applications benefit from the use of FHIR?), the included articles aimed to develop applications to support integrated care provision related to (i) oncological conditions, (ii) chronic conditions (i.e., multimorbidity chronic conditions, rheumatoid arthritis, and COPD), and (iii) complex paediatric conditions. Moreover, some of the included studies also proposed applications to improve the integration of primary care services and the integration of different levels of healthcare provision. The integration of FHIR in the aforementioned applications was justified by the need to support: (i) data exchange [7, 16, 18]; (ii) data storage [10, 17, 21]; (iii) clinical data aggregation and integration from multiple health information systems [9, 13–15]; (iv) standardization of information elements [8, 11, 12, 20, 23]; (v) output interface for data delivering [19]; and (vi) management of data access control [22]. A significant percentage of the included articles (i.e., almost 60% of the articles) are related to data exchange, data storage and clinical data integration involving the use of FHIR to achieve interoperability between heterogeneous healthcare information systems, including EHR. This result is a natural consequence of the fact of FHIR being designed to guarantee the interoperability and cross-institutional sharing of clinical data in the clinical environment. Interoperability and cross-institutional sharing of clinical data also require (i) the mapping with information elements stored in existing systems in accordance with a wide range of standards, and (ii) the inclusion of novel concepts. In this respects, five articles (i.e., almost 30% of the included articles) were focused on standardization of information elements. Each one of the two other identified concerns (i.e., (i) output interface for data delivering and (ii) management of data access control) deserve the attention of only one study. Analysing the already mentioned reviews [5, 6], the communication between EHR and the mapping to FHIR of data from existing legacy systems are relevant issues. Moreover, topics such as mobile and web applications and data protection, security and reliability were also identified [5, 6]. In turn, the integration of medical devices to support patients monitoring is an active topic within the FHIR research community. However, only one of the included studies (i.e., [15]) referred the integration of remote monitoring systems. In what concerns the second research question (i.e., what are the facilitators and barriers when using FHIR to support integrated care applications?), the included studies pointed a set facilitators: (i) it is a best practice in terms of clinical information interoperability [10]; (ii) it presents consistent and rigorous interoperability

530

J. Pavão et al.

mechanism [14]; (iii) it is an effective approach to support interoperability allowing the creation of cross-institutional communities supported by heterogeneous healthcare information systems [7, 16, 18]; (iv) it is a requirement for meaningful use certification in the United States [8]; (v) contrary to its predecessors, including the remainder solutions of the HL7 framework, its resources and their extensions constitute a flexible information model [9, 11, 12, 14, 17] with a wide coverage of data definition [17], including multilingual options [9]; (vi) its Observation resource is adequate to capture patients’ observations [12]; (vii) it has the ability to share only what is needed for specific rather than a large collection of data elements [14]; (viii) it presents an explicit semantics [17]; (ix) it is easily pluggable [22]; (x) it is supported by a large number of open-source development tools [17]; and (xi) the XML and JSON formats of its resources can support the definition of procedural logics of electronic data entry forms [8]. In turn, several barriers were identified: (i) the possibility of granularity losses when mapping legacy data sources [14]; (ii) the possibility of the same meaning be encoded in different forms due to the support of multiple ways to map the same concept using available resources [14]; (iii) the non-compulsory of some specifications, which means that a given source have the option to not provide values for these data elements if the data are optional [8]; (iv) difficulties in mapping non-clinical events [8]; and (v) the fact that the patients are not considered as the owners of their data [22]. In a previous study [6] several challenges for the implementation of FHIR were reported: (i) implementations of FHIR in an application; (ii) standard complexity; (iii) adoption difficulties; (iv) FHIR maintenance and specification; (v) RESTful approach; and (vi) mapping/migration challenging. In this respect, this review identified additional challenges that also may condition FHIR implementations. Moreover, and even more important, these additional challenges may condition the adequacy of using FHIR with specific purposes. A limitation of this scoping review is related to the fact that the search procedure only considered references indexed by scientific databases. This strategy has the drawback of excluding potential interesting industrial studies that were not published in indexed journals or conferences.

5 Conclusion A scoping review was performed to systematize the state of the art of the use of FHIR in the development of integrated care applications and, therefore, to complement other reviews addressing other aspects of the implementation of FHIR. From the 1343 articles retrieved from the initial databases search, 17 articles were identified as being related to the use of FHIR in the development of integrated care applications. Considering the number of articles identified it is possible to conclude that there is a current research interest in the application of FHIR in the context of integrated care, although this topic is not one of the most significant of FHIR research

The Fast Health Interoperability Resources (FHIR) and Integrated …

531

community. Nevertheless, it expected that the number of studies will increase in the future, since FHIR is a relatively recent interoperability standard (i.e., the FHIR Draft Standard for Trial Use 1 was published in 2013). Integrated care has wide-scope requirements and intervention areas, due to the diversity of pathologies, including chronic conditions, and types of healthcare delivering. In this respect, several pathologies were identified (i.e., cancer, multimorbidity chronic conditions, rheumatoid arthritis, COPD, and complex paediatric conditions), as well as different purposes of using FHIR (i.e., data exchange, data storage, clinical data aggregation and integration, standardization of information elements, output interface for data delivering, and management of data access control). An important finding of this review was the synthesis of the FHIR facilitators and barriers identified by the authors of the included studies. Specifically, the identified barriers (i.e., granularity losses when mapping legacy data sources, multiple ways to map the same concept using available resources, the non-compulsory character of some specifications, difficulties in mapping non-clinical events, and the fact that the patients are not the owners of their data) may condition the adequacy of using FHIR with certain purposes. Therefore, these barriers demand the attention of the FHIR research community as well as the FHIR promotors. Acknowledgements This work was supported by Programa Operacional Competitividade e Internacionalização (COMPETE 2020), Portugal 2020 and Lisboa 2020 of the Fundo Europeu de Desenvolvimento Regional (FEDER)/European Regional Development Fund (ERDF), under project ACTIVAS—Ambientes Construídos para uma Vida Ativa, Segura e Saudável, POCI-01-0247FEDER-046101.

References 1. Baxter S, Johnson M, Chambers D, Sutton A, Goyder E, Booth A (2018) The effects of integrated care: a systematic review of UK and international evidence. BMC Health Serv Res 18(1):1–13 2. Briggs AM, Valentijn PP, Thiyagarajan JA et al (2018) Elements of integrated care approaches for older people: a review of reviews. BMJ Open 8(4):e021194 3. Sousa M, Arieira L, Queirós A et al (2018) Social platform. In: World conference on information systems and technologies. Springer, Cham, pp 1162–1168 4. Dias A, Martins AI, Queirós A, Rocha NP (2018) Interoperability in pervasive health: a systematic review. In: International joint conference on biomedical engineering systems and technologies. Springer, Cham, pp 279–297 5. Lehne M, Luijten S, Imbusch PVFG, Thun S (2019) The use of FHIR in digital health—a review of the scientific literature. Stud Health Technol Inform 267:52–58 6. Ayaz M, Pasha MF, Alzahrani MY, Budiarto R, Stiawan D (2021) The fast health interoperability resources (FHIR) standard: systematic literature review of implementations, applications, challenges and opportunities. JMIR Med Inform 9(7):e21929 7. Séroussi B, Guézennec G, Lamy J-B, Muro N et al (2017) Reconciliation of multiple guidelines for decision support: a case study on the multidisciplinary management of breast cancer within the DESIREE project. In: AMIA annual symposium proceedings, vol 2017. American Medical Informatics Association, Bethesda, MD, p 1527

532

J. Pavão et al.

8. Ranade-Kharkar P, Narus SP, Anderson GL, Conway T, Del Fiol G (2018) Data standards for interoperability of care team information to support care coordination of complex pediatric patients. J Biomed Inform 85:1–9 9. Cunningham PM, Cunningham M (2019) mHealth4Afrika—co-designing a standards based solution for use in resource constrained primary healthcare facilities. In: 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, Piscataway, NJ, pp 4289–4292 10. Kilintzis V, Chouvarda I, Beredimas N, Natsiavas P, Maglaveras N (2019) Supporting integrated care with a flexible data management framework built upon Linked Data, HL7 FHIR and ontologies. J Biomed Inform 94:103179 11. Eapen BR, Costa A, Archer N, Sartipi K (2019) FHIRForm: an open-source framework for the management of electronic forms in healthcare. In: Lau F et al (eds) Improving usability, safety and patient outcomes with health information technology. IOS Press, Amsterdam, pp 80–85 12. Watkins M, Viernes B, Viet Nguyen LRM, Valencia JS, Borbolla D (2020) Translating social determinants of health into standardized clinical entities. Studies Health Technol Inform 270:474 13. Dubovitskaya A, Baig F, Xu Z, Shukla R, Zambani PS, Swaminathan A et al (2020) ACTIONEHR: patient-centric blockchain-based electronic health record data management for cancer care. J Med Internet Res 22(8):e13598 14. González-Castro L, Cal-González VM, Del Fiol G, López-Nores M (2021) CASIDE: a data model for interoperable cancer survivorship information based on FHIR. J Biomed Inform 124:103953 15. Richter JG, Chehab G, Schwartz C, Ricken E, Tomczak M, Acar H et al (2021) The PICASO cloud platform for improved holistic care in rheumatoid arthritis treatment-experiences of patients and clinicians. Arthritis Res Ther 23(1):1–13 16. Nan J, Xu LQ, Wang Q, Bu C, Ma J, Qiao F (2021) Enabling tiered and coordinated services in a health community of primary care facilities and county hospitals based on HL7 FHIR. In: IEEE international conference on digital health (ICDH). IEEE, Piscataway, NJ, pp 254–259 17. Despotou G, Arvanitis TN (2021) An electronic health record approach to understanding drug to drug interactions and associated knowledge gaps in integrated care of multimorbidity. In: Public health and informatics. IOS Press, Amsterdam, pp 580–584 18. Andersen SNL, Brandsborg CM, Pape-Haugaard L (2021) Use of semantic interoperability to improve the urgent continuity of care in Danish ERs. In: Public health and informatics. IOS Press, Amsterdam, pp 203–207 19. Tarumi S, Takeuchi W, Chalkidis G, Rodriguez-Loya S, Kuwata J, Flynn M, Turner KM et al (2021) Leveraging artificial intelligence to improve chronic disease care: methods and application to pharmacotherapy decision support for type-2 diabetes mellitus. Methods Inf Med 60(S01):e32–e43 20. Veggiotti N, Sacchi L, Peleg M (2021) Enhancing the IDEAS framework with ontology: designing digital interventions for improving cancer patients’ wellbeing. In: AMIA annual symposium proceedings, amia symposium, vol 2021. American Medical Informatics Association, Bethesda, MD, pp 1186–1195 21. Shi W, Giuste FO, Zhu Y, Carpenter AM, Iwinski HJ, Hilton C, Wattenbarger JM, Wang MD (2021) A FHIR-compliant application for multi-site and multi-modality pediatric scoliosis patient rehabilitation. In: IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, Piscataway, NJ, pp 1524–1527 22. George M, Chacko AM (2022) A patient-centric interoperable, quorum-based healthcare system for sharing clinical data. In: International conference for advancement in technology (ICONAT). IEEE, Piscataway, NJ, pp 1–6 23. Das S, Hussey P (2022) Development of an interoperable-integrated care service architecture for intellectual disability services: an Irish case study. Frontiers of data and knowledge management for convergence of ICT, healthcare, and telecommunication services. Springer, Cham, pp 1–24

Blockchain Based Secure Interoperable Framework for the Internet of Medical Things Wajid Rafique, Babar Shah, Saqib Hakak, Maqbool Khan, and Sajid Anwar

Abstract Internet of Medical Things (IoMT) has revolutionized the way medical infrastructure has been managed in the past. Multiple platforms in IoMT have disparate communication standards, data format requirements, and access policies, which produce immense overhead during data transfer among these platforms. In order to provide seamless healthcare services using IoMT, interoperability concerns of heterogeneous devices need to be addressed. Smart contracts using blockchain provide a secure communication for distributed objects to interact in a secure way. We propose a Blockchain-based Secure Interoperable Framework (BSIIoMT) using smart contracts for secure communication in IoMT. We present components, workflow, and design considerations of the BSIIoMT framework to show the feasibility of using edge-enabled blockchain for secure interoperability in IoMT. The BSIIoMT framework is an ongoing project where we present the framework and its components in this research where further results and evaluation will be presented in future. Keywords Blockchain · Healthcare Services · IoMT · Interoperability · Security

W. Rafique (B) Department of Computer Science and Operations Research, University of Montreal, Quebec, Canada e-mail: [email protected] B. Shah Center of Excellence in IT, Institute of Management Sciences, Peshawar, Pakistan S. Hakak Faculty of Computer Science, Canadian Institute for Cybersecurity, University of New Brunswick, Fredericton, Canada M. Khan Pak-Austria Fachhochschule Institute of Applied Sciences and Technology, Haripur, Pakistan Software Competence Center Hagenberg, Vienna, Austria S. Anwar College of Information Technology, Zayed University, Academic, UAE © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_46

533

534

W. Rafique et al.

1 Introduction Internet of Medical Things (IoMT) has revolutionized the traditional way of medical services provisioning by connecting smart objects with the Internet [1]. IoMT works based on the miniature sensing equipment that captures real-world data and helps in autonomous services provisioning in the healthcare [2] There is a recent trend in blockchain-based systems to provide secure services in the medical paradigm, which provides a decentralized platform to offer security to the attached infrastructure [3]. IoMT offers a sustainable solution for fault-tolerant security where no central authority controls the network traffic [4]. In this regard, the adoption of blockchain in the IoMT paradigm provides promising solutions to deliver healthcare services in a secure and efficient way. As IoMT devices lack resources, they require cloud and edge computing support to process resource-intensive tasks [5]. Central cloud computing infrastructure resides farther from the IoMT; therefore, bringing the communication and resources to and from the central cloud becomes a challenging task [6]. As the IoMT applications have lower-latency requirements; therefore, centralized cloud computing becomes inefficient [7]. Edge computing brings computational resources at the network edge thereby offering solutions for the lower-latency and compute-intensive tasks that the IoMT devices are unable to handle [8]. Edge computing-based IoMT solutions provide patients’ support where they can be exploited for remote-patient monitoring; hence, reducing cost, and providing healthcare services at distant locations [9]. IoMT devices can measure patients’ personal healthcare-related information and help them in efficiently tracking their health thereby limiting the impact of chronic diseases [10]. During the current COVID-19 outbreak, remote patient monitoring has become one of the important solutions to provide contactless medical services [1]. Canada’s Switch Health portal is one such example, which offers remote COVID-19 testing facilities using a video calling strategy moderated by an online nurse [11]. Furthermore, IoMT devices measure patients’ physical parameters and help them in autonomous services provisioning [12]. It helps to monitor patients remotely and offer medical support at their homes where edge and cloud computing infrastructures interact continuously to provide services to the patients. Figure 1 shows secure services provisioning using blockchain. The figure shows IoMT devices sending captured data to blockchain which contains distributed database, smart contracts and transactions, which authenticate IoMT and provide services to the healthcare services users. Besides immense advantages provided by edge and cloud computing solutions, immense security, privacy, and data interoperability challenges arise, which become a hurdle for the seamless implementation of healthcare services [13]. One of the concerns in this paradigm is the secure interoperability of IoMT generated data among the patients and healthcare services providers [14–16]. Malicious users may get access to the IoMT data without patients’ consent bringing detrimental impacts on the security and privacy of IoMT data [17]. Furthermore, it is challenging for the patients to monitor their health-relevant information

Blockchain Based Secure Interoperable Framework for the Internet …

535

Fig. 1 Components of the blockchain-based secure interoperability framework comprising medical sensors blockchain and healthcare services. Data generated by heterogeneous IoMT devices passes through blockchain structure for healthcare services provisioning

on the applications hosted on the cloud [18]. Therefore, it is necessary to provide a solution for the access control, interoperability, and security of the IoMT data [19]. Based on the advantages provided by the blockchain, this paper proposes a secure interoperability framework for IoMT devices. Blockchain ensures that the data is more secure against any tampering and integrity related attacks. Blockchain autonomously processes smart contracts among the IoMT entities without any third party which cannot be changed. We propose an access control mechanism for the IoMT data, which provides granular control over data from different network entities. The access control mechanism is capable of providing efficient data access and restricting malicious adversaries from accessing IoMT data. It is to be noted here that this research is the part of an ongoing project where we present the framework, its components, and its workflow. Results and evaluation will be provided in a separate research paper in future. Keeping in consideration, this paper provides the following key contributions. • We develop an interoperability framework for the privacy and security of IoMT data. • We elaborate on key components, working principles, and design considerations of the BSIIoMT framework in the IoMT paradigm. • We perform discussion to show the efficacy of the BSIIoMT framework in providing secure interoperability in healthcare services provisioning. The rest of the paper is organized as follows. Section II presents the state-of-the-art research in secure and interoperable services provisioning in the healthcare paradigm. Section III discusses the key enabling technologies, components, working principles and design considerations of the BSIIoMT framework. Section IV describes the discussion on the BSIIoMT framework, while Section V discusses the conclusion and provides the future insights.

536

W. Rafique et al.

2 State of the Art Due to the manufacturing of a wide variety of smart sensors in the medical sphere, autonomous healthcare services provisioning has become an important area of research. Researchers from academia and industry are putting their efforts into the development of novel solutions for seamless services provisioning in healthcare. However, due to the wide variety of healthcare infrastructures, providing secure interoperability has become a challenging task. In this section, we elaborate on stateof-the-art research in providing interoperability in the IoMT paradigm and discuss the comparative analysis of the available approaches and the proposed research. ISO and IEEE are working on ISO/IEEE 11,073 Personal Healthcare Device (PHD) standard since 1984 [20]. ISO/IEEE 11,073 PHD standard enables communication among medical, healthcare, and wellness devices with external computing infrastructures. This standard provides autonomous healthcare data collection from sensors. ISO/IEEE 11,073 standard deals with the upper transport layer and defines domain information, service and communication model. One other standard IEEE 11,073–20,601 defines generic data types, message type, communication model and provides interface for the healthcare defines to communication [21]. This standard uses ASCE and CMDISE for biometrics information exchange. However, it is challenging to implement ASCE and CMDISE for resource constrained IoMT. Therefore, this standard is not suitable for constrained IoMT devices. IETF Constrained RESTful Environments (CoRE) working group standardized Constrained Application Protocol (CoAP) for constrained IP networks [22]. CoAP supports resource limited IoMT devices to communication over the Internet using its low signaling overhead, support for multicasting, and simplistic communication model. However, it does not consider the requirements of healthcare applications, which require high reliability, seamless connectivity and integrity of data that have been proposed by international standards-based healthcare systems. Abdellatif et al. in [15] propose a blockchain-based security solution for IoMT devices, which authenticates electronic health records among different entities in the network using a chaining mechanism. However, this technique secures the electronic healthcare records and does not consider the authentication of resource constrained IoMT devices [15]. Proposed distributed data storage using smart contracts to overcome challenges of higher cost, latency, and single-point-of-failure issues. However, the security perspective has not been elaborated in this research, which is a challenging problem in the IoMT-enabled healthcare systems [12]. Proposed the EWPS system, which provides on-demand emergency services in the healthcare paradigm. The proposed system overcomes the challenges of network congestion and delay issues and increases network throughput for emergency data by tagging emergency packets. However, anomalies injected by the adversaries could impact the performance of packet delivery in emergency conditions, which has not been discussed by the authors [13]. Proposed an SDN-enabled malware detection system to ascertain anomalies in the IoMT infrastructure. A deep-learning-based model has been deployed for the classification of malware and benign traffic. However, this approach

Blockchain Based Secure Interoperable Framework for the Internet …

537

exploits centralized characteristics of SDN, which suffers from single-point-offailure issues when distributed controllers are not employed [18]. Developed MedHypeChain to overcome interoperability issues in healthcare using Hyperledger Fabric. The proposed system uses a blockchain-based framework for secure information access; however, this system suffers from scalability issues, which may raise significant challenges in the large-scale IoMT paradigm [17]. Proposed a blockchain-based offloading approach for the efficient and secure task migration from the resourceconstrained IoMT devices to the edge infrastructure. Blockchain is used to reach a consensus for the global task offloading strategy where task offloading, and resource allocation have been considered as a Markov Decision problem. However, this approach uses a domain-specific strategy for task offloading and security; therefore, it suffers from the lack of interoperability issues [1]. Propose a symptomatic COVID-19 patient tracking system using reinforcement learning; however, this technique lacks security and interoperability perspective to efficiently deliver healthcare services [16]. Proposed an attribute-based encryption scheme to provide granular access control to preserve privacy and confidentiality of user data. However, this technique strives to secure the confidentiality of electronic health records and does not consider the sensory data produced by IoMT thereby having limited applicability. The discussed approaches lack a comprehensive perspective of either security, privacy, or interoperability, which have not been considered in the available techniques. The abovementioned literature review suggests that there is a need for a secure interoperability framework, which provides access control to handle the malicious users and efficiently delivers end-to-end IoMT services. Therefore, we propose a secure interoperability framework to provide seamless services in the IoMT-enabled healthcare systems. In the next section, we will discuss the enabling technologies and key components of the BSIIoMT framework.

3 BSIIoMT Framework, Working Principle, and Design Considerations We propose a blockchain-based security framework to protect IoMT data against security and privacy attacks, which enables efficient interoperability among heterogeneous devices. Ethereum blockchain provides an efficient solution for the development of interoperable platforms for IoMT [23]. BSIIoMT is a novel blockchain solution, which is similar to most of the available blockchain paradigms such as bitcoin [15]. BSIIoMT provides a flexible solution for IoMT applications by exploiting smart contracts, and transactions, which are authenticated and added to the blockchain using the proof-of-work algorithm. The proof-of-work algorithm is operated by miners, which operates and provides a tamper-resistant and security-aware overall consensus among the connected blockchain nodes of the network. In this scenario, every block stores a hash value of the preceding block in a sequential disposition, which allows the linked blockchain blocks to be modified in an efficient way.

538

W. Rafique et al.

Ethereum uses an external account and a contract account, which is indexed using a twenty-byte address and identified by private and public keys [24]. The smart contract is based on a program that executes upon the fulfillment of certain conditions. Smart contracts consist of code and data having disparate programmable functions. Application Binary Interface (ABI) is used by underlying users to share their Ethereum accounts. A transaction in Ethereum constitutes transferring a data packet of ether from one account to another. The structure of a transaction is composed of Account Nonce, recipient address, Ether value, Gas price, sender’s signature, and Gas limit. We discuss the system model in the following.

3.1 System Model In this section, we discuss the security framework to allow security, privacy, and interoperability among IoMT. Further, design goals and components have been discussed in the following. In the system model, we consider a healthcare paradigm where services are provided using smart sensors attached to the patient body, external environment, and patient’s vicinity. The collected data is initially transferred to the edge cloudlets in the vicinity of the IoMT infrastructure, which includes the personal information of the patients. In this case, every patient contains a unique identity maintained by id = Patient address . The data in the vicinity of IoMT devices is collected at the edge cloudlets and is finally transferred to the central cloud data center for long-term storage. The patient ID information is stored on the blockchain, which is managed at the central cloud. Records of the IoMT devices such as wearable sensors, gadgets, mobile applications, and other health-related data is stored with the patient ID on the blockchain. The data is collected and processed based on continuous parameters provided by sensors, which can be accessed by medical practitioners from the cloud. The components of the BSIIoMT framework are given in the following. IoMT Data Collector IoMT Data Collector (IDC) collects data from the medical sensors and distributes it over the central cloud in an efficient way. IDC has the responsibility to control transactions all over the blockchain network, which includes data management, storage, and providing access to the mobile users. A strict user policy governance strategy is used to administer users by IDC. The edge cloudlets initially collect data and then transfer it to the cloud for secure storage and interoperability management among different platforms. Data Governor Transactions on the cloud are managed by the data governor, by allowing, revoking, and modifying the access policies. The transactions have the capability to place smart contracts and is the sole governing unit, which is able to monitor and update rules in smart contracts.

Blockchain Based Secure Interoperable Framework for the Internet …

539

Smart Contracts Smart contracts maintain the security and access control in the IoMT paradigm. The interaction of users with smart contracts is controlled by Application Binary Interface. In this way, smart contracts analyze and verify requests and allow privileges to the users by initiating transactions and messages. Smart contracts govern overall blockchain entities. Decentralized Storage We develop a decentralized storage system to store data on the blockchain and intend to use the InterPlanetary File system (IPFS) in order to build a file-sharing platform. The proposal is to use IoMT data stored on the IPFS nodes, and their hash values being stored on the Distributed Hash Table. Block Structure Blockchain consists of data blocks containing transactions that are represented using a Merkle tree. The path to the leaf node of the Merkle root represents the data access path of a transaction. The service requests are generated at the service access platforms that require data from IoMT sensors. The service users (i.e., doctor require blood pressure sensor data) establish a transaction that would be signed with the private key of the sensor using a timestamp, which aims at creating a trust between IoMT and cloud server. Every block contains a hash of the current block, previous block hash, Merkle root, nonce, and time stamp. We discussed the components of the BSIIoMT framework and their responsibilities. The next section describes the overall working principle of the BSIIoMT framework.

3.2 Working Principle of the BSIIoMT Framework In this section, we will discuss the overall working principle of the BSIIoMT framework including data collection, uploading, and secure downloading for efficient healthcare services provisioning. Transactions in the system are automatically managed using smart contracts and blockchain, which provides a higher amount of efficiency and reliability. Every blockchain device takes part in the authentication process using a blockchain-based consensus mechanism. Figure 2 shows overall components and working principle of the BSIIoMT framework starting from data collection by IoMT sensors to secure interoperable medical services provisioning to end-users. The numbered steps of the BSIIoMT are given in Fig. 2 starting from data collection to the provisioning of services to the healthcare service users. The working principle of the proposed BSIIoMT framework has been elaborated in the following in a chronological order. • An IoMT gateway identifies the request in the form of a transaction from an IoMT device to upload data to the edge cloudlets that is sent to the service receiver shown in the figure. • This request is processed by the blockchain client and forwarded to the IDC manager. The IDC manager communicates with the smart contract for the validation of the transaction received from the IoMT device.

540

W. Rafique et al.

Fig. 2 Components of the BSIIoMT framework showing numbered steps performed from data gathering to the provisioning of the services to the healthcare service users. Blockchain has continuous interaction in the workflow supporting security and interoperability throughout the process

• The IDC manager validates the transaction using smart contract. Upon validation of the request, a request is forwarded back to the IoMT gateway to upload the data. • At this step, IoMT gateway selects required data from the sensors and encrypt this data using public key of IDC manager. The gateway uploads the data to the IPFS storage on the central cloud and returns the hash of the data that will be stored in hash table. • All the transactions are represented by data blocks and are first added to the transaction pool. The miners perform proof-of-work and add the transaction in the blockchain (number 7 in the figure). • This transaction is recorded at the IoMT gateway. The IoMT connects with the blockchain and forwards the transaction and signs it with the private key of IDC manager on cloud. • The IDC manager validates the access rights of the IoMT using smart contract. The IDC manager analyzes service request from the IoMT. The analysis involves confirming ID of the IoMT and forwards the request to IPFS to retrieve the data stored by IoMT. • The IDC manager decrypts the requested data using an encryption technique and forwards the requested data to the service requester. • The transaction is updated at the interface of the service requester for tracking using blockchain client. • Once the transaction is successfully added, data is forwarded to the IoMT medical service users. The medical service users perform computational processing on the data to accomplish relevant service associated with the data. The sequence of steps shown above are used to verify a request and assign access rights to users; hence, secure interoperability of the IoMT data is maintained.

Blockchain Based Secure Interoperable Framework for the Internet …

541

3.3 Design Consideration For secure interoperability of IoMT in healthcare paradigm on edge-cloud storage, the framework should be able to achieve following design considerations. • The framework should be able to ensure identify of the IoMT device, authentication and trustworthiness. Only authorized service users should be allowed to access IDC data and prevent potential attacks from adversaries. • Designing a highly secure environment with a reduced communication and computation overhead is a challenging issue. Although secure interoperability is required, the system should not add extra overhead in accessing IoMT content. The system should lightweight and able to provide data access at lower latency. • The framework should be able to provide high level of interoperability, granular security, and privacy to the IoMT users that would make it feasible for healthcare services provisioning. • To achieve the above design goals, we use a trustworthy access control of IoMT data with data sharing using smart contracts and peer to peer storage using IPFS, which is designed to achieve decentralization.

4 Discussion The BSIIoMT framework will be able to provide secure interoperable access of the IoMT data to the requesting users. We have identified specific design considerations based on metrics discussed in Abou-Nassar, 2020 #24}, [25] that should be followed to measure the efficacy of the BSIIoMT framework. We are currently, implementing the BSIIoMT framework, which will be presented in our later research. In this paper, we provide a qualitative analysis of the BSIIoMT framework and analyze whether it fulfills the required functionality. The most important function of the BSIIoMT framework is its conformance with the identified goals where the core importance is given to secure interoperability. The proposed system preserves user identity and authentication information by strictly employing access verification from the IoMT devices to guarantee trustworthiness. The proposed system not only authorizes underlying users to access the IoMT data but also effectively prevents threats to the IDC. • The proposed system will be able to effectively employ a fast data retrieval process, which follows a strict authorization policy based on smart contracts. Furthermore, the proposed access control mechanism is lightweight, which avoids higher network latency discrepancies. Moreover, it overcomes challenges of extra resource consumption and provides enhanced functionality. • The system will be able to achieve a higher level of accuracy and provides enhanced security and privacy. Therefore, it offers flexibility to mobile users making it feasible for general-purpose healthcare scenarios.

542

W. Rafique et al.

• The BSIIoMT framework exploits an interaction of edge and cloud computing; therefore, can accommodate an increasing number of IoMT devices in order to provide scalable services to the users. Therefore, it could be deployed for a largescale healthcare services provisioning paradigm. • The proposed system provides the flexibility of accessing various information metrics using the data generated by IoMT. Healthcare professionals can use this data for long-term decision-making and performing analytics in a flexible way. All the above-mentioned goals are efficiently fulfilled by the proposed system, which effectively provides access control and data sharing scheme by exploiting smart contracts to access IoMT data. The BSIIoMT framework overcomes limitations of centralized techniques by providing higher access control characteristics and reducing latency. We have performed a comparative analysis of the BSIIoMT framework using a qualitative metrics with the state-of-the-art available approaches in Table 1. The key comparison metrics based on [25, 26] include interoperability, security, privacy, scalability, data integrity, and accuracy enhancement. We selected the latest research published in high quality journals on secure interoperability in the IoMT paradigm. We can observe in Table 1 that the proposed system covers all the comparison metrics and provides a comprehensive solution for the secure interoperability for large-scale healthcare services provisioning. We will present further evaluation and comparison of the BSIIoMT framework in our upcoming research.

5 Conclusion In this paper, we proposed a secure interoperability framework for IoMT using an edge computing-based blockchain. Critical challenges of current IoMT have been identified and a solution framework has been proposed to overcome those challenges. The BSIIoMT framework uses a smart contract system to ensure efficient and secure IoMT data sharing. We have proposed to use Ethereum blockchain on the central cloud where IoMT data is collected by the edge cloudlets in the vicinity of IoMT devices. We deploy a peer-to-peer storage system to achieve a decentralized data storage and sharing mechanism. A theoretical comparison with the state-of-the-art research shows that the proposed system overcomes challenges of secure interoperability and provides flexible healthcare services while ensuring higher scalability. We intend to perform implementation and experimental analysis in the future to prove the actual reliability of the BSIIoMT framework. The implementation of the BSIIoMT framework will help healthcare practitioners to access medical data over a mobile cloud environment in a rapid and reliable manner as compared to available techniques. We intend to provide a security analysis of the BSIIoMT framework from different technical aspects to show the efficacy of the BSIIoMT framework as compared to the other available systems. We are currently developing a testbed to evaluate the BSIIoMT framework where the results will be presented in our upcoming

Research Scope

Data Exchange

Access Control

Data Exchange

Malware Detection

Authentication

Interoperability

Disease Detection

Secure Interoperability

Reference

Abdellatif al. [8]

Egala et al. [15]

Gopikrishnan et al. [12]

Khan et al. [13]

Lin et al. [17]

Kumar et al. [18]

Rahman et al. [1]

BSIIoMT Framework



√ √

















√ √

Scalability



Privacy √ √

Security √



Interoperability



Data Integration













Accuracy √

Table 1 A comparison of the aspects covered in this paper with the state-of-the-art available research. We highlight the key components discussed in this research paper and compare them with the related research

Blockchain Based Secure Interoperable Framework for the Internet … 543

544

W. Rafique et al.

research paper. The BSIIoMT framework will provide an efficient solution for secure interoperability management in the current heterogeneous healthcare paradigm.

References 1. Rahman MA, Hossain MS (2021) An internet-of-medical-things-enabled edge computing framework for tackling COVID-19. IEEE Internet Things J 8(21):15847–15854 2. Ibaida A, Abuadbba A, Chilamkurti N (2021) Privacy-preserving compression model for efficient IoMT ECG sharing. Comput Commun 166:1–8 3. Rafique W, Qi L, Yaqoob I, Imran M, Rasool RU, Dou W (2020) Complementing IoT services through software defined networking and edge computing: A comprehensive survey. IEEE Communications Surveys & Tutorials 22(3):1761–1804 4. Rafique W, Khan M, Zhao X, Sarwar N, Dou W (2019) A blockchain-based framework for information security in intelligent transportation systems. International Conference on Intelligent Technologies and Applications. Springer, pp 53–66 5. Rafique W, Khan M, Sarwar N, Dou W (2019) A security framework to protect edge supported software defined Internet of Things infrastructure. International Conference on Collaborative Computing: Networking, Applications and Worksharing. Springer, pp 71–88 6. Lv Z, Chen D, Lou R, Wang Q (2021) Intelligent edge computing based on machine learning for smart city. Futur Gener Comput Syst 115:90–99 7. Huang S, Wang S, Wang R, Wen M, Huang K (2021) Reconfigurable intelligent surface assisted mobile edge computing with heterogeneous learning tasks. IEEE Trans Cogn Commun Netw 7(2):369–382 8. Abdellatif AA et al (2021) Medge-chain: Leveraging edge computing and blockchain for efficient medical data exchange. IEEE Internet Things J 8(21):15762–15775 9. Oniani S, Marques G, Barnovi S, Pires IM, Bhoi AK (2021) Artificial intelligence for internet of things and enhanced medical systems, in Bio-inspired neurocomputing: Springer, pp 43–59 10. Hosseinzadeh M et al (2021) A multiple multilayer perceptron neural network with an adaptive learning algorithm for thyroid disease diagnosis in the internet of medical things. J Supercomput 77(4):3616–3637 11. (August 08) Switch Health provides COVID-19 testing and at-home healthcare solutions. https://www.switchhealth.ca/en/ 12. Gopikrishnan S, Priakanth P, Srivastava G, Fortino G (2021) EWPS: emergency data communication in the internet of medical things. IEEE Internet Things J 8(14):11345–11356 13. Khan S, Akhunzada A (2021) A hybrid DL-driven intelligent SDN-enabled malware detection framework for Internet of Medical Things (IoMT). Comput Commun 170:209–216 14. Malamas V, Chantzis F, Dasaklis TK, Stergiopoulos G, Kotzanikolaou P, Douligeris C (2021) Risk assessment methodologies for the internet of medical things: A survey and comparative appraisal. IEEE Access 9:40049–40075 15. Egala BS, Pradhan AK, Badarla V, Mohanty SP (2021) Fortified-chain: a blockchain-based framework for security and privacy-assured internet of medical things with effective access control. IEEE Internet Things J 8(14):11717–11731 16. Zeng P, Zhang Z, Lu R, Choo K-KR (2021) Efficient policy-hiding and large universe attributebased encryption with public traceability for internet of medical things. IEEE Internet Things J 8(13):10963–10972 17. Lin P, Song Q, Yu FR, Wang D, Guo L (2021) Task offloading for wireless VR-enabled medical treatment with blockchain security using collective reinforcement learning. IEEE Internet Things J 8(21):15749–15761 18. Kumar M, Chand S (2021) MedHypChain: A patient-centered interoperability hyperledgerbased medical healthcare system: Regulation in COVID-19 pandemic. J Netw Comput Appl 179:102975

Blockchain Based Secure Interoperable Framework for the Internet …

545

19. Lee E, Seo Y-D, Oh S-R, Kim Y-G (2021) A survey on standards for interoperability and security in the internet of things. IEEE Commun Surv & Tutor 23(2):1020–1047 20. Park C-Y, Lim J-H, Park S (2011) ISO/IEEE 11073 PHD standardization of legacy healthcare devices for home healthcare services. In: 2011 IEEE International Conference on Consumer Electronics (ICCE), IEEE, pp 547–548 21. Frohner M, Urbauer P, Bauer M, Gerbovics F, Mense A, Sauermann S (2009) Design and realisation of a framework for device endcommunication according to the IEEE 11073–20601 standard In: Proceedings Tagungsband der eHealth, Citeseer, pp 135–139 22. Khattak HA, Ruta M, Di Sciascio EE (2014) CoAP-based healthcare sensor networks: A survey. In: Proceedings of 2014 11th International Bhurban Conference on Applied Sciences & Technology (IBCAST) Islamabad, Pakistan, 14th-18th January, 2014, IEEE, pp 499–503 23. Hu T et al (2021) Transaction-based classification and detection approach for Ethereum smart contract. Inf Process Manage 58(2):102462 24. Vivar AL, Orozco ALS, Villalba LJG (2021) A security framework for Ethereum smart contracts. Comput Commun 172:119–129 25. Noura M, Atiquzzaman M, Gaedke M (2019) Interoperability in internet of things: Taxonomies and open challenges. Mob Netw Appl 24(3):796–809 26. Abou-Nassar EM, Iliyasu AM, El-Kafrawy PM, Song O-Y, Bashir AK, Abd El-Latif AA (2020) DITrust chain: towards blockchain-based trust models for sustainable healthcare IoT systems, IEEE Access, 8, pp 111223–111238

WaterCrypt: Joint Watermarking and Encryption Scheme for Secure Privacy-Preserving Data Aggregation in Smart Metering Systems Farzana Kabir, David Megías, and Tanya Koohpayeh Araghi

Abstract As the world is embracing the new era of smart technologies and the development of IoT equipment, such as smart grid and metering systems, a significant concern related to the privacy and security of user’s confidential information is forming rapidly. Many of the existing solutions are still suffering from plenty of time and power consumption, and security vulnerabilities. Secure data aggregation in smart metering systems is still a challenging task due to a plethora of attainable cyber and physical attacks. This paper presents the novel WaterCrypt technique based on reversible watermarking and Paillier encryption that significantly reduces the battery consumption of resource constrained smart meters by introducing a unique Encryption Server as well as making use of the homomorphic encryption properties to secure real time data transmission. In addition, the reversible watermarking technique used in the protocol guarantees the integrity and authenticity of data. The experimental results show that the proposed scheme offers a privacy friendly, secure data transmission and aggregation solution in smart metering systems in a cost effective manner. Keywords IoT · Smart home · Smart meter · Paillier cryptography · Reversible watermarking · Data aggregation

F. Kabir (B) · D. Megías · T. K. Araghi Internet Interdiscipinary Institute (IN3), Center for Cybersecurity Research of Catalonia (CYBERCAT), Universitat Oberta de Catalunya, Rambla del Poblenou, 154, 08018 Barcelona, Spain e-mail: [email protected] D. Megías e-mail: [email protected] T. K. Araghi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_47

547

548

F. Kabir et al.

1 Introduction A smart meter is automated equipment installed in smart home for calculating the energy consumption of various commodities such as gas, water, and electricity [12]. It receives pricing information and load forecasting from the utility company while displaying all the information in the in-home-display (IHD) [1]. A smart metering system can be defined as the infrastructure between smart meters and different remote entities such as users, utility service providers, and data aggregators, taking advantage of different communication technologies like ZigBee or Wi-Fi [2, 10], as shown in Fig. 1. Metering data is collected from each smart meter very frequently and aggregated before being sent to the service provider/control center. Data exchange between endpoints is vulnerable to security and safety threats, as well as the risk of malicious attacks. Data can be transferred from the smart home appliances to the SM using the home area network and to the utility center through a wide area network. A large amount of information generated by SMs provides the opportunity for service providers to monitor and control power utilities in a real time manner [7]. Illegally compromising a smart meter can cause a huge financial or personal damaging effect [9]. To deal with all possible threats related to smart metering systems, several security technologies and different privacy preserving schemes are being developed. However, due to the limitation of computing power of smart meters, cryptographic algorithms, such as Homomorphic Encryption (HE) are unable to execute heavy computational tasks. Data hiding methods, like digital watermarking, ensure the integrity and authenticity of the content with lightweight computational operations. Watermarking-based aggregation schemes, on the other hand, cannot provide a high level of security and are vulnerable to various attacks [5]. Considering the limitations of existing watermarking and cryptographic schemes, we present a novel P2DA scheme, WaterCrypt, to mitigate the risk of confidential

Fig. 1 Smart metering system

WaterCrypt: Joint Watermarking and Encryption Scheme for Secure …

549

information leakage based on reversible digital watermarking and Paillier homomorphic encryption. In the proposed scheme, a unique encryption server has been introduced in order to perform HE and preserve data confidentiality against security breaches to reduce energy consumption in smart meters. The rest of the paper is organized as follows. Sections 2 and 3 are assigned to related works and the overview of the proposed scheme respectively. System performance evaluation and experimental results are described in Sect. 4. Finally, the conclusion and future work are discussed in Sect. 5.

2 Related Works This section is aimed to provide a general overview of the smart meter security measures, existing issues, possible attacks, and some approaches proposed to face the challenges in this area. In 2021, Mohammadali and Haghighi [8] proposed a novel homomorphic privacy preserving protocol (called NHP3) for data aggregation to support multi category aggregations. It supports batch verifications as well as multi-dimensional aggregations where the Paillier cryptosystem is used. This scheme is focused on cyberattacks, but possible physical attacks are not investigated by the authors. Also, applying Paillier encryption in smart meters may cause computational complexity. In 2022, Yang Ming et al. [6] proposed a P2DA scheme that applies lightweight symmetric homomorphic technology and elliptic curve signature to accomplish efficiency. It resists common attacks, such as: collision, modification, and replay attacks. However, different physical attacks were not proven to be fully prevented. In 2017, Ni et al. [10] introduced a new security model to formally define the misbehavior of hacked collectors and proposed a P2DA scheme to achieve end-toend security and high efficient communication in smart grids using HE. In their proposal, the authors have not considered an operation center to detect dishonest behaviors which is necessary for a strong privacy-preserving protocol. Chen and Xiong [3] proposed a privacy protection method for WSN using wavelet dual watermarking to prevent tampering or packet loss and significantly provide data integrity in SMs. Since the proposed scheme uses two different watermarking techniques simultaneously, time delay and cost effectiveness need to be further analyzed. Recently, in 2022, Wang et al. [13] proposed the idea of combining digital watermarking and asymmetric encryption for a privacy preserving scheme in smart grids. The sensitive data is encrypted using the public key and is hidden in the collected readings using digital watermarks. The proposed method ensures secure end-to-end confidentiality. However, in their scheme, the data are distorted, since the watermarking scheme is not reversible.

550

F. Kabir et al.

3 Overview of the Proposed Scheme 3.1 System Model In the proposed scheme, a residential area with multiple smart homes is considered, where multiple SMs are used. The usage data is recorded in the SMs periodically. The entire system consists of four entities: (1) Smart Meters (SMs), (2) Encryption server (ES), (3) Data Aggregator (DA), and (4) Control Center (CC). SM is responsible for embedding the watermark in the original data and sending to the ES. ES encrypts the watermarked data and sends it back to SM. Before sending the data to DA, SMs preprocess it (e.g. pseudorandom number-based encryption). After receiving data from all the SMs, DA aggregates the watermarked encrypted data and sends it to CC eventually. CC decrypts and extracts the total consumption data after verification and stores it for future analysis.

3.2 Threat Model An attacker can be an internal or external entity [8]. An attacker can try to gain unauthorized access to SM data for various dishonest purposes. We denote the attacker as A, where he/she may attempt the following attacks: • Eavesdropping: A can eavesdrop on the transmission channel between the system entities to access daily lifestyle and behavior of the user for crime purposes. • Man-in-the middle attack (MitM): A can intercept the messages sent between the system entities to retrieve, discard, or modify the SM data. • Replay attack: A can perform replay attacks on the network to resend, replay or delay an SM’s report maliciously. • Masquerading attack: An external A can pretend to be an internal entity to gain unauthorized access to the system data. • Impersonation: A may impersonate one or a group of SMs to send fake data on behalf of non-compromised SMs to the CC. • False data injection (FDI): There might be physical attacks with the purpose of inserting a malicious node or injecting malicious codes for compromising the SM.

3.3 Proposed Scheme The proposed scheme consists of several parts which take place in the four components of the system model.

WaterCrypt: Joint Watermarking and Encryption Scheme for Secure …

3.3.1

551

Initialization

It is assumed that the entire residential area has n SMs. The number of SM (n) is assumed to be odd. If n is not odd, it is always possible to break it into two odd numbers. The reason behind considering n to be odd is explained further, at the end of this section. Considering m time frames, each smart meter (SMi ) generates data, d ij at each time period t j , where i = 1, 2, 3 ... n and j = 1, 2, 3, ..., m. In the proposed scheme, we consider the data consumption to be generated every half an hour (30 min). At each t j , each SM generates three pseudorandom numbers using three different seeds called Seed i1 , Seed i2 , and Seed i3 . These pseudorandom numbers are denoted as Rij1 , Rij2 , and Rij3 , respectively. We employ a pseudorandom number generator (PRNG) to generate them using a key (K r ). SM shares Rij1 , Rij2 , and Rij3 only with ES, DA and CC respectively. CC generates a public and private key pair (PK , S K ) for Paillier encryption and decryption [11]. S K is completely unknown to any other entities but CC. The watermark W is generated based on a cryptographic hash function, more precisely the Secure Hash Algorithm-2 (SHA-2). This HASH function uses a secret key (K w ) generated by SM and the timestamp, t j as: Hj ← HASH (K w , t j ). The watermark is generated using the following formula: W = H 1 ⊕ H 2 ⊕ H 3 ⊕ …. ⊕ H m , where W is converted to binary, and m bits of the binary watermark (W j ) are used for embedding into the SM data in m time-frames.

3.3.2

Protocol Phases

The WaterCrypt protocol consists of five phases that take place in the different entities of the system. The framework is as shown in Fig. 2. • Phase 1 in SM: The least significant bit (LSB) is used for embedding one bit of the watermark to the original data generated at each SM i . We obtain the watermarked data d ’ ij by multiplying di with 2, and then adding W j to the LSB bit. A pseudorandom number, Rij3 , is added to d ij and then Rij1 encrypts data using XOR. SM sends the watermarked encrypted data, E ij to the ES. • Phase 2 in ES: ES decrypts E ij using Rij1 . Then it generates the cipher text Pij by applying the Paillier encryption. For secure communication, Rij1 is again encrypted to Pij using XOR. ES sends back the result P ij to the SM. • Phase 3 in SM: SM performs an XOR operation using Rij1 to decrypt the data and obtain Pij . Finally, it encrypts Pij again using Rij2 before sending it to DA. • Phase 4 in DA: DA receives the watermarked encrypted data, P ij from all the n SMs and performs the decryption using Rij2 . Then it aggregates all P ij together  using a simple addition in the encrypted domain. This aggregated data Qj ( P ij ) is transferred to the CC.  • Phase 5 in CC: CC decrypts Qj using the private key (S K ) for obtaining V j ( d ij ).   CC then subtracts Rij3 from V j to obtain V j ( d ij ). The watermark is then validated by comparing the LSB of V j and W j . If they are not equal, it is assumed

552

F. Kabir et al.

Fig. 2 Protocol of the WaterCrypt scheme

that the data was tampered and is declined immediately. If the LSB is correct, CC extracts the watermark and retrieves the sum of real usage data, Dj ( d ij ). Note that the LSB of the d ij after data embedding is equal to W j . Therefore, d i is even if the W j = 0 and odd if W j = 1. When aggregating n different d ij , if all of them are odd, and n is also odd, the final result will be odd as well. If W j = 0, all d ij are even and the sum d ij will also be even. Hence, the LSB of the aggregated data will be equal to the watermark bit as long as n is odd. Therefore, without loss of generality, we assume that either n is odd, or we split the SMs into two groups with an odd number1 of SMs.

4 Experimental Results and Performance Evaluation We assess the performance of the proposed WaterCrypt scheme by calculating the total computational cost of the initialization process and the time of data processing at each phase described in Sect. 3.

It is always possible to write an even n as the sum of two odd numbers, e.g. n = 41 + 41, or n = 100 = 51 + 49. 1

WaterCrypt: Joint Watermarking and Encryption Scheme for Secure … Table 1 Computational time of different phases

553

Operations

Computational time (best of 5) (s)

Key generation

1.6900

Random number generation 0.5150 Watermark generation

0.0028

Phase-1 in SM

0.7050

Phase-2 in ES

0.4400

Phase-3 in SM

0.4430

Phase-4 in DA

1.1000

Phase-5 in CC

2.0000

4.1 Computational Time For this experiment, we have used real data sets of energy consumption readings for a sample of 5,567 London households that took part in the UK Power Networks led Low Carbon London project in 2014 [4]. The experiments have been carried out on the platform 11th Gen Intel(R) Core(TM) i3-1125G4 @ 2.00 GHz running Microsoft Windows 10 Pro with 8 GB of memory. The Python language is used for the programming. For Paillier cryptography, we have used the built-in library python-Paillier (phe 1.5.0). Table 1 shows the computational times of various phases for each timestamp. It can be clearly seen that, even though, the initialization is a little bit longer, the operations in various phases are very fast. The overall computational overhead is significantly lower in our scheme comparing with other existing schemes. In addition, WaterCrypt outperforms the existing works in terms of accuracy. Due to the use of reversible watermarking, the summation of the energy consumption data of an area is achieved accurately at CC. Comparing the proposed WaterCrypt scheme with Wang’s joint scheme [13], WaterCrypt clearly shows better performance in terms of availability of data by using reversible watermarking. The issue of the resource limitation of SM devices is also resolved by employing an encryption server. In [13], the authors used direct watermarking, which led the scheme to have data distortion, whereas the proposed reversible watermarking scheme shows no distortion at all.

4.2 Security and Privacy Analysis In the proposed protocol, the SM embeds the watermark and, later on, encrypts it using a pseudorandom number to maintain proper confidentiality. ES cannot retrieve the watermarked data as it does not have access to Rij3 . The DA does not have access to the private key (PK ) and neither to Rij3 . CC has no access to an individual user’s fine grained data.

554

F. Kabir et al.

Table 2 Security and privacy comparison Methods Wang et al. [13] Mohammadali et al. [8] Ni et al. [10] Ming et al. [6] Proposed scheme

C √

A1 √

I1 √

A2 √























I2 – –



– √

– √







C Confidentiality, A1 Authenticity, I1 Integrity, A2 Anonymity and I2 Indistinguishable. Signs—yes: √ , not mentioned

To ensure the integrity of the user data, we use reversible watermarking, which can successfully recover the original data. The watermarked data in encrypted, so that the original data cannot be manipulated by the adversary, not even by a compromised ES, DA or CC. The seeds are different for all SM, which acts as an authentication parameter of each SM to make it easy to verify the authenticity of data at every phase. The user identity remains absolutely anonymous since all the data for all users is undetectable. Therefore, an attacker will not be able to determine any information about the user behavior or life style. Table 2 shows the comparison of our proposed protocol with other related work in terms of security and privacy.

4.3 Comparative Threat Analysis As shown in Table 3, all the six types of attacks mentioned in Sect. 3.2 are prevented by the proposed WaterCrypt scheme. Comparing it with the related works mentioned in Sect. 2, it can be clearly seen that WaterCrypt is superior in terms of attack detection and prevention. Since data is encrypted with pseudorandom numbers, an eavesdropper would never access the plaintext data. Table 3 Threat analysis Methods Wang et al. [13] Mohammadali et al. [8] Ni et al. [10] Ming et al. [6] Proposed scheme

E √ √ √ – √

I

M

– √







– √

– √



R √

MitM –



– √



– √



– √





FDI

– √

E Eavesdropping, I Impersonation, M Masquerading, R Replay Attack, MitM Man-in-the-middle √ and FDI False Data Injection. Signs—yes: , not mentioned

WaterCrypt: Joint Watermarking and Encryption Scheme for Secure …

555

In the proposed protocol, the timestamp prevents replay attacks, since the pseudorandom numbers Rij3 change every timestamp. This means that the decryption of replayed data will be wrong and the decrypted data will typically have an impossible value. Masquerading is also prevented in the proposed scheme, since data in every communication channel is encrypted with different seeds. Only the sender and receiver have the right seeds for decrypting real data. If an attacker tries to send fake data on behalf of an internal entity, the pseudorandom number must be used and the watermark bit must be verified. Man-in-the-middle attacks can be detectable and avoided by preventing the access of the attacker to the private key of CC and the related seeds to encrypt plaintext data in each path. Furthermore, the proposed scheme adds an extra degree of protection by watermarking the data while transferring them to prevent false data injection attacks. As it can be seen, the proposed scheme can countermeasure all the threats mentioned in Table 3, while in Wang et al. [13] model can only resist the eavesdropping (E) and Replay (R) attacks. Mohammadali et al. [8] countermeasures E and I, while Ni et al.‘s model [10] is robust against E and R. In Ming et al. [6], also just MitM and R can be prevented, but the proposed scheme is robust against all investigated threats mentioned in Table 3. This specification, in addition to speed and accuracy of the proposed scheme, makes it a good candidate to be replaced with all related works mentioned in Table 3.

5 Conclusion and Future Work This paper presents the novel WaterCrypt scheme for privacy preserving data aggregation based on reversible watermarking and Paillier encryption. An LSB-based reversible watermarking approach is used to preserve the integrity of the user fine grained data. Considering that HE can provide well-grounded security to the data in one hand, and the limited computational capability of SMs on the other hand, we introduced another entity in the system: encryption server. As a substitution of digital signatures for authentication, simple PRNG is employed to make the protocol more lightweight. The WaterCrypt protocol provably protects the system from various attacks such as eavesdropping, replay, man-in-the-middle, impersonation, and masquerading. Physical attacks like false data injection are also taken into consideration. Moreover, the computational time is significantly small. Using reversible watermarking, the WaterCrypt protocol performs better than other recent P2DA schemes. After analyzing the security and privacy requirements and evaluating performance, it can be concluded that we have successfully implemented a suitable and secure P2DA scheme for resource-constrained smart metering systems. For future work, the focus will be on high frequency smart meters where the usage data are generated very frequently (e.g., every few seconds or once a minute). We

556

F. Kabir et al.

aim to employ reversible watermarking using difference expansion for this kind of smart metering systems. Acknowledgements The authors acknowledge the funding obtained by the RTI2018-095094-BC22 “CONSENT” and PID2021-125962OB-C31 “SECURING” projects granted by the Spanish Ministry of Science and Innovation. The first author also acknowledges the predoctoral grant PRE2019-091465 by the Spanish Ministry of Science and Innovation.

References 1. Asghar MR, Dán G, Miorandi D, Chlamtac I (2017) Smart meter data privacy: a survey. IEEE Commun Surv Tutor 19(4):2820–2835 2. Burunkaya M, Pars T (2017) A smart meter design and implementation using ZigBee based wireless sensor network in smart grid. In: Paper presented at the 2017 4th international conference on electrical and electronic engineering (ICEEE), pp 158–162 3. Chen Q, Xiong M (2016) Dual watermarking based on wavelet transform for data protection in smart grid. In: Paper presented at the 2016 3rd international conference on information science and control engineering (ICISCE), pp 1313–1316 4. DataStore L (2018) Smartmeter energy consumption data in london households 5. Kabir F, Qureshi A, Megıas D (2021) A study on privacy-preserving data aggregation techniques for secure smart metering system 6. Ming Y, Li Y, Zhao Y, Yang P (2022) Efficient privacy-preserving data aggregation scheme with fault tolerance in smart grid. Secur Commun Netw 7. Moghaddass R, Wang J (2017) A hierarchical framework for smart grid anomaly detection using large-scale smart meter data. IEEE Trans Smart Grid 9(6):5820–5830 8. Mohammadali A, Haghighi MS (2021) A privacy-preserving homomorphic scheme with multiple dimensions and fault tolerance for metering data aggregation in smart grid. IEEE Trans Smart Grid 12(6):5212–5220 9. Nateghizad M, Erkin Z, Lagendijk RL (2016) An efficient privacy-preserving comparison protocol in smart metering systems. EURASIP J Inf Secur 2016(1):1–8 10. Ni J, Zhang K, Lin X, Shen XS (2017) Balancing security and efficiency for smart metering against misbehaving collectors. IEEE Trans Smart Grid 10(2):1225–1236 11. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Paper presented at the international conference on the theory and applications of cryptographic techniques, pp 223–238 12. Sun Y, Lampe L, Wong VW (2017) Smart meter privacy: exploiting the potential of household energy storage units. IEEE Internet Things J 5(1):69–78 13. Wang S-X, Chen H-W, Zhao Q-Y, Guo L-Y, Deng X-Y, Si W-G et al (2022) Preserving scheme for user’s confidential information in smart grid based on digital watermark and asymmetric encryption. J Central South Univ 29(2):726–740

Technological Applications for Smart Cities: Mapping Solutions Bruno Santos Cezario and André Luis Azevedo Guedes

Abstract This study sought to investigate through the analysis of literature review articles on the topics of technology applications and smart cities. There is a problem in evaluating how they should be mapped as investments in a smart city. There is a gap in the governance of cities in the touching priority investment solutions that can assist in decision making in cities to make them smart and sustainable. Therefore, the objective of this article was to obtain information on studies carried out on technological applications combined with dimensions/driver mapping solutions to transform cities into smarter ones. The study used a bibliographic analysis based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology and applying smart cities and technological applications as basic parameters. In this way, it is expected as a result that it will be possible with the use of technology to define priority investment drivers. Keywords Smart Cities · Technological Applications · Governance · Sustainability

1 Introduction Articles related to smart cities and technological applications were used as a basic foundation in the construction of this research. This research seeks to verify how technological applications can be used in the mapping of solutions defined as priorities that help local or regional development, with direct assistance from citizens. According to a UN study, there will be a substantial increase in people in urban areas in the coming decades, which will influence the entire social ecosystem. Given this, it is imperative that governments invest in the use of new technological tools to increase quality of life [4]. There is a deficiency in the governance of the referenced B. S. Cezario (B) · A. L. A. Guedes Centro Universitario Augusto Motta, Rio de Janeiro, Brasil A. L. A. Guedes e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_48

557

558

B. S. Cezario and A. L. A. Guedes

cities, this deficiency being referenced in studies as the main problem faced by cities [11]. More than ever, with the use of information technology, it is essential to map possible problems faced by citizens and detect which ones are a priority, helping to direct investment to the public power, which in turn can help other sectors of society in making decisions. Decision-making and policy creation [15]. The use of technological applications to map investment priorities was used as an interrogative taken into account when choosing the articles that will be used in this study. Therefore, this review article will describe how a smart city can make use of technological tools that can help improve living conditions by boosting urban systems efficiently. There is a great concern with the development of cities so that they have a sustainable and environmental alignment [22]. In this way the content is aligned with the United Nations (UN) Sustainable Development Goals and make up the 2030 Agenda.

2 Proposed Methodology The methodology used in this review article was the systematic bibliographic research given the plurality of the study involving smart cities. In this way, different search vehicles were used aiming at greater assertiveness in the process. They are: Portal de Periódicos Capes and Scielo. These scientific search vectors contain information about various academic texts. To carry out the research, the term “smart cities” was used as keywords, as well as its translation into English “smart cities”. The research carried out for the prism was carried out during the month of April 2021, in the Portal de Periódicos Capes, 33.630 allusions were found. The Scielo portal returned 130 results. Based on the above considerations, we have a research base of 33.760 references. However, for the process to be executed as successfully as possible, the term “smart cities” was combined with the words “technological applications”, “governance” and “sustainability”, as in the initial research, terms translated into English were also used. In addition, only articles published in the last 5 years, peer-reviewed, were inserted as search filters so that our information collection was as recent and reliable as possible. In this way we arrive at 48 items. These were checked so that if duplicate articles were found, they would be excluded from the research. However, there was no need to remove any article because it was duplicated. Following the screening logic pointed out by [24] for systematic reviews and meta-analyses (PRISMA). In this study we followed a basic procedure that consists of exploratory reading of the abstracts of the articles to verify better adherence to the study. And as exclusion criteria, we saw the lack of confluence with the theme proposed for this review and the verification of subjects outside the context of the project. This phase resulted in the exclusion of 28 articles. Then, the texts of the 20 remaining articles were selectively read and evaluated in their entirety so that only texts with high relevance to the investigated topic and that could bring new elements or corroborate the proposal of this project resulted. This analysis did not result in any deletion of texts. In this study, 20 articles were used as a reference for the creation of a model that would guide the understanding of technological applications as mapping

Technological Applications for Smart Cities: Mapping Solutions

559

solutions for smart cities. To attest to the veracity of the research and comply with the methodological purposes that aim at transparency and trust in the process, the worksheet will be made available in annex. In a time without new relevant texts found, the bibliographic review research is completed (Fig. 1).

Fig. 1 Summary of bibliographic research using the flowchart PRISMA

560

B. S. Cezario and A. L. A. Guedes

3 Evolution of Cities to Intelligent Cities The concept of smart cities was created from a fashionable political ideal that had a wide diffusion in many countries and was widely accepted as being a vision of greatness and development as they claim [8]. Has an association with the concept of smart growth that is directly linked to the structural context of urban planning in cities since the 1990s. A long term strategy prazo [3]. As described above, the concepts are based on an ideological premise that combines sustainability and technology-driven development. Therefore, such an understanding of smart cities is still diffuse because it mixes the services offered in the city and the technologies used to provide them in a frenetic and conscious interaction that aims to help and improve life for its citizens in all fields of urban life, be they infrastructure, mobility, governance, social development, educational development, good practice in public policies, health, tourism, green management, the economy, among others. There is a cyber-sociotechnical system of a smart city where there is an interaction between humans and systems with the objective of improving the quality of urban life (Antônio 2021). That inserts a disciplinary thought of heritage for smart cities as a kind of heritage in the construction of sustainable processes that are part of a smart city. This, which can also be called a legacy that remains after the creation or evolution of the city, is understood as a learning process. We should note that all this structural baggage must come with a look at sustainability with the help of innovative solutions. We should also highlight that citizen participation plays a fundamental role in managing the evolutionary process of cities, mainly helping to take [22]. Starting from a cadenced evolution towards a smart city, we should note that there must be an access plan to this new level, as all evolution demands time and investment, be it intellectual, that could come through studies and possible conclusions to compile the help of guiding ideas, monetary investments with the funds necessary for the acquisition of the necessary means for the expected end, legislative investments creating laws that leave the process in legal harmony with society and the place where the process takes place, social investment with the help of the population integrated to the whole process be it charging transparency or monitoring. It is necessary to have a comprehensive understanding of a smart city, only in this way it is possible to merge knowledge and experiences with innovation [12].

4 Technological Applications for Smart Cities It is true, according to a literature review, that the growth of smart cities goes through an entire process of interconnection between sectors and comes with an immense load of accessible data, hence the need to verify at all levels an analysis for decision making that is sustainable and that use technological tools to manage this process [8]. For monitoring and monitoring services without a smart city, whether in mobility,

Technological Applications for Smart Cities: Mapping Solutions

561

infrastructure, waste management, assistance in water management, health, governance, among others. These applications have had a very fast growth and developed exponentially in order to facilitate urban development and make smarter cities more adequate to sustainable standards. And also according to [9], the UN’s 2030 agenda is categorical in saying that technological tools will help in the transformation to a development in accordance with sustainability. We see today that technologies are transforming environments and increasingly innovating the paradigm of cities whether in the urban perimeter it is essential to use technology as a creative force to effect the necessary changes [21]. We also know that it is through the use of tools technologies in smart cities that citizens perceive and become aware of the evolutionary process of cities and their participation as an individual relacionado a sustentabilidade, pois essas ferramentas conseguem inserir o cidadão de forma plena uso e no convívio do serviços [16]. A estudos que descrevem que o uso de ferramentas tecnológicas a obtenção de dados via big data, usa de conceitos inteligentes e técnicas aplicados a cidades inteligentes as “smatification” [20]. We must have a “techno-optimistic” look [17] for the use of technologies as it would help in critical reflection for a better use of the tools.

5 Use of Technology for Mapping Priority Drives in Cities The use of technology in cities over time has been extremely important as a strategic means for an evolutionary path and constant development for smart cities, but there is a gap in the correct use of these technologies so that we can succeed in meeting the goals set for the sustainable development success that can accompany the United Nations (UN) 2030 Agenda. There is a tendency for public managers to strengthen their relationships with their citizens, companies and the federative bodies themselves with the use of information technology this new aspect serves as a platform that is called electronic government [7] this type of government is premised on the use of technological platforms to strengthen relations with all sectors. It is with this guidance of aid for the governance and management of cities that we can direct the technologies to map priorities here called “drives” for a better understanding of the project. Such drives are of fundamental help in the creation of public and private policies for sustainable development and the evolution of smart cities. Because it is possible to measure where investments would be allocated, be they in health, education, governance, mobility, environmental management, according to the result of mapping priorities, and a holistic view of the process [12]. Looking at the case of urban governance, it is of fundamental help that we mapped the drives that helped us to monitor cities, identifying problems and even generating solutions, since cities and technologies develop in record time and since this process is very fast and intense. The mapping of investment priorities is also of fundamental importance for sustainable urban development, as it is in this environment that the greatest number of undirected problems resides. Only in this way is it possible to carry out the so-called urban innovation, which consists of the adoption of technological tools to improve

562

B. S. Cezario and A. L. A. Guedes

the infrastructure of cities in an intelligent fusion between the actors that make up society [18].

6 Innovation Quadruple Helix Engagement The quadruple helix of innovation is the interconnection of civil society, academia, governments and the private sector involved and focused on a common strategic objective for the development of innovation where the environment plays a fundamental role and where each actor has a fundamental role in this dynamic. “Equipe” we can say, in this way, will define goals and actions together, but each one in their means of action so that they can, in the end, reach the delimited objective. As described above in this ecosystem, each actor plays its role in the dynamics of innovation and contemplates the same objective. This interconnection of entities makes the entire creation process to be seen and monitored regardless of which class of actors we are inserted, since the participation of civil society increases this vision, since it is a junction of all actors. For this reason, in this helix concept, civil society is very important, as it has a role in monitoring the objectives and targets set [23] (Fig. 2).

Fig. 2 Quadruple helix of innovation

Technological Applications for Smart Cities: Mapping Solutions

563

6.1 Propeller of Governments The helix of governments includes the engagement of the federal, state and municipal government in the demands. In this way, government participation comes from supporting the development of cities with policies aimed at transforming cities. It is necessary to have public managers who accompany the development and can understand the process of change in smart cities so that there is a use of the resources they manage. Therefore, it is essential to follow the precepts established in the statute of cities so that development is not based on inconsistency, but on legislation planned to obtain solid development plans [13].

6.2 Academies Propeller The helix of the academies has a learning and training role in the innovation ecosystem and for this reason there is a tendency to increase the use of the set of words linked to smart cities by universities [2], since they are the ones who prepare and integrate innovative knowledge and precursors in society and also help in the development of sustainable technologies, which in the view of this article is of crucial importance for alignment with the 2030 Agenda (UN). Academies intertwine with the private sector at the other end of the quadruple helix of innovation. As described above, the merger of the two helices (academies and the private sector) allows for knowledge management and control of the improvement policy aimed at the productive market in addition to providing the hand-work required.

6.3 Private Sector Propeller The propeller of the private sector is a driver of this ecosystem as it adds market values to the proposals by inserting values to be invested in the projects, stipulated deadlines for the presentation and conclusion of the proposals, assisting in the management and governance carried out by governments in addition to the supervisory role to which all propellers represented here have. It is also this sector that aims to connect innovation ideas with society. We have to understand that every productive sector fits into this end of the system and will mostly be the supplier of technology and substantial financial investments in the process that will be applied to the mapping investments in smart cities [15].

564

B. S. Cezario and A. L. A. Guedes

6.4 Civil Society Propeller The helix of civil society in this context is more important because it encompasses all citizens regardless of social class, so we have a complete and active inclusion in innovation since the entire innovation ecosystem is included in civil society, making the process become self-aware, literally creating a body where interactions are mutual and recurring, since at times the citizen can be at the helix of the academies or the private sector. Either in the creation of the project, in the direction of the objectives or in the management of the goals.

7 Sustainable Technology in Cities This article contributes to consolidate sustainability in cities as it describes how the correct use of technological tools combined with the participation of society help in urban development in a way that respects the space achieved, people and the environment. Since smart cities must, as a premise if not an obligation, have their development based on sustainability, always meeting the sustainable objectives of the 2030 Agenda (UN). This document also highlights that the use of technology for mapping solutions is fundamental for the best applicability of investments, it goes through a process of analysis of the whole society and thus enabling a macro view to reach the expected expectations. Sustainable technological use has been crucial for cities around the world and has been used strategically in the innovation and evolution of smart cities. There are projects such as the 2020 European Smart and Sustainable Mobility Strategy that assesses and identifies possible technologies to be financed by the EU in the years 2007 to 2020 [14]. Of the environment and sustainability and must have, in the meantime, an aggregating and helping role in the urban awareness of cities as described [10]. More sustainable with the use of digital applications, for example, making the use of transport more efficient, thus impacting on a lower emission of gases that affect the ozone layer. Another example of the use of technology in a sustainable way in cities due to the population growth forecast for the coming decades is the food issue, hence the use of technology for sustainable food systems [1]. With this proposal, ICT assisting in the direction of food management, technology tools can be used in food traceability. Corroborating the information above, see the case of the uses of technologies such as 5G that contribute to the reduction of greenhouse gas emissions in European ports. The so-called Ports of the Future meeting the 2030 agenda [6]. This whole process of sustainable technological use has unleashed a range of revolutionary solutions for cities and urban development that can be felt in the short and medium term. That will be remembered and marked in the records of humanity because the change that sustainable technology makes in our urban ecosystem will remain as a kind of sustainable technological legacy in the annals of society, with this we will have the so-called Digital Heritage [5].

Technological Applications for Smart Cities: Mapping Solutions

565

8 Discussion This reveals that the help map is possible to search as research through the technology of investments for smart cities and develop the management of the literature review that the results and in the auxiliary management resources, sometimes even scarce, through the use of technologies is a basis for a city to become more intelligent and sustainable. There is also a factor found in all the articles used, which is the commitment of public or private authorities and society in who are going to invest.

9 Conclusion In this project we seek to review issues related to how technological applications play a fundamental role and complement the actions that aim to make a smarter city. And the referenced articles played an important role in verifying the use of various technologies to improve the governance of cities around the world. Some of them assisting in food management, ports, mobility, and many others. It is a fact that it is also seen here that there is a dependence on governments and society for these technologies to be used in order to provide a greater good to all citizens and provide a direction for investments. In this particular, we mention the quadruple helix that allows this connection between all the actors of society with the objective of improving the quality of life in cities. As mentioned above, investments need to be well applied in order to map drives to the studies described in this article prove that it can be a solution for the application of resources to be done responsibly. It became evident according to a study that the use of technologies for mapping dimensions in cities represents a guiding factor for improving people’s quality of life and in the management of investment resources. And with the use of appropriate tools, sustainable development can be achieved in partnership with the environment, thus showing even more participation in the effort to achieve the sustainable goals stipulated by the UN.

References 1. Abideen ZA et al (2021) Food supply chain transformation through technology and future research directions—a systematic review. Logistics 2. Alonso MSM, Antonio P et al (2020) Smart mobility: the main drivers for increasing the intelligence of urban mobility. Sustainability 12(24):10675 3. Bibri SE (2018) Backcasting in futures studies: a synthesized scholarly and planning approach to strategic smart sustainable city development. Eur J Futures Res 6:13 4. Bernal NW, Espileta GLK (2021) Framework for developing an information technology maturity model for smart city services in emerging economies: (FSCE2). Appl Sci 11(22):10712 5. Batchelor D et al (2021) Smart heritage: defining the discourse. Heritage

566

B. S. Cezario and A. L. A. Guedes

6. Cavalli L et al (2021) Addressing efficiency and sustainability in the port of the future with 5G: the experience of the livorno port. A methodological insight to measure innovation technologies’ benefits on port operations 7. Churin K, Kyung-Ah K (2021) The institutional change from e-government toward smarter city; comparative analysis between royal borough of Greenwich, UK, and Seongdong-gu, South Korea 7(1):42 8. De Nicola A, Villani LM (2021) Smart city ontologies and their applications: a systematic literature review 9. D’amico G et al (2020) Understanding sensor cities: insights from technology giant company driven smart urbanism practices 10. Fraske T, Bienzeisler B (2020) Toward smart and sustainable traffic solutions: a case study of the geography of transitions in urban logistics 11. Guedes AA (2018) Principal drivers das cidades inteligentes. Tese doutorado, Universidade federal Fluminense, Niterói 12. Guedes AA et al (2018) Smart cities: the main drivers for increasing the intelligence of cities. Sustainability 10(9):3121 13. Gonzalez LE et al (2020) Smart and sustainable cities: the main guidelines of city statute for increasing the intelligence of brazilian cities. Sustainability 12(3):1025 14. Gkoumas K et al (2021) Research and innovation supporting the european sustainable and smart mobility strategy: a technology perspective from recent european union projects 15. Huaxiong J, Geertman S, Witte P (2020) Smartening urban governance: an evidence-based perspective 16. Hasan TF et al (2020) Urban sustainability and smartness understanding (USSU)—identifying influencing factors: a systematic review. Sustainability 12(11):4682 17. Inclesan D et al (2017) Viewpoint: a critical view on smart cities and AI 18. Nguyen NUP, Moehrle MG (2019) Technological drivers of urban innovation: a T-DNA analysis based on US patent data. Sustainability 11:6966. https://doi.org/10.3390/su11246966 19. Objetivos de Desenvolvimento Sustentáveis (ONU) Available in: https://brasil.un.org/pt-br/ sdgs. Accessed on: 10 May 2022 20. Pantazis DN et al (2017) Smart sustainable islands vs smart sustainable cities 21. Sousa JM et al (2020) Technology, governance, and a sustainability model for small and medium-sized towns in Europe. Sustainability 12(3):884 22. Tan YS, Taeihagh A (2020) Smart city governance in developing countries: a systematic literature review 23. Vieira Carvalho K et al (2019) Da hélice tríplice a quíntupla:uma revisão sistemática. 18(51) 24. Webster J, Watson RT (2002) Analyzing the past to prepare for the future: writing a literature review. MIS Q 26:8–23

Duty—Cycling Based Energy-Efficient Framework for Smart Healthcare System (SHS) Bharti Rana

and Yashwant Singh

Abstract The Smart Healthcare System (SHS) facilitates personalized healthrelated services on demand and in real-time to healthcare physicians, doctors, and patients. SHS involves the body sensors for sensing different health conditions continuously. Therefore, necessitates managing the energy of the sensor node to extend the overall network lifetime. A sensor node is expected to work for a longer time in a Smart Healthcare System to transmit data continuously to the remote servers. One such approach on which the energy consumption heavily depends is the efficient management of duty cycling of sensor nodes. Being motivated by this, we have proposed an energy-efficient On-demand duty cycling-based framework in SHS. The On-Demand wake-up radio uses low-power signals to activate the components of a sensor node. Our framework is distinguished in the sense that the adaptive duty cycled approach is employed on the communication unit of the sensor node based on the full-functional and reduced functional devices. The proposed framework facilitates energy-efficient communication by managing duty cycling on individual components of the sensor node. The proposed framework has been supported by algorithms. Keywords Internet of things (IoT) · Smart healthcare system · Energy-efficiency · Duty-cycling

1 Introduction IoT represents the interconnectivity among tangible objects to aggregate data from surrounding events. Smart objects exchange information with each other without human or machine-to-machine intervention. The ever-growing ubiquitous network of things facilitates people-centric services on demand at any time. On the downside, the B. Rana (B) · Y. Singh Department of Computer Science and Information Technology, Central University of Jammu, Samba, Jammu & Kashmir 181143, India Y. Singh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_49

567

568

B. Rana and Y. Singh

deployment of ubiquitous networks falls short of energy supply in sensor-embedded devices. That’s why Green IoT focuses on conserving energy in IoT applications [20]. Green IoT aims to save energy while sensing, processing, transmitting, receiving, aggregating, and fusing data. All domains of IoT aim to achieve the principle of green communication to prolong network lifetime and reduce transmission delay. Smart Healthcare systems (SHS) monitor the severe health conditions of patients in real-time. Also, Smart Healthcare systems are driven by commands, signals, and queries [18]. The healthcare medical appliances are equipped with sensors, transceivers, power source, and microcontrollers. The individual components are powered by a battery source. But, these devices are constrained in battery power. Therefore, the available power must be utilized efficiently and effectively. In the case of a limited battery, expanding the size of the battery is not a viable solution because of cost, weight, mobility [9], and deployment issues. In a nutshell, these devices must be cost-efficient for wider deployment. The monitoring devices in Smart Healthcare System (SHS) expend energy in different phases such as sleep, active, transmit, receive, and idle [17]. Therefore, energy preservation and optimisation are the utmost need of the Body Sensor Networks. The prerequisite of Body sensor networks is energy conservation because of miniaturization, short lifespan, and limited battery capacity [21]. From a physician’s point of view, data aggregation services must be leveraged for a longer duration without any obstruction. To mitigate the issues, a state-based model is required that can handle the duty cycling of nodes at different states. Our state-based framework relies on On-Demand Duty Cycling. The technique that is considered under On-Demand Duty Cycling is Wake-Up Radio. Duty— Cycling techniques operate on the MAC layer. The base of managing duty cycling is IEEE 802.15.4 Superframe structure by adjusting the Superframe and beacon order. The activation and inactivation of individual nodes in our framework are conceptualized using Full-Functional and Reduced Functional devices. The adaptive duty cycling approach on the sender and receiver node optimises the energy consumption to extend the overall network lifetime. The organization of the rest of the paper is as follows. Section 2 discusses the literature work. The energy-efficient techniques in Smart Healthcare are conferred in Sect. 3. The methodology to transmit data in SHS is described in Sect. 4. The on-demand duty cycling-based energy efficient framework in SHS is proposed in Sect. 5. Section 6 presents the conclusion and future work. The research contributions of the study are: • We comprehend the energy-efficient techniques of the Smart Healthcare System. • A Smart Healthcare framework based on On-demand duty cycling is proposed for energy-efficient transmission. • An energy-efficient framework employs the concept of wake-up radio on the communication unit of the sensor node which is lacking in the existing frameworks. • The proposed framework is supported by algorithms i.e. On Demand wake-up Radio on the sender and Receiver node to manage the duty cycling adaptively.

Duty—Cycling Based Energy-Efficient Framework for Smart …

569

2 Literature Survey In [2], an improved duty cycling algorithm has been devised to minimise energy consumption in specific scenarios like cloudy weather. An improved duty cycling technique overcomes the classical challenge of data aggregation in IoT networks efficiently. In addition, a path searching approach based on residual energy is given for extended network lifetime and reliability. This algorithm is compared with no Duty Cycling, and Duty Cycling using ns2 simulator which shows significant improvement in network lifespan in the case of Duty Cycling [2]. In Monti et al. [11], an energy-efficient architecture is presented to combine many sensors that consume much power. The architecture focuses on extensive battery life by keeping the energy-consuming modules in separate energy supply domains to overcome quiescent power and minimum energy consumption. Experimental results in this paper showed a significant reduction in the consumption of current during sleep mode while using discrete power gating topology. A high collision rate increases energy consumption and delay in traditional dutycycled MAC protocols. This is because of the same duty cycle ratio at the transmitting and receiving end that makes it difficult to process data effectively. Therefore, the author adjusts the duty cycle at the receiving end and the contention window size via Early Acknowledge as per traffic patterns in IoT [7]. In [8], the author intends to identify and model power critical spots in synchronous and asynchronous schemes. Also, a synchronization scheme is developed to decrease the power loss effect occurred by clock drift [7]. The competence of wake-up radio (WuR) techniques and synchronization techniques is evaluated in this [8] to determine the maximum energy intake in WuR to make it suitable for complex IoT networks. In [3], different energy-efficient techniques like duty cycling, wake-up radio, and topology control are discussed. In duty-cycling, idle listening is evaded and sleep mode is preferred. Adaptation and maintenance of active periods form the basis of duty cycling. On the one hand, duty-cycling mechanisms are the most energy-saving but it suffers from delay problems [16]. To overcome the issue, the wake-up radio uses a low-power receiver to activate the node whenever communication is needed. Wake up radio suffers from an overhearing issue. In order to reduce transmission delay and network lifetime in WSN, a novel technique is presented to determine the duty cycling effectively and packet relaying by employing Reinforcement Learning and event based process [5]. This reduces the latency and the possibility of transmission collisions on the path. The Monte Carlo approach is employed to reduce the computation overhead of nodes to take decisions. This approach shows significant improvements in total delay, PDR, waiting time, and energy-efficiency as compared to the existing S-MAC and adaptive event based Duty Cycling.

570

B. Rana and Y. Singh

After the extensive literature study, it has been found that the duty cycling is applied on sensor nodes irrespective of adjusting the duty cycling in individual components of the sensor node. Also, the sensor nodes abided by the same percentage of duty cycling in existing studies. Based on this, we attempt to propose an energyefficient framework in Smart Healthcare by employing On-Demand duty cycling on the sender node as well as on the receiver node. The devised framework uses the concept of the reduced functional device and fully functional device in an IoT node.

3 Smart Healthcare: Energy-Efficient Techniques The energy-efficient techniques are primarily categorized as Duty Cycling, DataDriven, and Mobility based as shown in Fig. 1. Duty Cycling techniques are further of two types: Scheduled and On-Demand. Data-Driven techniques are classified based on data reduction and data aggregation. Based on the mobility of nodes, nodes can be classified as mobility sink and mobility relay [2]. The energy-efficient techniques illustrated in Fig. 1 are discussed subsequently. (i) Duty Cycling IoT nodes with limited power and resources are expected to remain working for a longer duration of time. Therefore, nodes temporarily turn off and on their radios periodically to minimize energy consumption and maximize the total network lifetime. This mechanism is known as Duty Cycling. The most commonly used technique is scheduled duty cycling. The Scheduled duty cycling technique activates the node at the scheduled time [9]. Here, the data transmission only happens at the scheduled time when the node activation time arrives. Scheduled duty cycling suffers from the problem of delay in data transmission to the target node. Another duty cycling approach is the On-Demand in which the nodes get only activated when necessary. In this study, we will take into account the On Demand wake-up radio scheme in our proposed framework. The wake-Up receiver constantly monitors the channel for Energy-Efficient Techniques

Duty Cycling

Scheduled

On-Demand

Fig. 1 Energy-efficient techniques

Mobility Based

Data Driven

Data Reduction

Data Aggregation

Mobility Sink

Mobility Relay

Duty—Cycling Based Energy-Efficient Framework for Smart …

571

Tx Wakeup Signal Data Data Out Receiver

Node1 (transmit) Node2 (WuR)

AcƟvate Receiver WuR

Node2 Receive

WuR monitor

ACK

ACK

Data Time

Fig. 2 Wake-up duty cycling

incoming requests to activate the receiver. As WuR listens to the channel continuously using a low-power receiver which in turn substantially reduces the latency also. Figure 2 illustrates the concept of a wake-up radio scheme. (ii) Data-Driven The data generated from IoT sensors remarkably affects the energy consumed by nodes. The longer the distance covered, the more will be energy consumption. Therefore, it becomes necessary to reduce the number of bits transmitted per packet to minimize the transmission energy consumed by the radios. The data aggregated from nodes is reduced by eliminating redundant bits and noisy data. The data aggregation techniques include cluster-based, tree-based, and chain-based. Various data reduction techniques like lossy compression and lossless compression could be utilized for serving the purpose of an energy-saving mechanism in an IoT network. In lossless compression, all the data bits remain same after being uncompressed i.e. Huffman Coding. In lossy compression, the data bits are eliminated especially redundant data to reduce the file. For instance, the same temperature sensor values are eliminated by matching with the existing ones. In Arduino, the Shox96 library is used to compress short strings and messages. The Shox96 uses a hybrid encoding technique. (iii) Mobility Based Mobility in an IoT network is introduced by providing mobilizers or attaching the sensors with mobile objects like electronic or digital items. Mobility-based energy conservation methods are based on the location of the sink and the relay nodes. The position of the sink and the relay node plays an important role in realizing energy efficiency and increasing the network lifetime [9]. Energy balancing can be achieved by increasing or decreasing the number of relays and sink nodes. Increasing the number of relay nodes helps in balancing the node’s energy which in turn reduces the data transmission energy. The diagrammatic representation of the mobile sink and mobile relay node is shown in Fig. 3.

572

B. Rana and Y. Singh

Fig. 3 Depiction of mobile sink and relay node

Relay Node

Mobile Sink

Data

4 Proposed Methodology: Energy-Efficient Data Transmission in SHS The proposed methodology for energy-efficient communication in Smart Healthcare is illustrated in Fig. 4. Body Sensors are essential entities in Smart Healthcare that senses the body conditions like temperature, blood pressure, calories, weight, disease severity, and sleep patterns [6]. Body sensors are the most energy-consuming components in Smart Healthcare. Therefore, the concept of duty cycling techniques is applied to the communication component of the sensor node. The sensed data is sent to the edge devices and cloud to offer healthcare services in real-time. So, in general, data sensed is sent to the edge nodes or fog nodes for processing. Based on the idea, the sensor node operates from the sender side and the edge node operates from the receiver side as it receives data from sensor nodes. Fig. 4 On-demand duty cycling on sender and receiver side

SENDER SIDE

RECEIVER SIDE (Edge Node)

Sensor Node FFD Sensing Component

RTS CTS

RFD Transceiver RFD

Analog to Digital Converter (ADC)

Edge Node Sensing Component

FFD

Analog to Digital Converter (ADC)

Computational Component RFD Communication Component Transmit Data

Computational Component

Duty—Cycling Based Energy-Efficient Framework for Smart …

573

4.1 On-Demand Wake-Up Radio Mechanism on Sensor Node The sensing unit, computational unit, and communication unit are the components of the sensor node. The flow of data is carried out through the sensing component to the computational unit and from the computational unit to the communication unit. To depict the use of duty cycling on the sensor components we classified the devices as full-functional and reduced-functional [13]. As the sensing unit and the computational unit are always in the working mode so they must be kept as FullFunctional devices (FFDs). Also, the communication component only gets activated when it receives the data. Therefore, the communication component could act as a reduced functional device (RFD) initially. As per the logic depicted in algorithm 1, first, the data is fetched from the sensing part. Then, the data is converted from analog to digital signals. The digital data is sent to the computational unit for processing. To transmit the data from the computational unit to the communication unit it must be set as a Full-Functional Device (FFD) if the threshold value of data is greater than or equal to 98.4. The threshold value set is the normal body temperature. In another case, if the computational unit does not have the data to send, then the communication unit will remain as a Reduced Functional Device. The abbreviations used in Algorithms are defined in Table 1. Table 1 Acronyms used in algorithms Acronyms

Description

Acronyms

Description

SNode_sens

Sensing component

RTS

Request to send

ADC

Analog to digital converter

CTS

Clear to send

SNode_comm

Communication component

RTransceiver

Transceiver of edge node

SNode_comp

Computing component

SDigi_data

Digital data

FFD

Full-functional device

SNode_computedData

Computed data

RFD

Reduced functional device

SNode_commData

Communicated data

574

B. Rana and Y. Singh

Algorithm 1: Sensor_Node (On-Demand Wake-Up) Initialization: SNode_sens =FFD, SNode_comp=FFD; SNode_Comm=RFD, Threshold=98.4 Output: Send_Data (SNode_comm) Steps: Start 1: while (true): 2: Data Fetch_Data (SNode_sens) 3: SDigi_Data convert_ADC (Data) 4: SNode_computedData compute (SDigi_Data) 5: If ((SNode_computedData >= Threshold) AND (SNode_PrevData! = SNode_NewData)) Then 6: SNode_comm FFD // Set to Active=1 7: Else 8: SNode_comm RFD //Set to Inactive=0 9: End If 10: SNode_commData Transmit (SNode_computedData) 11: end while _____________________________________________________________________

4.2 On-Demand Wake-Up Radio Mechanism on Receiver Node Edge nodes act as the Receivers from the Receiver side. Edge nodes receive data from the sensors. From the receiver end, the transceiver listens to the channel for data arrival. The data received is transmitted to the edge node. Initially, the transceiver and the edge node are set to Reduced Functional device as it only gets activated when the data arrived. If the communicated data from the sender side is greater than or equal to the threshold i.e. a normal body temperature, a Request-to-send (RTS) signal is sent to the receiver side. If the receiver node is ready to receive the data from the sender side then the Clear-to-Send (CTS) signal is sent to the Sender side. If the condition holds, the Edge node will be set as Full-Functional-Device (FFD) to receive the data. And, the communicated data is transmitted to the edge node. If the condition does not hold, then the edge node remains as a Reduced Functional Device (RFD).

Duty—Cycling Based Energy-Efficient Framework for Smart …

575

Algorithm 2: Receiver Node (On-Demand Wake-Up) Initialization: R_Transceiver = RFD, SNode_comm = FFD, Edge_Node = RFD, Threshold=98.4 Output: Receive_Data (Edge_Node) Steps: Start 1: while (True) 2: If (SNode_commData >= Threshold) Then 3: R_Transceiver RTS (SNode_comm) 4: SNode_comm CTS (R_Transceiver) 5: Edge_Node FFD // Set to Active =1 invoked by R_Transceive 6: Edge_Node Receive (SNode_CommData) 7: Else // Set to Inactive = 0 9: Edge_Node RFD 10: End If 11: end while

5 SHS: Energy-Efficient Duty-Cycling and Data-Driven Based Framework The framework intends to perform energy-efficient communication by applying various mechanisms as illustrated in Fig. 5. In our proposed framework, we employ three techniques for energy-efficient communication in Smart Healthcare as listed subsequently. (1) Wake-up radio on sender node (2) Data Reduction. (3) Wake-up radio on the receiver node

5.1 Full—Functional and Reduced—Functional Device The bottom layer of the proposed methodology represents the components of body sensor nodes in a Smart Healthcare System. The body sensor nodes are the biomedical devices functioning within or outside close to the human body. Different body sensors like temperature sensors, proximity sensors, accelerometers, motion sensors, and ultrasonic sensors are used to trace the medical conditions of the patients. The sensor unit, computation unit, and communication unit are the components of the sensor node. The sensing unit senses and converts the analog signals to digital signals via an ADC converter. The computational unit includes the processing and memory elements. The communication unit includes the transceiver for transmitting and receiving data. The elements of the sensor node could act as the FullFunctional Device (FFD) and Reduced Functional Device (RFD) based on the scheduled functionality of each unit in a sensor node.

576

B. Rana and Y. Singh

Health Monitoring

Hospitals

Routine Checkups

Emergency Services

Patients

DATA CENTERS

WEB SERVER

FOG DEVICES (Switches, Routers, and Gateways)

EDGE DEVICES (Sensors, Actuators and Microcontrollers)

DATA REDUCTION

WAKE-UP RADIO

Computational Unit

Sensing Unit Sensor

Processor

ADC

Full-Functional Device

ON-DEMAND DUTY CYCLING

Memory Full–Functional Device

Communication Unit Transceiver ReducedFunctional Device

BODY SENSOR NODES

Fig. 5 Proposed energy-efficient smart healthcare framework

(i) Full-Functional Device (FFD) Full–Functional Device always remains in the working mode and transmits the data continuously to the Reduced Functional Device. Then, the Reduced Functional device sends the data to the sink node. The sensing unit and computational unit act as the Full-Functional devices in the proposed framework.

Duty—Cycling Based Energy-Efficient Framework for Smart …

577

(ii) Reduced-Functional Device (RFD) Reduced—Functional Device only wakes up after a certain duration to aggregate the data from the Full-Functional Device and went to sleep mode again. Thereafter, the Reduced Functional Device sends data to the sink node. The sink node requires high processing energy to coordinate with the whole network. The communication unit in a sensor node acts as the Reduced Functional Device in the proposed framework.

5.2 Duty Cycling: On-Demand Wake-Up Radio Duty Cycling is the mechanism to keep the devices in an ON/OFF state periodically. When a node has data to send, it went to the ON state. The node remains in the OFF state when it does not have anything to send. Duty cycling of nodes can be done in a scheduled manner or on-Demand. In our proposed methodology, we employ On-Demand Duty cycling on Reduced Functional Device and Edge/Fog devices. On-Demand duty cycling wakes up the nodes only on demand when a node has data to send. • Wake-Up Radio Scheme The wake-Up radio scheme is an On-Demand Duty Cycling technique. Wake-up radio reduces the delay in transmitting the frame to the desired node. Such type of scheme is extremely beneficial in delay-sensitive applications. Wake-up radio is based on the utilization of a low-power receiver that is kept ON all the time. The main receiver node gets activated periodically to sense the channel on demand when the data arrives. The wake-up radio conserves energy by harvesting, frequency modulation, and use of lower frequencies for wake-up triggering [15]. The hardware design of wake-up radio depends upon active and passive approaches. The active wake-up radio gets energy from the battery as an external source continuously. On the other hand, the passive radio wake-up gets power from the radio frequency signals [4]. Passive radio wake-up does not require a battery as a power source. Passive radio wake-ups are more energy-efficient than active radios but such type of approach is only feasible for a few meters.

5.3 Workflow of the Proposed Framework The sensing unit and the computational unit act as full-functional device that remains activated to sense, store, and process the data generated from the body sensors. The aggregated data from the full-functional devices is sent to the reduced functional device i.e. transceiver. The reduced functional device uses a duty cycling approach and only wakes up when the data arrives. Then, the aggregated data from the reduced functional device is sent to the edge or fog nodes. The redundant data and the noisy

578

B. Rana and Y. Singh

data are refined by using data reduction techniques like lossless and lossy compression. The On-Demand wake-up radio scheme is used by the edge devices like Raspberry Pi and Aurdino that get activated only on demand. The On-Demand wake-up radio uses a low-power receiver to receive the packets. The data received by the edge nodes is transmitted to the fog nodes for further processing of data. The fog nodes send data to the web servers for storage [10]. The data stored in the web servers are utilized for regular maintenance of patients’ health conditions, providing on-demand health services, and delivery of emergency medical services to patients and physicians.

5.4 Energy-Efficiency: Edge, Fog, and Cloud Cloud computing offers various services (computing and storage) on-demand via software as a service, infrastructure as a service, and platform as a service. In the meantime, cloud computing encounters several challenges of high bandwidth, latency, node outage, security, and data breach [1]. For this reason, a concept introduced by Cisco is fog computing. Fog computing shifts computational overhead and storage services near the edge of the network. Fog computing is often used for low latency and real-time analysis of data in smart healthcare [12]. Routers, gateways, switches, and data management unit forms the infrastructure of the fog layer. Fog devices offer storage, computation, and networking services. Fog computing performs the local computation of data on devices rather than sending whole data to the cloud for processing. The fog nodes perform several operations like data uploading, data integration, data filtering, data storage, and compression on receiving the data from edge nodes [19]. Similarly, edge computing performs computation on isolated edge nodes that are in proximity to where the data originated. Edge computing manages real-time processing of data, resource management, and agile connectivity challenges [14]. The manifold benefits provided by fog and edge computing bring enormous opportunities for Smart Healthcare.

6 Conclusion Internet of Things (IoT) brings immense technological advancement in Smart Healthcare Systems over the last decade. Therefore, the prospects of Smart Healthcare are widespread. The increase in a technological boom in healthcare services also instigates the energy consumption challenge to the forefront because of the limited capabilities of IoT nodes. The proposed framework considers On-Demand Wake Up duty cycling mechanism rather than fixed duty cycling. The On-Demand dutycycling mechanism in the proposed framework intends to mitigate the issues of overhearing, over-emitting, collision, and latency which in turn reduce the overall energy consumption. Also, the framework attempts to manage the duty cycling adaptively as

Duty—Cycling Based Energy-Efficient Framework for Smart …

579

per the data traffic on the components of the sensor nodes i.e. Sensing Unit, Computation Unit, and Communication Unit. Moreover, the concept of fully functional and reduced functional would also play a significant role towards energy preservation in SHS. The proposed framework is supported by algorithms for employing a wake-up radio scheme on the sender node and receiver node. In the future, we will evaluate the performance of the proposed framework in a real-time environment. Also, we will combine the metaheuristic approach with on-demand duty cycling to transmit the data via the shortest path in an IoT network. In the proposed framework, the threshold value depends upon the sensor types i.e. temperature sensors, pressure sensors etc. However, the future work intends to improve the algorithms to adaptively adjust the thresholds as per the sensor values.

References 1. Azar J, Makhoul A, Barhamgi M, Couturier R (2019) An energy efficient IoT data compression approach for edge machine learning. Futur Gener Comput Syst 96:168–175. https://doi.org/ 10.1016/j.future.2019.02.005 2. Dhall R, Agrawal H (2018) An improved energy efficient duty cycling algorithm for IoT based precision agriculture. Procedia Comput Sci 141:135–142. https://doi.org/10.1016/j.procs.2018. 10.159 3. Haimour J (2019) Energy efficient sleep/wake-up techniques for IOT : a survey. 459–464 4. Hameed G, Singh Y, Haq S, Rana B (2022) Blockchain-based model for secure IoT communication in smart healthcare. 715–730. https://doi.org/10.1007/978-981-19-0284-0_52 5. Huang HY, Lee T, Youn HY (2021) Event driven duty cycling with reinforcement learning and Monte Carlo technique for wireless network 6. Jeon C, Koo J, Lee K, Lee M, Kim SK, Shin S, Sim JY et al (2020) A smart contact lens controller IC supporting dual-mode telemetry with wireless-powered backscattering LSK and EM-radiated RF transmission using a single-loop antenna. IEEE J Solid-State Circuits 55(4):856–867. https://doi.org/10.1109/JSSC.2019.2959493 7. Kim G, Kang JG, Rim M (2019) Dynamic duty-cycle MAC protocol for IoT environments and wireless sensor networks. Energies 12(21). https://doi.org/10.3390/en12214069 8. Kozłowski A, Sosnowski J (2019) Energy efficiency trade-off between duty-cycling and wakeup radio techniques in IoT networks. Wireless Pers Commun 107(4):1951–1971. https://doi. org/10.1007/s11277-019-06368-0 9. Lazarevskal M, Farahbakhsh R, Manshakya N, Crespi N (2018) Mobility supported energy efficient routing protocol for IoT based healthcare applications. In: 2018 IEEE conference on standards for communications and networking, CSCN 2018. https://doi.org/10.1109/CSCN. 2018.8581828 10. Majumdar A, Debnath T, Biswas A, Sood SK, Baishnab KL (2020) An energy efficient ehealthcare framework supported by novel EO- µ GA (Extremal Optimization Tuned MicroGenetic Algorithm) 11. Monti A, Alata E, Dragomirescu D, Takacs A (2018) Power supply duty cycling for highly constrained IoT devices. In: Proceedings of the international semiconductor conference, CAS, 2018-Oct, pp 215–218. https://doi.org/10.1109/SMICND.2018.8539832 12. Rana B (2020) A systematic survey on internet of things: energy efficiency and interoperability perspective (August), pp 1–41. https://doi.org/10.1002/ett.4166 13. Rana B, Singh Y (2021) Internet of things and UAV: an interoperability perspective. Unmanned Aer Veh Internet Things (IoT) 105–127. https://doi.org/10.1002/9781119769170.ch6

580

B. Rana and Y. Singh

14. Rana B, Singh Y, Singh H (2021) Metaheuristic routing: a taxonomy and energy-efficient framework for internet of things. IEEE Access 9:155673–155698. https://doi.org/10.1109/ ACCESS.2021.3128814 15. Rana B, Yashwant S (2021) Duty-cycling techniques in IoT: energy-efficiency perspective. In: International conference on recent innovations in computing (ICRIC-2021) 16. Sinde R, Begum F, Njau K, Kaijage S (2020) Refining network lifetime of wireless sensor network using energy-efficient clustering and DRL-based sleep scheduling. Sensors (Switzerland) 20(5). https://doi.org/10.3390/s20051540 17. Sodhro AH, Pirbhulal S, Sodhro GH, Gurtov A, Muzammal M, Luo Z (2019) A joint transmission power control and duty-cycle approach for smart healthcare system. IEEE Sens J 19(19):8479–8486. https://doi.org/10.1109/JSEN.2018.2881611 18. Sodhro AH, Pirbhulal S, Sodhro GH, Gurtov A, Sodhro AH, Pirbhulal S, Luo Z et al (2018) A joint transmission power control and duty-cycle approach for smart healthcare system a joint transmission power control and duty-cycle approach for smart healthcare system. https://doi. org/10.1109/JSEN.2018.2881611 19. Wang Z, Liu R, Liu Q, Thompson JS, Kadoch M (2020) Energy-efficient data collection and device positioning in UAV-assisted IoT. IEEE Internet Things J 7(2):1122–1139. https://doi. org/10.1109/JIOT.2019.2952364 20. Yang G, Jan MA, Menon VG, Shynu PG, Aimal MM, Alshehri MD (2020) A centralized cluster-based hierarchical approach for green communication in a smart healthcare system. IEEE Access 8:101464–101475. https://doi.org/10.1109/ACCESS.2020.2998452 21. Zahid N, Sodhro AH, Al-Rakhami MS, Wang L, Gumaei A, Pirbhulal S (2021) An adaptive energy optimization mechanism for decentralized smart healthcare applications. In: IEEE vehicular technology conference, 2021-April. https://doi.org/10.1109/VTC2021-Spring51267. 2021.9448673

Quality 4.0 and Smart Product Development Sergio Salimbeni

and Andrés Redchuk

Abstract The purpose of this article is to present the effects caused by Industry 4.0 enabling technologies on product development and the quality assurance management, the so-called Quality 4.0. The methodology was mapping and coding. A systematic literature review was carried out. The searching process was made in the fields Title, Abstract and Author’s Keywords. The findings were that enabling technologies open an innovative instance in both, product development and quality management. The digitization of components and products allow monitoring and control throughout the entire value stream system in real time and available to all stakeholders. Those digital changes reach the entire society. The digitization of components must be considered since the moment of its research and development. One of the most significant features of Industry 4.0 is the possibility of interconnecting the successive phases of the entire product life cycle, from its conception to its disuse. To this end, all objects must be, at a minimum, identifiable for traceability, that is, they must have a minimum intelligence. Many companies are searching for the best adaptation of their business in order to be able to improve their competitive positioning in the face of these new challenges, providing additional values to customers in order to gain advantages. Data-driven quality management contributes to this goal. This work has as an original contribution which links the concepts of quality management with smart products and industry 4.0. Keywords Industry 4.0 · Quality 4.0 · Quality 5.0 · Smart product development

S. Salimbeni (B) Universidad del Salvador, Buenos Aires C1051AAB, Argentina e-mail: [email protected] A. Redchuk Universidad Rey Juan Carlos, Madrid, España © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_50

581

582

S. Salimbeni and A. Redchuk

1 Introduction The term Industry 4.0 (I4.0) was mentioned for the first time in Germany in 2011 as a proposal for the development of a new concept of economic policy based on high-tech strategies [1]. New enabling technologies are changing the role of employees and the work they do, allowing interactions between the different elements of industrial companies throughout the value chain, since suppliers to end-users. The enterprise’s digitalisation is the beginning of the organizational evolution; it is the main result of the introduction to digital systems. The use of digital and collaborative tools and virtual information will profoundly change the work profile and the requirements of employees. Digital Technologies applied to all aspects of organizations is what is called Digital Transformation (DT). It is a profound change of business and organizational activities, processes, competencies and models to fully leverage the changes and opportunities of a mix of digital technologies and their impact across society. Those digital changes reach the entire society; that is the case of Japan, which has its own particular challenges and, like I4.0, aims to face several duel going well beyond the digitalization of the economy towards digitalization at all levels of Japanese society; this is the so-called Society 5.0 (S5.0) [2]. S5.0 is depicted as an evolution in five societal stages: (1) the hunting society, (2) the agrarian society, (3) the industrial society, (4) the information society and (5) the super-smart society. It does not mean having to change the production processes completely, replace machinery or completely turn the way of working, it only needs to modify the point of view on how production and quality have been controlled so far [3]. The research question posed is how the enabling technologies and the new humancentered approach impact on the new product development (NPD) and, in particular, on quality management. Since the purpose of this article is to present the effects caused by I4.0 on the development of smart products and the consequent evolution of quality management, the literature review was organized as follows: first, an introduction on I4.0 and digitization. Smart components and products are then introduced, then the concepts of Quality 4.0 and Quality 5.0 and finally their impacts on the development of new products.

2 Research Methodology The literature review has been performed from January to December 2021. Open coding was used for data review and organization. Mapping method was used. A screening of 68 I4.0 and 82 Q4.0 articles was carried out, 41 of which were selected for a deeper analysis. Steps were as follows: (1) in academic databases with a search string through the combination of the operator “or” between keywords, the references that met the selected criteria were collected; (2) were published in conference

Quality 4.0 and Smart Product Development

583

proceedings, journal articles, magazines, book chapters and books between 2015 and 2021; (3) contained, at least, one of the search terms in the title, abstract, and/or keywords; (4) those ones that did not have full texts available were discarded; (5) articles that defined I4.0, NPD and Q4.0 outside the scope of this research work were excluded; (6) they were classified according to the research question; (7) the data of interest for the research question was collected.

3 Literature Review 3.1 Industry 4.0 and Digitalization According to the authors Schlechtendahl et al. [4] the three pillars of I4.0 are (1) the digitization of production, (2) automation and (3) the automatic exchange of data. I.4.0 is the integration and interaction of technologies regarding physical and digital domains, which makes it stand out from the other industrial revolutions [5]. It is a German Government’s initiative to gain stronghold in global manufacturing, by advanced application of information and communication systems in manufacturing [6]. The Federal Ministry of Economics and Technology of Germany has examined the topics “digitization” and “Industrie 4.0” in detail in a study and concluded that there was considerable value-added potential in digitization. The central topic of digitization in the production´s process is the sensible and efficient collection and evaluation of production life cycle data records, starting with the evaluation of the suppliers and raw materials, monitoring of materials used, development of the quality products, deliveries and the product in use by the customer. Digital technology is increasingly important in achieving business goals, and its pervasive effects have resulted in the radical restructuring of entire industries. Given digital technology’s central role in the restructuring of several industries managers’ interest in handling digital product and service innovation is not surprising [7]. The basis of the DT is the targeted collection and evaluation of data. Entire machinery park does not have to be completely changed; in many cases, it is enough to capture the data which have been previously received in paper form to evaluate immediately the data that are already provided by systems and sensors in order to be able to be analysed. The interconnection between different technologies and devices through the Internet, the use of cyber-physical systems (CPS) and artificial intelligence (AI), are perhaps the most salient features of the present industrial era. As players involved in it, there is not enough perspective to observe the strong evolution that is currently taking place in the industry. Many manufacturing companies are in search of the best adaptation of their facilities so that they can position themselves better in face of these new challenges. One of the biggest dares that organizations are facing today is finding the proper way to

584

S. Salimbeni and A. Redchuk

Fig. 1 A. RAMI4.0. Source Adapted from Platform Industrie 4.0 and ZVEI

shape competitive advantages in the age of I.4.0. It is a condition of their long-term survival in the market [8]. I.4.0 has already got its framework, although specialists are still in its full completion; it is called “Reference Architectural Model for Industrie 4.0” (RAMI4.0) [9]. RAMI 4.0 is an interesting and useful guide to normalize all the central actors of this new industrial revolution [10]. The integration of various technologies is really the challenge of I4.0. The RAMI 4.0 (Fig. 1) is made up of three axes. It is a three-dimensional scheme that shows the most important aspects of I.4.0 and ensures that everyone involved in the industry shares a common perspective and understanding [11]. It should be remembered that the I.4.0 core is the interconnection of innumerable devices from innumerable manufacturers, and it must work. The three dimensions that are standardized are: (1) the Life Cycle Value Stream, (2) the Hierarchy Levels and (3) Layers (Fig. 1). Focusing on the first axis, the Value Stream Life Cycle, it can be observed that it is possible to monitor and control the product, and even its constituent parts, since the concept design to the “death” or disposal of the product itself. This value stream life cycle, according to the framework, goes through two major phases: (a) Type and (b) Instance (Fig. 3). The type‘s phase belongs to the development and its maintenance, while the Instance phase contains the product’s production, usage and disposal. During this first phase, it could be able to receive usage data and user behaviour, so that one could utilised this information to make improvements in new versions of the product, or even, offer additional services related to it. As a strategic tool, product lifecycle management (PLM) enables companies to provide additional customer values to gain competitive advantage. PLM is a business

Quality 4.0 and Smart Product Development

585

Fig. 2 Value Stream Life Cycle. Source Adapted from Plattform Industrie 4.0 and ZVEI

Fig. 3 Value Stream Life Cycle. Source Author‘s own

strategy used by manufacturers to support the full product life cycle (PLC) and accelerate business performance through a combination of process, organization, methodology, and technology [12]. PLM manages all product related information during the whole product life cycle. A closed-loop PLM system enables all stakeholders, during the life cycle of a product to track, manage and control product information. A basic system architecture for closed-loop PLM consists of communication channels with the product during its operation and a platform [13].

3.2 Smart Components and Products The digitalisation of “things”, including “objects” used by people, consists of ensuring that those ones have sufficient technology to be identified individually, can receive data, deliver information, and even make autonomous decisions. They can

586

S. Salimbeni and A. Redchuk

Fig. 4 Industry 4.0 assest. Source Adapted from Plattform Industrie 4.0 and ZVEI

be physical objects, but also intangible things such as software, concepts, methods or processes. It could be a simple component, e.g. a ball bearing, or a set of objects, e.g. an engine [10]. Those digitalised objects are named “assets” in the I4.0 (Fig. 4). Those assets are, at the end, digitalised objects, “Smart Objects”, and they are the pillar of the Internet of Things (IoT) and innovative end-products. Those smart objects not only support the cooperation of industrial and business processes but can also interact with people. As mentioned earlier, connectivity and interoperability between smart devices are one of the fundamental characteristics of DT. Its applications are becoming part of the complete life cycle of a product, mainly with the help of the Internet of Things (IoT) [14]. A detailed study carried out by Raff et al. [15] proposes an smart objects classification according to the “level of intelligence”. As defined by the author, they can be classified into: (a) Digital, (b) Connected, (c) Responsive and (d) Intelligent. a. Digital Objects. They are components equipped with Information Technologies (IT). They have sensors that collect data and actuators which transmit that data. Other characteristics are data retention and storage in the same asset, data processing, analysis, diagnosis and use, provision and transmission of data and provision of product identification information. b. Connected Objects. They have got a unique identification, identity, uniquely identifiable and they are human readable. They interact with the environment and other objects. It is a system of connected and uniquely identifiable constituents which exchange of information with its environment. c. Responsive Objects. They are characterized by the detection, data collection, knowledge of the context in real time; situationally. They have the ability to adapt and react to changes in its environment in the form of a stimulus–response. They influence on the condition or state of the environment and could adapt themselves according to the needs and affections of the user.

Quality 4.0 and Smart Product Development

587

d. Intelligent Objects. They are those that can make decisions about themselves and their interactions with external entities. They have autonomy and selfmanagement and act intelligently and independently. It is said to have proactivity, it means, action against a future situation. An intelligent object is an entity that acts beforehand after analysing the surrounding situations.

3.3 Quality 4.0 and Quality 5.0 Total Quality Management (TQM) has not changed in essence. However, the quantity, reliability and speed in which data is collected and analysed offers precise information. That information is which purveys the necessary knowledge for decision making. That is important because of quality of products and processes throughout the entire product lifecycle is a prerequisite for achieving company goals [5]. Aldag and Eker [16] states that the combination of new technologies with traditional quality methods, to improve operational excellence and performance, is called Quality 5.0 (Q5.0). Q5.0 strategies offers the capacity to align traditional quality management with I4.0 capabilities, helping enterprises to achieve operational excellence [17]. It can be characterized as the digitalisation of TQM and its effect on quality technology, processes, and individuals as the application of I4.0 technologies to quality [18]. The set of activities that coordinate, manage and track the functions that measure and report quality, in I4.0, are defined in the hierarchy level axis of RAMI4.0, which is based on the ISO/IEC 62,264 standard. This includes the evaluation of raw materials, intermediate and finished products, collection and maintenance of data records, use of analysis, real-time decision making, classification and certification tests, validation of measures and maintenance of statistics for the management of the quality [19]. It is a clear prove, since the RAMI4.0, that Quality is strongly impacted by I4.0 and is changing due to digitalisation and the “data driven” mode of decision making and that is where the new Q.5.0 provides a potential value creation. There are multiple possibilities in relation to the new possibilities of quality management and three key subjects could be taken into consideration: (1) the reaction times, (2) relationships and (3) predictions. Thanks to characteristics, among others, new post-sale services could be also offered, like prescriptive maintenance for instance (Fig. 5). Through a successful application of the enabling technologies in the value stream system, enterprises could gather relevant information about the ongoing level of production’s quality, both, about the product and the process. Improving quality management would increase the performance of the company y and will act as a key strategic factor in the market [20].

588

S. Salimbeni and A. Redchuk

Fig. 5 Analytics framework. Source Adapted from LNS research

3.4 New Smart Product Development As Wang et al. [21] assert, changes in customer requirements based on field data can be incorporated, even, during a product’s manufacturing process because the company possesses the agility to adapt to the new situation. As an example, by closing the product’s information loop, digitalisation makes Closed Loop Lifecycle Management possible. It addresses the collection of entire product life cycle information as it can help to improve design, manufacturing, use and end of life cycle handling of products continually. As a result, product quality can be improved, and the business opportunities will be enhanced. Smart Products (SP) can collect data from their sensors and actuators, and consequently obtain information about manufacturing cycles, quality requirements and waste production [5]. As it was previously explained, SP can store and process large amounts of data and communicate with industrial systems. They can also collect information and interact with their environment without human intervention throughout their life cycle [22]. Smart Product Development (SPD) approaches must be adopted to ensure innovation in products and processes. The introduction of advanced technologies such as additive manufacturing (AM), augmented reality (AR) and virtual reality (VR) for prototyping and NPD, represents huge potential in SPD, enabling the creation of highly flexible products at an affordable cost. The creation of value, competitiveness, growth and sustainability are closely related to the adoption and development of new technologies giving answers to the world’s economy and market requirements which are changing rapidly [23]. The increasing demand for high complexity and technology has led to the development of more complex products. To remain competitive and meet market requirements in a changing environment, companies must face great challenges by constantly introducing new SP, processes and technologies, which will allow them to be more effective in NPD. The most important factors to consider successful product development are iteration, integration and innovation. The cost associated with the SPD is one of the most important variables in decision-making. A comprehensive analysis should be made on the cost of the resources used and the costs for both, the producer and the user, as well as the environmental impact of the products during their life cycle.

Quality 4.0 and Smart Product Development

589

On the other hand, with virtual prototyping, system planners could respond to the changes in manufacturing process quickly, improving flexibility and efficiency of tooling and process design recursively, and consider human-machine interaction with regards to usability, comfort, and safety in beginning- of-life phase [12]. Besides that, [24] said that foundational success factors in NPD are: (1) senior management involvement, (2) early customer involvement, (3) external cooperation beyond customers (e.g., suppliers), (4) alignment between NPD and strategy, (5) adequate degree of formalization, (6) cooperation among functions and departments, it means cross-functional cooperation, (7) creative organization culture and (8) project management capabilities. It can be easily observed that the digitalization of products, during the entire life cycle, covers and surpass the factors which Florén defined, and it is why it is called now SPD. As it was already said, the sensorization of components, parts and products throughout the entire life cycle allows information to be shared between all departments of the company. Knowledge sharing (KS), as the dissemination of information and knowledge within a community, is considered to play a crucial role in knowledge management ventures within the organisation. KS is regarded by most manufacturing companies as one of the most important issues in knowledge management with the purpose of improving efficiency, quality and time to market in NPD [25]. Another important concept in NPD is the Consumer Cocreation. The area of consumer cocreation is in its infancy and many aspects are not well understood [26]. Again, it could be said that digitalisation allows companies to exchange all types of information, including with customers and suppliers and in real time, which facilitates, in this case, the co-creation initiative and quality improvement.

4 Discussion Some companies often focus only on the innovation of their products, forgetting sometimes that they could be innovative by reinventing processes, business strategies or completely new business models. New technologies currently greatly facilitate networks in value chains and systems. For its part, the formation of multifunctional work teams, taking advantage of the advantages that Data Driven work offers, provoke truly disruptive ideas that can be brought to reality. The use of the knowledge discovery in databases (KDD), for analysis and decision making, is perceived as a continuous process aimed at executing and solving complex tasks. Such optimization proposals can only be realized through the systematic integration of new technologies. Dynamic Knowledge Discovery (DKD) is a convergence of people and machines working together in harmony to achieve profitable income for the company. PLM manages all product related information during the whole product life cycle. A closed-loop PLM system enables all stakeholders during the life cycle of a product to track, manage and control product information. A basic system architecture for

590

S. Salimbeni and A. Redchuk

closed-loop PLM consists of communication channels with the product during its operation and a platform. There are nowadays different applications for SP. What may not be observed yet, is its use throughout the value flow of the supplier-company-customer system. For instance, in the food and beverage industry, a mineral water bottler verifies the healthiness of the water before it is bottled, while controlling the bottling process, and monitoring customer orders to work with production on demand; or in small appliances that, being connected to the internet, give the manufacturer the opportunity to analyse the user behaviour and this serves as input for its marketing area; or digitalised bicycles connected to the cloud, which allow the manufacturer to carry out prescriptive maintenance, offering the customer new after-sales services. The new technological capacities and, consequently, the new opportunities that ones entails in the process of design, development, production and use of intelligent products, offer an inestimable amount of data and information. Said information, integrated horizontally throughout the entire value chain, allows monitoring and control so that the specifications of both components, subassemblies and finished products are met. All those features, integrated and available to all stakeholders and in real time, is what is really revolutionising the total quality management in any kind of companies.

5 Conclusions The aim of this work was to respond, through a systematic literature review, if the enabling technologies of I4.0 and the development of intelligent products are the pillars of a new era of Quality Management, the so-called Quality 4.0. It was concluded that enabling technologies open a new instance in quality management, which becomes “data-driven”. It was also found that considering the effects of the digital transformation throughout society and focused on the human factor, would trigger a new evolution of quality management, which is beginning to be known as Quality 5.0. The digitization of all components and final products, generically known as smart objects, throughout the life cycle of the value stream, can be a fundamental analysis and development tool for this new era of quality management, being the latter represented in one of the dimensions of the I4.0 reference architecture model, the so-called RAMI4.0. It is worth clarifying that, based on the literature review carried out, it is noted that there is no a definitive definition of Quality 4.0 and 5.0. That is why both, academics and standardization organizations, must continue working on minimum agreements of understanding on this discipline which is so important for both, service and manufacturing companies.

Quality 4.0 and Smart Product Development

591

References 1. Roblek V, Meško M, Krapež A (2016) A complex view of Industry 4.0. SAGE Open. 6(2) 2. Javaid M, Haleem A (2020) Critical components of Industry 5.0 towards a successful adoption in the field of manufacturing 3. Xu X, Lu Y, Vogel-Heuser B, Wang L (2021) Industry 4.0 and Industry 5.0—Inception, conception and perception. J Manuf Syst [Internet]. 61(October):530–535. https://doi.org/10.1016/j. jmsy.2021.10.006 4. Schlechtendahl J, Keinert M, Kretschmer F, Lechler A, Verl A (2015) Making existing production systems Industry 4.0-ready: holistic approach to the integration of existing production systems in Industry 4.0 environments. Prod Eng 9(1):143–148 5. Demartini M, Tonelli F (2018) Quality management in the industry 4.0 era. Proc Summer Sch Fr Turco. 2018(Septe):8–14 6. Sanders A, Elangeswaran C, Wulfsberg J (2016) Industry 4.0 implies lean manufacturing: research activities in industry 4.0 function as enablers for lean manufacturing. J Ind Eng Manag 9(3):811–833 7. Nylén D, Holmström J (2015) Digital innovation strategy: a framework for diagnosing and improving digital product and service innovation. Bus Horiz 58(1):57–67 8. Adamik A, Nowicki M (2018) Preparedness of companies for digital transformation and creating a competitive advantage in the age of Industry 4.0. Proc Int Conf Bus Excell [Internet]. 12(1):10–24. https://www.mendeley.com/catalogue/preparedness-compan ies-digital-transformation-creating-competitive-advantage-age-industry-40/ 9. Cotrino A, Sebastián MA, González-Gaya C (2020) Industry 4.0 roadmap: implementation for small and medium-sized enterprises. Appl Sci 10(23):1–17 10. Standardization-Council-Industry (2020) DIN and DKE roadmap. German standardization roadmap Industrie 4.0 version 4 [Internet], 4th ed. DIN e. V., editor. DIN, Berlin, 136 p. www. din.de 11. DKE Deutsche Kommission Elektrotechnik Elektronik Informationstechnik in DIN und VDE (2018) German Standardsation Roadmap: Industrie 4.0. DIN e V [Internet] 146. www.din.de 12. Xin Y, Ojanen V (2018) The impact of digitalization on product lifecycle management: how to deal with it? IEEE Int Conf Ind Eng Eng Manag. 2017-Decem:1098–1102 13. Lenz J, MacDonald E, Harik R, Wuest T (2020) Optimizing smart manufacturing systems by extending the smart products paradigm to the beginning of life. J Manuf Syst [Internet]. 57:274–286. https://doi.org/10.1016/j.jmsy.2020.10.001 14. Mourtzis D, Gargallis A, Zogopoulos V (2019) Modelling of customer oriented applications in product lifecycle using RAMI 4.0. Procedia Manuf [Internet] 28(January):31–6. https://doi. org/10.1016/j.promfg.2018.12.006 15. Raff S, Wentzel D, Obwegeser N (2020) Smart products: conceptual review, synthesis, and research directions. J Prod Innov Manag 37(5):379–404 16. Aldag MC, Eker B (2018) What is quality 4.0 in the era of Industry 4.0? Int Conf Qual life [Internet] (November):31–34. https://www.researchgate.net/publication/329442755_ WHAT_IS_QUALITY_40_IN_THE_ERA_OF_INDUSTRY_40 ˘ 17. Ralea C, Dobrin O-C, Barbu C, TAnase C (2017) Looking to the future. Self Manag Learn Action Putt SML into Pract. 255–266 18. Carvalho AV, Enrique DV, Chouchene A, Charrua-Santos F (2019) Quality 4.0: an overview. Procedia Comput Sci 181(2019):341–346 19. Batchkova IA, Gocheva DG, Georgiev D (2017) Iec-62264 based quality operations management according the principles of industrial internet of things. Sci Proc Xiv Int Congr Mach Technol Mater [Internet] VI(3):431–434. http://mtmcongress.com/proceedngs/2017/Summer/ 6/09.IEC-62264 20. Hrehova S (2021) 4.0 Concept 193–202 21. Wang Y, Towara T, Anderl R (2017) Topological approach for mapping technologies in reference architectural model industrie 4.0 (RAMI 4.0). Lect Notes Eng Comput Sci 2:982–990

592

S. Salimbeni and A. Redchuk

22. Schmidt J, Adler S (2019) Die digitale Lebenslaufakte—Stand der Normung 23. Nunes ML, Pereira AC, Alves AC (2017) Smart products development approaches for Industry 4.0. Procedia Manuf [Internet]. 13:1215–1222. https://doi.org/10.1016/j.promfg.2017.09.035 24. Florén H, Frishammar J, Parida V, Wincent J (2018) Critical success factors in early new product development: a review and a conceptual model. Int Entrep Manag J 14(2):411–427 25. Gao J, Bernard A (2018) An overview of knowledge sharing in new product development. Int J Adv Manuf Technol 94(5–8):1545–1550 26. Hoyer WD, Chandy R, Dorotic M, Krafft M, Singh SS (2010) Consumer cocreation in new product development. J Serv Res 13(3):283–296

Embedded Vision System Controlled by Dual Multi-frequency Tones I. J. Orlando Guerrero , Ulises Ruiz , Loeza Corte , and Z. J. Hernadez Paxtian

Abstract An embedded vision system, based on the conjunction of a mobile, a DTMF (dual-tone multi-frequency) module, and a four-bit relay module, is presented in this paper. The mobile camera is employed to distinguish color characteristics of analyzed objects by means of digital processing. Each time a feature is distinguished, the mobile generates a different tone through the audio port, which is sent to the DTMF module to generate one of four available digital outputs. The booster module allows to amplify this digital signal, which can be used for a power amplifier. In this research, the linear velocity of the system was evaluated, from the moment the image is acquired until the power signal is activated, for this purpose, an oscilloscope was used to perform a timing analysis. The results show that the system cannot distinguish color characteristics of objects at a speed greater than 48 cm/s. This system is intended to be used in the food industry, as an alternative vision machine, object selector. Keywords Embedded vision system · Machine selector · DTMF control · OpenCV · Android studio

1 Introduction Embedded vision systems are increasingly used to automate industrial processes. In these systems, the recognition and selection of objects on a conveyor belt are two of the main objectives in this sub-area of automation. In the process of object recognition, characteristics such as color, size, and texture are considered. Another main objective is that, every time the system makes a recognition, it generates a digital output, in order to activate an actuator, connected to a power stage [1, 2]. I. J. Orlando Guerrero · L. Corte · Z. J. Hernadez Paxtian Universidad de la Cañada, Teotitlán de Flores Magón, Oax, México U. Ruiz (B) Instituto nacional de astrofísica óptica y electrónica. Sta María Tonantzintla, San Andrés Cholula, Pue, México e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_51

593

594

I. J. Orlando Guerrero et al.

The conjunction of pattern recognition and the activation of a digital output are two main features of the so-called vision sorting machines, in which the main objective is to remove unwanted products from the production line at a high linear speed (conveyor belt speed). Selector vision machines and embedded vision systems can be considered synonymous in the automation area, since they perform the same task [3]. It is worth mentioning that some vision systems can perform more than one selection, so, in this research, we only limit ourselves to selector vision machines. On the other hand, a field programmable logic gate array (FPGA) and a high-speed camera are the core of vision systems, which increases their cost significantly [4]. Therefore, an alternative to replace the two components mentioned above is proposed in this research. It is worth mentioning that the proposed embedded vision system does not intend to replace commercial vision systems, it is only a low cost alternative for automations that require a low recognition speed and the use of only four digital outputs for the selection in the production line. The proposed vision system consists of three modules: a mobile, which replaces the FPGA and camera; the MT8870 DTMF module to generate digital outputs; and a four-bit relay module, which makes the link between the mobile and the power stage for actuators that can take the product out of the production line. The control algorithm of the proposed vision system (object recognition and the activation of the DTMF module) was realized under the integrated development environment, Android Studio and using the image processing functions of OpenCV (open-source computer vision). In the following sections, it will be explained in detail how the conjunction of the mentioned modules is performed.

1.1 OpenCv and Android Studio Real-time digital image processing has gained importance in recent years due to the advancement in the area of machine learning, which is a sub-area of artificial intelligence. In this context, it is intended that each captured frame is processed individually, applying digital image processing operations, such as sharpening, blurring, erosion, dilation, adaptive thresholding, histogram, and RGB component extraction, among many others. The color detection of a certain object can be carried out by analyzing its spectral components, for example, if you want to detect if an object is blue, the three images corresponding to the RGB channels will present different values (two-dimensional matrices). That is, the average value of the pixels in the blue channel will be greater than the average of the pixels present in the red and green channels. With this simple process, it is possible to distinguish blue objects. To perform real-time digital image processing, efficient and dedicated algorithms are required. An alternative is OpenCV, which is a library of programming functions for computer vision in real time, these functions linked with a cell phone and Android studio enhance the performance of mobile telephony in the area of image processing. The library is cross platform and free to use under open-source BSD license, and is now maintained by Itseez, Inc. an expert in computer vision algorithms and implementa-

Embedded Vision System Controlled by Dual Multi-frequency Tones

595

tion for embedded hardware [5, 6]. OpenCV was designed for high computational efficiency with a strong focus on real-time applications, therefore, it has been written with optimized C language. OpenCV functions can be executed in Android studio by adding the OpenCV Android software development kit (SDK). In the developed vision system control algorithm, it employs four functions: 1. cv2.VideoCapture to capture frames of images per second for processing. 2. cv2.cvtColor which helps us to obtain the matrices corresponding to the RGB channels of the captured image. 3. The NumPy library to perform mathematical operations with the image matrices. 4. TonoGenerator and audioTrack, which belong to Android studio. These are used to generate dual multi-frequency tones or DTMF [7].

1.2 MT8870 DTMF Module The DTMF module generates tones by pressing a key on the cell phone, these tones are sent to be heard through the phone’s audio port or speaker. These tones are not only used to indicate user actions (such as dialing a character), they have also been used as control commands, since they can be produced by the same voice communication channel. This has been used extensively in home automation and robotics, by different authors, which has shown a high efficiency of connection between control actuators and the MT8870 tone decoder chip, designed to capture dual tones and generate a four-bit binary code. The name dual tones is derived from the fact that they are generated by the sum of two sinusoidal signals of different frequencies (high and low frequencies). The correspondence of character and frequency is shown in Tables 1 and 2. From Table 1, it can be seen that the character # is linked to the frequencies 1477 and 941 Hz, which when added together generate the tone corresponding to #. On the other hand, there is a correspondence between character/tone and four-bit binary code generated by the MT8870 chip. The following table shows this relationship for the characters used in this research; the other relationships can be found in the chip’s data sheet. The MT8870 chip has been embedded in a generic module, which contains its typical electronic configuration and a clock and control signal on a single board (see its specification sheet), it also contains a 5mm female input connector to introduce the

Table 1 Correspondence between characters and two sine wave frequencies Frequency 1209 1336 1477 675 770 852 941

1 4 7 *

2 5 8 0

3 6 9 #

1633 A B C D

596

I. J. Orlando Guerrero et al.

Table 2 Four-bit character and binary code matching for the MT8870 chip Binary code Character 1 2 4 8

Q4 0 0 0 1

Q3 0 0 1 0

Q2 0 1 0 0

Q1 1 0 1 0

Table 3 EST, TOE, and INH states to generate a binary code. H = 5 V and L = 0 V Character TOE INH EST Q1 Q2 Q3 1 2 4 8

H H H H

L L L L

H H H H

0 0 0 1

0 0 1 0

0 1 0 0

Q4 1 0 0 0

tones, which in our case are generated by the cell phone. This module facilitates the connection with other modules, it is inexpensive and easy to access. The description of its pins is as follows [8, 9]: 1. Q1 to Q4 are the digital outputs (indicator LEDs), corresponding to the generated tones. 2. VCC is the power supply, which is set to 5V. 3. GND is the ground connection. 4. EST. It presents a high value, when the chip starts to detect tones. 5. TOE. When TOE is grounded, the data outputs are high impedance. 6. INH. A high logic value inhibits the detection of tones representing characters A, B, C, and D. The input for this pin is set to ground for our case. Table 3 shows the states of EST, TOE, and INH to generate the binary codes of the characters 1, 2, 4, and 8, which are used in the system to generate a digital sequence [10, 11].

1.3 Four-Bit Relay Module The link between the binary code and the power stage is made by means of a fourchannel relay module, which is supplied with a voltage of 5 V. Each channel needs a control current of 15–20 mA to activate the internal electromagnet of the relays. This module is high power as it can withstand alternating current of 250 V at 10 A and direct current of 30 V at 10 A, which allows switching industrial actuators. The module is optically isolated from the high voltage side, to avoid a ground loop when connected to the MT8870 module. The relay module contains four outputs, each one with three connectors, the middle one representing the common (COM), while the two at the ends represent a normally closed (NC) and open (NO) contacts. IN1 to IN4 are the digital inputs for the DTMF module, while VCC and GND are

Embedded Vision System Controlled by Dual Multi-frequency Tones

597

connections for the external source that powers the relays, for this the jumper on the module must be removed, thus isolating the voltage source from the DTMF module.

2 Materials and Methods As mentioned, the vision system consists of the connection of three modules: mobile, MT8870 module, and relay module, it is worth mentioning that the control algorithm is a fundamental part of the system, so this section will explain the physical connections between the modules, as well as the control algorithm, and it will also explain the methodology used to measure the linear velocity of the system.

2.1 Electrical Connection Between Modules The electrical connections between the modules are shown in Fig. 1. For the description of the system, in this article, three stages have been named, which are 1. Tone coupling: In this stage, a double male connector for audio is used. One end is connected to the MT8870 module by means of the 5mm female connector included in the module. The other end is connected to the audio connector of a MotorolaE6s cell phone, with 13 × 106 pixels with a size of 1.12 μm, its numerical aperture is f/2.2. This connection between the module and the mobile is very stable, and avoids the introduction of noise in the system. 2. Digital outputs. As can be seen in Fig. 1, a direct connection is made between the digital outputs Q1 to Q4 of the MT8870 module, with the inputs IN1 to IN4 of the relay module. This connection order is important because it is used in the control

Fig. 1 Electrical connection of the modules, which are part of the embedded vision system

598

I. J. Orlando Guerrero et al.

algorithm. On the other hand, it can be seen in Fig. 1 that the power supply of the MT8870 modules and relays is carried out by a power supply of 5 volts. While a LED is connected to the EST pin, which will indicate when the tone processing starts. 3. Time analysis. An oscilloscope is used to analyze the system response from the time the image is acquired until the power signals NO1 to NO4 are activated. A timing analysis can be obtained on the oscilloscope by performing the normally open contact configuration provided by the relay module, i.e., COM and NO are employed. As can be seen in Fig. 1 the COM ports are connected to GND of a 30 VDC power supply, while NO1 to NO4 are connected to GND of the oscilloscope, with this connection four independent contacts are obtained. To close the circuit of each contact, every time there is a digital signal coming from the sound module, the positive part of the power supply is connected to the CH1 to CH4 inputs of the oscilloscope. This electrical configuration allowed the contacts to actuate independently each time the system detects a color; these actions are controlled by the system algorithm.

2.2 Control Algorithm In a vision machine, the control algorithm is a fundamental part, since, under certain conditions preset by the algorithm, the analyzed objects can be put out of the production line; in our case, the control algorithm was performed in the Android studio environment version 2021.2.1. In this article, these conditions were set by the color of the analyzed object, four colors are considered: red, green, blue, and black. In general, the proposed system performs the following: Every time an object with these colors crosses in the field of view of the camera, the algorithm generates a tone in the audio port of the mobile, which is connected to the MT8870 module that will generate a binary code. Finally, this code is sent to the relay module, to activate the power signals NO1 to NO4. To perform the above, the control algorithm performs the following steps: 1. Using the OpenCv function cv2.VideoCapture, images are captured in real time, with a resolution of 480 × 640 pixels. This resolution is chosen because no details are observed in the images, and also speeds up the algorithm processing. 2. The function cv2.cvtColor separates the RGB components of the captured image, and by means of the NumPy library, these three images are converted to matrices, of size 480 × 640 pixels. 3. For each RGB matrix, the average of its pixels is calculated, that is, its average IR, GI, and IB intensity. 4. The values of IR, GI, and IB are bought in order to make a distinction between colors. For example, to distinguish a red color, IR > IG and IR, while for green and blue colors, the conditions are IG > IR and IB, IB> IG and IR, respectively. Thus, a color can be detected, if its condition is true. For the black color, in practice, the intensities IR, IG, and IB are not equal to zero, they only present a value very close to the noise of the camera under darkness, so, to detect this color, the condition was to verify that IR, IG, and IB were within the range 0–20, in a grayscale, which was obtained experimentally. 5. Each

Embedded Vision System Controlled by Dual Multi-frequency Tones

599

time a true tone is generated in any of the conditions described in the previous step, the algorithm accesses one of four matrices containing the values necessary to generate the tones corresponding to characters 1, 2, 4, and 8. These characters/tones will generate a binary sequence Q1–Q4 in the MT8870 module, which in turn activates the NO1–NO4 switches.

2.3 Experimental Setup In a vision machine, it uses a conveyor belt, which places the objects to be analyzed on the field of view of the camera, if the object meets certain characteristics (color, texture, etc.), it can be selected or removed from the production line. This selection, for our case, is made by preset conditions, in the control algorithm, explained in the previous section. The sum of the image capture speeds (frames per second), algorithm and electronics involved represent the linear recognition speed of the vision machine, which can vary depending on the components involved. The linear recognition speed of the proposed vision machine was quantified using the experimental setup shown in Fig. 2a. The conveyor belt was replaced by a disk of radius 5.1 cm, connected to a DC motor. While the objects were replaced by 0.2 cm wide lines of red, green, blue, and black colors printed on the disk and placed at angles of 90◦ , 180◦ , 270◦ , and 360◦ , see Fig. 2b. The separation between each stripe is approximately 8 cm, which would generate a linear band of 32 cm, this is exemplified in Fig. 3a. Each fringe, when detected by the system, will generate a contact closure in the relay module (NO1–NO4), and will be opened when it is no longer detected as shown in Fig. 3b. The experimental setup was fixed to an optical test plate, and a pole supports the

Fig. 2 Conditions used in the control algorithm to distinguish colored objects

600

I. J. Orlando Guerrero et al.

Fig. 3 a Linear band equivalence. b Joint plot of the four activation signals

mobile, to which a monocular with a 4× magnification was connected to focus the lines on the disk. The focusing distance of the monocular is 10 cm from the lens mount to the disk surface. The linear velocity for our case can be calculated from the equation: S (1) V = = wt t where s, t are distance and time of separation between the objects; w is the angular velocity measured in radians per second; and r is the radius of the disk. The angular frequency can be replaced by multiplying 2π with the frequency (Hz), so the linear velocity turns out to be V = 2π f r (2) where the period P is the inverse of the frequency, so the above equation would be as follows: 2π f r (3) V = P The above equation allows V to be calculated in terms of the disk radius and the period, i.e., the time it takes for four objects to be distinguished. For the quantification of V, NO1–NO4 activation signals were observed through the CH1–CH4 channels of the oscilloscope. While a quantification of the period was found by analyzing the joint plots of the four activation signals. This was achieved by means of data acquisition between the oscilloscope and a computer. An example of a joint activation graph is shown in Fig. 3, where it can be observed that the period is given by the time from the activation of the NO1 signal until the activation of the NO4 signal. That is, a 360◦ turn. In this same figure, it can also be seen that a square pulse represents the detection of a line, while the yellow line is the time it takes for another line to pass through the field of view of the camera. Finally, the union of a yellow line and a dotted line represents the time in which there is no detection in the system.

Embedded Vision System Controlled by Dual Multi-frequency Tones

601

From Eq. (3) and Fig. 3b, we can intuit that, for example, if the period is equal to 1 s, the linear velocity of the band would be approximately 32 cm/s. In other words, we can say that the vision machine, in one second, will recognize four objects and in turn activate NO1–NO4 power signals.

3 Results The linear velocity of the proposed vision machine was estimated experimentally, using the methodology described in Sect. 2.3. Time analysis graphs are used to calculate the period, and therefore the limiting linear velocity at which the system can recognize an object. This is done by applying different voltages to the motor that rotates the disk.

3.1 Time Analysis By applying different voltages to the DC motor, the conveyor belt can be simulated; therefore, voltages from 2 to 6.5 V with an increment of 0.5 V were applied. With the help of the oscilloscope we were able to obtain and analyze the joint graphs for the different voltages, an example of a joint graph of NO1–NO4, against time, is shown in Fig. 4; in this case, the voltage applied to the motor was 2 VDC, and it can be observed in the graph that the period is 6.6 s (1.15 Hz), that is to say in this time four objects are recognized. Using equation three, the linear velocity for this case would be 4.8 cm/s. Similarly, the same calculation was performed for the other voltages applied to the motor, so that joint graphs were obtained and each of them was analyzed independently. It was observed that for voltages higher than 6.5 V, the NO1–NO4 time signals remain in zero state, so that the limiting linear velocity of recognition was 48.70 cm/s.

Fig. 4 Joint time plot of the NO1–NO4 control signals, for a voltage of 2 V applied to the motor

602

I. J. Orlando Guerrero et al.

Table 4 Frequency, period, and line speed, for motor voltage changes Voltage Frequency (Hz) Period 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5

0.15 0.21 0.32 0.52 0.71 0.81 1.02 1.18 1.32 1.52

6.66 4.76 3.12 1.92 1.40 1.23 0.98 0.84 0.75 0.65

Linear velocity (cm/s) 4.80 6.72 10.25 16.66 22.75 25.95 32.68 37.81 42.29 48.70

In other words, the system can detect four objects in 0.65 s. The experimental data for the different voltages is shown in Table 4. According to Table 4, which shows the relationship between applied voltage and period, it can be seen that the change of period for voltages from 2 to 4 decreases rapidly, while for voltages from 4 to 6, the period remains almost constant, which means that the system is reaching the point where it cannot process more images.

4 Conclusions A vision system was presented based on a mobile, a DTMF module, and a relay module, and the control algorithm was realized using OpenCV image processing functions, and implemented in the Android studio 2021.2.1 environment. The linear velocity of the system was estimated by means of a time analysis of joint NO1–NO4 graphs, and these graphs were obtained by means of an oscilloscope and a rotating disk containing colored stripes, red, green, blue, and black. This disk emulates a conveyor belt and the objects are passing through the field of vision of the camera, each time the system detects a color, it generates a tone and activates the DTMF modules and relays. With the above, it was possible to estimate a bandwidth of 0.15–1.52 Hz, and in a linear velocity range of 4.8–48.70 cm/s. It was observed that, as the motor voltage increases, object recognition decays exponentially. It was also observed that there is a linear dependence between frequency and linear velocity, which expresses a stability in the system. It is intended that this vision system is used in the industrial area as an object selection machine, for the case where a low speed is required.

Embedded Vision System Controlled by Dual Multi-frequency Tones

603

References 1. Javaid M, Haleem A, Singh RP, Rab S, Suman R (2022) Exploring impact and features of machine vision for progressive industry 4.0 culture. Sens Int 3:100132 2. Silva RL, Canciglieri Junior O, Rudek M (2022) A road map for planning-deploying machine vision artifacts in the context of industry 4.0. J Ind Prod Eng 39(3):167–180 3. Bahadirov G, Umarov B, Obidov N, Tashpulatov S, Tashpulatov D (2021) Justification of the geometric dimensions of drum sorting machine. IOP Publishing. 937(3):032043 4. Mohamed AR, El Masry GM, Radwan SA, ElGamal RA (2021) Development of a real-time machine vision prototype to detect external defects in some agricultural products. Mansoura University, Faculty of Agriculture. 12(5):317–325 5. Madona E, Yulastri Y, Nasution A, Irmansyah M (2022) Design and implementation of portable and prospective embedded system and IoT laboratory kit modules. Indones J Electron Electromed Eng Med Inform 4(1):28–35 6. Shubiksha TV, Karthick S, Mohammad Sharukh M, Naveen M, Shanthi K (2021) Smart irrigation using embedded system. Advances in automation, signal processing, instrumentation, and control, pp 725–734 7. Grigore (2021) Considerations regarding the importance of the sorting operation of fruits and vegetables. Ann Univ Craiova-Agric Mont Cadastre Ser 50(2):316–321 8. Sharada N (2020) Automated home power system generated by thermoelectric generators in the DTMF module. Int Res J Innov Eng Technol 4(2):84 9. Nie X, Lou C, Yin R (2021) Research on embedded machine vision inspection system based on FPGA. For Chem Rev 545–556 10. Mannan MdS, Sakib MdN (2014) GSM based remote device controller using SIM548C. In: IEEE, fifth international conference on computing, communications and networking technologies (ICCCNT), pp 1–4. https://doi.org/10.1109/ICCCNT.2014.6963023 11. Cho YC, Jeon JW (2008) Remote robot control system based on DTMF of mobile phone. In: 6th IEEE international conference on industrial informatics, pp 1441–1446. https://doi.org/10. 1109/INDIN.2008.4618331

Determinants of City Mobile Applications Usage and Success Rita d’Orey Pape, Carlos J. Costa, Manuela Aparicio, and Miguel de Castro Neto

Abstract Smart cities are gaining popularity in local governments as urban regions evolve. The new city paradigm places the citizen at the center of an organic, efficient, interconnected structure. Information and communication technology (ICT) facilitates this change and citizen engagement, and city services and apps are one conduit. This study’s main goal is to identify the key variables of city mobile app uptake and performance. We present a study paradigm empirically tested in a European city, using an online poll to elicit public feedback on city apps. The results show that perceived usefulness and simplicity are significant to city app uptake. These apps also have some net benefits. This study contributes to a new model of city app adoption and supports implementation and development. Keywords Smart cities · Technology adoption · Innovation · Citizen engagement · Gamification · Mobile apps · E-government

R. d’Orey Pape EIT InnoEnergy SE, Eindhoven, The Netherlands e-mail: [email protected] ISCTE-IUL, Lisboa, Portugal C. J. Costa Advance/ISEG - Lisbon School of Economics and Management, Universidade de Lisboa, Lisbon, Portugal e-mail: [email protected] M. Aparicio (B) · M. de Castro Neto NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisbon, Portugal e-mail: [email protected] M. de Castro Neto e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_52

605

606

R. d’Orey Pape et al.

1 Introduction The global population is concentrated in a few cities. These large cities have a higher life expectancy due to their strong economies and urban design. This increase in new citizens [9] causes unpredictability and stress in urban areas. Rapid urbanization complicates planning and management. Transportation, traffic management, food and energy supply, and crowdfunding [5]. Numerous studies have been done on the adoption of new IS and the impact of SC on our society. This study examines the influence of municipal applications on Lisbon’s transformation into a Smart City by measuring awareness, enjoyment, usage, and net benefits (city and individual impact). This study surveyed the available literature, beginning with Everett Rogers’ Innovation Diffusion Theory (1969) on technology adoption scope and ending with Rob Kitchin’s review of the smart city paradigm (2019). Research on smart cities switched to public involvement, e-government, and gaming. Using the literature study results, a research model was built based on Davis’ [16] technology acceptance model and Delone and McLean’s IS success systems [26]. Operationalization was done using a 237-person online survey. For a more comprehensive approach to the research purpose and to organize the survey, a smart city inventory and interviews with city app users were done. After data collection, structural equation modeling and partial least square regression-testing were used for analysis. After model validation, the results were compared to the supporting hypothesis.

2 Theoretical Overview Many have conceptualized a Smart City (SC). A city where smart technology promotes sustainability, safety, comfort, and citizen control. It is a livable and sustainable city. SC conceptualization gains attention from academia, business, and governments in a quest to describe the new trend of cities composed and monitored by technology [22, 24], where innovation is the leading agent for both governance and economy, and a new “smart role” of the citizen as the foundation for creativity and entrepreneurship. SC uses digital technologies to improve performance, reduce expenses, increase resource efficiency, and actively involve residents. The concept of citizen involvement has evolved [7] since the transfer of power to engage individuals in political and economic processes is a cornerstone of democracy and a widely supported principle. The argument that individuals should participate in their governance [8] appeals to democratic values and is universally acclaimed, yet there is little agreement on how to attain meaningful involvement. Public organizations and government entities use ICTs to open and integrate external stakeholders into their processes [2]. We can thus base the SC on a distinct idea of communication, aligning old infrastructures (telephone, mobile, web access) with new fast data collecting mechanisms (sensors and IoT) that link personal assets to urban architecture [17, 25]. We are going toward integrated device-and-user communication to

Determinants of City Mobile Applications Usage and Success

607

establish “citizen communication.” The transparent, computerized method allows information flow [13]. Citizen communication is crucial for development goals [15] and a way to gain knowledge for government and business. New information sources and frequent exchange affect development potential and access to new options for improving daily life. “E-citizenship” helps shape citizen communication and involvement. Literature urges more research on participatory and collaborative governance [19] because it is unclear what motivates citizen participation in new governance structures. As we have discussed, ICT has changed top-down management to bottomup decision-making. Using web-based IT solutions, public administration [21] has enabled citizen participation in city services and planning. E-government services include renewing citizen IDs, passports, or driver’s licenses, filing for commercial registration, and controlling parking meters. Despite improvements and innovations in ICTs to boost the provision of tools and services that allow citizen engagement [19], adoption is low. To overcome this difficulty, researchers have studied which parameters can optimize utilization [1, 4]. Gamification is used in education, health, and human–computer relations, among other areas [10, 11, 27]. Like the SC, the idea has numerous definitions that constrain application rules. Gamification tools [18] include goal setting, real-time feedback, clarity, proficiency, challenges, and teamwork. Introducing a new technology may be beyond our control, but its success is. Why do people adopt specific technologies? Regulatory contexts, social pressures, and curiosity have been cited as factors. It aimed to explain how an idea, habit, or product [30] spread through a population or social system. Diffusion occurs in 4 stages [30]. Roger and York [30] say word-ofmouth is the best way to spread a new idea. Age difference affects the use of technology in the workplace due to our society’s fast-paced, complicated, and changing work environment [14]. Information processing affects older workers’ computerbased performance, according to this study. Based on the correlation between age differences and impact on individual adoption and sustained use of technology in the workplace [33], suggest that technology usage decisions are greatly impacted by attitude, and it is more prominent in younger workers than older peers.

3 Model Proposal The proposed research paradigm has three essential components: technology, services, and gamification. The components are all related since municipal applications address technical adaptation to new concepts and services. We also want to analyze whether gamification influences usage frequency and satisfaction. Nine constructs support all three components: perceived usefulness, perceived ease of use, perceived satisfaction of use, behavioral intention, use, system quality, information quality, service quality, user satisfaction, behavior attitude, individual impact, organizational/city impact, and gamification. Predicting the use of city apps is interesting since they provide communication and services between citizens and the government. Davis’ [16] technology acceptance model proposes analyzing individuals’ beliefs,

608

R. d’Orey Pape et al.

behaviors, and intentions to determine “how” and “when” to adopt new technology. Perceived satisfaction is based on an individual’s experiences and beliefs and is the enjoyment of using an application regardless of its performance [32]. Information quality and gamification may affect perceived pleasure. Adoption follows pleasure, happiness, and fun. Gamification adds fun and engagement features [1, 4, 6, 11, 27, 28] to less appealing activities. We expect this layer of enticing and distinctive components to boost app satisfaction and use. Even if studies imply that access to relevant information has a limited influence [31], we expect to gain more insights that support or suggest a different route thanks to ICTs. Our research focuses on the impact of gamification on city app satisfaction and use. Based on studies by Davis [16] and DeLone and McLean [26], the following hypothesis applies to both constructs: (H1a). Information quality improves city app satisfaction. (H1b). City apps benefit from accurate information. (H2a). Gamification improves city app pleasure. (H2b). Gamification improves city apps. Individuals intuitively assess [16] whether a new solution (technology/service/product) can help them perform better (work, routines, search/access for information, among others). Therefore, we hypothesize: (H3). Perceived usefulness influences city app usage. After seeing a new solution, a person evaluates its usefulness and complexity [16]. An individual may believe a solution is of utmost importance (usefulness), but if the use is regarded as tough and complicated, non-use may occur. We expect user-friendliness to affect the adoption of new apps. Hypothesis: (H4). Perceived ease of use influences city app usage. (H5). Intention influences city app use. According to Urbach et al. [31], satisfaction and use are interdependent. The individual effect is also influenced by satisfaction with a solution [31]. Hypothesis: (H6). Satisfaction improves city app use. (H7). Apps for cities profit from user pleasure. (H8). Use increases city app benefits. DeLone and McLean’s approach for IS success seems suitable for assessing user happiness with city apps [26]. System and information quality indirectly affected the individual and organizational impact of user satisfaction and use [26]. A further assessment of the model added System quality, Information quality, Service quality, System use, User happiness, and Net benefits. This multifaceted, interconnected model is a solid framework for measuring IS success.

Determinants of City Mobile Applications Usage and Success

609

4 Methodology This research seeks to develop and validate an adoption model based on the technology adoption model [16] and the success model for information systems in order to identify the success features of city apps. To determine the features of successful city application submissions, this will be carried out [26]. Utilizing information systems’ success models, research on the effectiveness of employee portals [31] explored the limited effect of information quality on reported levels of satisfaction. Numerous studies, including “Enterprise Resource Planning” (ERP) adoption [12] and “Online Programming Course” adoption, have highlighted the importance of these two elements [27, 28]. There is no association between app usage and perceived pleasure (−2%). Significant focus is placed on usage intentions in research on information system adoption [12, 27, 28]. In earlier adoption and modeling research [3, 6, 26], the low influence of information quality was discovered [6, 26]. However, the effect of gamification components on usage is currently being researched with promising findings [1, 4, 6, 23]. Because it reflects the performance of city apps and is backed by Aparicio et al. [3] and Urbach et al., the organization and individual impacts were integrated into the net benefits construct [31]. Because the net benefits construct reflects the success of city applications, this measure was performed. After confirming the model’s convergent validity, we evaluate the acquired data’s dependability. The obtained data is deemed reliable if both the composite reliability and Cronbach’s alpha are more than 0.60 [20]. As shown in Table 1, the composite reliability ranges from 0.871 to 0.935, and Cronbach’s Alpha spans from 0.717 to 0.916; thus, we can assume the data’s reliability. We investigated the Pearson (R2) determinant coefficients and the importance of verifying the structural model (Fig. 1). For our study, the R2 of our dependent variables are 0.438 for the Intention of Use, 0.282 for Perceived Satisfaction, 0.642 for use, and 0.462 for Net benefits. Regarding the direct effects of the path coefficients [29], the results suggest that H6 should be rejected (p > 0.10). Perceived satisfaction increases net benefits (H7) (p 0.05). Gamification increases satisfaction and usage (p 0.10). Information quality Table 1 Evaluation of the measurement model Indicator reliability

Comp. reliability

AVE

Cronbach’s alpha

Gam

0.925

0.935

0.706

0.916

IQ

0.863

0.906

0.708

0.861

Intention

0.752

0.884

0.792

0.738

NB

0.914

0.930

0.657

0.911

PEOU

0.835

0.874

0.636

0.814

PU

0.801

0.875

0.702

0.784

PercSatisfact

0.889

0.926

0.807

0.881

UseA

0.799

0.871

0.773

0.717

610

R. d’Orey Pape et al.

Fig. 1 Structural model results of smart city’s mobile applications success

increases satisfaction and use (p 0.05). Perceived usefulness and ease of use (p 0.01) influence the intention to use. The intended use of city apps is also very important. Finally, H8 predicts significant net benefits from Use (Table 2). Table 2 Structural model evaluation Original Sample sample mean (O) (M)

Standard T statistics P deviation (O/STDEV) value (STDEV)

Statistical significance

Gam → PercSatisfact

0.232

0.238 0.123

1.886

Gam → UseA

0.159

0.159 0.088

1.801

0.072 Positive*

IQ → PercSatisfact

0.471

0.469 0.077

6.085

0.000 Positive***

0.059 Positive*

IQ → UseA

0.220

0.212 0.090

2.433

0.015 Positive*

Intention → UseA

0.655

0.654 0.086

7.634

0.000 Positive***

PEOU → Intention

0.325

0.328 0.109

2.979

0.003 Positive**

PU → Intention

0.404

0.411 0.109

3.701

0.000 Positive***

PercSatisfact → NB

0.245

0.243 0.109

2.240

0.025 Positive*

PercSatisfact → UseA −0.022

−0.015 0.066

0.337

0.736 Nonsignificant

0.562 0.100

5.565

0.000 Positive*

UseA → NB

0.556

Note * Significance at p < 0.10; ** Significance at p < 0.05; *** Significance at p < 0.01

Determinants of City Mobile Applications Usage and Success

611

5 Discussion This study proposes and validates an adoption model based on the technology adoption model [16] and the information system’s success model in order to determine the success characteristics of city applications. This will be done so that the characteristics of successful city applications can be determined [26]. Research on the success of employee portals [31] using information systems success models addressed the limited impact that information quality has on reported levels of satisfaction. Numerous studies, including “Enterprise Resource Planning” (ERP) adoption [12] and “Online Programming Course” adoption, have highlighted the significance of these two components [27, 28]. There is no correlation between perceived satisfaction (−2.2%) and use of the city app. Research on information system adoption places a significant emphasis on usage intentions [12, 27, 28]. In previous adoption and modeling studies [3, 6, 26], the low impact of information quality was identified. However, the impact of gamification elements on use is still being studied [6, 23] with positive results. Because it reflects the success of city apps and is supported by De Lone and McLean [26] with strong correlations on their measurement of IS success, Urbach et al. [31] on employee portal success, and Aparicio et al. [3] on e-learning success, the organization and individual impacts were merged into the net benefits construct. This was done because the net benefits construct reflects the success of city apps.

6 Conclusions The literature review highlights Smart Cities’ (SC) impact on metropolitan governance. It also emphasizes how innovative ICTs drive this paradigm shift. The shift to a decentralized governance model in the SC, where the citizen is at the center and resources and infrastructure are efficiently managed, requires new communication and engagement channels, including mobile and web services and applications. This study developed a model to predict the technology adoption and individual and organizational advantages of city services and applications. The model includes information quality, gamification, perceived usefulness, perceived ease of use, intention to use, perceived satisfaction, use, and net benefits. The online survey represented Lisbon’s demographics and app users’ insights. Data collected validated measurement and structural model results. Perceived usefulness influences intention and use, contributing to city apps’ net benefits. The inclusion of gamification components had a minor impact on city app usage and satisfaction, suggesting it may not be significant when building and communicating a new city service. These findings help promote city apps. This model explains 46% of smart cities’ mobile success. App usage and customer satisfaction determine success. Apps’ quality information and gamified features make citizens happier. Information quality and gamification boost app usage. Perceived usefulness should be a differentiator to ensure customer

612

R. d’Orey Pape et al.

pleasure and uptake. The intention of use occurs when a user clearly understands how an app can help him/her with a certain job. The clearer the message, the more use is encouraged. Acknowledgements We are gratefully acknowledged financial support from FCT- Fundação para a Ciên-cia e Tecnologia, I.P., Portugal, MA & MCN Nacional funding through research grant through research grant UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC, NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Portugal; and CJC - Advance/CSG, ISEG, Universidade de Lisboa and er UIDB/04521/20.

References 1. Aparicio JT, Trinca M, Castro D, Henriques R (2021) Vehicle smart grid allocation using multiagent systems sensitive to irrational behavior and unstable power supply. In: 2021 16th Iberian conference on information systems and technologies (CISTI). IEEE, pp 1–6 2. Aparicio JT, Arsenio E, Santos FC, Henriques R (2022) LINES: multimodal transportation resilience analysis. Sustainability 14(13):7891 3. Aparicio M, Bacao F, Oliveira T (2017) Grit in the path to e-learning success. Comput Hum Behav 66:388–399 4. Aparicio M, Costa CJ, Moises R (2021) Gamification and reputation: key determinants of e-commerce usage and repurchase intention. Heliyon 7(3):e06383 5. Aparicio M, Costa CJ, Braga AS (2012) Proposing a system to support crowdsourcing. In: Proceedings of the workshop on open source and design of communication (OSDOC 2012), pp 13–17 https://doi.org/10.1145/2316936.2316940 6. Aparicio M, Oliveira T, Bacao F, Painho M (2019) Gamification: a key determinant of massive open online course (MOOC) success. Inf Manage 7. Arnstein SR (1969) A ladder of citizen participation. J Am Plann Assoc 35(4):216–224 8. Callahan K (2007) Citizen participation: models and methods. Int J Public Adm 30(11):1179– 1196 9. Caragliu A, del Bo C, Nijkamp P (2011) Smart cities in Europe. J Urban Technol 18(2):65–82. https://doi.org/10.1080/10630732.2011.601117 10. Costa CJ, Aparicio M (2018) Gamification: software usage ecology. Online J Sci Technol 8(1) 11. Costa CJ, Aparicio M, Aparicio S, Aparicio JT (2017) Gamification usage ecology. In: The 35th ACM international conference on the design of communication. ACM Press 12. Costa CJ, Ferreira E, Bento F, Aparicio M (2016) Enterprise resource planning adoption and satisfaction determinants. Comput Hum Behav 63:659–671 13. Costa CJ, Silva J, Aparício M (2007) Evaluating web usability using small display devices. In: Proceedings of the 25th annual ACM international conference on design of communication, pp 263–268. https://doi.org/10.1145/1297144.1297202 14. Czaja SJ, Sharit J (1993) Age differences in the performance of computer-based work. Psychol Aging 8(1):59–67 15. D’Asaro FA, Di Gangi MA, Perticone V, Tabacchi ME (2017) Computational intelligence and citizen communication in the smart city. Informatik-Spektrum 40(1):25–34 16. Davis FD (1989) Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q 13(3):319 17. de Castro Neto M, Rego JS, Neves FT, Cartaxo TM (2017) Smart & open cities: Portuguese municipalities open data policies evaluation. In: 2017 12th Iberian conference on information systems and technologies (CISTI), pp 1–6. https://doi.org/10.23919/CISTI.2017.7975912 18. Deterding S (2012) Gamification: designing for motivation. Interactions 14–17:1072–5220

Determinants of City Mobile Applications Usage and Success

613

19. Gustafson P, Hertting N (2017) Understanding participatory governance: an analysis of participants’ motives for participation. Am Rev Public Adm 47(5):538–549 20. Hair Jr J, Hult GT, Ringle C, Sarstedt M (2014) A primer on partial least squares structural equation modeling (PLS-SEM) 21. Khan Z, Dambruch J, Peters-Anders J, Sackl A, Strasser A, Fröhlich P, Soomro K (2017) Developing knowledge-based citizen participation platform to support smart city decision making: the smarticipate case study. Information (Switzerland) 8(2):1–24 22. Kitchin R (2014) The real-time city? Big data and smart urbanism. GeoJournal 23. Looyestyn J, Kernot J, Boshoff K, Ryan J, Edney S, Maher C (2017) Does gamification increase engagement with online programs? A systematic review. PLoS ONE 12(3):1–19 24. Mergulhao M, Palma M, Costa CJ (2022) A machine learning approach for shared bicycle demand forecasting. In: 2022 17th Iberian conference on information systems and technologies (CISTI). IEEE, pp 1–6 25. Neves FT, de Castro Neto M, Aparicio M (2020) The impacts of open data initiatives on smart cities: a framework for evaluation and monitoring. Cities 106:102860. https://doi.org/10.1016/ j.cities.2020.102860 26. Petter S, DeLone W, McLean E (2008) Measuring information systems success: models, dimensions, measures, and interrelationships. Eur J Inf Syst 17(3):236–263. https://doi.org/10.1057/ ejis.2008.15 27. Piteira M, Costa CJ, Aparicio M (2017) CANOE e Fluxo: determinants in the adoption of online course programming. CANOE e Fluxo: Determinantes na adoção de curso de programação online gamificado. RISTI—Revista Iberica de Sistemas e Tecnologias de Informacao 2017(25):34–53. https://doi.org/10.17013/risti.25.34-53 28. Piteira M, Costa CJ, Aparicio M (2017) A conceptual framework to implement gamification on online courses of computer programming learning: implementation. In: 10th international conference of education, research and innovation (ICERI2017). IATED Academy, pp 7022– 7031 29. Puklavec B, Oliveira T, Popoviˇc A (2018) Understanding the determinants of business intelligence system adoption stages an empirical study of SMEs. Ind Manag Data Syst 118(1):236–261 30. Rogers EM, York N (1995) Diffusion of innovations, 4th edn. Iffil The Free Press 31. Urbach N, Smolnik S, Riempp G (2010) An empirical investigation of employee portal success. J Strat Inf Syst 19(3):184–206. https://doi.org/10.1016/j.jsis.2010.06.002 32. Venkatesh V, Davis FD, Morris MG (2007) Dead or alive? The development, trajectory and future of technology adoption research. J Assoc Inf Syst 8(4):267–286 33. Venkatesh V, Morris MG, Ackerman PL (2000) A longitudinal field investigation of gender differences in individual technology adoption decision-making processes. Organ Behav Hum Decis Process 83(1):33–60

Sustainable Digital Transformation Canvas: Design Science Approach Reihaneh Hajishirzi

Abstract Today, digital transformation (DT) and sustainability are two vital criteria for businesses. These concepts are presented in different theories and research fields which brings substantial complexity for companies. The purpose of this paper is to present a strategic tool for companies to create sustainable digitally transformed business models. Therefore, by applying design science methodology, this research proposes a canvas that supports technological, environmental, economic, social, and organizational dimensions for creating sustainable and digital companies. Keywords Digital transformation · Sustainability · Organizational resilience · Design science

1 Introduction Applying innovative technologies and digitally transformed business processes for creating sustainable solutions is necessary for companies to adapt and survive in an ever-changing business world [7]. However, digital transformation (DT) and sustainability involve a wide range of concepts and research areas [4]. DT includes technology, external pressure, people, and organization [15, 31, 32, 38] and sustainability includes environment, social and economic factors [40]. This study identifies the main concepts for sustainable digital transformation including environment, technology, organizational resilience, management, customer engagement, competitive advantage, value proposition innovation and sustainability. Previous studies present the concepts and research areas by which DT impact sustainability including customer engagement [22, 23], business process innovation [1, 45], competitive advantage [2, 11], management support [19, 24] and value proposition innovation [3, 30, 42]. But this breadth of topics in DT and R. Hajishirzi (B) Advance/ISEG (Lisbon School of Economics & Management), Universidade de Lisboa, Lisbon, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_53

615

616

R. Hajishirzi

sustainability brings a great deal of complexity for companies which are looking for sustainable digitally transformed business models. Therefore, this research proposes a new canvas based on three empirically significant models to provide a new strategical tool for DT and sustainability. Accordingly, the research question is what are the concepts involved in DT and sustainability and how are they related? This article has two main contributions: (1) It proposes a canvas for sustainable digital transformation by defining seven key elements; (2) It adds to the body of existing knowledge in DT and sustainability. This paper is structured as follows: Sect. 2 provides the current state of the art. Section 3 introduces the research method. Section 3 proposes the designed canvas. Finally, conclusion is presented in Sect. 4.

2 The Current State of the Art Digital technologies are essential to creating disruption in industries [36]. Companies need to establish strategies for using these technologies in response to the disruption [25]. By using digital technologies, companies must remove obstacles and modify their internal structures to let the value propositions successfully change [38]. Indeed, DT is about applying digital technologies to deliver things differently to the customers [9]. However, due to the global climate crisis and social sustainability, merely considering how business is conducted seems to be insufficient [4, 8]. It is important to deliver things differently in a more proper and responsible way to be sustainable and resilient [12, 26]. However, these could never make happen without top management supports [19, 35]. Table 1 shows the previous research on DT and sustainability by categorizing their main research constructs.

3 Research Design and Results This study is based on design science approach which is a problem-solving paradigm to develop new concepts, methods, tools, artifacts that improves existing solutions [17, 39]. Figure 1 shows the design science research process. The first step is about identifying the research problems and defining the research objectives. As mentioned before, DT and sustainability comprise a variety of concepts and areas of research and there is not a general model or artefact for businesses looking for sustainable digitally transformed business models. The research goal is to create a canvas for DT and sustainability to help managers deliver more innovative value propositions in a sustainable way. In the second step, I determine the constructs that effect DT and sustainability based on bibliometric analysis and systematic literature review [15]. These constructs

Description

This paper generates guidelines for senior executives to handle the challenges of formulating digital transformation

This paper develops an integrative model to study the determinants of post-adoption stages of innovation diffusion, using enterprise digital transformation

This paper reviews a comprehensive body of IS literature on digital transformation and introduces DT process. The author goes on to build a conceptual definition of digital transformation and then proposes a research agenda for future research on digital transformation

Article

Hess et al.[16]

Zhu et al.[44]

Vial [38]

Table 1 Previous research on DT and sustainability

















(continued)

Technology Environment Resilience Management Customer Competition Value proposition

Constructs

Sustainable Digital Transformation Canvas: Design Science Approach 617

In this paper, a digital transformation strategy was formulated which serves as a central concept to integrate the entire coordination, prioritization, and implementation of digital transformations within a firm



Sambamurthy et al. [33] In this paper digital • options, IT competence, agility, and competitive actions has been investigated to understand how IT impacts the performance

Matt et al. [24]



This paper finds the • drivers of DT and shows the impacts of DT. Then it proposes a definition of DT by using systematic literature review

















(continued)

Technology Environment Resilience Management Customer Competition Value proposition

Morakanyane et al.[27]

Constructs

Description

Article

Table 1 (continued)

618 R. Hajishirzi

This study highlights DT • elements that have an impact on sustainability

This study looks at how • sustainable business models may be impacted by optimizing DT impact

This study offers a business model for sustainable conscious and suppliers of digital assets

This study proposes a framework for DT’s explanation of sustainability

Katsamakas [20]

Pasqualino et al.[29]

Hajishirzi et al.[14]

























(continued)

Technology Environment Resilience Management Customer Competition Value proposition

Chandola [4]

Constructs

Description

Article

Table 1 (continued)

Sustainable Digital Transformation Canvas: Design Science Approach 619

This study presents a • data analytics framework for a long-term client base. It expands the insight into consumer behavior mostly in big data era. This might be crucial in creating a sustainable consumer market

This study examines • how the Strategic Action Field Theory relates to change management, digitization, company performance, and green development

Irimiás and Mitev [19]







(continued)

Technology Environment Resilience Management Customer Competition Value proposition

Li and Liaw [23]

Constructs

Description

Article

Table 1 (continued)

620 R. Hajishirzi

This study builds a • model to predict how big data will affect consumer behavior. It shows that data analysis increases customer engagement and purchasing attitude

This study develops a • paradigm for understanding how customer engagement impacts the firm’s value. One of the main challenges in customer engagement is gathering customer data from different channels

The drivers of business model innovation in the aviation industry are examined in this study

Kunz et al. [22]

Schneider et al. [34]











(continued)

Technology Environment Resilience Management Customer Competition Value proposition

Zhang and Tan [43]

Constructs

Description

Article

Table 1 (continued)

Sustainable Digital Transformation Canvas: Design Science Approach 621

Description

They offer a paradigm that demonstrates the significance of customer involvement for long-term competitive advantage

They offer a paradigm that illustrates how consumer interaction affects company performance and competitive advantage

In order to illustrate the connection between resilience and sustainability, this study builds a conceptual model. They argue that the concept of organizational resilience should be understood in a subordinate manner, much like the holistic approach to sustainability

Article

(Alvarez-MilanReto [2]

Kumar and Pansari [21]

Miceli et al. [26]

Table 1 (continued)















Technology Environment Resilience Management Customer Competition Value proposition

Constructs

622 R. Hajishirzi

Sustainable Digital Transformation Canvas: Design Science Approach

1. Identify problem and define objectives

2. Define Constructs

3. Define hypotheses and theoretical models

4. Validate theoretical models by empirical study

5. Design the canvas

623

6. Evaluate the canvas

Fig. 1 Design science research process

are Technology Enabled Assets [28], Compatibility [13], Complexity [13], Data driven [31], Digital Platform [32], External Developer Framework [32], Operational backbone [32], Business process innovation [31], Organizational resilience [41], Customer engagement [31, 32], top management support [13], Change management [19], Industry pressure [18], Government regulation [18], Competitive advantage [31], Accountability framework [32], Value proposition innovation [6], and Sustainability [10, 37, 40]. In the third step, to understand if the constructs are substantial in this concept, I define hypotheses and propose three theoretical models to understand how DT influence sustainability. The purpose of proposing three different theoretical models is to check the validity and reliability of the constructs. In the first model, I integrate DT process introduced by Vial [38] with TOE framework [5]. Vial [38] believes that digital technologies make disruption in industries and it highlights the role of leaders to manage changes and create value network. In the second model, I apply DT domains introduced by Rogers [31] with organizational resilience and sustainability. In the third model, I propose a model based on building blocks of DT introduced by Ross et al. [32] and sustainability. Next, I conduct an empirical study to test these models (fourth step). I create a research instrument corresponding to the measurement model and use validated scales to operationalize the constructs and to enhance validity. The results of the first model show that environmental dimension of organization impacts technological dimension and change management process. In addition, technological dimension of organization and change management affects value proposition innovation. The results of the second model reveal that DT impacts sustainability through organizational resilience, customer engagement, and competitive advantage [14]. Moreover, data driven and business process innovation impact customer engagement. The results of the third model expose that DT affects sustainability through management support, technology, organizational resilience, and value proposition innovation. Furthermore, technology impacts value proposition innovation. In the fifth step, I categorize the constructs of the three significant models into a sustainable digital transformation canvas (Fig. 2). This Canvas consists of seven key elements including environment, technology, sustainability, value proposition innovation, management, customer engagement, and resilience. The constructs related to Technology dimension in the canvas are Technology Enabled Assets, Compatibility, Complexity, Data Driven, Digital Platform, External Developer Platform, Operational Backbone. The constructs related to Environment dimension in the canvas are Industry Pressure, Government Regulation, and Competitive Advantages. The

624

R. Hajishirzi

Fig. 2 Sustainable digital transformation canvas

constructs related to Management dimension in the canvas are Business Process Innovation, Top Management Support, Change Management, and Accountability Framework. In addition, the questions for each dimension are from the measurement model used in the fourth step. In the last step, I apply sustainable digital transformation canvas to a company to evaluate if this canvas describes a company’s sustainable digital transformation process. I choose an eco-friendly company that brings on-demand waterless car wash services on their customer’s spot in Portugal. Their business model is B2C and B2B. In B2C their customers can book their service through mobile applications or website platform. Then one of their car washers will go to the address and will present the service. In B2B the car washer goes to the company’s office and washes the company’s fleet or their employees’ cars. I interview its founders and ask the questions related to each dimension and fill the canvas (Fig. 3). There is no order to fill this canvas and it is possible to start from any dimension.

Sustainable Digital Transformation Canvas: Design Science Approach

625

Fig. 3 Sustainable digital transformation canvas—case study of an on-demand eco-friendly car wash service

4 Conclusion In this study, the main goal is to propose a canvas that enables to properly explain the sustainable digital transformation process of a company. To fulfil this, first, I have identified the main constructs that assist managers in delivering more digital innovative value propositions in a sustainable way. Second, I have created three theoretical models which are empirically evaluated. Third, I have categorized these constructs into seven complementary components that enable the creation of a sustainable digitally transformed business model. Fourth, to evaluate this canvas I have applied it in a specific case and understand that the proposed canvas is relevant with sustainable digitally transformed business models. This paper has theoretical implications. It adds to the expanding body of research on digital transformation and sustainability. It proposes a new canvas for sustainable digital transformation process for organizations. As a practical implication, this study provides a clear picture for managers to understand how they should apply technologies and bring sustainable solutions. It supports all aspects that c-suite leaders need to pay attention to for creating sustainable digitally transformed business models. It illustrates the companies’ needs to

626

R. Hajishirzi

respond to environmental changes and industry pressure. It highlights the importance of addressing new markets, new customers, and new channels. It depicts the need of top management support and change management process. It emphasizes on having marketing strategy for customer engagement. Finally, it underlines the necessity of responding to the crisis and unexpected changes.

References 1. Ahmed T, Van Looy A (2020) Business process management and digital innovations: a systematic literature review. Sustainability 12(17):6827. https://doi.org/10.3390/su12176827 2. Alvarez-MilanReto A, Felix R, Rauschnabel PhA, Hinsch Ch (2018) Strategic customer engagement marketing: a decision making framework. J Bus Res 92:61–70. https://doi.org/ 10.1016/j.jbusres.2018.07.017 3. Carbalho JMS, Jonker J (2015) Creating a balanced value proposition exploring the advanced business creation model. J Appl Manag & Entrep 20(2):49–64. https://doi.org/10.9774/GLEAF. 3709.2015.ap.00006 4. Chandola V (2016) Digital transformation and sustainability 5. Chiu C-Y, Chen S, Chen C-L (2017) An integrated perspective of toe framework and innovation diffusion in broadband mobile applications adoption by enterprises. Int J Manag Econ Soc Sci 6(1):14–39 6. Clauss Th (2017) Measuring business model innovation: conceptualization, scale development and proof of performance. R&D Manag 47(3):385–403. https://doi.org/10.1111/radm.12186 7. Close K, Faure N, Hutchinson R (2021) How tech offers a faster path to sustainability. Boston Consulting Group 8. Feroz AK, Zo H, Chiravuri A (2021) Digital transformation and environmental sustainability: a review and research agenda. Sustainability 13(3):1530. https://doi.org/10.3390/su13031530 9. Fitzgerald M, Kruschwitz N, Bonnet D, Welch M (2014) Embracing digital technology: a new strategic imperative. MIT Sloan Manage 55(2):1–12 10. Goodland R (1995) The concept of environmental sustainability. Annu Rev Ecol Syst 26:1–24 11. Grimstad S, Burgess J (2014) Environmental sustainability and competitive advantage in a wine tourism micro-cluster. Manag Res Rev 37(6):553–573. https://doi.org/10.1108/MRR-012013-0019 12. Guandalini I (2022) Sustainability through digital transformation: a systematic literature review for research guidance. J Bus Res 148:456–471. https://doi.org/10.1016/j.jbusres.2022.05.003 13. Gutierrez A, Boukrami E, Lumsden R (2015) Technological, organisational and environmental factors influencing managers’ decision to adopt cloud computing in the UK. J Enterp Inf Manag 28(6):788–807. https://doi.org/10.1108/JEIM-01-2015-0001 14. Hajishirzi R, Costa CJ, Aparicio M (2022a) Boosting sustainability through digital transformation’s domains and resilience. Sustainability 14(3):1822. https://doi.org/10.3390/su1403 1822 15. Hajishirzi R, Costa CJ, Aparicio M, Romao M (2022b) Digital transformation framework: a bibliometric approach. In: information systems and technologies. SpringerLink 16. Hess T, Matt C, Benlian A, Wiesboeck F (2016) Options for formulating a digital transformation strategy. MIS Q Exec 15(2):123–139 17. Hevner AR, March ST, Park J, Ram S (2004) Design science in information systems research. MIS Q 28(1):75–105 18. Ilin V, Iveti´c J, Simi´c D (2017) Understanding the determinants of e-business adoption in ERP-enabled firms and non-ERP-enabled firms: a case study of the Western Balkan Peninsula. Technol Forecast Soc Chang 125:206–223

Sustainable Digital Transformation Canvas: Design Science Approach

627

19. Irimiás A, Mitev A (2020) Change management, digital maturity, and green development: are successful firms leveraging on sustainability? Sustainability 12(10):4019. https://doi.org/10. 3390/su12104019 20. Katsamakas E (2022) Digital transformation and sustainable business models. Sustainability 14(11):6414. https://doi.org/10.3390/su14116414 21. Kumar V, Pansari A (2016) Competitive advantage through engagement. J Mark Res 53(4):497– 514 22. Kunz W, Aksoy L, Bart Y, Heinonen K, Kabadayi S, Villaroel Ordenes F, Sigala M, Diaz D, Theodoulidis B (2017) Customer engagement in a big data world. J Serv Mark 31(2):161–171. https://doi.org/10.1108/JSM-10-2016-0352 23. Le TM, Liaw SY (2017) Effects of pros and cons of applying big data analytics to consumers’ responses in an e-commerce context. Sustainability 9(5):798. https://doi.org/10.3390/su9 050798 24. Matt C, Hess T, Benlian A (2015) Digital transformation strategies. Bus Inf Syst Eng 57(5):339– 343 25. Matzler K, von den Eichen SF, Anschober M, Kohler Th (2018) The crusade of digital disruption. J Bus Startegy 39(6):13–20. https://doi.org/10.1108/JBS-12-2017-0187 26. Miceli A, Hagen B, Riccardi MP, Sotti F, Settembre-Blundo D (2021) Thriving, not just surviving in changing times: how sustainability, agility and digitalization intertwine with organizational resilience. Sustainability 13(4):2052. https://doi.org/10.3390/su13042052 27. Morakanyane R, Grace A, O’Reilly Ph (2017) Conceptualizing digital transformation in business organizations: a systematic review of literature. Bled, Slovenia 28. Nwankpa JK, Roumani Y (2016) IT capability and digital transformation: a firm performance perspective 29. Pasqualino R, Demartini M, Bagheri F (2021) Digital transformation and sustainable oriented innovation: a system transition model for socio-economic scenario analysis. Sustainability 13(21):11564. https://doi.org/10.3390/su132111564 30. Patala S, Jalkala A, Keränen J, Väisänen S, Tuominen V, Soukka R (2016) Sustainable value propositions: framework and implications for technology suppliers. Ind Mark Manag 59:144– 156. https://doi.org/10.1016/j.indmarman.2016.03.001 31. Rogers D (2016) The digital transformation playbook: rethink your business for the digital age. Columbia Business School Publishing 32. Ross JW, Beath C, Mocker M (2019) Designed for digital, how to architect your business for sustained success. Management on the Cutting Edge 33. Sambamurthy V, Bharadwaj AS, Grover V (2003) Shaping agility through digital options: reconceptualizing the role of information technology in contemporary firms. MIS Q 27(2):237– 263. https://doi.org/10.2307/30036530 34. Schneider S, Spieth P, Clauß T (2013) Business model innovation in the aviation industry. Int J Prod Dev 18:286–310 35. Sebastian IM, Ross JW, Beath C, Mocker M, Moloney KG, Fonstad NO (2017) How big old companies navigate digital transformation. Mis Quarterly Executive 16(3) 36. Skog DA, Wimelius H, Sandberg J (2018) Digital disruption. Bus Inf Syst Eng 60(4):431–437. https://doi.org/10.1007/s12599-018-0550-4 37. Spangenberg JH (2005) Economic sustainability of the economy: concepts and indicators. Int J Sustain Dev 8(1–2). https://doi.org/10.1504/IJSD.2005.007374 38. Vial G (2019) Understanding digital transformation: a review and a research agenda. J Strat Inf Syst 28:118–144. https://doi.org/10.1016/j.jsis.2019.01.003 39. Weigand H, Johannesson P, Andersson B (2021) An artifact ontology for design science research. Data Knowl Eng 133:101878. https://doi.org/10.1016/j.datak.2021.101878 40. Woodcraft S (2015) Understanding and measuring social sustainability. J Urban Regen Renew 8(2):133–144 41. Xiao L, Cao H (2017) Organizational resilience: the theoretical model and research implication 42. Yang M, Vladimirova D, Evans S (2017) Creating and capturing value through sustainability. Res-Technol Manag 60(3):30–39. https://doi.org/10.1080/08956308.2017.1301001

628

R. Hajishirzi

43. Zhang C, Tan T (2020) The impact of big data analysis on consumer behavior. J Phys Conf Series 1544(1). https://doi.org/10.1088/1742-6596/1544/1/012165 44. Zhu K, Dong S, Xin Xu S, Kraemer KL (2006) Innovation diffusion in global contexts: determinants of post-adoption digital transformation of European companies. European J Inf Syst 15:601–616. https://doi.org/10.1057/palgrave.ejis.3000650 45. Ziemba E, Eisenbardt M (2019) Consumer engagement in business process innovation: cases of the firms operating in the ICT Sector. Manag Issues 17:24–39. https://doi.org/10.7172/16449584.85.2

The Role of Community Pharmacies in Smart Cities: A Brief Systematic Review and a Conceptual Framework Carla Pires and Maria José Sousa

Abstract Community pharmacies are responsible for buying, stoking, and dispensing medicines. These businesses play a central role in healthcare systems. Study aim: To define possible contributions of community Pharmacies in smart cities. Research question: How can community Pharmacies contribute to smart cities? Methods: A brief systematic review was carried out. Three databases were browsed: PubMed, SciELO and Google Scholar. Databases were conveniently selected. Inclusion criteria: any study specifically about the involvement of community pharmacies in smart cities. Results: Only, 5 papers were selected (5 out of 104). Selected papers were about the safety transport of medicines, e-prescribing, electronic product information, use of Information and Communication Technology (ICT) and pharmaceutical services. Discussion: Community pharmacies can provide diverse health services in smart cities, such as collecting patients’ clinical data, re-dispensing of reused medicines, applying diagnostic tools, generating automatic alerts about precautions/contra indications of medicines, follow-up of patients, and/or alerting patients for the necessity of routine exams. Conclusion: The number of studies on the present topic was very limited. The massive and coordinated integration of community pharmacies and pharmaceutical services in smart cities is lacking. ICTs, artificial intelligence, and internet of things, among others seem to be crucial for the integration of community pharmacies in smart cities. Further studies are recommended. Keywords Smart city · Community pharmacy · Pharmacy · Pharmacists · Health · Big data · Systematic reviews · Information and communication technology · ICT

C. Pires (B) CBIOS - Universidade Lusófona’s Research Center for Biosciences and Health Technologies, Lisbon, Portugal e-mail: [email protected] M. J. Sousa ISCTE-IUL - Instituto Universitário de Lisboa, Lisbon, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_54

629

630

C. Pires and M. J. Sousa

1 Introduction Smart sustainable cities can be defined as “innovative cities that uses information and communications technology (ICT) and other means to improve quality of life, efficiency of urban operation and services, and competitiveness, while ensuring that it meets the needs of present and future generations with respect to economic, social, environmental as well as cultural aspects” [10]. Currently, the topic of integration of community pharmacies in smart cities is arising in the literature. For instance, according to the agenda of United Arab Emirates (UAE) 2020: “UAE aims to be a successful smart city by harnessing digital innovation in all its endeavors”, while an online system for prescribing and buying pharmaceutical products is desired by retail pharmacists and patients. e-pharmacy dispensing services will reduce the waiting times of patients in pharmacies [28]. Another study proposes an innovative approach aiming at optimizing medicines delivery with cost saving. In the proposed model pharmacists can be engaged with another pharmacy shop in the process of distributing medicines [7]. Community pharmacies are businesses within the health sector, where medicine and patient-oriented activities are carried out. Globally, community pharmacies are widespread in cities. Pharmacists need adequate facilities, equipment, infrastructures, and sources of information to develop, manage, and implement these activities [24]. Community pharmacists are responsible for dispensing medicines, review and prescribe medicines, and administer some pharmaceutical preparations, such as injectables, etc. These healthcare professionals are essential pillars for pharmacy care. The dispensation of medicines covers a significant and complex number of acts/procedures, such as “procuring, storing, preparing, compounding, reviewing, recording, counselling and giving medicine to a named person on the basis of a prescription”. The professional act of reviewing therapeutics is associated with the procedure of checking the appropriateness of prescribed medicines for a certain patient. In general, the review of therapeutics should be based on research and evidence, collegiality, digital health, sustainability, and effect of health policy. The prescription of medicines by community pharmacists is mainly centered in minor ailments covering over-the-counter medicines (e.g., obstipation, colds, musculoskeletal pain, etc.), and more recently other types of ailments, such as emergency contraception or urinary tract infections in some countries [13]. Community pharmacy services are very diversified, while encompassing a wide range of pharmaceutical interventions, such as medication review, patient education (e.g., adhesion to treatments), lifestyle advice, physical assessment, and the monitoring, prescribing, adjusting and/or administering of medicines. Importantly, community pharmacy services produce positive clinical outcomes. For instance, significant reductions in systolic blood pressure, diastolic blood pressure or HbA1c were achieved in pharmaceutical care programs/consultations [35]. Health records are fundamental to ensure appropriate consultations and patients’ counselling by healthcare professionals [11, 14, 17]. Centralized e-Health Records were developed in national databases in diverse countries, such as Portugal, Estonia,

The Role of Community Pharmacies in Smart Cities: A Brief Systematic …

631

or UK. These databases comprise clinical information, which is available online. For instance, clinical data can be consulted in an e-Patient website by patients or physicians in Portugal [33]. Online databases are powerful tools for healthcare professionals since all patient records can be consulted in a single access point [5, 33]. However, the access to patients’ clinical data by pharmacist only is possible in some countries. For instance, in UK pharmacists have access to the Summary Care Records (SCR), which is “an electronic record of important patient information, created from general practitioner medical records. SCR can be seen and used by authorized staff in other areas (e.g., 26,670 views of SCR in UK community pharmacies on 6 December 2021) [21]. However, the use of SCR by pharmacies is limited to some situations, such as emergency supply (e.g., the pharmacist can check the name of medicine, dosage, and frequency), adverse reactions (e.g., if the patient do not remember the name of allergies) or to self-care providing vaccinations and other services (e.g., to check if the patient is eligible for free flue vaccination) [11, 22]. Thus, the study aim was to define possible contributions of community Pharmacies in smart cities. Research question: How can community Pharmacies contribute to smart cities?

2 Proposed Methodology A systematic review was carried out. The preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist and flow diagram were followed [25, 26]. The search was carried out on 1-1-2021, without time restrictions.

2.1 Databases and Keywords Three databases were browsed: PubMed, SciELO and Google Scholar. These databases were conveniently selected since they mainly comprise peer reviewed publications and many publication/papers. Particularly, Google Scholar is estimated to include about 160 million documents [23], followed by PubMed and SciELO only comprising, 33 million citations and 48,987 articles, respectively [27, 30]. The selected keywords were, as follows: (“Community Pharmacy(cies)” and “Smart city(ies)”, i.e., [“community pharmacy” and “smart city”] and [“community pharmacy” and “smart cities”] and [“community pharmacies” and “smart city”] and [“community pharmacies” and “smart cities”]. These keywords were conveniently selected to accurately identify studies related to the present topic.

632

C. Pires and M. J. Sousa

2.2 Inclusion and Exclusion Criteria Inclusion criteria any study (qualitative, quantitative, review, or other) specifically about the involvement (or potential involvement) of community pharmacists specifically in smart cities. Studies related to other topics were excluded.

3 Results 3.1 PRISMA 2020 Flow Diagram for New Systematic Reviews The identification of studies per database (SciELO, PubMed, and Google Scholar) is presented in Fig. 1. Almost all selected studies were identified in Google Scholar (n = 4). Only, one study was selected in PubMed and no study was selected in SciELO.

3.2 Studies Related to Community Pharmacies in Smart Cities The selected studies (n = 5) on the integration of community pharmacies in smart cities are presented in Table 1. All studies are organized by author, year of publication, geographic origin, study aims, methods, findings and discussion and conclusion.

4 Discussion 4.1 Community Pharmacies in Smart Cities Smart cities, including their community pharmacies need advanced telecommunications, ICT and/or digital services to efficiently operate all communications, such as digital enterprise infrastructures, digitization of the supply chain, digitization of operations and processing and/or digitization of the distribution channels [19]. For instance, diverse European countries have approved a regulatory framework and implemented health policies to ensure the adoption of e-prescription. E-prescription will increase in EU in the next years, because of an increasing adoption of telehealth services or the adoption of artificial intelligence to eliminate potential prescription errors and/or to optimize prescriptions [29]. Recently, it become possible that EU citizens use e-Prescriptions in other EU country: “Finnish patients are now able to go to a pharmacy in Estonia and retrieve medicine prescribed electronically by their

The Role of Community Pharmacies in Smart Cities: A Brief Systematic …

633

Screening

Identification

Identification of studies via databases

Records identified from Databases: SciELO (n = 0) PubMed (n = 2) Google Scholar (n = 102)

Records screened (n = 69)

Records excluded (n = 0)

Reports sought for retrieval (n = 69)

Reports not retrieved (n = 1)

Reports assessed for eligibility (n = 68)

Included

Records removed before screening: Duplicate records removed (n = 35)

Reports excluded: Other topics (n = 63)

Studies included in review (n = 5) SciELO (n = 0) PubMed (n = 1) Google Scholar (n = 4)

Fig. 1 PRISMA 2020 flow diagram for new systematic reviews: studies related to community pharmacies in smart cities

doctor in Finland” [6]. On the contrary, these services only recently started to be implemented in other countries, such as UAE [28]. Surprisingly, the number of selected studies on the integration of community pharmacies in smart cities (or related to this topic) was very limited in the present brief systematic review. However, pertinent issues have been addressed in the selected studies, such as the safety transportation of medicines [4], electronic package inserts of medicines [8], e-prescribing, electronic health records, mobile apps, and online pharmacies [16, 28], and pharmaceutical care services [3]. The selected studies covered a limited number of countries, convenient samples of participants or convenient settings, i.e., representative studies involving the active participation of community pharmacies or pharmaceutical services in smart cities have not been identified. It is important to notice that the covered topics (i.e., transportation of medicines,

Mixed method: a structured survey and interview. Participants: 16 pharmacists (4 community pharmacists, 5 hospital pharmacists, and 7 industrial pharmacists).

Fung et al., 2020 [8] Hong Kong

To explore working pharmacists’ overall perception of an electronic product information (ePI) and to identify potential challenges to the implementation of an ePI system.

Findings

Issues related to the consultation of packages inserts (PI) in paper can be mitigated with an ePI system. Advantages: platforms of ePI can be quickly updated and retrieved. Disadvantages: less e-literate citizens can have more difficulties.

Both patients and pharmacists declared to trust in the process of dispensing health technologies online, Reported issues doctors handwriting and a significant wait time to buy medicines at pharmacies.

An IoT-based smart medicine After an email order, medicines transportation and medication are safely transported and monitoring system was applied monitored. For instance, in developing countries, where medicine supply issues are significant. Two questionnaires: community pharmacist and patients to evaluate the challenges faced with the current operational retail pharmacies (142 patients and 58 pharmacist).

To safely transport medicines.

Edoh, 2017 [4] Germany

Methods

Rahaman et al., 2019 To gain insight into how [28] traditional ways of prescribing United Arab Emirates medicines are preferred over e-Prescription to reduce errors in dispensing medication.

Aims

Author(s), year, and geographic origin

Table 1 Studies related to the integration of community pharmacies in smart cities (n = 5)

(continued)

ePI systems should be optimized to ensure their usability. ePI systems can be integrated in smart cities, which is aligned with the vision of “The Smart City Blueprint for Hong Kong [34].”: https://www.smartcity. gov.hk/)

Both patients and pharmacists recognize advantages in online pharmacies, namely the reduction of waiting time at pharmacies or the clarity of e-prescriptions.

The present system improved the safety access to pharmaceuticals products, which can contribute for smarter cities.

Discussion and conclusion

634 C. Pires and M. J. Sousa

Aims

To determine the positioning and roles of Information and Communication Technology (ICT) in community pharmacies in the state of Selangor, Malaysia.

To assess the feasibility and performance of pharmaceutical services for geriatric patients.

Author(s), year, and geographic origin

Kc et al., 2020 [16] Malaysia

Chen et al., 2021 [3] Taiwan

Table 1 (continued)

Questionnaires: pre and post intervention (e.g., satisfaction with services). Services: dispensed prescriptions, medication adherence, cognitive, and home and institutional medical care. Participants: 264 females and 253 males.

Questionnaire on the ICT use in community pharmacies (CF) and health sector (60 community pharmacies).

Methods

More than 90% of participants believed that pharmaceutical services were helpful to improve the cognition and behaviour related to medication. Patient’s adherence, behaviours, and knowledge on medication significantly improved.

CF: 77% electronic health records, 50% social media platforms to promote their services, 92% electronic payment systems, 68% software/programs for accounting and logistics, 78% a barcode reading system, and 27% e-commerce. Mobile apps were applied to provide health services.

Findings

Pharmaceutical care services for geriatric patients in community pharmacies are possible. Future research: Internet of Things and artificial intelligence will be used in a model of care to satisfy cost–benefit and the needs of elderly people.

CF are using ICT in the state of Selangor, Malaysia. Pharmacy services are facing an adaptation to be suitable for modern smart cities.

Discussion and conclusion

The Role of Community Pharmacies in Smart Cities: A Brief Systematic … 635

636

C. Pires and M. J. Sousa

electronic package inserts, etc.) can be found in other studies, but not specifically in the scope of smart cities. These findings seem to confirm the existence of a gap in the actual state-of-the-art about the integration of community pharmacies in smart cities. Furthermore, the search was repeated with some of the selected keywords on 24-8-2022, with similar findings being achieved (i.e., representative research was not found and the number of identified studies on the present topic was also very limited).

4.2 Reply to the Research Question The relevance of using clinical data and ICTs is recognized in diverse works, although the need of integrating the services community pharmacies in the scope of smart cities was only identified in a limited number of studies. Hospital and community pharmacies can contribute to smart cities through: (i) receiving prescriptions and/or medicines from hospitals, which can be dispensed in community pharmacies (i.e., avoiding unnecessary hospital visits by patients); (ii) automatically, controlling the number of taken medicines and the patients’ adhesion to therapeutics; (iii) sending automatic messages to patients about the rational use medicines (e.g., hour, number of pills, etc.); (iv) generating automatic alerts to healthcare professionals (e.g., allergies, pregnancy, precautions or contraindications of medicines); (v) producing automatic alerts to patients, if medicines or medical devices present quality or safety problems (e.g., anomalies in batch production of medicines) or new precautions/contraindications of medicines/medical devices; (vi) update of patients’ clinical data, such as glycemic control, arterial pressure, height, body mass index, profile of lipids, or others; (vii) providing virtual pharmaceutical consultations at any point of the city; (viii) applying the technology of the internet of things (e.g., automatic management of pharmaceutical waste, stocks or the provision of pharmaceutical services); (ix) waste management of medicines out of validity; (x) coordination and management of the stocks of medicines and medical devices at a national level, which can contribute to reduce the stocks in hospitals and, consequently, the national health costs; (xi) eventual reuse of returned medicines, after quality checking; (xii) automatic generation of an alert to patients or healthcare professionals if medicines are out of the conservation conditions (e.g., temperature or light); (xiii) providing the application of validated tools to carry out differential diagnosis, followed by the introduction of the collected information in a national database; (xiv) a more reliable follow-up of patients, with the help of artificial intelligence tools; (xv) alerting patients and physicians if a patient needs to repeat a certain routine exam, e.g., colonoscopy or mammography, and (xvi) signalizing citizens with significant health or social problems, such as informing the national health system or other social institutions through an integrated national database.

The Role of Community Pharmacies in Smart Cities: A Brief Systematic …

637

4.3 National Health Databases: Possible Application in Smart Cities Besides ensuring confidentiality of data and cybersecurity, e-health services provided by community pharmacies must be implemented under a well-defined regulatory framework. The linkage between national databases comprising citizens’ clinical data and community pharmacies seems to be fundamental for the provision of more safety and rational pharmaceutical services. However, these connections only exist in some countries, such as UK [22]. Diverse studies recognize the relevance of using clinical digital data, namely for improved care services in community or hospital pharmacies [11, 14, 17]. Potential advantages of web-based pharmacy services are described by Goundrey-Smith [11], as follows: patients’ discharge information from hospitals is available from physicians and pharmacists; patients’ questions about medicines are registered; community pharmacists can audit the quality of prescriptions or other health related information generated by general practitioners; better quality of care; prevention of dispensing unnecessary medicines (e.g., during patient hospitalization); integrative use of the available information (e.g. medicines review), etc. [11]. Globally, the connectivity between community pharmacies and national databases, which ensure the access to patients’ clinical data is widely heterogeny. For instance, some countries like UK, Portugal, or Estonia already maintain and use national health registers and databases, with a disseminated use of e-prescriptions [5, 21, 33]. Australian community pharmacists also have access to the health records of citizens through “My Health Record (MHR)”, which is a digital health record system accessible by patients and different healthcare professionals. Overall, around 5700 Australian pharmacies (83%) were registered with access to this system, as of April 2019. Positively, diverse advantages on the use of the MHR system are recognized by the Australian community pharmacists, such as improved medication safety, continuity of care, and quality of care [17]. Besides the access to key health information (medicines, allergies and adverse reactions, summaries of a patient’s medical history, conditions and treatments, hospital discharge summaries, immunizations, pathology reports and specialist letters, the access to MHR ensure diverse benefits for pharmacists, such as, provide a more complete picture of patients’ health through timely access to their key health information, access to key health information; medication management services, including medicine reviews; improve efficiency of services, since information is concentrated in just one file; provision of tailored advices; continuity of patient care, or better interprofessional collaborations between different healthcare professionals [1].

638

C. Pires and M. J. Sousa

4.4 Information and Communication Technologies (ICT): 5G, Internet of Things, Blockchain, and Artificial Intelligence as Resources for Community Pharmacies in the Scope of Smart Cities In smart cities, ICTs (e.g., apps) are applied in business for the effective delivery and functioning of services, consequently ensuring a better quality of life of citizens [2, 9]. Thus, the introduction of ICT in community pharmacies seems to be determinant for their successful integration in smart cities (e.g., e-commerce, use of electronic health records or payments, or alerts) [16]. In general, ICT are crucial to support the implementation of pharmaceutical services, such as the virtual provision of pharmaceutical-services (e.g., diabetic, or geriatric patients) or the supply of information through hot spots (e.g., product information about medicines or medical devices, such as package inserts or other type of patient information) [3, 8]. Additionally, ICTs can be used to collect information about the number and type of medicines out of validity in community pharmacies, regarding waste management is essential to control pollution and to ensure sustainability in smart cities, ecocities, and low-carbon cities [31]. At least, some of these medicines can be recycled, which support a circular economy. For instance, intelligent packaging can be applied to ensure a safety reuse of medicines, with the application of the internet of things to monitor the flow of medicines in society. Community pharmacists can accept, examine and re dispense some of the returned medicines, with the help of intelligible packaging technologies [12]. Particularly, the use of 5G, artificial intelligence, blockchain or internet of things in community services is expected to improve services [2, 15]. For instance, blockchain can be used for: securing patient and provider identities; managing pharmaceutical supply chains; clinical research, among many others or homes can be equipped with environmental and biological sensors through 5G networks to monitor patient health: clinical signs/parameters can be sent to clinicians and pharmacists (when applicable) [18]. There are diverse artificial intelligence tools that can be integrated in the services offered by community pharmacies, such as: ViPRx, which quickly analyze electronic medical records of patients aiming at ensuring better patient outcomes or the mPulse™, which contribute to improve patient adherence [20]. Internet of things can be applied to reduce the waiting time at pharmacies [32] or to control the transportation and conservation of medicines (in transports, in-house at pharmacy or in the home of patients) [4].

4.5 Implications to Practice and Future Research Regulators, health institutions, medicine authorities, and governments should be involved in the active incorporation of community pharmacies in smart cities, which

The Role of Community Pharmacies in Smart Cities: A Brief Systematic …

639

can contribute to reduce the economic burden in the health sector. Research and pilot studies involving the participation of community pharmacies in smart cities are recommended, since the number studies on this area is very limited. Ideally, pilot studies need to be scaled-up to ensure an intelligible integration and coordination of all community pharmacies in smart cities (e.g., standardized virtual pharmaceutical consultations or communication between community pharmacists and citizens or between community pharmacists and other healthcare professionals). Further studies are recommended to evaluate the economic implications or the impact of using 5G, internet of things, blockchain, and artificial intelligence in community pharmacies. The integration of community pharmacies in smart cities is highly recommended at a global level, since both developed and developing countries can benefit from this integration. Finally, quality and scientific soundness of the processed of information should be supervised (e.g., through Data Mining methodologies) to avoid imprecisions or the propagation of fake news [36, 37].

5 Conclusion Studies specifically about the integration of community pharmacies in smart cities are almost inexistent. Representative studies on the present topic are lacking (e.g., involving the integration and coordination of all the community pharmacies in a certain smart city). Besides the access to clinical data of patients, ICTs, 5G, artificial intelligence, blockchain and internet of things seem to be crucial for the services of community Pharmacies in smart cities. New technologies can be applied to monitor and transport medicines, to manage drug wastes, to provide virtual pharmaceutical consultations in any point of the city, to generate automatic alerts on interactions and precautions of medicines (e.g., email, mobile phone or other alerts), to carry out diverse types of pharmaceutical services (e.g., application of diagnostic tools), or to monitor patient adhesion, etc. Positively, health information or patient counselling can be provided through more comprehensible virtual or real time messages. Overall, it is expected that the present brief review raise the public and private awareness on the present topic. For instance, regarding the massive and coordinated integration of community pharmacies in future smart cities or the development of new regulations on the present topic.

References 1. Australian Government (2021) Pharmacists—better access to healthcare information for you and your patients. https://www.myhealthrecord.gov.au/sites/default/files/csr-478_-_pha rmacists_factsheet.pdf?v=1578279349. Accessed on 1 Jan 2022 2. Belli L, Cilfone A Davoli L, Ferrari G, Adorni P, Di Nocera F, Dall’Olio A, Pellegrini C, Mordacci M, Bertolotti E (2020) IoT-enabled smart sustainable cities: challenges and approaches. Smart Cities 3(3):1039–1071

640

C. Pires and M. J. Sousa

3. Chen S-C, Lee K-H, Horng D-J, Huang P-J (2021) Integrating the public health services model into age-friendly pharmacies: a case study on the pharmacies in Taiwan. Healthcare 9(11):1589 4. Edoh T (2017) Smart medicine transportation and medication monitoring system in EPharmacyNet. In: International rural and elderly health informatics conference (IREHI), pp 1–9 5. e-estonia (2021) Healthcare. https://e-estonia.com/solutions/healthcare/e-health-records/. Accessed on 1 Jan 2022 6. European Commission (2019) First EU citizens using e-Prescriptions in other EU country. https://ec.europa.eu/commission/presscorner/detail/en/IP_18_6808. Accessed on 1 Jan 2022 7. Fanti MP, Mangini AM, Roccotelli M, Silvestri B (2020) Drugs cross-distribution management in urban areas through an incentives scheme. IFAC-Papers OnLine 53(2):17053–17058 8. Fung EWT, Au-Yeung GTF, Tsoi LM, Qu L, Cheng TKW, Chong DW, Lam TTN, Cheung YT (2020) Pharmacists’ perceptions of the benefits and challenges of electronic product information system implementation in Hong Kong: mixed-method study. J Med Internet Res 22(11):e20765 9. Froehlich A, Siebrits A, Kotze C (2021) e-Health: how evolving space technology is driving remote healthcare in support of SDGs. In: Space supporting Africa. Studies in space policy, 27. https://doi.org/10.1007/978-3-030-61780-6_2 10. Gil-Garcia JR, Zhang J, Puron-Cid G (2016) Conceptualizing smartness in government: an integrative and multi-dimensional view. Gov Inf Q 33:524–534 11. Goundrey-Smith S (2018) The connected community pharmacy: benefits for healthcare and implications for health policy. Front Pharmacol 9:1352 12. Hui T, Mohammed B, Donyai P, McCrindle R, Sherratt RS (2020) Enhancing pharmaceutical packaging through a technology ecosystem to facilitate the reuse of medicines and reduce medicinal waste. Pharmacy (Basel, Switzerland) 8(2):58 13. International Pharmaceutical Federation (2020) Vision 2020–2025—Pharmacists at the heart of our communities. Community pharmacy section. International Pharmaceutical Federation (FIP), The Hague, Netherlands. https://www.fip.org/files/CPS_vision_FINAL.pdf. Accessed 1 Jan 2022 14. Jackson S, Peterson G (2019) My health record: a community pharmacy perspective. Aust Prescriber 42(2):46–47. https://doi.org/10.18773/austprescr.2019.009 15. Kamel Boulos MN, Wilson JT, Clauson KA (2018) Geospatial blockchain: promises, challenges, and scenarios in health and healthcare. Int J Health Geogr 17(1):25 16. Kc B, Lim D, Low CC, Chew C, Blebil AQ, Dujaili JA, Alrasheedy AA (2020) Positioning and utilization of information and communication technology in community pharmacies of Selangor, Malaysia: cross-sectional study. JMIR Med Inf 8(7) 17. Kosari S, Yee KC, Mulhall S, Thomas J, Jackson SL, Peterson GM, Rudgley A, Walker I, Naunton M (2020) Pharmacists’ perspectives on the use of my health record. Pharmacy (Basel, Switzerland) 8(4):190 18. Kua KP, Lee S (2021) The coping strategies of community pharmacists and pharmaceutical services provided during COVID-19 in Malaysia. Int J Clin Pract 75(12):e14992 19. Lynn T, Rosati P, Fox G, O’Gorman C, Conway E, Curran D (2020) Addressing the urban-townrural divide: the digital town readiness assessment framework. In: ICDS 2020: The fourteenth international conference on digital society. ISBN: 978-1-61208-760-3 20. Nadeem MF, Matti N, Parveen S, Rafiq S (2021) Incessant threat of COVID-19 variants: highlighting need for a mix of FDA-approved artificial intelligence tools and community pharmacy services. Res Soc Adm Pharm S1551–7411(21)00276-X. https://doi.org/10.1016/j.sapharm. 2021.07.018 21. NHS (2021) Summary care records data. https://digital.nhs.uk/data-and-information/datatools-and-services/tools-for-accessing-data/deployment-and-utilisation-hub/summary-carerecords-deployment-and-utilisation. Accessed on 1 Jan 2022 22. NHS digital (2021) Summary care record in community pharmacy. https://digital.nhs. uk/services/summary-care-records-scr/summary-care-record-scr-in-community-pharmacy. Accessed on 1 Jan 2022

The Role of Community Pharmacies in Smart Cities: A Brief Systematic …

641

23. Orduña-Malea E, Ayllón JM, Martín-Martín A, Delgado López-Cózar E (2014) About the size of Google Scholar: playing the numbers. Granada: EC3 Working Papers 18:23 24. Ordem dos Farmacêuticos (2009) Good pharmaceutical practices for community pharmacy [Boas Práticas Farmacêuticas para a farmácia comunitária (BPF)]. https://www.ordemfarm aceuticos.pt/fotos/documentos/boas_praticas_farmaceuticas_para_a_farmacia_comunitaria_ 2009_20853220715ab14785a01e8.pdf. Accessed on 11 Dec 2021 25. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD et al (2020) The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372:n71 26. PRISMA (2021) Preferred reporting items for systematic reviews and meta-analyses: checklist and flow diagram. http://www.prisma-statement.org/. Accessed 1 Jan 2022 27. PubMed (2021) National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/. Accessed on 1 Jan 2022 28. Rahaman S, Mohammed S, Manchanda T, Mahadik R (2019) e-Pharm assist: the future approach for dispensing medicines in smart cities. Int Conf Digitizat (ICD) 2019:263–267 29. ReportLinker (2021) Europe E-prescribing market—industry outlook and forecast 2021– 2026. https://www.reportlinker.com/p06074892/Europe-E-prescribing-Market-Industry-Out look-and-Forecast.html?utm_source=GNW. Accessed on 1 Jan 2022 30. SciELO (2021) Scientific electronic library online. https://www.scielosp.org/. Accessed on 1 Jan 2022 31. Sharma K, Kaushik G (2021) Urbanization and pharmaceutical waste: an upcoming environmental challenge. In: Kateja A, Jain R (eds) Urban growth and environmental issues in India. Springer, Singapore 32. Shauty A, Dachyar M (2020) Outpatient pharmacy improvement using internet of things based on BPR approach. Int J Adv Sci Technol 29(7s):3597–3604 33. SNS (2021) Portuguese health system: citizens area. https://www.sns.gov.pt/cidadao/. Accessed on 1 Jan 2022 34. The Government of the Hong Kong SAR (Office of the Government Chief Information Office) Hong Kong Smart City Blueprint [2020-06-17]. https://www.smartcity.gov.hk/ 35. Yuan C, Ding Y, Zhou K, Huang Y, Xi X (2019) Clinical outcomes of community pharmacy services: a systematic review and meta-analysis. Health Soc Care Community 27(5):e567–e587 36. Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explorations Newsl 19(1):22–36 37. Vijayarani S, Janani R (2016) Text mining: open source tokenization tools-an analysis. Adv Comput Intell Int J (ACII) 3(1):37–47

Digital Media and Education

Education in the Post-covid Era: Educational Strategies in Smart and Sustainable Cities Andreia de Bem Machado, João Rodrigues dos Santos, António Sacavém, Marc François Richter, and Maria José Sousa

Abstract With a global population of 7.8 billion people and limited natural resources, humanity and society as a whole must learn to live in harmony. We must always act responsibly, with the realization that whatever we do today may have longterm consequences for people and the environment. ESD (Education for Sustainable Development) encourages people to think and act in ways that lead to a more sustainable future. As a result, the following issue arises: what are the educational practices for schooling in the Covid-19 period in sustainable cities? To address this issue, the following goal was set: to map the educational strategies of sustainable cities for education in the Covid-19 era, based on a bibliometric review. A bibliometric search in the Web of Science database was done to this goal. The educational strategy, according to the findings of the Covid-19 epidemic, is linked to the interdependencies between business, nature, society, economy, and education. These interconnectors may be found in the 17 SDGs, as well as a deeper environmental awareness, which can be accomplished by active methodology, such as establishing an issue that can

A. de Bem Machado (B) Engineering and Knowledge Management Department, Federal University of Santa Catarina, Florianópolis, Brazil e-mail: [email protected] J. R. dos Santos · A. Sacavém Economics and Business Department, Universidade Europeia/IADE, Lisbon, Portugal e-mail: [email protected] A. Sacavém e-mail: [email protected] M. F. Richter Postgraduate Program in Environment and Sustainability (PPGAS), Universidade Estadual do Rio Grande do Sul, Porto Alegre, Brazil e-mail: [email protected] M. J. Sousa Political Science and Public Policy Department, ISCTE - University Institute of Lisbon, Lisbon, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_55

645

646

A. de Bem Machado et al.

be solved in the classroom, and employing problem solving as active and dynamic education. Keywords Education · Sustainable · Smart cities · COVID-19

1 Introduction Cities are the principal hubs of human and economic activity. Cities have the ability to create synergies that will help the growth of their residents. Growing in size and complexity, however, cities cause a slew of problems that can be difficult to solve. Smart city is a relatively new paradigm that uses information and communication technology to solve problems in modern cities and improve quality of life (QoL). Much of the research on smart and sustainable cities have been undertaken to clarify definitions, activities, and other criteria. The coronavirus epidemic, on the other hand, has only added to the urgency by worsening existing imbalances and posing significant challenges to how we live, work, and educate future generations. Several cities are now implementing the smart and sustainable city concept to improve society’s quality of life, namely in the area of education. The lessons learned from this outbreak point to more coordinated government and public-sector responses, as well as increased levels of urban digital connectedness [3], especially in health and education [8]. Education encourages creativity and invention, all of which are critical for a city’s long-term viability and development. In order to accomplish this, a bibliometric analysis was conducted with the goal of identifying a diverse list of authors, organizations, and nations that stand out in the research on educational techniques in smart and sustainable cities for education in the post-covid future. The article is divided into five sections: the first is an introduction; the second is about smart and sustainable cities; the third is about educational tactics; the fourth is about methodology, the fifth is about the final considerations.

2 Smart and Sustainable Cities Whether smart or not, the concept of “city” already takes distinct forms in different countries, making direct comparisons difficult. Characteristics such as population, population density, types of employment, infrastructure, and the presence of education or health facilities are used to categorize a region as urban [2]. In truth, the concept of “city” has long been associated with human density in an environment characterized by contact and material and immaterial exchanges. If well-planned, such a process can result in high productivity, competitiveness, and innovation in the urban environment, as well as for its residents. However, these characteristics alone do not constitute a city “smart”.

Education in the Post-covid Era: Educational Strategies in Smart …

647

Currently, there are numerous definitions available for the term “smart city”. The primary approaches, on the other hand, can be divided into two groups. The first has a technocentric perspective, emphasizing ICTs as the primary factor in city intelligence [9]. The second stream takes a holistic citizen-centric approach to improve city quality of life by combining human and social capital with natural and economic resources through ICT-based solutions. It is vital to use ICT, especially in urban spaces, to improve the quality of life of its inhabitants and, at the same time, maximize the use of our planet’s resources at the lowest possible cost (it’s all about efficiency). As a result, technology is no longer the major focus of the debate about smart and sustainable communities, but rather a tool for accomplishing a larger aim. Residents in smart and sustainable communities should be seen as co-creators of processes that will improve the community’s quality of life, not merely as users of services or even as customers. In other words, the administration of a human smart city must be participatory, seeking to enhance the specific geographic characteristics of a community or city. This humanized smart city model was coined version 3.0 of the evolution of this type of city [10]. The initial version was when companies holding the technology to offer some smart city service convinced governments that it would be good to buy their product to offer such a service. Version 2.0 refers to a situation where governments decide on behalf of citizens, in which smart city services resources should be invested to solve existing problems. Some of the most important technologies to provide the platforms for the development of a smart and sustainable city are mentioned below: (1) sensor technology and its connections in networks (Internet of Things—IoT) as perhaps the most important; (2) radio communication technology with low delay and high capacity; (3) robotics and civil applications of drones; (4) transport logistics; data aggregation and analysis, so that governments and communities can benefit from the amount of data validated and obtained from their sensors; (5) security in the processing and transmission of citizens’ data, because it is impossible for a public service to be successful without citizens having confidence in it and; (6) energy efficiency in general. A topic that has generated intense debate recently involves urban poverty and inequalities of income and access to policies, as well as social and economic opportunities (due to various social cleavages, but mainly gender, race and social groups). Finally, another important issue is linked to the area of education: ensuring access to culture and offering quality education are the main aspects to measure human capital. It is clear that managing the number of universities, schools, museums and other places that contribute to education and culture is something essential in Smart Cities [5]. In other words, cities must focus on strengthening the circular economy, investing more in education, lowering school dropout, which is strongly related to the issue of security in cities, and seeking balance in all sectors of society based on these measures [6].

648

A. de Bem Machado et al.

3 Educational Strategy Cities that are sustainable Mobile learning, smart phones, and computers were used to construct educational tactics for the Covid-19 period, which permitted improvements in teaching approaches across all subjects. Teaching is no longer constrained by time, place, psychological state, or geographical boundaries thanks to technological advancements. In this method, one can learn anywhere and develop a lifelong habit of learning [13]. Many national and foreign universities have built their own online teaching platforms in recent years, using the resources of the Internet and digitalization to offer students an interactive and personalized learning channel that is not limited to the time and space of learning [4]. Such learning can be supported by mobile technologies, applications for tablets and smartphones [12]. Technologies applied in the digital world such as gamification [1], moocs [7], SPOCs (Fu 2019), among others demonstrate that students learn new knowledge through instructional videos that include content auditory and visual [7]. In this way, the limited time in the classroom can be used primarily for teaching activities that employ two-way interaction or communication, such as practices, problem solving, and discussions, to enhance learning effects and realize the idea of student-centered education [11]. Plus, with digital materials, students can learn repeatedly anytime, anywhere.

4 Methodological Approach A bibliometric analysis was conducted to expand understanding, measure, and analyze the scientific literature on trust in the issue of educational strategies of sustainable cities for education in the Covid-19 age, beginning with a search in Clarivate Analytics’ Web of Science (WoS) database. The research was carried out using a three-phase strategy: execution plan; data collecting; and bibliometry. The bibliometrix program was used to evaluate the bibliometric data because it is the most compatible with the Web of Science database. Planning began in the month of September and ended in May 2022, when the research was conducted. In this phase, some criteria were defined such as the limitation of the search to electronic databases, not contemplating physical catalogues in libraries, due to the number of documents deemed sufficient in the research bases in the database chosen in the present research. In the scope of planning, the WoS database was stipulated as relevant to the research domain due to the relevance of this database in academia and its interdisciplinary character focus of research in this area. And also for the fact that it is one of the largest databases of abstracts and bibliographic references of peer-reviewed scientific literature and its constant updating. Considering the research problem, the search terms, namely “education” and “covid-19” and “sustainable cities”, were defined during the planning phase.

Education in the Post-covid Era: Educational Strategies in Smart …

649

Considering the research problem, the search terms were delimited, still in the planning phase. First, the following descriptor was chosen: what are the educational strategies of sustainable cities for education in the Covid-19 era? To refine the search in accordance with the research problem, another search was conducted with the following terms “education” and “covid-19” and “sustainable cities”, which resulted in 27 documents. As a result of this search, it was concluded that the 27 scientific articles were written by 97 authors, linked to 121 institutions from 33 different countries. Also, a total of 143 keywords were used. Table 1 shows the results of this data collection in a general bibliometric analysis. The referred papers found in the Web of Science database were published in the period from 2021 to 2022. 24 publications were found for the year 2021 and three for the year 2022. Table 1 Bibliometric data

Description

Results

Main information about data Timespan

2021:2022

Sources (Journals, Books, etc.)

4

Documents

27

Average years from publication

0.815

Average citations per documents

12.67

Average citations per year per doc

6.407

References

1705

Document types Article

27

Document contents Keywords Plus (ID)

103

Author’s Keywords (DE)

143

Authors Authors

97

Author Appearances

101

Authors of single-authored documents

2

Authors of multi-authored documents

95

Authors collaboration Single-authored documents

2

Documents per Author

0.278

Authors per Document

3.59

Co-Authors per Documents

3.74

Collaboration Index

3.8

650

A. de Bem Machado et al.

Fig. 1 Distribution by country of works

From the 27 papers, it is observed a varied list of authors, institutions and countries that stand out in the research on educational strategies in smart and sustainable cities for education in the post-covid era. When analyzing the 20 countries with the highest number of citations in the area it can be seen that China stands out with an average of 96 (= 28%) of total citations, a total of 96 citations. In second place, Turkey stands out with 56 (16%) of total citations, and Servia with a total of 47 (13%), as shown in Fig. 1. Figure 2, presents the publication intensity per country and the relationship established between them, through citations between published papers. Another analysis carried out is related to the identification of authors. The authors who have more production in the subject are: Li, Shuangjin; Pan, Yue; Zhang, Limao and Wang, Xi Wang, each of them with 2 publications in the area, according to Fig. 3. The top ten research areas publishing on this topic educational strategies in smart and sustainable cities for education in the post-covid era are: Green Sustainable Science Technology; Construction Building Technology; Energy Fuels; Environmental Sciences; Environmental Studies; Urban Studies; Computer Science Interdisciplinary Applications; Computer Science Theory Methods; Engineering Civil; Management; Applied Mathematics; and Medicine General Internal, as shown in Fig. 4. With 30% of the publications, the areas of Green Sustainable Science Technology and Construction Building Technology stand out with 24% of the publications, followed by Energy Fuels with 23%. The documents analyzed were published in four different journals, and among the total of twenty seven, twenty articles (74%) were published in only one journal: “Sustainable Cities And Society”, according to Fig. 5.

Education in the Post-covid Era: Educational Strategies in Smart …

651

Fig. 2 Spatial distribution and relationships of the publications

Fig. 3 Authors with the highest number of publications in the search topic

The second most relevant source, among the four explained in Fig. 5, is the “Frontiers In Sustainable Cities” with 4 documents. In the sequence, there is, with two publications, the magazine entitled “Sustainability”. From the bibliometric analysis, based on the work group recovered, on the basis of 143 keywords indicated by the authors, “cities” and “health” stood out with 3, according to Fig. 6. The word “curriculum” also stands out, from which we conclude

652

A. de Bem Machado et al.

7% 7% 23%

4%

Green Sustainable Science Technology

30%

24%

Construcon Building Technology Energy Fuels

Fig. 4 Research areas

Fig. 5 Most cited sources

it is an application area, i.e., the educational strategy adopted in smart and sustainable cities for education in the post-covid era, explored by the literature. The word “curriculum” is also highlighted, which is why it is concluded that it is a concept or area of application that has been much explored by the literature and that, therefore, it is concluded that it assumes preponderance in the educational strategy adopted in smart and sustainable cities in the era post-covid.

Education in the Post-covid Era: Educational Strategies in Smart …

653

Fig. 6 Tag cloud

5 Final Considerations The transition of educational “traditionalism” in regard to the use of ICT in the teaching–learning process, in its broadest sense, has been hastened by educational techniques in the Covid-19 era. Thus, during the Covid-19 period, the transition from traditional face-to-face pedagogical strategies to online virtual relationship strategies (synchronous or asynchronous) highlights the importance of “innovations not only in strategies, but also in technology” in serving students and apprentices in general. As a result, the solution to this chapter’s difficulty is that the use of ICT in smart cities and post-covid/future educational initiatives is critical for three reasons: 1. Only via the use of ICT will it be feasible to develop the level of critical thinking required to comprehend the vast and increasingly complex layers of current cultures while also incorporating more data; 2. Only via the use of ICT can a smart city’s data be processed in real time and transformed into higher efficiency in the management of educational activities, responding to each unique situation, including pandemic emergencies like Covid19; 3. Only by comprehending and employing ICT will it be possible to gain the “metaknowledge” (particularly digital literacy) required for self-regulated learning, which must occur continuously throughout one’s life. Otherwise, the Citizen loses sight of the rights and responsibilities that properly define him, becoming an information-deprived and thus an unadapted being.

654

A. de Bem Machado et al.

References 1. Actaxova HA, Melnikov CL, Tonkix AP, Kamynin BL (2020) Texnologiqeckie pecypcy covpemennogo vycxego obpazovani. Obpazovanie i nayka 22(6):74–101 2. Allam Z, Newman P (2018) Redefining the smart city: Culture, metabolism, and governance. Smart Cities 1(1):4–25 3. Costa DG, Peixoto JPJ (2020) COVID-19 pandemic: a review of smart cities initiatives to face new outbreaks. IET Smart Cities 2(2):64–73 4. Cornali F, Cavaletto GM (2021) Emerging platform education: what are the implications of education processes’ digitization? In: Handbook of research on determining the reliability of online assessment and distance learning. Hershey, PA, Estados Unidos de América: IGI Global, pp 359–378 5. De Bem Machado A, Richter MF (2021) As estratégias educacionais nas cidades inteligentes e sustentáveis para educação na era pós-covid. In: De Bem Machado A (Comp.), Desafios da educação: Abordagens e tendências pedagógicas para futuro pós-covid. Editora BAGAI, pp 66–77 6. De Bem Machado A, Rodrigues dos Santos J, Richter MF, Sousa MJ (2021) Smart cities: building sustainable cities. In: Chinmay C (eds) Green technological innovation for sustainable smart societies. Springer, Cham, pp 1–19 7. Lehmann A (2019) Problem tagging and solution-based video recommendations in learning video environments. In: 2019 IEEE global engineering education conference (EDUCON). IEEE, pp 365–373 8. Kunzmann KR (2020) Smart cities after COVID-19: Ten narratives. disP-The Plan Rev 56(2):20–31 9. Mora L, Bolici R, Deakin M (2017) The first two decades of smart-city research: a bibliometric analysis. J Urban Technol 24(1):3–27 10. Shamsuzzoha A, Nieminen J, Piya S, Rutledge K (2021) Smart city for sustainable environment: a comparison of participatory strategies from Helsinki, Singapore and London. Cities 114:103194 11. Shen KM, Wu CL, Lee MH (2017) A study on Taiwanese undergraduates’ conceptions of Internet-based learning. Int J Digit Learn Technol 9(3):1–22 12. Sousa MJ, Rocha Á (2020) Learning analytics measuring impacts on organisational performance. J Grid Comput 18(3):563–571 13. Xu D (2019) Research on new English mobile teaching mode under the impact of mobile Internet age. Open J Soc Sci 7(5):109–117

Digital Health and Wellbeing: The Case for Broadening the EU DigComp Framework Anícia Rebelo Trindade, Debbie Holley, and Célio Gonçalo Marques

Abstract Digital health and wellbeing are highly contested terms and range from carefully costed and evaluated software systems designed for patients to access their doctor; evidence-based mobile applications for supporting those living with long term health conditions such as diabetes; to the Coronavirus travel applications (app) developed to enable societies to come together post-pandemic. By way of contrast, numerous mental health ‘apps’ with tracking algorithms enabling individual personal data to be commercialized and sold on to third parties lack a robust evidence base and are problematic. Against a fast-changing backdrop, the European Union (EU) launched the revision of their Digital Framework Digital Competence (DigComp 2.2) in February of 2022. This paper reports on the findings of the ‘Safety and Security’ working group and their recommendations for the digital knowledge, skills, and attitudes (KSA) required for EU citizens negotiating a complex and constantly changing health sector. Keywords EU DigComp · Digital health · Wellbeing · Long term health

A. R. Trindade · C. G. Marques (B) Polytechnic Institute of Tomar, Tomar, Portugal e-mail: [email protected] A. R. Trindade e-mail: [email protected] A. R. Trindade Educational Technology Laboratory (LabTE), University of Coimbra, Coimbra, Portugal D. Holley Department of Nursing Sciences, Bournemouth University, Poole, England e-mail: [email protected] C. G. Marques Laboratory of Pedagogical, Innovation and Distance Learning (LIED.IPT), Tomar, Portugal © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_56

655

656

A. R. Trindade et al.

1 Introduction The expansion of Artificial Intelligence (AI) health systems and applications brings a new focus, identifying the necessity to empower and educate EU citizens. The critical knowledge, skills, and attitudes (KSA) required to make informed choices when interacting, using, and installing those applications on their smart devices are a requirement for those living in the digital age. Advanced techniques related to machine learning and deep learning enables sophisticated data analysis in order to facilitate new knowledge in different areas of Health Sciences [1], yet Murdoch [8] reports many such advances in healthcare artificial intelligence technologies end up owned and controlled by private entities, raising issues and discussion about privacy, ethics and protecting patient health information. Competencies Murdoch recommends are protecting devices, protecting personal data and privacy, and protecting health and well-being as a pre-requisite essential to avoid threats to physical, emotional, and personal health. Regarding the AI health concerns, the revised EU framework sets out a pathway for a comprehensive understanding about what AI systems do/do not; how do AI systems work; how to interact with AI systems for different daily routines (look for information, using AI systems and apps; focusing on privacy and personal data). The challenges and ethics of AI, and the attitudes regarding human agency control are set out by Vuorikari et al. [13]. As part of the update of the DigComp Framework, a group of experts in areas of safety, security, and wellbeing contributed to the articulation and enhancement of appropriate KSA. Their work advocated the promotion of confident, critical, and responsible use of digital technology, whenever and wherever the use of AI health systems and apps takes place. The study submitted a series of recommendations about the KSA, and were framed by the research question: RQ: Do European Union (EU) citizens need to engage with the digital environment in a confident, critical, and responsible way for participation in society in the safety area of the DigComp framework? To answer this challenge, qualitative research was conducted and designed by a group of experts, working as part of the wider Community of Practice (CoP), using a Design-Based Research (DBR) approach [7, 9], to propose the new KSA encompassed in the revised European DigComp 2.2 framework for safety, security, and well-being competencies, required for EU citizens negotiating a complex and constantly changing of health sector.

Digital Health and Wellbeing: The Case for Broadening the EU …

657

2 Background 2.1 Digital Health and Security The General Data Protection Regulation (GDPR) framework is no longer adequate nor sufficient to cover the complex problems emerging from the capture and treatment of sensitive data [5]. Empowering and educating citizens to take the responsibility of protecting their own personal data is a significant challenge, one which requires new policy implementation approaches. The currency of companies and organizations is often now the information/data they harvest from individuals, about their personal data and their health and medical conditions. For Seh et al. [10], a data breach means the illegal disclosure or use of data information without authorization of the main information owner. Data breaches are not just a concern and complication for security experts but are now a growing and regular threat to the ordinary citizen [10]. EU citizens need the skills to embrace the ‘digital’ to secure employment, to communicate, to shop, and increasingly to access health apps and systems regarding their own physical and mental health needs. As Goldsmith et al. [6] explain, “human learning, underpinned by technological tools, needs to be partnered by a focus on lifelong learning and continuous professional development.” Chang [1], advocates for the sharing of outcomes from predictions of human health conditions, and considers health data through an open access lens. Data should be released and shared without any obstacles, and in this way, the researcher argues, AI will flourish and lead to discovering new knowledge. This will, in turn, assist healthcare professionals make optimal decisions for their patients. Fornasier [5] anticipates it will become the new ‘normal’ to expect patients to engage with their digital health tools, a quantum leap from the current standard user profile. However, there are more dystopian views about the use and value of data capture and analysis. Regarding protecting devices, it is important that all citizens can “weigh the benefits and risks of using biometric identification techniques (e.g., fingerprint, face images) as they can affect safety in unintended ways. If biometric information is leaked or hacked, it becomes compromised and can lead to identity fraud” ([13], p. 36). Today, our personal information is constantly at risk, as Singh et al. ([11], p1) comment: “Smart healthcare uses advanced technologies to transform the traditional medical system in an all-round way, making healthcare more efficient, more convenient, and more personalized. Unfortunately, medical data security is a serious issue in the smart healthcare systems”. Providing guidance for the citizens to identify their sensitive data and know how to protect it from third parties is no longer a problem only for third party organizations [2]. Protecting private health data involves understanding what extended health applications and systems utilized codes that make it possible to link data about an individual without revealing the person’s identity [2]. In this regard, it’s still important that citizens be “aware that for many digital health applications, there are no official licensing procedures as is the case in mainstream medicine” ([13], p. 38). The researchers go on to comment that the internet is awash with false and potentially

658

A. R. Trindade et al.

dangerous information about health, medical and ‘pseudo’ medical claims being made with little underpinning evidence nor the results from robust health trials. EU Citizens need new skills to navigate and evaluate claims being made. As Singh et al. (op.cit) explain, comprehensive state-of-art-techniques are required, and they forecast solutions from cryptography, biometrics, watermarking, and blockchain-based security techniques for healthcare applications.

3 Methodology 3.1 European DigComp Revision The European Commission started early discussions about digital competences, and how to empower and engage citizens in the digital era a decade ago [3]. The work done created the first European Digital Framework, DigComp [4], proposing five main areas that the EU identified as core and essential for any citizen. The safety area included competences such as: 4.1—Protecting devices,4.2—Protecting personal data and privacy; 4.3—protecting health and well-being, and protecting the environment [13]. The European Commission (2022) recently launched the newest version of the digital competence framework, and this includes the KSA gathered by different expert groups in each field area of the framework. The research question that underpinned the one-year body of work leading to the launch was: what knowledge, skills and attitudes do citizens need to engage with the digital environment in a confident, critical, and responsible way for learning, at work, and for participation in society in the safety area of the European DigComp 2.2 revision framework? The research was commissioned drawing upon a designbased research (DBR) protocol, which combines theory with practice, and is used when complex decisions and multiple voices are to be collated and represented. In this case, the stakeholders comprised experts, volunteers, and Joint Research Centre (JRC) leadership. The DBR approach was implemented through into four cycles, as proposed by Plomp [9], and involves: (i) analyze existing practical problems, (ii) develop innovative solutions based on existing design principles; (iii) create iterative cycles of tests for the improvement of the solutions in practice; (iv) reflect on the principles of improvement of the implemented solutions. Following the DBR approach, the DigComp revision model was organized into eight phases, or iterative cycles. In the first phase the European Commission and the JRC set out the scope and scale of the challenge, and the different working groups addressed the emergent themes in the digital world, namely digital health, digital safety and security, AI applications, safety and security and well-being. In this regard, the analysis of existing practical problems was conducted on phase one and two, where different tasks were undertaken:

Digital Health and Wellbeing: The Case for Broadening the EU …

659

(i)

The identification of the new digital competence requirements for citizens which stem from the digital world, is based on literature, expert brainstorming and focus group sessions (ii) Proposing and selecting requirements for a safety, security and well-being, perspective, linked to the different competences of the Framework 2.1; (iii) organization of three strands of discussion: E-Health/well-being; opportunities and limits to digital protection; how to build safety and security step by step in the development of users (cf. younger students, active workers, and elderly people). For the development of innovative solutions based on existing design principles the cycle took place on phase three and four: (i)

conducting a literature review about the themes that inform the scope of safety, security and wellbeing; (ii) applying underpinning literature and values triangulated back through to the DigComp 2.1 framework; (iii) initial suggestions for relevant knowledge, skills and attitudes (KSA), statements related to the requirements previously identified, along with suggestions about where they might fit into the safety area of DigComp framework 2.1 (digital competence 4.1 protecting devices; 4.2 protecting personal data and privacy; and 4.3 protecting health and wellbeing). The cyclic creation of iterative sequences of tests for the improvement of the solutions in practices was accomplished in phase five and six, where an iterative peer review/reflective cycle of work was undertaken. Methods included online questionnaires; expert brainstorming meetings and focus group discussions. Involving 373 stakeholders/experts and 31 experts in the field of safety, security, and wellbeing, across Europe, the findings were validated through consultations with experts, stakeholders and civil society. In these phases the group of experts collected more than hundred statements (n = 133) for the safety area (51 statements (KSA) related with the digital competence “protecting devices” and its components; 41 statements (KSA) related with “protecting personal data and privacy”, and 41 statements related with “protecting health and well-being”). Considering the definition of KSA presented in the theoretical framework, a knowledge statement starts with: “knows/aware/understands that… or aware of”. The skill statement begins with “knows how/can apply…, etc.”. Finally, an attitude sentence starts with “inclined to/assumes responsibility/wary of/confident in… etc.). Closing the DBR protocol approach, reflecting the principles of improvement of the implemented solutions developed by the JRC on phase seven and eight, came up with a proposed list of KSA, a number of which were directly applied and incorporated into the DigComp 2.2 version. For data collection, direct techniques and indirect documentation techniques were used [12]. The direct data collection techniques integrate three questionnaire surveys (used and applied to validate the KSA for each of the three digital competence of safety areas (4.1—protecting devices; 4.2—protecting personal data and privacy; and 4.3—protecting health and wellbeing).

660

A. R. Trindade et al.

Table 1 Process of selection of KSA for DigComp 2.2 update Digital competence

No. KSA proposed by experts (phase 3 and 4)

No. KSA selected by JRC for survey validation (phase 5 and 6)

No. KSA selected by experts related with digital health and AI

No. of KSA selected through public validation to DigComp 2.2 (phase 7 and 8)

4.1—Protecting devices

51

20

17

14

4.2—Protecting personal data and privacy

41

20

16

9

4.3—protecting health and wellbeing

41

21

12

14

133

61

45

37

Total

Source Expert working group statements (KSA) results regarding digital competence 4.1, 4.2 and 4.3, survey validation results and final statements of DigComp 2.2 at [13], p. 35–40

The first part of the surveys collect data related to the characterization of the respondent. The second part of the survey measures the clarity of the KSA statement and the level of relevance of the statement with a Likert scale with five points. Each survey includes only 20 statements chosen by the JRC, among all statements collected by the experts through phase three and four (see Table 1). Before the online public validation through the surveys on phase five and six, on phase seven and eight only few were selected to be included on the Digcomp 2.2 update (see Table 1). The survey also collected additional comments about the Digcomp 2.2 update, which were useful to rephrase some proposed statements. The survey was completed online by a broad range of stakeholders, from different countries and organizations across Europe. Despite a limited response useful information was collected.

4 Results and Discussion This section presents the results of the methodology procedure to collect the KSA statements on the safety area, presenting only those that the experts considered that relate to digital health safety security and wellbeing among the 20 statements per each competence submitted to public validation (see Table 2).

4.1 Results Protecting Devices The competence of ‘protecting devices’ now articulated clearly previously invisible types of KSA such as identity theft, psychological manipulation, cyber-attacks,

Aware of “social engineering” that uses psychological manipulation to obtain confidential information (passwords, pin-codes) from victims or convince them to take a harmful action (execute malicious software)

Understands that IoT applications can be vulnerable to cyber-attacks as they require the exchange of data via wireless networks

Knows that cybercriminals might have several motivations to conduct their unlawful activity (motivated by financial gain, protest, information gathering for spying)

Knows about the importance of keeping the operating system and applications (browser) up to date to fix security vulnerabilities and protect against malicious software (malware)

Knows that a firewall blocks certain kinds of network traffic aiming to prevent a number of different security risks (spam, denial of service, remote logins)

4

5

6

8

9

Knowledge

Statement

Aware of the risk of identity theft on the internet, someone commits fraud or other crimes using another person’s personal data (digital identity, username) without their permission

Nr. Stat

3

Type

Dimension 4

X

X

Included

Decision

X

X

X

Not included X

(continued)

Included with arrangements

Table 2 Examples of final KSA related with digital health for digital competence 4.1—protecting devices presented in the survey validation and were reframed or excluded of the DigComp framework 2.2

Digital Health and Wellbeing: The Case for Broadening the EU … 661

Knows how to activate two-factor authentication for important services

Acquires digital tools that do not process unnecessarily personal data, check the type of data and features an app access on one’s mobile phone

Able to encrypt sensitive data stored on personal devices or in a cloud storage

Can identify the affordances of different data hosting/storing services, file versioning features of cloud storage to revert to previous files in case of corruption or deletion, to compare file versions to one another)

Knows how to install and activate protection software and X services (antivirus, anti-malware, firewall) to keep digital content and personal data safe

2

3

4

5

6

X

X

X

Knows how to adopt a proper cyber-hygiene regarding passwords (selecting strong ones difficult to guess) and managing them securely (password manager)

1

Decision Included

Skills

Statement

Nr. Stat

Type

Dimension 4

Table 2 (continued)

X

Not included

X

(continued)

Included with arrangements

662 A. R. Trindade et al.

Vigilant not to leave computers or mobile devices unattended, for example in public places (in a restaurant, train, car)

Weighs the risks and benefits of using biometric identification techniques (fingerprint, face images) as they can affect safety in unintended ways (biometric information can be leaked or hacked and therefore become compromised)

Vigilant towards practices to protect devices and digital content as security risks are always evolving

Keen to consider some self-protective behaviours such as not using open wi-fi networks to make financial transactions or online banking

2

3

4

Can respond to a security breach (an incident that results in unauthorized access to digital data, applications, networks or devices), a personal data breach (leakage of their login and passwords) or a malware attack (contain viruses) or a malware attack (ransomware)

7

1

Statement

Dimension 4

Nr. Stat

Decision

X

Included

X

Not included

X

X

X

Included with arrangements

Source Survey validation statements (KSA) results regarding protecting devices and final statements of DigComp 2.2 at [13], p. 36

Attitudes

Type

Table 2 (continued)

Digital Health and Wellbeing: The Case for Broadening the EU … 663

664

A. R. Trindade et al.

vulnerabilities, and protection against malicious software. The experts believe that those KSA (n = 5), that were not covered by the DigComp framework, are still very important, regarding the threats and risks that all citizens face in their daily routine (see Table 3). This is supported by the work of Seh et al. [10] who clearly identify data breach as one of the major concerns of digital health data protection.

4.2 Results Protecting Personal Data and Privacy Regarding protecting personal data and privacy, the group of experts selected 16 of the 20 statements submitted to public validation, which were related to digital health behaviours that all citizens need to acquire, in order to protect them and others, against risks and cyber-attacks of data health information. Nevertheless, half of these KSA were unable to be included in DigComp 2.2 (n = 8).

4.3 Results Protecting Health and Wellbeing Regarding health and wellbeing, among the 20 statements submitted to public validation, the experts found 12 statements relating to the digital health behavior (KSA), which were crucial in preventing citizens from neglecting aspects of their digital health data. Nevertheless, even though all 12 statements were selected, five of them (n = 5) were not selected for inclusion in the DigComp 2.2 (see Table 4).

5 Conclusion This paper considers critically the implications for health, security and wellbeing of a fast-moving artificial intelligence body of knowledge. Drawing from the work developed by the digital safety and security working group and underpinned by DBR as advocated by McKenney and Reeves [7], the researchers refined and distilled to the essential components of Knowledge Skills and Attitudes (KSA) as required by the EU DigComp team. The key benefit of the process was the gathering of an extensive evidence base to inform the final recommendations, and although not all were included in the revised version, the gaps in knowledge have been articulated and identified for further work. Those embedded within the new framework form the basis for educating and safeguarding EU citizens as they start to take advantage of huge changes in the way health services will be offered in future and promote a more critical engagement with digital health provision. Future work on the themes developed can now be undertaken, through the replication of the research base model, and through the framework developed for analysis. The themes that were not included in the final new DIGICOMP update remain

Aware that a security or privacy incident can result in loss of control, compromise, unauthorized disclosure, acquisition, or access to personal data, in physical or electronic form

Knows that, in terms of the EU’s GDPR, even voice interactions with a virtual assistant are personal data and can expose users to certain data protection, privacy and security risks

Knows that processing of personal data encompasses the collection, recording, organisation, storage, and modifications of the data. When an AI system links different pieces of apparently anonymous information together, it can lead to de-anonymisation, the identification of a particular person

Recognize that voice assistants, chatbots, smart devices and other AI technologies that rely on users’ biometric and other personal data might process such data more than is necessary (it is considered disproportionate and violates the principle of proportionality specified by GDPR)

5

6

7

Included

4

Statement

Aware that secure electronic identification is a key feature X to enable the safe sharing of personal data with third parties when conducting public sector and private transactions

Nr. Stat

1

Knowledge

Decision

Type

Dimension 4

X

X

X

Not included

X

(continued)

Included with arrangements

Table 3 Examples of final KSA related with digital health for digital competence 4.2—protecting personal data and privacy presented in the survey validation and were reframed or excluded from the DigComp framework 2.2

Digital Health and Wellbeing: The Case for Broadening the EU … 665

Skills

Type

Uses digital certificates acquired from certifying authorities (digital certificates for authentication and digital signing stored on national identity cards)

If informed by data controllers that there has been a data breach affecting users, act accordingly to take actions to mitigate the impact (change all passwords immediately, not just the one known to be compromised)

Can help mitigate the risks of personal data breaches by expressing concerns to relevant authorities relating to the usage of AI tools that collect data, especially if there is a suspicion that there is a violation of the GDPR or when the company does not make the information available

4

5

6

Knows how to identify suspicious email messages that try to obtain sensitive information (personal data, banking identification) or might contain malware

9

Knows how to modify privacy settings to keep safe from unwanted contacts (spam texts, emails)

Knows that reading a “privacy policy” of an app or service explains what personal data it collects and whether data is shared with third parties possibly including information about the device used (brand of the phone) and geolocation of the user

8

1

Statement

Nr. Stat

Dimension 4

Table 3 (continued) Decision

X

Included

X

X

Not included

X

X

X

(continued)

Included with arrangements

666 A. R. Trindade et al.

Weighs the benefits and risks before activating a virtual assistant (Siri, Alexa, Cortana, Google assistant) or smart IoT devices as they can expose personal daily routines

Weighs the benefits and risks before engaging with software that uses biometric data (voice, face images), checking that it complies with GDPR

Weighs the benefits and risks before allowing third parties to process personal data, recognizes that voice assistants that are connected to smart home devices can give access to the data to third parties (companies, governments, cybercriminals)

Confident in carrying out online transactions after taking appropriate safety and security measures

2

3

4

5

Decision Included

X

X

X

Not included

Emerged through survey analises

X

Included with arrangements

Source Survey validation statements (KSA) results regarding protecting personal data and privacy and final statements of DigComp 2.2 at [13], p. 38

Emphasizes the importance of taking a conscious decision whether to share information about private life publicly, considering the risks involved (especially for children) while keeping control of the personal data

1

Attitudes

Statement

Nr. Stat

Type

Dimension 4

Table 3 (continued)

Digital Health and Wellbeing: The Case for Broadening the EU … 667

Aware of the importance of healthy personal digital balance regarding the use of digital technologies, including non-use as an option. Many different factors in digital life can impact on personal health, wellbeing and life satisfaction

Knows that some AI-driven applications on digital devices (sensors, wearables, smart phones) can support the adoption of healthy behaviours through monitoring and alerting about health conditions (physical, emotional, psychological). However, decisions proposed could also have potential negative impacts on physical or mental health

Knows that for many digital health applications, there are no official licensing procedures like is the case in classical medicine

Aware that digital upskilling can create access to education and training as well as to job opportunities thus promoting social inclusion

Able to gather information about digital self-help health applications for improving physical and/or mental wellbeing (positive and negative effects) before deciding whether to use them or not

Knows how to recognize embedded user experience techniques designed to be manipulative and/or to weaken one’s ability to be in control of decisions (make users to spend more time on online activities, encourage consumerism)

2

3

9

2

3

Knowledge

Statement

Nr. Stat

1

Type

Dimension 4

X

Included

Decision

X

X

Not included

X

X

X

(continued)

Included with arrangements

Table 4 Examples of final knowledge and attitudes related with digital health for digital competence 4.3—protecting health and wellbeing presented in the survey validation and were reframed or excluded of the DigComp framework 2.2

668 A. R. Trindade et al.

Wary of the reliability of recommendation (are they by a X reputable source in healthcare/wellbeing) and their intentions (do they really help the user vs. encourage use the device more to be exposed to advertising)

Being willing not to harm others online

3

4

X

Inclined to focus on physical and mental wellbeing and avoid negative impact of digital media such as overuse, addiction, compulsive behavior

X

2

Can select digital content and solutions that enhance usability and user engagement, choses culturally relevant content in local languages, easy to access material for low-literate users, and applies captions for videos

7

Assumes responsibility for protecting personal and collective health and safety when evaluating the effects of medical products and services online as there are dangers in trusting and sharing false information on health

Able to decide whether to deal with an online problem situation alone or to recruit professional or informal help

6

Decision Included

1

Statement

Dimension 4

Nr. Stat

X

X

X

Not included

Included with arrangements

Source Survey validation statements (knowledge and attitudes) results regarding protecting personal data and privacy and final statements of DigComp 2.2 at [13], p. 40

Attitudes

Type

Table 4 (continued)

Digital Health and Wellbeing: The Case for Broadening the EU … 669

670

A. R. Trindade et al.

crucial in helping and supporting EU citizens to take informed decisions regarding their security, health and wellbeing on AI apps. As the torrent of ‘new’ evidence based and non-evidence based digital health apps become increasing prevalent as health services move increasing online. Recommendations: a. The important data collected (both for KSA included and for KSA unable to be included) needs wide dissemination throughout academic, social, and professional networks. b. To empower and educate citizens of the future to have the confidence to interact, evaluate and protect their privacy whilst using medical apps, building these topics into educational curricula is essential at all levels of schooling and further study. c. An EU funded education package for all citizens as part of the EU life-long learning movement.

References 1. Chang A (2020) The role of artificial intelligence in digital health. In: Wulfovich S, Meyers A (eds) Digital health entrepreneurship. Springer, Cham, pp 71–81 2. Center, for Open Data Enterprise (2019) CODE, Sharing and utilizing health data for AI applications. Center for Open Data Enterprise, Washington, DC 3. Ferrari A (2012) Digital competence in practice: an analysis of frameworks. European Commission 4. Ferrari A (2013) DigComp: a framework for developing and understanding digital competence in Europe. European Commission 5. de Fornasier MO (2021) The use of AI in digital health services and privacy regulation in GDPR and LGPD between revolution and (dis)respect. RIL Brasília 59(233):201–220 6. Goldsmith B, Holley D, Quinney A (2020) The best way of promoting digital wellbeing in HE? https://lmutake5.wordpress.com/2020/09/24/take5-47-the-best-way-of-promoting-dig ital-wellbeing-in-he/ 7. McKenney S, Reeves TC (2014) Educational design research. In: Spector J, Merrill M, Elen J, Bishop M (eds) Handbook of research on educational communications and technology. Springer, New York, NY 8. Murdoch B (2021) Privacy and artificial intelligence: Challenges for protecting health information in a new era. BMC Med Ethics 22(122):1–5 9. Plomp T (2013) Educational design research: an introduction. In Plomp T, Nienke N (eds) Educational design research: Part A: an introduction. Enschede: SLO—Netherlands Institute for Curriculum Development, pp 10–51 10. Seh AH, Zarour M, Alenezi M, Sarkar AK, Agrawal A, Kumar R, Ahmad KR (2020) Healthcare data breaches: Insights and implications. Healthcare 8(2):2227–9032 11. Singh AK, Anand A, Lv Z, Ko H, Mohan A (2021) A survey on healthcare data: a security perspective. ACM Trans Multimed Comput Commun Appl 17(2):1–26 12. Tuckman BW (2012) Manual de investigação em educação: metodologia para conceber e realizar o processo de investigação científica. Fundação Calouste Gulbenkian, Lisboa 13. Vuorikari R, Kluzer S, Punie Y (2022) DigComp 2.2: the digital competence framework for citizens with new examples of knowledge, skills and attitudes. European Commission

An Information Systems Architecture Proposal for the Thermalism Sector Frederico Branco, Catarina Gonçalves, Ramiro Gonçalves, Fernando Moreira, Manuel Au-Yong-Oliveira, and José Martins

Abstract The thermal SPA sector is currently experiencing a stable growth trend, which according to the World Tourism Organization (WTO) is expected to continue over the upcoming years. In Portugal, the sector has a very significant profile, with the existence of almost a hundred SPAs and thermal SPAs that generate a business volume (direct and indirect) of over 30 Me per year. Although the beginning of the process of digital transformation of the sector is already visible, there is no holistic view of the sector which means that the currently existing information systems (IS) do not present a useful response to the needs faced by the sector. Therefore, an architecture proposal was conceived and described for an IS that provides a useful, efficient, and agile response to the needs of the entire thermalism sector and its stakeholders. Keywords Thermalism · Thermal SPA · Information systems architecture · Thermalism observatory · Digital transformation F. Branco (B) · R. Gonçalves Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal e-mail: [email protected] R. Gonçalves e-mail: [email protected] F. Branco · R. Gonçalves · J. Martins INESC TEC, Porto, Portugal e-mail: [email protected] C. Gonçalves · R. Gonçalves · J. Martins AquaValor – Centro de Valorização e Transferência de Tecnologia da Água, Chaves, Portugal e-mail: [email protected] F. Moreira REMIT, IJP, University Portucalense & IEETA, University of Aveiro, Aveiro, Portugal e-mail: [email protected] M. Au-Yong-Oliveira INESC TEC, GOVCOPP, Department of Economics, Management, Industrial Engineering and Tourism, University of Aveiro, Aveiro, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_57

671

672

F. Branco et al.

1 Introduction The use of information systems (IS) and information and communication technologies (ICT) in the health and wellness sector has been steadily increasing over the last few years, as a result of a need to improve both the available knowledge and the decision-making processes [11]. Even assuming incredible proximity to the health and wellness areas, the thermalism sector has failed to transition to a more technology-based reality [1]. Although it is an activity and sector in full development, that is widely perceived as essential for the prevention of disease and promotion of health, the existing literature does not highlight the presence of a transversal incorporation of innovation (essential to the continuous improvement of both treatment techniques and thermal SPAs). Alongside this, it is also extremely difficult to not only identify a set of IS designed to specifically address the full scope of the sector needs, or even a full-proof architecture that supports the implementation of the referred IS, hence the existing impairment in what concerns the efficiency and effectiveness of the referred organizations’ managerial operations. In the specific context of this article, the Portuguese thermal SPAs (in conjunction with the Portuguese Thermal SPAs Association), the lack of an efficient and effective digital solution to support decision-making, capable of providing quality data, detailed and in real-time, was decisive for the idealization of an IS architecture capable of supporting not only the specific technical (and medical) processes, but also the vertical management of the organization. Thus, a project was developed whose goal was to conceptualize an IS architecture capable of responding to the challenges inherent to the digital transition of thermalism and the need for integration that the sector has with its various stakeholders (public and private). The proposed architecture was developed assuming a commitment to be agile and flexible enough to respond quickly to potential needs for modification and assumes a modular nature, in order to ensure that each of its functional groupings was well encapsulated and could be worked on individually. We are, as such, referring to an IS architecture that not only ensures data interoperability but also fosters it. This article begins with a brief description of the context in which the entire research and development project was carried out. In the second section, we present a conceptual framework that served as a basis for the R&D activities. The sector that served as a case study, thermal tourism, is briefly presented in the third section. The fourth section presents in detail the IS architecture for the thermalism sector that we propose. The article ends with a series of final considerations.

An Information Systems Architecture Proposal for the Thermalism Sector

673

2 Conceptual Background 2.1 Management Information Systems (MIS) IS have undergone several modifications over the years due to constant technological evolution. Despite this technological evolution, and more and more new business models emerging, the fundamental role of information technology (IT) in organizations remains [14]. The basic functions of an IS are to collect, process, store and analyse information considering a certain purpose, i.e., all IS having a purpose and a social context associated [27]. Some authors characterize IS as a composite concept, which combines both the technical component and the human activities within an organization [3, 12]. From a functional point of view, IS allow companies to achieve excellence in operational management, improve decision-making processes, increase employee productivity, and develop new products and services [18]. According to Branco et al. [8], IS affect not only the decisions of managers in organizations but also the way in which they plan and manage the available resources. Hence, and drawing on Laudon and Laudon [18], a management information system (MIS) is then characterized as a system whose purposes allow to collect information that assists organizations in decision-making processes, and that allows to monitor and control the business and predict future performance.

2.2 The Information Systems Within Organizations IT has drastically changed the business landscape. With the rapid growth of the computer industry and the increase in the functionality of computers, organizations have at their disposal a considerable volume of information [5, 22]. In terms of the management and decision-making inherent to an organization, the results achieved through the use of IS have already fully surpassed those common to manual processes supported by non-digital tools [23]. According to Laudon and Laudon [18], the success of a business is directly related to the quality of the IS that the organization implements. Thus, realizing its framework, the goal of all IS is to improve the performance and efficiency of an organisation and trigger the continuous use of innovative digital technologies to support the existing business processes [4, 15]. Thus, it is then crucial that IS are easy to use, that they maintain an information flow with quality and efficiency and that they are in accordance with the requirements and goals that the organisation wants to achieve [6].

674

F. Branco et al.

2.3 Health and Wellness Information Systems As organizations in the health and wellness sectors are modernizing and adapting to the new challenges of a society that, due to its lifestyle (which tends to be the opposite of what can be called as healthy), requires almost permanently specialized support (either medical or technical) [10, 28], it is also easily perceivable the investment, that is already being made, in the referred organisations’ empowerment and in the incorporation of both innovation and technological improvements. Thus, and as argued by Birken et al. [7], health care and wellness organisations tend to be more aligned with international best practices and current standards and regulations. From a business point of view, health care and wellness organisations are extremely focused entities that try, as much as possible, to base their business decisions and their products/services on a highly structured information base, achieved through highly reliable and timely data [17]. Thus, and as argued by Magalhães et al. [20], even though the previously mentioned organizations are already taking on the challenge inherent to the digital transition and to the transversal incorporation of IS and technologies that allow, in a very effective way, to improve their services, optimise resources and adapt business models more easily to the challenges they face in order to improve the relationship with their customers/patients, there is still a long way to go, mainly due to the inexistence of formal support for the successful implementation of such IS.

3 The Portuguese Thermalism Sector—A Case Study The concepts of health and wellness have been evolving over the last few years, being currently perceived not as a state in which the person is facing the proven absence of illness [26], but rather as a state of total physical, mental and social harmony that allows the creation of dynamics fostering a global vision based on the “One Health” paradigm [29]. According to the Global Wellness Institute, our health status tends to reflect the interactions between our genetic background and a series of external variables, including the environmental conditions in which we live, the socioeconomic factors underlying our daily life, and also the ease of access to quality health and well-being services [30]. We are thus facing a turning point with regards to the pursuit of a healthy and happier life, in which there is a more significant preoccupation with individual and permanent well-being, with the search for constant happiness and a desire to fulfil ourselves personally [13]. One of the areas that has grown due to this incessant search for behaviours that promote health and prevent diseases is thermalism. As the Portuguese Thermal SPAs Association [2] tells us, the concept of thermalism is closely linked to services to improve the quality of life that are based on premises focused on the prevention of disease and that involve the application of thermal techniques and treatments in which natural mineral water is the main element. The

An Information Systems Architecture Proposal for the Thermalism Sector

675

sustained growth observed in the SPA sector over the last few years is, according to the World Tourism Organization, something that will continue to happen over the coming decades, mainly due to the emergence of the previously mentioned consumption patterns focused on individual wellness [9]. Now, this (positive) forecast is highly relevant, for example, for the territories that have the endogenous resource “Natural Mineral Water” as an asset of great relevance both from an economic and a regional development perspective [25], as is the case of low-density rural territories, among which we can highlight the Alto Tâmega region in Portugal [24]. From a functional and operational perspective, the thermalism sector is mostly composed of thermal (hot springs) SPAs of both private and public natures. We are then dealing with small and medium enterprises, mostly composed of technical staff specialized in thermal treatments, who are assisted by an administrative and operational support body composed of non-specialized employees. The management of the thermal SPAs is typically defined by appointment and tends to be altered as the municipal executives undergo changes. With regards to the managerial actions implemented (operational, management, and strategic), these are typically maintained by a process still considerably supported by unstructured physical formats, from which it is very difficult to collect in a timely, structured and consistent manner the information and knowledge not only to ensure efficient management of the thermal SPAs, but also to ensure that the services are provided in the best way and in line with customer needs. In view of the above, the need for the implementation of IS based on a holistic view not only of individual thermal SPAs, but of the whole sector (including its stakeholders) is quite evident. The implementation of this type of IS would allow, in parallel, a significant improvement in the management, administration and internal decision-making of each SPA, and an increase in the clarity of perception that the various sector stakeholders (ATP, Government, etc.), have about the role of thermal tourism in health promotion and disease prevention.

4 Proposed Information System Architecture for the Thermalism Sector In order for it to be possible to achieve the implementation of an IS capable of responding efficiently, with high quality and in a useful manner to the current (and future) needs of the thermalism sector, particularly in terms of control of its operation, a proposal has been developed for a modular architecture that includes all contexts of the sector and that foresees (and stimulates) interaction and interoperability between all of the parties (Fig. 1). The proposed architecture, which envisions the future implementation of a “Thermal Activity Observatory” (an IS focused on supporting the management and control of thermal SPAs and the thermalism sector itself), is composed of 12 major components, ranging from the technological infrastructure responsible for supporting

676

F. Branco et al.

Fig. 1 Proposed architecture of an observatory of thermal activity that will serve as a basis for the entire sector of thermalism in Portugal

all the components of the IS, to a component responsible for encompassing a set of integration mechanisms with external systems and solutions. In functional terms, it was possible to establish that the main actors that will interact with the various features of the IS should be the Portuguese Thermal SPAs Association (and its associated thermal SPAs), the Directorate-General of Health, the Directorate-General of Energy and Geology, the Ministry of Social Security and the Ministry of Finance.

4.1 Operational Management of Thermal SPAs The operational management module is one of the most relevant components of the proposed IS architecture since it will be here that the management and monitoring functionalities (human resources, customers, available and applied thermal treatments, sales, purchases, suppliers, resources, etc.) will be implemented. This module will be one of the most relevant in terms of generating properly structured and quality data so that it is possible to feed the decision-making support modules.

An Information Systems Architecture Proposal for the Thermalism Sector

677

4.2 Monitoring the Therapeutic Effects of the Thermal Activity The monitoring of the therapeutic effects of thermalism is one of the most critical activities, not only for the thermal SPAs, but mainly for the stakeholders of the sector, among which we can highlight the Government, for the regulatory and incentivizing role it may have. As indicated by Martins et al. [21], one can monitor the effects of thermal SPAs by using a dynamic and digital system that generates effective data from a set of physiological indicators of each of the thermal SPAs patients. Therefore, the IS to be implemented has to assume the ability to continuously interact with digital solutions developed to monitor the therapeutic effects of thermalism.

4.3 Access Control The correct operation of an IS underlies the existence of a control mechanism that establishes the rules of access to the various functionalities of the IS and to the very data it generates [8, 16]. In this sense, and in order to ensure that all the identified actors can access the IS in a linear way, the existence of an independent module was defined that will be responsible for user management, the management of access profiles, the assignment of permissions to individual users and profiles, as well as the effective control of compliance with the established access rules.

4.4 Integrations Realizing the potential social and economic impact of the implementation of an IS that not only allows monitoring and supporting the individual operation of thermal SPAs, but also of the entire thermal sector, the architecture proposal presented contemplates a functional module whose focus is the implementation and availability of mechanisms of integration of the IS itself with external systems belonging to the Ministry of Social Security and the Ministry of Finance. These integrations are necessary in order to extrapolate the monitoring of both social and economic effects of thermal SPAs, to the point where it is possible to explore the effects in terms of, for example, absenteeism, sick leave, medication consumption and even recourse to the National Health System itself.

678

F. Branco et al.

4.5 Business Analytics The IS that is recommended in the proposed architecture provides the generation of added value for the SPA sector through the creation of large volumes of properly structured and recoverable data. Therefore, in order to be able to implement detailed and useful analyses for all stakeholders, the existence of a business analytics module was conceived, which will not only incorporate the functionalities of intelligent (and automated) data analysis, but also the functionalities of customizable visualization of the same, in a logic of self-service business analytics.

4.6 Reporting Service The thermal SPA sector is, by its own regulation, required to implement reporting routines whose focus is the control of the operation of thermal SPAs and compliance with the legal obligations that apply to them (e.g., quality control of natural mineral waters). Currently, these routines are based on manual procedures, highly prone to human error, and occupy the scarce resources of the thermal SPAs for a significant period of time. Thus, it was established in this IS architecture proposal, that a module is responsible for the analysis and summary of the existing data and, consequently, for the automated creation of all the necessary reports so that the SPAs can achieve the necessary compliance.

4.7 Information Service Support Although thermalism is typically associated with non-invasive treatments (of a therapeutic nature), there is an immense technical complexity associated [19]. In this way, and in order to streamline access to technical and scientific information about thermalism, natural mineral water and thermal treatments, a functional module was devised that will serve as a repository of knowledge that should be maintained by the thermal SPAs and by the Portuguese Thermal SPAs Association itself, and that can be consulted by the public. Intranet The capacity of the IS, adjacent to the proposed architecture, to meet the needs of the various thermal SPAs is dependent on how it ensures watertight isolation of the operational data that are unique to each of the organisations. Therefore, a functional module has been devised which implements a conditional access interface and which is available only internally for each of the SPAs. In this module, all the functionalities that allow thermal SPA operational management and control will be available and, conditionally, it will provide global data for the rest of the IS.

An Information Systems Architecture Proposal for the Thermalism Sector

679

4.8 User Interface Global access to the IS will be made through a single interface that will allow access to the functionalities of the business analytics module, all the reports that the reporting module produces, as well as to the repository of technical and specialized knowledge inherent to the IS support module. It will be through this interface that the various actors identified, and even the general public itself, will be able to have access to the evolution of the SPA sector over time and its impacts on health promotion, disease prevention and ensuring the overall well-being of the SPA population.

5 Conclusions and Future Work The thermal SPA sector is a growing sector whose focus is on health promotion and disease prevention centred on the application of therapeutic and wellness techniques based on natural mineral waters. Although the sector is quite significant and even structurally and operationally organized in international terms, it is easily perceptible that in the case of Portugal this situation does not occur, since most thermal SPAs support their operation by generic IS and that do not give a response which is complete and of sufficient quality. Considering the present context, we proceeded to idealize and propose an IS architecture for the thermal sector in Portugal, called “Thermal Activity Observatory”, and that assumes as the main challenge to represent an asset of added value through the generation of structured quality data, aligned with the development strategy of the sector and the SPAs themselves, and in real-time, able to solidly support the decisionmaking process. In addition to this and relying on the paradigms of the continuous need for integration and interoperability, the proposed architecture has a modular nature that foresees the interaction of all the stakeholders of the SPA sector. In light of the existing technical and scientific knowledge, we consider that this proposal is a valid conceptual contribution that will stably support the implementation of a highly reliable information system for the SPA sector.

References 1. Anaya-Aguilar R, Gemar G, Anaya-Aguilar C (2021) Challenges of spa tourism in Andalusia: experts’ proposed solutions. Int J Environ Res Public Health 18(4):1829 2. ATP (2019) O que são serviços de bem-estar termal? Associação de Termas de Portugal. https:// termasdeportugal.pt/faqs 3. Avgerou C, McGrath K (2007) Power, rationality, and the art of living through socio-technical change. MIS Q Manag Inf Syst. https://doi.org/10.2307/25148792 4. Baiyere A, Salmela H, Tapanainen T (2020) Digital transformation and the new logics of business process management. Eur J Inf Syst 29(3):238–259

680

F. Branco et al.

5. Bani-Hani JS, Al-Ahmad NMM, Alnajjar FJ (2009) The impact of management information systems on organizations performance: field study at Jordanian Universities. Rev Bus Res 6. Bessa J, Branco F, Costa A, Martins J, Gonçalves R (2016) A multidimensional information system architecture proposal for management support in Portuguese higher education: the university of Tras-os-Montes and Alto Douro case study. In: 2016 11th Iberian conference on information systems and technologies (CISTI), pp 1–7. https://doi.org/10.1109/CISTI.2016. 7521508 7. Birken SA, Lee S-YD, Weiner BJ (2012) Uncovering middle managers’ role in healthcare innovation implementation. Implement Sci 7(1):1–12 8. Branco F, Gonçalves R, Moreira F, Au-Yong-Oliveira M, Martins J (2020) An integrated information systems architecture for the agri-food industry. Expert Syst 1–15. https://doi.org/ 10.1111/exsy.12599 9. Carvalho C (2017) Health tourism & Estoril resort’s rebirth: from thermal springs to the contemporary wellness centre. Tour Hosp Int J 9:42–58 10. Casey MM, Call KT, Klingner JM (2001) Are rural residents less likely to obtain recommended preventive healthcare services? Am J Prev Med 21(3):182–188 11. Dahl AJ, Milne GR, Peltier JW (2019) Digital health information seeking in an omni-channel environment: a shared decision-making and service-dominant logic perspective. J Bus Res 125:840–850 12. Davis GB (2000) Information systems conceptual foundations: looking backward and forward. 61–82. https://doi.org/10.1007/978-0-387-35505-4_5 13. del Río-Rama M, Maldonado-Erase C, Álvarez-García J (2018) State of the art of research in the sector of thermalism, thalassotherapy and spa: a bibliometric analysis. Eur J Tour Res 19:56–70 14. DeLone WH, McLean ER (2004) Measuring e-commerce success: applying the DeLone and McLean information systems success model. Int J Electron Commer 9(1):31–47. https://doi. org/10.1080/10864415.2004.11044317 15. Eroshkin SY, Kameneva NA, Kovkov DV, Sukhorukov AI (2017) Conceptual system in the modern information management. Procedia Comput Sci 103:609–612 16. Evered M, Bögeholz S (2004) A case study in access control requirements for a health information system. In: Proceedings of the second workshop on australasian information security, data mining and web intelligence, and software internationalisation, vol 32, pp 53–61 17. Langell JT (2019) Evidence-based medicine: a data-driven approach to lean healthcare operations. Int J Healthc Manag 1–4. https://doi.org/10.1080/20479700.2019.1641650 18. Laudon KC, Laudon JP (2007) Management information system: managing digital firm. Int J Comput Commun Control 19. Leandro ME, da Silva Leandro AS (2015) Da saúde e bem-estar/mal-estar ao termalismo. Sociologia 30:75–96 20. Magalhães D, Martins J, Branco F, Au-Yong-Oliveira M, Gonçalves R, Moreira F (2020) A proposal for a 360° information system model for private health care organizations. Expert Syst 37(5):e12420. https://doi.org/10.1111/exsy.12420 21. Martins J, Moreira F, Au-Yong-Oliveira M, Gonçalves R, Branco F (2021) Digitally monitoring thermalism health and wellness effects—a conceptual model proposal. In: World conference on information systems and technologies 22. Mithas S, Tafti A, Mitchell W (2013) How a firm’s competitive environment and digital strategic posture influence digital business strategy. MIS Q 37(2):511–536 23. Peppard J (2018) Rethinking the concept of the IS organization. Inf Syst J 28(1):76–103 24. Pinos Navarrete A, Shaw G (2020) Spa tourism opportunities as strategic sector in aiding recovery from Covid-19: the Spanish model. Tour Hosp Res 1–6. https://doi.org/10.1177/146 7358420970626 25. Silvério ACB (2020) Determinantes da satisfação e perfil do cliente no balneário termal de Chaves. Instituto Politécnico de Bragança 26. Strout K, Ahmed F, Sporer K, Howard EP, Sassatelli E, Mcfadden K (2018) What are older adults wellness priorities? A qualitative analysis of priorities within multiple domains of wellness. Healthy Aging Res 7(2):e21

An Information Systems Architecture Proposal for the Thermalism Sector

681

27. Turban E, McLean ER, Wetherbe JC (2007) Information technology for management: transforming business in the digital economy, p 20 28. Vandenbosch J, Van den Broucke S, Vancorenland S, Avalosse H, Verniest R, Callens M (2016) Health literacy and the use of healthcare services in Belgium. J Epidemiol Community Health 70(10):1032–1038 29. Xie T, Liu W, Anderson BD, Liu X, Gray GC (2017) A system dynamics approach to understanding the One Health concept. PLoS One 12(9):e0184430 30. Yeung O, Johnston K (2018) Global wellness economy monitor: thermal/mineral springs. https://globalwellnessinstitute.org/wp-content/uploads/2019/05/ThermalMineralSpri ngs_WellnessEconomyMonitor2018revfinal.pdf

Technologies for Inclusive Wellbeing: IT and Transdisciplinary Applications Anthony L. Brooks

Abstract This text positions the research of the 2021 and 2022 ICITA keynote speaker, namely, Dr Professor Anthony Brooks of Aalborg University, Denmark. It presents an introduction to his near four decades body of mature research targeting a working towards “Probably the best rehabilitation complex in the world”. The text includes aspects on eXtended Reality (XR); his emergent model ZOOM (the Zone of Optimized Motivation) and Artificial Intelligence (AI) within SoundScapes and Virtual Interactive Space (VIS), the author’s research vehicles (method and apparatus, as patented); a workforce of the future in (Re)habilitation (by any other name); closing with a Discussion segueing to a Conclusion. Interested readers may peruse the author’s related approximate 250 publications that include a trilogy of books titled Technologies for Inclusive Wellbeing (i.e., Brooks et al. in “Recent advances in technologies for inclusive well-being: from worn to off-body sensing, virtual worlds, and games for serious applications” see Springer, 2017; Brooks et al. in “Recent advances in technologies for inclusive well-being: virtual patients, gamification and simulation”, see Springer, 2021; Brooks et al. in “Technologies of inclusive well-being serious games, alternative realities, and play therapy”, see Springer, 2014) that align to this text and both his 2021 Dubai Knowledge Park (UAE) ICITA keynote and his 2022 Lisbon University (Portugal) ICITA keynote. Keywords Third Culture · IT Applications art and health · Entertainment Computing · Healthcare · (Re)habilitation · Wellbeing

1 Introduction I am employed at Aalborg University under the CREATE department under which approximately twenty years ago I was a founder of the Medialogy education. I am a senior researcher having been an Associate Professor for those twenty years since A. L. Brooks (B) CREATE, Aalborg University, Aalborg, Denmark e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_58

683

684

A. L. Brooks

my employment began at Aalborg university. In that period I established, designed and acquired funding for a research and education complex titled SensoramaLab. My focus is upon my own body of work that has developed over many years as a hybrid entity contributing to and across the fields of healthcare, (re)habilitation, and wellbeing and as well within the field of the arts in the form of interactive installations, performances and more. Development has included building upon my prior research where for most of the 1990s I developed a bespoke system for creative expression, video game play, and robotic device manipulation. Manipulations were via gesture-control movements within invisible sensor-based interactive spaces. Generated signals were mapped to trigger and manipulate digital media via interim software that could be programmed for a tailoring of the created interactive environments. Applications of the IT-based (modular) system cover the arts and health. However, to be clear—I am not a therapist or medical professional—but over many years—my bespoke research outputs have been applied and researched alongside physiotherapists and other healthcare professionals such as psychologists, neuropsychologists, and doctors as well as care workers, occupational therapists, and others. Family members have also been involved. Within the arts the work has been showcased at Museums of Modern Art as interactive installations and exhibitions (the first in 1978 at the Institute of Contemporary Arts in London), as stage performances, and at major events such as the cultural event supporting the Olympics and Paralympics (1996 and 2000). The positioning of the work across disciplines aligns to CP Snow’s thesis [13] on two cultures where he posited that transdisciplinary bridges needed building to advance both science and the humanities. Thus, this positioning has led to the author being referred to as a third culture thinker where the work sits on the cusp between applications. The research in CREATE is conducted in close collaboration with a wide range of companies, universities and local authorities where researchers collaborate in both national and international projects where they are typically investigating the interplay between creativity and technology and where this interplay is used to fulfil our ambition of delivering excellent human design centered technology research. CREATE is considered unique in the Danish context providing the foundation for a powerful new field of research and development of competence within the field of technology and design with a human focus. The author’s research, besides influencing the initial design of the Medialogy education (CREATE’s flagship study program), has thus been an ambassador for this profile and includes numerous activities interfacing the author’s system to Virtual/Augmented/Mixed Reality, Interactive digital media, and Video Games and other associated content mappings.

Technologies for Inclusive Wellbeing: IT and Transdisciplinary …

685

2 Virtual Reality and All That When the term “Virtual Reality” (or VR) is heard or read, it can conjure up visions of different technologies and intervention strategies. Typically, these days many will think of a person wearing a Head Mounted Display (HMD) a thread of such technology goes back to 1968 and the pioneering work of Ivan Sutherland who published “A head-mounted three-dimensional display”. Contemporary HMD apparatus can track a wearer’s head orientation to match viewing of a surrounding 360-degree environment model. Hands can be tracked to enable interaction within the computer-generated virtual environment. Similarly, movement of the eyes can be tracked, and video recorded to enable analysis of what the user looks at within the 360-degree environment. Additionally, pupil dilation can be recorded within the HMD device to analyze emotional effects associated to what the person experiences. Different peripheral devices augment the HMD experience, and many studies are ongoing within (re)habilitation using such technologies. VR has other acknowledged technology threads that are beyond HMDs dated to around the same time—For example, a device titled as Sensorama from 1962—by Morton Heilig—has been referred to as The World’s first Virtual Reality Device. Patented in 1969 as an “experience theatre” the device offered multi-sensorial impressions of a virtual, ten-minute-long motorcycle ride through New York City. Apart from seeing the film by placing one’s head into a viewing port (thus no wearable HMD), the Sensorama user would simultaneously experience the corresponding vibrations, head movements, sounds, and rushes of wind. In 1971 the patent was extended and renamed as a Sensorama Simulator for creating the illusion of reality. From more of an art perspective, computer scientist Myron Krueger is credited with what was titled “Artificial Reality”—participants experienced this as a projected display of computer-manipulated camera images sourced from a participant and a facilitator located in different rooms scaled and mixed and then experienced with the naked eye (so again without any HMD) that was experienced in real-time as an interactive immersive environment. Krueger’s books entitled Artificial Reality from 1983 and a second edition from 1991 are recommended reading. Simply stated Virtual Reality is a simulated experience that can be similar, or completely different, from the real world—in other words—figurative or abstract. In the CREATE Medialogy education where I am based students study different VR technologies and both figurative and abstract content in their research with external partners for their projects. In an example case, abstract game play and experiencing a virtual Snoezelen (or controlled multisensory environment) was the created content but with tailored patient control programmed to match the patient actions that the therapists wished to improve. Close collaborations between staff and our students and the support of such centres is important to develop and evolve such work. Other cases by students are where figurative environments have been created and patients are challenged with tasks, they would encounter in the real world interacting

686

A. L. Brooks

with virtual objects to achieve for example making a virtual cup of coffee in a virtual kitchen. A prediction after observing our students collaborating with industry specialists, is that one day in the future (re)habilitation teams will include a Medialogist as the IT specialist, that is—someone capable of creating the technology, establishing optimal system set-up and system change parameters, and providing data from interventions for the medical professionals to analyze. The term Virtual Reality is an oxymoron—that is “a figure of speech in which apparently contradictory terms appear in conjunction”—and—as well as different preconceived ideas of what the term represents, and associated confusions the term offers, (such as presented in the preceding three history threads)—this can also lead to challenges in reporting interventions in sufficient detail and characteristic such that one can reflect that such studies can fall into what is referred to as the “Black Box of rehabilitation” that has been pointed to as preventing research progress, hindering patient outcomes, and making it harder to justify spending on rehabilitation services. In addition, there are many health and safety considerations of Virtual Reality— especially regarding prolonged use. Also, notably, eye testing by a qualified professional optometrist before HMD use is not typically carried out. I would suggest that many therapists/facilitators etc., would also prefer to directly see their patient’s full faces and especially their eyes to determine emotional changes or identify discomfort. I’d like to now move from the general to the specific and briefly introduce my own body of research titled SoundScapes that focuses upon Virtual Interactive Space (VIS).

3 SoundScapes/Personics1 Typically, in Virtual Environments (be they either figurative or abstract and titled ‘Virtual Reality’ or otherwise—…..) there are typically four distinct design specification areas that can be discussed. These are (1) the human at the center of the session/design, here labelled as “Patient”—including given tasks/challenges—(2) the selected interface or interfaces to suit the participant and intervention targeted output, here labelled “Interface(s)”—(3) The manipulation of the data downstream of the interface to suit the designed activities, labelled here as “Software”—(4) the “Content”—that can be software programs, robotics or similar, and (5) the “Presentation means”, so for example, speakers for stereo audio; projection of visuals on a screen or monitor, or other device such as a HMD. Each area offers opportunity for tailoring experiences for everyone within (re)habilitation sessions within a dedicated treatment program. Thus, a therapist/facilitator can be prepared with change elements to determine within a session to 1

Personics (Personal Interactive Communication Systems) was the title of the commercial company based upon the author’s research.

Technologies for Inclusive Wellbeing: IT and Transdisciplinary …

687

alter differing incremental challenges according to patient progress, or state, to tailor the experience for optimal impact within the created ’SoundScapes’ or VIS. Both terms relating to a commercial entity titled Personics (acronym for Personal Communication System) resulting from the author’s research, that was registered around the millennium by a third party in Denmark. A successful tailoring of experience results in closure of the human afferent efferent neural feedback loop where a state referred to as ’Aesthetic Resonance’ (AR) is achieved—that is—a situation when the response to intent is so immediate and aesthetically pleasing as to make one forget the physical movement (and often effort) involved in the conveying of the intention. This AR coined from within a European Commission IST funded project titled CARE HERE (Creating Aesthetic Resonant Environments for Handicapped, Elderly, and REhabilitation)–IST-20001-32729–see http://www.bris.ac.uk/carehere/Postprojectreflect ions.html. In most cases, the feedback stimuli can be matched to the patient profile and input, which motivates his/her engagement, participation and motivation. Each SoundScapes set-up ideally involves two designs associated to a single setup. The first design is for optimizing the patient’s experience (where fun, playful, enjoyable and creative interactions are targeted such that an effort is made to not “prime” the intervention as being therapy or training etc.), in other words the patient just enters, and it works (this involving preparatory role-playing by staff familiar with the patient). The second design targets therapeutic outcome which is hidden under the fun and creative interactions, thus, purposeful intervention targets patient progression. This involves data logging archives (e.g., session video recordings, system data including interface/software/content/presentation means, each as definable parts of the whole). Analysis is also of sub-parts, where the previous parts become the whole, this in line with the hybrid action research/hermeneutic methodology central to my PhD dissertation titled SoundScapes see https://vbn.aau.dk/files/55871718/PhD.pdf. Patient and facilitator actions and responses etc., are also archived for correlation analysis under this methodology.

4 ZOOM (Zone of Optimized Motivation) From my many years of field work it became clear that a model or framework was required to support the therapists’ systematic intervention and evaluation with/of digital media technologies. Thus, the model titled ’Zone of Optimized Motivation’ (or ZOOM) was detailed reflecting my own experiences to be used with the created system, but also for others to explore and refine. The model was first published at the IEEE Healthcom event in South Korea [5]. Subsequently, its most recent iteration was reported in a related book chapter [1]. Publications on this action research/hermeneutic methodology and ZOOM model have been to optimize intervention with digital media tools such as the author’s

688

A. L. Brooks

system as well as to encourage adoption and uptake to enable research therapists to critique and improve the model. But this has not happened; therefore ZOOM has undergone an uplift to speculate future contribute in an updated form proposed as a grounding for building Artificial Intelligence Deep Learning algorithms where input is automatically garnered systematically from the sessions with therapist, facilitators and associated health professionals. Briefly, the emergent model ZOOM, emerged from my field work wherein I acted as a facilitator as therapists didn’t understand the technology to be able to operate. Operation requires knowledge of the system, including the designed in-session change parameters that are integral to optimizing patient engagement and motivation. The model is built upon a synthesis of theoretical perspectives associated to Vygotsgy ([14]—ZPD and masterful other), ([11]—Activity), Csikszentmihalyi ([6]—Flow), [12]—Reflective Practitioner, In-action/On-action), [8] Microdevelopment, scalloped neuro-learning), and other influences. There are two phases—in-action (session) and on-action (post/pre-session analysis and refinement). To support the facilitator system pre-sets are supplied for the in-action phase. At a certain intervention point, the facilitator detects deviation of participant engagement—a change of pre-set targets is needed to re-engage. Re-engagement takes time such that successful re-engagement establishes a new challenge profile according to ability. This is a cumulative recursive reflection process where each action research part is analyzed aligned to a hermeneutic model.

5 Workforce of the Future Relative to the above, I posit that future medical rehabilitation teams will need to be supplemented with employed personnel having technology—human expertise, such as graduate Medialogists alongside a next-generation of healthcare professional who is aware of the potentials of such systems as the author’s. This to overcome intervention weaknesses towards optimizing and to manage and overview the AI deep learning aspect that through taxonomy contribution is anticipated to improve shared learnings through optimization of interventions characterization and importantly also enable inclusion of knowledge generated from Black Box Rehabilitation activities for such rehabilitation treatment and behavior change specification systems. Additionally, the burden of therapist/facilitator rapportage after every session and current lack of documentation detail (as illustrated and as discussed in the literature e.g., from [7] is posit as being reduced and improved via such automated AI deep learning supplementing the field. Given my limited background, this research is argued from solely an artistic/scientific research position. Thus it includes the documentation of an artistic process, including documentation of new results and new knowledge gained during

Technologies for Inclusive Wellbeing: IT and Transdisciplinary …

689

Fig. 1 A continuum of different ways of generating knowledge and recognition2

the process. This is in line with the Royal Danish Academy who posited a continuum of different ways of generating knowledge and recognition under their position aligned with the Organization for Economic Co-operation and Development (OECD) and their definitions of research. In my case investigations and processes are informed across my background in the fields of art and health and as an inventor (thus respectfully abridged/inspired as in Fig. 1). Practice-based research across disciplines (such as in the author’s SoundScapes/VIS), as stated in the OECD model, is academic research involving actions and events, which are not systematic or exactly reproducible, but which nevertheless can help to produce academically recognized results. A publication on Transdisciplinary Applications is offered by OECD relating societal challenges and research strategies at https://one.oecd.org/document/DSTI/STP/GSF(2020)4/FINAL/En/pdf. Similarly, artistic research is posited in tune with artistic practice but combined with a set of definition-determined criteria for explicit reflection, documentation and dissemination, this is seen as a potential for helping to maintain, develop and disseminate knowledge related to processes, events and results. This potential is analogized in (re)habilitation under SoundScapes/VIS, especially when art and associated practices are used within interventions (as in the case for many ’people of determination’ [PoD]).

6 (Re)habilitation (By Any Other Name) Analogized to (re)habilitation, I posit inclusion of black box rehabilitation (suggested as continuum right and center in the figure) alongside the more valid and reliable rehabilitation intervention studies (for example on the left side of the continuum).

2 Figure abridged/inspired from an original Royal Danish Academy/OECD model posted at their previous website that seems no longer available at time of this edit—see https://royaldanishacad emy.com/

690

A. L. Brooks

“Habilitation” refers to a process aimed at helping persons of determination to attain, keep, or improve skills and functioning for daily living. For pediatric patients, habilitative therapy often aims to help a child develop motor skills that they have yet to accomplish. For example, a child with cerebral palsy may require the assistance of a physical therapist to learn how to sit. Because this is a skill that the child has yet to accomplish, the aim of the therapy is habilitation (see https://napacenter.org/difference-betweenhabilitation-and-rehabilitation/). In my body of research titled SoundScapes, I target to be able to empower as wide a range of participants as possible—especially people of determination—where an individual’s functional ability is sourced as system input, …..thus to include even locked-in patients where breath and eye-movement data can be input, …..through to higher functional abilities of people of determination with for example Cerebral Palsy, Down Syndrome, Acquired Brain Injury, PMLD (Profound and Multiple Learning Disabilities), etc. To do this individual tailoring, a range of interfaces, content, and activities can be selected, so, for example, game playing, music making, painting, robotic control can be selected in focus optionally with tangible output results, respectively, a recording of a game or composition, a printout of a painting, or video of robotic control. Virtual Reality can also be used. Thus, SoundScapes is conceptually considered as a flexible, tailorable and modular Virtual Interactive Space (VIS)—or scalable environment—(especially as typically affordable invisible sensors with different profiles are used). This rather than an unadaptable fixed system that may limit participation through not being widely inclusive and accessible for all, which is often a misconception of the SoundScapes system by peers. Relatedly, as I am sure the audience know, the term Rehabilitation typically refers to regaining skills, abilities, or knowledge that may have been lost or compromised as a result of illness, injury, or acquiring a disability. For example, a 30-year-old man who is an active runner trips over a rock and injures his ankle. Due to his injury, this man is unable to walk or run without limping and seeks the help of a physical therapist to be able to walk and run as he did before. The aim of this therapy is considered rehabilitation, helping this man regain a lost skill (see https://napacenter.org/difference-between-habilitation-and-rehabilit ation/). Additionally, The World Health Organization defines rehabilitation as “a set of interventions designed to optimize functioning and reduce disability in individuals with health conditions in interaction with their environment” (see https://www.who. int/).

Technologies for Inclusive Wellbeing: IT and Transdisciplinary …

691

7 Discussion Segueing to Conclusion So, in conclusion, from this keynote it can be questioned should researchers continue to use the term Virtual Reality in their intervention reports and in the literature, or does the lack of specific universal definition position it into the ‘black box of rehabilitation’ rather than contributing to specification systems/taxonomies? From my experiences I would recommend that when the term Virtual Reality or its counterparts Augmented Reality, or Mixed Reality (i.e., contemporised as eXtended Reality-XR) etc., are used, then a detailed description of the technology involved is included alongside specifics of use to assist reliability and understanding across studies. A further question is to ask whether new vocations in future healthcare teams such as Medialogists (the author’s term for an academic produced from the Medialogy education) is credible to propose as supporting digital technology adoption in (re)habilitation. From my many years of experience in the field I believe that (re)habilitation leaders should look at this option whilst ensuring that such vocational Medialogy graduate employees embody deep understandings and comprehension of the human, so not just technology, especially with regards to perception and cognitive attributes and possible differences associated to functional abilities aligned to impairments, be they acquired or born-with challenges and disability. In the long run such personnel it is proposed would also train therapists in the use of the technology. Another closing question is to ask about the feasibility of the emergent model ZOOM as a grounding for formulating the AI deep learning algorithms where it automatically supports optimizing specification systems and taxonomies and at the same time supporting therapists due to increased workloads. This is speculatively posit aligned to the proposal of healthcare teams employing technology-human savvy Medialogists who would, alongside immediate technology changes such as system creation and adjustment, and training of system aspects, they would additionally facilitate the transfer of knowledge from therapist, that is, facilitators and medical team into the deep learning mechanisms (for example via session video analysis. For example, using post-session reflections while viewing video thereafter correlated to collected session data…as in my ZOOM model). Finally, to reflect and question from this keynote and text on how realistic will my vision of “Probably the best rehabilitation complex in the world” be taken utilising such methods and apparatus as succintly presented herein and in my six published patents on Communication Method and Apparatus (e.g., US6893407B1, with all six listed in PhD-see https://vbn.aau.dk/files/55871718/PhD.pdf) —I close by pointing the reader to my profile text created for my keynote for the 16th International Conference on Information Technology and Applications (ICITA)-https:// 2022.icita.world/#/speaker2—: this reflection is asked in closure as this is my (imagined) swan song project (alongside the beach bar in Bali) whereby both (shared with a smile) are awaiting funders such that any interested parties are welcome to contact me? Thanks to the International Conference on Information Technology

692

A. L. Brooks

and Applications (ICITA) for the interest in my body of work over the two conference events where I was honoured to share with conference attendees in Dubai 2021 and Lisbon 2022 as their keynote speaker; special thanks to general chair of both events Dr. Abrar Ullah Assistant Professor, Heriot-Watt University, Edinburgh|Dubai|Malaysia and his organising teams. I also acknowledge and thank those attending both of my ICITA talks, i.e., as my keynote listeners, and further herein as readers of my supplemental text relating to the keynotes.... I hope to meet some of you one day in the future to discuss further as appropriate to your interest in my work-maybe even over a shared coffee in the Bali beach bar. A happy and healthy life to all.

References 1. Brooks AL (2021) The zone of optimised motivation (ZOOM). Digital learning and collaborative practices. Routledge, pp 106–116 2. Brooks AL, Brahnam S, Kapralos B, Jain LC (2017) Recent advances in technologies for inclusive well-being: from worn to off-body sensing, virtual worlds, and games for serious applications. Springer. https://doi.org/10.1007/978-3-319-49879-9 3. Brooks AL, Brahnam S, Kapralos B, Jain LC, Nakajima A, Tyerman J, Jain LC (2021) Recent advances in technologies for inclusive well-being: virtual patients, gamification and simulation. Springer. https://doi.org/10.1007/978-3-030-59608-8 4. Brooks AL, Brahnam S, Jain LC (2014) Technologies of inclusive well-being serious games, alternative realities, and play therapy. Springer. https://doi.org/10.1007/978-3-642-45432-5 5. Brooks AL, Petersson E (2005) Recursive reflection and learning in raw data video analysis of interactive ‘play’ environments for special needs health care. In: IEEE proceedings of 7th international workshop on enterprise networking and computing in healthcare industry, Busan, South Korea, HEALTHCOM 2005. https://doi.org/10.1109/HEALTH.2005.1500399 6. Csikszentmihályi M (1990) Flow: the psychology of optimal experience. Harper & Row 7. DeJong G, Horn SD, Gassaway JA, Slavin MD, Dijkers MP (2004) Toward a taxonomy of rehabilitation interventions: using an inductive approach to examine the “Black Box” of rehabilitation. Arch Phys Med Rehabil85:678–686 8. Fischer KW (2008) Dynamic cycles of cognitive and brain development: measuring growth in mind, brain, and education. In: Battro A, Fischer K, Léna P (eds) The educated brain. Cambridge University Press, pp 127–150 9. Krueger M (1983) Artificial reality. Addison-Wesley. ISBN 0-201-04765-9 10. Krueger M (1991) Artificial reality 2. Addison-Wesley Professional. ISBN 0-201-52260-8 11. Leontyev AN (1978) Activity, consciousness, and personality. Pergamon Press 12. Schön D (1983) The reflective practitioner. Basic Books 13. Snow CP (2001/1963/1959) The two cultures. Cambridge University Press 14. Vygotsky L (1978) Mind and society. Harvard University

Should the Colors Used in the Popular Products and Promotional Products Be Integrated? Takumi Kato

Abstract In the automotive industry, marketers often use color to promote a product that looks most attractive. In many cases, chromatic colors such as red and blue are adopted. However, there is a divergence from actual consumer behavior. In the automobile market, conservative colors tend to be sold better. Achromatic colors (white, black, gray, and silver) occupy approximately 80% in the color-coded sales of automobile exterior designs. The colors marketers use for promotion and the colors that consumers decide to purchase are different. However, consistency and empathy based on realism are emphasized in marketing communication research. This effect increases when the world of communication is understood to be similar to the real world or less fictitious. Therefore, the model used for communication should be a person whose situation is similar to that of a consumer. It can be inferred that the influence of the person appointed for communication is also suitable for the product. This study evaluated the following hypothesis in Japanese and American automobile markets; “Consumers who own cars with achromatic exterior body evaluate achromatic color more attractive for marketing communications.” As a result of randomized controlled trial in an online survey environment, the hypothesis was supported. As a practical implication, this study shows that popular and promotional colors should be unified. Traditionally, in business, it has been customary to use colors that practitioners consider the most attractive to show products for marketing communication. However, it is more effective to adopt product colors that match the consumers’ usage situation. Keywords Automobile Industry · CMF · Marketing Communication · Product Design · Promotional Color · Sensory Marketing

T. Kato (B) Meiji University, 1-1, Kanda Surugadai, Chiyoda-Ku, Tokyo 101-8301, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_59

693

694

T. Kato

1 Introduction In the past, durability was a key factor in improving product quality. However, durability is a prerequisite, and an improvement in perceived quality is required for attractiveness [1]. In industrial design, perceived quality consists of a CMF comprising color, material, and finishing [2, 3]. Above all, color research has accumulated abundantly because of its significant influence. Color influences all purchasing behaviors, including memory [4], emotional arousal [5], purchase intent [6], and willingness to pay [7]. Their influence may exceed the objective content of the product [8–11]. Therefore, in many industries, color is actively used to improve the perceived quality of products. The following three dimensions are treated as dimensions that define color: hue (e.g., red, green, or blue), saturation, and lightness [12, 13]. As a method of improving the perceived quality of color, saturation and lightness have been the main focus [14–17]. From another point of view, they also showed the perceived value produced by the amount of light reflected [18]. However, while understanding the effects of such objective values is important, it is insufficient. A consumer value judgment is subjective [19, 20]. As an application in business, a color (promotion color) is used for products in marketing communication materials. When appealing to a product, they use color marketers for communication because they think the product is the most attractive. In the automobile industry, chromatic colors such as red and blue are often used. For example, Mazda consistently promotes the red color based on the concept of KODO design (soul of motion design) [21, 22]. However, there is a divergence from actual consumer behavior. Conservative colors sell better in the automotive market [23, 24]. In other words, the colors used by marketers for advertisements and those that consumers decide to purchase are different. Surprisingly, only a few academic studies have examined this relationship. Therefore, this study clarified the effect of marketing communication that adopted promotional colors different from popular colors in the Japanese and American automobile markets. At that time, the difference in the effect of owning a car exterior color between the achromatic color consumer and the chromatic color consumer was also verified. This study supplements the lack of controversy regarding the critical themes of consumer behavior and color.

2 Related Works and Hypotheses Development 2.1 Effects of Color on Consumer Behavior Since the beginning of the twenty-first century, interest in color research in sensory marketing has grown rapidly. Color has a significant meaning, and its influence may exceed the objective content of the product. This effect has been actively discussed,

Should the Colors Used in the Popular Products and Promotional …

695

especially for foods. Regarding the effect of color on plastic cups filled with hot chocolate, orange and dark cream-colored cups were reported to enhance flavor [8]. In coffee mugs, white has been shown to have a stronger coffee flavor than clear and blue [9]. In wine experiments, consumers’ sense of smell is more strongly influenced by color than by raw materials [10]. In the orange juice experiment, a comparison of the effects of taste and label color showed that color had a stronger effect on taste [11]. Thus, tableware, packaging, ambient colors, and lighting affect consumer perceptions and appreciation [25]. Of course, similar effects have been observed in other industries. Among pharmaceuticals, warm colors are recognized as having stronger efficacy than cold colors [26]. The impact of color on brand image is well-known. By using fictitious product packages for verification, cold colors (e.g., blue) and dark colors (e.g., black) give an elegant and luxurious impression, and bright colors (e.g., white) give an affordable impression [27]. Verifying brand personality [28] using a fictitious logo reported that white gave the impression of honesty, red was stimulating, blue was ability, black was refined, and brown was sturdy [29]. Thus, color influences the competitiveness of products in various industries. Thus, color is an asset that serves as a competitive advantage for businesses.

2.2 Difference Between Promotional Color and Popular Color Based on this background knowledge, many companies are strengthening their brands by consistently appealing to specific colors. The most typical example is Tiffany Blue, which emotionally connects customers and brands and contributes to their differentiation from competitors [30]. Apple conveys a cool impression to consumers through gray and time-consuming communication [31]. In the automotive industry, Mazda has consistently promoted the red color [21]. However, there is a divergence from actual consumer behavior. In the automobile market, conservative colors tend to be sold better. For example, in the color-coded sales of automobile exterior designs, white occupies approximately 40% of the world market, followed by other achromatic colors (black, gray, and silver), which have a total share of approximately 40%. These four colors occupy approximately 80% of the market [23]. Blue, a typical chromatic color, accounts for only 9% of the cars produced worldwide, while red accounts for only 7% [24]. Hence, the colors used by marketers for advertisements and those that consumers decide to purchase are different. Consistency and empathy based on realism are emphasized in marketing communication research [32]. This effect increases when the world of communication is understood to be similar to the real world or less fictitious. Therefore, the model used for communication should be a person whose situation is similar to that of a consumer [33]. This makes understanding and imagining a situation easier. If

696

T. Kato

consumers cannot imagine using the target product when they come into contact with product promotions in marketing channels, such promotions cannot evoke a positive attitude [34]. It can be inferred that the influence of the person appointed for communication is also suitable for the product. Therefore, the following hypothesis was derived: H1: Consumers who own cars with achromatic exterior bodies evaluate achromatic colors as more attractive for marketing communications. H2: Consumers who own cars with a chromatic exterior body evaluate chromatic color as more attractive for marketing.

3 Methodology Randomized controlled trials (RCT) were conducted in an online research environment. To test these hypotheses, it is necessary to produce stimulus materials of the same brand, car, but with different exterior colors. Therefore, the Tesla car lineup was adopted in consideration of the following conditions: (i) the sales volume is small so that the brand experience of the car ownership is not affected; (ii) chromatic color is used for the exterior design as the promotion color; and (iii) there is no difference in brand image between Japan and the United States (US) to compare the effects. As shown in Fig. 1, the stimulus materials created for Tesla’s Model S, Model 3, and Model Y. Model X were excluded because the product was white. The chromatic color on the right side of the figure is the original photo used by the brand for marketing communication. The photo on the left is a red color corrected to gray using Adobe Photoshop. This study used gray and red as achromatic and chromatic colors, respectively. An online survey was conducted in Japan and the United States between May 29 and 31, 2022. There are two conditions for extracting respondents: (a) the 20– 60 s and (b) owning a car (cars). The sample size was 300 individuals from each country. The questions in this survey were: (1) gender, (2) age, (3) number of owning cars, (4) body type of owning the car, (5) brand of owning the car, (6) exterior color of owning the car, and (7) attractiveness of cars in the picture presented (5-point Likert scale). Questions (2) and (3) excluded respondents who did not meet the above conditions (a)–(b). If respondents owned multiple cars, they were instructed to answer questions (4)–(6) regarding the cars they mainly used. Before Question (7), one of the photographs in Fig. 1 was randomly presented. The distribution of respondent attributes is presented in Table 1. There were two Tesla owners in Japan and five in the US; thus, the bias given by the brand of the currently owned car was small. For verification, the chi-square test was applied to the matrix of each color × attractiveness. The null hypothesis is that there is no difference in attractiveness between groups. The significance level was set at 5%. The analysis environment was R, which is statistical analysis software.

Should the Colors Used in the Popular Products and Promotional …

Fig. 1 Product designs used for RCT (left: chromatic color; right: chromatic color)

697

698

T. Kato

Table 1 Distribution of respondent attributes Item

Gender

Content

Japan

US

Number of Respondents

Ratio (%)

Number of Respondents

Ratio (%)

Male

175

58.3

141

47.0

Female

125

41.7

155

51.7

0

0.0

4

1.3

20 s

Prefer not to say

49

16.3

57

19.0

30 s

64

21.3

72

24.0

40 s

56

18.7

64

21.3

50 s

68

22.7

51

17.0

60 s

63

21.0

56

18.7

Number of owing cars

One

246

82.0

210

70.0

Two or more

54

18.0

90

30.0

The body type of owning the car

Micro

95

31.7

11

3.7

Compact

60

20.0

37

12.3

Minivan

75

25.0

21

7.0

SUV

30

10.0

104

34.7

Sedan

40

13.3

127

42.3

Toyota

113

37.7

34

11.3

Honda

45

15.0

42

14.0

Nissan

37

12.3

27

9.0

Suzuki

41

13.7

0

0.0

Daihatsu

Age

Brand of owning the car

20

6.7

0

0.0

GM

0

0.0

35

11.7

Ford

0

0.0

41

13.7

44

14.7

121

40.3

Others Note SUV means sport utility vehicle

4 Results and Discussions 4.1 Results Table 2 shows that white and black are the most popular exterior colors of cars currently owned, and black is the most popular in the US. The total percentage of achromatic colors is about the same as about 65% in both countries. As shown in Table 3, among consumers who owned achromatic color cars, the group presented with gray responded with a higher score than the group presented with red. As a result of the chi-square test, a significant difference was detected (p = 0.007). Cramer’s V (small: 0.1–0.29, medium:0.3–0.49, large: ≥0.5) was 0.189,

Should the Colors Used in the Popular Products and Promotional …

699

Table 2 Exterior color of owing car Item

Achromatic Color

Chromatic Color

Content

Japan

US

Number of respondents

Ratio (%)

Subtotal (%)

Number of respondents

Ratio (%)

Subtotal (%)

White

72

24.0

65.7

49

16.3

64.7

Black

72

24.0

72

24.0

Gray

20

6.7

34

11.3

Silver

33

11.0

39

13.0

Blue

29

9.7

41

13.7

Red

24

8.0

31

10.3

Brown

6

2.0

11

3.7

Green

6

2.0

11

3.7

Others

38

12.7

12

4.0

Total

34.3

100.0

35.3

100.0

Table 3 Chi-square test result in total of two countries Color of owing car

Color of testing car

Achromatic Color Chromatic Color

Attractiveness

Total

Mean

p-value

Cramer’s V

41

197

3.305

0.007**

0.189

46

194

3.536

32

19

112

3.259

0.985

0.042

27

14

97

3.196

1

2

3

4

5

Red

14

41

54

47

Gray

16

15

58

59

Red

13

15

33

Gray

11

14

31

Note *** p < 0.001; ** p < 0.01; * p < 0.05

so the effect size was small. For consumers who owned chromatic color cars, the group presented with red responded as more attractive, but the difference with gray was small. Even in the results of the chi-square test (p = 0.985), no difference was detected. Accordingly, H1 was supported, and H2 was not supported. Finally, differences in results by country were confirmed. In Japan, similar to the above results, a significant difference was detected only in the consumers who owned achromatic color cars. On the other hand, in the US, no significant difference was detected in either groups.

4.2 Implications The color factors influencing consumer behavior have long been an important issue in marketing research. In the existing research, it is common to deal with hue, saturation, and lightness [12, 13]. There are reports on the effects of these indicators

700

T. Kato

in various industries, such as personal computers [14], printed advertisements [15], lighting [16], and architecture [17]. However, objectively measurable factors alone cannot capture all the factors that affect consumers. Psychological factors need to be examined to understand consumers who make subjective decisions. Since there has been a lack of discussion focusing on popular and promotional colors, this study complemented this finding. As a practical implication, this study shows that popular and promotional colors should be unified. Traditionally, in business, it has been customary to use colors that practitioners consider the most attractive to show products for communication. However, it is more effective in marketing communications to adopt product colors that match the consumers’ usage situation.

4.3 Limitations and Future Work This study had three limitations. First, only two colors were used in the verification: red and gray. Hence, it should be noted that the results do not cover achromatic or chromatic colors. Second, the generalization of the conclusions is limited because the target product is limited to Tesla. Third, the difference in the effect between Japan and the US was mentioned above, but the difference in the response tendency in the survey may have an effect here. Japanese people have an intermediate response tendency to avoid extreme answers. Furthermore, the more positive the item, the stronger the tendency to suppress such propensity. As shown in Fig. 2, the Japanese are concentrated in the middle three in this study, and the tendency differs significantly from that of the US. Despite the evaluation of the same subject, the average attractiveness was 2.973 in Japan and 3.733 in the US. These are topics for future research.

Fig. 2 Distribution of attractiveness responses by country

Should the Colors Used in the Popular Products and Promotional …

701

Funding This work was supported by JSPS KAKENHI Grant Number 21K13381.

References 1. Stylidis K, Wickman C, Söderberg R (2020) Perceived quality of products: a framework and attributes ranking method. J Eng Des 31(1):37–67. https://doi.org/10.1080/09544828.2019.166 9769 2. Piselli A, Baxter W, Simonato M, Del Curto B, Aurisicchio M (2018) Development and evaluation of a methodology to integrate technical and sensorial properties in materials selection. Mater Des 153:259–272. https://doi.org/10.1016/j.matdes.2018.04.081 3. Becerra L (2016) CMF design: the fundamental principles of colour, material and finish design. Frame Publishers, Amsterdam 4. Wichmann FA, Sharpe LT, Gegenfurtner KR (2002) The contributions of color to recognition memory for natural scenes. J Exp Psychol Learn Mem Cogn 28(3):509–520. https://doi.org/ 10.1037/0278-7393.28.3.509 5. Bagchi R, Cheema A (2013) The effect of red background color on willingness-to-pay: the moderating role of selling mechanism. J Consum Res 39(5):947–960. https://doi.org/10.1086/ 666466 6. Huang L, Lu J (2016) The impact of package color and the nutrition content labels on the perception of food healthiness and purchase intention. J Food Prod Marketing 22(2):191–218. https://doi.org/10.1080/10454446.2014.1000434 7. Marozzo V, Raimondo MA, Miceli GN, Scopelliti I (2020) Effects of au naturel packaging colors on willingness to pay for healthy food. Psychol Mark 37(7):913–927. https://doi.org/10. 1002/mar.21294 8. Piqueras-Fiszman B, Spence C (2012) The influence of the color of the cup on consumers’ perception of a hot beverage. J Sens Stud 27(5):324–331. https://doi.org/10.1111/j.1745-459X. 2012.00397.x 9. Van Doorn GH, Wuillemin D, Spence C (2014) Does the colour of the mug influence the taste of the coffee? Flavour 3(1):1–7. https://doi.org/10.1186/2044-7248-3-10 10. Morrot G, Brochet F, Dubourdieu D (2001) The color of odors. Brain Lang 79(2):309–320. https://doi.org/10.1006/brln.2001.2493 11. Hoegg J, Alba JW (2007) Taste perception: more than meets the tongue. J Consum Res 33(4):490–498. https://doi.org/10.1086/510222 12. Levkowitz H, Herman GT (1993) GLHS: a generalized lightness, hue, and saturation color model. CVGIP: Graph Models Image Process 55(4):271–285. https://doi.org/10.1006/cgip. 1993.1019 13. Stuart GW, Barsdell WN, Day RH (2014) The role of lightness, hue and saturation in featurebased visual attention. Vision Res 96:25–32. https://doi.org/10.1016/j.visres.2013.12.013 14. Camgöz N, Yener C, Güvenç D (2002) Effects of hue, saturation, and brightness on preference. Color Res Appl 27(3):199–207. https://doi.org/10.1002/col.10051 15. Lichtlé MC (2007) The effect of an advertisement’s colour on emotions evoked by attitude towards the ad: the moderating role of the optimal stimulation level. Int J Advert 26(1):37–62. https://doi.org/10.1080/02650487.2007.11072995 16. Barli Ö, Bilgili B, Dane S¸ (2006) Association of consumers’ sex and eyedness and lighting and wall color of a store with price attraction and perceived quality of goods and inside visual appeal. Percept Mot Skills 103(2):447–450. https://doi.org/10.2466/pms.103.2.447-450 17. Cubukcu E, Kahraman I (2008) Hue, saturation, lightness, and building exterior preference: an empirical study in Turkey comparing architects’ and nonarchitects’ evaluative and cognitive judgments. Color Res Appl 33(5):395–405. https://doi.org/10.1002/col.20436

702

T. Kato

18. Kato T (2022) Perceived quality created by the light reflection on a car’s exterior design. In: Proceedings of the 5th international conference on computers in management and business, pp 156–160. https://doi.org/10.1145/3512676.3512702 19. Ham J, Lee K, Kim T, Koo C (2019) Subjective perception patterns of online reviews: a comparison of utilitarian and hedonic values. Inf Process Manage 56(4):1439–1456. https:// doi.org/10.1016/j.ipm.2019.03.011 20. Xin L, Seo SS (2019) The role of consumer ethnocentrism, country image, and subjective knowledge in predicting intention to purchase imported functional foods. Br Food J 122(2):448– 464. https://doi.org/10.1108/BFJ-05-2019-0326 21. Mazda (2016) Mazda’s new Soul Red Crystal paint. Mazda, November 16. http://www.inside mazda.co.uk/2016/11/16/mazdas-new-soul-red-crystal-paint/ 22. Kato T (2021) Does the impression of the manufacturer brand color increase repurchase intention? In: Proceedings of the 2021 8th international conference on behavioral and social computing, pp 1–4. https://doi.org/10.1109/BESC53957.2021.9635454 23. Ferris R (2020) Most cars are painted one of these four colors—here’s why. CNBC, September 22. https://www.cnbc.com/2020/09/22/most-cars-are-painted-one-of-these-fourcolorsheres-why.html 24. Zumkley J (2019) White still dominates BASF’s analysis of the 2019 automotive color distribution. BASF, January 15. https://www.basf.com/global/en/media/news-releases/2020/01/p-20112.html 25. Baptista I, Valentin D, Saldaña E, Behrens J (2021) Effects of packaging color on expected flavor, texture, and liking of chocolate in Brazil and France. Int J Gastron Food Sci 24:100340. https://doi.org/10.1016/j.ijgfs.2021.100340 26. Roullet B, Droulers O (2005) Pharmaceutical packaging color and drug expectancy. ACR North Am Adv 32:164–171 27. Ampuero O, Vila N (2006) Consumer perceptions of product packaging. J Consum Mark 23(2):100–112. https://doi.org/10.1108/07363760610655032 28. Aaker JL (1997) Dimensions of brand personality. J Mark Res 34(3):347–356. https://doi.org/ 10.1177/002224379703400304 29. Labrecque LI, Milne GR (2012) Exciting red and competent blue: the importance of color in marketing. J Acad Mark Sci 40(5):711–727. https://doi.org/10.1007/s11747-010-0245-y 30. Biswas D (2016) Sensory aspects of branding. In: The Routledge companion to contemporary brand management. Routledge, London, pp 250–259 31. Baxter SM, Ilicic J, Kulczynski A (2018) Roses are red, violets are blue, sophisticated brands have a Tiffany Hue: the effect of iconic brand color priming on brand personality judgments. J Brand Manag 25(4):384–394. https://doi.org/10.1057/s41262-017-0086-9 32. Argo JJ, Zhu R, Dahl DW (2008) Fact or fiction: an investigation of empathy differences in response to emotional melodramatic entertainment. J Consum Res 34(5):614–623. https://doi. org/10.1086/521907 33. Komeda H, Tsunemi K, Inohara K, Kusumi T, Rapp DN (2013) Beyond disposition: the processing consequences of explicit and implicit invocations of empathy. Acta Physiol (Oxf) 142(3):349–355. https://doi.org/10.1016/j.actpsy.2013.01.002 34. Nielsen JH, Escalas JE (2010) Easier is not always better: the moderating role of processing type on preference fluency. J Consum Psychol 20(3):295–305. https://doi.org/10.1016/j.jcps. 2010.06.016

Impact of Teacher Training on Student Achievement Miguel Sangurima

Abstract Using a differences-in-differences strategy, I evaluate the impact of teacher training driven by the application of the Ser Maestro 2016 test on student performance in the Ser Bachiller test, which is the compulsory education completion test in Ecuador. I find that the time invested in education by teachers to take the test does not contribute to the performance of students in the areas of language and social studies and causes a 1% reduction in math scores. Keywords Human capital · Ecuador · Latin America

1 Introduction The policy to improve education in Ecuador, initiated in 2008, led to an increase in evaluations of students and teachers in the educational system, in order to monitor the impact of public policies implemented by the state. For this, the Ecuadorian constitution of 2008 created the National Institute for Educational Evaluation (INEVAL) who would be in charge of this activity. This organism creates several evaluations among them; two important tests that were applied to the educational system were: the high school graduation test called Ser Bachiller and the teacher evaluation test called Ser Maestro 2016. These two evaluations show information on the educational system that was not available before the creation of the Ineval. Thus, using these resources, I try to determine what is the impact of the preparation made by teachers to take the Ser Maestro 2016 test on the results of the Ser Bachiller test taken by students in the last year of compulsory education? A first idea about the impact of teacher training would be that student results would improve because teachers studied to take standardized tests. This is reasonable because the teacher training lasted approximately seven months and part of this training contained items similar to those of the Ser Bachiller test. Evidence of this M. Sangurima (B) Universidad Católica Andrés Bello, Caracas, Venezuela e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_60

703

704

M. Sangurima

idea can be found in [1], who find that training is one contributing factor to student performance. Research in the United States shows that teachers vary in their ability to contribute to their students’ performance, depending on experience, motivation, training, and [1] titles; these not being the only factors that determine performance since, other unobservable attributes of the students and the classroom could contribute to some extent. Few studies of teaching quality and the impact on student performance have been done in Ecuador. However, work is being done to improve educational quality by increasing teachers’ salaries, improving infrastructure, creating training courses short-term and motivate teachers to pursue postgraduate studies. Two important evaluations that reveal the state of Ecuadorian education are the Ser Bachiller test and the Ser Maestro 2016 test. The Ser Bachiller tests are the most representative evaluations of the progress in education in Ecuador; these are applied to all students in the last year of high school, providing information to all districts and high school educational institutions, public and private, about the state of education in your area and institution. This consists of multiple-choice questions in six areas: mathematics, language and literature, social studies, natural sciences, abstract reasoning, and scientific knowledge, as well as including a education survey about factors associated. The Ser Maestro 2016 test is the first evaluation carried out on all teachers in the Ecuadorian educational system who work in the public sector and who have permanent appointments and provisional appointments after the approval of the Ecuadorian constitution of 2008. According to INEVAL, this test measures specific knowledge of the area in which the teacher works, pedagogical knowledge, planning skills, research skills, collaboration, and emotional intelligence.1 The interpretations made of the results of these tests by educational institutions are used as working instruments for the improvement of teachers and students. I am not aware of relevant studies that study the impact of teacher training on student achievements in Ecuador. With this work I try to shed some light on the effectiveness of state policies in this matter, by evaluating the impact that the results of the Ser Maestro 2016 test have on the results of the Ser Bachiller test, considering that teachers invested time to study and thus carry out the activities that required the test. The work provides information on the results that can come from a compulsory training program that is not taken by the teachers’ own motivation and that is taken to fulfill a requirement. The results may have implications for other public policies in Ecuadorian education, such as the requirement to pass a master’s program to obtain salary improvements from which a question arises that is not resolved in this work. How much do the postgraduate programs that Ecuadorian teachers carry out to achieve a salary improvement contribute to the students? A reference on this topic can be found in [2]. The study is guided by the questions: Does the time invested in education to take the Ser Maestro 2016 test contribute positively to student performance? And 1

Detailed information on the Ser Maestro 2016 test such as: knowledge evaluated, schedule, statistics, test design, can be found at: http://evaluaciones.evaluacion.gob.ec/BI/ser-maestro/.

Impact of Teacher Training on Student Achievement

705

What is the impact of the results of the Ser Maestro tests on the results of the Ser Bachiller tests? I answer both questions using cross-sectional panel data obtained from the National Institute for Educational Evaluation (INEVAL). To do this, I use a differences-in-differences model to generate indicators. I build the databases by joining the students’ data at an individual level with the individual data of the teaching staff of the last three years of compulsory education, I make three bases, one for each subject. Thus, if there are one or more high school mathematics teachers in the school who have taken the Ser Maestro test, they will be linked to the student or students of the school where they work; this will be the treatment group. Those students who do not have teachers, in their center, who have taken the exam will be the control group. This is for each subject. I perform a difference-in-difference analysis taking student grades as independent variables, using fixed effects per year. My estimates show (i) that the time invested in education to take the Ser Maestro 2016 test did not contribute positively to student performance; and (ii) that the impact on the test was negative in mathematics and null in the areas of language and social sciences. These results are important because public policy decisions can be made regarding teacher training and the improvement of the quality of education through them. The paper is structured as follows. Section 2 describes the literature review. Section 3 describes the data. Section 4 discusses my empirical specifications. Section 6 concludes.

2 Literature Review There is evidence that the effectiveness of a teacher is related to the levels of formal education, experience, and social skills that they develop throughout their teaching life [3–5], there is also evidence of; The more effective the teacher is, the better the academic results of their students [6, 7]. Under these premises, educational institutions and governments invest in the training and development of their teachers to raise the level of effectiveness of education. However, the process of teacher training does not have the same effect on each of them, because of their different personal characteristics and the environment in which they work and live [8–11]. Economically, you could study teacher training in three directions: accumulation of human capital, signaling or null effects [12–14]. If education achieves a positive causal effect, this would be reflected in an improvement in the skills of teachers and consequently in the performance of students; otherwise, teachers would probably have created a signal by achieving more education that would not significantly improve their performance abilities nor those of their students. However, this behavior could only be studied as signaling under the condition that the studies were carried out voluntarily, but this is not the case. Thus, the absence of effects can be evaluated due to an ineffective public policy.

706

M. Sangurima

In general, research on the return on investment in education presents mixed results [15–18] due to the difficulty of identifying the endogenous factors that characterize students and educational programs. One factor influencing investing in education is the salary [19]. The better the market wages for professionals, the longer the students will stay in school and even return to it. However, teachers face a labor market often controlled by the state, since education is a good factor that contributes significantly to their development. Thus, depending on the teacher’s salary structure and skill level, the state establishes motivational strategies to get teachers to increase their effort [20, 21] and for professionals with other backgrounds to enter the teaching profession. Commonly, the strategies linked to teacher training are usually related to maintaining job stability and the salary improvements [20–22] as a reward for effort; however, a greater investment in education is not a guarantee that the teacher manages to increase the human capital or that this is reflected in the performance of the students, since, as has been seen, the teacher can create signals to confuse the market. Thus, the return on investment may not be significant when the teacher fails to improve their human capital. This aspect is relevant since society invests in enhancing education hoping that its investment will achieve better results. Research related to this topic has yielded mixed results [4], which are in line with the results of educational returns in other sectors of the labor market.

3 Data The study combines three databases obtained from the National Institute of Educational Evaluation INEVAL; these are: the results of the Ser Bachiller Exam, the factors associated with the education of students who register for the Ser Bachiller exam in the periods 2014–2015, 2015–2016, 2016–2017 and the Ser Maestro 2016 exam results. With these I build three cross-sectional data panels. The Ser Bachiller database contains information about the test grades. The associated factors database has personal information about the students. The Ser Maestro database has the results of the grades plus personal information about the teachers who took the test. The Ser Bachiller test is a mandatory evaluation instrument for students who finish secondary education and measures skills in mathematics, language, natural sciences, social studies, abstract reasoning and scientific knowledge, reporting scores of 10 points for each of the areas of expertise, and with these results an average general grade called Ineval grade is obtained. To access this test, students must pass the last year of secondary education. In Ecuador compulsory education is 13 years old; a student enters at six years old and usually finishes school at 18 or 19 years. Basic education is established in 10 years of study and the last three years are called high school because students specialize in some knowledge. The Ser Maestro 2016 test was a mandatory evaluation for all teachers in the Ecuadorian fiscal educational system that was carried out in three phases between

Impact of Teacher Training on Student Achievement

707

April and October 2016. The first phase evaluates disciplinary knowledge that constitutes specific knowledge of the area in which the teacher worked, the second phase is self-assessment; in which teachers evaluate themselves through hypothetical cases related to teaching work and the third co-evaluation phase, where the teacher creates a teaching portfolio and in most cases rates two or three portfolios of teachers unknown to him, that is, double blind. The database contains the global qualifications, out of 1000 points, of the teachers identifying them by schools and performance areas, it includes personal information related to the teaching performance. In Ecuador, the educational system divides schools into four classes based on the type of financing: fiscal, municipal, fiscomisional and private. The government finances the first two. Still, they differ in that the first is controlled by national government organizations and the second by a municipal government organization; a fiscomisional school has mixed financing, that is, private and state. And a private school has private financing. There are at least four forms of hiring in fiscal colleges: permanent appointment, occasional assignments, contracts for professional services, temporary contracts. To build the first data panel, I joined the Ser Bachiller bases with the associated factors, identifying them by students. Then in the Ser Maestro 2016 base I identify the teachers of the three years of high school in the area of mathematics and exclude the rest of the teachers. This new base is one with the Ser Bachiller base previously created, identifying them by the school; in this way each student is related to a math teacher from the high school years of their school, this is from the last three years of Ecuadorian compulsory education. I repeat the same process for the language and social studies areas. I do not carry out a treatment for natural sciences because the identification of the professors on this area is more difficult since in it the area of physics, chemistry and biology are evaluated, that is, unlike the other subjects, more professors influence this note, I don’t do it for abstract reasoning either since this is not a subject in school. On the three bases, educational institutions are classified by the type of financing into four classes: fiscal, fiscomisional, municipal and private. I eliminate municipal, private and fiscomisional institutions. The investigation was carried out with the fiscal financing institutions. On the bases there are public schools that have teachers who took the Ser Maestro test; these constitute the treatment group and schools that do not have teachers who took this test, are identified as a control group. There are also people who want to enter a higher education institution and who no longer study in the treated or control schools but who were assigned to certain schools to take the exam, these students do not cause an endogeneity problem because the samples are homogeneous. With the data, we seek to study the effects of the preparation of teachers to take the Ser Maestro tests on the performance of students in the test. Ser Bachiller. The data have the advantage of being population-based for students completing their third year of high school and contain the results of the Ser Maestro 2016 evaluation of the majority of teachers from educational institutions in the country. Another important characteristic of the data is the information on the factors associated with the education of students and teachers, these are used to create a model that considers students’ and teachers’ endogenous characteristics.

708

M. Sangurima

4 Empirical Strategy In the year 2016, teachers with definitive appointment and provisional appointment of the fiscal and fiscomisional educational institutions of Ecuador entered a mandatory evaluation process that would allow them to update and improve their knowledge; this evaluation would not be attended by the 4 teachers of temporary contracts and others contract forms. Thus, 140,917 teachers of the 214,917 that existed in the Ecuadorian educational system in 2016 presented themselves. I take advantage of this event to carry out a quasi-experimental investigation knowing that the evaluation process required prior preparation by the teachers to achieve a satisfactory grade; failure to achieve a minimum grade implied the probability of losing the position or entering a process training to reinforce knowledge for the exercise of the profession. This test was carried out between April and October 2016; the test results were classified into three levels and the achievements of the teachers were: 6.8% of them achieved a grade of in training, 70.5% fundamental, 22.4% favorable and 0.3% excellent. The minimum level accepted was in training, so all teachers passed the test. For the main investigation, I take the data of 131,116 fiscal teachers who agreed to the Ser Maestro test out of a total of 148,3912 ; of these I keep the data of the teachers of the last three years of secondary education in the areas of: mathematics, social studies, and language. I create three data panels with which I carry out a differences-in-differences treatment, having students from schools whose teachers took the Being a teacher test as a treatment group and students whose fiscal teachers did not give the test as a control group. I make robust estimates and use year fixed effects for the coefficient estimates (Table 1). Descriptive statistics show that the groups are quite homogeneous in the main indices that determine student achievement and that the grades and socioeconomic indices of the treated groups are higher than those of the control group; that is, higher incomes achieve better results in education, which is in line with the literature. The mathematical model of the main estimation of differences in differences is: yit = δ × Students Treatment t + φt + it ,

(1)

where the parameter of interest is δ which includes the effect of the time invested in training, development of activities and resolution of the Ser Maestro 2016 test questionnaires and φt are fixed effects per year. The dependent variable is indexed by individuals i and time t. To apply the differences-in-differences method, it is necessary to have treatment and control groups that maintain parallel behaviors over time. I verify the parallelism of the treated and control data, using the averages of the two groups per year. I show in Figs. 1, 2 and 3 the validity of the method. In Fig. 1 I show the average scores of the Ser Bachiller exam in mathematics of the treated and control groups in the years 2015, 2016 and 2017, corresponding to 2

https://educacion.gob.ec/datos-abiertos/.

Impact of Teacher Training on Student Achievement

709

Table 1 Summary statistics (1)

(2)

(3)

Treated

Non-treated

Difference

Mean

sd

Mean

sd

b

Grade obtained in mathematics

7.1

1.2

6.8

1.1

0.3***

Age

19.9

3.9

19.6

3.4

0.4***

Socioeconomic index

2.9

1.4

2.2

1.3

0.7***

Observations

641,655

Grade obtained in literature

7.7

1.1

0.3***

58,487 1.1

7.4

700,142

Age

19.0

2.7

21.8

5.0

−2.8***

Socioeconomic index

2.9

1.4

2.6

1.4

0.3***

Observations

473,733

Grade obtained in social studies

7.9

1.3

7.6

1.3

Age

19.0

2.7

21.6

4.9

−2.6***

Socioeconomic index

2.9

1.4

2.5

1.4

0.4***

Observations

459,053

226,412

700,145

2 41,091

0.3***

700,144

Note The table shows the statistical summary of the three data panels by subject

Fig. 1 Parallelism in mathematics

the school periods 2014–2015, 2015–2016 and 2016–2017 respectively, taking 2016 as the year of treatment, the results for the year 2017 increase compared to previous years, and it can be seen that the increase in the control group is greater than that of the treated group. Figure 2 shows the parallelism between the treated and control groups in the language subject. In this, a parallelism is observed in all the years. The treatment group has higher averages in language than the control group. Figure 3 also does not show differences over time between the treated and control groups given the application of the Ser Maestro test.

710

M. Sangurima

Fig. 2 Parallelism in language

Fig. 3 Parallelism in social studies

5 Main Results Table 2 I presents the results of the three differences-in-differences model treatments: ungrouped, grouped by cantons, and grouped by parishes. The impact on math grades is negatively significant in all three treatments. In the subjects of language and social sciences, the results are significant only in the first model. The effect of teacher training to take the Ser Maestro test on student results in the Ser Bachiller test in the areas of mathematics is −1%, language −0.2% and social sciences 0.3%. The negative and significant effects in mathematics in the three models suggest that the time invested by teachers in studying and taking the Ser Maestro exam reduced the effectiveness of their work with students, causing a modest reduction in the results of standardized tests. This assumption makes sense, given that the time invested in learning mathematics in general, is greater than that of other subjects. Therefore, the preparation of classes and the teacher’s dedication can be affected by adding activities to their daily work.

Impact of Teacher Training on Student Achievement Table 2 Impact of the Ser Maestro test

711 (1) imat

(2) ilyl

(3) Ies

Main δ

−0.109∗∗∗ (−11.03)

−0.0255∗∗ (−2.75)

−0.0322*** (−3.33)

Clauster canton δ

−0.109∗∗ (−2.62)

−0.0255 (−0.94)

−0.0322 (−0.74)

Clauster parish δ

−0.109∗∗ (−2.79)

−0.0255 (−0.76)

−0.0322 (−0.75)

N

503,696

503,696

503,893

t statistics in parentheses ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

The negative and significant results in language and social studies in the first model, but not significant in the two pooled models, suggest no effects on the scores in these two subjects. Given that the coefficients are −0.2 and −0.3%, respectively, it is likely that there were no effects on student results caused by the preparation of teachers to take the Ser Maestro test. Given that the preparation process for the Ser Maestro 2016 test required time dedicated to studying the field in which each teacher works and the pedagogical processes necessary to exercise the profession; it could be expected, following the literature, that these are reflected in an improvement in the students’ results. However, the research results do not show evidence of this improvement, which would imply that the teachers studied and solved the test to emit a signal that they meet the requirements to be teachers, but not as an opportunity to accumulate human capital. However, the results cannot be interpreted as signaling since the test was mandatory. Apparently, the test was taken by teachers as an additional compulsory activity that would not contribute significantly to their performance even though the results could affect their job stability; this leads me to wonder if the policy of improving education based on economic incentives for professors who achieve postgraduate academic degrees, it is effective or is generating a signaling process in Ecuador since obtaining a postgraduate degree is voluntary.

6 Conclusions This paper studies the effects of the Ser Maestro test on the results of the Ser Bachiller test. I find modest significant negative effects on math scores. The students in the treated group reduce their scores by 1% point compared to the control group. I find negative effects of 0.2% points in language and negative effects of 0.1% points in social sciences, these last two indicators suggesting that the effect on language and social sciences was null. Probably the teachers studied to pass the test do not accumulate human capital because the test was mandatory; this implies that a policy that forces teachers to study to keep their jobs probably does not contribute to student

712

M. Sangurima

achievement. It also leaves us with the question about the effectiveness of a professor who studies motivated by the state policy of higher salary, higher training.

References 1. Bruns B, Luque J (2014) Great teachers: how to raise student learning in Latin America and the Caribbean. World Bank Publications 2. Doran K, Gelber A, Isen A (2014) The effects of high-skilled immigration policy on firms: evidence from h-1b visa lotteries. Technical report, National Bureau of Economic Research 3. Anfara Jr VA, Schmid JB (2007) Defining the effectiveness of middle grades teachers. Middle Sch J 38(5):54–62 4. Harris DN, Ingle WK, Rutledge SA (2014) How teacher evaluation methods matter for accountability: a comparative analysis of teacher effectiveness ratings by principals and teacher value-added measures. Am Educ Res J 51(1):73–112 5. Wayne AJ, Youngs P (2003) Teacher characteristics and student achievement gains: a review. Rev Educ Res 73(1):89–122 6. Aaronson D, Barrow L, Sander W (2007) Teachers and student achievement in the Chicago public high schools. J Law Econ 25(1):95–135 7. Kane TJ, Rockoff JE, Staiger DO (2008) What does certification tell us about teacher effectiveness? Evidence from New York city. Econ Educ Rev 27(6):615–631 8. Ashton P (1984) Teacher efficacy: a motivational paradigm for effective teacher education. J Teach Educ 35(5):28–32 9. Newton XA, Darling-Hammond L, Haertel E, Thomas E (2010) Value-added modeling of teacher effectiveness: an exploration of stability across models and contexts. Educ Policy Anal Archiv 18(23):n23 10. Opdenakker M-C, Van Damme J (2006) Teacher characteristics and teaching styles as effectiveness enhancing factors of classroom practice. Teach Teach Educ 22(1):1–21 11. Clotfelter CT, Ladd HF, Vigdor JL (2006) Teacher-student matching and the assessment of teacher effectiveness. J Hum Res 41(4):778–820 12. Schultz TW (1963) The economic value of education. Columbia University Press 13. Rees A (1965) Human capital: a theoretical and empirical analysis with special reference to education 14. Spence M (1978) Job market signaling. In: Uncertainty in economics. Elsevier, pp 281–306 15. Goldhaber D, Anthony E (2007) Can teacher quality be effectively assessed? National board certification as a signal of effective teaching. Rev Econ Stat 89(1):134–150 16. Kroch EA, Sjoblom K (1994) Schooling as human capital or a signal: some evidence. J Hum Res 156–180 17. Bedard K (2001) Human capital versus signaling models: university access and high school dropouts. J Polit Econ 109(4):749–775 18. Hussey A (2012) Human capital augmentation versus the signaling value of mba education. Econ Educ Rev 31(4):442–451 19. McMullen S (2011) How do students respond to labor market and education incentives? An analysis of homework time. J Lab Res 32(3):199–209 20. Vegas E (2007) Teacher labor markets in developing countries. Future Child 219–232 21. Loeb S, Myung J (2020) Economic approaches to teacher recruitment and retention. In: The economics of education. Elsevier, pp 403–414 22. Ladd HF (2007) Teacher labor markets in developed countries. Future Children 201–217

Educational Data Mining: A Predictive Model to Reduce Student Dropout Carlos Redroban , Jorge Saavedra , Marcelo Leon , Sergio Nuñez, and Fabricio Echeverria

Abstract Data mining (DM) is one of the many tools that exploits a large amount of information for a specific purpose, the change that has been made by digitizing physical information has made it much more malleable for everything that it needs. The use of data mining applied within the educational area allows the collection of information and its classification to help the institution’s decision making, through the use of classification algorithms and modeling techniques in order to improve student academic achievement. This article will identify patterns that influence the dropout of students in their studies in order to create indicators that help improve student performance, and the information will be based on the study and analysis of academic data. In this article, various variables that influence school performance will be identified, because it is an important issue for educational institutions and it is crucial to determine the reasons that prompt students to drop out of their studies in order to generate indicators that prevent the problem supporting the study with the analysis of cases where data mining techniques had favorable results. In this way, it helps enhaning student learning, while contributing to teaching activities so that their students have a better average. It will also serve as a guide for similar future models that make predictions about the number of subjects that students pass and fail. Keywords Educational Data Mining · Academic Performance · Academic Dropout · Data Mining Tools

C. Redroban · M. Leon (B) · F. Echeverria Universidad ECOTEC, Samborondón, Ecuador e-mail: [email protected]; [email protected] J. Saavedra Universidad Estatal Peninsula de Santa Elena, La Libertad, Ecuador S. Nuñez Universidad del Pacifico, Guayaquil, Ecuador © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2_61

713

714

C. Redroban et al.

1 Introduction Technology is a set of techniques that have undergone changes at all levels. All these advances have been very significant over time, and throughout this innovation process a number of challenges have arisen that have made go perfecting all data collection techniques and make there are more and more solutions. The creation of participatory and composite environments are fundamental tools in data mining, these are just one of the great advances in this important field in the study of data. Academic performance is one of the qualities and capacities of the student that can be measured, where it shows what he has learned throughout his educational training. This measure is relevant for any person being studied; therefore, the learning outcomes are related to the aptitudes and attitudes of these students [2]. When a student does not accomplish academic expectations it may be related to behavioural variations that are shown in the student’s behaviour and that may affect their daily activities [3]. To know the cause of the problem and make decisions according to the case, it is necessary to investigate some factors that involve a student if he or she has a good or bad academic performance and also can be reflected in their grades. For that reason, data mining tools help us define behaviour patterns that influence students’ academic performance. We are currently living through a decisive period in human history; educational data mining is one of the application areas of data mining that helps predict dropout learning habits and, in this way, help timely to students. However, it has not been included in university education [4]. For this reason, data mining plays a fundamental role here, since it offers tools to understand the complexities that characterize the different systems, analyse technological, economic and environmental impacts, and evaluate their production strategies. His general approach to better understand the actual problems and predictions under investigation.

2 Methodology In this article we talk about the behaviour patterns that influence the academic performance of students through all the data mining tools that must be defined. In this case, the Data Mining tools were executed to find patterns and relationships between student learning data. Identify what affects the student’s academic performance, their permanence and approval of the subjects. Make recommendations on how to use the results obtained and the characteristics of the information sources that serve as an innovative and appropriate learning management strategy. The methodology that will be applied in the development of this article is based on the data mining process in a systematic and non-trivial way, that is, we will be

Educational Data Mining: A Predictive Model to Reduce Student Dropout

715

able to understand the knowledge discovery process to provide a guide for project planning and execution.

3 Background 3.1 Data Mining In the following section, different topics that are related to the study have been considered. The data mining process consists of collecting relevant information on previously defined topics, and this process is carried out with large amounts of information to find different models from these data [1]. This state of art shows how the use of different data mining techniques and tools would help to predict student’s dropout. For the process to be effective, it must be automatically with the help of patterns that help make better decisions. The main benefit is the transformation of data into knowledge, which is currently the most important thing at the time of a feedback for improvements in the application area [1]. Within the great area of data mining, there is one in particular, which is education, which offers advantages compared to the more traditional paradigms of research in education, whether with experiments, sociological studies or design research. First, we have the supervised techniques which use variables and fields of a database that help in the prediction of unknown values, all this to get knowledge for a specific purpose. Unsupervised or descriptive mining techniques are used to collect important and relevant information in large databases to discover patterns that serve for their proper interpretation. Next, mining techniques [2] are mentioned. Second, we have the data clustering, which is done in different ways, one of this is to join a data set based on the similarity of these data entered in the search patterns [3]. Other technique is the decision trees which allows the organization to be more efficient from previous evaluations and the levels of importance are defined that go from their branches to their leaves [4]. Finally, there are the association rules which are used to discover situations that occur within a certain set of data, within these for everything to work, specific situations must occur, through the search information of the entities belonging to or associated with the information sought is revealed. The most common situations where these analyses are applied can be in marketing, already crossing emails, segmenting customers, designing catalogues, and all this to provide support for decision making [5]. DM is evolving and needs to be conceptually step by step. Originally, the purpose of information systems was to collect information on specific assets that would help make decisions. Data mining tools allow you to extract patterns and trends, explain and better understand your data, and predict future behaviour. Data mining analyses

716

C. Redroban et al.

the data from the rest of the tools for access to information and a more effective analysis [6]. DM uses two process named algorithms. One of this is supervised data mining process there is learning where one algorithm is trained by a previously known referent, to be able to supervise the entire process to be carried out. Models with predictions generally result from supervised learning. This being the main difference for the other subdivision, where the goal is the discovery of patterns. The use of an inspected model involves training, a process where the software reviews some similar cases where the target value is previously known. Other type of model is unsupervised data mining where an algorithm learns by itself. In this part there is no order in the dependent and independent attributes. That is, there are no known results prior to the investigation. Because of this, unsupervised mining can be used for descriptive proposals. Although it can also be used to make predictions for a specific job. Obtaining this information will help us take action and benefit from it [7]. There are some success stories to implementing data mining, due to capacity to process large amounts of data, this helps identify patterns and behaviours based on variables such as type of the course, some difficulties, the grades, etc.

3.2 Data Mining and Students When we talk about the problems of academic student’s achievement, the correct decision is taken into account in relation to their itinerary based on the available information that can be presented to us such as their courses, schedules, sections, teachers, etc. Therefore, the use of recommendation systems that are based on data mining techniques that help the student in some way to emphasize their decisions and their student life is proposed. In the course of this study, it has been possible to get a few algorithms, each one with different results, that helped some students to modify the input data and that allows him to improve theirs level of prediction. Hence data mining tools effectively helped students make better decisions when taking an academic cycle. Therefore, the study shows the education scenario, the use of decision tree algorithms and techniques that reach high levels of prediction [8]. The KDD process is a process that establishes a series of stages whose priority is the extraction of information in databases. It is also called a non-trivial process, which can simulate as valid, novel, and profitable patterns form of some data. Taking into consideration this process of discovering information in the database, it begins with identification of some goals, which initially establish the targets that are desired to be achieved at the end of the process. Consequently, the databases of the academy are selected both in relation with the personal, academic, and socio-economic characteristics of those who are in the program.

Educational Data Mining: A Predictive Model to Reduce Student Dropout

717

Once the collected data is established, the analysis is carried out with the creation of a repository which will cover the pre-processing stage, which uses different techniques and operations for cleaning and preparing the data. The next stage shows us the data acquired ready and suitable for the application of the algorithms and be applied in data mining [9]. These days, online learning has largely been adopted by educational platforms that help students in their curricular activities, and for this reason many educational institutions make investments in this area of information technology, focusing on information management systems learning or LMS due to the interaction with the user where it generates a large amount of information that can used for multiple processes. Due to the exponential growth of these platforms, it is difficult to classify and determine what information is used to generate the appropriate knowledge. For that reason, it is important to develop tools that help get the desired information; some of these tools involve the use of data mining that contribute to solving the problem. [10] Platforms based on LMS systems allow managing and distributing published content in such a way that all content intended for students can be saved, organized and distributed. LMS tools contributing to management and learning systems are varied and contribute through techniques that are applied in the educational environment such as follows: First, Evaluation and monitoring is one of the tools provided by LMS systems that allow students to be evaluated through questionnaires with different types of questions such as open questions, selection, to complete among many others in order to determine the grade of knowledge acquired during the course [11]. Second, User administration that facilitates the validation of credentials, the registration of users in virtual platforms and displays content such as courses and activities to different types of students who have accessed the system. Third, through this offered by the system, it can use chats, sends emails, participates in student forums, and creates bulletin boards to get in touch between teachers and students in real-time communication. Database process knowledge, which not only includes obtaining of models but also the evaluation and the respective interpretation; it consists of important phases such as: (1) Collect of information from several sources; (2) Cleaning, selection and transformation phase that allows the management of most relevant attributes; (3) Data mining that is the phase where information is chosen; (4) Evaluation and interpretation the patterns obtained that are analysed and evaluated, (5) And the diffusion phase that shows the results obtained. There are others auxiliary techniques that allow data mining to have endless techniques that achieve the same goal, such as artificial intelligence and statistics that are intelligent algorithms that take a very large group of data to achieve some patters and take decisions. Among the most important techniques, the following can

718

C. Redroban et al.

be mentioned [12]: Neural networks; linear regression; statistical models; clustering; and association rules. Neural networks are models to follow with easy understanding of the functions of the nervous system, they are also widely used in data science. Linear regression performs relationships between relevant information. It is very efficient but the main disadvantage is the lack of multidimensional space. In the case of the statistical models, they represent the equation that is handled in the empirical designs and in the regression that affect the change in the response variable. The clustering allows to use a method which uses the grouping of a sequence of periodic vectors according to their distance. And association rules reveal common events within the data governance framework. Through various techniques and models within data mining, it is possible to detect patterns related to the behaviour of the student’s academic performance, through the analysis of their academic record, socio-economic factors, and others to understand the complexity that characterizes the various systems to be studied.

4 Analysis of Results A detailed analysis of around 15 different documents between articles and theses was carried out, which allowed us to observe and learn about the different data mining techniques and tools applied to each case study in the area of education, resulting in the following analyses: The first thing that was determined was the use of variables (Gender, Age, Marital status, Race, Ratings, Number of approved subjects, Number of subjects failed, Average, Campus, Retention) in each of the articles analysed, and obtaining attributes to facilitate the data mining phase from the personal and educational data of the students. To carry out this type of research, a period long enough for a student to determine his desertion in his studies should be considered. It will be necessary to use algorithms for a better development, however, for this type of project the KDD process (Knowledge Discovery Process in Database) can be used, which allows selecting, cleaning, transforming and projecting data and analyse to derive adequate patterns and models, to use them as knowledge [13]. To find the characteristics in academic failures in institutions according to a study carried out by students of the Carlos III University of Madrid, decision trees can be used to identify the students who require more help in a particular area and in turn the index of school failure [14]. The CHAID analysis (Autonomous Interaction Detection based on Chi-Square) will provide a summary diagram, detailing the characteristics that provide greater dependence on the study objective. Together these methods perform pairwise comparisons to find more highly related predictor variables [14]. Educational data mining, another investigation carried out by the Centre for Research in Science and Technology of the National Polytechnic Institute, gives

Educational Data Mining: A Predictive Model to Reduce Student Dropout

719

as its main strategy the use of clustering methods, since they are not interested in modelling a set of relationships and a set of responses. It is used more as an unsupervised model where data objects are parsed without querying a known class, as well as for label generalization, that is, finding and constructing groups formed in such a way that the patterns are within a set [15]. The application of data mining in online education, a study carried out by students from the University of Francisco de Paula Santander Ocaña, establishes an important technique and tools, which are the DM techniques which allow finding the results during the research process that starts when the information of organizations is stored in the computers but that must be applied according to the learning that is being applied. The tool is also used in DM who give answers to questions that in the past were difficult to answer [16]. In the study carried out in the Bachelor’s degree in SI at the UNSE University, it was possible to identify the characteristics of the learning style of the students enrolled using the descriptive method of data mining applying the cluster analysis, in this way the sensory-visual style was determined—active as the predominant among students. [17]. In a successful case where data mining was applied in a virtual learning environment, the CRISP–DM methodology was used, which allowed obtaining a data analysis model and determining the percentage of students who interact in the English course with a total of 69%, and it also helped define the factors that were decisive for obtaining interaction results, such as exams, homework, economic resources and the employment status of each student [18]. The different techniques of data mining are also applied in the area of virtual teaching, since they are currently developing rapidly and arouse the interest of researchers due to their different benefits, the techniques most used in the virtual teaching modality are: Classification, rule discovery, clustering, and pattern sequences. These techniques can be classified into two groups: Those that apply to elearning-based teaching systems that do not integrate AI techniques and Web-Based SHA systems [19]. Platforms and systems (LMS) in which students interact continuously, provide information that can be used to identify behaviour patterns, analyse data, decide what information is used to generate solutions and thus establish the appropriate resources and activities in systems like Moodle among others [19]. In the study: “Application of Data Mining techniques to the analysis of the situation and academic behaviour of UGD students” some data mining techniques were used in an educational environment (MDE) to identify the possible causes that may affect performance academic and the areas that directly affect and be able to provide recommendations that help develop strategies that help academic management [20]. In each analysis, the different variables that can be used to measure academic performance and its consequences were determined. These variables are the following: • The unfavourable economic conditions of the students; • The poor cultural level of the family to which he belongs;

720

• • • • •

C. Redroban et al.

The student’s expectations regarding the importance of education; The incompatibility of the time dedicated to studies; The personal characteristics of the student; Little interest in studies in general and the institution; And the previous characteristics of the student.

5 Conclusions Today, data mining and education play a very important role, because both of them help to find out information that is within the data; thankfully it allows its analysis and determines the pattern among students with a high chance of dropout. These tools contain elements that allow the use of key algorithms for the collection of information, they also have various aids that facilitate the development of the procedures for the creation of the applicable models for the analysis of the data. The previous studies will help to obtain a better analysis of the possible incidences of the student’s academic performance, for them a previous investigation must be established and make use of the necessary means or methods that allow us to obtain all the necessary patterns. The different data mining techniques contribute to improving the development of the educational system, however, the changes can be presented, the results obtained must be applied in the creation of strategic tools that help improve the academic performance of the students and that allow the development of guidelines. In order to adapt the teaching methods of each teacher to the learning style of the students, the construction of academic management models to help the administrative area of each educational institution is allowed.

References 1. Bibliography AT (2008) Modelo de Mineria de Datos para identificación de patrones. Guayaquil 2. Alejandro B, Daniel S, Ricardo G (2013) Mineria de datos educativa: Una herramienta para la investigacion de patrones de aprendizaje sobre un contexto educativo. Cent Investig En Cienc Apl Y Tecnol Av I:2–5 3. Alvaro J, Alvarez H (nd) Mineria de Datos en la Educacion . Universidad Carlos III , I(30), 2–6 4. Alveiro R, Alejandra V (2017) Aplicación de la mineria de datos en la educación en linea. Universidad Francisco de Paula Santander Ocaña, I(29) 5. Angelica J (2015) Aplicación de técnicas de minería de datos para determinar las interacciones de los estudiantes en un determinar las interacciones de los estudiantes en un entorno virtual de Aprendizaje. RTE 28(1):64–90 6. Apolaya Torres CH (2018) Tecnicas de indiferencias, prediccion y mineria de datos. Lima 7. Calvache Fernadez LC (2018) Aplicacion de tecnicas de mineria de datos para la identificacion de patrones de desercion estudiantil como apoyo a las estrategias de sara. Panama: Congresos Clabes 8. D, E., & C, R. (2007) Minería de datos para descubrir. IberoamNa Educ I:8–9

Educational Data Mining: A Predictive Model to Reduce Student Dropout

721

9. Eckert K, Suénaga R (2013) Aplicación de técnicas de Minería de Datos al análisis de situación y comportamiento académico de alumnos de la UGD. XV Work Shop Investig En Cienc Comput I(1):92–96 10. F, L. F. (2007) la educación más allá del LMS. Rusc 4(1):1–7 11. Galindo ÁJ (2019) Minería de Datos en la Educación 12. García J, Acevedo A (2016) Análisis para predicción de ventas utilizando minería de datos en almacenes de ventas de grandes superficie 13. Lagla GA, Moreano JA, Arequipa EE, Quishpe MW (2019) Minería de datos como herramienta estratégica. ReciMundo 3(1):955–970 14. Mauricio M, Jheser G (2017, 2 13) Scielo. Retrieved 11 21, 2021, from https://scielo.conicyt. cl/scielo.php?pid=S0718-50062017000300007&script=sci_arttext&tlng=n 15. Montero J (2008) Mineria de Datos Tecnicas y Herramientas. Area Universitaria, España 16. Morales C (2019) Estado actual de la aplicación de la minería de datos a los sistemas de enseñanza basada en web. Retrieved from ,http://www.investigacion.frc.utn.edu.ar/labsis/Pub licaciones/congresos_lab 17. Riquelme J (2006) Minería de Datos: Conceptos y Tendencias 18. Sarango M (2019). Aplicación de técnicas de minería de datos para identificar patrones de comportamientos relacionados con las acciones del estudiante con el EVA de la UTPL. Loja 19. Suarez R (2009) Herramientas de mineria de datos. Universidad de las ciencias Informaticas, Cuba 20. Villegas WE, Mora SL (2016) Análisis de las Herramientas de Minería de Datos para la Mejora del E-Learning en Plataformas LMS. seeci, I, 1

Author Index

A Abreu, Fernando Brito e, 111, 311 Adnan, Muhammad, 435 Ahmad, Arshad, 163, 435 Akella, Gopi, 353 Alayedi, Mohanad, 499 Alexandre, Isabel, 299 Al-Hamed, Ahmed, 163 Alhatami, Emadalden, 89 Ali, Gohar, 449 Almeida de, Ana, 363 Almeida, Dora, 341 Al-Obeidat, Feras, 123 Alturas, Bráulio, 299 Amaral, Vasco, 111, 311 Amin, Adnan, 123 Anoop, V. S., 175, 201 Anwar, Sajid, 533 Aparicio, Joao Tiago, 409 Aparicio, Manuela, 409, 421, 605 Araghi, Tanya Koohpayeh, 547 Arora, Aakarsh, 225 Arriaga, Alexandre, 275 Asharaf, S., 175, 201 Aslam, Uzair, 491 Au-Yong-Oliveira, Manuel, 671

Bilgili, Co¸skun Özenç, 99 Bolog, Sama, 17, 341 Boudjellal, Nada, 163 Branco, Frederico, 671 Brooks, Anthony L., 683

B Banoori, Farhad, 435 Bastardo, Rute, 521 Bayram, Barı¸s, 99 Bazai, Sibghat Ullah, 27, 491 Bhatti, Uzair Aslam, 89 Bilgaiyan, Saurabh, 225

E Echeverria, Fabricio, 713 Eufrazio, Edilvando Pereira, 251

C Caiche, Anthony, 511 Caiza, José, 397 Câmara, Ariele, 363 Castro Neto de, Miguel, 605 Cezario, Bruno Santos, 557 Chan, Anthony, 353 Chiranjeevi, Srinath, 189 Co¸skun, Nergiz, 99 Corte, Loeza, 593 Costa, Carlos J., 275, 409, 605 Costa, Helder Gomes, 139, 251

D Dai, Lin, 163 Dandolini, Gertrudes Aparecida, 289 Dar, Tarim, 77 d’Orey Pape, Rita, 605

F Feng, SiLing, 89

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Anwar et al. (eds.), Proceedings of International Conference on Information Technology and Applications, Lecture Notes in Networks and Systems 614, https://doi.org/10.1007/978-981-19-9331-2

723

724 G Gomes, Rodrigo, 111 Gonçalves, Catarina, 671 Gonçalves, Ramiro, 671 Gourisaria, Mahendra Kumar, 225 Guarda, Teresa, 511 Guedes, André Luis Azevedo, 557 Guevara-Vega, Cathy, 397 Gul, Haji, 123

H Hajishirzi, Reihaneh, 615 Hakak, Saqib, 533 Hannen, Clemens Julius, 421 Hashmi, Muhammad Zaffar, 27 Hasnain, Ahmad, 27 Hassan, Farman, 3, 63, 151 Hayat, Bashir, 163 Hernadez Paxtian, Z. J., 593 Holley, Debbie, 655 Huang, MengXing, 89 Hussain, Muhammad Anwar, 215

I Irtaza, Aun, 239

J Jahan, Nusrat, 35 Javed, Ali, 77, 151, 239 Jickson, Sarin, 201 John, Romieo, 175

K Kabir, Farzana, 547 Kato, Takumi, 693 Kaur, Sharan Preet, 481 Khalid, Fatima, 239 Khan, Asif, 163 Khan, Habib Ullah, 323 Khan, Khalid, 435 Khan, Maqbool, 533 Khan, Tarique, 435 Kumsetty, Nikhil Venkat, 459 Küp, Eyüp Tolunay, 99

L Landeta, Pablo, 397 Leon, Marcelo, 713

Author Index M Machado de Bem, Andreia, 289, 645 Madera-Ramírez, Francisco, 331 Malik, Khalid Mahmood, 239 Marques, Célio Gonçalo, 655 Martins, José, 671 Megías, David, 547 Mehmood, Muhammad Hamza, 3, 151 Menéndez-Domínguez, Víctor H., 331 Miller, David, 353 Mise, José, 397 Mitu, Meherabin Akter, 35 Moreira, Fernando, 123, 671 Moura de, Pedro Estrela, 311 Moura, José, 469 N Nadeem, Basit, 27 Nagodavithana, Banuki, 47 Nazir, Shah, 323, 435 Nhabomba, Arlindo, 299 Nishad, Md. Ashiqur Rahaman, 35 Nizamani, Mir Muhammad, 27 Nuñez, Sergio, 713 O Obite, Felix, 435 Oliveira, João, 363 Orlando Guerrero, I. J., 593 P Pavão, João, 521 Pérez-Canto, Amílcar, 331 Pesqueira, Antonio, 17, 341 Pinto, José, 373 Pires, Carla, 629 Portela, Filipe, 373 Q Qayum, Fawad, 435 Qayyum, Huma, 63 Quiña-Mera, Antonio, 397 R Rafique, Wajid, 533 Rahman, Auliya Ur, 3, 63 Ramada, Óscar Teixeira, 385 Ramos, Michelle Merlino Lins Campos, 139 Rana, Bharti, 567

Author Index Redchuk, Andrés, 581 Reddy, Bharat, 189 Redroban, Carlos, 713 Richter, Marc François, 645 Rocha, Nelson Pacheco, 521 Rodríguez-Orozco, Daniel, 331 Rosero, Shendry, 263 Rudra, Bhawana, 459 Ruiz, Ulises, 593 S Saavedra, Jorge, 713 Sacavém, António, 645 Sahni, Manoj, 225 Salimbeni, Sergio, 581 Salinas, Isidro, 511 Sangurima, Miguel, 703 Santos dos, João Rodrigues, 645 Santos, Manuel Filipe, 373 Sawant, Sarvesh V., 459 Shah, Babar, 533 Shah, Syed Ali Asghar, 491 Shahzad, Khurram, 215 Singh, Surender, 481 Singh, Yashwant, 567 Sohail, Amir, 63

725 Sousa, Maria José, 17, 289, 341, 629, 645 Souza de, Luciano Azevedo, 139 Suarez, Cindy, 511 Sulaiman, Sarina, 215

T Tien, David, 353 Trindade, Anícia Rebelo, 655

U Ullah, Abrar, 47 Ullah, Rehmat, 491

W Wahab, Abdul, 151 William, Nobel John, 435

Z Zafar, Usama, 151 Zahoor, Rizwan, 435 Zainab, Zarah, 123 Zhang, Huaping, 163