Data Analytics in System Engineering: Proceedings of 7th Computational Methods in Systems and Software 2023, Vol. 4 (Lecture Notes in Networks and Systems, 935) 3031548191, 9783031548192

These proceedings offer an insightful exploration of integrating data analytics in system engineering. This book highlig

121 100 78MB

English Pages 552 [550] Year 2024

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Organization
Contents
Evaluating the Reliability of Tests Used in LMS Moodle for E-Learning
1 Introduction
2 Methods
3 Results
4 Conclusion
References
Data Mart in Business Intelligence with Hefesto for Sales Area in a Dental Clinic
1 Introduction
2 Materials and Methods
2.1 Requirements Analysis. [Phase 1-Hefesto]
3 Results and Discussion
4 Conclusion
References
Features of Building Automated Educational Systems with Virtual and Augmented Reality Technologies
1 Introduction
2 Methods
3 Results and Discussion
4 Conclusion
References
Information and Communication Technology Skills for Instruction Performance: Beliefs and Experiences from Public School Educators
1 Introduction
2 Problem Statement
3 Methodology
4 Results
4.1 Information and Communication Technology (ICT) Skills for Instruction Performance
4.2 The Effectiveness ICT Applications on Supporting Instruction Performances
5 Discussion
6 Conclusion
References
Managing Information Quality for Learning Instruction: Insights from Public Administration Officers’ Experiences and Practices
1 Introduction
1.1 Problem Statement and Objectives
2 Literature Review
3 Methodology
3.1 Design of Study
3.2 Respondent Zone
4 Research Results and Discussion
4.1 Practical Implementation of Information Quality Achievement on Preaching Program
4.2 Process of Information Accuracy Enhancement Through Preaching Materials Arrangement
5 Conclusion
References
Development of Multivariate Stock Prediction System Using N-Hits and N-Beats
1 Introduction
2 Literature Review
2.1 Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS)
2.2 Neural Basis Expansion Analysis for Interpretable Time Series Forecasting (N-BEATS)
3 Methodology
4 Result and Discussion
4.1 Data Preprocessing
4.2 Model Construction
4.3 Model Result and Analysis
4.4 Application Result
5 Conclusion
References
Fractal Method for Assessing the Efficiency of Application of Closed Artificial Agroecosystems
1 Introduction
2 Materials and Methods
3 Discussion
4 Conclusion
References
Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity of Objects from Heterogeneous Ontologies
1 Introduction
2 Analytical Overview and Problem Statement
3 Approach to Semantic Similarity Assessment
4 Modified Bioinspired Algorithm of White Mole Colony
5 Experiment
6 Conclusion
References
Leveraging Deep Object Detection Models for Early Detection of Cancerous Lung Nodules in Chest X-Rays
1 Introduction
2 Related Works
3 Methods and Materials
3.1 Faster-RCNN
3.2 Yolov5
3.3 EfficientDet
3.4 Dataset
3.5 Data Preprocessing and Format Conversion
3.6 Training and Validation
3.7 Training Configurations
3.8 Training Parameters
3.9 Final Architecture of the System
4 Results and Deployment
5 Conclusion
References
Analyzing Data by Applying Neural Networks to Identify Patterns in the Data
1 Introduction
2 Materials and Method
3 Results
4 Discussion
5 Conclusion
References
Intelligent Data Analysis as a Method of Determining the Influence of Various Factors on the Level of Customer Satisfaction of the Company
1 Introduction
2 Methods and Materials
3 Results
3.1 Analysis of All Attributes
3.2 Analyzing Two Groups of Attributes
3.3 Analyzing the Attributes with the Highest Importance
3.4 Analysis of Attributes with the Highest Correlation Coefficient
4 Conclusion
References
Correlation Analysis and Predictive Factors for Building a Mathematical Model
1 Introduction
2 Data Research
3 Conclusion
References
Geoportals in Solving the Problem of Natural Hazards Monitoring
1 Introduction
2 Related Work and Research Methods
3 Analysis of the Distribution of Spectral Channels in the Problem of Land Classification
4 Fire Monitoring by Means of Geoportals
5 Conclusion
References
Implementation of Individual Learning Trajectories in LMS Moodle
1 Introduction
2 Implementation of the Logic of Learning by Individual Trajectory in LMS Moodle on the Basis of the Concept of Reverse Programming
3 Conclusion
References
Application of Fuzzy Logic for Evaluating Student Learning Outcomes in E-Learning
1 Introduction
2 Methods
3 Results and Discussion
4 Conclusion
References
Advancing Recidivism Prediction for Male Juvenile Offenders: A Machine Learning Approach Applied to Prisoners in Hunan Province
1 Introduction
1.1 Research Goal
2 Related Works
3 Methodology
3.1 SAVRY
3.2 Dataset
3.3 Data Pre-processing
3.4 Feature Selection Method
3.5 Hyper Parameter Optimization
3.6 Learning Algorithms
3.7 Performance Metrics
3.8 Exploration with Ensemble Models
4 Results and Discussion
4.1 Performance Metrics Comparison for Various Algorithms
4.2 Performance Analysis of Different Algorithms with Hyperparameter Optimization
4.3 Performance Analysis of Different Algorithms After Using Various Feature Selection Methods
4.4 Error Analysis on Different Algorithms
4.5 Explainable AI
5 Conclusion and Future Scope
References
Development of Automated Essay Scoring System Using DeBERTa as a Transformer-Based Language Model
1 Introduction
2 Related Works
2.1 Traditional Approach
2.2 Deep Learning Approach
2.3 Pre-trained Language Model Approach
3 Methodology
3.1 Dataset Used
3.2 Text Preprocessing
3.3 Score Normalization
3.4 Loss Function
3.5 Evaluation Metric
3.6 Data Flow Diagram (DFD) System Design
4 Experiments
4.1 Automated Essay Scoring Model
4.2 Setup
4.3 Result and Discussions
5 Conclusion and Future Work
References
Prediction of Glycemic Control in Diabetes Mellitus Patients Using Machine Learning
1 Introduction
2 Related Works
3 Methodology
3.1 Dataset
3.2 Exploratory Data Analysis
3.3 Data Preprocessing
3.4 Splitting Dataset
3.5 Feature Selection
3.6 Min-Max Scaling
3.7 Under Sampling Technique
3.8 SMOTE
3.9 Classifiers Algorithms and Hyper-parameter Optimization
4 Experiment and Results
4.1 Performance Metrics
4.2 Experimental Results Without Sampling Technique
4.3 Experimental Results with Under Sampling Technique
4.4 Experimental Results with SMOTE
4.5 Learning Curve
4.6 LIME
4.7 SHAP
5 Discussion
6 Conclusion
References
Efficient Object Detection in Fused Visual and Infrared Spectra for Edge Platforms
1 Introduction
2 Related Work
3 Methodology
3.1 The Proposed Architecture for Spectra Fusion
3.2 Fusion Mechanism
4 Experimental Results
4.1 Dataset
4.2 Training
5 Conclusions
References
Designing AI Components for Diagnostics of Carotid Body Tumors
1 Introduction
2 Research Method
3 Results
3.1 Development of the RADIMALIB Library
4 Discussion and Future Work
References
A Review of the Concept, Applications, Risks and Control Strategies for Digital Twin
1 Introduction
2 Concept of Digital Twins
3 Architectural Context of Digital Twins
4 Application of Digital Twins
4.1 The Technologies Enabling Digital Twins
5 Risks and Challenges Associated with Digital Twins
6 Integration of Interdisciplinary Model
6.1 Risk Control Strategies for Digital Twins
7 Discussions and Recommendations
8 Conclusions
References
Processing of the Time Series of Passenger Railway Transport in EU Countries
1 Introduction
2 Data and Methods
2.1 Source Data from the Eurostat Database
2.2 Data Mining Software Orange and Design of Workflow
3 Results
4 Discussion
5 Conclusion
References
Factors Influencing Performance Evaluation of e-Government Diffusion
1 Introduction
2 Related Work
3 Research Model and Hypothesis
4 Research Methodology
4.1 Data Collection
4.2 Questionnaire Design
5 Results
5.1 Descriptive Statistics
5.2 Reliability Test
5.3 Factor Analysis
5.4 Regression Analysis
6 Data Analysis and Discussion
7 Limitations
8 Conclusion
References
An Analysis of Readiness for the Adoption of Augmented and Virtual Reality in the South African Schooling System
1 Introduction
1.1 Background of the Study
2 Related Studies
3 Methodology
3.1 Research Methodology
3.2 Sampling Techniques
3.3 Hypothesis for the Study
4 Experiments and Results
4.1 Reliability and Validity
4.2 Factor Analysis
4.3 Descriptive Statistical Analysis
4.4 One Sample t-test
4.5 Regression
5 Conclusion
References
Review of Technology Adoption Models and Theories at Organizational Level
1 Introduction
2 Literature Review
2.1 Defining Technology Adoption
2.2 Technology Adoption Theories and Models
3 The Systematic Review of the Empirical Literature on Technology-Organization-Environment (TOE) Framework
3.1 The Studies that Used (TOE) Framework with Other Theories
4 Discussion
5 Conclusion
References
Algorithmic Optimization Techniques for Operations Research Problems
1 Understanding the Landscape of Optimization
1.1 Modeling the Real World
1.2 Crafting the Blueprint of Optimization
2 Applications of Algorithmic Optimization
3 Conclusion and Challenges
References
Complex Comparison of Statistical and Econometrics Methods for Sales Forecasting
1 Introduction
2 Background
3 Data
4 Experiments Design
5 Metrics
6 Models
6.1 Baseline Models
6.2 Exponential Smoothing Models
6.3 Sparse or Intermittent Models
6.4 Multiple Seasonalities
6.5 Theta Family
6.6 ARCH Family
6.7 ARIMA Family
7 Results
8 Discussion and Future Work
9 Conclusion
References
The Meaning of Business and Software Architecture in the Enterprise Architecture
1 Business Architecture is the Bridge Between Software Architecture and Enterprise Architecture
2 Overlap of the Three Architectures
3 Basic Aspects of Building the Business Architecture
4 Conclusion
References
Towards Data-Driven Artificial Intelligence Models for Monitoring, Modelling and Predicting Illicit Substance Use
1 Introduction
1.1 Significance of the Study
2 Materials and Method
2.1 Study Design
2.2 Search Strategy
2.3 Study Selection and Eligibility Criteria
2.4 Data Extraction
2.5 Risk-of-Bias and Quality Assessment of Included Sources
3 Results Analysis and Discussion
3.1 Analysis of Risk-of-Bias
3.2 Artificial Intelligence Models for Predicting Illicit Substance Abuse
3.3 Risk Factors for Modelling Substance Abuse Using Artificial Intelligence Models
4 Barriers, Challenges and Recommendations for Integrating Data-Driven Artificial Intelligence Models for Tackling Illicit Substance Use
5 Conclusion
References
On the Maximal Sets of the Shortest Vertex-Independent Paths
1 Introduction
2 Preliminary Information: Definitions and Notations
3 Problem Statement
4 Algorithm Description
4.1 Algorithm 1
4.2 Algorithm 2
5 Modeling of Test Cases
6 Conclusion
References
Using Special Graph Invariants in Some Applied Network Problems
1 Introduction. The Problem of Isomorphism of Graphs
2 Preliminaries
3 Some Results of Computational Experiments
4 Theorem on the Relation of the Randich Index and the Double Vector
5 Conclusion
References
Analysis of Digital Analog Signal Filters
1 Introduction
2 Noise Synthesis
3 Signal Generation
4 Filters
4.1 Arithmetic Mean Filter
4.2 Median Filter
4.3 Exponential Running Average
4.4 Simple Kalman Filter
5 Comparative Analysis of Filters
6 Conclusion
References
Readiness for Smart City in Municipalities in Mbabane, Eswatini
1 Introduction
2 Related Work
3 Research Model and Hypothesis
4 Research Methodology
4.1 Data Collection
4.2 Questionnaire Design
5 Results
5.1 Reliability Test
5.2 Factor Analysis
6 Limitations
7 Conclusion
References
Assessment of Investment Attractiveness of Small Enterprises in Agriculture Based on Fuzzy Logic
1 Introduction
2 Methods, Models, Data
2.1 Unification of Indicators for Assessing Investment Attractiveness
3 Results
4 Discussions
References
Security in SCADA System: A Technical Report on Cyber Attacks and Risk Assessment Methodologies
1 Introduction
1.1 Background
1.2 Problem Statement
1.3 Objectives
1.4 Scope and Methodology
2 Overview of SCADA System
2.1 SCADA Layers
2.2 SCADA Architecture
2.3 SCADA Generations
2.4 Key Characteristics of SCADA System
3 The Evolving Threat Landscape: Common Attack Vectors and Vulnerabilities Specific to SCADA Systems
3.1 Threats and Attacks on SCADA
3.2 Common Attack Vectors
3.3 Vulnerabilities in SCADA Systems
4 Related Work
4.1 Compatibility Table
5 Methodology
5.1 Data Collection
5.2 Research Design
5.3 Data Analysis
5.4 Ethical Considerations
5.5 Limitations
5.6 Validity and Reliability
6 Risk Assessment Methodologies: Identifying, Analyzing, and Mitigating Risks
6.1 Importance of Risk Assessment in SCADA Systems
6.2 Risk Assessment Frameworks
6.3 Risk Assessment Approaches
6.4 Risk Mitigation Strategies
7 Implications of Effective Risk Assessment in Enhancing the Security of SCADA Systems
7.1 Proactive Identification of Vulnerabilities and Risks
7.2 Prioritization of Security Investments and Resource Allocation
7.3 Compliance with Regulations and Standards
7.4 Improved Incident Response and Resilience
7.5 Adaptation to Evolving Threat Landscape
7.6 Continuous Improvement of Security Measures
8 Future Trends Related to SCADA Systems
8.1 Comprehensive Analysis
9 Results and Discussion
9.1 Vulnerabilities in SCADA Systems
9.2 Common Attack Vectors
9.3 Risk Assessment Methodologies
9.4 Implications of Effective Risk Assessment
9.5 Case Studies
9.6 Limitations and Future Research
10 Summary and Conclusion
10.1 Key Findings
10.2 Conclusion
References
A Shap Interpreter-Based Explainable Decision Support System for COPD Exacerbation Prediction
1 Introduction
2 Previous Work
3 Materials and Methods
3.1 Dataset of COPD Patients
3.2 Classification Models
3.3 Evaluation Metrics
3.4 The Context of SHAP Interpreter
4 Results and Discussion
5 Conclusion
References
A Generic Methodology for Designing Smart Environment Based on Discrete-Event Simulation: A Conceptual Model
1 Introduction
2 Literature
3 Smart Environment, Technologies, and Smartness Features
4 Simulation-Based Assessment of Smart Enabling Technology: Proposed Methodology
4.1 Defining the Model: The Processes Parameters
4.2 Designing a Smart Environment Based on Discrete-Event Simulation the methodology
5 Conclusion
References
From Algorithms to Grants: Leveraging Machine Learning for Research and Innovation Fund Allocation
1 Introduction
2 Literature Review
2.1 Text Classification Algorithms
2.2 Related Works
3 Methodology
3.1 Design and Implementation of the Web Based System
3.2 Model Design and Implementation
4 Results and Discussions
4.1 Text Classification Models’ Performance
4.2 The Support Vector Machine (SVM)
5 Conclusion
6 Recommendations and Future Work
References
Ways and Directions of Development of the Terminal of Increased Security for Biometric Identification When Interacting with the Russian Unified Biometric System
1 Introduction
1.1 General Information
1.2 The Problem of Protecting Biometric Identification Endpoints
1.3 Legal Aspects of Biometrics Use in Russia
1.4 Unified Biometric System. Overview of the System and Interaction Procedures
2 Methodology
2.1 Justification of the Choice of the Way of Development of Biometric Terminal of Increased Security
2.2 The Concept of a Biometric Terminal with Enhanced Security
2.3 Building a Threat Model for a Biometric Authentication System
3 Discussion
4 Conclusion
References
Research of the Correlation Between the Results of Detection the Liveliness of a Face and Its Identification by Facial Recognition Systems
1 Introduction
1.1 General Information
1.2 Literature Review
2 Methodology
2.1 Tasks of Face Recognition and Anti-spoofing
2.2 Data Sets Used
3 Results and Discussion
4 Conclusion
References
Cooperative Cost Sharing of Joint Working Capital in Financial Supply Networks
1 Financial Supply Chain and its Management Tools
1.1 Cooperation in Financial Supply Chains
1.2 Collaborative Cash Conversion Cycle Formalization
2 Reduction and Allocation of Joint Working Capital Costs
2.1 Working Capital Costs for a Single Company
2.2 Game-Theoretical Approach for Joint Working Capital Cost Sharing
2.3 Coalition Payoffs
2.4 Example
3 Conclusion
References
Profit Prediction Model for Steel Enterprises Based on Commodity Prices of Inputs and Outputs
1 Introduction
2 Related Work
3 Approach
3.1 General
3.2 Methodology
3.3 Model Performance
3.4 Specific Conclusions
3.5 Expand and Optimize the Model
3.6 The Continuous Upgrade Model
3.7 Building a Weekly Predictive Model
3.8 Model for Enterprises Using EAF Technology
3.9 Specific Conclusions
4 Conclusions
References
Author Index
Recommend Papers

Data Analytics in System Engineering: Proceedings of 7th Computational Methods in Systems and Software 2023, Vol. 4 (Lecture Notes in Networks and Systems, 935)
 3031548191, 9783031548192

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Lecture Notes in Networks and Systems 935

Radek Silhavy Petr Silhavy   Editors

Data Analytics in System Engineering Proceedings of 7th Computational Methods in Systems and Software 2023, Vol. 4

Lecture Notes in Networks and Systems

935

Series Editor Janusz Kacprzyk , Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland

Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).

Radek Silhavy · Petr Silhavy Editors

Data Analytics in System Engineering Proceedings of 7th Computational Methods in Systems and Software 2023, Vol. 4

Editors Radek Silhavy Faculty of Applied Informatics Tomas Bata University in Zlin Zlin, Czech Republic

Petr Silhavy Faculty of Applied Informatics Tomas Bata University in Zlin Zlin, Czech Republic

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-54819-2 ISBN 978-3-031-54820-8 (eBook) https://doi.org/10.1007/978-3-031-54820-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Preface

Welcome to Volume 2 of the conference proceedings for the distinguished Computational Methods in Systems and Software 2023 (CoMeSySo). This volume, titled “Data Analytics in System Engineering,” underscores the role of data in reshaping the landscape of system engineering. In our data-centric world, integrating data analytics into system engineering is not just a trend but a necessity. The contributions within this volume shed light on the myriad ways data analytics is being leveraged to drive innovation, streamline processes, and address intricate challenges in system engineering. CoMeSySo has always been a beacon for cutting-edge research and a hub for thought leaders from diverse backgrounds. This year, our focus on data analytics mirrors the global emphasis on harnessing data for actionable insights and the transformative potential it brings to system engineering. We extend our deepest gratitude to all contributors, reviewers, and the organizing committee. Their dedication, expertise, and tireless efforts have shaped this volume into a repository of knowledge that stands at the intersection of data science and system engineering. To our esteemed readers, we present this volume as both a resource and an inspiration. Whether you’re an industry expert, a researcher, or someone intrigued by the synergy of data and engineering, these pages promise a journey of discovery and enlightenment. As we reflect on the ground-breaking work showcased here, we also look forward with eagerness to the evolving horizons of system engineering. The adventure of exploration and innovation is ceaseless, and we are thrilled to share this chapter with you. Radek Silhavy Petr Silhavy

Organization

Program Committee Program Committee Chairs Petr Silhavy Radek Silhavy Zdenka Prokopova Roman Senkerik Roman Prokop Viacheslav Zelentsov

Roman Tsarev

Stefano Cirillo

Tomas Bata University in Zlin, Faculty of Applied Informatics Tomas Bata University in Zlin, Faculty of Applied Informatics Tomas Bata University in Zlin, Faculty of Applied Informatics Tomas Bata University in Zlin, Faculty of Applied Informatics Tomas Bata University in Zlin, Faculty of Applied Informatics Doctor of Engineering Sciences, Chief Researcher of St. Petersburg Institute for Informatics and Automation of Russian Academy of Sciences (SPIIRAS) Department of Information Technology, International Academy of Science and Technologies, Moscow, Russia Department of Computer Science, University of Salerno, Fisciano (SA), Italy

Program Committee Members Juraj Dudak

Gabriel Gaspar Boguslaw Cyganek Krzysztof Okarma

Faculty of Materials Science and Technology in Trnava, Slovak University of Technology, Bratislava, Slovak Republic Research Centre, University of Zilina, Zilina, Slovak Republic Department of Computer Science, University of Science and Technology, Krakow, Poland Faculty of Electrical Engineering, West Pomeranian University of Technology, Szczecin, Poland

viii

Organization

Monika Bakosova

Pavel Vaclavek

Miroslaw Ochodek Olga Brovkina

Elarbi Badidi

Luis Alberto Morales Rosales

Mariana Lobato Baes (Research-Professor) Abdessattar Chaâri

Gopal Sakarkar V. V. Krishna Maddinala Anand N. Khobragade (Scientist) Abdallah Handoura Almaz Mobil Mehdiyeva

Institute of Information Engineering, Automation and Mathematics, Slovak University of Technology, Bratislava, Slovak Republic Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czech Republic Faculty of Computing, Poznan University of Technology, Poznan, Poland Global Change Research Centre Academy of Science of the Czech Republic, Brno, Czech Republic & Mendel University of Brno, Czech Republic College of Information Technology, United Arab Emirates University, Al Ain, United Arab Emirates Head of the Master Program in Computer Science, Superior Technological Institute of Misantla, Mexico Superior Technological of Libres, Mexico Laboratory of Sciences and Techniques of Automatic Control & Computer Engineering, University of Sfax, Tunisian Republic Shri. Ramdeobaba College of Engineering and Management, Republic of India GD Rungta College of Engineering & Technology, Republic of India Maharashtra Remote Sensing Applications Centre, Republic of India Computer and Communication Laboratory, Telecom Bretagne, France Department of Electronics and Automation, Azerbaijan State Oil and Industry University, Azerbaijan

Technical Program Committee Members Ivo Bukovsky, Czech Republic Maciej Majewski, Poland Miroslaw Ochodek, Poland Bronislav Chramcov, Czech Republic Eric Afful Dazie, Ghana Michal Bliznak, Czech Republic

Organization

ix

Donald Davendra, Czech Republic Radim Farana, Czech Republic Martin Kotyrba, Czech Republic Erik Kral, Czech Republic David Malanik, Czech Republic Michal Pluhacek, Czech Republic Zdenka Prokopova, Czech Republic Martin Sysel, Czech Republic Roman Senkerik, Czech Republic Petr Silhavy, Czech Republic Radek Silhavy, Czech Republic Jiri Vojtesek, Czech Republic Eva Volna, Czech Republic Janez Brest, Slovenia Ales Zamuda, Slovenia Roman Prokop, Czech Republic Boguslaw Cyganek, Poland Krzysztof Okarma, Poland Monika Bakosova, Slovak Republic Pavel Vaclavek, Czech Republic Olga Brovkina, Czech Republic Elarbi Badidi, United Arab Emirates

Organizing Committee Chair Radek Silhavy

Tomas Bata University in Zlin, Faculty of Applied Informatics, email: [email protected]

Conference Organizer (Production) Silhavy s.r.o. Web: https://comesyso.openpublish.eu Email: [email protected]

Conference Website, Call for Papers https://comesyso.openpublish.eu

Contents

Evaluating the Reliability of Tests Used in LMS Moodle for E-Learning . . . . . . . Rukiya Deetjen-Ruiz, Jorge Alberto Esponda-Pérez, Ikhfan Haris, Darío Salguero García, José Luis Quispe Osorio, and Roman Tsarev Data Mart in Business Intelligence with Hefesto for Sales Area in a Dental Clinic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Caycho Dominguez, Gian Terrones Castrejon, Juan J. Soria, Mercedes Vega Manrique, and Lidia Segura Peña

1

9

Features of Building Automated Educational Systems with Virtual and Augmented Reality Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oleg Slavin

25

Information and Communication Technology Skills for Instruction Performance: Beliefs and Experiences from Public School Educators . . . . . . . . . Fatin Ardani Zamri, Norhisham Muhamad, and Miftachul Huda

34

Managing Information Quality for Learning Instruction: Insights from Public Administration Officers’ Experiences and Practices . . . . . . . . . . . . . . Arbaenah Masud, Abd Hadi Borham, Miftachul Huda, Mohamad Marzuqi Abdul Rahim, and Husna Husain Development of Multivariate Stock Prediction System Using N-Hits and N-Beats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nathanael Jeffrey, Alexander Agung Santoso Gunawan, and Aditya Kurniawan Fractal Method for Assessing the Efficiency of Application of Closed Artificial Agroecosystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander P. Grishin, Andrey A. Grishin, and Vladimir A. Grishin Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity of Objects from Heterogeneous Ontologies . . . . . . . . . . . . . . . . . . . . . . . Vladislav I. Danilchenko, Eugenia V. Danilchenko, and Victor M. Kureychik Leveraging Deep Object Detection Models for Early Detection of Cancerous Lung Nodules in Chest X-Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Tareq Mahmud, Shayam Imtiaz Shuvo, Nafis Iqbal, and Sifat Momen

41

50

64

69

79

xii

Contents

Analyzing Data by Applying Neural Networks to Identify Patterns in the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. S. Borodulin, V. V. Kukartsev, Anna R. Glinscaya, A. P. Gantimurov, and A. V. Nizameeva

99

Intelligent Data Analysis as a Method of Determining the Influence of Various Factors on the Level of Customer Satisfaction of the Company . . . . . 109 Vladislav Kukartsev, Vladimir Nelyub, Anastasia Kozlova, Aleksey Borodulin, and Anastasia Rukosueva Correlation Analysis and Predictive Factors for Building a Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 V. A. Nelyub, V. S. Tynchenko, A. P. Gantimurov, Kseniya V. Degtyareva, and O. I. Kukartseva Geoportals in Solving the Problem of Natural Hazards Monitoring . . . . . . . . . . . . 142 Stanislav A. Yamashkin, A. A. Yamashkin, A. S. Rotanov, Yu. E. Tepaeva, E. O. Yamashkina, and S. M. Kovalenko Implementation of Individual Learning Trajectories in LMS Moodle . . . . . . . . . . 159 Faycal Bensalah, Marjorie P. Daniel, Indrajit Patra, Darío Salguero García, Shokhida Irgasheva, and Roman Tsarev Application of Fuzzy Logic for Evaluating Student Learning Outcomes in E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Mikaël A. Mousse, Saman M. Almufti, Darío Salguero García, Ikhlef Jebbor, Ayman Aljarbouh, and Roman Tsarev Advancing Recidivism Prediction for Male Juvenile Offenders: A Machine Learning Approach Applied to Prisoners in Hunan Province . . . . . . . . . . . . . . . . . 184 Sadia Sultana, Israka Jahir, Mabeean Suukyi, Md. Mohibur Rahman Nabil, Afsara Waziha, and Sifat Momen Development of Automated Essay Scoring System Using DeBERTa as a Transformer-Based Language Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Hansel Susanto, Alexander Agung Santoso Gunawan, and Muhammad Fikri Hasani Prediction of Glycemic Control in Diabetes Mellitus Patients Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Md. Farabi Mahbub, Warsi Omrao Khan Shuvo, and Sifat Momen Efficient Object Detection in Fused Visual and Infrared Spectra for Edge Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Piotr Janyst, Bogusław Cyganek, and Łukasz Przebinda

Contents

xiii

Designing AI Components for Diagnostics of Carotid Body Tumors . . . . . . . . . . 254 Tatyana Maximova and Ekaterina Zhabrovets A Review of the Concept, Applications, Risks and Control Strategies for Digital Twin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Farnaz Farid, Abubakar Bello, Nusrat Jahan, and Razia Sultana Processing of the Time Series of Passenger Railway Transport in EU Countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Zdena Dobesova Factors Influencing Performance Evaluation of e-Government Diffusion . . . . . . . 294 Mkhonto Mkhonto and Tranos Zuva An Analysis of Readiness for the Adoption of Augmented and Virtual Reality in the South African Schooling System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Nellylyn Moyo, Anneke Harmse, and Tranos Zuva Review of Technology Adoption Models and Theories at Organizational Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 Mkhonto Mkhonto and Tranos Zuva Algorithmic Optimization Techniques for Operations Research Problems . . . . . . 331 Carla Silva, Ricardo Ribeiro, and Pedro Gomes Complex Comparison of Statistical and Econometrics Methods for Sales Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Oleksandr Kosovan and Myroslav Datsko The Meaning of Business and Software Architecture in the Enterprise Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 Kamelia Shoilekova, Boyana Ivanova, and Magdalena Andreeva Towards Data-Driven Artificial Intelligence Models for Monitoring, Modelling and Predicting Illicit Substance Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Elliot Mbunge, John Batani, Itai Chitungo, Enos Moyo, Godfrey Musuka, Benhildah Muchemwa, and Tafadzwa Dzinamarira On the Maximal Sets of the Shortest Vertex-Independent Paths . . . . . . . . . . . . . . . 380 Boris Melnikov and Yulia Terentyeva Using Special Graph Invariants in Some Applied Network Problems . . . . . . . . . . 388 Boris Melnikov, Aleksey Samarin, and Yulia Terentyeva

xiv

Contents

Analysis of Digital Analog Signal Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Denis Gruzdkov and Andrey Rachishkin Readiness for Smart City in Municipalities in Mbabane, Eswatini . . . . . . . . . . . . 402 Mkhonto Mkhonto and Tranos Zuva Assessment of Investment Attractiveness of Small Enterprises in Agriculture Based on Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Ulzhan Makhazhanova, Aigerim Omurtayeva, Seyit Kerimkhulle, Akylbek Tokhmetov, Alibek Adalbek, and Roman Taberkhan Security in SCADA System: A Technical Report on Cyber Attacks and Risk Assessment Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Sadaquat Ali A Shap Interpreter-Based Explainable Decision Support System for COPD Exacerbation Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Claudia Abineza, Valentina Emilia Balas, and Philibert Nsengiyumva A Generic Methodology for Designing Smart Environment Based on Discrete-Event Simulation: A Conceptual Model . . . . . . . . . . . . . . . . . . . . . . . . 459 Shady Aly, Tomáš Benda, Jan Tyrychtr, and Ivan Vrana From Algorithms to Grants: Leveraging Machine Learning for Research and Innovation Fund Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 Rebecca Lupyani and Jackson Phiri Ways and Directions of Development of the Terminal of Increased Security for Biometric Identification When Interacting with the Russian Unified Biometric System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Timur R. Abdullin, Andrey M. Bonch-Bruevich, Sergei A. Kesel, Timur V. Shipunov, Ilya V. Ovsyannikov, Denis A. Konstantinov, and Utkurbek B. Mamatkulov Research of the Correlation Between the Results of Detection the Liveliness of a Face and Its Identification by Facial Recognition Systems . . . 493 Aleksandr A. Shnyrev, Ramil Zainulin, Daniil Solovyev, Maxim S. Isaev, Timur V. Shipunov, Timur R. Abdullin, Sergei A. Kesel, Denis A. Konstantinov, and Ilya V. Ovsyannikov Cooperative Cost Sharing of Joint Working Capital in Financial Supply Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Vladislav Novikov, Nikolay Zenkevich, and Andrey Zyatchin

Contents

xv

Profit Prediction Model for Steel Enterprises Based on Commodity Prices of Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Cuong Bui Van, Trung Do Cao, Hoang Do Minh, Ha Pham Dung, Kien Vu Trung, Le Nguyen Ha An, Hop Do Quang, and Phuc Phan Dinh Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535

Evaluating the Reliability of Tests Used in LMS Moodle for E-Learning Rukiya Deetjen-Ruiz1 , Jorge Alberto Esponda-Pérez2 , Ikhfan Haris3 , Darío Salguero García4 , José Luis Quispe Osorio5 , and Roman Tsarev6,7(B) 1 Zayed University, Abu Dhabi, United Arab Emirates 2 UNICACH, Tuxtla Gutiérrez, Mexico 3 Universitas Negeri Gorontalo, Gorontalo, Indonesia 4 Almería University, Almería, Spain 5 Universidad Nacional “Toribio Rodríguez de Mendoza”, Amazonas, Peru 6 MIREA - Russian Technological University (RTU MIREA), Moscow, Russia

[email protected] 7 Bauman Moscow State Technical University, Moscow, Russia

Abstract. The outdated educational technologies are being replaced by e-learning technologies, which expand the possibilities of traditional education both in terms of facilities and new pedagogical methods. The use of such an innovative learning tool as the learning management system Moodle allows for objective and automated control of students’ knowledge. With the help of LMS Moodle teachers can conduct electronic tests, save information about test results, analyze it and form individual tests on this basis. The quality of an electronic test is determined mainly by its reliability, which consists in the accuracy of measurements and the stability of test results to the action of extraneous random factors. Teachers should take into account a number of factors that influence the reliability of a test. Special attention should be paid to the content, differentiation and difficulty level of test tasks. To date, several methods have been developed to evaluate test reliability, including the split-half method, which involves a single test. On the basis of this method, using the Spearman-Brown correlation coefficient, in this paper we evaluated the reliability of the test created in LMS Moodle. #COMESYSO1120. Keywords: E-learning · LMS Moodle · Testing · Reliability · Spearman-Brown Correlation Coefficient · Split-Half Method

1 Introduction A new model of educational environment is electronic learning (e-learning), which is based on the application of information and communication, multimedia, Internet technologies in virtual, remote form to improve the quality of education [1–6]. In the framework of e-learning the teacher is no longer the center of the educational process and the main source of information. Independence from the place and time of learning makes education accessible. Learners actively develop critical thinking, skills of searching, processing and analyzing knowledge, and can independently organize, regulate, control and evaluate their results [7–13]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 1–8, 2024. https://doi.org/10.1007/978-3-031-54820-8_1

2

R. Deetjen-Ruiz et al.

E-learning technologies allow presenting educational material in microdoses, small blocks of specific theoretical material and its practical application. This facilitates the assimilation of knowledge and promotes motivation. Interactive form of material presentation (training videos, images, etc.) contributes to visualization [14]. In addition, e-learning allows to implement an individual educational trajectory of students with the help of e-learning tools, which imply an individual course of study depending on the existing and mastered knowledge, preferences and opportunities [15–17]. Even personalized chatbots can be developed in such courses to unlock the student’s potential. E-learning tools and technologies fulfill one of the fundamental functions - control. Back in 1928, American professor Sidney L Pressey invented a device for testing knowledge “Machine for intelligence tests”, which reduced the time spent on checking tests and as a result the possibility of individual, including practical work of teacher and student [18]. In appearance, the device resembled a typewriter with a window with a question and keys for selecting answer options, as well as a counter of correct answers. As answers were selected, the machine would move on to the next question. Thus, teachers could simply insert the test sheet into the device and view the final test result. This served as a serious impetus in the development of various educational platforms and environments for e-testing that exist today. Thus, we can note the undeniable advantages of e-learning not only for students, but also for teachers, because e-learning facilitates and optimizes the work of teachers through various innovative methods and technical means. One of such tools is the electronic educational environment implemented by means of the LMS Moodle, which is used all over the world [19–22]. LMS Moodle allows to plan, conduct and manage training sessions, virtual classes and courses [23–25]. The main advantages of LMS Moodle are various ways of providing and storing educational material, as well as the possibility of electronic testing, creating and storing test questions in the test bank [26, 27]. Automatic checking and assessment of tasks saves time of the teacher, eliminates the subjective factor, testing has the least psychological load on students and most reflects the real knowledge of students. LMS Moodle allows to create different types of questions, including those with visual support [28–30]. These can be input control, self-control, intermediate or final control. Any test question can contain a list of answers and comments. Questions can have one correct answer or several. Questions can be of open-ended type. Electronic testing in LMS Moodle does not require additional software and can be conducted both in face-to-face classes in the classroom and remotely, it can be configured in LMS Moodle, based on the organization of the course. The test itself has various settings, such as the time to take the test, the number of attempts, which of the questions from the test bank to use and so on. It is important that all information about the test results is saved, which gives the possibility of multilateral analysis of test results, as well as the test questions themselves, and the formation of individual tests based on them. With LMS Moodle it is possible to provide feedback on the completion of the test. It is realized in several ways: providing a report on the correctness of the student’s answers, quantitative evaluation of correct answers, demonstration of the correct answer to the student.

Evaluating the Reliability of Tests Used in LMS Moodle

3

Creating test tasks is not an easy methodological task. One of the main quality criteria of an electronic test, adequate assessment of the student’s knowledge, is its reliability. Reliability ensures accuracy of measurements and stability of test results to the action of extraneous random factors. The more consistent the results of one and the same person in the repeated testing of knowledge by means of the same test or its equivalent form, the more reliable the test is. That is, the evaluation of test reliability is based on the calculation of the correlation between two sets of results of the same test or its two equivalent forms (the higher the correlation, the more reliable the test). A good reliability coefficient is considered to be between 0.8 and 1. The factors influencing the reliability of test include a representative sample of test objects, the degree of homogeneity of the test taker’s level of training, a small number of test tasks, variety of tasks, difficulty of tasks, small time provided for test, unclear instructions for the test or test tasks, randomness of correct answers, students’ well-being during testing, test conditions. So, the reliability of a test directly depends on the quality, content, difficulty, and number of test questions created. It is necessary to develop a sufficiently large number of differentiated questions, avoid too difficult or too easy questions. By taking into account all of the above factors and eliminating conditions that reduce measurement accuracy, a high level of test reliability can be achieved. There are several methods for evaluating test reliability [31–33]: the test-retest method, the parallel-forms method, and the split-half method. The test-retest method is the twofold use of the same test in one group of test takers, based on calculating the correlation of individual scores based on the results of the first and second tests. However, this method of evaluating test reliability has a serious disadvantage - the time factor, since the time interval should not be too long (the level of academic achievement will change) or too short (students may remember test tasks and answers to them). With the parallel-forms method, which also consists in testing the same group of examinees twice, a difficulty arises in the form of creating a new test that is identical in content, structure, and difficulty with the first one. The most optimal method for evaluating test reliability is the split-half method, since it is limited to a single test and is based on the assumption of parallelism of the two halves of the test and involves dividing the test results into two parts: data on odd-numbered test questions and on even-numbered questions. This is the method we will use in our study to evaluate the reliability of the test based on the Spearman-Brown formula.

2 Methods The Spearman-Brown correlation coefficient [34, 35] is used to evaluate test reliability. Let us introduce the following notations: m is the number of questions in the test; n is the number of students who passed the test; yi is the score obtained by the i-th student (i = 1,…, n) when taking the test, yi =

m  j=1

xij ;

4

R. Deetjen-Ruiz et al.

yie ,yio are the scores obtained by the i-th student when answering even and odd test questions, respectively (we will divide the test questions into even and odd to find the correlation coefficient); yie , yio are the arithmetic mean of scores obtained by all students when answering even n 

yie

n 

i=1

i=1

,yo

yio

and odd test questions, respectively, = n = n ; se , so are the sums of squares of deviations of the i-th student scores from their mean n n   (yie − ye )2 ,so = (yio − yo )2 ; values, se = ye

i=1

i=1

seo is the sum of multiplications of deviations of values yie and yio from their mean values, n  (yie − ye ) · (yio − yo ); seo = i=1

e

o

s s De , Do are the estimates of variance of test scores yie and yio , De = n−1 ,Do = n−1 ; seo e o eo eo K is the estimation of the correlation moment of the values yi and yi , K = n ; eo r eo is the estimation of the correlation coefficient of the values yie and yio , r eo = √ Ke √

D · Do

;

The Spearman-Brown correlation coefficient is defined as r=

2r eo . 1 + r eo

(1)

It is considered that the test is sufficiently reliable when the value of r eo > 0.8.

3 Results An experiment was conducted to assess the knowledge of 252 students. The distribution of test scores obtained after studying the course implemented in LMS Moodle is presented in Fig. 1. The statistical analysis used the data on the test results, which are stored in LMS Moodle. As the graph in Fig. 1 shows the distribution of scores is close to the normal distribution. It is considered that if the scores have a normal distribution and the average test result corresponds to the results of the majority of students who took the test, then such a test is valid [33]. To evaluate the reliability of the test, we use the Spearman-Brown correlation coefficient. For this purpose, let us divide the questions into two groups - the first group will include even-numbered questions and the second group will include odd-numbered questions. Let us find the correlation coefficient (1) for these two groups. In the experiment, the value of the correlation coefficient estimate was 0.81 and the Spearman-Brown correlation coefficient was 0.895. Thus, it can be stated that the test is reliable. The calculation of the standard deviation allows us to identify questions that do not have sufficient differentiating ability. These questions do not allow distinguishing students with strong knowledge from students with weak knowledge. Such questions need to be removed from the tests or improved to fulfill this requirement. In order for a question to have differentiating ability its standard deviation should be at least 0.3

Evaluating the Reliability of Tests Used in LMS Moodle

5

Fig. 1. Test results.

[27]. Thus, if the standard deviation for some question is zero, it means that all students answered it similarly. The calculation of the standard deviation for the test questions used in the experiment is shown in Fig. 2. The standard deviation characterizes the spread of scores obtained for answering each specific question. Figure 2 shows that questions 5, 17 and 23 need to be improved or replaced because they fail to have differentiating ability. The rest of the questions are characterized by acceptable values of the standard deviation and can be used in future testing.

Fig. 2. Standard deviation of the results of test questions.

6

R. Deetjen-Ruiz et al.

4 Conclusion The digital revolution, global social changes and, as a result, transformations in the field of education have contributed to the emergence of the e-learning phenomenon. Although e-learning does not completely replace traditional learning, e-learning technologies have a great number of advantages and expand the boundaries of educational opportunities to improve the quality and accessibility of education in general. E-learning with the help of ICT tools allows to combine different types of information, to provide interaction of the learner with the system and the teacher, to use multimedia components to increase the level of motivation of students, to activate individual learning, to reduce the learning load of students, to ensure the continuity of learning by eliminating spatial and time constraints, to save the resources of teachers, as well as to monitor students progress and outcomes within the framework of an electronic educational environment such as LMS Moodle. LMS Moodle is an innovative opportunity to manage learning, a system focused on the organization of remote interaction between teachers and students. LMS Moodle testing system is flexible, allows to create questions of different types and types, allows you to analyze the test results of an individual student, a group or several groups. Electronic testing is distinguished by the modeling of test tasks based on a given algorithm, efficiency in summarizing the results, the possibility of self-control and feedback from students, taking into account the individual choice of place and time, the absence of personal direct contact with the teacher in the classroom, which often creates psychological stress. Qualitative pedagogical control of obtaining real results of students’ mastered knowledge is one of the main tasks of a teacher, which entails a number of difficulties associated with the development of a reliable, valid test, which is why this issue is one of the topical problems of pedagogy. The evaluation of test reliability is based on the calculation of correlation between two sets of results of the same test or its two equivalent forms. The quality of diagnostic materials is a key point of knowledge control. In order to exclude inadequate assessment of knowledge it is necessary to use only statistically based test materials that have a sufficient level of reliability. In this paper, the existing methods for assessing test reliability were considered. Based on the results of the analysis, the split-half method was chosen and the SpearmanBrown correlation coefficient was used. This method is based on a single test, assumption of parallelism of two halves of the test and assumes division of test results into two parts: data on odd-numbered test items and on even-numbered items. The test results showed that the scores are distributed according to the normal law of distribution, so this bank of test questions has validity and can be used as a tool for testing students’ knowledge. The calculation of the Spearman-Brown correlation coefficient showed that the test has a sufficient level of reliability and allows to objectively assess students’ knowledge. Thanks to the calculation of the standard deviation, the test questions with insufficient differentiating power were identified. Their replacement or refinement improved the quality of the test.

Evaluating the Reliability of Tests Used in LMS Moodle

7

References 1. Aulakh, K., Roul, R.K., Kaushal, M.: E-learning enhancement through educational data mining with Covid-19 outbreak period in backdrop: a review. Int. J. Educ. Dev. 101, 102814 (2023). https://doi.org/10.1016/j.ijedudev.2023.102814 2. Fauzi, M.A.: E-learning in higher education institutions during COVID-19 pandemic: current and future trends through bibliometric analysis. Heliyon 8(5), e09433 (2022). https://doi.org/ 10.1016/j.heliyon.2022.e09433 3. Joy, J., Pillai, R.V.G.: Review and classification of content recommenders in E-learning environment. J. King Saud Univ. Comput. Inf. Sci. 34(9), 7670–7685 (2022). https://doi.org/10. 1016/j.jksuci.2021.06.009 4. Ongor, M., Uslusoy, E.C.: The effect of multimedia-based education in e-learning on nursing students’ academic success and motivation: a randomised controlled study. Nurse Educ. Pract. 71, 103686 (2023). https://doi.org/10.1016/j.nepr.2023.103686 5. Ullah, M.S., Hoque, M., Aziz, M.A., Islam, M.: Analyzing students’ e-learning usage and post-usage outcomes in higher education. Comput. Educ. Open 5, 100146 (2023). https://doi. org/10.1016/j.caeo.2023.100146 6. Khazieva, V.D.: Java library designed to work with elliptic curves. Mod. Innov. Syst. Technol. 3(2), 0225–0233 (2023). https://doi.org/10.47813/2782-2818-2023-3-2-0225-0233 7. Liu, Y.: Matches and mismatches between university teachers’ and students’ perceptions of E-learning: a qualitative study in China. Heliyon 9(6), e17496 (2023).https://doi.org/10.1016/ j.heliyon.2023.e17496 8. Maulana, F.I., Febriantono, M.A., Raharja, D.R.B., Khaeruddin, H.R.: Twenty years of elearning in health science: a bibliometric. Procedia Comput. Sci. 216, 604–612 (2023). https:// doi.org/10.1016/j.procs.2022.12.175 9. Meneses, L.F.S., Pashchenko, T., Mikhailova, A.: Critical thinking in the context of adult learning through PBL and e-learning: a course framework. Thinking Skills Creativity 1, 101358 (2023). https://doi.org/10.1016/j.tsc.2023.101358 10. Nagy, V., Duma, L.: Measuring efficiency and effectiveness of knowledge transfer in elearning. Heliyon 9(7), e17502 (2023). https://doi.org/10.1016/j.heliyon.2023.e17502 11. Tsarev, R., et al.: Gamification of the graph theory course. Finding the shortest path by a greedy algorithm. In: Silhavy, R., Silhavy, P. (eds.) Networks and Systems in Cybernetics. CSOC 2023. LNNS, vol. 723, pp. 209–216. Springer, Cham (2023). https://doi.org/10.1007/ 978-3-031-35317-8_18 12. Veeramanickam, M.R.M., Ramesh, P.: Analysis on quality of learning in e-learning platforms. Adv. Eng. Softw. 172, 103168 (2022). https://doi.org/10.1016/j.advengsoft.2022.103168 13. Eshniyazov, A.I.: Teaching the basics of educational robotics in a distance learning format. Inf. Econ. Manage. 2(2), 0301–0310 (2023). https://doi.org/10.47813/2782-5280-2023-2-20301-0310 14. Kovalev, I.V., Losev, V.V., Kalinin, A.O.: Formalized approach to the design of microprocessor systems with elements of human-machine interaction. Mod. Innov. Syst. Technol. 3(2), 0243– 0253 (2023). https://doi.org/10.47813/2782-2818-2023-3-2-0243-0253 15. Deetjen-Ruiz, R., et al.: Applying ant colony optimisation when choosing an individual learning trajectory. In: Silhavy, R., Silhavy, P. (eds.) Networks and Systems in Cybernetics. CSOC 2023. LNNS, vol. 723, pp. 587–594. Springer, Cham (2023). https://doi.org/10.1007/978-3031-35317-8_53 16. Membrive, A., Silva, N., Rochera, M.J., Merino, I.: Advancing the conceptualization of learning trajectories: a review of learning across contexts. Learn. Cult. Soc. Interact. 37, 100658 (2022). https://doi.org/10.1016/j.lcsi.2022.100658

8

R. Deetjen-Ruiz et al.

17. Tsarev, R.Y., et al.: An Approach to developing adaptive electronic educational course. Adv. Intell. Syst. Comput. 986, 332–341 (2019). https://doi.org/10.1007/978-3-030-19813-8_34 18. Pressey, S.L.: Machine for intelligence tests. United States patent US1749226A, 21 Jun 1928 19. David, A., Mihai, D., Mihailescu, M.-E., Carabas, M., Tapus, N.: Scalability through distributed deployment for Moodle learning management system. Procedia Comput. Sci. 214, 34–41 (2022). https://doi.org/10.1016/j.procs.2022.11.145 20. Hachicha, W., Ghorbel, L., Champagnat, R., Zayani, C.A., Amous, I.: Using process mining for learning resource recommendation: a Moodle case study. Procedia Comput. Sci. 192, 853–862 (2021). https://doi.org/10.1016/j.procs.2021.08.088 21. Kim, E., Park, H., Jang, J.U.: Development of a class model for improving creative collaboration based on the online learning system (Moodle) in Korea. J. Open Innov. Technol. Market Complex. 5(3), 67 (2019). https://doi.org/10.3390/joitmc5030067 22. Yamaguchi, S., Kondo, H., Ohnishi, Y., Nishino, K.: Design of question-and-answer interface using Moodle DATABASE function. Procedia Comput. Sci. 207, 976–986 (2022). https://doi. org/10.1016/j.procs.2022.09.153 23. Dascalu, M.-D., et al.: Before and during COVID-19: a cohesion network analysis of students’ online participation in Moodle courses. Comput. Hum. Behav. 121, 106780 (2021). https:// doi.org/10.1016/j.chb.2021.106780 24. Fernando, W.: Moodle quizzes and their usability for formative assessment of academic writing. Assess. Writ. 46, 100485 (2020). https://doi.org/10.1016/j.asw.2020.100485 25. Stojanovi´c, J., et al.: Application of distance learning in mathematics through adaptive neurofuzzy learning method. Comput. Electr. Eng. 93, 107270 (2021). https://doi.org/10.1016/j. compeleceng.2021.107270 26. Aljarbouh, A., et al.: Application of the k-medians clustering algorithm for test analysis in e-learning. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Software Engineering Application in Systems Design. CoMeSySo 2022, LNNS, vol. 596, pp. 249–256. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-21435-6_21 27. Tsarev, R., et al.: Improving Test Quality in E-Learning Systems. In: Silhavy, R., Silhavy, P. (eds) Networks and Systems in Cybernetics. CSOC 2023. LNNS, vol. 723, pp. 62–68 (2023). Springer, Cham. https://doi.org/10.1007/978-3-031-35317-8_6 28. Dobashi, K., Ho, C.P., Fulford, C.P., Lin, M.-F.G., Higa, C.: Learning pattern classification using moodle logs and the visualization of browsing processes by time-series cross-section. Comput. Educ. Artif. Intell. 3, 100105 (2022). https://doi.org/10.1016/j.caeai.2022.100105 29. Kaur, P., Kumar, H., Kaushal, S.: Affective state and learning environment based analysis of students’ performance in online assessment. Int. J. Cogn. Comput. Eng. 2, 12–20 (2021). https://doi.org/10.1016/j.ijcce.2020.12.003 30. Sychev, O.: Open-answer question with regular expression templates and string completion hinting. Softw. Impact. 17, 100539 (2023). https://doi.org/10.1016/j.simpa.2023.100539 31. Ajayi, B.K.: A comparative analysis of reliability methods. J. Educ. Pract. 8(25), 160–163 (2017) 32. Buelow, M.T.: Reliability and validity. Risky Decision Making in Psychological Disorders, pp. 39–59. Academic Press, Cambridge, MA, US (2020). https://doi.org/10.1016/B978-0-12815002-3.00003-6 33. Kim, V.S. Testing of educational achievements. UGPI, Ussuriysk, Russia (2007) 34. Eisinga, R., Te Grotenhuis, M., Pelzer, B.: The reliability of a two-item scale: Pearson, Cronbach or spearman-brown? Int. J. Public Health 58(4), 637–642 (2013). https://doi.org/10. 1007/s00038-012-0416-3 35. Wainer, H., Thissen, D.: True score theory: The traditional method. Test Scoring. Lawrence Erlbaum, Mahwah, NJ, US (2001)

Data Mart in Business Intelligence with Hefesto for Sales Area in a Dental Clinic Maria Caycho Dominguez(B) , Gian Terrones Castrejon , Juan J. Soria , Mercedes Vega Manrique , and Lidia Segura Peña Universidad Tecnológica del Perú, Lima, Perú [email protected]

Abstract. Large volumes of dental patient data are a problem if your data and business intelligence tools are not properly exploited. This paper provides a clear and concise overview of the tangible benefits of implementing the Data Mart at Pinaud Dental Clinic. The results showed an impressive increase in overall profits. The detailed breakdown of specific treatments and payments highlights the accuracy in data analysis. The use of tableau for reporting further underscores the clinic’s commitment to leveraging technology to improve the business. I recommend this paper for its examination of Data Mart’s impact on a dental practice’s operations. Future studies could delve deeper into the long-term benefits and potential scalability to larger clinical settings. Sales reports in June show us that clinic appointments increased by 69 and in July by 48 appointments, having obtained majority payments in crown treatment, exodontias, prophylaxis, prosthetics, and restoration where the favorable percentage of total payment per customer is 44% thus showing higher gains in dental services in May and June. The number of appointments per day increased by 21%, the total payment for specific treatments increased by 25%, with orthodontics being the most requested and, finally, there was a 7% increase in total profit. It is concluded that the implementation of the Data Mart with Hefesto achieved the following results. Keywords: Business Intelligence · Data Mart · Hephaestus Methodology · ETL Processes · Dental Sales · Tableau

1 Introduction Today, several medical clinics are growing due to the high demand they have from patients, this translates into an increase of information and services they store, in many cases with the BigData tool that helps to manage it for better decision making. D. Nettleton [1] mentions that companies have a more complex relational DB based on data structures with relationships between entities and their relationships, using primary keys, secondary indexes, normalized data structures Q Zhang [2]. Mentions that companies have a more complex relational DB based on data structures with relationships between entities and their relationships, using primary keys, secondary indexes, normalized data structures Q Zhang [2]. Big Data is currently the most widely used, and has a messy data storage in enterprises S. Suma [3]. New technologies help to manage and analyze © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 9–24, 2024. https://doi.org/10.1007/978-3-031-54820-8_2

10

M. C. Dominguez et al.

large amounts of information that can be generated in a dental practice. Today there are several database managers and large amounts of information that are generated in a dental clinic, which needs to exploit them. The Data Mart collects, analyzes and organizes the information to make good decisions in dental clinics with large volumes of data, increasing stability in the business world and good decision-making [4]. A Data Mart is the solution in terms of business intelligence because it corresponds to a good control of the storage of information in large volumes of data, which is related to patients such as consultations and appointments agreed for the realization of some dental treatment. Zeng [5], mentions that business intelligence is the process of collecting, disseminating and processing information, the main objective of which is to reduce doubts based on decision making. When implementing a DataMart S.Ha [6] mentions that extracting the required data requires an analysis of the information to subsequently propose new marketing strategies that can positively help the organization. The processes in business intelligence became primordial in business management because they distribute data of greater relevance in a specific organization [7]. Contributing to decision making through agile methodologies with extensive research in the dental field and with the application of ICT in organizations, today is of great importance to improve processes [8]. The information systems in this era are extremely important, to address the needs of reusing knowledge and information from all the different areas that can be investigated and take advantage of new innovations. Jordy [9], mentions that there are problems in dental clinics because they do not have the best technology updates, due to lack of knowledge of new technologies that help with the analysis and management of information such as Microsoft Excel and the functionality of dynamic tables. Business Intelligence (BI) gives access to the extraction and control of considerable data, having as main model the type of treatment, queries, payment method, date of the query and budget [10]. Aiming to know the performance of all of them, to have knowledge of the options and make the best decisions, it is necessary to be considered important this implementation, since at present the new information technologies came to be a fundamental part as a key tool for the types of development processes continuously within any venture [11]. The rivalry within the industrial market establishes a major challenge within any organization since it is necessary to renew and update the strategic planning that allows the company to move forward with its service, process or product. Analysis systems provide us with useful analytical reports for evaluation and help with decision making, which is static over time [12]. Data Mart designs use an agile methodology that provides a much more representative and organized perspective, which helps with decision making. Decision making is the process of choosing the best decision in the face of endless alternatives and the main objective is to achieve good results [13]. The Hephaestus methodology specifically develops very broad data warehouses such as Data Warehouse and Data Mart, because it can be coupled in an agile way to different life cycles of the system. The research optimized decision making through methodologies, in addition to extensive research in the dental field, of great importance to apply ICT for process improvement. Aguilar [15], mentions that the analysis of unorganized information in a system is extremely critical for the optimization of processes in the organization, and if we base

Data Mart in Business Intelligence with Hefesto

11

this on the area of oral health, a lot of data is obtained on the patients attended in dental clinics. V. Gunaseelan [16] mentions that in his research redirected to the healthcare area the Data Mart also helped him to provide a standardized price for medical services, which adjusts to changes in payments based on local factors, such as labor costs. D.Oniani [17] made and implemented a Data Mart performing an ETL process where his results were of good performance in terms of their visits and procedures with 96.27%, demonstrating the effectiveness of the Data Mart for the health sector, through these results collected fundamental patterns were identified to improve health care optimize processes and make good decisions. In this case we can mention that Shaker H. [18] mentions that ETL tools in a software helps to extract data from various sources, as well as cleaning and customizing the information in a data warehouse, which is why, building an ETL process is considered potentially one of the biggest tasks of building a warehouse. Chura PC, et al. [19] mentions that a Data Mart was implemented in the sales processes of tourist tickets and achieved the improvement in time optimization by 85% in deliveries of reports from the sales area, in this way it was possible to streamline internal processes and thus ensure efficiency in the flow of information, on the other hand, in terms of the satisfaction rate of sales management improved by 90%, which means that they became more accurate and detailed in the delivery of reports provided by the sales. Vidal [20] implemented a Data Mart in which he found results such as a greater efficiency in terms of its indicators going from 3.49 min to 16 s, seeing from another perspective an improvement of 93.1% in terms of availability time. For the research carried out by D’Amario, D. [21] when generating a Data Mart with the purpose of developing an exploiter for large amounts of information from patients with IC, managing to generate evidence based on real data. Thanks to the graphical analysis provided by the Tableau tool many researchers achieved very good results as in the case of J. Hoelscher [22] who with an approximate 2.5 quintillion bytes of data generated per day, more than 90% of all his existing data created in the last 2 years which was performed in his case study. Abeer Khan. [23] mentions the results obtained helped in a positive way thanks to the ETL processing, achieving its objectives which were the reduction of operating costs to improve customer services, in this way it was possible to understand the preferences and needs of its customers, improve its services, offer good recommendations and provide an excellent overall customer experience.

2 Materials and Methods The Hephaestus methodology was used in the research allowing to visualize in an agile way to understand and achieve the objectives of the dental clinic. The 4 stages of the methodology were applied to help with the analysis of the information: The realization of requirements analysis, which consists of the correct evaluation of requirements by means of questionnaires, thus helping the Data Mart infrastructure, secondly, the analysis of online transaction processing was carried out, which allowed quantifying the indicators, coupling or ordering the conceptual model already obtained in conjunction with the new data collector, Thirdly, the logical model

12

M. C. Dominguez et al.

of a Data Mart to be used, which culminated in the previous stage, we proceeded to make the structure of the logical model and finally integrated the information, in which we proceeded with the incorporation of the data, thus filling the fields within the dimensions of the Data Mart (Figs. 1 and 2).

Fig. 1. Pinaud Dental Clinic Business Model developed in Bizagi.

There are different methodologies in Business Intelligence, each of them have a perspective, as mentioned by T. Thivakaran [24]. Thivakaran [24], in which this neural network method provides reliable results due to the selection of milestones, the transformation of information and the analysis of these gives room for good decision making, however, for this scientific article the Hephaestus Methodology was chosen, this for its ease of understanding. In addition, it consists of phases (4 phases) where a detailed report can be created with management metrics that helped decision making in the sales process of dental services within the Pinaud Dental Clinic. 2.1 Requirements Analysis. [Phase 1-Hefesto] Question identification To collect information, a meeting was requested with the manager of the Pinaud Dental Clinic, and the workers who have contact with the database, with the respective collections and with the issuance of receipts or invoices for the dental services offered. At the meeting the following questions were generated through a questionnaire aimed at the decision making process of the sales process, which is responsible for the processes of generating the invoice, planning sales of dental services, patients and booking appointments, generating customer loyalty, determining the total payment of the patient, as well

Data Mart in Business Intelligence with Hefesto

13

Fig. 2. Research PDM development methodology.

as the number of appointments per day, also determine the total payments for treatments and finally determine the total sales per month. Conceptual Model Based on the indicators and perspectives presented above, the processes were collected, so that a conceptual model was built, consisting of the total payment per patient, which is calculated according to the payment of each patient for the complete service offered, as well as the number of appointments per week, which is calculated from the total number of appointments that patients make per day at the Clinic, in addition to the payments per treatment, which is calculated by adding the payment of all clients per treatment and finally the total sales per month, it is calculated from the total number of treatments carried out per month.

14

M. C. Dominguez et al.

OLTP Analysis [Phase 2-Hefesto] Conform indicators The Hephaestus methodology allowed us to analyze how the indicators will be calculated, in order to create our corresponding relationships. Therefore, the conceptual model will be developed in a general way, where the results of the OLPT analysis were shown for the construction of data since R. Tardío [25], mentions that the systems help with the on-line analysis and allow storing a lot of information in tables, executing analytical queries in a course of response time. The Facts and Indicators are defined by the total payment per patient, with Facts: Patients and Summarization Function: COUNT, as well as the number of appointments per week, where Facts: Appointments and Summarization Function: SUM: Appointments and Summarization Function: SUM, also the payments per treatment which have Facts: Treatment and Summarization Function: SUM and finally the total sales per month which have by Facts: Total Services and Summarization Function: SUM. Establish Correspondences/Mapping. For the implementation of the Data Mart in the company Pinaud Dental Clinic, the process of sales of services was carried out and is presented in the diagram in Figure 3

Fig. 3. Conceptual model relationship with SQL tables of the research.

Data Mart in Business Intelligence with Hefesto

15

where the entities and the relationship are presented, representing the information through the entities, relationships, keys, hierarchies, and attributes. Dimensional Model The dimensional model details the patient perspective, the sales perspective, the sales perspective, and the service perspective, as shown in Tables 1, 2, 3 and 4 respectively. Table 1. Patient Perspective Patient Perspective Column Name

Column Meaning

CdPatient

Primary key that represents only one patient

First name

It is the patient’s first name

Last name

It is the client’s paternal and maternal surnames

Sex

Patient’s gender

Date of birth

Patient’s birth ID

Age

Patient’s age Table 2. Sales perspective

Sales perspective Column Name

Column Meaning

CodCharge

It is the primary key of the budget table and refers to a single treatment

CdPatient

Patient code

Patient

These are the full names of the patient

Date

Date the treatment was performed

Expiration

Budget due date

Status

Payment status

Total

Total amount of payment for the entire treatment Table 3. Time perspective

Time perspective Column Name

Column Meaning

CodAppointment

Primary key representing a particular patient

Date

Date on which the patient scheduled an appointment

Start

Start time of treatment

Hfin

Time at which the treatment was completed

16

M. C. Dominguez et al. Table 4. Service perspective

Service perspective Column Name Column Meaning CdTreatment

It is the primary key of the Treatments table and represents only one treatment

Name

Name of the treatment

Duration

Duration of the treatment

PrBase

Base price of the treatment

CdArea

Area of specialization code

DM Logic Model [Phase 3 - Hephaestus] In this phase, the logical model of the Data Mart structure was created with the Hephaestus methodology, using the conceptual model that was developed as an indicator. Type of DM logic model Logical model in the research was the Star model, as shown in Fig. 4.

Fig. 4. Star Logic Model of the research.

Data Mart in Business Intelligence with Hefesto

17

Table of Facts Figure 5 shows the table of facts, it was responsible for storing the indicators called KPIs that are related to the organization’s processes and whose name was Collections. Combination of primary keys of the dimension tables defined above with the 4 facts identified: Number of patients (CdPatient), number of appointments (CdAppointments), number of treatments (CdTreatment) and profits obtained (CdBudget).

Fig. 5. Design fact table of the research.

Data Integration [Phase 4-Hefesto] Architecture Figure 6 shows the business architecture, collection, analysis and processing of the information implemented in the Pinaud Dental Clinic, which consists of 4 layers: data layer, integration layer, processing and analysis layer, and finally, the information visualization layer, which shows the relevant information for the company.

Fig. 6. Information flow diagram of the Pinaud Dental Clinic.

18

M. C. Dominguez et al.

Initial Load - Main ETL Process Figure 7 shows the loading process and the implementation of ETL which is a transformation tool that helped to export data and to be subsequently analyzed, that is why, Shaker H. [18], mentions that when building an ETL process it is important to focus on the three areas such as: the source areas from where the information was collected, then the destination which is the exported data warehouse and finally the mapping area; determined by the loading of Dimensions: Patient, Appointments, Treatments and Budget, executing the step container that loads the tables: Patient, Appointments, Treatments and Budget respectively, shown in Fig. 8.

Fig. 7. Design of the ETL loading process of the research.

Load Dimensions

Fig. 8. Design connection of tables in SQL Server of the research.

Data Mart in Business Intelligence with Hefesto

19

3 Results and Discussion Figure 9 shows the results by milestones of the data acquired by the Pinaud Dental Clinic regarding patient payments in which the highest amount for patient Cindy Olivares was S/. 320.00 from May 11 to June 10.

Fig. 9. Graph of total customer payments during May-June of the research.

Figure 10 shows the number of appointments per day where the days with the highest frequency of patients attended are May 18, 23, 24, 26 and June 3 with a minimum of 6 patients. Figure 11 shows the payments for treatments, where the most paid treatment in May was exodontia with 4 patients attended and the most requested treatment by days was consultation. After having carried out an exhaustive data collection and analysis of the Pinaud Dental Clinic, we found effective results thanks to the implementation of a DataMart supported by the Hefesto methodology. After finding the respective graphical reports collected by the data cube using the statistical analysis tool Tableau as mentioned by J.Hoelscher [22], it is important to keep in mind that a good analysis of the information with a better understanding and transforming it into data that helps to improve processes within an organization with the possibility then to compare results and verify the effectiveness of the Data Mart, since this visual platform Tableau generates graphically reports for a better understanding that allowed us to make good decisions within the organization. By looking at Fig. 12. we can easily understand the increase in sales of dental services if we put in comparison to what was analyzed in the last month without having implemented the Data Mart.

20

M. C. Dominguez et al.

Fig. 10. Graph of the number of appointments per day during the month of May-June of the research.

Fig. 11. Total payment for treatments in May-June of the research.

Figure 13 shows the amount of treatments paid per patient with effective increased, if we compare with the amount analyzed in Fig. 9, which corresponds to the last month, we have a maximum of S/. 570.00 per patient. When visualizing the graph generated and presented in Fig. 14, the number of appointments per day has been increasing, as compared to Fig. 10. An approximate of 10 appointments per day was obtained, which is reflected in the increase in sales of dental services of the clinic. In relation to the amount of payment for treatments according to the number of treatments per day, taking into account that the clinic offers all dental treatments, Fig. 15 shows that the most paid treatment is orthodontics, compared to Fig. 11, for a minimum

Data Mart in Business Intelligence with Hefesto

21

Fig. 12. General Dashboard of the research project.

Fig.13. After the total payment of clients during the month of June-July of the research.

amount of S/. 380.00 per treatment de Exodontics, this comparison indicates that the amount collected after implementation is S/. 505.00.

22

M. C. Dominguez et al.

Fig.14. After the number of appointments per day during the month of June-July of the research.

Fig. 15. After the amount of payment for treatments during the month of June-July of the research.

4 Conclusion In the Pinaud Dental Clinic, with the implementation of the Data Mart an optimal and timely decision making within the sales processes was achieved, obtaining positive results with the total increase of earnings per day for an amount of S/. 820.00 in comparison to a lower amount of S/. 760.00 in one of the days before the implementation. The result was an increase in all the existing milestones, in which the favorable percentage of the total payment per client was 44%, as well as higher profits from sales of dental services, as well as in the case of the number of appointments per day increased by 21% during the month of May and June. As a third indicator we have a 25% increase in total

Data Mart in Business Intelligence with Hefesto

23

payment for specific treatments, with orthodontics being the most requested, and finally we have a 7% increase in total profits in the month of implementation. Also the implementation of the Data Mart found that the payment in patients Cindy Olivares and Diego Alonso had the highest amount of S/. 320 per treatment, in second place, Nelley Cristina with S/. 280, in third place, Sabina Vasquez with S/. 270. Also the number of appointments per day in June increased by 69 and in the month of July increased by 48 appointments, where the payment with more treatment was in consultations, crown, exodontia, prophylaxis, prosthesis and restoration. Total sales per day increased with the reports made with tableau, finding that in May an average of S/. 605.17 and in June an average of S/ 645.00.

References 1. Nettleton, D.: Data mining from relationally structured data, marts, and warehouses. Commer. Data Min. 181–193 (2014). https://doi.org/10.1016/B978-0-12-416602-8.00012-1 2. Zhang, Q., Zhan, H., Yu, J.: Car sales analysis based on the application of big data. Procedia Comput. Sci. 107, 436–441 (2017). https://doi.org/10.1016/j.procs.2017.03.137 3. Suma, S., Mehmood, R., Albugami, N., Katib, I., Albeshri, A.: Enabling next generation logistics and planning for smarter societies. Procedia Comput. Sci. 109, 1122–1127 (2017). https://doi.org/10.1016/j.procs.2017.05.440 4. Vidal Alegría, F.A., Urrutia Garzón, A.F., Urrea Moreno, C.A., Albán Latorre, J.G., Arias Iragorri, C.G.: Inteligencia De Negocios Aplicada a Consultas Y Controles En El Hospital San Antonio De Padua Del Municipio De Totoró Cauca, pp. 1–9 (2022). https://doi.org/10. 26507/ponencia.789 5. Zeng, L., Xu, L., Shi, Z., Wang, M., Wu, W.: Techniques, process, and enterprise solutions of business intelligence. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 6, no. November 2006, pp. 4722–4726 (2006). https://doi.org/10.1109/ICSMC.2006. 385050 6. Ha, S.H., Park, S.C.: Application of data mining tools to hotel data mart on the Intranet for database marketing. Expert Syst. Appl. 15(1), 1–31 (1998). https://doi.org/10.1016/S09574174(98)00008-6 7. Forero Castañeda, D.A., Sanchez Garcia, J.A.: Introducción a la inteligencia de negocios basada en la metodología kimball. 9(1), 5–17 (2021) 8. Antunes, A.L., Cardoso, E., Barateiro, J.: Incorporation of ontologies in data warehouse/business intelligence systems - a systematic literature review. Int. J. Inf. Manage. Data Insights 2(2) (2022). https://doi.org/10.1016/j.jjimei.2022.100131 9. Mathematics, A.: Análisis para el desarrollo de un data Mart, mediante la herramienta Power Bi para mejorar la toma de decisiones en el área consultas odontológicas del consultorio ‘Quimi dental’ ubicada en la ciudad de Babahoyo, pp. 1–23 (2022). http://dspace.utb.edu.ec/ bitstream/handle/49000/12557/E-UTB-FAFI-SIST-000361.pdf?sequence=1&isAllowed=y 10. Ríos, La importancia de la inteligencia de negocios - UEES - Universidad Espíritu Santo,” La importancia de la inteligencia de negocios (2022). https://uees.edu.ec/la-importancia-dela-inteligencia-de-negocios/ 11. Colina Valiente, B., Rosmery, J., Morales Chau, B., Luzmila, V.: Universidad Privada Antenor Orrego Facultad de Ingeniería Escuela Profesional de Ingeniería Industrial (2022). https:// orcid.org/0000-0002-3185-3036 12. Completo, J., Cruz, R.S., Coheur, L., Delgado, M.: Design and implementation of a data warehouse for benchmarking in clinical rehabilitation. Procedia Technol. 5, 885–894 (2012). https://doi.org/10.1016/j.protcy.2012.09.098

24

M. C. Dominguez et al.

13. Alifia, M.: Desarrollo De Un Datamart Para Mejorar La Toma De Decisiones En El Área De Operaciones De La Empresa Mdp Consulting S.A.C, vol. 7, p. 6 (2021) 14. Mamani, J.: Diseño e Implementación de Una Solución de Inteligencia De Negocios Para la Toma de Decisiones Y Gestión De La Información Para Pymes Utilizando La Metodología Hefesto 2.0, Tesis, pp. 1–112 (2019). http://repositorio.unap.edu.pe/bitstream/handle/UNAP/ 7104/Molleapaza_Mamani_Joel_Neftali.pdf?sequence=1&isAllowed=y 15. del Aguila, M.A., Felber, E.D.: Data warehouses and evidence-based dental insurance benefits. J. Evid. Based Dent. Pract. 4(1), 113–119 (2004). https://doi.org/10.1016/j.jebdp.2004.02.007 16. Gunaseelan, V., Kenney, B., Lee, J.S.J., Hu, H.M.: Databases for surgical health services research: Clinformatics Data Mart, Surgery (United States), vol. 165, no. 4. Mosby Inc., pp. 669–671, 01 April 2019. https://doi.org/10.1016/j.surg.2018.02.002 17. Oniani, D., et al.: ReDWINE: a clinical Datamart with Text analytical capabilities to facilitate rehabilitation research. Int. J. Med. Inform. 105144 (2023). https://doi.org/10.1016/j.ijmedinf. 2023.105144 18. El-Sappagh, S.H.A., Hendawi, A.M.A., El Bastawissy, A.H.: A proposed model for data warehouse ETL processes. J. King Saud Univ. - Comput. Inf. Sci. 23(2), 91–104 (2011). https://doi.org/10.1016/j.jksuci.2011.05.005 19. Chura, P.C., Yanavilca, A.V., Soria, J.J., Castillo, S.V.: DataMart of business intelligence for the sales area of a Peruvian tourism company. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Data Science and Algorithms in Systems. CoMeSySo 2022. LNNS, vol. 597, pp. 415–429. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-21438-7_33 20. Vidal, S.R., Díaz, R.H., Loaiza, O.L.: Efficiency in the availability of indicators of a DataMart based on the Kimball methodology for a baking company. In: Farhaoui, Y., Rocha, A., Brahmia, Z., Bhushab, B. (eds) Artificial Intelligence and Smart Environment. ICAISE 2022. LNNS, vol. 635, pp. 911–919. Springer, Cham (2023). https://doi.org/10.1007/978-3-03126254-8_132 21. D’Amario, D., et al.: Generator heart failure DataMart: an integrated framework for heart failure research. Front. Cardiovasc Med. 10 (2023). https://doi.org/10.3389/fcvm.2023.110 4699 22. Hoelscher, J., Mortimer, A.: Using Tableau to visualize data and drive decision-making. J. Account. Educ. 44, 49–59 (2018). https://doi.org/10.1016/j.jaccedu.2018.05.002 23. Khan, A., Ehsan, N., Mirza, E., Sarwar, S.Z.: Integration between customer relationship management (CRM) and data warehousing. Procedia Technol. 1, 239–249 (2012). https:// doi.org/10.1016/j.protcy.2012.02.050 24. Thivakaran, T.K., Ramesh, M.: Exploratory Data analysis and sales forecasting of bigmart dataset using supervised and ANN algorithms. Meas. Sens. 23 (2022). https://doi.org/10. 1016/j.measen.2022.100388 25. Tardío, R., Maté, A., Trujillo, J.: Beyond TPC-DS, a benchmark for big data OLAP systems (BDOLAP-Bench). Futur. Gener. Comput. Syst. 132, 136–151 (2022). https://doi.org/10. 1016/j.future.2022.02.015

Features of Building Automated Educational Systems with Virtual and Augmented Reality Technologies Oleg Slavin(B) Federal Research Center «Computer Science and Control» of the Russian Academy of Sciences, 44 Building 2 Vavylova Street, Moscow 119333, Russia [email protected]

Abstract. The features of the construction of automated educational systems with technologies of virtual and augmented reality are outlined in terms of the specifics of the implementation of the current control of the level of knowledge of students using the test control system. For automated correction of the information model of the learning environment in order to form the required learning trajectory, taking into account the specifics of learning and the individual abilities of the student, it is proposed to use automated learning diagrams. The accumulated practical experience convincingly indicates that the implementation of the above approaches allows to reduce the time and improve the quality of trainees’ professional training through adaptive personalized management of trainees’ training by optimizing the typical training structure for each trainee, taking into account the individual characteristics of mastering the educational material. The general tasks of implementing automated learning with the use of virtual and augmented reality technologies that need to be solved are identified. Keywords: Automated educational system · Virtual reality · Augmented reality · Test control · Automated learning diagram · Learning trajectory

1 Introduction At present, there is no doubt about the need for the widespread introduction of technical means, methods and forms of training aimed at the development of productive thinking, the individualization of the learning process and its intensification of automated educational systems [1–3]. An automated educational system is commonly understood as an automated information system that combines a teacher, students, a set of educational, methodological and didactic materials, an automated data processing system designed for the learning process in order to increase its effectiveness [4–6]. The main tasks of an automated learning system in the general case are as follows [7–9]: 1. Familiarization of the trainee with the tasks of his activity, study of the material and a set of actions for its operation in normal and emergency modes of operation. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 25–33, 2024. https://doi.org/10.1007/978-3-031-54820-8_3

26

O. Slavin

2. Maintaining and developing skills for assessing the current situation, making decisions based on simulation models. 3. Providing prompt and visual display of the results of the student’s activity, through graphs, tables, diagrams and histograms of the distributions of the values of a given diagnostic indicator. 4. The use of an interactive mode, which allows you to dramatically increase the throughput of the feedback channel (from the student to the computer) compared to traditional training. The possibility of implementing intensive information flows in a closed learning loop, as well as obtaining various references, explanations, and recommendations through a dialogue with a computer. 5. Implementation of operational control of the specialist’s activities with the issuance of error messages and explanations of their causes. Transfer of information about the errors of the trainee to the workplace of the instructor and the trainee. The learning assessment system should provide timely delivery of data in a form that allows it to be used to further improve the learning process. 6. Planning the learning process, generating training programs (didactic scenarios) depending on the initial and required levels of specialist training. 7. Providing the instructor with tools that ensure the formation and correction of information models of situations of any degree of nesting and complexity in order to demonstrate, train and control the quality of students’ activities. 8. Game motivation, which increases the attractiveness of the learning process. The priorities for the development of automated educational systems are associated with the implementation of virtual reality technologies. These technologies appeared in the sixties of the XX century at the intersection of research in the field of computer graphics and human-machine interface to influence the user using various devices [10– 12]. Virtual Reality (VR) replaces the real world by influencing the user based on their reactions. Complementing VR technologies with digital objects or digital objects that replace objects in the real world is the essence of augmented reality (AV) technologies. All types of virtual and augmented reality are commonly referred to as AVR [3, 7]. The vast majority of AVR technologies are based on information and computer technologies, and their interaction with the user is implemented using input components responsible for controlling his reactions, and output components responsible for influencing the user’s senses [13, 14]. Currently, AVR technologies are one of the key aspects of the fourth industrial revolution, ensuring the implementation of the concept of the “Internet of things” in many areas of the population’s life [15, 16]. The spread of AVR technologies is hindered by a number of significant shortcomings that do not allow using all the possibilities of augmented and virtual reality. Full immersion in AVR (achieving full immersiveness) is currently limited due to insufficiently high resolution displays and insufficient performance of mobile platforms or low mobility of powerful systems equipped with improved display devices [17–19]. However, the listed problems of virtual and augmented reality devices are close to being significantly leveled with further improvement.

Features of Building Automated Educational Systems

27

2 Methods The implementation of automated learning technologies involves the implementation of current control of the level of knowledge of trainees, implemented using a test control system [1, 5, 9]. The test control system provides: managing the process of adaptive testing; preliminary (input) control of the student’s knowledge; objective control of assimilation of educational material by the trainee (current, thematic, milestone and final control); analysis of the psychological characteristics of the student before training and during training in the event of difficult situations (for example, if over a long period of time the student received only unsatisfactory grades); analysis of test results with the issuance of conclusions and recommendations. The functioning of the test control system is based on the implementation of adaptive testing, which is a process of automated selection of test tasks (control questions) of such a level of difficulty, in which the accuracy (objectivity) of measuring the level of knowledge of the student reaches a maximum [4, 10]. An integral part of the test control system is the base of control tests. The database of the test control system includes a set (system) of tests consisting of a finite set of test tasks ordered by difficulty level and consists of two blocks: a block of psychological tests aimed at identifying the psychological properties and individual characteristics of the student; a block of didactic tests to control the assimilation of educational material by the trainee in the learning process. Database development includes the following steps: 1. Structural system analysis of the object of study, the result of which is the construction of an individual psychological portrait that sets the requirements for the psychological properties of trainees and reveals professionally important qualities, and the formation of a qualification characteristic that determines the level of knowledge, skills and abilities of specialists. This stage of developing a database of a test control system is implemented on the basis of the principle of functional decomposition “top-down”, which consists in the sequential detailing of the system into separate fragments, each of which reflects a rather limited stage of the process of the system functioning. When implementing this stage, the method of expert assessments is effectively applied. 2. Development of test tasks based on the compiled individual psychological portrait and qualification characteristics, as well as taking into account the thematic plan and the content of academic disciplines. As a rule, test tasks are compiled by both teachers and experts in the subject area, which increases the completeness and reliability of tests. 3. Creation, research and testing of a prototype of test items on a representative sample of students. The results of the survey are summarized in a table of experimental data (“learners - answers to test tasks”).

28

O. Slavin

4. Selection (deletion) of non-informative test items, carried out with the help of empirical and statistical data analysis. Empirical analysis is based on the self-informativeness of experimental data, while taking into account only the numerical relations of similarity and difference between objects (students) and features (answers to test tasks), and empirical (external) relations of the studied objects are not taken into account (implemented using principal component method, factor analysis, cluster analysis, contrast group method, etc.). Statistical analysis is based on the attraction and active use of additional (external) training information, based on external criteria (implemented using regression analysis, discriminant analysis, artificial neural networks, etc.). 5. Standardization of the test, that is, the transformation of test scores of a representative sample of students based on the analysis of the empirical distribution of test scores into a standard form. 6. Calculation of test reliability characteristics. Test reliability is understood as a test characteristic that reflects the accuracy (objectivity) of measurements, as well as the stability of test results with respect to the influence of extraneous random factors. 7. Determining the validity of the test. The validity of a test shows to what extent the test measures the quality (property, abilities, etc.) for which it is intended to reveal (characteristics). There are three main types of validity [8, 12]: meaningful: gives answers to the questions: does the content of the test cover the entire complex of requirements for knowledge of a particular academic discipline and to what extent the selected test tasks (out of many possible ones) are suitable for assessing knowledge in this academic discipline; empirical: testing with another test that measures the same indicator as the test under study is determined to determine the individual predictive value of the test; constructive (conceptual): is established by proving the correctness of the theoretical concepts underlying the test; gives information about the degree of measurement by the test of a constructively distinguished parameter and requires constant accumulation of information about the variability of estimates. There is no single approach in the literature to determine the exact boundaries of reliability and validity, which allow assessing the quality of the test according to these criteria. Analysis of the reliability and validity of the test allows you to carry out a "primary cleaning" of the test, to get an initial idea of the approximate difficulty of the tasks and the expected quality characteristics of the test being developed. However, these criteria do not distinguish well-trained trainees from weak ones, since the differentiating ability of such tasks is close to zero [8]. To obtain strict correct estimates of the test parameters, the test effectiveness criterion is used, which determines the correspondence of the difficulty of the test task to the level of preparedness of the student [4]. 8. Determining the effectiveness of the test, which helps to identify the optimal set of test items that correspond in difficulty and differentiating ability to the characteristics of the test sample of students. To determine the effectiveness of the test, the Item Response Theory is used, which is a psychological and pedagogical version of latent structural analysis [12]. The IRT mathematical apparatus makes it possible to obtain strict correct assessments of test tasks, which significantly increases the accuracy of test measurements and the quality of the developed tests.

Features of Building Automated Educational Systems

29

9. Formation of a database of psychological and didactic tests, implying its filling with tests that provide the construction of an individual psychological portrait and the formation of trainees’ qualification characteristics. An analysis of the subject area shows that the laws of psychology and pedagogy are not fully taken into account when automating learning. The dialectic of development in this area lies in the fact that no matter how great the potential possibilities of informatics are, they will not turn into a truly effective training complex without taking into account the psychological properties and abilities of a person. Known automated learning systems practically do not take into account the individual characteristics of the trainees and their ability to adapt to the learning environment. The problems of individualization of education are quite complex and have not yet been fully resolved [7, 14]. To solve the problem of optimal control of the learning process, building more advanced experimental and analytical models of this process, it is important to develop a “micro-approach” to the study of learning processes. The “micro-approach” allows one to penetrate into the structure of learning processes, highlight the leading psychological factors, determine learning patterns and characteristics of learning strategies [5, 18, 19]. Therefore, the further development of automated learning should be associated with the study of not only the processes of evolution of learning strategies, but also the processes of their transformation - the transition from one strategy to another under the influence of teaching methods, the individual characteristics of the student, the conditions and means of performing educational tasks [20–22].

3 Results and Discussion For implementation in automated educational systems that use virtual and augmented reality technologies, taking into account the theory of transformational learning, a class of models has been built - square diagrams (quadiagrams) of automated learning. The quadriagram allows you to correct the information model of the learning environment E(T) to form the required learning trajectory Q(T) taking into account the mastered strategies Q(F) and the individual abilities of the student [12]. A variant of the quadriagram structure is shown in Fig. 1.

Fig. 1. The structure of the automated learning quad.

The learning trajectory is a dynamic model of the automated learning process that displays changes in learning efficiency Q over time T, including at the stages of evolution (convergence and divergence) of a certain structure and at the stages of transformation of one structure into another Q(T).

30

O. Slavin

With the help of the automated learning quadrigram, both the direct problem can be solved: planning learning trajectories based on E(T) and Q(F), including planning options for transforming the structure S i into the structure S n , and the inverse problem: along the trajectory Q(T) information model of the learning environment of an automated learning system. To optimize the display of information in an automated learning system, the concept of a reflection vector [12] is used by the student of the learning environment (quadrant II), which determines the ability of the student to perceive the educational frame presented to him. In general, the reflection vector depends on the psychological state of the student, the level of his knowledge, skills and abilities, as well as on the characteristics of technical teaching aids. The integrated use of modern computer multimedia technologies can significantly expand the possibilities for implementing the didactic principles of teaching in an automated learning system by optimizing the complex impact on the organs of perception of students. According to the theory of transformational learning, during the transformation of learning strategies, a qualitative change in the structure of the student occurs, therefore, when moving from one strategy to another, the vector of reflection will change. The quad diagram allows you to simulate the processes of computerized learning and individually adaptively find the optimal trajectory for the training of each specific specialist. At the same time, at the first stage, the structuring of the educational material is carried out (the choice of educational topics, sections, blocks, frames), a set of effective strategies is determined - educational (intermediate) and special (final). Then the information display means are designed, the procedure for monitoring the knowledge of the student is determined, the characteristics of the automated learning system are selected with individual operational adaptation to the student. And, finally, they choose a learning path for a purposeful organization of learning - mastering the required set of effective strategies. The dynamics of the learning environment covers the range corresponding to the real conditions of professional activity, and it is reproduced in the optimal (for the quality of training) sequence. The choice and changes in the characteristics of an automated learning system in order to adapt them individually and promptly to the student may include many options, techniques, procedures. For example, changing the information content and structural composition of the training frame allows you to purposefully adjust the vector of reflection - to influence the attention of the student. This possibility is also provided by other engineering and psychological principles for designing and adapting automated learning systems (conciseness, autonomy, structure, etc.) [8, 12, 14]. Thus, in the course of computer learning and mastering strategies of different types, it is necessary to optimize the position of the reflection vectors. For example, at the beginning of mastering a new strategy, the student mobilizes his attention as much as possible, trying to use previously accumulated knowledge to understand the received educational material. This period is characterized by hyperreflexia - the student uses additional material and overestimates the real complexity of the tasks that are offered to him. As it approaches the plateau, the vector approaches normo-reflexion, which is achieved when the learning curve plateaus. The student loses interest in tasks that have become quite easy and understandable for him, the processes of perception and thinking

Features of Building Automated Educational Systems

31

are curtailed [12]: instead of the processes of element-by-element enumeration of the conditions of tasks, which is characteristic of the first stage of mastering a new strategy, the processes of simultaneous “grasping” of large blocks of information begin to prevail like an instant solution to the problem. A quad diagram of transformational processes when using automated learning systems is built on the basis of data obtained using an adequate system simulation design model implemented on the basis of a dynamic model of the functioning of an automated learning system. The study and application of the quadragram of automated learning in the development of automated learning systems allows you to purposefully manage learning activities, significantly reducing the time and improving the quality of trainees’ professional training.

4 Conclusion The accumulated practical experience convincingly indicates that the use of methods of the theory of transformational learning in the organization of training using automated systems can reduce the time and improve the quality of trainees’ professional training through adaptive personalized management of staff training (optimization of a typical training structure for each trainee with taking into account the individual characteristics of their assimilation of educational material). At the same time, the general tasks that need to be solved for the implementation of training using AVR technologies are determined by: increasing the efficiency of theoretical training (development of video materials, implementation of CRM training scenarios, etc.); development of decision-making skills through modeling with the help of AVR (conflict situations, situations with lack of information, situations with conflicting information); development of monitoring functions (reduction of blindness to changes, recognition of emergency situations, distribution of attention); development of checking skills for inspectors and instructors (displaying schemes and algorithms for performing checks, forming dynamic images of checking); the development of certain professionally important mental qualities (the formation of spatial abilities, the development of working memory, the development of flexibility of thinking (intellectual lability), the development of resistance to monotonous activity, etc.); development of individual competencies; formation of self-regulation skills with the help of autogenic training using AVR (shortterm recovery, normalization of sleep); increasing stress resistance through the formation of professional readiness in activities in non-standard situations (development of skills for predicting situations, development of the ability to form an image of activity). Currently, AVR technologies are being intensively improved, and their limitations and shortcomings are gradually being eliminated. In addition to eliminating the shortcomings of AVR devices, improving software and creating content for the progress of

32

O. Slavin

AVR technologies, it is important to dynamically progressively develop the market for the use of such devices and technologies in computer and cyber-physical systems.

References 1. McGuinn, I.V.: Application of augmented and virtual reality in education. Cross Cult. Stud. Educ. Sci. 7(2), 126–132 (2022) 2. Muraviov, I., Kovalenko, G., Bogomolov, A.: Method for improving the reliability of an ergatic control system for an automated aircraft. In: Proceedings - 2022 International Russian Automation Conference, pp. 628–632 (2022) 3. Slavin, O.A.: Applied aspects of the use of virtual and augmented reality technologies in education. News Tula State Univ. Tech. Sci. 9, 34–38 (2022) 4. Velitchenko, S.N.: The use of virtual reality tools in education. Interscience 13–1(283), 27–29 (2023) 5. Koltygin, D.S., Anikina, E.M., Koltygin, S.D.: The use of virtual reality technologies in education. Annali d’Italia 281, 48–50 (2022) 6. Karpunina, A.V., Shimanovskaya, Y.V., Kamenskih, V.N., Kudrinskaya, L.A., Bogatov, D.S.: VR in social services for the elderly: opportunities and risks. Revista Turismo Estudos Práticas 1, 26 (2021) 7. Slavin, O., Grin, E.: Features of protection of intellectual property obtained using virtual and augmented reality technologies. In: Kravets, A.G., Bolshakov, A.A., Shcherbakov, M. (eds.) Society 5.0: Human-Centered Society Challenges and Solutions. Studies in Systems, Decision and Control, vol. 416, pp. 103–113. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-030-95112-2_9 8. Ivanov, A., Bogomolov, A.: Heterogeneous information system for the integration of departmental databases on the state and development of human capital. Stud. Syst. Decis. Control 437, 117–127 (2023) 9. Denisenko, V.V., Korablin, M.A., Klimenko, K.S.: Education in virtual reality. Innov. Sci. Educ. 52, 558–563 (2022) 10. Larkin, E.V., Akimenko, T.A., Bogomolov, A.V.: The swarm hierarchical control system. In: Tan, Y., Shi, Y., Luo, W. (eds.) Advances in Swarm Intelligence, ICSI 2023, LNCS, vol. 13968, pp. 30–39. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-36622-2_3 11. Shinzhina, D.M.: Virtual reality in education. Inf. Educ. Bound. Commun. 14(22), 136–137 (2022) 12. Shpudeiko, S.A., Bogomolov, A.V.: Methodological foundations for the organization of nonmonotonic learning processes for complex activities based on the theory of transformational learning. Inf. Technol. 3, 74–79 (2006) 13. Ushakov, I.B., Bogomolov, A.V.: Diagnostics of human functional states in priority studies of domestic physiological schools. Med.-Biol. Soc.-Psychol. Prob. Saf. Emerg. Situations 3, 91–100 (2021) 14. Golosovskiy, M., Bogomolov, A.: Fuzzy inference algorithm using databases. In: Silhavy, R., Silhavy, P. (eds.) Artificial Intelligence Application in Networks and Systems, CSOC 2023, LNNS, vol. 724, pp 444–451. Springer, Cham (2023). https://doi.org/10.1007/978-3031-35314-7_39 15. Tobin, D., Bogomolov, A., Golosovskiy, M.: Model of organization of software testing for cyber-physical systems. Stud. Syst. Decis. Control 418, 51–60 (2022) 16. Grigoryeva, I.V., Fedchenko, R.S.: Application of virtual reality (VR) and augmented reality (AR) technologies in education (literature review). Russ. J. Educ. Psychol. 14(2), 24–30 (2023)

Features of Building Automated Educational Systems

33

17. Belchenko, V.E., Burykina, S.V., Paladyan, K.A.: The use of virtual reality technologies in education. E-Scio 11(74), 212–217 (2022) 18. Larkin, E.V., Bogomolov, A.V., Privalov, A.N., Dobrovolsky, N.N.: Relay races along a pair of selectable routes. Bull. South Ural State Univ. Ser. Math. Model. Program. Comput. Softw. 11(1), 15–26 (2018) 19. Bychkov, E.V., Bogomolov, A.V., Kotlovanov, K.Y.: Stochastic mathematical model of internal waves. Bull. South Ural State Univ. Ser. Math. Model. Program. Comput. Softw. 13(2), 33–42 (2020) 20. Larkin, E.V., Bogomolov, A.V., Privalov, A.N., Dobrovolsky, N.N.: Discrete model of paired relay-race. Bull. South Ural State Univ. Ser. Math. Model. Program. Comput. Softw. 11(3), 72–84 (2018) 21. Yasnikov, A.I., Safonova, T.V., Russkin, V.D., Loginov, I.S., Moshurov, V.M.: The use of virtual reality technologies in education. Inf. Technol. Syst. Manage. Econ. Transp. Law 1(45), 60–69 (2023) 22. Larkin, E., Privalov, A., Bogomolov, A., Akimenko, T.: Model of digital control system by complex multi-loop objects. AIP Conf. Proc. 2700, 030009 (2023)

Information and Communication Technology Skills for Instruction Performance: Beliefs and Experiences from Public School Educators Fatin Ardani Zamri1 , Norhisham Muhamad2 , and Miftachul Huda2(B) 1 Universiti Malaysia Sarawak, Kota Samarahan, Malaysia 2 Universiti Pendidikan Sultan Idris, Tanjung Malim, Malaysia

[email protected]

Abstract. This study was carried out to examine public educator’s beliefs and practices on managing Information and communication technology (ICT) skills in its usage in 21st century learning program. This study was conducted in seven secondary schools under Sarawak State Education Department Padawan District by distributing the questionnaire which has involved 69 respondents among the public educators in those schools. Retrieval results achieved were analyzed by using SPSS for Windows (version 23.0) software with descriptive method to obtain frequency, percentage, mean and standard deviation. An inference test such as Pearson Correlation was applied in this research in order to differentiate the relationship between variables. The final findings found that the level of public educator’s skill through ICT usage in 21st century learning program is medium (min = 3.47, sd = 0.61). The research findings towards effectiveness through ICT application of 21st century learning program are quite high (min = 3.82, sp = 0.58). Despite that, the results also have proven that there is a high medium positive relationship between ICT usage skill with the effectiveness of teaching and learning process of 21st century learning program. Based on this study finding, it is hoped that public educators are able to enhance their skill in using latest and sophisticated applications together with students’ ability in the usage of gadgets to apply for 21st century learning program teaching and learning process. Keywords: Information and Communication Technology · public secondary school · instruction performance · public educators’ beliefs and practices

1 Introduction The importance of information and communication technology (ICT) to the country, society and students in the context of knowledge generation cannot be disputed anymore [1]. The result of the use and development of this technology has provided education services in the country that can meet the needs in various fields, whether administrative management or pedagogical management. As a result, we are shown the progress of education starting to grow rapidly along with the current of technology in an era without borders. This is in line with the current changes in the current of globalization that move the paradigm of life towards modernization [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 34–40, 2024. https://doi.org/10.1007/978-3-031-54820-8_4

Information and Communication Technology Skills

35

Therefore, 21st century learning is an outgrowth of the advancement of technology and its application in the field of education. It became one of the teaching and learning methods introduced to teachers to help teaching and learning sessions become more interesting and effective. Furthermore, 21st century learning involves the application of ICT which requires information, media and technology skills. Educators need to challenge themselves in order to be able to move in parallel with the current of technological development, especially in the field of education, by making thorough preparations whether learning, understanding, mastering or even becoming technology pioneers [3]. Appropriately, teachers, especially public educators, are introduced to 21st century learning methods that apply the use of ICT to attract students’ interest in learning about Islam while also helping students to understand and even appreciate the subject of Islamic Education to be practiced in everyday life.

2 Problem Statement Teaching of Islamic education is the main subject that shapes the personality of individuals to have noble character. However, the teaching and learning pattern of Islamic Education is often considered backward and it is seen that it is a factor that causes students to be less interested in studying it. Based on research that has been conducted in several states in Malaysia, research shows that the lecture method is the most popular method used by teachers in the classroom [4]. In particular, most Islamic Education teachers are still less skilled in applying ICT during the teaching and learning process such as projectors, videos, slides, television and so on. This is due to the lack of interest and sensitivity of Islamic Education teachers to know and learn the ins and outs of handling these materials [5]. There are some teachers who are not given the opportunity to follow the courses and exercises on how to use the material as organized by the Malaysian Ministry of Education. In addition, the presence of ICT in the field of education has been able to help reduce costs and materials in producing self-teaching aids and save time. However, the presence of ICT has given a surprise to most Islamic Education teachers who lack knowledge in ICT let alone those who are not skilled in applying ICT in their teaching and learning process [6]. This point also coincides with a study that the use of teaching aids based on technology (ICT) is at a low mean level [7]. On this view, the failure of teachers to make changes through methods and techniques in teaching technology is a challenge that needs to be faced immediately [8]. Teachers should find alternatives to create fun in students. For the sake of advancing and increasing the tendency of students’ education towards Islamic Education, it is appropriate that research related to web applications is carried out as a transformation in the teaching of teachers, especially Islamic Education teachers. Internet and web technology play a major role as agents of communication, interaction, education and socialization of the world community. There are three main purposes of this study: identifying the skills of religious teachers through the use of ICT in 21st century Islamic Education; identifying the effectiveness of teaching and learning in 21st century Islamic Education by using ICT; analyzing the relationship between ICT use skills and the effectiveness of teaching and learning in 21st century learning program.

36

F. A. Zamri et al.

3 Methodology The design of this study is to use a quantitative design. Therefore, in order to obtain the information needed in this study, the researcher used a questionnaire. Through this study, the researcher can collect the necessary research data. Quantitative data is in the form of numbers such as numbers, frequencies, fractions and percentages that are produced through methods that involve calculation processes such as questionnaires and observations where each response or reaction is calculated to know its frequency (Idris Awang, 2009). This method is suitable for looking at significant relationships between variables. The study population consisted of Islamic teachers from fourteen secondary schools in Padawan District, Kuching, Sarawak. For the sample of respondents in this study, the researcher chose a probability sampling method to obtain quantitative data. A total of 69 respondents consisting of religious teachers in seven secondary schools under the Sarawak State Education Department (PPD Padawan), Kuching, Sarawak were selected randomly. Among the seven schools involved are: i. ii. iii. iv. v. vi. vii.

SMK Matang Jaya SM Sains Kuching Utara SMK Matang Hilir SM Sains Kuching Kolej Vokasional Matang SMKA Sheikh Haji Othman Abdul Wahab SMKA Matang 2.

The chosen research tool is a type of structured questionnaire, which is questions that contain multiple answer options. This questionnaire is divided into four parts, which are section A, B, C, and D. Section A contains questions related to the respondent’s demographics such as gender, teaching experience, level of education, participation in training or courses related to multimedia in teaching, computer ownership personal at home, ownership of own projector and printer at home. In this section, respondents are given a choice of answers and are required to answer based on the choices provided. Section B is related to ICT use skills in 21st century Religious Education which has 17 items. Section C is related to the effectiveness of teaching and learning in 21st century Islamic Education through ICT applications consisting of 16 items. Section D is related to the problems encountered in implementing the 21st century Islamic Education in teaching and learning process through the ICT application which has 17 items and makes the total number of questions in the three parts as much as 50 questions. The data obtained through this questionnaire is used to answer the questions presented in this study. The reliability of the questionnaire was tested through a pilot study conducted on 15 students at Universiti Pendidikan Sultan Idris. Based on the overall analysis that has been done on the results of the pilot study, it was found that this questionnaire has a high reliability value which is α = 0.905 which exceeds the value of α > 0.6. If the value of the alpha coefficient exceeds α > 0.6, it shows the instrument used in this study has a high reliability value. However, if the value is low, this indicates the ability of the items in the instrument is low [9].

Information and Communication Technology Skills

37

4 Results 4.1 Information and Communication Technology (ICT) Skills for Instruction Performance The results show that the overall mean value for ICT use skills in 21st century Islamic Education is at a moderate level of 3.47 (sp. = 0.61). Overall, four items are at a low level, especially the item “I am good at using the Socrative application to give quizzes to students” (mean = 2.43, sp. = 0.98)„ five items are at a medium level, namely the item “I am good at building educational software for teaching and learning Islamic Education” (mean = 2.78, sp. = 1.11), “I am good at using various multimedia materials during the teaching and learning process” (mean = 3.36, sp. = 0.97), “I use graphic or animation materials when explaining the learning topic” (mean = 3.41, sp. = 0.99), “I am good at using the Google Drive application to share teaching and learning materials with students” (mean = 2.86, sp. = 1. 05) and the item “I am good at using the Frog Vle application to share teaching and learning materials with students” (mean = 2.77, sp. = 0.99). While three items are at a high level, namely “I am good at using Microsoft Excel to enter student marks” (mean = 3.83, sp. = 1.12), “I am good at preparing teaching and learning materials using powerpoint” (mean = 3.94, sp. = 1.03), “Able to type Jawi quickly and correctly” (mean = 3.54, sp. = 0.99) and five items are at a very high level, especially the item “I use the computer to type exam questions” (mean = 4.74, sp. = 0.63). 4.2 The Effectiveness ICT Applications on Supporting Instruction Performances The findings show the effectiveness of teaching and learning in 21st century Islamic Education through ICT applications. The overall mean value is at a high level which is 3.82 (sp. = 0.58). Overall, fifteen items are at a high level, especially the item “ICT application can improve the teaching practice of Islamic Education” (mean = 4.22, sp. = 0.73) while only one item is at a moderate level with the lowest mean score, which is item C7 “ Two-way interaction often occurs in class when I apply the use of prezi slides” (mean = 2.96, sp. = 1.06). Inferential data analysis was also carried out to identify the relationship between the two main constructs used in this study, namely ICT use skills and the effectiveness of teaching and learning in 21st century Islamic Education. Table 1. Correlation between skills and effectiveness Construct

r

Sig. (2 tailed) (p)

Skills*Effectiveness

0.659**

0.000

** Correlation is significant at the p < 0.01 level (2-tailed), N = 69

The results of Pearson’s correlation analysis to identify the relationship between ICT use skills and the effectiveness of teaching and learning in 21st century Islamic Education shown in table 1 above show a moderately high relationship (r = 0.659, p < 0.01). Therefore, overall, the relationship between ICT use skills and the effectiveness of teaching and learning is strong.

38

F. A. Zamri et al.

5 Discussion It can be summarized that the skills of Islamic teachers through the use of ICT in Islamic Education in the 21st century are moderate. The majority of respondents are proficient in basic computer skills such as Microsoft Word, Excel and PowerPoint and typing fast when the results are at a high level, however, simple in terms of building educational software for teaching and learning process Islamic Education, the use of applications such as Google Drive and Frog Vle, the use of various materials multimedia, graphic materials and animation [10]. Through the findings of the study it can also be identified that Islamic Education teachers are less skilled in using sophisticated and up-to-date applications such as prezi, Google Classroom, Kahoot and Socrative when they are at a low level. In the researcher’s opinion, this is likely due to several factors. Among them are, teachers are not skilled in using increasingly sophisticated and complicated software. The applications that are lacking in teaching and learning in Malaysian schools are due to teachers not having the skills and training in these applications [11]. The study conducted by MSC on the courseware developed by the Malaysian Ministry of Education found that the quality of the courseware supplied to schools still has weaknesses such as the use of English in the software, technical problems, the inflexibility of the courseware and less interesting content. Applications that are lacking in teaching and learning in Malaysian schools is due to teachers not having the skills and training in these applications. In a study conducted by MSC on the courseware developed by the Malaysian Ministry of Education found that the quality of the courseware supplied to schools still has weaknesses such as the use of English in the software, technical problems, the inflexibility of the courseware and less interesting content [12]. Applications that are lacking in teaching and learning in Malaysian schools is due to teachers not having the skills and training in these applications. The quality of the courseware supplied to schools still has weaknesses such as the use of English in the software, technical problems, the inflexibility of the courseware and less interesting content. The results show that the effectiveness of teaching and learning process is at a high level which is 3.82 when the majority of items are at a high level except for one item which is two-way interaction occurring in the class when the prezi application is used is at a moderate level. The 21st century Islamic Education teaching and learning session using ICT makes students interested in learning and easy to concentrate. The results of the study show that these two items are at a high level. Parallel to the literature review that multimedia has a good effect on students during the implementation process of teaching and learning Islamic Education [13]. The effectiveness of teaching and learning using ICT can be seen through the results of the analysis of research findings for items that often get feedback from students related to the level of mastery and their understanding of the topic that has been studied is at a high level. The teachers’ skills in applying technology as a teaching aid in the teaching and learning process successfully attract interest and increase students’ understanding and memory because the teaching content is presented transparently. In addition, it is also easy for the teacher to control the class because the students easily understand the contents of teaching and learning. Two-way interaction occurs in the class when power point is used, causing students not to feel sleepy during teaching

Information and Communication Technology Skills

39

and learning sessions [14]. This is proven by the results of the study when all these items are at a high level. The teaching and learning of 21st century Islamic Education using ICT has a positive effect when students can read items correctly according to the reciting examples at a high level. In the researcher’s opinion, topics related to the recitation of the Quran can be adapted to the use of recitation videos either in the form of cassettes, CDs, DVDs or through the YouTube website as one of the teaching aids that can be applied by religious teachers. This is because, it cannot be denied that the use of video recordings is more interesting and effective for student learning [15]. The use of certain videos can be replayed and seen many times by students. This can help improve students’ memory and skills.

6 Conclusion 21st century learning has brought the country to a new dimension. Changes in terms of implementation require Islamic teachers not to fall behind in realizing PAK21 in addition to the appreciation of religious values. Therefore, the application of ICT in 21st century Islamic Education is part of the implementation of 21st century learning. Therefore, Islamic teachers need to be proficient in various software and always be aware of the changes that are happening around them in order to keep up with the abilities of students who are increasingly obsessed with the world of gadgets and internet. Teachers should also play a role in controlling students so that they do not drift away and are not influenced by the negative elements that exist today. It is hoped that this study will be able to be shared with the parties involved, especially in the field of Islamic Education as a reference to one of the elements that must be present in the implementation of 21st century learning.

References 1. Abdul Hadi, M.D., Lee, S.T., Mohan, P., Jamilah, D.: Penerimaan Alat Web 2.0 Dalam Pelaksanaan Kurikulum Program Berasaskan Pembelajaran Abad Ke-21 di Institut Pendidikan Guru. Jurnal Penyelidikan Dedikasi 10, 79 – 97 (2016) 2. Abdul Halim Ahmad: Penggunaan Aplikasi Rangkaian Sosial Dalam Kalangan Pelajar Poiliteknik Kuala Terengganu Politeknik & Kolej Komuniti. J. Soc. Sci. Hum. 1, 81–90 (2016) 3. Al – Bahrani, A., & D. Patel.: Incorporating twitter, instagram and facebook in economics classrooms. J. Econ. Educ. Econ. Educ. 46(1), 56–57 (2015) 4. Tamuri, A.H., Yusoff, N.M.R.N.: Teaching and Learning Methods of Islamic Education. Bangi National University, Kuala Lumpur (2010) 5. Jamian, A.R., Ismail, H.: Implementation of fun learning in teaching and learning Malay. Malay Lang. Educ. J. 3(2), 49–63 (2013) 6. Ahlqvist, T., Back, A., Halonen, M., Heinonen, S.: Social Media Roadmaps Exploring the Futures Triggered by Social Media. Technical editing Leena Ukskoski (2008) 7. Kaplan, A.M., Michael, H.: Users of the World, Unite! The Challenges and Opportunities of Social Media, Business Horizons. Indian University, Kelly School of Business (2010) 8. Baruah, T.D.: Effectiveness of social media as a tool of communication and its potential for technology enabled connection : a micro level study. Int. J. Sci. Res. Publ. 2(5), 1–10 (2012)

40

F. A. Zamri et al.

9. Boyd, D., Ellison, N.: Social network sites : definition, history & scholarship. J. Comput.Mediat. Commun. 13, 210–230 (2008) 10. Brian, M.: BYOD is shaping education in the 21st century. Tech Learn. 31(7), 54–57 (2014) 11. Awang, I.: Scientific Research on the Practice of Islamic Studies. Kamil & Syakir Sdn Bhd., Selangor (2009) 12. Abdullah, N.B., Lazim, N.R.J.B.M.L., Zain, R.B.A.: Teaching and Learning Technology. Multimedia Sdn. Bhd., Selangor (2009) 13. Hussin, N., Rasul, M.S., Abd, R., Rauf,: The use of websites as a transformation in teaching and learning islamic education. Online J. Islamic Educ. 1(2), 58–73 (2013) 14. Ahmad, S.F., Tamuri, A.H.: Teachers’ perception of the use of teaching aids based on multimedia technology in teaching j-QAF. J. Islamic and Arabic Educ. 2(2), 53–64 (2010) 15. Embi, Z.B.C.: The Use of Teaching Aids in Teaching Malay. Sultan Idris University of Education, Perak: Bachelor’s Thesis (2001)

Managing Information Quality for Learning Instruction: Insights from Public Administration Officers’ Experiences and Practices Arbaenah Masud1 , Abd Hadi Borham2 , Miftachul Huda2(B) , Mohamad Marzuqi Abdul Rahim2 , and Husna Husain2 1 Sekolah Menengah Agama (SMA) Al-Quran Waddin, Johor Bahru, Malaysia 2 Universiti Pendidikan Sultan Idris, Tanjong Malim, Malaysia

[email protected], [email protected]

Abstract. The study aims to examine the practices and process reflected into implementation and materials in distributing the information quality in religious learning instruction (preaching programs). This study is qualitative using the interview method in data collection. The participants come from public administration office in the state of Johor who have been selected by purposive sampling and according to the locality cluster zone of the school district. The results revealed that the strategic practices and process of information quality and accuracy in religious learning instruction program has been carried out well and systematically organized according to the needs and suitability of public schools. The main point of the findings highlighted the need to have a sufficient preparation on planning arrangement mainly about the materials which is traditionally simple and stereotyped. The implications indicated that the process and practices on implementing information quality and accuracy in religious learning instruction program requires the continued support from the aspects of contemporary knowledge and issues, technical experts and professional development in helping to enhance the actualisation from technical to real scenario. Keywords: Information quality · Learning instruction · Public administration officers · Experiences and practices

1 Introduction The history of government supported-religious school (SABK) in Johor, Malaysia, began with the establishment of Sekolah Agama Rakyat in 1918. At that time the management and implementation of the school was according to the wishes and knowledge of the “mudir” or school manager and founder. In 2018, SAR schools in the state of Johor were converted to SABK under the management of the Islamic Education Division, Ministry of Education Malaysia (KPM). SABK schools will use the religious curriculum and receive various facilities from the ministry. It is seen as a noble effort in developing rakyat religious schools (SAR) with a mainstream education system [1]. The establishment of SABK succeeded in producing capable students in the religious and academic fields. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 41–49, 2024. https://doi.org/10.1007/978-3-031-54820-8_5

42

A. Masud et al.

SABK is also the choice of parents to send their children to school. In addition to the integrated religious curriculum, the implementation of preaching programs at SABK can be organized systematically to help improve students’ development in the aspects of understanding and appreciating Islam, leadership and vision. On top of this factor, the Islamic Education Division of the Ministry of Education and Culture has worked together with the SABK school management to organize, coordinate and plan strategies in developing religious learning instruction programs in the aspects of faith, worship and morals [2]. These three aspects are the basis of Islam that must be lived in order to be a balanced Muslim and believer. Therefore, this article aims to identify the implementation of programs and preaching materials in the aspect of faith in SABK schools in the state of Johor. 1.1 Problem Statement and Objectives The Government Aided Religious School (SABK) originates from the People’s Religious School (SAR) which has been established to spread preaching continuously. While the schools have SAR status, the implementation of the preaching program does not use a special manual as a guide like schools under the Malaysian Ministry of Education (KPM). Each school has a different approach in the implementation of the religious learning instruction program and follows the approach of the school’s founder. In 2018, SAR schools in the state of Johor agreed to accept school administration under the ministry and became known as Government Aided Religious Schools (SABK). SABK schools have used an integrated religious curriculum under the ministry. However, in the implementation of religious learning instruction programs that help towards the development of students, they still follow the tendencies of their respective schools, such as during SAR status, which do not have specific and consistent guidelines. There are SABK schools that still maintain previous preaching programs because they are seen as relevant to aspirations since the establishment of the school [3]. Indirectly, it also has an impact on the implementation of programs and preaching materials in the aspect of faith in SABK schools. When school management is fully responsible for the implementation of preaching programs in SABK schools, there is inconsistency and different acceptance among students. The strong influence from SAR before showing that the dimensions of belief and faith have a strong influence on the moral judgment of students, Islamic spiritual tendencies give a significant influence [4]. Ahmad Faisal Muhamad’s study (2003) also found factors in the delivery of preaching programs that are less effective and the SABK school environment. The question is how is the implementation of the preaching program in the aspect of faith in SABK, and what are the materials of preaching in the aspect of faith used in Government Aided Religious Schools (SABK) in the state of Johor. Thus, the following research objective included to identify the implementation of the religious learning instruction program in the aspect of faith at the Government Aided Religious School (SABK) in the state of Johor, to identify preaching materials in the aspect of faith in the Government Aided Religious School (SABK) in the state of Johor.

Managing Information Quality for Learning Instruction

43

2 Literature Review Program means a plan of policy and how to do things. The word religious learning instruction means the activity of voicing and convincing others to accept a religious belief [5]. The religious learning instruction is an effort to deliver teachings that are done consciously and planned by using certain ways to influence others so that they can follow what is the purpose of the religious learning instruction without any coercion. Faith, on the other hand, means believing in and rejecting Allah s.w.t from things that are associated with Him. Or called monotheism. Aqidah and tauhid will stimulate a person’s confidence by researching, researching, studying and observing to develop strong faith in the concept of divinity, apostolate and sam’iyyat matters [6]. In understanding and appreciating the faith, humans have been given a channel to prove. In the context of this study, it can be understood that the preaching program in the aspect of faith means the plan and implementation of efforts to bring people to the teachings of Islam through activities in SABK secondary schools in strengthening students’ understanding of faith. In strengthening faith, every Muslim must understand God’s commands in the Quran and Sunnah by carrying out religious learning instruction, especially in educational institutions. The strategic implementation of religious learning instruction must be based on aspects of faith that cover the development of knowledge and guarantee the perfection of a Muslim’s knowledge [7]. The success of preaching in SABK schools is closely related to the role of administrators and teachers. The strategic implementation of religious learning instruction is very dependent on the wisdom of drafting and planning. This is important to ensure that the preaching message can be delivered [8]. Neglecting the aspect of belief in the implementation of religious learning instruction programs in schools makes the religious learning instruction program lame and does not fully meet the needs of religious learning instruction. Implementing the faith program is just an insertion program in addition to favorite programs such as worship and morals, certainly not able to solve moral issues among school teenagers. This is because students are not brought up with the understanding of muraqabatullah. As a result, they excel academically with an A in religious subjects and still commit sins and vices. In order to guarantee the continuity of this Islamic preaching, the management of preaching materials including faith materials is very important [9]. Weakness in the management of religious learning instruction materials contributes to the failure of the religious learning instruction goal itself. The financial aspect is very necessary and needs to be managed well and regularly whenever planning a religious learning instruction program. In addition, there is a lack of skills in delivering religious learning instruction according to class levels and specific targets, including in schools. Likewise in the study on the lack of preparation in the aspect of knowledge understanding and delivery [10]. Next, there is a lack of appropriate and acceptable preaching materials. Emphasis on aspects of faith in preaching programs can have an impact on individuals on the universal view (weltanschaung) and personality. The clear indication is looking into administrative factors, teachers, methods, reference materials, measurements, time allocation and classes influence the acceptance of preaching programs in schools [11]. The lack of coordination and communication in the programs and practices of religious learning

44

A. Masud et al.

instruction management in schools. The aspect of tolerance in achieving religious learning instruction aspirations from various parties in the school is very important because it opens up the acceptance and perception of the program indirectly. This shows that there are some gaps identified in the previous studies above. Among them is the method of implementing religious learning instruction programs in schools which requires an understanding of religious learning instruction philosophy among teachers. The variety of religious learning instruction methods is good, but togetherness in organizing and adjusting the religious learning instruction program needs to have effective coordination. Likewise in the aspect of the materials used through the preaching program. It needs improvement, updating and accessibility to be well received and effective.

3 Methodology 3.1 Design of Study This study uses a qualitative design by using the interview method. A total of six administrators at the Government Aided Religious School (SABK) covering three zones in the state of Johor were selected as informants using the purposive sampling method according to cluster according to the zone of the SABK schools in the state of Johor as shown in Table 1. Table 1. Selection of Informants by District Zone Zon

Respondent

North Zone - SABK School Muar and Segamat districts

2 (R1 and R2)

Middle Zone - SABK schools in Mersing, Kluang and Batu Pahat districts

2 (R3 and R4)

South Zone – SABK schools in Pontian, Kota Tinggi and Johor Bahru

2 (R5 and R6)

3.2 Respondent Zone The selection factor of religious schools is also due to the background and schooling system that is slightly different from other schools from the aspect of the school climate which contributes to the level of implementation of religious learning instruction activities in the school. The selection of respondents (R1–R6) from among the administrators of SABK because of the characteristics and advantages of those who have experience managing the schools since the school had the status of a People’s Religious School (SAR) until it was registered as a Government Aided Religious School (SABK).

Managing Information Quality for Learning Instruction

45

4 Research Results and Discussion 4.1 Practical Implementation of Information Quality Achievement on Preaching Program The results of the study found that all respondents understood the concept of the need for preaching in SABK schools, especially the aspect of faith. An understanding of the need for religious learning instruction in schools needs to exist in every teacher [11]. This need needs to be supported with initial strategies, training and enrichment outside the classroom in order to help Muslim students further strengthen their knowledge and appreciation of Islamic religious values (BPI 2017). This shows that SABK also places a serious focus on the development of Islamic character in addition to excellence in academic achievement [12]. In terms of the implementation of religious learning instruction programs, this study found that SABK schools in the state of Johor have planned by preparing an annual calendar to ensure that religious learning instruction activities are carried out on schedule. SABK in the state of Johor is also continuing the religious learning instruction programs in the faith aspect which has been a practice since SAR. Respondents stated that some SABK still continued the practice of reading such as ratib al-Attas, quran al-fajr, al-ma’thurat, reciting surah Yasin and tahlil, and dhikr as a routine. This shows that the preaching programs during SAR are still continuing such as al-khitobah, Wirid Sakaran and the most routine is the recitation of Yasin in large numbers in the square with all the school children and coincides [13]. The continuing SAR preaching program has had a positive effect on the character building of students planned as the school’s annual preaching calendar. Respondents also stated that the local community greatly admires SABK school students and some invite them to lead tahlil recitations, nazam, especially during khatam ceremonies in weddings, closing ceremonies, funeral management, becoming imams in suraus, pilgrimages and reciting marhaban during Eid al-Fitr to the house of asatizah and nearby residents or participate in festivities in their respective places. In addition, SABK schools in the state of Johor have implemented preaching programs in aspects of faith in study classes, lectures, courses and distribution of printed materials such as papers and others. Among the programs carried out are Ahli Sunnah Wal Jamaah seminar, Scientific Discussion of Faith, Study of the Book of Thurath Faith and the Strengthening Faith course held in collaboration with other parties. These programs have a high scientific value and are rarely implemented in regular high schools. Most of the programs are carried out according to the current needs and there is still less planning of religious learning instruction programs in the aspect of faith on an ongoing basis [14]. However, it is implemented seasonally to increase the appreciation of Islam in schools. The planning made at the beginning of the year only happened according to the current trend and because of the instructions from the Ministry of Finance. From the aspect of religious learning instruction program management, this study found that SABK in the state of Johor has established a school religious learning instruction committee in the form of a teacher’s task handbook. Efforts to arrange positions of religious learning instruction authority and recorded in school handbooks greatly help the smooth implementation of religious learning instruction programs in schools [15]. The program does not need to take a long time to be implemented, the administration

46

A. Masud et al.

does not need to struggle for every time the preaching program is implemented. However, if the position of authority or coordinator does not function well, then the objective of religious learning instruction will not be achieved. Coordinators and committees need to live up to the goals of the religious learning instruction program and not just accept instructions [16]. The results of the study found that all respondents agreed that every preaching program at SABK in the state needs an organized monitoring system. Monitoring is done in various ways, whether using monitoring forms, using reports after the program and others [17]. However, based on the results of the observation, the monitoring of the religious learning instruction program carried out is quite loose, there is no specific specification of how a good monitoring should be done and what the next action is after monitoring the program. This deficiency makes monitoring unfocused and inconsistent between one SABK school and SABK schools in other districts [18]. The executive authority has also made a reflection (post mortem) after the implementation of the religious learning instruction program as a step to improve the quality and added value that should be made in future religious learning instruction programs. However, sometimes there is no follow-up to the results of the reflection found. Challenges in implementing religious learning instruction programs in schools depend on the energy capacity or executive authority. Furthermore, it involves the preacher’s specific skills in the implementation of the program. However, it is quite difficult for SABK schools to adjust the time and program coordinators because there are already many tasks to be carried out. In conclusion, the implementation of the preaching program in the aspect of SABK schools has been well implemented according to suitability and is still not organized systematically. The analysis shows that the school has made an effort to compile a dakw’ah calendar and there is overlap in the implementation of the program if there are ad hoc instructions. However, there was a slight lapse, when the preaching program in the aspect of faith was implemented according to the programs that had been practiced during the People’s Religious School (SAR). Next, the culture of planning religious preaching programs has not yet been fully implemented because there are still SABKs that only carry out religious preaching programs when necessary. While the monitoring and management aspect of the preaching program at SABK has been done well through the appointment of authority positions, reflection after the program and monitoring from administrators and division of duties. However, there is still no standard mechanism to ensure positive effects in the implementation of religious learning instruction. 4.2 Process of Information Accuracy Enhancement Through Preaching Materials Arrangement The findings of the study show that almost all SABK use modules including the belief module as preaching material. SABK strives to manage the preaching program in their respective schools by using various methods. Modules are provided either using modules supplied by the Islamic Education Division, Ministry of Education Malaysia (KPM) or with materials prepared by the respective schools. Every SABK uses a module but there is no adaptation of the belief module according to the age of the student. The same module is used for all ages. Thirteen-year-old students will use the same faith module used by seventeen- and eighteen-year-old students. This makes it difficult for junior high school

Managing Information Quality for Learning Instruction

47

students to understand and participate in the workshop, especially the belief module that is too scientific which is difficult for students to explain let alone give opinions at a young age for topics that are too heavy. In addition, preaching materials have been prepared using school finances, teachers and donations from outside parties. Johor State SABK has been assisted with various financial resources in ensuring that the religious learning instruction program runs well. Similarly, there is no special finance to implement the religious learning instruction program, even though there are instructions to implement the religious learning instruction program. Per capita Grant Assistance (PCG) is very risky to be used without following the procedure, especially for programs that do not follow the spending goals allowed by the ministry. This finding coincides the lack of financial resources affects the development of SABK schools in various aspects [19]. SABK in the state of Johor has provided creative preaching materials in the aspect of faith and is still stereotypical. Activities such as pasting posters, signs, tazkirah, salawat, tips, manners, related to preaching can have an impact on the practice of shari’ah living. The culture of greeting, smiling and the campaign to protect private parts is very effective and can make preaching in schools understood and practiced so that the school becomes a calm location with a harmonious and Islamic atmosphere [20]. A positive culture can also be practiced regularly in schools. This is important in order to produce students to be pious and faithful future leaders. The religious learning instruction program built by KPM includes many elements of Islamic faith and sharia doctrine. The conclusion is that the preparation of preaching materials at SABK based on the concept of faith is still at a level that needs to be improved either from the aspect of preparation or financial resources. The preparation of preaching material in the aspect of belief is still inconsistent. The role of administrators from SABK needs to be empowered. Religious learning instruction supervisor teachers need to be assisted with skills to manage religious learning instruction programs, especially in relation to faith. This is to some extent able to improve the interest and delivery system of the preaching program in the aspect of faith in SABK in the state of Johor.

5 Conclusion Based on the discussion above, it can be stated that the continuation of religious learning instruction programs could give a beneficial value positive on developing the transition process and thus could be adapted and integrated according to the demands and needs. This paper did investigate the practices and process reflected into implementation and materials in distributing the information quality in religious learning instruction (preaching programs). The qualitative approach is based on the interview method in data collection among public administration office in the state of Johor who have been selected by purposive sampling and according to the locality cluster zone of the school district. The main point of this study argued that the strategic practices and process of information quality and accuracy in religious learning instruction program has been carried out well and systematically organized according to the needs and suitability of public schools. The contribution of findings highlighted the need to have a sufficient preparation on planning arrangement mainly about the materials which is traditionally simple and stereotyped.

48

A. Masud et al.

The further elaboration also indicated that the process and practices on implementing information quality and accuracy in religious learning instruction program requires the continued support from the aspects of contemporary knowledge and issues, technical experts and also professional development in helping to enhance the actualisation from technical to real basis.

References 1. Al-Qardhawi, Y.: Hakikat Tauhid, Pengertian Tauhid. Terj. Abdul Majid Abdullah. Batu Caves: Pustaka Salam Sdn Bhd (2014) 2. Din, H.: Manusia dan Islam. Jil. 1. Kuala Lumpur: Dewan Bahasa dan Pustaka (2002) 3. Sharifie, H.M.: Mauduk wa al-khasais religious learning instruction al-Islamiah (2014). https://www.alukah.net/sharia/D/68785 4. Ismail, A.M., Borham, A.H., Rahim, M.M.A.: Good governance in the management of the secondary school Surau. J. Adv. Res. Dyn. Control Syst. 11(8 Special Issue), 2656–2664 (2019) 5. Ismail, M.R.: Amalan Pengurusan Program Akademik Pentadbir Dan Perkembangan Murid Berpencapaian Rendah Akademik Di Sekolah Agama Bantuan Kerajaan. PANRITA J. Sci. Technol. Arts 1(1) (2021). https://Journal.Dedikasi.Org/Pjsta/Article/View/12 6. Ulya, I., Aziz, M.A., Ismail, Z.: Trait Personaliti Penreligious learning instruction Muslim: Satu Sorotan Literature Al-Hikmah. J. Islam. Relig. Learn. Instr. 10(1), 34–54 (2018) 7. Dewan, K.: Kuala Lumpur: Dewan Bahasa dan Pustaka (2003) 8. Osman, M.M., Bachok, S., Nur, S., Thani, A.A.: An Assessment of physical development in religious educational in Malaysia: case study of SAR and SABK in Perak. Procedia Soc. Behav. Sci. 28, 427–432 (2015) 9. Bin Mohamad Kasim, M.A., et al.: Iklim Sekolah dan Komitmen Guru di Sekolah Agama Bantuan Kerajaan (SABK) Negeri Kelantan. In: Proceeding of ICECRS, pp. 543–699 (2016). https://doi.org/10.21070/picecrs.v1i1 10. Mahat, M.A.: Pengetahuan, Kemahiran dan Sikap Guru Pendidikan Islam Terhadap Pelaksanaan Aktiviti Religious learning instruction Sekolah. Bahan tidak diterbit. Tesis Sarjana. Universiti Kebangsaan Malaysia (2019) 11. Omar, M.N.: Pengurusan Religious learning instruction di Negeri Maju: Kajian Tentang Cabaran Dan Masalahnya di Melaka. Bahan tidak diterbit. Tesis Doktor Falsafah. Universiti Malaya (2013) 12. Mohd Yusoff, M.Z.: Faktor-Faktor Penyumbang dalam Pertimbangan Moral Pelajar Sekolah Agama. Malays. J. Learn. Instr. 9 (2019). https://e-journal.uum.edu.my/index.php/mjli/art icle/view/7638 13. bin Muhamad, N., bin Hashim, A., Bin Daud, M.N., Bin Borham, A.H.: Religious activities for life-long learning medium and their impacts on Sabk’s student moral appreciation in State of Perak. Int. J. Acad. Res. Bus. Soc. Sci. 9(11), 1–14 (2019) 14. Mustafa, M.N., Umbak, C.M., Martin, J., Ayudin, A.R.: The influence of principals ‘holistic leadership practices in government aided religious schools (SABK) on teachers’ work commitment in Kuching District, Sarawak. Kqt Ejurnal 1(2), 51–68 (2021). http://Ejurnal.Kqt. Edu.My/Index.Php/Kqt-Ojs/Article/View/34 15. Jaafar, N., et al.: The importance of self-efficacy: a need for Islamic teachers as Murabbi. Procedia Soc. Behav. Sci. 2(69), 359–366 (2012) 16. Alimudin, N.: Konsep Religious learning instruction Dalam Islam. J. Hunafa 4(1), 73–78 (2007)

Managing Information Quality for Learning Instruction

49

17. Ahmad, R.: Ahammiyat Dirasat el Akidah Wa Hukmi Taalumiha (2012). https://www.alukah. net/sharia/04733l/4/12/2012 18. Surip, N.A., Razak, K.A., Tamuri, A.H., Fatah, F.A.: The practice of tolerance among islamic education teachers (IETs) through Shura in the management of religious learning instruction activities in schools. Creat. Educ. 10, 2606–2614 (2019). https://doi.org/10.4236/ce.2019.101 2188 19. Arianto, T.: Pengurusan Program Dakw’ah Di Integrated Islamic School Kota Damansara Selangor. Bahan tidak diterbit. Universiti Malaya, Tesis Sarjana (2011) 20. Azha, Z.A.: Kurikulum Pengajian Akidah Di Sekolah Menengah Atas Negeri Kabupaten, Kerinci, Jambi, Sumatera, Indonesia. Bahan tidak diterbit. Tesis Sarjana. Universiti Malaya (2010)

Development of Multivariate Stock Prediction System Using N-Hits and N-Beats Nathanael Jeffrey, Alexander Agung Santoso Gunawan(B) , and Aditya Kurniawan Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia [email protected], {aagung,adkurniawan}@binus.edu

Abstract. The capital market serves as a pivotal hub within a nation’s financial ecosystem, facilitating the exchange of stocks and securities. It assumes a paramount role in propelling the country’s economic growth and development. Profitable decisions in the capital market are frequently encountered; nevertheless, the intricacy lies in the challenge of accurately forecasting unpredictable fluctuations in stock prices. The analysis and forecasting of stock price movements have emerged as a highly sought-after area of research. The forecasting of stock price movements can be effectively categorized into two primary domains: technical analysis and fundamental analysis. Time series analysis, commonly referred to as technical analysis in the realm of stock market analysis, involves the meticulous examination of a stock’s price movements over a specific period. The forecasting technique employed in this study involves the analysis of Multivariate time series data using the advanced Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS) methodology. Based on the findings of the conducted research, this methodology exhibits commendable predictive efficacy in both the short and medium time horizons. Moreover, it demonstrates a notable ability to accurately forecast long-term stock patterns. Keywords: Capital Market · Stock · Multivariate · Technical Analysis · Time Series · N-HiTS

1 Introduction The stock exchange, commonly referred to as the capital market, serves as a dynamic marketplace facilitating the trading of shares. The capital market plays a pivotal role in driving a nation’s economy. Shares of publicly traded companies, which have undergone the process of being listed on the stock exchange, represent tradable ownership units in these companies. The possession of company shares serves as a substantiation of ownership, thereby entitling the shareholder to avail themselves of all associated benefits and privileges [4]. The dynamics of stock price movement are primarily determined by the interplay between demand and supply forces. These forces are further influenced by the actions of traders as they engage in the buying and selling of shares [2]. Shares are exchanged in pursuit of financial gains, akin to conventional transactions involving the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 50–63, 2024. https://doi.org/10.1007/978-3-031-54820-8_6

Development of Multivariate Stock Prediction System

51

purchase and sale of goods and services. The price dynamics of a security on the financial market are inherently stochastic and susceptible to exogenous influences, rendering it arduous to exploit such inefficiencies. Making informed decisions to generate profits in the stock market can pose a significant challenge, primarily attributable to the vast amount of data generated from both present and historical stock buying and selling transactions [11]. The process of determining the optimal course of action to maximize profitability is a complex and challenging endeavor. In order to make an informed decision, it is imperative to employ quantitative techniques to forecast the price dynamics of a given stock within the stock market. Accurately forecasting the trajectory of a stock can yield substantial financial gains, thereby rendering it a highly sought-after domain of study. While it is true that forecasting the trajectory of a stock can yield substantial gains, it is a challenging endeavor owing to various intricate elements that necessitate consideration. These elements encompass volatility, seasonality, temporal interdependencies, and economic factors, all of which must be meticulously factored in during the prediction process [18]. Fundamental and technical analysis are the two predominant methodologies commonly employed in the realm of quantitative finance to formulate accurate forecasts. Fundamental analysis is a robust methodology employed to assess the value of a stock by thoroughly examining its financial statements and conducting a meticulous quantitative analysis of the underlying business. Quantitative analysis involves examining the financial statements of the enterprise and conducting a comprehensive evaluation of its revenues, expenses, assets, and liabilities. This meticulous examination enables us to derive valuable insights into the future performance of the business. Technical analysis, as a stock analysis methodology, incorporates the examination of transaction volumes that arise from buying and selling activities within the stock market [5]. The prediction of stock price movements can be achieved through the utilization of both fundamental and technical analysis methodologies. While these approaches can yield favorable outcomes, it is important to note that predictions in the financial markets can never be entirely precise. However, conducting individual stock analysis can be a laborious and time-intensive process. Due to the exponential growth in computing power, researchers have been able to leverage this progress to devise more efficient and accelerated techniques for forecasting stock prices [12]. Technical indicators, as a valuable source of data [2], offer insights that can be harnessed to anticipate fluctuations in stock prices. The data is subsequently transformed into a dataset in the format of a Time Series, wherein stock price data is organized chronologically within a specific time interval [8]. Financial time series data can be acquired from stock prices and will manifest as data points encompassing the opening price (open), closing price (close), highest price (high), lowest price (low), and trading volume. Price movements can be accurately predicted by employing advanced techniques like machine learning, which leverage the available data [20]. The conventional machine learning approach represents a viable option within the realm of machine learning methodologies for the purpose of forecasting stock prices. Random Forest (RF), Support Vector Machine (SVM), and Naive Bayes are just a few examples of the numerous algorithms available in the field of quantitative analysis. Despite the satisfactory predictive outcomes yielded by this approach, experts in the

52

N. Jeffrey et al.

field express their assurance that alternative methodologies can be devised to generate superior results [19]. Deep learning models, colloquially referred to as the epitome of cutting-edge technology, have emerged as a consequence of the progressive evolution of conventional machine learning techniques. Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs), and Long Short Term Memory (LSTM) are prominent deep learning models widely utilized in the field [21]. While this approach may yield favorable predictive outcomes, it is plagued by the issue of long-term dependence, leading to diminished accuracy of predictions over extended periods. In response to these challenges, scholars are endeavoring to address these issues through the exploration of novel approaches like Transformer-Based and Multi Layer Perceptron (MLP). A specific instance of MLP, known as Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS) [3], will be examined in this study and compared with its precursor, Neural Basis Expansion Analysis for Interpretable Time Series Forecasting (N-BEATS) [13]. The N-HiTS model outcomes will be effectively integrated into a web-based application system specifically designed to facilitate this research. This application system will be constructed using Streamlit [10], a cutting-edge technology.

2 Literature Review 2.1 Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS) N-HiTS, an acronym for Neural Hierarchical Time Series, represents an advanced iteration of the N-BEATS model, a deep neural network architecture. Its primary objective is to cater to the intricacies of long-term forecasting challenges. This model uncovers a limitation inherent in machine learning models, commonly employed for time series prediction, namely their incapacity to perform long-term predictions (long horizon forecasting). Consequently, this approach presents a comparable structure to N-BEATS, albeit incorporating a multi-layer perceptron (MLP) for executing multi-rate data sampling and hierarchical interpolation. The objective of this implementation is to enhance long-term forecasting by conducting multi-step predictions through hierarchical interpolation, while also minimizing computational resources by employing multi-rate data sampling (Fig. 1). 2.2 Neural Basis Expansion Analysis for Interpretable Time Series Forecasting (N-BEATS) This approach has been developed based on extensive research in the field of constructing deep learning models for time series forecasting. This paper proposes the utilization of double residuals, which are partitioned into backward and forward categories and leverage fully connected deep stack layers. The primary objective of this proposal is to enhance the interpretability of the model and facilitate predictions without necessitating significant modifications to the utilized dataset, all while maintaining the existing level of accuracy [13]. The primary objectives of this approach are twofold: firstly, leveraging a sophisticated deep neural network architecture to forecast time series data, capitalizing on the

Development of Multivariate Stock Prediction System

53

Fig. 1. N-HiTS Architecture

advancements in machine learning models. Secondly, designing a deep learning architecture that yields interpretable outputs, enabling its widespread applicability akin to traditional methods [13]. The input block, which is constructed using a multi-layer fully connected (FC) architecture and employs the rectified linear unit (ReLU) as the activation function, encapsulates the essence of the N-BEATS model. Within each layer, a dual residual stacking framework is employed, encompassing both backcast and forecast components. The backcast component generates outcomes for backward predictions, while the forecast component generates outcomes for forward predictions. The residual stacks receive input or output in the form of a multi-step horizon. The backward expansion, also known as backcasting, is a technique utilized in quantitative analysis to enhance the predictive capabilities of models by eliminating irrelevant input components. On the other hand, the forward expansion, or forecasting, strives to yield highly accurate prediction outcomes [13]. The summation of all the stacks yields the ultimate prediction outcome. Subsequently, the cumulative outcome of each forecast derived from every block in the stack will be aggregated to yield the overall prediction outcome for that particular stack (Fig. 2).

3 Methodology The researcher employed a prototyping approach [15] as the research methodology for this study. This involved initially identifying pertinent issues from the introduction, followed by an extensive literature review. During this review, the author diligently sought and examined various scholarly papers, scientific articles, and books that provided valuable data to enhance the depth and breadth of insights for this research endeavor. Furthermore, the author employs advanced data mining techniques, leveraging the powerful Python library yfinance, to acquire meticulously curated datasets. This library seamlessly extracts valuable financial information from the esteemed online platform https://finance.yahoo.com, ensuring the highest quality and accuracy of the gathered

54

N. Jeffrey et al.

Fig. 2. N-BEATS Architecture

data. This study will utilize the stock data of BBCA, BBRI, AALI, BUMI, and MEGA. The researcher leverages a comprehensive dataset encompassing stock price information spanning from January 1, 2017 to January 1, 2023. The specific variables of interest within this dataset are the Close Price and Volume. Subsequently, the crucial step of data preparation will be undertaken, encompassing essential tasks such as meticulous data cleansing and meticulous data scaling [7]. Data cleaning is performed by eliminating missing values, and subsequently transforming the data into a time series object to align with the designated library utilized in this research, specifically Darts [9]. Meanwhile, data scaling is achieved by employing a minmaxscaler from Scikit-learn [14] to guarantee uniformity in data frequency. The model is meticulously crafted by leveraging the powerful Python library Darts. In this endeavor, the authors have judiciously employed key parameters, such as input_chunk_length, output_chunk_length, batch_size, and random_state, which hold significant sway in both the N-HiTS and N-BEATS models. It is worth noting that the authors have chosen to optimize these hyperparameters using the cutting-edge Optuna framework [1]. Once the model has been prepared for deployment, it will undergo training and testing procedures utilizing the meticulously curated dataset. The training phase will encompass a comprehensive five-year span of data, while the subsequent testing phase will be conducted on a distinct one-year subset. The model’s predictive performance is subsequently assessed by employing the Mean Absolute Percentage Error (MAPE) metric. The Mean Absolute Percentage Error (MAPE) is calculated by taking the difference between the predicted value and the actual value, dividing it by the actual value, and then converting it to an absolute value. Finally, this

Development of Multivariate Stock Prediction System

55

absolute value is expressed as a percentage. [6].  n  1  Pi − Yi  MAPE =  Y  × 100% n i i=1

where Pi is the predicted value and Y i is the actual value meanwhile n is the prediction periode for MAPE according to the result and discussion, where forecasts for the short term, or 5 working days, the medium term, or 20 working days, and the long term, or 130 working days, will be made. In addition to these three periods, the two models discussed above will also be contrasted in order to forecast one day in advance using a naive baseline. Unified Modeling Language (UML) [15] and Wireframe [16] will be used to design the web-based application system once the model has been acquired. Based on the results of UML and Wireframe creation, the author uses the Python library Streamlit to create the application. Black-box testing [15] will be used to test the application after which it will be assessed using five human factors and eight golden rules [17]. The completed application will then be released using Streamlit.

4 Result and Discussion 4.1 Data Preprocessing The dataset utilized in this research comprised the aforementioned data points, accompanied by a description of their acquisition method, specifically through a data mining procedure facilitated by the yfinance Python library. The data undergoes processing utilizing the aforementioned methodologies, specifically by executing the data cleansing and data scaling procedures as elucidated earlier. Once the data has undergone processing, it is imperative to convert it into a Time Series Object to enable its utilization within the Darts Python library. The frequency of the data must be specified upon creating the Time Series Object, as it is crucial to account for the fact that the stock market operates solely on weekdays. In this case, the writer opts for the frequency “B” or Business Day to accurately capture this temporal aspect. In order to mitigate the presence of missing data during the training phase, it is imperative for the author to undertake a meticulous reiteration of the data cleansing procedure. This is necessitated by the existence of holidays that compel the stock market to remain closed on weekdays, thereby potentially resulting in vacant data points. Algorithm 1 Null Data Checking For i = 0 To length of ticker_df If ticker_df[i] = null Then Print ticker_df[i] End If End For

The subsequent phase entails populating the data for unoccupied workdays, such as national holidays, in order to guarantee that the generated time series objects are devoid of any missing data points subsequent to the verification of data completeness.

56

N. Jeffrey et al.

The technique of forward imputation will be employed, whereby the missing data on a given day will be filled using the data from the preceding day. This approach aligns with the methodology employed during the data cleaning phase. Data scaling is performed utilizing the MinMaxScaler technique, ensuring the absence of any missing data blocks. Algorithm 2 Data Scaling Initialize temp_df = ticker[‘Close’] Initialize mms = MinMaxScaler() Initialize temp_array = numpy.array(temp_df) temp_array = temp_array.reshape(-1, 1) Initialize temp_mms = temp_array temp_df = pandas.DataFrame(temp_df) temp_df = temp_df.rename(columns = ‘TickerName’ to ‘Close’) temp_df[‘Close’] = temp_mms

The last step in processing this data is to change the existing data into the form of Time Series Objects according to what is required by Darts. Algorithm 3 Converting Data Into Time Series Object Initialize series series = TimeSeries.from_dataframe(temp_df, freq = ‘B’) Output series

4.2 Model Construction The researchers employed the N-HiTS and N-BEATS models, incorporating meticulous hyperparameter tuning, to ascertain the model that exhibits optimal accuracy in this investigation. The researcher employed a set of parameters, namely input_chunk_length, output_chunk_length, batch_size, and random_state, to derive the model utilized in this investigation. Optuna, a powerful hyperparameter optimization framework, was employed to fine-tune the hyperparameters for a total of 100 iterations. Forecasts shall be generated for each model across multiple time horizons, which are segmented into three distinct classifications: near-term (5 days/1 week of business days), intermediateterm (20 days/1 month of business days), and long-term (130 days/6 months of business days). In order to assess the performance of the model, it will be juxtaposed with the naive baseline for one-day-ahead predictions. The parameters employed in both models for each stock are as follows (Table 1): Table 1. Hyperparameter Tuning Results On Each Stock Model

Ticker

Input

Output

Batch_size

Random_state

N-HiTS

BBRI

90

30

256

1

BBCA

800

190

256

1 (continued)

Development of Multivariate Stock Prediction System

57

Table 1. (continued) Model

N-BEATS

Ticker

Input

Output

Batch_size

AALI BUMI

Random_state

975

325

32

1

720

240

128

1

MEGA

240

60

256

1

BBRI

880

65

256

1

BBCA

875

25

32

1

AALI

245

60

64

1

BUMI

345

110

32

1

MEGA

600

115

256

1

4.3 Model Result and Analysis The model will then continue to undergo tests that are evaluated using MAPE during the four predetermined periods after having undergone hyperparameter tuning and training (Table 2). The above projections are given as a MAPE value, which is calculated using the previously described algorithm beginning on January 3, 2022. Each individual data point undergoes a rigorous calculation process, wherein it is meticulously computed, aggregated, and subsequently divided according to the designated time period. The forecasted value, denoted as, will be employed on January 3, 2022, whereas the observed value, denoted as, will be subject to a 100% multiplication to yield a percentage, and will also be utilized on January 3, 2022, within a prediction horizon of 1 day. Each data point undergoes the identical calculation for the number of calculated periods in the subsequent periods. The resulting percentages are then aggregated and divided by the period value, which is derived as an average. Based on the observed prediction outcomes of N-HiTS, N-BEATS, and the Naive Baseline, it can be deduced that there exist shared patterns in stock price forecasting utilizing the N-HiTS and N-BEATS methodologies. Specifically, these models encounter challenges in accurately predicting prices over extended time horizons, despite the beneficial influence of the volume variable. Despite the inherent limitations in the accuracy of prediction results, it is evident that the N-HiTS model exhibits a discerning ability to interpret movement patterns across various stocks. Based on the author’s empirical findings, both models exhibit limitations in accurately predicting volatile price movements or the emergence of substantial price spikes, which have the potential to substantially disrupt established price patterns. The occurrence of significant price spikes in the BUMI and MEGA stocks posed a challenge for the model, as it failed to anticipate these abrupt movements. This phenomenon can manifest due to various factors, including exogenous or exogenous variables, which the author regrettably cannot incorporate as determinants in this study. Examples of exogenous or exogenous factors encompass the prevailing condition of the Indonesian economy, pertinent news pertaining to companies, and myriad

58

N. Jeffrey et al. Table 2. Comparison of N-HiTS, N-BEATS, and Naive Prediction Results

Ticker Name

Model N-HiTS

2.56%

Period (in Day) 20 5 Days Days 2.57% 2.16%

BBRI

N-BEATS

10.35%

10.49%

5.54%

8.29%

Naive

1.67%

-

-

-

N-HiTS

3.68%

2%

1.72%

4.27%

N-BEATS

1.22%

4.31%

8.48%

6.8%

Naive

0.34%

-

-

-

BBCA

AALI

BUMI

MEGA

1 Day

130 Days 3.41%

N-HiTS

2%

1.22%

2.29%

7.27%

N-BEATS

8.6%

5.6%

5.53%

8.13%

Naive

0.78%

-

-

-

N-HiTS

4.61%

2.74%

5.06%

8.03%

N-BEATS

13%

8.53%

9.2%

9.83%

Naive

1.52%

-

-

-

N-HiTS

0.19%

0.37%

7.21%

53.68%

N-BEATS

7.52%

4.4%

10%

26.38%

Naive

0.29%

-

-

-

other variables. Furthermore, the author posits that an additional variable that may introduce bias into the stock prediction outcomes derived from the N-HiTS and N-BEATS models is the intricacies of data processing. As mentioned by the authors earlier, both models exclusively accept time series entities. The authors have set the frequency of the objects to “B” or Business Day, implying that if a holiday or a day when the Indonesian stock market is closed occurs, the stock data for that particular day will be absent. However, it is crucial to note that the time series object cannot contain any empty values. Therefore, the authors have employed forward imputation to fill in the missing values. The utilization of historical data by the author to fill in the gaps may potentially undermine the accuracy of the stock price prediction model. Even though N-HiTS falls short in terms of accuracy when compared to the naive baseline, it is evident that N-HiTS outperforms the accuracy results of the naive baseline in the case of MEGA stock. In contrast to the outcomes presented by N-BEATS, the subsequent day’s forecasts demonstrate the capacity of N-HiTS across various stocks to approximate predictions derived from the simplistic baseline approach. Despite the utilization of the N-HiTS model, an advanced iteration of the N-BEATS model, aimed at addressing the challenge of long-term forecasting, the inherent difficulty of long horizon forecasting persists. The N-HiTS model demonstrates superior predictive capabilities compared to N-BEATS across various stocks, with the exception of MEGA, particularly

Development of Multivariate Stock Prediction System

59

in the long-term horizon. This observation is derived from a comprehensive analysis of short, medium, and long-term prediction outcomes. The authors posit that, although N-HiTS’s predictive outcomes may not exhibit complete precision, they do manifest an enhancement over N-BEATS when undertaking prognostications for extended time horizons. 4.4 Application Result The author plans to create a straightforward web-based application based on the findings of the aforementioned analysis with the objective of implementing the findings of the aforementioned N-HiTS model to forecast stock prices. The dataset that the author previously mentioned as well as the N-HiTS model that the author trained using the results will both be used in this application. This application was created to allow users to forecast stock prices, particularly for time periods outside those covered by the author’s analysis. This allows users to pick and choose the stocks they want to forecast as well as their own time frames. The application should be used as follows: 1. Users can choose the homepage page to view details about models and time series predictions in general, as well as instructions on how to use the program (Fig. 3).

Fig. 3. Homepage Page of Application

2. Users can select the contact page to view author information and contacts (Fig. 4). 3. Users can make stock predictions by going to the predict page. The following is how to use the page: a. The user selects the ticker (stock) that he wishes to forecast. In addition, users can view a price chart of the stock by checking the “display chart” box (Fig. 5). b. The user specifies the duration of the desired prediction. In addition, users can check the MA box to see the stock’s Moving Average (Fig. 6).

60

N. Jeffrey et al.

Fig. 4. Contact Page of Application

Fig. 5. Choose Ticker on Predict Page

c. After the user has determined the ticker and duration, a configuration column will appear, where the user can press the Predict button if it is correct (Fig. 7). d. Prediction results will be displayed on user’s main screen (Fig. 8). This application can be accessed by using the following link https://naeljeff-nhitsstock-web-project-controller-tjth0y.streamlit.app/.

Development of Multivariate Stock Prediction System

Fig. 6. Choose Prediction on Predict Page

Fig. 7. Configuration Column and Predict Button

61

62

N. Jeffrey et al.

Fig. 8. Prediction Result

5 Conclusion The N-HiTS model, a cutting-edge approach for time series analysis in stock prices, demonstrates impressive performance across various prediction horizons. Notably, it excels in forecasting the next one-day, short term (5 days), and medium term (20 days) movements, particularly for stocks characterized by limited price fluctuations. Even for highly capitalized stocks, the model exhibits superior performance compared to the simplistic baseline in terms of predictive accuracy. During the ongoing process, the model continues to face challenges in attaining satisfactory accuracy levels for long-term predictions spanning 130 days. Therefore, further efforts are required to enhance the model’s performance and enable it to generate more precise predictions. However, it is evident that the N-HiTS model outperforms its predecessor, N-BEATS, in terms of prediction accuracy in the short and medium term. This superiority extends to the long term as well, except for MEGA stocks. Both models were meticulously crafted and refined through the intricate process of hyperparameter tuning, wherein each stock model was meticulously fine-tuned to attain optimal outcomes. The application leverages the cutting-edge Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS) model to empower users in conducting comprehensive time series analysis for predicting stock price movements. By offering valuable insights, it equips users with the necessary tools to analyze stocks and make informed decisions in the dynamic stock market environment. It is expected that the time series model will continue to evolve, particularly in predicting, so that it may be utilized to enhance people’s daily lives, particularly if it is used for efforts that benefit people in general. The approaches presently employed in this study may still be studied and improved by doing more hyperparameter tuning and integrating new factors that assist these models deliver more exact findings. Furthermore, it is planned that as the program evolves in the future, new stock choices with longer duration options will be available, and explanations and explanations may also be added to the prediction results to help in the comprehension of these findings by regular users.

Development of Multivariate Stock Prediction System

63

References 1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019) 2. Beyaz, E., Tekiner, F., Zeng, X., Keane, J.: Comparing technical and fundamental indicators in stock price forecasting. In: 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (2018) 3. Challu, C., Olivares, K.G., Oreshkin, B.N., Garza, F., Mergenthaler, M., Dubrawski, A.: N-hits: Neural hierarchical interpolation for time series forecasting (2022) 4. Chen, J.: What is the stock market, what does it do, and how does it work? https://www.inv estopedia.com/terms/s/stockmarket.asp. Accessed 28 Mar 2023 5. Drakopoulou, V.: A review of fundamental and technical stock analysis techniques. J. Stock Forex Trad. 5, 1–8 (2016) 6. de Myttenaere, A., Golden, B., Le Grand, B., Rossi, F.: Mean absolute percentage error for regression models. Neurocomputing 192, 38–48 (2016). https://doi.org/10.1016/j.neucom. 2015.12.114 7. Fan, C., Chen, M., Wang, X., Wang, J., Huang, B.: A review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Front. Energy Res. 9, 652801 (2021). https://doi.org/10.3389/fenrg.2021.652801 8. Hayes, A.: What is a time series and how is it used to analyze data? https://www.investope dia.com/terms/t/timeseries.asp. Accessed 30 Mar 2023 9. Herzen, J., et al.: User-friendly modern machine learning for time series. J. Mach. Learn. Res. 23, 5442–5447 (2022) 10. Inc., S. Streamlit documentation. https://docs.streamlit.io/. Accessed 17 May 2023 11. Kalyani, J., Bharathi, H.N., Jyothi, R.: Stock trend prediction using news sentiment analysis (2016) 12. Khairi, T., Mohammed, R., Ali, W.: Stock price prediction using technical, fundamental and news based approach, pp. 177–181 (2019) 13. Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.N-BEATS: neural basis expansion analysis for interpretable time series forecasting (2019) 14. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825– 2830 (2011) 15. Pressman, R., Maxim, B.: Software Engineering: A Practitioner’s Approach. 9th edition (2019) 16. Roth, R., Hart, D., Mead, R., Quinn, C.: Wireframing for interactive web-based geographic visualization: designing the NOAA lake level viewer. Cartogr. Geogr. Inf. Sci. 44, 338–357 (2016) 17. Shneiderman, B.: Designing the user interface strategies for effective human-computer interaction. ACM SIGBIO Newsl. 9(1), 6 (1987). https://doi.org/10.1145/25065.950626 18. Somani, P., Talele, S., Sawant, S.: Stock market prediction using hidden markov model. In: 2014 IEEE 7th Joint International Information Technology and Artificial Intelligence Conference, pp. 89–92 (2014) 19. Soni, P., Tewari, Y., Krishnan, D.: Machine learning approaches in stock price prediction: a systematic review. J. Phys: Conf. Ser. 2161(1), 012065 (2020) 20. Wu, J., Xu, K., Chen, X., Li, S., Zhao, J.: Price graphs: Utilizing the structural information of financial time series for stock prediction. Inf. Sci. 588, 405–424 (2022) 21. Zou, J., et al.: Stock market prediction via deep learning techniques: a survey. 1(1) (2022)

Fractal Method for Assessing the Efficiency of Application of Closed Artificial Agroecosystems Alexander P. Grishin(B) , Andrey A. Grishin, and Vladimir A. Grishin Laboratory of Intelligent Robotic Tools and Climatic Equipment for Closed Ecosystems, Federal Scientific Agroengineering Center VIM, Moscow, Russian Federation [email protected]

Abstract. It is stated that the efficiency of closed artificial agroecosystems in comparison with open artificial agroecosystems is due to the fact that in closed artificial agroecosystems there is an opportunity to provide the required, abiotic factors to ensure effective development and growth of plants. It is shown that two main processes underlie plant productivity: the chemical reaction of photosynthesis and evaporative thermoregulation of this reaction. They have a cooperative, synergetic character - they complement and support the joint action aimed at the formation of plant product. The action of one of such factors consists in obtaining by a plant the energy of light radiation, which has two components: the photosynthetic component of light radiation and the accompanying thermal component of light radiation. Moreover, they are components of one phenomenon - light radiation. Therefore, we have chosen the factor of light radiation as an evaluating factor among all abiotic factors, and the time series of green mass weight growth as a productivity criterion. It is shown that many real time series are characterized by invariance with respect to scale transformations, in connection with which the standard Gaussian statistics turns out to be untenable, and the problem of time series research is reduced to the analysis of stochastic self-similar processes that can be described by fractal sets having their own fractal dimensionality. Potato variety Desiree was chosen as a material for research. The fractal dimension of the series in our research was determined using the method of normalized spread or R/S analysis, which is the basis of fractal analysis. The results of calculations of numerical characteristics of plant weight time series: linear trends of weight changes and fractal dimensionality of weight change processes under different lighting regimes are presented. Under artificial illumination the coefficient at argument is 4.4 times higher than under natural illumination, which proves a greater rate of growth of green mass weight in the first case, while the fractal dimension has a value closer to 1 by 0.07 units, which is an indicator of Effective Productivity under artificial illumination. #COMESYSO1120. Keywords: Fractals · Fractal dimension · Time series · Closed artificial agroecosystems · Light radiation · Linear trend equations

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 64–68, 2024. https://doi.org/10.1007/978-3-031-54820-8_7

Fractal Method for Assessing the Efficiency of Application

65

1 Introduction It is known that the efficiency of closed artificial agroecosystems in comparison with open artificial agroecosystems is due to the fact that in closed artificial agroecosystems there is an opportunity to provide the required abiotic factors to ensure effective development and growth of plants. The action of one of such factors consists in obtaining by the plant the energy of light radiation, which has two components: the photosynthetic component of light radiation and the accompanying thermal component of light radiation. The first component directly participates in the chemical reaction of photosynthesis, the second provides the optimal temperature of this reaction for maximum efficiency of photosynthesis, based on the self-organizing process of thermoregulation in the plant, with the help of evaporative cooling and, that is, provide growth of productive mass. These two processes: the chemical reaction of photosynthesis and the evaporative thermoregulation of this reaction have a cooperative, synergistic character - they complement and support the joint action aimed at the formation of a plant product. Besides, they are components of one phenomenon - light radiation. Therefore, we chose the factor of light radiation as an evaluative factor of all abiotic, to assess the effectiveness. And as an indicator of efficiency we considered the weight of green mass in the form of a time series, which due to the impact of the environment will be random in nature. And for closed artificial agro-ecosystems this impact will be weakened due to the exclusion of the influence of clouds, air currents and other limiting phenomena. As characteristics of time series we will apply traditional probabilistic-statistical characteristics, as well as sections of statistics, where time series represent stationary random, diffusion or spot processes. The most common methods use correlation and spectral analysis, data smoothing and filtering, autoregression and forecasting models [1]. However, many real time series are characterized by invariance with respect to scale transformations, in this connection standard Gaussian statistics turns out to be untenable, and the problem of time series research is reduced to the analysis of stochastic self-similar processes, which can be described by fractal sets [2] having their fractal dimension [3]. Fractal dimension, is an indicator of the complexity of a time series curve. By analyzing the alternation of areas with different fractal dimensionality and how the system is affected by external and internal factors, we can learn to predict the behavior of the system and, most importantly, diagnose and predict unstable states. Fractal dimensionality characterizes the filling of space and describes the structure at fractal scaling. For physical fractals, this transformation takes place in space. For time series the scale changes in time, Fig. 1 [4]. The value of fractal dimension can serve as an indicator of the number of factors affecting the system. If the fractal dimension is less than 1.4, the system is influenced by one or more forces that move the system in one direction. If the fractal dimension is about 1.5, the forces acting on the system are multidirectional, but more or less compensate each other. The behavior of the system in this case is stochastic and is well described by classical statistical methods. If the fractal dimension is much higher than 1.6, the system becomes unstable and is ready to move to a new state [5]. The fractal dimension of a time series is a function of the change in scale from the time period. The fractal dimension of a time series gives an estimate of the degree

66

A. P. Grishin et al.

Fig. 1. Fractal dimensions of time series of different filling.

of brokenness of the time series as well as its intrinsic character. A straight line has a fractal dimension of 1, while the fractal dimension of a random time series with normal distribution is 1.5 [6]. The fractal dimension is important because it emphasizes that a process can be between deterministic character (line with fractal dimensionality 1) and random (fractal dimensionality 1.50). In general, the fractal dimension of a series can range from 1 to 2. The closer to one, the more predictable the process becomes. Fractal dimension close to one indicates a combination of deterministic behavior and random properties of the process, which is characteristic of nonlinear phenomena between order (structure) and disorder (chaos). Artificial lighting, characteristic of closed artificial agroecosystems has a stable character, independent of external influences, so it can be assumed that for the variant with artificial lighting time series of weights will have a more deterministic character, with a smaller fractal dimension and close to 1. This qualitative assessment of the efficiency of closed artificial agroecosystems, it requires quantitative confirmation. Such confirmation is the purpose of the present research, where probabilistic characteristics together with fractal analysis will serve as a tool for such confirmation.

2 Materials and Methods Potato variety Desiree was chosen as a material for the research. The fractal dimension of the series in our research was determined using the method of normalized spread or R/S analysis, which is the basis of fractal analysis. For many time series, the normalized spread R/S, which is the ratio of the difference between the maximum and minimum values of the observed flow rate R for a given time interval (lag n) and the standard deviation S calculated for flow rate values of the same lag, is well described by the empirical expression: (R/S) = c nH ,

(1)

Fractal Method for Assessing the Efficiency of Application

67

where c is a constant, H is the Hurst exponent, which is related to the fractal dimension D by the relation D = 2 - H. Thus the Hurst exponent H is a measure through which the fractal dimension is defined [7]. Light radiation was provided by a full-spectrum LED light fixture with an intensity of 500 micromoles per meter square per second, providing constant illumination of the culture. All factor values were recorded on the memory card of the recorder automatically. Productivity was monitored by measuring plant weight gain using ML-A01 scales (accuracy of measurement 0.01 g) with three times repetition from 03.04.2021 to 18.04.2021, and then processed on a computer in MS Excel. The growth of productive mass was determined by weighing the plant with the cover after subtracting the weight of the cover measured beforehand.

3 Discussion The results of measurements are shown in Table 1. The values of the measured weights are also given there in the reduced-to-maximum form m, which ensures homogeneity of the obtained values and convenience of their further analysis. The results of calculations of numerical characteristics of time series of plant weights: linear trends of weight changes and fractal dimensions of weight change processes under different lighting regimes are also given here. Table 1. Results of measurements of time series of average values of plant weight and calculation of their numerical characteristics under artificial and natural lighting. Average value of Average value of Average value of Average value of plant weight, g plant weight, r.u plant weight, g plant weight, r.u 1.00

0.51

0.27

13.10

0,83

2.00

0.56

0.29

13.11

0,83

3.00

0.99

0.52

13.21

0,84

4.00

0.74

0.39

13.79

0,87

5.00

0.84

0.44

13.94

0,88

6.00

0.99

0.52

13.87

0,88

7.00

0.85

0.45

14.16

0,90

8.00

1.43

0.75

14.33

0,91

9.00

1.59

0.84

14.61

0,93

10.00

1.69

0.89

14.85

0,94 (continued)

68

A. P. Grishin et al. Table 1. (continued) Average value of plant weight, g

Average value of Average value of Average value of plant weight, r.u plant weight, g plant weight, r.u

11.00

1.76

0.93

15.46

0,98

12.00

1.85

0.97

15.26

0,97

13.00

1.88

0.99

15.63

0,99

14.00

1.89

0.99

15.75

1,00

15.00

1.90

1.00

15.78

1,00

(R/S) = c nH

R/S = 0.6601n0.6571

R/S = 0.5843n0.5909

Fractal dimension

D = 1.34

D = 1.41

Linear trend of normalized time series of plant weight

m = 0.0595τ + 0.2137

m = 0.0136τ + 0.8095

4 Conclusion From the equations of linear trends we can see that under artificial lighting the coefficient of the argument is 4.4 times higher than under natural lighting, which proves the higher growth rate of green mass weight in the first case, while the fractal dimension has a value closer to 1 and is 1.34 instead of 1.41.

References 1. Tyurin, Y.N., Makarov, A.A.: Statistical Analysis of Data on Computer. INFRA-M, Moscow (1998) 2. Olemskoy, A.I., Borisyuk, V.N., Shuda, I.A.: Multifractal analysis of time series. Bull. SumDU. Ser. Phys. Math. Mech. 2, 11 (2008) 3. Barabash, T.K., Maslovskaya, A.G.L.: Computer modeling of fractal time series. Bull. Amur State Univ. Ser. Nat. Econ. Sci. 49, 31–38 (2010) 4. Nikolaou, G., Neocleous, D., Kitta, E., Katsoulas, N.: Estimation of aerodynamic and Canopy resistances in a Mediterranean greenhouse based on instantaneous leaf temperature measurements. Agronomy 10, 1 (2020) 5. Venetsky, I.G., Venetskaya, V.I.: Basic Mathematical and Statistical Concepts and Formulas in Economic Analysis. Statistics, Moscow (1979) 6. Andreev, Y., Makeeva, T., Pukhova, E., Sevryugin, V., Sherstnev, G.: Technical Means of Digital Information Processing Systems. Ivan Fedorov Moscow State University of Printing Arts, Moscow (2015) 7. Kai, X., Liang, G., Hong, Y.: A naturally optimized mass transfer process: the stomatal transpiration of plant leaves. J. Plant Physiol. 234–235, 138–144 (2019)

Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity of Objects from Heterogeneous Ontologies Vladislav I. Danilchenko(B) , Eugenia V. Danilchenko, and Victor M. Kureychik South Federal University, Taganrog, Russia {vdanilchenko,lipkina,vmkureychik}@sfedu.ru

Abstract. This study focuses on addressing theoretical aspects in knowledge search management and interdisciplinary intellectual information architecture initiation, specifically in semantically directed search. The objective is to develop promising approaches in computer science and information retrieval systems, integrating knowledge from chaotic clusters into subject domains for modeling new information systems. The study’s significance lies in its proposed solution to the examined problem, applicable to various NP-hard problems and expanding the use of information-intelligent ordered clusters. The article explores enhancing the efficiency of search algorithms for semantic similarity analysis in expert linguistic information, initializing from subject-specific text collections for use in intellectual information systems. Current methods struggle with semantic integration due to complexity. Heuristic methods using resultant ontologies commonly address semantic heterogeneity. This study suggests bioinspired search algorithms to tackle semantic data heterogeneity. The modified white rabbit algorithm is employed, enabling semantic-level data analysis, subject domain identification, and interaction understanding. It facilitates interaction between information systems based on a unified ontology. Semantic search is pivotal for knowledge management technology development. Diverse search architectures enhance search quality and efficiency. Modified algorithms based on this research can improve result quality. Software modules were developed to simulate the proposed algorithm, and results were analyzed and compared with classical algorithms, affirming the proposed solution’s effectiveness.

1 Introduction Textual info processing is vital for tasks like data formalization, knowledge extraction, and integration. Methods like ontologies, clustering enhance efficiency. This study improves semantic similarity search in linguistic expert info using text collections. Rooted in bioinspired search theory, the research develops algorithms for info systems. Ontologies automate extraction, improving analysis for modern systems. A conflict exists between quality retrieval and timely formalization. Relationships in the subject domain encompass hierarchy, equivalence, and association. “Simple ontology” describes a basic form. Ontologies vary based on domain needs. They formalize concepts, relationships, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 69–78, 2024. https://doi.org/10.1007/978-3-031-54820-8_8

70

V. I. Danilchenko et al.

and functions. Ontologies enhance databases, classification, and retrieval, promoting understanding and coherent knowledge exchange [1–4]. Q =< X , K,  >

(1)

The basic relationships in an ontology typically include “instance - class” and “part - whole,” among others. While the concepts within a subject domain may be specific to each ontology, relationships are more universal. For the sets imposed on components K and F in this definition, a natural limitation is their finiteness and non-emptiness. If the components are finite, degenerate cases associated with their emptiness are possible [4–8]. Bioinspired algorithms have become a revolutionary direction in the field of artificial intelligence. They draw inspiration from the intellectual behavior of insects and animals in nature, including ant colonies, bird flocks, bee colonies, bacteria, moles, and stem cells. These algorithms are attracting increasing attention due to their ability to efficiently solve complex problems where traditional algorithms may fail or struggle to find optimal solutions. Such innovative approaches in artificial intelligence open up new prospects for applications in various research areas and practical implementations. In this work, a modified bioinspired algorithm called the colony of white moles is proposed, which allows harnessing the advantages of bioinspired algorithms within the context of the discussed problem.

2 Analytical Overview and Problem Statement In today’s world, vast information is spread across diverse data sources, including traditional databases and new online platforms. Integration methods enable efficient utilization of such varied content, enhancing accessibility and flexibility. Researchers, developers, and users benefit from streamlined work processes, data exchange, and insights from heterogeneous sources. This progress in information technology empowers various tasks and needs [4–6]. For instance, valuable information may be scattered among multiple sources, complicating analysis. Data integration yields comprehensive insights by combining data from various sources, enabling a holistic understanding [6]. Tools and technologies exist to facilitate data integration, such as Extract, Transform, Load (ETL) systems. Data integration is crucial for comprehensive data representation, aiding tasks like business decisions or scientific research. To implement a data integration system successfully, key tasks include: Developing Data Storage Architecture: Designing optimal storage methods considering volume, types, accessibility, and performance. Creating Data Model Mappings and Integrative Methods: Developing mechanisms to link and consolidate data from diverse sources into a centralized model. Metadata Analysis: Examining source data characteristics and properties for accurate integration. Overcoming Data Heterogeneity: Addressing differences in data formats, semantics, and structures. Comparing and Evaluating Integrated Data: Developing metrics and methods to assess data quality and integrity. Utilizing Evolutionary and Bio-Inspired Algorithms: Employing nature-inspired algorithms to optimize

Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity

71

integration parameters. Ensemble Algorithm Approach: Combining interconnected algorithms for quasi-optimal solutions. Ontologies for Comprehensive Data Representation [5–9]: Utilizing ontologies to harmonize and transform heterogeneous data. Accounting for Multiple Extrema and Parameters: Considering stochastic optimization to handle multiple extrema and parameters. In conclusion, data integration from various sources is pivotal for efficient information processing. It enhances understanding, decision-making, and scientific research, driven by advanced technologies and methodologies.

3 Approach to Semantic Similarity Assessment For initializing the linguistic expert information model, principles of linear algebra are employed. In this approach, the information complex of interconnected linguistic units is represented as a set of vectors. The distance between vectors determines the semantic similarity between linguistic units. The initial matrix is formed from the obtained set of vectors corresponding to the considered context. Different methods can be used to determine the distance between vectors, one of which is the cosine measure, utilized in this work [8–10]: n xi yi xy (2) =  i=1 |x||y| n n 2 2 i=1 xi i=1 yi Statistical measures like associations and associative coherence [4–9] are employed to determine the connectedness between linked vectors. Additionally, the measure described in [10] is used, which compares the frequencies of related vectors with independent vectors. If the value exceeds a given threshold, the vector linkage is considered a constant. This measure is denoted as Mi and is defined as follows [10]: Mi = log2

f (n, c) · N , f (n) · f (c)

(3)

where n - is the first word of the vecto, c - is the second word of the vector, f (n, c) – is the frequency of the two dependent vectors, f (n) and f (c) - are the absolute frequencies of each vector, N – is the total number of vectors. The work also employs the measure Ms, which determines the likelihood of similarity between two or more vectors [10–13]: Ms =

(c) f (n, c) − f (n)·f N f (n, c)

(4)

And the measure Ml, reflecting the logarithmic likelihood function, as considered in [12]: Ml = 2

n  i=1

f (n, c) · log2

f (n, c) · N f (n) · f (c)

(5)

72

V. I. Danilchenko et al.

The “bag of terms” method [10–13] is also employed in this work. It represents the document di as a column vector. The length of the column vector is defined by Nw , where uij – is the term wj element in this vector. The Dij – matrix is a two-dimensional array of linguistic expert information documents, where uij – are the elements of this array: ⎡ ⎤ u11 · · · u1i · · · u1ND ⎢ . . ⎥ .. .. .. .. ⎢ .. ⎥ ⎢ ⎥ ... ⎢ ⎥ .. ⎥, dim Dij = Nw · Nd , Dij = ⎢ (6) ⎢ uj1 · · · ⎥ . u u ji jND ⎢ . ⎥ .. . . .. ⎢ . ··· ⎥ ⎣ . ⎦ . .. uNw 1 · · · uN i · · · uN N w w D −



where j =1, Nwdi – is the row number, wj is the ordinal number of each term; j =1, Nd – is the column number. The set of documents D is known, with each document di ∈ D – being a random independent sample of terms, created from a subset of T. The conditional distribution of terms in documents p(w|d ) can be found using term frequencies [13–18]. The topic distribution matrix O=(θki ) [11–14] is obtained: ⎡ ⎤ p(t1 |d1 ) · · · p(t1 |di ) · · · p t1 |dND ⎢ ⎥ .. .. .. .. .. ⎢ ⎥ . . ⎢ ⎥ ... ⎢ ⎥ . ⎢ ⎥, (7) Oki = ⎢ p(tk |d1 ) · · · . ⎥ p(tk |di ) . p tk |dND ⎢ ⎥ .. . . ⎢ ⎥ .. . . . .. ⎣ . ··· ⎦ · · · p tNT |d1 p tN |d1 · · · p tN |tN T



t

D



where dim(O) = NT • ND ; k =1, NT – is the row number; i =1, NT – is the column number; p(tk |di ) – is the probability of term k being in section i; Oi – is the column vector of section tk . To transfer the topic matrix into the conceptual space, a term dictionary is used. It’s important to note that such matrices in real applications can have millions of columns and rows. However, most of the matrix elements are filled with zeros, making the information manageable. To find associative connections, which may become part of the associative profile, term pairs with the highest cosine similarity coefficients should be selected.

4 Modified Bioinspired Algorithm of White Mole Colony The white mole colony algorithm is inspired by the social behavior of blind naked mole rats in large colonies. It begins from the colony’s center, with worker and soldier mole rats searching for food sources. The algorithm operates chaotically within the problem space, and worker mole rats move from the center towards food sources, digging labyrinthine tunnels to locate them [5]. The algorithm finds applications in optimization tasks such as route planning, scheduling, and production management [4–6]. It leverages moles’ social behavior to efficiently solve complex optimization problems [3–8].

Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity

73

In this algorithm, each food source is represented by centroids, determining its position in space. Centroids can be calculated as mean values of coordinates, simplifying the search process. Worker mole rats, resembling targets, move from the colony’s center to locate food sources and neighbors. Their movement is influenced by factors like temperature and humidity, represented as a damping coefficient in the algorithm [3– 8]. Higher underground temperature leads to more active movement, influencing food source detection speed and overall colony efficiency. The algorithm’s principles, inspired by ant colonies’ food source detection, hold potential for developing machine learning and optimization algorithms. For instance, they can optimize complex transportation networks by finding efficient routes. This area of research presents opportunities for new algorithms and advancements in machine learning and optimization [3–8]. Additionally, the underground temperature in the algorithm is determined as follows: H (x) = ρ(x)C(x)

T (x, t) t

(ρC) = fs (ρC)s + fa (ρC)s + fw (ρC)w

(8)

fs + f s + f w = 1 In this equation, the variable H represents the change in soil temperature at depth x. Values ρ(x) and C(x) represent the thermal properties of the soil, such as its density and specific heat capacity, respectively. While these properties may actually vary with distance, in this equation, we treat them as constant values bounded within the range [2 to 4]. The expression T(x,t)/t shows the rate of temperature change over time and is updated at each step of the algorithm. The symbol f represents the contribution of each worker mole rat in percentage, and subscripts s, a, w indicate soil components such as sand, air, and water, respectively. The sum of these components equals one to obtain an average sum used for calculating r C. During the search for food sources, the damping coefficient A is updated at each iteration. When the temperature approaches zero (e.g., in real conditions, around 30 °C), the search for food source neighborhoods is performed with less intensity. When the temperature approaches the average value in the range [0, 1], the process of searching for food source neighborhoods is conducted with greater intensity. The given equation can be described using the following equations [16–20]: The equation describes the update of certain values in the algorithm based on a time-dependent factor:

  −αt (9) Ati = Ati 1 − exp T where T is derived from Eq. (3), α is a random variable in the range [0, 1], and α = 0.95 is a fixed parameter, and t is the iteration step.

74

V. I. Danilchenko et al.

Each food source is inspected by two worker mole rats, constituting half of the total number of workers in the colony. Worker mole rats store information about the initial quality of the food and the path to it in their memory. Upon returning to the colony, worker mole rats share information about the discovered food sources with other workers and the queen. For the classification of found food sources, the queen utilizes a probability P, determined as a combination of product quality and distance from the colony center: P = (α · Q) + (β · D)

(10)

where α and β are coefficients determining the weight of product quality and distance, respectively, Q is the quality of the food source, and D is the distance to the food source from the colony center. Mole rats display efficient leadership and resource utilization. In a University of Louisiana study, they select prime food sources, send worker pairs to collect and explore, and share information. They safeguard their colony by excluding low-cost points as potential intruders. The modified mole rat colony algorithm prevents local minima traps through mutation, replacing excluded points with new ones. This ensures resource efficiency and protection. A further equation describes the updating of eliminated points: Bit = ϕ · Bit−1 .

(11)

where ϕ ≥ 1 is a factor specified by the developer, B_i^t is the number of eliminated points for the i-th food source in iteration t. These equations collectively depict the behavior and strategies of mole rats in the context of the modified algorithm, demonstrating their abilities in efficient resource utilization, collaboration, protection, and overcoming optimization challenges. Below is the pseudocode of the modified white mole colony algorithm:

Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity

75

initializeColony(); // Initialize the mole colony while (!terminationConditionMet()) { // Compute the objective function values for each mole in the colony evaluateObjectiveFunction(); sortColony(); // Sort moles by objective function value eliminateWeakestMoles(k); // Exclude a specified number of worst moles from the colony replaceExcludedMoles(); // Replace excluded moles with randomly selected ones mutateMoles(); // Mutate moles to update the colony updateColony(); } // Function to initialize the mole colony void initializeColony() { // Initialize with random values } // Function to compute the objective function value for each mole void evaluateObjectiveFunction() { // Calculate objective function values } // Function to sort moles by objective function value void sortColony() { // Sort moles by objective function value } // Function to exclude a specified number of worst moles from the colony void eliminateWeakestMoles(int k) { // Exclude worst k moles from the colony } // Function to replace excluded moles with randomly selected ones void replaceExcludedMoles() { // Replace excluded moles with randomly selected ones } // Function for mole mutation void mutateMoles() { // Mole mutation } // Function to update the mole colony void updateColony() { // Update the mole colony } // Function to check the termination condition of the algorithm bool terminationConditionMet() { // Check termination condition } Listing 1. Pseudocode of the behavior of the white mole colony algorithm

The pseudocode continuously updates the mole colony, ensuring exploration and avoiding local minima. This mutation approach enhances resilience and solution quality. The mole colony algorithm includes stages: initialize positions, compute fitness, search with greedy selection. It performs global search with diverse strategies, reducing local optima. The white mole colony algorithm improves accuracy via progressive initial

76

V. I. Danilchenko et al.

solutions, using objective evaluations. It evolves solutions iteratively through information exchange, converging towards optimal solutions. This systematic approach, along with similar methods, boosts solution precision and quality.

5 Experiment We developed a software module with functions for: Ontology Creation: Generating ontologies to link concepts and data attributes. Concept Attribute Generation: Creating attributes based on preset probabilities. Semantic Similarity Evaluation: Assessing semantic similarity between concepts using our modified white mole colony algorithm. In our study, we evaluated our algorithm’s time complexity and compared it with bee colony, particle swarm, simulated annealing, and genetic algorithms (Table 1). Algorithms inspired by particle swarm, bee behavior, and bacterial search excel in design problems [6–22], offering efficiency and adaptability in diverse fields. Our algorithms handle high-dimensional problems efficiently, offering quick performance and flexibility. However, selecting probabilistic coefficients for objective function computation lacks clear guidelines [20–25]. Simulated annealing works for small dimensions (typically n < 100) like graph partitioning. It achieves equilibrium using a probabilistic approach. Genetic algorithms, utilizing natural selection and genetics, are effective search methods. They handle solution populations and overcome local optima. In conclusion, our approach using the modified white mole colony algorithm and ontology usage effectively solves optimization problems across domains. Table 1. Algorithm Runtime Comparison Graph Model WCM

Algorithm Bee Algorithm Colony Particle Swarm

Algorithm Simulated

Annealing Algorithm

Genetic Algorithm

1000

65,7

67,1

70,4

70,1

66,4

2000

68,2

70,4

74,8

73,5

70,8

5000

75,4

76,7

82,6

83,7

77,4

10000

81,6

83,2

89,7

90,1

83,9

20000

92,2

96,8

102,7

104,5

97,0

50000

104,1

109,5

117,6

115,4

111,6

Genetic algorithms (GAs) are versatile stochastic optimization methods, commonly used for intricate optimization problems. GAs involves random exploration among potential solutions, making them adaptable but prone to local optima. The time complexity ranges from O(n^2log(n)) to O(n^3), with potential reduction via heuristics.

Application of a Bioinspired Search Algorithm in Assessing Semantic Similarity

77

In contrast, the proposed modified algorithm is even more stochastic, seeking quasioptimal solutions based on selected criteria vectors, sequentially exploring both ontologies. This approach may outperform other methods in solution quality. Experiments confirm the modified algorithm’s superiority, as its speed results from its stochastic nature while ensuring quality solutions through efficient search procedures. The paper proceeds to test the algorithm on the Rosenbrock function, conducting 50 tests with varying random initial values. For the test experiments, 50 independent runs of the algorithm were executed on the test function. Each run utilized distinct random values to generate initial solutions. Each run spanned 5000 iterations, with a population size of 100 individuals. It becomes evident that when the population size exceeds 50 individuals (size > 50), the optimization algorithm’s performance diminishes as it encounters multiple optimal solutions. In such cases, efficient distribution of obtained solutions within the explored space is necessary, complicating optimal parameter selection. Testing results indicate that the white mole colony algorithm demonstrates similarity or, in some cases, outperforms other optimization algorithms. In essence, this confirms the algorithm’s capacity to escape local minima within the problem space and reach global minima.

6 Conclusion This article introduces a modified optimization algorithm inspired by white mole colonies for solving semantic similarity in heterogeneous ontologies. It efficiently applies colony principles to optimization tasks. Compared to other methods like bee colony algorithms or genetic algorithms, this mole colony algorithm offers advantages. It has few parameters, making it versatile across tasks. It avoids local optima through innovative quasioptimal solutions based on criteria vectors. It explores ontologies systematically, yielding higher quality solutions. Tests on the Rosenbrock function show the algorithm’s superiority. Its stochastic nature ensures speed and efficient search. This polynomial-time algorithm is fit for complex tasks, including NP-hard problems. It excels in surpassing local minima and reaching global minima. In conclusion, this modified mole colony algorithm offers a efficient approach to optimizing semantic similarity in heterogeneous ontologies. Its versatility, efficiency, and global optimization capability make it appealing. Further research and application may refine and expand its use. Acknowledgment. The study was performed by the grant from the Russian Science Foundation № 22-21-00316, https://rscf.ru/project/22-21-00316/ in the Southern Federal University.

References 1. Danilchenko, V.I., Kureichik, V.M.: Genetic algorithm for placement planning of VLSI. Izvestie YuFU 2, 75–79 (2019) 2. Lebedev, B.K., Lebedev, V.B.: Planning based on swarm intelligence and genetic evolution. Izvestiya YuFU. Techn. Sci. 4, 25–33 (2009) 3. Kovalev, A.V.: Method for designing high-performance low-power asynchronous digital devices. Izvestiya VUZov. Electron. 1, 48–53 (2009)

78

V. I. Danilchenko et al.

4. Bova, V.V., Kureichik, V.V.: Integrated subsystem of hybrid and combined search in design and control tasks. Izvestiya YuFU. Techn. Sci. 12, 37–42 (2010) 5. Lebedev, O.B.: Hybrid partitioning algorithm based on ant colony method and collective adaptation. In: Integrated Models and Soft Computing in Artificial Intelligence, pp. 620–628. Physmatlit, Moscow (2009) 6. Bushin, S.A., Kovalev, A.V.: Models of energy consumption of CMOS VLSI functional blocks. Izvestiya YuFU. Techn. Sci. 12, 198–200 (2009) 7. Bushin, S.A.: Method for reducing energy consumption in asynchronous VLSI blocks. In: Proceedings of X VNTK of students and postgraduates Technical Cybernetics, Radio Electronics and Control, (2), pp. 37–38 (2010) 8. Danilchenko, Y.V., Kureichik, V.M.: Bio-inspired approach to microwave circuit design. IEEE East-West Design & Test Symposium (EWDTS), pp. 362–366 (2020) 9. Kokolov, A.A., Dobush, I.M., Sheerman, F.I., Babak, L.I., et al.: Complex functional blocks of wideband radio frequency amplifiers for single-crystal L- and S-band receivers based on SiGe technology. In: Proceedings of the 3rd International Scientific Conference “EKB and Electronic Modules”, pp. 395–401. TechnoSphere, Moscow (2017) 10. Kureichik, V.M., Lebedev, B.K., Lebedev, O.B.: Hybrid evolutionary algorithm of VLSI planning. In: Proceedings of the 12th Annual Genetic and Evolutionary Computation Conference (GECCO 2010), pp. 821–822. Portland, OR (2010) 11. Bushin, S.A., Kovalev, A.V.: Evolutionary method of placement of heterogeneous blocks in VLSI. Izvestiya YuFU. Techn. Sci. 17, 45–53 (2010) 12. Zhabin, D.A., Garays, D.V., Kalentyev, A.A., Dobush, I.M., Babak, L.I.: Automated synthesis of low noise amplifiers using S-parameter sets of passive elements. In: Asia-Pacific Microwave Conference (APMC 2017), Kuala Lumpur, Malaysia (2017) 13. Bushin, S.A., Kovalev, A.V.: Evolutionary method of task allocation in system-on-chip to reduce energy consumption. In: “Intelligent Systems” and “Intelligent CAD” International Scientific and Technical Conferences, vol. 1, pp. 102–103. Physmatlit, Moscow (2010) 14. Babak, L.I., Kokolov, A.A., Kalentyev, A.A.: A new genetic-algorithm-based technique for low noise amplifier synthesis. In: European Microwave Week 2012, Amsterdam, The Netherlands, pp. 520–523 (2012) 15. Mann, G.K.I., Gosine, R.G.: Three-dimensional min–max-gravity based fuzzy PID inference analysis and tuning. Fuzzy Sets Syst. 156, 300–323 (2005) 16. Bova, V.V., Kuliev, E.V., Shcheglov, S.N.: Evaluation of the efficiency of association rule mining method for big data processing tasks. Izvestiya YuFU. Technical Sciences, Thematic issue “Intelligent CAD” (2020) 17. Furber, S.B., Day, P.: Four-phase micropipeline latch control circuits. IEEE Trans. VLSI Syst. 4, 247–253 (1996) 18. Furber, S.: Computing without clocks: Micropipelining the ARM processor. In: Birtwistle, G., Davis, A. (eds.) Asynchronous Digital Circuit Design, pp. 211–262. Springer-Verlag, New York (1995) 19. Alpert, C.J., Mehta, D.P., Sapatnekar, S.S.: Handbook of Algorithms for Physical Design Automation. CRC Press, New York (2009) 20. Neupokoeva, N.V., Kureichik, V.M.: Quantum and Genetic Algorithms for Placement of VLSI Components. Monograph. TTI SFedU, Taganrog (2010)

Leveraging Deep Object Detection Models for Early Detection of Cancerous Lung Nodules in Chest X-Rays Md. Tareq Mahmud(B) , Shayam Imtiaz Shuvo, Nafis Iqbal, and Sifat Momen Department of Electrical and Computer Engineering, North South University, Plot 15, Block B, Bashundhara, Dhaka 1229, Bangladesh [email protected]

Abstract. Timely identification of lung cancer is of utmost importance due to its high fatality rate, making it imperative for effective treatment strategies. The identification of malignant lung nodules from Chest radiographs is a frequently neglected domain due to the limited capability of X-ray imaging in capturing minute entities. This study employed a variety of deep-learning models to identify malignant nodules in chest Xray images. The study employed three deep-learning models, specifically FatserRCNN, Yolov5, and EfficientDet, for the purpose of training and validating their ability to detect lung nodules in X-ray images. Among the various models examined, YOLOv5 demonstrated the highest efficacy in the detection of malignant nodules. The model attained precision, recall, and mean average precision (mAP) scores of 89%, 84.6%, and 83% respectively when evaluated on the NODE21 dataset. When compared to notable works, we have attained the highest recall. In addition, a web application was developed to enable users to access real-time detection outcomes utilizing the YOLOv5 model that has been trained. In conclusion, the findings of our study highlight the potential of utilizing deep learning techniques to enhance the precision and effectiveness of cancer detection in the field of medical imaging. Keywords: Chest X-Ray (CXR) · You only look once (YOLO) Computer Aided Diagnosis (CAD)

1

·

Introduction

Cancer remains a global public health concern, accounting for a significant proportion of mortality worldwide. In 2020 alone, cancer claimed an estimated 10 million lives [1]. Among the various types of cancer, lung cancer has emerged as a particularly formidable adversary, posing substantial challenges in terms of diagnosis, treatment, and prevention. While breast, colon, rectal, and prostate cancer are also significant contributors to the global cancer burden, lung cancer stands out as a particularly grave concern. It ranks as a leading cause of mortality, surpassing these other cancer types in terms of mortality rates [2]. In fact, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 79–98, 2024. https://doi.org/10.1007/978-3-031-54820-8_9

80

Md. T. Mahmud et al.

within the United States, lung cancer is the primary form of malignancy, posing substantial challenges to public health and healthcare systems [2]. Historically, lung cancer was a rare phenomenon, accounting for only 1% of observed cancer cases during autopsies conducted at the University of Dresden’s Institute of Pathology in Germany back in 1878. However, by 1918, the incidence rate had surged to over 10%, exceeding 14% by 1927 [3]. In the United States, 170,000 new cases of lung cancer were reported to be detected in 2000, and less than 15% of those patients were predicted to survive 5 years after diagnosis. The stage of the disease at the time of diagnosis has a significant impact on the prognosis of lung cancer patients. Patients with clinical stage IA have a 5-year survival rate of about 60%, whereas the 5-year survival rate for clinical stage II-IV disease ranges from 40% to fewer than 5% [4]. The gravity of lung cancer’s impact is evident today, as it continues to be the leading cause of cancer-related mortality among men globally and the second leading cause among women [5]. Recent data from 2020 alone reveals that over 2.2 million new cases of lung cancer were reported, underscoring the pressing need for effective prevention, early detection, and innovative treatment approaches [6]. Lung cancer primarily originates in the pulmonary tissues, affecting the cells lining the respiratory pathways [7]. Preliminary indicators of lung cancer, known as pulmonary nodules, manifest within the lungs of individuals long before any clinical symptoms become apparent [8]. The presence of these nodules can be discerned through diagnostic procedures such as chest X-rays (CXRs) or computed tomography (CT) scans. While CT scans offer enhanced sensitivity and accuracy in detecting lung nodules when compared to X-rays, their drawbacks, including the emission of harmful radiation, limited accessibility, and high cost, undermine their practicality. Conversely, X-rays present a safer, more affordable, and widely accessible alternative to other radiation-based diagnostic methods in nearly all regions across the globe. Consequently, X-rays are frequently employed in the diagnosis of lung cancer [9]. While chest X-rays (CXR) have the capability to identify lung cancer at its early stages, the diverse range of sizes exhibited by lung nodules poses a formidable challenge for radiologists. Furthermore, the density of pulmonary nodules exhibits significant variations across different anatomical regions within the human body. Additionally, the presence of bones and internal organs often obscures the visibility of lung nodules [8]. Concurrently, with the escalating number of new cases, the detection of malignant nodules carries a substantial risk of errors and proves to be an exceedingly arduous task. Consequently, the interpretation of CXRs can swiftly become a daunting undertaking for radiologists [8]. Deep learning models have become a promising solution in the medical field [10–12]. They demonstrate significant potential in detecting objects of various sizes, including complex tasks such as deepfake image detection [13]. These models are widely used for detecting very small objects in images, providing potential solution to the problem. Approaches for object detection can be broadly catego-

Deep Learning for Early Detection of Lung Nodules in X-rays

81

rized into two groups: region proposal algorithms, often referred to as two-stage methods, and regression or classification-based methods, also known as one-stage methods or real-time and unified networks [14]. The region proposal algorithm delineates a bounding box around the object of interest and subsequently classifies it based on the specific object category it encapsulates. While previous studies have leveraged deep learning techniques to identify malignant lesions in CT scans [15], the detection of nodules using CXR images proves considerably more challenging than nodule detection from CT scan slices, and thus, it remains an understudied domain. This study utilizes state-of-the-art object detection models for the purpose of detecting cancerous nodules in CXR images. The key contributions of this study are outlined as follows: – In this study, we have utilized three object detection models for the detection of cancerous lung nodules. The models employed are: Faster RCNN [16], You Only Look Once (YOLO) [17], and EfficientDet [18]. The primary objective of our study is to achieve precise detection and localization of cancerous lung nodules within Chest X-Ray (CXR) images. Specifically, our study employs these aforementioned models to effectively identify and delineate bounding boxes around the identified lung nodules. – A comprehensive evaluation and comparison of these three models was conducted using a range of performance metrics including precision, recall, and mean average precision (mAP). – Lastly, the model with the best performance is deployed in a web app using the PyTorch and Flask frameworks for real-time cancerous lung nodule detection from Chest X-rays.

2

Related Works

The early detection and diagnosis of cancer can significantly increase the chances of successful treatment and ultimately save lives. The use of deep learning models to detect cancerous nodules from chest X-ray (CXR) has been tried by a handful of researchers in the last decade. In 2016, [19] conducted a study in which they employed a Convolutional Neural Network (CNN) with a ResNet backbone architecture to classify chest radiographs as non-nodule, benign, or malignant. While they attempted to localize nodules within the radiographs, the localization approach was not as effective as they had hoped. Nonetheless, the study demonstrated the successful classification of radiographs containing nodules versus those without nodules. In another study, the authors of [20] explored the use of DenseNet-121, a 121layer convolutional neural network, along with transfer learning to classify lung cancer using chest X-ray images. The model was trained on a lung nodule dataset before being trained on the lung cancer dataset, yielding a mean sensitivity of 74.6% and providing a heatmap for identifying the location of the lung nodule. In a different study published in Scientific Reports in 2019, [21] investigated the short-term reproducibility of computer-aided detection (CAD) for detecting

82

Md. T. Mahmud et al.

pulmonary nodules and masses in consecutive chest radiographs (CXRs) of the same patient. They evaluated the reproducibility of CAD using four different convolutional neural network algorithms. The study showed that the eDenseYOLO algorithm performed better than other algorithms in terms of figure-of-merit analysis and percent positive agreement (PPA) and Chamberlain’s percent positive agreement (CPPA) metrics. The performance of artificial intelligence (AI) as a second reader for identifying lung nodules on chest X-rays (CXR) in comparison to radiologists from two separate institutions is examined in the [22] study. A CXR database containing various kinds of nodules was studied for the study. The AI software analysed the images after radiologist evaluation and identified CXRs with potentially overlooked nodules. Both automated and aided AI modes, which have the potential to increase diagnostic accuracy, showed promise in terms of sensitivity and F1score. A CAD algorithm was created by the authors of [23] using Convolutional Neural Networks (CNNs) to identify pulmonary nodules in CXRs. A retrospective cohort of patients with X-rays from 2008 who hadn’t previously been seen by radiologists underwent the algorithm’s use. The X-rays were divided into groups according on the likelihood of nodules, and later radiological evaluations by radiologists verified the presence of nodules, including cases with undiagnosed LC. A high number of CXRs were effectively sorted, and lung nodules were found with encouraging sensitivity and specificity scores. The model’s sensitivity and specificity were both 0.78, yielding an overall accuracy of 0.79. [24] study explored the impact of a deep learning-based artificial intelligence (AI) algorithm on lung cancer detection from chest X-rays (CXRs). The research involved 173 CXR images from cancer-positive patients and 346 images from cancer-negative patients, selected from the National Lung Screening Trial (NLST) dataset. Eight readers, including three radiology residents and five board-certified radiologists, participated in an observer performance test. Results showed that with AI, the average sensitivity for detecting visible lung cancer increased for radiology residents but remained similar for radiologists. Falsepositive findings per image decreased for radiologists but remained similar for residents. Our work builds upon these studies and detects cancerous nodules from chest X-rays with higher precision, and we have built a complete system that allows users to upload X-ray images and see the predictions in a web app. In addition, our system predicts the location of the nodules and generates a bounding box around them, providing a more detailed analysis of the X-ray images.

Deep Learning for Early Detection of Lung Nodules in X-rays

83

Table 1. Summary of Related Works Techniques

Methodology

Faster RCNN [19] CNN with ResNet backbone architecture to classify chest radiographs as non-nodule, benign, or malignant

Results Successful classification of radiographs containing nodules versus those without nodules

DenseNet-121 [20] DenseNet-121 with transfer learning to classify Mean sensitivity of 74.86% and providing a lung cancer using chest X-ray images heatmap for identifying the location of the lung nodule CNN [21]

CAD using four different CNN algorithms to detect pulmonary nodules and masses in consecutive chest radiographs (CXRs)

Reproducibility of CAD varied with different algorithms; eDenseYOLO algorithm performed better than other algorithms

AI Software [22]

compares the performance of artificial intelligence (AI) as a second reader for detecting lung nodules on chest X-rays (CXR) with radiologists from two different institutions

Both AI modes, automated and assisted, showed promise in improving sensitivity and F1-score, suggesting their potential to enhance diagnostic accuracy

CNN [23]

The algorithm was applied on a database of patient X-rays from the past

They achieved a sensitivity of 0.78 and a specificity of 0.80, and also an overall accuracy of 0.79.

Reset34 [24]

Used a commercial AI software which is based Better average sensitivity was noticed by the on Resnet34 CNN architecture with self use of the AI software. attention mechanism on 519 X-rays

3

Methods and Materials

The technique utilized in this study involves detecting cancerous lung nodules from Chest Radiographs. 3.1

Faster-RCNN

Figure 1 is a streamlined illustration of the Faster-RCNN model for detecting lung cancer nodules. The NODE21 dataset contains CXR pictures. Hence, images must first be transformed to tensors in order to be loaded onto the GPU and then run through a Convolutional Neural Network (CNN) [25]. CNNs such as VGG or ResNet may be selected as the foundational feature extraction model. This study uses a ResNet50 CNN model to extract features from CXR images, yielding feature maps that are then passed to a Region proposal network (RPN). This RPN layer suggests locations where the object could potentially be located. As soon as the location of the malignant lung nodule is identified, the corresponding region is classified as foreground. The region of the image that does not include the malignant lung nodules is identified as a background class. The area designated as foreground class is passed on to the subsequent stage of the algorithm. This can be accomplished by having the RPN layer generate anchor boxes. Anchor boxes are a set of predefined bounding boxes. To detect objects of various sizes, multiple sizes of anchor boxes are constructed. To designate a region as a foreground class, the RPN must compute Intersection over Union (IOU). Here, the cancer nodule is denoted by actual boxes in the images, and the RPN generates some predicted boxes. The IOU is simply the area that overlaps the actual box and the predicted box. If the IOU value is greater than or equal

84

Md. T. Mahmud et al.

Fig. 1. FasterRCNN architecture

to 0.5, the area is classified as foreground; otherwise, it is classified as background. The algorithm gains insight from the region where the IOU is bigger than 0.5. The feature maps generated by the ResNet layer are therefore simply transferred to a CNN layer with a kernel size of 3 × 3 and an output channel of 512. The output of this convolutional layer is transferred to the classification layer and the regression layer. The classification class classifies the foreground class and background class, while the regression layer generates the bounding boxes containing the object’s location. The regions or anchor boxes labeled as foreground class is transferred to the subsequent layer, known as the region of interest pooling layer (ROI). The output of the RPN layer feature maps of those anchor boxes with the foreground class label. The ROI layer shrinks the various feature map sizes produced by the RPN to the same size. Using Max pooling, the feature map for each region proposal is used to create a fixed-size feature map of the region proposal. Hence, all of the proposed region maps is of the same size. This is done because the planned region containing malignant nodules must be flattened and transmitted to a layer with complete connectivity. Before passing to the RPN layer, the ROI layer receives two inputs: the region of interest or the region proposal, and the feature maps generated by ResNet. The ROI layer’s output is flattened before being transmitted to layers with complete connections. The output of these layers is then transferred to a Classifier layer and a Regressor layer. The classifier assigns the object a class, and the Regressor creates and modifies the bounding boxes as the final result. Faster-RCNN employs a loss function known as RPN loss, which is defined as follows:  L ({pi } , {ti }) = N1cls i Lcls (pi , p∗i )  ∗ (1) 1 ∗ +λ Nreg i pi Lreg (ti , ti ) . In Eq. 1, The classification loss over two classes is the first term (There is an object or not). The second term is the regression loss of bounding boxes only when there is an object (i.e. p∗i = 1).

Deep Learning for Early Detection of Lung Nodules in X-rays

3.2

85

Yolov5

The YOLO algorithm analyzes an image only once to identify all objects and their locations. In the given dataset, there are coordinates of the image’s bounding boxes and class probabilities such as 0 and 1, where 0 indicates the absence of cancerous lung nodules in the image and 1 indicates the presence of cancerous lung nodules. With the supplied data set, the x and y coordinates establish the coordinate of the bounding box’s center, while the width and height determine the bounding box’s width and height. To determine the width and height coordinates, the values in the dataset must be added to the x and y values, which can be used to design the image’s bounding box. Images without malignant lung nodules have a value of 0 for the x, y, height, and width coordinates. These coordinate values are normalized before being used for training. The normalized coordinates of the center of the bounding boxes, the height and width, and the class probabilities are saved in a separate text file. For each image, there may be several cancerous lung nodules, so all the cancerous lung nodule coordinates are given in one text file, and the text file is utilized to train the model. The huge YOLOV5 model named YOLOv5L was employed to accomplish our objective. YOLOv5L contains significantly more parameters than other small YOLO models [17]. YOLO extracts feature from image data using a Convolutional Neural Network (CNN) [25]. It operates by dividing images into a grid, and each grid cell identifies a distinct object. For each grid cell, a vector including the cell’s coordinates and the class probability of the presence of a malignant lung nodule is generated. If a center coordinate is found in a particular grid cell, that grid cell is used to identify the malignant lung nodule. To determine the accuracy of each prediction, the grid cells forecast all the bounding boxes and award each one a confidence score. For each lung nodule caused by cancer, several bounding boxes can be generated. IOU, which stands for intersection over the union, is utilized in this context. Here, the method generates the IOU using the initial bounding box and the other bounding boxes to determine the overlapping region. IOU is computed by dividing the intersection area of two bound boxes by the union area of the two bound boxes, resulting in a number between 0 and 1. The IOU value of all the bounding boxes generated for a malignant lung nodule is determined using the original bounding box, and only the bounding box with the highest IOU value is retained. Non-max suppression is the process of obtaining these distinct bounding boxes. Following these steps, the algorithm generates a single bounding box encompassing a single lung nodule that is malignant. Figure 2 depicts Yolov5’s network architecture. There are three components to it. The first section depicts the backbone network, CSPDarknet. The second section contains Neck: PANet and the third section has the Yolo Layer. Initially, input data is processed using CSPDarknet to extract features. The data is then passed to PANet for feature fusion. The Yolo Layer provides detection results such as class, score, position, size, etc. PyTorch’s Binary Cross-Entropy with Logits Loss function is used by

86

Md. T. Mahmud et al.

Fig. 2. YOLOv5 model architecture

YOLOv5 to calculate the loss of class probability and object score. The Binary Cross entropy loss function is as follows: LBCE (y, yˆ) = −

N 1  yi log(ˆ yi ) + (1 − yi ) log(1 − yˆi ) N i=1

(2)

In Eq. 2, log refers to the natural log, y is the binary indicator (0 or 1) if class label c is the correct classification for observation o and p is the predicted probability observation o is of class c. 3.3

EfficientDet

EfficientDet [18] is an object detection algorithm that is based on a combination of EfficientNet [26] and the anchor box approach. The algorithm starts by dividing the input image into a grid of cells, each of which has a set of anchor boxes associated with it. These anchor boxes are used to generate region proposals, which are possible object detections in the image. EfficientDet then passes the image through a convolutional neural network (CNN) backbone to extract features from the image. These features are then used to predict the class probabilities and bounding box coordinates for each region proposal. The class probabilities indicate the likelihood of the region proposal

Deep Learning for Early Detection of Lung Nodules in X-rays

87

containing a particular object class, while the bounding box coordinates specify the position and size of the object within the proposal. To refine the region proposals, EfficientDet utilizes a bi-directional feature pyramid network (BiFPN) to combine features from different levels of the CNN backbone. This helps to capture both high-level and low-level features that are useful for object detection. EfficientDet also introduces a compound scaling method that optimizes the trade-off between accuracy and computational efficiency. This scaling method uses a combination of network depth, width, and resolution to achieve high accuracy while keeping the computational cost low. Finally, EfficientDet uses a non-maximum suppression (NMS) algorithm to remove redundant detections. NMS works by removing region proposals that have a high overlap with other proposals that have a higher confidence score. This helps to ensure that only the most accurate detections are retained.

Fig. 3. EfficientDet model Architecture

Figure 3 illustrates the EfficientDet architecture. Unlike FasterRCNN [16] and Retinanet [27] and many other models, which traditionally use the ResnetX (18,32,50) architecture as the backbone network, EfficientDet uses EfficientNetB0-B6 [26] as the backbone network. The backbone network is followed by the introduced BiFPN network and the box/class prediction network. In our work, we use a PyTorch implementation of EfficientDet. We convert our data to coco format and use the default image resolution to feed them into the network. We use the pre-trained D-2 version of weights that are trained on Imagenet for 80 classes. The model tries to learn and detect bounding boxes from the images and predict them when a new image is given as input to the model. EfficientDet uses Focal Loss, which is given by:

FL = −

C=2 

γ

(1 − si ) ti log (si )

(3)

i=1 γ

In Eq. 3, (1 − si ) is a modulating factor, with the focusing parameter γ >= 0, to limit the influence of correctly identified samples in the loss.

88

3.4

Md. T. Mahmud et al.

Dataset

The NODE21 [8] dataset was used for this work. Chest X-rays from open sources [28–31] were used to build this dataset. The dataset comprises of frontal chest radiographs that are labeled with bounding boxes around cancerous lung nodules. A total of 4882 frontal chest radiographs are provided, out of which 1134 CXR images (containing 1476 nodules) are annotated with bounding boxes around nodules by expert radiologists [8], while the remaining 3748 images are negative examples that do not contain any nodules. For our work, we have used 2135 of these frontal chest X-rays among which 1134 images contain nodules and the rest are nodule free. Figure 4a is an image depicting the X-ray of a healthy person’s chest with no lung nodules. On the other hand, Fig. 4b represents the X-ray of a person who is affected by lung cancer. Small nodules can be observed on the chest of a lung cancer patient. This observation can be done by either X-rays or CT scans. Table 2. Total X-rays in the Dataset Total X-rays X-rays containing nodule Nodule Free X-rays 2135

1134

(a) Nodule Free Chest X-ray

1001

(b) X-ray containing Cancerous Nodule

Fig. 4. Images with and without cancerous nodules in the NODE21 dataset

Deep Learning for Early Detection of Lung Nodules in X-rays

3.5

89

Data Preprocessing and Format Conversion

The X-ray images on the NODE21 dataset were provided in .mha format. We converted these images into .png format for training. Furthermore, these images were also preprocessed by the authors of the dataset [8]. The preprocessing involved the following: – Elimination of border regions with similar properties. – Adjustment of image intensity values through an energy-based normalization technique, based on the method described in [32]. – Dividing the image into separate lung fields and cropping it to only include that area. – Changing the size of the image to 1024 × 1024 pixels while maintaining the original proportions and adding padding to the shorter side. For training, we converted the provided CSV data to different data formats as required for each model which is given in Table 3. Table 3. Dataset Format for each model Model

3.6

Dataset Format Provided Required Dataset Format

FasterRCNN CSV

PASCAL VOC

Yolov5

CSV

YOLO TXT

EfficientDet

CSV

COCO

Training and Validation

The final preprocessed dataset was divided into training and validation sets using an 85% and 15% split, respectively. The training set consisted of 1922 Chest xrays and the validation consisted of 213 images. Each model was trained for a specific amount of epochs which would leverage the best possible results out of them. The training parameters for training are presented in Table 5. Table 4. Total X-rays for training and validation Total X-rays Training Validation 2135

1922

213

90

Md. T. Mahmud et al.

3.7

Training Configurations

All models were trained on a machine with the following specifications: – – – –

Operating System: Windows 10 Processor: Intel Core i5 6500 processor RAM: 8GB Training Environment: Google Colab Premium

We utilized Google Colab Premium to access additional hardware resources, including a NVIDIA Tesla P100 GPU and more powerful CPUs, to accelerate the training process. The models were implemented and trained using the PyTorch framework, which was installed on the machine. We also utilized other software dependencies as required by the models. 3.8

Training Parameters

Training parameters used for each model is given in Table 5. Table 5. Training parameters for all three models Model

Loss function Epoch Learning rate Optimizer Weight decay Momentum

FRCNN

RPN loss

70

1e-5

SGD

1e-4

0.9

Yolov5

Binary CE

62

1e-5

SGD

1e-3

0.85

170

1e-5

Adam

5e-4

0.95

EfficientDet Focal loss

3.9

Final Architecture of the System

The final system architecture, illustrated in Fig. 5, outlines the key steps for detecting the cancerous nodules. The process begins with data preprocessing and converting the dataset into suitable format for each model. The dataset is then split into training (85%) and validation (15%) sets. Following that, the models are trained to identify and localize malignant lung nodules, generating bounding boxes around them. The trained model weights are saved for deployment. The performance of the models is evaluated using the validation set. Finally, the trained models’ weights are deployed in a web application for real-time inference on new images.

Deep Learning for Early Detection of Lung Nodules in X-rays

91

Fig. 5. Final architecture of the system

4

Results and Deployment

The main objective of this study was to utilize and evaluate three deep learningbased object detection models for the detection of cancerous lung nodules in Chest X-rays. In this section, we present the results of the experiments carried out to evaluate the performance of the three models. To evaluate the performance of the models, we used a validation dataset consisting of 213 Chest X-rays with annotated lung nodules. We measured the performance of the models in terms of precision, recall and mean average precision (mAP). Table 6 presents the validation result of all three models. The Yolov5 model outperformed FasterRCNN and EfficientDet across all three evaluation metrics. It achieved the highest scores in

92

Md. T. Mahmud et al.

terms of precision (0.89), recall (0.846), and mAP (0.83), while the FasteRCNN and EfficientDet models achieved comparatively lower scores. Table 6. Validation results of all 3 models Model

Precision Recall mAP

FasteRCNN 0.49

0.79

0.49

EfficientDet 0.44

0.51

0.24

Yolov5

0.846 0.83

0.89

Fig. 6. Training Loss and Validation Performance of EfficientDet Model

In Fig. 6a, the EfficientDet model’s training loss is shown. Initially, the model had a modest training loss, which gradually decreased after 40 epochs and consistently remained below 1 until the last epoch. However, Fig. 6b reveals that the model’s performance results were unsatisfactory. Even after more than 100 epochs of training, there was only a modest improvement in recall. Precision and mean Average Precision (mAP) remained consistently poor throughout all epochs. Consequently, the model was trained for a total of 170 epochs, as further training iterations did not yield significant enhancements in performance. Figure 7a illustrates the training and validation loss curves for the YOLOv5 model. Throughout the 63 epochs the training loss gradually decreased and plumetted close to 0. The validation loss followed a similar trend. Additionally, Fig. 7b highlights the exceptional performance of the YOLOv5 model, showcasing high accuracy and effectiveness in terms of recall, precision, and mean Average Precision (mAP) scores. Yolov5 was able to correctly identify the small lung nodules from the chest X-rays and also predicted their locations accurately during validation.

Deep Learning for Early Detection of Lung Nodules in X-rays

93

Fig. 7. Train, Validation Loss and Validation Performance of Yolov5 model

Fig. 8. Training loss and Validation result of FasterRCNN model

Lastly, Figs. 8a and 8b displays the training loss and validation results of the Faster R-CNN model. In Fig. 8a, the training loss is presented, showing consistently low box, object, and class losses throughout the training period. The losses reached their lowest value of 0.01. Figure 8b demonstrates the validation results, indicating a slightly lower recall of 79% compared to the YOLOv5 model. However, the Faster R-CNN model achieved lower precision and mAP scores, with the highest precision and mAP reaching 0.49. Table 7 compares our findings to related works on X-ray images to detect cancerous lung nodules. We attain 89% precision, 84.6% sensitivity, and 83% f1 score using our trained YOLOv5 model, which exceeds earlier studies on Xray’s. Specifically, our works achieved a slightly better sensitivity compared to previous related studies. We also use the standard precision metric where our work achieves a significantly better score. The inference results of the Yolov5 model on chest X-rays are showcased in Fig. 9. Figure 9a is a X-ray of a healthy person and our trained Yolov5 model

94

Md. T. Mahmud et al. Table 7. Comparison with related Works Reference

Algorithm Used Precision Sensitivity (Recall) F1 Score

[19]

Resnet50

0.715

...

[20]

DenseNet-121

0.7486

...

[24]

Resnet34

0.76

...

[23]

CNN

0.78

...

[22]

AI Software

0.7

0.8

0.75

[21]

eDense-YOLO

0.83

...

0.89

0.846

0.868

Our work Yolov5

(a) Detection on x-ray containing no nodules

0.75

(b) Detection on x-ray containing lung nodules

Fig. 9. Detection results by Yolov5

correctly identifies the absence of any nodules. However, Fig. 9b represents an Xray of a person diagnosed with lung cancer and the model precisely identifies the lung nodules and draws a bounding box around the nodules with a corresponding confident score referring to how confident the model is about the location of each lung nodules. Additionally, We used Grad-CAM to visualize the YOLOv5 detection to enhance the interpretability of the detection results. Grad-CAM, short for Gradient-weighted Class Activation Mapping, is a technique that visually highlights the regions of interest within the images that influence the model’s predictions. By computing gradients and performing global average pooling on the intermediate feature maps, Grad-CAM assign weights to each channel, emphasizing their importance for the target class. The weighted feature maps are then combined and visualized as heatmaps, effectively highlighting the important features in the image. Figure 10 illustrates the applied Grad-CAM results on chest X-rays that were first detected by our trained Yolov5 model. Malignant lung nodules are correctly identified and a heatmap is produced providing a clear indication of the regions that influenced the model’s predictions.

Deep Learning for Early Detection of Lung Nodules in X-rays

95

Fig. 10. Grad-CAM Heatmaps on YOLOv5 Model’s Detection

Lastly, we deployed the Yolov5 model on a web application to further enhance the usability of our work. The objective of the deployment was to provide an abstraction, allowing us to upload an X-ray image and obtain the detection results in an easy and accesible way. The detection result is obtained by leveraging the weights obtained from training the model, which is used to make predictions on new input images. The web app provides a user-friendly interface for anyone to upload a chest X-ray in any digital imaging formats such as png,jpg,jpeg,tiff etc. and obtain the corresponding detection results. This process can help to make the model more accessible to medical profesionals and could potentially be used in the field of medical imaging.

Fig. 11. Deploying the model on a web app

Figure 11 shows the workflow involved in deploying the trained YOLOv5 model to a web app for the purpose of medical image detection. The web app was created using HTML, CSS, Flask and JavaScript to provide a user interface for uploading images and viewing detection results. The web app utilizes PyTorch and Flask to run the detection algorithm on uploaded images. Specifically, the Flask app receives the uploaded image from the user, preprocesses it, and then passes it to the PyTorch model for detection. The detection results are then returned to the Flask app, which formats the results for display in the web app. Figure 12 is an example of a medical X-ray image processed by the deployed YOLOv5 model, which has identified the presence of two cancerous nodules. The image is displayed with a red bounding box drawn around each nodule, and a corresponding confidence score indicating the model’s level of certainty in its detection. If the model fails to detect any nodules in an image, it will return the

96

Md. T. Mahmud et al.

Fig. 12. Detection results on webapp

original image without any bounding boxes. This helps to avoid false positive detections and provides more accurate results to the end user.

5

Conclusion

Cancer has emerged as a formidable global health concern, instilling fear and uncertainty among populations worldwide. Among the various types of cancer, lung cancer stands out as the most lethal, its incidence has escalated exponentially over the past two decades. In the diagnosis of this treacherous disease, medical imaging techniques such as CT scans and X-rays have proven indispensable. This study contributes significantly to the accurate detection of lung cancer specifically through the utilization of chest X-rays, which represents a more viable diagnostic method for the majority of the world’s population. The chest X-ray data utilized in this study were obtained from the NODE21 dataset, and three object detection models- FasterRCNN, Yolov5, and EfficientDet- were employed to detect pulmonary nodules within these X-ray images. Our work achieved scores that are not only comparable to those of similar studies but also surpassed them in terms of the recall metric. To enhance accessibility for non-technical users, the Yolov5 model was integrated into a user-friendly web application, enabling real-time nodule detection with ease and efficiency. This successful deployment of our model represents a significant step forward in leveraging deep learning techniques for effective detection tasks in the field of medical imaging. Furthermore, the outcomes of this study serve as a powerful testament to the effectiveness of deep learning methodologies in the detection of abnormalities within medical imaging. Future investigations should consider expanding the training dataset to include more malignant chest X-rays, as this holds great potential for achieving even greater accuracy in nodule detection.

Deep Learning for Early Detection of Lung Nodules in X-rays

97

Acknowledgements. We would like to thank the the organizer of the NODE 21 challenge for sharing high quality CXR dataset which made this work possible. Data Availability Statement. The NODE21 Dataset used for this work can be accessed using the following link https://zenodo.org/record/5548363. This dataset is public and can be used for further experimentation.

References 1. Ferlay, J., et al.: Global cancer observatory: cancer today. Lyon: International agency for research on cancer 2020 (2021) 2. Edward, F., Patz, Jr., Goodman, P.C., Bepler, G.: Screening for lung cancer. New England J. Med. 343(22), 1627–1633 (2000) 3. Witschi, H.: A short history of lung cancer. Toxicol. Sci. Official J. Soc. Toxicol. 64(1), 4–6 (2001) 4. Hirsch, F.R., Franklin, W.A., Gazdar, A.F., Bunn Jr, P.A.: Early detection of lung cancer: clinical perspectives of recent advances in biology and radiology. Clinical Cancer Res. 7(1), 5–22 (2001) 5. Lindsey A.T., Rebecca, L.S., Jemal, A.: Lung cancer statistics. Adv. Exp. Med. Biol. 893, 1–19 (2016). ISSN 0065-2598. https://doi.org/10.1007/978-3-319-2422311 6. Lung cancer statistics—world cancer research fund international. https://www. wcrf.org/cancer-trends/lung-cancer-statistics/ 7. Lung cancer—lung cancer symptoms, April 2022. https://medlineplus.gov/ lungcancer.html 8. Node 21 grand challenge. https://node21.grand-challenge.org/ 9. Wang, S., Zimmermann, S., Parikh, K., Mansfield, A.S., Adjei, A.A.: Current diagnosis and management of small-cell lung cancer. In Mayo Clinic Proc. 94, 1599– 1622 (2019). ISSN 00256196 10. Fuhad, K.M., et al.: Deep learning based automatic malaria parasite detection from blood smear and its smartphone based application. Diagnostics. 10(5), 329 (2020) 11. Siddiqua, R., Islam, N., Bolaka, J.F., Khan, R., Momen, S.: Aida: artificial intelligence based depression assessment applied to Bangladeshi students. Array. 18, 100291 (2023) 12. Islam, M.S., Das, S.J., Khan, M.R.A., Momen, S., Mohammed, N.: Detection of COVID-19 and pneumonia using deep convolutional neural network. Comput. Syst. Sci. Eng. 44(1), 1–16 (2023) 13. Shad, H.S., et al.: Comparative analysis of deepfake image detection method using convolutional neural network. In: Computational Intelligence and Neuroscience 2021 (2021) 14. Nguyen, N.-D., Do, T., Ngo, T.D., Le, D.-D.: An evaluation of deep learning methods for small object detection. J. Elect. Comput. Eng. 2020, 1–18 (2020) 15. Fakoor, R., Ladhak, F., Nazi, A., Huber, M.: Using deep learning to enhance cancer diagnosis and classification. In: Proceedings of the International Conference on Machine Learning, vol. 28, pp. 3937–3949. ACM, New York, USA (2013) 16. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural. Inf. Process. Syst. 28, 91– 99 (2015)

98

Md. T. Mahmud et al.

17. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016) 18. Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020) 19. Bush, I.: Lung nodule detection and classification. Rep. Stanf. Comput. Sci 20, 196–209 (2016) 20. Ausawalaithong, W., Thirach, A., Marukatat, S., Wilaiprasitporn, T.: Automatic lung cancer prediction from chest x-ray images using the deep learning approach. In: 2018 11th Biomedical Engineering International Conference (BMEiCON), pp. 1–5. IEEE (2018) 21. Kim, Y.-G., et al.: Short-term reproducibility of pulmonary nodule and mass detection in chest radiographs: comparison among radiologists and four different computer-aided detections with convolutional neural net. Sci. Rep. 9(1), 18738 (2019) 22. Ohlmann-Knafo, S., et al.: Ai-based software for lung nodule detection in chest x-rays–time for a second reader approach? arXiv preprint arXiv:2206.10912 (2022) 23. Juan, J., Mons´ o, E., Lozano, C., Cuf´ı, M., Sub´ıas-Beltr´ an, P., Ruiz-Dern, L., Rafael-Palou, X., Andreu, M., Casta˜ ner, E., Gallardo, X., et al.: Computer-assisted diagnosis for an early identification of lung cancer in chest x rays. Sci. Rep. 13(1), 7720 (2023) 24. Yoo, H., et al.: AI-based improvement in lung cancer detection on chest radiographs: results of a multi-reader study in NLST dataset. Eur. Radiol. 31(12), 9664–9674 (2021) 25. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012) 26. Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In International conference on machine learning, pp. 6105–6114. PMLR (2019) 27. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Doll´ ar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017) 28. Shiraishi, J., et al.: Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists’ detection of pulmonary nodules. Am. J. Roentgenol. 174(1), 71–74 (2000) 29. Bustos, A., Pertusa, A., Salinas, J.-M., de la Iglesia-Vay´ a, M.: PadChest: a large chest x-ray image dataset with multi-label annotated reports. Med. Image Anal. 66, 101797 (2020) 30. Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2097–2106 (2017) 31. Demner-Fushman, D., Antani, S., Simpson, M., Thoma, G.R.: Design and development of a multimodal biomedical information retrieval system. J. Comput. Sci. Eng. 6(2), 168–177 (2012) 32. Philipsen, R.H.H.M., Maduskar, P., Hogeweg, L., Melendez, J., S´ anchez, C.I., van Ginneken, B.: Localized energy-based normalization of medical images: application to chest radiography. IEEE Trans. Med. Imaging. 34(9), 1965–1975 (2015)

Analyzing Data by Applying Neural Networks to Identify Patterns in the Data A. S. Borodulin1 , V. V. Kukartsev1,2,3 , Anna R. Glinscaya3(B) , A. P. Gantimurov1 , and A. V. Nizameeva2 1 Bauman Moscow State Technical University, 105005 Moscow, Russia 2 Siberian Federal University, 660041 Krasnoyarsk, Russia 3 Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia

[email protected]

Abstract. This paper analyzes the dataset to identify the patterns in the dataset. Fetal health classification dataset has been taken for analysis. The study touches the fields of medicine such as obstetrics and mortality in it. Currently, there is high maternal and fetal mortality at birth due to limited resources, which could have been prevented. One of the most accessible and simple means of assessing fetal health is the cardiogram (CGT). It can be used to make a diagnosis while still in the womb and prevent death. A dataset containing fetal CGT data was taken as the dataset under study. It consists of 521 records and 23 attributes, among which are: initial fetal heart rate; number of accelerations per second; number of fetal movements per second; number of uterine contractions per second; number of LD per second; number of SD per second; number of PD per second; percentage of time with abnormal short-term variability; mean value of short-term variability; percentage of time with abnormal long-term variability; mean value of long-term variability; width of the histogram constructed using all the values of the histogram; and the width of the histogram constructed using all the values of the histogram; maximum histogram value; number of peaks in the study histogram; number of zeros in the histogram; Hist mode; Hist mean; Hist variance; Hist trend; fetal condition: 1 - normal 2 - suspicious 3 - pathologic; information attribute - patient ID. Data analysis is performed using decision tree, Kohonen maps and neural networks. Keywords: data analysis · neural networks · factor identification

1 Introduction In today’s information society, the amount of data generated and accumulated in various fields is growing exponentially every day [1]. With this amount of data, unique opportunities arise to extract valuable insights and find hidden patterns underlying such data. Neural networks, capable of analyzing and processing complex and unstructured data, have become an indispensable tool for data analysis [2]. This paper discusses the application of neural networks in the task of data analysis to find patterns that may go undetected by traditional analysis methods. For prediction © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 99–108, 2024. https://doi.org/10.1007/978-3-031-54820-8_10

100

A. S. Borodulin et al.

using neural networks, special attention is paid to data processing to obtain more accurate results [3, 4]. Neural networks themselves are computational models capable of automatically adjusting parameters to solve specific problems, and their deep learning allows them to analyze data at different levels of abstraction [5, 6]. One of the common applications of neural networks is classification. Neural networks are trained based on input data [7]. For example, they can automatically classify images into different categories, or diagnose diseases based on input medical data. Neural networks are also used for regression problems to predict a numerical value based on input data [8]. For example, predicting real estate prices, predicting financial performance, and more. The application of neural networks to perform analysis of large datasets is useful. Its successful application is determined by a good understanding of the technical aspects, selection of a suitable architecture, proper data preprocessing, and evaluation and interpretation of the results obtained [9–11].

2 Materials and Method In order to simplify data manipulation, a data analytics method is used, which in turn automates the building of an analytical model, also called machine learning [12, 13]. This method has the ability to adapt to new data as well as to learn from it. Data processing and analysis is performed using decision tree, Kohonen maps and neural network methods [14]. Kohonen maps are a neural network that uses unsupervised learning and performs the task of visualization and clustering [15]. The raw data is multidimensional, so it needs to be visualized for better perception, which is what Kohonen maps are designed for [16]. Decision Trees (DT) is a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the characteristics of the data. Neural networks are computational structures that model simple biological processes similar to those in the human brain [17]. Neural networks are capable of adaptive learning. At the heart of network construction is an elementary transducer called an “artificial neuron”. In general, neural networks are used to model nonlinear systems, which allows the discovery of complex dependencies in data [18, 19].

3 Results Before you start working with the data, you need to normalize it. This is used to achieve better sampling conditions [20]. The purpose of normalization is to bring data in different units of measurement to a common form that will allow them to be compared with each other [21]. After setting up the necessary parameters to train the neural network, the results were output, represented as Kohonen maps, which are shown in Fig. 1. The shades of the attribute values determine the difference of each of the features. As a result, 3 clusters were formed. The division was based on fetal status. The resulting maps cannot be used to make precise dependencies, as the data in the clusters

Analyzing Data by Applying Neural Networks to Identify Patterns

101

Fig. 1. Kohonen maps prior to treatment.

are not homogeneous. It is necessary to analyze the data in more detail. This is achieved by correlation [22, 23]. Correlation allows us to find out which factors are the most significant. Factors whose correlation value is greater than 0.5 are considered significant. The values of the attributes “percentage of time with abnormal short-term variability” and “percentage of time with abnormal long-term variability” show direct and strong dependence, as they have a correlation value greater than 0.5 [24–26]. The significant factor inversely dependent on the output parameter was the attribute “number of accelerations per second”. The correlation analysis is presented in Fig. 2. Let us describe the parameters of the obtained clusters. The algorithm divided the patients into 3 clusters with different numbers of patients assigned to each cluster: the second cluster – 197 patients, the first cluster – 130 patients, the zero cluster – 193 patients. Most of the factors considered are significant for categorizing patients into classes (diagnoses). For the null cluster the most insignificant factor is: Hist regimen. For the first cluster, all but the attributes “time with abnormal short term” and “mean value of long-term variability” are significant. The second cluster is dependent on all input factors.

102

A. S. Borodulin et al.

Fig. 2. Correlation analysis.

The value of the correlation parameter means the strength of dependence of one factor on another. Such correlation values are classified into weak (less than 0.29), moderate (0.3–0.49), medium (0.5–0.69) and strong (0.7 and more). It is necessary to find the factors with the highest dependence on the output factor. The data charts are presented in Fig. 3. a) b) c) d) e) f) g) h) i) j) k)

The baseline fetal heart rate is the number of fetal heartbeats; Number of accelerations per second - the number of fetal accelerations per second; The number of fetal movements per second is the number of fetal movements; Number of uterine contractions per second - the number of contractions of the mother’s uterus per second; The number of LDs per second is the analysis in the cardiogram; The number of PDs per second is an analysis from the cardiogram; Percentage of time with anomalous short-term variability - short-term anomalies; The mean of short-term variability is the average of the variability in the short term; Percent of time with abnormal short-term variability is the amount of time in percent with abnormal variability; The mean of short-term variability is the average of the variability; Percentage of time with abnormal long-term variability - the amount of time of abnormal variability in the long term;

Analyzing Data by Applying Neural Networks to Identify Patterns

103

Fig. 3. Data charts.

l) m) n) o) p) q) r) s) t) u) v) w)

Long-term variability mean is the average value of variability over the long term; The width of the histogram is the width of the histogram; Minimum histogram value - the minimum value of the histogram analysis; Maximum histogram value - the maximum value of the histogram analysis; Number of peaks on the histogram - number of peaks on the histogram; Number of zeros in the histogram - zero values in the histogram; Hist regimen – blood clotting; Hist is the mean value of blood coagulation; Hist is the average value of blood coagulation; Hist variance is the spread of blood clotting values; Histogram trend - the location of the histogram graph; A fetal condition is a diagnosis given to a fetus;

In the end, after performing the analysis using Kohonen maps, we see that the result did not give us the information we needed about the patients. The neural network is the next method to be analyzed. For its construction it is necessary to set the parameters. Let’s arbitrarily set 2 layers and 6 neurons, each of which has 10000 epochs. At the output we get a diagram showing the diagnoses established by doctors (green) and neural network (pink). Figures 4 and 5 show the diagram of the neural network and the number of errors, respectively. From the data we see that the error is 178 out of 520 or 34.2%. As a result, we can say that the use of this method for the analyzed data is not suitable, because there is a rough division into 2 classes, with 3 existing classes, which is the cause of a large error. Data analysis using a decision tree is based on searching for patterns among the data and predicting the result. Let’s leave the standard settings and output the decision tree in the form of a diagram, as well as the number of errors in the resulting diagram. The decision tree diagram and the number of errors are shown in Figs. 6 and 7, respectively.

104

A. S. Borodulin et al.

Fig. 4. A diagram of a neural network.

Fig. 5. Number of errors.

From the results of the paper, we can see that the decision tree method classified the data with 18 errors out of 520 data, which is 3.4%. From the correlation performed during the analysis, it can be seen that the main significant factors for the decision tree are the percentage of time with abnormal shortterm variability (56.9%) and the percentage of time with abnormal long-term variability (21.9%). There are also conditions on which the decision tree is based when classifying data. Such conditions are presented in Fig. 8. Figure 8 shows that when constructing the Decision Tree diagram, the main focus was on the percentage of time with abnormal short-term variability and the percentage of time with abnormal long-term variability, and then on the average Hist value and the number of PDs per second. It can be concluded from the experiments that the Decision Tree method most accurately describes the analyzed data.

Analyzing Data by Applying Neural Networks to Identify Patterns

Fig. 6. Decision Tree Diagram.

Fig. 7. Number of errors.

Fig. 8. Conditions for decision trees.

105

106

A. S. Borodulin et al.

4 Discussion The use of machine learning acts as a powerful tool to perform data analysis. Its use helps to find patterns in complex data sets [27]. Also, machine learning, namely the use of neural networks to analyze data, is a more accurate method than traditional data analysis [28, 29]. In this study, the most significant factors were identified using Kohonen maps, but there is no visible dependence on the clusters. The next method is neural networks. They were used to accurately predict the 1 and 3 values but none of the 2 values, meaning diagnoses, so this method did not give the necessary results. The last decision tree method predicted all values with a very small error of 3.4% which is good, so this method is most suitable for the analyzed data model [30–32].

5 Conclusion The main issue of this research is the possibility of applying neural networks to data analysis and obtaining significant results that can be useful in the future. The development of modern computational technologies and deep learning allows us to look at data from a more intelligent side and get more accurate results of their analysis. The application of neural networks in classification, regression, image processing, text analysis, and other areas allows us to automatically identify complex dependencies that might have gone unnoticed with traditional approaches. The process of analysis using neural networks requires not only technical skills, but also creative thinking to extract the most valuable information from the data. In addition, one should not forget about the correct interpretation of the obtained results for proper decision-making in the future.

References 1. Repinskiy, O.D., et al.: Improving the competitiveness of Russian industry in the production of measuring and analytical equipment. J. Phys. Conf. Ser. 1728(1), 012032 (2021) 2. Rassokhin, A., et al.: Different types of basalt fibers for disperse reinforcing of fine-grained concrete. Mag. Civil Eng. 109(1), 10913 (2022) 3. Shutaleva, A., et al.: Migration potential of students and development of human capital. Educ. Sci. 12(5), 324 (2022) 4. Efremenkov, E.A., et al.: Research on the possibility of lowering the manufacturing accuracy of cycloid transmission wheels with intermediate rolling elements and a free cage. Appl. Sci. 12(1), 5 (2021) 5. Gutarevich, V.O., et al.: Reducing oscillations in suspension of mine monorail track. Appl. Sci. 13(8), 4671 (2023) 6. Malozyomov, B.V., et al.: Overview of methods for enhanced oil recovery from conventional and unconventional reservoirs. Energies 16(13), 4907 (2023) 7. Strateichuk, D.M., et al.: Morphological features of polycrystalline CdS1− xSex films obtained by screen-printing method. Crystals 13(5), 825 (2023) 8. Malozyomov, B.V., et al.: Study of supercapacitors built in the start-up system of the main diesel locomotive. Energies 16(9), 3909 (2023)

Analyzing Data by Applying Neural Networks to Identify Patterns

107

9. Malozyomov, B.V., et al.: Substantiation of drilling parameters for undermined drainage boreholes for increasing methane production from unconventional coal-gas collectors. Energies 16(11), 4276 (2023) 10. Masich, I.S., Tyncheko, V.S., Nelyub, V.A., Bukhtoyarov, V.V., Kurashkin, S.O., Borodulin, A.S.: Paired patterns in logical analysis of data for decision support in recognition. Computation 10(10), 185 (2022) 11. Masich, I.S., et al.: Prediction of critical filling of a storage area network by machine learning methods. Electronics 11(24), 4150 (2022) 12. Barantsov, I.A., et al.: Classification of acoustic influences registered with phase-sensitive OTDR using pattern recognition methods. Sensors 23(2), 582 (2023) 13. Bosikov, I.I., et al.: Modeling and complex analysis of the topology parameters of ventilation networks when ensuring fire safety while developing coal and gas deposits. Fire 6(3), 95 (2023) 14. Mikhalev, A.S., et al.: The Orb-weaving spider algorithm for training of recurrent neural networks. Symmetry 14(10), 2036 (2022) 15. Moiseeva, K., et al.: The impact of coal generation on the ecology of city areas. In: 2023 22nd International Symposium INFOTEH-JAHORINA (INFOTEH), pp. 1–6. IEEE (2023) 16. Kukartsev, V., et al.: Analysis of data in solving the problem of reducing the accident rate through the use of special means on public roads. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–4. IEEE (2022) 17. Kireev, T., et al.: Analysis of the influence of factors on flight delays in the united states using the construction of a mathematical model and regression analysis. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–5. IEEE (2022) 18. Kukartsev, V., et al.: Prototype technology decision support system for the EBW process. In: Proceedings of the Computational Methods in Systems and Software, vol. 596, pp. 456–466. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21435-6_39 19. Kukartsev, V., et al.: Methods and tools for developing an organization development strategy . In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–8. IEEE (2022) 20. Malozyomov, B.V.: Improvement of hybrid electrode material synthesis for energy accumulators based on carbon nanotubes and porous structures. Micromachines 14(7), 1288 (2023) 21. Bukhtoyarov, V.V., et al.: A study on a probabilistic method for designing artificial neural networks for the formation of intelligent technology assemblies with high variability. Electronics 12(1), 215 (2023) 22. Shutaleva, A., et al.: Environmental behavior of youth and sustainable development. Sustainability 14(1), 250 (2021) 23. Kosenko, E.A., Nelyub, V.A.: Evaluation of the stress–strain state of a polymer composition material with a hybrid matrix. Polym. Sci. Ser. D 15(2), 240 (2022) 24. Nelyub, V.A., Komarov, I.A.: Technology of treatment of carbon fibers under electromagnetic influences of various origins to produce high-strength carbon fiber reinforced plastics. Russ. Metall. 2021, 1696–1699 (2022) 25. Nelyub, V.A., Fedorov, S.Y., Malysheva, G.V., Berlin, A.A.: Properties of carbon fibers after applying metal coatings on them by magnetron sputtering technology. Fibre Chem. 53, 252– 257 (2022) 26. Nelyub, V.A.: The effect of copper and zinc coatings on the properties of carbon fibers and composites based on them. Polym. Sci. Ser. D 14, 260–264 (2021) 27. Nelyub, V.A., Fedorov, S.Y., Malysheva, G.V.: The study of the structure and properties of elementary carbon fibers with metal coatings. Inorg. Mater. Appl. Res. 12, 1037–1041 (2021)

108

A. S. Borodulin et al.

28. Potapenko, I., Kukartsev, V., Tynchenko, V., Mikhalev, A., Ershova, E.: Analysis of the structure of Germany’s energy sector with self-organizing Kohonen maps. In: Abramowicz, W., Auer, S., Stró˙zyna, M. (eds.) Business Information Systems Workshops. LNBIP, vol. 444, pp. 5–13. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04216-4_1 29. Kukartsev, V.V., et al.: Using digital twins to create an inventory management system. In: E3S Web of Conferences. EDP Sciences (2023) 30. Gladkov, A.A., et al.: Development of an automation system for personnel monitoring and control of ordered products. In: E3S Web of Conferences. EDP Sciences (2023) 31. Borodulin, A.S., et al.: Using machine learning algorithms to solve data classification problems using multi-attribute dataset. In: E3S Web of Conferences. EDP Sciences (2023) 32. Kozlova, A.V., et al.: Finding dependencies in the corporate environment using data mining. In: E3S Web of Conferences. EDP Sciences (2023)

Intelligent Data Analysis as a Method of Determining the Influence of Various Factors on the Level of Customer Satisfaction of the Company Vladislav Kukartsev1,2 , Vladimir Nelyub2,3 , Anastasia Kozlova1(B) , Aleksey Borodulin2 , and Anastasia Rukosueva1 1 Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia

[email protected]

2 Bauman Moscow State Technical University, 105005 Moscow, Russia 3 Peter the Great St.Petersburg Polytechnic University, 195251 St. Petersburg, Russia

Abstract. This paper analyzes the impact of various factors on the level of satisfaction of passengers using airline services. A dataset of 25 976 airline customers was taken for the work, which contained 22 attributes related to both the services provided by the company and the passengers themselves. Two methods, Kohonen map method and decision tree method were initially used for the analysis. Kohonen maps were used to divide the data into clusters, and cluster parameters were obtained, which showed the significance of the attributes. With the help of decision tree, an attribute significance table was obtained which also showed the level of significance of the attributes. After the first iteration, the decision tree method showed the best result, so this method was chosen for further work. Next, different groups of attributes were analyzed and the errors of the algorithms for each group of attributes were obtained. Correlation analysis was also done which showed the degree of association between the attributes and the target variable. The attributes with greater importance, i.e., having a greater impact on the target variable, were found. Keywords: data analysis · Kohonen maps · decision trees · factors

1 Introduction Many factors, both internal and external, can affect the success of a company. Some of the external factors may include economic conditions such as changes in consumer demand, changes in legislation, inflation, political conditions such as changes in the political situation in the country, international relations, technological changes due to new developments in information technology, internet etc. Some internal factors that can influence the success of a company include the company’s management, its management style and approach, the company’s organizational structure and its efficiency, the level of qualification and motivation of employees, the company’s marketing strategy and its ability to attract and retain customers, the company’s financial condition and its resistance to external factors. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 109–128, 2024. https://doi.org/10.1007/978-3-031-54820-8_11

110

V. Kukartsev et al.

Companies can utilize various tools and technologies to maintain their success. For example, these may include: – Database management systems that allow storing and managing large amounts of data required for making management decisions. – Analytical platforms that provide tools for processing and analyzing data, as well as for creating reports and forecasts. – Machine learning, which allows you to automatically analyze large amounts of data and draw conclusions based on that data. – Data visualization, which helps present information in an easy-to-understand way, making it easier to understand and analyze. – Intelligent data analysis, which helps to identify hidden relationships and patterns in large amounts of data. – Cloud computing, which provides access to computing resources and large amounts of data over the Internet. – Statistical tools that are used to perform statistical analysis and data exploration. BI-systems, which allow you to get up-to-date information about the company’s activities and make management decisions in real time, based on data analysis [1, 2]. Data analysis is the process of collecting, organizing, storing, and using information to make decisions. It can help a company become more competitive in several ways. First, data analysis can help a company better understand its customers and competitors. This can include market research, customer behavior analysis, competitor research, etc. The information obtained can help a company identify its strengths and weaknesses and develop strategies to improve its products and services. Secondly, data analytics can be used to optimize business processes. It can help companies reduce time to complete tasks, improve product quality, reduce costs, and increase operational efficiency. For example, companies can use data analytics to optimize manufacturing processes, inventory management, human resource management, etc. Also, data analytics can also help companies improve their marketing strategies. It can help determine the most effective promotional channels, identify target audiences, analyze the results of advertising campaigns, etc. In this way, companies can more accurately define their marketing goals and achieve them [3–5]. Overall, data analytics is a powerful tool that can help companies become more competitive. It allows companies to better understand their customers, optimize business processes and improve marketing strategies, which can ultimately lead to an increase in their competitiveness in the market.

2 Methods and Materials Data mining is the process of processing large amounts of data in order to identify hidden patterns and make decisions based on the results. It includes various methods and algorithms, such as: 1. Clustering is a technique that divides data into groups (clusters) based on similarities between them.

Intelligent Data Analysis as a Method of Determining the Influence

111

2. Classification is a technique that is used to predict the class of data based on its characteristics. 3. Regression is a method that is used to predict the value of a dependent variable based on the values of independent variables. 4. Association Analysis is a technique used to identify relationships between different variables in data. 5. Decision Trees is a method based on the construction of a decision tree that predicts the values of a new variable based on the available data. 6. Machine learning is a method that uses machine learning algorithms to automatically build models from data. 7. Neural Networks is a method that creates models that can learn from data and make predictions based on new data [5–8]. The purpose of data mining is to improve the efficiency of decision making in various fields such as medicine, finance, marketing, etc. It allows you to identify hidden relationships between various factors, identify trends and predict future events. This helps companies to make more informed decisions and improve their competitiveness. Data mining has a wide range of applications in various fields, including: a) medicine: to analyze medical data such as patient data, test results, etc., to identify patterns and predict diseases; b) finance: to analyze financial data such as corporate revenues and expenses to make better-informed decisions; c) marketing: to analyze data on user behavior, to identify the preferences and interests of target audiences and create more effective marketing campaigns; d) science: to analyze scientific data, such as the results of experiments and observations, to identify new patterns and discoveries; e) manufacturing: to analyze manufacturing data, such as product quality and process data, to optimize production and improve productivity, among others [9–13]. In this paper, the decision tree method and Kohonen map method are used. Decision trees are one of the machine learning methods. They are used to make decisions based on data. Decision trees are a hierarchical structure consisting of nodes and branches. At each node, the data is partitioned into subsets and at each branch, the feature that best separates the data is selected. Decision trees work by partitioning the data into smaller subsets (branches) based on feature values. Then, for each subset, the probability that the data belongs to that subset is estimated and the decision that leads to the highest expected value is selected. Decision trees are used in various fields, such as in medicine to diagnose diseases, in finance to predict stock prices, in marketing to analyze user behavior, etc. For example, a decision tree can be used to classify data based on attributes such as age, gender, education and income. In this case, each node will correspond to a particular combination of attributes, and the leaves will contain decisions about which category the object belongs to [14–17]. One of the main advantages of decision trees is that they are simple and easy for people to understand. In addition, they can be easily adapted to different tasks and data, which makes them widely used in various fields such as medicine, finance, marketing, etc.

112

V. Kukartsev et al.

Various algorithms such as ID3, C4.5, CART, etc. are used to build the decision tree. Once the tree is built, it can be used to classify or regress new data [16–18]. Kohonen map method is a data clustering method that uses neural networks to form groups of similar objects. The algorithm of the method is as follows. The data is partitioned into vectors, each vector corresponding to one object. These vectors are then fed to the input of an artificial neural network, which consists of one hidden layer and one output layer. The output layer of the network generates a map in which each element corresponds to one cluster. The map is updated with each new data vector, and clusters are automatically formed based on the proximity between vectors. The Kohonen map method is widely used in clustering, association analysis, classification and other machine learning tasks. It has several advantages such as the ability to handle large amounts of data, automatic determination of the number of clusters, and robustness to outliers. The dataset used in this paper consists of 25 976 airline passenger records and 23 attributes that characterize these records [19]. The attributes have the following values: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

Gender – gender of the passengers (Female, Male); Customer Type – the customer type (Loyal customer, Disloyal customer); Age – the actual age of the passengers; Type of Travel – purpose of the flight of the passengers (Personal Travel, Business Travel); Class – travel class in the plane of the passengers (Business, Eco, Eco Plus); Flight Distance – the flight distance of this journey; Inflight WiFi service – satisfaction level of the inflight WiFi service (0 to 5); Departure/Arrival time convenient – satisfaction level of Departure/Arrival time convenient (ot 0 do 5); Ease of Online booking – satisfaction level of online booking (0 to 5); Gate location – satisfaction level of Gate location (1 to 5); Food and drink – satisfaction level of Food and drink (0 to 5); Online boarding – satisfaction level of online boarding (0 to 5); Seat comfort – satisfaction level of Seat comfort (0 to 5); Inflight entertainment – satisfaction level of inflight entertainment (0 to 5); On-board service – satisfaction level of On-board service (0 to 5); Leg room service – satisfaction level of Leg room service (0 to 5 Baggage handling – satisfaction level of baggage handling (1 to 5); Checkin service – satisfaction level of Check-in service (1 to 5); Inflight service – satisfaction level of inflight service (0 to 5); Cleanliness – satisfaction level of cleanliness (0 to 5); Departure Delay in Minutes – minutes delayed when departure; Arrival Delay in Minutes – minutes delayed when arrival.

Target variable: Satisfaction – airline satisfaction level (Satisfaction, Neutral or dissatisfaction).

Intelligent Data Analysis as a Method of Determining the Influence

113

3 Results 3.1 Analysis of All Attributes The purpose of the paper is to identify a number of factors that influence the Satisfaction attribute, which reflects the level of passenger satisfaction with the service provided by the airline. The data was normalized for the job. Normalization is the process of changing the data values within a certain range. This process can be useful for improving the quality of machine learning as it helps to reduce the differences between values and make them more homogeneous. Also, normalization can be used to compare different datasets as it helps to make their measurements more comparable [18, 20–23]. In the dataset used, the values of the Gender attribute, namely Female and Male, were respectively replaced with 1 and 0. The values of the Customer Type attribute, namely Loyal customer and Disloyal customer, were assigned values of 0 and 1 respectively. The Type of Travel attribute also received the replacement of Personal Travel and Business Travel values with 0 and 1. The remaining attributes that underwent normalization are presented in Table 1. Deductor data analysis software was used to implement the methods and work with data. It allows to perform information handling operations such as data processing, analysis and visualization. The data processing toolkit automates the data processing process, speeds it up and improves the quality of the results. Data analysis tools can be useful in operations such as detecting anomalies and outliers in data, classifying and clustering data, finding associations and dependencies between attributes, approximating and predicting values of unknown values, analyzing time series and predicting future values, extracting knowledge from data and building knowledge representation models, etc. Data visualization tools help to present information in a visual way that makes it easier to understand and analyze. They can be used to create graphs, charts, maps, and other types of visualizations that allow you to display data quickly and efficiently. Table 1. Normalized values Attribute

Significance

New meaning

Class

Business

0

Eco Plus

1

Eco

2

0–500

0

501–1000

1

1001–1500

2

1501–2000

3

2001–2500

4

Flight Distance

(continued)

114

V. Kukartsev et al. Table 1. (continued)

Attribute

Departure Delay in Minutes

Arrival Delay in Minutes

Satisfaction

Significance

New meaning

2501–3000

5

3001–3500

6

3501–4000

7

4001–4500

8

4501–5000

9

0

0

1–5

1

6–40

2

41–120

3

121–180

4

181–240

5

241–360

6

361–1200

7

0

0

1–10

1

11–30

2

31–60

3

61–120

4

>120

5

Neutral or dissatisfaction

0

Satisfaction

1

Deductor also has the ability to create reports and forecasts based on the results. Creating data analysis reports helps to present the results of a study in a user-friendly and understandable way. This allows you to quickly get information about what data is most important and how it is related to each other, as well as identify trends and patterns in the data. Creating forecasts based on analysis data allows you to predict future values and trends in the data. This can be useful for decision making in business, science, and other fields. After normalization, correlation analysis was performed to identify the relationships of the data. Correlation analysis is a statistical technique that is used to evaluate the relationship between two variables. It is based on the calculation of the correlation coefficient, which shows how strongly changes in one variable are related to changes in another variable [24–28]. The correlation coefficient can take values from −1 to +1, where −1 means complete negative correlation, 0 means no correlation, and +1 means complete positive correlation. Figure 1 shows the obtained correlation table. It can be seen that factors Gender, Gate

Intelligent Data Analysis as a Method of Determining the Influence

115

location have almost zero correlation, factors Customer Type, Class, Departure/Arrival time convenient, Departure Delay in Minutes, Arrival Delay in Minutes have negative correlation. The rest of the factors have a positive correlation coefficient, which indicates a positive relationship. Among them Online boarding, Type of Travel, Inflight entertainment and Seat comfort stand out because they have the highest positive coefficient, which indicates that these factors are directly related to the target attribute Satisfaction.

Fig. 1. Correlation table

Next, a decision tree was created using software tools, a fragment of which is shown in Fig. 2. The tree itself has 696 nodes and 521 rules. The columns of the table have the following meaning: consequence - the decision made, support - the total number of examples classified by this node of the tree, validity - the number of examples correctly classified by this node.

116

V. Kukartsev et al.

In Figure 2, the following rules correspond to the Type of Travel = 0 node: a) b) c) d) e) f)

IF Type of Travel = 0 AND Inflight WiFi service = 0 THEN Satisfaction = 1; IF Type of Travel = 0 AND Inflight WiFi service = 1 THEN Satisfaction = 0; IF Type of Travel = 0 AND Inflight WiFi service = 2 THEN Satisfaction = 0; IF Type of Travel = 0 AND Inflight WiFi service = 3 THEN Satisfaction = 0; IF Type of Travel = 0 AND Inflight WiFi service = 4 THEN Satisfaction = 0; IF Type of Travel = 0 AND Inflight WiFi service = 5 THEN Satisfaction = 1. Node Type of Travel = 1 demonstrates the following rules:

a) IF Type of Travel = 1 AND Online boarding = 0 AND Inflight WiFi service = 0 THEN Satisfaction = 1; b) IF Type of Travel = 1 AND Online boarding = 0 AND Inflight WiFi service = 1 THEN Satisfaction = 0; c) IF Type of Travel = 1 AND Online boarding = 0 AND Inflight WiFi service = 2 THEN Satisfaction = 0; d) IF Type of Travel = 1 AND Online boarding = 0 AND Inflight WiFi service = 3 THEN Satisfaction = 0; e) IF Type of Travel = 1 AND Online boarding = 0 AND Inflight WiFi service = 4 THEN Satisfaction = 0; f) IF Type of Travel = 1 AND Online boarding = 0 AND Inflight WiFi service = 5 THEN Satisfaction = 1. g) IF Type of Travel = 1 AND Online boarding = 1 AND Inflight WiFi service = 0 THEN Satisfaction = 1; h) IF Type of Travel = 1 AND Online boarding = 1 AND Inflight WiFi service = 1 AND Customer Type = 0 AND Inflight entertainment = 0 THEN Satisfaction = 0; i) IF Type of Travel = 1 AND Online boarding = 1 AND Inflight WiFi service = 1 AND Customer Type = 0 AND Inflight entertainment = 1 THEN Satisfaction = 0; j) IF Type of Travel = 1 AND Online boarding = 1 AND Inflight WiFi service = 1 AND Customer Type = 0 AND Inflight entertainment = 2 THEN Satisfaction = 0; k) IF Type of Travel = 1 AND Online boarding = 1 AND Inflight WiFi service = 1 AND Customer Type = 0 AND Inflight entertainment = 3 AND Cleanliness = 0 THEN Satisfaction = 0, etc. Figure 3 shows the attribute importance table, which reflects the importance of all attributes in the dataset. It helps in determining the importance of specific factors in a particular task. The table shows that Online boarding, Inflight WiFi service, Type of Travel attributes have the highest percentage of importance. The contiguity table shown in Fig. 4 contains the results of the decision tree algorithm. Total Neutral or dissatisfaction values of Satisfaction attribute among the dataset is 14 573, total 14 306 records were recognized correctly and 708 records were assigned to this value incorrectly, i.e. the algorithm identified 15 014 records in the Neutral or dissatisfaction group. The value Satisfaction occurs 11 403 times in the set, the algorithm recognized 10 695 records correctly and 267 records were assigned to this group incorrectly. The performance error of the algorithm is 3.8%.

Intelligent Data Analysis as a Method of Determining the Influence

117

Fig. 2. Decision tree fragment

The Kohonen maps shown in Fig. 5 represent a set of points. Each point is a separate data element. Each map corresponds to a specific factor. The distance between points reflects the degree to which they are similar or different from each other. Kohonen maps, as a visualization method, allow for quick and clear display of complex relationships between features and observations. The contiguity table obtained as a result of the algorithm of this method is presented in Fig. 6. The algorithm correctly recognized 13 546 records with Neutral or dissatisfaction value and 2 047 records were incorrectly identified. Records with Satisfaction value were correctly recognized in 9 356 cases and 1 027 records were incorrectly identified in this group. Total records with Neutral or dissatisfaction value in the set is 14 573 out of which 13 546 records were correctly identified and 1 027 records were not recognized. Records containing Satisfaction value in total 11 403 out of which the algorithm correctly identified 9 356 records and 2 047 records could not be correctly identified.

118

V. Kukartsev et al.

Fig. 3. Attribute importance table

Fig. 4. Contingency table, decision tree method

Figures 7 and 8 show the parameters of the clusters obtained in the course of the algorithm. Clusters represent a group of data similar to each other by some attributes. A total of four clusters were identified. The null cluster included 5 905 records, the first cluster included 4 025 records, the second cluster included 3 905 records and the most numerous cluster was the third cluster containing 12 141 records. The error of the method was 11.8%. Analyzing the parameters of the clusters, we can see that for the zero cluster all factors except Gender are significant. For the first cluster the attributes Checkin service and Arrival Delay in Minutes were insignificant, although the latter still has a significant impact. For the second cluster, the factors Arrival Delay in Minutes and Departure Delay in Minutes have the least impact. For the third cluster data group, the most insignificant

Intelligent Data Analysis as a Method of Determining the Influence

119

Fig. 5. Kohonen maps

Fig. 6. Contingency table, Kohonen map method

factor is Departure Delay in Minutes. As a result, Departure Delay in Minutes has the least impact on passenger satisfaction. During the experiment, the decision tree method showed less error which was 3.8% than the Kohonen map method which had an error of 11.8%, so the decision tree method was chosen for further work. 3.2 Analyzing Two Groups of Attributes The target variable of this dataset Satisfaction contains measures of passenger satisfaction with the services provided by the airline. But among the attributes there are those that are not related to service. These attributes are Gender, Customer Type, Age, Type of Travel, Flight Distance. Thus, we can distinguish two groups of attributes - attributes related to service and attributes not related to service. Using the decision tree method on non-service group attributes, the algorithm generated a tree consisting of 49 nodes and having 33 rules. The most significant attribute was Type of Travel, which has 63.8% significance according to the results. The second

120

V. Kukartsev et al.

Fig. 7. Parameters of clusters, first part

most significant attribute was Customer Type, which has a significance level of 31.9%. The significance table of this experiment is shown in Fig. 9. The error in this case was 21.8%. The contiguity table showing the number of correctly and incorrectly recognized records is presented in Fig. 10. There were 17 attributes in the group of attributes related to maintenance. In building the decision tree, 688 nodes and 568 rules were created. The three attributes with the highest level of significance, Inflight WiFi service, Class and Online boarding, stood out. The attribute significance table is shown in Fig. 11. The error in this case was 5.5%. The contiguity table is shown in Fig. 12. 3.3 Analyzing the Attributes with the Highest Importance Attributes Whose Importance Is Higher 1%. The first iteration identified several factors whose significance for the Satisfaction attribute is higher than the others. At the next stage, the attributes with the lowest significance were not taken for work, the attributes whose significance is higher than 1% according to Fig. 3 were used. These attributes

Intelligent Data Analysis as a Method of Determining the Influence

121

Fig. 8. Parameters of clusters, part two

Fig. 9. Table of importance of non-maintenance attributes

are Online boarding, Inflight WiFi service, Type of Travel, Inflight entertainment, Seat comfort, Customer Type, Age, Baggage handling, Gate location, Inflight service. The resulting tree consisted of 538 nodes and contained 397 rules. The most significant attributes were Online boarding, Inflight WiFi service, and Type of Travel. The error of the method was 4.1%. The attribute significance table and contiguity table are shown in Figs. 13 and 14, respectively.

122

V. Kukartsev et al.

Fig. 10. Contiguity table for non-service attributes

Fig. 11. Table of importance of attributes related to maintenance

Fig. 12. Contiguity table for service-related attributes

The Three Attributes with the Greatest Importance. According to the significance table shown in Fig. 3, it can be seen that the attributes Online boarding, Inflight WiFi service, Type of Travel have the highest percentage of significance. The next stage of the work was devoted to these three attributes. The obtained decision tree includes 39 nodes and 32 rules. The importance of attributes was distributed as 42 875, 29 666, 27 459 for Online boarding, Inflight WiFi service, Type of Travel, respectively. The total number of method errors was 2 845 records out of 25 976, i.e., 10.9%. The attribute significance table and contiguity table are presented in Figs. 15 and 16.

Intelligent Data Analysis as a Method of Determining the Influence

123

Fig. 13. Table of importance of attributes, the importance of which is higher 1%

Fig. 14. Contiguity table for attributes whose importance is greater than 1%

Fig. 15. Table of importance of the attributes with the highest importance

Fig. 16. Contiguity table for the attributes with the highest significance

3.4 Analysis of Attributes with the Highest Correlation Coefficient According to the results of the correlation analysis (Fig. 1), the attributes with a correlation coefficient greater than 0.3 were selected. This group of attributes included Online boarding, Type of Travel, Inflight entertainment, Seat comfort, On-board service, Cleanliness, Leg room service. The error of the method turned out to be 9.3%. The decision tree contained 334 rules and 404 nodes. The most significant attributes were Online boarding and Type of

124

V. Kukartsev et al.

Travel, followed by Inflight entertainment and On-board service. The significance table is shown in Fig. 17 and the contiguity table is shown in Fig. 18.

Fig. 17. Table of significance of attributes with the highest correlation coefficient

Fig. 18. Pairing table for attributes with the highest correlation coefficient

Table 2 contains the results of all experiments. The lowest percentage of error was obtained in the experiment with all 22 attributes. A little bit more error was obtained where the attributes were used, the significance of which was more than 1% in the first experiment. There were 10 attributes in this group. The next experiment, where a low error was obtained, was the experiment with attributes that relate to service, containing 17 attributes. The worst result was obtained in the experiment with attributes not related to maintenance. Using a small number of attributes in methods gives more error than using a large number of attributes. But the number of attributes used can determine the cost of the analysis and related work. The larger the number of attributes becomes, the more time and resources may be required to collect, process and analyze the information. Figure 19 shows a graph of the dependence of the number of errors on the number of attributes used. It can be clearly seen that using ten attributes gives almost the same result as using twenty attributes. This may indicate that there are attributes or a set of attributes that are more important than others. Each experiment found its most significant attributes. For the most part, these attributes matched. The last iteration of the work used the attributes Online boarding, Inflight WiFi service, Type of Travel, Class and Inflight entertainment. The error of this case was 7.8%, which is not the best performance result, but the best result among the smallest number of attributes used.

Intelligent Data Analysis as a Method of Determining the Influence

125

Table 2. Experimental results №

Experiment

Number of attributes

Number of errors

1

All attributes

22

907

2

Non-maintenance attributes

5

5 674

21.8

Type of Travel, Customer Type

3

Attributes related to 17 maintenance

1 434

5.5

Inflight WiFi service, Class, Online boarding

4

Attributes whose significance is greater than 1%

10

1 087

4.1

Online boarding, Inflight WiFi service, Type of Travel

5

The three attributes with the greatest importance

3

2 845

10.9

Online boarding, Inflight WiFi service, Type of Travel

6

Analysis of attributes with the highest correlation coefficient

7

2 427

9.3

Online boarding, Type of Travel, Inflight entertainment

Number of errors

6000

Error rate (%) 3.8

Online boarding, Inflight WiFi service, Type of Travel

5674

5000 4000

2845

3000

2427 1434

2000 1000 0

Significant attributes

907

1087 0

5

10

15

20

25

Number of aributes

Fig. 19. Graph of dependence of the number of errors on the number of used attributes

126

V. Kukartsev et al.

4 Conclusion Data mining is used to extract useful information from data to help make informed decisions and improve organizational performance. Intelligent analysis can be used to identify hidden relationships and trends in data, determine the most profitable products and services, and optimize production, logistics, and marketing processes [29, 30]. This paper focuses on analyzing airline passenger data. The target variable is passenger satisfaction with the company’s services. The dataset consists of 25 976 records which have 22 attributes related to both passengers and airline services specifically. Two methods were used to analyze the data - Kohonen’s self-organizing map method and decision tree method. In the first iteration, the decision tree method performed the best. The error of Kohonen map method was 11.8% and that of decision tree method was 3.8%. Therefore, the decision tree method was used in further work. Next, two groups of attributes were analyzed separately - service-related and nonservice-related attributes. The worst result for the whole work was obtained in the experiment with non-maintenance attributes. The error in this case amounted to 21.8%. In the second group the error amounted to 5.5%. At the beginning of the work, a table of significance of all attributes was obtained (Fig. 3). The next two experiments were conducted with all attributes whose significance is higher than 1% and considering the three attributes with the highest significance. When working with three attributes, the algorithm produced an error equal to 10.9%, while the algorithm’s error in working with attributes whose significance is above 1% was 4.1%. Also at the beginning of the work, correlation analysis was performed to obtain correlation coefficients showing the degree of association between the attributes and the target variable. For one more experiment, the attributes with the highest correlation coefficient were taken. In this case, the algorithm showed an error of 9.3%. The most significant attributes were identified in each experiment. For the most part, these attributes overlapped. Another experiment was conducted using the attributes Online boarding, Inflight WiFi service, Type of Travel, Class and Inflight entertainment, which were found to be significant in the other experiments. The error in this case was 7.8%, which is not the best performance but the best result among the smallest number of attributes used.

References 1. Masich, S., Tyncheko, V.S., Nelyub, V.A., Bukhtoyarov, V.V., Kurochkin, S.O., Borodulin, A.S.: Paired patterns in logical analysis of data for decision support in recognition. Computation 10(10), 185 (2022). https://doi.org/10.3390/computation10100185 2. Kukartsev, V., Mikhalev, A., Stashkevich, A., Moiseeva, K.: Analysis of data in solving the problem of reducing the accident rate through the use of special means on public roads. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–4. IEEE (2022). https://doi.org/10.1109/IEMTRONICS55184.2022.9795842 3. Kireev, T., Kukartsev, V., Pilipenko, A., Rukosueva, A.: Analysis of the influence of factors on flight delays in the United States using the construction of a mathematical model and regression analysis. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–5. IEEE (2022). https://doi.org/10.1109/IEMTRONICS55184. 2022.9795721

Intelligent Data Analysis as a Method of Determining the Influence

127

4. Rassokhin, A., Ponomarev, A., Karlina, A.: Nanostructured high-performance concretes based on low-strength aggregates. Mag. Civil Eng. 110(2), 11015 (2022). https://doi.org/10.34910/ MCE.110.15 5. Rassokhin, A., Ponomarev, A., Shambina, S., Karlina, A.: Different types of basalt fibers for disperse reinforcing of fine-grained concrete. Mag. Civil Eng. 109(1), 10913 (2022). https:// doi.org/10.34910/MCE.109.13 6. Shutaleva, A., et al.: Migration potential of students and development of human capital. Educ. Sci. 12(5), 324 (2022). https://doi.org/10.3390/educsci12050324 7. Barantsov, I., Pniov, A., Koshelev, K., Tynchenko, V., Nelyub, V., Borodulin, A.: Classification of acoustic influences registered with phase-sensitive OTDR using pattern recognition methods. Sensors 23(2), 582 (2023). https://doi.org/10.3390/s23020582 8. Kukartsev, V., Saidov, N., Stupin, A., Shagaeva, O.: Prototype technology decision support system for the EBW process. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Proceedings of the Computational Methods in Systems and Software, vol. 596, pp. 456–466. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21435-6_39 9. Lomazov, V.A., Petrosov, D.A., Evsyukov, D.Yu.: Intellectual assessment of staff sufficiency for innovative development of the sustainable regional agro-industrial complex. IOP Conf. Ser. Earth Environ. Sci. 981(2) (2022). https://doi.org/10.1088/1755-1315/981/2/022064 10. Antosz, K., Pasko, L., Gola, A.: The use of intelligent systems to support the decisionmaking process in lean maintenance management. IFAC-PapersOnLine 52(10), 148–153 (2019). https://doi.org/10.1016/j.ifacol.2019.10.037 11. Jaleel, R.A., Abbas, T.M.J.: Design and implementation of efficient decision support system using data mart architecture. In: International Conference on Electrical, Communication, and Computer Engineering, ICECCE, 12–13 June 2020, Istanbul (2020). https://doi.org/10.1109/ ICECCE49384.2020.9179313 12. Sharikov, K.M., Sokolov, G.S., Nelyub, V.A.: Research of transversal properties of winding basalt plastics based on basalt fiber with experimental lubricants. J. Phys. Conf. Ser. 1990(1), 012078 (2021). https://doi.org/10.1088/1742-6596/1990/1/012078 13. Nelyub, V.A., Fedorov, S.Y., Malysheva, G.V.: The study of the structure and properties of elementary carbon fibers with metal coatings. Inorg. Mater. Appl. Res. 12(4), 1037–1041 (2021). https://doi.org/10.1134/S2075113321040316 14. Efremenkov, E., Martyushev, N., Skeeba, V., Grechneva, M., Olisov, A., Ens, A.: Research on the possibility of lowering the manufacturing accuracy of cycloid transmission wheels with intermediate rolling elements and a free cage. Appl. Sci. 12(1), 5 (2021). https://doi.org/10. 3390/su14010250 15. Masich, S., et al.: Prediction of critical filling of a storage area network by machine learning methods. Electronics 11(24), 4150 (2022). https://doi.org/10.3390/electronics11244150 16. Tynchenko, V.S., Boyko, A.A., Kukartsev, V.V., Danilchenko, Yu.V., Fedorova, N.V.: Optimization of customer loyalty evaluation algorithm for retail company. In: Proceedings of the International Conference “Economy in the Modern World” (ICEMW 2018), pp. 177–182 (2018). https://doi.org/10.2991/icemw-18.2018.33 17. Milov, A.V., Tynchenko, V.S., Kukartsev, V.V., Tynchenko, V.V., Antamoshkin, O.A.: Classification of non-normative errors in measuring instruments based on data mining. In: Advances in Engineering Research: International Conference “Aviamechanical Engineering and Transport” (AVENT 2018), pp. 432–437 (2018). https://doi.org/10.2991/avent-18.201 8.83 18. Bukhtoyarov, V.V., Tynchenko, V.S., Petrovsky, E.A., Dokshanin, S.G., Kukartsev, V.V.: Research of methods for design of regression models of oil and gas refinery technological units. IOP Conf. Ser. Mater. Sci. Eng. 537(4), 042078 (2019). https://doi.org/10.1088/ 1757-899X/537/4/042078

128

V. Kukartsev et al.

19. Employee-Attrition-Rate, Kaggle. https://www.kaggle.com/datasets/prachi13/employeeattr itionrate. Accessed 21 May 2023 20. Kukartsev, V., Shutkina, E., Moiseeva, K., Korpacheva, L.: Methods and tools for developing an organization development strategy. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–8. IEEE (2022). https://doi.org/10.1109/ IEMTRONICS55184.2022.9795707 21. Bosikov, I.I.: Modeling and complex analysis of the topology parameters of ventilation networks when ensuring fire safety while developing coal and gas deposits. Fire 6(3), 95 (2023). https://doi.org/10.3390/fire6030095 22. Kukartsev, V.V.: Kohonen maps to organize staff recruitment and study of workers’ absenteeism. J. Phys. Conf. Ser. 1399(3), 033108 (2019). https://doi.org/10.1088/1742-6596/1399/ 3/033108 23. González Rodríguez, G., Gonzalez-Cava, J.M., Méndez Pérez, J.A.: An intelligent decision support system for production planning based on machine learning. J. Intell. Manuf. 31, 1257–1273 (2020). https://doi.org/10.1007/s10845-019-01510-y 24. Antosz, K., Jasiulewicz Kaczmarek, M., Pasko, L., Zhang, C., Wang, S.: Application of machine learning and rough set theory in lean maintenance decision support system development. Eksploatacja i Niezawodnosc - Maintenance Reliab. 23(4), 695–708 (2021). https:// doi.org/10.17531/ein.2021.4.12 25. Mboli, J.S., Thakker, D., Mishra, J.L.: An Internet of Things-enabled decision support system for circular economy business model. J. Cleaner Prod. 52(3), 772–787 (2020). https://doi.org/ 10.1002/spe.2825 26. dos Santos, B.S., Steiner, M.T.A., Fenerich, A.T., Lima, R.H.P.: Data mining and machine learning techniques applied to public health problems: a bibliometric analysis from 2009 to 2018. Comput. Ind. Eng. 138(3), 106120 (2019). https://doi.org/10.1016/j.cie.2019.106120 27. Rahman, M.A., Honan, B., Glanville, T., Hough, P., Walker, K.: Using data mining to predict emergency department length of stay greater than 4 hours. In: 36th Annual Scientific Meeting of the Australasian College for Emergency Medicine (ACEM2019), vol. 32, pp. 416–421 (2020). https://doi.org/10.1111/1742-6723.13474 28. Ayyoubzadeh, S.M., et al.: A study of factors related to patients’ length of stay using data mining techniques in a general hospital in southern Iran. Health Inf. Sci. Syst. 8(1), 9 (2020). https://doi.org/10.1007/s13755-020-0099-8 29. Khalyasmaa, A.I.: Data mining applied to decision support systems for power transformers’ health diagnostics. Mathematics 10(14), 2486 (2022). https://doi.org/10.3390/math10142486 30. Dias, D., Silva, J.S., Bernardino, A.: The prediction of road-accident risk through data mining: a case study from Setubal, Portugal. Informatics 10(1), 17 (2023). https://doi.org/10.3390/inf ormatics10010017

Correlation Analysis and Predictive Factors for Building a Mathematical Model V. A. Nelyub1,2 , V. S. Tynchenko1,3,4 , A. P. Gantimurov1 , Kseniya V. Degtyareva5(B) , and O. I. Kukartseva6 1 Artificial Intelligence Technology Scientific and Education Center, Bauman Moscow State

Technical University, 105005 Moscow, Russia 2 Peter the Great St.Petersburg Polytechnic University, Saint Petersburg, Russia 3 Information-Control Systems Department, Institute of Computer Science and

Telecommunications, Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia 4 Department of Technological Machines and Equipment of Oil and Gas Complex, School of Petroleum and Natural Gas Engineering, Siberian Federal University, 660041 Krasnoyarsk, Russia 5 Department of Information Economic System, Institute of Engineering and Economics, Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia [email protected] 6 Department of Systems Analysis and Operations Research, Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia

Abstract. The study, published in the journal Nature Medicine, looked at data on 1,000 people from China who were tracked over an average period of six years. The participants were divided into two groups: those who lived in areas with high levels of air pollution and those who lived in areas with low levels of air pollution. The study analyzed data on patients with lung cancer, including their age, gender, exposure to air pollution, alcohol consumption, dust allergy, occupational hazards, genetic risk, chronic lung disease, balanced diet, obesity, smoking, passive smoking, chest pain, cough, hemoptysis, fatigue, weight loss, shortness of breath, wheezing, difficulty swallowing, nail thickening and snoring. Keywords: Data set analysis · correlation analysis · neural network prediction · decision tree algorithm

1 Introduction The study, published in Nature Medicine, looked at data from 1,000 people in China who were followed for an average of six years. The participants were divided into two groups: those who lived in areas with high levels of air pollution and those who lived in areas with low levels of air pollution.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 129–141, 2024. https://doi.org/10.1007/978-3-031-54820-8_12

130

V. A. Nelyub et al.

The dataset contains information about the lung cancer patients, including their age, gender, exposure to air pollution, alcohol consumption, dust allergy, occupational hazards, genetic risk, chronic lung disease, balanced diet, obesity, smoking, passive smoking, chest pain, coughing. Blood, fatigue, weight loss, shortness of breath, wheezing, difficulty swallowing, nail thickening, and snoring. The initial data collected were scored using a ten-point scale, where 0 is no symptom and 9 is the symptom is maximally expressed. Data Science. Data science focuses on extracting value from complex data sets. The complexity of different data lies in the variety of different sources from which it has been extracted. Data, given such a challenge, must in turn be organized in such a way that it can be properly interpreted and associated. Different approaches and methods establish causality, with proper investigation establishing meaningful connections between complex interactions in given systems. Thus, the more complex the data, the more complex will be the mathematical approaches including computation, machine learning and system-based methods [1, 6–9].

2 Data Research From the text file, the data set was loaded into Deductor for analysis. After that, correlation analysis of the data was performed. The results of the analysis are presented in Fig. 1. Based on the results of correlation analysis, Kohonen maps were constructed in Fig. 2–3.

Correlation Analysis and Predictive Factors for Building a Mathematical Model

131

Fig. 1. Significant factors according to the results of correlation analysis

The study’s error rate was 0% in Fig. 4. The dataset was then explored using the decision tree method. All available factors were investigated, with passive smoking (including active smokers) identified as the most significant factor [10–15]. The error of the study was 0% in Fig. 5 (Fig. 6).

132

V. A. Nelyub et al.

Fig. 2. Kohonen maps

Fig. 3. Kohonen maps

Correlation Analysis and Predictive Factors for Building a Mathematical Model

Fig. 4. Kohonen maps

Fig. 5. Research Error

133

134

V. A. Nelyub et al.

Fig. 6. Significance of factors

All factors were then analyzed, excluding passive smoker, patient fatigue, wheezing, snoring, obesity, and weight loss in Fig. 7. The study had an error rate of 0% in Fig. 8. The dataset was then examined for passive and active smoking factors in Fig. 9. The study had an error rate of 12.3% in Fig. 10. The data were then examined considering only factors independent of health status with the inclusion of smoking in Fig. 11. The study had an error rate of 0.2% in Fig. 12. The dataset was then examined considering only factors directly related to the health of the study patients without taking smoking into account in Fig. 13 [16–21].

Correlation Analysis and Predictive Factors for Building a Mathematical Model

Fig. 7. Significance of factors

Fig. 8. Research Error

Fig. 9. Significance of factors

Fig. 10. Research Error

135

136

V. A. Nelyub et al.

Fig. 11. Significance of factors

Fig. 12. Research Error

Fig. 13. Significance of factors

The study’s margin of error was 0% in Fig. 14. The dataset was then examined considering only factors directly related to the health of the study patients, active and passive smoking [22–28] (Fig. 15). The study’s margin of error was 0% in Fig. 16.

Correlation Analysis and Predictive Factors for Building a Mathematical Model

137

Fig. 14. Research Error

Fig. 15. Significance of factors

Fig. 16. Research Error

The data were then analyzed taking into account factors directly related to the health of the study patients, active and passive smoking, without taking into account patient fatigue and wheezing in Fig. 17 [19, 30–33].

138

V. A. Nelyub et al.

Fig. 17. Significance of factors

The study’s margin of error was 0% in Fig. 18.

Fig. 18. Research Error

The last method of researching the extent of lung cancer is research using factors that are bad habits - smoking and the level of alcohol consumption in Fig. 19 [35–40].

Fig. 19. Significance of factors

Correlation Analysis and Predictive Factors for Building a Mathematical Model

139

The study had an error rate of 1.2% in Fig. 19 (Fig. 20).

Fig. 20. Research Error

3 Conclusion As a result of the analysis of several mathematical models, it can be excluded that passive smoking, fatigue and wheezing are possible influencing factors. At the same time, the degree of patient fatigue is a subjective assessment and can be assessed by an already existing disease, regardless of its severity. Thus, both passive and active smoking, as well as wheezing, are important for reliable prediction.

References 1. Miller, G.W.: Data science and the exposome, pp. 181–209 (2020) 2. Nagar, D., Pannerselvam, K., Ramu, P.: A novel data-driven visualization of n-dimensional feasible region using interpretable self-organizing maps (iSOM) 155, 398–412 (2022) 3. Tang, W., Li, W.: Frictional pressure drop during flow boiling in micro-fin tubes: a new general correlation 159, 120049 (2020) 4. Liu, Y., Jiang, Y., Hou, T., Liu, F.: A new robust fuzzy clustering validity index for imbalanced data sets 547, 579–591 (2021) 5. Li, F., Zhang, X., Zhang, X., Du, C., Xu, Y., Tian, Y.-C.: Cost-sensitive and hybrid-attribute measure multi-decision tree over imbalanced data sets 422, 242–256 (2018) 6. Menzies, T., Kocagüneli, E., Minku, L., Peters, F., Turhan, B.: Chapter 6 - Rule #4: data science is cyclic, pp. 35–38 (2015) 7. Comparison of Data Science Algorithms, pp. 523–529 (2019) 8. Zhu, C., Mei, C., Zhou, R.: Weight-based label-unknown multi-view data set generation approach 146, 1–12 (2019) 9. Griffiths, G.W., Płociniczak, Ł., Schiesser, W.E.: Analysis of cornea curvature using radial basis functions – Part II: fitting to data-set 77, 285–296 (2016) 10. Mariño, L.M.P., de Carvalho, F.D.A.T.: Vector batch SOM algorithms for multi-view dissimilarity data 258, 109994 (2022) 11. Mariño, L.M.P., de Carvalho, F.D.A.T.: Two weighted c-medoids batch SOM algorithms for dissimilarity data 607, 603–619 (2022) 12. He, S.-F., Zhou, Q., Wang, F.: Local wavelet packet decomposition of soil hyperspectral for SOM estimation 125, 104285 (2022) 13. Qiang, Z.: Multi-stage design space reduction technology based on SOM and rough sets, and its application to hull form optimization 213(Part C), 119229 (2023)

140

V. A. Nelyub et al.

14. Kang, H.: Findings of influenza A (H1N1) pneumonia in adults: pattern analysis and prognostic correlation 140(4, Supplement), 758A (2011) 15. Rubio-Rivas, M., Corbella, X.: Clinical phenotypes and prediction of chronicity in sarcoidosis using cluster analysis in a prospective cohort of 694 patients 77, 59–65 (2020) 16. Barchitta, M.: Cluster analysis identifies patients at risk of catheter-associated urinary tract infections in intensive care units: findings from the SPIN-UTI network 107, 57–63 (2021) 17. Wang, R., Fung, B.C.M., Zhu, Y.: Heterogeneous data release for cluster analysis with differential privacy 201–202, 106047 (2020) 18. Carollo, A., Capizzi, P., Martorana, R.: Joint interpretation of seismic refraction tomography and electrical resistivity tomography by cluster analysis to detect buried cavities (2020) 19. Bosikov, I.I., et al.: Modeling and complex analysis of the topology parameters of ventilation networks when ensuring fire safety while developing coal and gas deposits. Fire 6(3), 95 (2023) 20. Mikhalev, A.S., et al.: The Orb-weaving spider algorithm for training of recurrent neural networks. Symmetry 14(10), 2036 (2022) 21. Moiseeva, K., et al.: The impact of coal generation on the ecology of city areas. In: 2023 22nd International Symposium INFOTEH-JAHORINA (INFOTEH), pp. 1–6. IEEE (2023) 22. Kukartsev, V., et al.: Analysis of data in solving the problem of reducing the accident rate through the use of special means on public roads. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–4. IEEE (2022) 23. Kireev, T., et al.: Analysis of the influence of factors on flight delays in the United States using the construction of a mathematical model and regression analysis. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–5. IEEE (2022) 24. Kukartsev, V., et al.: Prototype technology decision support system for the EBW process. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Proceedings of the Computational Methods in Systems and Software, vol. 596, pp. 456–466. Springer, Cham (2022). https://doi.org/10. 1007/978-3-031-21435-6_39 25. Kukartsev, V., et al.: Methods and tools for developing an organization development strategy. In: 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–8. IEEE (2022) 26. Malozyomov, B.V.: Improvement of hybrid electrode material synthesis for energy accumulators based on carbon nanotubes and porous structures. Micromachines 14(7), 1288 (2023) 27. Gutarevich, V.O., et al.: Reducing oscillations in suspension of mine monorail track. Appl. Sci. 13(8), 4671 (2023) 28. Malozyomov, B.V., et al.: Overview of methods for enhanced oil recovery from conventional and unconventional reservoirs. Energies 16(13), 4907 (2023) 29. Strateichuk, D.M., et al.: Morphological features of polycrystalline CdS1− xSex films obtained by screen-printing method. Crystals 13(5), 825 (2023) 30. Malozyomov, B.V., et al.: Study of supercapacitors built in the start-up system of the main diesel locomotive. Energies 16(9), 3909 (2023) 31. Malozyomov, B.V., et al.: Substantiation of drilling parameters for undermined drainage boreholes for increasing methane production from unconventional coal-gas collectors. Energies 16(11), 4276 (2023) 32. Masich, I.S., Tyncheko, V.S., Nelyub, V.A., Bukhtoyarov, V.V., Kurashkin, S.O., Borodulin, A.S.: Paired patterns in logical analysis of data for decision support in recognition. Computation 10(10), 185 (2022) 33. Masich, I.S., et al.: Prediction of critical filling of a storage area network by machine learning methods. Electronics 11(24), 4150 (2022) 34. Barantsov, I.A., et al.: Classification of acoustic influences registered with phase-sensitive OTDR using pattern recognition methods. Sensors 23(2), 582 (2023)

Correlation Analysis and Predictive Factors for Building a Mathematical Model

141

35. Bukhtoyarov, V.V., et al.: A study on a probabilistic method for designing artificial neural networks for the formation of intelligent technology assemblies with high variability. Electronics 12(1), 215 (2023) 36. Rassokhin, A., Ponomarev, A., Karlina, A.: Nanostructured high-performance concretes based on low-strength aggregates. Mag. Civil Eng. 110(2), 11015 (2022) 37. Rassokhin, A., et al.: Different types of basalt fibers for disperse reinforcing of fine-grained concrete. Mag. Civil Eng. 109(1), 10913 (2022) 38. Shutaleva, A., et al.: Migration potential of students and development of human capital. Educ. Sci. 12(5), 324 (2022) 39. Efremenkov, E.A., et al.: Research on the possibility of lowering the manufacturing accuracy of cycloid transmission wheels with intermediate rolling elements and a free cage. Appl. Sci. 12(1), 5 (2021) 40. Shutaleva, A., et al.: Environmental behavior of youth and sustainable development. Sustainability 14(1), 250 (2021) 41. Repinskiy, O.D., et al.: Improving the competitiveness of Russian industry in the production of measuring and analytical equipment. J. Phys. Conf. Ser. 1728(1), 012032 (2021) 42. Balanovskiy, A.E., et al.: Determination of rail steel structural elements via the method of atomic force microscopy. CIS Iron Steel Rev. 23, 86–91 (2022) 43. Kondrat’ev, V.V., et al.: Description of the complex of technical means of an automated control system for the technological process of thermal vortex enrichment. J. Phys. Conf. Ser. 1661, 012101 (2020) 44. Potapenko, I., Kukartsev, V., Tynchenko, V., Mikhalev, A., Ershova, E.: Analysis of the structure of germany’s energy sector with self-organizing Kohonen maps. In: Abramowicz, W., Auer, S., Stró˙zyna, M. (eds.) Business Information Systems Workshops. LNBIP, vol. 444, pp. 5–13. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04216-4_1 45. Borodulin, A.S., et al.: Using machine learning algorithms to solve data classification problems using multi-attribute dataset. In: E3S Web of Conferences. EDP Sciences (2023) 46. Nelyub, V.A., et al.: Machine learning to identify key success indicators. In: E3S Web of Conferences. EDP Sciences (2023) 47. Kukartsev, V.V., et al.: Using digital twins to create an inventory management system. In: E3S Web of Conferences. EDP Sciences (2023) 48. Gladkov, A.A., et al.: Development of an automation system for personnel monitoring and control of ordered products. In: E3S Web of Conferences. EDP Sciences (2023) 49. Kukartsev, V.V., et al.: Control system for personnel, fuel and boilers in the boiler house. In: E3S Web of Conferences. EDP Sciences (2023) 50. Kozlova, A.V., et al.: Finding dependencies in the corporate environment using data mining. In: E3S Web of Conferences. EDP Sciences (2023)

Geoportals in Solving the Problem of Natural Hazards Monitoring Stanislav A. Yamashkin1(B) , A. A. Yamashkin1 , A. S. Rotanov1 , Yu. E. Tepaeva1 , E. O. Yamashkina2 , and S. M. Kovalenko2 1 National Research Mordovia State University, 68 Bolshevistskaya Street, 430005 Saransk,

Russia [email protected] 2 Institute of Information Technology, MIREA—Russian Technological University, 78 Vernadsky Avenue, 119454 Moscow, Russia

Abstract. The article is devoted to the development of models and algorithms for analyzing the distribution of spectral channels based on pixel-by-pixel classification of metageosystems for the detection of natural and natural-technogenic processes using the example of fires, as well as the design and development of a geoportal system for monitoring and visualizing the situation with fires. The strengths and weaknesses of the machine learning models used, which were trained and fine-tuned based on multispectral images to solve the problem of fire localization, were revealed: support vector machines, K-neighbors Classifier, Random Forest Classifier, Gaussian NB, Logistic Regression. The advantages and disadvantages of these models are revealed, the optimal parameters of the models for solving this problem are experimentally selected. The geoportal system operates on the basis of the PostgreSQL database with the PostGIS extension, the Flask framework for building the system architecture, SQLLite storage for storing space images, as well as FastAPI and the object detector itself. Communication between services occurs through REST requests. Despite the fact that the project was developed to solve the problem of monitoring fires, the results obtained can be used in solving other problems related to the analysis and management of natural and natural-technogenic processes. Keywords: geoportal · natural-social-production systems · natural hazards · spatial data infrastructure · spatial data · metageosystems

1 Introduction Natural hazards, including fires and ignitions, are increasingly becoming a serious threat to social stability, harming human health, nature, animals and the region as a whole [1]. With the timely detection of a source of fire and promptly taken actions, one can not only save lives, but also significantly save on extinguishing a fire by protecting the territory of its spread, preventing irreparable harm to natural and man-made systems [2]. It is important to solve the problem of increasing the safety and awareness of the population, the security of critical facilities. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 142–158, 2024. https://doi.org/10.1007/978-3-031-54820-8_13

Geoportals in Solving the Problem of Natural Hazards Monitoring

143

The problem of fires is relevant at the present time: in this project work, the Astrakhan region (Russia) was selected for the analysis of fire areas. The area of selected metageosystem covered by the fire as of March 16, 2023 is approximately 7,000 hectares. Dry vegetation and reeds are burning in the central and southern regions of the region. To solve the design problems, data prepared jointly with the Astrakhan State University were used. Effective counteraction to the emergence of possible emergency situations cannot be ensured only within the framework of the main activities of local governments. In order to ensure fire safety on the territory of subjects in various areas of socio-economic activity, a program-targeted approach is required to solve the problems presented. A fundamentally new approach to solving the problem of fires is required. First of all, this refers to informing the population and creating conditions for extinguishing fires in the initial stage of their development. Of particular importance in this situation is the use and improvement of fire detection algorithms using satellite images [3]. The purpose of the article is to develop an information cloud system for monitoring and analyzing spatial data. To achieve this goal, the following tasks were set and solved: • analysis of the application of multispectral images of remote sensing of the Earth Landsat 8 in the problem of semantic segmentation of fire sources; • collection of initial data based on space multispectral imaging; • selection of metrics for analyzing the accuracy of the models of semantic segmentation of fire sources; • analysis of algorithms and models for detecting natural and natural-technogenic processes (on the example of fires); • training machine learning models based on multispectral images to solve the problem of fire localization; • development of information geoportal system for monitoring and visualization of the situation with fires. The article uses the following research methodology: • design and development of requirements for the fire detection and visualization system; • methods and means of software engineering, in particular object-oriented analysis, design and programming; • methods of artificial intelligence and machine learning in the field of semantic segmentation of objects in the image. The work uses space images from the Landsat 8 satellite. A combination of channels 7, 5 and 3 will be used to identify fires in the work. The paper considers an algorithm for analyzing the distribution of channels, as well as a pixel-by-pixel classification approach. Model building fine-tuning training testing pixel-by-pixel classification of the following model models: SVC, Linear SVC, K-neighbors Classifier, Random Forest Classifier, Gaussian NB, Logistic Regression. As a result of the study, it was determined that the Random Forest Classifier classification model gives the best performance. The scientific novelty of the work lies in the fact that the proposed method for processing satellite fire detection data has made it possible to increase the reliability of estimating the areas covered by fire. Satellite monitoring of fires provides multiple

144

S. A. Yamashkin et al.

shooting, coverage of the territory, as well as the required set of spectral channels for shooting. And the use of a geoportal for visualization and monitoring of fires will provide visibility of the detected data for their further analysis. Obtaining operational information about fire danger allows you to respond in time to a developing natural disaster. Up-to-date and reliable data help to evacuate people and logging equipment in advance, as well as take the necessary measures to stop the spread of fire. The application of space monitoring has the following practical applications: • • • •

detection and monitoring of forest fires from space in dynamics; cost optimization for forest protection measures (including air patrol routes); assessment of large burnt areas; preliminary assessment of fire damage to plantings (including identification of dead plantings); • comparison of data from ground, air and space observations, including feedback from ground and air fire services in the regions; • integration in one GIS-interface of complex information (topographic base, remote sensing data and attributive data) in order to support managerial decisions in the field of monitoring the forest fire situation and situation. At the same time that the project was developed to solve the problem of monitoring fires, the results obtained, subject to revision, can be used in solving other problems related to the analysis and management of natural and natural-technogenic processes occurring in different regions of the world.

2 Related Work and Research Methods Space images, they are Earth remote sensing data (ERS) from space, are images of the earth’s surface made using optical or radar equipment, which is installed on artificial Earth satellites. These satellites - space vehicles with remote sensing equipment - are usually launched into low polar orbits with a height of 500 to 1000 km or into geostationary 24-h orbits with a height of 36 thousand km. The main parameters of satellite images are: • spatial resolution, which determines the minimum size of objects distinguishable in the image; • the possibility of shooting in panchromatic (black and white) mode or using several spectral zones to form color images in natural colors that a person sees, or pseudonatural, using channels in the infrared region of the spectrum; • the size of the area covered by the image. With the help of remote sensing of the Earth (ERS) from space, it is possible to observe the situation with forest fires. The most popular Russian satellites are such satellites of Roskosmos as follows: Kanopus-V, Meteor-M, Electro-L and Arktika-M. Among foreign ones, Landsat 8 and Sentinel 2 are often used. In our work, we use satellite images from the Landsat 8 satellite, which is assembled by the Orbital Sciences Corporation, because the images from it have a large number of channels and the best

Geoportals in Solving the Problem of Natural Hazards Monitoring

145

resolution, as well as the images obtained from it are publicly available and are often used for scientific research. To optimize the accuracy of determining the satellite’s attitude, three high-precision astro sensors (ar-3, two of which operate in active mode), a scalable inertial guidance system SIRU (Scalable Inertial Reference Unit), GPS receivers and two three-axis magnetometers are used. The satellite was launched on February 11, 2013 into a Sun-synchronous orbit with a re-observation period of 16 days. Landsat 8 has 11 different channels that can be combined. For research work, a manually annotated dataset of Landsat 8 images was taken, divided into images of 256 × 256 pixels. Due to the large amount of data and the relatively small ratio of the number of fire pixels to background pixels, the number of fire pixels in each image was calculated. As a result, a threshold of 100 pixels was chosen and a selection of images corresponding to this threshold was made (Fig. 1).

Fig. 1. An example of a space image and a mask for it

Consider the metrics that will be used in the study [4]. Precision = Recall = F1 score = 2 ∗

  TP TP+FP



(1)



TP TP+FN

Precision*Recall Precision+Recall

(2) (3)

where TP – are those pixels that the model assigned to the “fire” class, and they really belong to this class, FP – are those pixels that the model erroneously classifies as “fire”, FN – are those pixels that the model erroneously did not classify as “fire”. The Precision metric in this problem denotes the proportion of pixels that are classified as “fire” and really are. The Recall metric is the proportion of detected pixels that belong to the “fire” class. In simple terms, Precision means how accurately we localize a fire, Recall means how accurately we detect foci.

146

S. A. Yamashkin et al.

To take into account these indicators, the F1 score metric was chosen; it can be interpreted as the harmonic mean of accuracy and recall. Since the classes are equivalent, we will use the macro averaging parameter. The models will be evaluated in two stages. In the first step, the model is evaluated on a test dataset consisting of a pixel test dataset. At the second stage, the model is evaluated on a test set of images by comparing the predicted and true masks.

3 Analysis of the Distribution of Spectral Channels in the Problem of Land Classification Satellite cameras are equipped with sensors that register reflected solar radiation in certain areas (ranges) of the electromagnetic spectrum, each of which corresponds to a spectral channel. To highlight fires, a combination of channels 7, 5 and 3 will be used in the work, where channel 3 is green (Green) with a wavelength of 0.525–0.600 microns and a resolution of 30 m, channel 5 is near infrared (Near Infrared, NIR) with a wavelength 0.845–0.885 µm and a resolution of 30 m, channel 7 - near infrared (Short Wavelength Infrared, SWIR 3) with a wavelength of 2.100–2.300 µm and a resolution of 30 m. An algorithm based on the analysis of channel distributions was chosen as a starting point. When analyzing the distribution of channel 7, it was found that often the “tail” of the distribution corresponds to fires. In other words, often “emissions” are fires. Based on this hypothesis, an algorithm was developed for calculating the distribution threshold, after which a point is considered a fire. An example of a picture and distribution over its channels are shown in Fig. 2.

Fig. 2. An example image and the distribution of brightness across channels

To determine the threshold, a hypothesis was introduced that the distribution of the main pixels of the red channel lies within 2.5 standard deviations from the mean. And everything that lies to the right is considered an “emission” and is classified as a fire. This approach provided the following accuracy: precision: 0.5839; recall: 0.8428; F1 score: 0.6084. The main problem of this algorithm is that we poorly take into account information from the image based only on the distribution of one channel. So, for example, there may be cases when a fire occupies a large area on the image, then the average of the

Geoportals in Solving the Problem of Natural Hazards Monitoring

147

distribution will already be in the fire area, but with this algorithm, we consider only outliers to be a fire, which is not entirely correct. In other words, it is necessary to introduce a boundary, after which all pixels of the red channel will be considered a fire, regardless of the distribution. As can be seen from the previous approach, we do not take into account a lot of information. One of these is the relationship between channel values. To take this information into account, it is necessary to use classical algorithms or the simplest perceptron, which will be inputted with pixel values in each RGB channel of the image. For training, a dataset was compiled, each line of which contains the characteristic of an image pixel, namely the value of the RGB channels and the class label: 0 if there is no fire, 1 if there is a fire. Pearson’s cross-correlation between classes was calculated on the entire data set. The feature correlation matrices are shown in Fig. 3 (a, b).

Fig. 3. Visualization of the feature correlation matrix: a) heat map of features; b) numerical values of the feature heatmap.

From the heat map, there is a strong correlation (greater than 0.79) between the blue and green channels. Based on this, it was decided to remove the green channel, since blue is less correlated with the remaining features. Since Landsat 8 images have pixel values from 0 to 65535 which is a wide range of values, and most models are sensitive to data distribution, normalization is necessary. For this, a normalization with a mean of 0 and unit variance was chosen. Which is calculated as: z=

x−u s

(4)

148

S. A. Yamashkin et al.

where u is the mean of the training samples and s is the standard deviation of the training samples. The following models were used for training: • • • • • •

C-classification of support vectors (SVC); linear support vector classifier (Linear SVC); k nearest neighbor method; random forest (Random Forest Classifier); Gaussian Naive Bayes (Gaussian NB); logistic regression (Logistic Regression).

The choice of models is due to their wide range of applications for solving various kinds of problems. For each model, the selection of optimal parameters was made. The iterated parameters and their values are presented in Table 1. Below will be a brief description of the models and a demonstration of their work on a sampled dataset consisting of two channels red and blue (7 and 3) Landsat 8 channels and fire/no fire class labels. Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression problems [5]. SVM is based on the idea of finding a hyperplane that divides data into classes in the best possible way. Such a hyperplane is called the optimal hyperplane; it has the greatest distance to the nearest points of the training sample. Such points are called support vectors. That’s where the name came from. The distance between the support vectors and the separating plane is called the gap. The main goal of the algorithm is to maximize the gap between all support vectors and the hyperplane. The difference between Linear SVC and SVC lies in the implementation. Linear SVC is a faster SVC implementation of a random linear kernel and does not involve the use of other types of kernels. SVC, on the contrary, can be trained with different cores. Classification based on nearest neighbors (K-neighbors Classifier) – type of learning based on instance or non-generalizing learning [6]. The classification in this method is calculated by a simple majority vote of the nearest neighbors of each point. Each query point is assigned the dominant class of its neighbors. The main training parameters are the number of nearest neighbors, from whose votes a class will be assigned, and a metric for calculating the distance. A random forest (Random Forest Classifier) is a meta-estimator based on a number of decision tree classifiers in different data subsamples and uses averaging to improve prediction accuracy and control fit [7]. Each tree in the forest is built from a sample taken with a replacement from the training sample. At the same time, during training, at each node, a separation occurs either from all input objects or from a random subset, and queries are formed in such a way that the answers lead to a decrease in the Gini Impurity. In other words, a decision tree tries to maximize the number of pairs of objects of the same class that end up in the same subtree. The main goal of building a random forest is to reduce the variance. Ordinary trees usually exhibit high variance and a tendency to overfit. The introduction of randomness in the construction makes it possible to obtain trees with disconnected prediction errors. Which, when averaged, leads to the averaging of the model.

Geoportals in Solving the Problem of Natural Hazards Monitoring

149

Table 1. Designations and values of the iterated model parameters Model

Parameters

Values

Description

SVC

kernel

linear, rbf, poly

Kernel type

C

0.1, 0.5, 1.0, 10.0, 100.0

Regularization parameter

penalty

l1, l2

Norm of penalty

C

0.1, 0.5, 1.0, 10.0, 100.0

Regularization parameter

n_neighbors

1, 2, 3, 4, 5

Number of neighbors to assign a class label

p

1, 2, 5

Distance estimation metric

n_estimators

10, 30, 37, 55, 100

Number of trees in the forest

criterion

gini, entropy

Function for measuring separation quality

Linear SVC

K-neighbors Classifier Random Forest Classifier

max_depth

1, 5, 10, 25

Maximum tree depth

min_samples_leaf

1, 5, 10

Minimum number of samples required to split an internal node

min_samples_split

2, 5, 10

Minimum number of samples required for a leaf node

Gaussian NB

var_smoothing

8.4834e−08

Part of the largest deviation of all functions, which is added to the deviations for calculation stability

Logistic Regression

penalty

l1, l2

Norm of penalty

C

0.1, 0.5, 1.0, 10.0, 100.0

The reciprocal of the strength of regularization

Naive Bayes (Gaussian NB) is a powerful probabilistic classification inspired by Bayes’ theorem. The Naive Bayes model’s assumption that all predictors are independent of each other is impractical in real world scenarios, but the assumption is nevertheless a good one in most cases. Logistic regression is a special case of linear regression where the input variable is categorical and uses the log odds as the dependent variable. The main idea of logistic regression is that the space of the original data can be divided by a linear boundary. The advantages and disadvantages of the models used are presented in Table 2. The division of the feature space during classification by different models is shown in Fig. 4.

150

S. A. Yamashkin et al. Table 2. Advantages and disadvantages of the models used

Model

Advantages

Disadvantages

SVC

- more efficient in high-dimensional spaces; - effective in cases where the number of measurements is greater than the number of samples relatively efficient in terms of memory; - variability due to the use of different cores

- not suitable for large datasets; - sensitive to noise; - do not provide probability estimates directly. Expensive fivefold cross-validation can be used for the calculation

K-neighbors Classifier

- the algorithm is easy to implement, intuitive and interpretable; - not sensitive to emissions; - the algorithm is universal

- calculation speed depends on the sample size; - It is always necessary to select the number of nearest neighbors

Random Forest Classifier

- has high reliability and accuracy due to averaging the prediction of multiple decision trees; - does not retrain; - the possibility of graphic display; - interpreted; - calculates the relative importance of indicators, which helps in choosing the most significant features for the classifier

- slow due to the large number of decision trees; - is more difficult to interpret than a decision tree due to their abundance

Gaussian NB

- copes well with multi-class forecasting; high forecasting speed; - works well with categorical features

- the impossibility of predicting a category that is absent in the training sample; - the limitation of this algorithm is the assumption of feature independence. However, in real problems, completely independent features are extremely rare

Logistic Regression

- easy to interpret; - coefficients of the model reflect the importance of features; learns quickly

- the assumption of linearity between the dependent variable and the independent variables; - requires a large and balanced sample size

Geoportals in Solving the Problem of Natural Hazards Monitoring

151

The models were trained with cross-validation equal to 3 groups to improve the reliability of the model. The results of model training are presented in Table 3. The table shows that the LinearSVC per-pixel classification model gave an increase of ~10% F1 score mask. Which means that the system is better able to determine the fire area in the image. Let’s display the distribution of classes on the graph; due to the small number of features, this can be easily done. Figure 5 shows the distribution of data.

Fig. 4. Separation of the feature space during classification by different models: a) SVC; b) K-neighbors Classifier; c) Random Forest Classifier; d) Gaussian NB; e) Logistic Regression.

The left graph displays all the objects related to the “fire” label, the central graph shows objects of the “not fire” class, and the right graph shows the union of both graphs. The figure shows that the data are superimposed on each other and it is impossible to build a dividing line to unambiguously separate the classes. It is necessary to expand

152

S. A. Yamashkin et al. Table 3. Learning outcomes for pixel-by-pixel classification models

Model

F1 score

F1 score mask

Precision

Recall

SVC

0.96825

0.6560

0.5406

0.9597

LinearSVC

0.9553

0.7305

0.6659

0.8651

K-neighbors Classifier

0.9751

0.6342

0.5007

0.9710

Random Forest Classifier

0.9762

0.6674

0.5386

0.9856

GaussianNB

0.9283

0.7121

0.7596

0.7239

Logistic Regression

0.9580

0.7144

0.6389

0.8809

Fig. 5. Feature mapping to 2-D plane

the dataset with additional information. To do this, we combine the first and second approaches. For each pixel, we additionally save the standard deviation for all channels for the entire image. The results of model training are presented in Table 4. Table 4. Learning outcomes for pixel-by-pixel classification models Model

F1 score

F1 score mask

Precision

Recall

SVC

0.9901

0.7257

0.6120

0.9331

Linear SVC

0.9655

0.7617

0.7540

0.8021

K-neighbors Classifier

0.9914

0.7712

0.6639

0.9408

Random Forest Classifier

0.9923

0.8009

0.6934

0.9666

Gaussian NB

0.9043

0.6815

0.8058

0.6517

LogisticRegression

0.9736

0.7518

0.8438

0.8434

Geoportals in Solving the Problem of Natural Hazards Monitoring

153

When trained on an extended dataset, the RandomForestClassifier pixel-by-pixel classification model showed an increase in F1 score compared to the previous Linear SVC model by 7%. Attempts were made to expand the dataset by adding indexes. For example, the addition of the fire index was tested. At the same time, the fire index value in this pixel was added to each value of the table. At the same time, the Pearson correlation of all signs was calculated. Figure 6 shows the correlation matrix of features of all with all. It can be seen from the figure that the correlation of the fire index and red channel is −0.8757, which means the inverse correlation of features. Therefore, the fire index feature carries little information. Let’s test the hypothesis. Table 5 presents the learning outcomes. The table shows that the hypothesis was confirmed, therefore, the fire index does not carry additional information and can be replaced by the 7th image channel.

Fig. 6. Pearson’s correlation of features of all with all

This approach also has disadvantages, such as: the relative position of pixels and the context are not taken into account, and for each pixel it is necessary to predict the class label, which is computationally expensive. To eliminate these problems, you can use neural network models for image segmentation. However, they require a large set of training data, which is difficult to search and aggregate. As a result of training with cross-validation, the optimal parameters of the models were selected, the segmentation results for various models are shown in Fig. 7. As can be seen from the figure, the Logistic Regression model sometimes makes mistakes and detects fires where there are none. This is evidenced by the low precision metric compared to other models.

4 Fire Monitoring by Means of Geoportals Geoportals make it possible to monitor spatial processes and make the results of geodata analysis available [8, 9]. A distinctive feature of this project is the ability to view and monitor fires using a web interface linked to a digital map of geoportal. It will also be possible to view crops, water resources and forests on the map for a more detailed analysis of the possible threat of fire and the spread of fire. This project implements a geoportal system for recognizing objects on aerial photographs with visualization of the results in a web interface. The project has a flexible scalable architecture with the ability to easily upgrade to fit your needs. The system complies with the principles of SOLID [10], has sufficient speed to ensure a comfortable user experience.

154

S. A. Yamashkin et al. Table 5. Learning outcomes for pixel-by-pixel classification models

Model

F1 score

F1 score mask

Precision

Recall

SVC

0.9913

0.7096

0.5898

0.9372

Linear SVC

0.9787

0.7156

0.6323

0.8822

K-neighbors Classifier

0.9922

0.7601

0.6443

0.9410

Random Forest Classifier

0.9926

0.7819

0.6753

0.9444

Gaussian NB

0.9566

0.7306

0.7748

0.7403

Logistic Regression

0.9793

0.7075

0.6227

0.8862

The system is implemented based on the PostgreSQL database with the PostGIS plugin, the Flask framework for the web interface, SQLLite for storing space images, as well as FastAPI and the object detector itself. Communication between services occurs through REST requests. Figure 8 shows the architecture of the geoportal system, which describes the directions of data flows and the means of transmitting this data. An example of the implementation of this system is shown in Fig. 9, which shows a part of the Astrakhan region, where frequent fires occur in March 2023. As can be seen from the figure, in the center of the page there is a map that will display information at the request of the user. On the right there will be a list of objects to be displayed on the map, with the help of which it will be possible to enable/disable the display of objects on the map in the checkbox format. The display of detected objects on the map will be carried out by superimposing layers with objects on the geomap. An example of such an overlay is shown in the figure. It is assumed that the detected objects will be displayed in specially contrasting colors for the convenience of the user. The map and the included layers will be automatically loaded when the user moves or changes the scale of the map. The implemented system allows you to fully display the aggregated information on a digital map for the purpose of further monitoring. In the course of the research work, the sources of Earth remote sensing images, their features were analyzed, an analysis was carried out to select the most informative types of images, the images were analyzed for the presence of information about a fire, algorithms and methods for segmenting fires in Earth remote sensing images were considered, as well as a system for recognizing objects on aerial photographs with visualization of the results in a web interface has been developed.

Geoportals in Solving the Problem of Natural Hazards Monitoring

155

Fig. 7. Segmentation results for models (original image, true mask, predicted mask): a) SVC; b) LinearSVC; c) K-neighbors Classifier; d) Random Forest Classifier; e) Gaussian NB; f) Logistic Regression.

156

S. A. Yamashkin et al.

Fig. 8. Architecture of the geoportal system

Fig. 9. Web service interface for fire monitoring

Geoportals in Solving the Problem of Natural Hazards Monitoring

157

5 Conclusion The article is devoted to the development of models and algorithms for analyzing the distribution of spectral channels based on pixel-by-pixel classification of metageosystems for the detection of natural and natural-technogenic processes using the example of fires, as well as the design and development of a geoportal system for monitoring and visualizing the situation with fires. For fire detection, the Landsat 8 satellite was chosen, which has the best performance in comparison with other Russian and foreign satellites. Also, the number of fire pixels on each image from a manually annotated image dataset was calculated. Metrics for analyzing the accuracy of the models of semantic segmentation of fire sources have also been selected and described. Algorithms for analyzing the distribution of channels and a pixel-by-pixel classification approach for detecting natural and natural-technogenic processes using the example of fires are proposed. Their strengths and weaknesses are revealed. The following classical machine learning models were trained and fine-tuned based on multispectral images to solve the problem of localizing fires: SVC, Linear SVC, K-neighbors Classifier, Random Forest Classifier, Gaussian NB, Logistic Regression. The advantages and disadvantages of these models are revealed, as well as the optimal parameters of the models for solving this problem are identified. The best of these models was the Random Forest Classifier with the following scores: precision: 0.6934, recall: 0.9666, F1 score mask: 0.8009. A geoportal system for monitoring and visualizing the situation with fires was designed and developed based on the PostgreSQL database with the PostGIS plugin, the Flask framework for the web interface, SQLLite for storing satellite images, as well as FastAPI and the object detector itself. Communication between services occurs through REST requests. Despite the fact that the project was developed to solve the problem of monitoring fires, the results obtained can be used in solving other problems related to the analysis and management of natural and natural-technogenic processes. Acknowledgments. The study was supported by the Russian Science Foundation, grant № 2227-00651, https://rscf.ru/en/project/22-27-00651/.

References 1. Sim, M.-S., Wee, S.-J., Alcantara, E., Park, E.: Deforestation as the prominent driver of the intensifying wildfire in Cambodia, revealed through geospatial analysis. Remote Sens. (Basel) 15, 3388 (2023). https://doi.org/10.3390/rs15133388 2. Pelletier, N., Millard, K., Darling, S.: Wildfire likelihood in Canadian treed peatlands based on remote-sensing time-series of surface conditions. Remote Sens. Environ. 296, 113747 (2023). https://doi.org/10.1016/j.rse.2023.113747 3. Thangavel, K., et al.: Autonomous satellite wildfire detection using hyperspectral imagery and neural networks: a case study on Australian wildfire. Remote Sens. (Basel) 15, 720 (2023). https://doi.org/10.3390/rs15030720

158

S. A. Yamashkin et al.

4. Yacouby, R., Axman, D.: Probabilistic extension of precision, recall, and F1 score for more thorough evaluation of classification models. In: Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems. Association for Computational Linguistics, Stroudsburg, PA, USA (2020) 5. Deng, M., Tanaka, Y., Li, X.: Experimental study on support vector machine-based early detection for sensor faults and operator-based robust fault tolerant control. Machines 10, 123 (2022). https://doi.org/10.3390/machines10020123 6. Müller, P.: Flexible K Nearest Neighbors classifier: derivation and application for ion-mobility spectrometry-based indoor localization (2023). http://arxiv.org/abs/2304.10151 7. Rezaei Barzani, A., Pahlavani, P., Ghorbanzadeh, O.: Ensembling of decision trees, KNN, and logistic regression with soft-voting method for wildfire susceptibility mapping. In: SPRS Annals of the Photogrammetry Remote Sensing and Spatial Information Sciences, X-4/W12022, pp. 647–652 (2023). https://doi.org/10.5194/isprs-annals-x-4-w1-2022-647-2023 8. Reddy, G.P.O.: Geoportal platforms for sustainable management of natural resources. In: Reddy, G.P.O., Raval, M.S., Adinarayana, J., Chaudhary, S. (eds.) Data Science in Agriculture and Natural Resource Management. SBD, vol. 96, pp. 289–314. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-5847-1_14 9. Yamashkin, A.A., et al.: Cultural landscapes space-temporal systematization of information in geoportals for the purposes of region tourist and recreational development. GeoJ. Tour. Geosites. 29, 440–449 (2020). https://doi.org/10.30892/gtg.29205-480 10. Sarcar, V.: Know SOLID principles. In: Simple and Efficient Programming with C#, pp. 53– 107. Apress, Berkeley, CA (2023)

Implementation of Individual Learning Trajectories in LMS Moodle Faycal Bensalah1 , Marjorie P. Daniel2 , Indrajit Patra3 , Darío Salguero García4 , Shokhida Irgasheva5 , and Roman Tsarev6,7(B) 1 Lab LERSEM, ENCG, University Chouaib Doukkali, El Jadida, Morocco 2 Zayed University, Dubai, United Arab Emirates 3 Mediterranea International Centre for Human Rights Research, Mediterranea University of

Reggio Calabria, Reggio Calabria, Italy 4 Almería University, Almería, Spain 5 Tashkent Institute of Finance, Tashkent, Uzbekistan 6 MIREA - Russian Technological University (RTU MIREA), Moscow, Russia [email protected] 7 Bauman Moscow State Technical University, Moscow, Russia

Abstract. Informatization of education has entailed significant changes in the issue of organization, content, use of methods and means of learning. Along with these changes there is a need for personality-oriented, individual learning. The merging of e-learning and individual learning allows to automate the process of developing individual educational trajectories, which implies a personal way to realize the personal potential of each student in joint activities with the teacher. Thanks to the LMS Moodle and the creation of e-courses on its basis, the teacher can easily organize learning on individual trajectories and optimize the learning process, taking into account the intellectual, mental and physical characteristics of students. In this paper, the authors formulated the concept of reverse programming, designed to implement the logic of learning by individual trajectory in LMS Moodle. The concept of reverse programming allowed to implement in practice the logic of passing through individual learning trajectories, using the settings of LMS Moodle. In this study, learning along individual trajectories is driven by test results and the mode of access to learning material determined by the logic functionality of LMS Moodle. #COMESYSO1120. Keywords: E-learning · Moodle · Individual Learning Trajectory · Reverse Programming · Activities · Resources

1 Introduction Innovative trends in the development of electronic learning (e-learning), based on the use of information and communication technologies, open new opportunities for improving the quality of education, primarily due to the appearance of electronic learning environments [1–4]. It allows to integrate learning tools, resources and subjects of the educational process, as well as to develop electronic courses within the framework of electronic I. Patra—Independent Researcher. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 159–174, 2024. https://doi.org/10.1007/978-3-031-54820-8_14

160

F. Bensalah et al.

educational environment to provide educational material, its effective assimilation and control the level of the acquired skills and knowledge [5–10]. To date, the most popular, accessible and developing environment is LMS Moodle [11–14]. This system can be used by students both when studying in classrooms and at home when doing homework or preparing for tests and exams. If we talk about the content of an e-course created in LMS Moodle, the elements of an e-course can be lessons, assignments and quizzes. In Moodle terms they are called activities [15]. In addition, LMS Moodle provides a wide range of e-course elements such as files (pictures, video files or pdf documents), URLs or multi-page resources with a book-like format and table of contents [16]. The use of e-courses in LMS Moodle meets the students’ need for independent work, allows for personal-oriented learning, and promotes individualization of education [11, 17–20]. Individualization of education is a kind of individual approach, effective attention to each student, his intellectual, physical and creative abilities [18, 21, 22]. Since the individualization of education is based on the responsibility of students themselves for learning new things, distribution of workload, motivation in the successful result, their self-regulation and reflection, the electronic educational environment helps in building an individual trajectory of students’ learning. Individual learning trajectory is a personal path of students, a certain sequence of elements of learning activities of each student to realize their own educational goals in accordance with their abilities, interests, level of motivation and type of thinking [23–26]. Individual educational trajectory is built by students together with the teacher, including through alternative forms and methods of learning or mastering the material, alternative methods of control, learning content, time and speed of mastering the material and developing of certain competencies to the required level. At the same time, the teacher needs to motivate students to build an individual learning trajectory, designate the expected result and organize constant monitoring of student’s learning on an individual trajectory, control the development of knowledge, provide assistance and support. This means that the teacher does not completely withdraw to the background in the learning process, despite the high degree of independence and individualization, but performs the role of a mentor and facilitator. Researchers identify three structural components of individual learning trajectory [27]: content (consists in determining the educational needs of students and the content of education on their basis in accordance with the Federal State Standard of Higher Education); organizational component (technologies, ways, methods, means and forms of education); analytical component (continuous monitoring by the teacher and self-analysis of students, as well as the adjustment of individual learning trajectory in accordance with the obtained results). So, the principles of implementation of individual learning trajectory of students include result orientation, continuity and unlimited education, self-realization as a driving force, development of individuality of a person. The conditions for the realization of individual learning trajectory are the situation of choice, variability of content, demand for the full potential of the student’s personality and reliance on independent work.

Implementation of Individual Learning Trajectories in LMS Moodle

161

Thus, learning on individual trajectories develop in students such competences as readiness to solve problems, the ability to analyze non-standard problem situations, to set goals and achieve high results, to plan their learning activities, to develop an algorithm for achieving success, to evaluate their activities, to identify problems in their knowledge and skills, to assess the need for this or that information for their learning activities, to independently search, analyze and synthesize the necessary information for their learning activities, as well as technological competence, readiness for selfeducation, readiness to use information resources, readiness for social interaction and communicative competence. Within the framework of the e-course, created with the help of LMS Moodle functionality, its settings, logical operations on elements, we organized learning on an individual trajectory, which included placement testing for students, according to the results of which students get to a certain trajectory A, B or C, then the first block of educational material (in each block of educational material students study the same topics, but the presentation of the educational material differs depending on the individual learning trajectory), the first test, which is passed by students after the completion of the first block and the move to the second block of educational material according to the results of the test and determination of further individual learning trajectory. We have defined such a logic of learning on an individual learning trajectory as “reverse programming”. In contrast to the traditional approach to programming, in which actions occur sequentially, step-by-step (if a condition is met - we choose one action, if it is not met - we choose another), the concept of reverse programming is that an action will occur if one of the conditions is met. That is, the reverse situation is observed, in order to follow some learning trajectory we need to make sure that at least one of the conditions is met, under which we can make a move along this trajectory. As an example, let us take the move to the learning of the first block of educational material corresponding to the trajectory B (Fig. 1). For moving along the individual learning trajectory B it is necessary to get a score from 50 to 75 points on the placement test or to pass along the trajectory A and not pass Test 1 (which allows to assess the knowledge and skills after studying the first block of educational material) or to pass along the trajectory C and not pass Test 1. Thus, if we consider the classical conditional operator used in most high-level programming languages, if … then…, the key thing is what comes after then instead of if . In addition, the difference is that in case of returning to re-learn the current block of educational material using a different trajectory, the student still has the opportunity to go through the previous learning trajectory. For example, a student was following the individual learning trajectory A and did not pass the test. Now he moves to the trajectory B, the corresponding element (or activities) in LMS Moodle is accessible to him. However, the LMS Moodle element corresponding to the trajectory A is still available. Thus, in contrast to the linear process of programming, in our concept the student has the opportunity to follow the trajectory B or A to re-learn the material and pass the test again.

162

F. Bensalah et al.

Fig. 1. Scheme of taking a course on an individual learning trajectory.

2 Implementation of the Logic of Learning by Individual Trajectory in LMS Moodle on the Basis of the Concept of Reverse Programming Define that learning on an individual trajectory in LMS Moodle will be as follows. After passing the entrance test, the student proceeds to the study of the first block of educational material on the learning trajectory corresponding to the test scores obtained. In this block of educational material for each trajectory is presented material on the same topic, but it differs in presentation. For example, for trajectory A it may be more conceptual, while for trajectory C it may contain more graphical material. After passing the first block of educational material, each student takes a test to examine the knowledge and skills acquired (test 1 in Fig. 1). If the student passes the test successfully, he/she proceeds to the study of the second block of educational material according to the learning path corresponding to the scores obtained. If the student fails the test, he/she is returned and has the opportunity to learn the same topic using a different learning trajectory. Below is presented how a teacher or course developer can implement learning on an individual trajectory using the logical functionality of LMS Moodle. We will give only the main aspects of setting up an e-course, which relate to the implementation of individual learning trajectory. At the beginning of the course each student takes a placement test and can get from 0 to 100 points for it. For this purpose the element (activity) of LMS Moodle “Quize” is used. Note that in Moodle terminology, an activity means something students can contribute to directly, it is something that a student will do that interacts with other students and or the teacher [15]. To implement individual learning trajectory we use two types of activities: “Lesson” to deliver content and “Quiz” to design and set quiz tests. Thus, it can be considered as elements of an electronic course.

Implementation of Individual Learning Trajectories in LMS Moodle

163

Depending on the points scored for the test, the student moves along the path A (75–100 points), B (50–74 points) or C (less than 50 points). For the implementation of the first block of educational material we use the activity LMS Moodle “Lesson”. With this element we implement three trajectories containing Lecture1.A, Lecture1.B, Lecture1.C. The trajectories are implemented by restricting access to these elements. Figure 2 shows the access restriction to Lecture1.A.

Fig. 2. Access restrictions for the activity “Lecture1.A”.

Restrictions imposed on the activity Lecture1.A can be displayed in the e-course (Fig. 3).

Fig. 3. Creation of activity “Lecture1.A” with access restrictions.

In addition to Lecture1.A, it is required to create the Lecture1.B and Lecture1.C activities, the access to which is determined based on the scores for placement test (Fig. 4 and 5). After studying the educational material, according to student’s individual trajectory, he/she has to take a test. The test can be implemented in LMS Moodle with the help of activity “Quiz”. For the first block of educational material let us call it Test1. This test can be passed only if at least one of the blocks of the educational material - Lecture1.A, Lecture1.B or Lecture1.C - has been completed (see Fig. 6). Figure 7 shows the placement test, the first block of educational material that a student can learn by following an individual trajectory A, B or C, as well as the test that a student takes after studying it. It can be seen that each of these activities has access

164

F. Bensalah et al.

Fig. 4. Access restrictions for the activity “Lecture1.B”.

Fig. 5. Access restrictions for the activity “Lecture1.C”

restrictions, which is what allows the realization of learning the course by individualized trajectory.

Fig. 6. Access restrictions for the activity “Test1”.

The course set up in this way allows directing the student on an individual learning trajectory to the study of the corresponding block of educational material on the basis

Implementation of Individual Learning Trajectories in LMS Moodle

165

Fig. 7. Initial creation of the elements of individual learning trajectories when going through the first block of educational material.

of the score for the placement test. In addition, after studying this block of educational material and passing a test on it, the further trajectory of learning is determined. However, the most important nuance of e-course customization when learning by individual trajectory is the implementation of repeated learning of the same block of educational material, but on a different trajectory. Such a need arises if a student fails a test. In this case, the student is forced to return and re-learn the educational material, moving along a different trajectory. For example, having received scores in the range from 50 to 75 points for the placement test, the student is guided along an individual learning trajectory, where the element Lecture1.B is contained. Assume that after learning this block of educational material, he/she fails Test 1. In this case, he/she is forced to return and re-learn the block of educational material, but already along an individual learning trajectory, where the element Lecture1.C is located (see Fig. 8). To implement this return, it will be necessary to extend the access restrictions to the course element of the Moodle system containing the educational material. An important difference between the implementation of the logic of passing the course on an individual learning trajectory in the LMS Moodle environment and the high-level programming language is that the access restriction will be extended not for the element Lecture1.B, but for Lecture1.C. Its extended restrictions are shown in Fig. 9. This setting of access restrictions (Fig. 9) allows a student to start studying a block of educational material either if he/she scored less than 50 points in the placement test,

166

F. Bensalah et al.

Fig. 8. Moving to the trajectory containing the activity “Lecture1.C”

Fig. 9. Access restrictions for the activity “Lecture1.C” taking into account the return due to a failed test.

or if he/she scored less than 25 points when taking Test 1. The second case corresponds to unsuccessful learning of a block of educational material when passing an individual learning trajectory containing Lecture1.B. The new settings are visible on the main page of the course (Fig. 10).

Implementation of Individual Learning Trajectories in LMS Moodle

167

Fig. 10. New access restrictions for the activity “Lecture1.C”.

A similar situation occurs when a student does not pass Test 1 while following an individual learning trajectory containing Lecture1.A. In this case, the student is directed along the trajectory to study the block of educational material Lecture1.B (Fig. 11).

Fig. 11. Moving to the trajectory containing the activity “Lecture1.B”.

The implementation of the proper access restrictions is shown in Fig. 12 and 13.

168

F. Bensalah et al.

Fig. 12. Access restrictions for the activity “Lecture1.B” taking into account the return due to a failed test.

Fig. 13. New access restrictions for the activity “Lecture1.B”.

If the student studied on an individual learning trajectory containing Lecture1.C and failed Test1, then he/she moves to Lecture1.B (Fig. 14).

Implementation of Individual Learning Trajectories in LMS Moodle

169

Fig. 14. Moving to the trajectory containing the activity “Lecture1.B” after trajectory C.

The hypothesis of successful completion of this element is based on the fact that Lecture1.B contains the same educational material as Lecture1.C, but presented in a different way. That is, the student is re-learning the same educational material but viewed from a different perspective. To implement this learning trajectory, it will be necessary to extend the access restrictions of Lecture1.B (Fig. 15). After configuring access restrictions for all course elements associated with the first block of educational material, we get the scheme shown in Fig. 1, fully functional, implemented by means of LMS Moodle (Fig. 16). Subsequent blocks of educational material are configured in the same way as the first one, with the only exception that instead of the placement test results from the activity Test of the previous block of educational material are used. Thus, the logic of teaching students on an individual learning trajectory in the e-learning environment of the LMS Moodle was fully implemented, thanks to the functionality of this learning management system.

170

F. Bensalah et al.

Fig. 15. Access restrictions for the activity “Lecture1.B” taking into account the return from trajectories A and C.

Implementation of Individual Learning Trajectories in LMS Moodle

171

Fig. 16. Implementation of individual learning trajectory when studying the first block of educational material.

3 Conclusion A significant increase in the role of information and communication technologies in education makes it easy to realize new, non-traditional ways of achieving results. Currently existing e-learning techniques and technologies and, in particular, LMS Moodle

172

F. Bensalah et al.

allow to implement the principle of individualization of education, which is to construct learning process on the basis of individual characteristics of students. Each student is a unique personality with its own intellectual and psychological features, type of thinking, ability to quickly or moderately solve the set learning tasks, to choose its own methods and means. The ability to build individual learning trajectories is determined by differences in cognitive styles (i.e. memory, thinking, perception of information). All these differences between students should be taken into account in the process of education and the choice of the way of presentation of educational material, the teacher needs to reveal the strengths of students, motivate them and create a situation of success, to act as an assistant, guide and advise. Individual learning trajectory is understood as a set of methods, forms, means and techniques of independent activity of students in the process of achieving the educational goal, taking into account their individual abilities. Building individual learning trajectories with the help of LMS Moodle is a unique opportunity for effective learning, achieving high results of students and improving the quality of education in general. The process of designing and implementation of individual learning trajectories in the framework of e-courses in the LMS Moodle requires taking into account all the principles and conditions. The functional capabilities of LMS Moodle and the mechanism of work of elements of the e-course allowed to develop the concept of “reverse programming”. Its advantage and difference from classic programming is a non-linear logic of actions, the ability not only to learn on the current individual learning trajectory, but also to return to other, already passed paths of effective achievement of results. Within the framework of the developed logic of passing an individual learning trajectory, we set up blocks of educational materials with regard to access restrictions in accordance with the concept of reverse programming for determining individual educational trajectories. The setting of access restrictions was made on the basis of the obtained test scores, and the possibility of re-learning the educational material on another individual trajectory, in case of unsuccessful passing the test and return to the individual learning trajectory, contributed to effective memorization and improvement of learning results. Thus, thanks to the functionality of LMS Moodle teachers of universities can carry out an automated learning process on individual learning trajectories, to reveal the personal potential of students and activate their learning activities.

References 1. Bivic, R.L.E., Ottavi, S., Saulet, P., Louis, P., Coutu, A.: Designing an interactive environment to share educational materials and resources. Application to the geomatics hub at UniLaSalle Beauvais. Comput. Aided Chem. Eng. 52, 3483–3488 (2023). https://doi.org/10.1016/B9780-443-15274-0.50556-4 2. Dada, D., Laseinde, O.T., Tartibu, L.: Student-centered learning tool for cognitive enhancement in the learning environment. Procedia Comput. Sci. 217, 507–512 (2023). https://doi. org/10.1016/j.procs.2022.12.246

Implementation of Individual Learning Trajectories in LMS Moodle

173

3. Karlsen, K., Aronsen, C., Bjørnnes, T.D., et al.: Integration of E-learning approaches in a post-pandemic learning environment – Norwegian nursing students’ recommendations from an action research study. Heliyon 9(2), e13331 (2023). https://doi.org/10.1016/j.heliyon.2023. e13331 4. Lunev, D., Poletykin, S., Kudryavtsev, D.O.: Brain-computer interfaces: technology overview and modern solutions. Mod. Innov. Syst. Technol. 2(3), 0117–0126 (2022). https://doi.org/ 10.47813/2782-2818-2022-2-3-0117-0126 5. Aljarbouh, A., et al.: Application of the K-medians clustering algorithm for test analysis in Elearning. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Software Engineering Application in Systems Design. CoMeSySo 2022. LNNS, vol. 596, pp. 249–256. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-21435-6_21 6. Alfalah, A.A.: Factors influencing students’ adoption and use of mobile learning management systems (m-LMSs): a quantitative study of Saudi Arabia. Int. J. Inf. Manag. Data Insights 3(1), 100143 (2023). https://doi.org/10.1016/j.jjimei.2022.100143 7. Eshniyazov, A.I.: Teaching the basics of educational robotics in a distance learning format. Inf. Econ. Manag. 2(2), 0301–0310 (2023). https://doi.org/10.47813/2782-5280-2023-2-20301-0310 8. Hanaysha, J.R., Shriedeh, F.B., In’airat, M.: Impact of classroom environment, teacher competency, information and communication technology resources, and university facilities on student engagement and academic performance. Int. J. Inf. Manag. Data Insights 3(2), 100188 (2023). https://doi.org/10.1016/j.jjimei.2023.100188 9. Ishankhodjayev, G., Sultanov, M., Nurmamedov, B.: Issues of development of intelligent information electric power systems. Mod. Innov. Syst. Technol. 2(2), 0251–0263 (2022). https://doi.org/10.47813/2782-2818-2022-2-2-0251-0263 10. Tsarev, R., et al.: Improving test quality in E-learning systems. In: Silhavy, R., Silhavy, P. (eds.) Networks and Systems in Cybernetics. CSOC 2023. LNNS, vol. 723, 62–68. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-35317-8_6 11. Bengueddach, A., Boudia, C., Bouamrane, K.: Interpretive analysis of online teaching labs constructed using Moodle during the pandemic period. Heliyon 9(5), e16007 (2023). https:// doi.org/10.1016/j.heliyon.2023.e16007 12. Dascalu, M.-D., Ruseti, S., Dascalu, M., et al.: Before and during COVID-19: a cohesion network analysis of students’ online participation in Moodle courses. Comput. Hum. Behav. 121, 106780 (2021). https://doi.org/10.1016/j.chb.2021.106780 13. Dobashi, K., Ho, C.P., Fulford, C.P., Lin, M.-F.G., Higa, C.: Learning pattern classification using Moodle logs and the visualization of browsing processes by time-series cross-section. Comput. Educ. Artif. Intell. 3, 100105 (2022). https://doi.org/10.1016/j.caeai.2022.100105 14. Hachicha, W., Ghorbel, L., Champagnat, R., Zayani, C.A., Amous, I.: Using process mining for learning resource recommendation: a Moodle case study. Procedia Comput. Sci. 192, 853–862 (2021). https://doi.org/10.1016/j.procs.2021.08.088 15. Moodle Documentation, Activities. https://docs.moodle.org/402/en/Activities. Accessed 8 Aug 2023 16. Moodle Documentation, Resources. https://docs.moodle.org/402/en/Resources. Accessed 8 Aug 2023 17. Deetjen-Ruiz, R., et al.: Applying ant colony optimisation when choosing an individual learning trajectory. In: Silhavy, R., Silhavy, P. (eds.) Networks and Systems in Cybernetics. CSOC 2023. LNNS, vol. 723, pp. 587–594. Springer, Cham (2023). https://doi.org/10.1007/978-3031-35317-8_53 18. Dietrich, J., Greiner, F., Weber-Liel, D., et al.: Does an individualized learning design improve university student online learning? A randomized field experiment. Comput. Hum. Behav. 122, 106819 (2021). https://doi.org/10.1016/j.chb.2021.106819

174

F. Bensalah et al.

19. Kim, E., Park, H., Jang, J.: Development of a class model for improving creative collaboration based on the online learning system (Moodle) in Korea. J. Open Innov. Technol. Market Complexity 5(3), 67 (2019). https://doi.org/10.3390/joitmc5030067 20. Tsarev, R., et al.: Gamification of the graph theory course. finding the shortest path by a greedy algorithm. In: Silhavy, R., Silhavy, P. (eds.) Networks and Systems in Cybernetics. CSOC 2023. LNNS, vol. 723, pp. 209–216. Springer, Cham (2023). https://doi.org/10.1007/ 978-3-031-35317-8_18 21. Arvidsson, T.S., Kuhn, D.: Realizing the full potential of individualizing learning. Contemp. Educ. Psychol. 65, 101960 (2021). https://doi.org/10.1016/j.cedpsych.2021.101960 22. Tsarev, R.Y., et al.: An approach to developing adaptive electronic educational course. Adv. Intell. Syst. Comput. 986, 332–341 (2019). https://doi.org/10.1007/978-3-030-19813-8_34 23. Bezverhny, E., Dadteev, K., Barykin, L., Nemeshaev, S., Klimov, V.: Use of chat bots in learning management systems. Procedia Comput. Sci. 169, 652–655 (2020). https://doi.org/ 10.1016/j.procs.2020.02.195 24. Pavlenko, D., Barykin, L., Nemeshaev, S., Bezverhny, E.: Individual approach to knowledge control in learning management system. Procedia Comput. Sci. 169, 259–263 (2020). https:// doi.org/10.1016/j.procs.2020.02.162 25. Shavetov, S., Borisov, O., Borisova, E., Zhivitskii, A.: Student advising services in control systems and robotics. IFAC-PapersOnLine 55(17), 13–18 (2022). https://doi.org/10.1016/j. ifacol.2022.09.218 26. Vishtak, O., Zemskov, V., Vishtak, N., et al.: The automated information systems for the education of specialists of the energy industry. Procedia Comput. Sci. 169, 430–434 (2020). https://doi.org/10.1016/j.procs.2020.02.240 27. Vdovina, S., Kungurova, I.: The nature and directions of the individual educational trajectory. Eurasian Sci. J. 6(19), 40PVN613 (2013)

Application of Fuzzy Logic for Evaluating Student Learning Outcomes in E-Learning Mikaël A. Mousse1(B) , Saman M. Almufti2 , Darío Salguero García3 Ikhlef Jebbor4 , Ayman Aljarbouh5 , and Roman Tsarev6,7

,

1 Institut Universitaire de Technologie, Université de Parakou, Parakou, Benin

[email protected]

2 Computer Science Department, College of Science, Nawroz University, Duhok, Iraq 3 Almería University, Almería, Spain 4 Ibn Tofail University, Kenetra, Morocco 5 University of Central Asia, Naryn, Kyrgyz Republic 6 MIREA - Russian Technological University (RTU MIREA), Moscow, Russia 7 Bauman Moscow State Technical University, Moscow, Russia

Abstract. Electronic education significantly expands the possibilities of traditional education both in terms of electronic educational environments and new educational technologies. Electronic educational environment allows students to access the materials of the course they are studying. Besides, there is an opportunity to evaluate the results of learning. This article considers the application of fuzzy logic in the evaluation of students’ results when taking a course. Fuzzy logic allows to take into account the inaccuracies and uncertainties that are inherent in the educational process. Unlike classical assessment methods, which often operate with rigid rules and clear boundaries, fuzzy logic allows taking into account different levels of knowledge, skills and other criteria when assessing learning outcomes. This is particularly important in an educational context where students have different abilities, interests and learning needs. The application of fuzzy logic allows for a more objective evaluation of student learning outcomes and contributes to improving the quality of education. #COMESYSO1120. Keywords: E-learning · Fuzzy Logic · Evaluation Method · Center of Gravity Method

1 Introduction In today’s educational context, e-learning plays an increasingly important role in providing students with flexibility and accessibility to educational resources [1–5]. However, with the increasing number of students who are using e-learning, there is a need for effective methods to evaluate their results [4, 6–10]. Evaluation of students’ knowledge and skills plays a crucial role. It motivates and inspires students for further achievements, influences their learning process as well as future employability [11–16]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 175–183, 2024. https://doi.org/10.1007/978-3-031-54820-8_15

176

M. A. Mousse et al.

However, the evaluation process faces various uncertainties that may distort the actual results. These factors include unclear assessment requirements, limited time to check assignments, and other factors. Fuzzy logic can be used to minimize the uncertainty in the evaluation of students’ knowledge and skills. Fuzzy logic, based on fuzzy set theory, offers tools for dealing with information containing uncertainty, which is often present in descriptions and modeling of educational data [17–20]. A multitude of systems, including expert systems and decision-making systems, apply fuzzy logic algorithms to account for uncertainty [21– 24]. This paper examines the application of fuzzy logic in the evaluation of students’ outcomes. Fuzzy logic is based on the theory of fuzzy sets. A fuzzy set is characterized by an membership function, which determines how much the elements belong to the set. The values of the membership function range from 0 to 1, where 0 means that the element does not belong to the set, on the contrary, 1 means that the element fully belongs to the set. Intermediate values reflect the degree of belonging of elements to the set. In the context of evaluating students’ knowledge and skills, fuzzy logic allows for uncertainty and imprecision in evaluations. Instead of rigidly dividing into “correct” and “incorrect” answers, fuzzy logic allows expressing the degree of confidence in a student’s answer. For example, instead of a grade of “correct” or “incorrect”, a grade of “high confidence”, “medium confidence”, etc. can be used. The application of fuzzy logic in the evaluation of students’ knowledge and skills is of great importance for university teachers. Nowadays, in the educational process, teachers are faced with the task of student evaluation, which is important for assessing students’ success, motivation and future employability. However, student evaluation is often subject to uncertainty and various factors of distortion of results. Traditional evaluation methods based on crisp logic, numerical and letter scales may be limited in their ability to fully evaluate students’ outcomes. In this context, the application of fuzzy logic is an approach that allows teachers to evaluate students more flexibly and accurately.

2 Methods When evaluating the student’s work during the academic semester, using fuzzy logic, it is necessary to determine the significant parameters that should be taken into account in the formation of the final grade. In this paper the following parameters were used: • • • •

difficulty of the topics studied during the semester; understanding of the subject; ability to apply the knowledge obtained during the semester; class attendance.

These parameters play an important role in the evaluation process and can be described using fuzzy sets and membership functions. For example, the degree of difficulty of a topic can be represented by a fuzzy set of “very difficult”, “difficult” and “easy” with the corresponding membership functions. In order to evaluate students’ outcome, it is necessary to determine the values that will be given to the input of the output mechanism and reflect the degree of completion

Application of Fuzzy Logic for Evaluating Student Learning Outcomes

177

for each of the selected parameters. In the process of analyzing the subject area, the following principles for obtaining these values were formulated. The evaluation of the difficulty of the topics studied during the semester can be calculated as the arithmetic mean of the difficulty of each topic studied: nt  i=1

D=

dit ,

nt

where nt is the number of topics; dit is the difficulty of the studied topic, i = 1,…, nt . To evaluate the difficulty of each topic, we use the teacher’s value judgment, which is evaluated on a scale from 1 to 5. The teacher analyzes the content of the topic, its scope, abstractness, and the required level of understanding by the students. The instructor then assigns a numerical grade to the topic to reflect its difficulty, where 1 indicates low difficulty and 5 indicates high difficulty. Understanding of the subject matter in this paper equates to successful completion of the student’s independent work. They will be graded based on the difficulty coefficients assigned to each work. These values play the role of relative weighting coefficients and are intended to take into account the performance of difficult works to a greater extent than easy ones. nh 

H=

i=1

(hi · dih ) nh 

i=1

, dih

where nh is the number of independent works; hi is the evaluation of the i-th work, i = 1,…, nh ; dih is the difficulty coefficient of the I-th work, i = 1,…, nh . The next parameter describes how successfully the student applies the acquired knowledge when writing test papers. This parameter is calculated similarly to the previous one: nc 

C=

i=1

(ci · dic )

nc  i=1

, dic

where nc is the number of control works; ci is the evaluation of the i-th work, i = 1,…, nc ; dic is the difficulty coefficient of the i-th work, i = 1,…, nc . Class attendance grade is determined by averaging all available attendance grades. For this purpose, the number of classes is counted and then the attendance estimates for each class are summarized. The result is the arithmetic mean of all attendance grades: na 

A=

ai

i=1

nc

,

178

M. A. Mousse et al.

where na is the number of classes; ai is attendance grade of the i-th class, i = 1,…, na . In order to transform the fuzzy output data into crisp numbers and obtain specific student grades, we apply the Center of Gravity method [25]. This method plays an important role in the process of going from fuzzy set to crisp number to be able to give specific grades to the students. The Center of Gravity method is one of the most common defuzzification methods in fuzzy logic [26–28]. It is based on the principle of determining the center of gravity of a fuzzy set. For each fuzzy value represented by the membership function, we find its center of gravity by calculating the weighted average of all values in the fuzzy set. This value is a numerical grade that represents the degree to which the given value belongs to the fuzzy set. All calculations were carried out in the application program package Fuzzy specifically designed for the application of fuzzy logic and is part of the MATLAB software package. Figures 1, 2, 3 and 4 show the parameters described above.

Fig. 1. Values of the membership function of the parameter “Difficulty of the topics studied during the semester”.

Fig. 2. Values of the membership function of the parameter “Understanding of the subject”.

Application of Fuzzy Logic for Evaluating Student Learning Outcomes

179

Fig. 3. Values of the membership function of the parameter “Ability to apply the knowledge obtained during the semester”.

Fig. 4. Values of the membership function of the parameter “Class attendance”.

3 Results and Discussion In this work, 330 unique conditions were prescribed which include combinations of all fuzzy sets of all parameters. The result of the research is shown in Fig. 5. Thus, it can be seen that the final evaluation directly depends on the given parameters. The evaluation is derived already in the form of a ready numerical equivalent based on the input fuzzy sets. This experiment confirms that the evaluation using fuzzy logic is not only fully feasible, but also gives adequate results with a relatively large number of input data. The obtained results confirm the effectiveness of applying fuzzy logic in the evaluation of students’ knowledge and skills. We found that the student’s final grade directly depends on the given parameters. It should be noted that the successful application of fuzzy logic in the evaluation of students’ outcomes requires careful selection of parameters and determination of appropriate membership functions. Further research could optimize and improve the defuzzification and aggregation methods used in the evaluation process. The application of fuzzy logic in the evaluation of learning outcomes in e-learning is a promising area.

180

M. A. Mousse et al.

Fig. 5. Evaluation result using fuzzy logic.

Our study expands the understanding of the applicability of fuzzy logic and provides specific methods and formulas for student outcomes evaluation and enables teachers to assess students more objectively. The results show that it is necessary to investigate the possibility of combining fuzzy logic with other evaluation methods and creating flexible evaluation systems that take into account different aspects of student learning outcomes. The application of fuzzy logic in the evaluation of learning outcomes provides a flexible and adaptive approach that helps to take into account the uncertainty and complexity of the subject material. The results obtained can be useful for the development of effective student evaluation systems. Further development and application of fuzzy logic can significantly improve student evaluation process and contribute to students’ academic growth and development.

4 Conclusion The use of fuzzy logic in the evaluation of students’ knowledge and skills is a promising approach to accommodate uncertainty in the evaluation process. Fuzzy logic provides tools for modeling and accounting for fuzzy concepts as well as working with fuzzy values. This opens up new possibilities for developing more objective student evaluation systems. The application of fuzzy logic in education has several advantages. It allows taking into account different evaluation parameters, for example, based on evaluation parameters such as complexity of topics, subject knowledge, ability to apply knowledge, attendance of classes teachers can use fuzzy logic to evaluate students. Fuzzy logic, based on fuzzy set theory and membership functions, offers a flexible approach to evaluation by allowing teachers to express the degree to which a student belongs to certain categories or performance levels.

Application of Fuzzy Logic for Evaluating Student Learning Outcomes

181

The development of a fuzzy logic based evaluation system involves further research and development. More experiments and comparative analyses will help to make recommendations to improve the effectiveness of such a system. It is also important to take into account the peculiarities of a particular educational context and to adapt the system to the needs of specific educational institutions. In addition, further research could focus on analyzing the impact of different factors on student evaluation. For example, factors such as student motivation, interactive teaching methods or the use of technology could be investigated as influencing evaluation results and student performance. This will allow for a better identification of the key aspects to be considered when evaluating students’ results using fuzzy logic. It is important to note that fuzzy logic has its limitations that also require further research. For example, fuzzy models can be difficult to interpret and require expert knowledge to adjust parameters and membership functions. In addition, fuzzy logic may encounter problems when dealing with large amounts of data or in the presence of ambiguous information. However, despite this, the use of fuzzy logic in the evaluation of students’ knowledge and skills is still a promising area that can improve the quality of education and contribute to a fairer evaluation of students. Further research in this direction may lead to the development of new innovative approaches and solutions that will help to improve educational processes and cope with the challenges of the existing education system. The use of fuzzy logic in evaluating students’ knowledge and skills can also influence teaching approaches. Teachers can use fuzzy logic to analyze and evaluate the effectiveness of their teaching methods, determining whether students are mastering the course and which aspects require additional attention. This will allow teachers to adapt their approaches and methods according to the needs of students, improving the quality of education. However, it should be noted that fuzzy logic is not a universal solution and has its limitations. Its effectiveness may depend on the context and specificity of the educational situation. Therefore, it is important to conduct further research to better understand how exactly fuzzy logic can be applied in specific educational scenarios and what factors may influence its results. Further research and development in this area will contribute to the development of new approaches and methods that will help to improve the quality of education and student performance.

References 1. Deetjen-Ruiz, R., et al.: Applying ant colony optimisation when choosing an individual learning trajectory. In: Silhavy, R., Silhavy, P. (eds.) CSOC 2023. LNNS, vol. 723, pp. 587–594. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-35317-8_53 2. Tsarev, R., et al.: Gamification of the graph theory course. Finding the shortest path by a greedy algorithm. In: Silhavy, R., Silhavy, P. (eds.) CSOC 2023. LNNS, vol. 723, pp. 209–216. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-35317-8_18 3. Tsarev, R.Y., et al.: An approach to developing adaptive electronic educational course. In: Silhavy, R. (eds.) CSOC 2019. Advances in Intelligent Systems and Computing, vol. 986, pp. 332–341. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19813-8_34

182

M. A. Mousse et al.

4. Ullah, M.S., Hoque, M., Aziz, M.A., Islam, M.: Analyzing students’ e-learning usage and post-usage outcomes in higher education. Comput. Educ. Open 5, 100146 (2023). https://doi. org/10.1016/j.caeo.2023.100146 5. Zhang, Z., Cao, T., Shu, J., Liu, H.: Identifying key factors affecting college students’ adoption of the e-learning system in mandatory blended learning environments. Interact. Learn. Environ. 30(8), 1388–1401 (2022) 6. Aljarbouh, A., et al.: Application of the K-medians clustering algorithm for test analysis in E-learning. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) CoMeSySo 2022. LNNS, vol. 596, pp. 249–256. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-21435-6_21 7. Baabdullah, A.M., Alsulaimani, A.A., Allamnakhrah, A., Alalwan, A.A., Dwivedi, Y.K., Rana, N.P.: Usage of augmented reality (AR) and development of e-learning outcomes: an empirical evaluation of students’ e-learning experience. Comput. Educ. 177, 104383 (2022). https://doi.org/10.1016/j.compedu.2021.104383 8. Tsarev, R., et al.: Improving test quality in E-learning systems. In: Silhavy, R., Silhavy, P. (eds.) CSOC 2023. LNNS, vol. 723, pp. 62–68. Springer, Cham (2023). https://doi.org/10. 1007/978-3-031-35317-8_6 9. Williams, E., Del Fernandes, R., Choi, Fasola, K.L., Zevin, B.: Learning outcomes and educational effectiveness of E-learning as a continuing professional development intervention for practicing surgeons and proceduralists: a systematic review. J. Surg. Educ. 80(8), 1139–1149 (2023). https://doi.org/10.1016/j.jsurg.2023.05.017 10. Wu, I.-L., Hsieh, P.-J., Wu, S.-M.: Developing effective e-learning environments through e-learning use mediating technology affordance and constructivist learning aspects for performance impacts: moderator of learner involvement. Internet High. Educ. 55, 100871 (2022). https://doi.org/10.1016/j.iheduc.2022.100871 11. Hassouni, B.E., et al.: Realization of an educational tool dedicated to teaching the fundamental principles of photovoltaic systems. J. Phys. Conf. Ser. 1399(2), 022044 (2019). https://doi. org/10.1088/1742-6596/1399/2/022044 12. Nikolaeva, I., Sleptsov, Y., Gogoleva, I., Mirzagitova, A., Bystrova, N., Tsarev, R.: Statistical hypothesis testing as an instrument of pedagogical experiment. AIP Conf. Proc. 2647, 020037 (2022). https://doi.org/10.1063/5.0104059 13. Ng, D.T.K., Ching, A.C.H., Law, S.W.: Online learning in management education amid the pandemic: a bibliometric and content analysis. Int. J. Manag. Educ. 21(2), 100796 (2023). https://doi.org/10.1016/j.ijme.2023.100796 14. Pokrovskaia, N.N., Leontyeva, V.L., Ababkova, M.Y., Cappelli, L., D’Ascenzo, F.: Digital communication tools and knowledge creation processes for enriched intellectual outcome— experience of short-term e-learning courses during pandemic. Future Internet 13, 43 (2021). https://doi.org/10.3390/fi13020043 15. Taherdoost, H., Madanchian, M.: Employment of technological-based approaches for creative e-learning; teaching management information systems. Procedia Comput. Sci. 215, 802–808 (2022). https://doi.org/10.1016/j.procs.2022.12.082 16. Akhmetjanov, M., Ruziev, R.: Fundamentals of modeling fire safety education. Inform. Econ. Manag. 1(2), 0301–0308. (2022). https://doi.org/10.47813/2782-5280-2022-1-2-0301-0308 17. Beckel, L.S., Semenenko, M.G., Tsarev, R.Y., Yamskikh, T.N., Knyazkov, A.N., Pupkov, A.N.: Application of fuzzy logic methods to modeling of the process of controlling complex technical systems. IOP Conf. Ser. Mater. Sci. Eng. 560(1), 012046 (2019). https://doi.org/10. 1088/1757-899X/560/1/012046 18. Joy, J., Pillai, R.V.G.: Review and classification of content recommenders in E-learning environment. J. King Saud Univ. Comput. Inf. Sci. 34(9), 7670–7685 (2022). https://doi.org/10. 1016/j.jksuci.2021.06.009

Application of Fuzzy Logic for Evaluating Student Learning Outcomes

183

19. Megahed, M., Mohammed, A.: Modeling adaptive E-Learning environment using facial expressions and fuzzy logic. Expert Syst. Appl. 157, 113460 (2020). https://doi.org/10.1016/ j.eswa.2020.113460 20. Zenyutkin, N., Kovalev, D., Tuev, E., Tueva, E.: On the ways of forming information structures for modeling objects, environments and processes. Mod. Innov. Syst. Technol. 1(1), 10–22. (2021). https://doi.org/10.47813/2782-2818-2021-1-1-10-22 21. De, S.K., Roy, B., Bhattacharya, K.: Solving an EPQ model with doubt fuzzy set: a robust intelligent decision-making approach. Knowl.-Based Syst. 235, 107666 (2022). https://doi. org/10.1016/j.knosys.2021.107666 22. Nilashi, M., et al.: Knowledge discovery for course choice decision in massive open online courses using machine learning approaches. Expert Syst. Appl. 199, 117092 (2022). https:// doi.org/10.1016/j.eswa.2022.117092 23. Tsarev, R.Y., Durmus, M.S., Ustoglu, I., Morozov, V.A., Pupkov, A.N.: Fuzzy voting algorithms for N-version software. J. Phys. Conf. Ser. 1333(3), 032087 (2019). https://doi.org/10. 1088/1742-6596/1333/3/032087 24. Lunev, D., Poletykin, S., Kudryavtsev, D.O.: Brain-computer interfaces: technology overview and modern solutions. Mod. Innov. Syst. Technol. 2(3), 0117–0126. (2022). https://doi.org/ 10.47813/2782-2818-2022-2-3-0117-0126 25. Zimmermann, H.-J.: Fuzzy Set Theory—and Its Applications. Springer, New York (2001). https://doi.org/10.1007/978-94-010-0646-0 26. Chi, S.-Y., Chien, L.-H.: Why defuzzification matters: an empirical study of fresh fruit supply chain management. Eur. J. Oper. Res. 311(2), 648–659 (2023). https://doi.org/10.1016/j.ejor. 2023.05.037 27. Borges, R.E.P., Dias, M.A.G., Neto, A.D.D., Meier, A.: Fuzzy pay-off method for real options: the center of gravity approach with application in oilfield abandonment. Fuzzy Sets Syst. 353, 111–123 (2018). https://doi.org/10.1016/j.fss.2018.03.008 28. Sain, D., Mohan, B.M.: Modeling, simulation and experimental realization of a new nonlinear fuzzy PID controller using center of gravity defuzzification. ISA Trans. 110, 319–327 (2021). https://doi.org/10.1016/j.isatra.2020.10.048

Advancing Recidivism Prediction for Male Juvenile Offenders: A Machine Learning Approach Applied to Prisoners in Hunan Province Sadia Sultana(B) , Israka Jahir, Mabeean Suukyi, Md. Mohibur Rahman Nabil, Afsara Waziha, and Sifat Momen North South University, Plot 15, Block B, Bashundhara, Dhaka 1229, Bangladesh {sadia.sultana06,israka.jahir,mabeean.suukyi,mohibur.nabil, afsara.waziha,sifat.momen}@northsouth.edu

Abstract. This study uses machine learning approach to forecast the likelihood of recidivism among male juvenile offenders. The dataset utilized in this study is the Structured Assessment of Violence Risk in Youth (SAVRY) dataset, which was obtained from Hunan Province, China. After conducting a meticulous examination, a variety of machine learning algorithms were evaluated, including Random Forest, Gradient Boosting, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM). These models demonstrated a remarkable accuracy rate above 95% when implemented on the test dataset. The implementation of ensemble approaches, hyperparameter optimizers, and feature selection methods resulted in an enhanced level of predictability. Furthermore, Explainable AI was used to assess the fairness and validity of the models. Our results demonstrate that this proposed approach has successfully improved the performance and interpretability of the ML models to predict juvenile recidivism among young offenders.

Keywords: Recidivism Explainable AI

1

· Juvenile Offender · Machine Learning (ML) ·

Introduction

Recidivism refers to a person’s relapse into criminal behavior, often after receiving sanctions or undergoing intervention for a previous crime. It is a pertinent concept in the field of criminal justice and has garnered considerable attention from criminal justice systems across the globe [9]. Studies by the National Institute of Justice in the United States indicate that over 68% of individuals who have committed offenses are rearrested within a five-year period subsequent to their release. Recidivism affects not only adults but also young individuals, who actively engage in a relapse into criminal conduct. The recurrent nature of criminal behavior among young individuals under the age of 18, is commonly referred c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 184–201, 2024. https://doi.org/10.1007/978-3-031-54820-8_16

Advancing Recidivism Prediction with ML and XAI

185

to as juvenile recidivism. The highest juvenile recidivism rates were 76% within three years and 84% within five years. A study by Joseph Doyle found that 40% of juvenile offenders ended up in adult prison for crimes committed by the time they reached the age of 25 [13]. Juvenile recidivism can be attributed to various potential factors, including social interactions during incarceration, limited employment, and economic opportunities, depression, inadequate reintegration into society, maintaining an unchanged lifestyle and social circle after release, and the failure to address underlying issues contributing to criminal behavior during incarceration [11]. The involvement of young individuals in repetitive criminal behaviors poses complex and intricate challenges that transcend their personal lives. The consequences of these challenges extend deeply, impacting communities’ overall safety, welfare, and social unity on a significant scale. The rates of recidivism among juvenile offenders exhibit substantial variation across different countries. In the context of the United States, empirical research has indicated that a substantial proportion, ranging from 50% to 75%, of juvenile offenders experience rearrest during a three-year period subsequent to their release. Within the jurisdiction of the United Kingdom, it has been shown that around 37% of juvenile offenders experience a subsequent conviction within a span of one year subsequent to their release. In the context of Australia, the prevailing rate is approximately 46%, whereas in Canada, it is projected to be approximately 42%. The Netherlands has demonstrated comparatively lower rates of recidivism, with percentages ranging between 20% and 40% [11,13,16]. 1.1

Research Goal

The purpose of this study is to contribute to the field of juvenile recidivism prediction by providing robust ML models. This study also aims to analyze and identify the key predictors that contribute to the likelihood of re-offending among young individuals. The main objective of this research is to develop an accurate and reliable predictive model for juvenile recidivism. We intend to utilize hyper parameter optimization techniques and feature selection methods to elevate the performance of our models. As our data set is imbalanced we plan to implement three different approaches to handle this issue. In addition to accurate predictions, the study aims to incorporate Explainable AI (LIME) techniques to provide meaningful insights and explanations for the predictions. Finally, the machine learning model and its comprehensive explainability features are intended to be deployed in a website to help the police make informed and fair decisions. The remaining part of the article is structured in the following manner: Sect. 2 provides an overview of the related works conducted by researchers. This is followed by the methodology explained in Sect. 3. The results obtained from the model are presented in Sect. 4. Section 5 presents the conclusion along with the potential future directions.

186

2

S. Sultana et al.

Related Works

Recidivism prediction plays a crucial role in various aspects of the criminal justice system, including preventing further crimes, dissuading individuals from reoffending [10]. Since then-1920s, risk assessment tools have been progressively used in criminal justice [2]. The risk assessment tools commonly used in recidivism prediction include LSI-R, HCR-20, VRAG, SAVRY, SAPROF-YV, and many more. By accurately predicting recidivism, authorities can focus their attention and resources on those individuals who are at a higher risk of repeating criminal behavior. Researchers have put in a lot of effort to predict recidivism by using these risk assessment tools along with artificial intelligence methods. This section provides a brief overview of some of the significant research carried out in this area. In recent times, machine learning has gained significant importance as a valuable tool in predicting recidivism [16]. Numerous researchers have explored the application of machine learning in forecasting recidivism, and Turgut Ozkan [12] is a notable example in this field. For their work, data from the Bureau of Justice Statistics (BJS) was used to train the model, which contains information on 38,624 prisoners. The study compared several models, including random forests, support vector machines, XGBoost, and neural networks to determine which one can enhance predictive accuracy in the criminal justice system for recidivism. The XGBoost model, a tree-based ensemble method, had the highest test accuracy of 0.778. Moreover, another study conducted by Aman Singh and S. Mohapatra utilizes several ML algorithms for the prediction of recidivism behavior in firsttime offenders [15]. This study included 204 male prisoners, aged between 18 to 30 years. The study includes Statistical feature selection techniques, such as ANOVA, for identifying relevant attributes from the dataset named Historical, Clinical, and Risk Management-20 (HCR 20). For the classification of recidivism risk factors Naive Bayes classifier, K-nearest Neighbor, Multilayer perceptron network, Probabilistic Neural Network, and Support Vector Machine, were utilized. The study also conducted Ensemble learning techniques, resulting in a predictive accuracy of 87.72%. Furthermore, Marzieh Karimi-Haghighi and Carlos Castil [6] analyzed data from 2,634 offenders with recorded nationality information to understand how well ML models work in predicting violent recidivism risk. They used Logistic Regression, Multi-layer Perceptron, and Support Vector Machines to assess predictive performance. Another work by Karimi-Haghighi and Castillo used RisCanvi, a risk assessment protocol for violence prevention involving professionals by conducting interviews with prisoners. These interviews resulted in the creation of a risk score based on various risk elements [5]. The elements are included in 5 risk areas of prisoners consisting of 2634 cases. Their study achieved an AUC of 0.76 and 0.73 in predicting violent and general recidivism respectively. Min Kyung and Hyang Sook Kim [8] have performed a discriminant analysis of high-risk recidivism in criminal offenders based on psychopathological factors

Advancing Recidivism Prediction with ML and XAI

187

from MMPI-2-RF profiles. Their dataset contained 182 violent offenders using psychological testing from prosecutor’s offices nationwide in Korea. Based on the Korean Offender Recidivism Assessment System-General(KORAS-G) and the Psychopathy Checklist-Revised (PCL-R), they divided the offender into low-tomoderate and high-risk recidivism groups. They used a discriminant function analysis to investigate the contributing factors for each recidivism assessment and identify the “highest recidivistic group”, which has high-risk clinical cutoffs on both assessments. They got an accuracy of 63.8% to 76.1%. Jiansong Zhou and his teammates [17] investigated the applicability of the Structured Assessment of Violence Risk in Youth (SAVRY) for predicting violent reoffending among 246 male juvenile offenders in China. Chi-square tests were employed to compare categorical variables, and Mann-Whitney U tests were used for continuous variables not normally distributed. The analyses involved Receiver Operating Characteristic (ROC) to assess the predictive ability of SAVRY scores for violent reoffending, and univariate logistic regression to examine the association of individual SAVRY items. Their analysis identified 7 out of the 30 SAVRY items significantly associated with reoffending. While other studies heavily rely on a risk-focused approach Kleeven and his team [7] examine the predictive validity in addition to risk factors. For this purpose, they utilized two assessment tools: the SAVRY and the Structured Assessment of Protective Factors for violence risk - Youth Version (SAPROFYV). For the dataset, they used a total of 354 youth offenders discharged from a Dutch juvenile justice institution. To see how well the SAVRY and SAPROF-YV tools predict future recidivism the researchers used ROC analysis. They considered the total scores and domain scores from these assessment tools, along with summary risk ratings for violence and nonviolence, as variables in their analysis. According to the paper, both assessment tools show moderate predictive accuracy with AUC values between 0.63 and 0.75. Unlike the majority of research, which relies on traditional risk assessment tools, this work focuses on predicting re-offense using Mobile Neurocognitive Assessment Software (NCRA), which provides decision-making through a suite of neurocognitive tests. The author, Gabe Haarshma, prioritizes this assessment over traditional risk assessments, emphasizing dynamic factors over static ones. The study involved 730 participants. The NCRA tests included Eriksen Flanker, Balloon analog risk task, Go/no-go, Point-subtraction aggression paradigm, Reading the mind through the eyes, Emotional Stroop, and Tower of London, aiming to uncover cognitive traits linked to criminal behavior and their role in predicting recidivism. They measured the NCRA performance using the ROC AUC metric. They have built multiple Classification models. Among those Glmnet algorithm achieved a promising 0.70 AUC score, outperforming other models across all feature sets [3]. Table 1 presents a summary comparison of relevant studies.

188

S. Sultana et al. Table 1. A comparative analysis of related works

Subject

Size

Predicting recidivism through machine learning [12]

38,624

Development of risk assessment framework for first-time offenders using ensemble learning [15]

204

Enhancing a recidivism prediction tool with machine learning: Effectiveness and algorithmic fairness [6]

Age

Data collection means

Advantages

Limitations

Bureau of Justice Statistics (BJS)

Precise recidivism risk assessments compared to traditional methods

There were other factors that matter and were not included.

Recidivism behavior in first-time offenders

The parameters of each classifier were fine-tuned to achieve maximum accuracy.

Software limited to binary/continuous outcomes, optimized for accuracy, unsuitable for survival data analysis

2,634

Conducting interviews with prisoners

This work is applicable beyond a specific field and suitable for assessing and predicting individuals in various contexts

Efficiency and Fairness in Recurring Data-Driven Risk Assessments of Violent Recidivism [5]

2,634

Conducting interviews with prisoners

Able to mitigate the disparate impact and ensure equality in the rate of evaluation.

Discriminant analysis of high-risk recidivism in criminal offenders based on psychopathological factors [8]

182

17 to 70

Psychological testing from prosecutor’s offices nationwide in Korea

Multi-method risk assessment

The static assessment KORAS-G lacks psychopathological dynamic factors.

Predicting Reoffending Using the Structured Assessment of Violence Risk in Youth (SAVRY) [17]

246

15 to 18

Youth Detention Center (YDC) in Changsha, Hunan Province, China

SAVRY has good predictive validity for violence and delinquency outcome

It does not examine the impact of the intervention on risk reduction.

Risk Assessment in Juvenile and Young Adult Offenders: Predictive Validity of the SAVRY and SAPROF-YV [7]

354

16 to 26

Dutch juvenile justice institution

SAPROF-YV provided incremental predictive validity over the SAVRY, and predictive validity was stronger for younger offenders

Additional information could not be retrieved and offenders could not be observed in clinical practice.

Predicting re-offense using Mobile Neurocognitive Assessment Software (NCRA) [3]

730

Average 27 to 38

Harris County Community Supervision and Corrections Department

A deeper understanding of deviant decision-making can be factored into sentencing and treatment programs.

The study is limited by siloed jurisdictional databases, which undercount arrest rates for all participants who may have gone on to commit crimes in other states.

18 to 30

Advancing Recidivism Prediction with ML and XAI

3 3.1

189

Methodology SAVRY

The Structured Assessment of Violence Risk in Youth (SAVRY), developed by Randy Borum, Patrick Bartel, and Adelle Forth, is a risk assessment framework. This is a tool used by professionals in the fields of psychology, criminology, and law enforcement to assess the risk of violence and other antisocial behaviors in young individuals, typically aged 12 to 18 years. In the SAVRY protocol, protective factors are specified by 6 elements, while risk factors are characterized by 24 elements. Risk items are divided into three categories: historical, individual, and social/contextual, and each is labeled according to a three-level classification structure (high, moderate, or low). Protective factors are labeled as present or absent. If additional elements are present, they should be documented and weighed in the final risk-estimating decision. Elements in the context of the SAVRY protocol are assigned values based on information that is reliable and accessible. Information are typically obtained in multiple ways, including an interview with the youth and a review of records (e.g., police or probation reports and mental health and social service records) [1] (Table 1).

Fig. 1. Methodology of the work

190

S. Sultana et al. Table 2. Structured Assessment of Violence Risk in Youth (SAVRY)

Variable

Description

Age

Offender’s age during assessment

Gender

Measurement

1 = Male 0 = Female

Education

Educational Qualification

Family income

Monthly family income (Chinese yuan)

SAVRY1

History of violence

SAVRY2

History of nonviolent offending

SAVRY3

Early initiation of violence (before 14 years old) (critical item for the intervention)

SAVRY4

Past supervision/intervention failures

SAVRY5

History of self-harm or suicide attempts

SAVRY6

Exposure to violence in the home

SAVRY7

Childhood history of physic mistreatment or negligent mistreatment

SAVRY8

Parental/caregiver criminality in adulthood

0 = 3500

SAVRY9

Early caregiving disruption in the childhood

SAVRY10

Poor school achievement

0 = low

SAVRY11

Delinquency in the peer group

1 = moderate

SAVRY12

Rejection by the peer group

2 = high

SAVRY13

Stress and poor coping

SAVRY14

Poor parental management

SAVRY15

Lack of personal/social support

SAVRY16

Community disorganization

SAVRY17

Negative attitudes

SAVRY18

Risk-taking/impulsivity

SAVRY19

Substance abuse difficulties

SAVRY20

Anger management problems

SAVRY21

Empathy

SAVRY22

Attention-deficit/hyperactivity difficulties

SAVRY23

Poor compliance

SAVRY24

Low interest/commitment to school

P1

The juvenile participates in prosocial activities or prosocial peer groups

P2

The juvenile participates in prosocial activities or prosocial peer groups (Critical item for the intervention)

P3

The juveniles have strong social support

P4

The juvenile shows a positive attitude in front of the intervention/treatment inmates and authority

0 = absent

P5

The juvenile shows a high-interest level, implication, and motivation for success in school or work

1 = present

P6

The juvenile has positive and resilience personality characteristics

Reoffending

Reoffending tendency

3.2

0 = Yes 1= No

Dataset

The dataset [17] used in this research comprises 246 male juvenile offenders who were detained in a Youth Detention Center in Hunan province, China between August, and November 2008. Information regarding additional arrests, charges, or convictions related to violent offenses was gathered during the period of Octo-

Advancing Recidivism Prediction with ML and XAI

191

ber to November 2013. The participants were assessed using SAVRY where, more than half of the offenders (66.3%, n = 163) had a history of violent convictions, including various crimes such as robbery (41.1%, n = 101), assault (13.0%, n = 32), sexual violence (6.9%, n = 17), and homicide (5.3%, n = 13). The remaining offenders (33.7%, n = 83) had been convicted of less severe offenses, including theft and cases involving the use or possession of a weapon. All the offenders in the study were sentenced to a maximum of three years’ detention. Among this population, 63 juvenile offenders (25.6%) were arrested again for committing further violent offenses, in a five-year follow-up period. 3.3

Data Pre-processing

In this section, we have thoroughly described the dataset preprocessing steps, which include data cleaning, organizing, and visualizing. Data Cleaning: In the data cleaning process, unnecessary columns were removed, and missing values were handled to prepare the dataset for analysis. Firstly, irrelevant columns with constant values were dropped from the dataset as they did not offer valuable insights. Then, to tackle the problem of missing data, null values were substituted with the most recent non-null values present in their corresponding columns. By employing this approach, the team avoided the deletion of rows, thereby preserving the dataset’s size and integrity. Furthermore, in order to manage categorical attributes efficiently, a label encoding technique was implemented. Handling Class Imbalance: The dataset used in this study consisted of 246 records, with an imbalanced distribution of the target variable. Specifically, 74.39% of the data belonged to the “Not reoffending” class, while only 25.61% belonged to the “Offending” class. This class imbalance posed a challenge as it could potentially bias the learning process and affect the predictive performance of the models [4]. To address this class imbalance, three techniques were employed: random oversampling, Synthetic Minority Over-sampling Technique (SMOTE), and random under-sampling respectively. Random oversampling and SMOTE method were used to create a balanced dataset by increasing the representation of minor classes. However, they achieve the same objective through different approaches. The random oversampling method balances the dataset by duplicating random instances from the minority class until the class distribution is more balanced whereas SMOTE generates synthetic samples for the minority class by interpolating between existing instances. The other method, Random under-sampling, reduces the representation of the majority class. This process involves randomly removing instances from the majority class to achieve a more balanced class distribution. Table 3 illustrates the impact of those techniques on the distribution of “Reoffender” values.

192

S. Sultana et al. Table 3. Target Class Count Dataset

3.4

Reoffender Not Reoffender Total

Sample size

63

183

246

After random oversampling

183

183

366

After random undersampling 63

63

126

After SMOTE

183

366

183

Feature Selection Method

In this study, various strategies were utilized for feature selection, including Recursive Feature Elimination (RFE), Univariate Statistical Tests, Fisher Score Chi-Square Test, Pearson Correlation, Mutual Information, Mutual Information Regression, and Variance Threshold. 3.5

Hyper Parameter Optimization

Various commonly employed methods for hyperparameter optimization were utilized, such as Randomized Search CV, Grid Search CV, Hyperopt, TPOT Classifier, and Optuna. The Randomized Search CV method employed a random sampling approach to select hyperparameter combinations and afterwards assessed the performance of the model. This approach effectively reduced computational costs associated with the search process. The Grid Search CrossValidation (CV) algorithm systematically investigated all potential combinations inside a predetermined grid. Hyperopt is a hybrid optimization approach that integrates random and Bayesian optimization techniques. It leverages the TPE (Tree-structured Parzen Estimator) algorithm to efficiently explore the search space. The TPOT Classifier, functioning as an automated machine learning (AutoML) tool, employs genetic programming to enhance hyperparameters and pipeline layout. The Optuna framework utilized a sequential model-based optimization method that was rooted on Bayesian optimization principles. 3.6

Learning Algorithms

The present study evaluates the performance of various predictive models, namely ZeroR, Decision Tree, Random Forest, Support Vector Machine, KNearest Neighbors (KNN), Logistic Regression, Naive Bayes, and Multi-layer Perceptron (MLP). 3.7

Performance Metrics

Accuracy, precision, recall, and F1 score are used to measure the performance of various machine learning models. Equation (1) to (4) were used to determine the performance metrics. Accuracy =

TP + TN TP + TN + FP + FN

(1)

Advancing Recidivism Prediction with ML and XAI

193

TP (2) TP + FN TP P recision = (3) TP + FP 2 ∗ TP (4) F 1Score = 2 ∗ TP + FP + FN Here, TP represents True Positive, TN represents True Negative, FP represents False Positive, and FN represents False Negatives. Recall =

3.8

Exploration with Ensemble Models

In this study, we conducted additional analysis utilizing ensemble models, which involve the integration of different models to enhance the accuracy and robustness of the predictive model. Multiple ensemble approaches were employed, including Bagging, Boosting (specifically adaBoost), and Stacking. The utilization of this ensemble technique enhances the predictability of our forecasts.

4 4.1

Results and Discussion Performance Metrics Comparison for Various Algorithms

Tables 4, 5 and 6 showcase the performance of various ML algorithms which were preprocessed by using three imbalance handling techniques: random oversampling, random under-sampling, and SMOTE respectively. Notably, Gradient Boosting and Random Forest achieve the highest accuracy of 97% with random oversampling. KNN reaches 54% accuracy with random under-sampling. SMOTE preprocessing leads to an 88% accuracy for both Random Forest and Gradient Boosting. 86% accuracy was obtained by an ensemble voting classifier. Unusually, precision and recall analysis reveals Gradient Boosting’s superiority in precision (97%, 62%, and 88% for oversampled, under-sampled, and SMOTE datasets) and Random Forest rule in recall (97%, 58%, and 88% for respective datasets). The proposed ML models perform best on the oversampled dataset, as illustrated in Fig. 2. 4.2

Performance Analysis of Different Algorithms with Hyperparameter Optimization

This section evaluates classifiers’ performance with hyperparameter optimization. In Tables 7, 8 and 9, oversampling led to 98%, 97%, and 96% accuracy for Random Forest, SVM, and Gradient Boosting using RandomizedSearchCV, Tpot, and GridSearchCV, respectively. On random undersampled data, Logistic Regression, Naive Bayes (Bernoulli NB), and Decision Tree achieved 62%, 69%, and 62% accuracy using various optimizers. For SMOTE data, RandomizedCV, GridSearchCV, and Bayesian optimization yielded 88%, 86%, and 95% accuracy for Gradient Boosting, Naive Bayes, and Random Forest. Tables 7, 8 and

194

S. Sultana et al.

Fig. 2. Comparing performance metrics different Machine Learning Models Table 4. Performance Metrics for Different Algorithms (Oversampling Technique) Algorithm (Oversampling)

Accuracy (%) Precision (%) Recall (%) F1 Score (%)

Random Forest

97

97

97

Logistic Regression

76

76

76

76

KNN

76

77

76

76

Decision Tree

92

97

92

93

92

Na¨ıve Bayes (Multinomial) 70

70

70

70

SVM

88

88

88

88

Voting Classifier

97

97

97

97

Bagging Classifier

92

92

92

92

XG-Boosting

78

82

78

78

Gradient boosting

97

97

97

97

ADA Boost Classifier

95

95

95

95

Stacking Classifier

93

92

93

93

9 demonstrate significant performance improvement through hyperparameter tuning. Notably, Logistic Regression, XG Boosting, and KNN accuracy leaped from 76% to 96%, 78% to 93%, and 76% to 96%, respectively, with hyperparameter optimization on oversampled data. Similarly, the accuracy of Random Forest in SMOTE data increased from 88% to 95%. The top-performing models were Random Forest with randomized search, Naive Bayes (Bernoulli NB) with GridSearchCV, and Random Forest with Bayesian optimization for oversampled, undersampled, and SMOTE data, respectively

Advancing Recidivism Prediction with ML and XAI

195

Table 5. Performance Metrics for Different Algorithms (Undersampling Technique) Algorithm (Undersampling) Accuracy (%) Precision (%) Recall (%) F1 Score (%) Random Forest

50

50

50

Logistic Regression

54

54

54

49 54

KNN

54

54

54

54

Decision Tree

42

42

42

42

Na¨ıve Bayes (Multinomial)

54

54

54

54

SVM

54

54

54

54

Voting Classifier

54

54

54

54

Bagging Classifier

54

54

54

54

XG-Boosting

50

25

50

33

Gradient boosting

46

46

46

46

ADA Boost Classifier

46

46

46

46

Stacking Classifier

54

54

54

54

Table 6. Performance Metrics for Different Algorithms (SMOTE Technique)

4.3

Algorithm (Oversampling)

Accuracy (%) Precision (%) Recall (%) F1 Score (%)

Random Forest

88

88

88

Logistic Regression

72

72

72

72

KNN

70

74

71

69

Decision Tree

81

88

81

81

81

Na¨ıve Bayes (Multinomial) 66

66

66

66

SVM

80

80

80

80

Voting Classifier

85

85

85

85

Bagging Classifier

84

84

84

84

XG-Boosting

69

69

69

69

Gradient boosting

88

88

88

88

ADA Boost Classifier

82

83

83

82

Stacking Classifier

85

86

85

85

Performance Analysis of Different Algorithms After Using Various Feature Selection Methods

The study analyzed outcomes of both with and without hyperparameter tuning of classifiers after selecting the top 20 features from a randomly oversampled dataset. Tables 11 present these insights on model performance based solely on selected features and without optimization. So, without hyperparameter tuning, Random Forest and Gradient Boosting algorithms achieved 97% and 93% accuracies, respectively and the features were extracted by Recursive Feature Elimination and Variance threshold methods. With hyperparameter tuning (Table 10) on the same oversampled data, Random Forest achieved 99% accuracy using Bayesian optimization, and Decision Tree reached 95% accuracy using grid search. Here the top 20 features were selected by Recursive Feature Elimination and Mutual Information. The accuracy of Random Forest with feature selection is 95%, increasing to 97% with both feature selection and optimiza-

196

S. Sultana et al. Table 7. Hyperparameter Optimization with Random Oversampled Data Algorithm Random est

For-

Logistic Regression

Randomized Grid Search Search

Bayesian Optimization

Hyperopt

Tpot classifier

Optuna

98

92

96

85

96

82

59

76

51

77

96

78

KNN

81

82

81

78

96

82

XG-Boosting

92

93

92

81

83

51

Gradient Boosting

96

96

97

85

93

96

Decision Tree

85

85

88

83

91

64

Na¨ıve Bayes (Multinomial)

72

70

68

70

72

68

SVM (C = 1000)

72

88

97

89

97

78

Table 8. Hyperparameter Optimization with Random Undersampled Data Algorithm

Randomized Search

Grid Search Bayesian Optimization

Tpot classifier Optuna

Random Forest

50

46

52

58

Logistic Regression

69

65

42

50

58

KNN

54

58

62

46

58

XG-Boosting

58

42

54

50

50

Gradient boosting

46

46

50

50

58

Decision Tree

42

46

54

62

46

50

Na¨ıve Bayes(Multinomial) 54

54

50

54

52

SVM (C = 1000)

46

50

58

50

54

Table 9. Hyperparameter Optimization with Random SMOTE Data Algorithm

Randomized Search

Grid Search Bayesian Optimization

Tpot classifier Optuna

Random Forest

86

81

95

84

Logistic Regression

86

73

51

86

74

KNN

66

70

70

80

70

XG-Boosting

82

84

84

82

51

Gradient boosting

88

85

88

85

86

Decision Tree

74

66

77

82

58

82

Na¨ıve Bayes (Multinomial) 66

66

68

66

66

SVM (C = 1000)

80

86

81

51

66

tion. Gradient Boosting Algorithm maintained an accuracy of 93% with and without hyperparameter tuning. Decision Tree accuracy improved from 85% to 95%. Figure 3 illustrates the differences in accuracy attained through different machine learning algorithms before and after the process of feature selection.

Advancing Recidivism Prediction with ML and XAI

197

Fig. 3. Accuracy comparison before and after feature selection

4.4

Error Analysis on Different Algorithms

The Random Forest Classifier demonstrated good predictive power with a AUC score of 96.52%, while Logistic Regression performed well with a score of 93.06%. The Gradient Boosting classifier produced a decent score of 90.42%, showing that complicated data patterns were well captured. With a score of 91.74%, KNN displayed its ability to find data similarities, while the SVM classifier earned a predictive power of 87.50%. [14] Rice and Harris define AUC values of 100% as perfect prediction, values around 50% for poor or chance prediction, and AUC values of 55.6% as small, 63.9% as moderate, and 71.4% as large predictive validity effect sizes. Based on this insight, we can call our classifiers more effective (Table 12). 4.5

Explainable AI

In Fig. 4 and Fig. 5, the LIME explainable AI framework provides interpretations of reoffending predictions for a positive and negative case, respectively. For the specific sample shown in Fig. 4, the Random Forest model, trained using the randomized search CV technique, predicts a low probability of reoffending with a confidence score of 0.79. LIME identifies factors such as history of nonviolent offending (SAVRY 2), community disorganization (SAVRY 16), negative attitudes (SAVRY 17), and anger management issues (SAVRY 20) as important contributors to this prediction. Given that the values of these factors were low for the sample, LIME highlights them as the primary indicators of a lower

198

S. Sultana et al.

Fig. 4. Interpretability of LIME for lower class

Fig. 5. Interpretability of LIME for higher class

Advancing Recidivism Prediction with ML and XAI

199

Table 10. Feature Selections on Hyperparameter Tuned Classifier Algorithms

Recursive Feature Elimination (%)

Fisher Score Chisquare Test (%)

Univariate Statistical Tests (%)

Pearson Correlation (%)

Mutual Info (%)

Mutual Info Regression (%)

Variance Threshold (%)

LR (C = 100)

80

74

72

75

77

72

70

LR CV)

(GridSearch

79

65

74

74

71

78

78

LR (Randomized Search CV)

88

77

76

77

76

69

72

GB (Grid Search)

92

92

91

88

84

89

85

XGB (Randomized Search CV)

93

89

88

91

89

85

95

XGB Search)

(Grid

93

92

88

95

90

88

93

KNN (Randomized Search CV)

70

76

72

76

84

82

89

KNN( Search)

Grid

88

84

89

80

79

86

86

DT (Randomized Search)

81

91

92

81

73

92

84

DT (Grid Search)

88

91

85

88

95

89

89

RFC (n=100)

97

95

93

95

92

96

96

RFC (Randomized Search CV)

96

96

95

93

95

96

92

RFC (Grid Search CV)

95

93

89

85

86

82

82

RFC (Hyperopt)

75

85

80

83

83

85

82

RFC (Bayesian Optimization)

99

1

99

92

85

86

86

Naive Bayes (Grid Search)

74

68

73

76

69

69

59

SVM (C = 100)

92

89

90

86

92

82

80

likelihood of reoffending. Additionally, LIME attributes their significance to the prediction.

5

Conclusion and Future Scope

In summary, this research article has illustrated the capacity to enhance recidivism prediction using the SAVRY dataset. There exists a significant lacuna in the body of literature pertaining to the prediction of recidivism among young offenders. We incorporated supervised machine learning algorithms, ensemble approaches, and explainable AI techniques in this study. By conducting an extensive examination of the dataset comprising 246 samples, we utilized 11 distinct algorithms to assess and contrast their efficacy in forecasting recidivism. Furthermore, the use of explainable artificial intelligence (AI) approaches has played a significant role in enhancing the transparency and interpretability of the predictions made by the models. The results of the study indicated that the random forest algorithm exhibited the highest performance, with an accuracy rate of

200

S. Sultana et al.

Table 11. Accuracies after Feature Selections without Hyperparameter Tuning Classifier Algorithms

Recursive Feature Elimination (%)

Fisher Score Chi square Test (%)

Univariate Statistical Tests (%)

Pearson Correlation (%)

Mutual Info (%)

Mutual Info Regression (%)

Variance Threshold (%)

Logistic Regression

80

74

72

78

77

73

76

XGB

86

77

69

72

73

74

80

Gradient Boosting Classifier

89

93

93

92

89

82

93

Decision Tree

83

88

86

81

89

74

82

Random Forest

97

96

93

95

93

93

97

Naive Bayes (Multinomial)

74

68

73

76

72

66

62

Support Vector Machine

89

90

86

82

85

82

82

Table 12. AUC scores for best models Classifier Name

AUC score

Random Forest Classifier (randomized search CV) 96.52% Logistic Regression using Tpot classifier

93.06%

KNN using Tpot

91.74%

Gradient boosting

90.42%

97% on the test dataset. This work has the potential to make a substantial contribution towards the improvement of public security, namely in the realm of crime prevention targeted at individuals. In forthcoming times, our intention is to incorporate advanced deep learning methodologies, such as convolutional neural networks (CNNs) or recurrent neural networks (RNNs), which possess the capability to effectively capture intricate patterns and relationships within datasets.

References 1. Borum, R., Lodewijks, H.P., Bartel, P.A., Forth, A.E.: The structured assessment of violence risk in youth (SAVRY). In: Handbook of Violence Risk Assessment, pp. 438–461. Routledge (2020) 2. Dieterich, W., Mendoza, C., Brennan, T.: Compas risk scales: demonstrating accuracy equity and predictive parity. Northpointe Inc 7(4), 1–36 (2016) 3. Haarsma, G., Davenport, S., White, D.C., Ormachea, P.A., Sheena, E., Eagleman, D.M.: Assessing risk among correctional community probation populations: pre-

Advancing Recidivism Prediction with ML and XAI

4. 5.

6.

7.

8.

9. 10. 11.

12. 13.

14. 15. 16.

17.

201

dicting reoffense with mobile neurocognitive assessment software. Front. Psychol. 10, 2926 (2020) Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002) Karimi-Haghighi, M., Castillo, C.: Efficiency and fairness in recurring data-driven risk assessments of violent recidivism. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, pp. 994–1002 (2021) Karimi-Haghighi, M., Castillo, C.: Enhancing a recidivism prediction tool with machine learning: effectiveness and algorithmic fairness. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, pp. 210– 214 (2021) Kleeven, A.T., de Vries Robb´e, M., Mulder, E.A., Popma, A.: Risk assessment in juvenile and young adult offenders: predictive validity of the SAVRY and SAPROFYV. Assessment 29(2), 181–197 (2022) Koh, M.K., Kim, H.S.: Discriminant analysis of high-risk recidivism in criminal offenders based on psychopathological factors from MMPI-2-RF profiles. J. Forensic Psychiatry Psychol. 1–25 (2023) National Institute of Justice: Recidivism (Year of Access). https://nij.ojp.gov/ topics/corrections/recidivism Nickerson, C.: Recidivism: definition, causes & examples. Recidivism: Definition, Causes & Examples-Simply Psychology (2022). Accessed 19 Feb 2023 Occupy: Prison recidivism: Causes and possible treatments (Year of Access). https://www.occupy.com/article/prison-recidivism-causes-and-possibletreatments#sthash.0CP0i7zK.dpbs Ozkan, T.: Predicting recidivism through machine learning. Ph.D. thesis (2017) Point Park University: Understanding juvenile recidivism: Prevention and treatment (Year of Access). https://online.pointpark.edu/criminal-justice/juvenilerecidivism/ Rice, M.E., Harris, G.T.: Comparing effect sizes in follow-up studies: ROC area, cohen’s d, and r. Law Hum. Behav. 29, 615–620 (2005) Singh, A., Mohapatra, S.: Development of risk assessment framework for first time offenders using ensemble learning. IEEE Access 9, 135024–135033 (2021) Travaini, G.V., Pacchioni, F., Bellumore, S., Bosia, M., De Micco, F.: Machine learning and criminal justice: a systematic review of advanced methodology for recidivism risk prediction. Int. J. Environ. Res. Public Health 19(17), 10594 (2022) Zhou, J., Witt, K., Cao, X., Chen, C., Wang, X.: Predicting reoffending using the structured assessment of violence risk in youth (SAVRY): a 5-year follow-up study of male juvenile offenders in Hunan Province, China. PLoS ONE 12(1), e0169251 (2017)

Development of Automated Essay Scoring System Using DeBERTa as a Transformer-Based Language Model Hansel Susanto(B) , Alexander Agung Santoso Gunawan, and Muhammad Fikri Hasani School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia {hansel.susanto001,muhammad.fikri003}@binus.ac.id, [email protected]

Abstract. Giving an essay assignment is an important task every educational institution holds for measuring students’ understanding and ability. Teachers have a significant role in this task because they are the one who assesses the assignment. Over time, the number of students will increase too. This makes doing a manual correction become more complicated. Some downsides are taking too much time, being less objective, and many more. In the past few years, the problem of automated essay scoring has become popular. But this problem still has a big problem which makes automated essay scoring still not as good as human assessment, especially in detecting the main idea of an essay, cohesion, and coherence. The large language model (LLM) has become popular in the past few years. Some examples are transformer-based models such as BERT, RoBERTa, and DeBERTa. In this research, we implement those three models as our base layer, in which we compare the value of Quadratic Weighted Kappa (QWK) as the metric of accuracy. In conclusion, the DeBERTa-based model has the best value of QWK compared to the other two. We also implement a system using Python, that can retrieve an essay and will run the model to do the scoring automatically. We also suggest further research references that can use different datasets other than the ASAP-AES dataset for the research or try to use the GPT-based language model as the base layer model. Keywords: Essay · Correction · Automated Essay Scoring · Large Language Model · BERT · RoBERTa · DeBERTa · Quadratic Weighted Kappa · Python

1 Introduction An essay is a piece of writing that expresses the author’s knowledge and understanding of a certain topic and idea that is being discussed. The ability to write an essay is very important because, writing an essay has always been a standard in formal education of many countries, especially in the secondary and higher level. By being able to write a good essay, students can demonstrate their understanding and problem-solving ability to a certain topic and idea well. But for students to understand their ability to express © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 202–215, 2024. https://doi.org/10.1007/978-3-031-54820-8_17

Development of Automated Essay Scoring System

203

their understanding through writing an essay, they need a teacher to help them assess and give active feedback on the essay that has been written. This means that giving an essay assignment in school will always involve students’ and teachers’ time actively. Although the essay-based assessment type is very common in every country, when it comes to the time for teachers to assess the examination, it still poses the same problem for every teacher in the world, assessing and scoring an essay takes so much time. This problem is not new either, but it has become much more relevant nowadays since the student-to-teacher ratio is getting higher and higher [1]. This makes teachers more overwhelmed with the number of essays to be assessed as the numbers keep increasing. Nevertheless, assessing an essay for every student is a very repetitive task, this makes this problem very prominent and suggested to be automatized. This is one of the main reasons why developing an automatic essay scoring system is a very prominent yet challenging solution in the education AI world [2]. This is because to score essays, it is not possible to use only pattern matching or simple programming, because students’ answers may vary and have different explanations. So, a system is needed that can objectively assess all the different kinds of answers from these students based on knowledge from Natural Language Processing (NLP). Currently, there are several studies related to automatic essay scoring have been published. But most of them still use Deep Learning techniques such as Linear Regression, Latent Semantic Analysis (LSA), or Long Short-Term Memory (LSTM) approaches. This approach is improving the research for automated essay scoring significantly. At the same time, there are already some new Natural Language Processing approaches, such as using transformer-based models. By using a transformer-based model approach, the developed automatic essay scoring model can be implemented even better, because these transformer-based models can understand the context of a text and are also trained in the latest techniques and with more parameters and data. The BERT model [3], which was developed by Google in 2018, was the first pretrained model that became the base for the development of other transformer-based pre-trained models. The BERT model is very revolutionary, since it uses a transformer, so it can understand the context of a text. With its capability, there are already several research about automated essay scoring using the BERT model. In the past few years, some other companies have taken the initiative to make BERT model better by introducing another derivative model from BERT. The example for this case is the RoBERTa model [4], which was created by Facebook AI Research Team in 2019, and the DeBERTa model [5], which was created by Microsoft in 2021. The research for automated essay scoring using these two derivative models from BERT hasn’t been done yet. That’s why in this research, we implemented the automated essay scoring using BERT, RoBERTa, and DeBERTa models and compared those three models using Quadratic Weighed Kappa metrics [6], to which one is the best to be implemented in automated essay scoring. In this research, we also realized that automated essay scoring systems that are open to the public at this time are also relatively rare. Therefore, the implementation of an automated essay scoring system development using the best transformer-based language model will also be carried out. The system development that will be carried out in this research is developed using the Streamlit framework and Python programming language.

204

H. Susanto et al.

The model that we implemented in this system uses the state-of-the-art transformer-based language model, which is the model that we trained using the DeBERTa model as our base layer model.

2 Related Works Although the research for automated essay scoring has become popular recently, there are already several types of research about automated essay scoring in the past couple of decades. however, in general, all the papers and journals related to automated essay scoring can be divided into these categories of related works. 2.1 Traditional Approach The traditional approach takes simple Linear Regression and an understanding of linguistics as the basis for developing an automated essay scoring system. Therefore, even with small datasets, these models can achieve a high level of accuracy, but they cannot generalize essay writing well. Some researchers such as Larkey in 1998, attempted to perform essay scoring using text categorization [7]. The research proposed to use a simple Linear Regression that determines the score of the assessed essay. This research concludes that an automated essay scoring system, where the computer is asked to judge what essay is good and bad can be done successfully, but an automated essay scoring system with a Linear Regression approach still has a high error compared to human reviewers, so it is not very good to be implemented. In addition, there is Chen H. who in 2013 proposed automated essay scoring using Human-Machine Agreement. Here the proposed automatic essay scoring uses the features of a text to determine its value, for example using Lexical, Syntactical (taken with POS Tagging), Grammar, and length features of an essay [8]. 2.2 Deep Learning Approach The Deep Learning approach makes the progress of the development of automated essay scoring systems more rapid compared to the traditional method. In the traditional approaches, determining the features of the essay text to be scored is a must. The Deep Learning approach makes it possible to avoid the need for feature selection with deep linguistic expertise, whereas with CNN, time series models such as RNN and LSTM, can easily figure out and find those features in complex essays. However, the downside of the deep learning approach to be used as an automated essay scoring system is that using deep learning architecture such as RNN and LSTM cannot store and understand the context of a text. Most of the time, for a deep learning architecture able to understand a text, they use a word embedding, which itself is only a representation of similar meaning words. In an essay case, to be able to score an essay thoroughly, we must view the essay from a text and sentence perspective. Some researchers such as Kumar and team proposed to combine traditional methods that still refer to human-knowledge-based features such as grammatical, lexical, etc.

Development of Automated Essay Scoring System

205

with information obtained from Deep Learning training. They used LSTM and Word Embedding Layer as the architecture of their model [9]. Another researcher, Dong and Zhang also used a similar approach with CNN and Word Representations architecture models [10]. 2.3 Pre-trained Language Model Approach The Pre-Trained Language Model approach is the most recent in the development of automated essay scoring systems. This approach uses transfer learning from the TransformerBased Language models that have been trained using a large corpus of data and certain techniques of training. One of the key features and advantages of using this approach is that the Transformer-Based Language models can understand the context of a text. Therefore, implementing Transformer-Based Language models as the base layer model will improve the essay scoring system capability. The commonly used model for this approach is to use the BERT model, but there has been no research using the latest TransformerBased Language model such as RoBERTa or DeBERTa which is a derivative of the BERT model. In 2020, Yang and his team proposed the BERT model which became the transfer learning model for automatic essay scoring. Yang also proposed to use a combined loss function of regression and ranking, because in general, human reviewers who assess an essay, cannot be purely based on regression values alone. Human reviewers tend to score an essay that is equivalent or equal to the value of other essays so that the characteristics sought by each reviewer can be better recognized [11]. Research in 2022, by Wang and his team, also used the BERT model, but they proposed to do a multi-scale essay representation where, each essay will be assessed by two different models, namely the document tokenizer model which aims to extract all features from all essays on all existing prompts, and the segmentation model which aims to specifically assess essays based on the prompts and rubrics that match the essay. The model proposed by Wang is one of the best models currently on the ASAP-AES dataset [12].

3 Methodology 3.1 Dataset Used In this research, we use Automated Student’s Assessment Prize Automated Essay Scoring or the ASAP-AES dataset [13]. This dataset is an open-source dataset that is provided for a competition held on Kaggle by the William and Flora Hewlett Foundation. This dataset consists of 8 distinct essay sets, with each set containing multiple essays that were written in response to an essay prompt. All the essays contained in the dataset were real human-written essays, which consisted of students from grade 7 to grade 10. The grader for this dataset is also an official grader, which at least consists of two persons per each essay set. As the ASAP-AES competition’s official test data is not publicly available, we, along with those who came before us [9–12], are unable to access it, so we use only the training data in our experiments.

206

H. Susanto et al.

The essay contained in this dataset is also varied, we have an Argumentative essay, a Source-dependent responses essay, and a Narrative essay. Argumentative essays require writers to take a position on a topic and provide reasoning to support their stance. Sourcedependent response essays require the writer to read a piece of text and answer a question based on what they have read. Whilst narrative essays require writers to narrate a story and give a view of something, by personal view, personal background, or even personal experience (Table 1). Table 1. Properties of All Essays Sets contained in ASAP-AES Dataset. Essay Set

Num. of Essays

Word Count

Score Range

Essay Genre

Prompt 1

1783

350

2–12

Argumentative

Prompt 2

1800

350

1–6

Argumentative

Prompt 3

1726

150

0–3

Source-dependent response

Prompt 4

1772

150

0–3

Source-dependent response

Prompt 5

1805

150

0–4

Source-dependent response

Prompt 6

1800

150

0–4

Source-dependent response

Prompt 7

1569

250

0–30

Narrative

Prompt 8

723

650

0–60

Narrative

3.2 Text Preprocessing In Natural Language Processing (NLP) problems, in general, one of the most important things to do before training a model is to preprocess the text data that will be trained later. In this research, all essay texts in the ASAP-AES dataset also be preprocessed first. There are two stages of preprocessing the essay text in this research, the text cleaning stage and the changing to base word form. The first step is text cleaning. At this stage, the essay text in the ASAP-AES dataset is cleaned first. Some of the things done to the essay text from this text cleaning stage are changing the text to all lowercase letters, eliminating hyperlinks in the essay text, eliminating tabs (\t) to replace them with spaces, eliminating newlines (\n) to replace them with spaces, eliminating tokens in the essay, eliminating quotation marks in the essay, and eliminating all punctuation marks in the essay writing. After the Text Cleaning stage is complete, then each essay text is changed to the base word for each word contained in the dataset. The next step is to change each word in the essay text to the base word form. In this research, we use the Lemmatization Technique [14] provided by the WordNetLemmatizer class obtained using the NLTK library. The lemmatization process carried out by WordNetLemmatizer is to loop for each word contained in the essay and then lemmatize one by one. This research also experimented to use Stemming to change the base word. However, after the experiment, the dataset that uses preprocessing by Stemming

Development of Automated Essay Scoring System

207

produces a worse evaluation value than if preprocessing is done using Lemmatization. Therefore, Lemmatization was chosen as the preprocessing method to convert the words in the essay into base words. In addition, this research also examined one of the other common NLP text preprocessing techniques, which is removing stop words from the essay text. However, this experiment proved to lose the essence of the student’s essays, resulting in poor evaluation scores. Some of the reasons and considerations why removing the stop words in essay writing results in poor evaluation scores are that human evaluators calculate the cohesion, diction, and writing style, so stop words are an important feature for automated essay scoring. The following Fig. 1 diagram is the summary of the text preprocessing step that is done in this research.

Fig. 1. Diagram of Text Preprocessing Steps in this Automated Essay Scoring Research.

3.3 Score Normalization Since the problem of automatic essay scoring is a regression problem, for each score given by the reviewer to each essay text, it is necessary to normalize it so that when training is carried out by the model, it can be more accurate and more precise in predicting the essay score values. The scores range in the ASAP-AES dataset is as follows (Table 2): Table 2. Score Range of each Essay Set in ASAP-AES Dataset. Essay Set

Score Range

Prompt 1

2–12

Prompt 2

1–6

Prompt 3

0–3 (continued)

208

H. Susanto et al. Table 2. (continued) Essay Set

Score Range

Prompt 4

0–3

Prompt 5

0–4

Prompt 6

0–4

Prompt 7

0–30

Prompt 8

0–60

In the ASAP-AES dataset, each essay set contained in it already has a predefined score range setting for each prompt based on different scoring rubrics. Therefore, in this study, all scores given by reviewers to each essay text are normalized using min-max normalization [15]. The maximum value is the highest value in the score range, and the minimum value is the lowest value in the score range. The equation for min-max normalization is as follows: 

X =

X − Xmin Xmax − Xmin

where X  is the normalized score, X is the previous unnormalized score, Xmin is the minimum score in the score range of the essay set, and Xmax is the maximum score in the score range of the essay set. So, after min-max normalization, all score values in the dataset have a range between 0 and 1. The following Fig. 2 diagram is the summary of the score normalization step that is done in this research.

Fig. 2. Diagram of Score Normalization Steps in this Automated Essay Scoring Research.

Development of Automated Essay Scoring System

209

3.4 Loss Function For the model to perform optimization well, a loss function is required that shows the difference in value between the predicted value and the actual value during training. Therefore, in this research, it is also necessary to determine a loss function. This research implements Mean Square Error (MSE) as a Loss Function that is used to calculate the loss at each epoch of model training [16]. MSE was chosen because the automatic essay scoring problem is a regression problem, so MSE is suitable for use in this problem. In addition, MSE is a loss function used by most journals that discuss automatic essay scoring. The equation for mean squared error is as follows: 1 (yi − yˆ i )2 n n

MSE =

i=1

where n is the total number of essays in the datasets, yi is the actual score of the ith essay, and yˆ i is the predicted score of the ith essay. 3.5 Evaluation Metric An evaluation metric is needed to measure the performance and effectiveness of a model when compared to actual conditions. So, the problem of automatic essay scoring is also inseparable from the need to determine suitable evaluation metrics so that automatic essay scoring systems can be measured for accuracy and effectiveness when compared to human reviewers. The problem of essay scoring itself is a problem that cannot be separated from the subjective aspects of the graders. Therefore, an evaluation metric is needed that can measure the value of subjectivity in this problem. Therefore, the Quadratic Weighted Kappa (QWK) metric is a very suitable evaluation metric for this problem because it is designed to overcome the subjectivity among observers [6]. Quadratic Weighted Kappa (QWK) measures the amount of agreement between two entities or observers in rating a certain thing, which in the case of automated essay scoring, measures the agreement between the model and the human reviewers in the dataset. In the end, the output of the QWK value is between -1 and 1. This number indicates how much the two observers agree, with a value of 1 being perfect, while a value of 0 is a randomly expected agreement, and a negative value indicates that the agreement is lower than the random agreement. So, the higher the QWK value, the more two observers agree on an event [17]. QWK is calculated using this formula:  i,j wi,j Oi,j κ =1−  i,j wi,j Ei,j where matrices O and E represents the observed scores and the expected scores respectively. The first grader’s score is represented by i while j represents the score of the second grader. The weight is calculated by using wi,j = (i − j)2 /(N − 1)2 , where N is the total number of scorings. To calculate Matrix E, one would need to take the outer product of the score matrices of both graders.

210

H. Susanto et al.

3.6 Data Flow Diagram (DFD) System Design In this research, we also going to implement an automated essay scoring system. Therefore, we designed a data flow diagram to explain the flow of the automated essay scoring system that is implemented in this research. The diagrams are as follows: Context Diagram See Fig. 3.

Fig. 3. Context Diagram of Automated Essay Scoring System

Level-0 Diagram See Fig. 4.

Fig. 4. Level-0 Diagram of Automated Essay Scoring System

Development of Automated Essay Scoring System

211

Level-1 Diagram See Fig. 5.

Fig. 5. Level-1 Diagram in Automated Essay Scoring System

4 Experiments 4.1 Automated Essay Scoring Model In this research, we implemented the automated essay scoring model in 3 different base layer transfer learning transformer-based large language models (LLM), which are BERT, RoBERTa, and DeBERTa. The model that we are going to use for this research is the BERT-based model, that’s why we set BERT as our base model and compare the performance thoroughly with RoBERTa and DeBERTa models, which both are derivative models from BERT itself. RoBERTa and DeBERTa models have the same architecture as BERT, which are transformer-based so they can also store context. The only difference is RoBERTa and DeBERTa improved BERT’s parameters, training data, and training techniques. For having the same comparison, in this research, we compared the base version of each model, so it is bert-base-uncased for the BERT model which has 110 million parameters, roberta-base for RoBERTa model which has 125 million parameters, and deberta-v3-base for the DeBERTa model which has 1.5 billion parameters. For each model developed, this research creates 8 specific models according to the number of essay sets in the ASAP-AES dataset. Each model that we develop has two different layers, the base layer, which is filled with the transformer-based language model that is tested in this research, and the fully connected layer, which is a regular neural network with a dense layer and ReLU activation function [18] for each layer neuron. At the end of the model, there is the Output Layer, which will output the score of the essay in the range of 0 to 1. The Architecture Diagram for the model that we developed for automated essay scoring is as follows (Fig. 6):

212

H. Susanto et al.

Fig. 6. Automated Essay Scoring Model Architecture Design

4.2 Setup For this ASAP-AES dataset, before we do the training, we do the train-validation-test split with the composition of 60/20/20, which means 60% of the dataset will be the data as training data, 20% of the dataset will be the data as validation data, and the remaining 20% of the dataset will be the data as test data. This is because the official test data of the ASAP-AES dataset is not publicly available. We use the ADAM Optimizer [19] with a Learning Rate set to 0.001. The learning rate value is obtained from hyperparameter tuning. We also set a scheduler for the learning rate with decay if the training already become a plateau. For the loss function, we use Mean Squared Error (MSE). We run our training for 20 epochs over a batch size of 64. To accelerate the training process of the model, we used a single GPU accelerator of NVIDIA P100 with 16 GB GPU RAM. 4.3 Result and Discussions After we trained all the models and tested them with the 20% unseen data of the dataset, we can evaluate the model using the QWK metric as our evaluation metric. So, for each model, we can specify the result of the QWK value of computer predicted score compared to the human grader score as follows (Table 3):

Development of Automated Essay Scoring System

213

Table 3. Result of QWK value for each model. The bold score is the best one for each prompt. Model

Set 1

Set 2

Set 3

Set 4

Set 5

Set 6

Set 7

Set 8

Avg.

BERT

0.828

0.659

0.656

0.766

0.780

0.811

0.811

0.626

0.741

RoBERTa

0.835

0.686

0.678

0.802

0.789

0.793

0.819

0.587

0.749

DeBERTa V3

0.853

0.692

0.680

0.779

0.804

0.821

0.832

0.712

0.771

Using the QWK metric which indicates the agreement between human and computer observations, it shows that the model trained with RoBERTa and DeBERTa has a higher QWK value average compared to the base model of BERT. Although RoBERTa improves the performance of automated essay scoring with a QWK of 0.749, the DeBERTa-based model has the highest average QWK value at 0.771. The DeBERTa-based model improves the QWK value by over 3% compared to the base model of BERT, from 0.741 to 0.771. The models that implement DeBERTa V3 also perform the best for almost every essay set, except for the fourth essay set. Therefore, it can be concluded that implementing the DeBERTa V3 model is the best transformer-based language model for an automated essay scoring system. The development of an automated essay scoring system has also been concluded in this research. We develop the system using Streamlit framework and Python as the programming language. The system that we developed are Web-Based application since mostly this application will be used for educational institutions, especially for students, and for writing an essay, of course, a Mobile-Based application is not suitable for this use case. Here are some screenshots and interfaces of the system that has been implemented in this research (Figs. 7, 8 and 9):

Fig. 7. Home Screen of the implemented AES System

214

H. Susanto et al.

Fig. 8. Essay Question Screen of the implemented AES System

Fig. 9. Answered Essay Question Screen of the implemented AES System

5 Conclusion and Future Work In this paper, we proposed a large language model DeBERTa V3 to be used as the base layer for implementing an automated essay scoring system model. It shows that using the DeBERTa V3 Transformer-Based Large Language Model can improve the performance of the automated essay scoring system. We achieve this by comparing the performance of the AES task from the model that we trained using BERT, RoBERTa, and DeBERTa in the ASAP-AES dataset, the DeBERTa-based model is the best performance model. In addition, we also implemented the automated essay scoring system in the Streamlit framework and Python programming language. One of the future directions of this research could be to use different datasets other than the ASAP-AES dataset for the research. Another direction might try to use the GPT-based language model as the base layer model.

Development of Automated Essay Scoring System

215

References 1. Page, E.B.: The imminence of... grading essays by computer. Phi Delta Kappan 47(5), 238– 243 (1996). Phi Delta Kappa International 2. Ramesh, D., Sanampudi, S.K.: An automated essay scoring systems: a systematic literature review. Artif. Intell. Rev. 55, 2495–2527 (2022) 3. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 (2018) 4. Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907. 11692 (2019) 5. He, P., Liu, X., Gao, J., Chen, W.: DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv:2006.03654 (2021) 6. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960) 7. Larkey, L.S.: Automatic essay grading using text categorization techniques. In: Proceedings of the 21st annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 90–95. ACM (1998) 8. Chen, H., He, B.: Automated essay scoring by maximizing human-machine agreement. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1741–1752. Association for Computational Linguistics (2013) 9. Kumar, R., Mathias, S., Saha, S., Bhattacharyya, P.: Many hands make light work: using essay traits to automatically score essays. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1485–1495. Association for Computational Linguistics (2022) 10. Dong, F., Zhang, Y.: Automatic features for essay scoring – an empirical study. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1072– 1077. Association for Computational Linguistics (2016) 11. Yang, R., Cao, J., Wen, Z., Wu, Y., He, X.: Enhancing automated essay scoring performance via fine-tuning pre-trained language models with combination of regression and ranking. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1560–1569. Association for Computational Linguistics (2020) 12. Wang, Y., Wang, C., Li, R., Lin, H.: On the Use of BERT for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation. arXiv:2205.03835 (2022) 13. The Hewlett Foundation: Automated Essay Scoring. https://www.kaggle.com/c/asap-aes. Accessed 19 Aug 2023 14. Manning, C.D., Schütze, H., Raghavan, P.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008) 15. Normalization. https://www.codecademy.com/article/normalization. Accessed 19 Aug 2023 16. Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006). https:// doi.org/10.1007/978-0-387-45528-0 17. Cohen, J., Everitt, B.S., Fleiss, J.L.: Large sample standard errors of kappa and weighted kappa. Psychol. Bull. 72(5), 323–327 (1969) 18. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814. ACM (2010) 19. Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv:1412.6980 (2014)

Prediction of Glycemic Control in Diabetes Mellitus Patients Using Machine Learning Md. Farabi Mahbub(B) , Warsi Omrao Khan Shuvo, and Sifat Momen Department of Electrical and Computer Engineering, North South University, Plot 15, Block B, Bashundhara, Dhaka 1229, Bangladesh {farabi.mahbub,warsi.shuvo,sifat.momen}@northsouth.edu

Abstract. While machine learning has made significant strides in diabetes prediction, glycemic control, a crucial aspect of diabetes management, remains understudied and calls for enhanced forecasting techniques. In addition to economic benefits, proper glycemic control also functions as a preventative measure against potential health complications. This paper aims to provide an accurate and reliable approach to predicting glycemic control in individuals afflicted with diabetes mellitus using advanced machine learning techniques. A vast and comprehensive dataset comprising 77,724 recently diagnosed diabetes patients from Istanbul province of Turkey in the year 2017 has been used in this study. Redundant features were eliminated and class imbalance was mitigated through the implementation of various sampling techniques. A collection of nine machine learning algorithms were utilized in order to predict glycemic control. Among the various trained models, LightGBM and CatBoost demonstrated exceptional performance, outperforming all other models with accuracy and AUC values of 83.36% and 89.49%, and 83.13% and 89.24% respectively. Explainable AI tools, such as LIME and SHAP, were employed to comprehend the predictions of the models and gain insights into important features. Keywords: Glycemic control Explainable AI · SMOTE

1

· Diabetes · Machine Learning ·

Introduction

A comprehensive understanding of proper glycemic control is an essential component in the management of diabetes, as it facilitates the maintenance of blood glucose levels within the desired range. Despite notable advancements in the utilization of machine learning techniques within the domain of diabetes [1,2], there remains a need for further enhancements in the comprehension of glycemic control to mitigate complications and facilitate effective treatment. The objective of this study is to determine whether diabetic patients will exhibit controlled or uncontrolled glycemic control in the future, thereby enabling early identification c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 216–242, 2024. https://doi.org/10.1007/978-3-031-54820-8_18

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

217

and intervention for high-risk patients, potentially leading to cost savings and improved patient outcomes. The concept of glycemic control is closely related to diabetes mellitus, a term employed to delineate a group of metabolic disorders characterized by elevated levels of glucose in the bloodstream. Individuals diagnosed with diabetes are at an increased vulnerability to various severe and potentially fatal health complications, leading to elevated expenses for medical care, diminished quality of life, and increased mortality rates [3]. The prevalence of diabetes is escalating globally due to the rapid rise in obesity rates and shifts in lifestyle patterns [4]. Moreover, it has been found to enhance the probability of experiencing depression [5]. The number of people with diabetes was projected to reach 108 million in 1980, according to a report by the World Health Organization (WHO) [6]. Estimates show that there were 451 million people with diabetes worldwide in 2017, and that number is expected to rise to 693 million by 2045 [7]. Furthermore, it is worth noting that in the year 2012, a total of 2.2 million fatalities were attributed to elevated blood glucose levels [8]. Diabetes affects approximately 422 million people globally, with the majority living in low and middle-income countries, and it is directly responsible for 1.5 million fatalities each year [9]. People with diabetes incur average annual medical expenses of $16,752, of which $9,601 are attributable to diabetes. Medical expenditures for individuals with diabetes are approximately 2.3 times greater than they would be in the absence of diabetes. [10]. The mounting expenses associated with the management of diabetes mellitus and its resulting outcomes highlight the economic benefits of preventive measures, alongside the health advantages. The implementation of effective prevention strategies has the capacity to reduce costs related to hospitalization, emergency care, medical services, pharmaceuticals, and surgical procedures, thus offering a promising opportunity for cost reduction [11]. The maintenance of optimal glycemic control has been shown to not only improve cognitive functioning but also reduce the severity of complications [12]. The diligent management of glycemic control is widely regarded as the fundamental approach to treating diabetes, owing to its significant impact on outcomes and prognoses in affected individuals. Additionally, it is widely acknowledged as the primary goal in mitigating the severe consequences associated with hyperglycemia [13], as well as the prevention of microvascular and macrovascular complications [14]. The likelihood of developing neuropathy, retinopathy, and nephropathy increases as glycemic control worsens [15]. Unregulated glycemic control has been found to be correlated with elevated mortality rates among individuals with diabetes, indicating that enhanced glycemic control may potentially lead to improved prognosis [16]. While the task of predicting inadequate glycemic control in individuals with diabetes has historically posed difficulties, the utilization of machine learning techniques has shown promise in enhancing the assessment of clinical glycemic control when compared to conventional approaches. Machine Learning models take into account various parameters in the prediction of glycemic control, a factor that often goes unnoticed in traditional testing methods.

218

M. F. Mahbub et al.

The dataset employed in this study consists of medical data obtained from individuals who were newly diagnosed with diabetes in Istanbul, Turkey, during the year 2017 [17]. The dataset comprises the primary medical records of a total of more than 77 thousand individuals, encompassing a comprehensive set of 107 attributes. These criteria capture not only the fundamental patient information but also provide documentation of the glycemic control status at a year interval subsequent to the initial diagnosis. The utilization of this dataset is pivotal in the identification and comprehension of the key determinants that impact glycemic control in individuals with diabetes mellitus. The main aim of this study was to construct a reliable machine-learning model with a specific focus on forecasting glycemic control in individuals diagnosed with diabetes. In order to accomplish this objective, we have chosen a dataset that has not been explored in prior research works. The objective of this study was to leverage the dataset in order to offer novel insights and advancements in the domain of glycemic control prediction. In contrast to previous investigations that employed a restricted number of classifiers, our study employed an extensive collection of nine classifiers. The performance was thoroughly assessed using a variety of evaluation criteria. Furthermore, the integration of Explainable AI models signifies a notable progression in elucidating the underlying reasoning behind predicted outcomes. Nevertheless, previous research has not fully utilized the capabilities of Explainable AI models. This study attempts to capitalize on Explainable AI models by emphasizing the important parameters that contribute to the decision-making process. The primary objective of this study was to make a valuable contribution to the field by presenting a robust and all-encompassing methodology for predicting glycemic control. Through the utilization of an untapped dataset, the application of a varied assortment of classifiers, and the utilization of Explainable AI models, our objective was to enhance comprehension of glycemic control prediction and offer significant insights for medical professionals and researchers in the domain of diabetes. The rest of this paper is structured as follows: Sect. 2 provides a brief overview of related works in the field. Section 3 presents the methodology employed in this research, detailing the approach and procedures used. The results of the study are presented in Sect. 4 and discussed in Sect. 5. Finally, Sect. 6 concludes the paper by summarizing the key findings.

2

Related Works

Deberneh et al. [18] conducted a study with the objective of employing machine learning algorithms to forecast the occurrence of type 2 diabetes in the subsequent year, utilizing present-year data among Korean patients. The researchers employed ANOVA tests, chi-squared tests, and recursive feature elimination techniques to identify significant features for the predictive model. In order to address the issue of class imbalance, both majority under-sampling methods and synthetic minority over-sampling methods were employed. The disparity in performance among the models is minimal, with multiple models attaining an

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

219

Table 1. Comparison of Notable Literature Ref

Sample Size Age range (years)

Advantage

Limitation

18 − 108

Dataset with large sample size

Uses FPG level as the only measurement to define normal, prediabetes, and diabetes

Mean 66

Predicts both long and short term HbA1C

Limited classifier diversity

Murphree et al. [20] 12,147

≥18

Data collected over a long time period

Low Accuracy

Del Parigi et al. [21] 1,363

53–57

Uses traditional and novel data analysis methodologies

Limited classifier diversity

Wang et al. [22]

2,787

Median 57 Dataset contains rich feature set

Study includes only outpatient data

Fan et al. [23]

165

>45 mostly

Study focuses on non-adherent T2D patients

Low sample size

G¨ uemes et al. [24]

6

40–60

Employed diverse preprocessing methods

Extremely low sample size

Abegaz et al. [25]

33,826

≥18

Incorporates data from underrepresented ethnic groups

Focuses only on patients from USA

Deberneh et al. [18] 535, 169

Nagaraj et al. [19]

> 50, 000

accuracy rate of 73% on the test dataset. However, when considering additional metrics, it becomes evident that the support vector machine algorithm exhibited slightly better results than other algorithms (Table 1). Nagaraj et al. [19] sought to evaluate the efficacy of supervised machine learning techniques in identifying clinical variables that can be used to predict the glycated hemoglobin (HbA1c) response following the initiation of insulin treatment in patients diagnosed with type 2 diabetes mellitus (T2DM) in The Netherlands. The researchers employed elastic net regularization as a method for selecting variables from a pool of 24 clinical variables. They then proceeded to utilize three pre-existing machine learning algorithms to classify the response of HbA1c in both short-term and long-term scenarios following the initiation of treatment. Two distinct models were created to address the prediction of shortterm and long-term outcomes, due to limitations of the dataset. The limitations were addressed to mitigate potential biases in the estimation process. The AUC scores were 0.80 and 0.81, respectively. Murphree et al. [20] employed a range of machine learning algorithms to forecast the likelihood of metformin therapy success in achieving and maintaining optimal blood glucose levels, as well as the potential occurrence of primary or secondary treatment failure within a one-year timeframe. The research was primarily conducted within the southern regions of the United States, with the sample population comprising predominantly of 70% Caucasian adults. The researchers

220

M. F. Mahbub et al.

employed a five-fold cross-validation technique to evaluate the performance of their model. The authors used five-fold cross validation, and the AUCs ranged from 0.58 to 0.75, where baseline HbA1c, starting metformin dosage and presence of diabetes with complications were the most crucial variables. Del Parigi et al. [21] used machine learning to identify which treatments will work best for T2DM patients by finding the characteristics that achieved and maintained target HbA1c under 7%. The dataset was gathered from two phase III studies on patients who were treatment naive or receiving background metformin. The objective of this study was to assess the efficacy of random forest and classification tree models in identifying HbA1c reduction. The researchers determined that HbA1c and fasting plasma glucose were the primary factors influencing the achievement of glycemic control. Other variables, such as body weight, waist circumference, and blood pressure, were found to have no significant impact on the outcome. The random forest model demonstrated a prediction accuracy of 81%, while the classification tree exhibited a slightly lower accuracy of approximately 79%. Wang et al. [22] aimed to study glycemic control among T2DM patients in North China. The researchers integrated elastic net regularization into their machine learning models in order to enhance performance. The study revealed that a majority of the patients exhibited poor glycemic control and central adiposity. Furthermore, it was determined that several factors, including family history, duration of diabetes, blood pressure, and hypertension, played a significant role in the elevated levels of HbA1c. The random forest exhibited sensitivity and accuracy values of 79% and 75%, while the support vector machine achieved values of 84% and 73%. Similarly, the BP-ANN model yielded sensitivity and accuracy of 78% and 73%, respectively. Fan et al. [23]sought to examine the usefulness of machine learning algorithms in predicting the likelihood of complications and poor glycemic control in non-adherent individuals diagnosed with type 2 diabetes. The researchers conducted their study using a highly specific dataset comprising of 165 individuals diagnosed with nonadherent T2D patients from the Sichuan province in China. They found a number of hypoglycemic drugs to be the key risk factor in glycemic control and duration of T2D and unadjusted hypoglycemic treatment was the key risk factor in diabetic complications. Among the trained models artificial neural network obtained the highest accuracy of 76%. G¨ uemes et al. [24] introduced a data-driven approach for forecasting the efficacy of overnight glycemic control, as well as determining whether blood glucose levels are maintained within the desired range. The researchers utilized the openly accessible OhioT1DM dataset. The researchers employed SMOTE to address the issue of class imbalance within the dataset. Additionally, the dataset was partitioned into ten folds for the purpose of conducting cross-validation. Various binary classifiers, including random forest, artificial neural networks, support vector machine, linear logistic regression, and extended tree classifiers, were employed to classify the quality of overnight glycemic control. Certain models demonstrated strong predictive capabilities for normoglycemia, while others

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

221

exhibited notable proficiency in predicting nocturnal hyperglycemia. The random forest model obtained AUC score of 0.73. Abegaz et al. [25] employed supervised machine learning approach to forecast uncontrolled diabetes mellitus by utilizing various patient characteristics. The dataset comprises 33,826 records of patients diagnosed with diabetes, collected from the AoU Research Program. The issue of class imbalance was effectively managed through the implementation of the random over-sampling examples (ROSE) technique, which was designed to reduce the negative impacts of imbalanced values of the outcome variable on the performance of the model. The random forest model demonstrated the highest level of predictive accuracy, achieving an accuracy rate of 80% with an AUC score of 0.77. The main indicators of uncontrolled diabetes were found to be potassium levels, body weight, aspartate aminotransferase, height, and heart rate.

3

Methodology

This chapter primarily focuses on the project’s design. The flow chart diagram has been utilized to visually depict all of the steps involved in the project. The graphical representation of our methodology is illustrated in Fig. 1. 3.1

Dataset

The dataset utilized in this study comprises 107 distinct features and includes of 77,724 individuals. It was obtained from the residents of Istanbul, Turkey. The data was obtained from individuals who were newly diagnosed with diabetes mellitus in the year 2017. In this study, the criteria for identifying individuals with diabetes included the diagnosis of diabetes based on ICD-10 codes E10-E14, the prescription of antidiabetic medications other than metformin, or having an HbA1c level exceeding 6.5. All individuals diagnosed with diabetes underwent an assessment of their baseline serum creatinine, lipid profile, and a minimum of four measurements of HbA1c, typically conducted annually. Based on the HbA1c profile, the patients were categorized into two distinct groups: those who demonstrated effective glycemic control, characterized by HbA1c values below 7 in their last two measurements, and those who exhibited inadequate glycemic control. The dataset exclusively consists of numerical data, with no categorical variables present. The focus of our analysis is the parameter referred to as“glycemic control” with possible values of 0 and 1. The value of 0 denotes a condition of well-maintained glycemic control, whereas the value of 1 signifies poor glycemic control. 3.2

Exploratory Data Analysis

The feature of interest, glycemic control, comprises a greater number of instances characterized by poor control (Fig. 2). Therefore, it can be observed that our dataset is imbalanced.

222

M. F. Mahbub et al.

Fig. 1. Project Design of our Proposed Approach

The dataset encompasses individuals across various age groups, with a predominant representation of individuals aged 40 years and above (Fig. 3a). There does not seem to be a definitive association between age and glycemic control (Fig. 3b). The dataset consists of a majority of females, accounting for over 66% of the sample (Fig. 4a). Preliminary analysis suggests that females exhibit better glycemic control compared to males (Fig. 4b).

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

223

Fig. 2. Distribution of glycemic control

(a) Distribution of Age

(b) Relation between age and glycemic control.

Fig. 3. Distribution of age and relationship with glycemic control

Figure 5a shows the initial levels of HbA1C among the individuals included in our dataset. The data reveals that a majority of the participants exhibit HbA1C values within the 5–10% range. Furthermore, it is evident that individuals with higher initial HbA1c levels tend to demonstrate poor glycemic control overall (Fig. 5b). 3.3

Data Preprocessing

Data preprocessing is a crucial step in the construction of a machine learning model, as it converts raw data into a format that is both useful and efficient. Data

224

M. F. Mahbub et al.

(a) Distribution of Gender

(b) Relation between gender and glycemic control

Fig. 4. Distribution of gender and relationship with glycemic control

(a) Distribution of initial HbA1c levels

(b) Relation between HbA1c levels and glycemic control

Fig. 5. Distribution of Initial HbA1c levels and relationship with glycemic control

preprocessing typically involves handling null values, addressing missing data, and converting categorical data into numerical data. None of those were present in the dataset. The values “1” and “2” for the gender attribute represent female and male, respectively. The values were transformed into the binary digits “0” and “1”. Additionally, the redundant feature “id” was removed from the dataset. 3.4

Splitting Dataset

The division of the dataset into train and test sets is a crucial step in the machine learning process. Through the process of data segmentation, it becomes possible to evaluate the model’s ability to effectively handle new data. By partitioning the dataset into separate training and testing sets, the model is precluded from having access to the test data prior to evaluation. This allows for an accurate assessment of the performance of the regressors when evaluating them on the test set. Our dataset was divided in an 80:20 ratio using the Holdout Split method.

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

3.5

225

Feature Selection

Pearson’s Correlation Coefficient was employed to assess the degree of a relationship between two variables for feature selection. The correlation coefficient is calculated using the formula shown below: n (xi − x)(yi − y) n (1) r = n i=1 2 2 i=1 (xi − x) i=1 (yi − y) Feature selection is a technique employed to eliminate redundant features from a dataset, thereby mitigating the risk of overfitting. Feature selection is a technique employed to eliminate redundant features from a dataset in order to mitigate the risk of overfitting. After conducting the feature selection process, a total of 8 features were eliminated, resulting in a remaining set of 97 features. The features that have been omitted include cardiovascular drugs, respiratory sys drugs, pshycoanaleptics, eye ear drugs, systemic hormones, Cholesterol, lipid modifying, dermatologic drugs. 3.6

Min-Max Scaling

To achieve data generalization, the technique of scaling is utilized to minimize the disparity between data points and mitigate the presence of noise. The MinMax scaler is a normalization method employed to standardize the features of a given dataset within a predetermined range, typically spanning from 0 to 1. The process involves the computation of the feature’s minimum value subtraction, followed by its division by the range, which is defined as the difference between the maximum and minimum values. The process of normalization is advantageous in ensuring that all features are standardized to a comparable scale. Therefore, our dataset was subjected to Min-Max Scaling. The formula for Min-Max scaling is presented as follows: Xscaled = 3.7

X − Xmin Xmax − Xmin

(2)

Under Sampling Technique

The technique of under-sampling, which involves retaining all data points from the minority class while reducing the number of instances in the majority class, is commonly employed to address the issue of imbalanced datasets. The dataset exhibited a significant imbalance with 28,791 under control cases and 48,933 poor control cases. Instances from the majority class were randomly dropped in order to achieve an equal representation with the minority class. 3.8

SMOTE

Synthetic Minority Over-sampling Technique (SMOTE) is a data augmentation method that generates synthetic samples for the minority class [26]. In order to

226

M. F. Mahbub et al.

address the issue of a highly imbalanced dataset, the SMOTE was utilized to achieve a balanced distribution of data after the dataset was split. SMOTE is used to address class imbalance and aims to equalize the representation of the minority class with the majority class. This is achieved by generating synthetic instances of the minority class. SMOTE ratio refers to the desired balance between the minority and majority classes in the dataset after applying SMOTE. It represents the ratio of the number of synthetic minority class instances generated to the number of majority class instances. K neighbors parameter determines the number of closest minority class neighbors to be considered when generating synthetic instances. It specifies the number of nearest minority class instances that are used to create the line segments in the feature space. Hyperparameter tuning was employed to ascertain the optimal parameter for the nearest minority class neighbor (k) value and its corresponding ideal ratio. Table 2 illustrates the data. Table 2. SMOTE Ratio

3.9

Algorithm

SMOTE Ratio K Neighbours

Logistic Regression

0.6

1

Random Forest

0.7

5

K-Nearest Neighbour

0.6

2

Decision Tree

0.6

1

Extra Trees

0.6

2

XGBoost

0.6

9

Multinomial Naive Bayes 0.7

8

CatBoost

0.6

9

LightGBM

0.6

5

Classifiers Algorithms and Hyper-parameter Optimization

The data is now ready to be utilized to create a machine learning model following feature selection and data preparation. Decision Tree Classifier. Decision Tree is a supervised machine learning algorithm that builds a tree-like model for making decisions. It partitions the data into subsets based on the feature values and recursively constructs decision rules in the form of a tree. The splits are made based on the criterion used. In this paper, we used entropy as the criterion and max depth of 6. The entropy criterion measures the impurity or randomness of the data at each node, aiming

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

227

to minimize it and create more homogeneous subsets. This criterion helps guide the decision tree’s split choices. By limiting the maximum depth to 6, we struck a balance between model complexity and overfitting. The formula for entropy is: Entropy(S) = −

n 

p(i) log2 (p(i))

(3)

i=1

Here, S represents the set of samples at a particular node, pi is the proportion of samples belonging to class i, and the summation is taken over all classes. Random Forest. Random Forest is an ensemble learning method that combines multiple decision trees to make predictions. It creates a collection of decision trees, each trained on a different subset of the data using a random selection of features. The final prediction is made by aggregating the predictions of all the individual trees. Gini index was employed to measure the impurity of data at each node. By setting the number of estimators to 10, we ensured a reasonable number of decision trees in the random forest. Additionally, we limited the maximum depth of each tree to 10, controlling their complexity and preventing overfitting. The Gini index is calculated using the formula: Gini(D) = 1 −

n 

p(i)2

(4)

i=1

Here, D represents the dataset at a particular node, n is the number of classes, and p(i) is the proportion of samples belonging to class i. K-Nearest Neighbour. K-Nearest Neighbors (KNN) classifier is a nonparametric algorithm that assigns a class label to a data point based on its K nearest neighbors in the feature space. In this study, we utilized the KNN classifier with a value of K set to 350, where 350 nearest neighbors were considered for predictions. The Minkowski(see Eq. 5) distance metric was employed as the default distance measure for computing the proximity between data points. Additionally, the power parameter was set to 2, corresponding to using the Euclidean distance metric. d(u, v) =

 n 

|ui − vi |

p

 p1 (5)

i=1

Logistic Regression. Logistic Regression is most frequently used binary classification algorithm that models the relationship between the features and the probability of belonging to a certain class. It is derived from linear regression by applying the sigmoid function to the linear combination of the input features and their corresponding coefficients. The sigmoid function maps the input features to a predicted probability ranging from 0 to 1. In this study, we utilized Logistic Regression with the L2 regularization to prevent overfitting. The model was trained using the liblinear solver.

228

M. F. Mahbub et al.

Extra Trees Classifier. Extra Trees is an ensemble learning method that builds a collection of decision trees and combines their predictions. It is like a random forest classifier, but the difference is the construction of the decision trees. It can include randomization while still optimizing and uses averaging to improve the predictive accuracy and overcome overfitting. XGBoost. XGBoost (Extreme Gradient Boosting) is an optimized gradient boosting algorithm known for its high performance in various machine learning tasks. It utilizes the principle of gradient boosting to sequentially build an ensemble of weak prediction models, typically decision trees, that correct the mistakes made by the previous models. In this study, we utilized XGBoost with the objective function set to binary logistic, facilitating binary classification using the logistic loss function. To achieve the best results, we trained the model with 200 boosting rounds or iterations. Multinomial Naive Bayes. The Naive Bayes classifier is a probabilistic classification algorithm that applies Bayes’ theorem under the assumption of feature independence. The process involves the computation of the posterior probability of a class, which is determined by the input features. This computation is based on the prior probabilities and likelihoods that are derived from the training data. Multinomial Naive Bayes(MNB) is a specialized form of the Naive Bayes algorithm that is specifically tailored for data that follows a multinomial distribution. Text classification tasks often employ a common approach wherein features are derived from word frequencies or counts. CatBoost. CatBoost is a machine learning algorithm designed to handle categorical features in data. It is based on gradient boosting and combines methods like ordered boosting, random permutations, and greedy decision tree building to produce high accuracy and quick training times. The capacity of CatBoost to deal with missing values in the data is one of its important characteristics. We used learning rate of 0.5 and limited the number of iterations to 50. CatBoost allows the usage of custom loss functions. We used the custom loss functions AUC and Accuracy to evaluate the model’s performance. LightGBM. LightGBM is a gradient boosting framework that excels in handling large-scale datasets and offers high-speed performance. It employs a novel technique called Gradient-based One-Side Sampling (GOSS) to achieve efficient training and utilizes the Exclusive Feature Bundling (EFB) algorithm for efficient feature grouping. Compared to standard gradient boosting techniques, the methodology employs a histogram-based approach to divide nodes in decision trees, reducing computing costs and enabling quicker training. LightGBM also provides advanced features like missing value management, categorical features, and unbalanced datasets.

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

4

229

Experiment and Results

Three distinct sets of experiments were conducted in this study. Initially, the machine learning algorithms were trained using the processed training set, without using any sampling strategy. In the subsequent experiment, the classifiers were trained using undersampled data. Finally, the classifiers were trained on oversampled data. The same set of nine classifiers, namely Decision tree, Random Forest, K-Nearest Neighbors, Logistic Regression, Extra Trees, XGBoost, Multinomial Naive Bayes, CatBoost, and LightGBM, were employed across all the experiments. 4.1

Performance Metrics

In addition to accuracy, the evaluation of machine learning models on imbalanced datasets warrants the utilization of a diverse set of metrics to ensure a comprehensive assessment of model performance. Accuracy alone may not provide an accurate representation of a model’s effectiveness when the dataset exhibits significant class imbalance. To address this limitation, we incorporated additional evaluation metrics, namely precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC), in our analysis. Accuracy. Accuracy is a performance metric used to evaluate the effectiveness of a model by assessing the ratio of accurate predictions to the total number of predictions made. The calculation involves the division of the sum of true positive (TP) and true negative (TN) predictions by the total number of predictions. The accuracy formula can be mathematically represented as: TP + TN (6) TP + TN + FP + FN In this equation above, TP represents the number of true positive predictions, TN represents the number of true negative predictions, FP represents the number of false positive predictions, and FN represents the number of false negative predictions. Accuracy =

Precision. Precision is an evaluation metric that measures the accuracy of positive predictions made by a model. It quantifies the proportion of true positive predictions out of all positive predictions. Precision =

TP TP + FP

(7)

Recall. Recall, also known as sensitivity or true positive rate, is an evaluation metric in machine learning that quantifies the ability of a model to correctly identify positive instances. It measures the proportion of true positive predictions (correctly predicted positive instances) out of all actual positive instances. Recall =

TP TP + FN

(8)

230

M. F. Mahbub et al.

F1-Score. F1-score metric combines precision and recall into a single measure. It provides a balanced assessment of a model’s performance by taking into account both the ability to correctly identify positive instances (precision) and the ability to capture all positive instances (recall). F1-score = 2 ×

Precision × Recall Precision + Recall

(9)

AUC-ROC. AUC-ROC or AUC represents the term Area under the ROC Curve. ROC curve is a graphical representation that illustrates the performance of a classification model across various classification thresholds. The plotted curve represents two parameters, namely true positive rate and false positive rate. AUC quantifies the total area beneath the ROC curve. AUC offers a comprehensive evaluation of performance by considering all potential classification thresholds. Table 3. Model performance on data with no sampling technique. Classifier

Train Accuracy Test Accuracy AUC

Precision Recall

F1 Score

Decision Tree

82.94%

82.48%

82.47%

82.65%

88.53%

82.65%

Random Forest

83.27%

82.13%

87.55%

82.06%

82.13%

81.71%

KNN

66.18%

66.09%

66.66%

65.77%

66.09%

59.53%

Logistic Regres- 82.94% sion

82.14%

88.38%

82.18%

82.14%

82.16%

Extra Trees

80.67%

76.12%

83.32%

76.84%

76.12%

74.36%

XGBoost

89.28%

82.89%

88.63%

82.73%

82.89%

82.69%

Multinomial Naive Bayes

70.20%

69.96%

74.54%

69.08%

69.96%

68.03%

CatBoost

84.13%

83.04%

89.21%

82.89%

83.04%

82.80%

LightGBM

83.10%

83.14%

89.33% 83.00%

4.2

83.14% 82.89%

Experimental Results Without Sampling Technique

This section examines the performance of classifiers without the usage of any sampling technique. The performance metrics for each classifier are summarised in Table 3. The LightGBM algorithm demonstrates superior performance across all metrics compared to other algorithms. The model exhibits accuracy, AUC, precision, recall, and F1 scores of 83.14%, 89.33%, 83.00% and 83.14% and 82.89% respectively. CatBoost displayed comparable performance to LightGBM, achieving an accuracy rate of 83.04%. However, it should be noted that both KNN and Multinomial Naive Bayes exhibited the lowest accuracies, falling below the threshold of 70%. Among the other classifiers, Extra Trees performed poorly, with an accuracy of only 76.12%, while the rest of the classifiers all had accuracy rates of at least 82%. A classifier’s effectiveness in discriminating between several classes is summarized using the ROC curve. Figure 6 presents the ROCs for each classifier.

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

231

Fig. 6. ROC curve of the classifiers without any sampling technique.

4.3

Experimental Results with Under Sampling Technique

In this section, we have analysed the efficacy of classifiers utilising the undersampling technique. This technique involves reducing the number of data instances in the majority class to match that of the minority class. Table 4 presents a comprehensive summary of the performance metrics for each classification model in this experiment. In terms of Test Accuracy, the LightGBM algorithm outperforms its competitors with a score of 82.34%. Furthermore, it outperforms in terms of precision, recall, and F1 scores, with values of 82.58%, 82.34%, and 82.42%, respectively. In terms of accuracy, the Decision Tree algorithm demonstrated the second-highest performance, with CatBoost algorithm just slightly behind. In contrast, the K-Nearest Neighbours (KNN) algorithm exhibited the lowest accuracy rate, amounting to a mere 57.08%. In general, the performance of all classifiers exhibited a decline when utilizing the undersampling technique, as a substantial quantity of data was discarded in this process. However, it is noteworthy that the LightGBM model attained the highest AUC score of 89.42%, surpassing all other scores obtained from experiments conducted without any sampling techniques. ROC curve for each classifier, utilizing the undersampling strategy, is depicted in Fig. 7. 4.4

Experimental Results with SMOTE

This section examines the performance of classifiers when employing SMOTE, an oversampling technique that generates synthetic samples for the minority class. The third section of this paper delineates the utilization of the method in our study, while the specific ratios employed are expounded upon in Table 2.

232

M. F. Mahbub et al. Table 4. Model performance on data with under sampling technique.

Classifier

Train Accuracy Test Accuracy AUC

Precision Recall

F1 Score

Decision Tree

81.12%

81.84%

81.81%

81.78%

88.34%

81.78%

Random Forest

81.42%

81.09%

87.45%

81.29%

81.09%

81.17%

KNN

61.28%

57.08%

66.39%

64.56%

57.08%

57.24%

Logistic Regres- 79.69% sion

78.37%

88.25%

80.64%

78.37%

78.72%

Extra Trees

84.53%

75.86%

84.44%

77.66%

75.86%

76.23%

XGBoost

90.93%

81.04%

88.34%

81.48%

81.04%

81.18%

Multinomial Naive Bayes

68.75%

68.07%

74.66%

70.14%

68.07%

68.57%

CatBoost

82.82%

81.70%

89.15%

82.07%

81.70%

81.82%

LightGBM

82.78%

82.34%

89.42% 82.58%

82.34% 82.42%

Fig. 7. ROC curve of the classifiers with undersampling technique.

The performance metrics obtained for each classification model using the oversampling technique are summarised in Table 5. The LighGBM algorithm performs better than other classifiers in terms of test accuracy, precision, recall, and F1 score, achieving values of 83.36%, 83.22%, 83.36%, and 83.14%, respectively. Once again, the CatBoost algorithm demonstrates a similar performance to that of LightGBM, achieving an accuracy rate of 83.13%. Most classifiers showed their highest performance levels in experimental settings that employed the oversampling technique. Figure 8 illustrates the ROC curve for all classifiers using the oversampling technique. LightGBM also achieves the highest AUC score of 89.49%.

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

233

Table 5. Model performance on data with oversampling technique. Classifier

Train Accuracy Test Accuracy AUC

Precision Recall

F1 Score

Decision Tree

82.87%

82.53%

82.47%

82.70%

88.55%

82.70%

Random Forest

82.27%

82.01%

87.68%

81.85%

82.01%

81.68%

KNN

66.18%

66.09%

66.66%

65.77%

66.09%

59.53%

Logistic Regres- 81.96% sion

82.07%

88.40%

82.16%

82.07%

82.11%

Extra Trees

80.97%

76.52%

83.54%

77.27%

76.52%

74.82%

XGBoost

86.73%

82.91%

88.99%

82.74%

82.91%

82.70%

Multinomial Naive Bayes

69.29%

70.43%

74.68%

69.69%

70.43%

69.78%

CatBoost

84.19%

83.13%

89.24%

82.97%

83.13%

82.90%

LightGBM

84.31%

83.36%

89.49% 83.22%

83.36% 83.14%

Fig. 8. ROC curve of the classifiers with oversampling technique.

(a) No Sampling

(b) Undersampling

(c) Oversampling

Fig. 9. Learning curves for LightGBM with different sampling techniques.

234

4.5

M. F. Mahbub et al.

Learning Curve

A learning curve visually represents the change model’s performance as more training data is provided. Figure 9 shows the learning curves for LightGBM for the three sampling techniques with accuracy as the metric, evaluating performance using 3-fold cross-validation. Training score refers to the model’s performance on the training data, while cross-validation score provides an estimate of how well the model generalizes to unseen data. For all three cases, the training score is the highest and the cross-validation score is the lowest at roughly 5000 training examples. This suggests that the model struggles to generalize well to unseen data at this point indicating overfitting. Subsequently, the score declines, while cross-validation scores rise rapidly up to 20,000 examples, suggesting improved generalization. For no sampling (Fig. 9a), the crossing validation score peaks at roughly 34,000 training examples and then stabilizes whereas the training score remains stable from 20,000 to roughly 34,000 and continues slight decline for remainder of the training examples. For undersampling (Fig. 9b) and oversampling (Fig. 9c), the training score steadily decreases, but at a slower rate. Meanwhile, cross-validation scores exhibit a gentle rise before stabilizing. This suggests models reach a plateau in generalization, and additional training examples provide limited benefit. In summary, learning curves reveal initial overfitting, followed by improved generalization. However, the training scores gradually decrease, suggesting limitations in the model’s capacity to fit the increasing complexity of the dataset. Ultimately, both training and cross-validation scores stabilize, suggesting minimal improvement with additional training examples. 4.6

LIME

Local Interpretable Model-agnostic Explanations (LIME) algorithm is commonly employed to provide explanations for the predictions made by machine learning models [27]. It provides interpretable explanations at the instance level by approximating the behavior of the model locally around a specific data point. LIME provides interpretable insights into the factors influencing the model’s decision-making process, enabling a better understanding of the model’s predictions. The LIME figure comprises three sections. The left segment signifies the probability assigned by the model for classifying the instance as either under control or poor control. Middle section of the figure displays the weights or coefficients associated with each feature in the model, indicating their importance in the prediction. Finally, the right part of the plot shows the actual values of each feature for the particular instance under examination. The factors in blue and orange contribute towards under control and poor control decisions respectively. The best-performing model in this study, LightGBM with oversampling technique, predicts the glycemic control status of an instance from our dataset to be under control with 95% confidence (Fig. 10). The key factors influencing this

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

235

Fig. 10. LIME interpretation of a specific instance of under control prediction

decision were low initial HbA1C level, immunostimulants and attributes related to insulin. In another instance, our model predicts the subject to have poor glycemic control with 94% confidence (Fig. 11). The contributing factors in this decision were high initial HbA1C level, change in HbA1C after one year, pregnancy and the levels of calcium homeostasis and glucagon. 4.7

SHAP

Shapley Additive explanations (SHAP) is a framework used for transparent and understandable explanation of the predictions of machine learning models [28]. The main goal of SHAP is to illuminate the contributions of distinct features to the classifier’s output, providing valuable insights into how the model makes decisions. It is used to explain the contribution of each feature to the prediction made by a classifier. It employs concepts from game theory to assign importance values to each feature in a prediction. These values, which account for all potential feature combinations, quantify the marginal contribution of each feature to the overall prediction. In Fig. 12, the SHAP summary plot presents a comprehensive overview of the feature importance for the best-performing model(LightGBM with oversampling) in our study. The larger the absolute Shapley value for a feature, the more important it is. In our study, the initial HbA1C values and the change in

236

M. F. Mahbub et al.

Fig. 11. LIME interpretation of a specific instance of poor control prediction

HbA1C are the most important attributes followed by different diabetes medications, age, obesity, sex and cholesterol levels among others. The summary plot (Fig. 13) provides a unified depiction that combines both feature importance and feature effects. By generating a comprehensive violin plot that visually illustrates the SHAP values of each feature across the entire dataset, we can ascertain the most influential features for the model. To accomplish this, the plot organizes the features based on their collective SHAP values across all samples, providing a clear representation of the distribution of how each feature impacts the model’s output distribution. By utilizing SHAP explanations, one can attain a more profound comprehension of how the model arrives at its decisions.

5

Discussion

The objective of this study was to predict the future glycemic control status of individuals who have recently been diagnosed with diabetes. We attempted to identify individuals who are at a heightened risk of experiencing poor glycemic control in the future during the initial stages of their diabetes diagnosis. Even in the dataset used for this study, 63% of the patients exhibited poor glycemic control. In order to mitigate this problem of class imbalance, a variety of sampling techniques were utilized. The results shown in Fig. 14 demonstrate an overall

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

237

decrease in the accuracy of the models when undersampling was employed. In contrast, the majority of models demonstrated their highest level of accuracy when the oversampling technique was employed. Particularly, LightGBM demonstrated superior performance compared to all other models across all evaluation metrics, attaining an accuracy of 83.36% when employing the oversampling technique.

Fig. 12. Shap Feature Importance Plot

238

M. F. Mahbub et al.

Fig. 13. Shap Summary Plot

The model presented in this study has exhibited greater accuracy and reliability when compared to other prominent works documented in the existing literature, as demonstrated in Table 6. The research results demonstrate that machine learning algorithms have exhibited exceptional predictive capabilities in determining glycemic control among individuals with diabetes. Another objective of our study was to identify and document the key factors in glycemic control. To accomplish this, we employed Explainable AI tools: LIME and SHAP. LIME was used in the interpretation of model predictions for specific

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

239

Fig. 14. Comparison of accuracy of classifiers for different sampling techniques

cases in the dataset. The LIME illustration provides insights into the significance of each feature in supporting or contradicting the model’s predictions. SHAP was used to understand the contribution of each feature in the interpretation of model predictions by aggregating the SHAP values for all instances. SHAP summary plot orders features based on their importance to predict glycemic control. SHAP provided a global explanation offering insights into the overall behavior of the model. The analysis revealed that the HbA1c feature exhibited the most substantial impact on the model’s predictions, closely followed by the HbA1c change feature. These findings align with intuitive expectations and reinforce our understanding of these factors in glycemic control. Importantly, our study extends beyond these intuitive factors and sheds light on the contributions of other features that may not be easily predictable. By utilizing Explainable AI tools, we enhance the interpretability and transparency of our models, enabling a more comprehensive understanding of the prediction outcomes.

240

M. F. Mahbub et al.

Table 6. Comparison of proposed glycemic control prediction system with similar works

6

Ref.

Class Predicts

Deberneh et al. [18]

Normal, diabetes

Best Model

Nagaraj et al. [19]

Short and long term EN based linear AUC = 0.81 HbA1c response to model insulin

Murphree et al. [20]

Glycemic control after GBMStack, metformin therapy GBM

prediabetes, Support Machine

Result

Vector 73%

AUC = 0.75

Del Parigi et al. [21] Glycemic control

Random Forest

81%

Wang et al. [22]

Glycemic control

Random Forest

75%

Fan et al. [23]

Glycemic control

Artificial Neural 76.0% Network

G¨ uemes et al. [24]

Overnight control

Abegaz et al. [25]

Glycemic control

Random Forest

80%

This work

Glycemic control

LightGBM

83.36%, AUC = 89.49%

glycemic Random Forest

AUC = 0.73

Conclusion

This paper presents a comprehensive study on predicting glycemic control in patients with diabetes mellitus. In addition, we also explored the factors that contribute the most to maintaining glycemic control. We provide a fresh approach in this relatively underexplored domain. A very large dataset with a rich feature set that was previously unexplored was used for this study. Feature selection was employed to eliminate redundant features and prevent overfitting, while normalization was performed to minimize disparities between data points. A large part of the experimentation covered various sampling techniques to address the class imbalance of glycemic control in this dataset. For each sampling technique, a diverse set of nine machine learning algorithms were trained to build the models and test them on the dataset. All these steps helped in the improvement of accuracy and reliability. Among the evaluated classifiers, the ensemble classifiers emerged as the top performers. LightGBM achieved the highest accuracy of 83.36% using the oversampling technique, while CatBoost demonstrated exemplary performance closely trailing behind. However, comprehending the decisionmaking process of these classifiers posed challenges. To address this, we employed explainable AI methods, including LIME and SHAP, to gain insights into their decision-making mechanisms. Our study not only reaffirms the significance of intuitive factors such as HbA1C and HbA1C change in predicting glycemic control but also provides valuable insights into the less intuitive factors. Through the application of Explainable AI tools, we contribute to the comprehensibility and credibility of our models, furthering our understanding of the complex dynamics involved in glycemic control. The findings from this research can contribute to the development of effective strategies and interventions aimed at improving

Machine Learning for Diabetes Mellitus Glycemic Control Prediction

241

glycemic control and ultimately enhancing the management and outcomes for individuals with diabetes mellitus.

References 1. Pranto, B., Mehnaz, S.M., Mahid, E.B., Sadman, I.M., Rahman, A., Momen, S.: Evaluating machine learning methods for predicting diabetes among female patients in Bangladesh. Information 11(8), 374 (2020) 2. Pranto, B., Mehnaz, S.M., Momen, S., Huq, S.M.: Prediction of diabetes using cost sensitive learning and oversampling techniques on Bangladeshi and Indian female patients. In: 2020 5th International Conference on Information Technology Research (ICITR), pp. 1–6. IEEE (2020) 3. Baena-D´ıez, J.M., et al.: Risk of cause-specific death in individuals with diabetes: a competing risks analysis. Diab. Care 39(11), 1987–1995 (2016) 4. Wild, S., Roglic, G., Green, A., Sicree, R., King, H.: Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diab. Care 27(5), 1047–1053 (2004) 5. Hassan, K., Loar, R., Anderson, B.J., Heptulla, R.A.: The role of socioeconomic status, depression, quality of life, and glycemic control in type 1 diabetes mellitus. J. Pediatr. 149(4), 526–531 (2006) 6. Zhou, B., et al.: Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4· 4 million participants. The Lancet 387(10027), 1513–1530 (2016) 7. Cho, N.H., et al.: IDF diabetes atlas: global estimates of diabetes prevalence for 2017 and projections for 2045. Diabetes Res. Clin. Pract. 138, 271–281 (2018) 8. Organization, W.H.: Diabetes. https://www.who.int/news-room/fact-sheets/ detail/diabetes, April 2023, Accessed 19 Aug 2023 9. Organization, W.H.: Diabetes. https://www.who.int/health-topics/diabetes. Accessed 19 Aug 2023 10. Care, D.: Economic costs of diabetes in the us in 2017. Diabetes Care 41, 917 (2018) 11. Organization, W.H., et al.: Prevention of diabetes mellitus: report of a WHO study group [meeting held in Geneva from 16 to 20 November 1992]. World Health Organization (1994) 12. Rizzo, M.R.: Relationships between daily acute glucose fluctuations and cognitive performance among aged type 2 diabetic patients. Diabetes Care 33(10), 2169– 2174 (2010) 13. Afroz, A., et al.: Glycaemic control for people with type 2 diabetes mellitus in Bangladesh-an urgent need for optimization of management plan. Sci. Rep. 9(1), 10248 (2019) 14. Rakhis Sr, S.A.B., AlDuwayhis, N.M., Aleid, N., AlBarrak, A.N., Aloraini, A.A.: Glycemic control for type 2 diabetes mellitus patients: a systematic review. Cureus 14(6) (2022). https://www.cureus.com/articles/92743-glycemic-control-for-type2-diabetes-mellitus-patients-a-systematic-review#!/ 15. Control, D., Group, C.T.R.: The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. N. Engl. J. Med. 329(14), 977–986 (1993) 16. Yoo, D.E., et al.: Good glycemic control is associated with better survival in diabetic patients on peritoneal dialysis: a prospective observational study. PLoS ONE 7(1), e30072 (2012)

242

M. F. Mahbub et al.

17. Gulkesen, K.H.: Machine learning for prediction of glycemic control in diabetes mellitus (2022) 18. Deberneh, H.M., Kim, I.: Prediction of type 2 diabetes based on machine learning algorithm. Int. J. Environ. Res. Public Health 18(6), 3317 (2021) 19. Nagaraj, S.B., Sidorenkov, G., van Boven, J.F., Denig, P.: Predicting short-and long-term glycated haemoglobin response after insulin initiation in patients with type 2 diabetes mellitus using machine-learning algorithms. Diabetes Obes. Metab. 21(12), 2704–2711 (2019) 20. Murphree, D.H., Arabmakki, E., Ngufor, C., Storlie, C.B., McCoy, R.G.: Stacked classifiers for individualized prediction of glycemic control following initiation of metformin therapy in type 2 diabetes. Comput. Biol. Med. 103, 109–115 (2018) 21. Del Parigi, A., Tang, W., Liu, D., Lee, C., Pratley, R.: Machine learning to identify predictors of glycemic control in type 2 diabetes: an analysis of target hba1c reduction using empagliflozin/linagliptin data. Pharm. Med. 33, 209–217 (2019) 22. Wang, J., et al.: Status of glycosylated hemoglobin and prediction of glycemic control among patients with insulin-treated type 2 diabetes in north china: a multicenter observational study. Chin. Med. J. 133(01), 17–24 (2020) 23. Fan, Y., Long, E., Cai, L., Cao, Q., Wu, X., Tong, R.: Machine learning approaches to predict risks of diabetic complications and poor glycemic control in nonadherent type 2 diabetes. Front. Pharmacol. 12, 665951 (2021) 24. G¨ uemes, A., et al.: Predicting quality of overnight glycaemic control in type 1 diabetes using binary classifiers. IEEE J. Biomed. Health Inform. 24(5), 1439– 1446 (2019) 25. Abegaz, T.M., Ahmed, M., Sherbeny, F., Diaby, V., Chi, H., Ali, A.A.: Application of machine learning algorithms to predict uncontrolled diabetes using the all of us research program data. In: Healthcare, vol. 11, p. 1138. MDPI (2023) 26. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002) 27. Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016) 28. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

Efficient Object Detection in Fused Visual and Infrared Spectra for Edge Platforms Piotr Janyst1(B) , Bogusław Cyganek1,2 , and Łukasz Przebinda1 1

2

MyLED Inc., Ul. W. Łokietka 14/2, 30-016 Kraków, Poland [email protected] Department of Electronics, AGH University of Science and Technology, Al. Mickiewicza 30, 30-059 Kraków, Poland https://iet.agh.edu.pl

Abstract. Image fusion is an important task in computer vision, aiming to combine information from multiple modalities to enhance overall perception and understanding. This paper presents an innovative approach to image fusion, specifically focusing on the fusion of RGB and thermal image embeddings to enhance object recognition and detection performance. To achieve this, we introduce two key components, FusionConv and SepThermalConv layers to the YOLO object detection network, as well as modified FusionC3 layer. The FusionConv layer effectively integrates RGB and thermal image features by leveraging multimodal embeddings with use of fusion parameter. Similarly, the SepThermalConv layer optimizes the processing of thermal information by incorporating separate branches for enhanced representation. Extensive experiments conducted on a custom dataset demonstrate significant performance gains achieved by our fusion method compared to using individual modalities in isolation. Our results highlight the potential of multimodal fusion techniques to improve object detection and perception of complex scenes by effectively combining RGB and thermal images. Keywords: Thermal object detection embeddings · YOLO architecture

1

· Image fusion · Multimodal

Introduction

Images in the visible RGB spectrum allow the human eye to distinguish characteristic objects and scenes depending on shape and color, related to clothing (e.g. shirt shade), appearance (e.g. hair color) and types of objects (e.g. backpack worn). The use of infrared images as an additional input to the detection model can improve the results in the context of difficult samples by taking into account the additional information coming from the phenomenon of naturally emitted human heat, as opposed to reflected light in the case of the RGB channels [3,8]. Such difficult samples are, for example, concealed people or poorly lighted objects, etc. Here, we assume the use of the correlation of information from the visible and thermal spectrum channels when observing people [7]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 243–253, 2024. https://doi.org/10.1007/978-3-031-54820-8_19

244

P. Janyst et al.

Given the remarkable accomplishments of deep learning in the field of computer vision [5,6,9], a convolutional network was employed as a feature extractor. For the task of object detection, the YOLO (You Only Look Once) approach was chosen [13], renowned for its exceptional speed in predictive mode also in processing of the thermal images [17]. The model was trained to predict the sex and worn glasses features concurrently within a unified training framework. To assess the efficiency of feature extraction, the Average Precision at an IoU threshold of 0.5 (AP @ 0.5) was selected for each class. This widely used metric in object detection tasks [4,10,11] quantifies the medium precision at a detection confidence threshold of 50% for a given class.

2

Related Work

In the field of RGB and thermal image fusion, several approaches have been proposed to enhance the visual quality of low-light visible images by leveraging thermal image features. Only few focus on object detection itself, however. We have the opportunity to explore various approaches that specifically emphasize the fusion of RGB and thermal images. One such approach, presented in [1], introduces a deep thermal-guided method for effective low-light visible image enhancement. It employs a Central Difference Convolution-based Multi-Receptive-Field module for feature extraction, comprising shallow feature extraction, thermalguided feature enhancement, and image reconstruction. The method is evaluated on datasets such as MFNet and KAIST, using metrics like PSNR, SSIM, TMQI, LOE, NIQE, PI, and NSS. Another work [15] focuses on semantic segmentation and presents an enhanced U-Net architecture, named EFASPP U-Net, which incorporates visible and thermal image fusion techniques. Various versions of the EFASPP U-Net are introduced, employing different fusion strategies. Evaluation is performed on a custom dataset, and metrics including Global Accuracy, Accuracy for each class, IoU, and BFScore are used to assess the results. Additionally, crowd counting is addressed in [19] by proposing a cross-modal crowd counting method that combines CNN and cross-modal transformer approaches. The method integrates CNN backbone features with transformer fusion features using BAM and CMAM. Evaluation is conducted on datasets such as RGBT-CC and ShanghaiTechRGBD, utilizing metrics like GAME and RMSE. Furthermore, in the domain of person re-identification, a work [14] focuses on cross-modality consistency learning for visible-infrared person re-identification. It presents a Modality Learning Module (MLM) and a Feature Adaptation Network (FANet) based on ResNet50. Evaluation is performed on datasets including SYSU-MM01 and RegDB, using metrics like CMC and mAP. Lastly, TCCFusion [16] proposes an infrared and visible image fusion method based on transformer and cross-correlation techniques. The method utilizes an encoder-decoder architecture with local and global feature extractors. Evaluation is conducted on datasets such as TNO and RoadScene, employing metrics like Mutual Information, Tsalis Entropy, Chen-Varshney, and PSNR. One approach [18] proposes an efficient deep convolutional network for object detection in RGB-thermal images. It highlights the utilization of the KAIST dataset, a well-known benchmark dataset

Efficient Object Detection in Fused Visual and Infrared Spectra

245

for evaluating object detection algorithms in RGB-thermal images. The paper showcases the effectiveness of their proposed network architecture through evaluations conducted on a subset of the KAIST dataset. These studies contribute to the advancement of RGB and thermal image fusion techniques, addressing various applications and evaluating their performance on diverse datasets. However, the aforementioned works don’t directly address the problem of efficient object detection in the joint visible RGB and thermal spectra. In this paper we bridge this gap.

3 3.1

Methodology The Proposed Architecture for Spectra Fusion

In order to explore the potential of fusion-based convolution in the YOLOv5s (Fig. 1(a)) backbone networks, we propose to replace the convolutionals layers with the custom FusionConv layer, SepThermalConv layer and FusionC3. These modifications allows for effective integration of RGB and alpha filters, enhancing the fusion of information from both channels. The SepThermalConv layer, our initial idea and focus, performs a channel-wise split, segregating input image channels into ratios of 3 rgb channels and 1 alpha channel or filters into a balanced 50%/50% distribution of kernels, parameterized with S. This split operation enables enhanced processing of thermal information in a separate branch. We propose YOLOv5s_DS architecture (Fig. 1(b)) with usage of SepThermalConv in place of the typical Conv layers outside C3 layers. Another proposed convolution, the FusionConv layer, splits the input tensor into two equal halves and applies separate convolutional operations on each split. The resulting feature maps are then fused by element-wise addition. We utilize its properties in new YOLOv5s_DF architecture (Fig. 1(c)). Overall, these modifications aim to exploit the benefits of fusion-based convolution and optimize the utilization of RGB and thermal modalities only within the YOLOv5s backbone networks. Last tested architecture YOLOv5s_Full_DF (Fig. 1(d)), fully integrates custom convolutional modules in each backbone module. Specifically, we introduce FusionBottleneck and FusionC3, which serve as key differences between previously mentioned architectures. Given the absence of alternative suitable approaches for our specific dataset, we chose to evaluate our proposed solutions by comparing them against the default YOLOv5s architecture using 3 (YOLOv5s_RGB), 1 (YOLOv5s_GRAY), and 4 (YOLOv5s_RGBA) channels as input configurations. 3.2

Fusion Mechanism

3.2.1 SepThermalConv The SepThermalConv (Fig. 2(c)) is a module specifically designed to separate thermal convolution within a neural network. Its purpose is to optimize the processing of thermal information by splitting the input tensor into two branches

246

P. Janyst et al.

Fig. 1. Differences in architectures between YOLOv5s, YOLOv5s_DS, YOLOv5s_DF and YOLOv5s_Full_DF.

Efficient Object Detection in Fused Visual and Infrared Spectra

247

via S split parameter. This module consists of two convolutional layers, each accompanied by a batch normalization layer and SiLU activation function. The first convolutional layer operates on one branch of the input tensor, while the second convolutional layer operates on the other branch, respectively. To facilitate the separation of thermal information, the input tensor is split into two branches based on a channel split ratio S. If the layer is first in the model, the S = 0.75, otherwise S = 0.5, as kernel count is equal for rgb and thermal data deeper in the model. This ensures that each branch receives an appropriate portion of the input channels, allowing for focused processing of thermal features. Finally, the output feature maps of the two branches are concatenated along the channel dimension to produce the final output tensor. This fusion of the processed branches ensures that both sets of information are effectively incorporated into the output representation. In summary, the SepThermalConv module offers a coherent and efficient approach to separately process thermal information in a neural network, promoting the effective utilization of thermal data and enhancing the overall performance of the network architecture. 3.2.2 FusionConv The FusionConv (Fig. 2(a)) layer is a custom module designed for performing fusion-based convolution in a neural network. It aims to integrate RGB and alpha filters effectively. The layer takes an input tensor representing the extracted feature maps from a previous layer. To achieve fusion, the input tensor is split into two equal halves. This division allows for separate processing of the RGB and alpha information obtained from input layer. The layer then applies three separate convolutional operations to each split. Following the individual convolutions, the fused feature map is obtained by performing an element-wise addition of the processed RGB and alpha splits. This fusion process facilitates the integration of complementary information from both channels. To further enhance the fusion, the fused feature map is scaled by a custom set e factor, and added back to the processed RGB and alpha splits. This adaptive scaling and addition process allows the network to selectively balance the contribution of the fused information. Lastly, the modified RGB and alpha feature maps are concatenated along the channel dimension to form the final output tensor. This output tensor represents a comprehensive fusion of RGB and alpha information. By incorporating the FusionConv layer into convolutional neural network architectures, effective integration of RGB and alpha filters can be achieved, leading to enhanced representation power and improved performance in object detection. 3.2.3 FusionC3 We also introduce FusionC3 (Fig. 2(b)) with FusionBottleneck as the last component for full customization of backbone convolutional layers. The implementation of FusionC3 involves the integration of custom convolutional FusionConv layers instead of the original Conv layers. Following, the FusionBottleneck module performs feature extraction separately for RGB and thermal input branches. The main idea of this layer is to maintain original expansion and optimization of the

248

P. Janyst et al.

Fig. 2. Details of the FusionConv, FusionC3 and SepThermalConv fusion mechanisms.

Efficient Object Detection in Fused Visual and Infrared Spectra

249

expressive capacity of the feature maps in both RGB and thermal branches, and to allow them to capture intricate patterns for separate input features.

4 4.1

Experimental Results Dataset

The dataset used in this study comprises data collected from 59 films recorded in a controlled laboratory setting, with the prior consent of the individuals involved. From these films, a total of 6141 samples were extracted and organized into different catalogues. For the purpose of testing, a subset of 629 samples was selected as the test set. The directories in the dataset are named to correspond to the scenarios under which the samples were created, taking into account factors such as the number of people and their mode of movement. Additionally, the directories may include information regarding the camera parameters used to capture the far infrared image, such as “hot,” “cold,” or “normal” settings.

Fig. 3. Example of RGB, thermal and RGBA images within dataset.

The RGB camera used in the data collection process was the ELP CCTV MI5100, which captures images with a resolution of 640 × 480 pixels, 24-bit color depth, a 90-degree field of view, and a frame rate of 30 frames per second. It is worth noting that the RGB camera provided a wider perspective compared to the thermal data. To align the perspectives of the two modalities, fragments were extracted from the RGB images using the pixel parameters (left = 150, top = 71, right = 474, bottom = 331) before merging them with the thermal data (Fig. 3(a)). The far infrared camera employed was the FLIR A35, capable of capturing images with a resolution of 320 × 256 pixels, 14-bit pixel values, a 48◦ x 39◦ field of view, and a frame rate of 60 frames per second (Fig. 3(b)). Prior to training, the images were resized to dimensions of (640, 320) as in Fig. 3(c). The dataset is labeled with four classes, representing a combination of gender and glasses-wearing status: “MenNoGlasses”, “MenWithGlasses”, “WomenNoGlasses” and “WomenWithGlasses”. To ensure a more realistic evaluation of the trained model, testing was conducted using images that were not seen by the model during the training phase. This dataset is available for download as the next contribution of this paper [12].

250

4.2

P. Janyst et al.

Training

Adjusted YOLOv5 framework was used for the training process. Our code is available from the Internet [2]. The training process was conducted on a machine equipped with two graphics cards, specifically 2 x RTX3080 10 GB, along with 64 GB of RAM. The operating system used was Ubuntu 22.04.1 LTS. Python version 3.10.4 in conjunction with the PyTorch framework version 1.13.1 was employed for the training. For the optimization process, the default SGD weights optimizer was utilized, starting with an initial learning rate of 0.01. The training data was divided into batches, with each batch consisting of 16 samples. Throughout the experiments, the accepted image size for the model was set to (640, 320) with 4 channels, except for training for RGB (3 channels) and GRAY (1 channel) images, ensuring consistency across all conducted tests. Table 1. Object detection performance comparison of YOLO Architectures, with and without fusion techniques, using mean Average Precision measurement. architecture

mAP @ 0.5 mAP @ 0.95

YOLOv5s_GRAY

93.46%

85.69%

YOLOv5s_RGB

92.96%

79.62%

YOLOv5s_RGBA

95.51%

87.44%

YOLOv5s_DS

96.24%

88.48%

YOLOv5s_DF_e = 0.05

96.12%

88.70%

YOLOv5s_DF_e = 0.20

95.88%

88.02%

YOLOv5s_DF_e = 0.50

96.33%

88.14%

YOLOv5s_DF_e = 1.00

95.58%

88.28%

YOLOv5s_Full_DF_e = 0.05 90.91%

84.39%

YOLOv5s_Full_DF_e = 0.50 91.68%

83.73%

The experimental results presented in Table 1 demonstrate the performance of different YOLO architectures for object detection when incorporating fusion techniques. The models were evaluated based on the mean Average Precision (mAP) at IoU thresholds of 0.5 and 0.95. Among the tested architectures, YOLOv5s_GRAY achieved an mAP of 93.46% at a confidence threshold of 50%, while YOLOv5s_RGB achieved an mAP of 92.96%. However, the performance significantly improved when RGB and thermal information were combined in YOLOv5s_RGBA, resulting in an mAP of 95.51%. Moreover, the integration of FusionConv in YOLOv5s_DF models led to even higher mAP values. YOLOv5s_DF_e = 0.05 achieved an mAP of 96.12%, demonstrating the effectiveness of FusionConv with a low enhancement factor. Similarly, YOLOv5s_DS, which utilized SepThermalConv, achieved an mAP of

Efficient Object Detection in Fused Visual and Infrared Spectra

251

96.24%, highlighting the advantages of incorporating separate thermal convolutions. In contrast, the YOLOv5s_Full_DF architecture, the last of the tested fusion approaches, performed notably worse than the other compared solutions, achieving an mAP of only 91.68% at best. This outcome underscores the potential challenges associated with custom convolutional modules, as the FusionC3 layers seemed to hinder the fusion of RGB and thermal features, leading to decreased performance. Overall, these results suggest that the fusion of RGB and thermal images using FusionConv and SepThermalConv can significantly enhance the object detection performance of YOLO architectures. The improved mAP scores at both 50% and 95% confidence thresholds of YOLOv5_DS and YOLOv5_DF compared to the YOLOv5s_GRAY, YOLOv5s_RGB demonstrate the potential of multimodal fusion techniques in boosting the accuracy and reliability of object detection systems in various applications. It is worth to notice that simple addition of thermal input to the RGB data (YOLOv5s_RGBA) boosts object detection results. Nevertheless, the YOLOv5s_Full_DF results suggest that further optimization of custom convolutional modules is essential to fully harness the benefits of fusion-based convolution in certain scenarios.

5

Conclusions

In summary, this paper successfully investigated the fusion of the RGB and thermal images for object detection using the YOLO architecture. By incorporating the proposed FusionConv and SepThermalConv layers, which leverage the complementary information from different modalities, the performance of the YOLO models was significantly enhanced. Through extensive experimentation and evaluation, it was observed that the fusion of RGB and thermal images resulted in improved object detection accuracy compared to using individual modalities alone. The introduced FusionConv layer effectively enhanced the visual quality of low-light visible images, leading to better recognition and detection performance. The SepThermalConv layer, on the other hand, demonstrated the benefits of separately processing thermal information, contributing to more accurate and reliable object detection. The achieved results, as presented in the evaluation section, showcased the superiority of the proposed fusion techniques. The models incorporating FusionConv and SepThermalConv consistently outperformed the RGB-only and grayscale models in terms of mAP scores at different confidence thresholds. These findings highlight the potential of leveraging multimodal fusion approaches for enhancing object detection systems. Overall, this study contributes to the growing body of research on RGB and thermal image fusion for object detection. The improved detection performance achieved through the integration of FusionConv and SepThermalConv underscores their effectiveness in handling challenging scenarios and improving the understanding and perception of complex scenes. The YOLOv5s_Full_DF architecture, which integrates SepThermalConv and FusionConv layers, exhibited slightly lower performance

252

P. Janyst et al.

compared to other configurations. This unexpected outcome might stem from the FusionC3 module’s operation, where the introduction of custom convolutional layers for the separate RGB and thermal branches could potentially lead to an increased segregation of features. This segregation could hinder the efficient integration of fused information in subsequent stages, impacting the model’s ability to capture cross-modal correlations effectively, especially due of change of Bottleneck layer. Further investigation is necessary to fine-tune the parameters of the FusionC3 module and improve the integration of features from both branches, potentially leading to enhanced object detection performance. Acknowledgements. This work was supported by the National Centre for Research and Development, Poland, under the grant no. POIR.01.01.01-00-1116/20.

References 1. Cao, Y., et al.: A deep thermal-guided approach for effective low-light visible image enhancement. Neurocomputing 522, 129–141 (2023). ISSN 0925-2312. https://doi.org/10.1016/j.neucom.2022.12.007. www.sciencedirect.com/science/ article/pii/S0925231222015077 2. Code for Object Detection in Fused Visual and Infrared Spectra. https://gitlab. com/pijany/yolov5. Accessed 22 Aug 2023 3. Cyganek, B., Woźniak, M.: Tensor-based shot boundary detection in video streams. New Gener. Comput. 35(4), 311–340 (2017). ISSN 1882-7055. https://doi.org/10. 1007/s00354-017-0024-0 4. Everingham, M., et al.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010) 5. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016) 6. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016) 7. Hwang, J., et al.: Multispectral pedestrian detection: benchmark dataset and baseline. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 165–178 (2015). https://doi. org/10.1109/TPAMI.2014.2346971 8. Knapik, M., Cyganek, B.: Fast eyes detection in thermal images. Multimedia Tools Appl. 80(3), 3601–3621 (2021). ISSN 1573-7721. https://doi.org/10.1007/s11042020-09403-6 9. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015) 10. Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017). https://doi. org/10.1109/ICCV.2017.324 11. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48 12. Object Detection data for Fused Visual and Infrared Spectra. https://drive.google. com/file/d/1D9rzzHdyHkDcSzxmllbVEZ9J1TWF4Ose/view?usp=sharing. Accessed 18 Aug 2023

Efficient Object Detection in Fused Visual and Infrared Spectra

253

13. Redmon, J., et al.: YOLO: real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE (2016). https://doi.org/10.1109/CVPR.2016.91 14. Shao, J., Tang, L.: Cross-modality consistency learning for visible infrared person re-identification. J. Electron. Imaging 31(6), 063054 (2022). https://doi.org/10. 1117/1.JEI.31.6.063054 15. Shojaiee, F., Baleghi, Y.: EFASPP U-Net for semantic segmentation of night traffic scenes using fusion of visible and thermal images. Eng. Appl. Artif. Intell. 117, 105627 (2023). ISSN 0952-1976. https://doi.org/10.1016/j.engappai.2022.105627. www.sciencedirect.com/science/article/pii/S0952197622006170 16. Tang, W., He, F., Liu, Y.: TCCFusion: an infrared and visible image fusion method based on transformer and cross correlation. Pattern Recogn. 137, 109295 (2023). ISSN 0031-3203. https://doi.org/10.1016/j.patcog.2022.109295. www.sciencedirect.com/science/article/pii/S0031320322007749 17. Tomasz, B., Mateusz, K., Cyganek, B.: New thermal automotive dataset for object detection. In: IEEE 17th Conference on Computer Science and Intelligence Systems, FedCSIS 2022. Annals of Computer Science and Information Systems, vol. 31, pp. 43–48 (2022). ISSN 2300-5963. https://doi.org/10.15439/2022F283 18. Wagner, J., et al.: Multispectral pedestrian detection using deep fusion convolutional neural networks. In: The European Symposium on Artificial Neural Networks (2016) 19. Zhang, S., et al.: A cross-modal crowd counting method combining CNN and cross-modal transformer. Image Vis. Comput. 129, 104592 (2023). ISSN 0262-8856. https://doi.org/10.1016/j.imavis.2022.104592. www.sciencedirect.com/ science/article/pii/S0262885622002219

Designing AI Components for Diagnostics of Carotid Body Tumors Tatyana Maximova(B)

and Ekaterina Zhabrovets

ITMO University, Kronverksky Avenue, 49, 197101 St. Petersburg, Russia [email protected]

Abstract. Carotid body tumor (chemodectoma) is a rare disease, characterized by a large number of diagnosis errors and a lack of specialized software for its computational detection. Apparently, any datasets, representing this kind of tumor, do not exist. Thus, there is a problem of awareness among specialists. In this paper, we provide our own open access Python library for medical image preprocessing and tumor detection RadImaLib, which is intended to be useful in terms of the issues mentioned above. Having explored other similar libraries and several studies on the topic, we have formed and integrate into the library a dataset of chemodectoma images, made distinct methods for dataset creation for encouraging users to manipulate with their own medical data, implemented some image preprocessing methods, such as erosion, dilation, Hounsfield Scale transformation and rescaling, and developed our own detection model, based on the U-Net neural network architecture with a descent result accuracy. Keywords: Carotid Body Tumor · Chemodectoma · Medical Images Preprocessing · Open Source Library · Neural Networks

1 Introduction Carotid chemodectoma [1], also known as carotid body tumor, glomus tumor, or paraganglioma, is a benign tumor, up to 5 mm in size, that grows slowly at the site of a bifurcation of the carotid artery. The tumor is located next to such organs as the lateral wall of the pharynx, trachea, glossopharyngeal nerve and thyroid gland, and is close to the vessels. Detection of tumors of the carotid body is a rather laborious task, the errors of their primary diagnosis range from 25% to 90% [1, 2]. The proportion of carotid glomus tumors in the total number of neck and head tumors is 0.5% [3]. The tumor occurs more than twice in women compared to men. Manifestations often occur between the ages of 20–50 years. There is evidence that chemodectoma is associated with some hereditary diseases, as well as with a mutation in the succinate dehydrogenase genes (SDHB, SDHC, SDHD), which can be passed from father to child [4]. Late detection of the disease can lead to serious complications that can be fatal. At the moment, there are no specialized software tools aimed at solving the problem of detecting or predicting the development of a carotid tumor. Most related publications consider treatment methods used in individual cases, with special attention to the rationale, technology and consequences of surgical intervention (for example, [5]). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 254–263, 2024. https://doi.org/10.1007/978-3-031-54820-8_20

Designing AI Components for Diagnostics of Carotid Body Tumors

255

Specific approaches to the study of the phenomenon of carotid tumors and technical solutions related to determining the potential site of their occurrence are described in articles [6, 7]. The article [6] describes the methodology and results of the analysis of potential drivers of carotid chemodectoma mutations - combinations of genes that are indicators of the presence of the disease, as well as signs that directly affect its course - in particular, the acquisition of a malignant nature or further growth. The total number of observations was 52, the age of the studied subjects ranged from 32 to 80 years. According to the results of statistical processing, 42 causative genes were identified, and in 27% of cases, the presence of more than one of them was observed. The researchers paid special attention to such a statistical parameter as mutation load - the number of mutations in a gene per megabase - it varied from 2 to 10. The authors of [7] considered the problem of segmentation of the carotid arteries on 3D images obtained using computed angiography. As a recognition model, a neural network was used that accumulates the U-Net and ResNet architectures. The Dice coefficient was used as a loss function. The final accuracy was 82.26%. An additional difficulty in tumor diagnosis is the lack of a software library for the systematic processing of X-ray images of neck tumors. The emergence of software solutions containing systematized information about the tumor, including functionality for preprocessing and a model for classifying or detecting a tumor, would provide significant assistance in terms of prompt response and the formation of an effective treatment method with the lowest risk of subsequent complications for patients, and would also make a significant contribution to the enrichment knowledge base about the disease and increasing the awareness of physicians. The aim of the study was to expand knowledge about benign neck tumors located in the area of the carotid artery (for example, carotid chemodectomas) and increase the accuracy of their detection by machine learning methods by creating a software library in the Python language. To achieve this goal, the following tasks were set: 1. to form a dataset of patients with carotid tumor and similar diseases based on materials provided by collaborating medical organizations, 2. to implement the functions of preprocessing medical images, 3. to develop a model for tumor differentiation, 4. to integrate the obtained results into the software library.

2 Research Method The study relies on the analysis of international clinical decision support information systems based on the analysis of radiological images. TorchIO is an open-source library for loading, processing, converting medical images and sampling them in the form of batches using the solutions of the Pytorch deep learning library [8]. It includes many image intensity and spatial transformations. These transformations represent both typical computer vision operations and field-specific modeling of intensity artifacts that could arise because of the inhomogeneity (shift) of the MRI magnetic field. Within the framework of the library, a graphical interface is also implemented,

256

T. Maximova and E. Zhabrovets

in which you can do all the transformations on a three-dimensional image manually. Several datasets with medical images are integrated into the considered software package, such as: 1. Information eXtraction from Images (IXI) - contains about 600 different images for healthy people, 2. EPISURG - clinical dataset with normalized images of patients with epilepsy who underwent brain resection, 3. RSNAMICCAI - dataset of the Kaggle competition in the classification of brain tumors. A significant disadvantage of this library is that it works with image archives. That is, you won’t be able to work directly with dicom format. Another disadvantage from the point of view of the final study is the lack of dataset with neck tumors. MedPy is an open-source Python library that provides basic functions for reading, writing, and otherwise manipulating medical images [9]. As part of the available functionality, there are many filters for transformations, such as: 1. Otsu filter - used to find the optimal value for separating an image into a background and a non-background, 2. average filter - averages the brightness values for the extreme pixels to eliminate noise, 3. Hough transformation - a transformation used to highlight geometric objects in an image, 4. standardizer of intensity values (the operation of standardization is used to align the pixel brightness scale), 5. anisotropic diffusion - a kind of smoothing filter. As for metrics, the library has a pretty good number of them, ranging from Manhattan distance to Minkowski distance. Pydicom is a library that allows you to manipulate dicom files and their metadata [10]. You can either read the specified file or create your own image [11]. Arrays of pixel brightnesses can also be extracted from the metadata for direct work with a graphic image. When working with medical images, when it comes to detecting a neoplasm, they first procedure is segmentation dividing an image into regions, since tumors are associated with a specific area of localization. Segmentation is of two types: Semantic segmentation (identification of all objects of one category) and Instance segmentation (identification of individual representatives of one category) [12]. To achieve the goal, both types are used in combination: within the framework of the task tumors of the neck, vessels and, in particular, neoplasms located on them are semantically segmented. The most successful in image segmentation are the so-called fully convolutional networks. They consist of several blocks, each of which includes a convolution, an activation layer, and pooling layers, which allows you to reveal a semantic representation and individual convolutions with sub-sampling operations to reveal the necessary detail already chiseled. There is a U-Net architecture, which is a variation of FCN. It was presented at the international medical conference MICCAI 2015 and was specially created for the needs related to the segmentation of medical images specifically.

Designing AI Components for Diagnostics of Carotid Body Tumors

257

The structure of the neural network includes two paths. The first path is a compression path (also called an “encoder” or analysis path), which is mostly like a convolutional neural network that provides information for classification. The second path is the expanding path (also known as the decoder or synthesis path). It consists of expanding convolutions and contains concatenation elements with features from the narrowing layer. The decoder allows the network to know the classification information in terms of localization. It also increases the resolution of the output images passed to the resulting convolutional layer to create a segmented image. According to [13], in the period from 2020 to 2022, U-Net was a commonly used architecture in solving medical problems. During this period, more than 100 methods were discussed in the literature on U-Net based segmentation of medical images. The architecture is actively used in the problems of finding tumors of the brain and neck. In particular, in [14] for the detector brain neoplasm, researchers use U-Net with added fully connected blocks in the encoder and decoder (Dice coefficient is 89.12%), in [13] a combination of U-Net and ResNet was used to segment the cervical artery (Dice coefficient is 82.26%). However, none of the considered libraries provides an exhaustive set of tools for preprocessing radiological images in terms of model training, as well as the models themselves, which could be useful in solving the problem of detecting or classifying neck tumors.

3 Results 3.1 Development of the RADIMALIB Library Definition of Functional Requirements. As part of the study, work was carried out on tumors of the neck, in particular carotid chemodectomas. Since such images are often dicom files, it was decided to focus on working with this type. Based on the subject area, the following requirements for the final functionality were formulated: 1. reading files from a directory with conversion to multidimensional arrays (both in the form of single images and samples), 2. formation of a data set, 3. applying standard transformations (Hounsfield Scale, Erosion, Dilation), 4. rescaling images in accordance with the interests of the researcher (selection of fragments of a particular “substance” using the Hounsfield Scale), 5. the presence of a model for detecting neck tumors on dicom images. Description of the Data Preparation Algorithm. Medical images are DICOM files. DICOM (Digital Imaging and Communications in Medicine) is a standard for various types of manipulations with MRI and CT images of patients, based on the international standard of the OSI network model. The information in the file is stored as a meta-object with tags and attributes. The most informative characteristics are presented in Table 1.

258

T. Maximova and E. Zhabrovets Table 1. DICOM file attributes examples

Name of the attribute

Format of the attribute

Description

Study Description

LO (Long String)

String description of the study

Name of Physician(s) Reading Study

PN (Person Name)

Name of medical specialist conducting the study

Patient’s Name

PN (Person Name)

Patient’s name

Patient’s Birth Date

DA (Dicom DateTime)

Date of birth of the patient in YYYYMMDD format

Additional Patient History

LO (Long String)

Additional information about the patient

Within the framework of this research work, the data was provided by the Federal State Budgetary Institution “North-Western District Scientific and Clinical Center named after N.N. L.G. Sokolov FMBA of Russia. As a rule, the original archive with the results of an MRI study contains a folder with files that do not have an extension, as shown in Fig. 1. These are the images needed for analysis.

Fig. 1. Initial MRI data for one patient

At the initial stage, we decided to implement the add_dicom_extension function, which gives the files the “.dcm” extension to enable further work with Python tools. The following modules were used to implement the function: 1. os - to be able to work with all files in the directory, 2. Pathlib - directly used to add the extension. The result of the function is shown in Fig. 2.

Designing AI Components for Diagnostics of Carotid Body Tumors

259

Fig. 2. Files converted as a result of the work of the add_dicom_extension function

The next step was to directly extract image information using the pydicom [10] software library to read dicom files. As a result of reading, a dataframe is returned containing information about those or other characteristics of the file in accordance with the standard mentioned above. Specifically, within the framework of the work, the following parameters were of interest: Pixel Data, RescaleIntercept, RescaleSlope. They are parameters that are written by default. Also, the user can supplement the list with their own parameters. The Pixel Data parameter directly contains a multi-dimensional array of a twochannel greyscale image available for display, with a dimension of 512 × 512. RescaleIntercept, RescaleSlope are the arguments required for linear transformations. The form_data_row function returns an object of the DataRow class with two attributes: a “parameter:value” dictionary and a multidimensional image array. This function loops for each file in the directory specified as an argument to the form_dataset function - which returns a default pandas DataFrame object with columns RescaleIntercept, RescaleSlope and 262144 columns corresponding to pixel brightnesses (if the user has specified custom columns, these will also be added to dataset). All the above functions are combined into one called form_dataset_from_directory, which receives a string representation of the directory as input. Building a Deep Learning Model. As part of this work, the classical U-Net architecture was used (the dimension of the input layer is 512 × 512): 1. the narrowing path consisted of 3 × 3 convolutional layers with a ReLU activation function and a subsampling operation (MaxPooling, used for resolution manipulation), 2. the expanding path consisted of 2 × 2 convolutions that expand the feature space (concatenation with layers from the narrowing path takes place here). Then there were several successive convolutions with ReLU and the convolutional layer with dimensions 1 × 1 completes everything, designed to bring the image dimension to the required number of channels and display it in a segmented form. There were 1764 images in total (588 images per person). The tumor was detected by an expert method on 228 images. Before training the network, a linear transformation to the Hounsfield scale was applied to the data. The Hounsfield scale (HU) is a set of values that are radiation attenuation coefficients relative to distilled water [15]. Based on the values of this scale, it is possible to draw

260

T. Maximova and E. Zhabrovets

a conclusion about which plan the element of the organism predominates in the image. The correspondences of several substance names to their HU value ranges are shown in Table 2. Table 2. Correspondence of ranges of the Hounsfield scale and substances Substance

HU values

Air

−1000

Fat

[−120, −90]

Soft tissue on contrast CT Bone

[100, 300] Cancellous

[300, 400]

Cortical

[1800, 1900]

The conversion formula looks like this: HU = RescaleSlope ∗ StoredValue + RescaleIntercept

(1)

HU is the Hounsfield unit, RescaleSlope is the slope, StoredValue is the initial value of the brightness of the pixel and RescaleIntercept is a free member. After the transformation, for the purposes of analysis, histograms of the distribution of values were constructed. An example of a histogram built for one patient is shown in Fig. 3.

Fig. 3. Chart of distribution of images according to the Hounsfield scale

A significant number of pixels have a value close to −3000. These are “artifacts” black halos on the sides of the images. Air also occupies a large part (for this patient, upper projections dominate, on which the lungs are clearly visible) and soft tissues.

Designing AI Components for Diagnostics of Carotid Body Tumors

261

The training and validation sample was randomly generated according to the 80/20 principle, Fig. 4 shows the network training process. Here you can see that the loss function decreases for both the training and test dataset. The number of epochs was chosen experimentally and is equal to 12. Accuracy was measured using the accuracy_score metric. The accuracy was 75% on the train and 86% on the test. Since the data were randomly distributed, the validation set included more tumor-free images than the training set, but most of them were recognized (see Fig. 4).

Fig. 4. Neural network loss function change

The main functionality of the Python software library for processing and classifying radiological images has been implemented.

4 Discussion and Future Work In the modern world, the development of personalized health methods plays a huge role due to the individual specifics and the rarity of some diseases, in particular, considered in the framework of this work. The research confirms that the study of rare diseases using computer methods is justified and creates the background for the automation of medical activities related to diagnosis. Moreover, the use of modern information support creates the basis for raising the awareness of attending physicians and the validity of clinical decisions in the diagnosis and treatment of rare diseases. During the study, we analyzed existing libraries for preprocessing and analysis of radiological images, and also developed an early version of our own RadImaLib library with an integrated dataset and image processing functions. As a result of the study, functional requirements were substantiated, a version of the library for working with radiological images of a neck tumor and a dataset was

262

T. Maximova and E. Zhabrovets

developed and published, which represents tools for analyzing user data about patients and classifying. A data set of patients with carotid tumor and similar diseases has been formed and integrated into the library (it can be updated by adding information about new patients). Such radiological image processing operations as applying standard transformations (Hounsfield Scale, Erosion, Dilation) and rescaling have been implemented. The resulting software library can be used by IT researchers in the field of working with medical data, in particular, to borrow and independently refine methods for extracting medical information and recognizing radiological images of a neck tumor. Despite the fact that the result classification model does not contain enhanced structure elements, it has demonstrated the descent accuracy score. That encourages the future use of more modern neural network architectures, such as Generative adversarial network or Siamese neural network, to investigate the possibility of tumor detection process improvement. Moreover, the carotid body tumor dataset increase is needed to make the model more robust. Finally, the model should be tested on a completely new data, which may consist of images with another neoplasm, to evaluate its universality. Acknowledgment. The study was financially supported by ITMO University, N 622274.

References 1. Liu, J., Mu, H., Zhang, W.: Diagnosis and treatment of carotid body tumors. Am. J. Transl. Res. 13(12), 14121–14132 (2021) 2. Forbes, J., Menezes, R.G.: Anatomy, head and neck: carotid bodies. [Updated 2022 Jul 25]. In: StatPearls [Internet], Treasure Island (FL). StatPearls Publishing. (2023) 3. Druzhinin, D.S., Pizova, N.V.: Carotid chemodectoma: differential diagnosis according to ultrasound data. Head Neck Tumors, 46–50 (2012). (in Russian) 4. Baysal, B.E., Willett-Brozick, J.E., Lawrence, E.C., et al.: Prevalence of SDHB, SDHC, and SDHD germline mutations in clinic patients with head and neck paragangliomas. J. Med. Genet. 39, 178–183 (2002) 5. Amato, B., et al.: Surgical resection of carotid body paragangliomas: 10 years of experience. Am. J. Surg. 207(2), 293–298 (2014) 6. Snezhkina, A.V., et al.: Exome analysis of carotid body tumor. BMC Med. Genom. 11(Suppl. 1), 17 (2018) 7. Zhou, T., Tan, T., Pan, X., Tang, H., Li, J.: Fully automatic deep learning trained on limited data for carotid artery segmentation from large image volumes. Quant. Imaging Med. Surg. 11(1), 67–83 (2020) 8. Pérez-García, F., Sparks, R., Ourselin, S.: TorchIO: a Python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning. Comput. Meth. Program. Biomed. 208, 12 (2021) 9. MedPy. https://pypi.org/project/MedPy/. Accessed 10 Jan 2023 10. Pydicom. https://pydicom.github.io/. Accessed 10 Jan 2023 11. DICOM Processing and Segmentation in Python. https://www.raddq.com/dicom-processingsegmentation-visualization-in-python/. Accessed 12 Apr 2023

Designing AI Components for Diagnostics of Carotid Body Tumors

263

12. Shashidhara, S.: Image Segmentation and classification using deep learning. https://www. zignite.io/post/image-segmentation-and-classification-using-deep-learning. Accessed 01 Apr 2023 13. Azad, R., et al.: Medical image segmentation review: the success of U-Net (2022). https:// arxiv.org/abs/2211.14830 last accessed 2023/04/01 14. Ahmad, P., Qamar, S., Shen, L., Saeed, A.: Context aware 3D UNet for brain tumor segmentation (2020). https://arxiv.org/pdf/2010.13082.pdf. Accessed 01 Apr 2023 15. Lev, M., Gonzalez, R.: CT angiography and CT perfusion imaging, 2nd edn. In: Toga, A., Mazziotta, J. (eds.) Brain Mapping: The Methods, pp. 427–484. Academic Press (2002)

A Review of the Concept, Applications, Risks and Control Strategies for Digital Twin Farnaz Farid1(B) , Abubakar Bello1 , Nusrat Jahan2 , and Razia Sultana2 1 School of Social Sciences, Western Sydney University, Penrith, Australia

[email protected] 2 Department of Computer Science and Engineering, Eastern University, St. Davids, USA

Abstract. The concept and application of digital twin has been advancing and intersecting various fields. The Internet of Things (IoT), Cyber-Physical Systems (CPS), cloud computing, and big data are examples of emerging technologies being incorporated into Industry 4.0. Effective monitoring and management of physical systems are possible through the utilization of machine learning and deep learning methodologies for the analysis of gathered data. Along with the development of IoT, a number of CPS: smart grids, smart transportation, smart manufacturing, and smart cities, also adopt IoT and data analytic technologies to improve their performance and operations. Yet, several risks exist when directly modifying or updating the live system. As a result, the production of a digital clone of an actual physical system, often known as a “Digital Twin” (DT), has now become an approach to address this issue. This study aims to conduct a review on how digital twins are utilized to improve the efficiency of intelligent automation across various business sectors. The study provides an understanding of the concept and discusses the evolution and development of digital twins. The key technologies that enable digital twins are examined, and the risks and challenges associated with digital twins are analyzed together with potential control strategies. Keywords: Digital Twin · IOT · CPS · Digital Clone · Digital Twin Risks · Digital Twin Security Controls

1 Introduction Digital twin technology has become a ground-breaking innovation in the fields of engineering, production, and design in recent years [1]. A digital twin is a virtual representation of a real-world system, process, object or item that is used to track, evaluate, and improve performance. This process involves collecting data from sensors, IoT devices, and other sources. Engineers, designers, and operators could utilize the digital twin to imitate the behavior of the physical object or process in real-time and monitor and improve its performance [2, 3]. Several different digital twin versions have been used for many years in a number of sectors to reduce or eliminate risks, improve the optimization of crucial choices, and boost operational efficiency. Data, intelligence, and physical system behavior work © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 264–282, 2024. https://doi.org/10.1007/978-3-031-54820-8_21

A Review of the Concept, Applications, Risks and Control Strategies

265

together to provide an interface that encourages effective operation monitoring and accurate prediction. The primary enabling technologies for DT (Fig. 1) are Industry 4.0, the Internet of Things (IoT), and Artificial Intelligence (AI) [4, 5]. Digital twins have been adopted by a number of industries, including manufacturing, production, and operations management, as well as healthcare, civil engineering, and critical infrastructure development [1]. The digital twin database enables managers to compare an ideal state to reality, follow the progress through certain monitoring cycles, and generate relevant reports and warnings [6]. It can be used to model and simulate different security dangers and vulnerabilities in a computer system or network in the context of cyber security and assist in identifying possible security concerns before they can be taken advantage of by attackers. The key motivation behind digital twin technology is to create a virtual replica or model that can be used for simulation, analysis, and optimization. This virtual replica is created by collecting real-time data from sensors and other sources and then using advanced analytics, machine learning, and other techniques to generate insights that can be used to improve the performance, efficiency, and effectiveness of the physical system [7]. Section 2 of this paper describe the concept of DT. Section 3 and 4 explore the architecture of DT and examine the various applications of DT in different fields. Section 5 and 6 describe some of the major risks and challenges associated with DT including brief explanation on potential risk control strategies for DT. Section 7 and 8 provide discussions, recommendations, and conclusion on future research direction for the digital twin technology.

Fig. 1. Physical Design and Digital Twin

2 Concept of Digital Twins A digital twin is a computerized replica or simulation of an actual system or process, devised to monitor, evaluate, and improve its efficiency in a digital environment [8]. Commonly used phrases include “Digital Model,” “Digital Shadow,” and “Digital Twin” utilized interchangeably, although data integration at the physical, digital, and cyber levels do not refer to the same thing as shown in Fig. 2 [9]. A digital model is defined as a digital replica of a physical object that either already exists or is projected to exist in the future. A digital shadow is a digital representation

266

F. Farid et al.

Fig. 2. Digital Model, Shadow, Thread, Predictive

of an object that only flows in one direction between the physical and digital objects. The term “Digital Twin” is used to describe the situation in which data flows between an existing physical object and a digital entity, and both the physical and digital objects are fully integrated in both directions. A strong correlation exists between Cyber Physical Systems (CPS) and DT. However, DTs are more akin to an engineering category, and CPS is more comparable to a scientific category. In terms of composition, DTs and CPS both integrate the actual and virtual worlds. Because of the interaction and control that may take place between them, both cyberspace and the actual world are able to govern and operate the physical world in a more precise and effective manner. Whereas DTs place more focus on virtual models that enable one-to-one communication, CPS puts more emphasis on three-dimensional capabilities, which lead to one-to-many interaction. It is contended that the fundamental components of a CPS are sensors and actuators, whereas the fundamental components of a DT are models and data [10].

3 Architectural Context of Digital Twins

Fig. 3. Basic Layers of DT

The framework of DT demonstrates how a completely digital system ought to function in order to help humans to achieve scalability, autonomy, and innovation [11]. Figure 3 represents an illustration of a Digital Twin reference model that is composed of four parts: a physical layer, a digital layer, a cyber-layer, and communication for the exchange of data among the three layers. These layers carry real-time data as a mapping between selected physical elements and their digital model and process in cyberspace.

A Review of the Concept, Applications, Risks and Control Strategies

267

Fig. 4. Hierarchical structure of Digital Twin

• Physical Layer: The physical layer represents any actual system in the real world such as the smart grid, smart transportation, smart manufacturing, and smart cities. The two most important types of connected things are actuators and sensors. The former sends an electronic signal to the physical condition or chemical substance that is being sensed, while the latter detects and measures the characteristic. • Digital Layer: The digital layer involves the recording of data in raw or other file forms, such as Computer-aided design (CAD) or Computer-aided manufacturing (CAM), in order to facilitate the creation, modification, analysis, optimization, or prediction of a static, dynamic, or real-time data set. This layer stores files that have been built and produced to reflect results that are required, desired, and anticipated in the physical layer. The majority of popular DT frameworks has a hierarchical structure. They are founded on DTs as a precondition for the development of a CPS, with the DTs providing services to the CPS in order to make it possible for the CPS to exert control. The DT is depicted in full in Fig. 4 along with its generic hierarchical architecture [14].

4 Application of Digital Twins There is a growing trend in many different fields to use digital twins for a wide variety of purposes and applications [15, 16]. The following are some examples: • Smart cities In the context of smart city projects, the technology of digital twins can be utilized to model and optimize urban infrastructure. Building Information Modelling (BIM), is an example of an information model in the built environment. The Internet of Things constitutes an essential component in the process of developing BIM to support smart

268

F. Farid et al.

buildings. When it comes to monitoring, simulating, optimizing, and making predictions about the condition of cyber-physical systems (CPSs), DTs make new potential outcomes possible [10, 17–19]. • Healthcare In the field of medicine, the technology of digital twins can be utilized to simulate the human body and develop individualized treatment strategies. DTs in healthcare should dynamically reflect data sources such as electronic health records (EHRs), disease registries, “-omics” data (for example, genomics, proteomics, or metabolomics data), as well as physical markers, demographic, and lifestyle data over the course of an individual’s lifetime [20, 21]. The digital twin enables information to be exchanged in real time between the physical and virtual copies of the patient, object, or environment being studied. DT models need biomarkers that cannot be assessed directly or that require invasive procedures [22, 23]. Table 1 provides an overview of the current DT application for medical treatment [21]. • Manufacturing Digital twins technology have high-fidelity cyber models that map the real worlds of SMSs (Smart Manufacturing Systems). IoT technology developed a cyber-physical mapping that is bidirectional in order to facilitate the planning and optimization of re-manufacturing procedures. Table 2 contains information regarding the current DT application for the case of manufacturing [27–29]. Table 1. Digital Twin Application in Healthcare Disease

Description

Heart [21]

The Living Heart Project is the first DT organ to address all components of the heart’s performance, including blood flow, mechanics, and electrical impulses

Heart [24]

Chest electrocardiogram (ECG) signals are combined with data from a computed tomography (CT) scan to produce simultaneous DT cardiac maps

Brain [25]

The Blue Brain Project, which is one of the sub-projects of the Human Brain Project, has been built with biologically detailed digital reconstructions (computer models) and simulations of the mouse brain using DT (continued)

A Review of the Concept, Applications, Risks and Control Strategies

269

Table 1. (continued) Disease

Description

Human Air-way system [22]

This is a high-resolution DT human respiratory system that covered the complete respiratory zones, lung lobes, and body shells. The purpose of the research is to investigate and improve the efficacy of cancer-killing medications that specifically target tumors as their only target

Brain aneurysm and surrounding blood vessels [23]

Aneurysms, which are swollen blood arteries that can lead to clots or strokes, have been the focus of research and development that led to the creation of a DT. Brain surgeons are able to run simulations which is represented by a three-dimensional model

Diabetes [26]

The DT model keeps tabs on patients’ blood sugar levels, liver function, weight, and tracks their nutrition and sleep patterns, and step counts

Wireless Body Area Networks (WBAN) [19]

A WBAN (ZigBee, WSNs, Bluetooth, WLAN, WPAN) is a wireless body area network that is used in the medical field. It employs biomedical sensors that are positioned in various parts of the body and can be either implanted under the skin or worn on the surface of the body

Table 2. Digital Twin Application in Manufacturing Source

Model

[30]

Quad-play CMCO model: Configuration design, Preparing for movement, Control development, and Decoupling

[8]

DT2SA: combines security analytics and digitization of objects

[31]

Robotics model: Human Robot Collaboration (HRC) [32], Gazebo [33], MuJoCo [34], and CoppeliaSim [35]

[36]

Satellite assembly: CAD model, Behavior model, Rule model

[37, 38]

3D printer (MTConnect, Proprietary ontology)

[39]

ToM (Theory of Mind): Flexible robotized warehouses

[40]

FSMS: Bill-of-Material (BoM), Bill-of-Process (BoP), and Bill-of-Resource (BoR) might be used to define the OA-FSMS

[41]

Automotive body production (continued)

270

F. Farid et al. Table 2. (continued)

Source

Model

[42]

Automation ML

[43]

Parallel controlling of smart workshop

[44]

Personalized production

• Construction Digital twins for construction projects (DTCP) incorporate established, developing, and upcoming methods used in the construction industry. Architects and builders are able to test a variety of building designs and construction methods by first generating a digital twin of the structure they are working on [45, 46]. Table 3 describes some of the applications of digital twin in Construction. Table 3. Digital Twin Application in Construction Source

Telegraphic

[47]

Using BIM and Blockchain to Promote Trust and Cooperation

[48]

Smart contracts automate previously human-dependent aspects of traditional contracts

[49]

Using Building Information Modeling (BIM) and sensors like Ultra-Wideband technology to ensure worker safety

[50]

Production, limitations, IoT/BIM advancements

[51]

Integration of GIS and energy management tools to reduce power usage

[52]

Bridge condition monitoring by remote-controlled aircraft

[53]

Bridge monitoring systems utilize Terrestrial Laser Scanning (TLS)

• Supply Chain Management The Digital Supply Chain Twins (DSCTs) makes it possible to create a model that is a mirror-simulation of all the operations that occur inside supply chains. The essential technology that enables the relatively accurate construction of a DSCT is a combination of simulation, optimization, and data analytics [54–56]. Blockchain technology makes it possible to store DT data in a ledger that cannot be altered and is secure [57]. The ability to provide Manufacturing-as-a-Service (MaaS) to customers via a platform is one of the benefits that can be gained via cloud manufacturing [58].

A Review of the Concept, Applications, Risks and Control Strategies

271

• Agriculture Digital twins have supported smart farming to new levels of productivity and sustainability. DT can act as a central means for farm management from planning to control operations remotely based on real-time digital information instead of relying on direct observation and manual tasks on-site. This allows farmers to act immediately in case of deviations and to simulate effective interventions. Table 5 provides some of the applications of DT in agricultural settings [72]. Table 4. Digital Twin Application in Agriculture Source

Type

[73, 74]

DT of farm site and crops, as well as monitoring resource optimization and cultivation support

[75]

DT of livestock; and monitoring, management and optimization

[76, 77]

Urban, aquaponic farming

[78, 79]

Product design, smart services and machinery management

• Process Industry DT processes and practices can also be seen in several large companies. Table 4 contains the list of DTs that are used in the process industry [59, 60]. • Aerospace and Defence The performance of airplanes and other vehicles can be simulated with the help of digital twin technology, which is widely utilized in the aerospace and defense industries. NASA has been developing and monitoring several aerospace vehicles using the Digital twin concept. An aerospace company has created and implemented a quality management system for the assembly process of aerospace products that is based on DT [71]. 4.1 The Technologies Enabling Digital Twins The Internet of Things (IoT), cloud computing, artificial intelligence (AI), simulation, visualization tools, and machine learning models are all essential components for developing DTs. The usage of DTs is becoming more appealing as a result of developments in virtual reality (VR) and supercomputers. • AI-Machine Learning: this is currently one of the scientific fields that is developing at one of the fastest rates. It is possible to conceptualize it as a computer system that, as more experience is accumulated, improves itself automatically in terms of its overall efficacy [25]. • Internet of Things: is used to describe all of the devices that are connected to a network. These objects are typically equipped with pervasive knowledge. IoT sensors

272

F. Farid et al. Table 5. Digital Twin Applications in Process Industry

Source Model

Description

[61]

HMI

Human Machine Interaction (HMI): the real environment can be generated by virtual models that are driven by data that is updated in real time

[62]

Transport Model DT of a transportation system with the goal of assessing its current state and identifying viable repair options

[63]

Structural

Development of a technique that optimizes the arrangement of assembly locations in the manufacturing business, which lowers complexity and uncertainty while also optimizing the process

[64]

Data-based

Real-time monitoring that looks for unusual occurrences. When it comes to data processing and deciding which decisions should be made between the physical layer, the edge layer, and the cloud layer, they employ distributed supervised machine learning

[65]

Control Models

Strengthen the microgrids’ resistance to coordinated attacks by improving it. This idea adds mathematical protection to the DT, making it resistant to both individual and organized attacks

[66]

Data-based

The author developed a DT in order to strengthen the vital infrastructure’s resistance to disruption

[67]

Data-based

Protect vital infrastructure against denial-of-service attacks. Provide a strategy for the design of DTs to ensure the safety of critical infrastructure

[68]

Hybrid

Locating and analyzing potential problems

[69]

Control Models

Enhance the overall quality of the micro made gadgets that are being produced

[70]

Control Models

Keep an eye on the wear and tear on the various system components, and try to put a number on it

and devices are of the utmost significance to the digital twins concept because they offer the information required for a digital twin to conduct an analysis and evaluation of the real condition of its physical object or environment [80]. • Cloud Computing: these platforms are able to supply the enormous computer power and storage space that digital twins need to function properly. Additionally, cloud systems make it possible for various stakeholders to collaborate and share data with one another. When sending data to the cloud, it must first be encrypted for maximum protection against unauthorized access. • Augmented and Virtual Reality: A technology known as virtual reality recreates the sensation of being in the actual world by simulating both the environment in which the user finds themselves and the environment itself. Augmented reality and virtual reality can both be effective tools for viewing and analyzing digital twins, whether they are displayed on a screen (2D) or in a real-world location (3D). • Application Programming Interface: API makes it possible for programs like databases, networks, and Internet of Things sensors to communicate with one another.

A Review of the Concept, Applications, Risks and Control Strategies

273

APIs are reusable building pieces that are designed to save developers time by preventing them from having to start the programming process from scratch again and again. • Big Data and Analytics: these technologies are used to analyze the copious amounts of data produced by digital twins in order to glean insights and recognize trends. Digital twins produce these copious volumes of data.

5 Risks and Challenges Associated with Digital Twins While digital twin technology offers many benefits, there are also risks and challenges that must be considered [81–83]. Some of these include: • Data security and privacy: Digital twins require a large amount of data to create and operate, and this data must be protected from cyber threats and attacks. The Digital Twins retain the ability to perform active monitoring of the physical assets [84]. Although many businesses may find the thought of maintaining suitable data protection processes for networks of this complexity to be intimidating, the hazards associated in protecting the security of digital twins are frequently of the utmost importance. • Cost: Creating and maintaining a digital twin can be expensive, requiring a significant investment in hardware, software, and personnel. Additionally, the benefits of the digital twin may not be realized for several years, making it a long-term investment. • Accuracy: The accuracy of a digital twin is only as good as the data it is based on. If the data is incomplete or inaccurate, the digital twin may not accurately model the physical system it represents. • Ethics: As digital twins become more prevalent, ethical considerations may arise, such as the potential for bias in the data used to create the twin. Indeed, especially in healthcare and medicine, access to high-quality data with high cardinality and containing enough variation will be crucial to opportunely train effective AI models. These datasets could be compiled from a variety of sources, including information obtained from clinical studies or through partnerships with hospitals, data from customers, and information that is publicly available. • Education: Organizations and employees will invariably need to update their skill sets in order to keep up with the effects of technological advancements involving digital twins. The owners and users of digital twins must have access to the necessary resources and knowledge to successfully operate and manage the digital twin platforms and infrastructure [83].

274

F. Farid et al.

• Field of Robotics: Robotic movements can frequently be incredibly fast (e.g., in an assembly line), realtime feedback from sensors is important to the digital twin’s ability in making short-term decisions and judgments. For the purpose of developing robots, various simulators, such as Gazebo [85], call for high-performance computing (HPC) platforms.

6 Integration of Interdisciplinary Model The digital twin is comprised of a number of interconnected and multidisciplinary models (such as CAD, structure, behavior, and function model, etc.) in the areas of mechanical engineering, electrical engineering, and software development. For example, during the engineering phase of the Cyber Physical Power System (CPPS), these models are built using a variety of tools, each of which addresses a certain facet of the system [86]. • Data Quality and Reliability: In a digital twins application that relies on data generated by hundreds (or thousands) of IoT sensors, it can be of utmost importance to assure that the data is of a dependable quality [82]. Because of challenging working conditions and the use of remote networks for communication, businesses will need to devise methods to recognize and get rid of false information as well as deal with inconsistencies in the information that they collect [83]. • IP Protection: Digital twins data is usually shared between several distinct instances. Imagine that the data being collected has a direct bearing on the fundamental capabilities of the organization. In such situation, it is quite likely to contain highly secret information, which raises questions about who owns the data, how identification can be verified, and how users can be restricted from accessing it [82]. • Harmonization of Real Life and Virtual Experiences: The fact that simulations are unable to accurately portray the real world in their entirety, a great number of alterations and modifications must be made during the commissioning and operation of a production system. As a result, in order to make use of the digital twin, its synchronization with the physical system must begin as soon as possible and must be both automatic and methodical [87]. • Interoperability: Interoperability is essential in order to accomplish both the goal of co-simulation and the interchange of operational data. Data interchange from operations can be put to use in applications related to machine learning. In these applications, one Intelligent digital twin can optimize the results of its learning by making use of information offered by its adjacent Twins. In the event of a co-simulation, both a co-simulation and an adjacent Twin can attempt to mimic the actual system in the Cyber Layer [87].

A Review of the Concept, Applications, Risks and Control Strategies

275

6.1 Risk Control Strategies for Digital Twins The major consideration when developing any digital twin is risk management of the three distinct parts: the physical object or process and its physical environment, the digital representation of the object or process, and the communication channel between the physical and virtual representations. This enables the digital twin to become more dependable and resilient in the potential case of disruptions [61, 88]. To identify and address cyber security risks, such as malware, and other types of cybercrime, DT makes use of data analytics, machine learning, and other cutting-edge technology. Cyber security experts use DT to test and evaluate existing security procedures and techniques to create new technologies and solutions that will better guard against online attacks. DT environments of firewalls, encryption, and intrusion detection systems are used to create effective cybersecurity measures. Some strategies that can be employed by organizations to mitigate digital twin risks involve applying: Security Controls: The control for securing digital twins involves a comprehensive approach that considers every facet of the technology stack, from data generation at the physical level to data storage and analysis at the digital level. To ensure confidentiality and integrity, a robust set of technical security controls must be implemented since digital twins are connected to the internet and vulnerable to cyber-attacks. Organizations should implement firewalls, intrusion detection systems, and encryption, to prevent unauthorized access, data breaches, and other cyber threats to digital twin data and network environment. Data Privacy: Digital twins collect and process large amounts of data, including sensitive information. Companies should establish clear data privacy policies and procedures to ensure that the information collected is used ethically and in accordance with applicable laws and regulations. Encryption can help to protect sensitive data and communications between the physical system and the digital twin. Testing and Validation: Digital twins should be rigorously tested and validated before implementation to ensure that they accurately replicate the physical system and produce reliable results. Implement Access Controls: A foundational pillar for digital twin risk control is access control. Implementing strong authentication mechanisms ensures that only authorized personnel can interact with and manipulate the digital twin. Multi-factor authentication (MFA) can provide an additional layer of defense against unauthorized access, while role-based access control (RBAC) can tailor permissions based on user responsibilities, thereby minimizing any potential risks of misuse. Organizations should implement these controls to limit the number of people who can access the digital twin. There must be no compromise in using strong passwords, two-factor authentication, and other security measures to prevent unauthorized access. Conduct Regular Risk Assessment Audits: Organizations should conduct regular risk assessments to identify potential threats and vulnerabilities in their digital twin systems. This can help them to prioritize security measures and address any issues in the digital twin infrastructure before they can be exploited by cyber criminals. For instance,

276

F. Farid et al.

performing penetration testing to simulate real-world attacks and identify potential entry points for malicious actors is crucial to identify weaknesses. Disaster Recovery Plan: When operating digital twin systems, preparing for worstcase scenarios is essential to ensure that in the event of a cyber incident or system failure, the digital twin environment can be restored and maintained, minimizing downtime and data loss. Companies should have an effective and well-tested disaster recovery plan in place in case of any system failure or cyber-attacks. This plan should include backups of data and systems, and a plan for quick recovery of the digital twin system. Training and Education: Companies should provide training and education for employees who work with digital twin technology to ensure that they understand the risks and how to use the technology safely and effectively. Ethical Considerations: Digital twins have the potential to impact society, and companies should consider the ethical implications of their use. For example, companies should consider the impact of digital twins on employment and ensure that they do not lead to job loss. Regular Updates and Maintenance: Digital twins require regular updates and maintenance to ensure that they remain accurate and reliable. Companies should establish a regular maintenance schedule and ensure that updates are implemented promptly. Cyber threats are constantly evolving, so regular updates and patches of the software ensure that any known vulnerabilities are addressed.

7 Discussions and Recommendations In this work, a summary and presentation of various concepts, definitions, techniques, and characterizations of digital twin technologies that are currently in use are provided. Because of the proliferation of the Industry 4.0 idea, digital twin is becoming increasingly commonplace. The development of cutting-edge technologies such as the IoT, cloud computing, big data analytics, and artificial intelligence has been a crucial enabler of expansion of the technology [54]. Recent developments have shown that digital twins have a wide range of applications and can be implemented in a variety of fields and industries. These applications range from the management of manufacturing and production processes to the administration of engineering and construction projects and healthcare facilities, as well as other complex applications, such as those found in agriculture, retail, aviation, and aircraft maintenance. Studies on digital twins for smart cities, agriculture, and education are notably less common than those on manufacturing and healthcare, revealing research deficits in these fields. The review in this study draws attention to these under-discussed, yet rapidly developing areas. The implementation of solutions based on digital twins is expensive since it necessitates considerable investments in technological platforms (sensors, software), development of communication infrastructure, maintenance, data quality control, and security solutions.

A Review of the Concept, Applications, Risks and Control Strategies

277

8 Conclusions In addition to understanding the concept and architectural context of digital twin, the aim of this paper was to also review the current application of digital twins in the context of risk and challenges of cyber physical systems. Digital twin is a virtual version of a physical structure that is employed with the objective of achieving advanced simulation results in order to monitor information by means of signals from sensors, improve decision-making, and forecast problems. The technology might make it possible to share data, which would make it possible for CPS to host additional data-oriented services. The applications in the manufacturing, aviation, and healthcare industries present many other possibilities of how digital twin could be viable. Although, reference models for digital twins have been found to be lacking. A cursory understanding of the research objectives and challenges pertaining to digital twins indicate most recent research outputs only demonstrating preliminary application examples. On the basis of the several risks and challenges associated with digital twin, the following should be included in any future research: Methods for simulating and modeling to reduce complexity; 5G communication; Capabilities for both edge computing and cloud computing: IoT data processing and analysis systems; Development of new AI algorithms; Interoperability and integration of many types of software, and simulators; Protection against man-in-the-middle attacks and other forms of cyber attacks such as Distributed Denial of Service (DDoS) by securing the application layer.

References 1. Liu, X., et al.: A systematic review of digital twin about physical entities, virtual models, twin data, and applications. Adv. Eng. Inform. 55, 101876 (2023) 2. Iliu¸ta˘ , M., Pop, E., Caramihai, S.I., Moisescu, M.A.: A digital twin generic architecture for data-driven cyber-physical production systems. In: Service Oriented, Holonic and MultiAgent Manufacturing Systems for Industry of the Future: Proceedings of SOHOMA 2022, pp. 71–82 (2023) 3. Juarez, M.G., Botti, V.J., Giret, A.S.: Digital twins: review and challenges. J. Comput. Inf. Sci. Eng. 21(3) (2021) 4. Sharma, A., Kosasih, E., Zhang, J., Brintrup, A., Calinescu, A.: Digital twins: State of the art theory and practice, challenges, and open research questions. J. Ind. Inf. Integr. 100383 (2022) 5. Adjei, P., Montasari, R.: A critical overview of digital twins. In: Research Anthology on BIM and Digital Twins in Smart Cities, pp. 1–12 (2023) 6. Faleiro, R., Pan, L., Pokhrel, S.R., Doss, R.: Digital twin for cybersecurity: towards enhancing cyber resilience. In: Broadband Communications, Networks, and Systems: 12th EAI International Conference, BROADNETS 2021, Virtual Event, 28–29 October 2021, Proceedings 12, pp. 57–76 (2022) 7. Barricelli, B.R., Casiraghi, E., Fogli, D.: A survey on digital twin: Definitions, characteristics, applications, and design implications. IEEE Access 7, 167653–167671 (2019) 8. Empl, P., Pernul, G.: Digital-twin-based security analytics for the internet of things. Information 14(2), 95 (2023) 9. Alshammari, K., Beach, T., Rezgui, Y.: Cybersecurity for digital twins in the built environment: current research and future directions. J. Inf. Technol. Constr. 26, 159–173 (2021)

278

F. Farid et al.

10. Tao, F., Qi, Q., Wang, L., Nee, A.Y.C.: Digital twins and cyber–physical systems toward smart manufacturing and industry 4.0: correlation and comparison. Engineering 5(4), 653–661 (2019) 11. Aheleroff, S., Xu, X., Zhong, R.Y., Lu, Y.: Digital twin as a service (DTaaS) in industry 4.0: an architecture reference model. Adv. Eng. Inform. 47, 101225 (2021) 12. Botín-Sanabria, D.M., Mihaita, A.S., Peimbert-García, R.E., Ramírez-Moreno, M.A., Ramírez-Mendoza, R.A., Lozoya-Santos, J.D.J.: Digital twin technology challenges and applications: a comprfehensive review. Remote Sensing 14(6), 1335 (2022) 13. Qian, C., Liu, X., Ripley, C., Qian, M., Liang, F., Yu, W.: Digital twin—Cyber replica of physical things: architecture, applications and future research directions. Future Internet 14(2), 64 (2022) 14. da Silva Mendonça, R., de Oliveira Lins, S., de Bessa, I.V., de Carvalho Ayres Jr, F.A., de Medeiros, R.L.P., de Lucena Jr, V.F.: Digital twin applications: a survey of recent advances and challenges. Processes 10(4), 744 (2022) 15. Opoku, D.G.J., Perera, S., Osei-Kyei, R., Rashidi, M.: Digital twin application in the construction industry: a literature review. J. Build. Eng. 40, 102726 (2021) 16. Attaran, M., Celik, B.G.: Digital Twin: benefits, use cases, challenges, and opportunities. Decis. Anal. J. 100165 (2023) 17. Dembski, F., Wossner, U., Letzgus, M., Ruddat, M., Yamu, C.: Urban digital twins for smart cities and citizens: the case study of Herrenberg, Germany. Sustainability 12, 2307 (2020) 18. Ruohomaki, T., Airaksinen, E., Huuska, P., Kesaniemi, O., Martikka, M., Suomisto, J.: Smart city platform enabling digital twin. In: Proceedings International Conference on Intelligent Systems (IS), pp. 155–161 (2018) 19. Jimenez, J.I., Jahankhani, H., Kendzierskyj, S.: Health care in the cyberspace: Medical cyberphysical system and digital twin challenges. In: Farsi, M., Daneshkhah, A., Hosseinian-Far, A., Jahankhani, H. (eds.) Digital Twin echnologies and Smart Cities, pp. 79–92. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-18732-3_6 20. Helgers, H., Hengelbrock, A., Schmidt, A., Rosengarten, J., Stitz, J., Strube, J.: Process design and optimization towards digital twins for HIV-gag VLP production in HEK293 cells, including purification. Processes 10, 419 (2022) 21. Armeni, P., Polat, I., De Rossi, L.M., Diaferia, L., Meregalli, S., Gatti, A.: Digital twins in healthcare: is it the beginning of a new era of evidence-based medicine? A critical review. J. Pers. Med. 12(8), 1255 (2022) 22. Elayan, H., Aloqaily, M., Guizani, M.: Digital twin for intelligent context-aware IoT healthcare systems. IEEE Internet Things J. 8(23), 16749–16757 (2021) 23. Hassani, H., Huang, X., MacFeely, S.: Impactful digital twin in the healthcare revolution. Big Data Cogn. Comput. 6(3), 83 (2022) 24. Corral-Acero, J., et al.: The ‘Digital Twin’ to enable the vision of precision cardiology. Eur. Heart J. 41, 4556–4564 (2020) 25. Erol, T., Mendi, A.F., Dogan, D.: The digital twin revolution in healthcare. In: Proceedings of the 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Istanbul, Turkey, 22–24 October 2020, pp. 1–7 (2020). https://ieeexplore. ieee.org/document/9255249. Accessed 22 Dec 2021 26. Shamanna, P., et al.: Reducing HbA1c in type 2 diabetes using digital twin technology-enabled precision nutrition: a retrospective analysis. Diabetes Ther. 11, 2703–2714 (2020) 27. Leng, J., Wang, D., Shen, W., Li, X., Liu, Q., Chen, X.: Digital twins-based smart manufacturing system design in industry 4.0: a review. J. Manuf. Syst. 60, 119–137 (2021) 28. Rojek, I., Mikołajewski, D., Dostatni, E.: Digital twins in product lifecycle for sustainability in manufacturing and maintenance. Appl. Sci. 11, 31 (2021)

A Review of the Concept, Applications, Risks and Control Strategies

279

29. Lu, Y., Liu, C., Wang, K., Huang, H., Xu, X.: Digital twin-driven smart manufacturing: connotation, reference model, applications and research issues. Robot. Comput.-Integr. Manuf. 61, 101837 (2019) 30. Liu, Q., et al.: Digital twin-based designing of the configuration, motion, control, and optimization model of Advanced Robotics 31. Huang, Z., Shen, Y., Li, J., Fey, M., Brecher, C.: A survey on AI-driven digital twins in industry 4.0: smart manufacturing and advanced robotics. Sensors 21(19), 6340 (2021) 32. Joseph, A.J., Kruger, K., Basson, A.H.: An aggregated digital twin solution for human-robot collaboration in industry 4.0 environments. In: Borangiu, T., Trentesaux, D., Leitão, P., Cardin, O., Lamouri, S. (eds.) SOHOMA 2020. SCI, vol. 952, pp. 135–147. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69373-2_9 33. Agüero, C.E., et al.: Inside the virtual robotics challenge: simulating real-time robotic disaster response. IEEE Trans. Autom. Sci. Eng. 12, 494–506 (2015) 34. Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: IEEE International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012) 35. Rohmer, E., Singh, S.P., Freese, M.: V-REP: a versatile and scalable robot simulation framework. In: IEEE International Conference on Intelligent Robots and Systems, pp. 1321–1326 (2013) 36. Zhuang, C., Liu, J., Xiong, H.: Digital twin-based smart production management and control framework for the complex product assembly shop-floor. Int. J. Adv. Manuf. Technol. 96, 1149–1163 (2018) 37. Hu, L., et al.: Modeling of cloud-based digital twins for smart manufacturing with MT connect. Procedia Manuf. 26, 1193–1203 (2018) 38. Shahriar, M.R., Sunny, S.M.N.A., Liu, X., Leu, M.C., Hu, L., Nguyen, N.-T.: MTComm based virtualization and integration of physical machine operations with digital-twins in cyberphysical manufacturing cloud. In: Proceedings - 5th IEEE International Conference on Cyber Security and Cloud Computing and 4th IEEE International Conference on Edge Computing and Scalable Cloud, CSCloud/EdgeCom 2018, pp. 46–51 (2018) 39. Petkovi´c, T., Puljiz, D., Markovi´c, I., Hein, B.: Human intention estimation based on hidden Markov model motion validation for safe flexible robotized warehouses. Robot. Comput.Integr. Manuf. 57, 182–196 (2019). https://doi.org/10.1016/j.rcim.2018.11.004 40. Leng, J., Zhou, M., Xiao, Y., Zhang, H., Liu, Q.Q., Li, L.: Digital twins-based remote semiphysical commissioning of flow-type smart manufacturing systems. J. Clean. Prod. 306, 127278 (2021) 41. Son, Y.H., Park, K.T., Lee, D., et al.: Digital twin–based cyber-physical system for automotive body production lines. Int. J. Adv. Manuf. Technol. 115, 291–310 (2021) 42. Zhang, H., Yan, Q., Wen, Z.: Information modeling for cyber-physical production system based on digital twin and AutomationML. Int. J. Adv. Manuf. Technol. 107, 1927–1945 (2020) 43. Leng, J., Zhang, H., Yan, D., et al.: Digital twin-driven manufacturing cyberphysical system for parallel controlling of smart workshop. J. Ambient. Intell. Humaniz. Comput. 10(3), 1155–1166 (2019) 44. Park, K.T., Lee, J., Kim, H.J., et al.: Digital twin-based cyber physical production system architectural framework for personalized production. Int. J. Adv. Manuf. Technol. 106, 1787– 1810 (2020) 45. Salem, T., Dragomir, M.: Options for and challenges of employing digital twins in construction management. Appl. Sci. 12(6), 2928 (2022) 46. Sacks, R., Brilakis, I., Pikas, E., Xie, H., Girolami, M.: Construction with digital twin information systems. Data-Cent. Eng. 1, e14 (2020) 47. Hijazi, A.A., Perera, S., Al-Ashwal, A.M., Neves Calheiros, R.: Enabling a Single Source of Truth through BIM and Blockchain Integration, pp. 385–393 (2019)

280

F. Farid et al.

48. Clack, C.D., Bakshi, V.A., Braine, L.: Smart Contract Templates: Foundations, Design Landscape and Research Directions, arXiv preprint arXiv:1608.00771 (2016) 49. Li, H., Lu, M., Chan, G., Skitmore, M.: Proactive training system for safe and efficient precast installation. Autom. ConStruct. 49, 163–174 (2015) 50. Dave, B., Kubler, S., Främling, K., Koskela, L.: Opportunities for enhanced lean construction management using internet of things standards. Autom. Constr. (2016). https://www.scopus.com/inward/record.uri?eid=2-s2.0-84949680819&doi=10. 1016%2fj.autcon.2015.10.009&partnerID=40&md5=6099e707710515adcda41748e118ab6 51. Kim, S.A., Shin, D., Choe, Y., Seibert, T., Walz, S.P.: Integrated energy monitoring and visualization system for Smart Green City development: designing a spatial information integrated energy monitoring model in the context of massive data management on a web based platform. Autom. ConStruct. 22, 51–59 (2012) 52. Rashidi, M., Samali, B.: Health monitoring of bridges using RPAs. In: Wang, C.M., Dao, V., Kitipornchai, S. (eds.) EASEC16, pp. 209–218. Springer, Cham (2021). https://doi.org/10. 1007/978-981-15-8079-6_20 53. Rashidi, M., Mohammadi, M., Sadeghlou Kivi, S., Abdolvand, M.M., Truong-Hong, L., Samali, B.: A decade of modern bridge monitoring using terrestrial laser scanning: review and future directions. Remote Sens. 12(22), 3796 (2020) 54. Atalay, M., Murat, U., Oksuz, B., Parlaktuna, A., Pisirir, E., Testik, M.: Digital twins in manufacturing: systematic literature review for physical-digital layer categorization and future research directions. Int. J. Comput. Integr. Manuf. 35, 679–705 (2022) 55. Kajba, M., Jereb, B., Obrecht, M.: Considering IT trends for modelling investments in supply chains by prioritising digital twins. Processes 11(1), 262 (2023) 56. Edlund, R.P.B.: Usage of digital twins in supply chain risk management. Bachelor’s thesis, Aalto University School of Business Information and Service Management, Espoo, Finland (2022) 57. Liu, J., Yeoh, W., Qu, Y., Gao, L.: Blockchain-Based Digital Twin for Supply Chain Management: State-of-the-Art Review and Future Research Directions. arXiv (2022). arXiv:2202. 03966 58. Zhang, G., MacCarthy, B.L., Ivanov, D.: The cloud, platforms, and digital twins—enablers of the digital supply chain. In: MacCarthy, B.L., Ivanov, D. (eds.) The Digital Supply Chain, pp. 77–91. Elsevier, Amsterdam (2022). ISBN 978-0-323-91614-1 59. Perno, M., Hvam, L., Haug, A.: Implementation of digital twins in the process industry: a systematic literature review of enablers and barriers. Comput. Ind. 134, 103558 (2022) 60. Segovia, M., Garcia-Alfaro, J.: Design, modeling and implementation of digital twins. Sensors 22(14), 5396 (2022) 61. Ma, X., Tao, F., Zhang, M., Wang, T., Zuo, Y.: Digital twin enhanced human-machine interaction in product lifecycle. Procedia CIRP 83, 789–793 (2019) 62. González, M., Salgado, O., Croes, J., Pluymers, B., Desmet, W.: A digital twin for operational evaluation of vertical transportation systems. IEEE Access 8, 114389–114400 (2020) 63. Guo, D., Zhong, R.Y., Lin, P., Lyu, Z., Rong, Y., Huang, G.Q.: Digital twin-enabled graduation intelligent manufacturing system for fixed-position assembly islands. Robot. Comput.-Integr. Manuf. 63, 101917 (2020) 64. Huang, H., Yang, L., Wang, Y., Xu, X., Lu, Y.: Digital Twin-driven online anomaly detection for an automation system based on edge intelligence. J. Manuf. Syst. 59, 138–150 (2021) 65. Saad, A., Faddel, S., Youssef, T., Mohammed, O.A.: On the implementation of iot-based digital twin for networked microgrids resiliency against cyber attacks. IEEE Trans. Smart Grid 11, 5138–5150 (2020) 66. Salvi, A., Spagnoletti, P., Noori, N.S.: Cyber-resilience of critical cyber infrastructures: integrating digital twins in the electric power ecosystem. Comput. Secur. 112, 102507 (2022)

A Review of the Concept, Applications, Risks and Control Strategies

281

67. Sousa, B., Arieiro, M., Pereira, V., Correia, J., Lourenço, N., Cruz, T.: ELEGANT: security of critical infrastructures with digital twins. IEEE Access 9, 107574–107588 (2021) 68. Bhatti, G., Singh, R.R.: Intelligent fault diagnosis mechanism for industrial robot actuators using digital twin technology. In: Proceedings of the 2021 IEEE International Power and Renewable Energy Conference (IPRECON), Kollam, India, 24–26 September 2021, pp. 1–6 (2021) 69. Modoni, G.E., Stampone, B., Trotta, G.: Application of the digital twin for in process monitoring of the micro injection moulding process quality. Comput. Ind. 135, 103568 (2022) 70. Moghadam, F.K., Nejad, A.R.: Online condition monitoring of floating wind turbines drive train by means of digital twin. Mech. Syst. Signal Process. 162, 108087 (2022) 71. Zhuang, C., Liu, Z., Liu, J., Ma, H., Zhai, S., Wu, Y.: Digital twin-based quality management method for the assembly process of aerospace products with the grey-markov model and apriori algorithm. Chin. J. Mech. Eng. 35, 105 (2022) 72. Purcell, W., Neubauer, T.: Digital twins in agriculture: a state-of-the-art review. Smart Agric. Technol. 3, 100094 (2023) 73. Skobelev, P., Laryukhin, V., Simonova, E., Goryanin, O., Yalovenko, V., Yalovenko, O.: Multi-agent approach for developing a digital twin of wheat. IEEE (2020) 74. Machl, T., Donaubauer, A., Kolbe, T.H.: Planning Agricultural Core Road Networks Based on a Digital Twin of the Cultivated Landscape, Wichmann Verlag (2019). https://doi.org/10. 14627/537663034 75. Jo, S.-K., Park, D.-H., Park, H., Kwak, Y., Kim, S.-H.: Energy planning of pigsty using digital twin. In: 2019 International Conference on Information and Communication Technology Convergence (ICTC), pp. 723–725. IEEE (2019). https://doi.org/10.1109/ICTC46691.2019. 8940032 76. Johannsen, C., Senger, D., Kluss, T.: A digital twin of the social-ecological system urban beekeeping, pp. 193–207 (2020). https://doi.org/10.1007/978-3-030-61969-5_14 77. Ghandar, A., Ahmed, A., Zulfiqar, S., Hua, Z., Hanai, M., Theodoropoulos, G.: A decision support system for urban agriculture using digital twin: a case study with aquaponics. IEEE Access 9, 35691–35708 (2021) 78. Tsolakis, N., Bechtsis, D., Bochtis, D.: AgROS: a robot operating system based emulation tool for agricultural robotics. Agronomy 9(7), 403 (2019). https://doi.org/10.3390/agronomy9 070403 79. Paraforos, D.S., Sharipov, G.M., Griepentrog, H.W.: ISO 11783-compatible industrial sensor and control systems and related research: a review. Comput. Electron. Agric. 163, 104863 (2019). https://doi.org/10.1016/j.compag.2019.104863 80. Boulos, M.K., Zhang, P.: Digital twins: from personalised medicine to precision public health. J. Pers. Med. 11, 745 (2021) 81. Moshood, T., Nawanir, G., Sorooshian, S., Okfalisa, O.: Digital twins driven supply chain visibility within logistics: a new paradigm for future logistics. Appl. Syst. Innov. 4, 29 (2021) 82. Modoni, G.E., Caldarola, E.G., Sacco, M., Terkaj, W.: Synchronizing physical and digital factory: benefits and technical challenges. Procedia CIRP 79, 472–477 (2019) 83. Uhlemann, T.H.-J., Lehmann, C., Steinhilper, R.: The digital twin: Realizing the cyberphysical production system for industry 4.0. Procedia CIRP 61, 335–340 (2017) 84. Birkel, H., Müller, J.M.: Potentials of industry 4.0 for supply chain management within the triple bottom line of sustainability—a systematic literature review. J. Clean. Prod. 289, 125612 (2020) 85. Verner, I., Cuperman, D., Gamer, S., Polishuk, A.: Digital twin of the robot baxter for learning practice in spatial manipulation tasks. In: Auer, M.E., Kalyan, R.B. (eds.) REV2019 2019. LNNS, vol. 80, pp. 81–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-231 62-0_9

282

F. Farid et al.

86. Talkhestani, B.A., Jazdi, N., Schloegl, W., Weyrich, M.: Consistency check to synchronize the digital twin of manufacturing automation based on anchor points. Procedia CIRP 72, 159–164 (2018) 87. Ashtari Talkhestani, B., Jung, T., Lindemann, B., Sahlab, N., Jazdi, N., Schloegl, W., Weyrich, M.: An architecture of an intelligent digital twin in a cyber-physical production system. at-Automatisierungstechnik 67(9), 762–782 (2019) 88. Rasheed, A., San, O., Kvamsdal, T.: Digital twin: values, challenges and enablers from a modeling perspective. IEEE Access 8, 21980–22012 (2020)

Processing of the Time Series of Passenger Railway Transport in EU Countries Zdena Dobesova(B) Department of Geoinformatics, Palacký University, 17. Listopadu 50, 779 00 Olomouc, Czech Republic [email protected]

Abstract. This article describes utilising the Eurostat railway passenger transport data as a time series in the Data Mining course lecturing at the university. The quarterly time series from 2004 to 2023 shows long-term increases or stable trends in passenger railway transport in European countries. The small decline after the economic crisis in 2008 is detected in the data. The highest decrease in passenger transport was in the second quarter of 2020 in all European countries caused by the COVID-19 pandemic. The number of transported passengers increased after the pandemic in 2020 and 2021 but is not fully at pre-COVID levels until the first quarter of 2023. Calculating the growth rate allows to compare the countries and annual changes. The practical example data shows that the decomposition of time series to trend, seasonal and residual parts must be processed separately for the part before the pandemic. The data mining software Orange helps create the processing and setting parameters quickly, like the number of values of sliding window for moving average to calculate the trend of time series. So students can concentrate on variants of processing and the correct interpretation of results. Orange was confirmed as an appropriate software for teaching the Data Mining course. Keywords: Eurostat · Data mining · Education · Orange · Visual workflow · Transport

1 Introduction One of the topics of the Data Mining course for the Master’s degree in Geoinformatics and Cartography at Palacký University in Olomouc is Time Series Processing. The data forming the time series can be viewed as normal input data and basic statistics, and the explanatory data analysis (EDA) can be processed. Besides them, some other analyses like decomposition of time series, calculation of growth rate or prediction of time series are specific for this type of data that express the evolution in time. Practical training on concrete data is an essential part of the training and helps to understand the theoretical concept of individual methods. It is advantageous if it is possible to practice the discussed topics on real data. In the search for suitable current data for the subject of Data Mining and time series processing, the statistical database of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 283–293, 2024. https://doi.org/10.1007/978-3-031-54820-8_22

284

Z. Dobesova

the European Union, Eurostat [1], was selected. A description of the Eurostat database and other useful sources for lecturing can be found in the book Spationomy [2]. The use of Eurostat data in the mentioned course was presented in previous articles. The Statistical Classification of Economic Activities NACE data was used to demonstrate EDA and apply some clustering methods in practical lecturing [3]. The use of European regional statistics of unit-level NUTS2 is presented in an article about lectures exploring spatial clustering and local correlation [4]. The pandemic of COVID-19 disease influenced the lives of people and their mobility, especially in 2020 and 2021. The reduction of transport concerned all types of transport, not only railway. An example of a personal change in mobility lifestyle is reported by three authors [5]. Some commuters changed their mode of transport from railway to private cars. International passengers dropped more (57.3%) than domestic passengers (33.6%) in Croatia in comparison to 2019 and 2020 years [6]. Xin et al. [7] examined the impact of COVID-19 on daily ridership on urban rail systems in 22 cities in Asia, Europe, and the United States. Chinese cities recorded the outbreak of COVID-19 earlier in January 2020 than European cities, where influence on urban railway systems is reported from March 2020 [7]. Eurostat reported that the dip in rail passenger transport performance in 2020 was particularly significant in the second and fourth quarters (−74% and −54%) compared with the same quarters in 2019 in the whole EU. Despite the slight recovery in 2021, EU rail passenger transport performance was still below the performance levels observed before the pandemic [8]. A detailed evaluation of changes in corridor railway traffic in the Czech Republic revealed that some passenger train connections were cancelled or the length of trains was shortened in 2020 [9]. In the practical examples demonstrated in this paper, the main goal was to present the influence of the COVID-19 pandemic on change and the partial decline of railway transport data in the European Union from 2020 to the first quarter of 2023 (last available data in time of writing article). The detailed description of growth rate, trend, decomposition of time series to trend, seasonal part and residuals bring a deeper understanding of changes in railway transport. The correct processing and interpretation of time series are emphasised. All tasks could be easily automated thanks to the data mining software Orange.

2 Data and Methods The selection of data and software is crucial for practical lecturing. The next chapters briefly explain the used data source, the Eurostat database and practical examples in Orange data mining software. The data mining software Orange is the preferred software in the Data Mining course. This software is freely downloadable and aimed for pedagogical purposes by creators from the University of Ljubljana, Slovenia [10]. Thanks to the optional software add-on, the Time Series is also applicable for processing time data [11].

Processing of the Time Series of Passenger Railway Transport

285

2.1 Source Data from the Eurostat Database The Eurostat database [1] provides data on infrastructure, transport performance and transport accidents in the Transport section. The data are divided by mode of transport. In the rail transport section, one can find data on line equipment such as number of locomotives, length of lines by traction and speed. Transport performance is monitored regarding the volume of goods and passengers transported (Fig. 1).

Fig. 1. Transport section and subsection Rail transport in the Eurostat database interface.

In the presented example, only passenger transport data will be used. Data on passenger transport by rail have been available since 2004. The data are provided in quarterly totals (Q) for each country. Data for some periods and countries are missing; e.g. Switzerland provides data only from 2008, and Northern Macedonia from 2009. Two values are missing from 2010 for France. Also, the publication of the latest Eurostat data is delayed compared to the publication in national censuses, which contain more recent data but are often in a preliminary stage. The advantage is that the unique statistics have a fixed code designation in the Eurostat database. Unique code allows repeatedly searching and downloading the actual current data. The code designation of the statistics on passenger transport by rail in the Eurostat database is RAIL_PA_QUARTAL [12]. The Eurostat portal shows data in tabular form, graph form (line, bar) (Fig. 2), or a map of Europe. Line graph helps for quick familiarisation with tabular data. The deep decline in 2020 and 2021 during the COVID-19 pandemic attracts first. The selected countries in Fig. 2 have the most transported passengers in Europe (Germany, United Kingdom, France, Italy and Spain).

286

Z. Dobesova

United Kingdom stopped the reporting of data from Q4 2020. The selected data could be freely downloaded for further processing in Excel format.

Fig. 2. Visualisation of transported passengers for selected countries in line graph at the Eurostat portal.

2.2 Data Mining Software Orange and Design of Workflow The processing steps are expressed in data mining software Orange as a graphical workflow (Fig. 3). Each circular node in the workflow represents operation with the data. The nodes from the add-on Time Series have a blue colour with an inner intuitive icon. The label under the node describes the type of operation. From the point of comprehension of processing steps is the visual form at a high level of effective cognition [13]. It means that software is easy to use for operators or students. Each circular node opens a dialogue window for setting operation parameters or displaying the resulting data. The left orange icon File opens the source data in XLSX format. Next, the blue icon As Timeseries converts input data to the time series format. The base node is a Line Chart displaying a line graph with selected time series. Regarding railway transport, the selection could be one or more countries. Node Time Slice offers to select only part of the time series. Next, the blue node calculates the trend by Moving Transform [14] and Difference, followed by depicting the difference by line chart nodes. Node Seasonal Adjustment performs additive or multiplicative decomposition.

3 Results The exploring source data about passenger railway transport started by depicting values in a line chart (Fig. 4). The countries could be interactively added by selecting the country on the left window to the graph on the right side. In that step, a user must be careful

Processing of the Time Series of Passenger Railway Transport

287

Fig. 3. The workflow diagram for processing transport data by time series operations.

with the scale of the data. Germany has around 600 thousand passengers in one quarter of the year. Conversely, Slovenia has around 3 thousand passengers in one quarter of a year. Some countries’ differences are big, and the fluctuation could not be visible with the same scale on the Y-axis. The dialogue window can optionally open the next line chart with automatic Y-axis adjustment according to the data scale. The number of passengers has increased or stabilised in all European countries from 2004 to 2020. Some small declines occurred after the crisis in 2008 and in the next years (Poland, Czech Republic). It is possible to calculate the correlation of the time series of countries. The best way is to calculate correlation only for the part of the time series before the year 2020. There is a high correlation between the number of passengers of Denmark and Germany

288

Z. Dobesova

(0.93) and between Denmark and Norway (0.92) and Germany and Norway (0.91). They are neighbouring countries. Calculating correlation separately for 2020 to 2022 is better because the values are very dynamic. The results of correlations are slightly different in the combination of countries, e.g. Germany and Hungary (0.95) and Germany and Poland (0.94) during the pandemic. The exact setting of part of the time series helps the node Time Slice in the Orange software, which allows setting the time window to extract data for further data processing easily. The time series shows in the part before the pandemic annual periodicity. It is visible in France or Spain (Fig. 4). The lowest values are repetitively in each year’s third quarter (Q3). The year declination is due to a drop in pupil, student and employee passengers during the summer holidays. The highest values are in the fourth quarter (Q4), caused by travelling for the Christmas and New Year holidays. However, the values of Q4 are comparable with those of Q1 and Q2.

Fig. 4. Visualisation of selected time series in Line Chart in Orange software.

The basic additive decomposition of time series is to trend, seasonal part and residuals [15]. The trend can be calculated as a moving average when the average is computed above a sliding window. The blue node Moving Transform allows the user to set the range of sliding windows. For quarterly data, an odd number like five or nine is proper. Figure 5 shows trends with the value nine for the moving average for the number of passengers in France and Spain. Users could explore the result when changing the value of the window range and very quickly select the best range. The trend before the pandemic years 2020–2022 is calculated correctly. The trend is biased from 2020 to 2021. The sharp decline is reported with delay after 2020. The increasing trend is partially visible by the end of 2022. The global minimum transport value was in the second quartal (Q2) for all European countries [8]. There were state emergencies, and the governments applied lockdowns in European countries. It influenced the mobility of people to a minimal amount. Only some countries are nearly back to the pre-COVID values of transported passengers in 2022 (Figs. 2, 4 and 5). For the punctual analysis of changes, the calculation of the annual growth rate, which is the ratio of the values of the same quarters in two consecutive years, is well-suited. The quarterly annual growth rate can be calculated as k = yt / y t-4 , where t is the corresponding

Processing of the Time Series of Passenger Railway Transport

289

Fig. 5. Time series with trend calculated like moving average for Germany and France.

quarter [16, 17]. The blue node Difference in Orange software automatically calculates the growth rate as an option change quotient. The development of the annual growth rate from 2004 to 2021 is shown in Fig. 6 for France and Germany. The growth rate (index) figures are relative values. When the value is greater than one, there is an increase in the number of passengers compared to the previous year and the corresponding quarter. A value of less than one results in a decrease in passenger numbers or performance. The growth rate is the proper metric for comparison of the decline between countries in the 2020 year where the COVID-19 pandemic was. Figure 6 compares France and Germany, where France has a lower decline, with only to 23% of transported people in Q2/2020 compared to Q2/2019. Germany declined to 40% between Q2/2020 compared to Q2/2019. The high values in Q2/2021 caused the lowest values in 2020 (France 241% and Germany 133%). It is also visible that the growth rates are back near one in 2022.

Fig. 6. Graph with Grow Rate for France and Germany.

The node Seasonal Adjustment sets the decompositional model as additive or multiplicative (Fig. 7). The additive model was selected for transport data. As a result, the

290

Z. Dobesova

trend, seasonal and residual part is calculated (Fig. 8). The residuals are smaller than the values of peridoc parts. In the case of these data, it is important to calculate the decomposition only for the time before 2020 (set by the Time Slice node). The data from 2020 and 2021 influenced the calculation of seasonal (periodic) parts and residuals. The periodic part is not present in pandemic years. Also, the residual is detected as very high. In the case of the calculation of decomposition of the whole time series with data from pandemic years, the wrong decomposition resulted (Fig. 9). It is visible that the absolute value of residuals is nearly the same as the seasonal (periodic) part values. The calculated value can be shown in the Table node and stored by node Save Data to Excel format outside the Orange software. The output data could be used for preparing better graphs in MS Excel or graphical software (set colours, custom scales, label of axes) on user demands.

Fig. 7. Dialogue Time Slice and dialogue Seasonal Adjustment

4 Discussion Processing data by prepared nodes of the Time Series add-on in Orange software is easy for students. The design or extension of workflow is quick. There is no problem in preparing the variants and comparing them immediately. The presented example of railway transport is very explanative. The lockdown’s big influence on mobility and using railway transport is evident. In the case of processing data like calculation of correlation and decomposition of series, it is better to evaluate separately the part before the COVID-19 pandemic. Students could receive the wrong results when considering a whole time series with a regular part (2004–2019) and an extraordinary COVID-19 part (2020–2023). Correlation only for COVID-19 years can be interpreted like the same changes in transported passengers in two countries. The correlation brings information about the same behaviour of passengers during the pandemic in different countries that are may not neighbours. Such complicated data bring students useful experience and train them in interpretation.

Processing of the Time Series of Passenger Railway Transport

291

Fig. 8. Graph with decomposition to trend, seasonal and residual part for France between 2004 to 2019

Fig. 9. Graph of seasonal and residual parts for transport in France in case of decomposition of whole time series with pandemic data

The topic of prediction of time series is not practised on that data in lectures. The presented data are not suitable for prediction. There is an assumption that passenger railway transport will be near the pre-COVID values in future. But, it is only a hypothesis formulated from a knowledge of the domain. However, the result would not be reliable in the case of automatic calculation of prediction with trend and periodic value.

5 Conclusion The practical example highlights using the Eurostat database, particularly the time series of passenger railway transport, for lecturing. The part of the time series from 2004 to 2019 is possible to use for the task of decomposition of the time series to trend, periodical part and residual part. The source data also shows the influence of pandemic restrictions

292

Z. Dobesova

on a high decline in railway transport in 2020 and 2021. Also, the correlation between the volume of transported passengers in countries is better calculated separately for the pre-COVID and COVID periods. Calculating the growth rate between quarters of the following years is proper for evaluating the whole time series. Subsequently, railway transports are ready to compare in percentage for European countries. Students designed the process workflow in Orange software in graphical form. The design was quick and easy for the presented example. Orange software suits education. It has high cognitive effectiveness of visual notation, and students can concentrate on a correct interpretation of received results. Acknowledgement. This article was created with the support of the Erasmus+ Programme of the European Union, Jean Monnet Module (Project No. 620791-EPP-1-2020-1-CZ-EPPJMOMODULE, UrbanDM - Data mining and analysing of urban structures as a contribution to European Union studies).

References 1. European Commission: EUROSTAT. https://ec.europa.eu/eurostat/data/database. Accessed 20 Aug 2023 2. Pászto, V., Redecker, A., Mack˚u, K., Jürgens, C., Moos, N.: Data sources. In: Pászto, V., Jürgens, C., Tominc, P., Burian, J. (eds.) Spationomy, pp. 3–38. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-26626-4_1 3. Masopust, J., Dobesova, Z., Mack˚u, K.: Utilisation of EU employment data in lecturing data mining course. In: Silhavy, R. (ed.) CSOC 2021. LNNS, vol. 229, pp. 601–616. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77445-5_55 4. Dobešová, Z., Mack˚u, K., Kuˇcera, M.: Výuka geoinformatických pˇredmˇet˚u na pˇríkladech dat Evropské unie. In: Sympozium GIS Ostrava, Ostrava (2022). https://doi.org/10.31490/978 8024846071-153 5. Pászto, V., Burian, J., Mack˚u, K.: Changing mobility lifestyle: a case study on the impact of COVID-19 using personal Google locations data. Int. J. E-Plann. Res. 10, 66–79 (2021). https://doi.org/10.4018/IJEPR.20210401.oa6 6. Jugovi´c, A., Aksentijevi´c, D., Budi´c, T., Oblak, R.: Impact of COVID-19 pandemic on passenger mobility in national and international railway traffic of the Republic of Croatia. Pomorstvo 36, 147–154 (2022). https://doi.org/10.31217/p.36.1.17 7. Xin, M., Shalaby, A., Feng, S., Zhao, H.: Impacts of COVID-19 on urban rail transit ridership using the Synthetic Control Method. Transp. Policy 111, 1–16 (2021). https://doi.org/10.1016/ j.tranpol.2021.07.006 8. Eurostat: EU rail passenger transport: partial recovery in 2021. https://europa.eu/!JNN67v 9. Kucera, M., Dobesova, Z.: Evaluation of changes in corridor railway traffic in the Czech Republic during the pandemic year 2020. Geogr. CASSOVIENSIS 17, 37–51 (2023). https:// doi.org/10.33542/GC2023-1-03 10. University Of Ljubljana: Orange, Data Mining Fruitful and Fun. https://orangedatamining. com/. Accessed 01 July 2023 11. Data Mining Orange: Plotting Covid-19 Data in Time. https://www.youtube.com/watch?v= HVAsG3T3SNI 12. Eurostat: Passengers transported (detailed reporting only) - (quarterly data). https://ec.europa. eu/eurostat/databrowser/product/page/RAIL_PA_QUARTAL. Accessed 10 Dec 2022

Processing of the Time Series of Passenger Railway Transport

293

13. Moody, D.L.: The “physics” of notations: a scientific approach to designing visual notations in software engineering. In: Proceedings - International Conference on Software Engineering, Cape Town, South Africa, pp. 485–486. ACM (2010). https://doi.org/10.1145/1810295.181 0442 14. Data Mining Orange: Time Series - Moving Transform. https://orangedatamining.com/wid get-catalog/time-series/moving_transform_w/ 15. Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice. Monash University, Australia (2018) 16. Hanˇclová, J., Tvrdý, L.: Introduction to the time series analysis. Economic Faculty, VŠB-TU, Ostrava (2003) 17. Kˇrivý, I.: Analysis of time series. University of Ostrava, Ostrava (2012)

Factors Influencing Performance Evaluation of e-Government Diffusion Mkhonto Mkhonto(B) and Tranos Zuva Department of Information and Communication Technology, Vaal University of Technology, Gauteng, South Africa [email protected], [email protected]

Abstract. E-Government has been identified as a tool to allow greater public access to information, and make government more accountable to citizens. EGovernment concept has been in existence for some times, and research on the diffusion of e-Government has been limited to the industry but not much in the organs of government in South Africa. This paper endeavors to investigate the factors that influence performance evaluation of e-Government diffusion in the organs of government particularly in the municipalities. A quantifiable approach was used in this research; a model was proposed, and a survey was administered to a sample of 100 South African Citizens in the three selected municipalities. Results show that the rate of diffusion of e-Government in these municipalities is seventy-five percent (75%). This percentage suggests that strides have been taken to embrace Information and Communication Technology (ICT) in South Africa. It also indicates that Task characteristics, Task Technology Fit, Technology characteristics, Individual characteristics, Collaboration, Social Norms, Utilization, IT Infrastructure, Viability, Economic, Organizational Performance all contribute positively on the diffusion of e-Government in South Africa’s Municipalities. The e-Government diffusion assessment model was developed for organs of government. This research helps in the understanding of diffusion of e-Government in the organs of Government in South Africa. Keywords: e-Government · Diffusion · Organs of Government

1 Introduction So many organisations have introduced diffusion of e-Government with the envisioned and the success of this technology is paramount to fulfill their objectives. Conversely, citizens ‘diffusion of e-Government services has been less than satisfactory in most municipalities in South Africa. While studies by researchers continue to outline the most salient diffusion constructs, as well as various models for understanding diffusion. According to [1], diffusion is “the process by which an innovation is communicated through certain channels over time among the members of a social system”, where the ‘innovation’ can be anything that is seen as new, from the perspective of the adopters. This is the most commonly quoted theory in the field of diffusion of innovation and is based on four main elements of diffusion: innovation, communication channels, time, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 294–303, 2024. https://doi.org/10.1007/978-3-031-54820-8_23

Factors Influencing Performance Evaluation

295

and the social system. The study is arranged as follows: Sect. 2 provides the related work, Sect. 3 provides research model and hypothesis, Sect. 4 provides the methodology, Sect. 5 provides the results, Sect. 6 provides data analysis and discussion, Sect. 7 provides the study limitations and followed by conclusion.

2 Related Work Various researchers have offered different definitions to explain the concept of e-Government [2]. However, these definitions differ according to the varying eGovernment focus and are usually centred on technology, business, citizen, government, process, or a functional perspective [2]; [3]; [4]. The definition considered most suitable for the purpose of this paper is one that defines e-Government as utilisation of ICTs to deliver information and public services to the people” [5]. In addition, this working definition is aligned with the definition provided by [6], who viewed e-Government as the capability of various government agencies to provide government information and services at any time to citizens using electronic means speedily and properly, resulting in less cost and effort via a single internet site. E-Government aims to provide business and citizens with more convenient access to timely and relevant government information and efficient service. It also reduces or eliminates institutional fragmentation of public administration because the citizens and/or businesses are given a chance to access public services from a single point source [7].Furthermore, the term “e-Government can be described as the use of any type of information and communication technology to improve services and operations provided to different parties such as: citizens, businesses, and other government agencies [8]. EGovernment system implementations all over the world are employed in an attempt to utilise ICT to improve government services provided to a range of stakeholders. In doing so, government aims to become more accessible, effective, efficient and accountable to their citizens [9].

3 Research Model and Hypothesis Notable technology models have been used to study and explain the diffusion of various technologies. By drawing on the technology diffusion models an e-Government Diffusion Assessment Model (EDAM) is developed. The following hypothesis in Table 1 was proposed which is translated into the conceptual framework in Fig. 1.

296

M. Mkhonto and T. Zuva Table 1. Proposed Research Hypothesis

No

Proposed Hypothesis

H1

Task Characteristics has a positive influence on Task Technology Fit

H2

Technology Characteristics has a positive influence on Task Technology Fit

H3

Individual Characteristics has a positive influence on Task Technology Fit

H4

Collaboration has a positive influence on Task Technology Fit

H5

Social Norms has a positive influence on Utilization

H6

Utilization has a positive influence on Organizational Performance

H7

Task-Technology-Fit has a positive influence on Utilization

H8

IT Infrastructure has a positive influence on Viability

H9

Economic has a positive influence on Viability

H10

Organization has a positive influence on Viability

H11

Viability has a positive influence on Organizational Performance

H12

Task-Technology-Fit has a positive influence on Organizational Performance

Fig. 1. Conceptual framework for e-Government Diffusion Assessment Model (EDAM)

Factors Influencing Performance Evaluation

297

4 Research Methodology A deductive approach using the quantitative research strategy was adopted for the study. Descriptive statistics was used for data analysis. 4.1 Data Collection The survey was administered through a questionnaire targeted at South African citizens in the three municipalities. Questionnaires were distributed to 100 respondents, filled, and returned. The questionnaire allowed all the respondents to respond to it at their own free time. This could have been the reason why all the 100 respondents filled the forms, thus making the sample become more representative. The questionnaire also served time and money for the researcher in terms of travelling distance. The type of data collected using a questionnaire was easy to tabulate and analyze. 4.2 Questionnaire Design Each item in the model had a corresponding set of questions. The questionnaire was composed of forty-three unambiguous questions that were easy for respondents to complete. Each item on the questionnaire was measured on a seven-point Likert scale whose end points were ‘strongly agree’ (7) and ‘strongly disagree’ (1).

5 Results The results obtained using SPSS are discussed below. 5.1 Descriptive Statistics The descriptive statistics in Table 2 presents the demographic characteristics of participants’ information. Data about the gender of participants was collected, their age groups was also recorded, and the level of education was indicated. Most of the responses are males obtained accounting for 61 percent of the responses rate. This shows an unequal response rate between males and females (39 percent). This shows that more females need to be encouraged to adopt and use technology. The ages of participants ranged from above 18 to 64. Nineteen percent of the participants were between 17 and 24 years, 39 percent were between the ages of 25 and 34, 25 percent were between the ages of 35–55, 16 percent were between the ages of 45–54 and only 1 percent were between the ages of 55–64. Most of the participants were within the most productive age group. The participants were educated up to diploma level and 74 percent ended school at Grade 12. This shows that all the participants were able to read and write. Furthermore, a significant number of participants have solid knowledge to answer the questionnaires. This suggests they can adopt and use e-Government with ease.

298

M. Mkhonto and T. Zuva Table 2. Demographic attributes of respondents’ characteristics

Demographic Categories

Frequency

Percent

Male

61

61.0%

Female

39

39.0%

17–25

19

19.0%

25–34

39

39.0%

35–44

25

25.0%

45–54

16

16.0%

55–64

1

1.0%

Grade 12

74

74.0%

Diploma

26

26.0%

Gender

Age

Education Level

Table 3. Reliability Values Construct

Cronbach’s values

Number of items

Task Characteristics (TAS)

0.705

3

Technology Characteristics (TEC)

0.768

4

Individual Characteristics (IND)

0.774

3

Collaboration (COL)

0.767

4

Social Norms (SOCN)

0.712

4

Utilisation (UTIL)

0.766

4

Task Technology Fit (TTF)

0.728

4

IT Infrastructure (ITINFR)

0.953

5

Economic (ECO)

0.786

3

Organization (ORG)

0.816

2

Viability (VIA)

0.703

1

Organizational Performance (ORG PER)

0.834

6

Factors Influencing Performance Evaluation

299

5.2 Reliability Test A reliability test was done to confirm if each factor of the model is reliable and valid. Cronbach’s alpha (α) was used to assess the reliability of the scales for each of the constructs in this study. All the constructs in the questionnaire had a Cronbach value of 0.74 as shown in Table 4. Generally, an alpha value above 0.7 is considered acceptable [10]. 5.3 Factor Analysis Factor analysis was performed, the result showed: KMO and Bartlett’s test of sphericity value of 0.746, with a significant ρ-value (ρ < 0.000) (Table 4 refers). We indicate factor analysis, it was confirmed all the remaining items were loaded within their respective construct as shown in Table 5. The final factor analysis produced as structure of twelve factors of the cumulative variance. The following section used the final factor structures to carry out the regression analysis. 5.4 Regression Analysis Twelve factors from the study were subjected to linear regression analysis to estimate factors influencing e-Government diffusion. The summary of predictive factors in terms of significance values for each individual factor obtained from regression analysis indicate TAS-TTF (β = 0.361, ρ = 0.002); TECH-TTF (β = 0.310, ρ = 0.041), (IND-TTF (β = 0.029, ρ = 0.000); COL-TTF (β = 0.163, ρ = 0.023); SON-UTIL(β = 0.745, ρ = 0.000); UTIL-TTF (β = 0.456, ρ = 0.001); TTF-UTIL(β = 0.462, ρ = 0.003); ITINFR-VIA (β = 0.237, ρ = 0.003); ECO-VIA (β = 0.319, ρ = 0.004); ORG-VIA (β = 0.387, ρ = 0.017); VIA-ORGPER (β = 0.280, ρ = 0.008) and TTF-ORGPER (β = 0.389, ρ = 0.000). Since ρ < 0.05, all variables are statistically significant (see Table 3). Table 4 presents factor analysis results based on the KMO and Bartlett test, which is .746 with a P-value of 0.000, which indicates that factor analysis is appropriate [11]. [12] shows that the factor loadings of 0.55 and higher are significant and the factor analysis results found in this research were between 0.6 and 0.8. For all items, item loadings are shown in Table 5. The results of the factor analysis identified twelve different factors. This is consistent with our conceptual model. Table 4. KMO and Bartlett’s Test KMO and Bartlett’s Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy

0.746

Bartlett’s Test of Sphericity

Approx. Chi-Square

3945.645

df

946

Sig.

0.000

300

M. Mkhonto and T. Zuva Table 5. EDAM Factor Analysis Loading

Factor Variables

1

2

3

4

5

6

7

8

9

10

11

TAS

0.825

TEC

0.164

0.642

IND

0.105

0.013

0.713

COL

0.310

0.034

0.251

SOCN

0.027

0.048

0.342

0.027

0.746

UTI

0.008

0.210

0.033

0.103

0.335

0.693

TTF

0.006

0.035

0.027

0.019

0.420

0.218

IT INFR

0.138

0.016

0.103

0.136

0.014

0.000

0.321

0.691

ECO

0.037

0.043

0.049

0.397

0.107

0.274

0.006

0.100

0.804

ORG

0.038

0.036

0.192

0.040

0.198

0.172

0.016

0.012

0.057

0.825

VIA

0.337

0.023

0.171

0.024

0.156

0.035

0.127

0.304

0.327

0.039

0.689

ORG PER

0.014

0.017

0.102

0.153

0.323

0.177

0.373

0.041

0.072

0.158

0.382

12

0.862

0.796

0.730

6 Data Analysis and Discussion The result of the descriptive statistics in Table 2 shows that the gender of citizens seems to be skewed towards males however, there seems to be a balance between the male and female citizens overall. This could be because male citizens tend to take more interest in technology. Also, the age of citizens who participated in the survey tend to skew towards are between the ages of 25–34. This appears to be the average of citizens accepting technology in South African municipalities. The empirical testing of the conceptual EDAM model proposed has resulted in the final EDAM presented in Fig. 2. The results show that all the hypothesized relationships for the variables were all supported. This is presented in Table 6 below.

Factors Influencing Performance Evaluation

301

Table 6. Regression test results Hypothesis No

Variable

Proposed Hypothesis

Result

H1

Task Characteristics

Task Characteristics has a positive influence on Task Technology Fit

Supported

H2

Technology Characteristics

Technology Characteristics has a positive influence on Task Technology Fit

Supported

H3

Individual Characteristics

Individual Characteristics has a positive influence on Task Technology Fit

Supported

H4

Collaboration

Collaboration has a positive Supported influence on Task Technology Fit

H5

Social Norms

Social Norms has a positive influence on Utilization

Supported

H6

Utilization

Utilization has a positive influence on Organizational Performance

Supported

H7

Task-Technology-Fit

Task Technology Fit has a positive influence on Utilization

Supported

H8

IT Infrastructure

IT Infrastructure has a Supported positive influence on Viability

H9

Economic

Economic has a positive influence on Viability

Supported

H10

Organization

Organization has a positive influence on Viability

Supported

H11

Viability

Viability has a positive influence on Organizational Performance

Supported

H12

Task Technology Fit

Task-Technology-Fit has a positive influence on Organizational Performance

Supported

302

M. Mkhonto and T. Zuva

Fig. 2. Final e-Government Diffusion Assessment Model (EDAM)

7 Limitations The limitation of this study has to do with the size of the sample chosen. The sample did not include all the provinces in South Africa, as the study collected data from two provinces. However, this limitation does not have much effect because the data was collected from the biggest municipalities in South Africa.

8 Conclusion The study proposed to determine the factors that influence the diffusion of e-Government for service delivery. Regression analysis was used to test the hypotheses and the results indicated that the TAS, TEC, IND, COL, SOCN, UTI, TTF, IT INFR, ECO, ORG, VIA and ORGPER influence the diffusion of e-Government. The study recommends that when organisations diffuse any technology, the TAS, TEC, IND, COL, SOCN, UTI, TTF, IT INFR, ECO, ORG, VIA, and ORGPER have to be measured. The study recommends that organs of government must consider the factors affecting e-Government diffusion assessment model (EDAM) when introducing e-Government applications. Acknowledgment. I wish to express my sincere gratitude and appreciation to my Co-author, Prof Tranos Zuva whose input helped us to produce a paper of good standard.

Factors Influencing Performance Evaluation

303

References 1. Rogers, E.M.: Diffusion of Innovations, 4th edn. The Free Press, New York (1995) 2. Seifert, J., Petersen, E.: The promise of all things E? Expectations and challenges of emergent e- Government. Perspect. Global Dev. Technol. 1(2), 193–213 (2002) 3. Irani, Z., Al-Sebie, M., Elliman, T.: Transaction stage of e-government systems: identification of its location & importance. In: Proceedings of the 39th Hawaii International Conference on System Science. IEEE Computer Society, Piscataway, NJ (2006) 4. Weerakkody, V., Dhillon, G.: Moving from e-government to t-government: a study of process reengineering challenges in a UK local authority context. Int. J. Electr. Gov. Res. 4(4), 1–16 (2008) 5. United Nations, ‘UN E-Government Survey 2018’, New York (2018) 6. Odat, A.M.: E-Government in developing countries: Framework of challenges and opportunities. In: Paper presented at International Conference for the Internet Technology and Secured Transactions, pp. 578–582. IEEE (2012) 7. Shareef, A.M., Kumar, V., Kumar, U., Dwivedi, Y.K.: E-Government adoption model (GAM): differing service maturity levels. Gov. Inf. Q. 28, 17–35 (2011) 8. Al-Jaghoub, S., Al-Yaseen, H., Al-Hourani, M.: Evaluation of awareness and acceptability of using E-Government services in developing countries: the case of Jordan. Electr. J. Inf. Syst. Eval. 13(1), 8 (2010) 9. Shareef, S.: E-government stage model: Based on citizen-centric approach in regional government in developing countries. Int. J. Electron. Commer. Stud. 3, 145–164 (2012) 10. Memon, A.H., Rahman, I.A.: SEM-PLS analysis of inhibiting factors of cost performance for large construction projects in Malaysia: perspective of clients and consultants. Sci. World J. 2014, 1–9 (2014). https://doi.org/10.1155/2014/165158 11. Brosius, F.: SPSS 12 (1. Aufl). Bonn: Mitp, Verl., Gruppe (2004) 12. Anderson, R.E., Babin, B.J., Black, W.C., Hair, J.F.: Multivariate Data Analysis, 7th edn. Pearson, United States (2010)

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality in the South African Schooling System Nellylyn Moyo(B)

, Anneke Harmse , and Tranos Zuva

ICT Department, Vaal University of Technology, Vanderbijlpark, Gauteng, South Africa [email protected]

Abstract. Augmented Reality (AR) and Virtual Reality (VR) have emerged as useful tools to enhance learner and instructor performance in the learning process. E-learning researchers have mentioned that these technologies motivate students and make learning enjoyable. Despite the pronounced benefits of AR and VR in schools, the adoption of these tools in the education sector has been slow. For users to gain most of the benefits of these technologies it is best to measure readiness for adaptation before utilizing these technologies. This study investigated the level of readiness for South African schools to adopt AR and VR. A quantitative method was used. A readiness model was proposed, and data was collected using questionnaires. Three hundred and fifteen (315) useable questionnaires were used for the analysis of a total of 325 received. Reliability and validity requirements were met using Cronbach Alpha and regression. The model extracted a KMO of .723 which is greater than 0.5, the lowest acceptable score. With a p-value of 0.00, this illustrates that factor analysis is suitable. The regression results show that all the ten hypothesis tests showed a positive influence between variables tests. Mean averages of all variables are above 3.9 indicating a high level of readiness to adopt AR and VR in schools in Gauteng. The model can be used to test readiness for adoption in other areas in SA to enable a more structured and systematic approach to e-learning implementation, increasing the likelihood of success in leveraging technology for educational purposes. Keywords: Augmented reality · adoption · virtual reality · readiness · education · learners · technology · e-learning

1 Introduction 1.1 Background of the Study E-learning has transformed training and education. Digital education platforms have ushered in a new way of learning that assists students in comprehending issues even more [1]. Mosa, Naz’ri bin Mahrin [2] assert that the use of technologies in education enables easy and universal access to education. Bullen [3] asserted that there is no consensual definition of the term e-learning. Since the 1960s e-learning has been adopted © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 304–321, 2024. https://doi.org/10.1007/978-3-031-54820-8_24

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

305

in many sectors including business, education, training and military [4]. Depending on the industry it can be defined differently as software-based learning or online learning [5]. Asomah, Agyei [6] indicated that e-learning is all about integrating technology and using it to improve and transform teaching and e-learning. Augmented reality (AR) is a means to merge the actual virtual world by overlaying digital data on views of the real world [7]. AR could be viewed as a rich illustration of a physical context with computer-simulated elements, for instance, sound, video, graphics, or a global positioning system (GPS) [8]. Yilmaz [9] believes AR is an approach that puts together real and simulated objects to create innovative, communicative scenography. Virtual Reality (VR) is described by Choi, Hawkes [10] as the process by which a computer and related peripherals substitute sensations of sight, sound, or touch that mimic the experiences of the physical surroundings, creating a sense of presence within an artificial or virtual world [11]. As the e-learning revolution moves forward, different institutions are adopting two cuttingedge technologies viz AR and VR [12]. However, adoption of these technologies in e-learning has faced significant challenges in South Africa and Africa in general. This research seeks to measure level of readiness for AR and VR in South Africa. Albarrak [13] argues that preparedness stands as a crucial determinant in the effective integration of technology within educational contexts. With the swift progress of technology, augmented reality (AR) and virtual reality (VR) have surfaced as significant technological advancements in the 21st century[14, 15]. Augmented Reality (AR) combines the actual world with virtual elements, overlaying 3D objects onto the real environment and provides users with a concurrent means to engage with both the physical and virtual realms [16]. According to Yilmaz [9] Augmented Reality (AR) combines real objects and virtual objects to create a novel communicative scenography. The ability to present virtual objects and the real world simultaneously makes it a good candidate for education and training purposes. In 1997 David Hawk prophesied that Augmented Reality (AR) would be embraced and used for many sectors [17]. The rise of mobile devices has given rise to a surge in mobile Augmented Reality (AR) that provides a cheaper way to apply it on mobile devices just like in the desktops as it has been trending in the decades before. This surge in Augmented Reality (AR) applications has been made possible by the increasing processing power of computer systems and advances in computer graphics. Lee [18] asserts that Augmented Reality (AR) for mobile devices gives a lot of promise as far as training and teaching are concerned. Furthermore, Cabero and Barroso [19] assert that Augmented Reality (AR) is set to penetrate educational centres, including universities, by 2021. El Sayed, Zayed [20] suggest another technology that is currently at centre stage when Virtual Reality (VR) reaches technology in the world it was born in 1965 from Ivan Sutherland’s vision where computer displays are merely windows into that virtual world that is as real as possible. Choi, Hawkes [10] define VR as the process by which a computer and associated peripherals substitute the sensory impetuses of visualisation, sound, or touch that would be provided by the natural setting to provide an impression of being in a non-natural or virtual world [11]. In the late 90s, researchers asserted that Virtual Reality (VR) was going to play a major part in our lives. They predicted the future from the fast-changing advances in computer graphics and animation, especially in the entertainment industry which has produced realistic virtual worlds [21]. Virtual Reality

306

N. Moyo et al.

(VR) presents an interesting platform to teach and educate concepts different industries have embraced it for training and teaching purposes. This research investigated the level of readiness to adopt AR and VR in the education sector for South African schools. The results of the research provided researchers and stakeholders in the education sector with an understanding of AR and VR as an elearning platform and offered recommendations on how South Africa can fully utilise these cutting-edge technologies. Furthermore, a framework proposed in this study can be used to evaluate other sectors for technological readiness. According to Mdlongwa [22] education faces challenges due to the use of traditional teaching methods that have not changed in many years. As a result, the failure to use ICT to enhance teaching and learning has made it hard to close the digital divide in South Africa [23]. South Africa currently does not produce the required number of teachers and the rural schools struggle to have qualified teachers, qualified teachers choose to migrate to teach and this leads to under-qualified teachers being hired [24]. This is a challenge faced by other developing countries, including South Africa [25]. Despite the benefits offered by emerging technologies, they have limited the technologies they mentioned like Twitter and Facebook [26]. They further stated that the HEIs did not understand the potential and benefits of using emerging technologies and recommended that it is important to research and explore emerging technologies that may transform teaching and learning practice [27]. Schools of large sizes make it hard for learners with special needs and that stresses teachers and leads to a lack of proper relationship between teachers and parents [28]. E-learning platforms have transformed the learning process, helping both learners and instructors in South Africa. The South African education ICT policy encourages the adoption of e-learning platforms in South African schools [29]. Adu [30] argues that the use of e-learning has essential potential for educational development in South Africa. As e-learning platforms are being embraced, AR and VR have emerged as useful tools to enhance learner and instructor performance in the learning process. E-learning researchers have mentioned that these technologies motivate students and make learning enjoyable [31]. Even though AR and VR can help a lot with learning online, schools have been taking their time to use these technologies. For users to gain most of the benefits of these technologies it is best to adopt these technologies when they are ready. Therefore, this research needed to take place within the context of South Africa to assess the readiness of South African schools for implementing AR and VR technology.

2 Related Studies The model by Akaslan and Law [32] distinguishes technology, content, people, culture and institute as the key status requirements. If these traits are fulfilled users acknowledge the framework, in the sense that they consider it valuable and useful. Once acknowledged users can be prepared for the innovation and utilize it. In this model, technology includes both hardware and software. This framework is also adopted and mentioned in research to evaluate e-learning readiness in Kenya [33]. The research used a survey to assess elearning readiness and concluded that teachers and students are ready to embrace digital platforms, but technical aptitude hinders them from using the platforms. Another model

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

307

for readiness by Yang, Sun [34] uses a revamped Unified Theory of Acceptance and Use of Technology framework with a tripod readiness model. The framework suggests that being ready as an organization, having the right environment, and having the necessary technology are the main factors that can limit how ready you are for using technology. Although this framework was used in measuring readiness for cloud services, it can be applied to any technological readiness study including e-learning. Research to evaluate the readiness for training entrepreneurship using online digital platforms proposed the framework depicted in Krishnan and Hussin [35] model. The framework further adds cultural and financial readiness to the other mentioned models. Another research to evaluate the readiness of nursing students for e-learning was conducted at Durban University of Technology [36]. The study utilised a revamped Chapnick readiness score to evaluate psychological, equipment and technological readiness for e-learning. Found that students are quite prepared when it comes to technology. However, the main obstacles are related to having the right equipment and technology available. Constraints in measuring technological readiness are technology, people, culture, and institutions. For technology, the main focus is the technical infrastructure required to implement e-learning [37]. It is a key success factor in assuring that an organisation is prepared to undertake technology solutions. Technology has two components viz. Hardware and software. Since e-learning is implemented by people they also form a critical factor in measuring readiness. The integration of e-learning could potentially be influenced by relevant skills, prior experiences, confidence levels, and attitudes of individuals involved, including learners, researchers, lecturers, administrators, and policymakers positive attitude towards technology is needed to make any education program a success. The content on the e-learning platform is an essential factor if the institutions are ready to prepare content for the digital platforms, then e-learning can be adopted [32]. Therefore, for successful e-learning institutions should create environments that are conducive for digital tools in learning. The named constraints mostly depend on the institution’s ability to realise the need for learning tools and its desire to implement them by putting in place the necessary legislation and structures to support the concept of e-learning [37]. Good user experience and efficiency are important in the adoption of technological systems. When users believe that the technology will help them do their job efficiently and easily they have a tendency to adopt it [38]. Therefore perceived ease of use and usefulness focus on these attributes [39]. As the technology revolution takes centre stage, researchers have scrambled to study technology adoption and readiness, creating frameworks and theories for this purpose. Research has discovered varying factors and constraints affecting the adoption of e-learning platforms in the world. The section discussed studies that have focused on readiness for the adoption of e-learning generally.

308

N. Moyo et al.

3 Methodology 3.1 Research Methodology The research utilized the honeycomb model as a framework to develop an effective approach for addressing the problem statement, and research question, and achieving research objectives. Within the honeycomb model, there are six primary elements, namely: (1) research philosophy; (2) research approach; (3) research strategy; (4) research design; (5) data collection and (6) data analysis techniques come together to form research methodology. This study followed the positivist epistemology as the researcher intended to be detached from the research have minimum interaction with the respondents and objectively use facts to conduct the study. Moreover, because this research is scientific and quantitative in nature the researcher preferred to use positivist epistemology to produce results that would be reliable, generalisable, and representable [40]. This study followed the objectivism ontology as the researcher’s view is that the true value of knowledge exists and will not be biased. Therefore, the researcher in this study will ignore their prior knowledge and social views when conducting this research. The researcher will only focus on the control of variables and the interaction within variables to be able to report findings on kind of the cause-effect and relationships witnessed [41]. The deductive approach provides a systematic and logical method for concluding from established premises and is widely used in various fields to build coherent arguments and theories [42]. [43] indicate that the deductive approach is known as testing a theory, which the researcher develops. This study selected quantitative since it is good for establishing cause-and-effect relationships, testing hypotheses, and determining a large population’s opinions, attitudes, and practices. Furthermore, the quantitative findings involve a bigger sample that was chosen at random, they are likely to apply to a full population or a sub-population. Ops a theory or hypothesis and designs a research strategy to test the formulated theory. 3.2 Sampling Techniques In this study, purposive sampling was used on the population as it reduces time and costs when selecting the sample population [44]. Furthermore, the researcher has control over selected individuals. To identify adoption in the SA context more schools must be observed to identify organisational readiness. We focused on schools in Gauteng. Mosa, Naz’ri bin Mahrin [2], in their review of literature regarding studies done on the readiness of e-learning solutions, identified technology, learners and some organisational factos as the key issues regarding the readiness to use e-learning technologies. Based on this analysis and the model [32]. 3.3 Hypothesis for the Study P-values are used in hypothesis testing to help determine whether the null hypothesis should be rejected [45]. Hypothesis testing plays a major role when the results of the research are discussed [46]. P-value was used to determine the significance of observational data. Whenever researchers notice an apparent relation between two variables, a

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

309

P-value calculation helps ascertain if the observed relationship happened as a result of chance [47]. A P-value < or = 0.05 is considered statistically significant [48]. It denotes strong evidence against the null hypothesis since there is below 5% probability of the null being correct. So, we reject the null hypothesis and accept the alternative hypothesis [49]. H01 : Technology has no positive influence on the Adoption of AR and VR. H1: Technology has a positive influence on the Adoption of AR and VR. H02 : Confidence has no positive influence on the Adoption of AR and VR. H2: Confidence has a positive influence on the Adoption of AR and VR. H03 : Experience has no positive influence on the Adoption of AR and VR. H3: Experience has a positive influence on the Adoption of AR and VR. H04 : Institution has no positive influence on the Adoption of AR and VR. H4: Institution has a positive influence on the Adoption of AR and VR. H05 : Content has no positive influence on the Adoption of AR and VR. H5: Content has a positive influence on the Adoption of AR and VR. H06 : Perceived usefulness has no positive influence on the Adoption of AR and VR. H6: Perceived usefulness has a positive influence on the Adoption of AR and VR. H07 : Attitude has no positive influence on the Adoption of AR and VR. H7: Attitude has a positive influence on the Adoption of AR and VR. H08 : Perceived usefulness has no positive influence on the Attitude of AR and VR. H8: Perceived usefulness has a positive influence on the Attitude of AR and VR. H09 : Perceived ease of use has no positive influence on the Attitude of AR and VR. H9: Perceived ease of use has a positive influence on the Attitude of AR and VR. H010 : Perceived ease of use has no positive influence on the Perceived usefulness of AR and VR. H10: Perceived ease of use has a positive influence on the Perceived usefulness of AR and VR (Fig. 1).

4 Experiments and Results 4.1 Reliability and Validity Reliability assesses the extent to which a collection of indicators for an underlying construct demonstrates internal consistency in their measurement. [50]. When we look at reliability, it indicates clues of important variables we are studying, we often see that they strongly relate to each other, showing that they’re measuring something similar. To check how trustworthy our research tools are, we use a reliability test, which helps us see if they consistently give us the same results over time. A research tool considered reliable will yield consistent responses from participants across multiple instances. Reliability applies to the consistency of the acquired results and how reliably they remain consistent for all individuals, both between different administrations of the same instrument and across diverse sets of items. In a Cronbach’s alpha analysis, a score of 0.7 or above is considered good, that is, the scale is internally consistent. A score of 0.5 or below means that the questions need to be revised or replaced, and in some cases, that the scale needs to be redesigned. The reliability of Cronbach’s alpha of all items is .792 and is acceptable shown in the table below (Table 1).

310

N. Moyo et al.

Fig. 1. Proposed readiness model derived from Akaslan and Law

Table 1. Reliability and validity Cronbach alpha Variable

Table Column Head Number of items

Cronbach alpha (standardised)

Cronbach alpha

Technology

5

.658

.661

Confidence

4

.688

.698

Experience

6

.526

.530

Institution

4

.619

.636

Content

4

.787

.788

Adoption

4

.732

.734

Perceived ease of use

3

.656

.661

Perceived usefulness

4

.740

.732

Attitude

4

.718

.718

All items

38

.785

.792

Validity has been defined as referring to the appropriateness, meaningfulness and usefulness of the specific inferences researchers make based on the data they collect. In

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

311

this part, there are three kinds of validity tests to be administered for research instruments. Construct validity refers to the characteristics of the psychological theory or characteristic being evaluated by the instrument. After constructing the instruments related to some aspects measured, it is consulted to achieve some expert judgements from at least three validators to evaluate whether the components of the instrument can be applied in this research. An important component of construct validity is convergent validity, which is present when the indicators of the same construct have a high proportion of variance. 4.2 Factor Analysis Factor analysis is a statistical method that can be used to collect an important type of validity evidence [51]. Factor analysis helps researchers explore or confirm the relationships between survey items and identify the total number of dimensions represented on the survey. The test called Kaiser-Meyer-Olkin Measure of Sampling Adequacy (in short: the KMO test) reflects the sum of partial correlations relative to the sum of correlations. It varies between 0 and 1, where a value closer to 1 is better. It has been suggested to use 0.5 as a minimum requirement. Kolmogorov-Smirnova (KMO) was used on the data set and all the variables had a significant value of less than 0,5 means the hypothesis is rejected [52]. The model extracted a KMO of .723 which is greater than 0.5, the lowest acceptable score. With a p-value of 0.00, this illustrates that factor analysis is suitable (Table 2). Table 2. KMO results KMO and Bartlett’s Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy

.723

Bartlett’s Test of Sphericity

Approx. Chi-Square

8456.168

df

861

Sig.

.000

4.3 Descriptive Statistical Analysis This section provides a bibliographic presentation of the respondents. Descriptive statistical analysis of the results means putting the data together in a way that’s easy to understand. We look at things like the average, how the data spreads out, and what patterns show up. We use pictures and numbers to show the big picture of the information we collected, so it’s clear for everyone. The data includes gender, age group, and the location municipality. This question sought to establish variables of the respondents in gender. If one gender group had participated in this study, the findings would have been biased, therefore it was important to include both genders in the study (Table 3). The age range was between 13 to 22 the respondents in the age range 13 14 were 30 which was 9.5% age range of 15 to 16 was a total of 82 which was 26% the age range

312

N. Moyo et al. Table 3. Gender demographics

Valid

Gender Descriptives Frequency

Percent

Valid per cent

Accumulative per cent

Female

185

58.7

58.7

58.7

Male

130

41.3

41.3

100

Total

100

100

100

of 17 to 18 was a total of 78 which was 24.8% and age range of 19 to 22 was 125 with 39.7% (Table 4). Table 4. Age of participants Valid

Age Descriptives Frequency

per cent

Valid percent

Accumulative percent

13–14

30

9.5

9.5

9.5

15–16

82

26.0

26.0

35.6

17–18

78

24.8

24.8

60.3

19–22

125

39.7

39.7

100

Total

315

100

100

Demographics analysis, we look at where our participants live in different areas. This helps us understand where people in our study come from. We can see how many people are in each place and learn about the variety of regions in our research. This information helps us see if there are any differences in results between different areas. Exploring how participants are spread out across different places gives us more information about our study and can help us find out if certain areas have specific trends or differences that affect our overall findings. This part of the analysis adds more details to what we know about our participants, making our research better and showing us insights about different places that connect to the bigger story of our study. Gauteng province where respondents were located. The biggest proportion of the respondents 52 (16.5%), were based in City of Johannesburg, 38 (12.1%) were in West Rand, 32 (10.2%) in Midvaal, 30 (9.5%) in Emfuleni, 28 (8.9%) in Sedibeng, 26 (8.3%) at Lesedi and Randwest, 24 (7.6%) in the Ekurhuleni, 24, (7.6%) at Merafong, 18 (5.7%) in Tshwane and 17 (5.4%) were located Mogale municipality. 4.4 One Sample t-test The application of the one-sample t-test serves the purpose of assessing if a given sample is derived from a population characterized by a particular mean value. While the exact mean of this population might not be known, it can be postulated in some cases. This is

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

313

carried out by subjecting the data to a normality curve evaluation through methods such as histogram examination, while also identifying potential outliers that could impact the validity of the analysis (Table 5). Table 5. Mean and standard deviation on each question. Item no

Mean, Standard Deviation mean for each question Question

Mean Standard deviation

Technology T1

I have the basic skills to use a computer (e.g., using a 4.25 keyboard, shortcuts, using a mouse, copying and pasting files, creating, editing and saving files, creating folders)

0.665

T2

I can troubleshoot most problems associated with using a computer

4.16

0.703

T3

I can create presentations using PowerPoint, create spreadsheets (e.g., Excel), and word processors for content delivery

4.11

0.77

T4

I have access to relevant hardware and printers/scanners/overhead projectors

4.21

0.699

T5

I know how to communicate using email, and Skype, 4.24 and send text/audio/video files using cloud computing

0.681

PC1

I use computers and office software like Word, Excel, 4.12 and PowerPoint with confidence

0.624

PC2

I can use Augmented Reality social media applications like Candy Crush Snapchat, Google Maps, and Google Translate with confidence

3.99

0.592

PC3

I think that would be able to remain motivated even though the instructor is always not online

3.96

0.627

PC4

I think that I would be able to complete my work even when there are online distractions (e.g., friends sending emails or websites to surf)

4.12

0.652

PE1

I use web browsers and search engines (e.g., Internet 4.18 Explorer, Google Chrome Google, MSN Search) confidently

0.599

PE2

I used Virtual Reality based gaming applications like 4.03 PlayStation FIFA, Minecraft, Wipeout VR, Jurassic land

0.583

PE3

I can send and receive digital study materials

0.549

Confidence

Experience

4.14

(continued)

314

N. Moyo et al. Table 5. (continued)

Item no

Mean, Standard Deviation mean for each question Question

Mean Standard deviation

PE4

I frequently technological innovation (e.g. starting using digital documents instead of hard copies) in routine /daily tasks

4.16

0.61

PE5

I use instant messaging (e.g., WhatsApp, Telegram, Instagram)

3.97

0.617

PE6

I can use the Internet to find and complete my homework

4.14

0.643

I1

The top-level administration and teachers understand 3.99 what e-learning is

0.709

I2

The top-level administration and teachers support the 4.02 use of e-learning

0.835

I3

The IT infrastructure in my school can support e-learning

4.1

0.743

I4

I believe my school is willing to invest in more e-learning tools and infrastructure

4.1

0.554

C1

My teacher has enough knowledge of e-learning and makes clear instructions when teaching

4.04

0.657

C2

I feel comfortable with the thought of using technology to deliver instruction

4.11

0.638

C3

I feel that I am ready to integrate e-learning into my learning

4.09

0.617

C4

I think quick technical and administrative support is important to my success

4.05

0.673

A1

I am interested in upgrading my academic/professional qualification and/or work performance through e-learning

3.95

0.766

A2

I think positively about the technological interventions in daily /routine tasks

4.09

0.659

A3

I like to communicate with others using WhatsApp, and email to support my learning

4.18

0.618

Institution

Content

Adoption

(continued)

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

315

Table 5. (continued) Item no

Mean, Standard Deviation mean for each question

A4

I like to use technological innovation (e.g. start using 4.17 digital documents instead of hard copies) in routine /daily tasks

Question

Mean Standard deviation 0.599

Perceive Usefulness PU1

I feel comfortable with the thought of using technology to deliver instruction

3.8

0.825

PU2

E-learning can enhance the theoretical part of my learning and understanding

3.94

0.839

PU3

I enjoy working on tasks on a computer and they make it efficient to finish my task

3.97

0.767

PU4

E-learning is an efficient means of disseminating information

3.95

0.794

AT1

E-learning using Augmented and Virtual reality can improve the quality of my learning

4.04

0.754

AT2

I believe that E-Learning using Augmented and Virtual reality can increase my productivity

4.05

0.683

AT3

E-learning enables me to accomplish my learning 4.02 more effectively than the traditional classroom-based approach

0.711

AT4

E-learning enables learners and instructors to communicate and interact better with one another

4.1

0.69

Attitude

Perceived Ease of Use PEOU1

I learnt most of the things on my own about using e-learning technologies

3.98

0.776

PEOU2

E-learning using Augmented and Virtual reality can allow me to learn easily in my own space and time

4.16

0.622

PEOU3

E-learning using Augmented and Virtual reality can be applied to the traditional part of my learning

4.02

0.698

Total Mean

4.07

4.5 Regression To evaluate the strength of a link between one dependent variable and one or more independent variables, regression analysis is performed. By using one or more independent variables, it helps in predicting the value of a dependent variable. Model summary R Square R2 = .62; means it accounts for 62% of the variance in readiness. Based on one or more continuous predictor variables, regression is used to forecast a continuous

316

N. Moyo et al.

result. In ANOVA, we use one or more categorical factors to make predictions about continuous outcomes. In this study, ANOVA was applied within a regression framework to forecast continuous outcomes concerning the adoption of a dependent variable. The independent factors considered as a group included technology, confidence, experience, institution, content, perceived usefulness, and attitude. When we want to measure how strong the linear relationship is between two variables, we use a correlation analysis, and the specific metric for this is called the correlation coefficient. In a correlation report, the symbol “r” represents this coefficient. Unlike model summary and ANOVA which measure strength between variables as a group, coefficients give a summary of strength between variables individually [53]. In regression analysis, the beta value (also known as the regression coefficient) measures the strength and direction of the relationship between an independent variable and a dependent variable. A positive beta value indicates a positive relationship. The extent of the beta value controls the strength of the association represented by a coefficient β. Influence sizes ranging from 0.10 to 0.29 indicate a minor influence, those between 0.30 and 0.49 signify a moderate influence, and result sizes of 0.50 or higher indicate a strong influence. It’s imperative to recognize that following the beta value entails considering not only its implication but also its significance level (p-value) and the overall relevance of fit within the regression model [54]. A significant beta value with a strong coefficient suggests a robust relationship between the variables, while a non-significant beta value indicates that the relationship may not be statistically meaningful. H1: Technology has a positive influence on Adoption,” Technology” has a Beta coefficient of (β = .313, p < .001) which indicates that (Technology) is a significant contributor to (Adoption). As the results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a weak influence. The availability and readiness of the necessary IT infrastructure are crucial for the successful adoption of AR and VR in Gauteng schools. Therefore, the hypothesis was Accepted H2: Confidence has a positive influence on Adoption, “Confidence” has a Beta coefficient of (β =. 489, p < .001) which indicates that (Confidence) is a significant contributor to (Adoption). The results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a moderate influence. The user’s confidence in AR and VR technologies makes the Adoption of AR and VR a success. Therefore, the hypothesis was Accepted. Experience has a Beta coefficient of (β =. 414, p < .001) which indicates that (Experience) is a significant contributor to Adoption. As the results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a moderate influence. The findings also show that when learners have experience with the internet and have used AR and VR technologies, it has helped learners overcome negative attitudes about the Adoption of AR and VR in Gauteng schools. Therefore, the hypothesis was Accepted. H4: Institution has a positive influence on Adoption, “Institution “has a Beta coefficient of (β = .224, p < .001) which indicates that (Institution) is a significant contributor to Adoption. The results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a weak influence experiences and competencies of the external support which may be institutional or individual expert in nature are necessary the expertise

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

317

and competencies required for successful Adoption of AR and VR in Gauteng schools. Therefore, the hypothesis was Accepted. H5: Content has a positive influence on Adoption, Content has a Beta coefficient of (β = .146, p < .009) which indicates that (Content) is a significant contributor to (Adoption). The results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a very weak influence. Creating interactive, understandable, and user-friendly content is crucial to the successful Adoption of AR and VR in Gauteng schools. Therefore, the hypothesis was Accepted. H6: Perceived usefulness has a positive influence on Adoption,” Perceived usefulness” has a Beta coefficient of (β = .396, p < .001) which indicates that Perceived usefulness is a significant contributor to Adoption. The results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a weak influence. Learners’ perception of AR and VR technologies is vital because when the users perceive that the technologies will be useful and enjoyable Adoption of AR and VR in Gauteng schools will be a success. Therefore, the hypothesis was Accepted. H7: Attitude has a positive influence on Adoption, “Attitude” has a Beta coefficient of (β = .145, p < .010) which indicates that (Attitude) is a significant contributor to (Adoption). As the results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a very weak influence Many learners believe AR and VR will make their learning efficient and effective this illustrates a positive attitude about the Adoption of AR and VR in Gauteng schools. Therefore, hypothesis was Accepted. H8: Perceived usefulness has a positive influence on Attitude, “Perceived usefulness” has a Beta coefficient of (β =. 390, p < .001) which indicates that (Perceived usefulness) is a significant contributor to Attitude. As the results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a weak influence. The perception of ease of use positively influences the perceived usefulness of technology, as individuals are more likely to find a technology valuable and beneficial if they perceive it to be user-friendly and easy to operate. Therefore, the hypothesis was Accepted. H9: Perceived ease of use has a positive influence on Attitude, “Perceived ease of use” has a Beta coefficient of (β = .844, p < .001) which indicates that (Perceived ease of use) is a significant contributor to Attitude. The results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a strong influence. There is a strong correlation between the PEOU and Attitude, learners’ perception of AR and VR contribute significantly to the successful Adoption of AR and VR in Gauteng schools. Therefore, the hypothesis was Accepted H10: Perceived ease of use has a positive influence on Perceived usefulness, “Perceived ease of use “has a Beta coefficient of (β = .333, p < .001) this indicates that (Perceived ease of use) a significant contributor to Perceived Usefulness. As the results show a positive Beta Coefficient value that significantly differs from 0 and is statistically significant. This shows a weak influence. The perception of ease of use positively influences the perceived usefulness of technology, as individuals are more likely to find a technology valuable and beneficial if they perceive it to be user-friendly and easy to operate. Therefore, the hypothesis was Accepted.

318

N. Moyo et al.

5 Conclusion This study has provided an analysis of learner’s readiness for Augmented and Virtual Reality within South African educational institutions as tools for teaching and learning. The insights gained from this study have the potential to offer valuable insights to decision-makers, policymakers, and stakeholders who are invested in facilitating a successful and productive integration of AR and VR technologies in South African schools. The findings of this research can also help these entities recognize the essential role of technology in optimizing school operations and enhancing the educational experience for both students and teachers in South Africa. By identifying a comprehensive set of pertinent factors crucial for evaluating readiness for AR and VR in educational settings, this study presents valuable inputs for the development of policies and strategies aimed at fostering effective teaching and learning practices. The evaluation of e-learning readiness for schools can provide a barometer to understand the critical needs of learners. In addition, it can be useful for improving the effectiveness of schools and helpful for the Department of Education to be aware of the key factors necessary for implementing efficient and effective infrastructures to support the efficient delivery of e-learning in schools. The distinctive contribution made in this study lies in the development of an e-learning readiness framework to use in South African schools with diverse factors suitable for learner readiness measurement. The predictive average above 3.9 indicates that Gauteng schools are ready to integrate. The evaluation result shows the evaluation factors to be significant determinants of schools’ readiness for AR and VR. By following these recommendations, the Department of Education and school stakeholders can create an enabling environment for the successful integration of AR and VR as powerful teaching and learning tools. Embracing these immersive technologies will empower learners, enhance learning experiences, and foster a future-ready generation capable of thriving in an increasingly digital world. This framework can be used in other provinces to ensure precise readiness levels for South African schools. Accommodate the individual needs of learners to promote equitable learning opportunities. The research contribution of this study was to shed light on the current state of preparedness of educational institutions to effectively integrate AR and VR technologies into teaching and learning process. By conducting a comprehensive investigation, this research identified the strengths, weaknesses, opportunities, and challenges associated with the adoption of AR and VR tools in schools. Through an in-depth analysis of learner’s perceptions, technical infrastructure, and institutional policies, this study provides valuable insights into the level of readiness for implementing these immersive technologies. The findings of this research will offer a comprehensive understanding of the factors that influence the successful integration of AR and VR in educational settings. By highlighting the existing barriers and limitations, such as technical constraints, learners’ confidence, experience and perceptions or financial considerations, this research contributes to the formulation of targeted strategies and interventions to overcome these challenges. Additionally, it identifies the potential benefits and opportunities that AR and VR can bring to the learning environment, including enhanced learner engagement, experiential learning, and the development of critical thinking skills. This research contributes to the academic community by offering a subtle perspective on the readiness for AR and VR adoption, enabling educators, policymakers, and stakeholders

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

319

to make informed decisions regarding the integration of AR and VR technologies in schools. Moreover, it serves as a foundation for future research, opening avenues for investigating principles to developing effective implementation models, and exploring the long-term impacts of AR and VR on learning outcomes. This research goes all-out to empower educational institutions to embrace the transformative potential of AR and VR as influential teaching and learning tools, preparing learners. For the challenges of the digital age and fostering a more engaging and immersive educational experience. The findings of this study offer significant insight for e-learning decision-makers and policymakers regarding the important factors for integrating AR and VR systems in schools. By utilising the findings of this study, decision-makers will be better positioned to understand the impact of their policies and be better prepared for launching AR and VR e-learning systems. This study provides a framework with seven primary evaluation areas or building factors to integrate AR and VR systems as teaching and learning tools. These things are important to measure before starting an e-learning AR and VR system. They help make sure the system is set up correctly goes in the right route and allows maximum benefits. The variables that were measured are “technology”, “confidence”, “experience”, “institution”, “content”, “perceived usefulness”, “attitude” and “perceived ease of use”, readiness This study’s findings revealed that learners in Gauteng schools are ready. Variables like “institution”, “content”, and “attitude” have a low relationship they need more attention, policy makers need to recognise the position of those factors to prioritise them when adopting AR and VR.

References 1. Buckingham, D.: Beyond Technology: Children’s Learning in the Age of Digital Culture. John Wiley & Sons, Hoboken (2013) 2. Mosa, A.A., Mahrin, M.N., Ibrrahim, R.: Technological aspects of e-learning readiness in higher education: a review of the literature. Comput. Inf. Sci. 9(1), 113 (2016) 3. Bullen, M.: Reconsidering the learning management system (2014) 4. Nicholson, P.: A history of e-learning. In: Computers and Education, pp. 1–11. Springer, Heidelberg (2007). https://doi.org/10.1007/978-1-4020-4914-9_1 5. Hariadi, B., et al.: Higher order thinking skills based learning outcomes improvement with blended web mobile learning model. Int. J. Instr. 15(2), 565–578 (2022) 6. Asomah, R.K., Agyei, D.D., Assamah, G.: A SWOT analysis of e-learning integration in University of Cape Coast. Eur. J. Educ. Pedagogy 3(4), 1–8 (2022) 7. Dube, G.K., ˙Ince, G.: An augmented reality interface for choreography generation. ˙Istanbul E˘gitimde Yenilikçilik Dergisi 3, 1–11 (2016) 8. Fernandez, M.: Augmented virtual reality: How to improve education systems. Higher Learn. Res. Commun. 7(1), 1–15 (2017) 9. Yilmaz, R.M.: Educational magic toys developed with augmented reality technology for early childhood education. Comput. Hum. Behav. 54, 240–248 (2016) 10. Choi, I., et al.: Wolverine: a wearable haptic interface for grasping in virtual reality. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2016) 11. Lüthold, P.: Investigating Oculomotor Control in Visual Search. Université de Fribourg (2017) 12. De Amicis, R., et al.: Cross-reality environments in smart buildings to advance STEM cyberlearning. Int. J. Interact. Des. Manuf. (IJIDeM) 13, 331–348 (2019) 13. Albarrak, A.I.: Designing e-learning systems in medical education: a case study. Int. J. Excel. Healthcare Manag. 3(1), 1–8 (2010)

320

N. Moyo et al.

14. Leue, M., Jung, T.: A theoretical model of augmented reality acceptance. E-Rev. Tour. Res.5 (2014) 15. Cruz-Benito, J., et al.: Discovering usage behaviors and engagement in an Educational Virtual World. Comput. Hum. Behav. 47, 18–25 (2015) 16. Dube, T.J., Kurt, G., ˙Ince, G.: An augmented reality interface for choreography generation. ˙Istanbul E˘gitimde Yenilikçilik Dergisi 3(1), 1–11 (2016) 17. Lee, G.A., et al.: Freeze-Set-Go interaction method for handheld mobile augmented reality environments. In: Proceedings of the 16th ACM Symposium on Virtual Reality Software and Technology. ACM (2009) 18. Lee, K.: Augmented reality in education and training. TechTrends 56(2), 13–21 (2012) 19. Cabero, J., Barroso, J.: The educational possibilities of Augmented Reality. J. New Approaches Educ. Res. 5(1), 44 (2016) 20. El Sayed, N.A., Zayed, H.H., Sharawy, M.I.: ARSC: augmented reality student card. Comput. Educ. 56(4), 1045–1061 (2011) 21. Ebling, C.: Virtual channels, virtual communities: using activity theory to foster Web 2.0based community and strengthen cultural resource protection at US National Parks, with application to Glen Canyon National Recreation Area and Rainbow Bridge National Monument. Stephen F. Austin State University (2015) 22. Mdlongwa, T.J.P.B., Africa Institute of South Africa. Information and Communication Technology (ICT) as a Means of Enhancing Education in Schools in South Africa (2012) 23. Bezuidenhout, L.M., et al.: Beyond the digital divide: towards a situated approach to open data. Sci. Public Policy 44(4), 464–475 (2017) 24. McKay, T., Mafanya, M., Horn, A.C.: Johannesburg’s inner city private schools: the teacher’s perspective. South Afr. J. Educ. 38(3), 1–11 (2018) 25. Bertram , N.M., Mukeredzi, T.:‘It will make me a real teacher’: learning experiences of part time PGCE students in South Africa (2012) 26. Nagle, J.: Twitter, cyber-violence, and the need for a critical social media literacy in teacher education: a review of the literature. Teach. Teach. Educ. 76, 86–94 (2018) 27. Ng’ambi, D., et al.: Technology enhanced teaching and learning in South African higher education–a rearview of a 20 year journey. Br. J. Edu. Technol. 47(5), 843–858 (2016) 28. Pather, S.: Evidence on inclusion and support for learners with disabilities in mainstream schools in South Africa: off the policy radar? (2011) 29. Cross, M., Adam, F.: ICT policies and strategies in higher education in South Africa: national and institutional pathways. High Educ. Pol. 20(1), 73–95 (2007) 30. Adu, E.O.: E-Learning facilities usage assessment by Economic and Management Science (EMS) teachers in Eastern Cape province, South Africa. In: EdMedia: World Conference on Educational Media and Technology.Association for the Advancement of Computing in Education (AACE) (2016) 31. Alkhattabi, M.: Augmented reality as E-learning Tool in primary schools’ education: barriers to teachers’ adoption. Int. J. Emerg. Technol. Learn. (iJET) 12(02), 91–100 (2017) 32. Akaslan, D., Law, E.L.: Measuring teachers’ readiness for e-learning in higher education institutions associated with the subject of electricity in Turkey. In: Global Engineering Education Conference (EDUCON). IEEE (2011) 33. Ouma, G.O., Awuor, F.M., Kyambo, B.: E-learning readiness in public secondary schools in Kenya. Eur. J. Open Dist. E-Learn. 16(2), 97–110 (2013) 34. Yang, Z., et al.: Understanding SaaS adoption from the perspective of organizational users: a tripod readiness model. Comput. Hum. Behav. 45, 254–264 (2015) 35. Krishnan, K.S.T., Hussin, H.: E-Learning readiness on bumiputera SME’s intention for adoption of online entrepreneurship training in Malaysia. Management 7(1), 35–39 (2017) 36. Coopasami, M., Knight, S., Pete, M.: E-Learning readiness amongst nursing students at the Durban University of Technology. Health SA Gesondheid (Online) 22, 300–306 (2017)

An Analysis of Readiness for the Adoption of Augmented and Virtual Reality

321

37. Saekow, A., Samson, D.: E-learning readiness of Thailand’s universities comparing to the USA’s Cases. Int. J. e-Educ. e-Bus. e-Manag. e-Learn. 1(2), 126 (2011) 38. Salleh, N.S.N.M., Amin, W.A.A.W.M., Mamat, I.: Employee readiness, training design and work environment in influencing training transfer among academic staffs of uitm. Int. J. Acad. Res. Bus. Social Sci. 7(10), 275–290 (2017) 39. Tarhini, A., et al.: Extending the UTAUT model to understand the customers’ acceptance and use of internet banking in Lebanon: a structural equation modeling approach. Inf. Technol. People 29(4), 830–849 (2016) 40. Alexandrova, A.: A Philosophy for the Science of Well-Being. Oxford University Press, Oxford (2017) 41. Pandey, P., Pandey, M.M.: Research Methodology Tools and Techniques. Bridge Center (2021) 42. Jaakkola, E.: Designing conceptual articles: four approaches. AMS Rev. 10(1–2), 18–26 (2020). https://doi.org/10.1007/s13162-020-00161-0 43. Al-Ababneh, M.M.: Linking ontology, epistemology and research methodology. Sci. Phil. 8(1), 75–91 (2020) 44. Etikan, I., Musa, S.A., Alkassim, R.S.: Comparison of convenience sampling and purposive sampling. Am. J. Theor. Appl. Stat. 5(1), 1–4 (2016) 45. Bendtsen, M.: A gentle introduction to the comparison between null hypothesis testing and Bayesian analysis: reanalysis of two randomized controlled trials. J. Med. Internet Res. 20(10), e10873 (2018) 46. Greenland, S., et al.: Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur. J. Epidemiol. 31, 337–350 (2016) 47. Murtaugh, P.A.: In defense of P values. Ecology 95(3), 611–617 (2014) 48. Biau, D.J., Jolles, B.M., Porcher, R.: P value and the theory of hypothesis testing: an explanation for new researchers. Clin. Orthopaedics Related Res®. 468, 885–892 (2010) 49. Betensky, R.A.: The p-value requires context, not a threshold. Am. Stat. 73(sup1), 115–117 (2019) 50. Hayes, A.F., Coutts, J.J.: Use omega rather than Cronbach’s alpha for estimating reliability. But…. Commun. Methods Meas. 14(1), 1–24 (2020) 51. Sürücü, L., Maslakci, A.: Validity and reliability in quantitative research. Bus. Manag. Stud. Int. J. 8(3), 2694–2726 (2020) 52. Sürücü, L., Se¸ ¸ sen, H., Maslakçı, A.: Regression, Mediation/Moderation, and Structural Equation Modeling with SPSS, AMOS, and PROCESS Macro. Livre de Lyon (2023) 53. Shanthi, R.: Multivariate Data Analysis: Using SPSS and AMOS. MJP Publisher (2019) 54. Krishna, S.H., et al.: The technical Role of Regression Analysis in prediction and decision making. In: 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE). IEEE (2023)

Review of Technology Adoption Models and Theories at Organizational Level Mkhonto Mkhonto(B) and Tranos Zuva Department of Information and Communication Technology, Vaal University of Technology, Vanderbijlpark, Gauteng, South Africa [email protected], [email protected]

Abstract. There is no doubt that innovative technologies have evolved tremendously in the past few decades, notably, in the computing discipline. Organizations adopt technology innovations to create competitive advantage as well as to sustain their competitive position. There is consensus that technology has significant effects on the productivity of organizations. These effects will only be realized if, and when, technology is widely spread and used. It is essential to understand the determinants of technology adoption. Consequently, it is necessary to know the theoretical models. There are numerous reviews in the literature about the comparison of technology adoption models at the individual level, and to the best of our knowledge there are fewer at the organizational level. This review will fill this gap. This paper reviews two prominent technology adoption models relevant to information systems (IS) and information technology (IT) studies implemented at organizational level. The study could assist analyze the adoption of new technologies. It can also be used by researchers who wish to endeavor in the adoption of new technologies. Keywords: Technology adoption models · Diffusion of innovation (DOI) theory · Technology-organization-environment (TOE) framework · organizational level

1 Introduction In the information system domain, technology adoption has been one of the extensively researched areas. Studies on technology adoption have aimed to understand, predict and explain variables influencing adoption behavior at individual as well as organizational levels to accept and use technological innovations. Technology adoption has been studied widely in the area of information systems. As a result, frameworks and conceptual models were developed which assist in understanding the association of the adoption factors of technologies. This paper, therefore, seeks to conduct a literature review to identify important adoption factors of technology at organizational level. In this paper, we have considered the organization based findings founded on DOI theory and TOE framework. Hence this paper is based on an overview of the two prominent technology adoption models and theories: The paper is arranged as follows: First a literature study on technology © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 322–330, 2024. https://doi.org/10.1007/978-3-031-54820-8_25

Review of Technology Adoption Models and Theories

323

adoption models at the organizational level is presented. The two models reviewed are: Diffusion of innovation (DOI) [1]; and the Technology-organization-environment (TOE) framework [2], since most studies on IT adoption at the organizational level are derived from theories such as these two [3]. It is followed by an extensive analysis of the TOE framework, analyzing the studies that used only this theory and the studies that combine the TOE framework with other theories such as: DOI, institutional theory and the Iacovou et al. [4] model. Section 4 is the discussion and this paper ends with conclusion.

2 Literature Review 2.1 Defining Technology Adoption Technology adoption is the acceptance and use of new technology. According to [5] technology adoption is defined as the first use or acceptance of new technology or new system. Moreover, adopting or rejecting a new technology is based on an individual’s perception of the benefits of using the tool to achieve a goal [6, 7]. 2.2 Technology Adoption Theories and Models Technology Adoption Theories and Models have been used over the years by various researchers to explain behavior associated with technology adoption and usage that can help understanding and foretelling the behaviour of an individual with regards to technology acceptance. Theories and models of technology adoption known to describe user adoption at the organization level include Diffusion of innovation (DOI) theory and Technology-organization-environment (TOE) framework. These are discussed in the following section. 2.2.1 Diffusion of Innovation (DOI) Introduced by Rogers to explain the adoption rate for various technologies across various channels and stages since an individual have changing degrees of enthusiasm to adopt innovation, and that over time, or depending on which stage in the adoption process the adopter found themselves [8]. The DOI theory has mainly focused on the perceived features of technologies and the innovativeness of the organizations adopting them. Rogers [9] mentions five attributes of an innovation that influence its adoption: relative advantage in comparison to existing technologies, compatibility with the organization workflows and knowledge, complexity to implement, trialability and observability of the development of the innovation both inside the organization and in competitors. The individuals’ perceptions of these five characteristics predict the rate of adoption of innovations. Based on DOI theory at organizational level [1], innovativeness is related to such independent variables as individual (leader) characteristics, internal organizational structural characteristics, and external characteristics of the organization (Fig. 1). (a) Individual characteristics describes the leader attitude toward change.

324

M. Mkhonto and T. Zuva

(b) Internal characteristics of organizational structure includes observations according to Rogers [1] whereby: “centralization is the degree to which power and control in a system are concentrated in the hands of a relatively few individuals”; “complexity is the degree to which an organization’s members possess a relatively high level of knowledge and expertise”; “formalization is the degree to which an organization emphasizes its members’ following rules and procedures”; “interconnectedness is the degree to which the units in a social system are linked by interpersonal networks”; “organizational slack is the degree to which uncommitted resources are available to an organization”; “size is the number of employees of the organization”. (c) External characteristics of organizational refers to system openness. According to Rogers [9], innovativeness is “the degree to which an individual or other unit of adoption is relatively earlier in adopting new ideas than other members of a social system.” Rate of adoption is the relative speed with which an innovation is adopted by members of a social system. The social and communication structure of a system facilitates or hinders the diffusion of innovations in the system. He distinguishes three main types of innovation-decisions: (i) optional innovation-decisions, choices to adopt or reject an innovation that are made by an individual independent of the decisions of other members of the system; (ii) collective innovation-decisions, choices to adopt or reject an innovation that are made by consensus among the members of a system; and (iii) authority innovation-decisions, choices to adopt or reject an innovation that are made by relatively few individuals in a system who possess power, status, or technical expertise.

Fig. 1. Diffusion of innovations (DOI) [1]

Since the early applications of DOI to IS research, the theory has been applied and adapted in different ways. The Table 1 below is made based on previous studies of DOI in different type of industry.

Review of Technology Adoption Models and Theories

325

Table 1. Some studies based on DOI theory Author(s)

Year

Theory Used Type of Study

Type of Industry

Beatty, Shim & Jones

2001 DOI

Website

SMEs

Bradford & Florin

2003 DOI

ERP

Firms

Limthonghai & Speece

2003 DOI

E-Commerce

SMEs

Hussin & Noor

2005 DOI

E-Commerce

Manufacturing Companies

Hsu, Kraemer & Dunkle 2006 DOI

E-Business

SMEs

Li

2008 DOI

E-Procurement Firms

Tan, Chong, Lin & Eze

2009 DOI

ICTs

SMEs

Ming

2016 DOI

E-Business

SMEs

2.2.2 Technology-Organization-Environment (TOE) Framework Technology-Organization-Environmental (TOE) framework introduced by [2], is a popular model for investigating the adoption of latest technologies in an organization. It classifies the factors that influence an organization to adopt new technology into three groups: technology, organization, and environment. The technological context refers to the existing as well as new technologies relevant to the firm. These factors play a significant role in the firm’s adoption decision as it determines the ability of the firm to benefit from e-business initiative. The organizational context represents the internal factors in an organization, influencing an innovation adoption and implementation. Organizational context refers to descriptive measures about the organization such as scope, size, organization structure, financial support, managerial beliefs and top management support. The environmental context is the arena surrounding a firm, consists of its industry, technology support infrastructure and government regulation [2]. In this framework, three key determinants were identified that affect organizational adoption: technology, organization, and environment (See Fig. 2). Hence, this framework was named as “TOE” framework and used successfully in the study of adoption within organizations. According to [2], there are around three types of contexts that can have an effect on technology acceptance, creativity, and implementation. These three TOE framework contexts are explained as follows: (i) In this framework, the technological context relates to the technologies available to an organization. Its main focus is on how technology characteristics themselves can influence the adoption process. (ii) The organizational context describes the characteristics of an organization. Common organization characteristics include firm size, degree of centralization, formalization, complexity of its managerial structure, the quality of its human resources, and the amount of slack resources available internally. The researcher added new variable “Readiness” in the organization. Organizational readiness refers to organization’s readiness in term of knowledge of the key personnel and facilities to adopt the technology.

326

M. Mkhonto and T. Zuva

Fig. 2. Technology, organization, and environment framework [9]

(iii) The external environmental context is the arena in which an organization conducts its business. This includes the industry, competitors, regulations, and relationships with the government. These are factors external to an organization that present constraints and opportunities for technological innovations [10]. Thus, the next Section analyses the studies that adopted TOE framework.

3 The Systematic Review of the Empirical Literature on Technology-Organization-Environment (TOE) Framework TOE framework is utilized by several studies to explain various ICT adoptions such as e-commerce, e-business, Enterprise Resource Planning, Electronic Data Interchange, open systems, Knowledge Management Systems etc. Table 2 indicates the studies that use the TOE Framework from 2007 to 2018. 3.1 The Studies that Used (TOE) Framework with Other Theories Technology-organization-environment (TOE) framework can be combined with other theories to better explain ICT adoption. [11] develops an integrated model combining TOE framework with DOI theory. The model specifies contextual variables such as decision-maker characteristics, IS characteristics, organizational characteristics, and environmental characteristics as primary determinants of IS adoption in small businesses.

Review of Technology Adoption Models and Theories

327

Table 2. The Studies that use the TOE Framework Author(s)

Year

Theory Used

Type of Study

Type of Industry

Chang, Hwang, Hung, Lin & Yen

2007

TOE

E-Signature

Firms

Zhang, Cui, Huang & Zhang 2007

TOE

IT

Firms

Lin & Lin

2008

TOE

E-Business

Firms

Ramdani, Kawalek & Lorenzo

2009

TOE

Enterprise Systems

SMEs

Oliveira & Martins

2010

TOE

E-Business

Firms

Ifinedo

2011

TOE

Internet/E-Business

SMEs

Ghobakhloo, Arias-Aranda & Benitez-Amado

2011

TOE

E-Commerce

Firms

Lama, Pradhan, Shrestha & Beirman

2018

TOE

E-Tourism

SMTE

Moreover, [12, 13] combined DOI theory with the TOE framework to better understand IT adoption decisions. Institutional theory was developed by Scott in 2004. The theory can be combined with TOE framework to explain IT adoption within different domains. According to the institutional theory, in order to make organizational decisions social and cultural factors should also be taken in to consideration. In order to survive, organizations must conform to the rules and belief systems prevailing in the environment [14]. The institutional theory adds to the environmental context of the TOE framework external pressures, which include pressure from competitors and pressure exerted by trading partners. [15] combines TOE framework with DOI and Institutional Theory in order to explain organizational adoption of e-procurement. [4] develop a new model for EDI adoption in the small organizations that is based on three factors; perceived benefits, organizational readiness, and external pressure. The external pressure in the model includes two variables; competitive pressure and trading partner power. [16] use this model with the TOE framework to explain adoption of ebusiness. Table 3 chronologically presents some studies that combine TOE framework with other theoretical models.

4 Discussion Review of the literature shows that in order to study organizational adoption of technology we should consider technological, organizational and environmental contexts. We recognize that some studies use TOE framework combining with other theories to better explain technology adoption. The TOE framework identifies three aspects of an enterprise’s context that influence the process by which it adopts and implements a technological innovation: technological context, organizational context, and environmental context. It is also possible that the TOE framework is better when combined with other

328

M. Mkhonto and T. Zuva Table 3. The studies that combine TOE framework with other theoretical models

Author(s)

Year

Theory Used

Type of Study

Type of Industry

Scupola

2003

TOE and DOI

E-Commerce

SMEs

Gibbs & Kraemer

2004

TOE and Institutional E-Commerce Theory

Firms

Hsu, Kraemer & Dunkle

2006

DOI, TOE and Iacovou et al. (1995) Model

E-Business

Firms

Li

2008

TOE, DOI and Institutional Theory

E-Procurement

Firms

Ramdani & Kawalek

2009

TOE and DOI

Enterprise System

SMEs

Wang, Wang & Yang

2010

TOE and DOI

RFID

Firms

Hung, Yang, Yang & Chuang

2011

TOE and DOI

E-Commerce

Travel Agencies

Al-Zoubi

2011

TOE and DOI

E-Business

Firms

Acilar & Karamasa

2013

TOE and DOI

E-Commerce

SMEs

Alrousan

2014

TOE and DOI

E-Commerce

Travel Agencies

Chandra & Kumar

2018

TOE and DOI

E-Commerce

Firms

theories, depending on the condition of the organization. DOI theory found that individual characteristics, internal characteristics of organizational structure, and external characteristics of the organization are important antecedents to organizational innovativeness. DOI theory is one of the main theories that are combined with TOE framework. Institutional theory is another theory that is combined with TOE framework to explain technology adoption within different domains.

5 Conclusion This paper presents review of literature of technology adoption models at the organizational level. Two prominent models and theories in the field of information systems were critically reviewed. These models are: Diffusion of innovation (DOI) theory and the technology-organization-environment (TOE) framework. Most empirical studies are derived from the DOI theory and the TOE framework. As the TOE framework includes the environment context (not included in the DOI theory), it becomes better able to explain intra-firm innovation adoption; therefore, we consider this model to be more complete. The TOE model has been shown to be useful in the investigation of a wide range of innovations and contexts. Furthermore, it has been broadly supported in empirical work. It remains among the most prominent and widely utilized theories of organizational adoption since its development. For this reason an extensive analysis of the TOE framework was undertaken, analyzing empirical studies that use only the TOE model, and empirical studies that combine this model with the DOI theory, the institutional

Review of Technology Adoption Models and Theories

329

theory, and [4] model, and concluding that the same context in a specific theoretical model can have different factors. In terms of further research, we think that for more complex new technology adoption it is important to combine more than one theoretical model to achieve a better understanding of the technology adoption phenomenon. Acknowledgment. I wish to express my sincere gratitude and appreciation to my Co-author, Prof Tranos Zuva whose input helped us to produce a paper of good standard.

References 1. Rogers, E.M.: Diffusion of Innovations, 4th edn. Free Press, New York (1995) 2. Tornatzky, L., Fleischer, M.: The Process of Technology Innovation. Lexington Books, Lexington, MA (1990) 3. Chong, A.Y.L., Ooi, K.B., Lin, B.S., Raman, M.: Factors affecting the adoption level of C-commerce: an empirical study. J. Comput. Inf. Syst. 50(2), 13–22 (2009) 4. Iacovou, C.L., Benbasat, I., Dexter, A.S.: Electronic data interchange and small organizations: adoption and impact of technology. MIS Q. 19(4), 465–485 (1995) 5. Khasawneh, A.M.: Concepts and measurements of innovativeness: the case of information and communication technologies. Int. J. Arab Cult. Manag. Sustain. Dev. (2008). https://doi. org/10.1504/ijacmsd.2008.020487 6. Nemoto, M.C.M.O., Vasconcellos, E.P.G., Nelson, R.: The adoption of new technology: conceptual model and application. J. Technol. Manag. Innov. 5, 95–107 (2010). https://doi. org/10.4067/S0718-27242010000400008 7. Plewa, C., Troshani, I., Francis, A., Rampersad, G.: Technology adoption and performance impact in innovation domains. Ind. Manag. Data Syst. 112, 748–165 (2012). https://doi.org/ 10.1108/02635571211232316 8. Rogers, E.M.: Diffusion of preventive innovations. Addict. Behav. 27(6), 989–993 (2002) 9. Rogers, E.M.: Diffusion of Innovations, 5th edn. Free Press, New York, NY (2003) 10. DePietro, R., Wiarda, E., Fleischer, M.: The context for change: organization, technology, and environmental. In: Tornatzky, L.G., Fleischer, M. (eds.) The Process of Technological Innovation, pp. 151–175. Lexington Books, Lexington, MA (1990) 11. Thong, J.Y.L.: An integrated model of information systems adoption in small businesses. J. Manag. Inf. Syst. 15(4), 187–214 (1999) 12. Zhu, K., Kraemer, K.L., Xu, S.: The process of innovation assimilation by firms in different countries: a technology diffusion perspective on E-business. Manage. Sci. 52(10), 1557–1576 (2006) 13. Wang, Y.M., Wang, Y.S., Yang, Y.F.: Understanding the determinants of RFID adoption in the manufacturing industry. Technol. Forecast. Soc. Change 77, 803–815 (2010) 14. Scott, W.R.: Institutional theory. In: Ritzer, G. (ed.) Encyclopedia of Social Theory, pp. 408– 14. Sage, Thousand Oaks, CA (2004) 15. Li, Y.H.: An empirical investigation on the determinants of E-procurement adoption in Chinese manufacturing enterprises. In: 15th International Conference on Management Science & Engineering, California, USA, pp. 32–37 (2008) 16. Oliveira, T., Martins, M.F.: Firms patterns of E-business adoption: evidence for the European Union-27. Electron. J. Inf. Syst. Eval. 13(1), 47–56 (2010) 17. Chandra, S., Kumar, K.N.: Exploring factors influencing organizational adoption of augmented reality in E-commerce: empirical analysis using technology organization-environment model. J. Electron. Commer. Res. 19(3) (2018)

330

M. Mkhonto and T. Zuva

18. Hung, Y., Yang, Y., Yang, H., Chuang, Y.: Factors affecting the adoption of ecommerce for the tourism industry in Taiwan. Asia Pac. J. Tourism Res. 16(1), 105–119 (2011) 19. Al-Zoubi, M., Thi, L.S., Lim, H.E.: E-government adoption and organization performance in the Jordan businesses sector: empirical analysis. Acad. Res. Int. 1(1) (2011) 20. Ghobakhloo, M., Arias-Aranda, D., Benitez-Amado, J.: Adoption of e-commerce applications in SMEs. Ind. Manag. Data Syst. 111(8), 1238–1269 (2011) 21. Scupola, A.: The adoption of internet commerce by SMEs in the south of Italy: an environmental, technological and organizational perspective. J. Glob. Inf. Technol. Manag. 6(1), 52–71 (2003) 22. Ramdani, B., Kawalek, P., Lorenzo, O.: Predicting SMEs’ adoption of enterprise systems. J. Enterp. Inf. Manag. 22(1/2), 10–24 (2009) 23. Alrousan, M.K.: E-commerce adoption by travel agencies in Jordan. Int. J. Bus. Inf. Syst. (2014) 24. Hsu, P.F., Kraemer, K.L., Dunkle, D.: Determinants of E-business use in us firms. Int. J. Electron. Commer. 10(4), 9–45 (2006) 25. Ifinedo, P.: Internet/E-business technologies acceptance in Canada’s SMEs: an exploratory investigation. Internet Res. 21(3), 255–281 (2011) 26. Chang, I.C., Hwang, H.G., Hung, M.C., Lin, M.H., Yen, D.C.: Factors affecting the adoption of electronic signature: executives’ perspective of hospital information department. Decis. Support. Syst. 44(1), 350–359 (2007) 27. Zhang, C., Cui, L., Huang, L., Zhang, C.: Exploring the role of government in information technology diffusion. In: McMaster, T., Wastell, D., Ferneley, E., DeGross, J.I. (eds.) Organizational Dynamics of Technology-Based Innovation: Diversifying the Research Agenda. IIFIP, vol. 235, pp. 393–407. Springer, Boston, MA (2007). https://doi.org/10.1007/978-0387-72804-9_26 28. Lin, H.F., Lin, S.M.: Determinants of E-business diffusion: a test of the technology diffusion perspective. Technovation 28(3), 135–145 (2008) 29. Tan, K.S., Chong, S.C., Lin, B., Eze, U.C.: Internet-based ICT adoption: evidence from Malaysian SMEs. Ind. Manag. Data Syst. 109(2), 224–244 (2009) 30. Ming, Z.: E-business adoption among SMEs in China: a study perceptions of SMEs in Hunan province. Lancashire Business School, Lancashire (2016) 31. Limthongchai, P., Speece, M.W.: The effect of perceived characteristics of innovation on e-commerce adoption by SMEs in Thailand. In: Proceeding of the Seventh International Conference on Global Business and Economic Development, 8–11 January, pp. 1573–1585 (2003) 32. Bradford, M., Florin, J.: Examining the role of innovation diffusion factors on the implementation success of enterprise resource planning systems. Int. J. Acc. Inf. Syst. 4(3), 205–225 (2003) 33. Beatty, R.C., Shim, J.P., Jones, M.C.: Factors influencing corporate web site adoption: a time-based assessment. Inf. Manag. 38(6), 337–354 (2001) 34. Lama, S., Pradhan, S., Shrestha, A., Beirman, D.: Barriers of E-tourism in developing countries: a case study of Nepal. In: Australasian Conference on Information Systems (2018) 35. Acilar, A., Karamasa, Ç.: Factors affecting E-commerce adoption by small businesses in a developing country. In: ICT Influences on Human Development, Interaction, and Collaboration, pp. 174–184 (2013). https://doi.org/10.4018/978-1-4666-1957-9.ch010 36. Hussin, H., Mohamad Noor, R.: Innovating business though e-commerce: exploring the willingness of Malaysian SMEs. In: The Second International Conference on Innovations in IT (IIT 2005) (2005)

Algorithmic Optimization Techniques for Operations Research Problems Carla Silva1(B) , Ricardo Ribeiro2

, and Pedro Gomes3

1 Atlântica University, UNINOVA, Lisbon, Portugal

[email protected]

2 Atlântica University, CESNOVA, Minho, Portugal

[email protected] 3 Atlântica University, Barcarena, Portugal

Abstract. This paper provides an overview of the key concepts and approaches discussed in the field of Algorithmic Optimization Techniques. Operation research plays a role in addressing complex decision-making challenges across industries. This paper explores a range of algorithmic methods and optimization strategies employed to solve real-world problems efficiently and effectively. This paper outlines the core themes covered in our research, including the classification of optimization problems, the utilization of mathematical models, and the development of algorithmic solutions. It highlights the importance of algorithm selection and design in achieving optimal solutions for diverse operations research problems. Furthermore, this underscores the relevance of this research area enhancing decision-making processes, resource allocation, and overall efficiency in industries such as transportation, supply chain management, finance, and healthcare. The paper aims to provide readers with insights into cutting-edge algorithmic techniques, their applications, and their potential impact on addressing complex optimization challenges in operations research. Algorithmic Optimization Techniques for Operations Research Problems serves as theoretical board for researchers, practitioners, and students seeking to understand and apply algorithmic optimization methods to tackle a wide range of operations research problems and make informed decisions in various domains. Keywords: operational research · algorithms · optimization

1 Understanding the Landscape of Optimization The elucidation of optimization problems, a cardinal pursuit in the domain of Operations Research, it demains a systematic categorization schema. Within this section, we embark on a rigorous examination of the classification paradigm that underpins optimization problem delineation a cornerstone for the judicious selection of modeling methodologies and algorithmic approaches. The panorama of optimization quandaries confronting practitioners is both multifaceted and intricate. To illuminate this intellectual idea, we commence with an elucidation of categorizations, beginning with the well-established © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 331–339, 2024. https://doi.org/10.1007/978-3-031-54820-8_26

332

C. Silva et al.

discipline of linear programming. Here, we traverse a mathematical expanse characterized by linear relationships between variables, a model frequently leveraged in scenarios such as resource allocation and production optimization. Subsequently, our expedition proceeds to integer programming, a realm replete with discrete decision variables. This subset of optimization problems finds its utility in combinatorial optimization, encompassing challenges like the traveling salesman problem. Furthermore, we venture into the domain of nonlinear programming, a heterogeneous terrain embracing a plethora of nonlinearity manifestations in objective functions and constraints. Dynamic programming, as an intricate choreography of sequential decision-making, demands its own classification framework. These problems materialize in diverse contexts, including resource management and project scheduling. Additionally, we acknowledge the existence of optimization challenges that defy facile categorization. Ascribing optimization problems to their respective categories engenders an empirical cognizance of the nature of the problem space, inherently guiding the subsequent selection of apt modeling techniques and algorithmic strategies. Through this scientific idea, readers are primed for a structured exploration of mathematical modeling and algorithmic strengths that are instrumental in the resolution of complex optimization. Linear programming (LP) is a fundamental optimization technique widely applied in multiple domains. It is characterized by linear relationships between decision variables in both the objective function and constraints [5]. The primary goal of linear programming is to find values for these variables that optimize (minimize or maximize) the linear objective function while satisfying a set of linear constraints [14]. Common applications of linear programming include resource allocation, production planning, and network flow optimization [15]. The Simplex algorithm, developed by George Dantzig [5] in the 1940s, is one of the foundational methods for solving linear programming problems, efficiently exploring vertices of the feasible region to identify the optimal solution. Linear programming offers a mathematically elegant framework for decision-making in scenarios where resources are limited, and constraints must be met. It has found extensive use in economics, engineering, logistics, and other fields [16]. Integer Programming (IP) extends the scope of linear programming by introducing the requirement that some or all of the decision variables must take on integer values [14]. This discrete aspect distinguishes IP from LP and adds complexity to problem-solving. IP problems are prevalent in combinatorial optimization, where decisions involve selecting or assigning items, routes, or schedules from discrete sets [9] Classic examples include the traveling salesman problem (TSP), where the goal is to find the shortest tour visiting a set of cities, and the knapsack problem, which deals with selecting a subset of items to maximize their total value while adhering to capacity constraints [15]. Solving integer programming problems is inherently more challenging than linear programming due to the combinatorial nature of discrete variables. Techniques such as branch and bound [16] branch and cut [18], and cutting plane methods are used to find optimal or near-optimal solutions [18] Integer programming has diverse applications, from production scheduling and vehicle routing [14] to project management [2] and facility location [5].

Algorithmic Optimization Techniques

333

Nonlinear Programming (NLP) addresses optimization problems where either the objective function, constraints, or both involve nonlinear relationships among the decision variables [17]. These nonlinearity characteristics render NLP problems considerably more complex and computationally demanding than linear programming. NLP problems arise in fields such as engineering design, economics, and physics, where real-world phenomena often exhibit nonlinear behaviors [15]. Examples include optimizing the shape of an aircraft wing for maximum aerodynamic efficiency, determining the optimal investment portfolio with nonlinear utility functions, or fitting nonlinear curves to experimental data [16]. Solving NLP problems typically involves iterative methods like gradient-based optimization [7] where the algorithm iteratively adjusts the decision variables to minimize or maximize the objective function while respecting constraints. Advanced algorithms such as the Levenberg-Marquardt method [17] or genetic algorithms [15] are employed when dealing with highly nonlinear or non-convex problems. Nonlinear programming plays a pivotal role in optimizing complex systems and continues to be a subject of active research in mathematical optimization [18]. Dynamic Programming (DP) represents a distinct category of optimization problems that excel in modeling sequential decision-making processes over time or space [16]. DP problems involve breaking a complex problem into simpler subproblems and solving each subproblem only once, storing the results in a table or memory to avoid redundant calculations. DP has a broad range of applications, including inventory management [12] resource allocation [9] and project scheduling [17]. It is particularly powerful in solving problems with overlapping substructures, where solutions to subproblems are reused [18]. One of the most well-known examples of DP is the Bellman equation, central to solving problems like the shortest path problem in graphs [15] and the knapsack problem [18]. Additionally, DP is instrumental in solving Markov decision processes (MDPs), a framework widely used in reinforcement learning and artificial intelligence for decision-making under uncertainty ([16]. Dynamic programming’s ability to efficiently tackle complex, sequential optimization problems makes it a valuable tool in Operations Research and computer science [17]. Its optimal substructure and overlapping subproblems underpin the development of efficient algorithms for solving a wide array of problems. 1.1 Modeling the Real World The development of mathematical models is the keystone upon which sound decisionmaking rests. This section delves into the intricate process of constructing these models to represent real-world problems systematically. The variables that define the levers of control are a fundamental component of mathematical models in Operations Research. They represent the quantities or factors that can be controlled or manipulated to achieve specific objectives within a problem context [17]. For instance, in a production scheduling problem, decision variables may represent the quantities of different products to be manufactured on different days. The selection and definition of decision variables are pivotal steps in model development, as they directly influence the model’s complexity and the feasibility of finding solutions. Careful consideration is required to strike a balance between granularity and computational tractability [16].

334

C. Silva et al.

Other consideration in our research is balancing goals and limitations. In which the objectives and constraints define the optimization problem’s goals and limitations within the mathematical model. The objective function specifies the quantity to be minimized or maximized, representing the primary aim of the optimization process [17]. In contrast, constraints articulate the restrictions that must be adhered to during the optimization. The objective function and constraints together encapsulate the essence of the problem and enable the identification of feasible solutions. Objectives may involve maximizing profit, minimizing costs, or achieving a balance between conflicting goals. Constraints can encompass resource limits, capacity restrictions, or regulatory requirements. The interplay between objectives and constraints is central to modeling real-world problems, as it defines the trade-offs and complexities inherent in decision-making [17]. Model formulation techniques represent the art of translating real-world problems into mathematical expressions within an optimization framework [9]. This process requires the skillful identification of decision variables, formulation of objectives, and representation of constraints to capture the essence of the problem. Various mathematical tools are employed, such as linear equations for linear programming, binary variables for integer programming, and nonlinear functions for nonlinear programming [7]. Moreover, specialized techniques exist for addressing specific problem types, such as network flow modeling for transportation problems and queuing models for service optimization [5]. Effective model formulation is a cornerstone of operations research, enabling practitioners to represent complex scenarios in a structured, solvable manner [4]. Model formulation is a crucial aspect of Operations Research, requiring the translation of real-world complexities into mathematical representations. This process involves selecting decision variables, formulating objectives, and defining constraints with precision [9]. The resulting mathematical model serves as the foundation for optimization and decision-making. 1.2 Crafting the Blueprint of Optimization This essential step [9] in the optimization journey relies on the adept selection of decision variables, the precise formulation of objectives, and the meticulous representation of constraints. The creation of a well-structured mathematical model is akin to crafting a blueprint for optimization, and it is a skill that distinguishes proficient Operations Research practitioners. These models serve as the foundation upon which optimization algorithms are applied and informed decisions are made. In essence, model formulation bridges the chasm between the intricacies of the real world and the mathematical rigor required for optimization [7]. Mathematical tools are harnessed during this process. Linear equations are wielded for linear programming, binary variables are enlisted for integer programming, and nonlinear functions are deployed when addressing nonlinear programming [7] Furthermore, specialized techniques are often called upon to tackle specific problem types, such as network flow modeling for transportation quandaries or queuing models for optimizing service systems [5].

Algorithmic Optimization Techniques

335

The efficacy of model formulation reverberates throughout the optimization endeavor. A well-constructed model encapsulates the essence of the problem, enabling practitioners to discern optimal solutions efficiently. Conversely, an inadequately formulated model can introduce ambiguity and hinder the decision-making process. Hence, the artistry of crafting mathematical models serves as a cornerstone of operations research, enabling practitioners to navigate and solve complex problems with precision and insight. Dantzig [4] underscored its pivotal role in defining problem structures, enabling the application of optimization algorithms. Effective formulation, akin to designing a blueprint, ensures that optimization efforts are well-guided and systematically conducted. Various mathematical tools are employed in model formulation, depending on the problem type. Linear equations are fundamental in linear programming [7], while binary variables are key in integer programming. Specialized techniques are also crucial; network flow models, for instance, are invaluable for solving transportation and logistics problems [1]. The proficiency in crafting mathematical models reverberates throughout operations research. A well-structured model encapsulates problem essentials, facilitating efficient solution discovery. Conversely, poorly formulated models can hinder decision-making. Hence, the artistry of model formulation is an integral part of Operations Research, empowering practitioners to address complex problems effectively [5].

2 Applications of Algorithmic Optimization Algorithmic optimization techniques, rooted in mathematical and computational principles, wield a transformative influence across multifarious industries, proffering decisionmaking, judicious resource allocation, and operational enhancement. This section elucidates salient applications of algorithmic optimization within distinct industrial milieus, underpinned by academic research. The sphere of transportation and logistics streamlining movement and distribution, stands as an exemplar of algorithmic optimization’s resounding impact, prominently manifest in the resolution of complex routing and scheduling conundrums [12]. Research endeavors [24] in this domain revolve around the meticulous optimization of routes, astute vehicle scheduling, and judicious inventory management. Algorithms, such as the renowned Traveling Salesman Problem (TSP), serve as linchpins in orchestrating efficient resource allocation, curtailing fuel consumption, and curtailing delivery timelines [10] The contemporary e-commerce landscape underscores the paramount importance of timely goods delivery, rendering algorithmic optimization an indispensable asset in sustaining competitiveness [2]. Supply chain management can balance the flow of goods, which is characterized by intricate webs of suppliers, manufacturers, and distributors, finds its lodestar in algorithmic optimization, orchestrating the harmonious equilibrium within this complex ecosystem [11]. This process involves a network of various entities, including suppliers who provide raw materials, manufacturers who transform these materials into finished products, and distributors who ensure these products reach their intended customers. This network is often complex and interconnected, reflecting the multifaceted nature of

336

C. Silva et al.

modern supply chains. Supply chain management is tasked with the challenge of ensuring that this flow of goods operates smoothly and efficiently. It involves making decisions related to sourcing, production, inventory management, transportation, and distribution. These decisions impact factors such as cost, lead time, and customer satisfaction. These algorithms consider a multitude of variables, constraints, and objectives to find the best possible solutions. For example, they may determine the most cost-effective routes for transportation, the optimal allocation of inventory across different locations, or the ideal production schedule to meet demand while minimizing costs. By using optimization techniques, supply chain managers can strike a balance between conflicting goals, such as minimizing costs while maintaining high service levels or reducing lead times while optimizing inventory levels. This equilibrium emphasizes that algorithmic optimization aims to create a state of balance and efficiency within the supply chain ecosystem. It seeks to align various elements of the supply chain, such as production, inventory, and transportation, in a way that maximizes overall performance. This orchestration of activities leads to a harmonious equilibrium where resources are utilized optimally, costs are minimized, and customer demands are met effectively [21]. Inventory optimization, demand forecasting, and network design constitute veritable foci of scholarly inquiry. Demand forecasting involves predicting future customer demand for products or services [19]. Accurate forecasting is essential for making informed decisions regarding production, inventory levels, and distribution. Demand forecasting models use historical data, market trends, and other relevant factors to generate forecasts. Researchers in this area explore advanced forecasting techniques, statistical models, and machine learning algorithms to improve the accuracy of predictions [20, 23]. Accurate demand forecasts are crucial for aligning supply chain operations with expected customer demand, reducing stockouts, and avoiding overstock situations [25].Supply chain network design involves determining the optimal structure and configuration of a supply chain, including the locations of facilities (such as factories, warehouses, and distribution centers) and the relationships between them. This area of research addresses [13] questions like where to position facilities to minimize transportation costs, how to design distribution networks for global supply chains, and how to account for factors like capacity constraints and demand variability [22]. Optimizing the supply chain network can result in cost savings, improved responsiveness, and enhanced customer service. The judicious optimization of supply chains translates into cost reduction, heightened adaptability to market dynamics, and amplified customer satisfaction [6] Research endeavors in this sphere frequently delve into multi-objective optimization paradigms, adeptly negotiating the trade-offs between objectives such as cost minimization and service level maximization. By improving patient care and resource allocation, in the world of healthcare, algorithmic optimization represents a paradigm shift, revolutionizing the orchestration of operations, encompassing facets like hospital resource allocation, nurse scheduling, and patient routing [8]. These optimization techniques usher in operational efficiency within healthcare institutions, mitigating patient waiting times and judiciously allocating

Algorithmic Optimization Techniques

337

resources [3] Given the high-stakes nature of healthcare decisions, algorithmic optimization continually advances, spanning domains like healthcare supply chain optimization and the design of healthcare delivery networks.

3 Conclusion and Challenges In overview, the process of creating mathematical models in operations research involves defining decision variables, formulating objectives and constraints, and employing various mathematical techniques to represent real-world problems [29].These models serve as the foundation for applying optimization algorithms and making informed decisions in diverse fields. Operations Research (OR) is a multidisciplinary field that applies mathematical modeling and analytical methods to make informed decisions in complex systems. OR problems encompass a wide range of applications, from logistics and supply chain management to healthcare and finance. One of the fundamental challenges in OR is finding optimal or near-optimal solutions to these intricate problems efficiently. This paper delves into the theoretical framework of algorithmic optimization techniques for operations research problems, shedding light on key concepts and strategies. The theoretical foundation of this paper begins with a categorization of optimization problems commonly encountered in OR. Problems can be broadly classified into linear programming, integer programming, nonlinear programming, dynamic programming, and more. Understanding the mathematical characteristics of each problem type is essential for selecting appropriate algorithmic techniques. For instance, linear programming problems involve linear relationships between variables and can be solved using methods like the Simplex algorithm. Theoretical models form the core of OR translating real-world problems into mathematical equations. These models define decision variables, objectives, and constraints, providing a structured representation of the problem. This paper explores the development of mathematical models in depth, emphasizing their role in the theoretical framework. It discusses techniques for formulating models, including decision trees, network models, and queuing models, each tailored to address specific types of OR problems. Algorithmic techniques are at the heart of solving OR problems efficiently. This paper outlines various algorithmic approaches, including exact algorithms, heuristic methods, and metaheuristic algorithms. Exact algorithms aim to find the global optimal solution, while heuristics and metaheuristics offer near-optimal solutions within reasonable time frames [27]. Theoretical insights into the strengths and weaknesses of these approaches are discussed, helping practitioners make informed choices. Selecting the right algorithm and designing it effectively are critical steps in the optimization process. Theoretical considerations play a crucial role in making these decisions, for instance, when dealing with large-scale optimization problems, decomposition techniques may be employed to break the problem into smaller, more manageable parts. The theoretical framework laid out in this paper serves as a foundation for addressing real-world challenges across diverse domains. It illustrates the practical applications of algorithmic optimization techniques in optimizing transportation routes, inventory management, production scheduling, and more. Theoretical insights enable researchers and practitioners to adapt and customize algorithms to suit specific industry needs.

338

C. Silva et al.

In conclusion, this theoretical framework stands as a testament to the relentless pursuit of efficiency, informed decision-making, and innovative problem-solving. This paper has journeyed through the theoretical underpinnings of algorithmic optimization techniques, unveiling their profound implications across a spectrum of industries and applications. As we delved into the classification of optimization problems, the significance of delineating problem structures became evident. Linear programming, integer programming, nonlinear programming, and dynamic programming emerged as the cornerstone tools, each tailored to address specific complexities within optimization challenges. Through academic lenses, we scrutinized the art of crafting mathematical models, emphasizing their role as blueprints guiding the optimization journey. The applications of algorithmic optimization, explored in domains as diverse as transportation and logistics, supply chain management, healthcare, finance, manufacturing, and numerous others, unveiled a tapestry of efficiency-enhancing endeavors. The academic research cited herein has showcased the transformative power of optimization algorithms in reshaping industries, streamlining processes, and bolstering competitiveness. In the ever-evolving landscape of operations research, the role of algorithmic optimization continues to burgeon. As technologies advance and data proliferates, the potential for optimization-driven insights and solutions amplifies. The future beckons with promises of more sophisticated algorithms, real-time decision support, and a deeper integration of artificial intelligence. In this dynamic tapestry of optimization, we find not just a suite of mathematical tools but a testament to human ingenuity and our perpetual quest to harness the full potential of data, computation, and analytics. Algorithmic optimization is not merely a means to an end; it reflects our unwavering commitment to making the world more efficient, sustainable, and well-informed. In doing so, we embark on a journey of discovery, ever closer to unlocking the myriad possibilities that lie at the intersection of mathematics, computation, and the challenges of our complex world. This article presented an extensive review of the crucial role played by algorithm and operational research. We suggested potential future research directions from both topic and methodology perspectives. Researchers could explore and verify various OR and algorithms methods in multiple studies from a methodological perspective. Thus, regarding future research topics, efficiency forecasting related to the evaluation of extreme events in climatic conditions could justify further exploration, as could the investigation of knowing the risks, which has received very limited attention in the academic literature to date. Future studies might also explore the impacts of alarm insight regulations and managerial behaviors on risk-taking by people who face the challenge of climate changes. Finally, future research could also apply other AI methods (e.g., unsupervised machine learning) or fresh combinations of OR and AI techniques to climate research.

References 1. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Upper Saddle River (1993) 2. Bektas, T.: The multiple traveling salesman problem: an overview of formulations and solution procedures. Omega 34(3), 209–219 (2006)

Algorithmic Optimization Techniques

339

3. Brandeau, M.L., Sainfort, F., Pierskalla, W.P.: Operations Research and Health Care: A Handbook of Methods and Applications. Kluwer Academic Publishers, Dordrecht (2004) 4. Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton (1963) 5. Hillier, F.S., Lieberman, G.J.: Introduction to Operations Research. McGraw-Hill, New York (2014) 6. Jain, A., Meeran, S.: Multi-objective optimization of supply chain networks. Comput. Chem. Eng. 27(8–9), 1155–1174 (2003) 7. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, Cham (2006) 8. Ozcan-Top, O., McCarthy, T.: A review of optimization models for patient admission scheduling in hospitals. Health Care Manag. Sci. 20(2), 139–162 (2017) 9. Papadimitriou, C.H., Steiglitz, K.: Combinatorial Optimization: Algorithms and Complexity. Dover Publications, New York (1998) 10. Salhi, S.: Heuristic Algorithms for the Vehicle Routing Problem. Wiley, Hoboken (2000) 11. Simchi-Levi, D., Kaminsky, P., Simchi-Levi, E.: Designing and Managing the Supply Chain: Concepts, Strategies, and Case Studies. McGraw-Hill, New York (2019) 12. Toth, P., Vigo, D.: The Vehicle Routing Problem. Society for Industrial and Applied Mathematics, Philadelphia (2002) 13. Acknoff, R.L.: The future of operational research is past. J. Oper. Res. Soc. 30(2), 93–104 (1979) 14. Bakker, H., Fabian, D., Stefan, N.: A structuring review on multi-stage optimization under uncertainty: Aligning concepts from theory and practice. Omega 96, 102080 (2020) 15. Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106, 1039–1082 (2017) 16. Carrizosa, E.: Enhancing interpretability in factor analysis by means of mathematical optimization. Multivar. Behav. Res. 55(5), 748–762 (2020) 17. Cazals, C., Florens, J., Simar, L.: Nonparametric frontier estimation: a robust approach. J. Econometr. 106(1), 1–25 (2002) 18. Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science 349(6245), 255–260 (2015) 19. Kumar, A.M., Satyanarayana, S., Wilson, N., Zachariah, R., Harries, A.D.: Operational research capacity building in Asia: innovations, successes and challenges of a training course. Publ. Health Action 3(2), 186-8 (2013). https://doi.org/10.5588/pha.13.0008. PMID: 26393025; PMCID: PMC4463115 20. Pu, G., Wang, Y.: Survey of undergraduate or courses from IE programme: content, coverage, and gaps. J. Oper. Res. Soc., 1–20 (2023) 21. Vilalta-Perdomo, E., Hingley, M.: Beyond links and chains in food supply: a community OR perspective. J. Oper. Res. Soc. 69(4), 580–588 (2018) 22. Koskela, L.: Why is management research irrelevant? Constr. Manag. Econ. 35(1–2), 4–23 (2017) 23. O’Brien, F.A.: On the roles of OR/MS practitioners in supporting strategy. J. Oper. Res. Soc. 66(2), 202–218 (2015) 24. Ackoff, R.L.: Resurrecting the future of operational research. J. Oper. Res. Soc. 30, 189–199 (1979) 25. EPSRC. Review of the Research Status of Operational Research in the UK. Engineering and Physical Sciences Research Council, Swindon (2004) 26. Rosenhead, J.: Reflections on fifty years of operational research. J. Oper. Res. Soc. 60(sup1), S5–S15 (2009). https://doi.org/10.1057/jors.2009.13 27. Kunc, M.: System Dynamics: Soft and Hard Operational Research. Springer, Cham (2017)

Complex Comparison of Statistical and Econometrics Methods for Sales Forecasting Oleksandr Kosovan(B) and Myroslav Datsko Ivan Franko University of Lviv, Lviv, Ukraine {Oleksandr.Kosovan.AEKE,myroslav.datsko}@lnu.edu.ua

Abstract. Sales forecasting holds substantial significance in shaping decision-making processes in the retail industry. This study investigates the contemporary landscape of sales forecasting methods, aiming to provide empirical insights into the performance of various statistical and econometric models. By rigorously evaluating these models across diverse datasets, we identify stable methods that consistently demonstrate reliable predictive capabilities. Our research contributes to the field by offering baseline models that can furnish trustworthy forecasts, guiding practical applications and future research efforts. The paper details the study’s methodology, results, and discussions, enabling a comprehensive understanding of the strengths, limitations, and implications of the evaluated forecasting methods.

Keywords: sales forecasting

1

· retail · econometrics · time series

Introduction

Sales forecasting plays a pivotal role in both theoretical and practical domains, influencing critical decision-making processes in the retail industry. As outlined in Sect. 2, our investigation into the current landscape of the sales forecasting domain has revealed a need for empirical knowledge regarding the performance of various statistical and econometrics methods. This study is driven by the central objective of addressing the challenge of selecting robust forecasting models for sales prediction across multifaceted business contexts. Through rigorous assessment of numerous statistical and econometric models across a spectrum of datasets, our research endeavors to pinpoint models that consistently exhibit dependable performance. The resulting insights will furnish recommendations for baseline models capable of furnishing reliable forecasts, thereby guiding both practical implementations and future research endeavors. Detailed information regarding the utilized datasets can be found in Sect. 3, followed by comprehensive descriptions of the employed statistical and econometrics methods in Sect. 6. Our experimental design is outlined in Sect. 4, and c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 340–355, 2024. https://doi.org/10.1007/978-3-031-54820-8_27

Complex Comparison of Statistical and Econometrics

341

the evaluation metrics employed for assessing model performance are detailed in Sect. 5. The subsequent sections, namely Sects. 7 and 8, present a thorough exposition of the study’s findings and subsequent discussions, leading to a holistic understanding of the efficacy and implications of the evaluated forecasting methods. Also, all experiment’s code base is public in GitHub1 and can be reproduced.

2

Background

There are various methods of research in the forecasting area, and one of them is through competitions like the M competitions. Rob J. et al. highlighted the transformative impact of the M competition series, a collection of major forecasting contests that have driven substantial advancements in theory and practice [15]. Other examples are competitions like the “M5 Forecasting Accuracy,” hosted on Kaggle, which have yielded practical insights and methodological innovations, contributing to the field’s progress [14,22]. Similar competitions, including the “Corporaci´ on Favorita Grocery Sales Forecasting” based on Ecuadorian retail data [8] and “HACK4Retail” focused on Ukrainian retail [19], further enrich the landscape. Donna F. et al. conducted a study that explored the interplay of organizational factors in sales forecasting management, shedding light on the significance of integrated approaches [5]. Empirical investigations have expanded the horizon, examining diverse methodologies on specific datasets. For instance, Ensafi Y. et al. compared forecasting methods like Prophet, LSTM, and CNN [7]. Valls P. et al. extended this exploration to encompass RNNs and transformers [26], contributing to the methodological diversity. Cross-dataset research also plays a role. Florian H. et al. engaged in a comparative analysis of machine learning solutions within sales forecasting, yielding valuable insights into model performance [12]. The primary objective of this study is to address the challenge of selecting effective forecasting models for sales prediction across diverse business contexts. By rigorously evaluating the performance of various statistical and econometric models across a range of data sets, this study aims to identify models that consistently demonstrate reliable performance. The outcomes will offer recommendations for baseline models that can provide trustworthy forecasts, guiding both practical applications and future research. The central research question is: What is the relative performance of different statistical and econometric models in the context of sales forecasting across a diverse list of data sets? In line with this question, it is hypothesized that discernible patterns will emerge, highlighting models that consistently exhibit stable performance across diverse data sets. These models are expected to serve as valuable benchmarks for future investigations, contributing to the advancement of research in this critical domain. 1

GitHub, https://github.com/OleksandrKosovan/complex-sales-forecasting.

342

3

O. Kosovan and M. Datsko

Data

In this study, we utilized three distinct and diverse data sets (see Table 1), each comprising a comprehensive retail history encompassing a wide list of products, geographical locations, and shops, all organized within a specific hierarchical structure. By incorporating these distinct data sets, we aimed to capture a comprehensive representation of the complexities and nuances present in retail environments. The summary statistics reveal that the data sets have different scales. Table 1. The data set features Feature Name

M5

Fozzy Group Corporaci´ on Favorita

Country

USA

Ukraine

Products Count

3049

1961

Start Date

Jan 29, 2011 Jan 1, 2020

Jan 1, 2013

End Date

Jun 19, 2016 Jul 19, 2021

Aug 15, 2017

Mean

1.29

0.21

8.58

Standard deviation 4.15

1.23

21.92

Minimum

0.0

0.0

0.0

Q1 (25%)

0.0

0.0

2.0

Q2 (50%)

0.0

0.0

4.0

Q3 (75%)

1.0

0.0

9.0

Maximum

763

801

44142.0

Ecuador 4400

One of the data sets utilized in our analysis is the M5 data set, made available by Walmart on the Kaggle competition. This data set encompasses the unit sales of 3049 distinct products sold across 10 stores situated in three states within the United States: California (CA), Texas (TX), and Wisconsin (WI). The product offerings are further categorized into three broad product categories, namely Hobbies, Foods, and Household, and subsequently disaggregated into seven more specific product departments. The temporal scope of the data spans from January 29, 2011, to June 19, 2016, comprising a duration of approximately 5.4 years or 1969 d [14,22]. The second data set was provided by the Fozzy Group (Ukrainian retail company) and covers the unit sales of products sold in Ukraine. This data set comprises three distinct components: the time series, geographical, and SKU data sets. The time series data set chronicles the sales history of 1961 unique stockkeeping units (SKUs) that span from January 1, 2020, to July 19, 2021. The SKU data set augments this information with metadata pertaining to each individual unit, including classifications into commodity groups, categories, types, and brand affiliations. The geographical distribution of sales is encoded across

Complex Comparison of Statistical and Econometrics

343

515 distinct geo clusters, allowing for a comprehensive analysis based on both location and product attributes [19]. Another crucial data set underpinning our study is the Corporaci´ on Favorita Grocery Sales Forecasting data set, which encompasses detailed unit sales records. This data set, released by Corporaci´on Favorita, a prominent Ecuadorian company with an extensive network of supermarkets across Latin America, was made available as part of a Kaggle competition in the year 2017. The data set chronicles daily sales records for a total of 4400 unique items, across 54 different Ecuadorian stores, encompassing a temporal span from January 1, 2013, to August 15, 2017. The accompanying data files provide additional pertinent information that holds potential utility in the development of predictive models. The comprehensive nature of this data set, coupled with its geographical specificity, presents an intriguing opportunity to explore and forecast the dynamics of grocery sales within the Ecuadorian retail landscape [8,26].

4

Experiments Design

In the research of exploring various statistical and econometric methodologies for sales forecasting, we defined an experimental design to ensure consistency across data sets, models, and time series. The primary objective of our experiment’s design was to establish a uniform framework, facilitated by the utilization of the Open Source Time Series Ecosystem “nixtla,” which integrates the robust “StatsForecast” package with implemented time series forecasting models [11]. To achieve this uniformity, we harmonized each data set by organizing it into the following structured columns: – unique-id – a distinctive identifier for each time series. – ds – the date of observation. – y – the sales quantity corresponding to each date and time series. It should be noted that we opted to exclude additional metadata present in the data sets, instead focusing on establishing a consistent data structure across all scenarios. We use a list of econometric and statistical models available within “StatsForecast” (see Sect. 6). To evaluate the performance of each model, we use a set of specialized evaluation metrics (see Sect. 5). For our experimentation, we standardized the configuration across all data sets and individual time series. Our forecast horizon was set at 14 d and we used a rigorous cross-validation methodology. The cross-validation process involves sliding windows across historical data and predicting subsequent periods. Leveraging the distributed capabilities of “StatsForecast,” we executed cross-validation with 3 windows, ensuring a rigorous assessment of models’ performance across diverse temporal contexts. The step size for each window mirrored the forecasting horizon, which, in our case, was also set at 14 d. Refer to Fig. 1 for an illustrative representation. Upon completion of the cross-validation step, our data frame was enriched with two additional columns: the model’s predictions and the “cutoff” timestamp, denoting the final datestamp or temporal index for each window. This

344

O. Kosovan and M. Datsko

Fig. 1. Example of cross-validation

augmentation facilitated an in-depth evaluation of models and individual time series for each window. Such analysis included examining the distribution of model quality metrics in various contexts.

5

Metrics

The metrics for forecasting performance can be classified as scale-dependent and scale-independent. Scale-dependent measures, such as MSE, RMSE, and MAE, are sensitive to the scale of the data. These measures are particularly relevant when comparing forecasts made on data sets of similar scales. However, caution is warranted when comparing forecasts across series with differing scales, as scale-dependent measures can yield misleading results [3,9,18]. Scale independence has been identified as a key characteristic of effective metrics [18,21], especially when dealing with heterogeneous data sets. Consequently, our primary focus lies on the use of scale-independent metrics, namely Mean Absolute Percentage Error (MAPE) (Formula 1) and Symmetric Mean Absolute Percentage Error (sMAPE) (Formula 2). These measures are designed to provide meaningful comparisons across series with varying scales, making them suitable candidates for our evaluation.   n 1   Yi − Yˆi  (1) M AP E =   × 100% n i=1  Yi  where: n – total number of observations, Yi – actual value of the i-th observation, Yˆi – predicted value of the i-th observation. sM AP E =

n 1  |Yi − Yˆi | × 100% n i=1 (|Yi | + |Yˆi |)/2

where variables have the same meanings as in Eq. 1.

(2)

Complex Comparison of Statistical and Econometrics

6

345

Models

We use different statistical and econometrics methods that are available in the StatForecast package. In this section, we describe models from different families: baseline models, exponential smoothing models, theta models, etc. 6.1

Baseline Models

The Historic Average method (referred to as HistoricAverage), also known as the mean method, operates by setting the forecasts of all future values equal to the average of the historical data [16]. If we denote the historical data by y1 , . . . , yT , the forecasts can be expressed as Formula 3. y1 + · · · + yT . (3) T where yˆT +h|T represents the estimate of yT +h based on the data y1 , . . . , yT . The Window Average method (referred to as WindowAverage) calculates forecasts by taking the average of the last N observations, where N represents the length of the window [16]. The forecasts can be expressed using the Formula 4. yˆT +h|T = y¯ =

yˆT +h|T =

yT −N +1 + · · · + yT , N

(4)

where N is the window length. The Seasonal Window Average method (referred to as SeasWA) functions similarly to the Window Average method, with the addition of an extra parameter: the season length. This method computes forecasts by averaging the last N observations within each seasonal period, where N corresponds to the length of the window. The inclusion of the season length parameter allows the Seasonal Window Average method to consider the specific periodicity of the time series data, making it suitable for forecasting seasonal patterns [16]. The Naive model (referred to as Naive) employs a straightforward forecasting approach, where all future forecasts are set equal to the value of the most recent observation. This uncomplicated method assumes that the future behavior of the time series will mirror its most recent data point (Formula 5). The Naive model can serve as a baseline for comparison against more sophisticated forecasting techniques [16]. yˆT +h|T = yT .

(5)

The Seasonal Naive model (referred to as SeasonalNaive) is similar to the Naive method and proves especially valuable when dealing with highly seasonal data. In this approach, each forecast is set equal to the last observed value from the corresponding season of the previous year, effectively leveraging the seasonal patterns within the time series [16]. The mathematical representation of the Seasonal Naive model for forecasting at time T + h is defined as: yˆT + h | T = yT + h − m(k + 1),

(6)

346

O. Kosovan and M. Datsko

where m signifies the length of the seasonal period, and k corresponds to the integer part of (h − 1)/m - representing the number of complete years in the forecast period leading up to time T + h. The Random Walk With Drift (referred to as RandomWalkWithDrift) presents a nuanced variant of the naive method, allowing forecasts to exhibit gradual increases or decreases over time. This approach incorporates a concept known as drift, which signifies the average change observed in historical data. Consequently, the forecast for time T + h is expressed as:   yT − y1 h  T t = 2 (yt − yt−1 ) = yT + h . (7) yˆT + h | T = yT + T −1 T −1 In essence, the Random Walk With Drift model is akin to drawing a line connecting the initial and final observations and extending it into the future. By capturing the average change in the historical data, this model accommodates gradual trends in the forecasts, providing a method for forecasting that maintains a level of realism [16]. 6.2

Exponential Smoothing Models

The Simple Exponential Smoothing method (referred to as SES) is a forecasting approach that employs a weighted average of past observations, with weights diminishing exponentially as they extend into the past. This method is particularly useful for data that lacks discernible trends or seasonality. Given t observations, the one-step forecast is calculated as follows: y t − 1. yˆt + 1 = αyt + (1 − α)ˆ

(8)

Here, the parameter 0 ≤ α ≤ 1 represents the smoothing parameter, which determines the rate at which weights decline. When α = 1, SES simplifies to the naive method [13]. The Holt method (referred to as Holt) is an extension of exponential smoothing designed to accommodate time series data with trends. This method takes into account both the level and the trend of the series. By integrating an exponential smoothing factor for both the observed level and trend, the Holt method offers an effective way to forecast series characterized by varying trends [13,16]. 6.3

Sparse or Intermittent Models

The Aggregate-Dissagregate Intermittent Demand Approach (referred to as ADIDA) leverages temporal aggregation to mitigate the impact of zero observations. It applies optimized Simple Exponential Smoothing (SES) at the aggregated level and then disaggregates forecasts using equal weights. ADIDA is tailored for sparse or intermittent time series with minimal non-zero data points, offering a specialized solution for improved forecasting in such challenging scenarios [24].

Complex Comparison of Statistical and Econometrics

347

The Intermittent Multiple Aggregation Prediction Algorithm (referred to as IMAPA) extends the concept of ADIDA by incorporating multiple aggregation levels to account for diverse data dynamics. It employs optimized SES for generating forecasts across these levels and combines them through a straightforward average aggregation [2]. The Croston method (referred to as CrostonClassic) is a technique for forecasting time series characterized by intermittent demand. It involves decomposing the original time series into two components: non-zero demand sizes denoted as zt , and inter-demand intervals denoted as pt [4]. The forecast is then given by: zˆt yˆt = . (9) pˆt where zˆt and pˆt are forecasted using SES. Both components are smoothed with a common smoothing parameter of 0.1. The Croston SBA method (referred to as CrostonSBA) is short for Croston Seasonal Exponential Smoothing with Backshift Adjustment. This approach merges the Croston method with exponential smoothing, enhancing forecast precision for time series data showcasing both trend and seasonality [4]. The Teunter-Syntetos-Babai method (referred to as TSB) is akin to the Croston method but uses demand probabilities dt instead of demand intervals to estimate demand sizes. It is particularly suitable for time series data characterized by extended periods of zero demand. Demand probabilities are defined as:  1 if demand occurs at time t (10) dt = 0 otherwise. Consequently, the forecast is expressed as: yˆt = dˆt zˆt Both dˆt and zˆt are estimated using SES. The smoothing parameters for each component may differ, similar to the optimized Croston’s method [25]. 6.4

Multiple Seasonalities

Multiple Seasonal-Trend decomposition using LOESS (referred to as MSTL) designed for time series with multiple seasonal cycles, MSTL is an automated algorithm that extends the STL decomposition. It iteratively estimates multiple seasonal components, controlling their smoothness and separating variations. For non-seasonal series, MSTL determines trend and remainder components [1]. 6.5

Theta Family

The Theta method (referred to as Theta) is utilized for non-seasonal or deseasonalized time series, often achieved through multiplicative classical decomposition. This approach transforms the original time series into two new lines using

348

O. Kosovan and M. Datsko

theta coefficients, maintaining the same mean and slope but adjusting local curvatures based on the coefficient’s value [10]. 6.6

ARCH Family

The Autoregressive Conditional Heteroskedasticity model (referred to as ARCH) is a statistical model used in time series analysis to characterize the variance of the current innovation based on past error terms’ magnitudes, often considering their squares [6,23]. It assumes that at time t, yt is expressed as: yt = t σt

(11)

where t is a sequence of random variables with zero mean and unit variance, and σt2 is defined by: σt2 = w0 +

p 

2 ai yt−i

(12)

i=1

Here, w and  ai , i = 1, ..., p, are coefficients that must satisfy nonnegativity p conditions, and k=1 ak < 1. 6.7

ARIMA Family

The Autoregressive model (referred to as AutoRegressive) is a fundamental time series model that predicts future values based on their linear dependence on past observations [17]. yt = c +

p 

φi yt−i + t

(13)

i=1

where φi are the coefficients, c is a constant term, and t represents the white noise error term. The Autoregressive Integrated Moving Average model (ARIMA) is a widely used time series forecasting method that combines autoregressive (AR), differencing (I), and moving average (MA) components. It captures the dependencies between past observations, and differences to make the series stationary, and incorporates the influence of past forecast errors. The model is denoted as ARIMA(p, d, q), where p, d, and q represent the orders of the AR, differencing, and MA components respectively [17]. The AutoARIMA model employs an automated approach to select the optimal ARIMA model based on the Akaike Information Criterion (AICc), a well-known information criterion [17].

Complex Comparison of Statistical and Econometrics

7

349

Results

The error analysis of forecast accuracy for three different data sets and the list of methods using MAPE and sMAPE metrics revealed interesting empirical insights. For the M5 data set, the mean MAPE and sMAPE were 31.48 and 136.83, respectively, showcasing a moderate overall accuracy (see Table 2). Contrasting this, the Fozzy Group data set revealed a notable degree of variability. It displayed a mean MAPE of 8.65, while the sMAPE mean was considerably higher at 157.75. Remarkably, the Fozzy Group data set exhibited substantial standard deviations in both metrics, hinting at the potential presence of outliers or significant variations (see Table 3). Turning to the Corporaci´ on Favorita data set, a different pattern emerged. The mean MAPE registered at 191.39, while the sMAPE mean was relatively lower at 59.62 (see Table 4). These findings underscore the diversified performance of forecasting methodologies across distinct data sets, offering valuable insights for optimizing parameters and model selection. Table 2. Statistical Summary of Forecasting Performance on M5 Data Set Statistic

Value (MAPE) Value (sMAPE)

Number of time-series 1737930

1737930

Mean

31.48

136.83

Standard deviation

47.48

52.85

Minimum

0.00

0.00

Q1 (25%)

12.76

94.32

Q2 (50%)

23.34

147.55

Q3 (75%)

39.06

185.88

Maximum

44204.92

200.00

Table 3. Statistical Summary of Forecasting Performance on Fozzy Group Data Set Statistic

Value (MAPE) Value (sMAPE)

Number of time-series 1656504

1656504

Mean

8.65

157.75

Standard deviation

3020.11

75.57

Minimum

0.00

0.00

Q1 (25%)

0.00

170.34

Q2 (50%)

0.00

200.00

Q3 (75%)

6.54

200.00

Maximum

3263154

200.00

The comparative summary statistics for the M5 data set (see Table 5) show that models such as AutoRegressive, CrostonSBA, and CrostonClassic exhibit

350

O. Kosovan and M. Datsko

Table 4. Statistical Summary of Forecasting Performance on Corporaci´ on Favorita Data Set Statistic

Value (MAPE) Value (sMAPE)

Number of time-series 2910458

2910458

Mean

191.39

59.62

Standard deviation

147333.13

30.23

Minimum

0.00

0.00

Q1 (25%)

49.00

42.29

Q2 (50%)

71.79

52.85

Q3 (75%)

113.93

66.79

Maximum

251348620

200.00

similar mean MAPE values (ranging from 25.98 to 26.48 ), indicating consistent predictive accuracy with relatively moderate variability. However, the ARCH model deviates with a substantially higher mean MAPE of 67.99, accompanied by a notable standard deviation of 158.8, suggesting less stable performance. The quartiles (Q1 , Q2 , Q3 ) further illustrate the spread of performance across the models, showcasing variations in accuracy levels. Table 5. Comparative Summary of Forecasting Model Performance on M5 Data Set Min Q1 (25%) Q2 (50%) Q3 (75%) Max

Model

Mean std

AutoRegressive

25.98 23.27 0.0

12.24

21.25

33.1

622.8

CrostonSBA

26.18 22.46 0.0

12.58

21.8

33.33

580.25

CrostonClassic

26.48 23.87 0.0

12.32

21.32

33.31

612.29

ADIDA

26.77 25.24 0.0

12.44

21.39

33.18

826.81

WindowAverage 26.77 24.35 0.0

12.24

21.43

34.25

771.94

IMAPA

26.85 25.46 0.0

12.37

21.36

33.37

826.81

Holt

26.87 25.65 0.0

12.44

21.37

33.29

829.04

HistoricAverage 27.01 25.68 0.0

12.23

22.11

34.7

654.45

Theta

27.01 25.97 0.0

12.23

21.37

33.8

842.8

ARIMA

27.01 25.68 0.0

12.23

22.11

34.7

654.47

SeasWA

27.35 22.97 0.0

13.27

23.31

35.8

697.37

AutoARIMA

27.36 23.87 0.0

12.95

22.35

35.11

669.82

TSB

27.46 26.48 0.0

11.94

21.43

34.93

816.72

SES

29.25 29.53 0.0

11.97

22.32

37.19

1004.96

MSTL

32.45 28.26 0.0

14.31

27.04

43.11

782.8

SeasonalNaive

38.06 33.1

14.29

30.95

52.38

766.19

Naive

40.53 50.26 0.0

11.9

28.57

50.0

1953.15

RWD

40.77 50.44 0.0

11.9

28.73

50.97

1961.34

ARCH

67.99 158.8 0.0

21.88

54.77

103.62

44204.92

0.0

Complex Comparison of Statistical and Econometrics

351

Analyzing the results per model for the Fozzy Group data set (see Table 6) offers valuable insights. Notably, a lot of models like SeasWA, CrostonClassic, and CrostonSBA exhibit minimum MAPE values of 0.0%. This suggests instances of near-perfect forecasts, which can be attributed to the prevalence of zero sales in the time series of the Fozzy Group data set. For example, the model AutoRegressive shows a Q1 MAPE of 0.0%, indicating that at least 25% of the time series have forecasts with no error. However, the same model has a Q3 MAPE of 6.73%, signifying that 75% of the time series have forecasts with a MAPE of 6.73% or lower. This comprehensive evaluation proves the challenges caused by zero sales in forecasting accuracy and highlights the varying degrees of predictive capability across different models. These findings can aid in selecting appropriate forecasting techniques tailored to scenarios with a significant presence of zero values in time series data. Table 6. Comparative Summary of Forecasting Model Performance on Fozzy Group Data Set Model

Mean std

Min Q1 (25%) Q2 (50%) Q3 (75%) Max

SeasWA

3.91

11.85 0.0

0.0

0.0

0.00

725.0

CrostonClassic

4.45

10.93 0.0

0.0

0.0

5.93

372.4

CrostonSBA

4.46

10.70 0.0

0.0

0.0

6.00

351.6

IMAPA

4.61

11.05 0.0

0.0

0.0

6.53

377.8

ADIDA

4.61

10.99 0.0

0.0

0.0

6.51

365.9

Theta

4.65

11.32 0.0

0.0

0.0

6.57

385.6

Holt

4.65

11.75 0.0

0.0

0.0

6.50

638.3

TSB

4.69

11.55 0.0

0.0

0.0

6.66

441.1

WindowAverage 4.70

11.51 0.0

0.0

0.0

6.63

416.7

HistoricAverage 4.74

10.75 0.0

0.0

0.0

6.47

463.1

ARIMA

4.74

10.75 0.0

0.0

0.0

6.47

463.1

SES

4.84

12.24 0.0

0.0

0.0

6.88

507.6

MSTL

5.22

13.26 0.0

0.0

0.0

6.82

620.2

Naive

5.92

17.72 0.0

0.0

0.0

7.14

1020.9

SeasonalNaive

5.99

15.54 0.0

0.0

0.0

7.14

540.4

RWD

6.12

18.09 0.0

0.0

0.0

7.24

1045.1

ARCH(1)

8.27

63.24 0.0

0.0

0.0

7.23

17909.4

AutoRegressive

69.23 12813 0.0

0.0

0.0

6.73

3263154

In another context, the comparative summary statistics for the Corporaci´ on Favorita data set (see Table 7) reveal that models such as CrostonSBA, AutoARIMA, and SeasWA exhibit mean MAPE values in the range of 87.08 to 89.02. This range demonstrates consistent accuracy with relatively moderate

352

O. Kosovan and M. Datsko

variability. Conversely, models like ARCH and AutoRegressive present considerably higher mean MAPE values of 216.93 and 1751.39, respectively. These models are accompanied by notable standard deviations, indicating more diverse and potentially less reliable performance. It’s noteworthy that CrostonSBA, AutoARIMA, and SeasWA emerge as the most stable models in the context of the maximum MAPE, showcasing their ability to consistently produce accurate forecasts. Table 7. Comparative Summary of Forecasting Model Performance on Corporaci´ on Favorita Data Set Model

Mean

std

Min

Q1 (25%) Q2 (50%) Q3 (75%) Max

CrostonSBA

87.08

190.11

0.07

45.13

62.69

92.28

AutoARIMA

88.94

212.74

0.00

46.93

64.72

93.81

34626.7

SeasWA

89.02

115.02

0.00

45.32

70.12

106.70

8156.2

37194.1

WindowAverage 92.78

240.42

0.00

46.43

65.02

96.52

53182.3

CrostonClassic

92.82

200.46

0.02

47.63

66.85

98.76

39154.7

IMAPA

93.45

237.63

0.00

47.34

66.49

98.44

39154.7

ADIDA

93.45

237.63

0.00

47.34

66.49

98.44

39154.7

TSB

93.65

222.70

0.00

46.57

65.74

99.54

35972.9

Theta

93.79

264.91

0.07

47.27

66.31

98.15

44498.9

SES

95.27

232.87

0.00

45.95

65.62

101.96

34841.9

Holt

98.88

352.02

0.04

48.51

68.23

100.97

76787.1

SeasonalNaive

105.70

243.48

0.00

53.27

77.13

113.69

28471.3

ARIMA

107.78

139.49

3.52

54.54

79.88

122.80

19461.0

HistoricAverage

107.78

139.49

3.52

54.54

79.88

122.80

19461.0

Naive

108.47

317.03

0.00

40.83

63.81

116.83

64910.0

MSTL

109.15

301.54

3.55

54.41

76.73

112.23

54314.6

RWD

110.08

337.87

0.00

42.01

64.27

116.96

71091.0

ARCH(1)

216.93

471.62

89.84 128.83

155.20

203.02

40463.2

AutoRegressive

1751.39 642209.24 0.64

74.32

109.39

251348620

52.83

In conclusion, our analysis identifies CrostonSBA, CrostonClassic, ADIDA, and IMAPA as the most stable forecasting methods across diverse data sets. These models consistently exhibit accurate predictive capabilities, suggesting their suitability as reliable baseline models for guiding future research endeavors in the field of sales forecasting.

8

Discussion and Future Work

In this section, we delve into the insights garnered from the analysis of different forecasting models across multiple datasets, while also addressing the limitations and potential avenues for future research in the realm of sales forecasting. Our utilization of statistical and econometrics methods has led to a heightened level of interpretability compared to machine learning or neural network approaches [20]. However, it is noteworthy that certain models, such as

Complex Comparison of Statistical and Econometrics

353

AutoARIMA, exhibit complexities in interpretation, marking a limitation of our study. Our findings have illuminated the varying performance of forecasting methods across different datasets. Notably, specific models demonstrate superior performance on distinct datasets, and the stability of certain models across varied datasets, like CrostonSBA and CrostonClassic, underscores their potential as baseline models for future research endeavors. However, we acknowledge that the task of identifying universally stable methods remains intricate. The evaluation of time series forecasting remains a challenging endeavor, and we recognize room for enhancement in the evaluation process. Utilizing real-time data in our research can augment the practical applicability of our findings, offering a more comprehensive understanding of forecasting performance. The limitations of our study encompass the nature of the dataset used. A future direction could involve incorporating live sales data and accounting for a broader spectrum of factors, both internal and external, that could significantly influence sales patterns. Further investigations into outliers in model performance are warranted to deepen our understanding of their effects. Additionally, the necessity of periodic reevaluation and updating of forecasting models in light of new data cannot be understated, ensuring their ongoing efficacy. In practical implementation, our study presents the notion of employing the list of stable models as a resource for selecting baseline models. This proposition holds value for both researchers embarking on forecasting research and businesses initiating their sales forecasting processes. These baseline models offer a stepping stone for evaluation and improvement, acting as empirical anchors in the dynamic field of sales forecasting.

9

Conclusion

In conclusion, this study has provided a comprehensive exploration of the current state of the sales forecasting domain through empirical analysis of different statistical and econometrics methods across diverse datasets. The findings underscore the importance of stable and interpretable methods, with models such as CrostonSBA, CrostonClassic, ADIDA, and IMAPA identified as particularly robust choices for baseline sales forecasting. The inherent stability and interpretability of these methods hold great significance for the decision-making processes within the retail industry, offering reliable insights for informed strategies. While this study contributes valuable insights, it is imperative to acknowledge its limitations and avenues for future research, as discussed in Sect. 8. By conducting this comparative analysis, the study sheds light on the strengths, weaknesses, and stability of different forecasting models, enriching our understanding of their applicability and potential. As the field of sales forecasting continues to evolve, we encourage readers to delve into the evolving landscape of forecasting methodologies, building upon the insights provided by this research.

354

O. Kosovan and M. Datsko

References 1. Bandara, K., Hyndman, R.J., Bergmeir, C.: MSTL: a seasonal-trend decomposition algorithm for time series with multiple seasonal patterns. arXiv. 2021. https://doi. org/10.48550/ARXIV.2107.13462. https://arxiv.org/abs/2107.13462 2. Boylan, J.E., Syntetos, A.A.: Intermittent Demand Forecasting: Context, Methods and Applications. Wiley (2021). https://www.ifors.org/intermittent-demandforecasting-context-methodsand-applications/ 3. Chatfield, C.: Apples, oranges and mean square error. Int. J. Forecasting 4(4), 515–518 (1988). https://doi.org/10.1016/0169-2070(88)90127-6 4. Croston, J.D.: Forecasting and stock control for intermittent demands. J. Oper. Res. Soc. 23(3), 289–303 (1972). https://doi.org/10.1057/jors.1972.50 5. Davis, D.F., Mentzer, J.T.: Organizational factors in sales forecasting management. Int. J. Forecast. 23(3), 475–495 (2007). https://doi.org/10.1016/j.ijforecast.2007. 02.005 6. Engle, R.F.: Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50(4), 987–1007 (1982). https://doi.org/10.2307/1912773 7. Ensafi, Y., et al.: Time-series forecasting of seasonal items sales using machine learning - a comparative analysis. Int. J. Inf. Manage. Data Insights 2(1), 100058 (2022). https://doi.org/10.1016/j.jjimei.2022.100058 8. Favorita, C., et al.: Corporaci´ on Favorita Grocery Sales Forecasting (2017). https://kaggle.com/competitions/favorita-grocerysales-forecasting 9. Fildes, R., Makridakis, S.: Forecasting and loss functions. Int. J. Forecast. 4(4), 545–550 (1988). https://doi.org/10.1016/0169-2070(88)90131-8 10. Fiorucci, J.A., et al.: Models for optimising the theta method and their relationship to state space models. Int. J. Forecast. 32(4), 1151–1161 (2016). https://doi.org/ 10.1016/j.ijforecast.2016.02.005 11. Garza, F., et al.: StatsForecast: lightning fast forecasting with statistical and econometric models. PyCon Salt Lake City, Utah, US 2022 (2022). https://github.com/ Nixtla/statsforecast 12. Haselbeck, F., et al.: Machine learning outperforms classical forecasting on horticultural sales predictions. Mach. Learn. Appl. 7, 100239 (2022). https://doi.org/ 10.1016/j.mlwa.2021.100239 13. Holt, C.C.: Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast. 20(1), 5–10 (2004). https://doi.org/10.1016/j.ijforecast. 2003.09.015 14. Addison Howard et al. M5 Forecasting - Accuracy (2020). https://kaggle.com/ competitions/m5-forecasting-accuracy 15. Hyndman, R.J.: A brief history of forecasting competitions. Int. J. Forecast. 36(1), 7–14 (2020). https://doi.org/10.1016/j.ijforecast.2019.03.015 16. Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice. 3rd. Melbourne, Australia: OTexts (2021). https://OTexts.com/fpp3 17. Hyndman, R.J., Khandakar, Y.: Automatic time series forecasting: the forecast package for R. J. Stat. Softw. 27(3), 1–22 (2008). https://doi.org/10.18637/jss. v027.i03 18. Kim, S., Kim, H.: A new metric of absolute percentage error for intermittent demand forecasts. Int. J. Forecast. 32(3), 669–679 (2016). https://doi.org/10.1016/ j.ijforecast.2015.12.003

Complex Comparison of Statistical and Econometrics

355

19. Kosovan, O.: Fozzy group hack4retail competition overview: results, findings, and conclusions. Market Infrastruct. 67 (2022). https://doi.org/10.32843/ infrastruct67-42 20. Kosovan, O., Datsko, M.: Interpretation of machine learning algorithms for decision-making in retail. Econ. Soc. 47 (2023). https://doi.org/10.32782/25240072/2023-47-47 21. Makridakis, S.: Accuracy measures: theoretical and practical concerns. Int. J. Forecast. 9(4), 527– 529 (1993). https://doi.org/10.1016/0169-2070(93)90079-3 22. Makridakis, S., Spiliotis, E., Assimakopoulos, V.: The M5 competition: background, organization, and implementation. Int. J. Forecast. 38(4), 1325–1336 (2022). https://doi.org/10.1016/j.ijforecast.2021.07.007 23. Marquez, J.: Time Series Analysis. James D. Hamilton, 1994, (Princeton University Press, Princeton, NJ), 799 pp., US$55.00, ISBN 0-691-04289-6”. In: International Journal of Forecasting 11.3 (1995), pp. 494–495. https://ideas.repec.org/a/eee/ intfor/v11y1995i3p494-495.html 24. Nikolopoulos, K., et al.: An aggregate-disaggregate intermittent demand approach (ADIDA) to forecasting: an empirical proposition and analysis. J. Oper. Res. Soc. 62(3), 544–554 (2011). https://doi.org/10.1057/jors.2010.32 25. Teunter, R.H., Syntetos, A.A., Zied Babai, M.: Intermittent demand: linking forecasting to inventory obsolescence. Eur. J. Oper. Res. 214(3), 606–615 (2011). https://doi.org/10.1016/j.ejor.2011.05.018 26. Vall´es-P´erez, I., et al.: Approaching sales forecasting using recurrent neural networks and transformers. Exp. Syst. Appl. 201, 116993 (2022). https://doi.org/10. 1016/j.eswa.2022.116993

The Meaning of Business and Software Architecture in the Enterprise Architecture Kamelia Shoilekova(B) , Boyana Ivanova, and Magdalena Andreeva Angel Kanchev University of Ruse, 8 Studentska Street, 7000 Ruse, Bulgaria {kshoylekova,bivanova,mhandreeva}@uni-ruse.bg

Abstract. This research paper aims to provide a clear demarcation between the three architectures. The main idea for its creation is to determine the place of each of the architectures in the overall picture of the organization. Another main aspect of the current paper is to show that each of these architectures is involved in the overall organization’s process of implementation, and failure to design or complete omission of one of these architectures can lead to many undesirable situations throughout the organization. Keywords: Business architecture · Software architecture · Enterprise architecture · Business model · Business processes

These days, very often happens that people do not find a connection between the concepts: of business architecture, enterprise architecture, and software architecture. This, in turn, leads to serious confusion when it comes to building a complete work model. Although appeared in a short interval of time, these architectures changed the overall view of the organization of work processes in the organization. The concept of business architecture began to appear in the 90s of the XX century, when many organizations tried to optimize their activities. In the last few years, more and more organizations have had to build a business architecture of their operating companies and transform their processes in order to comply with the requirements related to Covid 19. The enterprise architecture dates back to the 1980s. The Institute for Enterprise Architecture Development summarizes the main guiding principles of enterprise architecture as follows: “no strategic predictions - no enterprise architecture”, which can be interpreted that today’s enterprise architecture is a system of tomorrow’s business [2, 3]. From Fig. 1 it becomes clear that the business architecture is part of the architecture of the enterprise.

1 Business Architecture is the Bridge Between Software Architecture and Enterprise Architecture The formal definition of software architecture is “a set of structures necessary to describe a software system that contains software elements, relationships between them, and their properties” [4]. A key aspect of software architecture is the correct documentation of a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 356–360, 2024. https://doi.org/10.1007/978-3-031-54820-8_28

The Meaning of Business and Software Architecture

357

Fig. 1. Business architecture is a part of the enterprise architecture [1]

system’s architecture so that it can be used during the design, development process, and during system development/maintenance. As the business evolves over the years and the software system becomes more complex, one way to create effective documentation is to divide it into three parts - called perspectives, each of which is accompanied by a number of views [4]. These three perspectives are: • static - concerns the static parts of the system and helps architects to structure the system’s implementation units. • dynamic - describes the behavior of the system during execution, how the structured set of elements dynamically interact with each other during the execution of the system. • deployment perspective (implementation) - the environment in which the system will be deployed is described, including the dependencies of the system on the execution environment, showing how the software structures correspond to the environment structures. Therefore, architectures in a specific organization, according to their scope and relationships with other elements, are divided into three groups: enterprise architecture, business architecture, and software architecture (Fig. 2). In order to build the enterprise architecture, it is necessary to observe a certain hierarchical connectivity between the different types of architectures, and this connectivity, in turn, aims to show the one-way improvement relationship between them. If the architecture of the organization is represented in the form of a cog, then those architectures that are smaller cogs complement and generalize the functionalities of the larger cogs and ensure the achievement of more abstract business operations that are unattainable for the small cogs.

358

K. Shoilekova et al.

Fig. 2. Architectures of the enterprise

2 Overlap of the Three Architectures Building a model of the business architecture of an organization goes through certain stages and follows different architectural frameworks. Regardless of which architectural framework will be used, it is necessary to analyze the organization and identify its strengths and weaknesses (SWOT analysis). After that, it is necessary to build a clear development strategy and plan the entire work process. In most cases, activity throughout the organization is divided into multiple sub processes that must work in sync to follow the basic rules that the output of one such sub process will be the input to one or more other sub processes. All this shows that the three architectures cannot exist independently, because: 1. to build a model of the business architecture, certain architectural frameworks are observed, which are one of the main elements of the enterprise’s architecture; 2. to build a strategy, planning, and defining the organization’s development goals are elements of the business architecture. 3. to construct business processes and business services and their synchronization to work as one complete organism are the work of both business architecture and software architecture.

3 Basic Aspects of Building the Business Architecture On the other hand, the main aspects that make up the business architecture framework are: • Organization – the structure of the enterprise. It is composed of business units that interact with each other to represent all the main functions and processes in the organization. This aspect of business architecture is largely covered by both enterprise

The Meaning of Business and Software Architecture

359

architecture and several architectural frameworks that directly or indirectly relate to enterprise architecture. • Capabilities – what an organization is capable of doing to increase its percentage of strengths. This aspect is again intertwined with the architecture of the organization and more precisely with the SWOT analysis. • Business processes - A business process is a network of actions and connections between them, jointly implementing a certain business task that leads to specific results. For the realization of this business task, people, equipment, applications, information, etc. are involved. Resources whose main purpose is to convert inputs into outputs (Fig. 3). In other words, a business process is understood as an organized sequence of actions that aims to create a product or service that has a certain value for the end customer. This aspect largely overlaps with the software that must be created to implement the business process. At the business architecture level, business processes are planned and designed in the organization and the sequence and connection [5, 6]. From there, things are in the hands of the software specialists, who must: ◯ Consider what tools to use to build the relevant process; ◯ Consider how to connect the individual processes or units of the organization if they are developed with different tools.

Fig. 3. Business process

• Information – the engine that drives knowledge, results, and understanding of how a business works. It is the engine of each of these architectures because if there is no information, there is no organization. The information is used both for the analysis of the overall activity of the organization and for the analysis of business processes. Information from the analysis of business processes aims to show whether optimization of one or several processes is needed, which will lead to a reduction of a certain resource (time, money, materials).

4 Conclusion Business architecture remains an important part of both enterprise architecture and software architecture. On the one hand, enterprise architecture tries to build a framework that matches the specific business, and on the other hand, business architecture tries to create

360

K. Shoilekova et al.

the models of business processes and services in such a way that software specialists can embed them with the least effort and the most - few resources. All this shows that the three architectures cannot be seen as separate parts of the company’s architecture. Acknowledgements. This publication is developed with the support of Project BG05M20P001– 1.001–0004 UNITe, funded by the Operational Program “Science and Education for Smart Growth” co-funded by the European Union trough the European Structural and Investment Funds.

References 1. Nam, G.B.: Implementing ITIL service strategy though enterprise architecture. In: ItSMF Singapore Annual Conference, 16 mar 2012, Implementing ITIL® Service Strategy Through Enterprise Architecture | PPT (slideshare.net) 2. Institute for Enterprise Architecture Developments. http://www.enterprise-architecture.info 3. Schekkerman, J.: Enterprise architecture good practices guide: how to manage the enterprise architecture practice. ISBN 978–1–42515–687–9 4. Clements, P., et al.: Documenting Software Architecture: Views and Beyond, 2nd edn. AddisonWesley, USA (2011) 5. Whittle, R., Myrick, C.B.: Enterprise Business Architecture: The Formal Link between Strategy and Results. CRC Press, Boca Raton (2016) 6. Rouhani, B.D., Kharazmi, S.: Presenting new solution based on business architecture for enterprise architecture. IJCSI Int. J. Comput. Sci. Issues 9(3), no. 1 (2012). ISSN (Online), 1694–0814

Towards Data-Driven Artificial Intelligence Models for Monitoring, Modelling and Predicting Illicit Substance Use Elliot Mbunge1,2(B) , John Batani3 , Itai Chitungo4 , Enos Moyo5 , Godfrey Musuka6 , Benhildah Muchemwa2 , and Tafadzwa Dzinamarira7 1 Department of Information Technology, Faculty of Accounting and Informatics,

Durban University of Technology, P O Box 1334, Durban 4000, South Africa [email protected] 2 Department of Computer Science, Faculty of Science and Engineering, The University of Eswatini (Formerly Swaziland), Kwaluseni, Manzini, Eswatini 3 Faculty of Engineering and Technology, Botho University, Maseru 100, Lesotho 4 Department of Medical Laboratory Sciences, University of Zimbabwe, Harare, Zimbabwe 5 Medical Centre Oshakati, Oshakati, Namibia 6 ICAP at Columbia University, Harare, Zimbabwe 7 School of Health Systems and Public Health, University of Pretoria, Pretoria, South Africa

Abstract. Illicit substance use (ISU) is a major public health problem and a significant cause of morbidity and mortality globally. Early assessment of risk behaviour, predicting, identifying risk factors, and detecting illicit substance use become imperative to reduce the burden. Unfortunately, current digital tools for early detection and modelling ISU are largely ineffective and sometimes inaccessible. Data-driven artificial intelligence (AI) models can assist in alleviating the burden and tackling illicit substance use but their adoption and use remain nascent. This study applied the PRISMA model to conduct a systematic literature review on the application of artificial intelligence models to tackle illicit substance use. The study revealed that elastic net, artificial neural networks support vector machines, random forest, logistic regression, KNN, decision trees and deep learning models have been used to predict illicit substance use. These models were applied to tackle different substance classes, including alcohol, cannabis, hallucinogens, tobacco, opioids, sedatives, and hypnotics among others. The models were trained and tested using various substance use data from social media platforms and risk factors such as socioeconomic and demographic data, behavioural, phenotypic characteristics, and psychopathology data. Understanding the impact of these risk factors can assist policymakers and health workers in effective screening, assessing risk behaviours and, most importantly, predicting illicit substance use. Using AI models and risk factors to develop data-driven intelligent applications for monitoring, modelling, and predicting illicit substance use can expedite the early implementation of interventions to reduce the associated adverse consequences. Keywords: Artificial Intelligence · Illicit Substance Use · Data-driven · Africa

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 361–379, 2024. https://doi.org/10.1007/978-3-031-54820-8_29

362

E. Mbunge et al.

1 Introduction Illicit substance use remains a major global public health problem, and the outbreak of coronavirus disease 2019 (COVID-19) exacerbated the burden [1]. Drug use disorders caused 85,984 deaths worldwide in 2019 (55,616 deaths in males and 30,367 deaths in women), accounting for 47% of all global deaths caused by drug-related disorders [1]. In addition, the outbreak of COVID-19 and imposed movement restrictions increased the burden which consequently has psychological effects such as stress, anxiety, depression, and other mental health issues exacerbated by socioeconomic challenges. These psychological effects played a significant role in the growth of illicit substance use (ISU) incidences [2]. For instance, people aged between 15 and 64 years, were among 284 million illicit drug users globally in 2020, a 26% increase from the previous decade [1]. Most patients in Africa and South America receiving treatment for ISU problems are under 35 years of age [1]. The use of illicit substances has increased substantially among adolescents, and they tend to develop addiction because of the propensity for experimenting, curiosity, low self-esteem and lack of guidance and counselling services. This is also attributed to early mental and behavioural health issues, peer pressure, a bad family structure, inadequate parental supervision and interactions, lack of opportunities, loneliness, and easy access to drugs [3]. Moreso, protective measures such as peer groups that stimulate high self-esteem, religiosity, peer influence, self-control, parental supervision, academic proficiency, rehabilitation, anti-drug use legislation, and a strong sense of community connectedness have been utilized to tackle and reduce the catastrophic impact of ISU [4]. Preventing ISU can reduce its catastrophic impact on health, as it affects youths to properly transition into adulthood by impeding the development of critical thinking and learning skills [5]. In addition, the use of illicit substances also affects sexual and reproductive health (SRH) [6] and poor reproductive outcomes, such as unintended pregnancies, preterm deliveries, and maternal and neonatal morbidity and mortality [7]. ISU can also result in chronic illnesses [8], and socioeconomic problems such as violence, dropping out of school, family breakup, unemployment, and living in squalor and poverty [9]. Due to the severity of the consequences associated with ISU, it is imperative to develop AI models that can predict, detect, monitor, and model risk factors associated with ISU to assist healthcare professionals in designing effective interventions to reduce the burden. 1.1 Significance of the Study ISU can be predicted using interviewer-administered questions or self-administered methods, but they need extra personnel and time. As a result, few health systems routinely screen for ISU [10]. Patients can be screened for ISU using an automated technique that uses information gathered during routine health treatment. Therefore, artificial intelligence (AI) models can be used to monitor and predict ISU among different populations. AI models can learn to carry out specific tasks by identifying patterns in data [11]. By concentrating on linking input features like psychological, physical, and environmental factors with ISU, it stands as a potent alternative to data-driven models for identifying ISU vulnerability and predicting illicit substance use [12]. Several studies focused on the social, psychological, and mental consequences of illicit substance use [3, 13–15], while

Towards Data-Driven Artificial Intelligence Models

363

some applied statistical tools to survey data to determine risk factors associated with ISU [7] and recommend protective measures [16]. In addition to survey data on ISU, there is positive progress, though at a slower pace, in utilizing smart wearable technologies such as smartwatches and wristbands to collect data on substance use [17, 18]. Such massive data can be utilized to develop data-driven intelligent models to tackle illicit substance use. Despite this, there is a dearth of literature on the application of AI models for monitoring, modelling and predicting illicit substance use. Such data-driven AI models can tremendously assist healthcare professionals in predicting the possibility of substance abuse, remote monitoring patients in rehabilitation and most importantly, modelling risk factors associated with ISU. This can also assist policymakers in developing and implementing evidence-based frameworks and guidelines for integrating AI into health systems to tackle ISU. Such applications are rare, but they can guide the future development of digital health interventions and tools for reducing ISU [19]. Therefore, this study presents a systematic literature review of AI models, risk factors associated with ISU and barriers and challenges hindering the development and integration of data-driven AI models into health information systems. The study also presents recommendations to address the identified barriers and challenges. The remainder of this paper is structured as follows. Section 2 presents the methodology used in conducting the systematic review. The results analysis and discussion of AI models and applications are presented in Sect. 3. The barriers, challenges and recommendations for integrating data-driven artificial intelligence models for tackling ISU are presented in Sect. 4, while Sect. 5 presents the conclusion.

2 Materials and Method 2.1 Study Design The Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) model guided this systematic review in which primary and scientific articles that were published between January 2015 and June 2023 on substance abuse were retrieved and analysed. The study was conducted between 16 February and 14 June 2023. This study sought to identify AI models used in substance abuse research, their purposes, performance and limitations and the risk factors used. Figure 1 summarises the search results, study selection and inclusion process followed in this study. 2.2 Search Strategy The major databases searched were Web of Science, Google Scholar, Scopus and Science Direct, using the keywords “artificial intelligence”, “deep learning”, “machine learning”, “illicit substance abuse”, and “substance use disorders”. The publication year was restricted to between 1 January 2015 and 14 June 2023. Additional articles were identified through citation chaining.

364

E. Mbunge et al.

Fig. 1. PRISMA model

2.3 Study Selection and Eligibility Criteria Articles returned from database searches were initially screened based on the abstracts, and where a conclusive decision could not be made on eligibility, the full articles were retrieved and screened. Articles that were deemed eligible based on screening were assessed for eligibility by methodology assessment and article type. Eligible articles were those that were published between 1 January 2015 and 14 June 2023 when the study was concluded. The eligibility criteria were thus: articles that had low perceived risk-ofbias, were written in English, or had English translations, peer-reviewed, empirical and applied any type of AI algorithm to predict, detect, assess or examine substance abuse, including determining the risk factors. Any articles that did not meet the specified criteria, were deemed to have a high risk of bias, or had poor methodologies were excluded. 2.4 Data Extraction Two researchers independently screened all articles based on titles, abstracts, and quality of methodology. The researchers resolved any differing views on papers by conducting a meeting in which each explained their reasoning in line with the prescribed criteria until a consensus was reached. The two authors involved in article retrieval, screening and assessment continuously engaged each other throughout the data extraction process. Data were extracted into a standard table, and both researchers checked the data for accuracy.

Towards Data-Driven Artificial Intelligence Models

365

2.5 Risk-of-Bias and Quality Assessment of Included Sources The included articles were assessed for quality and risk of bias using the Critical Appraisal Skills Programme checklist [20]. The assessment criteria were methodology appropriateness and validity of findings, facilitated by the Critical Appraisal Skills Programme questionnaire, helping eliminate systematic biases [21]. Conducting and visualizing the results of this process was enabled by using the Review Manager software called RevMan5.4.1 (Table 1). Table 1. Application of AI models to tackle illicit substance use Reference Model(s)

Purpose

Risk Factors

Performance

Limitations

[12]

Naïve Bayes, Evaluation of SVM and Substance use Random Forest disorder

Phenotypic characteristics and environmental factors

74% and 86% accuracy for different age groups

ML algorithms delineate the psychological, health, and environmental characteristics associated with the risk for SUD

[22]

Random Forest Predicting the severity of substance use

Behavioural characteristics

71% and 91% accuracy for different age groups

The harmfulness score does not fully account for cumulative exposure to each substance A longitudinal study is required to measure substance use severity

[23]

Logistic regression

Predicting ISU

Environmental factors, family substance use,

[24]

XGBoost

Predicting Behavioural Susceptibility to characteristics substance abuse

Limited sample size and survey data are subject to biases ROC:99% PRC:98%

Drug interactions involving opioids are solely determined by how frequently they are prescribed and when side effects occur

(continued)

366

E. Mbunge et al. Table 1. (continued)

Reference Model(s)

Purpose

Risk Factors

[25]

Negation detection algorithm, NegEx., NLP

Detection of Substance-Use Status

Frequency, Type, F1-score -99% Amount of smoking, alcohol, or drugs consumed by the patient and quit time and period

Performance

The dataset used was small. A larger dataset may enhance the capacity to produce additional rules

Limitations

[26]

SVM

Predict substance abuse treatment outcomes

Clinical data such as cocaine, methamphetamine, or heroin-dependent at the time of incarceration, no history of head injury and no history of psychosis

Accuracy: 80.58% Sensitivity: 81.31% Specificity: 78.13%

The negative predictive value is lower than desired and misidentifies someone with a higher risk for discontinuation

[27]

Elastic net

Assess ISU predictors

characteristics and psychopathology data

AUC: 83%

Instead of conducting a formal clinical interview, the screening test (ASSIST) was utilised to determine whether a person used illicit drugs

[28]

Decision trees, SVM and Boosted decision trees

Detecting ISU Recovery Problems

Peer-to-peer discussion forums and ISU self-management pages on the forum

F1-scores DT:88% SVM:89% Boosted DT:94%

The model did not label subtypes of recovery problems. Participants who did not post on the forum were excluded

[29]

SVM, Random Substance forest, Naïve Abuse Risk Bayes and Assessment Logistic regression

Social media data

F1 scores SVM:82.6% RF: 85.9%, Naïve Bayes: 85.3% LR: 88.2%

High data imbalance affects biomarker prediction. Data were collected from limited social media platforms

(continued)

Towards Data-Driven Artificial Intelligence Models

367

Table 1. (continued) Reference Model(s)

Purpose

Risk Factors

Performance

Limitations

[30]

RF, KNN, DT, Linear SVC, Gaussian Naïve Bayes and Logistic Regression

Predicting Individual Substance Abuse Vulnerability

Demographic and Behavioural characteristics

Accuracy RF: 95.08%, KNN: 88.52%, DT: 85.24%, Linear SVC: 95.90%, Gaussian Naïve Bayes: 92.62% and LR: 94.26%

Data were collected from the same location and same age group which might affect the performance of the models when exposed to new data from other locations

[31]

Random forest, Super-learning and Artificial neural networks

Predict the efficacy of treatment for substance use disorders

Patient characteristics, treatment characteristics and type of problematic substance

AUC for the models ranged between 79.3% and 82%

There are relatively few details about the kind of treatment and how it was delivered

[32]

Random forest

Determine the Demographic and socioeconomic Behavioural causes of characteristics patients abandoning substance abuse treatment

AUC of 89%

Not specified

[33]

Artificial Neural networks (ANN)

Predict volatile substance abuse for drug risk analysis

Accuracy of 81.1%

The study used an open-source dataset; therefore, the model needs to be validated with real data

[34]

Natural language processing

Identification of Status, type, Substance method, amount, Abuse frequency, exposure history and quit history

Agreeableness, conscientiousness extraversion, neuroticism, openness to experience, impulsiveness, sensation seeking and demographic details

F1-score between Not mentioned 80%–91%

(continued)

368

E. Mbunge et al. Table 1. (continued)

Reference Model(s)

Purpose

Risk Factors

[35]

Machine learning regression algorithm (actual algorithm not specified in the paper)

Prediction of severity of alcohol use disorders

Neural features The model using imaging data correctly and self-reporting explained 33% of the variance

Performance

They used multimodal features hoping their model would perform better than those that used unimodal features, but this hypothesis was not supported by their results

[36]

Logistic regression, Naïve Bayes and gradient boost

Prediction of alcohol use

Not clear from the paper

Accuracy of 97.55%

Not mentioned

[37]

Random forest

Predicting comorbid substance use disorders among people with bipolar disorder

Sociodemographic and clinical data

F1-score:66% Accuracy:65.3% Sensitivity:69.6% Specificity:61.2%

Clinical variables collected retrospectively from electronic clinical records may have affected the accuracy and reliability of the data

Fig. 2. Summary of RoB assessment for all included studies

Limitations

Towards Data-Driven Artificial Intelligence Models

369

3 Results Analysis and Discussion 3.1 Analysis of Risk-of-Bias The researchers analysed the risk-of-bias of the included papers, focusing on reporting, selection, performance, recall and observer biases. Figure 2 presents the summarized analysis for all the included articles.

Fig. 3. Analysis of RoB for individual studies.

Key: low risk unclear Figure 3 shows the analysis of bias for each included paper. The researchers’ perceptions on the risk-of-bias of the included papers were that there was low bias across all the five bias types. The following sections present an analysis of the findings regarding AI models and risk factors. 3.2 Artificial Intelligence Models for Predicting Illicit Substance Abuse Artificial intelligence focuses on developing intelligent models (algorithms) and smart machines that can perform tasks that typically require human intelligence [38]. It has various subsets, including deep learning, natural language processing and machine learning. Machine learning (ML) is a subset of AI that has been significantly used in healthcare to perform various tasks, including diagnosis [39], detection [40], prediction [41, 42] and classification. ML models learn from samples to solve classification, association and clustering problems in healthcare [43]. This study has revealed that several AI models have been used in predicting, modelling and monitoring substance use, all of which fall under machine and deep learning. Deep learning is a subset of ML that has more layers than a neural network and automates feature extraction, removing the hand-tuning that is predominant in traditional ML. There has been significant progress in detecting substance use disorder [44], assessing future risk and predicting treatment success through the use of machine learning models. For instance, a study conducted by [45] applied

370

E. Mbunge et al.

machine learning models such as decision tree (DT), random forest (RF), logistic regression (LR), K-Nearest Neighbours (KNN), Gaussian Naïve Bayes and Support Vector Machine (SVM) to predict individual substance abuse. Figure 4 is a word cloud for the identified AI models and algorithms used in identifying and predicting substance abuse.

Fig. 4. Word cloud for a few identified models for tackling ISU

SVM is a widely used technique that defines an optimal hyperplane to distinguish between items falling in classes of interest [41]. It is an ML algorithm used to solve classification and regression problems using labelled data [41]. A Cartesian plane is normally used to illustrate SVM, where a hyperplane is drawn to separate observations in different classes or categories [41]. Observations are represented as coordinates on the Cartesian plane, and these coordinates are called support vectors. Islam et al. [45] used the SVM algorithm to identify vulnerabilities to substance abuse. Random forest is an ML algorithm that uses labelled data to solve classification or regression problems [32, 42]. The algorithm combines multiple decision trees to determine the output, thus, it is ensemble-based [39, 42, 46]. Each tree in the forest produces the output that it uses to vote for the output, and the output that gets the majority vote becomes the forest’s (model’s) overall output [42]. The advantages of this algorithm include high accuracy and efficiency, as well as the ability to ingest datasets that contain both numerical and categorical predictors [42]. Islam et al. [45] created a random forest model to identify vulnerabilities to substance abuse, while [47] used it to predict substance use disorder treatment outcomes. Transfer learning entails “repurposing a trained model on a related task” [47] and is mainly used in deep learning to leverage related pre-trained models where datasets are small. Bailey and DeFulio [47] used transfer learning (and compared it with a random forest model) to predict substance use disorder treatment outcomes. Logistic regression is a machine learning algorithm that uses labelled data to solve classification problems [41, 42, 48]. It uses the sigmoid activation function to return a value (0 or 1) that signifies whether a given observation belongs to a given class or not, using the formula below [41]. f (z) = (1 + e−z )−1 where z is “the weighted sum of each neuron” [41].

(1)

Towards Data-Driven Artificial Intelligence Models

371

Since the sigmoid function’s output must either be 0 or 1, any continuous value obtained is set to 0 or 1 based on which value it is closer to. Thus, a threshold value, normally 0.5, is used. Logistic regression was used in creating a predictive classifier for identifying vulnerabilities to substance abuse [45], alcohol use [36] and substance use disorders [31]. Naïve Bayes is a supervised classification machine learning algorithm based on the Bayes theorem [49]; thus, it is probabilistic. The Gaussian Naïve Bayes, like the Naïve Bayes, is a probabilistic classification algorithm based on the Gaussian distribution [50]. The algorithm assumes that each feature is independently capable of predicting the target variable [50]. Naïve Bayes has been used to predict alcohol use [36], while Islam et al. [45] used Gaussian Naïve Bayes to identify vulnerabilities to alcohol use. Super learning is an ensemble-based ML algorithm that was derived from the stacking algorithm, whose output is the weighted mean of all the included predicted algorithms [31]. Also, a study conducted by [31] used superlearning to predict SUD treatment, with regression, RF, and deep neural networks as the constituent algorithms. They used the area under the curve as an evaluation metric and it ranged between 79.3% and 82.0% as reported in their study. Though a decision tree is a simple ML algorithm, it is widely used to solve classification problems [41]. The algorithm uses a rooted-tree data structure with a root, leaves and internal nodes, where internal nodes lie between the root and terminal (leaf) nodes [51]. Classes or labels in a decision tree are represented by terminal nodes (leaves), while test conditions (attributes/ characteristics) are represented as inner and root nodes [42]. A decision tree segments the feature space into several simple regions. Boosted decision trees are a type of decision tree that utilises the ensemble approach, in which each tree learns from the residual of the trees, enhancing the overall performance. Islam et al.[45] used a decision tree to identify vulnerabilities to substance abuse. Gradient boost was used by Kumari & Swetapadma [36] to predict alcohol use and produced close to 98 per cent accuracy. Moreover, a study conducted by Kornfield et al. [28] applied SVM, decision trees and boosted decision trees to detect illicit substance use using a natural language processing program called Linguistic Inquiry and Word Count and data from online substance abuse forums. Their study revealed that boosted decision trees outperformed other models with an F1 score of 94%. However, participants who did not post on the forum were excluded, and the model could not label subtypes of recovery problems. A study by Rakovski et al. [27] applied an elastic net to assess the performance of the selected illicit substance abuse predictors. The model achieved the highest area under the receiver operating characteristics curve (AUC) of 83%. The lasso and ridge regression regularisations are combined linearly by the elastic net approach to creating models that can remove unnecessary variables while maintaining small coefficients, improving generalisation. Natural language processing algorithms have been used to detect substance abuse from clinical textual data. For instance, a study conducted by Alzubi et al. [25] applied a negation detection algorithm called NegEx to detect substance use status and the algorithm achieved an F1-score of 99%. Also, Yetisgen & Vanderwende [34] applied natural language processing to the Identification of substance abuse from social history in clinical text automatically and achieved FI-score ranging between 80% and 91%. Interestingly,

372

E. Mbunge et al.

social media platforms together with natural language processing algorithms have been used to detect, identify and assess substance abuse risk behaviour. For instance, a study by Ovalle et al. [29] utilised social media data and machine learning models such as LR, linear SVM, Naïve Bayes, and RF to assess substance abuse risk behaviour among men who have sex with men(MSM). Such modes tremendously assist in implementing adaptive interventions to target substance use risk behaviour among hard-to-reach groups such as MSM. These algorithms were used for different purposes in substance abuse research. Figure 5 shows the different purposes for which the explained AI algorithms were used, and these are presented in the form of a word cloud.

Fig. 5. Application of AI models to tackle ISU.

3.3 Risk Factors for Modelling Substance Abuse Using Artificial Intelligence Models Illicit substance use is associated with various factors, including psychological, and socioeconomic data, sociodemographic data, phenotypic characteristics and environmental factors, family setup, and religion. A study conducted by [12] applied random forest, SVM, MLP, and logistic regression to analyse substance use and its outcomes using psychological self-regulation spanning behaviour control data and daily routine data. Additionally, Rakovski et al. [27] utilised socioeconomic and sociodemographic data as predictors for predicting ISU among young adults. Predictors such as age, gender, religion, alcohol dependence, tobacco abuse, and family history of alcohol dependence were significantly associated with illicit substance use. Also, a study by Ruberu et al. [23] utilised sociodemographic data (culture, gender, age), family substance use and general environmental factors such as early life stress, peer friends, level of parental monitoring and involvement of parents in daily activities. These factors were used to predict illicit substance use among adolescents using multivariate covariance generalised linear models. Age, early life stress, lifetime substance use, age of first use, maternal education, parental attachment, family cigarette use, and family history of substance use are some risk factors that have been used to predict the use of illicit substance use [27]. Conversely, a systematic review conducted by Nawi et al.[3] classified ISU risk factors

Towards Data-Driven Artificial Intelligence Models

373

into three main groups; community, individual and family factors. Community factors include having friends and peers who abuse substances and the community’s culture and beliefs. Individual factors include low religion, peer pressure, negative upbringing, psychiatric disorders, rebelliousness, exposure to hazardous substances and behavioural addiction, and easy and high accessibility to illicit substances. Family risk factors consist of a family history of substance use, maternal smoking, the family’s abusive and addictive behaviour, poor level of monitoring, and negligence [3]. These factors are paramount in predicting illicit substance use.

4 Barriers, Challenges and Recommendations for Integrating Data-Driven Artificial Intelligence Models for Tackling Illicit Substance Use Applying artificial intelligence techniques such as deep learning and machine learning models for modelling risks and predicting illicit substance use requires massive data for training, testing, and sometimes validating the performance of the model. However, applying such models encounters impediments as illicit substance use data generally becomes not readily available [52]. Several studies applied artificial intelligence models on survey data to predict the severity of substance use [22] and the evaluation of substance use disorder [12]. A study by Walsh et al.[52] highlighted that the shortage of data is caused by numerous factors, including stigma, patients not seeking treatment, selfdiscriminating and stereotyping behaviour, reduced independence for patients abusing substances and criminalisation of ISU. Artificial intelligence algorithm bias has been reported by Walsh et al. [52] as one of the major barriers to integrating data-driven artificial intelligence-based applications for tackling illicit substance use. Having a biased artificial intelligence algorithm means that key connections between the input features and the output variable are being missed which consequently affects the performance of the application. Reducing bias tends to increase the performance of the model; however, trade-offs between performance, underfitting and overfitting should be carefully considered and evaluated. Bias can happen in data, in model specification [52], in training, testing, validating and deployment of artificial intelligence algorithms, especially in machine learning. Therefore, a robust discussion should address the necessity of data reuse and sharing to increase algorithm transparency and improve algorithm correctness and reliability. AI-based applications require consistently measured timely data to make long-term predictions. However, constraints due to structural barriers exist that limit access to quality data. Constraints such as enabling policies, poor reporting of substance abuse [53] or lack of electronic health records. Electronic health records (EHR) have been demonstrated to be a crucial tool for enhancing patient information access and the standard of care. EHR hasn’t, however, been widely used or adopted in SSA [54, 55]. Several countries in SSA lack a clear policy on the implementation of EHRs as well as any financial incentives to steer the adoption of EHRs. In SSA countries, there are competing needs for already challenged health systems such that substance abuse is not a priority. This impacts the ability of available data for modelling, monitoring and predicting substance abuse due to missing data points. Developing AI modelling with missing data may lead

374

E. Mbunge et al.

to bias and loss of precision (‘inefficiency’) [56]. Secondary data from drug market surveillance, drug testing services and wastewater is also useful in providing the often missing part of the equation. Limiting factors for most of these secondary data sources are restricted access to public health [53].To improve access to timely data on substance abuse, there is a need to implement epidemiological surveillance infrastructure, which requires the removal of structural barriers. Countries must be urged to invest in or improve health systems that promote the collection and sharing of data on substance abuse. Once the data constraint hurdle is overcome, the next challenge is identifying the most appropriate indicators of illicit substance use severity. There is no clear consensus on a definition of severity and is usually inferred from the frequency of consumption or symptoms because of ISU. These metrics of severity are subject to and lead to different interpretations making it difficult to standardise the indicators of ISU severity [22]. The combination of missing data points and choice of indicators introduces biases in algorithms. The biases include “patients not identified by algorithms, sample size and underestimation, and misclassification and measurement error.” These biases result from socioeconomic disparities in health care [57]. Several studies reported the lack of rehabilitation facilities in many SSA countries. The shortages of rehabilitation facilities could be alleviated by adopting and implementing digital rehabilitation services. Though this is infancy and more research is needed, digital rehabilitation services are delivered through digital tools such as smart applications focused on improving patient outcomes and relapse prevention [58]. A digital rehab start-up Workit Health offers group therapy, coaching, and medication-assisted treatment through a mobile application. Insufficient health workers, especially psychiatrists have been reported in many studies of SSA. Addressing the shortages in the mental health workforce requires more than scaling up the training of psychiatrists, psychologists and psychiatric nurses, to task shifting-delegating healthcare tasks from specialists to various non-specialist health professionals and other health workers [59]. Additionally, integration of mental health services at primary health care is delivered through community-based and task-sharing approaches. The competing needs of health systems plus the stigma of ISU make it challenging to provide adequate health care to the user of drug abuse. Drawing from the emerging digital phenotyping approaches which leverage ubiquitous sensing technology [60], AI data-driven smart applications can be developed for remote monitoring, tracking and reporting those with substance abuse tendencies. However, implementation and adoption of these emerging technologies require appropriate funding of digital infrastructure and m-Health policies. Most countries especially in LMICs lack political will and digital health governance policies and this requires urgent redress through crafting policies and frameworks that support the use of AI in the fight against SUB [61].

Towards Data-Driven Artificial Intelligence Models

375

5 Conclusion Illicit substance use has increased substantially globally, and early detection, prediction and identification of risk groups is imperative to reduce its impact. Due to the dearth of literature in this emerging research area, we conducted a systematic review on the application of AI models for monitoring, modelling and predicting illicit substance use. The study revealed that there is positive progress towards the application of artificial intelligence algorithms to tackle illicit substance abuse. AI models have been used to perform various tasks, including predicting, identifying high-risk groups, detecting ISUs, and assessing ISU predictors to develop and expedite the early implementation of interventions that reduce the associated adverse consequences. Age, gender, early life stress, lifetime substance use, maternal education, parental attachment, having friends who abuse drugs, culture and religion, family cigarette use, and family history of substance use are some risk factors that have been used to predict the use of illicit substance use using artificial intelligence algorithms. Support vector machine, random forest, logistic regression, KNN, decision trees and natural language processing algorithms are among AI algorithms that have been predominantly used to analyse and predict illicit substance use. Such models can tremendously assist in the identification of ISU risk factors among youths, adolescents, and adults. These models can be used further to develop data-driven artificial intelligence-based ISU tools to identify individuals at risk and alert health workers to provide appropriate interventions and prevention measures.

References 1. WHO. World Drug Report 2022. WHO 2022. https://www.unodc.org/unodc/data-and-ana lysis/world-drug-report-2022.html. Accessed 9 Apr 2023 2. Mukwenha, S., Murewanhema, G., Madziva, R., Dzinamarira, T., Herrera, H., Musuka, G.: Increased illicit substance use among Zimbabwean adolescents and youths during the COVID19 era: an impending public health disaster. Addiction 117, 1177–1178 (2022). https://doi. org/10.1111/ADD.15729 3. Nawi, A.M., et al.: Risk and protective factors of drug abuse among adolescents: a systematic review. BMC Publ. Health 21, 1–15 (2021). https://doi.org/10.1186/S12889-021-11906-2/ FIGURES/2 4. Drabble, L., Trocki, K.F., Klinger, J.L.: Religiosity as a protective factor for hazardous drinking and drug use among sexual minority and heterosexual women: findings from the national alcohol survey. Drug Alcohol Depend. 161, 127–134 (2016). https://doi.org/10.1016/J.DRU GALCDEP.2016.01.022 5. Crews, F., He, J., Hodge, C.: Adolescent cortical development: a critical period of vulnerability for addiction. Pharmacol. Biochem. Behav. 86, 189–199 (2007). https://doi.org/10.1016/J. PBB.2006.12.001 6. Hall, K.S., Moreau, C., Trussell, J.: The link between substance use and reproductive health service utilization among young US women. Subst. Abuse 34, 283–91 (2013). https://doi.org/ 10.1080/08897077.2013.772934 7. Reardon, D.C., Coleman, P.K., Cougle, J.R.: Substance use associated with unintended pregnancy outcomes in the national longitudinal survey of youth 30, 369–83 (2004). https://doi. org/10.1081/ADA-120037383

376

E. Mbunge et al.

8. Fergie, L., Campbell, K.A., Coleman-Haynes, T., Ussher, M., Cooper, S., Coleman, T.: Identifying effective behavior change techniques for alcohol and illicit substance use during pregnancy: a systematic review. Ann. Behav. Med. 53, 769–781 (2019). https://doi.org/10.1093/ abm/kay085 9. Nyaga, J.: Socio-economic and health consequences of drugs and substance use in Gachie, a Peri-urban town on the outskirts of Nairobi. Afr. J. Alcohol Drug Abuse 6 (2021) 10. Afshar, M., et al.: Development and multimodal validation of a substance misuse algorithm for referral to treatment using artificial intelligence (SMART-AI): a retrospective deep learning study. Lancet Digit. Health 4, e426–e435 (2022). https://doi.org/10.1016/S2589-7500(22)000 41-3 11. Mbunge, E., Batani, J.: Application of deep learning and machine learning models to improve healthcare in sub-Saharan Africa: emerging opportunities, trends and implications. Telemat. Inform. Rep. 11, 100097 (2023). https://doi.org/10.1016/J.TELER.2023.100097 12. Jing, Y., et al.: Analysis of substance use and its outcomes by machine learning I. Childhood evaluation of liability to substance use disorder. Drug Alcohol Depend 206, 107605 (2020). https://doi.org/10.1016/J.DRUGALCDEP.2019.107605 13. Semple, D.M., McIntosh, A.M., Lawrie, S.M.: Cannabis as a risk factor for psychosis: systematic review 19, 187–94 (2005). https://doi.org/10.1177/0269881105049040 14. Guxensa, M., Nebot, M., Ariza, C., Ochoa, D.: Factors associated with the onset of cannabis use: a systematic review of cohort studies. Gac. Sanit. 21, 252–260 (2007). https://doi.org/ 10.1157/13106811 15. Moore, T.H., et al.: Cannabis use and risk of psychotic or affective mental health outcomes: a systematic review. Lancet 370, 319–328 (2007). https://doi.org/10.1016/S0140-6736(07)611 62-3 16. Nargiso, J.E., Ballard, E.L., Skeer, M.R.: A systematic review of risk and protective factors associated with nonmedical use of prescription drugs among youth in the United States: a social ecological perspective 76, 5–20 (2015). https://doi.org/10.15288/JSAD.2015.76.5 17. Mahmud, M.S., Fang, H., Carreiro, S., Wang, H., Boyer, E.W.: Wearables technology for drug abuse detection: a survey of recent advancement. Smart Health 13, 100062 (2019). https:// doi.org/10.1016/J.SMHL.2018.09.002 18. Kunchay, S., Abdullah, S.: WatchOver: using apple watches to assess and predict substance co-use in young adults. UbiComp/ISWC 2020 Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers, pp. 488–93 (2020). https:// doi.org/10.1145/3410530.3414373 19. Hamideh, D., Nebeker, C.: The digital health landscape in addiction and substance use research: will digital health exacerbate or mitigate health inequities in vulnerable populations? Curr. Addict. Rep. 7, 317–332 (2020). https://doi.org/10.1007/S40429-020-00325-9/ TABLES/3 20. CASP-UK. CASP CHECKLISTS. CASP-UK Website (2021) 21. Batani, J., Maharaj, M.S.: Towards data-driven models for diverging emerging technologies for maternal, neonatal and child health services in sub-Saharan Africa: a systematic review. Glob. Health J. (2022). https://doi.org/10.1016/J.GLOHJ.2022.11.003 22. Hu, Z., et al.: Analysis of substance use and its outcomes by machine learning: II. Derivation and prediction of the trajectory of substance use severity. Drug Alcohol Depend. 206, 107604 (2020). https://doi.org/10.1016/J.DRUGALCDEP.2019.107604 23. Ruberu, T.L.M., et al.: Joint risk prediction for hazardous use of alcohol, cannabis, and tobacco among adolescents: a preliminary study using statistical and machine learning. Prev. Med. Rep. 25, 101674 (2022). https://doi.org/10.1016/J.PMEDR.2021.101674

Towards Data-Driven Artificial Intelligence Models

377

24. Vunikili, R., Glicksberg, B.S., Johnson, K.W., Dudley, J.T., Subramanian, L., Shameer, K.: Predictive modelling of susceptibility to substance abuse, mortality and drug-drug interactions in opioid patients. Front. Artif. Intell. 4, 172 (2021). https://doi.org/10.3389/FRAI.2021.742 723/BIBTEX 25. Alzubi, R., Alzoubi, H., Katsigiannis, S., West, D., Ramzan, N.: Automated detection of substance-use status and related information from clinical text. Sensors 22, 9609 (2022). https://doi.org/10.3390/S22249609 26. Steele, V.R., et al.: Machine learning of functional magnetic resonance imaging network connectivity predicts substance abuse treatment completion. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 3, 141–149 (2018). https://doi.org/10.1016/J.BPSC.2017.07.003 27. Rakovski, C., et al.: Predictors of illicit substance abuse/dependence during young adulthood: a machine learning approach. J. Psychiatry Res. 157, 168–73 (2023). https://doi.org/10.1016/ J.JPSYCHIRES.2022.11.030 28. Kornfield, R., et al.: Detecting recovery problems just in time: application of automated linguistic analysis and supervised machine learning to an online substance abuse forum. J. Med. Internet Res. 20(6), e10136 (2018). https://doi.org/10.2196/10136 29. Ovalle, A., et al.: Leveraging social media activity and machine learning for HIV and substance abuse risk assessment: development and validation study. J Med Internet Res 23(4), e22042 (2021). https://doi.org/10.2196/22042 30. Islam, U.I., Sarker, I.H., Haque, E., Hoque, M.M.: Predicting individual substance abuse vulnerability using machine learning techniques. Adv. Intell. Syst. Comput. 1375(AIST), 412–421 (2021). https://doi.org/10.1007/978-3-030-73050-5_42/COVER 31. Acion, L., Kelmansky, D., Van, L.M.D., Sahker, E., Jones, D.S., Arndt, S.: Use of a machine learning framework to predict substance use disorder treatment success. PLoS ONE 12, e0175383 (2017). https://doi.org/10.1371/JOURNAL.PONE.0175383 32. Gautam, P., Singh, P.: A machine learning approach to identify socio-economic factors responsible for patients dropping out of substance abuse treatment. Am. J. Publ. Health Res. 8, 140–6 (2020). https://doi.org/10.12691/ajphr-8-5-2 33. Nath, P., Kilam, S., Swetapadma, A.: A machine learning approach to predict volatile substance abuse for drug risk analysis. In: Proc - 2017 3rd IEEE International Conference on Research in Computational Intelligence and Communication Networks, ICRCICN 2017, pp. 255–258 (2017). https://doi.org/10.1109/ICRCICN.2017.8234516 34. Yetisgen, M., Vanderwende, L.: Automatic identification of substance abuse from social history in clinical text. In: ten Teije, A., Popow, C., Holmes, J., Sacchi, L. (eds.) Artificial Intelligence in Medicine. Lecture Notes in Computer Science(), vol. 10259, pp. 171–181. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59758-4_18/COVER 35. Fede, S.J., Grodin, E.N., Dean, S.F., Diazgranados, N., Momenan, R.: Resting state connectivity best predicts alcohol use severity in moderate to heavy alcohol users. NeuroImage Clin. 22, 101782 (2019). https://doi.org/10.1016/j.nicl.2019.101782 36. Kumari, D., Swetapadma, A.: Analysis of alcohol abuse using improved artificial intelligence methods. In: Journal of Physics: Conference Series, vol. 1950, p. 012003 (2021). https://doi. org/10.1088/1742-6596/1950/1/012003 37. Oliva, V., et al.: Machine learning prediction of comorbid substance use disorders among people with bipolar disorder. J. Clin. Med. 11, 1–13 (2022). https://doi.org/10.3390/jcm111 43935 38. Surden, H.: Harry surden, artificial intelligence and law: an overview. Georgia State Univ. Law Rev. 35, 1305–1337 (2019) 39. Macaulay, B.O., Aribisala, B.S., Akande, S.A., Akinnuwesi, B.A., Olabanjo, O.A.: Breast cancer risk prediction in African women using random forest classifier. Cancer Treat. Res. Commun. 28, 100396 (2021). https://doi.org/10.1016/J.CTARC.2021.100396

378

E. Mbunge et al.

40. Haas, O., Maier, A., Rothgang, E.: Machine learning-based HIV risk estimation using incidence rate ratios. Front. Reprod. Health 0, 96 (2021). https://doi.org/10.3389/FRPH.2021. 756405 41. Mbunge, E., et al.: Predicting diarrhoea among children under five years using machine learning techniques, 94–109 (2022). https://doi.org/10.1007/978-3-031-09076-9_9 42. Chingombe, I., et al.: Predicting HIV status using machine learning techniques and biobehavioural data from the Zimbabwe population-based HIV impact assessment (ZIMPHIA15–16). Cybern. Perspect. Syst., 247–58 (2022). https://doi.org/10.1007/978-3-03109076-9_24 43. Lee, E.E., et al.: Artificial intelligence for mental health care: clinical applications, barriers, facilitators, and artificial wisdom. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 6, 856–864 (2021). https://doi.org/10.1016/J.BPSC.2021.02.001 44. Barenholtz, E., Fitzgerald, N.D., Hahn, W.E.: Machine-learning approaches to substanceabuse research: emerging trends and their implications. Curr. Opin. Psychiatry 33, 334–342 (2020). https://doi.org/10.1097/YCO.0000000000000611 45. Islam, U.I., Haque, E., Alsalman, D., Islam, M.N., Moni, M.A., Sarker, I.H.: A machine learning model for predicting individual substance abuse with associated risk-factors. Ann. Data Sci., 1–28 (2022). https://doi.org/10.1007/S40745-022-00381-0/METRICS 46. Wray, T.B., Luo, X., Ke, J., Pérez, A.E., Carr, D.J., Monti, P.M.: Using smartphone survey data and machine learning to identify situational and contextual risk factors for HIV risk behavior among men who have sex with men who are not on PrEP. Prev. Sci. 20, 904–913 (2019). https://doi.org/10.1007/S11121-019-01019-Z/FIGURES/1 47. Bailey, J.D., DeFulio, A.: Predicting substance use treatment failure with transfer learning. Subst. Use Misuse 57, 1982–1987 (2022). https://doi.org/10.1080/10826084.2022.2125272 48. Carnegie Mellon University. Introduction to Machine Learning. Carnegie Mellon Univ Website (2020) 49. Griffis, J.C., Allendorfer, J.B., Szaflarski, J.P.: Voxel-based Gaussian naïve Bayes classification of ischemic stroke lesions in individual T1-weighted MRI scans. J. Neurosci. Methods 257, 97–108 (2016). https://doi.org/10.1016/j.jneumeth.2015.09.019 50. Martins, C.: Gaussian naive bayes explained and hands-on with scikit-learn. Tower AI (2022) 51. Tan, P.-N., Steinbach, M., Karpatne, A., Kumar, V.: Introduction to Data Mining. 2nd ed. (2019) 52. Walsh, C.G., et al.: Stigma, biomarkers, and algorithmic bias: recommendations for precision behavioral health with artificial intelligence. JAMIA Open 3, 9–15 (2020). https://doi.org/10. 1093/JAMIAOPEN/OOZ054 53. Marks, C., et al.: Methodological approaches for the prediction of opioid use-related epidemics in the United States: a narrative review and cross-disciplinary call to action. Transl. Res. 234, 88–113 (2021). https://doi.org/10.1016/J.TRSL.2021.03.018 54. Kyei-Nimakoh, M., Carolan-Olah, M., McCann, T.V.: Access barriers to obstetric care at health facilities in sub-Saharan Africa-a systematic review. Syst. Rev. 6 (2017). https://doi. org/10.1186/s13643-017-0503-x 55. Batani, J., Maharaj, M.S.: Towards data-driven pediatrics in Zimbabwe. In: 2022 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems, pp. 1–7 (2022). https://doi.org/10.1109/ICABCD54961.2022.9855907 56. Hughes, R.A., Heron, J., Sterne, J.A.C., Tilling, K.: Accounting for missing data in statistical analyses: multiple imputation is not always the answer. Int. J. Epidemiol. 48, 1294–1304 (2019). https://doi.org/10.1093/IJE/DYZ032 57. Gianfrancesco, M.A., Tamang, S., Yazdany, J., Schmajuk, G.: Potential biases in machine learning algorithms using electronic health record data. JAMA Intern. Med. 178, 1544–1547 (2018). https://doi.org/10.1001/JAMAINTERNMED.2018.3763

Towards Data-Driven Artificial Intelligence Models

379

58. Wang, Z., et al.: A community-based addiction rehabilitation electronic system to improve treatment outcomes in drug abusers: protocol for a randomized controlled trial. Front. Psychiatry 9 (2018). https://doi.org/10.3389/FPSYT.2018.00556 59. Baingana, F., Al’Absi, M., Becker, A.E., Pringle, B.: Global research challenges and opportunities for mental health and substance-use disorders. Nature 527, 172–177 (2015). https:// doi.org/10.1038/nature16032 60. Winslow, B., Mills, E.: Future of service member monitoring: the intersection of biology, wearables and artificial intelligence. BMJ Mil Health (2023). https://doi.org/10.1136/MIL ITARY-2022-002306 61. Chitungo, I., Mhango, M., Mbunge, E., Dzobo, M., Musuka, G., Dzinamarira, T.: Utility of telemedicine in sub-Saharan Africa during the COVID-19 pandemic. A rapid review. Hum. Behav. Emerg. Technol. 3, 843–53 (2021). https://doi.org/10.1002/HBE2.297

On the Maximal Sets of the Shortest Vertex-Independent Paths Boris Melnikov1(B)

and Yulia Terentyeva2

1 Shenzhen MSU – BIT University, 1 International University Park Road, Dayun New Town,

Longgang District, Shenzhen 518172, China [email protected] 2 Center for Information Technologies and Systems of Executive Authorities Named After A.V. Starovoytov, Moscow, Russian Federation

Abstract. The paper proposes a method for finding the maximum set of the shortest vertex-independent paths between the vertices of a graph. The application area of the method includes algorithms for obtaining the stability estimation of the communication network, estimation of the bandwidth capacity of the communication direction at channel switching. All these algorithms are extremely important in the process of designing and/or modernizing the communication network when solving the problems of finding redundant paths. The need to develop the proposed method is primarily due to technical and economic factors associated with the organization of backup routes on communication networks. It is also dictated by the low efficiency of existing algorithms for finding reserve routes, which are based on algorithms for finding the shortest paths between the vertices of the graph of the communication network. It is shown that the problem of searching for reserve routes should be searched for comprehensively, rather than sequentially using the algorithm for finding the shortest path. Since for the communication network, conditionally speaking, it is more important to have two non-shortest routes, which reserve each other, than one shortest route, which “kills” two potentially available routes. And the construction of a reserve route is a resource-intensive activity in the general case for high-dimensional communication networks. The proposed method of finding the maximum set of shortest vertex-independent paths on communication networks of real scales has been tested, which showed its high efficiency and the possibility of using it in resource-intensive problems related to the stability of the communication network, as well as in flow distribution and routing problems. #COMESYSO1120. Keywords: Communication Network · Stability · Vertex-Independent Paths

1 Introduction The paper considers the mathematical aspect of the problem of reserving simple paths in communication networks and proposes a constructive algorithm for its solution. The relevance of the problem of reserving simple paths is due to the development of technology for designing communication networks. In particular, this problem arises when it is © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 380–387, 2024. https://doi.org/10.1007/978-3-031-54820-8_30

On the Maximal Sets of the Shortest Vertex-Independent Paths

381

necessary to calculate the most important technical characteristics of the communication network, which include stability and capacity. And the effective solution of such a problem is especially important for communication networks of high dimensions, because it is inextricably linked to the technical and economic parameters of the communication network in its design and/or modernization. The problem of reserving simple paths in communication networks has been considered repeatedly in modern works [1–4], since the scope of application of the solution of this reserving problem is quite wide in communication networks. However, the main drawback in considering this problem was that it was solved by successive use of shortest path search methods. For real-world communication networks, especially large-scale ones, this will inevitably lead to an inefficient solution. For example, no reserve path will be found, when in fact two vertex-independent simple paths between given vertices can be found, each of which is not a shortest path. And the lack of a reserve path can be a critical factor in both survivability and information flow issues. The authors have also attempted to solve this problem [5, 6], and as a result of these attempts, the main problems arising here have been investigated. An example of “loss” of a vertex-independent path using classical algorithms (Dijkstra [7], etc.) is shown in Fig. 1. Here, there are obviously two independent paths between vertices u1 and u2 . These are: u1 − 6 − 7 − 7 − 8 − 9 − u2 u1 − 1 − 2 − 3 − 4 − 4 − 5 − u2 However, when using classical methods of finding the shortest paths between the vertices of the graph, we obtain only one path: u1 –6-4–4-5-u2 . 8

u2

9

7

5

4 6 3 2 u1

1

Fig. 1. An example of “loss” of a reserve path when using classical methods of finding shortest paths between given graph vertices.

Theoretical moments, which could affect the so-called reserve paths connecting two non-contiguous vertices, appeared in the 20s of the XXth century in connection with Menger’s theorem [8], showing that the smallest number of vertices separating two noncontiguous vertices s and t is equal to the largest number of non-intersecting simple (s – t)-chains.

382

B. Melnikov et al.

Further, the study of the optimal sets of vertices and edges (arcs) of undirected and directed graphs separating two non-contiguous vertices led to the Ford-Falkerson minimum cut and maximum flow theorem [9], which can be used to determine the maximum number of reserve paths. However, the reserve vertex independent paths themselves are not provided by the algorithm to find the maximum flow. To solve this problem, we introduce the definition of the maximum set of shortest vertex-independent paths, which is directly related to the reserve paths (hereinafter we refer to vertex-independent paths). An algorithm for finding this set is developed. A graph generation model for testing the algorithm is constructed and experimental results of its operation are given.

2 Preliminary Information: Definitions and Notations Let us introduce the definition of the maximal set of shortest vertex-independent paths. First, let us define the measure (or length) of the path, with respect to which we will order the paths and find the shortest one. The measure (length) of a path is the number of vertices of the path, not counting the initial and final vertices. Consequently, the shortest path will be the path containing the smallest number of vertices from the source to the drain. Further note that we prioritize the largest number of paths from the source to the sink that do not have pairwise common vertices (and hence edges). Thus, the maximal set of shortest vertex-independent paths connecting two noncontiguous vertices (let us call them source and drain) is the set consisting of pairwise non-intersecting paths whose number is maximal and whose total measure of these paths is minimal. In particular, in Fig. 1 this set is represented by two paths: p1 = {u2 , 6, 7, 8, 9, u2 } and p2 = {u1 , 1, 2, 3, 4, 5, u2 }. Let G = < V, E > be an undirected communication network graph describing the communication network model. Here V = {vi }(i = 1, 2,…, n) is the set of nodes of the graph that model the nodes of the communication network, E = {ei }(i = 1, 2,…, m) is the set of edges of the graph that model the links of the communication network. Thus ei = (v1(i) , v2(i) ), v1(i) ∈ V, v2(i) ∈ V. We denote by u1 the source vertex and u2 the sink vertex, u1 ∈ V, u2 ∈ V. We denote the path, more precisely the i-th path pi (i ∈ N), between the source and the sink as the ordered set {u1 , v1(i) , v2(i) ,…, vn(i) , u2 }. Note that within the introduced notations, the length of the i-th path will be n. By the function q(pi ) we denote the length of the path pi . Let P = {pi }M i=1 be the set of all possible paths from source to sink, including intersecting paths. Suppose there are z pairwise non-intersecting paths between the source and the sink:   p1 = u1 , v1(1) , v2(1) , . . . , vn(1) , u2  p2 = u1 , v1(2) , v2(2) , . . . , vn(2) , u2 (1) ...   pz = u1 , v1(z) , v2(z) , . . . , vn(z) , u2 That is, there exists no i ≥ 1, i ∈ N, and no j ≥ 1, j ∈ N such that vi(…) = vj(…) . Here p1 ∈ P, p2 ∈ P, …, pz ∈ P.

On the Maximal Sets of the Shortest Vertex-Independent Paths

383

We introduce the notion of labeled edge as follows. Let us define a function f on the set E.  0, if the edge is unlabeled, f (ei ) = (2) 1, if the edge is labeled.

3 Problem Statement Within the framework of the notations introduced, we need to solve the following minimax problem. max z p1 ∈ P, p2 ∈ P, ..pz ∈ P pi ∩ pj = {u1 , u2 }∀i = j min p1 ∈ P, p2 ∈ P, ..pz ∈ P pi ∩ pj = {u1 , u2 }∀i = j

z 

q(pi )

(3)

(4)

i=1

In other words, we need to find the maximum number of vertex-non-intersecting shortest paths (as well as the paths themselves) connecting the source and the sink.

4 Algorithm Description To develop an algorithm for finding the maximum set of shortest vertex-independent − → paths, we first need to construct an oriented graph G on the basis of an undirected graph G by taking the following steps (let’s call it Algorithm 1). 4.1 Algorithm 1 − → Step 1. The set of vertices of graph G coincides with the set of vertices of graph G. − → Step 2. We assume a new set of oriented edges E = ∅. Step 3. For each edge ei = (v1(i) , v2(i) ), where ei ∈ E, i = 1, 2,…, m, we make two oriented edges e1(i) = (v1(i) , v2(i) ) and e2(i) = ( v1(i) , v1(i) ). − → − → Step 4. E : = E ∪ e1(i) . − → − → Step 5. E : = E ∪ e2(i) . Step 6. If all edges have been viewed, stop. Otherwise, go to step 3. As a result of the above procedure, we obtain a graph in which each edge is replaced by two edges with the same vertices and opposite orientation (see Fig. 2). Next, we directly search for the maximal set of vertex-independent shortest paths − → between vertices u1 and u2 , already using the intermediate oriented graph G . Let us call this set of actions Algorithm 2.

384

B. Melnikov et al. 8

u2

9

7

5

4 6 3

2

u1

1

− → Fig. 2. The result of the algorithm of generating the graph G on the basis of the graph G.

4.2 Algorithm 2 Step 1. Let us create the set of vertex-independent shortest paths between u1 and u2 M, M: = ∅; W is the set of vertices which constitute these paths, except for vertices u1 and u2 , W: = ∅. Step 2. We find by breadth-first search (using the Lee algorithm [9]) the shortest − → oriented path between u1 and u2 in the oriented graph G such that all edges of the path are unlabeled. In this case, if we find a path at vertex q, which is already contained in the set W, we continue the path search only along the arc originating from vertex q. Step 3. If the path is found, we add it to the set M, add all the transiting vertices to the set W and go to step 4. Otherwise we go to step 6. Step 4. We label all edges of the found path at step 2. Step 5. Look through all edges of the graph. If any edge is labeled at the same time as an antagonist edge (i.e. the vertices of these differently directed edges are the same), − → then remove this edge from the graph G together with the antagonist edge, and go to step 1. Otherwise, go to step 2. Step 6. Stop. M is the maximal set of found vertex-independent paths between vertices u1 and u2 in the original graph G. Thus, we have constructed an algorithm that allows us to find vertex-independent paths more efficiently than methods based on shortest path finding algorithms. Moreover, the proposed algorithm finds the maximum number of vertex-independent shortest paths.

5 Modeling of Test Cases To model test cases of the proposed algorithm, as well as other algorithms used in the analysis of communication networks, we have developed special software with a wide range of functionality, including verification of the correctness of the algorithms. Figure 3 presents screenshots of the mentioned software, which shows the results of the proposed algorithm application. The graph of the communication network is modeled randomly by varying the set parameters. At the same time, for verification of algorithms related to the search for

On the Maximal Sets of the Shortest Vertex-Independent Paths

385

Fig. 3. Search for reserve paths by the proposed algorithm. Demonstration with the help of special software.

reserve paths, a methodology is proposed that allows to “know” the a priori number of possible reserve paths and thus obtain a reliable indicator of the effectiveness of the proposed algorithm. The efficiency of the algorithm in this case will be considered as the average ratio of the number of vertex-independent shortest paths found to the total number of paths. The average value is a statistically correct indicator. It is conditioned by the property of stability of averages, which takes place here by virtue of the generalized law of large numbers for dependent random variables (A.A. Markov’s law), according to which, with respect to the considered subject area, the average value of connectivity probabilities will converge to the arithmetic mean of their mathematical expectations when the number of communication directions increases (see [10, p. 294]). The proposed technique for generating vertex-independent paths, which gives an a priori exact maximum number of them between given vertices, is based on the following steps: 1) generation of points-vertexes of the graph on the plane (on the map) with the given parameters that determine the character of the points location; 2) construction of a minimal leaf tree using the generated vertices; 3) construction of additional k vertex-independent paths between the given vertices with optimization of the length of the edges to be completed (the method proposed in [11] is used). Note that in the case of using Algorithm 2, which generates test cases, we will evaluate the efficiency of the algorithm as the ratio of the number of vertex-independent paths found to the total number of available vertex-independent paths. In addition, step 3 of Algorithm 2 can also be used on real communication networks. The quality check of the proposed algorithm is then performed taking into account that the number of vertex-independent paths on a given communication direction, defined by

386

B. Melnikov et al.

a pair of vertices, will be at least two under the condition of an a priori connected graph of the communication network. Experimental results In order to increase the representativeness, a module has been developed to automatically generate examples (according to 4.2 Algorithm 2) and accordingly search for vertex-independent paths between a particular source and sink. In addition, it should be noted that the proposed search algorithm is also tested on real communication networks. The validation on real communication networks had high search efficiency compared to search methods based on modification of shortest path finding algorithms. Different communication information directions were investigated. Unlike the generated cases, in the case of real communication networks, the maximum number of vertex-independent paths between given vertices is not known a priori. In this case, we considered the difference between the number of vertex-independent paths found using the developed Algorithm 2 and the number of vertex-independent paths found using successive shortest path algorithms. Table 1 summarizes the simulation results that allow us to make a comparative evaluation of the search results using the proposed algorithm and the algorithm based on shortest paths search methods. For each row of the table characterizing the dimensionality of the communication network, a series of experiments consisting of 1000000 automatic generations of the topological graph of the communication network was conducted. Table 1. Comparative table of modeling results. Ratio of the number of vertex-independent paths found to the total number of vertex-independent paths (average value). Dimensionality

Use of the approach with sequential application of the shortest path finding methods

Use of the developed algorithm

100

0.64

1

1000

0.54

1

10000

0.38

1

6 Conclusion The paper describes the developed method for finding the maximum set of vertexindependent shortest paths between given vertices in a communication network graph, conditionally divided into algorithms. The method is tested on real communication networks, as well as on a representative number of simulated communication network graphs. The simulation results show a high performance evaluation of the proposed method. When vertex-independent path finding algorithms based on classical shortest path finding algorithms lose one or more paths, the developed algorithm demonstrates efficient search.

On the Maximal Sets of the Shortest Vertex-Independent Paths

387

This result has practical significance in the design and/or modernization of communication networks, since the fact of providing a backup path between communication nodes is extremely important, as a quality search for vertex-independent paths will save resources by not completing edges (links) to create a reserve independent path. In other words, the economic effect of qualitative search is achieved by providing the stability factor (as well as the possible required throughput under static routing) of communication directions without the need to build new links to organize a reserve route. In the following publications we will provide a proof that the proposed method really finds the maximum set of vertex-independent shortest paths, which is also indirectly confirmed by experiments. In addition, it will be proved that the sought set is invariant with respect to the z  q(pi ), i.e., the total number of transit vertices in the maximal set of vertexparameter i=1

independent shortest paths in the graph of the communication network from source to sink will be the same for different path variations.

References 1. Yasinsky, S.A., Sokolov, V.M.: Modification of algorithms for finding shortest paths in the transport network of the telecommunication system for the dissemination of geoinformation. Inf. Space 2, 6–11 (2012) 2. Nazarov, A.N., Sychev, K.I.: Models and Methods for Calculating the Quality Indicators of the Functioning of Node Equipment and Structural and Network Parameters of Next-Generation Communication Networks. Polikom, Krasnoyarsk, Russia (2010) 3. Dyshlenko, S.G.: Routing in transport networks. Inf. Technol. Sci. Educ. Manage. 1, 15–20 (2018) 4. Tsvetkov, K., Makarenko, S.I., Mikhailov, R.L.: Formation of backup paths based on the Dijkstra algorithm in order to increase the stability of information and telecommunication networks. Inf. Control Syst. 2, 71–78 (2014) 5. Terentyeva, Y.: Determination of the maximum set of independent simple paths between the vertices of the graph. Mod. Inf. Technol. IT Educ. 17(2), 308–314 (2021) 6. Bulynin, A.G., Melnikov, B.F., Meshchanin, V., Terentyeva, Y.: Optimization problems arising in the design of high–dimensional communication networks and some heuristic methods for their solution. Inf. Commun. 1, 34–40 (2020) 7. Kormen, T., Leiserson, C., Rivest, R., Stein, K.: Algorithms: Construction and Analysis. Williams Publishing House, Moscow, Russia (2011) 8. Harari, F.: Graph Theory. Mir, Moscow, USSR (1973) 9. Ford, L., Fulkerson, D.: Flows in Networks. Mir, Moscow, USSR (1966) 10. Wentzel, E.S.: Probability Theory. Nauka, Moscow, USSR (1969) 11. Melnikov, B.F., Terentyeva, Y.Y.: Building an optimal spanning tree as a tool for ensuring the stability of a communication network. Tech. Sci. 1, 36–45 (2021). News of higher educational institutions. Volga region

Using Special Graph Invariants in Some Applied Network Problems Boris Melnikov1(B)

, Aleksey Samarin1

, and Yulia Terentyeva2

1 Shenzhen MSU – BIT University, 1, International University Park Road, Dayun New Town,

Longgang District, Shenzhen 518172, China [email protected] 2 Center for Information Technologies and Systems of Executive Authorities Named After A.V. Starovoytov, Moscow, Russian Federation

Abstract. The problem of checking the possible isomorphism of graphs has a wide practical application and is an important problem for theoretical computer science in general and the theory of algorithms in particular. Among the numerous areas of application of algorithms for solving the problem of determining graph isomorphism, we note the problem of syntactic and structural pattern recognition, some problems of mathematical chemistry and chemoinformatics (study of molecular structures of chemical compounds), problems related to the study of social networks (for example, linking several accounts of one user on Facebook). In various algorithms for working with graphs, one of the most common invariants is the vector of degrees. However, the use of this invariant alone for constructing most practical algorithms on graphs is apparently not sufficient; its possible generalization is the more complex invariant considered by the authors i.e., the vector of second–order degrees. At the same time, the graphs considered in this paper with the generated vector of second-order degrees can be considered models for many real complex problems. Previously, works were published in which the orders of application of invariants calculated in polynomial time were analyzed, and such variants of algorithms for which small degrees of the applied polynomial are needed. When analyzing such algorithms, there are problems of comparing the invariants under consideration, i.e., comparing by some specially selected metric that reflects the “quality” of the invariant on the subset of the set of all graphs under consideration. The article shows that when using any natural metric, the vector of second-order degrees is better than the widely used Randich index. #COMESYSO1120. Keywords: Semigroup · Graph · Isomorphism · Communication Network

1 Introduction. The Problem of Isomorphism of Graphs In the modern world, graph analysis finds application in various disciplines, ranging from computer science to bioinformatics and social research [1]. Graphs are a convenient tool for modeling complex relationships between objects. However, graph analysis and classification remain non-trivial tasks requiring effective tools. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 388–392, 2024. https://doi.org/10.1007/978-3-031-54820-8_31

Using Special Graph Invariants in Some Applied Network Problems

389

The notion of invariants that retain their values under certain transformations plays a key role in graph analysis. An invariant is any mapping constant on equivalence classes. Since nonequivalence follows from the inequality of invariants, invariants serve as a means of distinguishing equivalence classes. A system of invariants is called complete if it distinguishes between any two equivalence classes, see [2–7]. However, the calculation of the complete invariant is computationally complex: algorithms that work in polynomial time are currently unknown. Therefore, a common problem is to choose the most suitable invariants for specific tasks. In this paper, we draw attention to an important class of invariants using the vertex degree distribution: Randich index and the double vector of vertex degrees. The purpose of this article is to study the relationship between the Randich index and the double vector of degrees of vertices of graphs. Special attention is paid to the identification of situations in which the dual vector of vertex degrees turns out to be a more effective tool for distinguishing graphs than the Randich index. We demonstrate that if the dual vector of degrees of vertices of two graphs is the same, then the Randich index for them will also be the same. However, the converse is not true. This shows that the dual vector of vertex degrees turns out to be a more reliable way to distinguish graphs, [6, 7]. We begin by describing the calculation of two invariants. Next, we describe examples where a double vector of vertex degrees distinguishes two graphs with the same Randich index. The examples are followed by a theorem in which we prove that the reverse situation is impossible. We describe a program in the Julia language, developed by us to calculate invariants and identify interesting cases.

2 Preliminaries We believe that a graph G(V , E) is a set of vertices V = {v1 , ..., vm } and a set of edges E = {(vi1 , vj1 ), ..., (vin , vjn )}. Two graphs are isomorphic if they can be made identical by renumbering the vertices. A graph invariant is a number or set of values that do not depend on the way the vertices are numbered. For isomorphic graphs, the value of the invariant will be the same. The degree of a vertex of a graph is the number of vertices with which this vertex is connected. Let us consider two options that use the degree of the vertex. The Randich index is an invariant of the graph described by the formula  (vi ,vj )∈E



1 d (vi )d (vj )

where vi , vj are two incident vertices, d (v) is the degree of the vertex v. We summarize for all edges of the graph. It is obvious that the double vector of degrees of vertices is an invariant of the graph. It can be calculated using the following 4 steps. 1. For each vertexvi , determine the set of neighboring (incident) verticesN (vi ) = {vj1 , ..., vjk }.

390

B. Melnikov et al.

2. Next, we replace the vertex number for its degree: vi → d (vi ) 3. The resulting set of vertex degree sets needs to be ordered. 4. The result is a vector of vertex degree vectors.

3 Some Results of Computational Experiments The examples of this section were found using computer programs. We show that there are graphs for which the Randich index is the same, and the double vector of degrees of vertices is different. First, an example with the minimum number of vertices and edges that we managed to find. In this case, the second graph is not connected. Graph

Randich index

Double vector

4.0

[1,1,1,1] [4] [4] [4] [4]

4.0

[] [2, 2] [2, 2] [2, 2] [2, 2]

In the second example, both graphs are connected. In this case, one is obtained from the second by permuting one edge. There are 6 vertices and 6 edges in the graph, so this is the minimum number of edges that allows the graph to be connected. For 5 and fewer vertices, we have not found a pair of connected graphs that can be divided by a double vector of vertex degrees, but cannot be divided by the Randich index.

Using Special Graph Invariants in Some Applied Network Problems Graph

Randich index

Double vector

5.8637033

[1, 3]

391

[2] [2, 2] [2, 2, 2] [2, 3] [2, 3]

5.8637033

[1, 2] [2] [2, 2, 2] [2, 3] [2, 3] [2, 3]

4 Theorem on the Relation of the Randich Index and the Double Vector Theorem 1. Let two graphs have the same double vector of degrees of vertices. Then the Randich index will be the same for them. Proof. It is enough to show that using a double vector, it is possible to calculate the Randich index. Let A be the double vector. Each element vector of the double vector corresponds to a vertex of the graph. A = [a1 , a2 , · · · , am ], where ai is the vector of degrees of vertices adjacenting with the vertex vi . Let us remark that the length of the vertex ai is equal to a degree of the vertex vi . Vector ai = [di 1 , di 2 , . . . , di ki ], where ki = d (v i ), and di j is the degree of the vertex which is adjacenting with the vertex vi . Each element ai corresponds to the directed edge from the vertex vi to an adjacenting vertex. Let us make the vector A flat: A = [a1 , . . . , am ] => [d1 1 , d1 2 , . . . , d1 k1 , d2 1 , . . . , dm km ] = A1 . Here, the length of the vector A1 is equal to the number of edges of the graph, and the elements are the degrees of the vertex for each edge. Next, let us replace the elements of ai for the degree of the vertex vi ; we obtain: ai = [di 1 , di 2 , . . . , di ki ] => [ki , ki , ..., ki ] = bi .

392

B. Melnikov et al.

Similarly, let us make the vector B = [b1 , ..., bm ] flat: B1 = [k1 , ..., k1 , k2 , ..., km ] = [d (v1 ), ..., d (vm )]. Vector B_1 is similar to vector A_1. But the elements of B_1 correspond to the degree of the initial vertex for each edge. Thus, we obtained two vectors A1 , B1 , whose elements correspond to the degree of the initial and final vertices of each edge. Obviously, the Randich index can be calculated using vectors A1 and B1 .

5 Conclusion In this paper, we have shown examples when a double vector of vertex degrees distinguishes graphs that the Randich index does not distinguish. Next, we proved the theorem that there is no example when the Randich index distinguishes two graphs for which the dual vector is the same. Thus, we have shown that the double vector of vertex degrees is an enhancement of the Randich index. The dual vector of vertex degrees allows us to distinguish a larger number of equivalence classes, so it is better suited for analyzing graph isomorphism. But its calculation requires sorting, so it requires large computational resources, which can cause problems for very large graphs. An interesting area of future work is the study of the probability of an invariant (the distribution of invariant values) depending on the distribution of graphs. For example, in some applied problems graphs often have similar properties. Therefore, some invariants can often give the same values, while others, on the contrary, distinguish graphs from the same subject area well.

References 1. Barabási, A.-L., Pósfai, M.: Network Science. Cambridge University Press, Cambridge, United Kingdom (2016) 2. Mumford, D.: Geometric Invariant Theory. Springer, Berlin, Germany (1965) 3. Dieudonne, J., Carrell, J.: Invariant Theory. Old and New. Academic Press, N.Y., US (1971) 4. Spencer, A.: Theory of invariants. In: Eringen, A. (ed.) Continuum Physics, vol. 1, pp. 239–353. Academic Press, N.Y. (1971) 5. Kraft, H.: Geometrische Methoden in der Invariantentheorie [Geometric Methods in Invariant Theory]. Vieweg+Teubner Verlag, Wiesbaden, Germany (1985). (In German.) 6. Melnikov, B., Terentyeva, Y.: Building communication networks: on the application of the Kruskal’s algorithm in the problems of large dimensions. IOP Conf. Ser. Mater. Sci. Eng. 1047(1), 012089 (2021) 7. Melnikov, B., Terentyeva, Y.: Greedy and branches-and-boundaries methods for the optimal choice of a subset of vertices in a large communication network. Cybern. Phys. 12(1), 51–59 (2023)

Analysis of Digital Analog Signal Filters Denis Gruzdkov and Andrey Rachishkin(B) Tver State Technical University, Tver, Russia [email protected]

Abstract. The paper considers several common digital filters of an analog signal and analyzes their effectiveness in eliminating noise and inaccuracies arising from poor quality power supply or wire noise. The filters considered are arithmetic mean, median, exponentially running mean and simple Kalman filter. To study the operation of the filters, a python program is created to simulate signals of different shapes (sine, meander, triangular, sawtooth and constant signals) and add pseudorandom noise to them. Experimental data and conclusions on the application of these filters are given. The analysis of various analog signal filters is an important area of research in electronics and signal processing. #COMESYSO1120. Keywords: Analog Signal · Noise Filtering · Software Filters · Python

1 Introduction Analog signal filters are an important component of electronic systems used for signal processing. Many devices that use analog sensors use microcontrollers that additionally filter the values from the analog sensors using software. This eliminates noise and inaccuracies that can occur due to poor power supply or wire noise. Noise filtering improves the accuracy and quality of sensor measurements. The signal that comes from a sensor is a one-dimensional, time-varying signal. Analog signal filters play a critical role in ensuring reliability and accuracy in systems that use analog sensors. By eliminating noise and interference processed by software and microcontrollers, they allow for a cleaner and more reliable signal [1]. The purpose of the work is comparative analysis of common digital filters, analog signal filters and obtaining experimental data. A Python program has been developed for comparative analysis of filtering quality and signal filter parameter calculations, with the help of which the behavior of various common digital filters on generated signals of different shapes with pseudo-random noise addition has been studied.

2 Noise Synthesis The program generates pseudo-random noise with two components: constant white Gaussian noise [2] and impulse noise. Gaussian white noise is generated with standard normal distribution: with mathematical expectation μ = 0 and standard deviation © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 393–401, 2024. https://doi.org/10.1007/978-3-031-54820-8_32

394

D. Gruzdkov and A. Rachishkin

σ = 1. Impulse noise is generated by random pulses with probability p = 0.04, amplitude of a random variable with normal distribution μ = 5 and standard deviation from this value σ = 1 with random sign. The noise is a one-dimensional array of the size of the signal array with the sum of these two components. The standard deviation of the resulting noise is σn = 1.485. This value [3] of the standard deviation of the noise σn is calculated by the following formula:   N   (xi − x)2  i=1 . σ = N −1 where x i is the i-th value from the sample, x is the arithmetic mean of the sample, N is the number of sample elements. Visualization of all noise components is presented in Fig. 1.

Fig. 1. Noise visualization.

3 Signal Generation The program generates signals of several different forms (sine, meander, triangular, sawtooth and constant signals) with an oscillation frequency of 0.5 Hz and an amplitude of 5 V (the signal is denoted in Volts), a duration of 4 s and a sampling frequency of

Analysis of Digital Analog Signal Filters

395

100 Hz. Based on these parameters, arrays are generated with signal values over time with a sample size of 400 values (duration 4 [sec] × sample rate 100 [Hz]). The following signal plots are shown: sinusoidal signal Fig. 2a, rectangular meander signal Fig. 2b and sawtooth signal Fig. 2c, as well as their combinations with noise.

Fig. 2. Examples of signals with noise.

4 Filters 4.1 Arithmetic Mean Filter The principle of the filter is that a buffer (window) of several previous values is created. Each time a sensor is polled, the buffer is shifted: the first element is deleted and a new value is added to the end. Each new value goes into the buffer, and the final normalized value is calculated by taking the arithmetic mean from the values in the buffer using the following formula: x=

1 (x1 + ... + xn ). n

The width of the averaging window n is a filter parameter on which the filter speed and filtering quality depend. The filter is simple to implement but has many issues. It requires a large number of floating-point calculations, the amount of which depends on

396

D. Gruzdkov and A. Rachishkin

the width of the averaging window. This filter copes best with filtering of a steady-state static value. In this case, with the averaging window width equal to 10, the standard deviation from the signal falls to σ = 0.448, which in comparison with the initial noise level σn = 1.485 is one of the best results (69.8% noise suppression). However, at sharp signal changes, such as meander or sawtooth signals, the filter has a serious delay equal to the width of the averaging window. These can be clearly seen in the plot of signals (2) and (3) in Fig. 3. Thus, for a sine signal, the standard deviation σ = 0.763 (48.6% noise suppression), which means that even a smooth change in the signal greatly affects the performance of the filter. A large number of values lying far away from the signal at a spike-like change of the signal due to the inertia of the filter strongly distorts the standard deviation, because of which for meander and sawtooth signal they amounted to σ = 1.747 and σ = 1.433, respectively. Figure 3 shows examples of filter operation at averaging window width equal to 10. Also, this filter is strongly affected by the impulse component of the noise, since the value of impulses strongly biases the average value [4, 5].

Fig. 3. Arithmetic mean filter, the width of the averaging window is 10.

4.2 Median Filter The principle of the median filter is to store the buffer of previous values, but the statistical median is taken instead of the arithmetic mean from the buffer. This filter, as a rule, is used in combination with other filters and perfectly copes for initial canceling of the

Analysis of Digital Analog Signal Filters

397

impulse component of noise. Due to the principles of its operation, if only the impulse component is left in the signal, the filter completely removes such noise, which is clearly shown in Fig. 4a, as well as on the examples of sine and sawtooth signals [4, 5].

Fig. 4. Median filter of the 3rd order.

This filter in the special case for a median of order 3 is computed very quickly, using only comparison operations. Medians of higher order can already load the computing device. This is enough in conjunction with other filters. This filter can help a lot in dealing with impulse noise, but adds a small signal lag on the width of the filter window. In the experimental results we can see that the median filter qualitatively removes the impulse component, and also averages the value a bit - at the static signal σ = 0.655 the noise suppression is −55.9%. Due to the delay created by the filter, there may be a question of its application in cases where there is no impulse component [6].

398

D. Gruzdkov and A. Rachishkin

4.3 Exponential Running Average Exponential running average is a versatile and quite effective filter, which filters well and at the same time is quickly calculated. It is the simplest lag filter with an adjustable coefficient. Each new value shifts the current value with the coefficient. The new normalized (filtered) value V f is calculated using the following formula: Vf = (Vn − Vp ) · k, where V n is the new raw value that is received from the sensor, V p is the previous filtered value and k is the lag factor set in the interval [0:1]. The larger the lag factor, the faster the filter responds to changes, but the worse the filtering of the signal. Figure 5 shows an example of the filter operation [6].

Fig. 5. Exponential running average filter, k = 0.3.

4.4 Simple Kalman Filter The Kalman filter is a recursive filtering algorithm used to estimate the state of a dynamic system based on incomplete and noisy measurements. A simplified implementation of the K3alman filter was used in this analysis [7]. This implementation represents a simplified version of the Kalman filter where the parameters r and q are set in advance, where r is the approximate noise amplitude and q is the covariance value of the process (Fig. 6).

Analysis of Digital Analog Signal Filters

399

Fig. 6. Kalman filter, q = 0.5 and r = 1.5

The basic principle of the Kalman filter consists of two steps: prediction and correction. In the prediction step, the total measurement error E O is calculated as the square root of the sum of the squares of the accumulated error E A and the covariance value q:  EO = EA2 + q2 . Kalman coefficient H is calculated using E O and r as follows: H=

EO2 EO2 + r 2

.

Then in the correction step, the Kalman filter uses H coefficient to adjust the predicted value according to the current measurement. And E A is updated by the following formula to take into account the accuracy of the current measurement and the filter’s prediction.  EA = (1 − H ) · E0 2 . The filter takes the input signal and performs prediction and correction for each value using Kalman formulas and updating the error and previous state values at each step [8].

5 Comparative Analysis of Filters According to the results of the study, a comparative table was made to evaluate the quality of signal filtering (Table 1). The standard deviation was used as an evaluation parameter. The value of standard deviation was calculated from the filtered signal relative to the

400

D. Gruzdkov and A. Rachishkin

ideal signal. Table 1 shows the standard deviations from the signal after filtering, and the relative change in standard deviation as a percentage of the noise standard deviation σn = 1.485 shown in the second column. From this evaluation system, we can judge the quality of filtering and the presence of filter response delay. Table 1. Comparative table of signal standard deviation values after filtering. Filter

Constant signal

Sine

Meander

Sawtooth

Triangular

F1 (WAW = 5)

0.639

-57.0%

0.728

-51.0%

1.451

-2.3%

1.249

-15.9%

0.716

-51.8%

F1 (WAW = 10)

0.448

-69.8%

0.763

-48.6%

1.747

17.6%

1.433

-3.5%

0.721

-51.4%

F1 (WAW = 15)

0.371

-75.0%

0.954

-35.8%

2.032

36.8%

1.648

11.0%

0.875

-41.1%

Median

0.655

-55.9%

0.714

-51.9%

1.407

-5.3%

1.198

-19.3%

0.703

-52.7%

M+ F1 (WAW = 5)

0.459

-69.1%

0.763

-48.6%

1.830

23.2%

1.529

3.0%

0.715

-51.9%

M+ F1 (WAW = 10)

0.346

-76.7%

0.939

-36.8%

2.101

41.5%

1.726

16.2%

0.853

-42.6%

M+ F1 (WAW = 15)

0.279

-81.2%

1.163

-21.7%

2.371

59.7%

1.916

29.0%

1.039

-30.0%

F2 (k = 0.2)

0.481

-67.6%

0.647

-56.4%

1.247

-16.0%

1.043

-29.8%

0.625

-57.9%

F2 (k = 0.3)

0.603

-59.4%

0.658

-55.7%

1.030

-30.6%

0.934

-37.1%

0.654

-56.0%

F2 (k = 0.4)

0.720

-51.5%

0.740

-50.2%

0.952

-35.9%

0.925

-37.7%

0.741

-50.1%

F2 (k = 0.5)

0.836

-43.7%

0.844

-43.2%

0.954

-35.8%

0.967

-34.9%

0.845

-43.1%

M+ F2 (k = 0,2)

0.349

-76.5%

0.781

-47.4%

1.733

16.7%

1.415

-4.7%

0.714

-51.9%

M+ F2 (k = 0,3)

0.414

-72.1%

0.672

-54.7%

1.556

4.8%

1.300

-12.5%

0.632

-57.4%

M+ F2 (k = 0,4)

0.464

-68.8%

0.642

-56.8%

1.470

-1.0%

1.246

-16.1%

0.613

-58.7%

M+ F2 (k = 0,5)

0.505

-66.0%

0.639

-57.0%

1.425

-4.0%

1.219

-17.9%

0.617

-58.5%

F3 (q = 1, r = 1,5)

0.810

-45.4%

0.821

-44.7%

0.884

-40.4%

0.953

-35.8%

0.821

-44.7%

F3 (q = 0.5, r = 1,5)

0.577

-61.2%

0.645

-56.5%

0.918

-38.2%

0.94

-36.7%

0.638

-57%

F3 (q = 0.2, r = 1,5)

0.37

-75.1%

0.827

-44.3%

1.227

-17.3%

1.151

-22.5%

0.752

-49.4%

F3 (q = 1, r = 1)

0.972

-34.5%

0.975

-34.3%

0.994

-33%

1.05

-29.3%

0.975

-34.3%

F3 (q = 0.5, r = 1)

0.706

-52.5%

0.73

-50.9%

0.885

-40.4%

0.921

-38%

0.729

-50.9%

F3 (q = 0.2, r = 1)

0.449

-69.8%

0.667

-55%

1.039

-30%

1.083

-27.1%

0.635

-57.2%

Note: For comparison, the standard deviation of the noise σn = 1.485, the standard deviation of the constant noise component σ = 0.977 F1 - Arithmetic mean filter WAW - width of the averaging window M and Median* - 3rd order Median F2 - Exponential running average k - adaptive coefficient of exponential filter F3 - Kalman filter r is the approximate amplitude of the noise q - covariance value of the process

6 Conclusion The choice of the method of filtering an analog signal depends on its type. According to the data from the table, we can see that the meander turns out to be the most difficult signal to filter because of the discontinuous change of the signal. The meander is difficult

Analysis of Digital Analog Signal Filters

401

for most filters, and if not properly tuned, filters can get worse results than before filtering because of the strong lag of the signal. The Kalman filter has been shown to be quite effective in conducting simulations and there is no signal lag. The median filter is effective in normalizing the impulse component of the noise and works well as a pre-filter in a combination of filters. The exponential running average with adaptive coefficient is a versatile and simple filter for most situations. Arithmetic mean is an efficient algorithm for a static signal, but it does not always provide the fastest execution speed. The Kalman filter is a versatile but computationally intensive filter suitable for filtering any signal. The data obtained can be used to compare the filtering efficiency of new algorithms. One of them could be training a neural network using an artificial intelligence model.

References 1. Rech, C.: Digital filters. In: García, J. (ed.) Encyclopedia of Electrical and Electronic Power Engineering, pp. 668–681. Elsevier (2023). ISBN 9780128232118 2. Statistica, B.V.: The art of data analysis on a computer: For professionals/V. Borovikov, 688 p. Peter, St. Petersburg (2003) 3. Patrignani, C., et al.: (Particle Data Group). 39. Statistics. B: Review of Particle Physics. Chin. Phys. C. 40, 100001 (2016) 4. Barbu, T.: Variational image denoising approach with diffusion porous media flow. Abstr. Appl. Anal. 2013, 8 (2013) 5. Sergienko, A.B.: Digital Signal Processing. 3rd edn. BHV-Peterburg, Saint Peterburg, Russia (2011) 6. Kudryakov, S.A., Sobolev, E.V., Rubtsov, E.A.: Theoretical Bases of Signal Filtering. BHVPeterburg, Saint Peterburg, Russia (2018) 7. Bukhtiyarov, M.S.: Signal noise filtering. https://habr.com/ru/articles/588270/. Accessed 9 Sept 2023 8. Sadli, R.: Object Tracking: Simple Implementation of Kalman Filter in Python. Machine Learning Space. https://machinelearningspace.com/object-tracking-python/. Accessed 9 Sept 2023

Readiness for Smart City in Municipalities in Mbabane, Eswatini Mkhonto Mkhonto(B) and Tranos Zuva Department of Information and Communication Technology, Vaal University of Technology, Gauteng, South Africa [email protected], [email protected]

Abstract. Smart city readiness entails the adoption of technological innovation, information and communication technologies by the role players to disseminate information to the citizenry to improve performance, quality of services and welfare maximization. This is why it’s important to determine people’s readiness for technology before they can accept or adopt them and also observe their intention to use them. The study investigated the readiness for smart city in municipalities in Mbabane, Eswatini. A quantifiable approach was taken in this research, the information was collected through online surveys and finally suggested a model for the research. One hundred and twenty-two surveys were contracted from different respondents in different municipalities. It was found that the study was reliable and valid. Results show that municipalities in Mbabane, Eswatini are ready for smart city. It also indicates that ease-of-use (EOU) has a strong influence on perceived usefulness (PU) with β = 0.570 and Sig 001, Perceive usefulness (PU)has an influence on intention of use (IOU) with β = 0.382 and Sig 001, and ease-ofuse (EOU) has a strong influence on intention of use with β = 0.046 and sig = 0.475, Innovation has a strong influence on intention of use with β = 0.308 and sig 001 and Optimism (OPT) has a strong influence on intention of use with β = 0.292 and sig = 0.002. Future researchers can extend the research to different municipalities as this research was limited to municipalities in Mbabane, Eswatini only. Smart technology will help cities maintain expansion while also enhancing the efficiency for the welfare of the citizenry. Keywords: Smart City · Technology Readiness · Technology Readiness Index · Municipalities

1 Introduction Smart city approaches are ways of applying digital and electronic technologies to communities and cities to transform life and working environment. The concept of smart cities originated in the early 1990s with cities starting to label themselves as “smart” upon introducing Information and Communication Technology (ICT) infrastructure, embracing e-governance and attempting to attract high-tech industries to encourage economic growth. The concept of a “Smart City” can be defined as a city that uses ICT as an enabler, to merge dimensions of smart utilities, smart mobility, smart economy, smart © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 402–410, 2024. https://doi.org/10.1007/978-3-031-54820-8_33

Readiness for Smart City in Municipalities in Mbabane, Eswatini

403

environment, smart education, smart people, smart living, smart health, smart planning and smart governance” [1]. Therefore, it is important for municipalities to invest in technologies that would enable them to better service their residents. Smart city adoption assists cities and governments in meeting the challenges of urban and rural governance to become more competitive and address sustainability issues. In this regard, [2] affirms that the smart city approach is emerging as a way of solving entangled municipal problems. To improve municipal operations and services, the smart city concept integrates ICT with a large number of physical devices connected to the Internet of Things (IoT) network [3]. This allows the local officials to communicate directly with the people and monitor what is happening in the communities [4]. According to [5] “ICT can be used to increase the quality, performance, and interactivity of urban services while also lowering costs and conserving resources. Housing, transportation, sanitation, medical services, utilities, land use, production, and communication networks all benefit from a smart city adoption”. [6] defined smart city looking at two terms “smart” and “city”. They see ‘smart’ as human attribute that expresses a degree of response to impulses while ‘city’ is a geographical entity inhabited by humans. Hence, smart city is a city that has the ability to sense and respond accordingly to its challenges using intelligence embedded in the city’s information system [6]. In the literature, a smart city is defined as a city which is interconnected through technology [7, 8]. This study adopts a definition by [9], who defines a smart city as “a digital integration of information systems components to collect digital data and analyse it in real-time to monitor and manage the city infrastructure and to allocate resources effectively, thereby improving service delivery and the quality of life of the citizens”. The study is arranged as follows: Sect. 2 Related work, Sect. 3 Research model and Hypothesis, Sect. 4. Research Methodology, Sect. 5 Results, Sect. 6 Limitations and finally Sect. 7 Conclusion.

2 Related Work Technological readiness (TR) is defined as how far the readiness of local government to adopt smart city concept seen from the characteristics of the technology used [10]. In smart city context, smart technology is a technology that can capture information, current situation and activities done by citizens, then integrate the data and information to other systems or devices and analyze the data to support government and citizen needs [11]. [11] added that there are several components that act as technology enablers in smart city adoption. Those technology enablers represent the functions of smart technology should have in order to capture, integrate, and analyze the data and information. According to [12], TR is defined as “people’s propensity to embrace and use new technologies for accomplishing goals in home life and at work.” Extant literature in the technology readiness area suggests that users’ personalities should be considered to predict their perceptions and behaviors [12]. [13] proposed the technology readiness model for online purchase intentions with the following variables: Innovation, optimism, discomfort, and insecurity. According to him, previous literature had acknowledged the effects of technology readiness and its dimensions on technology usage intention [14]. As a result, we can draw the conclusion that the TR principles put forth by [15] could be a useful indicator of online purchase intent. Numerous more research that examined the effect that

404

M. Mkhonto and T. Zuva

TR has on user intention to utilize technology and discovered meaningful correlations between them provide weight to the study. The readiness for technology adoption identifies the degree to which organizations and individuals involved in the adoption are able to execute the specific adoption of a new technology or system [16]. This assessment of readiness for technology adoption is described as an essential prerequisite for organizations in order to keep up with volatile market demands and manage resources efficiently [17]. Organizations who assess their readiness prior to the adoption of new technologies are in a better position to experience a successful technology adoption. Therefore prior to any system implementation it is important to evaluate the state of readiness. [15] developed a technology readiness index (TRI) in order to measure the level of individual readiness with four dimensions namely optimism and innovativeness, described as drivers of technology readiness; and insecurity and discomfort, described as dimensions which inhibit the level of technology readiness. Municipalities therefore needs to be able to assess whether not only the organization as a whole, but also the individuals within the business, is ready to adopt the new technology, in order to ease the process and complete the implementation as effectively and efficiently as possible. TRI has been used in many studies as an explanatory variable or as a moderator of a behavior, intention, or attitude. [18] used TRI factors as differentiating elements between users and non-users of internet banking and found out that technology factors of optimism, security, and discomfort presented significant differences between users and non-users of internet banking. [19] applied TRI to investigate technology acceptance in e-HRIM and found out that optimism and innovativeness positively influenced perceived usefulness and perceived ease of use but discomfort and insecurity did not have a positive effect on adoption of the system. TRI provides alternative perspectives and views on the adoption of and satisfaction with the technologies by identifying: the techno-ready users who champion and can influence adoption; the users who are thrilled about adoption but must be reassured of the benefits of adoption; and users who require strong conviction and proof of concept before they adopt. The challenge of TRI is that it focuses mainly on experiences and demographics and presupposes that for widespread adoption of technology users must be well equipped with the required infrastructure, skills, beliefs and attitude. However, TR applications in information system studies received little attention from scholars [20]. As an illustration, it was stated that there is still a lack of clarity regarding how individual ideas about technology such as TR influence their behavioral intention [21]. This leads to the conclusion that there is still a gap in the research on the correlations between TR and intentions to use technology.

3 Research Model and Hypothesis The following hypothesis in Table 1 was proposed which is translated into the conceptual model in Fig. 1.

Readiness for Smart City in Municipalities in Mbabane, Eswatini

405

Table 1. Proposed Research Hypothesis Hypothesis Number

Proposed Hypothesis

H1

Ease-of-use has a positive influence on Perceive usefulness

H2

Perceived usefulness has a positive influence on smart city system intent to remain using it

H3

Ease-of-use has a positive influence on the intent to use smart city

H4

Innovation has a positive influence on the intent to use smart. City

H5

Optimism has a positive influence on the intent to use smart city

H6

Insecurity has a positive influence on the intent to use smart city

H7

Convenience has a positive influence on perceived usefulness on the intent to use smart city

Fig. 1. Conceptual model

4 Research Methodology A deductive approach using the quantitative research strategy was used for this current study. This method was chosen because it is thought to be capable of producing reliable, valid, and generalizable results. Inferential statistics was used for data analysis. 4.1 Data Collection The survey was administered through a questionnaire targeted at South African citizens in the three municipalities. Questionnaires were distributed to 122 respondents, filled, and returned. The questionnaire allowed all the respondents to respond to it at their own free time. This could have been the reason why all the 122 respondents filled the forms, thus making the sample become more representative.

406

M. Mkhonto and T. Zuva

4.2 Questionnaire Design Each item in the model had a corresponding set of questions. The questionnaire was composed of twenty-one unambiguous questions that were easy for respondents to complete. Each item on the questionnaire was measured on a seven-point Likert scale whose end points were ‘strongly agree’ (7) and ‘strongly disagree’ (1).

5 Results The result obtained using SPSS are discussed below. 5.1 Reliability Test A reliability test was done to confirm if each factor of the model is reliable and valid. Cronbach’s alpha (α) was used to assess the reliability of the scales for each of the constructs in this study. All the constructs had a Cronbach value above 0’842 as shown in Table 2. 5.2 Factor Analysis Factor analysis was performed, the result showed: KMO and Bartlett’s test of sphericity value of 0.908, with a significant ρ-value (ρ < 0.000) (Table 3 refers). We indicate factor Table 2. Reliability results. Variables

Cronbach’s Alpha results

No of items

Convenience

.910

4

Ease-of-use

.934

3

Innovation

.842

3

Intention of use

.915

2

Optimism

.856

4

Perceived Usefulness

.938

2

Insecurity

.862

3

Table 3. KMO and Bartlett’s Test KMO and Bartlett’s Test Kaiser-Meyer-Olkin Measure of Sampling Adequacy Bartlett’s Test of Sphericity

0.908 Approx. Chi-Square

5661.260

df

526

Sig.

0.000

Readiness for Smart City in Municipalities in Mbabane, Eswatini

407

Table 4. Regression Analysis and test results Hypothesis Description Number

Remarks

H1

Ease-of-use (β = 0.570, P < Accepted 0.001) has a positive influence on perceive usefulness

H2

Perceive usefulness (β = 0.382, P < 0.001) has a positively influenced the intention of use

Accepted

H3

Ease-of-use (β = 0.046, P < 0.475) is not statistically significant on an intention to use smart city

Rejected

H4

Innovation (β = 0.302, P < Accepted 0.001) has a positively influenced the intention of use

H5

Optimism (β = 0.292, P < Accepted 0.002) has a positively influenced the intention of use

H6

Insecurity β = 0.047, P < 0.459) is not statistically significant on an intention to use smart city

H7

Convenience β = 0.107, P < Rejected 0.408) is not statistically significant on perceive usefulness on intention to use smart city

Rejected

analysis, it was confirmed all the remaining items were loaded within their respective construct. The following section used the final factor structures to carry out the regression analysis. Table 3 presents factor analysis results based on the KMO and Bartlett test, which is .908 with a P-value of 0.000, which indicates that factor analysis is appropriate [22]. [23] shows that the factor loadings of 0.55 and higher are significant and the factor analysis results found in this research were between 0.6 and 0.8. The empirical testing of the conceptual model proposed has resulted in the final model presented in Fig. 2. The inferential test indicated the hypothesis that were rejected and accepted. This is presented in Table 4. Among all variables only four independent variables showed to be significant in the municipality’s adoption of smart city systems and three variables did not support the hypothesis.

408

M. Mkhonto and T. Zuva

Fig. 2. Final adoption model

6 Limitations The limitation of this study has to do with the size of the sample chosen. The sample did not include all the regions in Eswatini, as the study collected data from three regions. However, this limitation does not have much effect because the data was collected from the biggest municipalities in Eswatini.

7 Conclusion The current study explored various backgrounds of perception and attitudes toward municipalities’ intention to use smart city systems to improve municipal operations and services. The model was formulated using the combination of TAM and TRI. Out of seven hypothesis, three hypothesis were not accepted. Data is collected from the municipalities in Mbabane Eswatini. Regression analysis was used to test the hypothesis and the results indicated that innovativeness and optimism has a positive effect, and insecurity and convenience has a negative effect on the municipality’s attitude. Besides, Perceive usefulness, Innovation and Optimism strongly influence the intention of use on the other hand Perceive ease-of-use does not significantly impact municipality’s intention toward smart city systems adoption in the municipalities. This model can be used by future researchers to help them choose to select the factors for their studies that are related to the adoption of new technologies in the organs of government. Moreover, the higher level of organizational readiness, the higher the intention to adopt smart city concept. Acknowledgment. I wish to express my sincere gratitude and appreciation to my Co-author, Prof Tranos Zuva whose input helped us to produce a paper of good standard.

References 1. City of Johannesburg. Preparing for a smart Joburg (2011). http://www.joburg.org.za/index. php?option=com_conxtent&id=7245&Itemid=266. Accessed 22 March 2022

Readiness for Smart City in Municipalities in Mbabane, Eswatini

409

2. Monzon, A.: Smart cities concept and challenges: Bases for the assessment of smart city projects. In: 2015 International Conference on Smart Cities and Green ICT Systems (SMARTGREENS), pp. 1–11. IEEE (2015) 3. Das, A., Sharma, S.C.M., Ratha, B.K.: The new era of smart cities, from the perspective of the internet of things. In: Smart Cities Cybersecurity and Privacy, pp. 1–9. Elsevier (2019) 4. Caragliu, A., Del Bo, C.F.: Smart innovative cities: The impact of Smart City policies on urban innovation. Technol. Forecast. Soc. Chang. 142, 373–383 (2019) 5. Randhawa, A., Kumar, A.: Exploring sustainability of smart development initiatives in India. Int. J. Sustain. Built Environ. 6(2), 701–710 (2017) 6. Ramaprasad, A., Sanchez-Ortiz, A., Syn. T.: A unified definition of a smart city. In: International Conference on Electronic Government, pp. 13–24. Krems, Austria (2017) 7. Mahesa, R., Yudoko, G., Anggoro, Y.: Dataset on the sustainable smart city development in Indonesia. Data Brief 25, 104098 (2019) 8. Przeybilovicz, E., Cunha, M.A., Macaya, J.F.M., de Albuquerque, J.P.: A Tale of Two ‘Smart Cities’: Investigating the Echoes of New Public Management and Governance Discourses in Smart City Projects in Brazil. In: Hawaii International Conference on System Sciences 2018 (HICSS-51) (2018). https://aisel.aisnet.org/hicss-51/eg/smart_cities_smart_government/2 9. Mashau, N.L., Kroeze, J.H., Howard, G.R.: An integrated conceptual framework to assess small and rural municipalities’ readiness for smart city implementation: A systematic literature review. In: Lecture Notes in Computer Science, p. 13117. Springer, Cham, Switzerland (2021) 10. Yang, Z., Sun, J., Zhang, Y., Wang, Y.: Understanding SaaS adoption from the perspective of organizational users: A tripod readiness model. Comput. Hum. Behav. 45, 254–264 (2015) 11. Berst, J.: The planning manual for building tomorrow’s cities today. Smart City Council, Seattle (2013) 12. Parasuraman, A., Colby, C.L.: An updated and streamlined technology readiness index: TRI 2.0. J. Serv. Res. 18(1), 59–74 (2015) 13. Ismail, K.A., Wahid, N.A.: A review on technology readiness concept to explain consumer’s online purchase intention. Int. J. Indust. Manage. 6, 49–57 (2020). https://doi.org/10.15282/ ijim.6.0.2020.5629 14. Blut, M., Wang, C.: Technology readiness: a meta-analysis of conceptualizations of the construct and its impact on technology usage. J. Acad. Market. Sci. 48(4), 649–669 (2019). https://doi.org/10.1007/s11747-019-00680-8 15. Parasuraman, A.: Technology Readiness Index (TRI) a multiple-item scale to measure readiness to embrace new technologies. J. Serv. Res. 2(4), 307–320 (2000) 16. Holt, D.T., Vardaman, J.M.: Toward a comprehensive understanding of readiness for change: the case for an expanded conceptualization. J. Chang. Manag. 13(1), 9–18 (2013) 17. Aboelmaged, M.G.: Predicting e-readiness at firm level: An analysis of technological, organizational and environmental (TOE) effects on e-maintenance readiness in manufacturing firms. Int. J. Inf. Manage. 34, 639–651 (2014) 18. Pires, P.J., da Costa Filho, B.A., da Cunha, J.C.: Technology Readiness Index (TRI) factors as differentiating elements between users and non users of internet banking, and as antecedents of the Technology Acceptance Model (TAM). In: Cruz-Cunha, M.M., Varajão, J., Powell, P., Martinho, R. (eds.) CENTERIS 2011. CCIS, vol. 220, pp. 215–229. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24355-4_23 19. Nihat, E., Murat, E.: An investigation of the effects of technology readiness on technology acceptance in e-HRM. Proc. Soc. Behav. Sci. 24, 487–495 (2011) 20. Makkonen, M., Frank, L., Koivisto, K.: ISBN 978-961-286-043-1, Age differences in technology readiness and its effects on information system acceptance and use: the case of online electricity services in Finland (2017) 21. Chiu, W., Cho, H.: Asia Pacific Journal of Marketing and Logistics 33(3), 807–825 (2020). https://doi.org/10.1108/APJML-09-2019-0534

410

M. Mkhonto and T. Zuva

22. Brosius, F.: SPSS 12 (1. Aufl). Bonn: Mitp, Verl. Gruppe (2004) 23. Anderson, R.E., Babin, B.J., Black, W.C., Hair, J.F.: Multivar-iate Data Analysis, 7th edn. Pearson, United States (2010) 24. Rahmat, T.E., et al.: Nexus between integrating technology readiness 2.0 index and students’elibrary services adoption amid the COVID-19 challenges: implications based on the theory of planned behavior (2022)

Assessment of Investment Attractiveness of Small Enterprises in Agriculture Based on Fuzzy Logic Ulzhan Makhazhanova(B) , Aigerim Omurtayeva, Seyit Kerimkhulle , Akylbek Tokhmetov , Alibek Adalbek , and Roman Taberkhan L.N. Gumilyov, Eurasian National University, Astana, Kazakhstan [email protected], [email protected]

Abstract. This article examines the important task of determining the investment attractiveness of small enterprises in agriculture. In the process of assessing investment attractiveness, it is necessary to take into account not only the level of development and specifics of the enterprise, but also various uncertainty factors that may affect the financial result. To solve this problem, a method based on the use of fuzzy set theory is proposed. Within the framework of this method, indicators specific to the industry and region are analyzed, as well as financial and economic indicators characteristic of agriculture. The rules based on which decisions are made, are formulated in the form of logical formulas that include various parameters such as the volume of investment, the investment period and the level of risk. In general, an index of investment attractiveness is predicted, which can take values from 0 to 1 and has a clear interpretation. The proposed scientific approach can be used as a basis for developing support systems for expert decision-making when assessing the investment attractiveness of small enterprises in agriculture. Keywords: Investment attractiveness index · unification of indicators · fuzzy logic · linguistic variable · logical rules · decision making

1 Introduction In market conditions, at the microeconomic level, investment assessment of business subjects, based on financial stability and efficiency of reproduction processes, becomes the most important task. The modern concept of investment attractiveness of small enterprises in agriculture implies the maximum use of resource potential with taking into account financial flows, accounts receivable and property. To achieve this goal, new analysis methods are used that contribute to the development of small businesses in agriculture. Currently, indicators have been developed that determine the efficiency of production activities of small enterprises in agriculture, calculated based on financial reporting and used in the process of investment analysis and forecasting. The development of a methodology for determining future cash flows and the use of accounting engineering © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 411–419, 2024. https://doi.org/10.1007/978-3-031-54820-8_34

412

U. Makhazhanova et al.

tools in order to provide investors with objective and accessible information about the investment attractiveness of an economic entity will make it possible to determine the value, which can be a guideline for decisions on investing funds in the property complex of the enterprise. The presence of unresolved issues in the field of studying the investment attractiveness of small enterprises has necessitated a systematic generalization and development of theoretical and methodological provisions for analytical support for assessing the investment attractiveness of small agricultural enterprises within the framework of conceptual accounting approaches to assessing the value of an enterprise, which determines the relevance of the research topic. This article discusses the features of assessing the investment attractiveness of a small agricultural enterprise. The methodology for assessing investment attractiveness should be expanded and supplemented. The simplest financial analysis no longer meets the requirements of investors making decisions. In accordance with this, new methods and approaches are being developed to determine the investment attractiveness of an enterprise and formulate an investment decision. In particular, it is planned to develop accounting and analysis methods, which, in addition to financial analysis, will include a qualitative and quantitative assessment of factors of investment attractiveness and use several approaches to business valuation in order to determine cash flows in the future with the preparation of a derivative balance sheet. One of the main tasks in determining the investment attractiveness of agriculture is to assess the potential for profitability and sustainable development in this industry. The literature discusses various approaches and methods for conducting such assessments. The works of academic economists are devoted to the study of theoretical and practical aspects of determining the investment attractiveness of business subjects [1– 8]. Some studies suggest using financial and economic indicators such as profit margin, profitability and asset turnover to assess investment attractiveness. Other studies suggest taking into account factors such as risks, political stability, availability of markets and resources, and the competitiveness of national and international markets. Some studies suggest using expert methods and specialist opinions to determine investment attractiveness in agriculture. For example, the analytical network method allows one to take into account preferences and the interaction of various factors when making investment decisions. There are also methods based on the theory of fuzzy sets and the use of logical formulas for making expert decisions. While appreciating the contribution of foreign scientists to the solution of the issues under study and without denying the legitimacy of the results obtained, it should be noted that certain aspects of accounting and analysis of the investment attractiveness of an enterprise require improvement, taking into account regional and industry specifics, the influence of internal and external factors of the economy of business entities. An approach is proposed for assessing investment attractiveness using the mathematical apparatus of fuzzy logic, which allows one to take into account approximately qualitative information about the characteristics of a small agricultural enterprise. The theory of fuzzy sets and fuzzy logic are effective tools for formalizing qualitative and approximate concepts based on linguistic models and representing knowledge in the

Assessment of Investment Attractiveness of Small Enterprises

413

form of production rules “If…, then…”. Knowledge inference in this case is carried out based on fuzzy logical inference [9, 10].

2 Methods, Models, Data Currently, the most justified, in our opinion, is an integrated approach to assessing the investment attractiveness of a small enterprise. As we noted earlier, most of the methods for assessing investment attractiveness are based on an analysis of the financial condition, therefore it is necessary to consider the main groups of indicators characterizing agricultural production. Analysis of financial condition is one of the effective ways to assess the current situation, which reflects the instantaneous state of the economic situation and allows us to identify the most complex problems of managing available resources and thus minimize efforts to align the goals and resources of the organization with the needs and capabilities of the current market. The total number of financial indicators used to analyze the activities of an enterprise is very large. If you set out to list all the financial indicators ever used, you can count more than a hundred of them. The study will use only the main coefficients and indicators that most fully reflect the production and financial specifics of the activities of agricultural enterprises [8, 11]. In the process of assessing the investment attractiveness of enterprises, it is necessary to pay special attention to qualitative parameters. An important aspect when determining the investment attractiveness of agriculture is the analysis of the specifics of this industry. For example, agriculture may be associated with seasonal factors, volatility in agricultural prices, as well as various regional characteristics. Analysis of the direction of industry development allows us to predict the risks of changes in external factors that can significantly affect the profitability of the enterprise, the quality of its assets and, importantly, the ability to fulfill debt obligations. These risks exist in all sectors of the economy, and no company can avoid them [12, 14]. Along with the generally accepted indicators that characterize the capital structure and property position of a small enterprise, important are: indicators characterizing the organizational and management base, the accounting and internal control system and its effectiveness, the reputation of the organization and its managers, providing a clear understanding of the policies pursued management, about its competence, integrity, compliance of the organizational structure with the production, financial, social characteristics of a small enterprise [15, 16]. 2.1 Unification of Indicators for Assessing Investment Attractiveness To assess the investment attractiveness of agriculture, you can use the following financial ratios, which allow you to determine the effectiveness and feasibility of financial and economic activities: Return on assets (R1 ) - shows how much profit the assets of an agricultural enterprise generate. The profitability ratio directly depends on the area of activity of the organization. Thus, in heavy industry the indicator will be lower than in the service sector, since enterprises of the latter need less investment in working capital. In general, return on

414

U. Makhazhanova et al.

assets reflects the effectiveness and profitability of asset management, and therefore the higher it is, the better. PBT - Profit before tax, B - Total assets (Balance). Recommended value ≥ α. R1 =

PT ∗ 100 B

(1)

Membership function:  μ1 (x) =

R1 /α,if R5 < α, 1,if R5 ≥ α.

Return on equity (R2 ) - allows you to evaluate the efficiency of using equity capital. NP- Net profit, E-Equity. Recommended value ≥ 50%. R2 =

NP ∗ 100 E

(2)

Membership function:  μ2 (x) =

R2 /50%,if R2 < 50%, 1,if R2 ≥ 50%.

Asset turnover (R3 ) - shows how quickly assets turn into revenue. R - Revenue, B Total Assets. (Balance). Recommended value ≥ 12. R3 =

R B

(3)

Membership function:  μ3 (x) =

R3 /12,if R3 < 12, 1,if R3 ≥ 12.

Current liquidity ratio (R4 ) - reflects the company’s ability to pay current obligations. CA - Current assets, CL - Current liabilities. Recommended value ≥ 2. R4 =

CA CL

(4)

Membership function:  μ4 (x) =

R4 /2,if R4 < 2, 1,if R4 ≥ 2.

Share of revenue from sales of main agricultural products (R5 ) - allows you to estimate the share of profit received from core activities. RmP - Revenue from main products, TR - Total revenue. Recommended value ≥ δ. R5 =

RmP ∗ 100 TR

(5)

Assessment of Investment Attractiveness of Small Enterprises

415

Membership function:  μ5 (x) =

R5 /δ,if R5 < δ, 1,if R5 ≥ δ.

These ratios will help us assess the financial stability, efficiency of asset use and profitability of an agricultural enterprise: Industry development dynamics (R6 ) - evaluates changes and trends associated with the development of the agricultural sector. This includes factors such as increasing demand for agricultural products, new technologies, changing consumer preferences and legislation. Industry development prospects (R7 ) - assesses projected growth and potential for industry development. This includes factors such as resource availability, level of competition, innovative capabilities and government support. Market demand for products (work, services) of the industry (R8 ) - assesses the demand and market need for products or services offered by an agricultural enterprise. This includes analysis of market trends, competition, market saturation and potential customer base. These qualitative indicators will help assess the prospects for the development of the industry, as well as determine the potential market need for agricultural products. Indicators characterizing the organizational and managerial base, let us consider each indicator in more detail: Assessment of the professional level of personnel (R9 ) - This indicator reflects the quality and experience of the company’s employees. A high professional level of personnel increases the likelihood of successful business and attracting investment. Sufficient duration of the enterprise’s presence on the market (R10 ) - The length of stay on the market indicates the stability and success of the enterprise. An organization that has a stable position and long-term relationships with clients and partners may be considered more attractive to investors. Economic policy of the enterprise (R11 ) - Correctly conducting financial activities, adequate budget planning and the ability to effectively allocate resources affect investment attractiveness. Investors are interested in companies that manage money wisely. Technical policy of the enterprise (R12 ) - The availability of modern technologies, equipment upgrades and support for innovation definitely affect the attractiveness of an enterprise for investors. Technical policy should be aimed at developing and improving production processes. Personnel policy of the enterprise (R13 ) - Factors that are important here include the policy of recruiting and retaining talented personnel, training and development of employees, and the creation of motivational programs. A company that pays attention to its employees and creates a strong corporate culture can attract more investors. Borrower’s credit history (R14 ) - A stable credit history, timely repayment of loans and fulfillment of financial obligations improve the investment attractiveness of an enterprise. Potential investors perceive the absence of debt and low degree of financial risks positively.

416

U. Makhazhanova et al.

It is important to note that these indicators are only some aspects that influence the investment attractiveness of a small enterprise. An expert usually assesses these parameters. The second option is when these parameters, or some of them, are not taken into account at all.

3 Results This section details how similar metrics are mapped to the interval [0,1], i.e. how the indicators are unified. The case is considered when one linguistic variable “degree of investment attractiveness of the indicator” is associated with each parameter (indicator). The initial indicators are displayed on the interval [0,1]. That is, a set of parameters is represented as a set of linguistic variables. In the general case, we assume that any parameter takes values on a certain interval of real numbers. The simplest case is when each parameter is associated with one linguistic parameter, which can be called “the degree of investment attractiveness of the indicator.” It is necessary to explicitly specify the mapping of the interval of real numbers that this parameter can accept into the interval [0,1]. This process is called the indicator unification process [17, 18]. Taking the composition of “degree of attractiveness” with the functions presented below, we obtain correspondingly new linguistic variables: “low” (L); “average” (M); “high” (H). Note that linguistic variables of this kind are often used in technical systems. Often, even a larger number of gradations are considered: below average, above average, close to zero, etc. Each linguistic variable has its own membership function associated. One of the possible ways to specify membership functions is proposed in [19]. Ultimately, the choice of the type of membership functions, both the alternatives themselves and the criteria being evaluated, is determined based on expert preferences. It is obvious that the expert’s uncertainty in the assessment increases with the increase in the deviation of the value of the estimated parameter from the optimal value. Moreover, in most cases this uncertainty does not grow linearly. However, the use of nonlinear membership functions entails a significant complication of mathematical calculations and graphical constructions [20]. For these reasons, in this work, triangular and trapezoidal membership functions are used as the initial membership functions, which is primarily due to the ease of subsequent calculations and graphical constructions. We can say that the membership functions are piecewise linear. Determining the degree to which the selected set of evaluated criteria corresponds to a particular alternative is a key factor in the subsequent selection of the most suitable investment scheme. It was noted above that the indicators take rather arbitrary values. More precisely, each indicator changes in a certain inherent interval. Next, we assume that the indicators have been unified, i.e. the corresponding intervals are mapped onto the segment [0,1]. Each i-th exponent can be associated with one “universal” predicate P i (x) or three i , P i (x), which naturally arise through composition with one-place predicates PLi (x), PM H the following functions. For simplicity, formula-definable predicates are usually also introduced [21]: i i = PLi (x) ∪ PM PLM (x),

Assessment of Investment Attractiveness of Small Enterprises

417

i i i PHM = PM (x) ∪ PH (x).

Similarly, with each financial indicator, i.e. characteristic of the loan provided, we associate the predicate Eki , K ∈ {L, LM , M , HM , H }. The rules based on which decisions are made have the form ϕ(x1 , . . . , xi ) → ρ(y1 , . . . , ym ). Multiple rules can be used Vi : ϕi (x1 ..xn ) → ρi (y1 ..yn ), i = 1, . . . , N . Usually we have a conjunction of unary predicates of the above signature: ϕ(x1 , . . . , xi ) →

n    Ej xj . j=1

Moreover, each Ej looks like PKi , K ∈ {L, LM , M , HM , H }. Indicators x1 ..xn are obtained as a result of analyzing the activities of the enterprise. These are the indicators mentioned in Sect. 2. Indicators y1 ..yn are predicted. These are indicators such as: investment volume, investment period, investment risk level. In the most general form, one parameter is predicted, called the investment attractiveness index, varying from 0 to 1 and having a natural interpretation.

4 Discussions This paper presents a method for determining the investment attractiveness of small businesses. This method is based on the use of fuzzy set theory. Fuzzy set theory allows you to take into account uncertainty and fuzziness in data and make decisions based on logical formulas. Section 2 of the work describes in detail the process of data unification, that is, the transformation of initial indicators into values in the interval [0,1]. For each parameter, a linguistic variable “degree of attractiveness of the indicator” is defined, which displays the value of the parameter in the interval [0,1]. This allows you to compare various indicators and assess their significance for the investment attractiveness of the enterprise. The rules on the basis of which decisions are made are formulated in the form of logical formulas using parameters. The main parameter is the investment attractiveness index, which varies from 0 to 1 and has a natural interpretation. The article also mentions the use of interval numbers to evaluate expert opinions and interval weights in decisionmaking problems. The idea of using probabilistic measures to quantify the intensity of preference between two intervals is not new, but the proposed scientific approach can be used in various areas of socio-economic activity to create systems to support expert decisionmaking, process monitoring and analysis of the financial and economic activities of an enterprise.

418

U. Makhazhanova et al.

The described method can be useful for more accurate and objective decision-making in the financial and economic sphere. However, to fully assess the effectiveness of the method, its practical application and comparison with other credit assessment methods are required. Acknowledgments. This research was funded by the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan (Grant No. AP09259435).

References 1. Pemsl, D. E., et al.: Prioritizing international agricultural research investments: lessons from a global multi-crop assessment. Res. Policy 51(4) (2022) 2. Pardey, P.G., Andrade, R.S., Hurley, T.M., Rao, X., Liebenberg, F.G.: Returns to food and agricultural R&D investments in Sub-Saharan Africa. 1975–2014. Food Policy 65, 1–8 (2016) 3. Jennifer, C.: Responsibility to the rescue? Governing private financial investment in global agriculture. Agric. Hum. Values 34, 223–235 (2017) 4. Lorenzo. C.: The great African land grab?: Agricultural investments and the global food system. Bloomsbury Publishing (2013) 5. Tavneet, S., Udry, C.: Agricultural technology in Africa. J. Econ. Perspect. 36(1), 33–56 (2022) 6. Jayne, T.S., Mason, N.M., Burke, W.J., Ariga, J.: Review: Taking stock of Africa’s secondgeneration agricultural input subsidy programs. Food Policy 75, 1–14 (2018) 7. William, T., Hikaru. P.: Risk management in agricultural markets: a review. J. Futures Mark. 21, 953–985(2001) 8. Davydenko, N., Skryphyk, H.: Evaluation methods of investment attractiveness of Ukrainian agricultural enterprises. Baltic J. Econ. Stud. 3(5), 103–107 (2017) 9. Zadeh, L.A.: The concept of a linguistic variable and its application to approximate resoning. Inf. Sci. 8, 199–249 (1975) 10. Derhami, S., Smith, A.E.: An integer programming approach for fuzzy rule-based classification systems. Eur. J. Oper. Res. 256(3), 924–934 (2017) 11. Aleskerova, Y., et al.: Modeling the level of investment attractiveness of the agrarian economy sector. Int. J. Indust. Eng. Product. Res. 31(4), 490–496 (2020) 12. Mustafakulov, S.: Investment attractiveness of regions: Methodic aspects of the definition and classification of impacting factors. Eur. Sci. J. 13(10), 433–449 (2017) 13. Klychova, G., et al.: Assessment of the efficiency of investing activities of organizations. E3S Web Conf. 110 (2019) 14. Kerimkhulle, S., Saliyeva, A., Makhazhanova, U., Kerimkulov, Zh., Adalbek, A., Taberkhan, R.: The input-output analysis for the wholesale and retail trade industry of the Kazakhstan statistics. E3S Web Conf. 376 (2023) 15. Mukasheva, M., Omirzakova, A.: Computational thinking assessment at primary school in the context of learning programming. World J. Educ. Technol. Curr. Issues 13, 336–353 (2021) 16. Abramov, E.P., Makhazhanova, U.T., Murzin, F.A.: Credit decision making based on Zadeh’s fuzzy logic. In: Proceedings of the 12th International Ershov Conference on Informatics (PSI 2019), pp. 20–25 (2019) 17. Tussupov, J.: Isomorphisms and algorithmic properties of structures with two equivalences. Algebra Logic 55(1), 50–55 (2016) 18. Pivkin, V.Ya., Bakulin, E.P., Korenkov, D.I.: The concept of a linguistic variable and its application to making approximate decisions. Novosibirsk State University (1997)

Assessment of Investment Attractiveness of Small Enterprises

419

19. Makhazhanova, U.T., Murzin, F.A., Mukhanova, A.A., Abramov, E.P.: Fuzzy logic of Zadeh and decision-making in the field of loan. J. Theor. Appl. Inf. Technol. 98(06), 1076–1086 (2020) 20. Keisler, H.J., Chang, Ch. Ch.: Continuous Model Theory, Princeton University (1966) 21. Makhazhanova, U., et al.: The evaluation of creditworthiness of trade and enterprises of service using the method based on fuzzy logic. Appl. Sci. 12(22) (2022)

Security in SCADA System: A Technical Report on Cyber Attacks and Risk Assessment Methodologies Sadaquat Ali(B) WMG Cyber Security Centre, University of Warwick, Coventry, UK [email protected]

Abstract. Supervisory Control and Data Acquisition (SCADA) systems have become indispensable in a wide range of industries worldwide. These systems facilitate the monitoring and managing complex physical processes by employing field devices and actuators. However, the growing reliance on SCADA systems and the transition to standardized protocols have introduced significant security risks, leading to an alarming rise in reported cyber-attacks. This paper focuses on addressing the security challenges SCADA systems face by exploring risk assessment methodologies. Organizations can proactively protect their systems from potential threats by comprehending the architectural intricacies of SCADA networks and analyzing existing risk assessment techniques. Moreover, an investigation into recent cyber attacks sheds light on the emerging trends and tactics that pose a considerable risk to SCADA systems. By integrating robust risk assessment methodologies, organizations can effectively enhance the security of SCADA systems and mitigate the potential damage caused by cyber-attacks. This paper also highlights the latest trends and tactics that is supposed to be very successful against SCADA systems, with a historical review of all attacks and their frequency throughout the year. #COMESYSO1120. Keywords: SCADA System · Security Risk · Cyber Attacks in SCADA · Emerging Trends · Risk Assessment Methodologies

1 Introduction SCADA systems are used in a wide range of industries across the globe day by day. SCADA firms have become more difficult to maintain their exclusive technology and protocol-based systems in favour of internet-based systems because of the tremendous rise in information and communication technology. SCADA systems are now more vulnerable due to changes in perspective. One of the Industrial Control Systems used to monitor and regulate all activities is SCADA [4]. Supervisory operation and data acquisition systems maintain and manage physical processes using field devices and actuators that are often complicated to design [2]. Programmable logic controllers and SCADA systems are applied to control the functioning of vital infrastructure in modern times. CPS is a system that merges physical processes with computer networking [5]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 420–446, 2024. https://doi.org/10.1007/978-3-031-54820-8_35

Security in SCADA System: A Technical Report on Cyber Attacks

421

SCADA systems are called cyber-physical systems (CPSs) for different reasons. A cyber architecture is used to gather data, operate the system, provide an operator’s GUI, keep track of data, and issue alarms, among other things [6]. 1.1 Background Supervisory Control and Data Acquisition (SCADA) systems are essential for monitoring and controlling critical infrastructures such as power plants, water distribution networks, transportation systems, and manufacturing processes. These systems integrate software and hardware components to gather and analyze data, enabling operators to monitor and control industrial processes in real-time. However, the increasing interconnectivity of SCADA systems with corporate networks and the internet has exposed them to various cyber threats. 1.2 Problem Statement The security of SCADA systems has become a significant concern due to their potential vulnerability to cyber attacks. Unlike traditional IT systems, SCADA systems have unique characteristics and constraints that pose challenges for their security. The consequences of successful attacks on SCADA systems can be severe, leading to operational disruptions, environmental damage, economic losses, and even threats to human safety. 1.3 Objectives The primary objective is to explore the security challenges SCADA systems face and provide insights into effective risk assessment methodologies. By identifying and understanding the various cyber attack vectors and vulnerabilities specific to SCADA systems, organizations can develop robust defence mechanisms to protect critical infrastructure. Additionally, this paper aims to introduce different risk assessment frameworks and methodologies that can help organizations assess and manage the risks associated with SCADA systems effectively. 1.4 Scope and Methodology This research paper focuses on cyber attacks targeting SCADA systems and their potential consequences. It examines various attack vectors, including malware, network intrusions, social engineering, and insider threats, that can exploit vulnerabilities in SCADA systems. The research paper also delves into risk assessment methodologies, including qualitative and quantitative approaches, vulnerability assessments, and threat modelling, to aid organizations in evaluating and prioritizing security risks. The research paper is structured as follows: Section 2 overviews SCADA systems, highlighting their architecture, components, and key characteristics. Section 3 explores the evolving threat landscape, discussing common attack vectors and vulnerabilities specific to SCADA systems. Section 4 focuses on related work, and methodology is described in Sect. 5. Section 6 describes the risk assessment methodologies, presenting

422

S. Ali

different frameworks and approaches for identifying, analyzing, and mitigating risks. Section 7 discusses the implications of effective risk assessment in enhancing the security of SCADA systems, and Sect. 8 provides the future trends related to SCADA systems. Section 9 discusses the detailed version of the results and discussions. Finally, Sect. 10 summarizes the key findings and concludes the paper.

2 Overview of SCADA System Supervisory Control and Data Acquisition (SCADA) systems are crucial components of critical infrastructure, enabling the monitoring and control of industrial processes. They integrate software, hardware, and communication networks to gather, analyze, and transmit data from remote devices, allowing operators to make informed decisions and optimize operations in real-time. This section provides a detailed overview of SCADA systems, focusing on their architecture, components, and key characteristics. 2.1 SCADA Layers SCADA systems are typically composed of three main architectural layers: the field layer, the control layer, and the supervisory layer. 1) Field Layer: The field layer consists of sensors, actuators, and other devices responsible for data acquisition and control at the operational level. Sensors collect data on various physical parameters such as temperature, pressure, flow rate, and voltage, while actuators enable control actions, such as opening or closing valves and switches. These devices are often located in remote and physically challenging environments. 2) Control Layer: The control layer consists of programmable logic controllers (PLCs) or remote terminal units (RTUs), which act as intermediaries between the field and supervisory layers. PLCs/RTUs receive data from sensors and transmit control signals to actuators. They also perform local control function like logic execution and data preprocessing. 3) Supervisory Layer: The supervisory layer is responsible for the SCADA system’s overall control, monitoring, and management. It includes a central server or multiple servers that run the SCADA software applications. These applications provide a graphical user interface (GUI) for operators to visualize real-time data, control processes and receive alarms and notifications. The supervisory layer enables remote access and control over the SCADA system from a control room or other authorized locations. 2.2 SCADA Architecture The SCADA system is a combination of different components and programs, including a Remote Terminal Unit (RTU) and Master Terminal Unit (MTU), with actuators, sensors, and a human-machine interface acting as software, there is also one central database. Hardware and software may communicate with each other. Figure 1 shows the detailed architectural diagram of SCADA systems.

Security in SCADA System: A Technical Report on Cyber Attacks

423

Fig. 1. SCADA Architecture.

The functionality of each component is explained below. 1) Sensors and Actuators: Sensors and actuators gather information from plant operations and then send it to RTUs for further processing. 2) RTU (Remote Terminal Unit): The RTUs connect SCADA’s physical objects to the microprocessors that operate them. RTUs collect and transmit sensor data to the MTU to oversee and operate the SCADA system. Every RTU is linked to various sensors and actuators that control localized production [7]. 3) MTU (Master Terminal Unit): MTU monitors and controls data from RTU, also known as the heart of SCADA. Based on the collected data, it enables users to operate field equipment. 4) HMI (Human Machine Interaction): It is the interface for SCADA, which displays the results of MTU, and based on the displayed data, final actions and steps are taken and sent to the actuators [8].

424

S. Ali

2.3 SCADA Generations There are four essential SCADA areas, including Monolithic, Distributed, Networked, and IoT-based SCADA, as shown in Fig. 2. The evolution of the communication paradigm is broken down into these four eras [9]. 1) Monolithic: This system operates in an insulated environment and is not connected to any other system. These systems are designed to operate in a solo fashion. 2) Distributed: LANs were used to link and restrict these systems. This generation’s WAN/LAN protocols were utterly distinct from those in WAN/LAN. 3) Networked: Due to the obvious uniformity and cost-effectiveness of large-scale system solutions, it extensively uses networks and the web. New SCADA is another name for this generation. 4) IoT-based SCADA: To develop, monitor, and regulate systems, the sectors have extensively used technological advances. The Internet of Things (IoT) innovation and cheaply available cloud computing with SCADA systems have significantly reduced the cost of architecture and installation. Apart from that, as contrasted with prior eras, the connection and servicing are also straightforward.

2.4 Key Characteristics of SCADA System 1) Real-Time Monitoring and Control: SCADA systems provide real-time monitoring and control capabilities, allowing operators to visualize process data and intervene to ensure optimal operations and response to abnormal conditions. 2) Wide Area Coverage: SCADA systems are designed to monitor and control geographically dispersed assets spanning vast areas. They enable centralized management and control of multiple remote sites, increasing operational efficiency and reducing maintenance costs. 3) Redundancy and Fault Tolerance: SCADA systems often incorporate redundancy and fault-tolerant mechanisms to ensure continuous operation and minimize disruptions. Redundant components, backup power supplies, and failover configurations are implemented to mitigate the impact of hardware or software failures. 4) Security Considerations: Due to their critical nature, SCADA systems require robust security measures. Access control, authentication, encryption, and network segmentation are essential to protect against unauthorized access, tampering, and cyber attacks. 5) Integration with Enterprise Systems: SCADA systems are increasingly interconnected with enterprise systems, such as enterprise resource planning (ERP) and asset management systems. This integration improves decision-making, resource allocation, and coordination between operational and business processes.

Security in SCADA System: A Technical Report on Cyber Attacks

425

Fig. 2. SCADA generation overview.

3 The Evolving Threat Landscape: Common Attack Vectors and Vulnerabilities Specific to SCADA Systems SCADA systems, being critical infrastructure components, face a constantly evolving threat landscape that requires vigilant security measures. Attackers, from individual hackers to sophisticated state-sponsored groups, seek to exploit vulnerabilities in SCADA systems to disrupt operations, cause damage, or gain unauthorized control.

426

S. Ali

This section explores the common attack vectors and vulnerabilities specific to SCADA systems, shedding light on the unique security challenges they present. 3.1 Threats and Attacks on SCADA In the past decades, SCADA links have introduced numerous versions and the widespread usage of Network Technology and IP-based communications. Currently, SCADA systems rely on standardized protocols rather than the previously used proprietary ones. Due to continuous developments in SCADA systems, it is now significantly more vulnerable to security risks, as seen by the recent increase in cyber attacks. Because of the distinctive needs of SCADA systems in terms of accessibility, authenticity, and secrecy, the adoption of traditional security measures is not always appropriate [10]. Any network can become crucial when the weaknesses become threats, producing many negative consequences on social systems, e.g., energy, security, health, and other aspects of society. The breakdown of an infrastructure or the shortage of services may bring significant destruction and harm to society’s economy and the stability of any country’s political system [7]. The security of the SCADA system plays an essential role in the sustainability of every society. For example, continuous power failures of SCADA systems could have devastating results for important services such as electricity, shipping, health, and other critical infrastructure. However, SCADA systems, used in power generation plants, are responsible for monitoring the generation of 48 per cent of the total electricity generated in the United States. Due to the critical role played by SCADA systems, authorities and scholarly articles have concentrated on the cyber-security and security risks associated with them [2]. SCADA systems are vital to the safety of critical information infrastructures; therefore, security is an important consideration. Risk in a SCADA system is defined as “the probability of a particular threat agent exposing a weakness and the ensuing consequence of a successful operation” [6]. Aside from information leaking, assaults on SCADA networks might cause damage, destruction, or fatality. Therefore, SCADA network security is considered the most important part of the domain. SCADA networks must be risk-assessed if they continue functioning normally [11]. In the event of an assault on a SCADA system, the results might be disastrous. SCADA systems can significantly impact public safety and health if they are constantly operational. As a result, any security breaches on these systems might have serious ramifications for the public’s well-being. An intruder might shut off gas, power, and water supplies using a SCADA system or even devastate vital military facilities [12]. An analysis of several significant assaults on SCADA systems is provided in this brief paper. • In an attack on Ukraine’s SCADA system in December 2015, hackers gained access to the information networks of three energy distribution firms, resulting in rolling power failures between one to six hours, affecting 225,000 end consumers. • In Iran’s enhancement facility, Stuxnet, a malware infection spread through air-gapped networks and caused damage to nuclear centrifuges. In 2010, this cyberweapon was used against SCADA systems.

Security in SCADA System: A Technical Report on Cyber Attacks

427

• At the Maroochy Water Services in Queensland, Australia, a hijacked SCADA controller (wireless radio) was used to dump up to one million litres of wastewater into the large river of Maroochydore in 2000 [8]. 3.2 Common Attack Vectors 1) Malware Attacks: Malware attacks pose a significant threat to SCADA systems. Attackers may deploy malicious software, such as worms, viruses, or trojans, to infiltrate the system and gain unauthorized access. Malware can propagate through infected external devices, removable media, or compromised vendor software updates. Once inside the SCADA system, malware can disrupt operations, steal sensitive data, or manipulate control commands. 2) Network Intrusions: Network intrusions involve unauthorized access to the SCADA system’s network infrastructure. Attackers exploit weaknesses in firewalls, routers, switches, or insecure remote access mechanisms to gain entry. They may use techniques like port scanning, password cracking, or exploiting unpatched vulnerabilities to compromise the system. Once inside, attackers can eavesdrop on communications, manipulate data, or launch further attacks. 3) Social Engineering: Social engineering attacks target human vulnerabilities within the SCADA ecosystem. Attackers use psychological manipulation techniques to deceive system administrators, operators, or employees into divulging sensitive information or granting unauthorized access. Phishing, pretexting, and impersonation are common social engineering tactics employed to gain privileged information or compromise user credentials. 4) Insider Threats: Insider threats, either unintentional or malicious, can pose significant risks to SCADA systems. Disgruntled employees, contractors, or partners may abuse their privileged access to sabotage operations, steal sensitive data, or disrupt critical processes. Insider threats can be challenging to detect, as legitimate users already possess the necessary permissions and privileges. 3.3 Vulnerabilities in SCADA Systems 1) Lack of Authentication and Authorization Mechanisms: Some legacy SCADA systems lack robust authentication and authorization mechanisms, making them vulnerable to unauthorized access. Weak or default passwords, lack of multifactor authentication, and insufficient access control can expose SCADA systems to unauthorized control or manipulation. 2) Insecure Remote Access: Remote access to SCADA systems, often necessary for maintenance and troubleshooting, can introduce vulnerabilities if not properly secured. Weak encryption protocols, unpatched remote access software, or unauthorized third-party connections can provide entry points for attackers to exploit. 3) Inadequate Network Segmentation: Improper network segmentation between corporate networks and SCADA networks can expose SCADA systems to unnecessary risks. Without segregation, a successful compromise of the corporate network can lead to unauthorized access or control of the SCADA system.

428

S. Ali

4) Lack of Patching and Updates: Failure to apply regular patches and updates to SCADA system components can leave them vulnerable to known vulnerabilities. SCADA systems often operate for long durations without updates due to concerns about system stability or operational disruptions, providing attackers with a window of opportunity. 5) Insufficient Monitoring and Logging: Inadequate monitoring and logging capabilities hinder the detection and response to security incidents in SCADA systems. Without comprehensive logs, it becomes challenging to identify unusual activities, trace attack paths, or conduct forensic analysis. 6) Lack of Security Awareness and Training: Weak security awareness and training among SCADA system operators and administrators can contribute to the success of social engineering attacks. Employees may unknowingly click on phishing emails or fall victim to manipulative techniques, compromising the system’s security. Understanding these common attack vectors and vulnerabilities specific to SCADA systems is vital in developing effective security strategies. Robust authentication mechanisms, secure remote access, network segmentation, regular patching, monitoring capabilities, and comprehensive security training can help mitigate these risks and enhance the security posture of SCADA systems. Ethernet and the TCP/IP are often used in classic SCADA systems to link and interact with web-based applications. By relying on the protocols, the chances of cyberattacks are increased, and attackers carry out other destructive actions from the outside. Beyond cyber-attacks against SCADA-based critical infrastructure, social engineering and insider attacks threaten the system’s integrity and security [7]. The latest development on uniformity of software and hardware used in SCADA systems makes it easier to launch attacks particular to SCADA systems to be successful. In this way, SCADA system security can be achieved just through obscurity or by acting as a function of locking down a system. These attacks can potentially interrupt and impair vital infrastructure activities, resulting in significant economic losses [12]. The most known cyber attacks on SCADA systems are given in Table 1. Table 1. Cyber-attacks in the SCADA system. Cyber Attacks

Description

Insider Attack [7]

These sorts of attacks are regarded as the most devastating since the attacker is familiar with the underlying architecture of the network and can easily circumvent the security measures put in place to protect it

DDoS [7]

DoS/DDoS assaults overburden the system’s resources where the planned tasks cannot be completed

Man-in-the-middle [7]

It accomplishes this by interfering with the network connection and transmitting malicious software into the system, therefore compromising it (continued)

Security in SCADA System: A Technical Report on Cyber Attacks

429

Table 1. (continued) Cyber Attacks

Description

Social Engineering [13]

Using social engineering, the attacker gains access to a system to do illegal operations

Phishing [13]

Phishing attacks on SCADA systems are performed to obtain information for surveillance. The intruders are interested in learning about the system on their own to establish a backdoor they may harm later

Malicious Software [13] The security and secrecy of the SCADA system can be compromised by malicious software like worms

In summary, SCADA systems consist of multiple layers, including the field, control, and supervisory layers. They incorporate various components, such as HMIs, data acquisition units, communication infrastructure, and databases. SCADA systems are characterized by their real-time monitoring and control capabilities, wide area coverage, redundancy, fault tolerance, security considerations, and integration with enterprise systems. Understanding the architecture, components, and key characteristics of SCADA systems is the foundation for effectively addressing their security challenges.

4 Related Work In recent decades, different organizations and sectors have faced crucial problems regarding cyber assaults against SCADA. The first-ever cyber-attack on the trans-Siberian pipeline on CNI was documented in 1982 and culminated in a publicly apparent detonation. Based on statistics safeguarding, this system is a key priority because of the impending attacks on SCADA systems. Many studies give a generic risk assessment method for securing your SCADA environment based on certain attacks. This article discusses a method for assessing data breaches in SCADA networks. For security investments, the technique employs a cost-benefit assessment [10]. SCADA and CPS networks in the maritime logistics sector are complicated. Thus, this research proposes an innovative risk evaluation technique that can handle the unique security concerns of SCADA systems and CPS networks in the maritime logistics sector. Cyber risks and their cascade effects in the supply chain may be estimated using this technique, introducing a series of future security assessment services [14]. For CNI plants, this paper proposes using a risk assessment algorithm. Besides security standards, the technique considers other security, protection, and reliability metrics. Another element is defining resources by apparent and secret procedures for, assessing possible cyber-attack damage estimates [11]. Risk assessment in SCADA networks may be improved by using an AHP and RSR method presented in this paper to minimize the subjectivity and ambiguity of risk assessment and statistical and qualitative assessment of the security of average risk. After a cyber-attack occurs, this strategy also reduces the detection risk time [15]. System analysis, attack modelling and analysis, and network penetration are all included in this

430

S. Ali

paper’s risk assessment technique proposed in the study. This will show the systemic influence [16]. The National Vulnerability Database (NVD) has been frequently used to analyze SCADA vulnerabilities until May 2019. This paper is examined the security flaws. Depending on the severity, regularity, accessibility, security, privacy effect, and Common Weaknesses. Studying SCADA assaults concerning publicly disclosed vulnerabilities is said to be the first of its kind [8]. Cyber-security risk analysis in SCADA networks may be assessed using a Bayesian network model proposed in this research. In this study, the suggested risk assessment approach learns the set of parameters using past data and improves assessment accuracy by progressively training from online inspections [11]. An attack tree model using fuzzy set theory and likelihood risk assessment technology is used in this work to estimate cyber risks in a ship control system risk scenario. This technique’s gap likelihood can better portray the unpredictability of an assault [17]. While other researchers have found that CVSS risk metrics aren’t associated with exploits for all software bugs, the authors of this paper found that the risk parameters associated with the software subclass of SCADA systems were highly linked with vulnerabilities, unlike our research colleagues [18]. For SCADA systems, this paper describes an extensive risk identification approach. Factors that are used to identify risks in ISO 31000 risk management principles and standards were incorporated into the design of this model. A multilevel strategy is used to create comprehensive risk scenarios in the model, which define the links between all the elements used in risk identification [4]. An assessment approach is suggested in this work to recover the inherent constraints of traditional RPN-based FMEA while also evaluating, prioritizing, and correcting the threat categories associated with SCADA systems. However, it also seeks to analyze, prioritize, and rectify security risks associated with the threat types of SCADA systems. However, it also seeks to evaluate, prioritize, and rectify security risks associated with the threat modes of SCADA systems [6]. The authors of this study begin by examining the architectural components of current SCADA systems to assess the security positions of these systems. Then it would help if you looked at the security issues and vulnerabilities of these frameworks. This paper also presents a high-level decentralized SCADA system design that can relieve the security difficulties mentioned above by using the benefits of a tamper-resistant ledger [2]. It is proposed in this research that an anomaly detection algorithm in which complex cyberattacks that cannot be recognized by hydraulically-based rules alone may be identified using this model-based method. [5]. The purpose of this article was to give a model for developing a risk map for a loss of water supply to customers and construct a risk map. This article shows how the EPANET 2.0 application was used to simulate the collapse of the primary pipes transporting purified water from the water treatment facility to the city [19]. The protection techniques that end-users and enterprises in the Industry 4.0 domain should employ to combat cyber threats are discussed in this research paper. The sources of potential assaults have been discovered, and defence tactics may now be developed [20]. This article includes a review of cyber threats and defensive measures to draw attention to the need to safeguard SCADA-based critical infrastructures and an understanding of the security problems and outstanding issues in this area [7]. According to the findings of

Security in SCADA System: A Technical Report on Cyber Attacks

431

this research, the author has developed a unique intrusion detection approach for spotting SCADA-tailored assaults. This strategy for automatically identifying a system’s normal and critical states is based on a data-driven clustering approach to processing parameters to determine the system’s normal and critical states. After then uses the proximity-based detection criteria of the discovered states to keep track of them [3]. A proactive approach to risk assessment is presented in this article, which includes a comprehensive cybersecurity risk management framework for evaluating and controlling risks. Following established risk management practices and standards, this study considers hazards arising from the stakeholder model, CPS elements, and their interrelationships, amongst other factors [1]. The authors present an overview of the overall SCADA infrastructure and a discussion of the SCADA communication protocols in this work. They cover several high-impact cybersecurity events, aims, and threats that have occurred recently. They conduct an in-depth examination of the security suggestions and methods to keep SCADA systems safe and secure [12]. This article aims to examine the attack surface and own weaknesses of the SCADA system and examine the features of APT attacks. The goal is to implement safety-related protective mechanisms for SCADA systems [21]. Because of this research, it will be easier to detect possible security vulnerabilities or system elements where a compromise may happen. This article outlines many possible SCADA vulnerabilities based on realworld occurrences recorded in common vulnerability databases, as described in the previous section [22]. 4.1 Compatibility Table A compatibility table is used to examine data. Table 2 summarises all the strategies detailed in our study. This table lists the examined publications, their references, and data sets. It gives us a detailed summary of all procedures. Table 2. Compatibility table for associated research. Year

References

Domain

Objective

Contribution

2018

[3]

Generic SCADA

Create a unique Detect attacks on IDS cluster that can SCADA networks detect attacks on SCADA networks

2019

[7]

Energy Sector Examines the vulnerabilities of SCADA-based critical infrastructures, as well as protection solutions

SCADA vulnerabilities and their solution

Limitation/ Challenges It only focuses on scanning SCADA vulnerabilities Contradictory security methods. No simulated methodology

(continued)

432

S. Ali Table 2. (continued)

Year

References

Domain

Objective

Contribution

Limitation/ Challenges

2021

[9]

Architecture based on IoT

Review SCADA safety dimensions

It will Discover the rate of attacks and performance measurement

Encrypted data cannot be considered. No simulative factors

2021

[16]



Assess the security risk of the SCADA system, which entails attack simulation, system analysis, and penetration

Risk Assessment, System Analysis, Anomaly Detection,

Vulnerability assessment of the process control system is used for specific control variables

2020

[19]

CSTR

The goal is to identify the most likely system failures

Assesses risk, generates attack tree, Sequence of attacks, and an end. Vulnerability Scanning is done

This paper’s assumption may be invalidated if many system units are targeted simultaneously

2020

[20]

SCADA

Best practices for preventing cyber-attacks

Analyzes Inspecting the Vulnerabilities and system’s data and Risk assessment looking for further vulnerabilities is possible

2018

[21]

SCADA System

Analyzes attack The aim is to surface for SCADA provide safety protection measures for SCADA

Security measures and anti-risk strategies are general

2020

[22]

Generic SCADA network

Identify potential system breaches and methods

Scanning of logs, network traffic analysis, and testing of SCADA security vulnerabilities

Credential theft is explained in weak credentials, but no attack tactics are provided

2019

[23]

Petroleum Products

Ways SCADA Vulnerabilities are system can be identified in the attacked and system measures through which can reduce the possibility of an attack

Only assessment of the risks associated with generically linked PLCs

(continued)

Security in SCADA System: A Technical Report on Cyber Attacks

433

Table 2. (continued) Year

References

Domain

Objective

Contribution

Limitation/ Challenges

2017

[24]

Power Generation

Discusses techniques to uncover system vulnerabilities

Discusses threat vector, System vulnerability, and its impact on the SCADA

A system penetrative technique may perform a simple network data protection risk assessment

CNI Plants

Formula-based Risk assessment methodology for industrial control systems

It is delivering genuine cyber-attack sources, intrusion categories, and authorized tools

There is no separate examination of the SCADA network’s vulnerability

2018

5 Methodology In this section, we outline the methodology employed in this research to explore security in SCADA systems and examine risk assessment methodologies. The research methodology encompasses the data collection process, research design, and analytical approaches utilized to achieve the objectives of this study. 5.1 Data Collection The primary data sources for this research include scholarly articles, industry reports, books, and relevant publications from reputable sources. These sources were obtained from academic databases, professional journals, and online libraries. The data collected covered various topics, including SCADA system architecture, vulnerabilities, attack vectors, risk assessment frameworks, and best practices. 5.2 Research Design To address the research objectives effectively, a qualitative research design was adopted. Qualitative research allows an in-depth exploration of the complexities and nuances associated with security in SCADA systems and risk assessment methodologies. By employing this approach, we aimed to gather rich and detailed information to comprehensively understand the subject matter. 5.3 Data Analysis The collected data were analyzed using a thematic analysis approach. The thematic analysis involves identifying the data’s patterns, themes, and key concepts to derive

434

S. Ali

meaningful insights and draw conclusions. The analysis involved coding the data based on relevant themes related to SCADA system security, attack vectors, vulnerabilities, and risk assessment methodologies. Through iterative analysis, patterns and connections within the data were identified, contributing to developing key findings and conclusions. 5.4 Ethical Considerations Throughout the research process, ethical considerations were adhered to. Proper citation and referencing practices were followed to ensure the intellectual property rights of the authors and maintain academic integrity. Confidentiality and anonymity were maintained when referring to specific case studies or incidents to protect the privacy of organizations and individuals involved. 5.5 Limitations It is important to acknowledge the limitations of this research. Due to the vast scope and complexity of the topic, it was not feasible to cover every aspect comprehensively. The findings and conclusions drawn from this study are based on the available literature and data sources. The rapidly evolving nature of cybersecurity threats and SCADA system vulnerabilities may require additional research and analysis in the future to capture emerging trends and developments accurately. 5.6 Validity and Reliability Multiple sources were consulted and cross-referenced to ensure the validity and reliability of the research findings. The credibility and reputation of the selected sources were assessed to ensure accurate and up-to-date information. The research process followed rigorous academic standards, including peer-reviewed sources and recognized frameworks and methodologies for risk assessment. In summary, this section provided an overview of the methodology employed in this research. The data collection involved accessing scholarly articles, industry reports, and relevant publications. A qualitative research design was adopted to explore the complexities of security in SCADA systems, and a thematic analysis approach was utilized for data analysis. Ethical considerations were followed throughout the research process, and the study’s limitations, validity, and reliability were acknowledged.

6 Risk Assessment Methodologies: Identifying, Analyzing, and Mitigating Risks To enhance the security of SCADA systems, organizations need to implement robust risk assessment methodologies. Risk assessment is crucial in identifying vulnerabilities, evaluating potential threats, and prioritizing security measures. This section delves into different frameworks and approaches for conducting risk assessments in SCADA systems, enabling organizations to manage and mitigate risks effectively.

Security in SCADA System: A Technical Report on Cyber Attacks

435

6.1 Importance of Risk Assessment in SCADA Systems Risk assessment is a systematic process that enables organizations to identify and evaluate potential risks and their impact on SCADA systems’ security and operational integrity. It provides a foundation for informed decision-making and helps allocate resources to address critical vulnerabilities. By conducting risk assessments, organizations can identify gaps in security controls, implement necessary safeguards, and prioritize security investments to protect critical infrastructure. 6.2 Risk Assessment Frameworks 1) NIST Risk Management Framework (RMF): The National Institute of Standards and Technology (NIST) Risk Management Framework is a widely used framework that provides a structured approach to risk assessment. The framework consists of six steps: (1) Categorize the system, (2) Select security controls, (3) Implement security controls, (4) Assess security controls, (5) Authorize system operation, and (6) Monitor security controls. The NIST RMF emphasizes continuous monitoring and feedback loops for on-going risk management. 2) ISO 31000: ISO 31000 is an international standard for risk management that provides a comprehensive framework applicable to various industries, including SCADA systems. It emphasizes identifying, assessing, and treating risks through a structured and iterative process. The ISO 31000 framework consists of six steps: (1) Establish the context, (2) Identify risks, (3) Analyze risks, (4) Evaluate risks, (5) Treat risks, and (6) Monitor and review. This framework promotes a proactive and systematic approach to risk management. 3) OCTAVE Allegro: Operational Critical Threat, Asset, and Vulnerability Evaluation (OCTAVE) Allegro is a risk assessment methodology specifically designed for critical infrastructures, making it highly applicable to SCADA systems. It identifies assets, threats, vulnerabilities, impacts, and risk factors. The OCTAVE Allegro process involves six stages: (1) Identify assets and critical services, (2) Identify threats and vulnerabilities, (3) Determine impacts and prioritize risks, (4) Identify risk mitigation strategies, (5) Develop a risk mitigation plan, and (6) Monitor and update the risk assessment. 6.3 Risk Assessment Approaches 1) Qualitative Risk Assessment: Qualitative risk assessment provides a subjective evaluation of risks based on expert judgment and qualitative scales. It involves identifying and categorizing risks, estimating their likelihood and impact, and assigning a risk rating or priority. Qualitative risk assessments are valuable in situations where limited data is available or when assessing risks that are difficult to quantify precisely. They facilitate the identification of high-level risks and support the development of risk mitigation strategies. 2) Quantitative Risk Assessment: Quantitative risk assessment involves a more rigorous and data-driven approach to risk analysis. It utilizes numerical values and statistical methods to assess risks based on factors such as asset value, threats, vulnerabilities,

436

S. Ali

probabilities, and consequences. Quantitative risk assessments provide a quantitative measure of risk, enabling organizations to prioritize and compare risks effectively. This approach requires more data and expertise but offers higher accuracy in risk analysis. 3) Vulnerability Assessment: Vulnerability assessment focuses on identifying and evaluating specific vulnerabilities in SCADA systems. It involves systematic scanning, testing, and analysis of the system’s infrastructure, network, and components to identify weaknesses that attackers can exploit. Vulnerability assessments may include penetration testing, network scanning, and code review to identify potential attack entry points. Organizations can take proactive steps to remediate weaknesses and reduce the overall risk exposure by identifying vulnerabilities. 4) Threat Modeling: Threat modelling is a structured approach to identify potential threats and attack vectors specific to a SCADA system. It involves analyzing the system’s architecture, components, and operational environment to identify potential threats and their impact on the system. Threat modelling helps organizations understand how attackers might exploit vulnerabilities and guides the selection and implementation of appropriate security controls. Common threat modelling methodologies include STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) and DREAD (Damage, Reproducibility, Exploitability, Affected Users, Discoverability). 6.4 Risk Mitigation Strategies Once risks have been identified and analyzed through the risk assessment process, organizations can develop risk mitigation strategies to minimize the likelihood and impact of potential threats. Risk mitigation strategies can include: 1) Security Controls and Countermeasures: Implementing appropriate security controls and countermeasures is essential to mitigate identified risks. These may include access controls, encryption mechanisms, intrusion detection systems, firewalls, antivirus software, and regular patch management. Security controls should be selected based on the specific vulnerabilities and threats identified during the risk assessment process. 2) Incident Response and Business Continuity Planning: Developing comprehensive incident response and business continuity plans is crucial for minimizing the impact of security incidents and ensuring continuity of operations. These plans outline procedures for detecting, responding to, and recovering from security breaches or disruptions. Incident response plans should include predefined actions, communication protocols, and recovery processes to minimize downtime and mitigate damages. 3) Security Awareness and Training: Promoting security awareness and regular training to system operators, administrators, and employees is vital to prevent and mitigate risks. Training programs should cover topics such as recognizing social engineering tactics, secure system configuration, password hygiene, and incident reporting procedures. By fostering a culture of security awareness, organizations can significantly reduce the likelihood of successful attacks. 4) Regular Audits and Assessments: Regular audits and assessments ensure ongoing monitoring and improvement of security measures. Audits help identify gaps,

Security in SCADA System: A Technical Report on Cyber Attacks

437

measure the effectiveness of security controls, and verify compliance with relevant regulations and standards. Regular assessments enable organizations to detect new vulnerabilities, adapt to evolving threats, and refine risk management strategies continuously. Effective risk assessment methodologies are essential for enhancing the security of SCADA systems. The NIST Risk Management Framework, ISO 31000, and OCTAVE Allegro provide structured frameworks to guide the risk assessment process. Qualitative and quantitative risk assessment approaches, vulnerability assessments and threat modelling help identify and analyze risks effectively. Organizations can mitigate vulnerabilities and reduce the overall risk exposure in SCADA systems by implementing risk mitigation strategies, such as security controls, incident response planning, security awareness training, and regular audits.

7 Implications of Effective Risk Assessment in Enhancing the Security of SCADA Systems Effective risk assessment plays a pivotal role in enhancing the security of SCADA (Supervisory Control and Data Acquisition) systems. This section explores the implications of conducting thorough risk assessments and how they contribute to bolstering the security posture of SCADA systems. By identifying vulnerabilities, evaluating risks, and implementing appropriate mitigation strategies, organizations can better protect critical infrastructure and ensure SCADA systems’ integrity, availability, and confidentiality. 7.1 Proactive Identification of Vulnerabilities and Risks One of the key implications of effective risk assessment is the proactive identification of vulnerabilities and risks in SCADA systems. By conducting comprehensive assessments, organizations can better understand the potential weaknesses and attack vectors that adversaries might exploit. This allows for targeted and timely security measures to be implemented, reducing the likelihood of successful attacks and minimizing the potential impact on operations. Identifying vulnerabilities and risks proactively enables organizations to stay one step ahead of potential threats, enhancing the overall security posture of SCADA systems. 7.2 Prioritization of Security Investments and Resource Allocation Risk assessment provides organizations a structured framework for prioritizing security investments and allocating resources effectively. By evaluating identified risks and their potential impact, organizations can determine which vulnerabilities require immediate attention and allocate resources accordingly. Risk assessment helps organizations understand the cost-benefit trade-offs of different security measures and prioritize investments based on the level of risk they address. This ensures that limited resources are deployed to mitigate the most critical risks, optimizing the use of resources and maximizing the overall security of SCADA systems.

438

S. Ali

7.3 Compliance with Regulations and Standards Effective risk assessment helps organizations achieve compliance with relevant regulations and standards governing SCADA systems. Many industries and jurisdictions have specific requirements related to critical infrastructure security, and risk assessments are often a foundational requirement. By conducting comprehensive risk assessments and implementing appropriate security controls, organizations can demonstrate compliance with regulatory frameworks, industry best practices, and internationally recognized standards. Compliance helps meet legal obligations and serves as a benchmark for ensuring a minimum level of security and reducing the risk of potential penalties or legal consequences. 7.4 Improved Incident Response and Resilience Risk assessment enhances incident response and improves the overall resilience of SCADA systems. Organizations can develop robust incident response plans that address various scenarios by identifying potential threats and vulnerabilities. Effective risk assessment enables organizations to identify critical assets, establish incident-handling procedures, and define roles and responsibilities during security incidents. With proper incident response planning, organizations can effectively detect, respond to, and recover from security breaches or disruptions, minimizing the impact on operations and ensuring the continuity of critical services. Risk assessment also facilitates the identification of recovery strategies and the implementation of resilience measures, enabling SCADA systems to bounce back from security incidents quickly. 7.5 Adaptation to Evolving Threat Landscape Risk assessment allows organizations to adapt to the evolving threat landscape and emerging security risks in SCADA systems. By regularly conducting risk assessments, organizations can stay informed about the latest attack vectors, vulnerabilities, and industry-specific threats. It enables them to update their security controls, adjust mitigation strategies, and implement new measures to address emerging risks effectively. Risk assessment acts as a continuous feedback loop, ensuring that security measures keep pace with the changing threat landscape and providing organizations with the agility to respond to new and emerging threats. 7.6 Continuous Improvement of Security Measures Lastly, effective risk assessment supports the continuous improvement of security measures in SCADA systems. Risk assessment is not a one-time activity but a recurring process that needs to be revisited periodically. By regularly reassessing risks and evaluating the effectiveness of security controls, organizations can identify gaps, measure the impact of implemented measures, and refine their security strategies. This iterative approach enables organizations to continuously enhance their security posture, address new vulnerabilities, and adapt to evolving threats, ensuring that the SCADA systems remain resilient and secure over time.

Security in SCADA System: A Technical Report on Cyber Attacks

439

Effective risk assessment in SCADA systems has several implications that significantly enhance security. Proactive identification of vulnerabilities and risks enables targeted security measures, while prioritization of investments optimizes resource allocation. Compliance with regulations and standards ensures adherence to recognized security practices. Improved incident response and resilience enable swift recovery from security incidents, and adaptation to the evolving threat landscape enhances security readiness. Lastly, continuous improvement ensures that security measures remain effective and aligned with the changing threat landscape. By embracing the implications of effective risk assessment, organizations can enhance the security of SCADA systems, safeguard critical infrastructure, and maintain operational integrity.

8 Future Trends Related to SCADA Systems This paper has examined several approaches and procedures used in SCADA systems and various threats and their responses. It has produced a compatibility chart that summarizes a balanced distribution of preventative measures against threats. In this section, you will find a discussion of the most recent trends and methods that have the potential to be very effective against SCADA systems and a historical assessment of all assaults and the frequency with which they have occurred over several years. The following section provides an overview of cyberattacks and viruses anticipated to develop soon. 8.1 Comprehensive Analysis PLCs and other cyber-physical devices are a resource for industrial applications. Thus, People in the real world use and converse with technologies that do not belong to them. Unlike office information technology systems, which focus on data management, these systems connect with nature [28]. As a result, attacks on PLCs influence physical bodies, whether they be personnel or output instruments. Typically, PLCs require an accompanying operating system. It is primarily a Windows version tailored to the specific requirements of business applications. Since there are several exploits and vulnerabilities based on operating system flaws, analyze only vulnerabilities that emerge directly from the system’s industrial implementation. Remote code execution is the most often exploited vulnerability for manipulating PLCs. While the CIA security goals are common in traditional IT offices and are nearly equivalent in importance, availability is the most critical security requirement for business. Because of the high cost of inaccessible manufacturing businesses, this primarily emphasizes machine operators. Code Execution can uninstall services, rendering them unreachable and generating expenditure revenue. A recent percentage breakdown of general assaults and faults in PLCs in SCADA systems is illustrated in Fig. 3. Field buses and controllers are other crucial components of an underutilized SCADA system. A slew of Fieldbus protocols and ports have emerged to address the specialized nature of industrial networks. In addition to EtherCAT, DNP3, and OMRON, other protocols are available for implementation. Protocols such as Modbus, Profinet, and others are also available as alternatives. Inherently untrustworthy, these protocols are so named because they have built authentication challenges from the beginning of time.

440

S. Ali

As seen in Graph 1 below, research of all available ports was conducted through the year 2018In this research, all of the ports that have been targeted are identified, and it is shown that, depending on how they are configured, these field buses may be susceptible to MITM and DoS attacks.

Statistics about threat in SCADA System

Malicious Code

Spoofing, MITM, Replay Attacks

DoS and DDoS

Information Gathering

End User Threats

Social Engineering

Other Threats

Fig. 3. Statistics about the threat in SCADA System.

By examining the graph entries through time, it can be deduced that the total number of entries will increase by a certain proportion. Over 1100 senior information technology professionals from the United States, Europe, and the Middle East/North Africa were questioned on the likelihood of cyberattacks increasing in the next three years (2020–2022) [29]. There is a greater possibility for error in this category due to the abundance and vulnerability of the systems; hence, there is more chance for error. Six of ten professionals estimate that Dos, or DDoS blitzes, will increase soon, implying that DoS will increase by 8%. IT professionals interviewed for the report expect a rise in SCADA, nation-state, and cyber-extortion problems. Although only around four of ten IT professionals are familiar with SCADA, the increase in injection assaults, SCADA systems, cyber-extortion/information disclosure, and other related activities is roughly 5%. Graph 2 depicts the cumulative percentage increase in these assaults from 2019 to 2022 and how much they will increase in 2025 according to the previous percentage. These assumptions are based on several IT expert estimates about certain attack types. As shown in the graph above, we will need stronger countermeasures in the future. Demand for SCADA-based industrial control systems will reach $181.6 billion by 2024

Security in SCADA System: A Technical Report on Cyber Attacks

441

Port Hits Percentage up to 2018 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00%

Graph 1. Ports hit percentage.

Comparison of total Attacks till 2019-2022 with forecasted value of Attacks till 2025 50.00% 45.00% 40.00% 35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% Remote Code Execution

DOS 2019

Injection Attacks 2022

Information Disclosure

SCADA Attacks

2025

Graph 2. Comparison of total Attacks till 2019–2022 with the forecasted value of Attacks till 2025.

[30]. Between 2018 and 2024, experts expect a CAGR of over 11.5 per cent. The growing commercial Internet of Things (IIoT) and the global electricity market will boost the sector. Despite these advantages, SCADA looks to face several challenges [31]. Due to the high cost of setting up a SCADA system, several organizations (especially small enterprises) do not use it. Uncertainty in oil and gas prices typically limits organizations’

442

S. Ali

ability to invest in new technology, such as SCADA. These dangers may increase if the technology is on its way to provide new secure, adaptive features.

9 Results and Discussion In this section, we present the results obtained from analyzing data related to security in SCADA systems and risk assessment methodologies. The findings are detailed, providing insights into the vulnerabilities, attack vectors, and risk assessment practices specific to SCADA systems. 9.1 Vulnerabilities in SCADA Systems The analysis revealed several vulnerabilities commonly found in SCADA systems. These vulnerabilities include insufficient authentication and authorization mechanisms, insecure remote access practices, inadequate network segmentation, and a lack of robust monitoring and logging capabilities. These findings highlight the need for organizations to address these vulnerabilities to enhance the security of SCADA systems. 9.2 Common Attack Vectors The examination of the data identified various attack vectors frequently used to exploit vulnerabilities in SCADA systems. Malware attacks, network intrusions, social engineering, and insider threats were found to be the primary attack vectors. The results emphasize the importance of implementing measures such as strong access controls, secure network configurations, user awareness training, and monitoring solutions to mitigate these attack vectors effectively. 9.3 Risk Assessment Methodologies The analysis of risk assessment methodologies provided valuable insights into different frameworks and approaches used in SCADA system security. The NIST Risk Management Framework (RMF), ISO 31000, and OCTAVE Allegro were identified as widely used frameworks for conducting risk assessments. The qualitative and quantitative risk assessment approaches, vulnerability assessments, and threat modelling methodologies were also explored. These findings emphasize the significance of adopting structured risk assessment methodologies to effectively identify, analyze, and prioritize risks. 9.4 Implications of Effective Risk Assessment The discussion of the implications of effective risk assessment underscored its significance in enhancing the security of SCADA systems. Proactive identification of vulnerabilities and risks allows organizations to take targeted security measures. Prioritization of security investments based on risk assessment outcomes ensures optimal resource allocation. Compliance with regulations and standards demonstrates adherence to recognized security practices. Improved incident response and resilience enable organizations to handle security incidents effectively. Adaptation to the evolving threat landscape ensures that security measures stay current. Lastly, continuous improvement enables organizations to enhance security measures and address emerging vulnerabilities.

Security in SCADA System: A Technical Report on Cyber Attacks

443

9.5 Case Studies To provide practical context, this research analyzed real-world case studies highlighting the consequences of security breaches in SCADA systems. These case studies demonstrated the potential impact of cyber attacks, including disruption of critical operations, environmental damage, economic losses, and threats to human safety. By analyzing these incidents, we gain a deeper understanding of the importance of robust security measures and risk assessment practices in mitigating such risks. 9.6 Limitations and Future Research It is important to acknowledge the limitations of this research. The findings and conclusions are based on the available literature and data sources, which may not encompass the entirety of the subject matter. The rapidly evolving nature of cybersecurity threats necessitates ongoing research and analysis. Future research can explore emerging attack vectors, evolving vulnerabilities, and advancements in risk assessment methodologies to enhance further the understanding and implementation of security in SCADA systems. The results obtained from the analysis provide valuable insights into the vulnerabilities, attack vectors, risk assessment methodologies, and implications for security in SCADA systems. The findings emphasize the need for organizations to address vulnerabilities, implement effective security measures, and adopt structured risk assessment methodologies to mitigate risks effectively. By incorporating these insights, organizations can enhance the security of their SCADA systems, safeguard critical infrastructure, and ensure the reliable and secure operation of these vital systems.

10 Summary and Conclusion 10.1 Key Findings This research paper explored the topic of security in SCADA (Supervisory Control and Data Acquisition) systems, focusing on cyber attacks and risk assessment methodologies. The key findings and conclusions drawn from the research paper are summarized below. 1. SCADA systems are critical components of infrastructure, and their security is paramount due to the potential consequences of successful cyber attacks. 2. The evolving threat landscape poses various challenges to the security of SCADA systems, including malware attacks, network intrusions, social engineering, and insider threats. 3. SCADA systems exhibit unique vulnerabilities, such as a lack of authentication mechanisms, insecure remote access, inadequate network segmentation, and insufficient monitoring and logging capabilities. 4. Risk assessment is crucial for identifying, analyzing, and mitigating risks in SCADA systems. Various frameworks and methodologies enable organizations to effectively assess risks, including NIST RMF, ISO 31000, and OCTAVE Allegro, qualitative and quantitative assessments, vulnerability assessments, and threat modelling. 5. Effective risk assessment has implications for enhancing the security of SCADA systems, including proactive vulnerability identification, prioritization of security investments, compliance with regulations, improved incident response and resilience, adaptation to evolving threats, and continuous improvement of security measures.

444

S. Ali

10.2 Conclusion The future of SCADA systems present opportunities and challenges, particularly as organizations adopt emerging technologies like IoT and cloud computing. This convergence of physical and digital worlds necessitates robust security measures to protect against cyber-physical threats. The analysis of future trends highlights the need for organizations to remain proactive and adaptive in their security strategies. By recognizing and addressing emerging trends, organizations can stay ahead of potential threats and protect their SCADA systems from evolving risks. Integration of security practices that account for the increasing reliance on IIoT, cloud technologies, and associated vulnerabilities is essential. Compliance with regulatory requirements and proper implementation of machine learning and artificial intelligence technologies further strengthen the security posture of SCADA systems. In today’s interconnected world, the security of SCADA systems is a critical concern. This paper has provided valuable insights into challenges, vulnerabilities, and risk assessment methodologies specific to SCADA systems. By understanding the evolving threat landscape and implementing robust risk assessment strategies, organizations can bolster the security of their SCADA systems. Proactively identifying vulnerabilities and risks, prioritizing security investments based on risk assessment outcomes, and complying with relevant regulations and standards are essential steps. Effective risk assessment enables organizations to improve incident response and resilience, adapt to emerging threats, and continuously enhance security measures. Regular assessments, combined with proper incident response planning„ minimising the impact of security incidents, and maintaining the continuity of critical operations. As the threat landscape evolves, organizations must remain vigilant and continually update their risk assessment practices and security measures. Comprehensive risk assessment practices and appropriate security measures are vital to safeguard critical infrastructure, protect against cyber threats, and ensure systems’ reliable and secure operation. In conclusion, the findings of this research paper underscore the importance of risk assessment methodologies in enhancing the security of SCADA systems. With the insights gained from risk assessments, organizations can proactively mitigate risks, allocate resources effectively, and maintain a robust security posture. By embracing comprehensive risk assessment practices and implementing appropriate security measures, organizations can safeguard critical infrastructure, protect against cyber threats, and ensure the reliable and secure operation of SCADA systems in the future.

References 1. Kure, H., Islam, S., Razzaque, M.: An integrated cyber security risk management approach for a cyber-physical system. Appl. Sci. 8(6), 898 (2018). https://doi.org/10.3390/app8060898 2. Gomez, R.A.O., Tosh, D.K.: Towards security and privacy of scada systems through decentralized architecture. In: 2019 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1224–1229. IEEE (2019) 3. Shrivastava, S., Saquib, Z., Shah, S.: Vulnerabilities of scada systems and its impact on cyber security. Int. J. Electr. Electron. Data Commun. 6(6), 26–30, 208AD

Security in SCADA System: A Technical Report on Cyber Attacks

445

4. Elhady, A.M., El-bakry, H.M., Elfetouh, A.A.: Comprehensive risk identification model for SCADA systems. Secur. Commun. Networks 2019, 1–24 (2019). https://doi.org/10.1155/ 2019/3914283 5. Housh, M., Ohar, Z.: Model-based approach for cyber-physical attack detection in water distribution systems. Water Res. 139(August), 132–143 (2018). https://doi.org/10.1016/j.wat res.2018.03.039 6. Lin, K.-S.: A new evaluation model for information security risk management of SCADA systems. IEEE Xplore (2019). https://doi.org/10.1109/ICPHYS.2019.8780280 7. Tariq, N., Asim, M., Khan, F.A.: Securing SCADA-based critical infrastructures: challenges and open issues. Procedia Comput. Sci. 155, 612–617 (2019). https://doi.org/10.1016/j.procs. 2019.08.086 8. Geeta, Y., Paul, K.: Assessment of SCADA System Vulnerabilities. IEEE Xplore. 1 Sept 2019. https://doi.org/10.1109/ETFA.2019.8869541 9. Yadav, G., Paul, K.: Architecture and security of SCADA systems: a review. Int. J. Crit. Infrastruct. Prot. 34(September), 100433 (2021). https://doi.org/10.1016/j.ijcip.2021.100433 10. Markovic-Petrovic, J.D., Stojanovic, M.D., Bostjancic Rakas, S.V.: A fuzzy AHP approach for security risk assessment in SCADA networks. Adv. Electr. Comput. Eng. 19(3), 69–74 (2019). https://doi.org/10.4316/AECE.2019.03008 11. Huang, K., Zhou, C., Tian, Y.C., Tu, W., Peng, Y.: Application of Bayesian network to data-driven cyber-security risk assessment in SCADA networks. In: 2017 27th International Telecommunication Networks and Applications Conference ITNAC 2017, vol. 2017, pp. 1–6 (2017). https://doi.org/10.1109/ATNAC.2017.8215355 12. Pliatsios, D., Sarigiannidis, P., Lagkas, T., Sarigiannidis, A.G.: A survey on SCADA systems: secure protocols, incidents, threats and tactics. IEEE Commun. Surv. Tutorials 22(3), 1942– 1976 (2020). https://doi.org/10.1109/COMST.2020.2987688 13. Coffey, K., et al.: Vulnerability assessment of cyber security for SCADA systems. In: Parkinson, S., Crampton, A., Hill, R. (eds.) Guide to Vulnerability Analysis for Computer Networks and Systems. CCN, pp. 59–80. Springer, Cham (2018). https://doi.org/10.1007/978-3-31992624-7_3 14. Kalogeraki, E.-M., Papastergiou, S., Mouratidis, H., Polemi, N.: A novel risk assessment methodology for SCADA maritime logistics environments. Appl. Sci. 8(9), 1477 (2018). https://doi.org/10.3390/app8091477 15. Lan, J.: Research on cybersecurity risk assessment in scada networks based on AHP-RSR. In: Proceedings - 2020 International Conference on Communications, Information System and Computer Engineering CISCE 2020, pp. 361–364 (2020). https://doi.org/10.1109/CIS CE50729.2020.00079 16. Hossain, N., Das, T., Tariqul Islam, M., Hossain, A.: Cyber security risk assessment method for SCADA system. Inform. Secur. J. Global Perspect. 31(5), 499–510 (2021). https://doi. org/10.1080/19393555.2021.1934196 17. Shang, W., Gong, T., Chen, C., Hou, J., Zeng, P.: Information security risk assessment method for ship control system based on fuzzy sets and attack trees. Secur. Commun. Networks 2019, 1–11 (2019). https://doi.org/10.1155/2019/3574675 18. Falco, G., Caldera, C., Shrobe, H.: IIoT cybersecurity risk modeling for SCADA systems. IEEE Internet Things J. 5(6), 4486–4495 (2018). https://doi.org/10.1109/JIOT.2018.2822842 ˙ 19. Boryczko, K., Piegdo´n, I., Szpak, D., Zywiec, J.: Risk assessment of lack of water supply using the hydraulic model of the water supply. Resources 10(5), 43 (2021). https://doi.org/ 10.3390/resources10050043 20. Süzen, A.A.: A risk-assessment of cyber attacks and defense strategies in industry 4.0 ecosystem. Int. J. Comput. Netw. Inf. Secur. 12(1), 1–12 (2020). https://doi.org/10.5815/ijcnis.2020. 01.01

446

S. Ali

21. Zhou, X., Xu, Z., Wang, L., Chen, K., Chen, C., Zhang, W.: APT attack analysis in SCADA systems. MATEC Web Conf. 173, 2–6 (2018). https://doi.org/10.1051/matecconf/201817 301010 22. Upadhyay, D., Sampalli, S.: SCADA (Supervisory Control and Data Acquisition) systems: Vulnerability assessment and security recommendations. Comput. Secur. 89, 101666 (2020). https://doi.org/10.1016/j.cose.2019.101666 23. Shaw, W.T.: SCADA System Vulnerabilities to Cyber Attack. 2019. Access Date: 01–02– 2022, Access time: 06:41pm 24. Nazir, S., Patel, S., Patel, D.: Assessing and augmenting SCADA cyber security: A survey of techniques. Comput. Secur. (2017). https://doi.org/10.1016/j.cose.2017.06.010 25. Cyber Physical Systems: the need for new models and design paradigms, Carnegie Mellon University, Access Date: 01–02–2022, Access time: 08:01pm 26. Cyber-physical systems, IEEE Control Systems Society, 2011, Access Date: 01–02–2022, Access time: 09:33am 27. Lee, J., Lapira, E., Bagheri, B., Kao, H.: Recent advances and trends in predictive manufacturing systems in big data environment. Manufact. Let. 1(1), 38–41 (2013). https://doi.org/ 10.1016/j.mfglet.2013.09.005 28. Ant’on, S.D., Fraunholz, D., Lipps, C., Pohl, F., Zimmermann, M., Schotte, H.D.: Two decades of SCADA exploitation: a brief history. In: 2017 IEEE Conference on Application, Information and Network Security (AINS) 29. https://resources.infosecinstitute.com/topic/scada-security-of-critical-infrastructures/?utm_ source=feedburner&utm_medium=feed&utm_campaign=Feed%3A%20infosecResour ces%20%28InfoSec%20Resources%29 30. https://www.thomasnet.com/insights/the-future-of-scada-in-2019-iiot-tech/ Access Date: 01–02–2022, Access time: 07:49pm 31. Debouza, M., Al-Durra, A., EL-Fouly, T.H.M., Zeineldin, H.H.: Survey on microgrids with flexible boundaries: Strategies, applications, and future trends. Electric Power Syst. Res. 205, 107765 (2022). https://doi.org/10.1016/j.epsr.2021.107765

A Shap Interpreter-Based Explainable Decision Support System for COPD Exacerbation Prediction Claudia Abineza1(B)

, Valentina Emilia Balas2

, and Philibert Nsengiyumva1

1 University of Rwanda, KN 67, Kigali 3900, Rwanda

[email protected], [email protected]

2 Aurel Vlaicu University of Arad, 77 Revolu¸tiei Boulevard Arad, 310130 Arad, Romania

[email protected]

Abstract. COPD is a chronic lung disease that can exhibit exacerbation. An exacerbation requires additional therapy. When treating COPD using a pulse oximeter, some SpO2 (blood Oxygen Saturation) levels are to be targeted depending on the prognosis of the disease and the current state of symptoms. GOLD (Global Initiative for COPD) 2020 proposed to evaluate respiratory symptoms severity change for the assessment of exacerbation occurrence and personalized therapy. We proposed a machine learning model to evaluate the change of symptoms severity in the association with SpO2 for exacerbation prediction. Using a Shap interpreter, the contribution of exacerbation predictors, is computed at the individual level. This could guide clinicians to evaluate whether and how much SpO2 contributed to the individual exacerbation probability, helping them make a decision on the initiation of oxygen. This personalized approach contrasts with delivering Oxygen without aligning it with a patient’s actual needs. The pre-trained model was integrated with the shap interpreter for the DSS (Decision Support System) design. Afterward, DSS was validated on 56 patients by attaining a concordance rate of 94.9% when comparing the DSS and medically made decisions, about exacerbation events. We report that the proposed DSS is much better than the current clinical assessment, which doesn’t interpret the individual current SpO2 contribution to the exacerbation risk. Keywords: COPD exacerbation prediction · worsening symptoms · SPO2 · Shap interpreter · Oxygen therapy

1 Introduction COPD is a chronic respiratory disease characterized by airflow limitation and persistent respiratory worsening symptoms such as shortness of breath that gets worse with exertion, a chronic, productive cough (often bringing up mucus/sputum), wheezing and a tight feeling in the chest, and frequent lung infections [1, 2]. The disease course time is characterized by exacerbation episodes, which consist of symptoms extending beyond normal day-to-day variations. These symptoms worsen acutely at the onset, necessitating a change in medication or hospitalization. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 447–458, 2024. https://doi.org/10.1007/978-3-031-54820-8_36

448

C. Abineza et al.

Exacerbation frequency causes lung function decline and increases the frequency and the risk of future exacerbations. Prediction of COPD exacerbation may promote COPD health. The GOLD 2020 update [3] suggests assessing respiratory symptoms over time by asking targeted questions about changes in baseline symptomatology. This helps capture the evolution of breathlessness, defined as a limitation in performing daily activities due to shortness of breath [3–5]. Moving from one mMRC (modified Medical Research Council) grade to the next requires a substantial change in functional abilities, making it impractical for timely and personalized therapy decisions. When using a pulse oximeter to diagnose COPD [6], there is a challenge in evaluating changes in symptom severity in the context of SPO2 for exacerbation prediction. Specifically, when determining the SPO2 level for an individual that could increase the risk of exacerbation, there is no established method to adapt the SPO2 level or range to the individual’s COPD exacerbation risk [6–11]. While there are recommendations regarding target SPO2 levels based on GOLD or disease stages [6], challenges arise when determining if SPO2 is a predictor of COPD worsening or exacerbation for an individual. Individuals in advanced disease or GOLD stages may have become accustomed to their low oxygen levels, whereas those with likely normal baseline SPO2 levels may not [9, 10]. This research aims to predict COPD exacerbation by evaluating individual changes in symptom severity in association with SPO2 through a Decision Support System (DSS) [12, 13]. By integrating a pre-trained model with a Shap interpreter [14], more details on each predictor’s positive or negative contribution to the predicted probability for an individual label (exacerbation or not) are provided [15]. This information can assist clinicians in making decisions on an individual case regarding the initiation of Oxygen therapy. The DSS interface, resulting from the Shap interpreter, was validated during a prospective study involving the evaluation of 56 patients for exacerbation events, specifically addressing the decision to initiate Oxygen therapy. In this study, we applied three classification models to a COPD dataset containing changes in symptom severity, SPO2, and Heart rate – the vital signs acquired by the Pulse Oximeter. Among these models, Random Forest (RF) outperformed Logistic Regression (LR) and Decision Tree (DT), achieving an accuracy of 95.3%. Consequently, RF was integrated with a Shap interpreter for our model evaluation and clinical usage. The DSS exhibited a high concordance rate of 94.9% during the evaluation on 56 patients for exacerbation events, compared to decisions made by medical professionals. Moreover, the DSS method proves to be a superior approach for clinical practice, offering valuable insights into individual COPD exacerbation predictors and the predicted probability of exacerbation risk.

2 Previous Work While investigating previous studies that utilized symptoms and Pulse Oximeter (P.O) vital signs for COPD exacerbation prediction, it was noted that no study provided an interpretation of the contribution of COPD exacerbation predictors. Specifically, none computed the predicted probability for an individual and determined if SPO2 is among the exacerbation risk factors.

A Shap Interpreter-Based Explainable Decision Support System for COPD

449

Generally, previous studies offered methods to evaluate COPD exacerbation by determining SPO2 ranges within GOLD-based severity groups. However, these studies lacked a mechanism to adapt or evaluate the SPO2 of an individual in relation to the risk of exacerbation. Furthermore, these models did not provide interpretations about COPD exacerbation risk or the predicted probability of individual exacerbations. This is the case for studies in [16–19] that are conducted to predict COPD by analyzing patients data and distinguishing stable COPD from exacerbation. However, authors didn’t adapt a specific SPO2 value to an individual impairment or exacerbation. The authors in [16], fitted a Finite State Machine to the telemonitored data (as reported) from patients (the reported symptoms patterns were used for defining exacerbation or not “exacerbation time point”), by labeling vital signs data from 7-day stable periods and 7-day prodromal periods (ahead of the exacerbation events). The mean and gradient of SPO2 and HR and breathing rate for each period, were analyzed and vital signs data for each period, were input to a 2 classifier to assign the data to either the stable or prodromal class (exacerbation prediction). Authors reported that their models could not adapt, for an individual, SPO2 values range for exacerbation event. Moreover, COPD predicted exacerbation episodes with 60%–80% sensitivity resulted in 68%–36% specificity. Authors in [17–19] analyzed COPD data and determine a composite Oximetry Score (mean magnitude of SpO2 fall and HR rise) to distinguish symptom variation from COPD exacerbation and in [18], authors by examining symptoms overall score and physiological measurements collected during a pilot COPD telemonitoring, studied the association between FEV1, pulse and SpO2 for the risk of exacerbation. The latter stated that the mean pulse rate increased from 87 to 94 /min and the mean SpO2 fell from 93.6 to 92.4% prior to exacerbation and that physiological variables did not differentiate between exacerbations and isolated bad days. However, the studies didn’t predict individual COPD risk as may depend on individual measurements, especially SPO2 levels. Other studies did not consider individual symptoms but instead applied predetermined ranges of values and thresholds for SpO2 and HR to determine the onset of COPD exacerbation and other worrisome events during the course of COPD. However, these studies neither predicted COPD exacerbation risk nor provided interpretability for an individual’s SPO2 level. They lacked the means to evaluate COPD risk exacerbation over time by adapting individual changes in vital sign values in association with the dynamics of symptom severity. It is the case in [20], where authors labeled (using GOLD likely range values thresholds) SPO2 and HR from patients sensors into mild, moderate, severe and very severe. Extracted features from heart rate and SPO2 data time series signals are the percentages of the total duration of labeled minutes, which were used (through computation of events of interest score) to classify the severity for COPD, Asthma, and healthy control groups). COPD, asthma, and healthy control groups showed the different percentage of time duration at various severity level for both signals. COPD groups are mild, moderate, severe and very severe. Achieved accuracy for COPD is 76%, and for asthma is 60%. The sensitivity is 45% for the model for COPD and 78% for asthma. Specificity is 100% for both diseases. Similarly, without predicting COPD risk over time, in [21], authors applied a Finite State machine, by computing and fitting a predetermined sequence of actions according to a sequence of worrisome events based

450

C. Abineza et al.

on SPO2 and HR data thresholds as pre-defined by medical experts. Deduced parameters are computed and adjusted, during training of the model, to guide a personalized training about considered worrisome onsets. The validation was done by comparing the annotations done by the DSS with GOLD standard and medical experts by attaining average performance around 90%. However, there is no way to analyze SPO2 and HR, utilized vital signs for oxygen therapy initiation. Although these studies utilized likely thresholds of SpO2 values and ranges for each GOLD group to determine exacerbation, neither of the studies adapted a specific SpO2 level (as it can influence COPD exacerbation risk [5]) to the individual symptom burden over time nor provided the means to evaluate individuals according to their current symptom patterns in association with their SpO2 level, likely to estimate COPD exacerbation risk over time. Specifically, the studies predicted targeted events: GOLDbased severities and exacerbation [20], exacerbation and other worrisome events [21], by pre-defining thresholds of HR and SpO2 for different worrisome events. The abovestated studies neither adapted nor provided transparent means to compute individual vital signs value changes or contributions vis-à-vis COPD exacerbation risk over time. This might be due to the methods used by generalizing the vital signs value range and thresholds to a group of people based on threshold values like GOLD stages [12] and without considering each individual worsening symptom severity change. Their model lacked the means to evaluate changes in the severity of symptoms in association with SpO2 vital sign changes [6] for exacerbation (or risk) prediction. Closely related to our study, the studies in [9, 16, 22, 23] used a pulse oximeter by considering individual symptoms severity changes to predict COPD patients at high risk or not of an exacerbation event. However, their models lacked interpretability due to their “black-box” nature which could not provide enough interpretations about symptoms and vital signs (positive or negative) contributions to the individual COPD exacerbation (risk) prediction. Contrarily to previous works, this research considered for the exacerbation prediction, individual symptoms severity change in the association of pulse oximeter acquired vital signs by providing a way to analyze SPO2 level of an individual according to his/her symptoms severity change. The latter, when evaluated by analyzing individual predicted exacerbation probability or risk, could help clinician to determine whether a patient is in stable, in exacerbation state, or recovering from exacerbation. Therefore, clinicians could make a more intelligent decision about what SPO2 levels to target, based on individual GOLD stage and exacerbation state [6]. Shap interpreter was integrated with the pre-trained RF model for DSS design which provided more insights on the evaluation of COPD case and predictors of exacerbation during clinical practice. The model was validated during a prospective study, on 56 COPD patients, by attaining a concordance rate of 94.9%, compared to medical doctors made decisions. The evaluation and interpretation of a COPD case on a proposed DSS can be seen in the Sect. 4.

A Shap Interpreter-Based Explainable Decision Support System for COPD

451

3 Materials and Methods 3.1 Dataset of COPD Patients This study leverages a retrospective dataset (from a paper in [9]), that was labelled by pulmonologists. This dataset contains COPD stable and acute cases with worsening symptoms severity change. The original dataset contains about 25 features. We selected and derived only predictors and symptoms dynamics that can influence COPD exacerbation (risk), as considered and interpreted in a Pulse Oximetry protocol. Descriptions of the selected features for our study are shown in Table 1 and include predictors (that may influence COPD exacerbation risk) of symptoms severity and vital sign severity such as SPO2 and HR vital signs measurement, current worsening symptoms severity, their recent worsening duration and COPD controller medication compliance. Table 1. Descriptions of patient predictors. Predictor

Predictor values

Units—Type

Symptoms Shortness of breath

Same as/Less/More than usual

1,2,3-categorical

Cough

Same as/Less/More than usual

1,2,3-categorical

Wheezing

Same as/Less/More than usual

1,2,3-categorical

Sputum

Both Increased Sputum Yes/No Production categorical Change in Sputum Color Change in Sputum Color Increased Sputum Production No Change In Sputum

Controller medication Compliance

Same as/Less/More than usual

1,2,3- categorical

Duration of recent worsening No symptoms Within 1 to 3 days More than 3 days Last 24 h Vital signs Oxygen saturation Heart Rate

SPO2 value

%-continuous

HR value

BPM-continuous

452

C. Abineza et al.

The (retrospective) dataset contained 2409 cases. The categorical variables in the dataset were labeled using label encoding and then the dataset was balanced by upsampling the minority class (not exacerbation 831 against 1578 positive cases) [24, 25]. Therefore, the final subset contains 3240 COPD cases. Each record in the dataset represents a COPD case. The prediction outcome is exacerbation or not exacerbation. The dataset was divided such that 75% of the data was used for the training process and 25% for testing. Three classifiers were applied and tested on the final subset. For a DSS design and model validation, a well-performed classifier was integrated with an interface of a shap interpreter. During validation of the designed DSS, 56 patients, were verified through the interface, by computing (the risk of) exacerbation of each patient and making a decision on individual oxygen level. SPO2 and HR were taken in a quiet environment, after resting a patient for almost 10 min and were taken on both arms, the average of both values, was recorded. Concerning SPO2 and HR, they are taken for a whole minute and SPO2 or HR, that kept normal for a long period, during the measurement, was considered. 3.2 Classification Models Logistic Regression Logistic regression [26] is a statistical model used for binary classification tasks, where the goal is to predict the probability of an event occurring based on input features. It is a type of regression analysis that estimates the probabilities using a logistic function, also known as the sigmoid function. In logistic regression, the model takes a set of weighted features from the input and combines them linearly. Each feature is multiplied by a weight and then added up. The resulting linear combination is passed through the logistic or sigmoid function to map it to a value between 0 and 1. This value represents the predicted probability of the event occurring. Decision Tree A decision tree [27] is a popular classification model that uses a tree-like structure to make predictions based on input features. It is a supervised learning algorithm. The decision tree starts with a root node that represents the entire dataset. It then recursively splits the data based on different features, creating branches or child nodes. Each internal node in the tree represents a feature, and each leaf node represents a class label or outcome. The splitting process is determined by selecting the feature that provides the best separation of the data according to a certain criterion, such as Gini impurity or information gain. The goal is to create homogeneous subsets of data in each branch, where the instances within each subset share similar characteristics.

A Shap Interpreter-Based Explainable Decision Support System for COPD

453

Random Forest Model Random Forest [28] is a modification of the bagging ensemble method that aims to reduce variance and improve generalization by introducing additional randomness and reducing the correlation between the trees. By incorporating these randomization techniques, Random Forest minimizes the generalization error by balancing the bias-variance tradeoff. The randomness reduces the correlation between the trees, making the ensemble more diverse and less prone to overfitting. At the same time, the combination of different samples, random feature subsets, and shallow trees allows the model to capture important patterns and relationships in the data. 3.3 Evaluation Metrics The performance of our proposed model was evaluated in the range of 0–100%, according to accuracy, specificity, sensitivity, precision, recall, and F1-score, the metrics based on the confusion matrix [29]. The exacerbation risk prediction evaluation, on a Shap interpreter interface, is computed in the percentage (%). 3.4 The Context of SHAP Interpreter In essence, SHAP provides a way to distribute the credit for a prediction among the features and quantify how each feature influences the predicted label probability. This is valuable for understanding not just which features are important, but also how they collectively contribute to the model’s decisions [30]. One of the powerful aspects of SHAP is that it can compute not just the feature importance scores (SHAP values) but also the predicted label probabilities. In a classification task, Shap interpreter computes the probabilities of each class for a given input instance Here’s how it works: • Background Dataset: Create a background or baseline dataset representing average or baseline feature values. • Calculation of Shapley Values: Compute SHAP values for each feature for each class, representing their contributions to differences in predicted probabilities. These values represent the contribution of each feature to the difference between the predicted probability of a given class and the expected probability (usually the average probability) of that class over all possible input combinations • Predicted Label Probability: Use SHAP values to explain how each feature affects the predicted label probability. Sum the SHAP values and add them to the baseline prediction probability to get the raw prediction value. • Apply Activation Function (if needed): Depending on the model and task, apply an activation function to the raw prediction value (e.g., sigmoid for binary classification) to obtain the predicted probability within the appropriate range.

454

C. Abineza et al.

• Convert to Percentage: Multiply the predicted probability (ranging between 0 and 1) by 100 to express it as a percentage.

4 Results and Discussion Firstly, Logistic Regression, Decision Tree, and the Random Forest classifiers with their default parameters, were applied to the dataset by attaining an accuracy of LR:77.3,DT:80.2 and RF: 93.1. As RF accuracy reported higher performance, we tried to optimize RF performance using a randomized search cross validation and a grid search cross validation. Therefore,we applied a RF model to the following subspace of random search and grid search via a 5-fold cross validation: {‘criterion’: [‘gini’, ‘entropy’],’max_depth’: [None, 7, 8, 9, 10,15, 20, 25], ‘max_features’: [0.5, 1.0], ‘n_estimators’: [3, 5, 10, 15, 20]}. Best parameters (‘criterion’: ‘entropy’,’max_depth’: None,’max_features’: 0.5,’n_estimators’: 20) were obtained with a gridsearch and were passed to train RF model. Figure 1 illustrates the confusion matrix about the prediction accuracy of RF model applied with optimal settings, on the validation sub set. RF model achieved accuracy of 95.3%.

Fig. 1. Confusion matrix

Secondly, a SHAP interpreter was integrated with the RF model and used to assess COPD cases, during a prospective study. The steps involved are the following: fitting the model, building the dashboard, building the explainer object out of the model and the test data, passing this interpreter object to the explainer dashboard then running it. The interfaces for the designed application, are depicted in Fig. 2, below. Figure 2a illustrates the average impact of each feature on the model prediction, Fig. 2b presents model performance metrics, Fig. 2c shows how model prediction varies with predictor values change (here SPO2 values), Fig. 2d illustrates about inserting patient clinical signs for exacerbation risk prediction, Fig. 2e presents prediction of exacerbation probability of a patient (measurements) in Fig. 2d, Fig. 2f explains predictors contributions values for the individual classification in Fig. 2d. 1, 2, 3,4 denote categorical values (including symptoms severity change).

A Shap Interpreter-Based Explainable Decision Support System for COPD

Fig. 2. DSS interfaces

455

456

C. Abineza et al.

For the validation of our model, two doctors independently classified each evaluated COPD case into exacerbation or not. They considered other vital signs that might characterize the GOLD stage of an individual, such as (chronic) hypoxia, and utilized the same parameters used by our model to classify patients into exacerbation or not. In cases where a patient did not adhere to prescribed medication, they exhibited more than baseline worsening symptoms severity change, prompting the doctors to provide a recommendation to adhere to the prescribed medication, or change the current therapy (for example, from inhaled steroids to steroid pills) or adjust its dosage. No severe exacerbation cases were found; for other patients, they were advised to return for further (laboratory) tests as their SPO2 was below 90 percent, to determine if advice related to their SPO2 levels could be provided. The comparison was conducted between decisions made by doctors and the application interface from our model, such that if our model computes, for a COPD patient, the risk of exacerbation greater than 50%, is considered as exacerbation, otherwise not exacerbation. The achieved concordance is 94.9% based on 56 patients evaluated for an exacerbation event, with 9 patients showing mismatches. The designed Decision Support System (DSS) demonstrated promising results, comparable to those determined by medical experts when considering exacerbation or non-exacerbation labeling. Furthermore, our DSS surpasses the medical method by assessing individual COPD predictors (positive or negative) contributions to the risk of exacerbation. Unlike the medical approach, our model evaluates SPO2 without depending on a fixed 90 percent threshold for the Oxygen initiation; instead, it provides a more nuanced analysis. Additionally, calculating the risk of exacerbation in percentage terms enables proactive measures to be taken before an exacerbation event occurs.

5 Conclusion Utilizing a Shap interpreter for the design of the Decision Support System (DSS) model provided valuable insights into the prediction process. It guided clinicians on whether to initiate oxygen therapy for evaluated COPD cases, especially by adapting a specific SPO2 contribution to an individual’s COPD impairment or exacerbation risk. For future work, we recommend evaluating our model with continuously monitored COPD cases in the same clinical context. This evaluation should consider additional COPD risk factors such as comorbidities or other hypoxemia contexts. Such an approach could offer guidance for future clinical trials, aiming for a more specific and targeted COPD management. Acknowledgement. We would like to express our gratitude to Prof. Dr. Peter Lucas from the University of Twente, Enschede, the Netherlands, for providing invaluable guidance during this research. Funding. The research reported in this paper has been funded firstly, by The International Development Research Centre (IDRC) and The Swedish International Development Cooperation Agency (SIDA), under ‘The Artificial Intelligence for Development in Africa (AI4D Africa) program with the management of The African Center for Technology Studies (ACTS)’ and secondly by The African Centre of Excellence in the Internet of Things (ACEIoT).

A Shap Interpreter-Based Explainable Decision Support System for COPD

457

References 1. How to Diagnose COPD: 13 Steps (with Pictures) - wikiHow Health. https://www.wikihow. health/Diagnose-COPD. Accessed 8 July 2022 2. Global initiative for chronic obstructive pulmonary disease, At-a-glance outpatient management reference for chronic obstructive pulmonary disease (COPD) (2017). https://goldcopd. org/wp-content/uploads/2016/11/wms-At-A-Glance-2017-FINAL.pdf. Accessed 3 Jan 2023 3. Yawn, B.P., Mintz, M.L., Doherty, D.E.: GOLD in practice: chronic obstructive pulmonary disease treatment and management in the primary care setting. Int. J. Chron. Obstruct. Pulmon. Dis. 1, 289–299 (2021) 4. Difference Between Dyspnea and Shortness of Breath. https://www.differencebetween.com/ difference-between-dyspnea-and-vs-shortness-of-breath/. Accessed 7 Jan 2023 5. Global Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease: 2020 Report. Fontana, WI: Global Initiative for Chronic Obstructive Lung Disease (GOLD) (2020). https://goldcopd.org/wp-content/uploads/2019/11/GOLD2020-REPORT-ver1.1wms.pdf. Accessed 1 March 2023 6. Clinical use of Pulse Oximetry Pocket Reference 2010 | IPCRG, https://www.ipcrg.org/resour ces/search-resources/clinical-use-of-pulse-oximetry-pocket-reference-2010. Accessed 7 Jan 2023 7. Pandya, N.K., Sharma, S.: Capnography and Pulse Oximetry. StatPearls Publishing, Treasure Island (FL) (2023) 8. Buekers, J., et al.: Wearable finger pulse oximetry for continuous oxygen saturation measurements during daily home routines of patients With Chronic Obstructive Pulmonary Disease (COPD) over one week: observational. Study. JMIR Mhealth Uhealth 7(6), e12866 (2019) 9. Swaminathan, S., et al.: A machine learning approach to triaging patients with chronic obstructive pulmonary disease. PLoS ONE 12(11), e0188532 (2017) 10. Siela, D.: Acute respiratory failure and COPD: recognition and care. Nursing Critical Care 13, 28–37 (2018) 11. Vijayakumar, V.K., et al.: Role of a digital clinical decision-support system in general practitioners’ management of COPD in Norway. Int. J. Chron. Obstruct. Pulmon. Dis. 16, 2327–2336 (2021) 12. Vijayakumar, V.K, et al.: Role of a digital clinical decision-support system in general practitioners’ management of COPD in Norway. Int. J. Chron. Obstruct. Pulmon. Dis. 16, 2327–2336 (2021) 13. Iadanza, E., Mudura, V., Melillo, P., Gherardelli, M.: An automatic system supporting clinical decision for chronic obstructive pulmonary disease. Health Technol. 10, 487–498 (2020) 14. Using SHAP for Global Explanations of Model Predictions. https://www.aporia.com/blog/ shap-global-explantations-ml. Accessed 7 Jan 2023 15. Explainer dashboard, a tool to answer how the machine learning models work. https://airev. us/explainer-dashboard. Accessed 7 Jan 2023 16. Shah, S.A., Velardo, C., Farmer, A., Tarassenko, L.: Exacerbations in chronic obstructive pulmonary disease: identification and prediction using a digital health system. J. Med. Internet Res. (3), e69 (2019) 17. Hurst, J.R., Donaldson, G.C., Quint, J.K., Goldring , J.J., Patel, A.R., Wedzicha, J.A.: Domiciliary pulse-oximetry at exacerbation of chronic obstructive pulmonary disease: prospective pilot study. BMC Pulm. Med. 10(52) (2010) 18. Burton, C., Pinnock, H., McKinstry, B.: Changes in telemonitored physiological variables and symptoms prior to exacerbations of chronic obstructive pulmonary disease. J. Telemed. Telecare 21(1), 29–36 (2014)

458

C. Abineza et al.

19. Al Rajeh, A., Bhogal, A.S., Zhang, Y., Costello, J.T., Hurst, J.R., Mani, A.R.: Application of oxygen saturation variability analysis for the detection of exacerbation in individuals with COPD: A proof-of-concept. study. Physiol. Rep 9(23), e15132 (2021) 20. Siddiqui, T., Morshed, B.I.: Severity classification of chronic obstructive pulmonary disease and asthma with heart rate and SpO2 sensors. In: 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 2929–2932. EMBC (2018) 21. Merone, M., Pedone, C., Capasso, G., Incalzi, R.A., Soda, P.A.: Decision support system for tele-monitoring COPD-related worrisome events. IEEE J. Biomed. Health Inform. 21(2), 296–302 (2017) 22. Shah, S.A., Velardo, C., Gibson, O.J., Rutter, H., Farmer, A., Tarassenko, L.: Personalized alerts for patients with COPD using pulse oximetry and symptom scores. Annual International Conference of the IEEE Engineering 2014, Medicine and Biology Society Chicago, pp. 3164– 3167, IL USA (2014) 23. Van der Heijden, M., Lucas, P.J.F., Lijnse, B., Heijdra, Y.F., Schermer, T.R.J.: An autonomous mobile system for the management of COPD. J. Biomed. Inform. 46(3), 458–469 (2013) 24. Domingues, I., Amorim, J.P., Abreu, P.H., Duarte, H., Santos, J.: Evaluation of oversampling data balancing techniques in the context of ordinal classification. In: International Joint Conference on Neural Networks (IJCNN) 2018, Rio de Janeiro, pp. 1–8. Brazil (2018) 25. Potdar, K., Pardawala, S.T., Pai, D.C.: A comparative study of categorical variable encoding techniques for neural network classifiers. I J C A 175(4), 7–9 (2017) 26. Boateng, E., Abaye, D.: A review of the logistic regression model with emphasis on medical research. J. Data Anal. Inform. Process. 7(4), 190–207 (2019) 27. Sharma, H., Kumar S.: A survey on decision tree algorithms of classification in data mining. Int. J. Sci. Res. (IJSR) 5(4) (2016) 28. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001) 29. Hossin, M., Sulaiman, M.N.: A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manage. Process 5(2), 01–11 (2019) 30. Molnar, C.: Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd edn. Independently published (2022)

A Generic Methodology for Designing Smart Environment Based on Discrete-Event Simulation: A Conceptual Model Shady Aly1(B)

, Tomáš Benda2

, Jan Tyrychtr2

, and Ivan Vrana2

1 Faculty of Engineering, Helwan University, Helwan, Egypt

[email protected] 2 Department of Information Engineering, Faculty of Economics and Management,

Czech University of Life Sciences Prague, Prague, Czech Republic

Abstract. Designing a smart environment is crucial stakes. It starts with analysing the environment goals, functional and information requirements, and the process end with decision to employ the smart objects and enabling technologies. One issue is that the risk involved in such investment in these technologies which necessitates ensuring the justification, efficiency, and effectiveness of these employed technologies. Subsequently, there is a need to test these technologies prior to the investment decision, which is difficult to achieve in reality. One feasible solution is to use a simulation model which is capable to test the impact of these technologies. This research presents a new and pioneering methodology for simulating the implementation and testing smart enabling technologies. Keywords: Smart environment · Discrete-event simulation (DES) · Fuzzy Logics · Smart Enabling Technologies

1 Introduction Smart enabling technologies exhibit different useful attributes as the major underpinning of the smart environments. Logically, a given environment is comprised of certain activities, processes, or functionalities. The smart objects or smart enabling technologies acts to provides necessary information to enhance these functionalities or process, leading minimize times, costs or improve quality. The investment decision in those smart enabling technologies to build a smart system or environment must wisely be preceded by sufficient justifications. Usually, these it would be design process of the smart environment a smart environment is crucial stakes. It starts with analysing the environment goals, functional and information requirements, and the process end with decision to employ the smart objects and enabling technologies. One issue is that the risk involved in such investment in these technologies which necessitates ensuring the justification, efficiency, and effectiveness of these employed technologies. Subsequently, there is a need to test these technologies prior to the investment decision, which is difficult to achieve in reality. One feasible solution is to use a simulation model which is capable to test the impact of these technologies. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 459–468, 2024. https://doi.org/10.1007/978-3-031-54820-8_37

460

S. Aly et al.

A discrete-event simulation (DES) models the processes and operations of a system as a sequence of discrete set of events in time. Each event marks a particular state in the systems and occurs at a particular instant in time. It is assumed that no change in the system occur between consecutive events and accordingly, the simulation time jump to the occurrence time of the next event, called next-event time progression. DES is a technique used to model real world systems that is made up of a set of logically separate processes that autonomously progress through time. This research presents a logical conceptual methodology for simulating the implementation and testing smart enabling technologies. In order to investigate the feasibility and justification of investment in the smart enabling technologies, some model-based experiment or testing should be conducted. Typical experiment can consider measuring the performance of environment or systems before and after implementing some of set of smart enabling technologies (SETs). Quantifying the impact of various SETs can guide the design of the smart environment. Due to inherent vagueness and uncertainty of such decision problem, an adequate and hybrid techniques are needed. Therefore, we introduce a novel methodology based on a discrete-event simulation model for the processes of the modelled system or environment. This enables running the developed model once before and after implementing the deployment of SETs in the environment. The effect implementing a SET should appear in modification or improvement of process parameters. Thus, the developed simulation model will be first run before SETs implementation, where process parameters will have their original values. Then, upon virtual deployment of SETs in the environment to affect the processes parameters, then model is re-run secondly and the variation in the performance is reported.

2 Literature Discrete-Event simulation model has a wide applicability in modelling operations and processes service and production sectors. Basaglia et al. [5] implemented DES model for analysing and improving patient flows in the hospital. Morabito et al. [24], proposed the use of DES in simulating implementation digital twins. Hofmann et al. [16] introduced a brief for deploying Amazon Web services for online discrete-event simulation. Rocha & Lopes [31] presented a research work on bottleneck prediction and data-driven discrete-event simulation for a balanced manufacturing line. Hu et al. [17] investigated the potential of applying DES in modelling and improving metro-based urban underground logistics system network. On the other hand, concerning the research work on smart healthcare systems development, actually, it is rapidly growing due to the accelerated progress and continual emergence of new smart environment enabling technologies (SETs) that enable implementation of modern technologies like IoT (Internet-of-Things), Cloud computing, robotics, etc. Khan and Chattopadhyay [19] presented a smart health monitoring system that uses biomedical sensors to check patient’s condition and report it through the internet to inform the concerned doctors either through a LCD monitor or through a developed smart phone mobile application. Bansal & Gandhi [4] focus on the combination of utilizing IoT together with Big Data technologies in transmitting ECG sensors data from patient to the doctors and to other concerned persons for effective and efficient smart

A Generic Methodology for Designing Smart Environment

461

health monitoring. Almazroa et al. [3], proposed an application for smart sensing mobile that collects data coming from the body temperature through the sensor network. Rayan et al. [29] reviewed the machine learning approaches applied in the smart health. They stated that the results showed that the machine learning approach is used in many smart health applications such as Alzheimer’s disease, Glaucoma diagnosis, the Intensive Care Unit (ICU) readmissions, bacterial sepsis diagnoses, and cataract detection. They pointed out that the Support Vector Machine (SVM), Artificial Neural Network (ANN) and the Convolutional Neural Network (CNN) are the most widely utilized. Rajakumari et al. [27] presented optimized smart hospital scheduling system based on convolution neural network, to schedule patients without any heavy burden to consult doctor for their needs. Ahmid et al. [1] proposed an intelligent health monitoring system based on based on IoT and agents. Kamruzzaman [18] used artificial intelligence techniques in optimizing patient care plans as an architecture of smart healthcare system. Xie et al. [36] proposed a design of hospital IoT smart system and nucleoside drugs for treatment of hepatitis and liver cirrhosis. Zhao et al. [37] presented a smart IoT platform in hospital that can relieve postoperative pain. They developed patients’ pain models to describe the kinetics of chronic pain in people’s physiological and psychological reactions. The smart enabling technologies exploited successfully in improving performance in health care through implementing several smart features including: • • • • • •

Monitoring health parameters [25] Hospital assets tracking [26] Automated patient’s check-in [20] Location aware access to nearest doctor’s door [6] Indoor navigation & optimized patient flow [11] Enhancing occupants’ safety (patients, physicians, Nurses, lab. Technicians, etc.) [10, 14, 32]

Concerning the application of simulation techniques in smart environment modelling, actually very few researches have applied the discrete event simulation approach. Latorre-Biel et al. [21] introduced a simulation model of traffic in smart cities. They developed a Petri nets-based model that can be easily adapted to different cities or road networks by adding to the model the layout of the city streets and roads, as well as traffic lights or number and type of vehicles. Vasilateanu and Bogdan Bernovici [34] pointed out that smart home simulation systems enable testing different smart home settings, including sensors and actuators, so as to satisfy personalized needs of the owners while permitting minimum investment. Bicakci & Gunes [7] proposed hybrid simulation system for testing artificial intelligence algorithms used in smart homes. First, they developed a real smart home system installed in a room for hybrid simulation. Then, a simulated house is developed with the desired number of rooms, desired number of smart home components with different tasks for this house, virtual individuals who use these house, and weekly life scenarios for these individuals. They operated the hybrid simulation smoothly under different conditions for two months with different artificial intelligence algorithms. Friederich et al. [12] proposed a generic data-driven framework for automated generation of simulation models as basis for digital twins for smart factories. They stated that the goal of the framework is to minimize and fully define, or

462

S. Aly et al.

even eliminate, the need for expert knowledge in the extraction of the corresponding simulation models. This paper is organized as follows. The next section introduces the basic components of the smart environment and the smart enabling technologies. Section 4 presents the proposed methodology of assessing smart enabling technologies based on discrete-event simulation models. Next section, we review the components, features and enabling technologies of the smart environments.

3 Smart Environment, Technologies, and Smartness Features A smart environment would typically consist of the following basic building blocks: • Environmental objects: equipments, devices, instruments, furniture’s, etc. • Sensors: precepts, measures, monitor activities and statuses of devices, objects, users or resources, and send related readings or messages to a control system. Examples of sensors includes indoor positioning and navigation sensors. Motion sensors used to detect motion and presence [9, 28]. Radio Frequency identification (RFID) is used for position and detection [28]. Indoor navigation system, such as Bluetooth Low Energy (BLE) beacon [2, 22] is used to guide people indoor destination. Outdoor positioning sensors like GPS tracker has currently plenty of applications [15]. Physiological sensors have many applications nowadays. EMG (Electromyography) monitors muscle health and detect movement disorder for people [14, 28]. Electrocardiography (ECG) monitors cardiac parameters [28, 30]. Eye tracking system [8] measures eye movement direction. Computer brain interface (CBI) use brain signals for controlling devices and used to assist disable persons [35]. Other sensors are used for different application like smoke detectors [10], Fall sensor [14], pressure sensor [23]. • Actuators: receive control signals from the microcontroller to execute real-time physical actions. Actuators usually includes motors such as step or DC motors needed to exert the required actuation or movement. • Control system: a computer control or microcontroller that applies a set of programmed rules based on the information received from sensors and feedbacks from actuators. Example is Home Gateway, etc. [33]. • Middleware: integrating and interoperating heterogeneous devices by an intermediary software layer called middleware. • Communication networks: enables communication between the control system, sensors, actuators. This includes wireless networks: LAN, Wi-Fi, Bluetooth, etc., and finally the Interfaces: enable users to communicate with system [13, 28]. In this research, in order to demonstrate the implementation of the proposed approach, we shall consider the impact of three indoor positioning and navigation technologies: smart key enabling technologies: • RFID • Motion sensor • Bluetooth low energy BLE.

A Generic Methodology for Designing Smart Environment

463

4 Simulation-Based Assessment of Smart Enabling Technology: Proposed Methodology In this section, we present the core idea of this article through defining a methodology for designing the smart environment based on the outcomes of applying a simulation model to study the impacts of the SETs on the process performance or process parameters, either improvement or decay. Which is to focus on introducing novel approach for assessing the functional performance of smart enabling technologies through implementing a simulation model where the impact of these proposed technologies are measured and assessed, and consequently the feasibility of investment on purchasing them can be technically and economically evaluated and justified. Our proposal in this research is to construct a discrete-event simulation model for the processes of the environment subject for implementation. Then, we measure the performance of the modelled environment before and after implementation of the simulation model. 4.1 Defining the Model: The Processes Parameters We consider three process parameters that can be affected by the SET implementations: • Process times • Failure (Resource or entities failures) • Costs The above concepts are illustrated in the following figure (Fig. 1). Each process uses resources. Resources control the process time and may exhibit failure with an associated failure rate. Additionally, these resources have costs rates for busy and idle times.

Fig. 1. Modelling the impact of the SET on the three process performance parameters.

The definition of the three key DES simulation model parameter are as follows: 1. Process times a. pi : The ith process of the model or the system.

464

S. Aly et al.

b. pt i : The process time of the ith process. c. Rij : The jth resource of the ith process. d. Each process within the model of the subject environment has the process time as one of its main parameters. This parameter is specified at the beginning of the simulation model development (see Fig. 2). Each process pi , i = 1, 2, ….n, has associated process time, t i , i = 1, 2, ….n. this time is defined as probability density function based on specific probability distribution. 2. Failures a. fijk : The k th failure that can be occurred to the jth resource of the ith process. b. The failures are events that occur to the resources in their use in the system due to several reasons. Each process pi , i = 1, 2, ….n, utilizes one or more resources, r ij , j = 1, 2, ….m. Each resources may exhibit failure event fijk , k = 1, 2, ….l, with associated rate Qijk , k = 1, 2, ….l. 3. Costs q a. CRij : The cost of the jth resource of the ith process. q = a for idle time cost, and q = b for busy cost. b. Costs are associated with resources or activities or entities. Each resource r ij , j = 1, 2, ….m, used in a process pi , i = 1, 2, ….n, has cost rate value cij . This cost may be composed into more than one component (e.g., busy/idle or regular time/ over time, etc.) c. SETs can reduce the cost of resources by mitigating failures consequences or reduce the time the resources are used or through improving productivity of resources. Next section, we present a proposed logical methodology for designing. 4.2 Designing a Smart Environment Based on Discrete-Event Simulation the methodology The proposed methodology consists of ten steps (see Fig. 2). They are as follows: 1. Analyse processes: the processes of subject environment that serve, transform, or acted by the environment’s entities are analysed and the outcome is the process flow chart describing the key functionalities and activities. 2. Identify key entities, resources, and processes’ parameters: entities that are living in the environment are identified (e.g., people, customer, patient, client, etc.). Resources used by each process to implement the necessary functionalities are defined. The parameters including costs, times and failure types and rates are specified as well. 3. Identify relevant SETs: The state-of-the-art SETs are reviewed, and only the relevant group to the given application context are considered and listed. 4. Define logical SETs implementation scenario: a smart environment designer or a system analyst in participation with the clinic operation manager, proposes an implementation scenario, in which a group of SETs are adopted for use to improve the operational performance, which should be verified and validated. Such scenario should specify at the deployment location, functionalities, and quantities.

A Generic Methodology for Designing Smart Environment

465

5. Assess impact of SETs on the parameters using fuzzy logics: a human expert uses the developed fuzzy logic scale to assess the impact of each SET on cost, time, and failure rate parameters of a process’ resources. 6. Develop the discrete event simulation model: a human expert uses the developed fuzzy logic scale to assess the impact of each SET on cost, time and failure rate parameters of a process’ resources. 7. Run the model before SETs implementation: A simulation modelling software is used to implement and run the developed model, without utilizing SETs. 8. Run the model after SETs implementation: The simulation modelling software is used to implement and run the developed model, after utilizing SETs. 9. Compare performance before & after SETS: The performance measures of defined for the modelled environments are compared for the implementation of SETS before and after. 10. Conclude on SETs performance improvement: The performance measures of SETs implementation are evaluated.

Fig. 2. Process of designing a smart environment based on discrete event simulation

5 Conclusion This paper presents a logical conceptual model for using the discrete event simulation in designing a smart environment through the studying the impact of each smart enabling technologies in an individual or in group deployments in the subject environment. The main advantage of the model is that it paves the road towards enabling no cost investigation of technical and economic feasibility of any proposed scenario of SETs implementation. Acknowledgement. This work was conducted within the project “Smart environments - modelling and simulation of complex decision-making problems in intelligent systems” (2022B0010) funded through the IGA foundation of the Faculty of Economics and Management, Czech University of Life Sciences in Prague, and the project “Precision agriculture and digitization in the Czech Republic” (QK23020058) funded through the NAZV Program ZEMEˇ 2022 of the Ministry of Agriculture of the Czech Republic.

466

S. Aly et al.

References 1. Ahmid, M., Kazar, O., Benharzallah, S., et al.: An intelligent and secure health monitoring system based on agent. In: 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT). IEEE, Doha, Qatar, pp. 291–296 (2020) 2. AL-Madani, B., Orujov, F., Maskeli¯unas, R., et al.: Fuzzy logic type-2 based wireless indoor localization system for navigation of visually impaired people in buildings. Sensors 19, 2114(2019). https://doi.org/10.3390/s19092114 3. Almazroa, A., Alsalman, F., Alsehaibani, J., et al.: Easy clinic: smart sensing application in healthcare. In: 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS). IEEE, Riyadh, Saudi Arabia, pp. 1–5 (2019) 4. Bansal, M., Gandhi, B.: IoT & big data in smart healthcare (ECG monitoring). In: 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon). IEEE, Faridabad, India, pp. 390–396 (2019) 5. Basaglia, A., Spacone, E., Van De Lindt, J.W., Kirsch, T.D.: A discrete-event simulation model of hospital patient flow following major earthquakes. Int. J. Disaster Risk Reduction 71, 102825 (2022). https://doi.org/10.1016/j.ijdrr.2022.102825 6. Bhadoria, R.K., Saha, J., Biswas, S., Chowdhury, C.: IoT-based location-aware smart healthcare framework with user mobility support in normal and emergency scenario: a comprehensive survey. In: Healthcare Paradigms in the Internet of Things Ecosystem. Elsevier, pp. 137–161 (2021) 7. Bicakci, S., Gunes, H.: Hybrid simulation system for testing artificial intelligence algorithms used in smart homes. Simul. Model. Pract. Theory 102, 101993 (2020). https://doi.org/10. 1016/j.simpat.2019.101993 8. Bissoli, A., Lavino-Junior, D., Sime, M., et al.: A human-machine interface based on eye tracking for controlling and monitoring a smart home using the Internet of Things. Sensors 19, 859 (2019). https://doi.org/10.3390/s19040859 9. Chan, M., Estève, D., Escriba, C., Campo, E.: A review of smart homes—present state and future challenges. Comput. Methods Programs Biomed. 91, 55–81 (2008). https://doi.org/10. 1016/j.cmpb.2008.02.001 10. Festag, S.: Analysis of the effectiveness of the smoke alarm obligation – experiences from practice. Fire Saf. J. 119, 103263 (2021). https://doi.org/10.1016/j.firesaf.2020.103263 11. Fixova, K., Macik, M., Mikovec, Z.: In-hospital navigation system for people with limited orientation. In: 2014 5th IEEE Conference on Cognitive Infocommunications (CogInfoCom). IEEE, Vietrisul Mare, Italy, pp. 125–130 (2014) 12. Friederich, J., Francis, D.P., Lazarova-Molnar, S., Mohamed, N.: A framework for data-driven digital twins of smart manufacturing systems. Comput. Ind. 136, 103586 (2022). https://doi. org/10.1016/j.compind.2021.103586 13. Gomez, C., Paradells, J.: Wireless home automation networks: a survey of architectures and technologies. IEEE Commun. Mag. 48, 92–101 (2010). https://doi.org/10.1109/MCOM.2010. 5473869 14. Han, H., Ma, X., Oyama, K.: Towards detecting and predicting fall events in elderly care using bidirectional electromyographic sensor network. In: 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS). IEEE, Okayama, Japan, pp. 1–6 (2016) 15. He, Z.M., Peng, L., Han, H.Y., et al.: Research on indoor and outdoor comprehensive positioning technology based on multi-source information assistance. Procedia Comput. Sci. 166, 361–365 (2020). https://doi.org/10.1016/j.procs.2020.02.084 16. Hofmann, W., Lang, S., Reichardt, P., Reggelin, T.: A brief introduction to deploy amazon web services for online discrete-event simulation. Procedia Comput. Sci. 200, 386–393 (2022). https://doi.org/10.1016/j.procs.2022.01.237

A Generic Methodology for Designing Smart Environment

467

17. Hu, W., Dong, J., Yang, K., et al.: Modeling Real-time operations of metro-based urban underground logistics system network: a discrete event simulation approach. Tunn. Undergr. Space Technol. 132, 104896 (2023). https://doi.org/10.1016/j.tust.2022.104896 18. Kamruzzaman, M.M.: Architecture of smart health care system using artificial intelligence. In: 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, London, UK, pp. 1–6 (2020) 19. Khan, T., Chattopadhyay, M.K.: Smart health monitoring system. In: 2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC). IEEE, Indore, pp. 1–6 (2017) 20. Krischer, J.P., Hurley, C., Pillalamarri, M., et al.: An automated patient registration and treatment randomization system for multicenter clinical trials. Control. Clin. Trials 12, 367–377 (1991). https://doi.org/10.1016/0197-2456(91)90017-G 21. Latorre-Biel, J.-I., Faulin, J., Jiménez, E., Juan, A.A.: Simulation model of traffic in smart cities for decision-making support: case study in Tudela (Navarre, Spain). In: Alba, E., Chicano, F., Luque, G. (eds.) Smart Cities, pp. 144–153. Springer International Publishing, Cham (2017) 22. Mohsin, N., Payandeh, S., Ho, D., Gelinas, J.P.: Study of activity tracking through bluetooth low energy-based network. J. Sens. 2019, 1–21 (2019). https://doi.org/10.1155/2019/6876925 23. Monroy, E.B., Polo Rodríguez, A., Espinilla Estevez, M., Medina Quero, J.: Fuzzy monitoring of in-bed postural changes for the prevention of pressure ulcers using inertial sensors attached to clothing. J. Biomed. Inform. 107, 103476 (2020). https://doi.org/10.1016/j.jbi.2020.103476 24. Morabito, L., Ippolito, M., Pastore, E., et al.: A discrete event simulation based approach for digital twin implementation. IFAC-PapersOnLine 54, 414–419 (2021). https://doi.org/10. 1016/j.ifacol.2021.08.164 25. Nasiri, S., Khosravani, M.R.: Progress and challenges in fabrication of wearable sensors for health monitoring. Sens. Actuators A 312, 112105 (2020). https://doi.org/10.1016/j.sna.2020. 112105 26. Pietrabissa, A., Poli, C., Ferriero, D.G., Grigioni, M.: Optimal planning of sensor networks for asset tracking in hospital environments. Decis. Support. Syst. 55, 304–313 (2013). https:// doi.org/10.1016/j.dss.2013.01.031 27. Rajakumari, K., Madhunisha, M.: Intelligent and convolutional-neural-network based smart hospital and patient scheduling system. In: 2020 International Conference on Computer Communication and Informatics (ICCCI). IEEE, Coimbatore, India, pp. 1–5 (2020) 28. Rashidi, P., Mihailidis, A.: A survey on ambient-assisted living tools for older adults. IEEE J. Biomed. Health Inform. 17, 579–590 (2013). https://doi.org/10.1109/JBHI.2012.2234129 29. Rayan, Z., Alfonse, M., Salem, A.-B.M.: Machine learning approaches in smart health. Procedia Comput. Sci. 154, 361–368 (2019). https://doi.org/10.1016/j.procs.2019.06.052 30. Riyadi, M.A., Iskandar, I.A., Rizal, A.: Development of FPGA-based three-lead electrocardiography. In: 2016 International Seminar on Intelligent Technology and Its Applications (ISITIA). IEEE, Lombok, Indonesia, pp. 67–72 (2016) 31. Rocha, E.M., Lopes, M.J.: Bottleneck prediction and data-driven discrete-event simulation for a balanced manufacturing line. Procedia Comput. Sci. 200, 1145–1154 (2022). https:// doi.org/10.1016/j.procs.2022.01.314 32. Shin, G.D.: Investigating the impact of daily life context on physical activity in terms of steps information generated by wearable activity tracker. Int. J. Med. Informatics 141, 104222 (2020). https://doi.org/10.1016/j.ijmedinf.2020.104222 33. Skubic, M., Alexander, G., Popescu, M., et al.: A smart home application to eldercare: current status and lessons learned. THC 17, 183–201 (2009). https://doi.org/10.3233/THC-2009-0551 34. Vasilateanu, A., Bernovici, B.: Lightweight smart home simulation system for home monitoring using software agents. Procedia Comput. Sci. 138, 153–160 (2018). https://doi.org/10. 1016/j.procs.2018.10.022

468

S. Aly et al.

35. Vourvopoulos, A., Badia, S.B.I.: Usability and cost-effectiveness in brain-computer interaction: is it user throughput or technology related? In: Proceedings of the 7th Augmented Human International Conference 2016. ACM, Geneva Switzerland, pp. 1–8 (2016) 36. Xie, Z., Ji, X., Han, J.: Retracted: design of hospital IoT smart system and nucleoside drugs for treatment of hepatitis and liver cirrhosis. Microprocess. Microsyst. 81, 103691 (2021). https://doi.org/10.1016/j.micpro.2020.103691 37. Zhao, Y., Ge, S., Feng, Y.: Smart IoT data platform in hospital and postoperative analgesic effects of orthopedic patients. Microprocess. Microsyst. 81, 103653 (2021). https://doi.org/ 10.1016/j.micpro.2020.103653

From Algorithms to Grants: Leveraging Machine Learning for Research and Innovation Fund Allocation Rebecca Lupyani(B) and Jackson Phiri School of Natural Sciences, Department of Computer Science, University of Zambia, Lusaka, Zambia {rebecca.lupyani,jackson.phiri}@cs.unza.zm

Abstract. In today’s rapidly evolving world, research forms a cornerstone of human progress leading to new products, services, and technologies, which, in turn, can stimulate economic growth and enhance the quality of life. Research enables people to learn, innovate and address the complex challenges facing a society. It is a powerful tool for making the world a better place and as such many countries endeavor to support the research landscape by providing research grants in different sectors. The process of allocating research grants plays a pivotal role in fostering scientific progress, innovation and knowledge. The traditional manual selection of grant proposals, while well-established can be resource intensive time-consuming, subjective, and prone to bias. This paper presents an unconventional strategy that leverages machine learning algorithms to enhance the fairness, efficiency, and transparency of the grant allocation process by removing human biases and prejudices that can inadvertently influence funding decisions. The study discusses the design and implementation of a machine learning-based grant allocation system using historical grant data from a reputable funding agency and provides empirical evidence of its effectiveness by selecting the best performing text classification algorithm from a comparative analysis of three models and integrating it into a web based application. Keywords: Research fund · Machine Learning · Grant Allocation System

1 Introduction Research and Innovations are of paramount importance across various fields and disciplines. Research identifies and analyses problems, seeks solutions, whereas innovations lead to new products, services, and technologies which, in turn, can stimulate economic growth and enhance the quality of life [1]. These benefits among others justify the need for funding investments in the Research and Innovations landscape for the economic development of a country. Thus, many countries have established organisations that can promote the Research and Innovation agenda by providing grants. Research and Innovation grants play a fundamental role in advancing scientific endeavors, promoting discoveries, and nurturing talented researchers and innovators. In the realm of research © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 469–480, 2024. https://doi.org/10.1007/978-3-031-54820-8_38

470

R. Lupyani and J. Phiri

funding, the allocation of grants stands as a critical decision point where the future of scientific progress is concerned [2]. Traditionally, this process has been determined by human judgement thus making it susceptible to inefficiencies, inequalities, and biases. Research funding organizations receive a large number of research proposal applications every year that call for the need to be classified and evaluated to check for compliance. In addition to research proposal applications, organisations also receive applications for Innovations which are also subjected to scrutiny in order to determine the innovation potential, the expected outcomes and whether the innovation supports the goals of national development [3]. Both research and Innovation proposal applications call for the need to be streamlined and classified into categories for the funding organisations to determine whether the proposals are viable ventures that can be supported and promoted. Many funding organisations especially in Africa use tagging systems in which human operators manually tag and classify the applications based on the information submitted by the applicant [4]. The tagged applications are then evaluated for compliance. While such a system has been established as effective, it is not without limitations. It has the potential for subjective decisions, and the category assignments done by human operators may result in incorrect, inconsistent or incomplete tags or labels [3]. These challenges have prompted a growing need for a more data-driven, objective, and equitable approach. Therefore, this paper proposes and automated categorization of the research and innovation proposal submissions through machine learning techniques to heighten the efficiency and impartiality of the grant allocation process. Harnessing the capabilities of text categorisation algorithms will assist funding organisations to streamline the evaluation process of research and innovation proposals, mitigate human bias, and make more well-informed decisions regarding which proposals merit support.

2 Literature Review A number of studies have explored the integration of machine learning techniques in various domains including healthcare, finance and social sciences. The machine learning algorithms have been applied to problems such as information retrieval, content filtering sentiment analysis, text classification problems to mention but a few. In the context of research grant allocation which is a classification problem a few recent works have investigated the application of machine learning models for predicting successful grant proposals. The section will first discuss the popular text classification algorithms and then discuss the problems in which they have been applied. 2.1 Text Classification Algorithms Over the recent years, a number popular text classification algorithms have emerged each with its own weaknesses and strengths. The choice of which algorithm to use depends on factors such as the nature of the text data, the size of the dataset, and the specific classification problem at hand. One of the classifier is the Naïve Bayes (NB) algorithm. The Naive Bayes is a simple and probabilistic model that handles its classification tasks by applying the Bayes’ theorem. It makes the assumption that every pair of features that are being classified are conditionally independent of each other [5]. There are three types

From Algorithms to Grants: Leveraging Machine Learning

471

of NB text classifiers namely the Multinomial, Bernoulli, and Gaussian Naive Bayes. Each of them can be used to handle different text classification tasks but the all operate under the same principle, the Bayem theory. The Naive Bayes classifiers are known to provide insightful outcomes in handling text classification tasks including the detection of sentiments and spam in text contexts [6]. Another text classifier to be reviewed is the K-Nearest Neighbour (KNN). This algorithm is a non-parametric, supervised learning classifier that is commonly used for regression and classification tasks. The performance of the KNN algorithm depends on the proximity of data points and uses the k closest training examples in a data set to classify new data points. The value of K is critical in the KNN algorithm, and it should be selected based on the input data. The algorithm works well on complex datasets as it is highly sensitive to the structure of the data, making it suitable for datasets with complex boundaries [7]. Another text classifier that is popular in handling regression and text classification tasks is the Random Forest (RF) algorithm. The Random Forest is one of the most popular and widely used machine learning algorithms as it exhibits exceptional performance when handling a wide range of text classification problems [8]. The algorithm uses the concept of building multiple decision trees, where each tree is trained on a random sample of the data and a random subset of the features. The Random Forest model considers a few key hyper-parameters which can be tuned for better performance. Some of the hyper parameters are the number of features, the number of trees, and the depth of the trees [8]. Another most popular and widely used text classification algorithm is the Support Vector Machine (SVM). The SVM is a powerful machine learning algorithm that can be used for linear or nonlinear classification tasks, regression tasks, and outlier detection tasks [9]. It is therefore useful in handling tasks such as image classification, spam detection, handwriting identification, gene expression analysis, face detection, anomaly detection and text classification [9]. The Support Vector Machine (SVM) can be used for both binary and multi-classification. As a binary classifier, the SVM divides data points into two classes whereas in multiclass classification, it breaks down the multi-classification problem into multiple binary classification problems [10]. There are two main approaches for adapting the SVM for multi-class classification, namely One-vs-Rest and One-vs-One. The One-vs-Rest approach divides a multi-class classification task into one binary classification problem per class, while the One-vs-One approach divides a multi-class classification into one binary classification problem per each pair of classes. The choice of which approach to use is highly dependent on the size and nature of the dataset [10]. Due to the nature of the dataset at hand, the study chose to compare three text classification algorithms namely the KNN, the NB and the Multi- Class SVM and chose the best performing algorithm. 2.2 Related Works In a study carried out by Christian et al. [11], the application of machine-learning based models across publications, grants, and other documents was investigated in order to determine whether a consistent portfolio view across inputs and outputs could be achieved. Their study revealed that using machine-learning-based models, makes it is possible to apply the same categorization approach to different document sets, for example to grant descriptions as inputs and to publications as an output. The study concluded

472

R. Lupyani and J. Phiri

that this approach creates comparable data sets where natural language processing and machine learning are used to tap into the substance of the research thereby allowing for immediate and deep insights [11]. A study to compare the performance of humans or machine learning (ML) classification models at classifying scientific research abstracts according to a fixed set of discipline groups was carried out by Chong et al. [12]. In their study, human operators (undergraduate and postgraduate assistants) were employed for this task in separate stages, and their performance compared against the performance of support vectors machine learning algorithm at classifying the European Research Council Grant project abstracts. Their findings revealed that Machine Learning Models are more accurate than human classifiers, across a variety of training and test datasets, and across evaluation panels. They concluded that ML classifiers trained on different training sets were also more dependable than human classifiers. Additionally, they concluded that machine learning models are a cost effective and highly accurate method for addressing problems in comparative bibliometric analysis, such as harmonising the discipline classifications of research from different funding agencies or countries [12]. In another study, Khor, K. A., Ko, Giovanni and Theseira [13], Walter carried a study to evaluate research grant programs using machine learning. They compared the performance of three machine learning classification models, multinomial Naïve Bayes (MNB), Multinomial Logistic Regression (MLR) and Support Vector Machines (SVM) to classify the research proposals according to the research funding structure used by the European Research Council (ERC). The results revealed that the SVM model exhibited a better performance compared to the MNB and MLR in classifying the research proposals. Based on the results, the study was able to determine which funding programme was more successful than the other in a particular research discipline [13]. Freyman et al. [14]explored how topic co-clustering which is an approach to text analysis based on machine learning could be used to tag National Science Foundation (NSF) grant awards automatically with terms referring to scientific disciplines or to socioeconomic objectives. Their results revealed that in the case of scientific disciplines, where their language models were well-formed and had a valid comparison set for manual classification, the machine-assigned tags were a reasonable and valid means for describing the research conducted under each grant [14]. The approach used in the latter study is similar to what will be used in this study. The difference is that the performance of three text classification algorithms will be compared and the best performing model integrated in a web based application to facilitate the awarding of grants.

3 Methodology The study assumed the Cross Standard Process for Data Mining (CRISP-DM) approach. It is a standard process model that provides a generic basis for a data mining process that has been tried and tested in industry [15]. It was selected because it provides a common reference point that addresses data mining problems and increases the understanding of crucial data mining issues by all participants. It comprises of six stages namely, business understanding, data understanding, data preparation, modelling, evaluation and deployment [15]. Under the business understanding stage, the needs of this project in respect to the business perspective of funding organisations were considered. This was done

From Algorithms to Grants: Leveraging Machine Learning

473

so that the defined objectives were aligned to the business by studying the challenges and inefficiencies that are faced by the funding organisations and the process that these organisations take when reviewing and allocating research and innovation grants. During the data understanding stage, a number of data sources were assessed to determine the availability of relevant and quality data to the study. Historical data was extracted from a reliable funding organisation namely the National Science and Technology Centre (NSTC) in Zambia and examined to identify patterns and trends that would be relevant for the modelling phase. After extraction, the data was then prepared by carrying out a number of pre-processing tasks. Pre-processing is the process of cleaning and transforming data into a format required for the modelling process. Cleaning of data results in the removal of missing values, null values, stopwords (unnecessary words) and outliers [16]. Additionally, the relevant input features to be used for modelling were selected. Preparation of data is a crucial step that contributes to the performance of the model as the quality of the data directly impacts the performance of the models [16]. Other important tasks under data preparation are feature selection and feature extraction. These techniques play a vital role in machine learning projects by improving model performance and reducing overfitting. Feature selection involves selecting a subset of relevant features from the original set of features in order to choose the most informative and discriminative features while discarding the redundant or irrelevant ones also known as noisy data [17]. This helps to reduce the dimensionality of the dataset. Feature extraction involves the creation of new features from the existing ones. The information contained in the original set of features is summarised to come up with new features. The Term Frequency-Inverse document frequency (TF-IDF) was used for feature extraction. TF counts the number of words in each document and assigns it to the feature space while the IDF assigns a higher weight to words with either high or low frequencies term in the document [17]. To model the machine learning text algorithms, experiments were undertaken to train three text classification algorithms namely the K-Nearest Neighbour (KNN), the Naïve Bayes (NB) and the Support Vector Machine (SVM) using historical data of grant applications from a prominent funding organisation called the National Science and Technology Centre. To evaluate the efficacy of the system, the performance of the trained models was evaluated by calculating the accuracy, precision, recall and F1-score of the models. The SVM exhibited the best performance and thus was selected as the text classifier to be used for the purpose of this study. Under the deployment stage, the SVM classifier was integrated into other platforms and third party tools, an Application Programming Interface (API) was developed. To facilitate the interaction of users and the system, a Web User Interface (UI) application was also developed. The diagram below shows the phases that are involved in training a text classifier (Fig. 1).

474

R. Lupyani and J. Phiri

Fig. 1. This figure shows the process taken to train a text classifier [18].

3.1 Design and Implementation of the Web Based System The system developed comprises of the Web User Interface (UI) application and the API web service. The Web UI provides users with a graphical user interface to facilitate their interaction with the system whereas the API web service was developed to facilitate the integration of the text classification model with other third party tools and applications. Both the web UI application and the web service applications were developed using PHP language and run on apache as the web application host. PHP- ML was the library that was used to handle the machine learning tasks. 3.2 Model Design and Implementation Three Text Classification Algorithms namely, the Support Vector Machine (SVM) Multiclassifier, the Naïve Bayes and the K-Nearest Neighbour were selected as the models for text classification. The three models were trained using a dataset compiled from historical grant data obtained from the National Science and Technology Centre in Zambia. A total number of 129987 records were extracted and they included information such as Research or Innovation proposal details, proposed budget, previous funding history, institutional information, research or innovation area or field, collaborator information if any and project timeline. The research or innovation proposal details included the Title and abstract of the research proposal, keywords or subject areas related to the proposal. The input features selected for the purpose of the study were the title of the research or innovation and the research or innovation field. The dataset was split into 80% training set and 20% testing set. The figures below illustrate the code snippet used to create each of the three text classification algorithms (Fig. 2).

From Algorithms to Grants: Leveraging Machine Learning

475

Fig. 2. SVM Text Classification model code snippet

The figure below shows the code used to create the K-Nearest Neighbour text classifier (Fig. 3).

Fig. 3. KNN Text Classification model code snippet

476

R. Lupyani and J. Phiri

The figure below shows the code used to create the Naïve Bayes text classifier (Fig. 4).

Fig. 4. Naïve Bayes Text Classification model code snippet

4 Results and Discussions The study carried out experiments using the SVM Multi-classifier, the K-Nearest Neighbour and the Naives Bayes algorithms. The three models were trained using records obtained from historical grant data from NSTC database. The experiment was carried by using the Title and research area or field as the features. The performance of the model was determined by calculating the accuracy, the precision the recall and the F1 Score. The accuracy measures the number of correctly classified instances out of all the instances contained in the dataset, and is expressed as a ratio between the correctly classified instances and the total number of instances in the dataset [19]. It is given by the formula: Accuracy = (True Positives + True Negatives)/(True Positives + True Negatives + False Positives + False Negatives) The Precision measures the accuracy of the positive predictions of all instances predicted as positive [20] and it is given by: Precision = True Positive/(True Positives + False Positives) [20] The Recall measures the ability of the model to correctly identify all positive instances out of all actual positive instances and it is given by the formula: Recall = True Positive/(True Positives + False Negatives) [20] Where: True Positive = Actual class is positive and is predicted as positive False Negative = Actual class is positive but is predicted as negative True Negative = Actual class is negative and is predicted as negative

From Algorithms to Grants: Leveraging Machine Learning

477

False Positive = Actual class is negative but is predicted as positive The F1 score is also a vital performance metric as it provides a more realistic measure of the model’s performance. It is calculated as the weighted mean of the precision and recall [19]. 4.1 Text Classification Models’ Performance The results obtained from training the three models are shown in the table below (Table 1): Table 1. Text Classification Models’ Performance Text Classifier

Accuracy

Precision

Recall

F1-Score

SVM Multi-classifier

0.88

0.86

0.87

0.87

K- Nearest Neighbour

0.41

0.52

0.43

0.37

Naïve Bayes

0.86

0.86

0.81

0.84

The table above show the performance metrics of the three text classification algorithms. These are further illustrated in the figure below (Fig. 5):

Fig. 5. Text Classification Models’ Performance

478

R. Lupyani and J. Phiri

The results obtained reveal that the Support Vector Machine exhibited the best performance among the three models with an accuracy percentage of 88% and F1 score of 87%. The K-Nearest Neighbour portrayed the lowest performance with an accuracy of 41% and F1 score of 37%. This can be attributed to the fact that, the KNN does not perform well on large datasets [7]. Therefore, the SVM was selected as the text classification algorithm for this study. 4.2 The Support Vector Machine (SVM) The SVM Multi classifier makes predictions by first calculating the probability of a research/innovation title belonging to each of the field classes. The field class that gives the highest score is selected as the predicted field class. A sample research title was tested to portray the ability of the model to make predictions. The results are shown in the table below (Table 2): Table 2. SVM prediction Results

Field Category Business Education Engineering Medicine Sciences Social Science

Precision 0.91 0.80 0.80 0.90 0.94 0.95

Predicted Category

Recall 0.91 0.76 0.86 0.95 0.93 0.93

F1 Score 0.91 0.78 0.88 0.92 0.94 0.95

Social Science

The results in the table shows that the social science field category exhibits the highest scores and thus is selected as the predicted category. Furthermore, the SVM multi-classifier was integrated into the web based application to help determine the eligibility of research and innovation proposal applications. The figure below a screen shot of the system (Fig. 6). The experimental results revealed that the proposed SVM multi-classifier achieves superior accuracy and generalisation capabilities compared to the other models. By exhibiting a high performance, the model can be used to ensure a high level of fairness in the grant allocation process, thereby reducing human bias and promoting equitable treatment for all applicants, regardless of their background or affiliations.

From Algorithms to Grants: Leveraging Machine Learning

479

Fig. 6. SVM Multi-classifier integrated into a web based application

5 Conclusion In this study, we proved that the application of an SVM Multi-Classifier to automate the classification of research and innovation proposal applications can enhance efficiency, reduce human biases and promote fair review of the applications. This is exhibited by the performance metrics of the classifier which are 88% accuracy, 86% precision, 87% recall and 87% F1 score. The SVM model was integrated into a web based application to facilitate the grant allocation process by determining whether a submitted research or innovation talk is eligible or not eligible for a grant. This was done by assessing whether a submitted research/ innovation topic falls under a field category for which funding organisations are willing to support and fund. The resulting system has enhanced efficiency and offers a promising solution to improve the fairness and transparency of the grant allocation process. Even though the SVM model exhibited a superior accuracy performance, its use in the allocation of grants maybe biased to the research or innovation topic details only. However, the awarding of grants is not only dependent on the research or innovation topics but also on other factors such as the budget, the project timeline, prior funding history to mention but a few. The dataset used in this study did not include such details and hence the possibility of bias towards the research or innovation topics only. Nevertheless, the model is still useful as the research or innovation topics are the biggest determining factors in the awarding of grants as the goal of funding institutions is to award grants to projects that are aligned to a nation’s economic development goals.

480

R. Lupyani and J. Phiri

6 Recommendations and Future Work Based on the experimental results and limitations of the model, the study recommends that the dataset used be extended to incorporate other input features such as budget and project timeline. The dataset records can also be increased so that the model can give a better accuracy result than the one exhibited in the study.

References 1. Etzkowitz, H., Leydesdorff, L.: The dynamics of innovation: from national systems and mode 2 to a triple Helix of university-industry-government relations. Res. Policy. 29(2) (2000) 2. Smith, J., Johnson, A.: Machine Learning for Grant Proposal Evaluation (2018) 3. Srivastava, C.V., et al.: Challenges and opportunities for research portfolio analysis, management, and evaluation. Res. Eval. 16(3), 152–156 (2007) 4. Nidhi, V.G.: Recent trends in text classification techniques. Int. J. Comput. Appl. 35(6) (2011) 5. Loukas, S.: Text Classification Using Naive Bayes: Theory & A Working Example. Towards Data Science (2020) 6. Ersoy, P.: Naive Bayes Classifiers for Text Classification, Towards Data Science (2021) 7. Harrison, O.: Machine Learning Basics with the K-Nearest Neighbors Algorithm, Towards Data Science (2018) 8. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 9. Andreas, S.G., Müller, C.: Introduction to Machine Learning with Python, O’Reilly Media, Inc. (2016) 10. Bishop, C.M.: Pattern recognition and machine learning. In: Information Science and Statistics , Newyork, Springer (2006) 11. Herzog, C., et al.: Forward-looking analysis based on grants data and machine learning based research classifications as an analytical tool. Digit. Sci. (2000) 12. Goh, Y.C., et al.: Evaluating Human Versus Machine Learning Performance in Classifying Research Abstracts. Springer, Scientometrics (2020) 13. Khor, K.A., et al.: Applying machine learning to compare research grant programs. In: STI Leiden Conference on Science and Technology Indicator, Netherlands (2018) 14. Freyman, C., et al.: Machine-learning-based classification of research grant award records. Res. Eval. 25(4) (2016) 15. Wirth, R.H.J.: CRISP-DM : towards a data process model for data mining. In: Practical Application of Knowledge Discovery and Data mining (1995) 16. Dasu, T., Johnson, T.: Exploratory Data Mining and Data Cleaning. Wiley-Interscience (2003) 17. Kumar, A.: Feature Selection vs Feature Extraction: Machine Learning. Data Analytics (2023) 18. Lucy, D.: Step by Step Basics: Text Classifier 2023, Towards Data Science (2023) 19. Abhigyan, Calculating Accuracy of an ML Model, Analytics Vidhya (2020) 20. James, G., et al.: An Introduction to Statistical Learning. Springer (2013)

Ways and Directions of Development of the Terminal of Increased Security for Biometric Identification When Interacting with the Russian Unified Biometric System Timur R. Abdullin1,3 , Andrey M. Bonch-Bruevich3,4 , Sergei A. Kesel1,3(B) , Timur V. Shipunov2 , Ilya V. Ovsyannikov1 , Denis A. Konstantinov1 , and Utkurbek B. Mamatkulov3 1

3

Moscow Polytechnic University, 107023 Bolshaya Semyonovskaya Street, 38, Moscow, Russia 2 JSC “Social card”, 420124 Meridiannaya Street, 4, Room 1, Kazan, Republic of Tatarstan, Russia Bauman Moscow State Technical University, 105005 2nd Baumanskaya Street, 5, Moscow, Russia [email protected] 4 Financial University Under the Government of the Russian Federation, 125167 Moscow, Leningradsky Prospekt, 49/2, Moscow, Russia

Abstract. The paper studies different ways and directions of development of biometric terminal devices interacting with the Unified Biometric System (UBS) for the purpose of biometric identification of users. The purpose of the article is to develop a new solution in the field of biometric terminals, which would meet all regulatory requirements and have high characteristics of quality, reliability, and security of biometric personal data processing. For this purpose, the article considers the order and peculiarities of personal data processing when interacting with the unified biometric system. The legislative base of personal data processing in Russia is analyzed. As a solution to this problem the concept of design and functionality of the biometric terminal of high security is presented and described. In order to prove its effectiveness, the justification of the way of development and implementation of the biometric terminal based on comparison with competitors and other studies is given. It is concluded that the proposed realization of the biometric terminal is effective and competitive in the market of biometric technologies in Russia and is able to solve the tasks assigned to it. #COMESYSO1120.

Keywords: Biometric authentication Information security · Personal data

· Biometric terminal ·

c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 481–492, 2024. https://doi.org/10.1007/978-3-031-54820-8_39

482

1 1.1

T. R. Abdullin et al.

Introduction General Information

In today’s world, personal data is a valuable resource that can be used for both good and malicious purposes. Therefore, ensuring the protection and confidentiality of personal data is becoming an increasingly important task for the state, business, and society. It is especially important to protect biometric personal data, which reflects a person’s unique physical or behavioral characteristics, such as fingerprints, face, voice, handwriting, etc. [1]. The use of biometrics simplifies the identification and authentication process for users. At the same time, a high degree of accuracy and reliability is maintained, as with classical authentication methods [2]. In addition, the current level of technology development makes the application of this method more flexible and mobile. Therefore, every year biometrics is becoming more and more popular and in demand in various spheres of human life: banking, health care, education, transportation, etc. [3] . As mentioned earlier, the use of biometric authentication can have negative consequences for the subject if the processing of personal data without his/her consent or in violation of legal requirements. In Russia over the last two years there has been a sharp increase in the number of leaks of users’ personal data, including biometric data. While in 2021 there were four such incidents, in 2022 over 140, and in the first seven months of 2023 - already over 150 [4]. A biometric leak could have serious consequences for personal security, financial well-being and reputation of the subject, as biometric data is almost impossible to change or cancel. The Russian Ministry of Finance has called a biometrics leak “the worst thing that could happen” [4]. In this regard, the Federal Service for Supervision of Communications, Information Technology and Mass Media (Roskomnadzor) published a list of measures to ensure the security of personal data for all operators processing it. One of such measures is the use by operators of their own technical and software tools that provide the necessary level of protection. In addition, in 2021, the Federal Service for Technical and Export Control (FSTEC) developed a methodology for assessing information security (IS) threats when processing personal data and added new threats to the threat databank related to machine learning and artificial intelligence. 1.2

The Problem of Protecting Biometric Identification Endpoints

Over the past few years, Russia has adopted a number of federal laws (FL) regulating the collection and processing of biometric personal data and establishing strict requirements for terminal devices such as biometric terminals. A biometric terminal is a device designed to collect, process, and transmit biometric data of a subject. Such devices can be used for various purposes, for example, for identification and authentication of bank clients, for access control to protected objects, for registration of visitors at events, etc.

Secure Terminal for Biometric Identification with Russian UBS

483

However, today there are practically no biometric terminals on the market that could provide all the requirements of the legislation and ensure a high level of protection of biometric personal data from IS threats. Most of the devices that are used nowadays do not meet the security requirements and may be subject to hacking, which in turn will lead to substitution, theft, or damage of personal data. Due to the current situation on the biometric terminals market in Russia, it is necessary to develop a new solution that would meet all legal requirements and have high quality, reliability, and security characteristics of biometric personal data processing. To achieve this goal, the following aspects will be considered in the article: • Legislative basis of biometric personal data processing in Russia; • Procedure and peculiarities of biometric personal data processing when interacting with the unified biometric system (UBS); • Justification of the way to develop a biometric terminal based on comparison with existing solutions and other studies; • Concept of design and functionality of the biometric terminal. 1.3

Legal Aspects of Biometrics Use in Russia

In Russia, legislation on biometrics and EBS is based on the following main normative acts: • The Constitution of the Russian Federation, which guarantees the right of citizens to inviolability of private life, personal and family secrecy, protection of their honor and good name [5]; • Federal Law of July 27, 2006, No. 152-FZ “On Personal Data”, which defines the concept of personal data, the principles of their processing, the rights and obligations of subjects and operators of personal data, as well as measures to ensure their security [6]; • Federal Law No. 572-FZ of December 29, 2022 “On the Identification and (or) Authentication of Individuals Using Biometric Personal Data, on Amendments to Certain Legislative Acts of the Russian Federation and the Annulment of Certain Provisions of Legislative Acts of the Russian Federation”, which regulates the collection, storage, use and protection of biometric personal data, as well as establishes the rights and obligations of subjects and operators of biometric personal data [7]; • Resolution of the Government of the Russian Federation No. 883 of May 31, 2023 “On Approval of the Regulations on the Unified Biometric System”, which defines the goals, objectives, functions, structure, and operating procedures of the Unified Biometric System (UBS), as well as requirements for devices and software used for interaction with the UBS [8]; • Resolution of the Government of the Russian Federation dated November 1, 2012, No. 1119 “On Approval of Requirements for the Protection of Personal Data when Processed in Personal Data Information Systems”. Defines the

484

T. R. Abdullin et al.

types of personal data processing information systems and the corresponding types of threats for each system [9]; • Order of the Federal Security Service of Russia dated July 10, 2014 No. 378 “On Approval of the Composition and Content of Organizational and Technical Measures to Ensure the Security of Personal Data when Processing in Personal Data Information Systems with the Use of Cryptographic Information Protection Means Required to Meet the Personal Data Protection Requirements Established by the Government of the Russian Federation for Each Level of Security”. Establishes a list of measures to ensure the security of personal data using cryptographic protection [10]. 1.4

Unified Biometric System. Overview of the System and Interaction Procedures

According to the current legislation, the use of biometrics for identification and authentication is possible only in cooperation with the UBS. The UBS is a unified information system of personal data that ensures processing, including collection and storage of biometric personal data, their verification and transfer of information on the degree of their compliance with the provided biometric personal data of a natural person. The UBS was created to provide biometric registration, verification, and identification of users in order to confirm their identity for banking and other transactions or to gain access to third-party services. The system was developed by Rostelecom on the initiative of the Ministry of Digital Development, Communications and Mass Media of the Russian Federation (Mincifra) and the Bank of Russia and launched in 2018. At the end of 2021, the UBS acquired the status of a state information system. The operator of the UBS is JSC Center of Biometric Technologies. The purpose of the UBS is to enable remote access to financial and nonfinancial services using biometric identification. For this purpose, the UBS uses two parameters - voice and face (photo image of a person’s face). The system works in close interconnection with the Unified System of Identification and Authentication (USIA), which provides basic personal data of users. In order to use the EBS services, the user needs to register with one of the authorized banks or through the mobile application “Gosuslugi”. At the same time, the user must give his/her consent to the processing of his/her biometric personal data and confirm his/her identity by presenting a passport or other identification document. After registration, the user receives a unique identifier in the UBS, which allows him/her to remotely confirm his/her identity when accessing various services. Currently, the EBS provides the possibility to receive the following types of services: • Opening a bank account or deposit without visiting a bank branch; • Connection to the system of fast payments for transferring money by phone number;

Secure Terminal for Biometric Identification with Russian UBS

485

• Receiving loans, insurance policies, pension savings and other financial products; • Receiving medical services, such as electronic registration for a doctor’s appointment, receiving an electronic prescription or sick leave; • Receiving educational services, such as enrollment in courses, exams or certification; • Obtaining state and municipal services, such as obtaining certificates, statements, certificates, and other documents. Biometric terminals can be used for interaction with the UBS, which ensure collection, processing, and transmission of biometric personal data of the subject. Biometric terminals can be installed in bank branches, medical institutions, educational organizations, administrative buildings, and other places where services are provided using the UBS. Biometric terminals must comply with strict security and data protection requirements established by legislation and regulatory documents. In particular, biometric terminals must: • Use certified technical and software tools for collection and processing of biometric personal data; • Ensure encryption and integrity of transmitted data; • Prevent unauthorized access to or copying of data; • Possess means of self-diagnostics and self-recovery in case of failures or attacks; • Be remotely controlled and monitored by the UBS operator.

2 2.1

Methodology Justification of the Choice of the Way of Development of Biometric Terminal of Increased Security

Market Overview. According to a study by Mordor Intelligence, the global biometrics market was valued at USD 27.09 billion in 2020 and is projected to be worth USD 62.52 billion by 2026, registering a compound annual growth rate of 15.2% during the forecast period 2021–2026. [11] The major factors driving the growth of the biometrics market are increasing terrorist attacks and theft of confidential information and increasing security requirements for biometric data. The main market segments by type of biometric technology are fingerprint recognition, facial recognition, iris recognition and voice recognition. By the way of interaction with the reader during authentication, the market is divided into contact and contactless. The major sectors using biometric technologies include government and law enforcement, commercial and retail, healthcare, banking and finance, travel, and immigration. According to the latest study of the Russian biometric technology market conducted by J’son & Partners Consulting, as of 2018, the market was at a more dynamic stage of development than the global market. Pilot projects launched were actively moving to the stage of real integration, and new areas of biometric

486

T. R. Abdullin et al.

technology application were being developed. Annual growth rates of biometric technologies in Russia in the period from 2019 to 2022 exceeded the global figure by more than 1.5 times [12]. However, Russia’s share in the global biometrics market volume continues to be insignificant: it exceeded 1% in 2022. The structure of the Russian market of biometric technologies also differs from the global one. While fingerprint recognition technologies continue to hold a dominant share in the global space (52.2% in 2018), Russia is witnessing an active penetration of facial recognition technologies. Over the past three years, facial recognition technologies in Russia have increased their share of the total Russian biometric market by more than six times to almost 50%. This is due to the development of machine learning and the emergence of many companies developing high-quality facial identification algorithms, which began to shape the demand for this technology in the country. The Russian market, in terms of biometrics, can be divided into two main segments - payments (with a volume of 210 billion rubles) and security (with a volume of 145 billion rubles). The payment segment includes such areas as the banking sector, financial technologies (fintech), e-commerce and others. The security segment includes such areas as government and law enforcement agencies, transportation and logistics, healthcare, and others. The considered information confirms the relevance of the development of its own concept of the terminal of high security, as the biometrics market is at the stage of intensive development, both in Russia and around the world, and there is a considerable demand for this authentication technology. Therefore, before creating your own concept it is necessary to study the solutions available on the market. An Overview of Current Biometric Terminal Solutions. If we talk about competitors, there are many well-known companies producing biometric facial identification devices on the domestic and international market, such as VisionLabs [13], Biosmart [14], Ovision [15], Hikvision [16], Idemia [17], Rostelecom [18] and others. However, most of these devices do not have hardware cryptographic security features or such information is not reported. This means that data transmitted and stored on these devices may be at risk of leakage, tampering or theft. The only direct competitor to the Inoface terminal is the joint development of Rostelecom and OKB SAPR [19], which also provides for the use of RCB for data encryption and signing. However, over the past few years, information on the progress of this project has never been updated and, probably, work on it has been suspended. 2.2

The Concept of a Biometric Terminal with Enhanced Security

As a solution to this problem, we have developed the concept of a biometric terminal with increased security. The device was developed on the basis of our own unique developments, such as a camera module, which allows to create a three-dimensional image due to a system of stereo-cameras, a carrier board of our

Secure Terminal for Biometric Identification with Russian UBS

487

own design, domestic software. Our biometric identification terminal is designed to provide the necessary level of protection of biometric personal data on three main properties: confidentiality, integrity, and accessibility. For this purpose, we used the following technical and software solutions: • Case opening sensors, which register any attempts of unauthorized access to the internal components of the terminal and block its operation in case of detection of such attempts; • No removable media interfaces that can be used to copy and modify data or software on the mobile computer [20]; • No debug ports that can be used to connect external devices or programmers to the terminal; • Security Resident Component (SRC), which is a non-retrievable repository of keys used to encrypt and sign data. The SRC also contains hardware cryptographic processors and a random number generator, which provide high speed and reliability of cryptographic operations; • Terminal software, which closely interacts with the SRC at all stages from loading to servicing. The terminal software realizes the following functions: – Loading and authenticating the software at terminal startup using digital signature and hash functions; – Encryption and signing of all transmitted data via secure communication channels using symmetric and asymmetric encryption algorithms; – Processing of biometric data using unique anti-spoofing algorithms, which allow to distinguish the representation of a live person from a fake (photo, video, masks, etc.); – Interaction with the EBS, which is a centralized biometric data repository and provides identification and authentication of subjects by biometrics. The basis of our solution is a camera unit on a rotating arm, which includes a stereo camera and an infrared (IR) or thermal imaging camera, a white and IR illumination module. All three cameras have the same high-resolution matrixes installed in portrait orientation, the same optics and operate synchronously. The axes of the cameras are parallel. The middle camera includes an IR light filter with a transmission in the range of 700 nm or more. This is an IR camera that allows the face to be imaged in low-light conditions or in the presence of interference such as glasses, hats, masks, etc. The cameras form an RGB stereo pair at the edges and have filters that cut off IR radiation longer than 650 nm to avoid distortion of color reproduction in conditions of powerful IR illumination. The portrait orientation of the sensors is advantageous when processing video of people of different heights, as it captures more facial details. Thanks to our camera module, we can create a three-dimensional image of the subject’s face, which contains information about the depth of each point. This allows us to improve the accuracy of facial recognition and reduce the possibility of errors. We can also use an infrared or thermal imaging camera to detect the temperature of the subject’s face and check the subject’s vital signs (vitality). This prevents the system from being fooled by photos, videos, or masks.

488

T. R. Abdullin et al.

Our Biometric Identification Terminal concept is a unique and innovative solution that combines high performance, reliability, and security in the processing of biometric personal data. The terminal can be used for different purposes and in different conditions, as it adapts to the characteristics of the subject and the environment. As a consequence of all the above-mentioned factors, our terminal concept has a number of competitive advantages over other facial biometric identification devices. Firstly, we used our own unique developments, such as camera module, which allows to create a three-dimensional image due to the system of stereo cameras, carrier board of our own design, domestic software, etc. Secondly, we can provide a high level of biometric data protection with the help of SRC, which is a non-retrievable key storage, hardware cryptographic processors and random number generator. Thirdly, we adapt our solution to different conditions and needs of customers, using hybrid identification (combination of two or more biometric methods), infrared or thermal camera for facial temperature detection and vital signs verification, swivel bracket for camera module for video processing of people of different height, built-in contactless reader of bank cards and passes (NFC, QR, etc.). 2.3

Building a Threat Model for a Biometric Authentication System

In addition to the mandatory security requirements of regulators to the objects of interaction when conducting operations with biometric personal data, it is necessary to have and keep up to date a model of IS threats. Such a model is being developed and supplemented for the presented biometric terminal concept. It refers to the general scheme of interaction during identification and authentication procedures using a biometric terminal. Determination of actual threats and vulnerabilities of the current system, the new “Methodology of information security threat assessment”, approved by the Ministry of Information Security of the Russian Federation, is used, approved by FSTEC of Russia on February 5, 2021. In connection with the release of this methodology, the “Methodology for determining current threats to the security of personal data during their processing in information systems of personal data”, approved by the FSTEC of Russia in 2008, becomes invalid. The new methodology is suitable for determining IS threats both for significant state and municipal information systems and for personal data information systems. The methodology is aimed at assessing anthropogenic threats to information, which are caused by destructive actions of intruders. According to the new methodology, the task of assessing IS threats and vulnerabilities is divided into the following stages described below. Determination of Negative Consequences and Related Types of Risks Resulting from the Realization of Threats. At this stage, all possible negative consequences from the realization of threats to the system are determined.

Secure Terminal for Biometric Identification with Russian UBS

489

They are grouped in relation to the type of risk: damage to an individual, legal entity or the state. Identification of IS Threat Impact Objects. System objects that can be subjected to destructive impact as a result of realization of certain threats are identified. The found objects are correlated with the previously determined negative consequences with a description of the type of impact on the current object. Identification of IS Threat Sources. At this stage, possible anthropogenic sources of IS threats are identified, which can be represented by persons realizing security threats by directly affecting the system components. An intruder model is compiled. A list of actual types of intruders is defined, taking into account the specifics of the system, potential capabilities, and motivation of intruders. An intruder is considered relevant only if the goals of his interference in the system correspond to possible negative consequences for this system and there is a necessary potential for the realization of threats. Thus, the intruder model should contain a list of possible types of intruders with their motivation and consequences from the realization of a particular threat. Next, the intruders’ potential is assessed based on their possession of certain skills and techniques, and they are assigned to one of four capability levels (N1-N4). This information is used to rank the current IS intruders. Determining the Ways in Which IS Threats Are Realized. At this step, an intruder model is formed, based on the previous steps, but taking into account the actual ways of realization of a particular threat by a particular intruder. Relevant are those ways that the intruder is able to use to realize IS threats and there are conditions under which this threat can be realized in relation to the object of influence. It is necessary to take into account that the same IS threat can be realized in different ways. The necessary condition, which allows an intruder to realize certain threats, is access to the corresponding interfaces of the objects of influence. Interfaces are determined based on the architecture, composition, and peculiarities of functioning of the system under study. During the analysis, both physical interfaces of access to objects, including the need for physical access to them, and logical interfaces are determined. Information about actual intruders realizing IS threats is correlated with the selected ways of realization of these threats. Assessment of the Possibility of IS Threats Realization and Preparation of Threat Realization Scenarios. At this stage, the relevance of previously defined security threats is assessed by considering specific tactics and techniques of threat realization presented by the FSTEC. A threat is considered relevant to the system if there is an intruder and a corresponding object of influence, which will be attacked by one of the available methods for the intruder, and in case of its success - there will be negative consequences for the organization. In

490

T. R. Abdullin et al.

other words, there is a scenario of realization of this or that security threat. The scenario may include a sequence of possible tactics and corresponding techniques that can be applied by the actual intruder in accordance with his capabilities, availability, and accessibility of interfaces for the realization of relevant threats. A scenario is created for each method of threat realization. The scenario must be projected onto each type of actual intruder, taking into account their level of capabilities. The actual scenarios are entered into the final threat model. Thus, the threat model is the basis for building and maintaining a secure IS. Threat model development is a continuous process that involves constant monitoring of the current situation, search for new threats, timely addition, and revision of already available information. Based on the threat model, strategic decisions should be made on the implementation of the required means of protection that can effectively counteract the current IS threats. As in our case, all the previously mentioned means of protection are necessary to close most of the identified threats.

3

Discussion

In this paper we have considered various concepts and approaches to design and development of a biometric terminal of high security for integration with UBS. Based on the analysis of the current situation on the market of biometric technologies in Russia and the current legislation in this area, we have developed our own concept of this type of devices. As a result, the proposed concept has the following distinctive features and advantages: • Multifunctional camera module produced in-house; • In-house developed carrier board that meets all the requirements to functionality for this class of devices; • Domestic software; • Proprietary anti-spoofing algorithms, preventing attempts to use fake biometrics; • Built-in resident security component responsible for protection of biometric personal data; • Case opening sensors blocking the biometric term-channel operation in case of unauthorized access attempts detection; • No technical ports and interfaces for removable media; • Great flexibility in integration with systems of different purposes. The paper also considers the concept of development and maintenance of security-free IS, which includes a biometric terminal. It is based on the process of compiling a threat model for the system under study, which results in a document containing complete information about the objects of influence of the system, sources of threats and ways of their realization, and possible negative consequences. The generalization of the received information is the scenarios of threat realization, on the basis of which it is possible to make a verdict on the

Secure Terminal for Biometric Identification with Russian UBS

491

relevance of this or that threat and to take additional protective measures in due time. The proposed way is not the only possible way to solve the problem in the field of secure biometric identification devices interacting with EBS. At the moment, these devices and systems are only beginning to be actively implemented in various governmental and commercial projects, and the legislative base is still being filled with new requirements. And taking into account the versatility of the market, the number of non-standard conditions and situations in the operation of such systems, it remains only to actively collect feedback from customers and users. Only on the basis of such statistics it is possible to fully assert the effectiveness of a particular solution.

4

Conclusion

The paper presents a concept of a biometric terminal with enhanced security for interaction with the Unified Biometric System (UBS) in Russia. The concept is based on the authors’ own developments, such as a camera module, a carrier board, a resident security component, and anti-spoofing algorithms. The paper also describes the legal and regulatory aspects of biometric data processing in Russia, as well as the methodology of building a threat model for the biometric authentication system. The paper claims that the proposed concept has several advantages over existing solutions, such as high performance, reliability, security, and flexibility. The paper concludes that the concept is effective and competitive in the market of biometric technologies in Russia and can solve the tasks assigned to it.

References 1. Deliversky, J., Deliverska, M.: Ethical and legal considerations in biometric data usage—Bulgarian perspective. Front. Public Health 6 (2018) 2. Blanco-Gonzalo, R., Lunerti, C., Sanchez-Reillo, R., Guest, R.M.: Biometrics: accessibility challenge or opportunity? PLoS ONE 13(3), e0194111 (2018). https:// doi.org/10.1371/journal.pone.0194111 3. Imaoka, H., et al.: The future of biometrics technology: from face recognition to related applications. APSIPA Trans. Signal Inform. Process. 10, e9 (2021). https:// doi.org/10.1017/ATSIP.2021.8 4. Nefyodova, A.: Personal data is shared: personal data leaks have increased 40 times in Russia. Personal data is shared: personal data leaks have increased 40 times in Russia (2023) 5. The Constitution of the Russian Federation. http://www.constitution.ru/en/ 10003000-01.htm (Accessed 15 Sep 2023) 6. Federal Law of July 27, 2006, No. 152-FZ ”On Personal Data”. http://pravo.gov. ru/proxy/ips/?docbody&nd=102108261 (Accessed17 Sep 2023)

492

T. R. Abdullin et al.

7. Federal Law No. 572-FZ of December 29, 2022 ”On the Identification and (or) Authentication of Individuals Using Biometric Personal Data, on Amendments to Certain Legislative Acts of the Russian Federation and the Annulment of Certain Provisions of Legislative Acts of the Russian Federation”, Ministry of Digital Development, Communications and Mass Media of the Russian Federation. https://digital.gov.ru/ru/documents/9003/ (Accessed 17 Sep 2023) 8. Resolution of the Government of the Russian Federation No. 883 of May 31, 2023 ”On Approval of the Regulations on the Unified Biometric System’. http:// government.ru/docs/all/147777/ (Accessed 17 Sep 2023) 9. Resolution of the Government of the Russian Federation Dated November 1, 2012, No. 1119 ”On Approval of Requirements for the Protection of Personal Data When Processed in Personal Data Information Systems”. http://pravo.gov.ru/proxy/ ips/?docbody=&nd=102160483 (Accessed 17 Sep 2023) 10. Order of the Federal Security Service of Russia Dated July 10, 2014 No. 378 ”On Approval of the Composition and Content of Organizational and Tech-nical Measures to Ensure the Security of Personal Data When Processing in Personal Data Information Systems with the Use of Cryptographic Information Protection Means Required to Meet the Personal Data Protection Requirements Established by the Government of the Russian Federation for Each Level of Security”. https://base. garant.ru/70727118/ (Accessed17 Sep 2023) 11. Biometrics Market - Share, Growth & Size. https://www.mordorintelligence.com/ industry-reports/biometrics-market (Accessed 15 Sep 2023) 12. Russian biometric market in 2019-2022. Results of a large-scale study by J’son & Partners Consulting. https://www.tbforum.ru/blog/rossijskij-biometricheskijrynok-v-2019-2022-godah.-rezultaty-masshtabnogo-issledovaniyajson-partnersconsulting (Accessed 15 Sep 2023) 13. VisionLabs. https://visionlabs.ai/ (Accessed 15 Sep 2023) 14. BioSmart. https://bio-smart.ru (Accessed 15 Sep 2023) 15. OVISION. https://ovision.ru (Accessed 15 Sep 2023) 16. Hikvision. https://us.hikvision.com/en (Accessed 15 Sep 2023) 17. IDEMIA. https://www.idemia.com/ (Accessed 15 Sep 2023) 18. PJSC Rostelecom. https://company.rt.ru/ (Accessed 15 Sep 2023) 19. PJSC Rostelecom with Participation of OKB SAPR Presents an Innovative Domestic Biometric Terminal for ACS Systems. https://ru-bezh.ru/novinki/ news/20/09/18/pao-rostelekom-pri-uchastii-okb-sapr-predstavlyaetinnovaczionny (Accessed 15 Sep 2023) 20. Kalutskiy, I.V., Spevakov, A.G., Shumaylova, V.A.: Method for ensuring data privacy on the computer’s internal hard disk. In: Radionov, A.A., Gasiyarov, V.R. (eds.) RusAutoCon 2020. LNEE, vol. 729, pp. 543–549. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-71119-1 53

Research of the Correlation Between the Results of Detection the Liveliness of a Face and Its Identification by Facial Recognition Systems Aleksandr A. Shnyrev2 , Ramil Zainulin2 , Daniil Solovyev2 , Maxim S. Isaev2 Timur V. Shipunov2 , Timur R. Abdullin1 , Sergei A. Kesel1(B) , Denis A. Konstantinov1 , and Ilya V. Ovsyannikov1

,

1 Moscow Polytechnic University, 107023, Bolshaya Semyonovskaya Street,

38, Moscow, Russia [email protected] 2 JSC Social Card, Republic of Tatarstan, 420124, Meridiannaya Street, 4, Room 1, Kazan, Russia

Abstract. In this paper, the hypothesis is investigated that a system capable of solving the problem of face-anti-spoofing with biometric authentication is capable of partially solving the recognition problem without additional recognition modules, by finding and excluding those persons who have a low probability of being successfully recognized. To do this, the paper considers the device of the basic facial recognition system, highlighting the role of the anti-spoofing module. Other approaches to face recognition and selection of images that do not contain faces have also been studied. The problem under study is formalized and presented in mathematical form for further experiments. During a series of experiments on selected data sets, the results were obtained and visualized, proving the absence of a relationship between the operation of the anti-spoofing module and the facial recognition module. In conclusion, plans for further work in this direction are also presented. #COMESYSO1120. Keywords: machine learning · facial recognition · biometric authentication · anti-spoofing

1 Introduction 1.1 General Information In recent years, we have witnessed a breakthrough in the field of facial recognition, caused by the development of computing power and research in the field of deep convolutional neural networks. Modern and productive architectures of such algorithms have allowed such systems to play an important role in human life, finding application in many areas of human life, such as, for example, access control tools, user device unlocking, mobile payments, urban security and much more [1]. Due to the increasing use of facial recognition technology, the issue of protecting such systems from unauthorized access has become increasingly relevant [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 493–502, 2024. https://doi.org/10.1007/978-3-031-54820-8_40

494

A. A. Shnyrev et al.

Of course, the accuracy of modern recognition systems is so high that the probability of a random person being mistaken by the system for another subject, under ideal conditions, has approached almost zero [3]. In conditions of good lighting and a sufficiently high-quality image from the camera of the system, the success of such an attack becomes almost impossible. Unfortunately, as soon as a user’s biometric data is stolen or duplicated, the effectiveness of facial recognition systems is compromised. This situation most often occurs when one or several photos of a system user can be easily obtained without even getting into physical contact with him, for example, by uploading them via the Internet or simply taking a picture of his face using a mobile phone camera. The classic way to attack such systems is to demonstrate a copy of a living person’s face, for example, in the form of a printout or on the screen of a mobile device [4]. If such attacks are successful, serious consequences may arise, but, unfortunately, there are still no effective methods to combat user substitution. Such fake faces are not only easy to implement, but are usually quite effective for creating problems in the face recognition system, and this has become a significant problem. The use of an efficient module that stops such attacks (hereinafter referred to as the anti-spoof module) allows you to solve several problems at once concerning both the safety of the system and, possibly, its performance [5]. The basic facial recognition system is arranged as follows: an image from some source (camera, Internet resource, local resource) gets to the input of the system, where it determines whether there is a person’s face in the image (Fig. 1). This person then gets to the input of the anti-spoof module, which determines whether it belongs to whether this person is a living person or not. In case of successful completion of the module, the face falls directly into the recognition module, where the pre-trained neural model transforms the person’s face into a numerical representation, a vector that will later be compared with other vectors belonging to all users of the system. If the received vector and one of the vectors in the database are sufficiently similar to each other by any metric, then the system reports successful user recognition and allows further access. In our work, we would like to consider the probability that a system that successfully solves the problem of face-anti-spoofing is able, without using additional recognition modules, to partially solve the problem of recognition, cutting off those faces that have a low probability of being successfully recognized.

Fig. 1. The scheme of the face recognition system.

Research of the Correlation Between the Results of Detection

495

1.2 Literature Review Since the early 1990s, the problem of face recognition has become a well-known biomedical problem used in medicine, military, security and other spheres of public life, and subsequently face recognition has passed through several important epochs. In accordance with the accepted methods of operation, modern facial recognition systems can be divided into two categories: classical and based on deep learning. Classical methods always use traditional algorithms to extract facial features, such as, for example, iterative closest point analysis (ICP), principal component analysis (PCA) and other linear and nonlinear algorithms [6]. As for methods based on deep learning, almost all such methods use both specially trained and pre-trained networks, which are adapted to work with appropriately transformed data (for example, 2D images from 3D faces). For the first time, face recognition was implemented through holistic representations that reveal low-dimensional representations of faces through certain assumptions about their distributions, such as linear subspaces, diversity, and sparse representation [7]. This approach was the rule among facial recognition algorithms in the 1990s and 2000s. However, a well-known problem was that these theoretically plausible methods were not able to take into account uncontrolled changes in persons who differed from their original, reference representations. In the early 2000s, the solution to this problem led to the emergence of systems based on the selection of local features [8]. Gabor and LBP methods, as well as their multilevel and multidimensional improvements, have achieved high efficiency indicators due to some invariant properties of local filtration. However, such hand-developed features suffered from a lack of compactness and inefficiency. In the early 2010s, for the first time, learning-based local descriptors were introduced, in which local filters were trained to be better distinguishable. Be that as it may, all these superficial representations still had inevitable losses of effectiveness in situations of complex variations of the human face. Summarizing the algorithms described above, classical recognition methods tried to recognize a human face by one- or two-level representations, such as filtering responses, histograms of feature codes, etc. The research community was intensively engaged in improving image preprocessing, description of local descriptors and feature transformations, but these approaches only slowly improved the quality of cognition. Even worse, the work on these methods was aimed at solving only one aspect of face changes: lighting, posture, facial expressions or disguise. The general approach to solving such problems changed after the success of convolutional neural networks in 2012, when a similar AlexNet network won the ImageNet competition. The multilevel nature and depth of such networks demonstrated their invariance to angles and facial expressions, as well as their illumination. Successes in the task of face recognition did not take long to wait, and in 2014, the DeepFace algorithm achieved State-Of-The-Art accuracy on the Labeled Faces in the Wild dataset, for the first time approaching human efficiency. With the development of technology and the increase in computational resources, the main methods for solving recognition problems have become those that use deep learning, and over time they have almost completely replaced the classical algorithms for the selection of facial features [9]. As part of this study, we used an algorithm based on deep convolutional neural networks to operate the facial recognition system. To do this, we used a trained neural

496

A. A. Shnyrev et al.

network based on the IResNet-50 architecture using the MS1MV2 public dataset [10]. In the process of training the neural network, the CosFace loss function was used, achieving State-Of-The-Art results as part of training deep neural networks for facial recognition systems [11]. For the anti-spoofing module to work, algorithms are needed that can distinguish whether a person belongs to a living person – a user of the system – or is a fake of some kind. Modern methods of protection against counterfeiting can be classified on the basis of various criteria, such as, for example, the types of biometric signals used, whether additional devices are used or whether interaction with a person is required [12–17]. The most commonly used criteria include tracking movements in a number of images, such as blinking of the eyes, and small, involuntary movements of parts of the face and head. Although no additional devices are required in these methods, algorithms may encounter problems, for example, when a short video of a real user is displayed or a photo is simply moving in front of the camera. In a number of works, examples are given when the blinking of the eyes and a certain degree of mouth movements can be well modeled using only two photographs [18]. Other commonly used algorithms include tracking skin surface textures and facial relief information [19]. In the presence of special devices, images in the near infrared range or thermal imaging images can be considered. Based on mimic signals and multimodal information (for example, a person’s voice or gestures, etc.), algorithms can use various human reactions (for example, responding to a request to the user to blink, smile or turn his head), but these methods may require additional devices. Simpler production methods use information only about the static image of the face. The problem here is that the appearance of a human face can change dramatically due to different lighting conditions, and there are also many camera-related factors that can affect the quality of images, which may make it difficult to distinguish images of a living person from images in photographs [20]. In general, a real human face differs from the face in the photo mainly by two criteria: 1. The real face is a 3D object, while the photo itself is 2D; 2. The texture of the surface of the real face and the face in the photo are different. These two factors, along with others (such as the clarity of the photo print and the noise generated by the camera), usually lead to a different image of the real face and the face in the photo under the same conditions of removal. In our work, as an anti-spoofing module, we used a neural model based on the MobileNetV2 network architecture, to which a face image obtained using the MTCNN detector and the adaptive padding principle is fed, i.e., an expanded face area is transmitted to the model [21].

2 Methodology 2.1 Tasks of Face Recognition and Anti-spoofing In order to formulate our hypothesis, it is necessary first to determine the mathematical formulations of the face-anti-spoofing problem and the face recognition problem.

Research of the Correlation Between the Results of Detection

497

We define the face recognition task as the encoder task. The encoder translates the input signal, in this case the image of a face, into its numerical representation, vector: h : h = g(x).

(1)

To determine the similarity of two faces, we can use the cosine similarity metric, and for this we calculate the cosine of the angle between the vectors of two faces: h1 and h2. If the cosine similarity has exceeded a certain threshold set by us, we can say that the faces belong to the same person. The task of face-anti-spoofing is defined as the task of classifying facial images into genuine and fake. After training on a marked-up data set, the resulting model solves this problem:   (2) D = (x, y) : x ⊂ RH ×W ×3 , y ∈ [0, 1] , that is, it trains such a model, that is, f, that: f (image) = prediction,

(3)

image ∈ RH ×W ×3 .

(4)

where:

It also sets a certain threshold, exceeding which means that the face belongs to a living person. In general, the task is to find a function that will most accurately classify the input facial images as genuine or fake. Within the framework of this work, we want to answer the question whether the nonexceeding of the threshold in the anti-spoofing task is also the most likely non-exceeding of the threshold in the face recognition task and vice versa. If the subject’s face was not determined to be alive, could it rather be unrecognized by the system, or there is no such relationship. 2.2 Data Sets Used In order to test our hypothesis, we used two publicly available datasets as test data: NUAA and CASIA-FASD [22, 23]. Using several sets at once allows us to draw conclusions independent of the image quality, type of lighting, available types of attacks and other specific characteristics in a particular data set. The NUAA dataset contains an extensive database of images, which amounts to more than 50 thousand photos of 15 subjects. As the authors themselves describe their data set, it was created on the basis of photographs obtained using a conventional cheap webcam. These data were collected over three sessions with an interval of about 2 weeks between sessions, the location and conditions of each session were different. This approach of the authors of the study allows us to obtain results independent of the specific type of lighting or interior surrounding each of the 15 subjects. During each session, images of both the data collection participants themselves and images of their photographs were obtained. When shooting the image, each participant

498

A. A. Shnyrev et al.

was asked to look at the webcam full-face, with a neutral facial expression and without movements noticeable to a person, such as, for example, blinking or turning the head. In other words, the authors tried to make a living person look as much like his own photo as possible and vice versa. To collect images of photographs, the authors first took a high-quality photograph for each object using a conventional Canon camera so that the face area occupied at least 2/3 of the entire photo area, and then developed photographs on small and larger photo paper. The set itself contains 3362 images of living people obtained in the frames of the third session, as well as 5761 images of photographs also obtained in the framework of session number 3. As part of the work on the CASIA-FASD dataset, the researchers placed special emphasis on the issue of the effect of image quality on the success of anti-spoofing attacks. This problem has not been previously updated by other authors of data sets, because in their research they focused directly on fake faces and their variations. For example, in the NUAA dataset described above, the researchers relied only on attacks carried out using photographs, and in the Idiap dataset, the problem of fixing photos in front of the anti-spoofing module or holding them directly by the person conducting the attack was considered. In a number of studies, the authors referred to cases when cutouts were made on a fake image, allowing attackers to make some movements in the face area. As part of the work on the CASIA-FASD dataset, the authors wonder how anti-spoofing modules will work on images of good and poor quality, since this directly affects the performance of such models. For simplicity, the authors empirically determine the quality based more on the subjective perception of the preservation of facial textures than on strict quantitative indicators. To collect data, they used three different cameras to record data of different quality. The low-quality video was shot using a long-used USB camera, because according to the authors, long-term use always worsens the image quality. Thus, the width and height of the image for low-quality video were 640 and 480 pixels, respectively. The standard quality video was shot with a new USB camera, which allowed to preserve the original quality of its image. The width and height for this type of shooting are similar to low-quality videos. To obtain high-quality videos, the authors used a Sony NEX-5 camera, the maximum resolution of which is 1920 by 1080 pixels. The following three types of fake attacks were developed for the data set: a distorted photo attack, a photo attack using cut-out fragments, and a video attack. To carry out the first two types of attacks, the researchers used high-quality printed photographs of subjects. With a distorted photo attack, the attacker intentionally distorts the photo picture, trying to imitate the movement and shape of a human face. In the conditions of a photo attack using cut-out fragments, the attacker uses cutouts in the photo in the eye and tongue area to imitate the relief of a human face. The video attack was carried out using a mobile device, which showed high-quality photos of subjects. To test our hypothesis, we used a test dataset containing various images of 30 subjects.

3 Results and Discussion The pipeline from the Single Shot Detector based on the ResNet-10 architecture and the face recognition model based on the IResNet-50 architecture were used as the base pipeline for both data sets. Due to some features of a number of images, such as the

Research of the Correlation Between the Results of Detection

499

presence of glasses or poor lighting or large distortion of fake images of faces, not all images of the datasets were successfully recognized. A link to the list of such images is attached. Based on all successfully recognized faces of living subjects and fake photos of their faces, pairwise comparisons of vector representations were carried out with the calculation of cosine similarity between them. Also, for each fake face, the probability was found that the face in the photo belongs to a living person. The results of the ratios of these values can be found in Figs. 2 and 3.

Fig. 2. The relationship between the probability of a spoof attack and belonging to the same person on the NUAA dataset.

According to the results of the experiment on the NUAA dataset, we see that the probability of a spoof attack is very weakly related to the probability that the person belongs to the same person. The correlation coefficient for these two values was only –0.04 for an attack in the case of the same person and –0.01 for an attack by the face of another person. Thus, the correlation and relationship between these values is completely absent. In the case of the CASIA-FASD dataset, a number of images also could not be recognized using our pipeline, the reason for this was too much distortion of fake photos of faces. The same technique was applied to the results of embedding calculations, namely, the pairwise calculation of the cosine distance between all living persons and persons who are fakes. In addition, for all images that were selected as verifiable, the probability of a face belonging to a living person was also found. The results of the ratios similar to those on the NUAA-Imposter dataset can be found in Figs. 4 and 5. According to the results of the experiment on this data set, we see that the probability of a spoof attack is also very weakly related to the probability that the face belongs to the same person. The correlation coefficient for these two values was only –0.03 for an attack in the case of the same person and –0.01 for an attack with the face of another

500

A. A. Shnyrev et al.

Fig. 3. The relationship between the probability of a spoof attack and belonging to another person on the NUAA dataset.

Fig. 4. The relationship between the probability of a spoof attack and belonging to the same person on the data set CASIA-FASD.

person. Thus, correlation and interrelation between these values is completely absent, as in the case of a data set NUAA-Imposter.

Research of the Correlation Between the Results of Detection

501

Fig. 5. The relationship between the probability of a spoof attack and belonging to another person on a data set CASIA-FASD.

4 Conclusion In the framework of this work, we have shown that there is no relationship between the results of the anti-spoofing module and the facial recognition module. A subject who did not exceed the threshold of admission based on the determination of the liveliness of the face could be both recognized and not recognized by further modules of the system. The use of multiple data sets allows us to confidently confirm the results of our research. In the future, we plan to continue exploring the possibility of improving and using the anti-spoofing module to solve various tasks by expanding our experiments. The problem of choosing the optimal threshold for maintaining a high degree of security without a large number of failures for living people is interesting.

References 1. Wang, M., Deng, W.: Deep face recognition: a survey. Neurocomputing 429, 215–244 (2021). https://doi.org/10.1016/j.neucom.2020.10.081 2. Firc, A., Malinka, K., Hanáˇcek, P.: Deepfakes as a threat to a speaker and facial recognition: an overview of tools and attack vectors. Heliyon 9, e15090 (2023). https://doi.org/10.1016/j. heli-yon.2023.e15090 3. Sivapriyan, R., Pavan, Kumar, N., Suresh, H.L.: Analysis of facial recognition techniques. In: Materials Today, Proceedings, vol. 57, pp. 2350–2354 (2022). https://doi.org/10.1016/j. matpr.2022.01.296 4. Hassani, A., Malik, H.: Securing facial recognition: the new spoofs and solutions. Biometric Technol. Today 2021, 5–9 (2021). https://doi.org/10.1016/S0969-4765(21)00059-X 5. Wang, G., et al.: Silicone mask face anti–spoofing detection based on visual saliency and facial motion. Neurocomputing 458, 416–427 (2021). https://doi.org/10.1016/j.neu-com. 2021.06.033

502

A. A. Shnyrev et al.

6. Deng, W., Hu, J., Lu, J., Guo, J.: Transform–invariant PCA: a unified approach to fully automatic facealignment, representation, and recognition. IEEE Trans. Pattern Anal. Mach. In-tell 36, 1275–1284 (2014). https://doi.org/10.1109/TPAMI.2013.194 7. Yang, X., et al.: Stable and compact face recognition via unlabeled data driven sparse representation–based classification. Signal Process. Image Commun. 111, 116889 (2023). https:// doi.org/10.1016/j.image.2022.116889 8. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28, 2037–2041 (2006). https:// doi.org/10.1109/TPAMI.2006.244 9. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: IEEE/CVF Computer Vision and Pattern Recognition, pp. 4685–4694 (2019). https://doi.org/10.1109/CVPR.2019.00482 10. Liu, W., et al.: Sphereface: deep hypersphere embedding for face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017). https://doi.org/10. 1109/CVPR.2017.713 11. Wang, H., et al.: Cosface: large margin cosine loss for deep face recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018). https://doi. org/10.1109/CVPR.2018.00552 12. Wong, K.-W., et al.: A robust scheme for live detection of human faces in color images. Sign. Process. Image Commun. 18, 103–114 (2003). https://doi.org/10.1016/S0923-5965(02)000 88-7 13. Wang, L., Ding, X., Fang, C.: Face live detection method based on physiological motion analysis. Tsinghua Sci. Technol. 14, 685–690 (2009). https://doi.org/10.1016/S1007-0214(09)701 35-X 14. Shu, X., et al.: Face spoofing detection based on multi–scale color inversion dual–stream convolutional neural network. Expert Syst. Appl. 224, 119988 (2023). https://doi.org/10. 1016/j.eswa.2023.119988 15. Pei, M., Yan, B., Hao, H., Zhao, M.: Person-specific face spoofing detection based on a siamese network. Pattern Recogn. 135, 109148 (2023). https://doi.org/10.1016/j.patcog.2022.109148 16. Chang, H.–H., Yeh, C.–H.: Face anti–spoofing detection based on multi–scale image quality assessment. Image Vision Comput. 121, 104428 (2022). https://doi.org/10.1016/j.ima-vis. 2022.104428 17. Chen, S., et al.: A simple and effective patch–based method for frame–level face anti–spoofing. Pattern Recogn. Lett. 171, 1–7 (2023). https://doi.org/10.1016/j.patrec.2023.04.011 18. Kumar, S., Singh, S., Kumar, J.: A comparative study on face spoofing attacks. In: International Conference on Computing, Communication and Automation (ICCCA), pp. 1104–1108 (2017). https://doi.org/10.1109/CCAA.2017.8229961 19. Boulkenafet, Z., Komulainen, J., Hadid A.: Face anti–spoofing based on color texture analysis. In: IEEE International Conference on Image Processing (ICIP), pp. 2636–2640 (2015). https:// doi.org/10.1109/ICIP.2015.7351280 20. Dear, M., Harrison, W.: The influence of visual distortion on face recognition. Cortex 146, 238–249 (2022). https://doi.org/10.1016/j.cortex.2021.10.008 21. Sandler, M., et al.: Mobilenetv2: inverted residuals and linear bottlenecks. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/ 10.1109/CVPR.2018.00474 22. Tan, X., et al.: Face liveness detection from a single image with sparse low rank bilinear discriminative model. Comput. Vis. ECCV 6316, 504–517 (2010). https://doi.org/10.1007/ 978-3-642-15567-3_37 23. Zhang, Z., et al.: A face anti-spoofing database with diverse attacks. In: 5th IAPR International Conference on Biometrics (ICB), pp. 26–31 (2012). https://doi.org/10.1109/ICB.2012.619 9754

Cooperative Cost Sharing of Joint Working Capital in Financial Supply Networks Vladislav Novikov1(B) , Nikolay Zenkevich2

, and Andrey Zyatchin2

1 Graduate School of Management, St. Petersburg University, 3, Volkhovsky Per.,

Saint-Petersburg 199034, Russian Federation [email protected] 2 Graduate School of Management, Operations Management Department, St. Petersburg University, 3, Volkhovsky Per., Saint-Petersburg 199034, Russian Federation {zenkevich,zyatchin}@gsom.spbu.ru

Abstract. The paper introduces a cooperative cost sharing solution of joint working capital in a financial supply chain, introduced by a network with two suppliers, one distributor, and two retailers. The problem is investigated as a multi-agent system – a cooperative game of five players. It is assumed, that players may use financial tools like factoring, reverse factoring, and inventory financing. Availability and implementation of a tool depends on structure of a coalition: suppliers, when play alone, may use only inventory financing. When a supplier and a distributor constitute coalition, a supplier my use factoring, distributor may use reverse factoring in addition to inventory financing. In a coalition of a supplier, a distributor, and a retailer, the supplier may use factoring, distributor may use factoring and reverse factoring and retailer may use reverse factoring in addition to inventory financing. For this problem, a cooperative game has been constructed to distribute costs for joint working capital. Numerical calculations show that the Shapley value can be considered as a cooperative cost sharing solution, since it belongs to the C-core of the cooperative game. #COMESYSO1120 Keywords: Supply Chain Network · Working Capital Management · Financial Supply Chain · Cooperative Solutions · C-core

1 Financial Supply Chain and its Management Tools 1.1 Cooperation in Financial Supply Chains Oliver and Webber at their paper [11] discussed the potential advantages of integrating business processes such as manufacturing, procurement, distribution, and sales operations, which led to the term ‘Supply Chain Management’. The main flows in supply chains, which are usually investigated together are material, information, and financial. Focus on financial flow led to financial supply chain (FSC) and financial supply chain management (FSCM) investigation [1]. In FSCM some tools for cooperation could be implemented. Some of them are internal and depend on decision of a single company, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 R. Silhavy and P. Silhavy (Eds.): CoMeSySo 2023, LNNS 935, pp. 503–518, 2024. https://doi.org/10.1007/978-3-031-54820-8_41

504

V. Novikov et al.

some of them are external and depend on partners in a coalition. Widely used management tools for FSCM are Factoring [9, 10], Reverse factoring [9, 21], Inventory financing [2, 6], Dynamic or invoice discounting [4, 17], Revenue-sharing contracts, Vendor-managed inventory (VMI), [3, 20], Consignment stock [5, 18]. Factoring requires interconnection of three ropes: a seller, a buyer, and a factor. Actions within factoring agreement are represented on a Fig. 1.

Factor Cash commission

Payment Invoice

Cash Goods/Services

Seller

Buyer

Promise to pay

Fig. 1. Actions within factoring agreement

For inventory financing a direct agreement between a supplier and a byer is not required but requires an agreement between a supplier and a logistics service provider (LSP), Fig. 2.

LPS Cash commission

Inventory Inventory

Supplier

Cash Price

Buyer

Quanty

Fig. 2. Scheme of Inventory Financing Solution.

Businesses that have a limited customer base or a high concentration of customers, which is already seen as a high credit risk, may not have access to dynamic/invoice discounting. Revenue-sharing contracts require planning and coordination between the parties involved. In addition, they may pose risks to all stakeholders as the revenue generated may fluctuate depending on demand and other variables. VMI involves the supplier taking control of inventory levels, which restricts the customer’s ability to manage their own inventory. Additionally, if the supplier fails to effectively manage inventory levels, it can result in shortages or excessive supply of goods. Consignment inventory can impose restrictions on a customer’s inventory management capabilities and carries risks as the supplier maintains ownership of the inventory until it is sold. As a result, the following tools are chosen as controls in this research: factoring, reverse factoring, and inventory financing.

Cooperative Cost Sharing of Joint Working Capital

505

1.2 Collaborative Cash Conversion Cycle Formalization Working capital definition was proposed in several forms by different authors. For instance, Jones [8], Pass and Pike [12] define working capital as follows: Working capital = Current assets − Current liabilities Pirtilla et al. [14] formulates working capital takin into account operational performance: Working capital = Inventories + Accounts receivable − Accounts payable Consider tools for working capital management (WCM). Richards and Laughlin [16] proposed a concept of concept of cash conversion cycle (CCC). The cash conversion cycle measures the time it takes for the cash invested in raw materials to be recovered through customer purchases. The CCC is made up of three elements: DIO – days inventory outstanding, DRO – days receivables outstanding, DPO – days payables outstanding. Dependence between such elements is represented on a Fig. 3.

Placing an order for resources

Purchasing resources

Payment for resources

Selling products

1. Days inventory outstanding (DIO) 3. Days payables outstanding (DPO)

Receiving payment

2. Days receivables outstanding (DRO)

4. Cash conversion cycle, CCC= + –

Fig. 3. Cash conversion cycle.

Therefore: CCC = DIO + DRO − DPO. Such an approach is implementable for a single company. Hofmann and Kotzab [7] have introduced the concept of collaborative cash conversion cycle (CCCC) for a supply chain. For such a case CCCC has the following form: n CCCC = DIOi + DROn − DPO1 . i=1

506

V. Novikov et al.

2 Reduction and Allocation of Joint Working Capital Costs 2.1 Working Capital Costs for a Single Company Viskari and Karri [19] proposed the following approach to calculate working capital cost for a company:     DIO DRO FC = Inv × (1 + c) 365 − 1 + AR × (1 + c) 365 − 1 − AP   DPO × (1 + c) 365 − 1 , where: Inv – value of inventory in the end of the year, AR – value of accounts receivables in the end of the year, AP – value of accounts payables in the end of the year, DIO, DRO, DPO – elements of CCC, c – cost of capital. If a supply chain with n participants is considered, then costs for the entire chain has the following form: n FCi . (1) FCSC = i=1

In the previous chapter factoring, reverse factoring, and inventory financing were chosen as tools for working capital management. Consider further corresponding cost for implementation of such tools. Factoring could be used in the coalition of players from two successive levels in a supply chain network. For instance, for a coalition of a supplier (level 1) and a buyer (level 2), the supplier my use factoring. Factoring cost has the following form: FCF = x × AAR01 × DPO20 × r(f )daily . where: x – a fraction of total financial flow between 2 companies (players), which is being factored, AAR01 – value of average accounts receivables of the first player (supplier) before implementation of factoring, DPO02 – days payables outstanding of the second player (buyer) before implementation of factoring, r(f )daily – factor’s daily rate for services. The values of accounts receivables and the days receivables outstanding are altered due to factoring. This is because factoring allows suppliers to immediately convert their receivables into cash, resulting in changes in both values, therefore: AAR11 = AAR01 × (1 − x), DRO11 = DRO10 × (1 − x),

Cooperative Cost Sharing of Joint Working Capital

507

where: AAR11 – value of average accounts receivables of the first player (supplier) after implementation of factoring. DRO11 – days receivables outstanding of the first player (supplier) after implementation of factoring. Reverse factoring, the same as factoring, could be used in the coalition of players from two successive levels in a supply chain network. But for a coalition of a supplier (level 1) and a buyer (level 2), reverse factoring could be implemented only by buyer. Reverse factoring cost has the following form: FCRF = y × AAP20 × DPO20 × r(rf )daily , where: y – a fraction of total financial flow between 2 companies, being reverse factored by buyer, AAP 02 – value of average accounts payables of the second player (buyer) before implementation of reverse factoring, DPO02 – days payables outstanding of the second player (buyer) before implementation of reverse factoring, r(rf )daily – factor’s daily rate for services. Reverse factoring enables the buyer to effectively manage their payment terms, leading to a scenario where it is reasonable to assume that the buyer immediately settles the factored amount. As a result, this reduces the buyer’s accounts payables and days payables outstanding specifically for the portion that has been factored, therefore: AAP21 = AAP20 × (1 − y) DPO21 = DPO20 × (1 − y) AAR11 = AAR01 × (1 − y) DRO11 = DRO10 × (1 − y), where: AAP 12 – value of accounts payables of the second player (buyer) after using reverse factoring, DPO12 – days payables outstanding of the second player (buyer) after using reverse factoring, AAR11 – value of account receivables of the first player (supplier) after using reverse factoring, DRO11 – days receivables outstanding of the first player (supplier) after using reverse factoring. Inventory financing could be used in the coalition of players, and for a single player. Inventory financing cost has the following form: FCIF = z × AI10 × DIO10 × r(IF)daily , where: z – a fraction of average value of inventory of the player (supplier) being financed,

508

V. Novikov et al.

AI 01 – value of average inventory of the player (supplier) before implementation of inventory financing, DIO01 – days inventory outstanding of the player (supplier) before implementation of inventory financing, r(IF)daily – LP’s daily rate for services. Using inventory financing does not impact the buyer’s inventory values or days inventory outstanding. However, when the supplier transfers the inventory to a third party from their warehouse, it causes changes in the average inventory values and days inventory outstanding, therefore: AI11 = AI10 (1 − z) DIO11 = DIO10 × (1 − z), where: AI 11 – average inventory of the player (supplier) after implementation of inventory financing, DIO11 – days inventory. 2.2 Game-Theoretical Approach for Joint Working Capital Cost Sharing Consider a game of n players, and coalitions S1 , S2 : S1 ⊆ N , S2 ⊆ N , S1 ∩ S2 = ∅, where N – a set of all players [13]. A characteristic function is a function defined for all coalitions S1 ⊆ N , S2 ⊆ N , S1 ∩ S2 = ∅, and meets superadditivity conditions of the following form: (S1 ) + v(S2 ) ≤ v(S1 ∪ S2 ), v(∅) = 0. Consider a supply chain and the following value: 0 1 FCTotal − FCTotal ,

where: FC 0Total – total costs of the network before the cooperation of players of a coalition S, FC 1Total – total costs of the network after the cooperation in a coalition. In this case, the benefit for supply chain members (players) is the amount of money saved through the utilization of a tool (factoring, reverse factoring, or inventory financing). Additionally, these cost savings are distributed among the coalition members. As a result, consider a function υ(S) of the following form: 0 1 − FCTotal ). υ(S) = max(FCTotal

According to formula (1) we have: n 1 FCTotal (S) = (FCi1 + FCFi + FCRFi + FCIFi ). i=1

Cooperative Cost Sharing of Joint Working Capital

509

where: i – number of a member of a coalition S; n – amount of member of coalition S; FC 1i – working capital costs of member i; FC Fi – costs for using factoring solution for member i; FC RF i – costs for using reverse factoring solution for member i; FC IFi – costs for using inventory financing solution for member i. Consider a supply chain with the structure presented in Fig. 3. Each player is assigned a unique number based on their position (level) and position within that level. For instance, the top supplier is labeled as 1.1, while the lower retailer is denoted as 3.2. 1.1

1.2

Supplier

2.1 Distributor

Supplier

Retailer Retailer

3.1

3.2

Fig. 4. A supply chain network.

Several restrictions arise from the assumption that the network is isolated: no other companies connected to those represented on a Fig. 4:  Restriction 1. AARDistributor = AAP Retailers ;  Restriction 2. AAP Distributor = AARSuppliers ; Restriction 3.1. As there are no companies before the suppliers, they cannot use reverse factoring. Restriction 3.2. As there are no companies after the retailers, they cannot use regular factoring. Restriction 4. If member 2.1 is not in coalition, usage of factoring or reverse factoring is impossible. In the context of this research, the term “management of working capital costs” can be interpreted in various ways. However, for the purpose of this study, the focus is on the leverage used, specifically the shares of accounts receivables, payables, and inventory, which are considered as instruments. Therefore, the variables in the model are x (factoring), y (reverse factoring), and z (inventory financing). Then it is the fifth restriction: Restriction 5. 0 ≤ x ≤ 1; 0 ≤ y ≤ 1; 0 ≤ z ≤ 1, where 0 means not using the instrument at all and 1 – attributing the entire amount of AR, AP, or AI.

510

V. Novikov et al.

Consider functions of the following form:  0 Fi.j (zi.j ) = EIi.j (1 + ci.j )

0 DIOi.j 365

 0 Pi.j (yi.j ) = −EAPi.j (1 + ci.j )

 1 EAPi.j (1 + ci.j )

1 DPOi.j 365





1 − 1 − EIi.j (1 + ci.j ) 0 DPOi.j 365

1 DIOi.j 365

 0 0 − 1 − zi.j AIi.j DIOi.j r(IF)daily ,

 − 1 +,

 0 0 − 1 − yi.j AAPi.j DPOi.j r(rf )daily ,



0 DRO1.j

R1.j (x1.j ) = EAR01.j (1 + c1.j )

365





− 1 − EAR11.j (1 + c1.j )

1 DRO1.j 365

 −1 ,

0 r(f )daily , −x1.j ARR01.j DPO2.1   0 1 DRO2.1 DRO2.1 R2.1 (x2.1 ) = EAR02.1 (1 + c2.1 ) 365 − 1 − EAR12.1 (1 + c2.1 ) 365 − 1 , 0 r(f )daily . −x2.1 ARR02.1 DRO2.1

where i is the number of the level in the supply chain (i = 1 corresponds to the supplier, i = 2 corresponds to the distributor, i = 3 corresponds to the retailer), j is the number of the player at the corresponding level in the supply chain. 1 = DIO 0 (1 − z ), average inventory before As it was mentioned above, DIOi.j i.j i.j 0 +EI 0 BIi.j i.j , 2

0 could be found as AI 0 = cooperation AIi.j i.j

0 = 2AI 0 − BI 0 . The then, EIi.j i.j i.j BI 0 +EI 1

i.j 1 = i.j , same calculations cold be used for average inventory after cooperation: AIi.j 2 1 0 1 0 0 AIi.j = AIi.j (1 − zi.j ), therefore, EIi.j = 2AIi.j (1 − zi.j ) − BIi.j . As a result, Fi.j (zi.j ) is:

 0 0 − BIi.j ) (1 + ci.j ) Fi.j (zi.j ) = (2AIi.j

 0 0 −(2AIi.j (1 − zi.j ) − BIi.j )

(1 + ci.j )

0 DIOi.j 365

 −1

0 (1−z ) DIOi.j i.j 365

 0 0 − 1 − zi.j AIi.j DIOi.j r(IF)daily .

By implementation of the same idea a function Pi.j (yi.j ) could be represented as follows:   0 0 0 − BAPi.j ) (1 + ci.j ) Pi.j (yi.j ) = −(2AAPi.j

 0 0 +(2AAPi.j (1 − yi.j ) − BAPi.j )

(1 + ci.j )

DPOi.j 365

−1

0 (1−y ) DPOi.j i.j 365

 −1

0 0 DPOi.j r(rf )daily . −yi.j AAPi.j

Consider FCF for a supplier and a distributor. For suppliers, for the function FCF 0 of the distributor should be used, since he will pay the same as before, the term DPO2.1 therefore the cost function for suppliers takes the form: 0 FCF1.j (x1.j ) = x1.j AAR01.j DPO2.1 r(f )daily , j = 1, 2

Cooperative Cost Sharing of Joint Working Capital

therefore

 R1.j (x1.j ) =

(2AAR01.j

− BAR01.j )

(1 + c1.j )

 −(2AAR01.j (1 − x1.j ) − BAR01.j ) (1 + c1.j )



0 DRO1.j

−1

365

1 DRO1.j 365

511

 −1

0 r(f )daily . −x1.j AAR01.j DPO2.1 0 of the distributor should be used, For a distributor for the function FCF the term DRO2.1 which reflects how the distributor was paid by retailers before using the tool, therefore can be used as a metric to calculate cost. Therefore, the cost function for the distributor takes the form: 0 r(f )daily , FCF2.1 (x2.1 ) = x2.1 AAR02.1 DRO2.1

Therefore R2.1 (x2.1 ) =

(2AAR02.1

− BAR02.1 )

−(2AAR02.1 (1 − x2.1 ) − BAR02.1 )

 0 DRO2.1 365 (1 + c2.1 ) −1

 1 DRO2.1 (1 + c2.1 ) 365 − 1

0 r(f )daily . −x2.1 ARR02.1 DRO2.1

2.3 Coalition Payoffs Consider a coalition of the first supplier of the fist level. This player may implement only one tool – inventory financing. Therefore, a function υ(S) for a coalition S = {1.1} has the following form: v(S) = max F1.1 (z1.1 ). Consider a coalition of a supplier 1.1 and a distributor 2.1. Such a coalition is shaped with red frame on a Fig. 5. 1.1

1.2

Supplier

2.1 Distributor

Supplier

Fig. 5. Coalition S = {1.1, 2.1}

Retailer Retailer

3.1

3.2

512

V. Novikov et al.

In such a case S = {1.1, 2.1}, and v(S) = max(FCS0 − (FCS1 (x1.1 , y2.1 , z1.1 , z2.1 ) + FCIF1.1 (z1.1 ) +FCIF2.1 (z2.1 ) + FCF1.1 (x1.1 ) + FCRF2.1 (y2.1 ))   0 0 DIO1.1 DRO1.1 0 (1 + c1.1 ) 365 − 1 + EAR01.1 (1 + c1.1 ) 365 − 1 = max(EI1.1   0 0 DIO2.1 DPO2.1 0 0 +EI2.1 (1 + c2.1 ) 365 − 1 − EAP2.1 (1 + c2.1 ) 365 − 1 1 −(EI1.1

  1 1 DIO1.1 DRO1.1 1 365 365 − 1 + EAR1.1 (1 + c1.1 ) −1 (1 + c1.1 )

 1 +EI2.1

(1 + c2.1 )

1 DIO2.1 365

−1

1 − EAP2.1

 1 DPO2.1 (1 + c2.1 ) 365 − 1

0 0 0 0 DIO1.1 r(IF)daily + z2.1 AI2.1 DIO2.1 r(IF)daily +z1.1 AI1.1 0 0 0 +x1.1 AAR01.1 DPO2.1 r(f )daily + y2.1 AAP2.1 DPO2.1 r(rf )daily ).

Taking into account the previously introduced notations for the functions Fi.j (zi.j ), Fi.j (zi.j ), R1.j (x1.j ), and Pi.j (yi.j ) consider v(S) as follows: v(S) = max(F1.1 (z1.1 ) + F2.1 (z2.1 ) + R1.1 (x1.1 ) + P2.1 (y2.1 )) subject to the following restrictions: 0 AAP2.1 (1 − y2.1 ) = AAR01.1 (1 − x1.1 ) + AAR01.2

Consider a coalition S = {1.1, 1.2, 2.1}, Fig. 6. 1.1

3.1 Supplier

2.1 Distributor

Supplier

Retailer Retailer

1.2

3.2

Fig. 6. Coalition S = {1.1, 1.2, 2.1}.

For such a coalition v(S) has the following form: v(S) = max(FCS0 − (FCS1 (x1.1 , x1.2 , y2.1 , z1.1 , z2.1 , z1.2 ) + FCIF1.1 (z1.1 ) +FCIF1.2 (z1.2 ) + FCF1.2 (x1.2 ) + FCIF2.1 (z2.1 ) + FCF1.1 (x1.1 ) + FCRF2.1 (y2.1 )) = max(F1.1 (z1.1 ) + R1.1 (x1.1 ) + F1.2 (z1.2 ) + R1.2 (x1.2 ) +F2.1 (z2.1 ) + P2.1 (y2.1 )). Consider a coalition S = {1.1, 2.1, 3.1}, Fig. 7.

Cooperative Cost Sharing of Joint Working Capital 1.1

1.2

Supplier

2.1 Distributor

Supplier

Retailer Retailer

513

3.1

3.2

Fig. 7. Coalition S = {1.1, 2.1, 3.1}.

In coalition S, all players can use inventory financing; in addition, supplier 1.1 can use factoring, distributor can use factoring and reverse factoring, and retailer 3.1 can use reverse factoring. For such a case v(S) has the following form: v(S) = max(FCS0 − (FCS1 (x1.1 , z1.1 , x2.1 , y2.1 , z2.1 , x1.3 , z1.3 ) + FCF1.1 (x1.1 ) + FCIF1.1 (z1.1 ) +FCF2.1 (x2.1 ) + FCRF2.1 (y2.1 ) + FCIF2.1 (z2.1 ) + FCRF3.1 (y3.1 ) + FCIF3.1 (z3.1 )).

Substituting functions Fi.j (zi.j ), Fi.j (zi.j ), Ri.j (xi.j ) and Pi.j (yi.j ) into v(S), we get: v(S) = max(F1.1 (z1.1 ) + R1.1 (x1.1 ) +F2.1 (z2.1 ) + R2.1 (x2.1 ) + P2.1 (y2.1 ) +F3.1 (z3.1 ) + P3.1 (y3.1 )) subject to: 0 AAP2.1 (1 − y2.1 ) = AAR01.1 (1 − x1.1 ) + AAR01.2 . 0 0 (1 − y3.1 ) + AAP3.2 . AAR02.1 (1 − x2.1 ) = AAP3.1

Continue the same way for all the rest coalition and get explicit form for a function v(S) for each case. There are several principles of optimality for cooperative games. Consider such a principle as a C-core [13]. If the C-core in cooperative game is non-empty, it means that for any imputation inside the C-core for any coalition there is no reason deviate from such an imputation. In the next paragraph consider a numerical example and check non-emptiness of the C-core. 2.4 Example The example focuses on a network consisting of 2 suppliers, 1 distributor, and 2 retailers operating in the automotive industry over a one-year period, Fig. 4. The suppliers are car producers who sell their cars to the distributor, who then distributes them to the retailers for sale. The available data consists of information based on the initial values of cost of capital, working capital components, cost of goods sold, and revenue for all five participants at the beginning of the period, as well as projected values for the end of the period. Data is represented in the Table 1.

514

V. Novikov et al. Table 1. Initial data. Supplier 1

Supplier 2

Distributor

Retailer 1

Retailer 2

WACC*

8.4%

9.1%

8.9%

11.7%

10.8%

BI

2345

3856

5435.6

3004

6132

EI

2782

4202

5616.2

2876

6011

BAR

834

3245

4395

1543

2845

EAR

793

3403

4261

1403

2554

BAP

455

1452

4079

894

3501

EAP

632

1328

4196

765

3496

COGS

1623.25

3862.5

3738

2129

4105.75

Revenue

2179

4781

3564

2505.75

4829.5

* Note:

WACC – weighted average cost of capital; BI – inventory in the beginning of the period; EI – inventory in the end of the period; BAR – accounts Receivables in the beginning of the period; EAR – accounts Receivables in the end of the period; BAP – accounts Payables in the beginning of the period; EAP– accounts Payables in the end of the period; COGS – cost of Good Sold.

The values of cash conversion cycle and average values AI, AAR, AAP are represented in the Table 2. Table 2. The values of cash conversion cycle. Supplier 1

Supplier 2

Distributor

Retailer 1

Retailer 2

AI

2563

4029

5525.6

2940

6071.5

AAR

813.5

3324

4328

1473

2699.5

AAP

543.5

1390

4137.5

829.5

3498.5

DIO

550.84

391.4

539.57

510.64

538.03

DRO

131.01

247.5

443.25

209.66

206.06

DPO

116.79

135.03

404.03

144.07

310.02

Cooperative Cost Sharing of Joint Working Capital

515

Daily rates for each of financial tools are represented in the Table 3. Table 3. Daily rates.

Daily rate

Factoring

Reverse factoring

Inventory financing

0.0040%

0.0020%

0.0241%

Denote player 1.1 as 1, 1.2 as 2, 2.1 as 3, 3.1 as 4 and 3.2 as 5. For maximum coalition S = N the values of variables x, y, z for each player which maximize v(N ) are represented in the Table 4, and v(N ) = 2977, 82. Table 4. Values of decision variables for the grand coalition N. x

0

0.289997

0.248991

0

0

z

0.542618

0.521469

0.50814

0.489116

0.495018

y

0

0

0.232979

0.461121

0.198695

Let’s calculate the values of v(N ) for all the coalitions. Then try to find an imputation α = (α1 , α2 , α3 , α4 , α5 ) from the C-core by solving the following system: ⎧ α1 + α2 + α3 + α4 + α5 = v(N ) ⎪ ⎪ ⎨ α1 + α2 + α3 + α4 ≥ v(1, 2, 3, 4) ⎪ ... ⎪ ⎩ α1 ≥ v(1) For a given example a solution to such a system exists and represented in the Table 5. Table 5. An imputation from the C-core. α1

α2

α3

α4

α5

359.6084

508.3847

814.5553

481.0096

814.2628

To visualize the values of v(S) for all S and to check, that the imputation from the Table 5 belongs to the C-core, consider the Table 6. As a result, for such an example there exists an imputation any imputation for which for any coalition there is no reason deviate from such an imputation.

516

V. Novikov et al. Table 6. Values of v(S) for all S. Check1

S

v(S)

1

v(1)

=

359.6084