Innovations in Smart Cities Applications Volume 7: The Proceedings of the 8th International Conference on Smart City Applications, Volume 2 (Lecture Notes in Networks and Systems, 938) 3031543750, 9783031543753

Many cities in the developed world are undergoing a digital revolution, and have placed the "smart city" on th

113 48 46MB

English Pages 458 [443] Year 2024

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Committees
Keynotes Speakers
Smart City: Why, What, Experience Feedback and the Future/Challenges
Imagining a Digital Competency Management Ecosystem Approach to Transforming the Productivity of People in the Built Environment
Challenges of Cybersecurity in Smart Cities
Advancing Urban Modelling with Emerging Geospatial Datasets and AI Technologies
Integrating System Dynamics in Digital Urban Twin
Background of Smart Navigation
Contents
Smart Agriculture
Plant Disease Classification and Segmentation Using a Hybrid Computer-Aided Model Using GAN and Transfer Learning
1 Introduction
2 Related Work
3 Proposed Methodology
3.1 Data Augmentation
3.2 Dataset Preprocessing
3.3 Leaf Segmentation
3.4 Model Building
4 Experiments and Results
4.1 Dataset
4.2 Performance Measures
4.3 Analysis of Experiments
5 Comparison with Existing Methods
6 Conclusion
7 Future Work
References
Water Amount Prediction for Smart Irrigation Based on Machine Learning Techniques
1 Introduction
2 Methodology
2.1 Data Source and Description
2.2 Data Pre-processing
2.3 Proposed Models
3 Experimental Study
3.1 Evaluation Metrics
3.2 Hyper-Parameter Tuning
3.3 Obtained Results
4 Discussion
5 Conclusion
References
Smart Irrigation System Using Low Energy
1 Introduction
2 Literature Survey
2.1 Data Collection
2.2 Data Processing
3 Analysis and Discussion
3.1 Analysis of Data and Results
3.2 Data Collected
3.3 Technologies Used
3.4 Discussion
4 Prototype of the Proposed System
4.1 Proposed Solution
4.2 Lora
4.3 Components
4.4 Description of the New System
5 Implementation and Testing
5.1 Working Steps
5.2 Mobile Application
5.3 Test of the New System
5.4 Critical Analysis
6 Conclusion
References
Smart Models
Advancing Crop Recommendation Systems Through Ensemble Learning Techniques
1 Introduction
2 Literature Review
3 Methodology
3.1 Dataset Description
3.2 Data Preprocessing
3.3 Classifier Models
4 Results and Discussion
5 Conclusion
References
Technology to Build Architecture: Application of Adaptive Facade on a New Multifunctional Arena
1 Introduction
2 Description of the Project
2.1 Case Study: Arena
2.2 Climate Analysis
2.3 Façade Requirements
3 Multifunctional Façade
3.1 Concept Idea
3.2 Kinetic Element Design
3.3 Parametric Modelling
3.4 System Components
4 Conclusions and Perspectives
References
Effectiveness of Different Machine Learning Algorithms in Road Extraction from UAV-Based Point Cloud
1 Introduction
2 Material and Methods
2.1 UAV-Based Point Cloud
2.2 Machine Learning Models
3 Experimental Study
4 Results and Discussion
5 Conclusion
References
A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering on Recommender System Implementation
1 Introduction
2 Literature Review
2.1 Types of Recommender Systems
2.2 Recommender System Challenges and Solutions
2.3 Recommender Systems Evaluation
3 Case Study: SVD and KNN RS Netflix Dataset
3.1 Research Methodology
3.2 Results and Analysis
4 Conclusion and Future Work
5 Limitations and Future Work
References
Critical Overview of Model Driven Engineering
1 Introduction
2 Model Driven Engineering Concepts
2.1 System
2.2 Model
2.3 Meta Model
2.4 Modelling Language
2.5 Software Products, Platforms and Transformations
3 Discussion
3.1 Areas of Advancements for MDE
3.2 Challenges to MDE Adoption
3.3 Enhancement of MDE with AI
4 Conclusion
References
A Synthesis on Machine Learning for Credit Scoring: A Technical Guide
1 Introduction
2 Literature Review
3 Experiments Setup
3.1 Software and Hardware
3.2 Data Description
3.3 Data Prepossessing and Splitting
3.4 Trial-and-Error Approach
3.5 Evaluation Metrics
4 Results and Discussions
5 Conclusion
References
Enhancing Writer Identification with Local Gradient Histogram Analysis
1 Introduction
2 Datasets
3 Methodology
3.1 Local Gradient Histogram (LGH)
3.2 Vector of Locally Aggregated Descriptors (VLAD)
4 Experimental Study
4.1 Impact of the Fragment Size
4.2 Impact of the LGH Dimension
4.3 Impact of the Number of Clusters
4.4 Comparison with State of Art
5 Conclusion
References
Solving a Generalized Network Design Problem Using Hybrid Metaheuristics
1 Introduction
1.1 Local Search Metaheuristics
1.2 Constructive Metaheuristics
1.3 Population-Based Metaheuristics
2 Related Works
2.1 The Multicommodity Capacitated Fixed-Charge Network Design Problem (MCNDP)
2.2 The Generalized Discrete Cost Multicommodity Network Design Problem (GDCMNDP)
3 Problem Formulation
4 Hybrid Metaheuristic for the GDCMNDP
4.1 Generation of an Initial Solution
4.2 Generation of a Neighboring Solution
5 Computational Results
5.1 Instances Characteristics
5.2 Parameter Settings and Results
6 Conclusion
References
Isolated Handwritten Arabic Character Recognition Using Convolutional Neural Networks: An Overview
1 Introduction
2 Background
2.1 IHAC Recognition: Characteristics and Challenges
2.2 Convolutional Neural Networks
3 Literature Review
3.1 Shallow CNN Architectures
3.2 Deep CNN Architectures
3.3 Hyperparameter Fine-Tuning
3.4 Hybrid Architectures
3.5 Transfer Learning
4 Analysis and Discussion
5 Conclusion
References
A New Approach for Quantum Phase Estimation Based Algorithms for Machine Learning
1 Introduction
2 Quantum Phase Estimation
3 Proposed Improvements for Quantum Phase Estimation
4 A New Approach for QPE Based Algorithms for Machine Learning
5 Conclusion
References
Model Risk in Financial Derivatives and The Transformative Impact of Deep Learning: A Systematic Review
1 Model Risk
1.1 Definition
1.2 Mitigating Model Risk in Derivatives Pricing
1.3 The Importance of Model Validation
2 The Recourse to AI to Reduce Model Risk
2.1 Conditional Expectations
2.2 Outperformance
2.3 Tree Methods as Function Approximators
2.4 Limitations of Tree Methods
2.5 Advanced Machine Learning Methods
3 Autocallable Notes
3.1 Introduction
3.2 Model Risk Precedents
3.3 Analysis
4 Conclusion
References
Digital Twins
Integrating Syrian Cadastral Data into Digital Twins Through Accurate Determination of Transformation Parameters
1 Introduction
2 Resources and Methods Used
2.1 Resources and Study Area
2.2 Methods Used
3 Experiment
4 Accuracy Assessment
5 Results and Discussion
6 Conclusions
References
Towards Linked Building Data: A Data Framework Enabling BEM Interoperability with Extended Brick Ontology
1 Introduction
2 Method Proposal: A Data Framework for BEM Interoperability
2.1 BEM Model
2.2 Brick Extension
2.3 BEM to Brick
2.4 Consistency Check
2.5 Graph Data and Time Series Storage
3 Experimentations
4 Discussion
5 Conclusion
References
Digital Twin for Construction Sites: Concept, Definition, Steps
1 Introduction
2 Objective
3 Digital Twin Challenges
4 Construction Industry
5 Steps of Digitalizing a Construction Site
5.1 Collecting Data for the Existing Buildings, Topography, and Site Layout
5.2 Drawing the 3D Model
5.3 Incorporate Design Data
5.4 Simulate the Construction Process
5.5 Monitor Progress in Real Time
5.6 Use Virtual Reality (VR) and Augmented Reality (AR)
5.7 Continuously Update the Model to Ensure that the Model Remains Accurate and Up-To-Date
6 Conclusion
References
Towards Digital Twins in Sustainable Construction: Feasibility and Challenges
1 Introduction
2 Digital Twins Applications in Climate Change and Reducing CO2 Emissions
3 Digital Twins and IoT Solutions for Energy Saving and the Sustainable Construction
4 Conclusion
References
Digital Twin Architectures for Railway Infrastructure
1 Digital Twin in Literature
1.1 Milestones in the Development of Digital Twins
1.2 Five-Dimensional Digital Twin Model
2 Digital Twin Architectures
2.1 The Cyber-Physical Perspective
2.2 Cyber-Physical Approach for Railway Infrastructure
3 Framework
3.1 Overview of the Architecture
4 Case Study: The Proposed Architecture Application for an SNCF Réseau Project-GeoLidar
5 Conclusion
References
Seismic Digital Twin of the Dumanoir Earth Dam
1 Introduction
2 Dumanoir Dam
3 System Architecture of the Seismic Digital Twin
3.1 Data Acquisition Layer
3.2 Digital Modelling Layer
3.3 Data/Model Integration Layer
4 Data and Digital Modeling
4.1 Data Source
4.2 Digital Model
5 Conclusion and Perspective
References
Digital Twin Base Model Study by Means of UAV Photogrammetry for Library of Gebze Technical University
1 Introduction
2 Study Area
3 Methodology
3.1 Outdoor Modeling
3.2 Indoor Modeling
4 Results and Conclusion
References
Leveraging Diverse Data Sources for ESTP Campus Digital Twin Development: Methodology and Implementation
1 Introduction
2 Literature Review
3 Methodology
4 Data Collection and Implementation: ESTP Campus Case Study
4.1 ESTP Campus Description
4.2 Data Collection
4.3 Level of Granularity and LOD
4.4 Data Implementation
5 Results and Discussion
6 Conclusion
References
3D Models and Computer Vision
Road Traffic Noise Pollution Mitigation Strategies Based on 3D Tree Modelling and Visualisation
1 Introduction
1.1 Road Traffic Noise
1.2 Influence of Trees to Mitigate Road Traffic Noise Levels and 3D Tree Modelling
1.3 3D Road Traffic Noise Visualisation
1.4 Research Problem
2 Methodology
3 Results and Discussion
4 Conclusion
References
Exploring Google Earth Engine Platform for Satellite Image Classification Using Machine Learning Algorithms
1 Introduction
2 Related Work
3 Methodology
3.1 Study Area and Dataset
3.2 Data Preparation
3.3 Background Techniques
3.4 Accuracy Assessment
4 Results and Discussion
5 Conclusion
References
A Review of 3D Indoor Positioning and Navigation in Geographic Information Systems
1 Introduction
2 3D Indoor Positioning
3 3D Navigation Systems
4 Conclusion
References
Enhancing Smart City Asset Management: Integrating Versioning and Asset Lifecycle for 3D Assets Management
1 Introduction
2 Related Works
2.1 Smart City 3D Asset Management and Versioning
2.2 CityJSON Versioning
3 Conceptual Framework: Integrating Versioning and Asset Lifecycle Management for 3D Assets
4 Discussion and Conclusions
References
Image Transformation Approaches for Occupancy Detection: A Comprehensive Analysis
1 Introduction
1.1 Contribution
1.2 Article Organization
2 Related Works
2.1 Image Encoding Techniques
2.2 Found Limitations
3 Prominent Approaches
3.1 Adopted Grayscale-Based Method
3.2 Gramian Angular Field Images
4 Results and Discussion
4.1 Custom CNN
4.2 Performance Assessment
4.3 Comparative Analysis Between Approaches
4.4 Energy Saving Potential
5 Conclusion
References
Low-Cost Global Navigation Satellite System for Drone Photogrammetry Projects
1 Introduction
1.1 Drone Imagery
1.2 GCP’s and Digital Surface Models
2 Methodology
2.1 Area of Study
2.2 Drone Specifications and Flight Planification
3 Control Points Observations
4 Image Acquisition, Processing and Analysis
5 Results
6 Conclusion
References
3D Spatio-Temporal Data Model for Strata Management
1 Introduction
2 Strata Management
2.1 Incorporating Spatial and Temporal Aspects of Data in Strata Management
2.2 3D Data and Strata Management
3 Temporal Database Types and Approaches
4 3D Spatio-Temporal Strata Management Data Model
5 Discussion
6 Conclusion
References
Investigating Wind-Driven Rain Effects on Buildings with 3D City Building Models: An Analysis of Building Complexity Using Computational Fluid Dynamics
1 Introduction
2 Methodology
3 Result and Discussion
4 Conclusion
References
Wildfire Detection from Sentinel Imagery Using Convolutional Neural Network (CNN)
1 Introduction
2 Methods
2.1 Data Collection
2.2 CNN Model Development
2.3 Performance Evaluation
3 Result and Discussion
4 Conclusion
References
3D Spatial Queries for High-Rise Buildings Using 3D Topology Rules
1 Introduction
2 Methodology
2.1 36IM Topology Rules Implementation
2.2 3D Spatial Queries
3 Experimental Results and Discussion
4 Conclusion
References
Smart Learning Systems
Data Analysis and Machine Learning for MOOC Optimization
1 Introduction
2 Related Work
3 Methodology
3.1 Database
3.2 Models
4 Results and Discussion
4.1 Results
4.2 Discussion
5 Conclusion and Outlook
References
Using Machine Learning to Enhance Personality Prediction in Education
1 Introduction
2 Background
2.1 Innovative Learning Techniques and Methods
2.2 Machine Learning
3 Review of Personality Theories
3.1 Theory of Vocational Personalities
3.2 The Big Five Model Theory
4 Related Work in Personality Prediction
5 Theoretical Framework and Contribution
6 Conclusion
References
Smart Education in the IoT: Issues, Architecture, and Challenges
1 Introduction
2 Smart Education Issues and Benefits
2.1 IoT in the Education
2.2 Smart Education Benefits
3 Smart Education Challenges
4 Smart Education Architecture and Framework
4.1 IoT-Based Smart Education Architecture
4.2 Smart Education Farmwork
5 Conclusion
References
Enhancing Book Recommendations on GoodReads: A Data Mining Approach Based Random Forest Classification
1 Introduction
2 Related Work
3 Research Methodology
3.1 Dataset
3.2 Methodology
4 Results and Discussion
4.1 Comparative Analysis of Random Forest Algorithm in Recommendation Systems
5 Conclusions and Suggestions
References
Reinforcement Learning Algorithms and Their Applications in Education Field: A Systematic Review
1 Introduction
2 State of the Art
2.1 Personalized Learning
2.2 Adaptive Tutoring Systems
2.3 Intelligent Assessment and Feedback
3 Discussion
4 Conclusion
References
Machine Reading Comprehension for the Holy Quran: A Comparative Study
1 Introduction
2 Related Works
3 Interpretation and Comparison Between the Models
3.1 BERT Methodology and Interpretation
3.2 CL-AraBERT Methodology and Interpretation
3.3 Model Based on Many BERT Architectures
4 Results and Discussion
5 Conclusion and Future Work
References
Author Index
Recommend Papers

Innovations in Smart Cities Applications Volume 7: The Proceedings of the 8th International Conference on Smart City Applications, Volume 2 (Lecture Notes in Networks and Systems, 938)
 3031543750, 9783031543753

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Lecture Notes in Networks and Systems 938

Mohamed Ben Ahmed Anouar Abdelhakim Boudhir Rani El Meouche İsmail Rakıp Karaș   Editors

Innovations in Smart Cities Applications Volume 7 The Proceedings of the 8th International Conference on Smart City Applications, Volume 2

Lecture Notes in Networks and Systems

938

Series Editor Janusz Kacprzyk , Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland

Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the worldwide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).

Mohamed Ben Ahmed · Anouar Abdelhakim Boudhir · Rani El Meouche · ˙Ismail Rakıp Karas, Editors

Innovations in Smart Cities Applications Volume 7 The Proceedings of the 8th International Conference on Smart City Applications, Volume 2

Editors Mohamed Ben Ahmed Faculty of Sciences and Techniques Abdelmalek Essaadi University Tangier, Morocco Rani El Meouche École Spéciale des Travaux Publics Paris, France

Anouar Abdelhakim Boudhir Fac Sciences et Techniques de Tanger Université Abdelmalek Essaâdi Tangier, Morocco ˙Ismail Rakıp Karas, Computer Engineering Department Karabük University Karabük, Türkiye

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-54375-3 ISBN 978-3-031-54376-0 (eBook) https://doi.org/10.1007/978-3-031-54376-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Preface

The content of this Conference Proceedings volume comprises the written version of the contributions presented at the 8th International Conference on Smart City Applications 2023. This multidisciplinary event was co-organized by the ESTP in the partnership with Mediterranean Association of Sciences and Sustainable Development (Medi-ADD) sponsored by the digital twins’ chair of construction and infrastructure at ESTP. The contents of this volume delve into recent technological breakthroughs across diverse topics including geo-smart information systems, digital twins of construction and infrastructure, smart building and home automation, smart environment and smart agriculture, smart education and intelligent learning systems, information technologies and computer science, smart healthcare, etc. The event has been a good opportunity for more than 110 participants coming from different countries around the world to present and discuss topics in their respective research areas. In addition, four keynote speakers presented the latest achievements in their fields: Prof. Jason Underwood “Imagining a digital competency management ecosystem approach to transforming the productivity of people in the built environment”, Prof. Isam Shahrour “Smart city: why, what, experience feedback and the future/challenges”, Dr. Ihab Hijazi “Integrating system dynamics and digital twin for the circular urban environment”, Prof. Mohammed Bouhorma “Challenges of cybersecurity in smart cities”, Prof. Filip Biljecki “Advancing urban modelling with emerging geospatial datasets and AI technologies”, Prof. Ismail Rakip Karas “Background of Smart Navigation”. We express our gratitude to all participants, members of the organizing and scientific committees, as well as session chairs, for their valuable contributions. We also would like to acknowledge and thank the Springer Nature Switzerland AG staff for their support, guidance and for the edition of this book. We hope to express our sincere thanks to Pr. Janusz Kacprzyk and Dr. Thomas Ditzinger for their kind support and help to promote the success of this book. November 2023

Rani El Meouche Mohamed Ben Ahmed Anouar Abdelhakim Boudhir Ismail Rakip

Committees

Conference Chair Rani El Meouche

ESTP, Paris, France

Conference Co-chairs Mohamed Ben Ahmed Anouar Boudhir Abdelhakim ˙Ismail Rakıp Karas,

FST, Tangier, UAE University, Morocco FST, Tangier, UAE University, Morocco Karabuk University, Turkey

Conference Steering Committee Rani El Meouche Rogério Dionisio Domingos Santos ˙Ismail Rakıp Karas, Alias Abdul Rahman Mohamed Wahbi Mohammed Bouhorma Chaker El Amrani Bernard Dousset Rachid Saadane Ali Youness

ESTP, Paris, France Polytechnic Institute Castelo Branco, Portugal Polytechnic Institute Castelo Branco, Portugal Karabuk University, Turkey Universiti Teknologi Malaysia EHTP Casablanca, Morocco FST, Tangier UAE University, Morocco FST, Tangier UAE University, Morocco UPS, Toulouse, France EHTP Casablanca, Morocco FS, Tetouan, Morocco

Local Organizing Committee Elham Farazdaghi Mojtaba Eslahi Muhammad Ali Sammuneh Maryem Bouali Mohamad Al Omari Mohamad Ali Zhiyu Zheng

ESTP Paris, France ESTP Paris, France ESTP Paris, France ESTP Paris, France ESTP Paris, France ESTP Paris, France ESTP Paris, France

viii

Committees

Technical Programme Committee Ali Jamali Ali Jamoos Alias Abdul Rahman Aliihsan Sekertekin Ana Paula Silva Ana Ferreira Anabtawi Mahasen Anton Yudhana Arlindo Silva Arif Ça˘gda¸s Aydinoglu Arturs Aboltins Assaghir Zainab Barı¸s Kazar Bataev Vladimir Behnam Atazadeh Benabdelouahab Ikram Bessai-Mechmach Fatma Zohra Beyza Yaman Biswajeet Pradhan Carlos Cambra Damir Žarko Darko Stefanovic Domingos Santos Edward Duncan Eehab Hamzi Hijazi Eftal Sehirli ¸ El Hebeary Mohamed Rashad EL Arbi Abdellaoui Allaoui Enrique Arias Filip Biljecki Francesc Anton Castro Ghulam Ali Mallah Gibet Tani Hicham Habibullah Abbasi Ihab Hijazi Isam Shahrour J. Amudhavel Jaime Lloret Mauri José Javier Berrocal Olmeda

Universiti Teknologi Malaysia Al-Quds University, Palestine Universiti Teknologi Malaysia Cukurova University Polytechnic Institute of Castelo Branco, Portugal Polytechnic Institute Castelo Branco, Portugal Al-Quds University, Palestine Universitas Ahmad Dahlan, Indonesia Polytechnic Institute of Castelo Branco, Portugal Gebze Technical University, Türkiye Technical University of Riga, Latvia Lebanese University, Lebanon Oracle, USA Zaz Ventures, Switzerland University of Melbourne, Australia UAE, Morocco CERIST, Algeria Dublin City University, Ireland University of Technology Sydney, Australia Universidad de Burgos, Spain Zagreb University, Croatia University of Novi Sad, Serbia IPCB, Portugal, France The University of Mines & Technology, Ghana An-Najah University, Palestine Karabuk University, Türkiye Cairo University, Egypt ENS, UMI, Morocco Castilla-La Mancha University, Spain National University of Singapore Technical University of Denmark Shah Abdul Latif University, Pakistan FP UAE University, Morocco University of Sindh, Pakistan An-Najah National University and Technical University of Munich Lille University France VIT Bhopal University, Madhya Pradesh, India Polytechnic University of Valencia, Spain Universidad de Extremadura, Spain

Committees

Jus Kocijan Khoudeir Majdi Labib Arafeh Loncaric Sven Lotfi Elaachak Mademlis Christos Maria Joao Simões Mónica Costa Mohamed El Ghami Muhamad Uznir Ujang Mahboub Aziz Omer Muhammet Soysal Ouederni Meriem Rachmad Andri Atmoko R. S. Ajin Rani El Meouche Rui Campos Rogério Dionisio Sagahyroon Assim Saied Pirasteh Senthil Kumar Sonja Ristic Sonja Grgi´c Sri Winiarti Suhaibah Azri Sunardi Xiaoguang Yue Yasyn Elyusufi Youness Dehbi ZAIRI Ismael Rizman

ix

Nova Gorica University, Slovenia IUT, Poitiers university, France Al-Quds University, Palestine Zagreb University, Croatia FSTT, UAE, Morocco Aristotle University of Thessaloniki, Greece Universidade da Beira Interior, Portugal Polytechnic Institute of Castelo Branco, Portugal University of Bergen, Norway Universiti Teknologi Malaysia FSTT UAE University Morocco Southeastern Louisiana University, USA INP-ENSEEIHT Toulouse, France Universitas Brawijaya, Indonesia DEOC DDMA, Kerala, India Ecole Spéciale des Travaux Publics, France INESC TEC, Porto, Portugal Polytechnic Institute Castelo Branco, Portugal American University of Sharjah, United Arab Emirates University of Waterloo, Canada Hindustan College of Arts and Science, India University of Novi Sad, Serbia Zagreb University, Croatia Universitas Ahmad Dahlan, Indonesia Universiti Teknologi Malaysia Universitas Ahmad Dahlan, Indonesia International Engineering and Technology Institute, Hong Kong FSTT, UAE, Morocco University of Bonn, Germany Universiti Teknologi MARA, Malaysia

Keynotes Speakers

Smart City: Why, What, Experience Feedback and the Future/Challenges

Isam Shahrour Lille University, France

Prof. Isam was a graduate from the National School of Bridges and Roads (Ponts et Chaussées-Paris); he has been strongly involved in research, higher education and partnership with the socio-economic sector. During the period of 2007–2012, he acted as Vice President “Research and innovation” at the University Lille1. He is a distinguished professor at Lille University with about 35 years of intensive academic activity with strong involvement in the university management as well as in both socio-economic and international partnership. His research activity concerned successively: geotechnical and environmental engineering, sustainability and since 2011 Smart Cities and urban infrastructures. Associate Editor of Infrastructures Journal (MDPI).

Imagining a Digital Competency Management Ecosystem Approach to Transforming the Productivity of People in the Built Environment

Jason Underwood University of Salford, UK

Prof. Jason Underwood is a Professor in Construction ICT & Digital Built Environments and Programme Director of the MSc. in Building Information Modelling (BIM) & Digital Built Environments within the School of Science, Engineering & Environment at the University of Salford. He holds a BEng (Hons) in Civil Engineering from Liverpool John Moores University, a Master’s in Psychology from Liverpool Hope University and a PhD from the University of Salford. His doctoral thesis was on “Integrating Design and Construction to Improve Constructability through an Effective Usage of IT”. He is a Chartered Member of both the Institution of Civil Engineering Surveyors (MCInstCES) and The British Psychological Society (CPsychol) and a Fellow of the Higher Education Academy (FHEA). He is actively engaged in the digital transformation of the UK construction industry. He is the present Chair of the UK BIM Academic Forum and Director of Construct IT For Business, an industry-led non-profit making collaborative membership-based network.

Challenges of Cybersecurity in Smart Cities

Mohammed Bouhorma UAE University, Morocco

Prof. Bouhorma is an experienced academic who has more than 25 years of teaching and tutoring experience in the areas of information security, security protocols, AI, big data and digital forensics at Abdelmalek Essaadi University. He received his M.S. and Ph.D. degrees in Electronic and Telecommunications from INPT in France. He has held a Visiting Professor position at many Universities (France, Spain, Egypt and Saudi Arabia). His research interests include cyber-security, IoT, big data analytics, AI, smart cities technology and serious games. He is an editorial board member for over dozens of international journals and has published more than 100 research papers in journals and conferences.

Advancing Urban Modelling with Emerging Geospatial Datasets and AI Technologies

Filip Biljecki National University

Prof. Filip is a geospatial data scientist at the National University of Singapore where he had established the NUS Urban Analytics Lab. His background is in geomatic engineering, and he was jointly appointed as Assistant Professor at the Department of Architecture (College of Design and Engineering) and the Department of Real Estate (NUS Business School). He hold a PhD degree (with highest honours, top 5%) in 3D GIS from the Delft University of Technology in the Netherlands, where he also did his MSc in Geomatics. In 2020, he has been awarded the Presidential Young Professorship by NUS.

Integrating System Dynamics in Digital Urban Twin

Ihab Hijazi An-Najah National University and Technical University of Munich

Dr. Hijazi is an associate professor of Geographic Information Science at Urban Planning Engineering Department, An-Najah National University in Palestine. Also, he is a senior scientist at the chair of Geoinformatics at Technical Uni of Munich. He worked as a postdoc scholar at the chair of information architecture, ETH Zurich. He was a researcher at ESRI–the world leader in GIS and the Institute for Geoinformatics and Remote Sensing (IGF) at the University of Osnabrueck in Germany.

Background of Smart Navigation

Ismail Rakip Karas Karabuk University, Turkey

Prof. Ismail Rakip Karas is a Professor of Computer Engineering Department and Head of 3D GeoInformatics Research Group at Karabuk University, Turkey. He received his BSc degree from Selcuk University, MSc degree from Gebze Institute of Technology and PhD degree from GIS and remote sensing programme of Yildiz Technical University, in 1997, 2001 and 2007, respectively, three of them from Geomatics Engineering Department. In 2002, he involved in a GIS project as a Graduate Student Intern at Forest Engineering Department, Oregon State University, USA. He has also carried out administrative duties such as Head of Computer Science Division of Department, Director of Safranbolu Vocational School of Karabuk University. Currently, he is the Dean of Safranbolu Fine Art and Design Faculty in the same university. He is the author of many international and Turkish publications and papers on various areas of Geoinformation Science.

Contents

Smart Agriculture Plant Disease Classification and Segmentation Using a Hybrid Computer-Aided Model Using GAN and Transfer Learning . . . . . . . . . . . . . . . . . Khaoula Taji, Yassine Taleb Ahmad, and Fadoua Ghanimi

3

Water Amount Prediction for Smart Irrigation Based on Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hamed Laouz, Soheyb Ayad, Labib Sadek Terrissa, and M’hamed Mancer

21

Smart Irrigation System Using Low Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kamal Elhattab, Karim Abouelmehdi, Abdelmajid Elmoutaouakkil, and Said Elatar

31

Smart Models Advancing Crop Recommendation Systems Through Ensemble Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M’hamed Mancer, Labib Sadek Terrissa, Soheyb Ayad, Hamed Laouz, and Noureddine Zerhouni Technology to Build Architecture: Application of Adaptive Facade on a New Multifunctional Arena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandra Annibale, Emily Chiesa, Giulia Prelli, Gabriele Masera, Andrea Kindinis, Arnaud Lapertot, Davide Allegri, and Giulio Zani

45

55

Effectiveness of Different Machine Learning Algorithms in Road Extraction from UAV-Based Point Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Serkan Biçici

65

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering on Recommender System Implementation . . . . . . . . . . . . Karim Seridi and Abdessamad El Rharras

75

Critical Overview of Model Driven Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . Yahya El Gaoual and Mohamed Hanine

87

A Synthesis on Machine Learning for Credit Scoring: A Technical Guide . . . . . . Siham Akil, Sara Sekkate, and Abdellah Adib

98

xxvi

Contents

Enhancing Writer Identification with Local Gradient Histogram Analysis . . . . . . 111 Abdelillah Semma, Said Lazrak, and Yaâcoub Hannad Solving a Generalized Network Design Problem Using Hybrid Metaheuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Imen Mejri, Manel Grari, and Safa Bhar Layeb Isolated Handwritten Arabic Character Recognition Using Convolutional Neural Networks: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Mohsine El Khayati, Ismail Kich, and Youssfi Elkettani A New Approach for Quantum Phase Estimation Based Algorithms for Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Oumayma Ouedrhiri, Oumayma Banouar, Salah El Hadaj, and Said Raghay Model Risk in Financial Derivatives and The Transformative Impact of Deep Learning: A Systematic Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Mohammed Ahnouch, Lotfi Elaachak, and Abderrahim Ghadi Digital Twins Integrating Syrian Cadastral Data into Digital Twins Through Accurate Determination of Transformation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Al-Kasem Shaza, Ramadan A. Al-Razzak, and Jibrini Hassan Towards Linked Building Data: A Data Framework Enabling BEM Interoperability with Extended Brick Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Zhiyu Zheng, Esma Yahia, Elham Farazdaghi, Rani El Meouche, Fakhreddine Ababsa, and Patrick Beguery Digital Twin for Construction Sites: Concept, Definition, Steps . . . . . . . . . . . . . . . 195 Mohamad Al Omari, Mojtaba Eslahi, Rani El Meouche, Laure Ducoulombier, and Laurent Guillaumat Towards Digital Twins in Sustainable Construction: Feasibility and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Mojtaba Eslahi, Elham Farazdaghi, and Rani El Meouche Digital Twin Architectures for Railway Infrastructure . . . . . . . . . . . . . . . . . . . . . . . 213 Maryem Bouali, Muhammad Ali Sammuneh, Rani El Meouche, Fakhreddine Ababsa, Bahar Salavati, and Flavien Viguier

Contents

xxvii

Seismic Digital Twin of the Dumanoir Earth Dam . . . . . . . . . . . . . . . . . . . . . . . . . . 224 Mohamad Ali Noureddine, Florent De Martin, Rani El Meouche, Muhammad Ali Sammuneh, Fakhreddine Ababsa, and Mickael Beaufils Digital Twin Base Model Study by Means of UAV Photogrammetry for Library of Gebze Technical University . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Bahadir Ergun, Cumhur Sahin, and Furkan Bilucan Leveraging Diverse Data Sources for ESTP Campus Digital Twin Development: Methodology and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Saffa Mansour, Rita Sassine, and Stéphanie Guibert 3D Models and Computer Vision Road Traffic Noise Pollution Mitigation Strategies Based on 3D Tree Modelling and Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Nevil Wickramathilaka, Uznir Ujang, and Suhaibah Azri Exploring Google Earth Engine Platform for Satellite Image Classification Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Hafsa Ouchra, Abdessamad Belangour, and Allae Erraissi A Review of 3D Indoor Positioning and Navigation in Geographic Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Buse Yaren Kazangirler, Ismail Rakip Karas, and Caner Ozcan Enhancing Smart City Asset Management: Integrating Versioning and Asset Lifecycle for 3D Assets Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Nabila Husna Idris, Suhaibah Azri, and Uznir Ujang Image Transformation Approaches for Occupancy Detection: A Comprehensive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 Aya N. Sayed, Faycal Bensaali, Yassine Himeur, and Mahdi Houchati Low-Cost Global Navigation Satellite System for Drone Photogrammetry Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Muhammad Ali Sammuneh, Alisson Villca Fuentes, Adrien Poupardin, Philippe Sergent, and Jena Jeong 3D Spatio-Temporal Data Model for Strata Management . . . . . . . . . . . . . . . . . . . . 322 U. Mehmood, U. Ujang, S. Azri, and T. L. Choon

xxviii

Contents

Investigating Wind-Driven Rain Effects on Buildings with 3D City Building Models: An Analysis of Building Complexity Using Computational Fluid Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 Nurfairunnajiha Ridzuan, Uznir Ujang, Suhaibah Azri, Liat Choon Tan, and Izham Mohd Yusoff Wildfire Detection from Sentinel Imagery Using Convolutional Neural Network (CNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Sohaib K. M. Abujayyab, Ismail R. Karas, Javad Hashempour, E. Emircan, K. Orçun, and G. Ahmet 3D Spatial Queries for High-Rise Buildings Using 3D Topology Rules . . . . . . . . 350 Syahiirah Salleh, Uznir Ujang, Suhaibah Azri, and Tan Liat Choon Smart Learning Systems Data Analysis and Machine Learning for MOOC Optimization . . . . . . . . . . . . . . . 363 El Ghali Mohamed, Atouf Issam, and Talea Mohamed Using Machine Learning to Enhance Personality Prediction in Education . . . . . . 373 Hicham El Mrabet, Mohammed Amine El Mrabet, Khalid El Makkaoui, Abdelaziz Ait Moussa, and Mohammed Blej Smart Education in the IoT: Issues, Architecture, and Challenges . . . . . . . . . . . . . 384 Ahmed Srhir, Tomader Mazri, and Mohammed Benbrahim Enhancing Book Recommendations on GoodReads: A Data Mining Approach Based Random Forest Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 Sajida Mhammedi, Hakim El Massari, Noreddine Gherabi, and Mohamed Amnai Reinforcement Learning Algorithms and Their Applications in Education Field: A Systematic Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 Hafsa Gharbi, Lotfi Elaachak, and Abdelhadi Fennan Machine Reading Comprehension for the Holy Quran: A Comparative Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 Souhaila Reggad, Abderrahim Ghadi, Lotfi El Aachak, and Amina Samih Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

Smart Agriculture

Plant Disease Classification and Segmentation Using a Hybrid Computer-Aided Model Using GAN and Transfer Learning Khaoula Taji1(B) , Yassine Taleb Ahmad2 , and Fadoua Ghanimi1 1

2

Faculty of Science, Electronic Systems, Information Processing, Mechanics and Energy Laboratory, Ibn Tofail University, Kenitra, Morocco [email protected], [email protected] Engineering Science Laboratory, Ibn Tofail University, ENSA Kenitra, Kenitra, Morocco [email protected] Abstract. Plants are essential for life on earth, providing various resources and are helpful in maintaining ecosystem balance. Plant diseases result in reduced crop productivity and yield. Manual detection and classification of plants diseases is a crucial task. This research presents a hybrid computer aided model for plant disease classification and segmentation. In this research work we have utilized PlantVillage dataset with 8 classes of plant diseases. The dataset was annotated using a Generative Adversarial Network (GAN), four transfer learning models were used for classification, and a hybrid model is proposed based on the pretrained deep learning models. Instance and semantic segmentation were used for localizing disease areas in plants, using a hybrid algorithm. The use of GAN and transfer learning models, as well as the hybrid approach for classification and segmentation, resulted in a robust and accurate model for plant disease detection and management in agriculture. This research could also serve as a model for other image classification and segmentation tasks in different domains. Proposed hybrid model achieved the promising accuracy of 98.78% as compared to the state-of-the-art techniques. Keywords: plant disease · classification · segmentation · hybrid model · Generative Adversarial Network (GAN) · Convolutional Neural Network (CNN)

1

Introduction

External factors can alter a plant’s physiological processes, making it more susceptible to infection and causing changes to the plant’s structure, development, functions, or other features. Depending on the kind of the causal agent, plant diseases can be classified as infectious or non-infectious. Depending on the disease’s etiology, type, and impact site, the symptoms can change. The prevalence c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 3–20, 2024. https://doi.org/10.1007/978-3-031-54376-0_1

4

K. Taji et al.

of diseases brought on by bacterial, fungal, and viral infections has significantly increased recently. Plants in various stages of agricultural production have been impacted by these diseases. Plant diseases, whether contagious or not, significantly reduce agricultural output, leading to financial losses as well as decreased crop quality and quantity. Examining the extensive effects of plant diseases on global agricultural productivity is the goal of this study [18]. It is crucial to take prompt action in developing effective disease management plans to safeguard global food security and ensure a sustainable food supply for the world’s growing population [28]. Environmental aspects and production resources, including temperature, humidity, and labor, in the agricultural process must be taken into account if agricultural output is to be increased. Plant disease, on the other hand, considerably lowers agricultural productivity by 20–30%, making it the primary factor in the global agricultural industry’s decline in production and economic value. In order to prevent the spread of disease and make effective treatment possible, monitoring plant health conditions becomes an essential duty [5]. Many systems have been proposed for plant identification based on leaf images [2,8,9]. Deep learning approaches achieved high accuracy while classifying plants based on leaf images [4,12,19]. Proposing systems based on the AI can help farmers to quickly and accurately identify infected areas of their crops, leading to more efficient use of resources and improved crop yields. In this research work eight classes of PlantVillage dataset are utilized but the images may contain noise, which can negatively impact the model’s performance. To address this, the researchers applied denoising using a generative adversarial network (GAN) [3] and data augmentation techniques. For classification, three different convolutional neural network (CNN) [20] architectures were evaluated, and a hybrid model was built using all three, resulting in higher accuracy. For segmentation, two different approaches were evaluated: instance and semantic segmentation. We have utilized Mask-RCNN [7], VGGSegnet [10], and Unet [6], and a hybrid algorithm was created by combining the strengths of these approaches, resulting in more accurate segmentation of plant diseases. The proposed hybrid model can help farmers to quickly and accurately identify infected areas of their crops, leading to more efficient use of resources and improved crop yields. Additionally, the use of GAN for denoising and data augmentation further enhanced the quality of the images, and the proposed hybrid model for segmentation can effectively localize and segment the disease areas in an image.

2

Related Work

The increasing prevalence of rice plant diseases has caused significant agricultural, economic, and communal losses. Researchers have been exploring image processing techniques to diagnose and identify these diseases. Some literature review was conducted on studies published between 2020 to 2023, which focused on the development of disease detection, identification, and quantification methods for a variety of crops. One of them Anjnaa, Meenakshi, Pradeep [26] worked

Plant Disease Classification and Segmentation

5

an automated system to detect and classified the plant disease in 2020. The paper presents an automated system for early detection and classification of plant diseases, specifically for capsicum plants. The system uses k-means clustering to identify the infected area of the plant and GLCM features to analyze its texture. The type of disease is then classified using various classifiers, with KNN and SVM providing the best results. The proposed system achieved an accuracy of 100% on a dataset of 62 images of healthy and diseased capsicum plants and their leaves. The research emphasizes the importance of early analysis and classification of plant diseases to improve crop production. The article by Prabira, Nalini, and Amiya [22] discusses the current advancements in the diagnosis of rice plant diseases, specifically highlighting the use of image processing techniques for disease identification and quantification. While acknowledging the potential of these methods, the authors also highlight the challenges faced in accurately classifying certain diseases due to the need for high-quality images. They suggest that further research is necessary to address these limitations and enhance the accuracy of these methods in diagnosing and identifying rice plant diseases. Parul, Yash, and Wiqas [23] investigated the use of segmented image data to train CNN models to improve automated plant disease detection. They compared the performance of a CNN model trained using full images to one trained using segmented images and found that the segmented model had significantly higher accuracy of 98.6% when tested on previously unseen data. They used tomato plants and target spot disease as an example to demonstrate the improvement in selfclassification confidence of the segmented model compared to the full image model. Kamal KC et al. [14], have investigated the impact of background removal on convolutional neural networks (CNNs) for in-situ plant disease classification. They have contributed to the field by proposing a novel dataset of in-situ plant images with annotated ground truth, which they have used to evaluate the performance of different CNN architectures with and without background removal. They have also investigated the effect of different background removal techniques on CNN performance. Their results show that background removal can significantly improve CNN performance for insitu plant disease classification, and that a combination of segmentation-based and color-based background removal methods can achieve the best results. The framework of the model proposed in this study is given in the Fig. 1. Azim, Khairul, and Farah [5] proposed a model in 2021 for detecting three common rice leaf diseases: bacterial leaf blight, brown spot, and leaf smut. The model uses saturation and hue thresholds to segment the disease-affected areas and extract distinctive features based on color, shape, and texture domains. They tested several classification algorithms and found that extreme gradient boosting decision tree ensemble was the most effective method, achieving an accuracy of 86.58% on the rice leaf diseases dataset from UCI. The class-wise accuracy of the model was consistent among the classes, and it outperformed previous works on the same dataset. The paper emphasizes the importance of accurate segmentation and feature extraction for effective disease detection in plants. The authors [15] in this paper propose an automated project for leaf segmentation to detect the

6

K. Taji et al.

Fig. 1. Framework of the proposed method

classification of disease. They use a deep convolutional neural network based on semantic segmentation for the classification of ten different diseases affecting a specific plant leaf, specifically tomato plant leaves. The model successfully identifies regions as healthy and diseased parts and estimates the area of a specific leaf affected by a disease. The proposed model achieved an average accuracy of 97.6% on a dataset of twenty thousand images. The study conducted by Raj, Anuradha, and Amit [16] found that 70% of machine learning-based studies used real-field plant leaf images, while 30% used laboratory-conditioned plant leaf images for disease classification. For deep learning-based approaches, 55% of studies used laboratory- conditioned images from the PlantVillage dataset. The average accuracy attained with deep learning-based approaches was 98.8%. The authors, Pooja and Shubhada [13] discuss in their paper different methods that have been developed for plant disease detection using image processing. They also explore the use of machine learning algorithms such as neural networks and decision trees to improve the accuracy of disease detection. In their paper, Jinzhu, Lijuan, and Huanyu [17] provided a review of the latest CNN networks for plant leaf disease classification. They discussed the principles and challenges of using CNNs for this task, as well as future directions for development. They also collected plant datasets from Kaggle [24] and BIFROST [25], and their proposed model achieved high accuracies of 91.83% on the PlantVillage dataset and 92.00% on their own dataset. The general steps adopted in majority of the related research studies is given in the Fig. 2. The literature suggests there is a gap in plant disease detection, and the proposed research aims to address this gap by proposing a method for the segmentation and classification of plant diseases. The goal of this study is to create a hybrid model for segmenting and classifying plant illnesses to precisely identify and control them for sustaining agricultural yield and halting the spread of disease. In this research work we have utilized the PlantVillage dataset, and considered 8 different classes of plant diseases. Four transfer learning models are utilized for categorization. A stable and accurate model that can be a useful tool for plant disease identification and

Plant Disease Classification and Segmentation

7

Fig. 2. General steps involved in the disease prediction system

management in agriculture was created by using GAN to generate the dataset and combining transfer learning models, instance and semantic segmentation, and other techniques.

3

Proposed Methodology

Data collection, data preprocessing, picture segmentation, feature extraction, model training and testing, model assessment, and deployment are the main phases in the approach that we present in this research work. Figure 3 gives the visual representation of the workflow of proposed method. The major phases of proposed method are described in the following section: The proposed method is

Fig. 3. Working methodology of proposed method

8

K. Taji et al.

based on the following steps: Collecting data from the PlantVillage dataset, preprocessing images, segmenting leaves, extracting features, training and testing deep learning model, evaluating the model’s performance using metrics such as accuracy, precision, recall, and F1-score, drawing conclusions from the results and discussing future work to improve the model’s performance. The algorithm of our proposed method framework is as presented in the Algorithm 1:

Algorithm 1: Enhanced Hybrid Model for Plant Disease Segmentation and Classification Input: P lantV illage dataset, segmentation models : M askR − CN N, U N ET, V GGSegN et, P retrained models : DenseN et, ResN et, andEf f icient − N etB1 Output: Preprocessed Dataset,Hyb Seg Model: Ensemble hybrid model for disease segmentation, Hyb classification Model: An Ensemble model for classification 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Load PlantVillage dataset; Perform data annotation and augmentation; foreach image in P lantV illage dataset do Perform preprocessing using Generative Adversarial Networks (GAN) to enhance input data quality and diversity; Save P reprocessed Dataset; end foreach segmentation model in segmentation models do Train segmentation model on P reprocessed Dataset; end Perform segmentation using Hyb Seg M odel; foreach pretrained model in pretrained models do Train pretrained model on P reprocessed Dataset; end Perform segmentation using pretrained models; Perform classification using Hyb classif ication M odel ; Evaluate accuracy for segmentation and classification; Assess effectiveness in localizing disease areas; Analyze results and Compare with state-of-the-art techniques;

We also followed a timeline for our methodology: 1. Data collection: The PlantVillage dataset will be used for this project, which contains images of plant leaves affected by different diseases and pests. The dataset will be downloaded from the Kaggle [22] website. 2. Data pre-processing: The images will be pre-processed to remove any noise and improve the overall quality of the images. This will include color space conversion, image enhancement, and image cropping. 3. Image segmentation: The leaves in the images will be segmented using a suitable segmentation algorithm. This will involve separating the leaf from the background, and isolating the leaf from other parts of the image.

Plant Disease Classification and Segmentation

9

4. Feature extraction: After the segmentation step, features will be extracted from the segmented leaf images. These features will include color, texture, and shape features. 5. Model training and testing: A machine learning model will be trained using the extracted features. The model will be trained using a suitable algorithm and fine-tuned using hyperparameter tuning. 6. Model training and testing: CNN model will be trained using the appropriate hyperparameters. 7. Model evaluation: The performance of the model is evaluated using metrics such as accuracy, precision, recall, and F1-score. 8. Deployment: The model will be deployed in a web or mobile application, which can be used by farmers and researchers to detect diseases in plants. 3.1

Data Augmentation

The practice of intentionally introducing random modifications to alreadyexisting images is known as data augmentation. Its goal is to decrease overfitting and increase the generalisation of the model. The Keras ImageDataGenerator class is used in this project to apply data augmentation to the training pictures before supplying them to the GAN. The class supports a variety of enhancements, including random rotation, turning the view horizontally or vertically, and zooming in or out. The model may be exposed to a larger variety of picture changes thanks to these enhancements, which can also increase its resistance to different kinds of noise and image fluctuations. The generator may learn to denoise the pictures while also becoming more accurate by adding data augmentation to the images before entering them into the GAN. Sample output of this phase is given in the Fig. 4.

Fig. 4. Sample Output of Data Augmentation Phase

3.2

Dataset Preprocessing

A crucial phase of image processing is data preparation. In this context, generative adversarial networks (GANs), advanced deep learning models composed of a generator and discriminator network, are used to improve images with noise.

10

K. Taji et al.

The generator network learns to produce images similar to the input data, whilst the discriminator network develops its capacity to distinguish real images from fake ones. The GAN model is used to successfully remove noise from photos after training. The output of this phase is given in the Fig. 5.

Fig. 5. Sample Output of preprocessing phase

3.3

Leaf Segmentation

Segmentation is the process of dividing an image into multiple regions, each representing a different object or part of the image. Semantic segmentation is a technique that associates each pixel in the image with a label representing the object or region it belongs to. This process involves feeding an image to a neural network model that generates a probability map for each pixel in the image, representing the likelihood of that pixel belonging to a certain class or segment. The probability maps are thresholded to obtain a binary mask for each class, and these masks can be combined to obtain the final segmentation mask representing the different segments or regions in the image.

Fig. 6. Segmentation Output

Segmentation is important for various computer vision applications such as object detection, image recognition, and medical image analysis. In this research

Plant Disease Classification and Segmentation

11

work we have utilized both instance and semantic segmentation techniques to detect and segment individual objects and different regions in an image. The Mask R-CNN model is used for instance segmentation and generates bounding boxes around each object, while the semantic segmentation model combines the VGGSegnet and UNet models to classify the segmented objects into different classes. The resulting segmented image can be used for various tasks such as object detection, classification, and localization in different applications. Sample output of the segmentation phase is given in the Fig. 6. 3.4

Model Building

A hybrid model was created by combining DenseNet, ResNet9, and EfficientNetB1 for the purpose of classification and segmentation of plant diseases using preprocessed and segmented images. The model input was preprocessed images of 256 × 256 × 3 dimensions. The paper uses the DenseNet121 architecture as the base model for plant disease classification and segmentation. The model is built using the Keras library and is trained on a dataset of plant disease images. In this project a hybrid model is created using three different architectures: DenseNet121, ResNet9, and EfficientNetB1, which are chosen for their ability to classify images with high accuracy and extract features effectively. The model is created by concatenating the output of the three different architectures and passing through fully connected layers. The Adam optimizer is utilized in this research work. The hybrid model is effective in extracting features and classifying images with high accuracy by combining the strengths of each architecture.

4 4.1

Experiments and Results Dataset

In this research work PlantVillage dataset [11] is utilized. The PlantVillage dataset is a large-scale dataset used for plant disease recognition and classification. It contains over 50,000 images of healthy and diseased plant leaves belonging to 14 crop species, such as tomato, potato, corn, and grape. The images were collected from various sources, including field surveys and plant clinics. Each image is annotated with the corresponding plant species and disease class label, making it a valuable resource for developing machine learning models for plant disease detection and diagnosis. The classes used in this research work are given in the Fig. 7. The dataset is publicly available and has been widely used by researchers to evaluate the performance of their algorithms. Some sample images of the dataset are given in the Fig. 8.

12

K. Taji et al.

Fig. 7. Dataset Classes and Labels

Fig. 8. Sample Images of Dataset [11]

4.2

Performance Measures

In this research we have utilized the several performance evaluation measures for accessing the performance of all experiments. The mathematical formulation of these measures is given in the Eqs. (1)–(5). Acc =

TP + TN TP + TN + FP + FN

TP TP + FN TN Spe = TN + FP Sen =

(1) (2) (3)

Plant Disease Classification and Segmentation

TP TP + FP 2 × Pre × Sen F1 = Pre + Sen Pre =

4.3

13

(4) (5)

Analysis of Experiments

In this study, we carried out six experiments to assess the effectiveness of several deep learning models for categorising fruit illnesses. EfficientNetB1, ResNet9, VGG16, Unet, MaskRCNN, and a unique hybrid model we suggested were among the models employed in the tests. The objective was to identify the best efficient model for correctly classifying diseases in fruit photos. We used a number of performance measures to statistically analyse the performance of these models. These measurements are necessary for evaluating the models categorization reliability and accuracy. Accuracy (Acc), Specificity (Spe), Sensitivity (Sen), Precision (Pre), and F1 Score are the performance metrics employed in this study. We utilised the Support Vector Machine (SVM) classifier for our classification assignments. A popular supervised learning method called SVM has produced promising results in a range of classification problems, including the categorization of images. It is appropriate for our illness classification challenge because of its capacity to manage high-dimensional data and identify the best decision limits. We want to determine the best deep learning architecture for precise and dependable fruit disease classification by conducting an extensive analysis of the trials and contrasting the performance metrics of each model. The outcomes of this study will help enhance automated fruit disease detection technologies, which are essential for guaranteeing the excellence and productivity of agricultural practises. Class wise detailed quantitative results are given in the Table 1. It is clear from the results that our proposed method achieved the better results as compared to the other models. We have also computed the average performance for all experiments to compare the average performance of the methods. The average performance is given in the Table 2. The results depicts that the hybrid model has achieved the better results. The final results obtained using the considered algorithms are given in the Table 2. IT is clear from the results that our proposed hybrid method has achieved the better results as compared to the other methods. The Table 2 shows that the proposed hybrid model, with an accuracy of 98.78%, had the greatest accuracy rate. This shows that the proposed method outperforms existing methods, such as EfficientNetB1, VGG16, MaskRCNN, and Unet, in terms of accuracy rates. The Fig. 9 provides the confusion matrix of the proposed method best results. The results of hybrid method can be verified from the confusion matrix given in the Fig. 9. Table 1 clearly shows the authenticity of proposed method results. Our proposed hybrid method has achieved the promising results as compared to the standard models.

14

K. Taji et al. Table 1. Quantitative analysis of all experiments Method

Class Performance Measures Rec Spe Pre F1

EfficientNetB1 1 2 3 4 5 6 7 8

0.8980 0.9060 0.8880 0.9740 0.9860 0.9498 0.9120 0.9400

0.9874 0.9863 0.9937 0.9940 0.9966 0.9829 0.9863 0.9949

0.9108 0.9042 0.9528 0.9587 0.9860 0.8874 0.9138 0.9631

0.9043 0.9051 0.9193 0.9663 0.9860 0.9176 0.9129 0.9514

ResNet9

1 2 3 4 5 6 7 8

0.9840 0.8800 0.9980 0.9980 0.9860 0.9920 0.9820 0.9960

0.9837 0.9991 0.9997 0.9974 0.9989 0.9991 0.9971 0.9986

0.8962 0.9932 0.9980 0.9823 0.9860 0.9940 0.9939 0.9901

0.9380 0.9332 0.9980 0.9901 0.9860 0.9930 0.9879 0.9930

VGG16

1 2 3 4 5 6 7 8

0.8420 0.9720 0.9960 0.9340 0.9880 0.9780 0.9320 0.9860

0.9969 0.9789 1.0000 0.9971 0.9946 0.9989 0.9894 0.9911

0.9745 0.8679 1.0000 0.9790 0.9880 0.9919 0.9395 0.9408

0.9034 0.9170 0.9980 0.9560 0.9880 0.9849 0.9357 0.9629

MaskRCNN

1 2 3 4 5 6 7 8

0.9620 0.9720 0.9720 0.9700 0.9800 0.9260 0.9720 0.9940

0.9971 0.9969 0.9909 1.0000 0.9986 0.9946 0.9917 0.9943

0.9796 0.9779 0.9382 1.0000 0.9800 0.9606 0.9586 0.9613

0.9707 0.9749 0.9548 0.9848 0.9800 0.9430 0.9652 0.9774

8

0.9940 0.9971 0.9803 0.9871 (continued)

Plant Disease Classification and Segmentation Table 1. (continued) Method

Class Performance Measures Rec Spe Pre F1

Unet

1 2 3 4 5 6 7 8

0.9620 0.9720 0.9920 0.9900 0.9780 0.9240 0.9720 0.9940

0.9971 0.9969 0.9909 0.9997 0.9986 0.9971 0.9946 0.9943

0.9796 0.9779 0.9394 0.9980 0.9780 0.9788 0.9779 0.9613

0.9707 0.9749 0.9650 0.9940 0.9780 0.9506 0.9749 0.9774

Proposed Hybrid Method 1 2 3 4 5 6 7 8

0.9660 0.9880 0.9980 0.9900 0.9880 0.9860 0.9920 0.9940

0.9994 0.9966 1.0000 0.9997 0.9983 0.9980 0.9969 0.9971

0.9959 0.9763 1.0000 0.9980 0.9880 0.9860 0.9880 0.9803

0.9807 0.9821 0.9990 0.9940 0.9880 0.9860 0.9900 0.9871

Table 2. Average Performance Analysis of All Experiments Method

Performance Measures (%) Acc Rec Spe Pre F1

EfficientNetB1

93.10 93.17 99.02 93.46 93.32

ResNet9

97.78 97.70 99.67 97.92 97.81

VGG16

95.34 95.35 99.34 96.02 95.68

MaskRCNN

96.85 96.85 99.55 96.95 96.90

Unet

97.34 97.30 99.61 97.39 97.34

Proposed Hybrid Method 98.78 98.78 99.83 98.91 98.84

15

16

K. Taji et al.

Fig. 9. Dataset Classes and Labels

5

Comparison with Existing Methods

In this section, the results of the suggested approach are compared to those of the most recent methodologies. Table 3 compares the results of the proposed hybrid technique with those of existing methods in terms of the datasets used, the methodology, the classes of datasets taken into consideration, and accuracy. From the comparison, it can be shown that our suggested strategy produced better results for plants disease classification, with an accuracy rate of 98.78%. Table 3. Average Performance Analysis of All Experiments Reference

Year Performance Measures Acc (%) Rec (%) Spe (%) Pre (%) F1 (%)

[27] [21] [1] [28] [15]

2021 2021 2021 2021 2022

Proposed Method

98.23 97.15 97.11 91.43 97.60

– – 97.00 – –

– – – – –

– – 97.11 – –

– – 97.11 – –

98.78

98.78

99.83

98.91

98.84

Plant Disease Classification and Segmentation

17

Fig. 10. Dataset Classes and Labels

6

Conclusion

The development of advanced technology in the field of computer vision has made it possible to classify the plant diseases with high accuracy. Recent research on the detection of plant diseases using machine learning and deep learning approaches has produced encouraging results. In this paper, we worked with three different models to recognize plant disease classification and detect the localization and affected area. We found that the EfficientNetB1 classifier outperformed the other two models we used, with an accuracy of 97.38%. However, to achieve even better results, we proposed a hybrid model that incorporated multiple models and techniques. Using this hybrid model, we achieved an accuracy of 98.78% for the classification task, demonstrating the effectiveness of combining multiple models and techniques. To localize and detect the affected area, we used instance and semantic segmentation, which are powerful techniques for identifying the regions of the image that are affected by the disease. Among the three segmentation models we have utilized: Mask-RCNN, Unet, and VGGSegnet - we found that VGGSegnet and Unet performed the best. Specifically, U-Net was able to classify and detect the area of disease and also localize the affected percentage area, making it a powerful tool for plant disease recognition. The proposed approach in this paper provides an effective method for detecting plant diseases and identifying their location and extent. By combining multiple models and techniques, we were able to achieve high accuracy in the classification and segmentation tasks. This research has important implications for the agriculture industry, as it can help farmers identify and address plant diseases quickly, which can improve crop yields and reduce economic losses. In conclusion, the proposed approach in this paper demonstrates the potential for using advanced computer vision techniques to improve plant disease recognition. By combining multiple models and techniques, we were able to achieve high accuracy in both classification and segmentation tasks. The results of this study provide valuable

18

K. Taji et al.

insights into the field of plant disease recognition and have important implications for the agriculture industry. With further research and development, it is likely that plant disease recognition will continue to improve, ultimately leading to improved crop yields and reduced economic losses for farmers The graphical comparison of proposed method and state-of-the-art methods is presented in the Fig. 10.

7

Future Work

The development of the web application and its successful results for unseen data opens up many future possibilities for this work. One future scope is to extend the model to detect and classify multiple diseases in plants simultaneously. Another future scope is to incorporate more advanced computer vision techniques to detect plant diseases. For instance, instead of relying on RGB images, hyperspectral imaging can be used to capture spectral information across a wider range of the electromagnetic spectrum, which could provide more detailed information about the plant and its disease. Additionally, the proposed model can be further enhanced by using more advanced deep learning architectures such as transformers, which have shown significant improvements in natural language processing and image recognition tasks. These models can handle complex relationships between different parts of the image and can learn more abstract features. The proposed model can be integrated into such systems to provide real-time disease detection and monitoring, allowing farmers to take timely preventive measures and optimize their crop yields. Our proposed model shows promising results in detecting and classifying plant diseases using computer vision techniques. The successful development of a web application and future plans to develop an Android and iOS app and integrate the model with IoT systems will make the model more accessible to farmers and help them make informed decisions about their crops. The future scopes discussed can further enhance the model’s performance and open up more possibilities for plant disease detection and monitoring. Acknowledgements. We Acknowledge that all authors have no conflict of interest.

References 1. Abbas, A., Jain, S., Gour, M., Vankudothu, S.: Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput. Electron. Agric. 187, 106279 (2021) 2. Agarwal, M., Kotecha, A., Deolalikar, A., Kalia, R., Yadav, R.K., Thomas, A.: Deep learning approaches for plant disease detection: a comparative review. In: 2023 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–6. IEEE (2023) 3. Aggarwal, A., Mittal, M., Battineni, G.: Generative adversarial network: an overview of theory and applications. Int. J. Inf. Manag. Data Insights 1(1), 100004 (2021)

Plant Disease Classification and Segmentation

19

4. Ahmad, A., Saraswat, D., El Gamal, A.: A survey on using deep learning techniques for plant disease diagnosis and recommendations for development of appropriate tools. Smart Agric. Technol. 3, 100083 (2023) 5. Azim, M.A., Islam, M.K., Rahman, M.M., Jahan, F.: An effective feature extraction method for rice leaf disease classification. Telkomnika (Telecommun. Comput. Electron. Control) 19(2), 463–470 (2021) 6. Barkau, R.L.: UNET, One-Dimensional Unsteady Flow Through a Full Network of Open Channels: User’s Manual. US Army COE, Hydrologic Engineering Center (1996) 7. Bharati, P., Pramanik, A.: Deep learning techniques-R-CNN to mask R-CNN: a survey. Comput. Intell. Pattern Recogn. Proc. CIPR 2019, 657–668 (2020) 8. Bhatt, P., Sarangi, S., Pappula, S.: Comparison of CNN models for application in crop health assessment with participatory sensing. In: 2017 IEEE Global Humanitarian Technology Conference (GHTC), pp. 1–7. IEEE (2017) 9. Chand, S., Hari, R.: Plant disease identification and suggestion of remedial measures using machine learning. In: 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), pp. 895–901. IEEE (2022) 10. Daniel, J., Rose, J., Vinnarasi, F., Rajinikanth, V.: VGG-UNet/VGG-SegNet supported automatic segmentation of endoplasmic reticulum network in fluorescence microscopy images. Scanning 2022, 7733860 (2022) 11. Hughes, D., Salath´e, M., et al.: An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv preprint arXiv:1511.08060 (2015) 12. Kamal, K., Yin, Z., Wu, M., Wu, Z.: Depthwise separable convolution architectures for plant disease classification. Comput. Electron. Agric. 165, 104948 (2019) 13. Kantale, P., Thakare, S.: A review on pomegranate disease classification using machine learning and image segmentation techniques. In: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 455–460. IEEE (2020) 14. Kc, K., Yin, Z., Li, D., Wu, Z.: Impacts of background removal on convolutional neural networks for plant disease classification in-situ. Agriculture 11(9), 827 (2021) 15. Khan, K., Khan, R.U., Albattah, W., Qamar, A.M.: End-to-end semantic leaf segmentation framework for plants disease classification. Complexity 2022, 1168700 (2022) 16. Kumar, R., Chug, A., Singh, A.P., Singh, D.: A systematic analysis of machine learning and deep learning based approaches for plant leaf disease classification: a review. J. Sens. 2022, 1–13 (2022) 17. Lu, J., Tan, L., Jiang, H.: Review on convolutional neural network (CNN) applied to plant leaf disease classification. Agriculture 11(8), 707 (2021) 18. Nazarov, P.A., Baleev, D.N., Ivanova, M.I., Sokolova, L.M., Karakozova, M.V.: Infectious plant diseases: etiology, current status, problems and prospects in plant protection. Acta Naturae 12(3), 46 (2020) 19. Pokkuluri, K.S., Nedunuri, S.U.D., Devi, U.: Crop disease prediction with convolution neural network (CNN) augmented with cellular automata. Int. Arab J. Inf. Technol. 19(5), 765–773 (2022) 20. Sarvamangala, D., Kulkarni, R.V.: Convolutional neural networks in medical image understanding: a survey. Evol. Intel. 15(1), 1–22 (2022) 21. Sembiring, A., Away, Y., Arnia, F., Muharar, R.: Development of concise convolutional neural network for tomato plant disease classification based on leaf images. J. Phys. Conf. Ser. 1845, 012009 (2021). IOP Publishing

20

K. Taji et al.

22. Sethy, P.K., Barpanda, N.K., Rath, A.K., Behera, S.K.: Image processing techniques for diagnosing rice plant disease: a survey. Procedia Comput. Sci. 167, 516–530 (2020) 23. Sharma, P., Berwal, Y.P.S., Ghai, W.: Performance analysis of deep learning CNN models for disease detection in plants using image segmentation. Inf. Process. Agric. 7(4), 566–574 (2020) 24. Shoaib, M., et al.: Deep learning-based segmentation and classification of leaf images for detection of tomato plant disease. Front. Plant Sci. 13, 1031748 (2022) 25. Singh, D., Jain, N., Jain, P., Kayal, P., Kumawat, S., Batra, N.: PlantDoc: a dataset for visual plant disease detection. In: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, pp. 249–253 (2020) 26. Sood, M., Singh, P.K., et al.: Hybrid system for detection and classification of plant disease using qualitative texture features analysis. Procedia Comput. Sci. 167, 1056–1065 (2020) 27. Swaminathan, A., Varun, C., Kalaivani, S., et al.: Multiple plant leaf disease classification using densenet-121 architecture. Int. J. Electr. Eng. Technol 12, 38–57 (2021) 28. Xian, T.S., Ngadiran, R.: Plant diseases classification using machine learning. J. Phys. Conf. Ser. 1962, 012024 (2021). IOP Publishing

Water Amount Prediction for Smart Irrigation Based on Machine Learning Techniques Hamed Laouz(B) , Soheyb Ayad, Labib Sadek Terrissa, and M’hamed Mancer LINFI Laboratory, University of Biskra, Biskra, Algeria [email protected] Abstract. Water is a critical resource that needs to be perfectly managed in the agriculture field to achieve high crop production with minimum water usage and without wastage. In this paper, we proposed a smart irrigation solution using different Machine Learning (ML) models to predict the daily irrigation water amount for the cucumber crop. The various used ML takes the plant’s environmental conditions parameters as input to predict the suitable amount of water as output. The results showed that the Support Vector Regression was the best model that gave the highest coefficient of determination (R2 score ≈ 60%) with the smallest Mean Squared Error value (0.28). Keywords: Precision irrigation · Water amount prediction agriculture · Hyperparameter tuning · Machine learning

1

· Smart

Introduction

The globe’s population has increased more than three times since the middle of the 20th century. The global human population reached 8.0 billion on November 15, 2022, from an estimated 2.5 billion in 1950, adding 1 billion people since 2010 and 2 billion in 1998 [1]. With the population growth, the food demand will increase too [2] because agriculture is the first food source [3]. It is mandatory to enhance the agriculture field by fixing the problems encountered in this sector. Because the water resource is one of the main requirements for the plant to grow [4], also with the problem of the shortage of fresh water, especially with the high population growth [5]. Efficient and rational use of irrigation water in agriculture may help achieve high crop production with less water use. The water scarcity problem has motivated researchers to propose different solutions to manage the irrigation water amount. Some of these researchers proposed solutions for monitoring the soil moisture where if the current captured value exceeds a predefined threshold, the system starts irrigating the plants. Otherwise, it stays off [10,11]. Other solutions have been proposed to predict the soil moisture value rather than using sensors to capture the data, which may reduce the system costs. This type of system utilizes artificial intelligence (AI) power in complex problems to c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 21–30, 2024. https://doi.org/10.1007/978-3-031-54376-0_2

22

H. Laouz et al.

predict the exact soil moisture value, which is seen as a piece of information that describes the plant’s water needs [7–9]. Although these works may be considered promising solutions for managing the irrigation process, still, they still suffer from huge problems that reduce their performance to use them as a rational solution for managing the water resource. Some of these solutions took only a few plant environment parameters [7–9], while others used simple linear models [6]. In this work, we have proposed different Machine Learning models that take various plant environment parameters as inputs to predict the exact daily water amount the plant needs straightway instead of predicting the soil moisture. This paper is organized as follows. Section 2 presents the methodology of this study, starting by describing the source of data we used, then the different preprocessing techniques and the machine learning models used. The methodology section is followed by the Experimental study section, where we present the results obtained from the proposed models using the evaluation metrics also presented in this section, along with the hyper-parameter tuning for the proposed models. In Sect. 4, we discuss the obtained results. Finally, the conclusion is provided in Sect. 5.

2 2.1

Methodology Data Source and Description

Since the plant’s environment is thought to be the primary component controlling the irrigation process, gathering information about it is crucial to forecast the amount of irrigation water. For that, we have used the Autonomous Greenhouse Competition (AGC) datasets –The first edition– [12]. The AGC is a competition between six teams organized in the Netherlands for managing cucumber plant evolution within the Greenhouse (GH)-five groups and the experts’ team. Each group has a GH that is equipped with a variety of sensors. Their task is to create machine learning (ML) and deep learning (DL) models that accurately monitor the plant’s needs in terms of irrigation, ventilation, heating, the dosage of carbon dioxide, and other factors. The AGC has six dataset Collections (Teams). Each collection (Team) contains six datasets which are: – Crop Management: Contains general information about the plant crop. – Greenhouse Climate: This dataset contains information about the climate inside GH where the cucumber plant was grown. – VIP: Contains the information about the set points used by the teams to automatically manage the GH. – Irrigation: Contains the information about the irrigation process. – Production: Contains the information about the production. – Resources Calculation: This dataset contains information about the resource consumption by the teams to grow their plants. We have used only the “GH Climate” and the “Irrigation” datasets because they contain essential information that may affect the irrigation process [13].

Water Amount Prediction for Smart Irrigation

23

GH Climate Dataset. Table 1 describes the “GH Climate” dataset, which contains information about the climate inside the GH that the cucumber plant grew in. This dataset contains 33133 rows distributed for 115 days of growing, where each row represents the data captured for 5 min. Table 1. GH Climate dataset [12] Feature

Description

Tair

GH air temperature

RHair

GH relative air humidity

Unit ◦

C

%

AssimLight Artificial light used inside the GH

%

CO2air

CO2 concentration in the GH air

ppm

HumDef

Humidity deficit inside the GH

g/m2

Ventwind

Ventilation wind

%

PipeLow

Temperature of the rail pipe heating on the floor



PipeGrow

Temperature of the pipe heating on the crop height

GHTime

GH time index



C C

Minutes

Irrigation Dataset. Table 2 describes the “Irrigation” dataset, which contains information about the irrigation process during the growing days. This dataset has 115 rows. Each row corresponds to the value captured for that day. We have dropped the “drain” parameter because it was used in the AGC competition to calculate the net water used, which is not our case here. Table 2. Irrigation dataset [12] Feature

Description

Unit

pH Drain The daily average of the PH of the drain water [–] EC Drain The daily average of the EC of the drain water dS/m drain

The daily drain water

l/m2

water

The supplied water for irrigation per day

l/m2

Time

Day index

/

The “Sonoma” team was the winning team of this competition, so we used their datasets as a reference for the perfect plant environment conditions that the plant needs to grow. 2.2

Data Pre-processing

This section presents the techniques used for pre-processing the data.

24

H. Laouz et al.

Handle Missing Values. As we can see from the Table 3, each of the “GH Climate” dataset parameters contains 142 missing values. However, because these missing values represent only 0.87% of the total data, we decided to ignore these missing values when we reconstructed the data. On the other side, and for the “Irrigation” dataset, we can see that only the water parameter contains one missing value (0.87% of the total data). However, because the water parameter is our target variable and each cell in the irrigation column represents a whole day, we decided to delete this cell from our dataset and all the data from the “Irrigation” and “GH Climate” datasets concerning this day. Table 3. Missing values of “GH Climate” and “Irrigation” datasets Dataset

Parameter

Missing values

GH Climate All parameters 142 Irrigation

water

1

Data Reconstruction. The description of the use of datasets demonstrates that different parameters had different intervals for data collection, where some parameters were taken every five minutes while others were taken daily. We have therefore unified the time to a daily range by: – Calculating the daily working time -in minutes- of the “AssimLight” parameter. – Calculating the daily average of the remaining parameters. Scaling Data. Feature scaling is standardizing feature set data used when feature set data is highly varying in magnitudes, units, and range [15]. The data scaling is an important step before training the models because different types of features are employed with varied ranges (%, ◦ C, [1–10] for PH, etc.). As a result, we have scaled our data into a single range of [0–1] using the Min-Max Scaler. 2.3

Proposed Models

Figure 1 presents the techniques used for the data pre-processing and the ML models used in this study where we have used: – – – –

Linear regression. Random Forest Regression (RFR). Support Vector Regression (SVR). K-nearest regression (KNR).

Water Amount Prediction for Smart Irrigation

25

Fig. 1. Diagram of the pre-processing techniques and ML model used

3 3.1

Experimental Study Evaluation Metrics

In order to evaluate the proposed models we have used the following metrics: Coefficient of Determination (R2 Score). The R2 score can be defined as the proportion of variance ‘explained’ by the regression model [14]. R2 = 1 −

RSS T SS

(1)

where: RSS: Sum of squared of residuals (unexplained). TSS: sum of all the squares. Mean Squared Error (MSE). MSE measures the amount of error for a given model. The closer this value is to 0, the fewer errors in the model [16]. N (yi − yi )2 M SE = i=1 (2) N where: yi : Real output value (Irrigation water amount) at instant i. yi : Predicted value at instant i. N: number of samples (Days). 3.2

Hyper-Parameter Tuning

Hyper-parameter tuning consists of finding the parameter that gives better results [17]. So, in this section, we will search for the hyper-parameter values for each ML model that gives the maximum R2 Score.

26

H. Laouz et al.

KNR Hyperparameter Tuning. Figure 2 shows the R2 score using different k values. From the figure, we can see that the KNR model gives the best R2 score when K = 3 with a score of 0.51.

Fig. 2. R2 score For different K values

Linear Regression Hyperparameter Tuning. Figure 3 represents the R2 score using a different alpha values (Famous regularization method used for optimizing the linear models) for the linear regression model, where we can see that the max R2 score was obtained where alpha = 0.1, then it starts to decrease when the alpha value increases.

Fig. 3. R2 score for different alpha values

Water Amount Prediction for Smart Irrigation

27

RFR Hyperparameter Tuning. Figure 4 shows the R2 score for different max depth values, where we can see that when the max depth increases, the R2 score increases too until it reaches the best value (max depth = 6) and then the R2 score starts to decrease.

Fig. 4. R2 Score for different max depth values

SVR Hyperparameter Tuning. Figure 5 shows the results obtained of R2 score using different C and epsilon values for the SVR. Where we can see that when C = 25, 36, 50 the model gives the better results compared to the other C values (slightly better when C = 36). Also, for the different C values, and when epsilon = 0.1 (The lightest colour point represents 0.1), the model gave the highest results compared to the other values (0.3,0.4,....0.9). In short, the configuration of (C = 36; epsilon = 0.1) is the best configuration for the SVR. 3.3

Obtained Results

Table 4 shows the obtained results of the different regression models with the specific values of hyper-parameters obtained from the previous section. These models were evaluated using MSE and R2 score metrics. Most of the used models gave a ≥ 0.4 of R2 score, meaning that these models have successfully explained more than 40% of the variation of the irrigation water amount varies based on the given inputs. The SVR model was the best model among the different used models because it has the highest R2 score (0.61) with the minimum MSE score (0.28) compared to the other models, which indicates that this model has a good fit. This result also means that around 60% of the irrigation water amount change has been

28

H. Laouz et al.

Fig. 5. R2 score for different C and epsilon values for SVR

explained using this model’s input parameters (plant environmental conditions). The KNR model also gave a descent R2 score (0.51) and a not-too-huge MSE value (0.35). On the other, the linear regression and the RFR gave the worst R2 score compared to the others (0.44 and 0.34, respectively) with the highest MSE values (0.40 and 0.48 respectively), which may ensure our hypothesis (presented in Sect. 1) that the irrigation process is way more complex than a simple linear problem. But yet, these results still need to be much better used for managing one of the most critical resources in the agriculture field, water. Table 4. Results obtained Model Hyper-Parameter MSE R 2 Parameter Value Linear alpha

4

0.1

0.40 0.44

RFR

max depth 6

0.48 0.34

SVR

C Epsilon

36 0.1

0.28 0.61

KNR

K

3

0.35 0.51

Discussion

The obtained results from the different models show a possible smart irrigation solution that may solve the problem of wastewater caused by wrong irrigation

Water Amount Prediction for Smart Irrigation

29

practices, where we can see that the best model, which is the SVR, gives a high R2 score (≈ 60%) with a reasonably good error margin (MSE = 0.28). Although the SVR results may be a promising solution to use in the agriculture field for predicting the irrigation water amount, still the given results are not enough yet (The MSE values of the different models are not small), and this is because this type of models deals with the irrigation process as a discrete problem without considering the previous variation. In other words, these models output is not affected by the earlier changes that happened to the plant. Even the irrigation water of the earlier days did not take into consideration to get the current result. And that is wrong because the plant irrigation process is complicated, and every change happens at a given time; it will change the future strategy. So, to achieve exact and accurate results, we need to treat this problem as a time series problem, where the proposed model needs to save the previous information to predict the current output. The auto-correlation is a statistical test that shows the relation between the dependent variable at the current time and the dependent variable at the last time. Table 5 shows the auto-correlation of the output variable (Irrigation water) between the present day with the previous days, where we can see a high correlation (80%) when the lag = oneday, and even when the lag = 3 days, the correlation ≈ 60%, this means that the irrigation water amount of the present day can be affected by the irrigation water amount of the last three days with 60%. Table 5. Auto-Correlation of the dependent variable (water parameter) Day Lags

1

2

3

6

9

Auto-Correlation 0.80 0.67 0.58 0.42 0.18

5

Conclusion

In this paper, we have used the datasets of the winning team of the AGC competition to propose a smart irrigation solution using different ML models to predict the exact irrigation water amount needed to irrigate the cucumber plant that has been grown inside the GH. The used models take as input a variety of plant environment parameters in order to predict the daily water amount that the plant needs it. The results show a descent R2 score and a reasonable MSE value, especially for the SVR model, which gives the highest score among the used models. However, because the irrigation process is very complex and can be affected by many factors, and because of the high auto-correlation of the water parameter, more than the simple ML regression models will be needed to predict the exact amount of irrigation water.

30

H. Laouz et al.

References 1. Nations, U.: Population-United Nations-un.org. https://www.un.org/en/globalissues/population. Accessed 8 June 2023 2. United Nations: UN/DESA Policy brief #102: Population, Food Security, Nutrition and Sustainable Development/Department of Economic and Social Affairs. United Nations, 20 April 2021. https://www.un.org/development/desa/dpad/ publication/un-desa-policy-brief-102-population-food-security-nutrition-andsustainable-development/. Retrieved 25 Dec 2022 3. Food, Agriculture Organization of the United Nations, & World Water Assessment Programme (United Nations). Agriculture, Food and Water. FAO (2003) 4. Asawa, G.L.: Irrigation and Water Resources Engineering. New Age International (P) Ltd., Publishers, New Delhi (2008) 5. van Kooten, O., Heuvelink, E., Stanghellini, C.: New developments in greenhouse technology can mitigate the water shortage problem of the 21st Century. Acta Horticulturae (767), 45–52 (2008). https://doi.org/10.17660/actahortic.2008.767.2 6. Kumar, A., Surendra, A., Mohan, H., Valliappan, K., Kirthika, N.: Internet of things based smart irrigation using regression algorithm. In: 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), pp. 1652–1657 (2017) 7. Huang, Y., et al.: Soil moisture content prediction model for tea plantations based on SVM optimised by the Bald Eagle Search algorithm. Cogn. Comput. Syst. 3(4), 351–360 (2021). https://doi.org/10.1049/ccs2.12034 8. Adeyemi, O., et al.: Dynamic neural network modelling of soil moisture content for predictive irrigation scheduling. Sensors 18(10), 3408 (2018). https://doi.org/ 10.3390/s18103408 9. Goap, A., Sharma, D., Shukla, A.K., Rama Krishna, C.: An IoT based smart irrigation management system using machine learning and open source technologies. Comput. Electron. Agric. 155, 41–49 (2018) 10. Boutraa, T., Akhkha, A., Alshuaibi, A., Atta, R.: Evaluation of the effectiveness of an automated irrigation system using wheat crops. Agric. Biol. J. N. Am. 88, 2151–7517 (2011). https://doi.org/10.5251/abjna.2011.2.1.80.88 ´ Auto11. Guti´errez, J., Villa-Medina, J.F., Nieto-Garibay, A., Porta-G´ andara, M.A.: mated irrigation system using a wireless sensor network and GPRS module. IEEE Trans. Instrum. Meas. 63, 166–176 (2014) 12. Hemming, S., de Zwart, H.F., Elings, A., Righini, I., Petropoulou, A.: Autonomous Greenhouse Challenge, 1st edn. (2018). 4TU.ResearchData. Dataset (2019). https://doi.org/10.4121/uuid:e4987a7b-04dd-4c89-9b18-883aad30ba9a 13. Ahmad, U., Alvino, A., Marino, S.: Solar fertigation: a sustainable and smart IoT-based irrigation and fertilization system for efficient water and nutrient management. Agronomy 12(5), 1012 (2022) 14. Nagelkerke, N.J.: A note on a general definition of the coefficient of determination. Biometrika 78(3), 691–692 (1991) 15. Paper, D.: Hands-on Scikit-Learn for Machine Learning Applications. Apress Berkeley, Berkeley (2020) 16. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning: with Applications in R. Springer, Cham (2013). https://doi.org/10.1007/ 978-1-4614-7138-7 17. Toal, D.J., Bressloff, N.W., Keane, A.J.: Kriging hyperparameter tuning strategies. AIAA J. 46(5), 1240–1252 (2008)

Smart Irrigation System Using Low Energy Kamal Elhattab1(B) , Karim Abouelmehdi2 , Abdelmajid Elmoutaouakkil1 , and Said Elatar1 1 LAROSERI Laboratory, FS Chouaib Doukkali University, El Jadida, Morocco

[email protected] 2 ELITES Laboratory, FS Chouaib Doukkali University, El Jadida, Morocco

Abstract. The Internet of Things (IOT) makes all areas of our daily lives more comfortable. By connecting physical objects to the internet, without human intervention. The development of new intelligent systems in the field of agriculture has strengthened agricultural production, made it an altarpiece and reduced the cost of production. The purpose of this article is to realize a new fully autonomous model capable of functioning correctly in the agricultural field, especially in places where there is no internet connection, and electricity. Our new model uses a solar panel, an ESP32 microcontroller, and LORA protocol to irrigate agricultural fields to ensure good water management. The performance of our new model will be measured in terms of energy savings. This new model will improve new techniques using IOT in agriculture. Keywords: Solar panel · Esp32 · New model · IOT · LORA

1 Introduction The use of the Internet of Things (IOT) in the field of agriculture to improve productivity, reduce the workload on humans and make agriculture more profitable. Nowadays, there is a great shortage of water. Many agricultural fields in our country are experiencing this problem. Several crops have not been able to increase their productivity at the water level. To raise this problem, we thought of automatic watering using a new intelligent model to optimize the use of water in large-scale agricultural fields. In order to deal with our subject, we started with the study and theoretical analysis of the use of IOT technology in the field of irrigation. The thing that allowed us to collect several problems for example the feeding of IOT materials in large agricultural fields and technologies to link these IOT devices. In order to solve these problems, a new approach has been developed using solar energy to power IOT equipment in large-scale agricultural fields, ESP32 to interconnect IOT collectors and LORA to ensure the connection in places with weak internet connection. In this article, a new model of intelligent irrigation has been proposed that optimizes the watering of crops to conserve water resources. The new realize system contains several sensors that help provide automatic (accurate) irrigation for farmers. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 31–42, 2024. https://doi.org/10.1007/978-3-031-54376-0_3

32

K. Elhattab et al.

2 Literature Survey 2.1 Data Collection In our literary study, our work focused on the search engines of these three databases IEEE, ACM library, and Science Direct, and we collected 12 publications on the different procedures and methods of using IOT in agriculture for the years (2023–2019). The search terms used IOT, smart irrigation, and smart agriculture. 2.2 Data Processing The comparison of the attributes of technologies used and the data collected after the analysis and processing of 12 articles, shown in Table 1, began. During the treatment of our subject, we excluded the amount of data collected, the numbers and typologies of sensors, and other intermediate devices because they have no value in our research work. Table 1. Internet of things in smart irrigation. Article

Data/Sensors

Technologies

[1]

Soil moisture

Arduino, Wi-Fi, web technology

[2]

Soil moisture, Temperature and Humidity Nodemcu and cloud technology

[3]

Soil moisture

Arduino, Cloud technology and web technology

[4]

Soil moisture

Raspberry, and cloud technology

[5]

Soil moisture and Water level

Raspberry, cloud technology and mobile technology

[6]

Soil moisture, Temperature and Humidity MQTT and Web technology

[7]

Soil moisture, Temperature, Humidity, Water level and MQ2 GAZ

WSN, Machine learning, Wifi, Cloud technology, Raspberry and arduino

[8]

Soil Moisture and temperature

Wifi, Raspberry, Cloud technology and mobile technology

[9]

Soil moisture, Temperature and Humidity WSN, Wifi, Zigbee, Raspberry and Arduino

[10]

Soil moisture, Temperature and Humidity Wifi, Nodemcu, Arduino, Cloud, mobile application

[11]

Soil moisture, Temperature and Humidity MQTT, Wifi, Nodemcu, Raspberry and cloud

[12]

Soil moisture and Temperature

Raspberry, Cloud and mobile

Smart Irrigation System Using Low Energy

33

3 Analysis and Discussion 3.1 Analysis of Data and Results After data collection, qualitative and quantitative data from the use of IOT in the field of automatic irrigation were available. Our study focuses on the types of technologies used and the types of data collected and the field of application, with the aim of increasing efficiency and productivity and saving human and natural resources (water and electrical energy) in the agricultural field. 3.2 Data Collected To have accurate results, the IOT collects a large amount of data and in order to complete our study we noticed that several articles are focused on soil moisture of the environment (40%), temperature (27%) and humidity (20%). Table 2 details the results obtained. Table 2. Data collected. Data collected

(%)

Soil moisture

(40%)

Temperature

(27%)

Humidity

(20%)

Water (Level)

(7%)

MQ2 (GAZ)

(3%)

NPK

(3%)

3.3 Technologies Used According to the study of the approaches already carried out in the field of agriculture, it was found that technology (Cloud) is the most used (21%), followed by technology (Raspberry pi) with (17%), Wifi by (14%) and finally Arduino by (12%). The results obtained are illustrated in Table 3. 3.4 Discussion After reading the articles, the authors develop approaches using wires and batteries (as sources of electrical energy), which makes the power supply of electricity to IOT equipment a real challenge: – The surface of agricultural fields is very large, the thing that will increase the price to install electric cables. – Maintenance and repair problems in case of breakdown because the cables are underground.

34

K. Elhattab et al. Table 3. Technologies used. Technology

(%)

Cloud

(21%)

Raspberry pi

(17%)

Wifi

(14%)

Arduino

(12%)

Mobile technology

(10%)

Nodemcu

(6%)

Web technology

(5%)

Zigbee, Machine learning

(2%)

– The level of electrical energy of the battery is limited (lasts from a few hours to several years). – The price to buy and replace batteries. From the collection and study of the data collected, it was noted that the majority of articles did not take into consideration the power supply of IOT devices and used Raspberry technology to link IOT devices, hence the need to propose a new system that takes into account the economic consumption of energy (electricity and water). In the agricultural field of large area there are places where there is a weak connection (absence of the internet), so we must propose a solution that will be able to solve this problem.

4 Prototype of the Proposed System 4.1 Proposed Solution To improve existing approaches in the field of agriculture, we will propose our new approach. This approach is based on the use of solar energy (light) linked to a battery to have electricity and using low energy consumption equipment (ESP32 Microcontroller). To solve the problem of low internet connection in agricultural areas, we used the LORA protocol. 4.2 Lora We used the LORA communication protocol to cover large agricultural fields in places where there are poor conditions (absence of internet and electricity). The LORA protocol saves electrical energy because this protocol, sends small messages over a large distance. 4.3 Components – Soil moisture sensors: The purpose of using soil moisture sensor, as shown in Fig. 1, is to determine the moisture level in the agricultural field. If the agricultural field

Smart Irrigation System Using Low Energy

35

needs water, the pump starts irrigating the agricultural field. If the desired humidity level is reached, the system stops the pump automatically.

Fig. 1. Soil moisture sensor [13]

– ESP32 Microcontroller: The ESP32, as shown in Fig. 2, is an IOT development board developed and realized by Heltec Automation (TM), it integrates WiFi, BLE and LoRa functions and widely used in smart homes, smart cities and smart farms.

Fig. 2. ESP32 [14]

– Solar panel: The purpose of using the solar panel (see Fig. 3), in our case is to provide our system with electricity to power the IOT objects related to our system, since the solar panel absorbs the rays of light and transforms it into electrical energy.

Fig. 3. Solar panel

36

K. Elhattab et al.

– Water pump: To irrigate the agricultural fields automatically, the pump (water) was used, as the watch (see Fig. 4). The work of our pump is related to soil moisture, when the soil moisture value drops the water pump starts working and if the soil moisture increases our ESP32 stops our water pump.

Fig. 4. Water pump [15]

– Lithium battery: The lithium battery, as shown in Fig. 5, helps us develop an autonomous system at the level of electricity because lithium batteries are able to store and provide electrical energy that comes from our solar panel.

Fig. 5. Lithium battery [15]

– Card (Power Bank): Charging technology, as shown in Fig. 6, is used to have a bank of electrical energy from lithium batteries to power all electrical objects compatible with 5 V.

Fig. 6. Card for power bank [16]

– Electrical Relay: This module, as shown in Fig. 7, control (On/Stop) our pump.

Smart Irrigation System Using Low Energy

37

Fig. 7. Relay [17]

– Rain sensor: We will use the rain detector, as shown in Fig. 8, to detect the rain in our agricultural field.

Fig. 8. Rain sensor [18]

4.4 Description of the New System Our new model is composed by two units, the first unit detects the parameters of agricultural land (soil moisture and humidity), then it sends it to the second unit using LORA. The second unit receives and sends data to the cloud. The mobile application allows farmers to monitor their agricultural field remotely. Figure 9 shows the proposed new model.

38

K. Elhattab et al.

Fig. 9. New system

5 Implementation and Testing 5.1 Working Steps – The soil moisture sensor checks for soil moisture. When the water content of the soil is greater than 700, the engine will pump the water to the agricultural field. – When the rain sensor detects rain, it will stop the pump automatically. – All information collected from the sensors will be transferred to the user through the mobile app via the cloud. Figure 10 shows the steps in the overall process of our new model: 5.2 Mobile Application We will use the Blynk platform to develop our mobile application. It works with several types of microcontrollers for example ESP32 and the Arduino. The Blynk platform contains:

Smart Irrigation System Using Low Energy

39

Fig. 10. Working steps

– Blynk (application): to drive IOT devices and see collected data. – Blynk (server): a cloud server responsible for communication between IOT devices and the smartphone. – Blynk (Libraries): This consists of notification display formats and command buttons. We created a simple mobile application consisting of a single interface that displays information from ESP32. The mobile app displays real-time information from then the Blynk cloud. (See Fig. 11). 5.3 Test of the New System A- Electricity We tested the consumption of our battery for 7 days, using the screen of our new model, illustrated in Fig. 12, we noticed that with the equipment used within 7 days the level of electrical energy of our battery does not exceed (25%). B- Water According to the comparison of our new model with other methods of irrigation (Manual irrigation and Drip irrigation), illustrated in Fig. 13, we found that our new model consumes the least level of water.

40

K. Elhattab et al.

Fig. 11. Interface mobile application

Fig. 12. Test - Electricity

Fig. 13. Test – Water

Smart Irrigation System Using Low Energy

41

5.4 Critical Analysis According to the test of our new model in the agricultural fields of large surface, we can say that we have succeeded in realizing a new model able to take decisions in an automatic way, and to function correctly in places where there is weak internet connection.

6 Conclusion In this paper, a new automatic irrigation system has been proposed. To develop this new system we studied, analyzed, treated 12 approaches in the field of automatic irrigation, this study gave us the opportunity to answer these two questions: the source of energy to power IOT devices and the technology that will be used to connect IOT objects. With the sensors and conditions use the tests of our new model was positive. In the future we will try to improve our new model by adding surveillance sensors to secure our agricultural field.

References 1. Rohith, M., Sainivedhana, R., Sabiyath Fatima, N.: IoT enabled smart farming and irrigation system. In: 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, pp. 434-439. IEEE (2021). https://doi.org/10.1109/ICICCS51141. 2021.9432085 2. Veerachamy, R., Ramar, R., Balaji, S., Sharmila, L.: Autonomous application controls on smart irrigation. Comput. Electr. Eng. 100, 107855 (2022). https://doi.org/10.1016/j.compel eceng.2022.107855 3. V, A.P.G., Sree, R.S., Meera, S., Kalpana, R.A.: Automated irrigation system and detection of nutrient content in the soil. In: 2020 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India, pp. 1–3. IEEE (2020). https://doi.org/10. 1109/ICPECTS49113.2020.9336990 4. Sunehra, D.: Web Based Smart Irrigation System Using Raspberry PI. Int. J. Adv. Res. Eng. Technol. 10(2) (2019). https://doi.org/10.34218/IJARET.10.2.2019.006 5. Vij, A., Vijendra, S., Jain, A., Bajaj, S., Bassi, A., Sharma, A.: IoT and machine learning approaches for automation of farm irrigation system. Procedia Comput. Sci. 167, 1250–1257 (2020). https://doi.org/10.1016/j.procs.2020.03.440 6. Raju, K.L., Vijayaraghavan, V.: IoT and cloud hinged smart irrigation system for urban and rural farmers employing MQTT protocol. In: 2020 5th International Conference on Devices, Circuits and Systems (ICDCS), Coimbatore, India, pp. 71–75. IEEE (2020). https://doi.org/ 10.1109/ICDCS48716.2020.243551 7. Srinidhi, J.A., Aasish, A., Kumar, N.K., Ramakrishnaiah, T.: WSN smart irrigation system and weather report system. IOP Conf. Ser. Mater. Sci. Eng. 1042(1), 012018 (2021). https:// doi.org/10.1088/1757-899X/1042/1/012018 8. Tapakire, B.A.: IoT based Smart Agriculture using Thingspeak. Int. J. Eng. Res. V8(12), IJERTV8IS120185 (2019). https://doi.org/10.17577/IJERTV8IS120185 9. Thaher, T., Ishaq, I.: Cloud-based Internet of Things approach for smart irrigation system: design and implementation. In: 2020 International Conference on Promising Electronic Technologies (ICPET), Jerusalem, Palestine, pp. 32–37. IEEE (2020). https://doi.org/10.1109/ICP ET51420.2020.00015

42

K. Elhattab et al.

10. Kesarwani, A., Mishra, D., Srivastva, A., Agrawal, K.K.: Design and development of automatic soil moisture monitoring with irrigation system for sustainable growth in agriculture. SSRN Electron. J. (2019). https://doi.org/10.2139/ssrn.3356380 11. Fathy, C., Ali, H.M.: A secure IoT-based irrigation system for precision agriculture using the expeditious cipher. Sensors 23(4), 2091 (2023). https://doi.org/10.3390/s23042091 12. Azry, A.S., Derahman, N., Mohamad, Z., Rahiman, A.R.A., Muzakkari, B.A., Mohamed, M.A.: Fuzzy logic-based intelligent irrigation system with mobile application. (18) (2022) 13. Soil Moisture Detection Humidity Sensor – Kuongshun Electronic Shop. https://kuongshun. com/products/soil-moisture-detection-humidity-sensor. Accessed 3 Mar 2023 14. Amazon.com. HiLetgo ESP-WROOM-32 ESP32 ESP-32S Development Board 2.4GHz Dual-Mode WiFi + Bluetooth Dual Cores Microcontroller Processor Integrated with Antenna RF AMP Filter AP STA for Arduino IDE: Electronics. https://www.amazon.com/HiLetgoESP-WROOM-32-Development-Microcontroller-Integrated/dp/B0718T232Z. Accessed 3 Mar 2023 15. NCR18650B 3.7v 3400mah batterie Lithium Rechargeable (originale). A2itronic. https:// a2itronic.ma/product/pile-18650-protected-rechargeable-3000mah-37v-2/. Accessed 3 Mar 2023 16. Module de Recharge pour Batterie Externe au Lithium, Power Bank à Écran LED, Double USB de Type C, Carte de Circuit Imprimé avec Protection avec Sorties Micro, Chargeur Mobile de Smartphone, Capacité 5V et 2,4 A, 18650 | AliExpress. https://fr.aliexpress.com/ item/1005004909264413.html. Accessed 3 Mar 2023 17. Module Relais. https://www.robotique.tech/tutoriel/module-relais/. Accessed 19 Mar 2023 18. Rain Sensor Module. https://electropeak.com/rain-sensor-module. Accessed 3 Mar 2023

Smart Models

Advancing Crop Recommendation Systems Through Ensemble Learning Techniques M’hamed Mancer1(B) , Labib Sadek Terrissa1 , Soheyb Ayad1 , Hamed Laouz1 , and Noureddine Zerhouni2 1

2

LINFI Laboratory, University of Biskra, Biskra, Algeria [email protected] Graduate School of Mechanics and Micro Technology, Besan¸con, France Abstract. In order to assist farmers in selecting the most suitable crops based on environmental characteristics, this article introduces a novel system for crop recommendation that leverages machine learning techniques, specifically ensemble learning with a voting classifier. A comprehensive analysis of prior research in the field of crop recommendation systems reveals the limitations and challenges of previous approaches, particularly their low accuracy. To address these shortcomings, the proposed system incorporates a voting classifier that amalgamates the performance of various machine learning models, while taking into account the perspectives of all participating models. By harnessing the collective intelligence of these models, this approach aims to mitigate the limitations of previous methods and provide more dependable and precise crop recommendations. The results demonstrate the system’s capacity to generate highly accurate recommendations, with the ensemble learning approach achieving an accuracy rate of 99.31%. This empowers farmers to optimize their agricultural practices and maximize crop yields, enabling them to make informed decisions for sustainable and efficient farming. Keywords: Crop recommendation · Machine learning learning · Voting classifier · Sustainable agriculture

1

· Ensemble

Introduction

Sustainable development and food security are unattainable without the growth of the agricultural sector. In recent years, machine learning (ML) algorithms and Blockchain Technology have shown promising results in addressing complex agricultural challenges. These algorithms assist farmers in optimizing agricultural practices, minimizing resource wastage, and maximizing crop productivity [1,2,5]. There is a growing interest in harnessing ML techniques to develop intelligent crop recommendation systems. This interest arises from the increasing demand c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 45–54, 2024. https://doi.org/10.1007/978-3-031-54376-0_4

46

M. Mancer et al.

for agricultural products and the need to maximize crop yields by considering various environmental and soil conditions [3]. In light of the limitations of previous crop recommendation systems, our primary objective is to create an advanced model capable of providing precise and effective crop recommendations while overcoming the shortcomings of existing methods. To achieve this, we employ a range of algorithms to predict the most suitable crops based on specific environmental and soil conditions. Our proposed system adopts a systematic approach. It commences with data description, encompassing information on soil, climate, and other relevant factors. Subsequently, the data undergoes preprocessing to ensure the quality and compatibility of input variables for machine learning models. These preprocessing operations include data cleaning, encoding, and normalization. Following this, we employ various ML techniques for model training and testing. We evaluate each model’s performance using metrics such as accuracy, precision, recall, and F1 score to gauge its ability to classify crops accurately based on input data (N, P, K, ph, temperature, humidity, and rainfall). To enhance the accuracy and robustness of our system, we incorporate the Voting Classifier, an ensemble learning technique that aggregates predictions from multiple individual models and selects the class with the majority vote. This paper makes a significant contribution to the field of precision agriculture by introducing a crop recommendation system utilizing ensemble learning techniques. The results underscore how this system has the potential to revolutionize the agricultural sector, empowering farmers to make data-driven decisions that enhance crop yield and reduce losses.

2

Literature Review

This section provides a concise overview of previous research on machine learning-based crop recommendation and prediction systems. The authors of this study [4] developed a crop recommendation system utilizing machine learning algorithms, such as Random Forest, Support Vector Machine, and KNN. The primary aim of the project was to assist farmers in making crop prediction decisions by incorporating soil type, temperature, humidity, and other relevant data. The Random Forest algorithm achieved an impressive accuracy of 99%. In their research [6], the authors collected soil data from diverse locations in India to train their machine-learning models. They evaluated the performance of various algorithms, including KNN, Decision Tree, Random Forest, Naive Bayes, and Gradient Boosting, to determine the most suitable algorithm for crop recommendation. With an impressive accuracy rating of 98.18%, the Gradient Boosting method outperformed other algorithms in terms of precision. Farmers can easily input their soil data into the system and obtain recommendations due to its user-friendly interface and accessibility. Furthermore, a mobile application proposed in a study [7] recommends the most profitable crops and suggests the optimal time for fertilizer use by

Advancing Crop Recommendation Systems

47

employing various machine learning algorithms. The Random Forest algorithm achieved the highest accuracy, reaching a rate of 95%. Another research effort [8] introduces a crop forecasting system using machine learning algorithms. Recommendations are based on factors including temperature, humidity, rainfall, and levels of N, P, K, and pH. According to performance evaluations, the selected classifiers achieved an accuracy rating of 98.2%, with a maximum model construction time of 8.05 s. Finally, in another study [9], an Agricultural Recommendation System (ARS) was developed to predict crop production and recommend the optimal crop for a given agricultural land. The study evaluates four algorithms: Multiple Linear Regression (MLR), Decision Tree Regression (DTR), Random Forest Regression (RFR), and Support Vector Regression (SVR). The RFR approach surpasses other methods in terms of accuracy, with a coefficient of determination (Rsquared) of 0.92. R-squared values for the other approaches range from 0.85 to 0.90. The study’s findings indicate the ARS’s competence in recommending the best crop selection, with an accuracy rate of 94.78%. A comparative analysis of previous works based on performance parameters is presented in Table 1. Table 1. Comparative Analysis of Previous Works. Paper Features selected

Outcome

[4]

Temperature, humidity, soil moisture, rainfall, and pH

Development of a RF, SVM, KNN real-time crop recommendation system

Models

99%, 98.6%, 97.8%

Accuracy

[6]

N P K, temperature, humidity, and ph

Provide KNN, DT, NB, RF, GB recommendations for the optimum crop based on soil factors

97.45%, 96.72%, 97.09%, 98%, 98.18%

[7]

Crop type, year, season, Provide soil type, area, and recommendations for region maximizing crop profitability

ANN, SVM, MLR, RF, KNN 86%, 75%, 60%, 95%, 90%

[8]

N P K, temperature, humidity, and ph

Use ML algorithms for crop forecasting

MLP, DT, JRip

98.23%, 88.59%, 96%,

[9]

precipitation, cloud cover, area, vapour pressure, season, yield and production, etc.

Provide recommendations for crop yield optimization

LR, DT, RF, PR, SVM

89.7%, 84.3%, 86.6%, 87.5%, 83.35%

A comprehensive literature review, as depicted in Table 1, demonstrates that despite the diverse machine learning models used in earlier research, achieving consistently high accuracy remains a challenge. Notably, accuracy has ranged from as low as 60% in some cases to considerably higher rates, such as 99%. The disparity in predicted accuracy among models and research emphasizes the need for a more robust and dependable crop recommendation system. Our study directly addresses this critical challenge by applying ensemble learning approaches with the goal of significantly improving forecast accuracy. Ultimately, we aim to provide farmers with accurate and dependable crop recommendations.

48

3

M. Mancer et al.

Methodology

In this section, we will describe the methodology employed to develop the optimal ML model for crop recommendation.

Fig. 1. System Architecture.

Figure 1 illustrates the system’s architecture, which can be summarized as follows: (i) Data Description and Preprocessing: A comprehensive description of the dataset was provided, and essential preprocessing steps were executed to ensure its suitability and quality. (ii) Data Splitting: The dataset was partitioned into training and testing subsets, enabling a precise evaluation of the machine learning models. (iii) Model Training and Testing: The training data were utilized to train a diverse set of machine learning models, subsequently assessed using appropriate performance metrics. The significance of each classifier’s contribution was taken into account during the results evaluation. (iv) Model Creation: Ensemble learning techniques were implemented, employing the Voting Classifier to construct a hybrid model that amalgamates predictions from various classifiers, thereby enhancing overall performance.

Advancing Crop Recommendation Systems

49

(v) Final Classification Outcomes: The efficacy of the proposed model was ascertained by scrutinizing and interpreting the ultimate classification results. These steps collectively form the core of our methodology, allowing us to systematically develop and assess an advanced ML model for crop recommendation. 3.1

Dataset Description

We utilized a publicly available dataset [21] that contains valuable information on nitrogen (N), phosphorus (P), and potassium (K) levels, as well as temperature, humidity, ph, and rainfall. Table 2 presents descriptive statistics for each variable. Table 2. Dataset information. Stats

P

K

Temperature Humidity Ph

Count 2200

N

2200

2200

2200

2200

2200 2200

Rainfall

Mean 50.55

53.36

48.15

25.62

71.48

6.47 103.46

Std

36.92

32.99

50.65

5.06

22.26

0.77 54.96

Min

0.00

5.00

5.00

8.83

14.26

3.50 20.21

25%

21.00

28.00

20.00

22.77

60.26

5.97 64.55

50%

37.00

51.00

32.00

25.60

80.47

6.43 94.87

75%

84.25

68.00

49.00

28.56

89.95

6.92 124.27

Max

140.00 145.00 205.00 43.68

99.98

9.94 298.56

The dataset consists of 2200 data points for each of the seven attributes, encompassing essential agricultural parameters. The table provides a comprehensive overview of the dataset’s statistical properties, including measures such as mean, standard deviation, minimum, quartiles, and maximum values for each attribute. This information offers valuable insights into the dataset’s distribution and variability. 3.2

Data Preprocessing

Preprocessing is a technique employed in data mining to transform raw data into a comprehensible format [10]. The data is comprehensive, containing no duplicates or missing instances, providing a robust foundation for further analysis. Data Cleaning: This step involves identifying and correcting errors, including all types of incorrect, corrupted, improperly formatted, duplicate, or incomplete data [10]. Data Encoding: This process transforms categorical data into a numerical format that can be readily utilized in machine learning models [11].

50

M. Mancer et al.

Data Normalization: When working with features that exhibit varying scales, normalization is applied to standardize the numerical data within a typically 0 to 1 range. This procedure is crucial for many classifiers to enhance model performance and training stability [12]. 3.3

Classifier Models

After preprocessing the data and dividing it into training and testing sets, the following classifier models were evaluated: Multilayer Perceptron (MLP): MLP is a feedforward neural network comprising numerous layers of interconnected nodes or neurons. It is primarily used for supervised learning tasks such as classification and regression [18]. K-Nearest Neighbor (KNN): KNN is a supervised machine learning algorithm designed for classification and regression problems. It operates on the principle that data points close to each other in the feature space belong to the same class [13]. Extra Tree (ET): ET is an ensemble learning algorithm based on decision trees. It shares similarities with Random Forest, but there are notable differences in how the trees are constructed [19]. Extreme Gradient Boosting (XGBoost): XGBoost, an optimized version of the gradient boosting algorithm, constructs a robust predictive model by combining multiple weak prediction models. Regular introduction of new models to correct the errors of previous models leads to continuous performance improvement [16]. CatBoost: CatBoost is a high-performance gradient-boosting technique specifically designed for datasets with categorical features. It proves to be effective in machine learning tasks involving diverse and complex datasets [17]. Voting Classifier: The Voting Classifier is an ensemble learning algorithm that aggregates predictions from multiple independent models to produce a final prediction based on the majority’s decisions [14,15]. Our proposed method is founded on the principle of weighted voting, where the prediction made by each classifier is assigned a weight reflecting its reliability and effectiveness. This can be expressed mathematically as:    Final Prediction = argmax (wi · yi ) (1) i

In this equation, yi represents the predicted probability of class i, while wi signifies the weight attributed to classifier i. Initially, we assign equal weights of 1 to all models before evaluating the results. The validity of the proposed method is assessed using performance metrics, including Accuracy, Precision, Recall, and F1-score.

Advancing Crop Recommendation Systems

4

51

Results and Discussion

In this section, we present the results obtained and provide an explanation of the evaluation outcomes for the proposed system using various assessment metrics, such as accuracy, precision, recall, and the F1 score. The performance of the models employed in the voting classifier, namely MLP, KNN, ET, XGB, and CB, is depicted in Fig. 2.

Fig. 2. Comparative Analysis of Model Accuracies.

When compared to different estimators, including MLP, KNN, ET, XGB, and CB, which achieved accuracy ratings of 98.86%, 98.40%, 99.31%, and 99.09% respectively, it becomes evident that the proposed system exhibits strong predictive capabilities.

Fig. 3. Comparative Analysis of Model Performances.

We delved deeper into the models’ performance beyond just accuracy, examining factors such as recall, precision, and the F1-score. Figure 3 illustrates the results of this comprehensive comparison.

52

M. Mancer et al.

All the metrics consistently yielded high values within the range of 0.98 to 0.99. This indicates that all models excel in providing accurate crop recommendations and perform admirably in classifying them accurately.

Fig. 4. Comparative Evaluation of the Voting Classifier Performance.

The performance analysis of the voting classifier presented in Fig. 4, which combines the predictions of various models, demonstrates its remarkable effectiveness and accuracy in crop recommendation. It achieved a high accuracy of 99.31%. Furthermore, the voting classifier exhibited precision, recall, and F1score values of 99%, underscoring its exceptional ability to correctly identify crops. Cross-validation is a fundamental technique that evaluates the performance of a model and provides significant information about its generalization, bias, and variance [20]. To validate the model’s consistency and its ability to perform well on unseen data, we employ 10-fold cross-validation. The results of cross-validation, as presented in Fig. 5, illustrate the model’s capacity for generalization, with scores consistently ranging from 0.994 to 1.0 across folds. This proves the model’s capability to maintain its performance and reliability when applied to diverse datasets and demonstrates the system’s readiness for real-world applications. Furthermore, our findings indicate minimal bias and variation, suggesting that the model strikes a balance between underfitting and overfitting, ensuring robust and reliable predictions. The analysis of the voting classifier results reveals the generally high and consistent performance of the models used, indicating that the crop recommendations provided by the voting classifier can be trusted and used as a basis for making planting decisions.

Advancing Crop Recommendation Systems

53

Fig. 5. Model Validation and Generalization: 10-Fold Cross-Validation Results.

5

Conclusion

In conclusion, this article has presented a crop recommendation system designed to aid farmers in making well-informed decisions regarding the most suitable crops to cultivate on their land. The results have showcased the system’s high predictive accuracy, consistently surpassing 99% across various machine learning models. Furthermore, the ensemble learning technique has demonstrated its competence in accurately identifying crops, instilling confidence in its recommendations. The cross-validation results have further underscored the model’s adaptability to real-world scenarios. Ongoing efforts are dedicated to expanding its application to diverse agricultural contexts.

References 1. Hosseinzadeh, M., Samadi Foroushani, M., Sadraei, R.: Dynamic performance development of entrepreneurial ecosystem in the agricultural sector. Br. Food J. 124, 2361–2395 (2022) 2. Mancer, M., Terrissa, L., Ayad, S., Laouz, H.: A Blockchain-based approach to securing data in smart agriculture. In: 2022 International Symposium on Innovative Informatics of Biskra (ISNIB), pp. 1–5 (2022) 3. Abbasi, R., Martinez, P., Ahmad, R.: The digitization of agricultural industry-a systematic literature review on agriculture 4.0. Smart Agric. Technol. 2, 100042 (2022) 4. Mathew, J., Joy, A., Sasi, D., Jiji, J., John, J.: Crop prediction and plant disease detection using IoT and machine learning. In: 2022 6th International Conference on Trends in Electronics and Informatics (ICOEI), pp. 560–565 (2022) 5. Mancer, M., et al.: Blockchain technology for secure shared medical data. In: 2022 International Arab Conference on Information Technology (ACIT), pp. 1–6 (2022)

54

M. Mancer et al.

6. Shariff, S., Shwetha, R., Ramya, O., Pushpa, H., Pooja, K.: Crop recommendation using machine learning techniques. Int. J. Eng. Res. Technol. (IJERT) (2022) 7. Pande, S., Ramesh, P., Anmol, A., Aishwarya, B., Rohilla, K., Shaurya, K.: Crop recommender system using machine learning approach. In: 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pp. 1066–1071 (2021) 8. Bakthavatchalam, K., et al.: IoT framework for measurement and precision agriculture: predicting the crop using machine learning algorithms. Technologies 10, 13 (2022) 9. Garanayak, M., Sahu, G., Mohanty, S., Jagadev, A.: Agricultural recommendation system for crops using different machine learning regression methods. Int. J. Agric. Environ. Inf. Syst. (IJAEIS) 12, 1–20 (2021) 10. Alasadi, S., Bhaya, W.: Review of data preprocessing techniques in data mining. J. Eng. Appl. Sci. 12, 4102–4107 (2017) 11. Majumdar, J., Naraseeyappa, S., Ankalaki, S.: Analysis of agriculture data using data mining techniques: application of big data. J. Big Data 4, 20 (2017) 12. Alexandropoulos, S., Kotsiantis, S., Vrahatis, M.: Data preprocessing in predictive data mining. Knowl. Eng. Rev. 34, e1 (2019) 13. Li, Y., Yang, Y., Che, J., Zhang, L.: Predicting the number of nearest neighbor for kNN classifier. IAENG Int. J. Comput. Sci. 46, 662–669 (2019) 14. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach. Learn. 36, 105–139 (1999) 15. Remadna, I., Terrissa, L., Al Masry, Z., Zerhouni, N.: RUL prediction using a fusion of attention-based convolutional variational autoencoder and ensemble learning classifier. IEEE Trans. Reliab. 72, 106–124 (2023) 16. Chen, T., et al.: XGBoost: extreme gradient boosting. R Package Version 0.4-2 1, 1–4 (2015) 17. Hancock, J., Khoshgoftaar, T.: CatBoost for big data: an interdisciplinary review. J. Big Data 7, 1–45 (2020) 18. Murtagh, F.: Multilayer perceptrons for classification and regression. Neurocomputing 2, 183–197 (1991) 19. Sharaff, A., Gupta, H.: Extra-tree classifier with metaheuristics approach for email classification. In: Advances in Computer Communication and Computational Sciences: Proceedings of IC4S 2018, pp. 189–197 (2019) 20. Schaffer, C.: Selecting a classification method by cross-validation. Mach. Learn. 13, 135–143 (1993) 21. Ingle, A.: Crop recommendation dataset. Kaggle, December 2020. www.kaggle. com/datasets/atharvaingle/crop-recommendation-dataset

Technology to Build Architecture: Application of Adaptive Facade on a New Multifunctional Arena Alessandra Annibale1,2 , Emily Chiesa1,2 , Giulia Prelli1,2 , Gabriele Masera1 , Andrea Kindinis2(B) , Arnaud Lapertot2 , Davide Allegri1 , and Giulio Zani3 1 Dip. ABC Architecture, Built Environment and Construction Engineering, Politecnico di

Milano, 20133 Milan, Italy 2 Institut de Recherche en Constructibilité, ESTP Paris, 94230 Cachan, France

[email protected] 3 Department of Civil and Environmental Engineering, Politecnico di Milano,

20133 Milan, Italy

Abstract. Adaptive façades (AFs) can adapt to changing boundary conditions according to short-term weather fluctuations, diurnal cycles, or seasonal models. The behaviour of indoor environment and the global comfort of a building are strictly dependent on the façade: traditional façades behave statically towards external and internal climate conditions. The objective of this study is to design an adaptive facade system with different layer functions, ensuring the thermal and visual comfort of the various indoor environments and controlling the incident solar radiation. Furthermore, by incorporating second and third generation photovoltaic cells into the adaptive envelope, it is possible to store and produce renewable energy to integrate the “invisible” photovoltaic technology in the building (BIPV). This facade configuration fulfils the performance requirements of the case study presented in this paper: a new multi-functional arena in Paris. This building has a high number of users and different areas of use inside it, so greater flexibility is also required by the envelope. Moreover, the architectural characterisation of the adaptive envelope contributes to establish the building as a new landmark for the neighbourhood and the city. The paper proposes a methodological process that has led to the technological and architectural definition of the envelope element using parametric modelling. By developing a model on Rhinoceros and Grasshopper, it is possible to control configuration and mechanism of the façade, depending on incident solar radiation and the changes in surface temperature. Keywords: Adaptive Façade · Smart Architecture · BIPV

1 Introduction Paris has been chosen to host the Olympic and Paralympic Games from 26th July to 11th August 2024 [1]. This historic event for the French capital is an opportunity to redevelop the northern suburbs of the 18th district, characterised by serious social problems [2], © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 55–64, 2024. https://doi.org/10.1007/978-3-031-54376-0_5

56

A. Annibale et al.

through the construction of a new multi-purpose arena called the Adidas Arena. A key point in the planning of the Olympic Games is the integration of new buildings into the life cycle of the district and the optimisation of existing ones. For this reason, after the event, the Adidas Arena will offer the possibility of hosting shows, concerts, and sports events [1]. This paper presents the methodology for design and develop a new type of adaptive façade that meets the needs of different spaces and coexisting functions in the designed multifunctional arena, as well as climatic conditions of the area. The façade system is the result of the requirements for the appropriate use of the interior spaces, considering the outdoor climatic conditions. Through the design of a diversified functional layout, together with the study of boundary climatic conditions, the necessary requirements for the various environments within the building and consequently the properties at the interface are integrated. In this dynamic process, each façade of the arena must respond to specific conditions. Therefore, it is necessary to develop an adaptive envelope, using parametric modelling through softwares Rhinoceros and Grasshopper.

2 Description of the Project 2.1 Case Study: Arena The concept of the proposed building is a variant of the Adidas Arena, based on the winning design by LeClercq Associés [3], and then focusing on the building envelope. The ground plan of the building is in the shape of a rectangular trapezoid and has two underground and three above-ground levels, determined by the various functions inside it. The basic scheme of the building is shown in Fig. 1 and it can be traced back to a central core consisting of the sports area surrounded by three blocks with tertiary functions. This division leads to the creation of a large pedestrian distribution space surrounding the four differentiated blocks. As for the context surrounding the arena, on the north and west sides there are Boulevard Periferique and Boulevard Ney [2]; while the east front of the building faces a sunken forecourt; lastly, to the south there is the large square leading to the entrance hall of the building. The use of the rooms in the different blocks is derived from the Technical Manual on Venues - Design Standard for Competition Venues published by the National Olympic Committee (IOC) in 2005. This document sets out the rules and requirements for the design of a building intended to host Olympic disciplines. The program of the building (Fig. 1) is based on: – 1630 m2 of sport arena (blue color), annexed services and changing rooms (blue color) in the central core; – in the eastern block 200 m2 of city gym, 1680 m2 of services like restaurants, merchandising etc. (pink color); – in the north 800 m2 of directional offices (yellow color); – south great hall of 400 m2 ; – in the western block 1500 m2 of press activities (orange color), and 280 m2 of offices (purple color).

Technology to Build Architecture: Application of Adaptive Facade

57

Fig. 1. Ground floor of the Arena.

2.2 Climate Analysis After defining the volumes of the building, climatic analyses were carried out to determine the incident solar radiation on the building in order to identify the most suitable façade types. The analyses were conducted using the Grasshopper plug-in Ladybug, integrated with weather data from Energy Plus (Paris Orly 071490, IWEC). Simulations were performed using a monthly approach, considering the period between 9 am. and 6 pm. In addition to the volume of the building, the surroundings were also modelled to consider possible shading. As can be seen in Fig. 2, the north façade, although minimally exposed due to its slope, it has low values for solar radiation (from 5.20 kWh/m2 in December to 30.38 kWh/m2 in July). For this reason, the façade does not require shading devices and will be mainly opaque. The east façade, on the other hand, has a low solar radiation since exposure to the sun during morning hours (from 10.39 kWh/m2 in December to 45.58 kWh/m2 in July). During summer, there is no noticeable increase in radiation levels, although shading systems are required to modulate solar radiation at dawn, when the sun is at its lowest point. The south façade of the building is not evenly exposed to solar radiation because there is a building a few metres in front of it. For this reason, the east-facing side has the lowest values of incident solar radiation, while the upper part of the façade and the west-facing side have the highest values (from 20,78 kWh/m2 in December to 75,96 kWh/m2 in July). Shading systems are highly necessary, but the possibility of keeping the lower part of the façade clear must also be considered in order to guide spectators towards the main entrance of the Arena. Lastly, the west façade displays a changing behaviour throughout the year, as the incident solar radiation shows low values in winter (7,79 kWh/m2 in December), while in summer there is a significant increase up to 75,96 kWh/m2 in July. This is because in summer the sun’s position remains high for longer period and the low rays of the setting sun reach the surface, increasing the total incident solar radiation. Therefore, better control of the shading systems is needed. They must have be able to adapt to seasonal changes and filter the low sun rays in summer, without compromising the indoor visual comfort.

58

A. Annibale et al.

Fig. 2. Incident solar radiation on building’s elevations in December (left) and July (right), from north (up) to west (down).

2.3 Façade Requirements Following the climatic analysis carried out and the functional typology within the environments, we can make such conclusions: – the northern front, with 788 m2 of executive offices, and always in the shade, does not require shielding but is clearly recognisable from the Boulevard Periferique; – the east side is only affected by solar radiation in the morning, and enclosing mainly catering services and merchandising, is opened towards the lowered square in front of the arena. The south elevation is characterised by intense radiation during the afternoon and must act as a thermal buffer in order to stabilise the temperature inside the arena, also for the large turnout inside the entrance hall. The west façade must modulate the solar radiation, which varies significantly between winter and summer (see Fig. 2) and at the same time manage the various needs of the functional mix inside, composed of offices, press activities and public circulation spaces. Due to the multifunctional nature of the complex, the Adidas Arena can be considered an ideal case study for analysing the benefits brought by the application of an adaptive façade. The adaptive façade definition is “A climate adaptive building shell has the ability to repeatedly and reversibly change some of its functions, features or behavior over time in response to changing performance requirements and variable boundary conditions and does this with the aim of improving overall building performance” [4]. The south and west-facing elevations were chosen for the application of this type of technology because of the large number of requirements they must fulfil compared to other elevations. In order to pay attention to lighting, energy balance, air quality and occupant safety, a movable façade was chosen for the south elevation and a switchable façade for the west elevation. A movable façade is a type of technology designed to quickly adapt to environmental conditions and to track the movement of the sun through movable shading systems capable of storing energy from solar radiation [5]. The south façade is therefore compatible with such a system, providing visual comfort, thermal comfort and energy storage at the same time. A switchable façade exploits the potential of adaptive materials to regulate the flow of light and energy into the building [5].

Technology to Build Architecture: Application of Adaptive Facade

59

3 Multifunctional Façade 3.1 Concept Idea As well as skin and skeleton must interact, also building and envelope collaborate in order to create the best visual and thermal environment for occupants. The adaptive façade has two ways to modificate its behaviour: a morphologic way, linked with the shape that can assume, and a physiological way, which it refers to adaptation of chimical processes, due to the changing of boundary conditions [6]. This mechanism can be analogous to the biomimetics of living organisms [6], in particular in flowers: corolla, stem, leaves and roots constitute a system of different layers, each with a specific function for the life cycle of the plant. As well as the life of a building, the multi-skin layers should be responsives to the specific needs and demands of the environment. The most external part of the flower is in direct contact with the atmospheric agents and with the solar radiation, which represents the moving-kinetic component that allows the selection of incoming energy and internal layer protection. The stem and leaves, in addition to giving stability to the flower, are the emblem of the photosynthesis mechanism and thus the part responsible for absorbing energy. The material, by its properties, is responsible for collecting and converting energy, applicable in a similar way to the adaptive envelope. Finally, the flower’s roots are the link with the ground, the deepest layer: these filter out the nutrients needed to feed the plant. So, as shown in Fig. 3, the final layer of this adaptive façade is the occupant contact interface, responsible for controlling the mechanism.

Fig. 3. Basing principles for adaptive façade’s layering concept.

And starting from these premises, the concept is to create a multilayer façade system to replace different functions. In particular, the idea was born to combine the large amount of energy that affects the various fronts of the building, interlacing it with the different instances to which they must respond, depending on their orientation and functions present inside. The concept of the system is as follows: – a Kinetic façade [7] as the first layer towards the external environment; this is mobile and regulates the incoming light;

60

A. Annibale et al.

– a Building Integrated Photovoltaic (BIPV) [8] system inward; it is a fixed element, enclosing the technology that accumulates and captures radiation and then transforms it into energy, through a combination of high-performance photovoltaic systems integrated in a Smart Window (SW). 3.2 Kinetic Element Design The kinetics of the façade system is developed to modulate the absorption and ensure protection to the photovoltaic modules, as well as to regulate the filtration of light inside the interior spaces, to ensure the comfort of users. Starting from the shape of the flower, a dimensional scale of the elements is studied on a human scale, in proportion to the size of office modules about 3,5/4 m each: in this way the façade element, hexagonal shape with a diameter of 1,5 m is less impacting to the view from the inside. In addition, the size of the triangular modules equal to 0,75 m, ensures easier maintenance and freedom of control: just think of the Al Bahar Towers in Dubai [9], which have a similar kinetic system but with a module size that covers an entire floor of the building. The size of the elements has an impact on the fact that from each “umbrella” for floor strongly influences the amount of light entering the rooms: in this article the project includes smaller modules also to better manage the activation and deactivation of n modules for each floor of the building, ensuring the versatility of the system. In order meet the requirements of structural and handling adaptability, a geometry is required to be created using a method that allows the transport in modules in an original condition and that can deform or expand in the desired shape once assembled on the façade: the best propose for a structure is rigid origami [10]. Figure 4 shows the process of preliminary conception and realization of a small origami and its movement is based on the translation of the triangular module along an axis perpendicular to the plane.

Fig. 4. Handling of the façade module by origami: a) Module closed; b) Module open.

3.3 Parametric Modelling In order to simulate and visualize kinematics of the façade, a parametric model was developed using Grasshopper as shown in Fig. 5. Starting from a triangular grid on the XY plane, the extension of which in x and y can be varied to display several modules, a single equilateral triangular cell of unit side was considered for the purpose of simplifying   modelling. The segments of the cell and the vertices (Ai, A , A ) were then distinguished and the centre of the cell (O) was identified.

Technology to Build Architecture: Application of Adaptive Facade 

61 

The segment A - A was then selected and divided in half, defining the point B and further discretizing the cell. From the centre of the cell, the direction Z was also identified  as the reference direction for movement. Then, dividing the length of Ai-A by a value of 30 (a chosen dimension that can be modified according to subsequent evaluations) and using the obtained result as the intensity of a translation vector in the Z direction applied   to B , point B was identified at the initial position (Bi). Similarly, by dividing Ai-A by a value of 15 (identified as described above) and applying the translation vector to O, the point C at the initial position (Ci) was determined. To determine the position of C during movement (Cm), initially the distance from the centre of cell O to an external attractor point P of variable position, determined by simple coordinate input, was calculated. At the same time, the side of the cell was divided by a value of 5 (identified as described above), and the obtained result was defined as the maximum permissible value of movement, thus achieving a permissible movement target from 0 to 0.2. By combining the value of the distance between P and O, point Ci and the selected motion target, the value of the intensity of the translation vector in the Z direction to be applied to C was determined, thus obtaining the point of the motion position (Cm). Once all these points were obtained, various trigonometric operations were performed to establish planes and radii of motion of points A and B. By finding the plane Pt through the intersection of the point Cm, O and A at the initial position (Ai), and rotating it 60° around O, the plane Pt2 was derived. The line of conjunction between Ai and O was also determined, the length of which was divided by the cosine of the alpha angle, obtained between the vectors Ai-Ci and Ai-O, to obtain the value r1. However, the distance between Ci and Bi, called r3, was first multiplied by the cosine of the beta angle, derived between the vectors Ai-Ci and Ci-Bi, obtaining the value d; subsequently, this distance was multiplied by the sine of the beta angle to obtain r2. The circle of radius r1 was then drawn in the Pt plane, intersecting it with the line A-0 to determine point A in motion (Am). A circle of radius d was then determined on the plane Pt and made to intersect with the line Am-Cm. From the point of intersection, a plane in the Z direction was created from the vector Am-Cm and a circle of radius r2 was constructed. Fig. X. g) Then the circle of radius r3 was drawn on the plane Pt2, and intersecting it with the circle of radius r2, the moving point B (Bm) was determined. By joining the points Am-Bm and Bm-Cm, the surface area of the analysed part of the cell in movement position was then obtained. Mirroring it with respect to the Pt plane, the first tip of the module was obtained, and then replicated by means of a polar matrix with respect to O for all the vertices of the cell. At this point, by increasing the extent of the grid and obtaining a delimited surface, it is possible to tilt the modules vertically by orienting them in the YZ plane, and then scale them to the desired size. Although such modelling is complex and requires numerous steps, it allows more control over the parameterization and subsequently permits the point of attraction to be defined by inputs from climate simulations, so that the movement of the modules can be linked to environmental variations.

62

A. Annibale et al.

Fig. 5. Modelling of the triangular cell in Grasshopper.

3.4 System Components The façade system consists of a primary steel structure, which supports the transparent surface: this is the result of the combination of second and third generation highperformance photovoltaic cells, respectively CIGS in the frame and LSC in the transparent part. These two types of cells combine the low-cost production of thin films with higher energy efficiency than traditional systems [11]. LSCs offer the possibility of exploiting solar radiation using large transparent devices, in this case triangular modules equal to 4 m side, and at the same time to require a minimum amount of photovoltaic material, through CIGS cells occupying only the edges of the façade [12], as well as working in the same way both in direct and diffuse sunlight. The mechanism works as follows (Fig. 6): the constitutive particle, called luminophore (or fuorophore), allows to absorb and reinsert the incident solar radiation, while the waveguide LSC (transparent matrix) concentrates it towards the edges of the plate; here small photovoltaic solar cells (CIGS) convert solar radiation into electricity [12].

Fig. 6. Energy mechanisms in an LSC device [12].

Technology to Build Architecture: Application of Adaptive Facade

63

Moving on to the secondary steel structure, the triangular modules are made with PTFE fiberglass, a material chosen for the characteristics of lightness, semi-transparency, resistance to atmospheric agents. These are then assembled to aluminum profiles. Behind are the electrical conduits that connect the motor connect the motor/power supply of the mechanical sails to the plate of energy storage produced by photovoltaic cells, which are divided for each floor (Fig. 7).

Fig. 7. Structure of the multifunctional façade.

4 Conclusions and Perspectives The new adaptive multifunctional façade system is the result of the requirements for the correct use of the interior spaces, according to the external climatic conditions; it allows solar control thanks to the integration of technology and movement. This study therefore conceived the structure of the various façade components and was able to create and replicate a parametric model of the kinetic element. In the perspective of future research, DC motors can be driven by an actuator-probe system [11], that has two temperature detectors, one external and one internal, which measure a temperature difference (T) between the exterior façade and the interior environment. Together with the monthly average solar radiation incident measurement data on the building, they can govern the system of movement of the external sails, exploiting the energy produced by the smart window CIGS-LSC. During the implementation of the system should be paid attention to the limits of these nanomaterials: in the CIGS the biggest problem is due to the electronic losses and the toxicity of Cadmium sulfide (CdS) present in the substrate [13]. In the case of LSC, the limits are attributable to reabsorption losses due to the colour rendering index (CRI) of certain colours such as red and yellow, and thus linked to the long-term stability of the material [13]. These aspects will be the subject of future research on the system’s durability.

References 1. Paris2024. https://www.paris2024.org/fr. Accessed 06 July 2023 2. Mairie de Paris. Etude d’impact de la ZAC Gare des Mines Fillettes (2018) 3. LeclercqAssocies. https://www.leclercqassocies.fr/fr/projets-d-architecture/gare-des-mines. Accessed 06 July 2023

64

A. Annibale et al.

4. Loonen, R.C.G.M., Trˇcka, M., Cóstola, D., Hensen, J.L.M.: Climate adaptive building shells: state-of-the-art and future challenges. Renew. Sustain. Energy Rev. 25, 483–493 (2013). https://doi.org/10.1016/j.rser.2013.04.016 5. Tabadkani, A., Roetzel, A., Xian Li, H., Tsangrassoulis, A.: Design approaches and typologies of adaptive façades: a review. Automat. Construct. 121, 103450 (2021). https://doi.org/10. 1016/j.autcon.2020.103450 6. López, M., Rubio, R., Martín, S., Croxford, B.: How plants inspire façades. From plants to architecture: Biomimetic principles for the development of adaptive architectural envelopes. Renew. Sustain. Energy Rev. 67, 692–703 (2017). https://doi.org/10.1016/j.rser.2016.09.018 7. Bui, D.K., Nguyen, T.N., Ghazlan, A., Ngo, N.T., Ngo, T.D.: Enhancing building energy efficiency by adaptive façade: a computational optimization approach. Appl. Energy 265, 114797 (2020). https://doi.org/10.1016/j.apenergy.2020.114797 8. Zhang, T., Wang, M., Yang, H.: A review of the energy performance and life-cycle assessment of building-integrated photovoltaic (BIPV) systems. Energies 11(11), 3157 (2018). https:// doi.org/10.3390/en11113157 9. Karanouha, A., Kerberb, E.: Innovations in dynamic architecture The Al-Bahr Towers Design and delivery of complex façades. J. Façade Design Eng. 3, 185–221 (2015). https://doi.org/ 10.3233/fde-150040 10. Del Grosso, A.E., Basso, P.: Adaptive building skin structures. Smart Mater. Struct. 19(12) (2010). https://doi.org/10.1088/0964-1726/19/12/124011 11. Ta¸ser, A., Koyunbaba, B.K., Kazanasmaz, T.: Thermal, daylight, and energy potential of building-integrated photovoltaic (BIPV) systems: a comprehensive review of effects and developments. Sol. Energy 251, 171–196 (2023). https://doi.org/10.1016/j.solener.2022. 12.039 12. Hernández-Rodríguez, M.A., Correia, S.F.H., Ferreira, R.A.S., Carlos, L.D.: A perspective on sustainable luminescent solar concentrator. J. Appl. Phys. 131(14) (2022). https://doi.org/ 10.1063/5.0084182 13. Meinardi, F., Bruni, F., Brovelli, S.: Luminescent solar concentrators for building-integrated photovoltaics, Nat. Rev. Mater. 2 (2017). https://doi.org/10.1038/natrevmats.2017.72

Effectiveness of Different Machine Learning Algorithms in Road Extraction from UAV-Based Point Cloud Serkan Bi¸cici(B) Department of Engineering, Geomatics Engineering, Artvin Coruh University, 08100 Artvin, Turkey [email protected]

Abstract. This study presents the evaluation of seven different machine learning (ML) models to classify road surface from point cloud. The study begins with converting two-dimensional images collected from unmanned aerial vehicles (UAV) flights to three-dimensional (3D) point cloud. Seven different ML models, namely, Generalized Linear Model, Linear Discriminant Analysis, Robust Linear Discriminant Analysis, Random Forest, Support Vector Machine with Linear Kemel, Linear eXtreme Gradient Bossting, and eXtreme Gradient Boosting, were developed under different training samples. Finally, road surface were classified from 3D point cloud using developed ML models. To assess the performance of the ML models, manually extracted road surfaces were compared with the ones obtained from ML models. Generalized Linear Model produces the most accurate classification results in a shorter processing time. On the other hand, Linear eXtreme Gradient Boosting and eXtreme Gradient Boosting models produce less accurate road classification in a longer processing time. The classification accuracies of other ML models are between these. Keywords: UAV

1

· 3D · point-cloud · ML-model · road

Introduction

Roads are one of the main topographical objects in both urban and rural areas and are used to transport both people and goods from one place to another safely, comfortably, and quickly [1,2]. This leads to economic development [3]. Therefore, road network information should be investigated and updated periodically and regularly [1,2]. In addition, this might allow one to carry out necessary maintenance and rehabilitation actions on time, reducing repair costs [1,4]. There are several methods and technologies for collecting road network information. It is a traditional method to go and examine the road on site. However, this method is time-consuming, expensive, and subjective [5,6]. Satellite images and aerial photos are technologies used to collect road information [7,8]. These images can obtain two-dimensional (2D) and pixel-based road network information in extensive areas. However, detailed information about the road surface c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 65–74, 2024. https://doi.org/10.1007/978-3-031-54376-0_6

66

S. Bi¸cici

may need to be collected. A three-dimensional (3D) model can be developed to obtain road information. LiDAR is a commonly used technology to develop 3D road models [9,10]. The vehicle drives along the road and develops a 3D model of the road using sensors on the LiDAR attached to the vehicle. However, LiDAR technology is costly. Another technology to develop a 3D model is unmanned aerial vehicles (UAV). Recently, UAV has been used in many fields and studies [1,2,11]. Quick, secure, and accurate information can be collected from UAV technology. Therefore, the 3D model developed from UAV images was used in this study. Several methods have been introduced to classify different types of objects from 3D models. Machine learning (ML) models are one of the most widely used models since they are easy to use, short in time compared to other methods, and provide accurate results [12,13]. This study investigated the accuracy of several ML models to classify road surface from 3D UAV-based models. This manuscript is organized as follows. In the next section, the material and methods are summarized. Specifically, developing 3D models from UAV images and ML models is discussed in this section. Then, the experimental study conducted in this study is presented in Sect. 3. The results are summarized in Sect. 4. Finally, the conclusion is presented in the last section.

2

Material and Methods

UAV data were collected in the city of Artvin. Artvin is a small city located in northeastern Turkey. Figure 1(a) presents the location of Artvin within the borders of Turkey. Since Artvin is in a mountainous region, almost all of its roads are inclined [14]. The road used in this study is a two-lane local road located on the Seyitler campus of Artvin Coruh University (see Fig. 1(a)). There are a few buildings and green areas with trees around the road. This road is an asphalt paved road. Since there are some essential locations such as the University Campus, a large housing area, the principal police station, and one of the most prominent high school in Artvin around the road, it is relatively one of the most frequently used roads of Artvin. In addition, several previous studies used this road as a study area [1,2]. The study consists of the following steps. Firstly, two-dimensional (2D) images were collected from UAV flights. Then, a 3D point cloud was developed from 2D images using Pix4D Mapper software. Pix4D Mapper software is structure from motion (SfM)-based software, and several previous studies used for this purpose [1,6,14]. Finally, Machine Learning (ML) algorithm was applied to classify road surface from UAV-based point cloud. The Connected Component (CC) algorithm was used to remove noisy points in the ML results. The manually obtained road surface was evaluated with the extracted road surface to assess the performance of the ML models.

Effectiveness of Different Machine Learning Algorithms in Road Extraction

67

Fig. 1. The location of the study area used in this study.

2.1

UAV-Based Point Cloud

The UAV platform used in the study is A DJI Phantom 4 RTK (P4RTK). Aerial images over the study area were collected with a DJI FC6310R camera on the UAV platform. In addition, there are two essential sensors on the UAV platform, namely the high-sensitivity global navigation satellite system (GNSS) and the inertial measurement unit (IMU). GNSS is a system that determines a user’s geographic location on Earth in real-time and with high precision using satellite signals. The IMU records the coordinate of the UAV platform to which the camera is connected and coordinates the objects in the images obtained from the camera [2,14]. Thus, the camera position is obtained in centimeter-level error using these sensitive sensors. Flight parameters and camera parameters are significant elements in collecting accurate data. Several studies investigate these parameters’ accuracy [2,5]. The detailed information regarding flight and camera parameters used in this study are summarized in previous studies [2,14]. Both oblique and nadir 165 aerial images were collected from 50 m above the study area. A 3D point cloud of the study area was developed from 2D images using Pix4DMapper, which is an SfM-based commercial software. The primary purpose of the SfM technique is the production of 3D models. The SfM technique is a method that produces models by using the common points of the object to be produced, using multiple 2D images taken from different angles and

68

S. Bi¸cici

intersecting with each other. This method provides the opportunity to work with high-resolution large data sets at low cost. More detailed information about the SfM technique can be found in many studies [15,16]. 2.2

Machine Learning Models

Several ML models were investigated to classify road surface from the 3D model of the study area. There are two preliminary requirements before developing an ML model. First, several point features for each point in the point cloud were calculated to be able to define the point. For example, it differs in the point feature on a flat terrain than on a rough terrain. Similarly, a grassy land is even more different than others. In this study, nine geometric features, namely Curvature, Omnivariance, Planarity, Linearity, Surface variance, and Anisotropy are used. The detailed explanation of the geometric feature calculation was discussed in previous studies [2]. In addition, R: Red, G: Green, B: Blue values of each point were recorded from SfM-based software. Of course, other features might be obtained with different types of sensors or cameras. However, in this study, the features listed above were used in the ML models. Another preliminary requirement is a training sample. The training sample is a subset of the dataset used to train the ML model. There are several studies to investigate the effect of training sample selection on ML model results [2,17,18]. ML model was developed using both point feature and training samples. However, different approaches/methods can be applied to develop ML models. This study investigated seven different ML models to classify road surface from 3D point cloud. In this study, R software was used to develop ML models. There is a package called caret in R [19]. There are more than 200 models available in caret package. Table 1 lists the ML models investigated in this study along with the abbreviations used in caret package in R and the lists of several previous studies studying these models. Please note that default parameters were used in these ML models. In this study, ML models classify some points around the road surface as road. These points are called noisy points since they were misclassified. This study used a basic classification algorithm to remove these noisy points in the ML model results. There are many basic classification algorithms. The Connected Component Algorithm (CC) was conducted in this study [30]. There are two inputs required in the CC algorithm. They are the octree level and the minimum number of points per cluster. Different input values produce different results. This study selected the octree level as 12, and the minimum number of points per cluster was 1, 500. Please note that these input values were found by trial and error method.

3

Experimental Study

This study investigates different ML models using training samples created in different ways. Given the limitations encountered during the ML model development, only 1% of points in the point cloud (around thirty-five thousand points)

Effectiveness of Different Machine Learning Algorithms in Road Extraction

69

Table 1. The list of ML models investigated in this study ML models

Abbreviation Previous Studies

Generalized Linear Model

glm

Knoblauch and Maloney [20] Miller and Franklin [21]

Linear Discriminant Analysis

lda

Feldesman [22] Croux et al. [23]

Robust Linear Discriminant Analysis

Linda

Todorov and Pires [24]

Random Forest

rf

Bicici and Zeybek [2] Speiser et al. [25]

Support Vector Machines with Linear Kernel svmLinear

Cervantes et al. [26] Gun [27]

Linear eXtreme Gradient Boosting

xgbLinear

Bansal and Kaur [28] Georganos et al. [29]

eXtreme Gradient Boosting

xgbTree

Bansal and Kaur [28] Georganos et al. [29]

was used as a training sample. Moreover, half of the thirty-five thousand points were selected from the points representing the road surface, while the remaining seventeen thousand five hundred points were chosen from the points representing the non-road surface. However, these points were selected from the point cloud differently. In this study, four different training sample selection processes were used. Figure 2 presents the four training sample examples used in this study. In these figures, the road surface is indicated by cyan points, while the non-road surface is represented by black points. In addition, red points are the points selected as training samples. For example, seventeen thousand five hundred points were selected as a single large group from the road and non-road surfaces as seen in Fig. 2(a). Similarly, Fig. 2(b) shows that seventeen thousand five hundred points were selected into five groups for road and non-road surfaces. Finally, seventeen thousand five hundred points were selected into fifty and hundred different groups for both road and non-road surfaces, as seen in Fig. 2(c) and (d), respectively. These training samples were selected randomly in this study. Therefore, there is randomness in the selection of the training sample. Four different training sample selection processes were repeated 25 times to eliminate the randomness. Figure 3 shows three examples of 25 training samples for seventeen thousand five hundred points selected as five groups. The same training sample size was used in these three examples. However, they are selected from different locations in the point cloud.

70

S. Bi¸cici

There are four different training sample selection processes, and each was repeated 25 times. This leads 100 training samples. These 100 training samples were used for each ML model to classify the road surface from the 3D model. Therefore, to eliminate the randomness of the training data selection, each ML model performed road classification 100 times.

Fig. 2. Four different training sample examples used in this study.

Fig. 3. Three different trails for the same training sample selection process.

4

Results and Discussion

To evaluate the ML model classification accuracy, the road surfaces obtained from ML models were compared with the actual road surface, which was obtained manually. A confusion matrix with four elements is produced for comparing these two types of road surface. Four elements in the confusion matrix are true positive (TP ), false positive (FP ), false negative (FN ), and true negative (TN ). Then, five commonly used measures, recall, precision, quality, accuracy, and F1score, are calculated from four elements in the confusion matrix. The detailed

Effectiveness of Different Machine Learning Algorithms in Road Extraction

71

definitions of four elements in the confusion matrix and these five measures are summarized in several studies [2,31]. These five values are between zero and one, and a larger value indicates a better classification result. 100 training samples lead to 100 road surface extraction for each ML model. Box plots were illustrated for each ML model in Fig. 4 to summarize these five measures. The top three classification results were obtained for glm, rf, and svmLinear models for precision, quality, accuracy, and F1-score measures. However, these three models were not performed as much as other ML models for completeness measures. Paired t-test was also applied to statistically check if two samples (e.g., rf accuracy results versus gld accuracy results) are coming from the same distribution [32]. The null hypothesis was not rejected between the top three ML model results for all five measures. In addition, The null hypothesis was rejected between the top three ML model results and other ML model results. These results indicate that the top three ML models, glm, rf, and svmLinear models, are different and better than other ML models.

Fig. 4. The box plots of five commonly used measures for each ML model.

Average processing time in seconds for developing ML model and classifying road surface in R software were also recorded and presented in Table 2. The shortest processing time in seconds was obtained when using lda and Linda models. On the other hand, xgbLinear model was the one with the longest average processing time in seconds along with rf and xgbTree. Glm model has the shortest average processing time in seconds among the top three ML models. In contrast, rf model has the longest average processing time.

72

S. Bi¸cici

Table 2. Average processing times (in seconds) for each step of seven ML models ML models Developing ML model Classifying road surface Total glm∗

40.8452

0.2159

41.0611

lda

28.2866

0.2934

28.5800

Linda rf∗

29.6039

0.2853

29.8889

265.1866

0.1716

265.3582

73.8678

0.3139

74.1817

451.2783

0.1746

451.4529

0.1743

283.1955

svmLinear∗ xgbLinear

xgbTree 283.0212 ∗ Top three ML models

Given all the above results, Generalized Linear Model (glm) produces accurate road classification from the 3D UAV-based point cloud. On the other hand, Linear eXtreme Gradient Boosting (xgbLinear) and eXtreme Gradient Boosting (xgbTree) results are not as good as glm, and it takes longer processing time for these two models to develop ML model.

5

Conclusion

This study evaluates seven different ML models to classify road surface from the 3D UAV-based point cloud. The study consists of the following steps. Firstly, 2D images collected from UAV flights were converted into 3D point cloud using Pix4D Mapper software. Seven different ML models (Generalized Linear Model, Linear Discriminant Analysis, Robust Linear Discriminant Analysis, Random Forest, Support Vector Machine with Linear Kemel, Linear eXtreme Gradient Boosting, and eXtreme Gradient Boosting) were developed under different training samples. These steps were conducted using caret package in R software. Finally, road surface were classified using developed ML models. To remove noisy points, the CC algorithm was used. To evaluate the ML model classification results, manually extracted road surfaces were compared with the ones obtained from ML models. Generalized Linear Model has the most accurate road classification from a 3D UAV-based point cloud. On the other hand, Linear eXtreme Gradient Boosting and eXtreme Gradient Boosting models produce less accurate road classification. The classification accuracies of other ML models are in between these performances. Acknowledgments. The author would like to thank Dr Mustafa Zeybek for his contribution to 3D data production.

Effectiveness of Different Machine Learning Algorithms in Road Extraction

73

References 1. Bi¸cici, S., Zeybek, M.: An approach for the automated extraction of road surface distress from a uav-derived point cloud. Autom. Constr. 122, 103475 (2021) 2. Bi¸cici, S., Zeybek, M.: Effectiveness of training sample and features for random forest on road extraction from unmanned aerial vehicle-based point cloud. Transp. Res. Rec. 2675(12), 401–418 (2021) 3. Zeybek, M., Bi¸cici, S.: Road surface and inventory extraction from mobile lidar point cloud using iterative piecewise linear model. Meas. Sci. Technol. 34(5), 055204 (2023) 4. Kavzoglu, T., Sen, Y.E., Cetin, M.: Mapping urban road infrastructure using remotely sensed images. Int. J. Remote Sens. 30(7), 1759–1769 (2009) 5. Saad, A.M., Tahar, K.N.: Identification of rut and pothole by using multirotor unmanned aerial vehicle (UAV). Measurement 137, 647–654 (2019) 6. Tan, Y., Li, Y.: UAV photogrammetry-based 3d road distress detection. ISPRS Int. J. Geo Inf. 8(9), 409 (2019) 7. Abburu, S., Golla, S.B.: Satellite image classification methods and techniques: a review. Int. J. Comput. Appl. 119(8), 20–25 (2015) 8. Lin, Y., Saripalli, S.: Road detection from aerial imagery. In: 2012 IEEE International Conference on Robotics and Automation, pp. 3588–3593 (2012) 9. Yadav, M., Lohani, B., Singh, A.: Road surface detection from mobile lidar data. ISPRS Ann. Photogram. Remote Sens. Spat. Inf. Sci. 4, 95–101 (2018) 10. Yadav, M., Singh, A.K.: Rural road surface extraction using mobile lidar point cloud data. J. Indian Soc. Remote Sens. 46(4), 531–538 (2018) 11. Akturk, E., Altunel, A.O.: Accuracy assessment of a low-cost UAV derived digital elevation model (dem) in a highly broken and vegetated terrain. Measurement 136, 382–386 (2019) 12. Kotsiantis, S.B., Zaharakis, I.D., Pintelas, P.E.: Machine learning: a review of classification and combining techniques. Artif. Intell. Rev. 26(3), 159–190 (2006) ˇ Tomovi´c, M.: Evaluation of clas13. Novakovi´c, J.D., Veljovi´c, A., Ili´c, S.S., Papi´c, Z, sification models in machine learning. Theory Appl. Math. Comput. Sci. 7(1), 39 (2017) 14. Zeybek, M., Bi¸cici, S.: Investigation of landslide-based road surface deformation in mountainous areas with single period UAV data. Geocarto Int. 37, 1–27 (2022) 15. Carrivick, J.L., Smith, M.W., Quincey, D.J.: Structure from Motion in the Geosciences. John Wiley & Sons, Hoboken (2016) 16. Wang, J.-A., Ma, H.-T., Wang, C.-M., He, Y.-J.: Fast 3d reconstruction method based on UAV photography. ETRI J. 40(6), 788–793 (2018) 17. Colditz, R.R.: An evaluation of different training sample allocation schemes for discrete and continuous land cover classification using decision tree-based algorithms. Remote Sens. 7(8), 9655–9681 (2015) 18. Millard, K., Richardson, M.: On the importance of training data sample selection in random forest image classification: A case study in peatland ecosystem mapping. Remote Sens. 7(7), 8489–8515 (2015) 19. Kuhn, M.: Caret: classification and regression training. Astrophysics Source Code Library, 1505 (2015) 20. Knoblauch, K., Maloney, L.T.: Estimating classification images with generalized linear and additive models. J. Vis. 8(16), 10–10 (2008) 21. Miller, J., Franklin, J.: Modeling the distribution of four vegetation alliances using generalized linear models and classification trees with spatial dependence. Ecol. Model. 157(2–3), 227–247 (2002)

74

S. Bi¸cici

22. Feldesman, M.R.: Classification trees as an alternative to linear discriminant analysis. Am. J. Phys. Anthropol. Off. Publ. Am. Assoc. Phys. Anthropologists 119(3), 257–275 (2002) 23. Croux, C., Filzmoser, P., Joossens, K.: Classification efficiencies for robust linear discriminant analysis. Statistica Sinica 581–599 (2008) 24. Todorov, V., Pires, A.M.: Comparative performance of several robust linear discriminant analysis methods. REVSTAT-Stat. J. 5(1), 63–83 (2007) 25. Speiser, J.L., Miller, M.E., Tooze, J., Ip, E.: A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 134, 93–101 (2019) 26. Cervantes, J., Garcia-Lamont, F., Rodr´ıguez-Mazahua, L., Lopez, A.: A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing 408, 189–215 (2020) 27. Gunn, S.R., et al.: Support vector machines for classification and regression. ISIS Techn. Rep. 14(1), 5–16 (1998) 28. Bansal, A., Kaur, S.: Extreme gradient boosting based tuning for classification in intrusion detection systems. In: Singh, M., Gupta, P.K., Tyagi, V., Flusser, J., ¨ Oren, T. (eds.) ICACDS 2018. CCIS, vol. 905, pp. 372–380. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-1810-8 37 29. Georganos, S., Grippa, T., Vanhuysse, S., Lennert, M., Shimoni, M., Wolff, E.: Very high resolution object-based land use-land cover urban classification using extreme gradient boosting. IEEE Geosci. Remote Sens. Lett. 15(4), 607–611 (2018) 30. Lumia, R., Shapiro, L., Zuniga, O.: A new connected components algorithm for virtual memory computers. Comput. Vision Graph. Image Process. 22(2), 287–300 (1983) 31. Goutte, C., Gaussier, E.: A probabilistic interpretation of precision, recall and F -score, with implication for evaluation. In: Losada, D.E., Fern´ andez-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1 25 32. Hsu, H., Lachenbruch, P.A.: Paired t test. Wiley StatsRef: statistics reference online (2014)

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering on Recommender System Implementation Karim Seridi(B) and Abdessamad El Rharras SIRC-LAGES Lab, Hassania School of Public Works, Casablanca, Morocco [email protected]

Abstract. Today, several successful companies like Uber, Airbnb, and others have adopted sharing economy business models. The increasing growth of websites and applications adopting this model pushes companies to develop differentiation strategies. One of the strategies is to use emerging technologies to offer a better customer experience. Recommender systems (RSs) are AI-based solutions that can provide customized recommendations. To implement an RS in a sharing economy platform, this study intends to compare the performance of two recommendationsystem approaches based on their accuracy, computation time, and scalability. The Netflix dataset was used to compare matrix factorization and memory-based techniques based on their performances using offline testing. The results of the study indicate that memory-based methods are more accurate for small datasets but have computation time limitations for large datasets. Single-value decomposition methods scale better than memory-based algorithms. Keywords: recommender systems · model-based method · collaborative filtering

1 Introduction The sharing economy, characterized by the exchange of goods and services between peers through digital platforms, has become a significant transformative influence in contemporary society. Fueled by technological advancements, this innovative business model has gained considerable attention due to its potential to promote resource efficiency, sustainable consumption practices, and community collaboration (Puschmann & Alt, 2016). As the sharing economy continues to evolve, its success primarily relies on establishing personalized and seamless interactions between providers and consumers. Consequently, sharing economy platforms are increasingly adopting recommender systems (RS)—intelligent algorithms developed to offer users relevant item and service suggestions—to improve overall user experiences and boost user engagement. Recommender systems are extensively utilized to offer valuable forecasts of user ratings or preferences for shared items, including products, movies, books, and news articles (Puschmann & Alt, 2016). The primary goal of an RS is to boost item sales, expand the range of products offered, enhance user satisfaction and loyalty, and gain a deeper understanding of customer preferences (Ricci et al., 2022). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 75–86, 2024. https://doi.org/10.1007/978-3-031-54376-0_7

76

K. Seridi and A. El Rharras

One of the significant challenges encountered by RSs in the sharing economy involves effectively utilizing prediction algorithms to determine appropriate matches for users, considering their unique interests and preferences (Kim & Yoon, 2016). Diverse methods and strategies can be employed to develop RSs, and the effectiveness of each approach in various situations and contexts remains uncertain (Koren et al., 2009). The selection of an RS method poses a challenge, as there is a lack of thorough analyses that compare the effectiveness of different strategies and methods (Ricci et al., 2011). The absence of comprehensive understanding makes it challenging for practitioners to determine the optimal design for a specific application (Liu et al., 2010). The main aim of this research was to evaluate and compare various recommender systems to understand their effectiveness and analyze their strengths and limitations. By conducting a comprehensive examination of multiple approaches, this study sought to provide valuable guidance to practitioners on selecting the most suitable methods for different applications. This study employed the Netflix dataset and evaluation metrics to compare different approaches to RSs. The implementation of these systems utilized the Surprise library, and their accuracy and scalability were thoroughly evaluated. The obtained results were then analyzed to identify the strengths and weaknesses of each approach.

2 Literature Review 2.1 Types of Recommender Systems RSs can be categorized into two major types: basic algorithms and representation-based algorithms (Ren et al., 2021). The basic algorithms can be divided into three main categories: content-based (CB), collaborative filtering (CF), and hybrid. Representationbased algorithms employ neural networks to learn the embeddings of both the user and the item and are introduced from two perspectives. The first perspective involves CF methods, in which models are constructed solely based on the user–item interaction matrix. The second perspective incorporates CF methods that combine user–item interaction data and auxiliary information simultaneously (Ren et al., 2021). CB methods involve creating a feature list for each item and then comparing it with items in which a particular user has previously shown interest. Items that closely match in similarity to the user’s preferences are suggested as recommendations (Raghuwanshi & Pateriya, 2019). For instance, a comedy movie rated positively by the user will lead to recommendations in this genre in the future (Lops et al., 2011). CB techniques can be divided into two categories: classic and semantic. In the classic technique, a recommendation is made based on matches to the attributes of the user profile and the items that are simply keywords extracted from items’ descriptions. In the semantic technique, the item and user profiles are represented based on concepts instead of keywords (Lops et al., 2011). CB techniques may require gathering external data that might not be readily accessible or easy to obtain (Shah & Duni, 2021). CF is considered to be the most famous algorithm (Ren et al., 2021). The CF method emerged from the idea of predicting preferences given user behavior based on the collaborative behaviors of all users. The core assumption of CF is that if two users have similar ratings for a set of items or exhibit similar behaviors such as purchasing, watching, or

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering

77

listening, then they are likely to rate or act similarly on other items as well (Su & Khoshgoftaar, 2009). This approach efficiently automates the processes involved in generating recommendations akin to “word of mouth” (Shah & Duni, 2021). The initial techniques employed in CF involved calculating the similarity between users in user-based methods or items in item-based methods. These methods were prevalent in the early CF methodology (Wu et al., 2022), dating back to the mid-1990s (Kluver et al., 2018; Hanafi et al., 2005). Known as memory-based or neighborhood-based methods, this approach proves highly effective in rating prediction. Memory-based methods rely on statistical techniques for making predictions. Several statistical methods, such as Cosine, Spearman, and Pearson, are employed in this context (Hanafi et al., 2005). However, the primary limitation associated with memory-based techniques is their reliance on loading substantial amounts of inline memory. This issue becomes particularly problematic when dealing with vast rating matrices in scenarios in which there are a high number of users utilizing the system. Therefore, significant computational resources are consumed, leading to a decline in system performance and the inability to respond promptly to user requests (Do, 2010). As a result, several academics have explored an alternative approach, predominantly based on models, to enhance the computational efficiency of CF (Hanafi et al., 2005). Model-based methods utilize data mining and machine-learning techniques to forecast user preferences for items. These include various approaches, such as model-based CF, which further encompasses several prevalent approaches, including clustering, classification, latent models, the Markov decision process (MDP), and matrix factorization (Do, 2010; Raghuwanshi & Pateriya, 2019; Ren et al., 2021). Clustering CF operates under the assumption that users belonging to the same group share common interests and, consequently, rate items similarly (Do, 2010). A method widely used in classification problem-solving is the Bayesian classifier, known for its probabilistic approach. Bayesian classifiers have gained popularity in model-based RSs, and they are also frequently applied to establish models for CB RSs. When integrating a Bayesian network into RSs, each node corresponds to an item, and the states represent the various potential vote values. In this network, each item has a set of parent items that serve as its top predictors (Lu et al., s. d.). To enhance recommendation performance, hybrid CF techniques like the contentboosted CF algorithm, personality diagnosis, and amalgamated CF and CB approaches. The aim is to overcome the limitations associated with each method, leading to more effective recommendations (Raghuwanshi & Pateriya, 2019; Su & Khoshgoftaar, 2009). Various methods of integrating CF and CB techniques in a hybrid RS can be categorized as follows: (a) employing CF and CB methods independently and then merging their predictions, (b) infusing certain CB features into a CF approach, (c) integrating specific CF traits into a CB approach, and (d) creating a comprehensive unified model that encompasses both CB and CF characteristics (Adomavicius & Tuzhilin, 2005). In recent times, the landscape of RSs has undergone substantial transformations, thanks to the adoption of deep learning techniques. This has opened new possibilities for significantly improving the performance of RSs. The advancements in RSs powered by deep learning have gained popularity as they overcome the limitations of basic models, delivering high-quality outcomes. By effectively recognizing nonlinear and intricate

78

K. Seridi and A. El Rharras

connections between users and items, deep learning allows the encoding of more complex constructs as data representations in the higher layers. Additionally, it excels at capturing intricate links in the data, such as contextual, textual, and visual information (Mu, 2018; Zhang et al., 2020). The figure below summarizes the types of algorithms used in RSs. The figure serves as a comprehensive visual aid, delineating and categorizing the various types of RSs employed in contemporary data-driven applications (Fig. 1).

Recommender system Representationbased algorithm

Basic algorithm

CB algorithm

Over specialization

CF algorithm

Hybrid algorith m

Memory- based algorithm

Model-based algorithm

Scalability issue because of the use of the hole rating matrix

- Clustering - Classification - Latent Factor Markov decision process - Matrix factorization

CF useritem interaction based

Hybrid algorithm

CF useritem interaction based + auxiliary info

Fig. 1. Types of recommender systems

2.2 Recommender System Challenges and Solutions A pivotal aspect of any RS is delivering exciting and pertinent items to users, as the trust users place in the system is closely linked to recommendation quality. If users receive unsatisfactory recommendations, they may perceive inadequacy and seek alternative platforms. Hence, an RS needs to attain a suitable prediction accuracy level to enhance desirability and efficacy. Accuracy is among the most debated challenges in RSs, and it is frequently scrutinized through three dimensions: the precision of rating predictions, usage forecasts, and item ranking (Batmaz et al., 2019). The CB recommendation approach offers distinct advantages over the CF approach. Firstly, CB systems operate independently for each user, constructing profiles based solely on the user’s own ratings, while CF methods necessitate ratings from other users to identify similar tastes. Thus, CB methods lead to more personalized recommendations. Secondly, CB systems provide transparency by explaining recommendations through explicit content features or descriptions, empowering users to assess the basis of the suggestions. In contrast, CF systems remain opaque, relying on the likeness of unknown

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering

79

users. Additionally, CB systems can suggest new, unrated items since they don’t rely solely on user preferences, circumventing the “first-rater” issue that affects CF systems, which require a substantial user base to recommend new items effectively (Lops et al., 2011). However, CB systems have limitations, such as constraints in analyzing content due to restricted features, which requires domain knowledge. They may overlook key aspects of user preferences and lack mechanisms for unexpected suggestions, resulting in recommendations closely mirroring existing preferences. These systems also struggle with new users or limited ratings, needing substantial data to provide reliable recommendations (Lops et al., 2011). CF-driven recommendation approaches also encounter certain constraints. Notably, when dealing with a sparsely populated rating matrix, there is a noticeable decline in the precision of recommendations. Furthermore, these methods are unable to offer suggestions for novel items lacking user appraisals, leading to a reliance on supplementary data, such as item descriptions, which then becomes indispensable (Mu, 2018). An additional drawback of CF approaches, particularly those relying on memory-based algorithms, is their dependence on loading considerable volumes of in-memory data. This concern becomes notably troublesome when addressing extensive rating matrices in situations involving a large user base engaging with the system. Consequently, substantial computational assets are utilized, resulting in a decrease in system efficiency and the inability to swiftly cater to user inquiries (Do, 2010). While deep learning has achieved significant accomplishments, it is often criticized for its opacity, resembling black boxes, and achieving interpretable predictions remains a formidable challenge. A prevalent critique of deep neural networks is their inherent lack of interpretability due to the complex nature of hidden weights and activations. Nevertheless, this issue has been alleviated to some extent with the emergence of neural attention models, which has opened the door for more interpretable deep neural models, enhancing their overall explainability (Zhang et al., 2020). 2.3 Recommender Systems Evaluation Most RSs were initially judged and graded according to how well they could forecast a user’s preferences. It is now acknowledged that producing accurate forecasts is necessary, but not sufficient for a successful recommendation engine. In numerous situations, people utilize RSs for more than just a precise prediction of their interests. They may be interested in discovering new items, maintaining their privacy, receiving quick responses from the system, and many other features of the interaction with the RS. Therefore, we must first determine the factors that may affect the success of an RS in the specific context of its application. Afterward, we can assess the system’s performance on these factors (Shani & Gunawardana, 2011). A variety of evaluation metrics based on prediction and coverage accuracy are used to assess the effectiveness of RSs. Mean Absolute Error (MAE) is a widely adopted evaluation metric for RSs, gauging the variance between the RS-predicted rating and the user-assigned rating (Kuanr & Mohapatra, 2021). The MAE value can be computed through the following equation:

80

K. Seridi and A. El Rharras

MAE =

m n  1    r(i,j) − ri,j  N 

i=1 j=1

RMSE is a frequently employed evaluation measure, accentuating larger deviations in rating predictions for more significant errors. It stands among the prevalent tools for prediction assessment and is computed using the formula below (Kuanr & Mohapatra, 2021).    n   1 m  r(i,j) − ri,j 2 RMSE =  N 

i=1 j=1

3 Case Study: SVD and KNN RS Netflix Dataset 3.1 Research Methodology In this study, we used the case study as a research strategy and quantitative comparative analysis as a methodology. The objective was to gain a comprehensive understanding of various scenarios and grasp the intricacies within the case and to delve deeply and intimately into the case, capturing its inherent complexities. The data utilized in this case was sourced from the Netflix dataset. This dataset comprises data from more than 480 thousand randomly selected and anonymous Netflix users, who contributed more than 100 million ratings, along with corresponding dates. These ratings, established on a star system ranging from one to five, were assigned to nearly 18 thousand movie titles. The distribution of all the ratings collected by Netflix mirrors the data acquired between October 1998 and December 2005 (Bennett & Lanning, 2007). The dataset was cleaned and preprocessed to adress missing values and outliers. To replicate the scenario of data expansion on the platform, we generated four distinct datasets derived from the original Netflix dataset. Subset_1 encompasses 0.001% of the initial dataset, Subset_2 is 0.01%, Subset_3 is 0.01%, and Subset_4 is 0,1% of the original dataset. The decision to perform memory-based recommendation on two smaller datasets while employing model-based recommendation on four larger datasets was driven by considerations of computational efficiency and resource availability, especially when using Google Colab Pro. Memory-based recommendation methods involve calculating pairwise similarities between users or items based on their interactions. These calculations can become computationally intensive as the dataset size grows, leading to potential performance bottlenecks. Therefore, applying memory-based methods to smaller datasets, like Subset_1 and Subset_2, ensured manageable processing times and memory requirements. It also helped us analyze the accuracy and scalability characteristics of memory-based approaches while avoiding excessive resource consumption.

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering

81

On the other hand, model-based methods, such as SVD, involve matrix factorization and mathematical computations that are better suited for larger datasets. Since modelbased approaches often offer enhanced accuracy through latent feature discovery, applying them to larger datasets like Subset_3 and Subset_4 allowed us to more effectively harness their predictive potential. Google Colab Pro’s higher computational resources and memory availability further facilitated the execution of these computationally intensive algorithms on larger datasets without causing resource constraints. In the RS evaluation phase, we employed two distinct approaches: memory-based and model-based methods. Specifically, we employed cosine similarity for the memorybased approach and utilized SVD for the model-based method. These approaches were evaluated in terms of accuracy metrics, including MAE and RMSE, offering insights into the quality of recommendations. Furthermore, we analyzed scalability by measuring processing time and memory usage. This aspect shed light on the computational demands associated with different dataset sizes and methods. The results were then presented through visualization, showcasing trends in accuracy enhancement and resource utilization across datasets. The figure below summarizes the major steps followed (Fig. 2). Data collecon • Nelix dataset

Data preprocessing • Dataset cleaning

RS implementaon • • • •

RS Evaluaon

Subset_1 Subset_2 Subset_3 Subset_4

• • • •

Subset_1 Subset_2 Subset_3 Subset_4

Comparave analysis • Accuracy and scalability evaluaon

Fig. 2. Research methodology

3.2 Results and Analysis Figure 3 shows the evolution of accuracy and scalability measures in memory-based RSs. In this analysis, we examined the performance of a memory-based RS using cosine similarity on two distinct datasets, Subset_1 and Subset_2. We evaluated the system’s accuracy using MAE and MSE alongside scalability metrics encompassing processing time and memory usage. Strikingly, we observed a consistent trend in which both MAE and MSE decreased as the dataset size expanded. This phenomenon implies that larger datasets, such as Dataset_2, foster enhanced predictive accuracy due to their ability to capture more intricate user–item interactions. Additionally, the analysis unveiled that with larger datasets, the system’s computational demands intensified significantly, manifesting in augmented processing time and memory utilization. This underscores the tradeoff between accuracy and resource requirements, as larger datasets enhance predictive precision but necessitate greater computational power. In interpreting these outcomes, we gleaned valuable insights into the dynamics of RS performance vis-à-vis dataset size. The empirical results underscored the salient role of data volume in dictating accuracy, with larger datasets yielding superior prediction

82

K. Seridi and A. El Rharras

Fig. 3. Accuracy and scalability of memory-based RSs

quality. Yet, the substantial increase in processing time and memory consumption with larger datasets prompted a nuanced consideration. Balancing the desire for heightened accuracy with the practical constraints of computational resources is paramount. This analysis highlighted the importance of algorithmic efficiency and resource allocation, particularly when dealing with expansive datasets. Ultimately, our findings illuminated the intricate interplay between dataset scale, accuracy gains, and computational demands, providing a comprehensive understanding of the tradeoffs inherent in memory-based RSs using cosine similarity. Figure 4 shows the evolution of accuracy and scalability measures in model-based RSs. In our analysis, we delved into the performance of an SVD-based RS across four distinct datasets of data: Subset_1, Subset_2, Subset_3, and Subset_4. We examined the system’s accuracy using MAE and RMSE, while also scrutinizing the scalability metrics of processing time and memory usage. Intriguingly, a noticeable pattern emerged in the accuracy metrics, revealing that the MAE and RMSE generally diminished as we progressed from smaller to larger datasets. This phenomenon underscores the SVD model’s proficiency in capturing latent features and patterns within larger datasets, yielding heightened predictive accuracy. Additionally, the processing time and memory usage demonstrated a proportional increase with the size of the datasets, corroborating the resource-intensive nature of larger data volumes. Interpreting these results provided valuable insights into the dynamics of SVD-based recommender systems relative to dataset scale. The observed reduction in both MAE and RMSE as the dataset expanded underscored the SVD model’s adaptability for capitalizing on the additional information present in larger datasets. Larger datasets afforded

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering

83

Fig. 4. Accuracy and scalability of model-based RSs

the model greater opportunities to uncover nuanced user preferences and item interactions, ultimately culminating in improved prediction precision. However, the concomitant increase in processing time and memory usage aligned with expectations, highlighting the computational tradeoffs associated with leveraging expansive datasets. These findings emphasize the careful equilibrium that must be struck between predictive accuracy and the requisite computational resources when deploying SVD-based RSs. By elucidating the intricate relationship between dataset size, accuracy enhancements, and resource demands, our analysis provides a holistic perspective on the viability of SVD approaches within varying data contexts.

4 Conclusion and Future Work In conclusion, our comparative analysis between memory-based and model-based RSs offers valuable insights into their respective strengths and limitations. The memorybased approach, exemplified by the cosine similarity method, showcased commendable accuracy improvements as the dataset size expanded. This underscores its efficacy in capturing user preferences and item relationships, especially evident in the larger datasets. However, this accuracy gain was counterbalanced by a substantial increase in processing time and memory usage, highlighting its resource-intensive nature. Conversely, the model-based approach employing SVD demonstrated a remarkable ability to harness latent features and patterns within the data. This translated into

84

K. Seridi and A. El Rharras

enhanced predictive accuracy across different dataset sizes, showcasing SVD’s adaptability to varying data contexts. Furthermore, while the processing time and memory usage also increased with dataset size, the scaling was generally more controlled compared to the memory-based method. In essence, the choice between memory-based and model-based RSs hinges on a tradeoff between accuracy and resource efficiency. Memory-based methods shine in accuracy improvements but come with heightened computational demands, making them suitable for smaller datasets in which precision is paramount. On the other hand, modelbased techniques like SVD strike a balance between accuracy and scalability, proving advantageous for larger datasets in which accurate predictions are desired without overwhelming resource constraints. Overall, this comparative analysis equips decisionmakers with a comprehensive understanding of these two approaches, enabling informed choices based on the specific requirements and constraints of their RSs.

5 Limitations and Future Work The study’s insights are constrained by certain limitations inherent to the conducted analysis. Firstly, the examination focused solely on two specific RS approaches: memorybased and model-based methods. While these approaches provide valuable insights, the study’s conclusions may not encompass the full spectrum of available techniques, potentially neglecting innovative solutions that could yield even better results. Additionally, the study’s reliance on the Netflix dataset, while illustrative, could limit the applicability of findings to other domains or platforms with distinct user behaviors and item characteristics. Furthermore, the evaluation metrics employed, though comprehensive, might not capture the entirety of recommendation quality, potentially omitting facets like diversity and novelty. Thus, broader investigations into diverse algorithms, datasets, and evaluation criteria could provide a more holistic understanding of RS performance. To enrich the comprehensiveness of the research, future endeavors could explore several promising directions. First, incorporating hybrid RSs that amalgamate memorybased and model-based techniques could leverage the strengths of both approaches while mitigating their respective weaknesses. Such hybrid models have the potential to yield enhanced recommendation accuracy and cater to varying computational constraints. Additionally, considering the temporal dynamics of user preferences and item popularity could lead to more contextually relevant recommendations, especially crucial for platforms like Netflix for which content trends evolve rapidly. Furthermore, delving into deep learning methods within the RS domain could unravel intricate patterns within user–item interactions, potentially enhancing accuracy and adaptability. Addressing privacy and ethical considerations remains a paramount concern, warranting research into techniques that ensure recommendation quality while safeguarding user privacy and preventing algorithmic biases. Finally, conducting online evaluation and real-world A/B testing could validate the efficacy of proposed solutions in live user environments, refining recommendations based on continuous user feedback and iteratively improving a system’s performance.

A Comparative Analysis of Memory-Based and Model-Based Collaborative Filtering

85

References Puschmann, T., Alt, R.: Sharing economy. Bus. Inf. Syst. Eng. 58(1), 93–99 (2016). https://doi. org/10.1007/s12599-015-0420-2 Ricci, F., Rokach, L., Shapira, B.: Recommender systems: techniques, applications, and challenges. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, 3rd edn., pp. 1–35. Springer, New York (2022) Kim, S., Yoon, Y.: Recommendation system for sharing economy based on multidimensional trust model. Multim. Tools Appl. 75(23), 15297–15310 (2016). https://doi.org/10.1007/s11 042-014-2384-5 Ricci, F., Rokach, L., Shapira, B.: Introduction to recommender systems handbook. In: Shapira, B. Ricci, F., Rokach, L., Kantor, P.B. (eds.) Recommender systems handbook, 1st edn., pp. 1–35. Springer, New York (2011). https://doi.org/10.1007/978-0-387-85820-3_1 Lu, J., Wu, D., Mao, M., Wang, W., Zhang, G.:. Recommender System Application Developments: A Survey (n.d.) Ren, J., et al.: Matching algorithms: fundamentals, applications and challenges.arXiv preprint arXiv:2103.03770 (2021) Raghuwanshi, S.K., Pateriya, R.K.: Recommendation systems: techniques, challenges, application, and evaluation. In: Bansal, J.C., Das, K.N., Nagar, A., Deep, K., Ojha, A.K. (eds.) Soft Computing for Problem Solving, vol. 817, pp. 151–164. Springer Singapore (2019). https:// doi.org/10.1007/978-981-13-1595-4_12 Lops, P., Gemmis, M.D., Semeraro, G.: Content-based recommender systems: state of the art and trends. In: Shapira, B. Ricci, F., Rokach, L., Kantor, P.B. (eds.) Recommender Systems Handbook, 1st edn, pp. 73–105. Springer, New York (2011) Shah, S.H., Duni, F.: A review on matrix factorization techniques used for an intelligent recommender system. Turkish J. Comput. Math. Educ. 12(7), 1812–1823 (2021) Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 1–19 (2009). https://doi.org/10.1155/2009/42142 Kluver, D., Ekstrand, M.D., Konstan, J.A.: Rating-based collaborative filtering: algorithms and evaluation. In: Brusilovsky, P., He, D. (eds.) Social Information Access, vol. 10100, pp. 344– 390. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-90092-6_10 Wu, L., He, X., Wang, X., Zhang, K., Wang, M.: A survey on accuracy-oriented neural recommendation: from collaborative filtering to information-rich recommendation. IEEE Trans. Knowl. Data Eng. 35(5), 4425–4445 (2022). https://doi.org/10.1109/TKDE.2022.3145690 Hanafi, M., Suryana, N., Basari, A.S.H.: An understanding and approach solution for cold start problem associated with recommender system: a literature review 96(9), 2677–2695 (2005) Do, M.-P., Nguyen, D.V., Nguyen, L.: Model-based approach for collaborative filtering [Paper presentation]. In: The 6th International Conference on Information Technology for Education, Ho Chi Minh City (2010) Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009). https://doi.org/10.1109/MC.2009.263 Adomavicius, G., Tuzhilin, A. (s.d.): Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005). https://doi.org/10.1109/TKDE.2005.99 Mu, R.: A survey of recommender systems based on deep learning. IEEE Access 6, 69009–69022 (2018). https://doi.org/10.1109/ACCESS.2018.2880197 Zhang, S., Yao, L., Sun, A., Tay, Y.: Deep learning based recommender system: a survey and new perspectives. ACM Comput. Surv. 52(1), 1–38 (2020). https://doi.org/10.1145/3285029 Batmaz, Z., Yurekli, A., Bilge, A., Kaleli, C.: A review on deep learning for recommender systems: challenges and remedies. Artif. Intell. Rev. 52(1), 1–37 (2019). https://doi.org/10.1007/s10462018-9654-y

86

K. Seridi and A. El Rharras

Najafabadi, M.K., Mohamed, A., Onn, C.W.: An impact of time and item influencer in collaborative filtering recommendations using graph-based model. Inf. Process. Manage. 56(3), 526–540 (2019). https://doi.org/10.1016/j.ipm.2018.12.007 Zagranovskaia, A.V., Mitiura, D.Yu., Makarchuk, T.A.: Designing a digital content recommendation system for films. In: Nazarov, A.D. (ed.) Proceedings of the 2nd International Scientific and Practical Conference “Modern Management Trends and the Digital Economy: From Regional Development To Global Economic Growth” (MTDE 2020), Atlantis Press, Yekaterinburg (2020). https://doi.org/10.2991/aebmr.k.200502.001 Ghazanfar, M.A., Prügel-Bennett, A.: Leveraging clustering approaches to solve the gray-sheep users problem in recommender systems. Expert Syst. Appl. 41(7), 3261–3275 (2014). https:// doi.org/10.1016/j.eswa.2013.11.010 Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Shapira, B. Ricci, F., Rokach, L., Kantor, P.B. (eds.) Recommender Systems Handbook, 1st edn, pp. 257–297. Springer, New York (2011). https://doi.org/10.1007/978-0-387-85820-3_8 Kuanr, M., Mohapatra, P.: Assessment methods for evaluation of recommender systems: a survey. Found. Comput. Decis. Sci. 46(4), 393–421 (2021). https://doi.org/10.2478/fcds-2021-0023 Bennett, J., Lanning, S.: The Netflix Prize. 2007, 35 (2007) Rendle, S., Freudenthaler, C.: Improving pairwise learning for item recommendation from implicit feedback. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 273–282 (2014). https://doi.org/10.1145/2556195.2556248 Gunawardana, A., Shani, G., Yogev, S.: Evaluating recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, 3rd edn., pp. 547–601. Springer, New York (2022)

Critical Overview of Model Driven Engineering Yahya El Gaoual(B) and Mohamed Hanine LTI Laboratory, ENSA, Chouaib Doukkali University, El-Jadida, Morocco {elgaoual.yahya,hanine.m}@ucd.ac.ma

Abstract. Model-driven engineering (MDE) is gaining favor as a method for creating complex software systems that is both effective and efficient. MDE places a strong focus on using models to represent the various facets of a software system. These models serve as the foundation for creating executable code. Even though MDE has proved effective in some situations, there are still difficulties with the method, such as the difficulty of modeling specific system components and the expense of maintaining the models as a project grows. In this article, we provide a critical analysis of MDE and discuss how it may develop in the future in terms of several concepts. We first consider the drawbacks of conventional MDE methods before looking at alternative remedies that could improve model precision and automate some components of the paradigm. The analysis that was done briefly demonstrates the possible advantages of incorporating AI techniques in order to enhance the MDE process. Keywords: MDE Modelling

1

· MDA · Code generation · NLP · Software ·

Introduction

Software systems can be abstracted into a multi-layered model that can describe the different aspects of the system from the simplest form to more complex designs. Models in brief conceptualise a given system in order to present a simpler overview of the component of the entire system. Because of the ability of models to simplify complex architectural components, it is widely adopted in several engineering fields including software engineering. In software engineering (including information systems), models are mainly employed when defining and describing a proposed system to different actors in the software project ranging from project sponsors (with limited IT knowledge) to seasoned professionals, therefore, modelling in software engineering is more language-based. Y. El Gaoual and M. Hanine—Contributed equally. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 87–97, 2024. https://doi.org/10.1007/978-3-031-54376-0_8

88

Y. El Gaoual and M. Hanine

To support this process, several modelling languages and techniques have been proposed and established. Most of these modelling languages have a background in methodologies similar to object-oriented or structural developments which are particular in creating models that are used to share information about the software to all stakeholders equally in the form of documentation. However, more recently, other approaches that do not focus on documentation have been developed. These approaches consider modelling as the main aspect of the software development process, therefore, caters to all aspects of the software engineering process. These techniques can thus create or automate the entire process of software systems by employing techniques like meta-modelling, model transformation, and code generation or model interpretation. These techniques such as DSL Engineering [1], Model-driven architecture [2,3], and Software Factories [4] are thus collectively referred to as Model-driven engineering (MDE) or Modeldriven Software development (MDSD), Model driven development (MDD), or Model-based testing (MBT), and they all share the same concepts and building blocks namely, system, model, meta-model, modelling language, transformation, software platform and software product. MDE approaches idolise models as the development centre in an application domain for any given software project. The MDD approach focuses on the disciplines of requirements, analysis and design, and implementation. It defines or employs modelling languages for the specification of the System Under Study at various levels of abstraction, resulting in M2M and M2T transformations that contribute to the successful improvement of the software system. MBT is restricted to the automation of the testing of the software system where some testing models are created that represents some desirable behaviour of the System Under Test (SUT). In this paper, a brief review of the state-of-the-art in MDE is provided. To broader the knowledge of readers on related concepts, all building blocks of the MDE paradigm are briefly introduced to serve as a context to the main objective of this paper. While reviewing these building blocks, potential improvements to advance the paradigms are identified and discussed.

2 2.1

Model Driven Engineering Concepts System

Most times, systems refer to software in its entirety. A system also contains other subsystems that individually can be referred to as a system. It can also have relationships with other systems through communication links. 2.2

Model

The model is an abstraction of a system that is currently being developed. It can be either a real or language-based abstraction of such system. It is used to point out outstanding features of the system. It is a simplified version of a system that can be used for presentation and evaluation purposes. A model

Critical Overview of Model Driven Engineering

89

itself is a system as it has its own elements and relationships. To be able to effectively distinguish a model from other system artefacts, three main criteria can be used according to Ludewig [5]. A mapping criterion ensures that the object reference in the system is clearly stated. A reduction criterion ensures that the specified model is a minimised or simplified representation of the system under the system therefore, the model just represents an aspect of the original. A pragmatism criterion establishes the useability of the model on its own, therefore, the abstracted part of the system under study should singly be able to replace some aspects of the original. 2.3

Meta Model

Meta model mean a model of models. It mostly facilitates the means of expressing a model using a language. It defines specifications for expressing models. Sometimes, a problem with meta modelling can arise due to the initialisation of the metamodel in a way that it is expected that the metamodel should be able to describe its own modelling language (meta-metamodel). A solution to this problem employs a language specification technique that can describe itself using its own language. An example in programming is Lisp which has a compiler which is also written in Lisp. Another approach is to proposed by OMG and it uses a four-layered architecture supported by Meta Object Facility (MOF) [6]. At the top-most layer, we have a meta-modelling layer (M3) facilitated by MOF that is responsible for establishing a language that can be used in the specification of meta-models. The subsequent layer (M2) provides the instantiation of the meta-models such as UML and CWM using meta-metamodels. The M1 layer is mainly domain-specific as it defines the model based on the targeted users such as software design (classes and objects), user requirements, or business presentations. The last layer (M0) provides realistic representations of the elements of the defined model in the context of the domain. 2.4

Modelling Language

It can be described as “a set of all possible models that are conformant with the modelling language’s abstract syntax, represented by one or more concrete syntaxes, and that satisfy a given semantics,” according to Rodrigues da Silva [7]. Additionally, a modelling language’s pragmatics aids and directs how to utilize it in the most effective manner. 2.4.1 Abstract Syntax The domain specialists must catalogue and list all the concepts, abstractions, and relationships in the given domain (Domain Analysis Phase) in order to design a modelling language for the given domain. This is primarily inferred through direct conversations with subject-matter experts or through the architect’s own domain expertise. The operation’s output generates the abstract syntax of the

90

Y. El Gaoual and M. Hanine

modelling language, which represents the metamodel at the meta-domain level. Grammars for natural languages and context-free grammars for programming languages used to build abstract syntaxes. Commonly, meta-modelling techniques are mostly used to define abstract syntaxes. In the case of OMG, the UML profile mechanism is mostly used for defining the modelling languages. Also, the MOF language is commonly used. The abstract syntax also comprises structural semantics which provides a set of binding rules among the elements of the elements in the model. Structural semantics facilities the definition of rules between two or more elements that have some relationships, thus, providing how they can relate and communicate with each other. Structural semantics can be defined through the use of declarative constraints language such as Object Constraint Language (OCL) for UML, through the use of informal natural language specification, and through a combination of the two. This can be seen in the UML where the structural semantics are described through natural language, class diagrams, and some OCL-defined aspects. 2.4.2 Concrete Syntax (Notation) This provides a way for users to interact with the modelling language, therefore, it is important that it is simple enough for users to learn and comprehensive enough for them to use. It should also be expressive. The important properties of concrete syntax are writability, readability, learnability, and effectiveness [1]. The types of notation that most modelling languages provide include graphics, texts, tables, and forms. These can be used to express different concepts based on users’ needs. 2.4.3 Semantics The semantics of any given language (natural or programming) define the meaning of legitimate syntax expressions in that language. In natural languages, the underlying information conveyed by the sentences and phrases of the language is deduced by the semantics of the language. As for programming language, the instructions the computer should execute are deduced from the syntax of the programming language through the semantics. As a result, semantics is used to determine the connection between a computer program’s input and output and can also be utilized to suggest how a set of instructions should be carried out. Executable and non-executable semantics are the two types of semantics that modelling languages define. The non-executable semantics is concerned with all other aspects not related to the execution of instructions but related to the deployment of software components such as UML component diagrams and deployment diagrams, and user requirement specifications (use case diagrams). The formal describes all concepts that are concerned with programming languages such as how the set of instructions will be executed (state machines, sequence, activity diagrams).

Critical Overview of Model Driven Engineering

91

2.4.4 Pragmatics It is mostly used to research communication acts if determinants like social, cultural, psychological, historical, or geographic aspects are anticipated. As opposed to semantics, which is concerned with the meaning of the language components, pragmatics focuses on understanding and interpreting a language in the context in which it is used (as shown in Fig. 1).

Fig. 1. The definition of modelling language [7]

When pragmatics is used in modelling languages, it is typically to define and understand practical contexts, such as the types of consumers or functions (domain experts, requirements engineers, software architects, and end-users), the activities being carried out (writing, refining, reading, analyzing, and model validation), as well as other factors (social, environmental, and psychological). Current pragmatics research focuses on finding more effective and efficient ways to model languages. Most of these studies involve a set of guiding principles, suggestions, and directives. Modelling language can be classified into general-purpose (GPML) and domain-specific (DSML). GPML provides generic constructs which ensure it can also be used in several application domains as opposed (UML or SysML) to DSML which is mostly domain restricted with fewer and more rigid constructs. Using one or more views, modelling languages can also be categorized according to the attribute of their application domain. A viewpoint offers a set of criteria

92

Y. El Gaoual and M. Hanine

that may be used again and again to build, pick, and display a particular part of a model in order to address stakeholder-specific issues. Using MDA keywords like computational independent model (CIM), platform-independent model (PIM), and platform-specific model (PSM), a viewpoint can be categorized according to abstraction dimension. 2.5

Software Products, Platforms and Transformations

The MDE process relies heavily on using modelling languages to provide an abstraction for software which can serve as a supporting artefact during the development process. Therefore, according to MDE principles, a software product can thus be defined as a component that comprises integrated software platforms, artefacts created directly by developers or generated through modelto-text transformations, and models that are executable with relevance to the software platforms in the system. Collectively, models, artefacts, platforms, and software applications can be referred to as systems. Given a set of software products, a software platform is the integration of all computational elements in the given software products which can facilitate its execution and development. In most cases, they are designed modularly and can often be reusable or extended. Examples include middleware, software libraries, application frameworks, etc. Furthermore, MDE also emphasises transformation as it is an important process in the approach. Thus, two types of transformations have been identified. Modelto-text transformations(M2T) usually involve processes that can use models to generate software artefacts (artefacts are elements of a software application such as source code, binary code, scripts, documentation, etc.). Code generation is often used to achieve this. The other type of transformation is referred to as the model-to-model transformation (M2M), as the name implies, facilitates the translation of models into other models that are closer to the solution domain and less broad. Distinct languages such as mainstream programming languages can be used to achieve this or specialised model transformation languages can also be used such as QVT, Acceleo, ATL, VIATRA, and DSLTrans. So far, we can see that while models can be directly created by human designers, they can also be automatically generated using M2M transformations (might seldom need refinement). These models are then further used to develop software artefacts (generated or non-generated) through the means of M2T transformations or by software developers.

3

Discussion

The universal adoption of MDE and related engineering paradigms in the development of software applications has not been widespread as most software projects rather follow other software development process that relies less on planning but on developing the software with the least amount of resources (time and budget). Some aspects of MDE or modelling language such as UML are adopted sometimes but their full potential is not exploited. Finally, it is rare

Critical Overview of Model Driven Engineering

93

to see a complete system that is completely generated from models or where MDE and MDA principles were followed. This section provides a critical review where we discuss the drawbacks of MDE that have generally affected its adoption and the also strengths of the paradigm. 3.1

Areas of Advancements for MDE

MDE has seen major advancements in three main areas: modelling languages, model analysis, and model transformation. Advancements in modelling languages are mainly concentrated on two tasks, abstraction and formalisation [8]. Abstracting modelling languages are concerned with the facilitation of domain-level activities through the generation of model constructs and other supporting elements of the modelling language. Formalisation involves creating a standardised language that can properly define the characteristics and properties of the modelling language to ensure that it can be used to automate software processes and analyses. Some of the strategies that addressed abstracting and formalisation of modelling languages are discussed further. General-purpose modelling languages that are extensible are used to address the abstraction challenge by providing specialised support for specific domains using general-purpose languages. This general-purpose language is used to provide customisation for the domain under study. Relatable examples include UML profiles and special syntactic forms and constraints available for specific modelling elements. To tackle the formality challenge, the mapping of the modelling language to a formal language can be done. In addition, at the meta-model level, the modelling language can be annotated with the intention of providing constraints on properties that can remain the same between language elements. General purpose modelling languages are more popular and widely used both in industrial applications and academics. Commonly used languages include Object Constraint Language (OCL), Systems Modelling Language (SysML), Business Process Model and Notation (BPMN). Domain-Specific Modelling Languages (DSMLs) implies creating a modelling language for a specific domain. To achieve this, meta-metamodelling elements like OMG-MOF are used. Using this, graphical editors along with code generators and debugging can now be developed with minimal effort. A new field, referred to as modelling language engineering has emerged from the need to develop more effective DSMLs for industrial usage. Popular tools that are used include MOF, EMF (Eclipse), VisualStudio, MPS (JetBrains), Kermit, GME, Epsilon, Xtext and Simulink. In model analysis, Models help in improving the understanding of the process of software development. The value of the model is better improved if it can use this knowledge from the process of modelling systems to automatically provide analysis of the system or model for several known properties such as behavioural properties [8,9], structural properties [10] from both inside a specific diagram type [11] or across several diagrams [12]. Some other techniques used in analysis

94

Y. El Gaoual and M. Hanine

models include simulating the environment for embedded systems, using model query languages for querying, using animations and visualisations, and using production quality tools. Through transformation, the existing relationship between the two models can be deduced. Model transformation can be categorized into Operational and Synchronization. Operation transformation is tasked with creating a target model from a given set of source models where the target model is refined, abstracted, or refactored from the source model. This is mostly achieved using multiple views to create an integrated model, breaking down a single model into multiple simpler models that can better represent individual units of the model, and direct translation of models into a standardised form that can easily be used to facilitate analysis automation, model checking, and behavioural and performance analysis. Synchronisation transformation facilitates the syncing of the model with its artefacts in order to provide support for activities such as code generation and code updates. In addition, it also ensures model traceability. OMG Query/View/Transformation (QVT) standard [13] provides numerous modelling languages that have the ability to provide a definition for the transformation at varied layers of abstraction. Other tools that can be used include ATL (Eclipse), Epsilon (Eclipse) and Triple Graph Grammars [14]. 3.2

Challenges to MDE Adoption

Contrary to the improvements experienced with MDE over the years, its adoption is drastically declining. Some of the issues facing its adoption are discussed further. • Failure in addressing increasing software demand: Complex software projects are often evolving continuously because product managers and stakeholders continuously add more or refine existing functional requirements. It is therefore important that the methodology chosen for software development process can support the evolution of software projects and can easily adapt to changes in software project requirements, and even hardware or implementation platforms. Furthermore, while making provisions for evolving requirements, it is also important in software engineering to ensure that the product reaches the market in the shortest time possible. This requirement makes Agile methodology one of the most popular software development methodology today. However, the current modelling approaches, tools, and techniques used in the industry do not have provision for this because they have not been evolving and has remained largely in the same state for several years. Employing MDE for such fast-paced software projects will lead to several complications. • Adoption and Usability of MDE tools: Several practitioners have experienced a number of usability challenges with MDE tools, mostly because of factors such as steep learning curves, bad user interfaces, and challenges when preformation migration from an older version to a newer version. In addition, existing MDE tools lack collaborative modelling which is popularly used in

Critical Overview of Model Driven Engineering

95

software projects these days. Users of these tools also complain of lack of flexibility and overly complex features. • Software Artefact Inconsistencies: It has been established that most practitioners employs UML at the inception phase of software development to map out the required artefacts. However, as soon as the development phase begins, modelling tools are instantly abandoned which, at the long run, leaf to inconsistencies between the codebase and the models. A system is frequently modelled using a number of perspectives from some models and modelling notations. Abandoning it quickly raises the risk of introducing inconsistencies across different models. In addition to the challenges enumerated above, other common challenges include: practitioners not believing models are as valuable as codes, MDE lack of foundational elements such as corpus of knowledge, differences between academic modelling examples and real life projects. 3.3

Enhancement of MDE with AI

With the recent adoption of AI in several domains including software development, we hereby highlights some of the promising aspects AI can be useful in MDE to help boost its adoption by practitioners. • Program generation from a set of input and output data, a complex task even for humans. IP techniques (Microsoft DeepCoder [15]) can learn the structure of a program automatically which it can then be used to generate small-scale applications. • Translation of natural language directly into source code: background knowledge needed to comprehend natural languages and their ambiguities poses a complex challenge in achieving this. • Generating quick fix patches to fix existing source code is also a potential field. • Conversion of natural code into UML class diagram before using the diagrams to generate source code and program structure can also be achieved by AI. • Conversion of UI sketches into source code. • AI can be used for the creation of high-level source code abstract instead of generating text directly which can help learning systems to use these highlevel models for pattern identification and creation of syntactically correct source code.

4

Conclusion

In this article, we have evaluated the several concepts relating to MDE. From the anaylsis of these concepts, we deduced the challenges faced by practitioners when adopting MDE in software projects while also showing the limited advancements MDE has seen in the immediate past. Clearly, it can be seen that lack of evolution is the among hurdle MDE faces with adoptation as software

96

Y. El Gaoual and M. Hanine

development processes are evolving faster than MDE. Furthermore, most of the research is focused on code generation since it is the most lucrative usage of MDA principles. Even with the attention it has received, it is still far from being a solved challenge as most solutions only work on small-scale problems or only in academic settings. As for large-scale software projects, conventional tools and techniques are still in use because of the several complex scenarios that an AI system cannot effectively generate. To mitigate some of the challenges identified in this article, we briefly enumerated some promising aspects of MDE that AI can be applied to further boost its adoption.

References 1. Voelter, M., et al.: DSL engineering-designing, implementing and using domainspecific languages (2013) 2. Truyen, F.: The fast guide to model driven architecture the basics of model driven architecture. Cephas Consulting Corp (2006) 3. Hanine, M., Lachgar, M., Elmahfoudi, S., Boutkhoum, O.: MDA approach for designing and developing data warehouses: a systematic review and proposal. Int. J. Onl. Biomed. Eng. 17(10), 99 (2021) 4. Greenfield, J., Short, K.: Software factories: assembling applications with patterns, models, frameworks and tools. In: Companion of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, pp. 16–27 (2003) 5. Ludewig, J.: Models in software engineering-an introduction. Softw. Syst. Model. 2, 5–14 (2003) 6. OMG: MetaObject Facility — Object Management Group. www.omg.org/mof/. Accessed 13 May 2023 7. Rodrigues da Silva, A.: Model-driven engineering: a survey supported by the unified conceptual model. Comput. Lang. Syst. Struct. 43, 139–155 (2015). https://doi. org/10.1016/j.cl.2015.06.001 8. France, R., Rumpe, B.: Model-driven development of complex software: a research roadmap. In: Future of Software Engineering (FOSE 2007). IEEE (2007) 9. McUmber, W.E., Cheng, B.H.: A general framework for formalizing UML with formal languages. In: Proceedings of the 23rd International Conference on Software Engineering (ICSE 2001), pp. 433–442. IEEE (2001) 10. Berenbach, B.: The evaluation of large, complex UML analysis and design models. In: Proceedings of the 26th International Conference on Software Engineering, pp. 232–241. IEEE (2004) 11. Cheng, B.H.C., Stephenson, R., Berenbach, B.: Lessons learned from automated analysis of industrial UML class models (an experience report). In: Briand, L., Williams, C. (eds.) Model Driven Engineering Languages and Systems, pp. 324– 338. Springer, Heidelberg (2005). https://doi.org/10.1007/11557432 24 12. Briand, L.C., Labiche, Y., O’Sullivan, L.: Impact analysis and change management of UML models. In: Proceedings of the International Conference on Software Maintenance, 2003, pp. 256–265. ICSM (2003). ISSN:1063–6773

Critical Overview of Model Driven Engineering

97

13. Kurtev, I.: State of the art of QVT: a model transformation language standard. In: Schurr, A., Nagl, M., Zundorf, A. (eds.) Applications of Graph Transformations with Industrial Relevance. AGTIVE 2007. LNCS, vol. 5088, pp. 377–393. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89020-1 26 14. Hildebrandt, S., et al.: A survey of triple graph grammar tools. In: Electronic Communications of the EASST 57 (2013) 15. Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., Tarlow, D.: Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989 (2016)

A Synthesis on Machine Learning for Credit Scoring: A Technical Guide Siham Akil1(B) , Sara Sekkate2 , and Abdellah Adib1 1

Faculty of Sciences and Technologies, Data Science and Artificial Intelligence, Laboratory of Mathematics, Computer Science and Applications (LMCSA), Mohammedia, Morocco [email protected], [email protected] 2 Higher National School of Arts and Crafts of Casablanca, Hassan II University of Casablanca, Casablanca, Morocco Abstract. Machine learning is a broad field that encompasses a wide range of techniques and algorithms that can be used to perform a wide variety of tasks. The selection of an appropriate algorithm to be used in a particular application can be challenging due to the complexity of the various techniques that are available as well as the high cost of implementing and debugging sophisticated models. In this paper, we examine the use of multiple machine learning algorithms on an Australian dataset that consists of a collection of loan applications from prospective borrowers with differing credit scores. Our goal is to provide comprehensive information about the performance of these models in order to assist financial firms in selecting the most effective model for their needs. To accomplish this goal, we compare the performance of the various models on the classification task and identify the most accurate and effective model based on the overall obtained performance. Our results suggest that XGBoost Classifier, Bagging Classifier, and Support Vector Machine are among the most effective models that can be used for this task based on their superior accuracy when compared to other machine learning algorithms. Keywords: Machine Learning assessment · Finance

1

· Credit Scoring · Credit risk

Introduction

The field of Machine Learning (ML) has exploded in recent years as advances in computer science have led to the development of increasingly powerful and sophisticated algorithms that can be used to automate a wide variety of tasks. This has led to growing interest in the use of ML in a wide range of different fields, including finance [1]. ML is already used at almost every level of the financial sector, from risk assessment to marketing to investment management. As this technology continues to improve and develop, it is only going to get more and more sophisticated, which means that it is likely that its use will continue to expand in the financial sector going forward. As the technology improves, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 98–110, 2024. https://doi.org/10.1007/978-3-031-54376-0_9

A Synthesis on Machine Learning for Credit Scoring

99

it is also likely that more and more people will be able to take advantage of its benefits, which will in turn lead to even greater financial innovation in the future. In the meantime, banks and financial services providers should continue to explore ways to leverage the power of ML in order to help them improve profitability and meet the needs of their customers [2]. One of the most prominent applications of ML in finance is the field of credit scoring. Credit scoring is a crucial process in finance, as it helps lenders determine the creditworthiness of prospective borrowers. Traditionally, credit scoring has been done using statistical methods based on historical data, which can be time-consuming and may not always capture the complex relationships between variables. With the recent advancements in ML, there is a growing interest in using these techniques for credit scoring which has relied on models that are based on simple rules-ofthumb such as the borrower’s age or occupation [3], as they offer the potential for higher accuracy and more efficient analysis. However, implementing ML for credit scoring can be a daunting task, especially for finance professionals who may not have expertise in artificial intelligence. There are a variety of algorithms and techniques to choose from, and selecting the right model can greatly impact the accuracy and effectiveness of the credit scoring process. This has motivated us to provide a technical guide on the use of machine learning for credit scoring, with the aim of assisting financial firms in selecting the most effective model for their needs. One of the most significant challenges facing the application of ML to the financial industry is the need to develop robust models that can accurately predict borrower risk and perform accurately in a variety of real-world scenarios. One of the most important factors to consider when designing an effective ML model is the selection of the appropriate algorithm to be used, especially when training data is limited and the algorithm must be trained on data from multiple sources. There are many available algorithms for ML, but selecting the appropriate one can be time-consuming. Resources exist to help choose an algorithm based on factors like performance, complexity, ease of use, and cost [4]. In this paper, we provide a synthesis of the top 26 ML algorithms among many others in ascending order for the Australian credit dataset to provide the most powerful algorithms in term of their empirical accuracy and suitability to ordinary finance users to make it easier for them as non-specialists in the field of artificial intelligence. By selecting the best performing algorithms based on the available training set, we provide a comprehensive overview of the most theoretically and empirically promising ML algorithms for the Australian real world credit dataset in this study. We select the most appropriate models based on the different functional evaluation metrics in the context of credit evaluation. The rest of the paper is organized as follows: In Sect. 2, we discuss the review of literature on ML based credit scoring, highlighting the most promising algorithms and techniques. In Sect. 3, we present our methodology for selecting and implementing different machine learning algorithms for credit scoring, including data preprocessing, model selection, dataset selection, and model training and evaluation. Section 4 reports and discusses the experimental results and provided a comparison of the results generated by each of the selected models. Finally,

100

S. Akil et al.

in Sect. 5, we conclude the paper and discuss the limitations of our approach as well as potential areas for further research.

2

Literature Review

Credit scoring is important in finance because it helps to mitigate the risks associated with lending and borrowing. By accurately assessing the creditworthiness of borrowers, lenders can make more informed decisions about whether to extend credit, and at what terms. This, in turn, can help to reduce the incidence of defaults and bad debts, which can have a significant impact on the financial health of lenders and the broader economy [5]. Traditional credit scoring methods typically rely on statistical models that are based on a limited set of variables, such as credit history, income, and employment status. While these models have been effective to some degree, they have several limitations [6]. First, they may not be able to capture the full range of factors that affect creditworthiness, such as social and behavioral factors. Second, they may be vulnerable to bias and error, especially if the data used to train them is incomplete or inaccurate. Third, they may not be able to adapt to changing market conditions or borrower behavior, which can limit their usefulness over time [7]. To address these limitations, researchers have increasingly turned to machine learning techniques for credit scoring. ML can be used to develop more sophisticated and accurate credit scoring models, by incorporating a wider range of variables and more complex relationships between them [8] Previous research has explored various machine learning approaches for credit scoring: Huang et al. [9] used three strategies that combine SVM with grid search, grid search and F1-Score, and genetic algorithm (GA). The SVM classifier was compared to other classification models such as neural networks, genetic programming, and decision tree classifiers. The SVM classifier was able to achieve the same accuracy as the other models, but with fewer input features. Additionally, combining Genetic Algorithm (GA) with the SVM classifier, the proposed hybrid GA-SVM strategy can do both feature selection and model parameter optimization at the same time. Danenas et al. [10] proposed different types of SVMs classifiers, such as linear SVM, stochastic gradient descent based SVM, LibSVM, Core Vector Machines (CVM), Ball Vector Machines (BVM), and others. Authors also use Discriminant Analysis (DA), which is a way to evaluate financial instances and form classes of bankruptcy. Finally, they used Feature Selection (FS), which is a way to choose the most important features from a set of data. This research showed that different SVM classifiers produced similar results. However, it is still important to choose the right classifier and its parameters. Bhattacharyya et al. [11] presented three different methods of detecting credit card fraud. That are SVM, Random Forest (RF) and Logistic Regression (LR). All three methods are being tested to see if they can help detect, control, and prosecute credit card fraud. Shi et al. [12] considered a new method for assessing credit risk. It involves using a feature-weighted SVM to rank the importance of different features. To

A Synthesis on Machine Learning for Credit Scoring

101

measure the interaction between the features, the researchers also used the RF technique. They tested the two feature-weighted versions of SVM against the traditional SVM on two real-world datasets and found that the proposed method was valid. Sadatrasoul et al. [13] aimed to comapre four different methods of feature selection used in combination with a technique called fuzzy apriori. Stepwise regression, Classification And Regression Tree (CART), correlation matrix and Principal Component Analysis (PCA) are the four methods used. Particle Swarm is used to find the best fuzzy apriori rules by looking at different levels of support and confidence. Then authors compared the accuracy, number of rules, and number of features of the four methods using data. The results are then compared using a T-test, which shows that fuzzy apriori combined with PCA creates a more compact rule base and produces better results than the single fuzzy apriori model and other combined feature selection methods. Ala’raj et al. [14] aimed to evaluate a new combination approach based on classifier consensus to combine multiple classifier systems of different classification algorithms. This approach is tested against two benchmark classifiers and seven traditional combination methods, and the results show that it improves prediction performance in terms of accuracy, Area Under Curve (AUC), Hmeasure, and Brier score. Wei et al. [15] explored a new model, Least Squares SVM with Mixture Kernel (LS-SVM-MK), which is designed to solve the problems of the traditional LS-SVM model, such as the loss of sparseness and robustness. This model is equivalent to solving a linear equation set with deficient rank, similar to the over complete problem in independent component analysis. They used credit card datasets to demonstrate the effectiveness of this model, and the results show that LS-SVM-MK can obtain a small number of features and improve the generalization ability of LS-SVM model. Maldonado et al. [16] presented a new approach to building classifiers and selecting variables that is driven by profit. This approach takes into account business-related information such as the cost of acquiring variables, the cost of making Type I and II errors, and the profit generated by correctly classified instances. They also incorporated a group penalty function into the SVM formulation to penalize variables that belong to the same group. Then, they tested this approach in a credit scoring problem for a Chilean bank and found that it led to better performance in terms of business-related goals. Tripathi et al. [17] aimed to combine the advantages of FS and ensemble frameworks. They proposed an approach based on feature clustering. This means that the features are grouped together based on their similarities. Then, the dataset with the selected features is applied to five different base classifiers, and the output from these classifiers is combined using a weighted voting approach to make a final prediction. Then, they compared the results to some existing FS techniques in terms of classification accuracy and F1-score. Akil et al. [18] explored how well-performing filters and embedded FS methods can be used to improve the accuracy of prediction credit scoring models

102

S. Akil et al.

created using different classification techniques. The results of the experiments showed that the proposed methods can be beneficial for credit risk analysis. They also found that by using the selected FS methods with certain classifiers such as Decision Tree (DT) SVM and RF, they can make the evaluation process faster and increase the accuracy of the classification. Akil et al. [19] developed a credit scoring model that uses two different types of Support Vector Machines (C-SVM and ν-SVM) combined with two filter FS methods. The author then tested the model on a public credit dataset from Australia and found that it was efficient in credit risk analysis. It involves that the model was able to quickly and accurately classify the credit risk of the dataset. Here, we established the importance of credit assessment using machine learning techniques and the different approaches used in this area. In the next section on experimental setup, we present our methodology and approach to evaluate the effectiveness of different machine learning algorithms for credit assessment.

3

Experiments Setup

In this section, we describe the dataset used in this study. Besides we present the process we followed to carry out our experiments as shown in Fig. 1. Finally, we present the performance measurements used to appraise the models in the evaluation. 3.1

Software and Hardware

The scikit-learn python package was used for all experiments. During these experiments, computations were performed in Google Colaboratory, an opensource cloud service offered by Google, equipped with a 2.3 GHz hyperthreaded Xeon processor, 12 GB of RAM, and a Tesla K80 with 2,496 CUDA cores and 12 GB of GDDR5 VRAM. 3.2

Data Description

To provide a meaningful comparison with other previous studies, we used the real-world Australian credit dataset from UCI Machine Learning Repository [20] which is frequently used for credit scoring. The dataset contains 690 samples, of which the non-defaulting class contains 307 samples (44.5%), and the default class contains 383 samples (55.5%). There are 13 input features and the outcome variable. To maintain confidentiality, the names and values of all features have been converted to symbolic data. 3.3

Data Prepossessing and Splitting

Data Splitting: we split our data randomly into 70% for training set and 30% for test set and then used them to train the model and then check the model’s performance on test set data.

A Synthesis on Machine Learning for Credit Scoring

103

Fig. 1. Flowchart of our experiments

Data preprocessing: Cleaning the data was very important for making accurate predictions because the data containing errors or inconsistencies would affect the performance of the algorithm. 3.4

Trial-and-Error Approach

We have systematically explored a comprehensive overview of the top 26 ML algorithms, among many others, specifically tailored for the Australian credit dataset. Our objective is to identify the most powerful algorithms in terms of their empirical accuracy and suitability for ordinary finance users, who may not be specialists in the field of artificial intelligence. To achieve this, we have adopted a trail-and-error approach, which involves iteratively trying out different ML algorithms, evaluating their performance, and refining our selection based on the observed results. To achieve this, we have adopted the following steps: Step 1: Initial algorithm selection: We began by selecting a set of commonly used ML algorithms relevant to credit scoring. Considering the classification nature of credit scoring, algorithms such as logistic regression, decision trees, random forests, and support vector machines were among our initial choices. Step 2: Training and evaluation: Each selected algorithm was trained on the available credit dataset, and its performance was evaluated using a separate

104

S. Akil et al.

test set. We employed appropriate evaluation metrics, such as accuracy, balanced accuracy, and F1-score, to assess the effectiveness of each algorithm. This step allowed us to identify promising algorithms that exhibited superior performance. Step 3: Refinement and iteration: Based on the results and insights gained from the initial evaluations, we refined our algorithm selection. We carefully analyzed the performance of each algorithm, identifying the ones that outperformed the others, and sought to understand the reasons behind their success. By leveraging this information, we made adjustments to our algorithm selection, focusing on the algorithms that demonstrated superior performance. For instance, if decision trees performed better than logistic regression, we explored ensemble methods like random forests or gradient boosting. Step 4: Hyperparameter tuning: We recognized the significant impact of hyperparameters on the performance of ML algorithms. Therefore, we conducted experiments with different hyperparameter configurations for the refined set of algorithms. To identify optimal hyperparameter values, we employed the grid search technique. This step further refined our algorithm selection, ensuring that the chosen configurations maximized performance. Step 5: Comparison and selection: By comparing the performance of the refined algorithms using appropriate evaluation metrics, we identified the algorithm that achieved the best results on the test data. In addition to accuracy, balanced accuracy, and F1-score, we considered other domain-specific metrics relevant to credit scoring such as ROC AUC and the Time Taken(ms). This comprehensive comparison allowed us to select the algorithm that exhibited the highest empirical accuracy and suitability for ordinary finance users. Step 6: Final evaluation: To provide a comprehensive assessment of the selected algorithm’s generalization capabilities, we evaluated its performance on unseen data, such as a separate test set. This final evaluation ensures that the chosen algorithm performs well beyond the training and validation stages, enhancing its reliability for real-world credit scoring applications. 3.5

Evaluation Metrics

Accuracy: is a commonly used measure of performance for classification, which is calculated as the number of correctly classified samples divided by the total number of samples that have been classified. It measures the accuracy of the classifier in assigning the correct label to a sample [21]. Accuracy =

(T P + T N ) (T P + F P + F N + T N )

(1)

where T P , T N , F P and F N are the true positive, true negative, false positive and false negative, respectively. Balanced accuracy: is the most accurate classification metric, and it is defined as the number of correctly classified positive and negative samples divided by the total number of samples. It takes both positives and negatives

A Synthesis on Machine Learning for Credit Scoring

105

into account when classifying a sample and calculates the overall error rate across all classifications [21]. Balanced accuracy =

TN 1 TP ( + ) 2 P N

(2)

F1 Score: The F1 score [22] is defined as the harmonic mean of the precision and recall of a classification algorithm and it is commonly used in supervised learning to rank the performance of different algorithms. It is a weighted metric that takes into account both the precision and the recall of a classifier, which makes it inferior to other performance metrics. F1 Score =

TP TP +

1 2 (F P

+ FN)

(3)

ROC AUC: The Receiver Operating Characteristic Area Under Curve (ROC AUC) is a metric that is commonly used to compare classifiers and to assess the overall performance of a classification system. It measures the trade-off between false positives and false negatives of an estimator in terms of sensitivity specificity, and predictive values. Time (ms): This is the time taken to execute an algorithm when handling one data sample from the dataset. A faster execution time results in better scalability since more data can be processed in less time. However it is important to keep in mind that fast execution speeds do not necessarily guarantee good performance of the algorithm [23].

4

Results and Discussions

In this section, we will discuss the results of the train and test set and compare them using 26 ML algorithms to classify credit applicants based on their creditworthiness. We will perform an in-depth analysis on the strengths and weaknesses of each model with respect to the Creditworthiness score accuracy. Table 1 and Table 2 below provides a summary of the accuracies for the train and test sets for the 26 selected ML classifiers. Overall, we can see that the best performing algorithms are the LGBMClasiffier, Random Forest and ExtraTreesClassifier with a Balanced accuracy of approximately 98% in testing set and 87% on the training set. LabelPropagation, DecisionTreeClassifier and LabelSpreading have also a good performing accuracy of 98%. However, these algorithms obtained the highest accuracy only on the train dataset, while their performance on the test set was significantly lower than their train accuracy of 81% for both LabelPropagation and LabelSpreading and an accuracy of 85.50% for DecisionTreeClassifier. This suggests that those algorithms are more prone to overfitting due to the large number of parameters that are randomly generated at the beginning of the training process. On the other hand, the DummyClassifier achieved the lowest accuracies on both the train and the test datasets with a score of 50% and 58% respectively. This indicates that this algorithm is prone

106

S. Akil et al. Table 1. Empirical Results of Our Experiments on Training Set

Model

Accuracy Balanced Accuracy ROC AUC F1 Score Time Taken (ms)

XGBClassifier

0.898550 0.898525

0.898525

0.898545 0.044

BaggingClassifier

0.893719 0.893530

0.893530

0.893540 0.029

LinearSVC

0.884057 0.884288

0.884288

0.883813 0.029

SVC

0.884057 0.884241

0.884241

0.883906 0.026

NuSVC

0.879227 0.879480

0.879480

0.878916 0.031

RandomForestClassifier

0.879227 0.879154

0.879154

0.879193 0.150

ExtraTreesClassifier

0.879227 0.879154

0.879154

0.879193 0.147

CalibratedClassifierCV

0.874396 0.874533

0.874533

0.874308 0.096

LinearDiscriminantAnalysis

0.869565 0.869912

0.869912

0.868923 0.014

RidgeClassifier

0.869565 0.869912

0.869912

0.868923 0.010

AdaBoostClassifier

0.869565 0.869585

0.869585

0.869565 0.085

LGBMClassifier

0.869565 0.869445

0.869445

0.869473 0.045

RidgeClassifierCV

0.864734 0.865057

0.865057

0.864157 0.013

LogisticRegression

0.859903 0.859970

0.859970

0.859883 0.017

BernoulliNB

0.859903 0.859830

0.859830

0.859864 0.011

NearestCentroid

0.855072 0.854975

0.854975

0.855004 0.012

Perceptron

0.855072 0.854975

0.854975

0.855004 0.015

DecisionTreeClassifier

0.855072 0.854929

0.854929

0.854930 0.017

KNeighborsClassifier

0.850241 0.850074

0.850074

0.850045 0.010

SGDClassifier

0.806763 0.806852

0.806852

0.806709 0.031

LabelSpreading

0.806763 0.806712

0.806712

0.806736 0.031

LabelPropagation

0.806763 0.806712

0.806712

0.805207 0.011

GaussianNB

0.792270 0.806338

0.806338

0.791833 0.008

PassiveAggressiveClassifier

0.787439 0.787107

0.787107

0.786392 0.010

QuadraticDiscriminantAnalysis 0.782608 0.782020

0.782020

0.779258 0.009

DummyClassifier

0.5

0.336020 0.008

0.502415 0.5

to poor performance, which is mainly caused by their inability to capture complex nonlinear relationships that exist between the input features and the output value. Based on results of Table 1 and Table 2 have classified the used ML algorithms into three categories: 1. Most accurate: The best performing ML classifiers for the credit scoring system are LGBMClassifier, RandomForestClassifier and ExtraTreesClassifier with a train accuracy of 87%, 88%, and 88% respectively and a test accuracy of 98%. BaggingClassifier, XGBClassifier, SVC, AdaBoost, Logistic Regression are also classified as well performing classifiers with a train accuracy of 89%, 90%, 88%, 86% and 86% respectively and a test accuracy of 97%, 96%; 89%, 89% and 88% respectively. These models are able to detect the complex nonlinear relationships that exist between the input features and the output value. For RandomForestClassifier and ExtraTreesClassifier are capable of generating a large number of feature interaction weights by sampling from a large number of randomly selected trees, which increases the accuracy

A Synthesis on Machine Learning for Credit Scoring

107

Table 2. Empirical Results of Our Experiments on Testing Set Model

Accuracy Balanced Accuracy ROC AUC F1 Score Time Taken (ms)

LGBMClassifier

0.989214 0.987811

0.987811

0.987887 0.047

LabelPropagation

0.983212 0.982417

0.982417

0.980014 0.030

DecisionTreeClassifier

0.988531 0.985638

0.985638

0.980688 0.010

RandomForestClassifier

0.987321 0.986197

0.985001

0.984762 0.156

ExtraTreesClassifier

0.987321 0.986197

0.985001

0.984762 0.009

LabelSpreading

0.983211 0.979994

0.979994

0.970043 0.118

BaggingClassifier

0.979296 0.977466

0.977466

0.979267 0.031

XGBClassifier

0.960662 0.958043

0.958043

0.960593 0.053

SVC

0.896480 0.901170

0.901170

0.987004 0.021

AdaBoostClassifier

0.898550 0.896373

0.896373

0.898583 0.102

LogisticRegression

0.881987 0.882695

0.882695

0.882350 0.018

LinearSVC

0.873706 0.878162

0.878162

0.874361 0.030

NuSVC

0.871635 0.877029

0.877029

0.872343 0.039

CalibratedClassifierCV

0.873706 0.876844

0.876844

0.874289 0.089

RidgeClassifierCV

0.861283 0.868727

0.868727

0.862114 0.010

LinearDiscriminantAnalysis

0.859213 0.866935

0.866935

0.860060 0.015

RidgeClassifier

0.859213 0.866935

0.866935

0.860060 0.012

KNeighborsClassifier

0.863354 0.862195

0.862195

0.863524 0.023

NearestCentroid

0.850931 0.843954

0.843954

0.850371 0.012

SGDClassifier

0.832298 0.832437

0.832437

0.832887 0.009

BernoulliNB

0.840579 0.831699

0.831699

0.839623 0.009

Perceptron

0.836438 0.827456

0.827456

0.835458 0.009

PassiveAggressiveClassifier

0.834368 0.819734

0.819734

0.831812 0.012

QuadraticDiscriminantAnalysis 0.807453 0.782600

0.782600

0.800016 0.009

GaussianNB

0.801242 0.781177

0.781177

0.796182 0.011

DummyClassifier

0.577639 0.5

0.5

0.422996 0.008

and reduces the impact of randomness in the modeling process. However, the performance of this model significantly decreases when the training sample size is small. 2. Lowest accurate: DummyClassifier with very low performance with a train accuracy of 50% and a test accuracy of 58%. A train accuracy of 81% for both LabelPropagation and LabelSpreading and an accuracy of 85.50% for DecisionTreeClassifier and a test accuracy of 98%. These models assumes a certain tree structure and cannot detect complex relationships between the input features and the output value. However, it can be useful for highlighting important features for improving the accuracy of the model during subsequent analysis. 3. Medium accurate: Here, we have the LinearSVC, NuSVC, CalibratedClassifierCV, RidgeClassifierCV, LinearDicsriminantAnalysis, RidgeClassifier, KNeighborsClassifier, NearestCentroid, SGDClassifier, BernoulliNB and Perceptron as a medieum accurate models for credit scoring system. These models assume a equivalent accuracy of the train and test set. However, they can be useful in highlighting the problem of credit scoring.

108

S. Akil et al.

ML has become an essential tool for many financial institutions as it is an effective and efficient method for processing large datasets and identifying patterns to make predictions. For companies specializing in lending services, this is particularly important as they need to find new customers and expand their business in order to achieve growth and profitability. However, due to the large and complex nature of these datasets, it can be difficult for these companies to identify the most relevant patterns which can help them to increase their customer base and improve profitability. In this research, we use a dataset of loan applications and analyze the performance of different ML algorithms to determine which is most effective at classifying loan applications based on credit quality, in order to help no ML experts in financial institutions to develop their own models and implement them in their businesses. We classified ML algorithms from accurate to inaccurate in order to show the accuracy results of the ML algorithms in order of algorithm performance.

5

Conclusion

Machine Learning algorithms are used to identify potential borrowers who are most likely to be approved for a loan or line of credit based on their credit history and other relevant information. In this study, we examined the use of a variety of ML techniques to classify loan applications based on their credit ratings. We used a real-world dataset provided from UCI Machine Learning Repository to determine the performance characteristics of several commonly used ML algorithms. Our results show that XGBoost Classifier, Bagging Classifier, and SVM can be highly effective at classifying loans based on consumer creditworthiness and that the technique provides an accurate depiction of borrower risk without the need for a large number of parameters that must be tuned manually. These results have important implications for lenders and credit analysts who are interested in developing predictive models for applications in the financial services industry. Acknowledgements. This work was supported by the Ministry of Higher Education, Scientific Research and Innovation, the Digital Development Agency (DDA) and the CNRST of Morocco (Alkhawarizmi/2020/01).

References 1. Akil, S., Sekkate, S., Adib, A.: Interpretable credit scoring model via rule ensemble. In: Kacprzyk, J., Ezziyyani, M., Balas, V.E. (eds.) International Conference on Advanced Intelligent Systems for Sustainable Development: Volume 1 - Advanced Intelligent Systems on Artificial Intelligence, Software, and Data Science, pp. 903– 911. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-26384-2 81 2. Olson, D.L., Wu, D.D.: Credit risk analysis. In: Enterprise Risk Management, pp. 117–136 (2015)

A Synthesis on Machine Learning for Credit Scoring

109

3. Akil, S., Sekkate, S., Adib, A.: Combined feature selection and rule extraction for credit applicant classification. In: Ben Ahmed, M., Boudhir, A.A., Santos, D., Dionisio, R., Benaya, N. (eds.) Innovations in Smart Cities Applications Volume 6: The Proceedings of the 7th International Conference on Smart City Applications, pp. 97–104. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-26852-6 9 4. Maldonado, S., P´erez, J., Bravo, C.: Cost-based feature selection for support vector machines: an application in credit scoring. Eur. J. Oper. Res. 261(2), 656–665 (2017) 5. Frame, W.: The effect of credit scoring on small business lending in low- and moderate-income areas. Banking Insurance eJournal (2001) 6. Ince, H.: A comparison of data mining techniques for credit scoring in banking: a managerial perspective. J. Bus. Econ. Manag. 10(3), 233–240 (2009) 7. Gool, J.V.: Credit scoring for microfinance: is it worth it? Int. J. Financ. Econ. (2012) 8. Volkova, E.: Data mining techniques: modern approaches to application in credit scoring (2017) 9. Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007) 10. Danenas, P., Garsva, G., Gudas, S.: Credit risk evaluation model development using support vector based classifiers. Procedia Comput. Sci. 4(June), 1699–1707 (2011) 11. Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Decis. Support Syst. 50(3), 602–613 (2011) 12. Shi, J., Zhang, S.Y., Qiu, L.M.: Credit scoring by feature-weighted support vector machines. J. Zhejiang Univ.: Sci. C 14(3), 197–204 (2013) 13. Sadatrasoul, S., Gholamian, M., Shahanaghi, K.: Combination of feature selection and optimized fuzzy apriori rules: the case of credit scoring 12(2), 138–145 (2015) 14. Ala’raj, M., Abbod, M.F.: Classifiers consensus system approach for credit scoring. Knowl. Based Syst. 104, 89–105 (2016) 15. Wei, L., Li, W., Xiao, Q.: Credit risk evaluation using: least squares support vector machine with mixture of kernel. In: Proceedings of the 2016 International Conference on Network and Information Systems for Computers (ICNISC 2016), pp. 237–241 (2017) 16. Maldonado, S., Bravo, C., L´ opez, J., P´erez, J.: Integrated framework for profitbased feature selection and SVM classification in credit scoring. Decis. Support Syst. 104, 113–121 (2017) 17. Tripathi, D., Edla, D.R., Kuppili, V., Bablani, A., Dharavath, R.: Credit scoring model based on weighted voting and cluster based feature selection. Procedia Comput. Sci. 132(Iccids), 22–31 (2018) 18. Siham, A., Sara, S., Abdellah, A.: Feature selection based on machine learning for credit scoring: an evaluation of filter and embedded methods. In: Proceedings of the 2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA 2021) (2021) 19. Akil, S., Sekkate, S., Adib, A.: Classification of credit applicants using SVM variants coupled with filter-based feature selection. In: Ben Ahmed, M., Abdelhakim, B.A., Ane, B.K., Rosiyadi, D. (eds.) Emerging Trends in Intelligent Systems and Network Security, pp. 136–145. Springer, Cham (2023). https://doi.org/10.1007/ 978-3-031-15191-0 13 20. Dua, D., Graff, C.: UCI machine learning repository (2017)

110

S. Akil et al.

21. Handelman, G.S., et al.: Peering into the black box of artificial intelligence: evaluation metrics of machine learning methods. Am. J. Roentgenol. 212(1), 38–43 (2019) 22. Dalianis, H.: Evaluation Metrics and Evaluation, pp. 45–53. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78503-5 6 23. Ho, K.I., Leung, J.Y., Wei, W.: Complexity of scheduling tasks with timedependent execution times. Inf. Process. Lett. 48(6), 315–320 (1993)

Enhancing Writer Identification with Local Gradient Histogram Analysis Abdelillah Semma1(B) , Said Lazrak1 , and Yaˆ acoub Hannad2 1

2

Ibn Tofail University, Kenitra, Morocco semma [email protected], [email protected] Faculty of Educational Sciences, Mohammed V University, Rabat, Morocco

Abstract. Writer identification is a critical aspect of document analysis and has significant implications in various domains, including forensics, authentication, and historical research. In this article, we propose a novel approach for writer identification using gradient angle histograms collected from neighboring pixels. By calculating the histogram of gradient angles from different locations of neighboring pixels, we effectively capture the writer’s unique style and nuances. Our experimental study demonstrates promising results on the two datasets BFL and CERUG, showcasing the potential of our proposed technique in improving the state-of-the-art methods in writer identification.

Keywords: writer identification

1

· HOG · texture · VLAD · ORB

Introduction

Writer identification has been an important research domain due to its numerous applications in forensic investigation, authentication of handwritten documents, and historical document analysis. The ability to accurately and reliably identify the writer behind a piece of handwriting offers valuable insights and can potentially solve disputes or authenticate critical documents. Over the past two decades, numerous features have been proposed for identifying the writer of handwritten documents, generally classified into codebookbased, texture-based, and deep-learning-based approaches. Codebook-based methods, also known as bag-of-shapes, have been used to characterize individual writing styles [23,33,34]. Various techniques to create codebooks have been proposed, such as Graphemes [8,9], Elliptic Graphemes [2], Bagged Discrete Cosine Transform (BDCT) descriptors [24], and Implicit Shape [5]. Allographic features, introduced by [8,9], have also been utilized for characterizing writer individuality, building upon [6]. Texture-based methods [4,7,15,17,18,32] treat handwriting images as different textures and extract features accordingly. These methods can be applied to the entire image [8,9,15], regions of interest [10,11,35], or writing fragments [17,18]. Several texture-based techniques have been proposed, such as Local c The Author(s), under exclusive license to Springer Nature Switzerland AG 2024  M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 111–122, 2024. https://doi.org/10.1007/978-3-031-54376-0_10

112

A. Semma et al.

Binary Patterns (LBP) [7,18,25], Scale Invariant Feature Transform (SIFT) [36], and Histogram of Oriented Gradients (HOG) [17]. Recently, deep-learning-based methods have gained popularity in writer identification tasks [13,14,19,21,27,28]. Convolutional Neural Networks (CNNs) have been utilized for feature extraction in text-independent writer identification [13,30,31]. Transfer learning approaches have also been employed, such as ImageNet-based transfer learning [27]. However, these techniques often require higher computation time and may lack some desired functionality for optimal classifiers. Some notable deep-learning-based approaches include the FragNet model based on Fraglets and BagNet [21] and the unsupervised method by [12] which exploits the SIFT descriptors as well as the regions centered around the detected key points. Most recently, [29] trained a deep CNN to capture features from small handwriting patches, encoding them using Vector of Locally Aggregated Descriptors (VLAD) and Triangulation Embedding (TE) with a KNN classifier.

Fig. 1. Samples of (a) CERUG-CH and (b) BFL datasets

In this article, we propose a new approach to writer identification that leverages gradient angle histograms collected from neighboring pixels. The key idea behind our method is to compute the histogram of gradient angles from different locations of neighboring pixels, which effectively captures the subtle nuances and stylistic aspects of the writer’s handwriting. By using this histogram, we can create a signature representation of the writer’s style that can be compared against other samples to establish the identity of the writer. Our experimental study on several datasets reveals that the proposed approach achieves promising results, outperforming existing methods in various scenarios. The robustness and accuracy of our method demonstrate the potential of

Enhancing Writer Identification with Local Gradient Histogram Analysis

113

gradient angle histograms for writer identification and pave the way for future research in this domain. In the following sections, we present the two datasets used for the evaluation of our system, then we provide a detailed description of our proposed approach, including the computation of gradient angle histograms from neighboring pixels, feature extraction, and classification. We then present the results of our experimental study, highlighting the performance of our method on several benchmark datasets. Finally, we conclude the article by discussing the implications of our findings and outlining future directions for research in writer identification.

2

Datasets

The evaluation of an automatic Writer Identification system necessitates the use of diverse datasets. In our case study, we will employ two datasets: BFL and CERUG. The first dataset, BFL [16], comprises 945 handwritten images from 315 writers, with each writer contributing 3 pages. The primary objective of this dataset is to establish a platform for writer identification and verification tasks in the context of forensics and the Brazilian Federal Police. In our study, we have selected one page for the test phase and the remaining two pages for the training phase. The second dataset, CERUG [22], contains handwritten documents from 105 individuals. Each person wrote four pages: the first and second pages are in Chinese, while the third page, written in English, is split into two half pages. To evaluate our system, we will use the English version of CERUG, designating one half page for the test phase and the other half for the training phase. Figure 1 presents samples of handwriting from the two datasets BFL and CERUG-CH.

3

Methodology

In this study, we propose a system that exploits a novel texture descriptor, the Local Gradient Histogram (LGH), for identifying the writer of handwritten documents. The methodology for this approach consists of several stages: keypoint detection, feature extraction, encoding, and classification. – Keypoint Detection: In the initial stage, we employ the FAST (Features from Accelerated Segment Test) keypoint detection method on the handwritten document images. This process helps identify salient points in the writing, which serve as the basis for subsequent feature extraction. – Feature Extraction: After detecting the keypoints, we extract small writing fragments centered around each keypoint. These fragments capture local information about the writing style of the author. We then compute the LGH histogram for each fragment. The LGH descriptor captures the distribution of gradient orientations within the writing fragments, effectively encapsulating local texture information.

114

A. Semma et al.

Fig. 2. Proposed Approach

Fig. 3. The process of calculating the angles of the Local Gradient Histogram (LGH)

– Encoding: With the LGH histograms computed for each fragment, we proceed to the encoding step. We use the Vector of Locally Aggregated Descriptors (VLAD) method to generate a global descriptor for each document. The VLAD encoding process aggregates local features, resulting in a compact and discriminative representation that can be used for classification. – Classification: In the final stage, we employ the Ball-Tree method for classification. The Ball-Tree method is a K-Nearest Neighbor (KNN) based technique that constructs a tree data structure to efficiently search for the nearest neighbors in high-dimensional spaces. Using this approach, we can identify the writer of a given document by comparing its global descriptor to those of the known samples in our dataset. The proposed system leverages the LGH descriptor to capture the distinct texture information in handwritten documents, resulting in a robust and efficient writer identification process. By combining the FAST keypoint detection, LGH feature extraction, VLAD encoding, and Ball-Tree classification, we

Enhancing Writer Identification with Local Gradient Histogram Analysis

115

aim to achieve accurate and efficient writer identification in offline handwritten documents. Figure 2 presents the main steps of the proposed approach. 3.1

Local Gradient Histogram (LGH)

The Local Gradient Histogram (LGH) descriptor, inspired by the Histogram of Oriented Gradients (HOG), is a new texture descriptor designed to capture unique aspects of a writer’s style in handwritten documents. To better understand the LGH, let’s first explore the HOG descriptor. The HOG descriptor is a popular feature extraction technique used for various computer vision tasks. It captures the distribution of gradient orientations in an image or image region, effectively encapsulating local texture and shape information. HOG works by dividing an image into small, non-overlapping cells and calculating the gradient orientation histogram for each cell. The histograms are then normalized over larger, overlapping blocks to account for illumination and contrast variations. Our novel LGH descriptor builds upon the principles of HOG while introducing some enhancements tailored to the task of writer identification. The LGH descriptor is computed on writing fragments of different sizes extracted around the FAST keypoints. Instead of only considering the gradient orientation, the LGH takes into account the angles formed by different neighboring pixels, such as: – – – – – –

LGH-VH: Vertical (↑) and horizontal (→) pixels LGH-DADD: Diagonal ascending (/) and Diagonal descending (\) pixels. LGH-DDH: Diagonal descending (\) and horizontal (→) pixels LGH-DDV: Diagonal descending (\) and vertical (↑) pixels LGH-DAH: Diagonal ascending (/) and horizontal (→) pixels LGH-DAV: Diagonal ascending (/) and vertical (↑) pixels

By considering these angles, the LGH descriptor captures more detailed information about the writer’s style, enhancing its discriminative power. After computing the angles for each fragment, we create histograms by dividing the angle range into a specified number of classes, ranging from 3 to 30. These histograms represent the distribution of the calculated angles within the fragments, encapsulating the local texture information that is critical for identifying the writer. In summary, the LGH descriptor, inspired by HOG, is a novel texture descriptor designed for writer identification in handwritten documents. It improves upon the HOG by considering angles formed by different neighboring pixels and calculating histograms on various fragment sizes extracted around FAST keypoints.

116

A. Semma et al.

This approach results in a more discriminative and robust feature representation, enabling accurate writer identification. 3.2

Vector of Locally Aggregated Descriptors (VLAD)

The Vector of Locally Aggregated Descriptors (VLAD) is a powerful encoding method that has proven effective in various applications, such as image and video retrieval, object recognition, and writer identification. One of its advantages is the ability to encapsulate both local and global information within an image while maintaining a compact representation compared to other global image descriptors. To apply the VLAD encoding method, we start by extracting the local features X = x1 , ..., xN . Subsequently, a”dictionary” D = c1 , ..., ck consisting of k clusters is created by organizing the local features X using a clustering technique like k-means. The encoding process for each handwritten image begins by associating the local descriptors xj with the nearest cluster center ci . Next, the vectors Vi for each center ci are calculated by summing the residuals of the local descriptors xj relative to ci , as shown in the equation below:  (xj − ci ) vi = N N (xj )=ci

To obtain the global vector, the aggregation vectors of each center Vi are concatenated. The resulting vector is then normalized using two methods: power normalization and intra-normalization. In summary, the VLAD encoding method captures both local and global information within an image and is useful for various applications, including writer identification. The method involves extracting local features, creating a dictionary of cluster centers, and calculating the residuals of the local descriptors relative to the cluster centers. The global vector is formed by concatenating the aggregation vectors and normalizing the result.

4

Experimental Study

In this section, we aim to evaluate the performance of the proposed system. First, we will outline the databases utilized for the experimental study. Following that, we will examine the effect of various parameters on the efficiency of the proposed method. Lastly, we will compare the results obtained with those from existing state-of-the-art approaches.

Enhancing Writer Identification with Local Gradient Histogram Analysis

117

Table 1. Evolution of the Top-1 identification rate in BFL dataset according to the fragments size using 5000 fragments of size 20 × 20 per document and a dimension of LGH (=24) Fragments size LGH-VH LGH-DDDA LGH-DDH LGH-DDV LGH-DAH LGH-DAV 5×5

85.08

85.08

81.59

84.76

80.63

85.71

10 × 10

95.56

96.83

96.51

95.87

97.14

96.19

15 × 15

98.41

98.41

98.1

98.41

99.05

98.73

20 × 20

98.41

98.41

99.05

98.1

100

99.05

25 × 25

97.46

98.73

99.05

97.78

98.41

98.41

30 × 30

98.41

99.05

99.05

98.41

98.73

98.73

Table 2. Evolution of the Top-1 identification rate in CERUG-CH dataset according to the fragments size using 5000 fragments per document and a dimension of LGH (=24) Fragments size LGH-VH LGH-DDDA LGH-DDH LGH-DDV LGH-DAH LGH-DAV 5×5

82.86

81.9

82.86

83.81

80

80.95

10 × 10

92.38

89.52

87.62

90.48

89.52

87.62

15 × 15

93.33

91.43

95.24

93.33

92.38

94.29

20 × 20

98.1

95.24

96.19

93.33

93.33

95.24

25 × 25

95.24

93.33

92.38

95.24

94.29

89.52

30 × 30

95.24

92.38

91.43

91.43

90.48

91.43

Fig. 4. Evolution of the Top-1 identification rate in CERUG-CH dataset according to the LGH-DADD dimension using 5000 fragments of size 20 × 20 per document

118

4.1

A. Semma et al.

Impact of the Fragment Size

The effectiveness of our system depends on the VLAD encoding of information within small fragments. Given that the information in a fragment changes according to its size, our system is sensitive to the size of the fragments used. Table 3. Evolution of the Top-1 identification rate in BFL dataset according to the number of clusters using 5000 fragments of size 20 × 20 per document and a dimension of LGH (=24) Number of clusters LGH-VH LGH-DDDA LGH-DDH LGH-DDV LGH-DAH LGH-DAV 64

98.73

99.05

99.05

98.73

99.05

99.37

128

99.05

99.37

98.1

98.73

99.37

99.05

256

98.1

98.73

98.73

99.05

99.37

99.37

512

98.41

98.41

99.05

98.1

100

99.05

1024

98.41

98.41

98.73

98.41

98.73

99.05

Table 4. Evolution of the Top-1 identification rate in CERUG-CH dataset according to the number of clusters using 5000 fragments of size 20 × 20 per document and a dimension of LGH (=24) Number of clusters LGH-VH LGH-DDDA LGH-DDH LGH-DDV LGH-DAH LGH-DAV 64

98.1

95.24

96.19

93.33

93.33

95.24

128

93.33

97.14

93.33

96.19

93.33

92.38

256

92.38

97.14

93.33

93.33

95.24

95.24

512

95.24

96.19

93.33

88.57

91.43

93.33

1024

94.29

93.33

89.52

89.52

96.19

94.29

To examine the impact of the size of the fragments utilized, we conducted several experiments with varying fragment sizes. Tables 1 and 2 display the results obtained, revealing that medium and large sizes generally lead to better performance compared to small sizes (such as 5 × 5), where the contained information is insufficient for more descriptive features. 4.2

Impact of the LGH Dimension

The analysis of the Top-1 identification rates evolution based on the dimension of the LGH-DADD descriptor considers the calculation of the histograms, where the angles of the diagonal pixels are divided into several classes. Figure 4 illustrates the progression of the results in relation to the number of classes chosen. It can be deduced that the best performance is attained for medium and large class values, while smaller class numbers, such as 3, lead to a decline in system performance. In this case, the identification rate drops to less than 31%.

Enhancing Writer Identification with Local Gradient Histogram Analysis

4.3

119

Impact of the Number of Clusters

Tables 3 and 4 illustrate the changes in Top-1 identification rates based on the number of classes selected during the clustering step, which is essential for VLAD encoding. It is evident that the variation in the number of clusters generally influences the system’s performance. For instance, in the BFL database, the scores range from 98.1% attained by the local histogram of gradient angles formed by horizontal and vertical pixels (LGH-VH) with 256 clusters, to 100% achieved by the local histogram of gradient angles formed by horizontal pixels and ascending diagonal pixels (LGH-DAH). Conversely, the CERUG-CH dataset exhibits more significant variations, with identification rates ranging between 88.24% and 98.1%. This demonstrates that the system is more stable for BFL than for CERUG-CH, where it encounters greater challenges in capturing the optimal characteristics of writers’ styles. Furthermore, the performance of different histograms is distinct. While LGHDAH excels in the BFL database, LGH-VH attains the best performance on the CERUG-CH dataset. It is also clear that various histograms achieve noteworthy results on both databases. 4.4

Comparison with State of Art

Table 5. Comparison with state of art on BFL database System Year

Method Top-1

[7]

texture (LBP & LPQ)

99.2

[5]

implicit shape codebook5 98.3

[26]

document filter

[1]

LBP & oBIF

98.6

Our

LGH-DAH

100

86.1

Table 6. Comparison with state of art on CERUG-CH database System Method

Top-1

[22]

Codebook and Texture

94.20

[20]

Texture (LBPrun)

93.8

[11]

Texture (LSTP)

100.0

[3]

Texture (MLBP)

100.0

Our

LGH-VH

98.1

120

A. Semma et al.

To assess the efficiency and robustness of our system, it is essential to compare it with the primary state-of-the-art works that have been conducted using the same databases. In this context, we provide a comparison with the key research efforts carried out on the two databases BFL and CERUG-CH in Tables 5 and 6, respectively. The analysis reveals that our approach attains scores that are comparable to the best state-of-the-art performance. This highlights the effectiveness of the proposed system in writer identification tasks and demonstrates its potential to stand alongside leading methods within the field.

5

Conclusion

In conclusion, this study introduced a novel writer identification system based on the Local Gradient Histogram (LGH) descriptor, inspired by the Histogram of Oriented Gradients (HOG) approach. The system leverages FAST keypoint detection, VLAD encoding, and Ball-Tree classification methods to achieve a robust and efficient identification process. Our experimental results, obtained from the two datasets CERUG-CH and BFL, demonstrate the effectiveness of the proposed system. The impact of different parameters, such as the number of clusters, fragment sizes, and LGH descriptor dimensions, were thoroughly analyzed, revealing optimal settings for achieving high identification rates. Comparisons with state-of-the-art works show that our approach achieves competitive performance, highlighting its potential for practical application in the writer identification domain. Future research may explore the integration of deep learning techniques or the development of more advanced descriptors to further enhance the system’s performance. Additionally, testing the proposed system on more diverse and challenging datasets, and extending its application to other related tasks, such as handwriting recognition or writer verification, could yield valuable insights and contribute to the ongoing advancement of the field.

References 1. Abbas, F., Gattal, A., Djeddi, C., Siddiqi, I., Bensefia, A., Saoudi, K.: Texture feature column scheme for single-and multi-script writer identification. IET Biometrics 10(2), 179–193 (2021) 2. Abdi, M.N., Khemakhem, M.: A model-based approach to offline text-independent arabic writer identification and verification. Pattern Recogn. 48(5), 1890–1903 (2015) 3. Bahram, T.: A texture-based approach for offline writer identification. J. King Saud Univ. Comput. Inf. Sci. 34(8), 5204–5222 (2022) 4. Bendaoud, N., Hannad, Y., Samaa, A., El Kettani, M.E.Y.: Effect of the subgraphemes’ size on the performance of off-line Arabic writer identification. In: Tabii, Y., Lazaar, M., Al Achhab, M., Enneya, N. (eds.) Big Data, Cloud and Applications: Third International Conference (BDCA 2018). CCIS, vol. 872, pp. 512–522. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-96292-4 40

Enhancing Writer Identification with Local Gradient Histogram Analysis

121

5. Bennour, A., Djeddi, C., Gattal, A., Siddiqi, I., Mekhaznia, T.: Handwriting based writer recognition using implicit shape codebook. Forensic Sci. Int. 301, 91–100 (2019) 6. Bensefia, A., Paquet, T., Heutte, L.: A writer identification and verification system. Pattern Recogn. Lett. 26(13), 2080–2092 (2005) 7. Bertolini, D., Oliveira, L.S., Justino, E., Sabourin, R.: Texture-based descriptors for writer identification and verification. Expert Syst. Appl. 40(6), 2069–2080 (2013) 8. Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007) 9. Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007) 10. Chahi, A., Ruichek, Y., Touahni, R., et al.: Block wise local binary count for off-line text-independent writer identification. Expert Syst. Appl. 93, 1–14 (2018) 11. Chahi, A., Ruichek, Y., Touahni, R., et al.: Local gradient full-scale transform patterns based off-line text-independent writer identification. Appl. Soft Comput. 92, 106277 (2020) 12. Christlein, V., Bernecker, D., Maier, A., Angelopoulou, E.: Offline writer identification using convolutional neural network activation features. In: Gall, J., Gehler, P., Leibe, B. (eds.) Pattern Recognition: 37th German Conference, GCPR 2015, Aachen, pp. 540–552. Springer, Cham (2015). https://doi.org/10.1007/978-3-31924947-6 45 13. Christlein, V., Gropp, M., Fiel, S., Maier, A.: Unsupervised feature learning for writer identification and writer retrieval. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 991–997. IEEE (2017) 14. Christlein, V., Maier, A.: Encoding CNN activations for writer recognition. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 169–174. IEEE (2018) 15. Djeddi, C., Meslati, L.S., Siddiqi, I., Ennaji, A., El Abed, H., Gattal, A.: Evaluation of texture features for offline Arabic writer identification. In: 2014 11th IAPR International Workshop on Document Analysis Systems, pp. 106–110. IEEE (2014) 16. Freitas, C., Oliveira, L.S., Sabourin, R., Bortolozzi, F.: Brazilian forensic letter database. In: 11th International Workshop on Frontiers on Handwriting Recognition, Montreal (2008) 17. Hannad, Y., Siddiqi, I., Djeddi, C., El-Kettani, M.E.Y.: Improving arabic writer identification using score-level fusion of textural descriptors. IET Biometrics 8(3), 221–229 (2019) 18. Hannad, Y., Siddiqi, I., El Kettani, M.E.Y.: Writer identification using texture descriptors of handwritten fragments. Expert Syst. Appl. 47, 14–22 (2016) 19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 20. He, S., Schomaker, L.: Writer identification using curvature-free features. Pattern Recogn. 63, 451–464 (2017) 21. He, S., Schomaker, L.: Fragnet: writer identification using deep fragment networks. IEEE Trans. Inf. Forens. Secur. 15, 3013–3022 (2020) 22. He, S., Wiering, M., Schomaker, L.: Junction detection in handwritten documents and its application to writer identification. Pattern Recogn. 48(12), 4036–4048 (2015)

122

A. Semma et al.

23. Khalifa, E., Al-Maadeed, S., Tahir, M.A., Bouridane, A., Jamshed, A.: Off-line writer identification using an ensemble of grapheme codebook features. Pattern Recogn. Lett. 59, 18–25 (2015) 24. Khan, F.A., Tahir, M.A., Khelifi, F., Bouridane, A., Almotaeryi, R.: Robust offline text independent writer identification using bagged discrete cosine transform features. Exp. Syst. Appl. 71, 404–415 (2017) 25. Lazrak, S., Semma, A., El Kaab, N.A., El Kettani, M.E.Y., Mentagui, D.: Writer identification using textural features. In: ITM Web of Conferences, vol. 43, p. 01027. EDP Sciences (2022) 26. Pinhelli, F., Britto, Jr., A.S., Oliveira, L.S., Costa, Y.M., Bertolini, D.: Singlesample writers–“document filter” and their impacts on writer identification. arXiv preprint arXiv:2005.08424 (2020) 27. Rehman, A., Naz, S., Razzak, M.I., Hameed, I.A.: Automatic visual features for writer identification: a deep learning approach. IEEE Access 7, 17149–17157 (2019) 28. Semma, A., Hannad, Y., El Kettani, M.E.Y.: Impact of the CNN patch size in the writer identification. In: Networking, Intelligent Systems and Security, pp. 103–114. Springer, Cham (2022). https://doi.org/10.1007/978-981-16-3637-0 8 29. Semma, A., Hannad, Y., Siddiqi, I., Djeddi, C., El Kettani, M.E.Y.: Writer identification using deep learning with fast keypoints and harris corner detector. Expert Syst. Appl. 184, 115473 (2021). https://doi.org/10.1016/j.eswa.2021.115473 30. Semma, A., Hannad, Y., Siddiqi, I., Lazrak, S., Kettani, M.E.Y.E.: Feature learning and encoding for multi-script writer identification. Int. J. Doc. Anal. Recogn. 25(2), 79–93 (2022). 10.1007/s10032-022-00394-8 31. Semma, A., Lazrak, S., Hannad, Y., Boukhani, M., El Kettani, Y.: Writer identification: the effect of image resizing on CNN performance. Int. Archiv. Photogram. Remote Sens. Spatial Inf. Sci. 46, 501–507 (2021) 32. Semma, A., Lazrak, S., Hannad, Y., El Kettani, M.E.Y.: Writer identification using vlad encoding of the histogram of gradient angle distribution. E3S Web Conf. 351, 01073 (2022). EDP Sciences 33. Siddiqi, I., Vincent, N.: Writer identification in handwritten documents. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 1, pp. 108–112. IEEE (2007) 34. Siddiqi, I., Vincent, N.: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recogn. 43(11), 3853–3865 (2010) 35. Singh, P., Roy, P.P., Raman, B.: Writer identification using texture features: a comparative study. Comput. Electric. Eng. 71, 1–12 (2018) 36. Wu, X., Tang, Y., Bu, W.: Offline text-independent writer identification based on scale invariant feature transform. IEEE Trans. Inf. Forens. Secur. 9(3), 526–536 (2014)

Solving a Generalized Network Design Problem Using Hybrid Metaheuristics Imen Mejri1(B) , Manel Grari2 , and Safa Bhar Layeb1 1 LR-OASIS, National Engineering School of Tunis, University of Tunis El Manar,

Tunis, Tunisia {imen.mejri,safa.layeb}@enit.utm.tn 2 National Engineering School of Tunis, University of Tunis El Manar, Tunis, Tunisia Abstract. Metaheuristics have emerged as a practical and highly effective alternative to traditional exact methods in mixed-integer optimization. Their ability to strike a favorable balance between solution quality and computational time has made them the preferred choice for tackling complex problems and large instances. In this paper, we focus on the Generalized Discrete Cost Multicommodity Network Design Problem (GDCMNDP), a challenging network design problem. We investigate the performance of hybrid metaheuristics, specifically the Genetic Algorithm and the Non-Linear Threshold Algorithm, known for their success in diverse applications. Our proposed collaborative framework, featuring a multistage structure, harnesses the strengths of these metaheuristics. The numerical results obtained demonstrate the effectiveness of our approach in solving various test problems, highlighting its favorable performance. Keywords: Network design problems · hybrid metaheuristics · NLTA · GA · Optimization

1 Introduction The efficient design and optimization of network infrastructures are paramount in domains such as transportation, telecommunications, logistics, and supply chain management. However, network design problems are characterized by their combinatorial nature and high-dimensional search spaces, making the discovery of optimal solutions a challenging task. Traditional optimization methods, such as mathematical programming and exact algorithms, encounter difficulties when addressing the computational complexity and scalability issues associated with network design problems. To overcome these limitations, researchers have turned to metaheuristic algorithms as effective alternatives. Metaheuristics are iterative optimization techniques that draw inspiration from natural and social phenomena to navigate complex problem spaces intelligently [1]. A metaheuristic comprises a collection of fundamental principles that enable the creation of heuristic approaches to tackle specific optimization problems. As a result, metaheuristics are flexible, and efficient, and possess the ability to be adapted and tailored to handle a wide range of real-life optimization problems. The metaheuristic algorithms can be categorized into three primary categories. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 M. Ben Ahmed et al. (Eds.): SCA 2023, LNNS 938, pp. 123–133, 2024. https://doi.org/10.1007/978-3-031-54376-0_11

124

I. Mejri et al.

1.1 Local Search Metaheuristics Local search metaheuristics, such as simulated annealing (SA) and threshold accepting (TA), iteratively improve solutions by exploring neighboring solutions [2]. SA mimics the annealing process to escape local optima and search for global optima. It compares the current and newly selected solutions, accepting improvements and occasionally accepting non-improving solutions based on a decreasing temperature parameter. SA is widely used for discrete optimization. TA, on the other hand, accepts deteriorating solutions based on a deterministic threshold parameter. TA is simpler and avoids probabilistic computations. Both SA and TA are powerful optimization heuristics with successful applications across a wide range of problems [3]. 1.2 Constructive Metaheuristics Constructive metaheuristics are characterized by their approach of constructing solutions by systematically assembling their constituent elements, rather than solely focusing on improving complete solutions. This is achieved through an incremental process of adding one element at a time, often referred to as a move, to a partial solution. Many constructive metaheuristics are derived from greedy algorithms, which select the most favorable element at each iteration. To enhance the quality of the final solutions, it is common for constructive metaheuristics to incorporate a subsequent phase of local search after the initial construction phase [4]. Ant colony optimization (ACO) refers to a collection of interrelated constructive metaheuristics, inspired by the foraging behavior of ants. These algorithms construct solutions by simulating how ants search for resources [5]. 1.3 Population-Based Metaheuristics Population-based metaheuristics are highly effective in discovering high-quality solutions through the iterative selection and combination of existing solutions from a predefined population. One notable example within this category is evolutionary algorithms, which emulate the principles of natural evolution to guide the search process. Genetic Algorithms [6], Evolutionary Programming [7], and Evolutionary Strategies [8] are examples of evolutionary algorithms. However, these classes are not mutually exclusive, and numerous metaheuristic algorithms integrate concepts from different categories. These approaches referred to as “hybrid” metaheuristics, combine ideas from various classes to enhance their problemsolving capabilities. The goal of hybridization in metaheuristics is to merge the strengths of multiple algorithms into a single hybrid algorithm, while also striving to minimize any significant drawbacks associated with the individual algorithms. There has been a recent trend to perceive metaheuristic frameworks as versatile concepts or modules that can be utilized to construct optimization methods, rather than rigid guidelines that must be strictly followed. This shift in perspective highlights the flexibility and adaptability of metaheuristics in developing customized optimization approaches [9]. Hybrid metaheuristics combining different algorithms have proven effective in various domains [10–14].

Solving a Generalized Network Design Problem

125

In this paper, we explore hybrid metaheuristics for solving a challenging variant of network design problems, specifically the Genetic Algorithm [15], and the Non-Linear Threshold Algorithm [16]. These two algorithms synergistically complement each other, resulting in a mutually beneficial integration. To provide context, we present relevant works on NDPs in the next section. In Sect. 3, we introduce the problem formulation for the generalized discrete cost multicommodity network design problem (GDCMNDP). Section 4 describes the proposed hybrid metaheuristic, beginning with an overview of the NLTA algorithm. Subsequently, we outline the tailored algorithm designed specifically to address the GDCMNDP. To evaluate the effectiveness of our algorithm, we conduct a computational study in Sect. 5, using benchmark and real-world instances. Finally, we provide concluding remarks in Sect. 6.

2 Related Works The literature on network design problems (NDPs) is extensive, given their relevance and complexity. In the subsequent section, we provide an overview of the closely related NDP models, namely the MCNDP and the GDCMNDP. 2.1 The Multicommodity Capacitated Fixed-Charge Network Design Problem (MCNDP) Extensive research has been conducted on the Multicommodity Capacitated FixedCharge Network Design Problem (MCNDP) [17]. Heuristics and metaheuristics have proven valuable in addressing the MCNDP, offering efficient solutions despite the problem’s complexity [18]. Various heuristics, including greedy algorithms, constructive algorithms, and local search algorithms, have been explored to rapidly find feasible solutions for the MCNDP [18]. Additionally, metaheuristic algorithms like genetic algorithms [19] simulated annealing [20], and tabu search [21] have been utilized to explore the solution space and obtain high-quality solutions that approach or even reach the optimal solution. The utilization of these heuristics and metaheuristics plays a pivotal role in addressing the MCNDP and has garnered significant attention in the optimization community, leading to extensive research and development in this field. 2.2 The Generalized Discrete Cost Multicommodity Network Design Problem (GDCMNDP) The Generalized Discrete Cost Multicommodity Network Design Problem (GDCMNDP) is a specific variant of Discrete Cost Multicommodity Network Design Problems [22–24]. It involves the creation of an optimal network layout that satisfies certain constraints while minimizing costs. In this problem, a network is defined by a set of nodes and edges. The edges correspond to bidirectional facilities with discrete predefined costs and capacities. The GDCMNDP focuses on the transfer of flow commodities between specified pairs of nodes. The objective is to determine the placement of facilities on the edges, ensuring that no more than one facility is implemented per edge. This optimization aims to minimize the sum of fixed installation costs and penalties for undelivered demands.

126

I. Mejri et al.

By effectively selecting and locating facilities, the GDCMNDP aims to achieve a costefficient network layout that facilitates the smooth flow of commodities while meeting the specified constraints. The GDCMNDP poses a significant optimization challenge. To tackle this problem, various metaheuristic algorithms have been employed, each offering efficient solutions. The biogeography-based optimization algorithm (BBO) with a hybrid genetic algorithm (HGA) that combines the GA with a variable neighborhood search procedure [25], the Red Deer Algorithm (RDA) [26], and the Archimedes Optimization Algorithm (AOA) [27] have all been successfully applied to address the GDCMNDP. These algorithms demonstrate the ongoing efforts to find effective solutions for this complex optimization problem. Recently, a well-engineered simulation-based optimization approach was successfully employed to effectively solve a stochastic variant of the GDCMNDP, which takes into account uncertain demands [28].

3 Problem Formulation In the context of graph theory, the GDCMNDP can be formally defined as follows. Consider a connected undirected graph G = (V, E), where V represents the set of nodes (|V| = n) and E represents the set of edges (|E| = m). The GDCMNDP involves K multicommodity flow demands denoted by ψ, with each demand corresponding to a distinct point-to-point commodity flow. For the GDCMNDP, we consider K distinct commodities, denoted as k (k = 1…K), each with a specific demand flow value dk . These commodities can be routed from a predefined source node sk to a predefined sink node tk , allowing for the possibility of bifurcated routing through multiple paths. The routing can be either partial or complete, depending on the specific demands and network constraints involved. For each unit of demand dk that remains unfulfilled, a per-unit cost/penalty γk is assigned (k = 1…K). The number of available facilities that can be installed on each edge is denoted by Le . Each facility, denoted by l (l = 1,…, Le ), is associated with a discrete finite bidirectional capacity ul e and a fixed cost fl e . These capacities and costs are step-increasing functions, meaning that u1 e < u2 e