Classification Applications with Deep Learning and Machine Learning Technologies 9783031175756, 9783031175763


481 34 14MB

English Pages [287] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
Artocarpus Classification Technique Using Deep Learning Based Convolutional Neural Network
1 Introduction
2 Propose Deep Learning
2.1 Proposed Convolutional Neural Network (CNN) Architecture
2.2 Transfer Learning Model for Artocarpus Classification
2.3 Dataset
2.4 Augmentation
3 Performance Result
3.1 Experimental Setup
3.2 Performance of Proposed CNN Model
3.3 Accuracy Comparison
3.4 Model Performance Comparison
4 Conclusion
References
Rambutan Image Classification Using Various Deep Learning Approaches
1 Introduction
2 Literature Review
3 Proposed Deep Learning Method
3.1 CNN
3.2 Transfer Learning
3.3 Dataset
4 Performance Results and Recommendation
4.1 Convolutional Neural Network (CNN)
4.2 Transfer Learning Model
5 Concluding Remarks
References
Mango Varieties Classification-Based Optimization with Transfer Learning and Deep Learning Approaches
1 Introduction
2 Methodology
2.1 Dataset
2.2 Data Preparation
2.3 Proposed CNN Architecture
2.4 Transfer Learning Model
3 Experiment Result
3.1 CNN
3.2 Transfer Learning
3.3 Xception
3.4 Accuracy Comparison
4 Conclusion
References
Salak Image Classification Method Based Deep Learning Technique Using Two Transfer Learning Models
1 Introduction
2 Dataset
2.1 Dataset Description
2.2 Dataset Preparation
3 Proposed Deep Learning
3.1 CNN
3.2 VGG16
3.3 ResNet50
4 Performance Result
4.1 Experimental Setup
4.2 Effect of Kernel Size: CNN
4.3 Effect of Pool Size: CNN
4.4 Effect of Epoch
4.5 Effect of Optimizer
4.6 Effect of Learning Rate
4.7 Effect of Dense Layer
4.8 Effect of Fine-Tuning for Pre-trained Models (VGG16 and ResNet50)
4.9 Accuracy Comparison
5 Conclusion
References
Image Processing Identification for Sapodilla Using Convolution Neural Network (CNN) and Transfer Learning Techniques
1 Introduction
2 Literature Survey
3 Proposed Deep Learning for Sapodilla Recognition
3.1 The Proposed CNN Architecture
3.2 Transfer Learning Model
3.3 Dataset
3.4 Augmentation
4 Performance Result
4.1 Experimental Setup
4.2 Performance of Proposed CNN Model
4.3 Accuracy Comparison
5 Conclusion
References
Comparison of Pre-trained and Convolutional Neural Networks for Classification of Jackfruit Artocarpus integer and Artocarpus heterophyllus
1 Introduction
2 Literature Review
3 Methodology
3.1 Dataset
3.2 Data Preprocessing and Partition
3.3 Convolutional Neural Networks
3.4 Transfer Learning
4 Result and Discussion
5 Conclusion
References
Markisa/Passion Fruit Image Classification Based Improved Deep Learning Approach Using Transfer Learning
1 Introduction
2 Literature Survey
3 Proposed CNN Architecture for Passion Food Recognition
3.1 The Proposed CNN Architectures
3.2 Transfer Learning Models
3.3 Dataset
3.4 Augmentation
4 Performance Result
4.1 Experimental Setup
4.2 Performance of Proposed CNN Model
4.3 Performance of Proposed Transfer Learning Model
4.4 Accuracy Comparison
5 Conclusion
Appendix
References
Enhanced MapReduce Performance for the Distributed Parallel Computing: Application of the Big Data
1 Introduction
2 Background
2.1 Big Data (BD)
2.2 Hadoop
2.3 Apriori Algorithm
3 Related Work
4 Methodology (Prescriptive Study)
4.1 Hadoop Architecture
4.2 MR Programming Model
4.3 Apriori Algorithm
5 Result and Discussion (Proposed Framework)
6 Conclusion
References
A Novel Big Data Classification Technique for Healthcare Application Using Support Vector Machine, Random Forest and J48
1 Introduction
2 Literature Review
3 Methodology
4 The Proposed Method
5 Experiments and Results
6 Conclusion
References
Comparative Study on Arabic Text Classification: Challenges and Opportunities
1 Introduction
2 Literature Review
3 Background
4 Literature Review Results and Discussion
5 Results and Discussion
6 Conclusions and Future Work
References
Pedestrian Speed Prediction Using Feed Forward Neural Network
1 Introduction
2 Material and Method
2.1 Data Collection Location
2.2 Data Capturing and Extraction
2.3 Data Preparation
2.4 Sensitivity Analysis
2.5 ANN Model Formulation
2.6 ANN Model Validations
3 Results Analysis and Discussion
3.1 Descriptive of Observed Pedestrian Data.
3.2 Speed Characteristic and Distribution Results
3.3 Result of Sensitivity Analysis
3.4 Model Estimation Analysis Results
4 Conclusion
References
12 Arabic Text Classification Using Modified Artificial Bee Colony Algorithm for Sentiment Analysis: The Case of Jordanian Dialect
Abstract
1 Introduction
2 Related Works
2.1 Introduction
3 The Proposed Method
3.1 Introduction
3.2 Data Preparation
3.3 Data Annotation
3.4 Preprocessing
3.4.1 Tokenization
3.4.2 Text Pre-processing
3.4.3 Stemming
3.4.4 Text to Numeric Data Representation
3.4.5 Most Affective Jordanian Words
3.5 Modified Artificial Bee Colony Algorithm with Upper Confidence Bound Algorithm
3.5.1 The Original Artificial Bee Colony Algorithm
3.5.2 Enhancing Artificial Bee Algorithm with Upper Confidence Bound
3.5.3 Obtain the Number of Feature Selection Using the Modified ABC-UBC
3.6 Feature Selection
3.7 The Text Classification
3.7.1 Support Vector Machines Classifier (SVM)
3.7.2 K-Nearest Neighbors Classifier (KNN)
3.7.3 Naïve- Bayes Classifier
3.7.4 Polynomial Neural Networks Classifier
4 Results
4.1 Results Information
4.2 The Jordanian Dialect Dataset Experiments
4.2.1 The Result of Arabic Text Classifiers with Pre-processing Phase
4.2.2 The Result of Arabic Text Classifiers Without Pre-Processing Phase
4.2.3 The Result of Arabic Text Using Forward Feature Selection with ABC-UBC and Pre-Processing Phase
4.2.4 The Result of Arabic Text Using Forward Feature Selection with ABC-UBC Without Pre-Processing Phase
4.3 The Algerian Dialect Dataset Experiments
4.3.1 The Result of Arabic Text Classifiers with Pre-processing Phase
4.3.2 The Result of Arabic Text Classifiers Without Pre-processing Phase
4.3.3 The Result of Arabic Text Using Forward Feature Selection with ABC-UBC and Pre-processing Phase
4.3.4 The Result of Arabic Text Using Forward Feature Selection with ABC-UBC Without Pre-processing Phase
4.4 Experimental Results and Discussion
4.5 Experimental Results and Discussion
5 Conclusion
References
Recommend Papers

Classification Applications with Deep Learning and Machine Learning Technologies
 9783031175756, 9783031175763

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Studies in Computational Intelligence 1071

Laith Abualigah   Editor

Classification Applications with Deep Learning and Machine Learning Technologies

Studies in Computational Intelligence Volume 1071

Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.

Laith Abualigah Editor

Classification Applications with Deep Learning and Machine Learning Technologies

Editor Laith Abualigah Hourani Center for Applied Scientific Research Al-Ahliyya Amman University Amman, Jordan Faculty of Information Technology Middle East University Amman, Jordan School of Computer Sciences Universiti Sains Malaysia George Town, Pulau Pinang, Malaysia

ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-031-17575-6 ISBN 978-3-031-17576-3 (eBook) https://doi.org/10.1007/978-3-031-17576-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Nowadays, with the considerable growth in deep learning and machine learning classification approaches ranging from many real-world problems such as Artocarpus Classification, Rambutan Classification, Mango Varieties Classification, Salak Classification, Image Processing, Identification for Sapodilla Transfer Learning Techniques, Classification of Jackfruit Artocarpus integer and Artocarpus heterophyllus, Markisa/Passion Fruit Classification, Big Data Classification, and Arabic text classification. Deep learning and machine learning have become indispensable technologies in the current time, and this is the era of artificial intelligence. These techniques find their marks in data analysis, text mining, classification problems, computer vision, image analysis, pattern recognition, medicine, etc. There is a continuous flow of data, so it is impossible to manage and analyze these data manually. The outcome depends on the processing of high-dimensional data. Most of it is irregular and unordered, present in various forms like text, images, videos, audio, graphics, etc. Fruit image recognition systems are used to classify different types of fruits and to differentiate different fruit variants of a single fruit type. Rambutan is an exotic fruit mainly in the Southeast Asian region and prevalent fruit in Malaysia. It comes in different varieties or cultivars. These cultivars appear to look alike in the naked eyes. Hence, an image recognition system powered by deep learning methods can be applied in classifying rambutan cultivars accurately. Currently, sorting and classifying mango cultivars are manually done by observing the features or attributes of mango like size, skin color, shape, sweetness and flesh color. Generally, experienced taxonomy experts can identify different species. However, it is not easy to distinguish these mangoes for most people. Nowadays, society is advancing in science and technology. There is a lot of technology that could solve the problem, which can make it easy for people to distinguish the cultivar. The solution we would like to propose to solve the concern is the computer vision technique. Artificial intelligence trains computers to interpret and understand the visual world like images and video. Deep learning, also known as deep neural networks or deep neural understanding, is used to process the data and create patterns by imitating the human brain to decide. It uses neurocodes that are linked together within the hierarchical neural network to analyze the incoming data. Image recognition is one of the most popular deep learning applications that help many v

vi

Preface

fields, especially in fruit agriculture, to identify the classification of the fruit. This book proposal intends to bring together researchers and developers from academic fields and industries worldwide working in the broad areas of deep learning and machine learning community-wide discussion of ideas that will influence and foster continued research in this field to better humanity. This book emphasizes bringing in front some of the technology-based revolutionary solutions that make the classification process more efficient. It also provides deep insight into classification techniques by capturing information from the given chapters. Amman, Jordan

Laith Abualigah

Contents

Artocarpus Classification Technique Using Deep Learning Based Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lee Zhi Pen, Kong Xian Xian, Ching Fum Yew, Ong Swee Hau, Putra Sumari, Laith Abualigah, Absalom E. Ezugwu, Mohammad Al Shinwan, Faiza Gul, and Ala Mughaid Rambutan Image Classification Using Various Deep Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nur Alia Anuar, Loganathan Muniandy, Khairul Adli Bin Jaafar, Yi Lim, Al Lami Lamyaa Sabeeh, Putra Sumari, Laith Abualigah, Mohamed Abd Elaziz, Anas Ratib Alsoud, and Ahmad MohdAziz Hussein Mango Varieties Classification-Based Optimization with Transfer Learning and Deep Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Ke, Ng Tee Weng, Yifan Yang, Zhang Ming Yang, Putra Sumari, Laith Abualigah, Salah Kamel, Mohsen Ahmadi, Mohammed A. A. Al-Qaness, Agostino Forestiero, and Anas Ratib Alsoud Salak Image Classification Method Based Deep Learning Technique Using Two Transfer Learning Models . . . . . . . . . . . . . . . . . . . . . Lau Wei Theng, Moo Mei San, Ong Zhi Cheng, Wong Wei Shen, Putra Sumari, Laith Abualigah, Raed Abu Zitar, Davut Izci, Mehdi Jamei, and Shadi Al-Zu’bi

1

23

45

67

Image Processing Identification for Sapodilla Using Convolution Neural Network (CNN) and Transfer Learning Techniques . . . . . . . . . . . . 107 Ali Khazalah, Boppana Prasanthi, Dheniesh Thomas, Nishathinee Vello, Suhanya Jayaprakasam, Putra Sumari, Laith Abualigah, Absalom E. Ezugwu, Essam Said Hanandeh, and Nima Khodadadi

vii

viii

Contents

Comparison of Pre-trained and Convolutional Neural Networks for Classification of Jackfruit Artocarpus integer and Artocarpus heterophyllus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Song-Quan Ong, Gomesh Nair, Ragheed Duraid Al Dabbagh, Nur Farihah Aminuddin, Putra Sumari, Laith Abualigah, Heming Jia, Shubham Mahajan, Abdelazim G. Hussien, and Diaa Salama Abd Elminaam Markisa/Passion Fruit Image Classification Based Improved Deep Learning Approach Using Transfer Learning . . . . . . . . . . . . . . . . . . . . . . . . 143 Ahmed Abdo, Chin Jun Hong, Lee Meng Kuan, Maisarah Mohamed Pauzi, Putra Sumari, Laith Abualigah, Raed Abu Zitar, and Diego Oliva Enhanced MapReduce Performance for the Distributed Parallel Computing: Application of the Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Nathier Milhem, Laith Abualigah, Mohammad H. Nadimi-Shahraki, Heming Jia, Absalom E. Ezugwu, and Abdelazim G. Hussien A Novel Big Data Classification Technique for Healthcare Application Using Support Vector Machine, Random Forest and J48 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Hitham Al-Manaseer, Laith Abualigah, Anas Ratib Alsoud, Raed Abu Zitar, Absalom E. Ezugwu, and Heming Jia Comparative Study on Arabic Text Classification: Challenges and Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Mohammed K. Bani Melhem, Laith Abualigah, Raed Abu Zitar, Abdelazim G. Hussien, and Diego Oliva Pedestrian Speed Prediction Using Feed Forward Neural Network . . . . . 225 Abubakar Dayyabu, Hashim Mohammed Alhassan, and Laith Abualigah Arabic Text Classification Using Modified Artificial Bee Colony Algorithm for Sentiment Analysis: The Case of Jordanian Dialect . . . . . . 243 Abdallah Habeeb, Mohammed A. Otair, Laith Abualigah, Anas Ratib Alsoud, Diaa Salama Abd Elminaam, Raed Abu Zitar, Absalom E. Ezugwu, and Heming Jia

Artocarpus Classification Technique Using Deep Learning Based Convolutional Neural Network Lee Zhi Pen, Kong Xian Xian, Ching Fum Yew, Ong Swee Hau, Putra Sumari, Laith Abualigah, Absalom E. Ezugwu, Mohammad Al Shinwan, Faiza Gul, and Ala Mughaid

Abstract There are many species of Artocarpus fruits in Malaysia, which have different market potentials. This study classifies 4 species of Artocarpus fruits using deep learning approach, which is Convolutional Neural Network (CNN). A new proposed CNN model is compared with pre-trained models, i.e., VGG-16, ResNet50, and Xception. Effects of variables, i.e., hidden layers, perceptrons, filter number, optimizers, and learning rate, on the proposed model are also investigated in this study. The best performing model in this study is the new proposed model with 2 CNN layers (12, 96 filters) and 6 dense layers with 147 perceptrons, achieving an accuracy of 87%. Keywords Deep learning · Transfer learning · Convolutional neural network · Fruit classification · Artocarpus

L. Z. Pen · K. Xian Xian · C. F. Yew · O. S. Hau · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 11328, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan A. E. Ezugwu School of Computer Science, University of KwaZulu-Natal, Pietermaritzburg Campus, Pietermaritzburg 3201, South Africa M. A. Shinwan Faculty of Information Technology, Applied Science Private University, Amman 11931, Jordan F. Gul Department of Electrical Engineering, Air University, Aerospace and Aviation Campus, Kamra, Attock 43600, Pakistan A. Mughaid Department of Information Technology, Faculty of Prince Al-Hussien Bin Abdullah for IT, The Hashemite University, PO Box 330127, Zarqa 13133, Jordan © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_1

1

2

L. Z. Pen et al.

1 Introduction Agricultural fields have faced the challenge of labour costs, and automated agricultural systems are on demand to overcome such challenges [1]. Computer vision technology has contributed to automation, such as the weed removal robots using realtime weed recognition to remove weeds from the crop field, thus reducing both labour and chemical costs [2, 3]. Fruit harvesting can harness this technology to enhance the industry’s profitability, and fruit recognition is the crucial part of the solution [4]. Multiple works have been done on fruit recognition with machine learning approach. However, only few are done on Malaysian fruits. Previous works on fruit recognition or classification have been done using both conventional machine learning approaches and deep learning approach. By extracting fruit color and fruit shape as features via specialized computing modules, fruit recognition system using KNN was able to have accuracy ranging from 30 to 90%, although the fruit types are highly distinctive to each other [5]. Wide range of accuracies (30– 90%) achieved by the system raise doubts on its capabilities and optimization of the feature extracting computing modules with various fruit types will be time and cost consuming. Another study using conventional machine learning approaches was done on Supermarket Produce data set, which is very well-documented with minimum noise. Although it scored high on accuracy with Support Vector Machine model, generalization of such model in a complicated, real harvesting environment remains questionable. Few studies using deep learning approaches were also able to obtain high accuracy (>90%) with well-documented dataset, while researchers are investigating effects of noise on generalization of neural networks. This study is to use deep learning approach to recognize four species of Artocarpus fruit in Malaysia, breadfruit (Artocarpus altilis), Keledang (Artocarpus lanceifolius), Nangka (Artocarpus heterophyllus), and Tarap (Artocarpus odoratissimus).

2 Propose Deep Learning 2.1 Proposed Convolutional Neural Network (CNN) Architecture Figure 1 shows our proposed CNN architecture. It consists of two layers of convolution, two layers of max pooling layer, one layer of flattening, six layers of dense layers and one output layer [6, 7]. The hyperparameters are shown in Fig. 2. The first layer of convolution is with 12 filters, 3 kernel size and activation function of relu. Then, followed by max pooling layer of size = 2. Next, the output will be fed into the second layer of convolution with 96 filters, 3 kernel size, activation function of relu and second layer of max pooling layer of size = 2. The main purpose of using the convolution is to summarize the presence of detected features in our input image and the usage of max pooling layer is to reduce the dimensions of our input so that we

Artocarpus Classification Technique Using Deep Learning …

3

could reduce the parameter to be trained. After that, flatten all the output with a flatten layer and proceed to 6 layers of dense layer with 147 perceptrons. These dense layers are used to identify the features in our input data and help the output layer to generate a correct output. Before connecting to the output layer, a dropout layer with the rate of 0.3 is utilized. Lastly, it is connected to output layer with activation function of softmax to generate the output of 4 label classes which are breadfruit (Artocarpus altilis), Keledang (Artocarpus lanceifolius), Nangka (Artocarpus heterophyllus), and Tarap (Artocarpus odoratissimus). Fig. 1 The proposed CNN architecture

Fig. 2 The hyperparameter of proposed CNN architecture

4

L. Z. Pen et al.

2.2 Transfer Learning Model for Artocarpus Classification Transfer learning model is a method of transferring what has been learnt from a previous application into a new application, which in our case is for Artocarpus classification [8–12]. Those models that have been trained at a different application are called pre-trained models. For our study, we selected three main pre-trained models which are VGG16, ResNet50 and Xception. Some other optimization methods can be used to optimize the problems as given in [13–18]. VGG16 VGG16 was proposed by Karen Simonyan and Andrew Zisserman in 2015 at a paper published during the International Conference on Learning Representations [19]. This model achieved 90.1% accuracy on the ImageNet validation dataset which consist of over 14 million images. The architecture of VGG16 is shown below in Fig. 3. A number of different configurations and fine tuning was done to identify the best performing model for our Artocarpus image classification. For VGG16, the highest accuracy achieved was 81.50% using 4096 perceptron, freezing the whole model except the top layer and running it with a new classifier with 2 dense layers as shown in Fig. 4. Figure 5 shows VGG16 Transfer Model with Freeze All except Top layer, New classifier with 2 dense layers and 4096 perceptron.

Fig. 3 VGG16 architecture

Artocarpus Classification Technique Using Deep Learning …

5

Fig. 4 Performance of VGG16 transfer model on Artocarpus image classification

ResNet50 ResNet50 is a variant of the residual network that consists of 48 convolution layers and 1 max pooling and 1 average pooling layer. This architecture has enabled the ability to train many layers (hundreds to thousands) while maintaining high performance. Prior to ResNet50, there were no models that were able to achieve the same feat especially in deep layers of training. ResNet50 achieved 92.1% accuracy on the ImageNet validation dataset. Figure 6 shows ResNet50 Architecture. For our Artocarpus image classification using ResNet50, the highest accuracy achieved was using freezing all except the top layer and run with new classifier with 2 dense layers. The first layer uses 1024 perceptron while the second layer uses 4096 perceptron. This configuration managed to achieve 86% accuracy on our Artocarpus image classification. Figure 7 shows the performance of ResNet50 Transfer Model on Artocarpus Image Classification. Figure 8 shows ResNet50 Transfer Model with Freeze All except Top Layer, New classifier with 2 dense layers with 1024 perceptron followed by 4096 perceptron. Xception Xception is a deep convolutional neural network which was developed by Francois Chollet from Google Inc. Figure 9 shows Xception Architecture. The name stands for Extreme Inception and is based on the Inception model but with its modules replaced using depthwise separable convolutions instead. Xception achieved 94.5% on the ImageNet validation dataset. Figure 10 shows the performance of Xception Transfer Model on Artocarpus Image Classification. Figure 11 shows Xception Transfer Model with Freeze All except Top Layer, New Classifier with 3 dense layers each with 4096 perceptrons. For the Artocarpus image classification using Xception, the best performing model only managed to achieve 66.50% accuracy. It was achieved using freeze all with new classifier and 3 dense layers, each with 4096 perceptrons.

6

L. Z. Pen et al.

Fig. 5 VGG16 transfer model with freeze all except top layer, new classifier with 2 dense layers and 4096 perceptron

Artocarpus Classification Technique Using Deep Learning …

7

Fig. 6 ResNet50 architecture

Fig. 7 Performance of ResNet50 transfer model on Artocarpus image classification

Fig. 8 ResNet50 transfer model with freeze all except top layer, new classifier with 2 dense layers with 1024 perceptron followed by 4096 perceptron

8

L. Z. Pen et al.

Fig. 9 Xception architecture

Fig. 10 Performance of Xception transfer model on Artocarpus image classification

Summary on Transfer Learning Models Figure 12 shows the performance of VGG16, ResNet50 and Xception on Artocarpus Image Classification. At a glance, all three models have poor performance if run using the original pre-trained model and original classifier with freeze all configurations. This is probably due to the lack of Artocarpus images in the ImageNet dataset which they were pre-trained on. Secondly, VGG16 and Xception model tend to perform better using higher perceptron count. This characteristic is less significant on ResNet50 since all the configurations on it seems to perform reasonably good. Another characteristic that was seen on all three models is that at lower perceptron count (150), the increase in the number of dense layer reduces the accuracy whereas at higher perceptron count (4096), the increase in the number of dense layer increases the accuracy. In addition, we can see that ResNet50 is the most suitable and

Artocarpus Classification Technique Using Deep Learning …

9

Fig. 11 Xception transfer model with freeze all except top layer, new classifier with 3 dense layers each with 4096 perceptrons

best performing transfer model for Artocarpus image classification. This is because ResNet50 is able to maintain reasonably good accuracy >70% across all configurations tested. VGG16 comes in second with around half of them performing reasonably good whereas Xception is unable to achieve >70% on all the configurations tested. Here we can conclude that Xception model is not suitable to use on Artocarpus image classification. However, it is good to take note that all three models could still achieve much higher accuracy if the number of epochs is increased. To conclude, ResNet50 is the best transfer model to use for Artocarpus image classification as compared to VGG16 and Xception.

Fig. 12 Performance of VGG16, ResNet50 and Xception on Artocarpus image classification

10

L. Z. Pen et al.

Fig. 13 Sample images of Artocarpus dataset

2.3 Dataset The Artocarpus genus consists of approximately 50 species of trees which are mainly restricted to Southeast Asia [20]. For our study, we focused on 4 edible fruits species namely, (1) Artocarpus altilis (2) Artocarpus lanceifolius (3) Artocarpus heterophyllus and (4) Artocarpus odoratissimus. The dataset consists of a total of 1000 images with each species having 250 images each. The images are resized to 224 × 224 pixels. The dataset was then split into 80% training and 20% test set. The sample images can be seen in Fig. 13.

2.4 Augmentation 90° image rotation was used to augment the images to increase accuracy and train the model better. The code and sample images can be seen in Fig. 14.

3 Performance Result 3.1 Experimental Setup The original dataset consists of 1000 images with 4 classes which are breadfruit (Artocarpus altilis), Keledang (Artocarpus lanceifolius), Nangka (Artocarpus heterophyllus) and Tarap (Artocarpus odoratissimus). Each class has 250 images and has already been preprocessed to 224 pixels × 224 pixels × 3 filters. We will use python programming languages like Keras and Tensorflow library with Jupyter notebook to

Artocarpus Classification Technique Using Deep Learning …

11

Fig. 14 Augmented Images using 90° rotation

build our program. First, we load all the images and then perform data augmentation by rotating all the images 90°. Then, we feed all the 2000 images into the Keras library function, “image_dataset_from_directory()” to preprocess the data so that it is converted to the format supported by Tensorflow library. The dataset is further split into 20% test dataset and 80% train dataset. Next, we perform hyperparameter optimization starting from the number of hidden layers (dense layer and CNN layer), number of perceptrons, number of filters, optimizers, epochs and learning rate. In order to reduce the tuning time for trying different combinations of hyperparameters, we decide to tune each hyperparameter separately. This can be done by fixing all the other hyperparameters when tuning for a specific hyperparameter. Once the hyperparameter reaches optimum, then proceed to another hyperparameter. The detailed illustration of the hyperparameter optimization workflow and the hyperparameter utilized are stated in Fig. 15 and Table 1.

3.2 Performance of Proposed CNN Model In this section, we will discuss the effect of hidden layer, perceptrons, filter number, optimizers, number of epochs and learning rate on the performance of our model. After that, identify the best hyperparameter for our proposed CNN model and compare its accuracy with the performance of transfer learning for VGG16 and Xception model.

12

L. Z. Pen et al.

Fig. 15 The hyperparameter optimization workflow

3.2.1

Effect of Hidden Layers (Convolutional Layers and Dense Layers)

Hyperparameter tuning is done on the hidden layers which are the convolutional layers and dense layers. The performance of the convolutional neural network has been found to be greatly affected by varying the numbers of hidden layers. Figure 16 shows the accuracy results of the model when different combinations of convolutional layers and dense layers are used to build the model. The convolutional layers are tested out with 2, 3, 4 and 5 layers while the dense layers are tested out with 1, 2, 3, 4, 5, 6 and 7 layers. Different combinations are tested such as 2 convolutional layers with 1 dense layer, 2 convolutional layers with 2 dense layers, 5 convolutional layers with 6 dense layer, 5 convolutional layers with 7 dense layers etc. It is observed that the best result was obtained with 2 convolutional layers with 5 dense layers, giving the accuracy of 76%.

Effect of hidden layer (CNN layer and dense layer)

Hyperparameter 1

Hyperparameter 2

Hyperparameter 3

Hyperparameter 4

Hyperparameter 5

Hyperparameter 6

1

2

3

4

5

6

Effect of learning rate

Effect of number of epochs

Effect of optimizers

Effect of filter number

Effect of number of perceptron

Hyperparameter explanation

Hyperparameter

No.

Optimum

Optimum

Optimum

Optimum

Optimum

Tuning

Hidden layer

Optimum

Optimum

Optimum

Optimum

Tuning

Same as the number of perceptron after flattening

Number of perceptron

Table 1 The hyperparameter utilized in optimization and its values

Optimum

Optimum

Optimum

Tuning

3

3

Number of filter

15

15

Loss = ‘sparse_categorical_crossentropy’, optimizer = ‘adam’ Loss = ‘sparse_categorical_crossentropy’, optimizer = ‘adam’

Optimum

Optimum

Optimum

Tuning

15

15

Loss = ‘sparse_categorical_crossentropy’, optimizer = ‘adam’

Tuning

Number of epochs

Optimizer

Tuning

0.01 (default)

0.01 (default)

0.01 (default)

0.01 (default)

0.01 (default)

Learning rate

Artocarpus Classification Technique Using Deep Learning … 13

14

L. Z. Pen et al.

Fig. 16 Accuracy results from different combinations of convolutional layers and dense layers

3.2.2

Effect of Perceptrons

During the development of the CNN model, one of the hyperparameter tested is the number of perceptrons. Using the best model obtained from 2.3, the number of perceptrons in the dense layers are decreased to observe its effect on the model’s accuracy. Originally, the number of perceptrons is 9408, following the number of perceptrons obtained from the flatten layer. Then, 9408 perceptrons are increasingly divided by 2, 4, 8, 16, 32, 64 and 128. Figure 17 shows that the accuracy of the model varies when different number of perceptrons are applied. The model achieved highest accuracy, 81% when the number of perceptrons in dense layers are reduced by 64 times, 147 perceptrons.

3.2.3

Effect of Filter Number

The number of convolutional filter layers was tested out with 3, 6, 12, 24, 48, 96, 192 filters. According to Fig. 18, it is observed that upon increasing the number of convolutional filter layers from 3 to 192, the accuracy is decreased from 81 to 54%. Different combinations of filter numbers in convolutional layers are tested. The results are shown in Fig. 19. For example, first convolutional layer uses 3 filters while the second convolutional layer uses 24 filters. The highest accuracy, 85% is obtained when the first convolutional layer uses 12 filters and the second convolutional layer uses 96 filters. Based on the results gathered, the usage of different filter numbers in convolutional layers achieved higher accuracy than using the same filter numbers in convolutional layers.

Artocarpus Classification Technique Using Deep Learning …

15

Fig. 17 Accuracy of model with different number of perceptrons

Fig. 18 Accuracy of the model when convolutional layers use same filter numbers

3.2.4

Effect of Optimizers

Different types of optimizers such as Adam, Adagrad, RMSprop, SGD, Adadelta, Adamax, Nadam and Ftrl are used to optimize the model’s performance. These optimizers help to minimize the loss of the neural network by updating the weight parameters. According to Fig. 20, the best optimizer for the model is Adagrad, which achieved the accuracy of 86.3%. Other optimizers such as, Adam gained accuracy of 77%, RMSprop gained accuracy of 79%, SGD gained accuracy of 28%, Adadelta gained accuracy of 65%, Adamax gained accuracy of 86%, Nadam gained accuracy of 82% and Ftrl gained accuracy of 37%.

16

L. Z. Pen et al.

Fig. 19 Accuracy of the model when convolutional layers use different filter numbers

Fig. 20 Accuracy of model using different types of optimizers

Based on Fig. 21, Adam is the fastest optimizer that reaches its own highest accuracy if compared to other optimizers. Adam achieved accuracy of 78% at 7 epochs. Other optimizers only reached their own highest accuracy after 13 epochs. Adagrad gained its highest accuracy at 14 epochs. Adamax and Adagrad had quite consistent accuracy after 4 epochs. RMSprop was able to gain 71% accuracy at 1 epoch. However, the accuracy was not consistent. The highest accuracy of RMSprop among epochs ran was 85% and the lowest accuracy was 60%, which results in final accuracy of 79%.

Artocarpus Classification Technique Using Deep Learning …

17

Fig. 21 Accuracy of different optimizers in each epoch

3.2.5

Effect of Learning Rate

Learning rate is one of the important hyperparameter used in training the CNN model. The learning rates adopted and observed in this project are 0.1, 0.01, 0.001, 0.0001, 0.00001 and 0.000001. The model reached highest accuracy of 87% when the learning rate is 0.001. Figure 22 shows that the accuracy is improved from 23 to 87% with learning rate range from 0.1 to 0.001. However, the accuracy decreases to 42% when the learning rate is 0.0001. The accuracy increases to 63% when learning rate is 0.00001 and decreases again when the learning rate is 0.000001. Therefore, it can be concluded that 0.001 is the optimum learning rate for the CNN model. Based on Fig. 23, the number of epochs may need to be modified for other learning rates to reach higher accuracy.

Fig. 22 Accuracy of the model with different learning rates

18

L. Z. Pen et al.

Fig. 23 Accuracy of the model with different learning rates in each epoch

3.3 Accuracy Comparison The accuracies of pre-trained and proposed models are shown in Table 2, Bold font refers to the best result. It can be observed that the model with the best performance is our proposed model which has the accuracy of 87.00%. Then, followed by ResnNet50 (Freeze all with new classifier, 1024 then 4096 perceptrons, 2 dense layers) and VGG16 (Freeze all with new classifier, 4096 perceptrons, 2 dense layers) with the accuracy of 86.00% and 81.50% respectively. These models have almost similar accuracy and do not improve even when we tried out for other combinations of hyperparameters. It may be due to the presence of bayes error in our dataset in which there are images with almost similar features but different targets. It is possible as almost all of our images contain a large amount of green pixels but with different labels. This will cause the images difficult to be trained and has the Bayes error which is irreducible. Thus, our model may have achieved the optimum performance. All the pre-trained models with freeze all hyperparameters do not show a high accuracy in the prediction and have the accuracy ranging from 22.00 to 30.00%. This is because the pre-trained model is complex and requires more epoch to converge to the optimum accuracy.

3.4 Model Performance Comparison The accuracy of the pretrained model and proposed model for 15 consecutive epochs are shown in Fig. 24. In this figure, the proposed model has the highest accuracy in the first epoch. Then, it increases sharply and reaches its maximum accuracy at the tenth epoch. After the tenth epoch, it consolidates at the level of 80–87%. For ResnNet50 (Freeze all with new classifier, 1024 then 4096 perceptrons, 2 dense layers) and VGG16 (Freeze all with new classifier, 4096 perceptrons, 2 dense layers), it is increase

Artocarpus Classification Technique Using Deep Learning …

19

Table 2 Accuracy of pre-trained, proposed model and its hyperparameter Model

Hyperparameter

Accuracy (%)

VGG-16

Freeze all

30.00

VGG-16

Freeze all with new classifier, 4096 perceptrons, 2 dense layers

81.50

ResNet50

Freeze all

23.00

ResNet50

Freeze all with new classifier, 1024 then 4096 perceptrons, 2 86.00 dense layers

Xception

Freeze all

22.00

Xception

Freeze all with new classifier, 4096 perceptrons, 3 dense layers

66.50

Proposed model 2 CNN layers (12, 96 filters) and 6 dense layers with 147 perceptrons

87.00

gradually starting from the first epoch to the fifteenth epoch. However, the increment is not greater than the proposed model. This means that our proposed model requires to be trained with less epoch to achieve the optimum and higher accuracy than these models. For other pretrained models, it does not provide a significant enhancement when trained from the first epoch until the fifteen epochs, but it is still showing the upward trend.

Pretrained Model and Proposed Model Accuracy in Each Epoch 1.0000 0.9000 0.8000

Accuracy

0.7000 0.6000 0.5000 0.4000 0.3000 0.2000 0.1000 0.0000 1

2

3

4

5

6

7

8

9

10

11

12

13

14

Epoch VGG-16 (freeze all)

VGG-16 (freeze all with new classifier)

ResNet50 (freeze all)

ResNet50 (freeze all with new classifier)

Xception (freeze all)

Xception3 (freeze all with new classifier)

Proposed model

Fig. 24 Accuracy of the pretrained model and proposed model in each epoch

15

20

L. Z. Pen et al.

4 Conclusion In conclusion, the best performing model is our proposed model with the prediction accuracy of 87% which has an architecture of 2 CNN layers (12, 96 filters) and 6 dense layers with 147 perceptrons. It also requires to be trained with less epoch compared to other pretrained models to achieve optimum accuracy.

References 1. Araújo, S. O., Peres, R. S., Barata, J., Lidon, F., & Ramalho, J. C. (2021). Characterising the agriculture 4.0 landscape—Emerging trends, challenges and opportunities. Agronomy, 11(4), 667. 2. Fennimore, S. A., Slaughter, D. C., Siemens, M. C., Leon, R. G., & Saber, M. N. (2016). Technology for automation of weed control in specialty crops. Weed Technology, 30(4), 823– 837. 3. Jamei, M., Karbasi, M., Malik, A., Abualigah, L., Islam, A. R. M. T., & Yaseen, Z. M. (2022). Computational assessment of groundwater salinity distribution within coastal multi-aquifers of Bangladesh. Scientific Reports, 12(1), 1–28. 4. Sarig, Y. (1993). Robotics of fruit harvesting: A state-of-the-art review. Journal of Agricultural Engineering Research, 54(4), 265–280. 5. Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., & McCool, C. (2016). Deepfruits: A fruit detection system using deep neural networks. Sensors, 16(8), 1222. 6. Daradkeh, M., Abualigah, L., Atalla, S., & Mansoor, W. (2022). Scientometric analysis and classification of research using convolutional neural networks: A case study in data science and analytics. Electronics, 11(13), 2066. 7. AlShourbaji, I., Kachare, P., Zogaan, W., Muhammad, L. J., & Abualigah, L. (2022). Learning features using an optimized artificial neural network for breast cancer diagnosis. SN Computer Science, 3(3), 1–8. 8. ud Din, A. F., Mir, I., Gul, F., Mir, S., Saeed, N., Althobaiti, T., Abbas, S. M., & Abualigah, L. (2022). Deep reinforcement learning for integrated non-linear control of autonomous UAVs. Processes, 10(7), 1307. 9. Alkhatib, K., Khazaleh, H., Alkhazaleh, H. A., Alsoud, A. R., & Abualigah, L. (2022). A new stock price forecasting method using active deep learning approach. Journal of Open Innovation: Technology, Market, and Complexity, 8(2), 96. 10. Shehab, M., Abualigah, L., Shambour, Q., Abu-Hashem, M. A., Shambour, M. K. Y., Alsalibi, A. I., & Gandomi, A. H. (2022). Machine learning in medical applications: A review of stateof-the-art methods. Computers in Biology and Medicine, 145, 105458. 11. Ezugwu, A. E., Ikotun, A. M., Oyelade, O. O., Abualigah, L., Agushaka, J. O., Eke, C. I., & Akinyelu, A. A. (2022). A comprehensive survey of clustering algorithms: State-ofthe-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Artificial Intelligence, 110, 104743. 12. Wu, D., Wang, S., Liu, Q., Abualigah, L., & Jia, H. (2022). An improved teaching-learningbased optimization algorithm with reinforcement learning strategy for solving optimization problems. Computational Intelligence and Neuroscience. 13. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 14. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250.

Artocarpus Classification Technique Using Deep Learning …

21

15. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 16. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 17. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 18. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 19. Hong, S., Noh, H., & Han, B. (2015). Decoupled deep neural network for semi-supervised semantic segmentation. Advances in Neural Information Processing Systems, 28. 20. Jagtap, U. B., & Bapat, V. A. (2010). Artocarpus: A review of its traditional uses, phytochemistry and pharmacology. Journal of Ethnopharmacology, 129(2), 142–166.

Rambutan Image Classification Using Various Deep Learning Approaches Nur Alia Anuar, Loganathan Muniandy, Khairul Adli Bin Jaafar, Yi Lim, Al Lami Lamyaa Sabeeh, Putra Sumari, Laith Abualigah, Mohamed Abd Elaziz, Anas Ratib Alsoud, and Ahmad MohdAziz Hussein

Abstract Rambutan (Nephelium lappaceum L.) is a widely grown and favored fruit in tropical countries such as Malaysia, Indonesia, Thailand, and the Philippines. This fruit is classified into tens of different cultivars based on fruit, flesh, and tree features. In this project, five different rambutan cultivars classification models using deep learning techniques were developed based on a 1000 rambutan images dataset. Common deep learning methods for the image classification task, Convolutional Neural Network (CNN), and transfer learning method were applied to recognize each rambutan variant. Results have shown that the VGG16 pre-trained model performed best as it achieved 96% accuracy on the test dataset. This indicates the model is reliable for the rambutan classification task. Keywords Deep learning · Convolutional neural networks · Fruit classification · Rambutan · ResNet · VGG N. A. Anuar · L. Muniandy · K. A. B. Jaafar · Y. Lim · A. L. L. Sabeeh · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah · A. R. Alsoud Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 19328, Jordan L. Abualigah Faculty of Information Technology, Middle East University, Amman 11831, Jordan M. A. Elaziz Faculty of Computer Science and Engineering, Galala University, Al Galala City, Egypt Artificial Intelligence Research Center (AIRC), Ajman University, 346 Ajman, United Arab Emirates Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt School of Computer Science and Robotics, Tomsk Polytechnic University, Tomsk, Russia A. M. Hussein Deanship of E-Learning and Distance Education, Umm Al-Qura University, Makkah 21955, Saudi Arabia © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_2

23

24

N. A. Anuar et al.

1 Introduction Computer vision is a subset field of Artificial Intelligence (AI) responsible for “teaching” the machine to understand and interpret the visual world such as digital images or videos. The rise of big data, faster and cheaper computing resources, and new algorithms have contributed to the widespread of this domain. Image classification is one of the computer vision approaches is applied to various fields including technology, medical, manufacturing, and agriculture. In agriculture, automated fruit image recognition can assist in quality control and the development of robotic harvesting systems from orchards [1]. Fruit image recognition systems are used to classify different types of fruits and to differentiate different fruit variants of a single fruit type [2, 3]. Rambutan is an exotic fruit that exists mainly in the Southeast Asian region and particularly popular fruit in Malaysia. It comes in different varieties or cultivars such as Binjai, Gading, Gula Batu, Jarum Mas, and Rongrien [4]. These cultivars appear to look alike in the naked eyes. Hence, an image recognition system powered by deep learning methods can be applied in classifying rambutan cultivars accurately [5–11]. The Convolutional Neural Networks (CNN) algorithm consistently shows remarkable performances on image classification tasks in image databases including the MNIST database, the NORB database, and the CIFAR10 dataset [12]. Besides CNN, transfer learning is amongst the popular method used by researchers for image classification. Transfer learning adopts the usage of the pre-trained model which is a network trained on a huge dataset and managed to achieve state-of-the-art performance. In this paper, we studied Rambutan cultivars classification using deep learning models such as CNN and transfer learning.

2 Literature Review Deep learning provides the capability of a computer model to learn and perform classification tasks directly from various types of data like images, text, or audio [13–16]. It provides a high accuracy rate on the go where models are trained using a huge amount of labeled data and neural network architectures that contain many layers [17]. The relevant features are learned while the network trains on a collection of data. This feature extraction while the network trains make deep learning models highly accurate for computer vision tasks such as object classification. It has become one of the core technologies for machine-critical artificial intelligence applications including medical diagnosis to screening various types of cancer [18]. Most recently image classification technique was used for the Covid-19 screening test using chest X-ray and CT images of patients [19]. Deep learning achieves tremendous performance in many applications including fruit classification. There are research works for fruit classification with different goals and applications [20]. One of these applications refers to agriculture. Anyhow,

Rambutan Image Classification Using Various …

25

deep learning has the drawback of requiring an exceptionally high processing power due to its massive parameters, which can easily go up to millions in number. Hence, the necessity to have a lightweight deep learning architecture to fasten the diagnosis without sacrificing accuracy. In this section, let’s review several previous attempts to use neural networks and deep learning for fruit recognition. On the topic of detecting fruits from images using deep neural networks, paper [21] shows a network trained to recognize fruits. The researcher seems to adapt a Faster Region-based convolutional network. The objective is to create a neural network that would be used by autonomous robots that can harvest fruits. The network is trained using RGB and NIR (near infra-red) images. The combination of the RGB and NIR models is done in 2 separate cases named early and late fusion. The result is a multi-modal network that obtains much better performance than the existing networks. Another paper [22], uses two backpropagation neural networks trained on images with apple “Gala” variety trees to predict the yield for the upcoming season. For this task, four features have been extracted from images like total cross-sectional area of fruits, fruit number, the total cross-section area of small fruits, and crosssectional area of foliage. It was found that the deep learning methods were highly useful to classify the fruits effectively. Some other optimization methods can be used to optimize the problems as given in [23–28].

3 Proposed Deep Learning Method In this paper, we planned to use a few Deep learning named convolutional neural networks (CNN), Residual networks (ResNet) and VGG16.

3.1 CNN A convolutional neural network (CNN) is a particular type of feed-forward neural network that is widely used for image recognition. CNN extracts each portion of the input image, which is known as the receptive field, and assigns weights for each neuron based on the significant role of the receptive field to discriminate the importance of neurons from one another. The architecture of CNN consists of three types of layers: (1) convolution, (2) pooling, and (3) fully connected as shown in Fig. 1. The convolution operation works to apply multiple filters to extract features from the images, which is known as a feature map. With this, corresponding spatial information from the dataset can be preserved. The pooling operation, also called subsampling, is used to reduce the dimensionality of feature maps from the convolution operation. A pooling layer is a new layer added after the convolutional layer.

26

N. A. Anuar et al.

Fig. 1 Basic architecture of CNN

Specifically, after a nonlinearity has been applied to the feature maps output by a convolutional layer. Max pooling and average pooling are the most common pooling operation while RELU is the common choice for the activation function to transfer gradient in training by backpropagation. In our work, we proposed a CNN model in classifying five rambutan types: Rambutan Binjai, Gading, Gula Batu, Jarum Mas, and Rongrien. The model consists of four convolutional layers. The first convolution layer uses 32 convolution filters with a filter size of 3 × 3, kernel regularizer 0.001. Regularizer is used to add penalties on the layer while optimizing. These penalties are used in the loss function in which the network optimizes. Padding is used to ensure the input and output tensors remain in the same shape. The input image size is 224 × 224 × 3. Batch normalization is applied on each convolution before the activation enters. RELU, a rectified linear activation function, the commonly used activation function at every convolution. This activation function ensures the output to be either positive or zero only. The output of each convolutional layer is given as input to the max-pooling layer with a pool size of 2 × 2. This layer reduces the number the parameters by down-sampling. Thus, it reduces the amount of memory and time required for computation. So, this layer aggregates only the required features for the classification. Dropout of 0.3, 0.2, and 0.1 are applied respectively starting from the second convolutional layers. This aims to reduce the model complexity to prevent overfitting and reduce the computation power and time at each convolution. The second convolution layer uses 64 convolution filters with 2 × 2 kernel size and the third convolution layer use 128 convolution filters with 2 × 2 kernel size and followed by the fourth layer with 256 filters with 2 × 2 kernel size. Finally, we use a fully connected layer with 4 dense layers and 0.5 dropouts, then ended with a SoftMax classifier. Before using dense, the feature map of the fourth convolution is flattened. In our model, the loss function used is categorical cross-entropy and Adam optimizer with a learning rate of 0.001. The architecture of the proposed CNN model is shown in Figs. 2, 3, and 4. Figure 5 shows the expected classification output from the model.

Rambutan Image Classification Using Various …

27

Fig. 2 Proposed CNN architecture

Fig. 3 Building the proposed model

3.2 Transfer Learning 3.2.1

ResNet

Residual networks (ResNet) were developed by the Microsoft Research team for image recognition tasks implemented using deep residual learning. This algorithm has managed to secure 1st place on the ILSVRC 2015 classification task. The deep residual learning architecture was developed to address the degradation problem which occurs due to increasing stacked layers (depth). Despite having several more depths compared to VGG nets, the networks show a lower complexity [29]. The models were trained on over 1.28 million images and evaluated on 50,000 validation images. ResNet was constructed in five convolutional blocks in the forms of 18, 34, 50, 101, and 152-layers. We propose the application of ResNet-50 and ResNet-101 pre-trained models on our Rambutan type classification task using the Keras library. Some and all ResNet convolutional blocks will be frozen to study the effect of using a fully trained model versus a partially trained pre-trained model. A new classifier layer consists of two dense layers with 256 neuron units per layer and SoftMax activation function. Adaptive Moment Estimation (Adam) optimizer is used to compute the optimum weights of the classifier layers with different learning rates. The fully connected layer applies the categorical cross-entropy loss function to calculate the loss between predictions and actual labels. Figure 6 shows the architecture of the ResNet model.

28

N. A. Anuar et al.

Fig. 4 Model’s summary

Feature Extraction and Model Training for ResNet and VGG: 1. Load the pre-trained model by specifying “include-top = False” and the shape of the image data. 2. Extract convolved visual features bypassing the image data through the pretrained layers. 3. The resultant feature stack will be three-dimensional, and it will need to be flattened before it can be used for prediction by the classifier. 4. The fully Connected layer is created and used in conjunction with the pre-trained layers. Initialize this Fully Connected layer with random weights, which will update during training (Figs. 7 and 8).

Rambutan Image Classification Using Various …

Fig. 5 Snapshot of a part of rambutan classifications output from the CNN model

Fig. 6 A 34-layer ResNet architecture

Fig. 7 Setting up the ResNet model

29

30

N. A. Anuar et al.

Fig. 8 Resnet model summary

3.2.2

VGG

Transfer learning is the reuse of a pre-trained model on a new problem. Its popularity in deep learning is given by its advantage of training deep neural networks with comparatively little data. This is very useful since most real-world problems typically do not have millions of labeled data points to train such complex models [30]. To reiterate, in transfer learning, the knowledge of an already trained machine learning model is applied to a different but related problem. With transfer learning, we basically try to exploit what has been learned in one task to improve generalization in another. We transfer the weights that a network has learned at “task A” to a new “task B” [31]. VGG16 is one of the transfer learning algorithms. The model achieves 92.7% top5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes [32]. It was one of the famous models submitted to ILSVRC-2014. Instead of using a large kernel (11 × 11 and 5 × 5 in the first and second layer), VGG16 improved upon AlexNet by opting smaller kernel size of 3 × 3. VGG16 architecture accepts a fixed input size of 224 * 224 RGB images, where it has a total of 138 million parameters. The architecture comprises of 5 blocks of convolution layer followed by a max-pool layer after each block and at the end three fully connected layers with 4096, 4096, 1000 neurons respectively. The last fully connected layer is the SoftMax layer for classification. VGG16 architecture uses a very small kernel size i.e., 3 * 3, where after every convolutional layer, a non-linear operation is performed by a ReLU activation function. Every block contains at least two convolution layers and at most three convolution layers where the number of

Rambutan Image Classification Using Various …

31

Fig. 9 VGG16 architecture

filters for convolution increases with the power of two starting from 64 to 512 [33]. Figure 9 shows the architecture of VGG16. After loading the VGG16 pre-trained model, all layers were frozen, except for the last 5 as the last few layers represent a higher-level combination of lower features and we want to train these layers to suit our problem (Fig. 10). Then a sequential model was created by adding the VGG convolutional base model, and some fully connected layers which include a Flatten layer, 3 Dense layers with filter sizes 1024, 1024, and 5 respectively. The first 2 Dense layers use the ReLu activation function. After the second Dense layer, a Dropout layer is added with a weight of 0.5 to minimize overfitting. The final Dense layer is the classification layer with a SoftMax activation function and Adam optimizer with a learning rate of 0.0001. The model summary is shown in Fig. 11.

3.3 Dataset There are various types of rambutan. However, we only collected five different types of rambutan (Gading, Binjai, Gula Batu, Jarum Mas, Rongrien) with 200 images per label. So, the total size of the dataset used in this work is 1000 images. All images are resized to 224 × 224 pixels. All images are split into training 80%, validation 10%, and testing 10%. Figure 12 shows the types of rambutan available.

32

Fig. 10 Building the VGG16 model

Fig. 11 VGG16 model summary

N. A. Anuar et al.

Rambutan Image Classification Using Various …

33

Fig. 12 Types of rambutan (R-156Y is a variety code for Gading cultivar)

4 Performance Results and Recommendation 4.1 Convolutional Neural Network (CNN) A basic Convolutional Neural Network (CNN) was established as a baseline model to compare the performances with more complex transfer learning models; ResNet and VGG16 model. The parameters for the basic CNN model are built upon 4 convolution layers that double in filter size after each convolution. Max pooling was used in the model. The architecture for the convolution layers is as shown in Fig. 2. 3 layers of dense neural were used, with neurons numbers decreased by half at each layer (256 neurons > 128 neurons > 64 neurons) before all the output passed to output layer with 5 neurons based on 5 classes of rambutan to be predicted. The batch size for the training was set at 128 samples per batch for 100 epochs. Nevertheless, an early stopping mechanism was used to ensure the training has the lowest loss

34 Table 1 Segmentation of F1-score for CNN

N. A. Anuar et al. Rambutan class

Rambutan photo

F1-score (%)

Binjai

77

Gading

99

Gula Batu

82

Jarum Mas

74

Rongrien

68

possible. Relu activation function was used for all layers except for the final output layer for class prediction which used the SoftMax activation function. With all the parameters set, the overall accuracy of the model was 79% based on the test set. The model was trained until the 40th epoch before it achieved its lowest loss. Table 1 shows the segmentation of the F1-score for each class of rambutan for this basic model. Gading rambutan has the highest classification score at 99% and Rongrien has the lowest score at 59%. It may be obvious that from all 5 classes of rambutan, Gading has distinctive yellow color while others are red. This feature is well extracted by the model as defining feature for Gading. On the other hand, clear features of the other 4 rambutans may be overlap hence giving lower classification performance. Diving further into the performance of Rongrien (lowest performance), its recall score is also significantly lower as compared to other classes at only 47%. This means that the false-negative rate for the Rongrien is high i.e., Rongrien is commonly mistaken for other classes of rambutan. With the baseline model, we then venture for manipulating the training parameters namely batch size, epochs run, and layers of convolution to observe the model’s performance. The parameters were changed one at a time with the rest of the parameters fixed as in the baseline model. The observation is as shown in Tables 2, 3 and 4. For the convolution layer, the maximum layers are 6 before the max-pooling caused negative dimension to occur hence the layers manipulated to be slightly lower and higher than the baseline model layers (2 to 6 layers).

Rambutan Image Classification Using Various … Table 2 Performance result for different batch sizes used

Batch size

Epochs | Convolution layers

Overall accuracy, %

32

Early stopping epochs 4 convolution layers

79

64

Table 3 Performance results for different epochs number

74

256

77

Epochs

Batch size | Convolution layers

Overall accuracy, %

30

128 samples/batch 4 convolution layers

79 81

90

80

120

79

Convolution layers

Batch size | Epochs Overall accuracy, %

3

128 samples/batch Early stopping epoch

6 2

77 65 80 80

5

Table 5 Performance result for combining best parameters from each test

77

100

60

Table 4 Performance result for different number of convolution layers

35

Convolution layers Batch size Epochs Overall accuracy, % 5

32

60

20

Combining all the best performance for each parameter, the performance that we got is (Table 5): The overall performance is much worse than the baseline model when all the best parameters are combined. Inspection on each class classification shows that prediction was made only for a single type of rambutan. The main contributor for this is the small batch size, changing the batch size to its baseline number, 128 gives us the accuracy of 77%, which is lower than the baseline model. Hence, the baseline model with 4 convolutional layers, 128 batch size with early stopping gave the highest accuracy model at 79%.

36

N. A. Anuar et al.

4.2 Transfer Learning Model We used two different transfer learning models as discussed in the previous section: VGG16 and ResNet model. For both transfer learning models, we unfreeze some of the layers for training.

4.2.1

ResNet

There are two parameters tested in the ResNet model namely batch size and learning rate. Three batch sizes were tested for ResNet: 32, 64, and 128. Table 6 shows the model accuracy summary for each batch size tested. One interesting observation is that unfreezing some layers improved the model’s performance and this effect is more noticeable than the batch size difference. Within each model, changing the batch size does not significantly improve the accuracy performance except for ResNet101 which accuracy jumped from 20 to 77% when batch size increased from 32 to 64. Doubling the batch size to 128 however, does not bring any more significant improvement. About the partially frozen layers, the unfreeze layers can extract and learn the distinctive features for our dataset which improved their performance. For the learning rate, we used two lower learning rates (0.01, 0.05) and two higher learning rates (0.1, 0.5). Table 7 shows the summary of the model performance result. For 3 models; ResNet50, ResNet50 partially frozen, and ResNet101, the observed trend is by increasing the learning rate the performance accuracy increased before plateaued. 50 epochs with early stopping were used for all model training and the lower learning rate may still be far from the lowest loss solution when the training Table 6 Performance result for different batch sizes using different models

Batch size

Model ResNet50

ResNet50 (partially frozen)

ResNet101

ResNet101 (partially frozen)

* Learning

rate set at 0.001

Accuracy, %

32

38.9

64

40

128

40

32

66

64

67

128

71

32

20

64

77

128

78

32

76

64

80

128

80

Rambutan Image Classification Using Various … Table 7 Performance result for different learning rate used

37

Model

Batch size Learning rate Accuracy, %

ResNet50

64

ResNet50 (partially frozen)

ResNet101

ResNet101 (partially frozen)

0.01

81

0.05

85

0.1

85

0.5

85

0.01

58

0.05

71

0.1

75

0.5

75

0.01

67

0.05

83

0.1

82

0.5

83

0.01

84

0.05

82

0.1

82

0.5

82

stopped as compared to the higher learning rate that may be closer to the optimized solution when the training ended either by reaching final epoch or sequence of lowest loss occur. On the other hand, increasing the learning rate for ResNet101 partially frozen model caused the performance to slightly be dropped due to the opposite reason of overshooting the optimized solution (Fig. 13).

Fig. 13 Accuracy and loss for best ResNet model

38

4.2.2

N. A. Anuar et al.

VGG16

VGG16 model has experimented with different architecture, batch sizes, epochs, and optimizers for training. The batch sizes used for VGG16 training are 100, 128, 256. As mentioned previously, all the layers are frozen except for the last few layers. The model’s performance is different for different batch sizes and architectures. Table 8 shows the VGG16 performance summary, where Bold font refers to the best result. Model 2 performed exceptionally well as compared to other models with 96% accuracy. Model 2 is trained for 125 epochs with a batch size of 128 and Adam optimizer. On the other hand, models with the same architecture and using Adam optimizer (Model 1, Model 3) with a batch size of 256 achieved validation accuracy of 89%, and batch size of 100 achieved validation accuracy of 91%. Model 4 with SGD optimizer achieved validation accuracy 87% for a batch size of 256. Model 5 with RMSprop optimizer also performed well with a validation accuracy of 95%. Model 6 and model 7 used the same architecture and Adam optimizer but with different batch sizes. Model 6 with a batch size of 256 achieved validation accuracy of 87% and whereas Model 7 with a batch size of 128 achieved a validation accuracy of 94%. The performance of the model improves when batch size decreases from 256 to 128. Within each model, changing the batch size does not significantly improve the accuracy performance. In Adam optimizer changing the batch size from 256 to 128 improves the validation accuracy from 89 to 96%. Compared to the other optimizers, RMSprop achieved a good validation accuracy of 95% for a batch size of 256. Increasing the number of layers in the architecture does not bring any more significant improvement in model performance. The performance history of learning the model and performance metrics of the best model is as shown in Fig. 14 (Fig. 15; Table 9). The overall validation accuracy of the best model was 96% based on the validation set. The model was trained for 125 epochs with a batch size of 128 and an Adam optimizer. The classification of the F1-score for each class of rambutan for this basic model is depicted in Table 10. Gading rambutan has the highest F1-score at 100% and Binjai has the lowest score of 92%. Like discussed previously, Gading has a distinctive yellow color while others are red which is well extracted by the model. On the other hand, clear features of the other 4 rambutans may overlap hence giving low classification performance compared to Gading. Nevertheless, the model still able to classify each type of rambutan with high accuracy as compared to other models discussed previously. Based on the highest accuracy of the model, we recommend VGG16 as the classifier for listed rambutan types.

5 Concluding Remarks The use of a convolution neural network to classify rambutan shows immense potential to correctly identify the type of rambutan. The initial hypothesis that all types of transfer learning models would outperform the conventional, built-from-scratch

Rambutan Image Classification Using Various …

39

Table 8 VGG16 performance summary Model VGG16

Optimizer

Batch size

Epochs

Fully connected layer

Model 1

Adam

256

150

Flatten + 2 94.5 dense layer with filter sizes (1024, 1024) + dropout (0.5) + output layer

89

Model 2

Adam

128

125

Flatten + 2 dense layers with filter sizes (1024, 1024) + dropout (0.5) + output layer

96

Model 3

Adam

100

100

Flatten + 2 98.5 dense layers with filter sizes (1024, 1024) + dropout (0.5) + output layer

91

Model 4

SGD

256

125

Flatten + 3 93.6 dense layers with filter sizes (4098, 1024, 512) + output layer

87

Model 5

RMSprop

256

100

Flatten + 2 97.75 dense layers with filter sizes (1024, 1024) + dropout (0.5) + output layer

95

Model 6

Adam

256

125

Flatten + 3 96.5 dense layers with filter sizes (4098, 1024, 512) + dropout (0.5) + output layer

87

Training accuracy %

97

Testing accuracy %

(continued)

40

N. A. Anuar et al.

Table 8 (continued) Model VGG16

Optimizer

Batch size

Epochs

Fully connected layer

Model 7

Adam

128

125

Flatten + 3 98.87 dense layers with filter sizes (4098, 1024, 512) + dropout (0.5) + output layer

Training accuracy %

Testing accuracy % 94

Fig. 14 VGG16 best model accuracy and loss

Fig. 15 Best model confusion matrix

CNN model is supported, shown using both ResNet and VGG model which yields higher improvement as compared to the conventional CNN model. Between the two transfer-learning models, VGG16 has the better accuracy in classifying all the types of rambutan, achieving overall 96% accuracy as compared to ResNet50 at 85%. VGG16 also manage to identify each type of rambutan well, with each type of rambutan correctly classified more than 90%. Built from scratch CNN model has

Rambutan Image Classification Using Various … Table 9 VGG16 best model parameters

Table 10 F1-score segmentation for VGG16 best model

41

Model 2 Optimizer

Adam

Batch size

128

Epochs

125

Learning rate

0.0001

Training accuracy

0.9737

Training loss

0.0790

Validation accuracy

0.9600

Validation loss

0.1914

Rambutan class

Rambutan images

F1-score (%)

Binjai

92

Gading

100

Gula Batu

95

Jarum Mas

97

Rongrien

98

the lowest accuracy with the best model achieved 79% accuracy. Rambutan Gading has the highest accuracy among other types of rambutan, which is believed due to its distinct color extracted well by the model. It would be suggested for the next training iteration to remove Rambutan Gading for the model to fully extract defining features of the other 4 types of rambutan. This system of experts is a basis for the future. It is recommended for future research to expand size of dataset to classify more varieties of Rambutan and can be applied to the agriculture field.

42

N. A. Anuar et al.

References 1. Risdin, F., Mondal, P. K., & Hassan, K. M. (2020). Convolutional neural networks (CNN) for detecting fruit information using machine learning techniques. IOSR Journal of Computer Engineering (IOSR-JCE), 22(2), 1–13. 2. Morton, J. F. (1987). Fruits of warm climates. Morton. 3. Rojas-Aranda, J. L., Nunez-Varela, J. I., Cuevas-Tello, J. C., & Rangel-Ramirez, G. (2020). Fruit classification for retail stores using deep learning. Lecture Notes in Computer Science, 12088, 3–13. 4. Goenaga, R., & Jenkins, D. (2011). Yield and fruit quality traits of rambutan cultivars grafted onto a common rootstock and grown at two locations in Puerto Rico. HortTechnology, 21(1), 136–140. 5. Abualigah, L., Al-Okbi, N. K., Elaziz, M. A., & Houssein, E. H. (2022). Boosting marine predators algorithm by salp swarm algorithm for multilevel thresholding image segmentation. Multimedia Tools and Applications, 81(12), 16707–16742. 6. Mehbodniya, A., Douraki, B. K., Webber, J. L., Alkhazaleh, H. A., Elbasi, E., Dameshghi, M., Abu Zitar, R., & Abualigah, L. (2022). Multilayer reversible data hiding based on the difference expansion method using multilevel thresholding of host images based on the slime mould algorithm. Processes, 10(5), 858. 7. Otair, M., Abualigah, L., & Qawaqzeh, M. K. (2022). Improved near-lossless technique using the Huffman coding for enhancing the quality of image compression. Multimedia Tools and Applications, 1–21. 8. Liu, Q., Li, N., Jia, H., Qi, Q., & Abualigah, L. (2022). Modified remora optimization algorithm for global optimization and multilevel thresholding image segmentation. Mathematics, 10(7), 1014. 9. Lin, S., Jia, H., Abualigah, L., & Altalhi, M. (2021). Enhanced slime mould algorithm for multilevel thresholding image segmentation using entropy measures. Entropy, 23(12), 1700. 10. Ewees, A. A., Abualigah, L., Yousri, D., Sahlol, A. T., Al-qaness, M. A., Alshathri, S., & Elaziz, M. A. (2021). Modified artificial ecosystem-based optimization for multilevel thresholding image segmentation. Mathematics, 9(19), 2363. 11. Abualigah, L., Diabat, A., Sumari, P., & Gandomi, A. H. (2021). A novel evolutionary arithmetic optimization algorithm for multilevel thresholding segmentation of Covid-19 CT images. Processes, 9(7), 1155. 12. Rawat, W., & Wang, Z. (2017). Deep convolutional neural networks for image classification: A comprehensive review. Neural Computation, 29(9), 2352–2449. 13. Sumari, P., Syed, S. J., & Abualigah, L. (2021). A novel deep learning pipeline architecture based on CNN to detect Covid-19 in chest X-ray images. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(6), 2001–2011. 14. Kadyan, V., Singh, A., Mittal, M., & Abualigah, L. (2021). Deep learning approaches for spoken and natural language processing. 15. Abuowaida, S. F. A., Chan, H. Y., Alshdaifat, N. F. F., & Abualigah, L. (2021). A novel instance segmentation algorithm based on improved deep learning algorithm for multi-object images. Jordanian Journal of Computer and Information Technology (JJCIT), 7(01), 10–5455. 16. Danandeh Mehr, A., Rikhtehgar Ghiasi, A., Yaseen, Z. M., Sorman, A. U., & Abualigah, L. (2022). A novel intelligent deep learning predictive model for meteorological drought forecasting. Journal of Ambient Intelligence and Humanized Computing, 1–15. 17. MathWorks. (2021). What is deep learning? How it works, techniques & applications. MathWorks. [Online]. https://www.mathworks.com/discovery/deep-learning.html. Accessed July 01, 2021. 18. Ardila, D., Kiraly, A. P., Bharadwaj, S., Choi, B., Reicher, J. J., Peng, L., Tse, D., Etemadi, M., Ye, W., Corrado, G., Naidich, D. P., & Shetty, S. (2019). End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine, 25(6), 954–961.

Rambutan Image Classification Using Various …

43

19. Wang, S., Kang, B., Ma, J., Zeng, X., Xiao, M., Guo, J., Cai, M., Yang, J., Li, Y., Meng, X., & Xu, B. (2021) A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19). European Radiology, 31(8), 6096–6104. 20. Hameed, K., Chai, D., & Rassau, A. (2018). A comprehensive review of fruit and vegetable classification techniques. Image and Vision Computing, 80, 24–44. 21. Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., & McCool, C. (2016). DeepFruits: A fruit detection system using deep neural networks. Sensors, 16(8), 1222. 22. Cheng, H., Damerow, L., Sun, Y., & Blanke, M. (2017). Early yield prediction using image analysis of apple fruit and tree canopy features with neural networks. Journal of Imaging, 3(1), 6. 23. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 24. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 25. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 26. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 27. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 28. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 29. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 770–778). 30. Qassim, H., Verma, A., & Feinzimer, D. (2018). Compressed residual-VGG16 CNN model for big data places image recognition. In 2018 IEEE 8th annual computing and communication workshop and conference (CCWC). 31. Ferguson, M., Ak, R., Lee, Y.-T. T., & Law, K. H. (2017) Automatic localization of casting defects with convolutional neural networks. In 2017 IEEE international conference on big data (big data) (pp. 1726–1735). 32. Naranjo-Torres, J., Mora, M., Hernández-García, R., Barrientos, R. J., Fredes, C., & Valenzuela, A. (2020). A review of convolutional neural network applied to fruit image processing. Applied Sciences, 10(10), 3443. 33. ul Hassan, M. (2021). VGG16—Convolutional network for classification and detection. Neurohive, November 20, 2018. [Online]. https://neurohive.io/en/popular-networks/vgg16/. Accessed July 31, 2021.

Mango Varieties Classification-Based Optimization with Transfer Learning and Deep Learning Approaches Chen Ke, Ng Tee Weng, Yifan Yang, Zhang Ming Yang, Putra Sumari, Laith Abualigah, Salah Kamel, Mohsen Ahmadi, Mohammed A. A. Al-Qaness, Agostino Forestiero, and Anas Ratib Alsoud Abstract Mango is one of the well known tropical fruits native to south asia and currently there are over 500 varieties of mangoes known. Depending on the variety, mango fruit can be varied in size, skin color, shape, sweetness, and flesh color which may be pale yellow, gold, or orange. However, sometimes it is difficult for us to differentiate what type of mango it is. Thus, in this paper, four types of mango classification approach is presented. Thus, we are going to use convolutional neural network (CNN) algorithm and transfer learning methods (VGG16 and Xception) to train on the 1000 mango images collected and obtain a deep learning model which is able to classify four types of mango (Alampur Baneshan, Alphonso, Harum Manis C. Ke · N. T. Weng · Y. Yang · Z. M. Yang · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah · A. R. Alsoud Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan L. Abualigah Faculty of Information Technology, Middle East University, Amman 11831, Jordan S. Kamel Department of Electrical Engineering, Faculty of Engineering, Aswan University, Aswan 81542, Egypt M. Ahmadi Department of Industrial Engineering, Urmia University of Technology, Urmia, Iran M. A. A. Al-Qaness State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China Faculty of Engineering, Sana’a University, 12544 Sana’a, Yemen College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321004, China A. Forestiero Institute for High Performance Computing and Networking, National Research Council of Italy, Rende, Cosenza, Italy © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_3

45

46

C. Ke et al.

and Keitt) automatically. In summary, the objective in this paper is to develop a deep learning algorithm to automatically classify four types of mango cultivar. Keywords Mango · Convolutional neural network (CNN) · Transfer learning · Deep learning · DGG16 · Xception

1 Introduction Currently, sorting and classifying cultivar of mango is manually done by observing the features or attributes of mango like size, skin color, shape, sweetness, and flesh color [1–3]. Generally, experienced taxonomy experts can identify different species. However, it is difficult to distinguish these mangoes for most people. Nowadays, society is advancing in science and technology. There is alot of technology that could be used to solve the problem which is able to make it easy for people to distinguish the cultivar. The solution we would like to propose to solve the concern is computer vision technique where it is an artificial intelligence that trains computers to interpret and understand the visual world like image and video [4–8]. Nowadays, the most popular technology used in this innovative era is Computer Vision for fruit recognition. Compare to other machine learning algorithms, Convolutional neural network (CNN) provide promising results to identify fruits in images [9] Mostly, deep learning is able to help people to solve some problems such as seed classification and retrieval [10], fruit detection for farmers [11], discrimination of litchi fruit [12] and etc. The major process of image classification contains three steps: feature extraction, training for the model and followed by testing. The feature extraction process means to take the characteristic properties in the images. After that, while training an algorithm will be used to train for a model form a unique description for a particular class. The testing step means to classify the test images under various classes with the model trained [13]. Also, modification of the convolutional layers is to have a more accurate and faster detection. The test results show the proposed algorithm has achieved higher detecting accuracy and lower processing time than the traditional detectors [11]. Some other optimization methods can be used to optimize the problems as given in [14–19]. In short, Table 1 shows the summary of the literature review. By using digital images from cameras and videos and deep learning models, machines can accurately identify and classify four types of the mango. Therefore, in this paper we will develop a deep learning model train by using the 1000 images we collected. Moreover, there will be three algorithms that will be tested: one is convolution neural network (CNN) and the other two is transfer learning method VGG16 and Xception. Thus, with the model train we might be able to implement it in some phone system or application so people could classify the mango cultivar by just snapping a picture with their phone camera.

Mango Varieties Classification-Based Optimization …

47

Table 1 Summary of literature review Author

Topic

Objective

Data

Algorithms

Performance (%)

Jaswal et al. [13]

Image classification using convolutional neural networks

Image classification

The images are converted to gray scale

CNN

95

Chung and Van Tai [9]

A fruits recognition system based on a modern deep learning technique

Fruits recognition

Fruit 360 dataset

CNN/DL

95

Shaohua and Goudos [11]

Faster R-CNN for multi-class fruit detection using a robotic vision system

multi-class fruit detection using a robotic vision system

Fruit images

Faster CNN

86.41

Osako et al. [12]

Cultivar discrimination of litchi fruit images using deep learning

Cultivar discrimination of litchi fruit images

litchi fruit images

DL

98.33

Andrea et al. [10]

A novel deep learning based approach for seed image classification and retrieval

Seed image classification and retrieval

Seeds pictures CNNs with different structure

95.65

2 Methodology 2.1 Dataset The data set for the development of this study consists of 1000 mango photographs divided into 4 categories Alampur Baneshan, Alphonso, Harum Manis and Keitt, 250 units for each category where all of them are collected from Google image. Figure 1 shows some examples for each type of mango. Besides, all the image in in 3 dimension channel and all the image is resize into the dimension of 224 * 224. Moreover, data augmentation will be used to increase the robustness of the model. In short, by using the data we will train the model by using three different deep learning algorithms, one convolutional neural network and another two transfer learning methods. In short, in section two we will discuss some literature review related to the topic after that in the following section we will show the deep learning model we design and discuss the performance for the model trained.

48

C. Ke et al.

Fig. 1 The used image dataset

Alampur Baneshan

Harum Manis

Alphonso

Keitt

2.2 Data Preparation 2.2.1

Augmentation

Data augmentation is an important step in data processing. It can increase the data size by augmenting the image like rotating, magnifying, different color intensity and so on. Which is able to prevent overfitting of the model. At the same time, the generalization ability of the model is enhanced. In all the experiments, we use the ImageDataGenerator function to argue the input image data. Figure 2 shows the augmentation code that we use in experiment. In the first row we have converted the RGB value from the range of 0–255 to 0–1. Secondly, we randomly rotate the image within the degree 0 to 180. Next, for the third and fourth row we randomly shift the image in the vertical or horizontal direction. On the fifth row we applied a random shear transform to shear the image. Moreover, in the sixth row the zoom function is used to randomly scale the image into different sizes. Furthemore, horizontal_flip is applied to 50% random probability to flip the image horizontally. Lastly, the nearest fill mode is the filling strategy used to fill up the image after augmentation like rotation or translation.

2.3 Proposed CNN Architecture Convolutional neural network (CNN) is a kind of feedforward neural network, which has excellent performance for large-scale image processing. Convolutional neural

Mango Varieties Classification-Based Optimization …

49

Fig. 2 Augmentation code

network consists of one or more convolution layers and all connected layers at the top, as well as correlation weight and pooling layer. Compared with other deep learning structures, convolutional neural networks can give better results in image and speech recognition. Firstly, the training set data is enhanced, because in deep learning, the number of samples is generally required to be sufficient. The more the number of samples, the better the trained model effect, and the stronger the generalization ability of the model. For the input image, some simple translation, scaling, color change, etc. As shown in Fig. 3, the CNN architecture model consists of five convolution layers, followed by five maximum pooling layers and two fully connected layers. The network input layer is 224 × 224 × 3 pixel RGB image. Convolution layer and pooling layer: the first convolution layer is convolution layer 1, which contains 32 convolution cores with the size of 3 * 3 and relu as the activation function, and the maximum pooling layer 1 is 2 * 2. The second convolution layer is convolution layer 2, which has 64 convolution kernels with the size of 3 * 3 and relu as the activation function, and the maximum pooling layer 2 is 2 * 2. The third convolution layer is convolution layer 3, which has 128 convolution cores with the size of 3 * 3 and relu as the activation function, and the maximum pooling layer 3 with the size of 2 * 2. The fourth convolution layer is convolution layer 4, which has 256 convolution cores with the size of 3 * 3 and relu as the activation function, and the maximum pooling layer 4 with the size of 2 * 2. The fifth convolution layer is convolution layer 5, which has 512 convolution cores with the size of 3 * 3 and relu as the activation function, and the maximum pooling layer 5 is 2 * 2. Flatten layer: Enter the fully connected layer from multi-dimensional input to one-dimensional. Full connection layer: (density (256, activation = ‘relu’)). Then dropout and relu of 0.5 are used for faster convolution calculation. Finally, the classification layer (density (4), activation = ‘softmax’) is used to predict the output of the model and

50

C. Ke et al.

Fig. 3 CNN model

represent four different kinds of mangoes. SGD: we set the parameters of SGD optimizer (LR = 0.001, decay = 1e−6, momentum = 0.9, nesterov = true).

2.4 Transfer Learning Model 2.4.1

VGG16

VGG16 is a convolutional neural network (CNN) algorithm proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”. This model is able to achieve the top one accuracy 0.713 and top five accuracy 0.901 in imagenet which contains 14 million images with 1000 class labels. The model contains 5 convolutional layers, 1 flatten layer and a fully connected layer. Besides, the fully connected layer contains 2 layers with 4096 neurons. Moreover, the original output layer is 1000. However in our dataset we only have 4 class labels (Alampur Baneshan, Alphonso, Harum Manis and Keitt), therefore in our experiment we will adjust the number to 4. Since, VGG16 is a CNN model thus the activation function used is relu and softmax for output. Figure 4 shows the summary of the VGG16 model.

2.4.2

Xception

Xception is a convolutional neural network (CNN) algorithm based on inception, which is presented in Fig. 5. Xception architecture has 36 convolution layers, which constitute the basis of network feature extraction. In our experiment, we will focus on mango image classification, so our convolution basis will follow the logistic regression layer. So, a fully connected layer must be inserted before the logistic regression layer, which will be discussed in the effect of the dense layer section. The 36 convolution layers are constructed into 14 modules, all of which are connected by linear residuals except the first and last modules. Finally, xception architecture is a linear stack of deeply separable convolution layers with residual connections. This

Mango Varieties Classification-Based Optimization …

51

Fig. 4 VGG16 model

makes the architecture very easy to define and modify based on requirements. Using advanced libraries such as keras or tensorflow slim requires very little code.

Fig. 5 Xception

52

C. Ke et al.

3 Experiment Result 3.1 CNN 3.1.1

Experimental Setup

There are 1000 mango images in the data set, all images are 224 * 224 pixels in size, and there are four types, namely Alampur Baneshan, Alphonso, HarumManis, and Keitt. Including 60% training set, 20% validation set, 20% test set. The deep learning experiment is carried out in the local jupyter notebook. The model summary is shown in Fig. 6 shows the model architecture and the input and output of each layer.

Fig. 6 a Model summary, b model architecture

Mango Varieties Classification-Based Optimization …

53

Fig. 7 Dense layer

3.1.2

Dense Layer

The Flatten layer is used to “flatten” the input, that is, to make the multi-dimensional input one-dimensional, which is commonly used in the transition from the convolution layer to the (Convolution) fully connected layer (Dense) as shown in Fig. 7. In other words, after the Convolution convolutional layer, the Dense fully connected layer cannot be directly connected. The data of the Convolution layer needs to be flattened (Flatten), and then the Dense layer can be added directly. Dense(256, activation = ‘relu’) After using relu, Training uses traditional Dropout with a drop rate of 0.5. For each neuron in the layer that uses Dropout, there is a 50% probability of being dropped during training, and the last fully connected layer uses softmax to output 4 categories.

3.1.3

Modeler Optimizer

The model uses the SGD optimizer, LR = 0.001, decay = 1e−6, momentum = 0.9, nesterov = True, gradient descent can make loss drop. Calculate the accuracy on the test set after training the model.

3.1.4

Number of Epochs

As shown in Fig. 8, we chose 10, 50, and 100 rounds of training. Figure 8 shows the accuracy and loss of 50 and 100 epochs. The accuracy of the epochs 10 test set is 0.65 and the loss is 0.82. The accuracy of the epochs 50 test set is 0.78 and the loss is 0.67. The accuracy of the epochs 100 test set is 0.75 and the loss is 1.07.

3.1.5

Learning Rate

As shown in Table 2, the effect of different LR on accuracy. As shown in Fig. 9, the impact of different LR on the accuracy of the training set and the accuracy of the validation set, and the loss.

54

C. Ke et al.

Fig. 8 Epochs 10, epochs 50, and epochs 100

Table 2 The effect of different LR on accuracy

Epochs

Lr

Test set_acc

Loss

10

0.01

0.72

0.82

10

0.001

0.65

0.82

50

0.001

0.78

0.67

100

0.001

0.75

1.07

3.2 Transfer Learning In this section we are going to conduct our experiment, where the guideline is proposed in the article [20]. By observing the figure we could indicate that it is more suitable for us to follow the third and fourth quarter since our image data data only has 1000 units which could be considered as low quantity. In the first experiment we will try to train the model with the original model as shown in Figs. 10 and 11 without freezing any layer. Second experiment we will try to fine tune the lower layers of the pretrained model and in the last experiment we will try to fine tune the output density of the pretrained model.

3.2.1

VGG16

Experiment 1: Train the entire model with original algorithm design (doesn’t freeze any layer) The hyperparameter we set for this experiment is batch size equal to 2, learning rate equal to 0.0001 and epoch equal to 18 and 100. After that the output layer is changed from 1000 to 4.

Mango Varieties Classification-Based Optimization …

A (epochs 10 Lr=0.01)

C (epochs 50 Lr=0.001)

55

B (epochs 10 Lr=0.001)

D (epochs 100 Lr=0.001)

Fig. 9 The impact of different LR on the accuracy of the training set and the accuracy of the validation set, and the loss

56

C. Ke et al.

Fig. 10 The result obtained from 100 epchs

Fig. 11 The result obtained from 18 epochs

In this first experiment we are able to indicate that this model is not performing well on the data we train. As the figure and table shown the model is overfitting when using epoch 100 and the performance is bad as shown in Table 3 the accuracy obtained for both models is lower than 0.4. Thus, we proceed to experiment two to test for different methods or hyperparameters. Experiment 2: Train the model by freezing the convolutional layer In this experiment we have freezed all the convolutional layers and trained the model with the original fully connected layer and made a comparison with the fully connected layer used in our CNN model shown in Fig. 12. The result obtained by using the original VGG16 dense layer is shown in the Fig. 12 below. Both of the models are trained with 100 epochs. Table 3 The obtained results for both models

Epochs

Accuracy

Loss

18

0.3

0.665

100

0.25

0.5627

Mango Varieties Classification-Based Optimization …

57

Fig. 12 The result for original design

With this model we will be able to obtain an accuracy 61.5% with the loss 3.9735 where the result is not that ideal and some more it is suffering from overfitting as we can see in Fig. 12 the distance of validating accuracy and training accuracy is far from each other. Next we train again by modifying the fully connected layer according to the method shown in the article [21] and the results are shown in Fig. 12. For this model the best accuracy is 0.61 and the loss is 1.1217. Besides, by observing the Fig. 13 we are able to indicate that the model is still suffering from overfitting and the accuracy is not much different compared with the original design model, but if we compare the loss then this model will be better. Thus, we will use this new design model and proceed to the next experiment. Furthemore, we also tried to reduce the number of neurons from 4096 to 128 units and surprisingly the result obtained better than the previous experiment with the accuracy of 66.5% and 0.5039 loss. Figure 14 below shows the result obtained in this experiment. Experiment 3: Train some layers and leave others frozen In this section we try to freeze the front few layers and keep the rest of the layer untrainable. The epoch we used in this section is 100 and the rest will be the same as experiment 1.

Fig. 13 The result after changing the fully connected layer

58

C. Ke et al.

Fig. 14 Result obtain with 128 neuron

Fig. 15 The result for freezing first 10 layer

In this section, we have tried to freeze the layer for the first 10 and 15, as shown in Figs. 15 and 16. The best result we obtained is accuracy equal to 72.5% with the loss of 0.4586 from the model freezing the first 15 layers. If compared with the original model or previous experiment we are able to indicate that this model has improved a lot where the accuracy is 72.5%. Compared with the previous experiment it has improved by up to 10%. However, if talking about the overfitting issue we are able to notice that it is still not solved, so after referring to some paper this problem might be affected by the data we collected. Thus, in this section we would like to conclude that the model train with 128 neurons and freezing the first fifteen convolutional layers is the best model we obtain in this experiment.

3.3 Xception 3.3.1

Experimental Setup

First, we need to create a baseline model, and then modify one parameter at a time to partition the result and compare it with the baseline model to get its impact. In

Mango Varieties Classification-Based Optimization …

59

Fig. 16 The result for freezing first 15 layer

order to achieve this goal, a total of five experiments were designed. Experiment 1: Create a baseline model, mainly modify the number of frozen layers. Experiment 2: Modify optimizer and compare the performance with the baseline model. Experiment 3: Modify deny layer and compare the performance with the baseline model. Experiment 4: Modify number of epochs and compare the performance with the baseline model. Experiment 5: Modify learning rate and compare the performance with the baseline model. We divide the dataset into three parts: training dataset, validation dataset and testing dataset. We will take the performance of the test dataset as the evaluation standard of the model.

3.3.2

Experiment 1: Create a Baseline Model

When creating the baseline model, we try to freeze all, part and no layer in the original model. Table 4 shows baseline model setting. Table 5 shows the performance of model with different freezing layer. Table 4 Baseline model setting

Baseline model

Optimizer

Dense layer

Number of epochs

Learning rate

RMSprop

x= GlobalAveragePooling2D()(x) x = Dropout(0.5)(x) x = Dense(1024)(x) x = Activation(‘relu’)(x) x = Dropout(0.5)(x) x = Dense(512)(x) x = Activation(‘relu’)(x) Predictions = Dense(4, activation = ‘sigmoid’)(x)

50

Learning rate scheduler

60

C. Ke et al.

Table 5 Performance of model with different freezing layer

Accuracy (test)

Loss (test)

Freeze all

0.185

1.3884

Freeze part

0.42

1.4410

Freeze no

0.78

1.5862

Considering that unfreezing will make the model perform better, we choose unfreezing as the baseline model, and the following experiments all choose unfreezing.

3.3.3

Experiment 2: Effect of Optimizers

In order to form a contrast experiment with experiment 1, only optimizer was modified here. Table 6 shows experiment 2 model setting and Table 7 experiment 2 model comparison. It can be seen from the experimental results in Fig. 17 that the accuracy of the model has decreased a lot. For our dataset, RMSprop is a better choice. The reason is that Adagrad learning rate decreases more slowly than RMSprop, which leads to the slow convergence of the model. Table 6 Experiment 2 model setting

Experiment 2 model

Optimizer

Dense layer

Number of epochs

Learning rate

Adagrad()

x= GlobalAveragePooling2D()(x) x = Dropout(0.5)(x) x = Dense(1024)(x) x = Activation(‘relu’)(x) x = Dropout(0.5)(x) x = Dense(512)(x) x = Activation(‘relu’)(x) Predictions = Dense(4, activation = ‘sigmoid’)(x)

50

Learning rate scheduler

Table 7 Experiment 2 model comparison

Accuracy (test)

Loss (test)

Baseline model

0.78

1.5862

Experment 2 model

0.315

1.3310

Mango Varieties Classification-Based Optimization …

61

Fig. 17 Comparison of learning rate between experiment 1 and experiment 2

3.3.4

Experiment 3: Effect of Dense Layer

Compared with experiment 1, we changed the setting of the dense layer. Table 8 shows the experiment 3 settings. Table 9 shows experiment 3 model comparison. Obviously, the dense layer of the baseline model has better performance, which shows that it can better distinguish image features. Table 8 Experiment 3 setting

Experiment 3 model

Optimizer

Dense layer

Number of epochs

Learning rate

RMSprop

x= GlobalAveragePooling2D()(x) x = Dense(1024)(x) x = BatchNormalization()(x) x = Activation(‘relu’)(x) x = Dropout(0.2)(x) x = Dense(256)(x) x = BatchNormalization()(x) x = Activation(‘relu’)(x) x = Dropout(0.2)(x) Predictions = Dense(4, activation = ‘sigmoid’)(x)

50

Learning rate scheduler

Table 9 Experiment 3 model comparison

Accuracy (test)

Loss (test)

Baseline model

0.78

1.5862

Experment 3 model

0.70

1.6954

62

C. Ke et al.

Table 10 Experiment 4 setting

Experiment 4 model

Optimizer

Dense layer

Number of epochs

learning rate

RMSprop

x= GlobalAveragePooling2D()(x) x = Dropout(0.5)(x) x = Dense(1024)(x) x = Activation(‘relu’)(x) x = Dropout(0.5)(x) x = Dense(512)(x) x = Activation(‘relu’)(x) Predictions = Dense(4, activation = ‘sigmoid’)(x)

100

Learning rate scheduler

Table 11 Experiment 4 model comparison

3.3.5

Accuracy (test)

Loss (test)

Baseline model

0.78

1.5862

Experiment 4 model

0.61

3.2966

Experiment 4: Effect of Number of Epochs

Table 10 shows the experiment 4 settings. Table 11 shows experiment 4 model comparison. It can be seen from Table 11 that the accuracy of the model has declined. However, by observing the training log, the accuracy of the model in the training dataset reaches 0.96, which indicates that the high epochs makes the model over fit. In this experiment, the number of epochs was increased from 50 to 100.

3.3.6

Experiment 5: Effect of Learning Rate

In Experiment 2, we tested the influence of different learning rates on the accuracy of the model by modifying optimizers. In this experiment, we tested the influence of learning rates on the accuracy through ReduceLROnPlateau. Table 12 shows the experiment 5 settings. Table 13 shows experiment 5 model comparison. By observing the training log, the ReduceLROnPlateau function keeps the learning rate at 0.0001. But the result is not as good as RMSprop.

3.4 Accuracy Comparison In this paper we have trained the model with three deep learning algorithms (Convolutional Neural Network. Transfer learning (Xception) and Transfer learning (VGG16)). Table 14 shows the best result we obtain for each model trained.

Mango Varieties Classification-Based Optimization …

63

Table 12 Experiment 5 setting

Experiment 5 model

Optimizer

Dense layer

Number of epochs

Learning rate

RMSprop

x= GlobalAveragePooling2D()(x) x = Dropout(0.5)(x) x = Dense(1024)(x) x = Activation(‘relu’)(x) x = Dropout(0.5)(x) x = Dense(512)(x) x = Activation(‘relu’)(x) Predictions = Dense(4, activation = ‘sigmoid’)(x)

50

ReduceLROnPlateau

Table 13 Experiment 5 model comparison

Table 14 Accuracy comparison

Accuracy (test)

Loss (test)

Baseline model

0.78

1.5862

Experment 5 model

0.675

3.872

Model

Accuracy

Loss

CNN

0.78

0.67

VGG16

0.725

0.4586

Xception

0.78

1.59

In this experiment there are two issues which have occurred; our experiments’ result is good, but the problem of overfitting can not be minimized. The parameters from pre-trained models can not fit our dataset accurately. Thus, in order to solve the first issue we might need to increase the number of samples in our dataset and diversify the image collection or we might improve the augmentation function, which is able to minimize the overfitting problems effectively. Moreover, for the second issue we will need to retrain all the parameters with training data where it is quite time consuming. Since time is precious, thus to solve this problem we might need to subscribe to a virtual machine on cloud which is able to process and obtain the result quickly. Therefore, we will be able to do more experiments in a finite time given.

4 Conclusion In this study, three variants of CNN model are proposed. One is to customize a CNN model, and the other two is transfer and the model we used is Xception and VGG16. By comparing the accuracy of these three algorithms, we would like to conclude that

64

C. Ke et al.

the CNN model that is shown in Sect. 3.3 is our best model. Although we notice that the Xception also gives the same result, the loss obtained is lower. However, if compared with the VGG16 performance it seems like the loss matrix is not that ideal. So within the three models we choose CNN as the best model since the model gives an average performance compared with the other two models.

References 1. Alhaj, Y. A., Dahou, A., Al-Qaness, M. A., Abualigah, L., Abbasi, A. A., Almaweri, N. A. O., Elaziz, M. A., & Damaševiˇcius, R. (2022). A novel text classification technique using improved particle swarm optimization: A case study of Arabic language. Future Internet, 14(7), 194. 2. Daradkeh, M., Abualigah, L., Atalla, S., & Mansoor, W. (2022). Scientometric analysis and classification of research using convolutional neural networks: A case study in data science and analytics. Electronics, 11(13), 2066. 3. Wu, D., Jia, H., Abualigah, L., Xing, Z., Zheng, R., Wang, H., & Altalhi, M. (2022). Enhance teaching-learning-based optimization for tsallis-entropy-based feature selection classification approach. Processes, 10(2), 360. 4. Ali, M. A., Balasubramanian, K., Krishnamoorthy, G. D., Muthusamy, S., Pandiyan, S., Panchal, H., Mann, S., Thangaraj, K., El-Attar, N. E., Abualigah, L., & Elminaam, A. (2022). Classification of glaucoma based on elephant-herding optimization algorithm and deep belief network. Electronics, 11(11), 1763. 5. Abualigah, L., Kareem, N. K., Omari, M., Elaziz, M. A., & Gandomi, A. H. (2021). Survey on Twitter sentiment analysis: Architecture, classifications, and challenges. In Deep learning approaches for spoken and natural language processing (pp. 1–18). Springer. 6. Fan, H., Du, W., Dahou, A., Ewees, A. A., Yousri, D., Elaziz, M. A., Elsheikh, A. H., Abualigah, L., & Al-Qaness, M. A. (2021). Social media toxicity classification using deep learning: Realworld application UK Brexit. Electronics, 10(11), 1332. 7. Alomari, O. A., Khader, A. T., Al-Betar, M. A., & Abualigah, L. M. (2017). MRMR BA: A hybrid gene selection algorithm for cancer classification. Journal of Theoretical and Applied Information Technology, 95(12), 2610–2618. 8. Alomari, O. A., Khader, A. T., Al-Betar, M. A., & Abualigah, L. M. (2017). Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. International Journal of Data Mining and Bioinformatics, 19(1), 32–51. 9. Chung, D. T. P., & Van Tai, D. (2019). A fruit recognition system based on a modern deep learning technique. Journal of Physics: Conference Series, 1327. 10. Andrea, L., Mauro, L., & Di Ruberto, C. (2021). A novel deep learning based approach for seed image classification and retrieval. Computers and Electronics in Agriculture, 187. 11. Shaohua, W., & Guodos, S.(2019). Faster R-CNN for multi-class fruit detection using a robotic vision system. School of Information and Safety Engineering. 12. Osako, Y., et al. (2020). Cultivar discrimination of litchi fruit images using deep learning. Scientia Horticulturae, 269. 13. Jaswal, D., Vishvanathan, S., & Soman, K. P. (2014). Image classification using convolutional neural networks. International Journal of Scientific and Engineering Research, 5(6), 1661– 1668. 14. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 15. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250.

Mango Varieties Classification-Based Optimization …

65

16. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 17. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 18. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 19. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 20. Diahashree, G. (2017, June 1). Transfer learning and the art of using pre-trained models in deep learning. https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tun ing-a-pre-trained-model/ 21. Transfer learning in Keras using VGG16, 2020. https://thebinarynotes.com/transfer-learningkeras-vgg16/

Salak Image Classification Method Based Deep Learning Technique Using Two Transfer Learning Models Lau Wei Theng, Moo Mei San, Ong Zhi Cheng, Wong Wei Shen, Putra Sumari, Laith Abualigah, Raed Abu Zitar, Davut Izci, Mehdi Jamei, and Shadi Al-Zu’bi Abstract Salak is one of the fruits plants in Southeast Asia; there are at least 30 cultivars of salak. The size, shape, skin color, sweetness or even flesh color will be different depending on the cultivar. Thus, classification of salak based on their cultivar become a daily job for the fruit farmers. There are many techniques that can be used for fruit classification using computer vision technology. Deep learning is the most promising algorithm compared to another Machine Learning (ML) algorithm. This paper presents an image classification method on 4 types of salak (salak pondoh, salak gading, salak sideempuan and salak affinis) using a Convolutional Neural Network (CNN), VGG16 and ResNet50. The dataset consists of 1000 images which having 250 of images for each type of salak. Pre-processing on the dataset is required to standardize the dataset by resizing the image into 224 * 224 pixels, convert into jpg format and augmentation. Based on the accuracy result from the model, the best model for the salak classification is ResNet50 which gave an accuracy of 84% followed by VGG16 that gave an accuracy of 77% and CNN which gave 31%. L. W. Theng · M. M. San · O. Z. Cheng · W. W. Shen · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 19328, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan R. A. Zitar Sorbonne Center of Artificial Intelligence, Sorbonne University-Abu Dhabi, 38044 Abu Dhabi, United Arab Emirates D. Izci Department of Electronics and Automation, Batman University, Batman 72060, Turkey M. Jamei Faculty of Engineering, Shohadaye Hoveizeh Campus of Technology, Shahid Chamran University of Ahvaz, Dashte Azadegan, Iran S. Al-Zu’bi Faculty of Science and IT, Al-Zaytoonah University of Jordan, Amman, Jordan © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_4

67

68

L. W. Theng et al.

Keywords Salak classification · Deep learning · CNN · ResNet50 · VGG16

1 Introduction Snake fruit which also known as salak or Salacca zalacca is a species of palm tree that is native to Indonesia but it is now grown and produced in the Southeast Asia [1]. It is called snake fruit due to its reddish-brown scaly skin [2]. The inside of the fruit consists of 3 lobes that resembles white colored large peeled garlic cloves. The taste is commonly sweet and acidic with apple-like texture [2]. There is a lot type of salak such as salak pondoh, salak sidempuan, salak gading, salak affinis, etc. They are too similar, and it is hard to differentiate among them. Thus, this is where deep learning comes to the picture. Deep learning which also known as deep neural network or deep neural learning is used to process the data and creates the patterns by imitating the human brain to make a decision [3]. It uses neuro codes that are linked together within the hierarchical neural network to analyze the incoming data [3]. Image recognition is one of the most popular deep learning applications that helps a lot of field especially in fruit agricultural to identify the classification of the fruit. In the past few decades, CNN or deep learning has been proven a powerful tool in handling big amount of data especially fruits, characters, animals classification [4–8]. Say is something easier than done, there are also challenges in image classification. Image classification is mainly a process of labelling an image according to the patterns (classes) [9, 10]. For example, image classification of an apple can be categories at least three types of color which is red, green, yellowish and many more. Some common problem in fruits detection are sizing, color and view-point variation which the input image of red cherry and tomatoes can be likely looks similar to a red apple [10]. According to the journal on fruit classification system using computer vision, this paper uses image classification and processing to conduct fruits grading quality, sorting and disease detection before selling to the market [11]. This implementation benefits fruits industry quality in time saving, reduce human errors, fast and efficiency and protects good consumer relations [11]. Fruits disease detection uses the techniques involve clustering, color-based segmentation, and other disease categorization classifiers [11]. Convolutional Neural Network (CNN) is one of the popular algorithms used to identify pattern in an image [12]. An image is a picture which form in appearance of an object such as durian, strawberry, or mango. Is easy for human eye to detect the object in the image but for a computer vision, it only read it as pixels in bits or binary format. CNN is kind of deep neural network, very efficient and reliable for all image processing. The combination of CNN involves few convolution layer, pooling layer and fully connected neural network [13]. The first process of CNN requires input image, cropping a section of input image to the convolution layer. Convolution layer consist a number of filters to extract features with kernel (K) of size 3 × 3 × 1 from section of input image [13]. Next the image will proceed through pooling layer that

Salak Image Classification Method Based Deep Learning …

69

used non-linear down-sampling which shortened half the size of the image during the process [13]. There are two kind of pooling layer which is max pooling and average pooling also referred as activation maps [13]. Max pooling identify the largest value from the section of the image while average pooling uses the (total sum/number of pool size) in an image. Next process duplicates the flow of convolution and pooling layer again to extract more information through the image. Last process uses only one fully connected layer that all neuron is connected into few classes. It determines image in few possible classes such as 0.97% for apple, 0.02% for banana and 0.01% for durian. At the end it will select the highest accuracy among all classes to populate the result. A good method to quickly resolve image classification problem is through transfer learning models. One of the most significant advantages of applying transfer learning models is that it reduces developer work without requiring too much time to build a new model at the beginning because the transfer learning model can be instantly applied to the present image classification problem [14]. Other than directly applying the transfer learning models, the developer or user should understand the problem definition of the image classification issue faced and perform fine tuning on certain convolution layer. Froze some layers and more training layers to fit the objective of the situation needed. There are various number of transfer learning model can be used such as VGG, AlexNet, MobileNet, ResNet and etc. [14]. Other than using normal CNN for image classification, Karen Simonyan and Andrew Zisserman from University Oxford published a paper title called “Very Deep Convolutional Networks for Large-Scale Image Recognition” which introduced VGG16 model [15]. This VGG16 model has larger parameter size likely the same with AlexNet model but VGG16 consist of 16 layers convolution layers. The architecture of VGG16 in first convolution layer fixed the size of (224 × 224) RGB then continue with a max pooling layer (3 × 3). For the second convolution layer fixed with size of (112 × 112) RGB and max pooling layer (3 × 3). Then continue the with three convolution and one max pooling layer on the third to fifth phase, lastly end with three fully dense layer. The max pooling layer is used to reduce the image extraction sizes in half. This outstanding model result obtained up to 92.7% accuracy, placing in the top five at ImageNet [15]. Although the model result is good but there are also disadvantage such as model requires more time to train and size of the architecture is huge [16]. Another popular transfer learning model is ResNet50, also known as residual neural network [17]. ResNet50 uses lesser parameter as compared to VGG16, this benefits in model running faster because of lesser weight in it. During feature extraction and weight learning, RestNet50 uses the same way softmax layer via CNN [18]. First pre-processing of ResNet50 resized all images to (224 × 224) pixels to fit the model input size [18]. Then perform CNN in filtering method for image extraction depends by the filtering mask applied in kernel (3 × 3) [18]. Next, the section of the input image will go through feature extraction with 2D-Convolution filter [18]. Depending on the amount of weight in the image, the more valuable feature will be extracted. Each layer will continue passing through the activation layer to understand

70

L. W. Theng et al.

Fig. 1 Sample of salak dataset

complex feature. Lastly process in fully connected layer by repeating the backpropagation process depend on the input number of iterations [18]. Based on the keras application result, this model achieved 92.1% percent accuracy with the parameter of 25,636,712 [19]. Some other optimization methods can be used to optimize the problems as given in [20–25]. The main goal of this paper is to develop a CNN model and 2 transfer learning models which are VGG16 and ResNet50 for image classification. The developed models should be able to classify the salak images into 4 types of classification which are salak pondoh, salak gading, salak sideempuan and salak affinis.

2 Dataset 2.1 Dataset Description The dataset is a collection of images from google, facebook, Instagram, youtube, etc. All collected images are real-life photos with color with not more than 30% noise. There are total of 1000 color images in the salak dataset which contains of 250 images from each of the classes (salak pondoh, salak affinis, salak gading and salak sideempuan). Figure 1 shows the sample of the salak dataset that has been collected.

2.2 Dataset Preparation Dataset preparation is done to process or transform the image collected into a form that can be used in designing the model. In this study, resizing, augmentation as well as converting the images into a standard format is done. . Resizing—Image’s pixel is resized into 224 × 224 × 3 pixel. . Image format—Converted into JPEG standard format.

Salak Image Classification Method Based Deep Learning …

71

. Augmentation—Transform the image by rotates, flips, etc. to expand the size of the dataset. This only apply to insufficient dataset such as salak affinis, salak gading and salak sideempuan. Figure 2 shows the sample of augmentation images. All 1000 images are split into 70% training, 20% validation as well as 10% testing dataset. The training and validation dataset will be used to build the model while the testing dataset is a non-trained dataset which will be used to test the overall accuracy of the model. Figure 3 shows the example of the directory structure which having a main directory call Salak. Inside the main directory, it will have 3 folders which are training, validation and testing and each of the folders will have 4 subfolders called Pondoh, Gading, Sideempuan and Affinis accordingly as shown in Figs. 4, 5 and 6.

Fig. 2 Sample of augmented images

Fig. 3 Salak directory

72

L. W. Theng et al.

Fig. 4 Test directory

Fig. 5 Train directory

Fig. 6 Validation directory

3 Proposed Deep Learning In this study, Convolutional Neural Network (CNN) as well as two transfer learning which are VGG16 and ResNet50 models will be developed. All the models will be trained and tested using the salak dataset to select the best accuracy among them.

3.1 CNN In our proposed CNN model, we use 2 convolutional layers, 2 pooling layers, 1 flatten operator and 2 dense layers to generate the desired output. We first take in

Salak Image Classification Method Based Deep Learning …

73

Fig. 7 CNN model diagram

the input images with the size of (224 × 224 × 3) and feed them into the 2 sets of convolutional layers and pooling layers. The outputs are then flattened into a single dimension and fed into 2 hidden layers before the final layer. The activation functions used for the dense layer is relu and the final layer of the classifier is using the softmax as its activation function. Since there are 4 classes in the salak dataset, the final output should have 4 nodes (Fig. 7).

3.2 VGG16 In VGG16, the convolutional base model is frozen, and we unfreeze the top layer. Two dense layers are added with units’ number 2048 and 1048 respectively and the output layer with units’ number 4. Output layer is indicating the classes output. The VGG16 model diagram is shown in Fig. 8.

3.3 ResNet50 In ResNet50, the convolutional base model is frozen, and we unfreeze the top layer. Two dense layers are added with units’ number 2048 and 1048 respectively and

74

L. W. Theng et al.

Fig. 8 VGG16 model diagram

the output layer with units’ number 4. Output layer is indicating the classes output. Figure 9, 10, and 11

Fig. 9 Overall ResNet50

Salak Image Classification Method Based Deep Learning …

75

Fig. 10 Conv block on ResNet50 A

Fig. 11 Conv block on ResNet50 B

4 Performance Result 4.1 Experimental Setup There is a total of 1000 color images in the salak dataset which contains 250 images from each of the classes (salak pondoh, salak affinis, salak gading and salak sideempuan). All the image is resized into 224 × 224 pixels. The dataset is split into 70% train, 20% validation and 10% test. Train dataset is used to train the modal while the validation dataset is used to evaluate a given model performance while tuning model hyperparameters. The test dataset is to acts as new data to evaluate the final model performance. Python is used in these experiments as it has an extensive set of libraries for artificial intelligence and machine learning such as TensorFlow, Keras and Scikit-learn. We used Keras API to build, train and validation our models. Google Colaboratory (Colab) Platform is used to perform all the experiments as no setup

76

L. W. Theng et al.

Fig. 12 Mounting to Google drive

is required, share code with others without any setup and easy to use. Dataset is upload to Google Drive and the path is shared within the team members and they are required to add a shortcut to drive for the shared path. Colab allowed us to access our Google Drive by using the drive module from google.colab. Figure 12 is shown the code for mounting the drive. Once key in the authorization code by clicking on the link, it mounted at the drive. We can access the same dataset without downloading it. ImageDataGenerator API is used to return batches of images from the subdirectories Sideempuan, Pondoh, Gading and Affinis. Model summary for both VGG16 and ResNet50 is shown in Figs. 13, 14, 15, 16, 17, 18 and 19. For the transfer learning model (VGG16 and ResNet50) and CNN, we perform several fine-tuning parameters such as the number of epochs, optimizers, learning rate and several dense layers. For CNN, additional tuning on filter size while for transfer learning model on the unfrozen percentage of the model. The activation function relu for the dense layer except for the output layer as the output layer used softmax for all the experiments. Validation and test accuracy used to evaluate the performance of the model.

4.2 Effect of Kernel Size: CNN Kernel size refers to the size of the filter, which convolves around the feature map. In this experiments, 3 kernel size are used which are 2, 3 and 4 in CNN only while VGG16 and ResNet50 model remain using the default value. Figures 20 and 21 shows the test and validation accuracy obtained. The results show that the validation accuracy have the best accuracy of 68% when kernel size is at 3 while it became worst for the test accuracy which gave only 20%. For test accuracy, it gave the best accuracy of 31% when the kernel size is at 4.

4.3 Effect of Pool Size: CNN Pool size refer to size that is used to reduce the dimensions of the feature maps. This will reduce the number of parameters to learn and the amount of computation performed in the network. In this experiment, there are 3 pool size are used which are

Salak Image Classification Method Based Deep Learning …

Fig. 13 VGG16 model summary

77

78

L. W. Theng et al.

Fig. 14 ResNet50 model wrapper summary part 1

2, 3 and 4 on CNN model only while the rest of the model will be using the default value. Figures 22 and 23 and shows the results of the validation and test accuracy. The similar pattern as the kernel size can be seen whereby it gave best accuracy of 36% validation accuracy when pool size is 3 and 31% test accuracy when pool size is 2.

Salak Image Classification Method Based Deep Learning …

79

Fig. 15 ResNet50 model wrapper summary part 2

4.4 Effect of Epoch Epoch is one of the neural networks’ hyperparameter which representing the gradient descent that controls the number of complete passes through the training dataset. In this experiment, 3 different epoch value are used which are 10, 20 and 50.

80

L. W. Theng et al.

Fig. 16 ResNet50 model wrapper summary part 3

4.4.1

Effect of Epoch: CNN

Based on Fig. 24, the validation accuracy shows the highest at 35% when the epoch value is at 10 and 50. As for the lowest validation accuracy, it is at 20% when the epoch value is at 20. The test accuracy is at 31% following by a steady 27% when to epoch is at 10, 20 and 50 as shown in Fig. 25.

Salak Image Classification Method Based Deep Learning …

81

Fig. 17 ResNet50 model wrapper summary part 4

4.4.2

Effect of Epoch: VGG16

Figures 26 and 27 show the accuracy obtained from the test and validation dataset. The validation accuracy gave its highest at 75.5% when the epoch value is at 10 followed by 69.5% when the epoch value is at 20 and 71% when the epoch value is at 50. As for the test accuracy, it gave 75% when the epoch is at 20, 73% when epoch is at 50 and lastly 68% when epoch is at 10.

82

L. W. Theng et al.

Fig. 18 ResNet50 model wrapper summary part 5

4.4.3

Effect of Epoch: ResNet50

The accuracy of the test and validation is as shown in Figs. 28 and 29. The epoch value of 10 gave the highest accuracy of 84% and is decreasing as the epoch value increase. As for the test accuracy, it gave the peak accuracy of 82% when the epoch value is at 20.

Salak Image Classification Method Based Deep Learning …

83

Fig. 19 ResNet50 overall model summary

Fig. 20 CNN—effect of kernel size on validation accuracy

4.5 Effect of Optimizer Optimizers are a neural network algorithm that is used to change the attributes of the neural network such as the weight parameters and learning rate. The objective of the optimizers is to reduce the loss of the neural network function by enhancing the

84

L. W. Theng et al.

Fig. 21 CNN—effect of kernel size on test accuracy

Fig. 22 CNN—effect of pool size on validation accuracy

parameters of the neural network. In this experiment, there are 4 types of optimizer that are used which are Adam, SGD, Adadelta and Adagrad.

4.5.1

Effect of Optimizer: CNN

Figures 30 and 31 shows the accuracy from validation and test dataset when using different optimizer. Adagrad optimizer shows the best validation accuracy of 67%,

Salak Image Classification Method Based Deep Learning …

85

Fig. 23 CNN—effect of Pool size on test accuracy

Fig. 24 CNN—effect of epoch on validation accuracy

Adadelta gave 41.5%, Adam gave 35% and SGD gave 25%. For the test accuracy, Adam gave the highest of 31% compared to SGD who gave 25%, Adagrad who gave 19% and Adadelta who gave 17%.

86

L. W. Theng et al.

Fig. 25 CNN—effect of epoch on test accuracy

Fig. 26 VGG16—effect of epoch on validation accuracy

4.5.2

Effect of Optimizer: VGG16

Figures 32 and 33 shows the comparison of the accuracy using test and validation dataset in VGG16 model. SGD optimizer shows the best optimizer when using the validation dataset which having 71% followed by Adam and Adagrad which having 69.5% and lastly Adadelta which having 44%. As for the test data set, Adam gives the best accuracy among all. Adam having an accuracy of 76%, SGD having 69%, Adagrad having 66% and Adadelta having 50%.

Salak Image Classification Method Based Deep Learning …

87

Fig. 27 VGG16—effect of epoch on test accuracy

Fig. 28 ResNet50—effect of epoch on validation accuracy

4.5.3

Effect of Optimizer: ResNet50

The effect of the optimizer on ResNet50 is shows in Figs. 34 and 35. For the validation accuracy, Adadelta giving the highest accuracy of 86.5% while Adagrad gave accuracy of 78%. Adam and SGD gave the lowest accuracy of 25%. As for the test accuracy, Adagrad shows the best result obtained which are 82% of the accuracy. However, Adadelta is also given a quite high accuracy of 79% while Adam and SGD are the lowest which gave an accuracy of 25%.

88

L. W. Theng et al.

Fig. 29 ResNet50—effect of epoch on test accuracy

Fig. 30 CNN—effect of optimizer on validation accuracy

4.6 Effect of Learning Rate Learning rate is one hyperparameter of neural network that controls how much to change the model in response to the estimated error for each time the weight of the model is updated. Selecting the learning rate is a challenge as a too small value will result in a long training process while high value will cause the training process to

Salak Image Classification Method Based Deep Learning …

89

Fig. 31 CNN—effect of optimizer on test accuracy

Fig. 32 VGG16—effect of optimizer on validation accuracy

unstable. There are 4 different learning rate values are used in this experiment which are 0.1, 0.01, 0.001 and 0.0001.

90

L. W. Theng et al.

Fig. 33 VGG16—effect of optimizer on test accuracy

Fig. 34 ResNet50—effect of optimizer on validation accuracy

4.6.1

Effect of Learning Rate: CNN

Figures 36 and 37 shows the result of the validation and test accuracy. The validation accuracy on CNN shows its peak on 82.14% when the learning rate is 0.01. When learning rate is at 0.1 it gave an accuracy of 26.7% followed by 25% with learning rate of 0.001 and 0.1. As for the test accuracy, it shows the similar pattern as validation

Salak Image Classification Method Based Deep Learning …

91

Fig. 35 ResNet50—effect of optimizer on test accuracy

Fig. 36 CNN—effect of learning rate on validation accuracy

accuracy. When learning rate is 0.01, it gave the highest test accuracy of 35% followed by 25% when the learning rate is at 0.1, 0.001 and 0.0001.

4.6.2

Effect of Learning Rate: VGG16

Figures 38 and 39 show the accuracy on test and validation dataset. The highest accuracy is at 76% when learning rate value is 0.0001 for validation accuracy and

92

L. W. Theng et al.

Fig. 37 CNN—effect of learning rate on test accuracy

0.001 for test accuracy. The overall results show that the higher the value of learning rate, the lower the accuracy.

Fig. 38 VGG16—effect of learning rate on validation accuracy

Salak Image Classification Method Based Deep Learning …

93

Fig. 39 VGG16—effect of learning rate on test accuracy

4.6.3

Effect of Learning Rate: ResNet50

Figures 40 and 41 shows the accuracy obtained for ResNet50 based on the learning rate. The results show the similar pattern as VGG16, whereby the higher the value of the learning rate, the lower the accuracy will be. Both test and validation highest accuracy is at 83% when learning rate is at 0.0001 and 0.001 respectively.

Fig. 40 ResNet50—effect of learning rate on validation accuracy

94

L. W. Theng et al.

Fig. 41 ResNet50—effect of learning rate on test accuracy

4.7 Effect of Dense Layer Dense layer is a neural network layer that is connected deeply. This means that all neuron in the dense layer receives inputs from the previous layer. In this experiment, 4 different dense layer is used, which are 1, 2, 3 and 4.

4.7.1

Effect of Dense Layer: CNN

Figures 42 and 43 show the effect of dense layer on validation and test accuracy for CNN. The results show that as the dense layer increase, the accuracy will decrease. The validation accuracy gave 51% for 1 dense layer followed by 33.5%, 28% and 31.5 respectively. As for the test accuracy, it gave its highest accuracy of 25% followed by 16% and 23%.

4.7.2

Effect of Dense Layer: VGG16

Figures 44 and 45 shows the result of the validation and test accuracy. The highest validation accuracy is at 76% while the highest test accuracy is at 77% when dense layer is at 3. As for the lowest both gave 25% when the dense layer is at 2.

Salak Image Classification Method Based Deep Learning …

95

Fig. 42 CNN—effect of dense layer on validation accuracy

Fig. 43 CNN—effect of dense layer on test accuracy

4.7.3

Effect of Dense Layer: ResNet50

Figures 46 and 47 shows the result of the validation and test accuracy. The validation accuracy has the highest at 86.5% as the dense layer increased. As for the test accuracy, it shows the highest at 82% while the lowest is at 72%.

96

L. W. Theng et al.

Fig. 44 Effect of dense layer on validation accuracy

Fig. 45 Effect of dense layer on test accuracy

4.8 Effect of Fine-Tuning for Pre-trained Models (VGG16 and ResNet50) Fine-tuning is a process that modifies the feature representation of the pretrained model to make the model more suitable for a specific task, in this case, salak dataset. The fine-tuning steps involved are unfreezing the top layers of a frozen pre-trained

Salak Image Classification Method Based Deep Learning …

97

Fig. 46 ResNet50—effect of dense layer on validation accuracy

Fig. 47 ResNet50—effect of dense layer on test accuracy

model base and attaching a few newly added classifier layers. Retraining of the newly modified models is required to obtain the new weights and biases. As shown in Tables 1 and 2, the pretrained model performs the best when 100% of the layers of the pretrained models are frozen. ResNet-50 in general performs better than VGG-16 for this dataset as it can obtain over 80% of accuracy when 100% of the layers are frozen. Bold font refers to the best result. Both model performances decline after we unfreeze the layers. VGG-16 gets 25% of accuracy for all the unfrozen

98

L. W. Theng et al.

Table 1 Validation accuracy Unfrozen percentage of pre-trained models (%)

0

20

40

60

80

100

VGG-16

0.70

0.25

0.25

0.25

0.25

0.25

ResNet-50

0.83

0.80

0.81

0.68

0.71

0.87

Unfrozen percentage of pre-trained models (%)

0

20

40

60

80

100

VGG-16

0.76

0.25

0.25

0.25

0.25

0.25

ResNet-50

0.81

0.21

0.23

0.18

0.26

0.23

Table 2 Test accuracy

percentage. Whereas ResNet-50 can obtain high validation accuracy but very lowtest accuracy. This suggests that the ResNet-50 model is having the problem of overfitting after we unfreeze the layers. In a nutshell, a pre-trained model performs better for salak dataset when the layers are all frozen.

4.9 Accuracy Comparison All three models have the highest validation accuracy when epoch = 10. Within the range of 10, 20 and 50. After that, the validation accuracy suffers a drop at epoch = 20 and increases again at epoch = 50. As for the test accuracy, the graphs above show that pretrained models can achieve the highest test accuracy when epoch = 20 within the epoch range of 10, 20, 50. CNN on the other hand, has the highest test accuracy when epoch = 10. This shows that CNN can achieve high test accuracy faster than the pre-trained models (Figs. 48 and 49). From the two bar charts (Figs. 50 and 51), we can infer that using Adadelta and Adagrad can yield better validation accuracy for all the models. When we are comparing the test accuracy, pretrained models that use Adadelta and Adagrad can give a higher test accuracy. However, CNN model with Adam optimizer can give a higher test accuracy compared to CNN model that uses other optimizers. For the learning rate charts, they share quite similar trends for validation accuracy and test accuracy (Figs. 52 and 53). Pretrained models are having the decreasing trend on the validation accuracy while CNN model is having the optimal learning rate at 0.001 on validation accuracy and test accuracy. Pretrained models are also having highest at optimal learning rate at 0.001 on test accuracy. Therefore, we can deduce that 0.001 of learning rate works best for the salak dataset in this study. Based on the two graphs (Figs. 54 and 55), we can see that the increasing dense layer for the ResNet-50 pre-trained model increases its validation accuracy but decreases its test accuracy. while VGG-16 is sharing a similar pattern for the validation accuracy and test accuracy, having the highest score when dense layer = 3 and lowest when dense layer = 2. Whereas for CNN model, we can see a decrease in

Salak Image Classification Method Based Deep Learning …

99

Fig. 48 VGG16, ResNet50 and CNN—effect of epoch on validation accuracy

Fig. 49 VGG16, ResNet50 and CNN—effect of epoch on test accuracy

validation accuracy and test accuracy when the dense layer is increased. Therefore, CNN performs the best when dense layer = 1. The best performing model is ResNet-50 with 84% of test accuracy, closely followed by VGG-16 model with 77% test accuracy (Fig. 56). CNN has the lowest test accuracy which is 31% for the salak dataset. The best combinations of parameters and the hyperparameters of the 3 respective models are presented in Table 3.

100

L. W. Theng et al.

Fig. 50 VGG16, ResNet50 and CNN—effect of optimizer on validation accuracy

Fig. 51 VGG16, ResNet50 and CNN—effect of optimizer on test accuracy

Salak Image Classification Method Based Deep Learning …

Fig. 52 VGG16, ResNet50 and CNN—effect of learning rate on validation accuracy

Fig. 53 VGG16, ResNet50 and CNN—effect of learning rate on test accuracy

101

102

L. W. Theng et al.

Fig. 54 VGG16, ResNet50 and CNN—effect of dense layer on validation accuracy

Fig. 55 VGG16, ResNet50 and CNN—effect of dense layer on test accuracy

Salak Image Classification Method Based Deep Learning …

Fig. 56 VGG16, ResNet50 and CNN—comparison of best validation and test accuracy Table 3 Best combination of parameters/hyperparameters for models Models

Best combination of parameters/hyperparameters

VGG-16

• • • • •

base_model: VGG-16 (100% frozen weight) 20 epochs Adam optimizer Learning rate of 0.001 3 dense layers

ResNet-50

• • • • •

base_model: ResNet-50 (100% frozen weight) 20 epochs Adagrad optimizer Learning rate of 0.001 2 dense layers

CNN

• • • • • • • •

10 epochs Adam optimizer Learning rate of 0.001 2 Convolutional layers 2 Pooling layers 1 hidden layer/dense layer Kernel size: (4, 4) Pool size: (2, 2)

103

104

L. W. Theng et al.

5 Conclusion In conclusion, the experiments are performed using CNN model and 2 transfer learning models which are VGG16 and ResNet50. The results are compared using test accuracy and validation accuracy to evaluate the performance of the model for each of the fine-tuning parameters. The highest validation accuracy value for each of the model when epoch at 10. The highest test accuracy for transfer learning models (VGG16 and ResNet50) is when epoch is at 20 while the CNN is when epoch is at 10. ResNet50 has the highest test accuracy which is 84% compare to VGG16 and CNN. Transfer learning model is performed better than CNN model. In this dataset, there is overfitting as the model is performs well on the training data but performs poorly on the validation data which is not used during training. There is some future work can be done to increase accuracy. Sampling method can be used to split the dataset into train, validation, and test dataset.

References 1. Snake fruit—Delicious taste, terrifying nightmare. Migrationology. [Online]. https://migration ology.com/snake-fruit-salak/ 2. Salak fruit facts and health benefits. HealthBenefits. [Online]. https://www.healthbenefitstimes. com/health-benefits-of-salak-fruit/ 3. Deep learning. Investopedia. [Online]. https://www.investopedia.com/terms/d/deep-learni ng.asp 4. ud Din, A. F., Mir, I., Gul, F., Mir, S., Saeed, N., Althobaiti, T., Abbas, S. M., & Abualigah, L. (2022). Deep reinforcement learning for integrated non-linear control of autonomous UAVs. Processes, 10(7), 1307. 5. Gharaibeh, M., Alzu’bi, D., Abdullah, M., Hmeidi, I., Al Nasar, M. R., Abualigah, L., & Gandomi, A. H. (2022). Radiology imaging scans for early diagnosis of kidney tumors: a review of data analytics-based machine learning and deep learning approaches. Big Data and Cognitive Computing, 6(1), 29. 6. Danandeh Mehr, A., Rikhtehgar Ghiasi, A., Yaseen, Z. M., Sorman, A. U., & Abualigah, L. (2022). A novel intelligent deep learning predictive model for meteorological drought forecasting. Journal of Ambient Intelligence and Humanized Computing, 1–15. 7. Abualigah, L., Zitar, R. A., Almotairi, K. H., Hussein, A. M., Abd Elaziz, M., Nikoo, M. R., & Gandomi, A. H. (2022). Wind, solar, and photovoltaic renewable energy systems with and without energy storage optimization: A survey of advanced machine learning and deep learning techniques. Energies, 15(2), 578. 8. Ali, M. A., Balasubramanian, K., Krishnamoorthy, G. D., Muthusamy, S., Pandiyan, S., Panchal, H., Mann, S., Thangaraj, K., El-Attar, N. E., Abualigah, L., & Elminaam, A. (2022). Classification of glaucoma based on elephant-herding optimization algorithm and deep belief network. Electronics, 11(11), 1763. 9. Convolutional neural networks (CNN). Analytics Vidhya, May 1, 2021. [Online]. https:// www.analyticsvidhya.com/blog/2021/05/convolutional-neural-networks-cnn/. Accessed June 6, 2021. 10. Gilani, R. (2021). Main challenges in image classification. towards data science, June 13, 2020. [Online]. https://towardsdatascience.com/main-challenges-in-image-classification-ba2 4dc78b558. Accessed June 20, 2021.

Salak Image Classification Method Based Deep Learning …

105

11. Naik, S., & Patel, B. (2017). Machine vision based fruit classification and grading-a review. International Journal of Computer Applications, 170(9), 22–34. 12. What is a convolutional neural network? [Online]. https://poloclub.github.io/cnn-explainer/. Accessed June 22, 2021. 13. Das, A. (2020). Convolution neural network for image processing—Using keras. towards data science, August 21, 2020. [Online]. https://towardsdatascience.com/convolution-neuralnetwork-for-image-processing-using-keras-dc3429056306. Accessed June 22, 2021. 14. Marcelino, P. (2018). Solve any image classification problem quickly and easily. KDnuggets, December 2018. [Online]. https://www.kdnuggets.com/2018/12/solve-image-classificationproblem-quickly-easily.html. Accessed June 24, 2021. 15. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition, Cornel University, April 10, 2015. [Online]. https://arxiv.org/abs/1409. 1556. Accessed June 24, 2021. 16. VGG16—Convolutional network for classification and detection. Neurohive, November 20, 2018. [Online]. https://neurohive.io/en/popular-networks/vgg16/. Accessed June 24, 2021. 17. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition, Cornell University, December 10, 2015. [Online]. https://arxiv.org/abs/1512.03385. Accessed June 25, 2021. 18. Zahisham, Z., Lee, C. P., & Lim, K. M. (2020). Food recognition with ResNet-50. In IEEE 2nd international conference on artificial intelligence in engineering and technology (IICAIET) 19. “Keras” [Online]. https://keras.io/api/applications/. Accessed June 6, 2021. 20. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 21. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 22. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 23. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 24. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 25. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49.

Image Processing Identification for Sapodilla Using Convolution Neural Network (CNN) and Transfer Learning Techniques Ali Khazalah, Boppana Prasanthi, Dheniesh Thomas, Nishathinee Vello, Suhanya Jayaprakasam, Putra Sumari, Laith Abualigah, Absalom E. Ezugwu, Essam Said Hanandeh, and Nima Khodadadi Abstract Image identification is a useful tool for classifying and organizing fruits in agribusiness. This study aims to use deep learning to construct a design for Sapodilla identification and classification. Sapodilla comes in a various of varieties from throughout the world. Sapodilla can come in different sizes, form, and taste depending on species and kind. The goal is to create a system which uses convolutional neural networks and transfer learning to extract the feature and determine the type of Sapodilla. The system can sort the type of Sapodilla. This research uses a dataset including over 1000 pictures to demonstrate four different types of Sapodilla classification approaches. This assignment was completed using Convolutional Neural Network (CNN) algorithms, a deep learning technology widely utilised in image classification. Deep learning-based classifiers have recently allowed to distinguish Sapodilla from various images. Furthermore, we utilized different versions of hidden layer and epochs for various outcomes to improve predictive performance. We investigated transfer learning approaches in the classification of Sapodilla in the suggested study. The suggested CNN model improves transfer learning techniques and state-of-the-art approaches in terms of results. A. Khazalah · B. Prasanthi · D. Thomas · N. Vello · S. Jayaprakasam · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan A. E. Ezugwu School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Road, Pietermaritzburg 3201, KwaZulu-Natal, South Africa E. S. Hanandeh Department of Computer Information System, Zarqa University, Zarqa, Jordan N. Khodadadi Department of Civil and Environmental Engineering, Florida International University, Miami, FL, USA © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_5

107

108

A. Khazalah et al.

Keywords Sapodilla · Deep learning · Convolution neural network · Transfer learning

1 Introduction Sourcing talented ranch work in the farming business (particularly cultivation) is perhaps the most expense requesting factors in the industry [1]. This would be due to growing supply prices for things like power, water irrigation, and genetically modified crops, among other things. Farm businesses and the agricultural sector are being squeezed by low profit margins as a result of this. Under certain conditions, agricultural production must continue to fulfil the rising demands of an ever-increasing worldwide population, posing a serious concern in the future. Sapodilla is a tropical fruit that can be found in South America as well as South Asia. In Malaysia, this fruit is better known as Ciku. One of the greatest issues in the farm fields is detecting Sapodilla and classifying the many types of Sapodilla. Furthermore, it leads to higher prices [2]. As a result, we require an automation system that will reduce manual labour, improve productivity, and decrease maintenance money and effort. Figure 1 shows different types of Sapodilla. Robotic cultivation has the opportunity to overcome this challenge by lowering labour expenses (because to increased durability and predictability) while also improving crop productivity. For some of these factors, during the last 3 decades, there seems to be a significant focus in using agriculture sector robots to harvest fruits [3]. The creation of these kind of systems entails a variety of difficult activities, including such manipulating and choosing. Nevertheless, developing a precise fruit recognition system is a critical step towards completely automated harvest robot,

Fig. 1 Different types of Sapodilla

Image Processing Identification for Sapodilla Using Convolution Neural …

109

since it is the front-end perceptual technology that precedes succeeding manipulative and clutching technologies; if fruits is not recognised or even seen, it cannot be harvested [1]. This phase is difficult due to a variety of circumstances, including lighting variations, occlusions, and circumstances whenever the fruit has a consistent image look to the background. With the rapid advancement of our human civilization, more emphasis was placed on the perfection of our lives, especially the foods we consume. Computer vision have become increasingly popular in personalized recommender technologies in recent years. Deep Neural Network (DNN) is often used to recognise fruits from photos in the areas of images recognition and characterization [4]. DNN outperforms other machine learning algorithms. Convolutional Neural Networks (CNNs) are a type of neural network. A deep learning algorithm is one that is categorized as such. CNNs are now the most widely utilized kind in deep learning. It is employed in a variety of image processing analyses. The accuracy rates in certain sectors, such as fruit classification using CNN, have surpassed human abilities [5]. CNN’s framework is remarkably similar to ANN’s. Each layer of the ANN has many neurons. As a result, the weighed total of a layer’s neurons is now the source of a neuron in the following layer, which adds a biased result. The layer in CNN contains three elements [6]. All of the neurons are linked to a single convolutional layer rather than being totally linked. To train the classifier, a costing process is defined. It analyses the network parameters to the expected outcome [7]. Deep learning approaches has made good advances in meeting these objectives in recent times. Fruit detecting is a challenge that may be thought of and expressed as a feature’s extraction problem. Convolutional Neural Networks (CNN) were employed in the presented system to detect fruit communications system form photographs [5]. In comparison to those other studies, the suggested technique attempts to solve all of the constraints of comparable fruit detecting system operates and achieve a high level of accuracy. The technology has delivered functionality that is both simple and efficient [3]. Some other optimization methods can be used to optimize the problems as given in [8–13]. In effort to accomplish fruit identification by machine. We suggest CNN training, under which the computer must display the output type provided to the networks as a consequence, independently of its type, color, number, texture, or other characteristics [14].

2 Literature Survey Despite the fact that several scientists have handled the subject of fruit recognition, such as with the resulting in the development in [2, 15–17] the survey concluded that the difficulty of developing a quick and efficient fruit detector continues. This is owing to the variety of color, dimensions, sizes, textures, ed and susceptible to constantly shifting lighting and shadow circumstances in the bulk of these scenarios. The subject of fruit recognition as a feature’s extraction issue has been addressed

110

A. Khazalah et al.

in many works in the literature (i.e., fruit vs. background). The subject of apples recognition for yield estimation was investigated by Wang et al. [2]. They established a form that could recognise apples primarily on their color and sparkling reflection’s structure. Additional details were utilised whether to eliminate inaccurate incidences or separate regions that might include numerous apples, including the size distribution of apples. Another strategy used was to only consider detection methods from locations that were predominantly circular. Bac at [15] and his colleagues for sweet peppers, a classification method was presented. They employed 6 multi-spectral cameras and a variety of characteristics, comprising unprocessed spectral information, standardized precipitation indexes, and feature descriptors based on entropy. Investigations in a carefully controlled glasshouse setting revealed that this method yielded fairly accurate segmented image. The writers, though, made a point. It wasn’t precise enough to create a trustworthy impediment map. For almonds identification, Hung et al. [16] advocated using artificial potential fields. They suggested a five-class categorization method based on a Sparse Autoencoder that learnt parameters (SAE). These traits were again applied to a CRF framework, which outperformed earlier research. They were able to divide the data quite well, but could not recognize any objects. They also mentioned that refraction was a significant difficulty. Instinctively, such a strategy can only handle modest amounts of opacity. Yamamoto et al. [15], for example, used color-based segmented to conduct tomato identification. Then, using color and figure information, a Classifier and Regression Trees (CART) classifier was trained. As a result, a classification map was created, which divided related pixels into areas. To limit the number of false alerts, each zone was assigned a detector. They used a random forest to train a non-fruit classifier in regulated glasshouse conditions. A pixel-level separation methodology for image recognition has been used in every one of the earlier in this thread research, and the majority of these efforts have focused on fruit recognition primarily for production estimate. Fruit recognition has only been done in regulated glasshouse situations in the few experiments that have been done. All things considered, the issue of organic product location in exceptionally testing conditions remains unsolved. This is because of the great changeability in the presence of the objective articles in the horticultural settings, which implied that the exemplary strategies for sliding window draws near, despite the fact that showing great execution when tried on datasets of chosen pictures, can’t deal with the inconstancy in scale and presence of the objective items when sent in genuine homestead settings [2, 15]. Deep learning models have already made significant advances in the categorization and recognition of objects. On PASCAL-VOC, the state-of-the-art recognition architecture is divided into two phases. The pipeline project first step uses a fully convolutional approach like feature extraction or edge box to pick areas of focus from a picture, which are then sent to a deep learning for classifications. This pipeline is computationally intensive, preventing that from being employed in instantaneously for an engineering application, despite its great recognition memory [16, 17]. RPNs

Image Processing Identification for Sapodilla Using Convolution Neural …

111

solve these problems by integrating a recognition convolutional neural infrastructure with an image helpful for increasing, allowing the device to forecast regulate and identify them from each location asynchronously. The specifications of the network entities are decided to share, resulting in significantly higher throughput, making it perfect for engineering manipulators. In real-world outside agricultural areas, a multiple sensor modal is seldom enough to data from relevant fruits behind a variety of lighting conditions, partially invariant, and various looks. This creates a strong argument for multi-modal fruit detection methods, since multiple types of devices may offer complimentary information on specific features of the fruit. Deep learning models have previously demonstrated considerable potential when employed for multi-modal algorithms in sectors other than farming technology, such as in, whereby audio/video has already been employed extremely well, and in, whereby photograph has outperformed each modality separately. As shown in the next parts, this study takes the very same technique and shows how much a multi-modal geographical area fruit identification system beats pixel-level segmentation technique [15–17]. A remarkable procedure for perceiving organic products from photographs utilizing profound convolutional neural organizations is introduced in examination. The scientists utilize a Faster Region-based cnn model for this. The objective is to foster a computational model that can be used without anyone else driving robots to pick organic products. RGB and NIR (infrared area) pictures are utilized to prepare the neural organization. The RGB and NIR forms are consolidated in two unique manners: early and center intermingling. The initiation work for starting combination contains four streams: three for the RGB picture and another for the NIR picture. Postponed assembly utilizes the use of two independently preparing pictures that are consolidated by normal the yields both from calculations. As a result, a multi-modular organization with a lot better than past frameworks have been created [26].

3 Proposed Deep Learning for Sapodilla Recognition Artificial neural networks [18, 19] produced the most effective achievements in the domain of picture identification and classification. The majority of deep learning methods are built on top of all these systems. Neural networks [18] is a type of machine learning technique that employs numerous layers of asymmetric processing elements. Each layer acquires the ability to modify its incoming information into a more complex and model is an appropriate [19]. Other machine learning techniques have been outperformed by deep learning models. In some sectors, they also accomplished the very first superhuman image recognition [18]. This is amplified by the fact that neural networks are seen as a vital step towards achieving High Quantities. Second, deep learning models, particularly convolutional neural network (CNN), have been shown to provide excellent classification performance recognition.

112

A. Khazalah et al.

Fig. 2 CNN architecture

3.1 The Proposed CNN Architecture A deep learning framework is used for the conceptual model. There are three CNN layer in the framework. A group of pixels in the picture might indicate a picture’s boundary, the shadows of a picture, or any other structure. Convolution is one method for detecting these connections. A matrix is used to describe the picture elements during calculation. The CNN Model’s framework is seen in Fig. 2. It entails the extraction and categorization of features. Cropping removes any unnecessary data from input photos. The pictures have all been resized. Convolution and pooling layers are applied a repeatedly to extract features. One convolution layer and a maximum pooling layer are found in the first two blocks. For identifying the examples, we need to utilize a “filter” network which is increased with the picture pixel grid. These channel sizes may shift and the duplication absolutely relies upon the channel size and one can take a subset from the picture pixel lattice dependent on the filter size for convolution beginning from the principal pixel in the picture pixel network. Then, at that point the convolution continues forward to the following pixel and this cycle is rehashed until all the picture pixels in the framework are finished. Then, at that point the convolution continues forward to the following pixel and this cycle is rehashed until all the picture pixels in the lattice are finished. The pooling layer will be the next kind of level in the CNN method. This layer reduces the outcome size, i.e. the feature map, and hence avoids curse of dimensionality. A fully connected surface is utilised as the output layer. This level “compresses” the result from preceding levels into a descriptor number that may be used as an intake for the following stage. Figure 3 shows the trained images of CNN model.

3.2 Transfer Learning Model Transfer learning is a machine learning technique in which a prototype developed once per job is used as the foundation for a simulation for another activity. Because when given data is insufficient, this methodology performs well, and the algorithm corresponds quickly. Transfer learning may be used to classify images in a variety of different ways. We began by loading the pre-trained models and discarding the final layer. We adjusted the remainder of the levels to non-trainable when they were

Image Processing Identification for Sapodilla Using Convolution Neural …

113

Fig. 3 Trained images of CNN model

Fig. 4 Transfer learning

eliminated. Then, towards the platform’s end, we inserted additional thick layer, this time with the amount of sapodilla types we wish to forecast. Figure 4 shows the transfer learning.

3.2.1

VGG16

Convolutional and completely linked layers make up the 16-layer matrix. For convenience, just 33 convolution layers were put on top of the other. The first and secondary convolution materials are composed of 64 element kernels filter with a size of 33%. The parameters of the input picture increase to 224 × 224 × 64 as it passes through the first and secondary convolution layer. The outcome is then transferred to the pooling layer with a duration of two. The 124 element kernels filter in the 3rd and 4th convolutional layers have a filter of 33%. After these 2 phases, a max pooling with 2 × 2 is applied, and the outcome is shrunk to 56 × 56 × 128. Convolutional layers with just a kernel of 33 are used in the five, six, and seven levels. 256 local features are used in all 3. These cells are surrounded by a phase 2 pooling layer.

114

A. Khazalah et al.

There are 2 types of convolution operation with kernel sizes of 33rd and thirteenth. There are 512 kernels filters in every one of those convolution kernel sets. Following such levels is a max—pooling with a duration of 1. Figure 5 shows the architecture of VGG16. Fig. 5 Architecture of VGG16

Image Processing Identification for Sapodilla Using Convolution Neural …

3.2.2

115

VGG19

VGG19 is perhaps the latest VGG architecture, and it looks quite identical to VGG16. When we examine the structure of the network with VGG16, we’ll notice that they’re both built on 5 convolutional layers. Nevertheless, by implementing a convolution operation throughout the last 3 groups, the network’s complexity was already enhanced yet further. The intake is indeed an RGB picture with the form (224, 224, 3), and the outcome is a features vector with the same structure (224, 224, 3). VGG19 has its own preparation method in Keras, however if we examine at the source, we’ll notice that it is indeed exactly the same except VGG 16. As a result, we won’t have to redefine anything.

3.2.3

MobileNet

MobileNet is a neural network which is used for categorization, recognition, and some other typical applications. They are quite tiny, which allows us to use them on portable apps, and their dimension is 17 MB. A simplified framework is used to create them. This design built compact and deep neural network models using complexity sustainabilit. These complexity convolutional layers generate a simplification benefit, reducing the model’s length and speeding up execution. Mobile Nets could be used to increase productivity in a variety of applications. MobileNets may be utilised for a variety of activities, including object identification, fine—grained categorization, face feature categorization, and so on. MobileNet is a powerful neural network that may be utilised in image recognition. Only with amount of methods in our framework, we updated the very last level in the MobileNet framework in this project.

3.3 Dataset Pictures of four various types of sapodilla are included in the dataset. The four types of sapodilla are ciku Subang, ciku Mega, ciku Jantung and ciku Betawi. The pictures in the collection include ciku of various sizes from several classes. The photos do not have a uniform backdrop. Various postures of the very same sorts of ciku may be found in the dataset. Cikus are included in a variety of postures and viewpoints, including side angle, back view, various backgrounds, partially chopped, sliced on the plate, chopped into bits, displaying the seeds, and degree of variability. Ciku might be freshly, rotting, or packaged in bunches. Many photos have bad lighting, unusual lighting characteristics, are covered with net, are adorned, decorated, and have leaves on trees. The dataset consist of more than 1000 images. Figure 6 shows sample dataset images. Table 1 shows the dataset description.

116

A. Khazalah et al.

Fig. 6 Sample dataset images

Table 1 Dataset description

Input

Labels

Ciku Subang

250

Ciku Mega

250

Ciku Betawi

250

Ciku Jantung

250

3.4 Augmentation The availability of data frequently enhances the effectiveness of deep learning neural network models. Data augmentation is a method of dynamically creating fresh training data from previous facts. This is accomplished by using database methods to transform instances from the learning algorithm into different and innovative training images. The very well sort of dimensionality reduction is picture data augmentation, which entails transforming pictures in the train dataset into modified copies that correspond to almost the same classification also as actual picture. Transitions, rotations, digital zoom, and other procedures from the area of picture modification are included in transformations. The goal is to add fresh, credible instances to the train collection. This refers to changes of the training data set pictures that perhaps the algorithm is interested in examining. A horizontally tilt of a sapodilla shot, for instance, would make logical sense as its picture may well have been captured from either the lefthand side or right. A vertically inversion of a sapodilla image makes some sense and is certainly not acceptable, considering that perhaps the modelling is uncommon to view an inverted sapodilla shot. As a result, it is obvious that only the exact data augmentation methodologies utilised for a training sample are always deliberately selected, taking into account the training set as well as understanding of the issue area. Furthermore, experimenting

Image Processing Identification for Sapodilla Using Convolution Neural …

117

Fig. 7 Images that are augmented

with data augmentation approaches alone and in combination to determine if these lead to significant gain in system performance can be beneficial. Advanced deep learning methods, including the convolutional neural network (CNN) [17], could understand characteristics that become independent of where they appear in the picture. Nonetheless, augmentation can help with this change consistent method to training by assisting the algorithm in understanding characteristics many of which are change consistent, along with right-to-left to top-to-bottom sorting, and lighting intensities in photos. Usually, digital data augmentation was used only on the training set, not really the validating or test datasets. Pre-processing, including picture cropping and pixels resizing, differs in that it will be done uniformly throughout all variables that communicate only with algorithm. Figure 7 shows the augmented image.

4 Performance Result To eliminate any extra info, the pictures in the collection are normalised, shrunk, and clipped. The information is split into two parts: train and validation. The dataset is divided into 80 and 20%.

4.1 Experimental Setup We conducted comprehensive tests to examine performance of the classifier based on skin color, material, and structure related to the system utilising numerous independently isolated picture examples in order to determine the optimal subset of features and classification techniques. The sample utilised was well-balanced, with 1000 pictures totaling 250 Ciku Mega, 250 Ciku Jantung, 250 Ciku Subang, and 250 Ciku Betawi. This dataset’s pictures have all been initialised. To develop the proposed approach, we employ the Python programming (particularly, the Keras modules). We

118

A. Khazalah et al.

Fig. 8 Own model

employed a few models, our own model built from scratch and also a few existing image classification models to perform transfer learning as detailed below. Figure 8 shows the proposed model. Figure 9 shows VGG16 model. Figure 10 shows VGG19 model.

4.2 Performance of Proposed CNN Model This method incorporates a 20% sample size, learning rates, batch sizes of 500, and epochs of 20. The results were evaluated on the testing sample after the machine was developed on the sapodilla training dataset. The model’s accuracy is 0.54. Figure 11 shows the plot of training accuracy and validation accuracy.

4.2.1

Effect of Optimizers

The neural Network system is built by continuously changing parameters of all nodes within the network, with the optimization playing a key role. The gradient

Image Processing Identification for Sapodilla Using Convolution Neural …

Fig. 9 VGG16

119

120

Fig. 10 VGG19

A. Khazalah et al.

Image Processing Identification for Sapodilla Using Convolution Neural …

121

Fig. 11 Plot of training accuracy and validation accuracy

descent technique is a top destination for CNN optimization. Furthermore, Adam optimizer’s categorization patterns are somewhat superior than those created by many other adapted optimization methods. Adam: It saves the exponential function mean including its previous inclination (at ), that indicates that the very first instant (mean), and also the previous square variation (ut ), which indicates the second cut (variance). The following is how at and ut are calculated: at = β1 at−1 + (1 − β1 )dt ut = β2 ut−1 + (1 − β2 )dt2 The effectiveness of a deep Convolution layer with several optimization methods has indeed been given in contrast to quantitative assessment. After using the adam optimizer the accuracy was raised to 0.99 epochs of 30. Figure 12 shows the trained CNN model. Figure 13 shows the accuracy results after the training.

Fig. 12 Trained CNN model

122

A. Khazalah et al.

Fig. 13 After training

Table 2 Performance of all the models using dense layers

4.2.2

Model

Dense layer

Accuracy

CNN model

3

0.54

CNN model (Adam)

4

0.99

Transfer learning 1

3

0.45

VGG16

3

0.57

VGG19

3

0.70

MobileNet

3

0.65

Effect of Dense Layer

We added dense layers and used a compressed convolutional neural network to increase the speed feed-forward operation. Table 2 shows the performance of all the models using dense layers. The effect of dense layer is high when the dense layer is 4 the accuracy score is 0.99 when the dense layer is 3 the accuracy score is less.

4.2.3

Effect of Filter Number

In the very first trial, 3 convolution layers of filter sizes of 3 * 3 pixel resolution and 32 filters were used; with in experiment 2, the quantity was increased from 32 to 64 for three separate convolutional layer with about the same sampling frequency of 3 * 3 pixel values; and in the experiment 3, 128 filters with filter sizes of 3 * 3 pixels were tried to apply. Table 3 shows the filter size. The runtime was also affected by the filter size with number of filters. Table 3 shows that the model has greater accuracy when the filter size is 128.

Image Processing Identification for Sapodilla Using Convolution Neural … Table 3 Filter size

4.2.4

Model

Dense layer

Filter size

123 Accuracy

CNN model

3

32

0.54

CNN model (Adam)

4

64

0.99

Transfer learning 1

3

128

0.45

VGG16

3

32

0.57

VGG19

3

32

0.70

MobileNet

3

32

0.65

Inception

1

32

0.27

Xception

1

32

0.24

Effect of Number of Epochs

Each outcome component in the neural network’s hidden layers has a variable distance measure. We attempt to create them take the characteristics of the data since they are adaptable. The concealed element’s borders are made up of a variety of characters. As a result, we modify the masses of all these concealed component lines to vary the form of the border. Figure 14 and Table 4 show the accuracy of the number of epochs. The number of epochs determines how often times the network’s parameters will indeed be changed. As that of the quantity of epochs grows, so do the bunch of times the neural network’s parameters are altered, and also the border shifts between minimizing the error to optimum to curse of dimensionality. In this experiment when the number of epochs is 30 the accuracy of the model increased to 0.99.

Fig. 14 Number of epochs

124

A. Khazalah et al.

Table 4 Number of epochs

4.2.5

Model

Dense layer

Epochs

Accuracy

CNN model

3

20

0.54

CNN model (Adam)

4

30

0.99

Transfer learning 1

3

20

0.45

VGG16

3

25

0.57

VGG19

3

25

0.70

MobileNet

3

20

0.65

Inception

1

100

0.27

Xception

1

100

0.24

Effect of Learning Rate

The sampling frequency, often known as the learning rate, is the quantity by which the parameters are changed throughout learning. Figure 15 shows the learning rates. The learning rate is a customizable parameter that seems to have a modest particular benefit, usually within 0.0 and 1.0, being used in the application of neural networks. The learning rate is a parameter that determines when rapidly the system adapts to the challenge. Considering the minor improvements to the parameters during iteration, lesser learning rates necessitate greater training epochs, however bigger learning rates necessitate smaller training epochs. A high learning rate helps the network to estimate more quickly, but at the price of a sub-optimal ultimate deep network. A slower learning rate might expect the system to acquire a somewhat more optimum or indeed completely optimum weight matrix, but training will take considerably longer. Fig. 15 Learning rates

Image Processing Identification for Sapodilla Using Convolution Neural …

125

Table 5 Comparison of accuracy Model

Dense layer

CNN model

3

Proposed model Transfer learning 1

Filter size

Epochs

Accuracy

32

20

0.54

4

64

30

0.99

3

128

20

0.45

VGG16

3

32

25

0.57

VGG19

3

32

24

0.70

MobileNet

3

32

20

0.65

Inception

1

32

100

0.27

Xception

1

32

100

0.24

Fig. 16 Bar chart representing accuracy scores

4.3 Accuracy Comparison When contrasted to its layers, CNNs with minimum layers have low installation needs and quicker training periods. Table 5 shows the comparison of accuracy. Short recovery durations enable more parameters to be tested and make the entire development transition easier. Reduced computational needs also allow for higher image quality. The best model is the one which is obtained by using the adam optimizer then it obtained the accuracy of 0.99. Figure 16 shows the bar chart representing accuracy scores.

5 Conclusion The study develops a deep convolutional neural network for sapodilla recognition and categorization. The study describes a technology that performs automated sapodilla species detection. Mostly on data, the CNN approach performs really well. The

126

A. Khazalah et al.

technique may have been used to training a large range of sapodilla in the next level of applications. It may also look at the effects of other variables such as Optimizers, Epochs, dense layers, learning rates and pooling function. We additionally ran several quantitative tests by using Keras library to categorise the photos based on their content. Only with aid of the suggested convolution neural network, the provided proposed method can simplify the process of the neural network in categorising the kind of sapodilla, minimizing administrative mistakes in sapodilla classification. The suggested Convolution layer has a 99% accuracy rate.

References 1. ABARE. (2015). Australian vegetable growing farms: An economic survey, 2013–14 and 2014– 15. Australian Bureau of Agricultural and Resource Economics (ABARE), Canberra, Australia. Research report. 2. Abualigah, L., Al-Okbi, N. K., Elaziz, M. A., & Houssein, E. H. (2022). Boosting marine predators algorithm by salp swarm algorithm for multilevel thresholding image segmentation. Multimedia Tools and Applications, 81(12), 16707–16742. 3. Palakodati, S. S. S., Chirra, V. R., Dasari, Y., & Bulla, S. (2020). Fresh and rotten fruits classification using CNN and transfer learning. Revue d’Intelligence Artificielle, 34(5), 617– 622. https://doi.org/10.18280/ria.340512 4. Sakib, S., Ashrafi, Z., & Siddique, M. A. (2019). Implementation of fruits recognition classifier using convolutional neural network algorithm for observation of accuracies for various hidden layers. ArXiv, abs/1904.00783. 5. Mettleq, A. S. A., Dheir, I. M., Elsharif, A. A., & Abu-Naser, S. S. (2020). Mango classification using deep learning. International Journal of Academic Engineering Research (IJAER), 3(12), 22–29. 6. Rojas-Arandra, J. L., Nunez-Varela, J.I., Cuevas-Tello, J.C., & Rangel-Ramirez, G. (2020) Fruit classification for retail stores using deep learning. In Proceedings of pattern recognition 12th mexican conference, Morelia, Mexico (pp. 3–13). 7. Risdin, F., Mondal, P., & Hassan, K. M. (2020). Convolutional neural networks (CNN) for detecting fruit information using machine learning techniques. 8. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 9. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 10. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 11. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 12. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 13. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 14. Álvarez-Canchila, O. I., Arroyo-Pérez, D. E., Patino-Saucedo, A., González, H. R., & PatiñoVanegas, A. (2020). Colombian fruit and vegetables recognition using convolutional neural networks and transfer learning.

Image Processing Identification for Sapodilla Using Convolution Neural …

127

15. Otair, M., Abualigah, L., & Qawaqzeh, M. K. (2022). Improved near-lossless technique using the Huffman coding for enhancing the quality of image compression. Multimedia Tools and Applications, 1–21. 16. Liu, Q., Li, N., Jia, H., Qi, Q., & Abualigah, L. (2022). Modified remora optimization algorithm for global optimization and multilevel thresholding image segmentation. Mathematics, 10(7), 1014. 17. Lin, S., Jia, H., Abualigah, L., & Altalhi, M. (2021). Enhanced slime mould algorithm for multilevel thresholding image segmentation using entropy measures. Entropy, 23(12), 1700. 18. Ciresan, D. C.,Meier, U.,Masci, J., Gambardella, L. M., & Schmid-Huber, J. (2011). Flexible, high performance convolutional neural networks for image classification. In Proceedings of the twenty-second international joint conference on artificial intelligence—Volume Two, IJCAI’11 (pp. 1237–1242). AAAI Press. 19. Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Training very deep networks. CoRR abs/1507.06228.

Comparison of Pre-trained and Convolutional Neural Networks for Classification of Jackfruit Artocarpus integer and Artocarpus heterophyllus Song-Quan Ong, Gomesh Nair, Ragheed Duraid Al Dabbagh, Nur Farihah Aminuddin, Putra Sumari, Laith Abualigah, Heming Jia, Shubham Mahajan, Abdelazim G. Hussien, and Diaa Salama Abd Elminaam Abstract Cempedak (Artocarpus heterophyllus) and nangka (Artocarpus integer) are highly similar in their external appearance and are difficult to recognize visually by a human. It is also common to name both jackfruits. Computer vision and deep convolutional neural networks (DCNN) can provide an excellent solution to recognize the fruits. Although several studies have demonstrated the application of DCNN and transfer learning on fruits recognition system, previous studies did not solve two crucial problems; classification of fruit until species level, and comparison of pretrained CNN in transfer learning. In this study, we aim to construct a recognition system for cempedak and nangka, and compare the performance of proposed DCNN S.-Q. Ong · G. Nair · R. D. A. Dabbagh · N. F. Aminuddin · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan H. Jia School of Information Engineering, Sanming University, Sanming 365004, China S. Mahajan School of Electronics and Communication, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir 182320, India A. G. Hussien Department of Computer and Information Science, Linköping University, 581 83 Linköping, Sweden Faculty of Science, Fayoum University, Faiyum 63514, Egypt D. S. A. Elminaam Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha 12311, Egypt Computer Science Department, Faculty of Computer Science, Misr International University, Cairo 12585, Egypt © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_6

129

130

S.-Q. Ong et al.

architecture and transfer learning by five pre-trained CNNs. We also compared the performance of optimizers and three levels of epoch on the performance of the model. In general, transfer learning with a pre-trained VGG16 neural network provides higher performance for the dataset; the dataset performed better with an optimizer of SGD, compared with ADAM. Keywords Cempedak · Nangka · Deep learning · Computer vision · Optimization

1 Introduction The “Nangka” (Artocarpus heterophyllus) and “Cempedak” (Artocarpus integer) as shown in Fig. 1 are tropical fruits common in Southeast Asia. In fact, it is also common for people to unwittingly name both as jackfruit. Both fruits belong to the genus Artocarpus, which shows its characteristics in irregular oval and slightly curvy shape, in addition to its large size. Its skin is distinguished by its sharp thorns. Sometimes these thorns do not appear to just point. The skin turns yellow when it is ripe or old [1]. When looking at them from a distance, the distinction is very difficult, and it may be easier to approach them, look closely and give the matter close attention. However, the outward appearance of the fruit makes for a distinct challenge [2]. In many aspects, cempedak is similar to nangka; however, cempedak is smaller and has a thinner peduncle. Nangka may range in size from 8 inches to 3 ft. (20– 90 cm) long and 6 to 20 in. (15–50 cm) broad, with weights ranging from 10 to 60 pounds or even more (4.5–20 or 50 kg). When mature, the ‘skin,’ or exterior of the compound or bundled fruit, is green or yellow, with many hard, conical points linked to a thick, rubbery light yellow or white wall [2]. Cempedak range in size from 10 to 15 cm wide and 20 to 35 cm long, and can be cylindrical or oval. The thin, leathery skin is greenish, yellowish, or brown in hue, and has pentagons with elevated bumps or flattened eye sides [3, 4]. Odour identification and texture of fruit bundles are the most common approach to distinguish between cempedak and nangka, in which cempedak usually exhibit stronger smell and softer texture. Due to the similarities between classes and inconsistent features within the cultivar, fruit and vegetable classification presents significant problems [3, 5]. It is common to mistaken cempedak with nangka and vice versa based on the size and sometimes the fragrance, however, in the naked eye, it is often deceiving to notice these fruits. Though this may not be a huge issue the idea to distinguish between both using DCNN and transfer learning algorithm is proposed in this report. Methods for quality assessment and automated harvesting of fruits and vegetables have been created, but the latest technologies have been created for limited classes and small data sets. Often the application of DCNN will need a different algorithm to train the model of best fit, but there is no result so far for the accuracy to distinguish between cempedak and nangka.

Comparison of Pre-trained and Convolutional …

131

Fig. 1 Sample of images of “Nangka” (Artocarpus heterophyllus) and “Cempedak” (Artocarpus integer)

This research aims to utilize multimodal information retrieval to determine the cempedak and nangka fruit accurately hence the objectives of the research are: (a) To construct a DCNN classification system for cempedak (Artocarpus integer) and nangka (Artocarpus heterophyllus)

132

S.-Q. Ong et al.

(b) To compare the performance of customized DCNN and transfer learning algorithm with pre-trained CNN of Xception, VGG16, VGG19, ResNet50 and InceptionV3.

2 Literature Review Due to the similarities between classes and inconsistent features within the cultivar, fruit and vegetable classification presents significant problems [6–8]. Due to the wide diversity of each type, the selection of appropriate data collection sensors and feature representation methods is particularly critical [9–12]. Methods for quality assessment and automated harvesting of fruits and vegetables have been created, but the latest technologies have been created for limited classes and small data sets. The problem is multidimensional, with many hyper-dimensional properties, which is one of the fundamental problems in current machine learning techniques [13]. The authors of this study concluded that machine vision methods are ineffective when dealing with multi-characteristic, hyperdimensional data for classification. Fruits and vegetables are divided into several groups, each of which has its own set of characteristics. Due to the paucity of basic data sets, specific classification methods are limited. The majority of trials are either restricted in terms of categories or data set size. The present study into building a pre-trained CNN is a step toward creating the capacity to supply turnkey computer vision components. These pre-trained CNNs, on the other hand, are data-driven, and there is a scarcity of huge datasets of fruits and vegetables [13]. Rahnemoonfar and Sheppard [14] utilise a deep neural network to apply to robotic agriculture in this article (DNN). This study focuses on tomato pictures found on the internet. They utilised an Inception-ResNet architecture that had been tweaked. A variety of training data was used to train the model (under the shade, surrounded by leaves, surrounded by branches, the overlap between fruits). Their search results revealed an average test accuracy of 93% on synthetic pictures and 91% on actual photos. In this study, researchers used CNN to create a model that can notify a driver of a car when he or she is sleepy. To extract features and apply them in the learning phase, the deep convolution network was created. The CNN classifier uses the SoftMax layer to determine whether a driver is sleeping or not. For this research, the Viola-Jones face detection method was adapted. The eye area is removed from the face when it has been discovered. The suggested Staked Deep CNN overcomes the drawbacks of standard CNN, such as location accuracy in regression. The suggested model has a 96.42% accuracy rate. The researchers suggest that transfer learning can be used in the future to improve the performance of the model [15]. Based on four different varieties of fruits, this research article provides a method for recognising the kind of fruit (litchi, apple, grape and lemon) [16]. Smartphones were used to capture the photos, which were then processed using a contemporary detection framework. Because the model is trained using a new data set of 2403 data from four different fruit classes, CNN is utilised to train it. The model’s total performance was outstanding,

Comparison of Pre-trained and Convolutional …

133

with a precision of 99.89%. Where CNN was successful in identifying the sort of fruit. The researchers plan to use the algorithm to detect a variety of fruits in the future. Some other optimization methods can be used to optimize the problems as given in [17–22].

3 Methodology 3.1 Dataset The fruit dataset was shot with a digital single-lens reflex (DSLR) camera (Canon 7D, ∅22.3 × 14.9 mm CMOS sensor, RGB Color Filter Array, 18 million effective pixels). The data are two classes as follows: cempedak (Artocarpus integer) and nangka (Artocarpus heterophyllus) with a total of 1000 images (each class consists of 500 images) with a resolution of 4608 × 3456 pixels. For the training purpose of the network, a sub-sampling of a factor of 72 was performed on the entire data set producing images of 48 × 64 pixels. The images were collected with three spectrums of lights: green, red, blue (by introducing an external gel filter on the flashlight) and white light. This is aim to have a dataset that could represent high variability in position and number of fruits devising a real scenario.

3.2 Data Preprocessing and Partition The entire dataset of images is reshaped to 224 × 224 × 3 and converted into a NumPy array for faster convolution in the case of building the CNN model. The converted dataset of images is labelled according to the two classes, and training of the dataset was conducted with the random image augmentation is applied, validation is done in parallel while training and tested upon the test set. Data partitioning was performed by splitting the data into training and test sets, as illustrated in Fig. 2.

3.3 Convolutional Neural Networks For this research, the DCNN model for classifying the cempedak (Artocarpus integer) and nangka (Artocarpus heterophyllus) is shown in Fig. 3. It consists of 15 convolutional layers/blocks of deep learning. The first convolution layer uses 16 convolution filters with a filter size of 3 × 3, kernel regularizer, and bias regularizer of 0.05. It also uses random_uniform, which is a kernel initializer. It is used to initialize the neural network with some weights and then update them to better values for every iteration. Random_uniform is an initializer that generates tensors with a uniform distribution.

(70%) (700 images)

(Jackfruits)

Fig. 2 Data splitting and process to be used for training and testing

Train set

Dataset

(30%) (300 images)

Test set

Trained Model

Model Adjustment

Tested model

Model Deployment

134 S.-Q. Ong et al.

Comparison of Pre-trained and Convolutional …

135

Its minimum value is -0.05 and the maximum value of 0.05. Regularizer is used to add penalties on the layer while optimizing. These penalties are used in the loss function in which the network optimizes. No padding is used so the input and output tensors are of the same shape. The input image size is 224 × 224 × 3. Then before giving output tensor to max-pooling layer batch normalization is applied at each convolution layer which ensures that the mean activation is nearer to zero and the activation standard deviation nearer to 1. After normalizing RELU an activation function is used at every convolution. The rectified linear activation function (RELU) is a linear function. It will output the input when the output is positive, else it outputs zero. The output of each convolutional layer given as input to the max-pooling layer with the pool size of 2 × 2. This layer reduces the number of parameters by down-sampling. Thus, it reduces the amount of memory and time required for computation. So, this layer aggregates only the required features for the classification. The finally a dropout of 0.5 is used for faster computation at each convolution. The 2nd convolution layer uses 16 convolution filters with 5 × 5 kernel size and the third convolution layer use 16 convolution filters with 7 × 7 kernel size. Finally, we use a fully connected layer. Here dense layer is used. Before using dense we have to flatten the feature map of the last convolution. In our model, the loss function used is categorical cross-entropy and we compare the performance of the optimizers of Adam and SGD with three levels of epochs (25, 50 and 75), and with a learning rate of 0.001.

3.4 Transfer Learning Customization of deep convolutional neural network models may take a longer time to train on the datasets. Transfer learning consists of taking features that have been learned on one problem of a dataset and leveraging them on a new and similar problem. In this study, the workflow of the proposed model construction was first, take layers from a previously trained model (VGG16, VGG19, Xception, ResNet50, InceptionV3) and freeze them, to avoid destroying any of the information they contain during future training rounds. Next with the addition of new and trainable layers on top of the frozen layers. The layers of the architecture then learn to turn the old features into predictions on a new dataset. Here we were comparing five transfer learning models—VGG16, VGG19, Xception, ResNet50, InceptionV3 with the proposed CNN model.

3.4.1

VGG16

VGG16 was developed by Simonyan and Zisserman for ILSVRC 2014 competition. It consists of 16 convolutional layers with only 3 × 3 kernels. The design opted by authors is similar to Alexnet i.e., increase the number of the features map or convolution as the depth of the network increases. The network comprises of 138 million parameters. In our model, this architecture is modified at the last FC layer

136

S.-Q. Ong et al.

Fig. 3 Proposed DCNN architecture

with 1000 classes. We replaced the 1000 classes with our number of classes i.e., 6. Adam Optimizer is used and accuracy is obtained. Similarly, by pushing the depth to 19 layers vgg19 architecture is defined. As stated above we changed the number of output classes to 6 in the last layer.

Comparison of Pre-trained and Convolutional …

3.4.2

137

VGG19

The VGG19 is an upgrade to the VGG16 model. VGG19 enhances VGG16 architecture by eliminating AlexNet’s flaws and increasing system accuracy [3]. It is a 19-layer convolutional neural network model and is constructed by stacking convolutions together, however, the depth of the model is limited due to a phenomenon known as diminishing gradient. Deep convolutional networks are difficult to train because of this issue.

3.4.3

ResNet50

ResNet stands for Residual Network in short. ResNet-50 is a Convolutional Neural Network-based Deep Learning model for image categorization that has been pretrained. Many other image recognition tasks have benefited substantially from very deep models as well [5]. ResNet-50 is a 50-layer neural network that was trained on a million photos from the ImageNet database in 1000 categories. In addition, the model comprises approximately 23 million trainable parameters, indicating a deep architecture that improves image identification. When compared to building a model from scratch, where usually a large amount of data must be collected and trained, using a pre-trained model is a highly effective option. ResNet-50 is a helpful tool to know because of its high generalisation performance and low error rates on recognition tasks.

3.4.4

Inception V3

Inception-v3 is a 48-layer deep pre-trained convolutional neural network model. It’s a version of the network that’s already been trained on over a million photos from ImageNet. It’s the third version of Google’s Inception CNN model, which was first proposed during the ImageNet Recognition Challenge. Inception V3 is capable of categorising photos into 1000 different object types. As a result, the network has learned a variety of rich feature representations for a variety of images. The network’s picture input size is 299 by 299 pixels. In the first stage, the model extracts generic features from input photos and then classifies them using those features in the second portion. On the ImageNet dataset, Inception v3 has been demonstrated to achieve better than 78.1% accuracy and roughly 93.9% accuracy in the top 5 results.

3.4.5

Xception

Xception stands for “Extreme Inception”. This architecture was proposed by Google. It consists of the same number of parameters that are used in Inception V3. The efficient usage of parameters in the model and increased capacity are the reasons for the performance increase in Xception. The output maps in inception architecture

138

S.-Q. Ong et al.

consist of cross-channel and spatial correlation mappings. These types of mappings were completely decoupled in Xception architecture [23]. 36 convolutional layers of the architecture were used in feature extraction in the network. These 36 layers are divided into 14 modules. For each module, it is surrounded by linear residual connections. The last and first modules do not have these kinds of representations. In the last FC layer, the number of classes is replaced with 6.

4 Result and Discussion The dataset has been processed and analysed using various analysis method. With higher trainable weights for a customised build of the proposed DCNN modal, the training takes a longer time. Based on the data in Table 1, it shows that the proposed DCNN architecture able to provide an accuracy of 0.89 to 0.9367. The graph to represent the comparison between the proposed method (highlighted in Yellow) and other models are shown in Figs. 4 and 5 respectively. Out of all, the accuracy of the VGG16 and the SGD is the highest. While SDG is the highest, VGG16 provide more stable and consistence performance throughout the epoch and it is evident as shown in Fig. 6. Overall, it shows that the higher the epoch, the higher accuracy. Table 1 Accuracy of the proposed DCNN and transfer learning models with optimizers of Adam or SGD at three level of epochs Optimizer

Adam

Epochs

25

50

75

25

50

75

Proposed model

0.8933

0.9267

0.9100

0.9233

0.9267

0.9367

Xception

0.8200

0.8800

0.9000

0.9000

0.9167

0.9000

VGG16

0.4733

0.8667

0.8700

0.6000

0.9567

0.9633

VGG19

0.7967

0.8567

0.8800

0.8800

0.8800

0.8800

ResNet50

0.6800

0.7200

0.7500

0.7933

0.6900

0.8000

InceptionV3

0.8800

0.8900

0.9167

0.9133

0.9000

0.9167

SGD

Adam Optimizer Accuracy

1

0.9267 0.8933 0.91

0.88 0.82

0.9

0.8667 0.87

0.8567 0.88 0.7967

0.8

0.89

0.68

0.6

0.88

0.72

0.9167

0.75

0.4733

0.4 Proposed model

Xception

VGG16

Epochs

25

VGG19 50

ResNet50

75

Fig. 4 Accuracy of the model for Adam optimizer at three levels of epochs

InceptionV3

Comparison of Pre-trained and Convolutional … 0.9267 0.9367

Accuracy

1

0.9233

SGD Optimizer

0.9167 0.9

139

0.9

0.8

0.9567 0.9633

0.88 0.88 0.88

0.6

0.9 0.9133 0.9167 0.7933 0.8 0.69

0.6 0.4 Proposed model

Xception

VGG16

Epochs

25

VGG19 50

ResNet50

InceptionV3

75

Fig. 5 Accuracy of the model for SGD optimizer at three levels of epochs

Fig. 6 Performance of model on train and test set by using Adam or SGD optimizer at three levels of epochs

5 Conclusion Cempedak (Artocarpus heterophyllus) and nangka (Artocarpus integer) are highly similar in their external appearance and are difficult to recognize visually by a human and due to the similarities between classes and inconsistent features within the cultivar, fruit and vegetable classification presents significant problems). Due to these two classes as follows the cempedak and nangka images with a total amount of 500 each and a resolution of 4608 × 3456 pixels was generated and for the training purpose of the network, a sub-sampling of a factor of 72 was performed on the entire data set producing images of 48 × 64 pixels. Based on the experiment conducted,

140

S.-Q. Ong et al.

the dataset has been processed and analysed using various CNN methods. Based on methodology imposed in the proposed method, it shows that the proposed DCNN architecture are able to provide an accuracy of 89–93.67%. While SDG is the highest, VGG16 provide more stable and consistence performance throughout the epoch and it is evident as shown in Fig. 6. Overall, it shows that the higher the epoch, the higher accuracy.

References 1. Grimm, J. E., & Steinhaus, M. (2020). Characterization of the major odorants in Cempedak— Differences to jackfruit. Journal of Agricultural and Food Chemistry, 68(1), 258–266. 2. Balamaze, J., Muyonga, J. H., & Byaruhanga, Y. B. (2019). Physico-chemical characteristics of selected jackfruit (Artocarpus Heterophyllus Lam) varieties. Journal of Food Research, 8(4), 11. 3. Shaha, M., & Pawar, M. (2018). Transfer learning for image classification. In 2018 Second international conference on electronics, communication and aerospace technology (ICECA) (pp. 656–660). https://doi.org/10.1109/ICECA.2018.8474802 4. Wang, M. M. H., Gardner, E. M., Chung, R. C. K., Chew, M. Y., Milan, A. R., Pereira, J. T., & Zerega, N. J. C. (2018). Origin and diversity of an underutilized fruit tree crop, cempedak (Artocarpus integer, Moraceae). American Journal of Botany, 105(5), 898–914. 5. Sharma, N., Jain, V., & Mishra, A. (2018). An analysis of convolutional neural networks for image classification. In International conference on computational intelligence and data science (ICCIDS 2018); Procedia Computer Science, 132, 377–384. ISSN 1877-0509. https:// doi.org/10.1016/j.procs.2018.05.198 6. Alhaj, Y. A., Dahou, A., Al-qaness, M. A., Abualigah, L., Abbasi, A. A., Almaweri, N. A. O., Elaziz, M. A., & Damaševiˇcius, R. (2022). A novel text classification technique using improved particle swarm optimization: A case study of Arabic language. Future Internet, 14(7), 194. 7. Daradkeh, M., Abualigah, L., Atalla, S., & Mansoor, W. (2022). Scientometric analysis and classification of research using convolutional neural networks: A case study in data science and analytics. Electronics, 11(13), 2066. 8. Wu, D., Jia, H., Abualigah, L., Xing, Z., Zheng, R., Wang, H., & Altalhi, M. (2022). Enhance teaching-learning-based optimization for tsallis-entropy-based feature selection classification approach. Processes, 10(2), 360. 9. Ali, M. A., Balasubramanian, K., Krishnamoorthy, G. D., Muthusamy, S., Pandiyan, S., Panchal, H., Mann, S., Thangaraj, K., El-Attar, N. E., Abualigah, L., & Elminaam, A. (2022). Classification of glaucoma based on elephant-herding optimization algorithm and deep belief network. Electronics, 11(11), 1763. 10. Abualigah, L., Kareem, N. K., Omari, M., Elaziz, M. A., & Gandomi, A. H. (2021). Survey on Twitter sentiment analysis: Architecture, classifications, and challenges. In Deep learning approaches for spoken and natural language processing (pp. 1–18). Springer. 11. Fan, H., Du, W., Dahou, A., Ewees, A. A., Yousri, D., Elaziz, M. A., Elsheikh, A. H., Abualigah, L., & Al-qaness, M. A. (2021). Social media toxicity classification using deep learning: Realworld application UK Brexit. Electronics, 10(11), 1332. 12. Abualigah, L. M. Q. (2019). Feature selection and enhanced krill herd algorithm for text document clustering (pp. 1–165). Springer. 13. Hameed, K., Chai, D., & Rassau, A. (2018). A comprehensive review of fruit and vegetable classification techniques. Image and Vision Computing, 80(September), 24–44. 14. Rahnemoonfar, M., & Sheppard, C. (2017). Deep count: Fruit counting based on deep simulated learning. Sensors (Switzerland), 17(4), 1–12.

Comparison of Pre-trained and Convolutional …

141

15. Reddy Chirra, V. R., Uyyala, S. R., & Kishore Kolli, V. K. (2019). Deep CNN: A machine learning approach for driver drowsiness detection based on eye state. Revue d’Intelligence Artificielle, 33(6), 461–466. 16. Risdin, F., Mondal, P. K., & Hassan, K. M. (2020). Convolutional neural networks (CNN) for detecting fruit information using machine learning techniques. IOSR Journal of Computer Engineering, 22(2), 1–13. 17. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 18. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 19. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 20. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 21. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 22. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 23. Chollet, F. (2021). Xception: Deep learning with depthwise separable convolutions. [online] arXiv.org. https://arxiv.org/abs/1610.02357v3. Accessed May 30, 2021.

Markisa/Passion Fruit Image Classification Based Improved Deep Learning Approach Using Transfer Learning Ahmed Abdo, Chin Jun Hong, Lee Meng Kuan, Maisarah Mohamed Pauzi, Putra Sumari, Laith Abualigah, Raed Abu Zitar, and Diego Oliva Abstract Fruit recognition becomes more and more important in the agricultural industry. Traditionally, we need to manually identify and label all the fruits in the production line, which is labor intensive, error-prone, and ineffective. Therefore, a lot of fruit recognition systems are created to automate the process, but fruit recognition system for Malaysia local fruit is limited. Thus, this project will focus on classifying one of the Malaysia local fruits which is markisa/passion fruit. We proposed two CNN models for markisa classification. The performances of the proposed models are evaluated on our own dataset collection and produces an accuracy of 97% and 65% respectively. The results indicated that the architecture of CNN model is very important because different architecture can produce different results. Therefore, first CNN model is selected because it can classify 4 types of markisa with a higher accuracy. In the proposed work, we also inspected two transfer learning methods in the classification of markisa which are VGG-16 and InceptionV3. The results showed that the performance of the first proposed CNN model outperforms VGG-16 (95% accuracy) and InceptionV3 (65% accuracy). Keywords Markisa · Passion fruit · Convolutional neural network · Deep learning · Transfer learning · VGG-16 · InceptionV3 A. Abdo · C. J. Hong · L. M. Kuan · M. M. Pauzi · P. Sumari · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan R. A. Zitar Sorbonne Center of Artificial Intelligence, Sorbonne University-Abu Dhabi, Abu Dhabi, United Arab Emirates D. Oliva IN3—Computer Science Department, Universitat Oberta de Catalunya, Castelldefels, Spain Depto. de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Guadalajara, Jalisco, Mexico © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_7

143

144

A. Abdo et al.

1 Introduction The recent development in computer vision contributed to the advancement of Machine Learning (ML), Neural Networks (NN), and Conventional Neural Networks (CNN), improved image classification tasks’ efficiency. Detection of several distinct varieties and the classification of Passiflora edulis [1, 2], commonly known as Passion Fruit or (Markisa) as known in the Malay language, represents one of the significant challenges in the fruits packaging and processing industry [3, 4]. Different colors, sizes, shapes, and orientations caused by several cultivars of this fruit, as shown in the Fig. 1, resulted in misclassification, affecting productivity and packaging quality. Customarily this classification of several cultivars of Markisa is done manually in production lines, in which a laborer manually sorts it into different processing lines, making the entire process labor-intensive, time-consuming, error-prone, and

Fig. 1 Samples of several cultivars of Markisa

Markisa/Passion Fruit Image Classification Based Improved …

145

ineffective. Additionally, it increases the cost of production as hiring employees to do manual work introduces wages cost and operation overheads, resulting in a decrease in production output. Hence, we need an automated system that can reduce humans’ efforts, increase production, and reduce the cost and time of production [5, 6]. In this study, we proposed a CNN architecture to perform classification between several cultivars of Markisa, notably the following cultivars of Markisa, Sweet Passion Fruit (Markisa Manis), Yellow Passion Fruit (Markisa Kuning), Purple Passion Fruit (Markisa Ungu), and Big Passion fruit (Markisa Besar). The novelties of this paper are: (i) the end-to-end deep learning pipeline architecture for Markisa classification. The paper is organized as follows: Section 2 provides the literature survey for the proposed CNN architecture. Then, Sect. 3 represents the details of the proposed CNN pipeline architecture for Markisa classification. As for Sect. 4, the experiment overview is presented, listing the tools, parameters, and criteria of the experiment and result. Finally, Sect. 5 represents our conclusion.

2 Literature Survey In image object detection or classification, two approaches available are deep learning or Convolutional Neural Network (CNN) and traditional Computer Vision (CV) approach [7, 8]. The traditional CV algorithms for feature extraction include edge detection, corner detection and threshold segmentation [9–12]. The deep learning approach performs a better accuracy in image classification as compared to the traditional CV techniques [13]. Deep learning also offers less demand from the expert to do the fine-tuning or features extraction, it can be done by the CNN with high flexibility and re-trained to get the optimum result. Therefore, CNN or deep learning are applied in many images classification, fruit classification which is one of the classification tasks to help in the robotic harvesting system or checking the quality of the fruit [14]. Risdin et al. [14] develop a CNN in fruit classification that achieves a 98.99% accuracy better than the traditional machine learning techniques such as SVM with only 87% accuracy [5]. Moreover, Palakodati et al. [15] develop a fresh and rotten fruit classification CNN model and able to achieve accuracies up to 97.82%. Among the best Transfer Learning model that have been experimented with fruits and vegetable dataset is VGG16. A study by Kishore et al. [16] have proven that by using VGG16 on dataset that consists of 4 classes (Banana, Tomato, Carrot, and Potato) achieves about 97% accuracy [16]. Each class in the dataset consists of 600 images. Interestingly, the model was tested with different image sizes to prove that VGG16 works well with smaller and noisier images. With the achieved accuracy, it is no doubt that VGG16 is a good option to opt for the fruit or vegetable dataset. There is also another study by Pardede et al. [17], that applying the VGG16 model on fruits dataset. The aim of the study is to build a deep learning model that can detect the fruit ripeness, which is a bit different from the previous study but have the same nature in the dataset. In that study, there are 8 classes of fruit (Ripe Mango, Unripe Mango, Ripe Tomato, Unripe Tomato, Ripe Orange, Unripe Orange,

146

A. Abdo et al.

Ripe Apple, Unripe Apple). As the outcome, they achieved about 90% accuracy with Dropout rate at 0.5. The study concluded that the best technique to reduce overfitting in Transfer Learning is by using Dropout. Inception-v3 is a convolutional neural network architecture named after the Inception movie directed by Christopher Nolan; the model is mainly used for image analysis and object detection and was introduced during the ImageNet Recognition Challenge held by Google [18]. A study published by Wikimedia Foundation [18]. Szegedy et al. [19] proposed the architecture of Inception-v3 and studied them in the context of Inception architectures; the formal has proven to have “high-performance vision networks that have a relatively modest computation cost compared to simpler, more monolithic architectures”. In addition, the highest quality trained version of Inception-v3 has “reached 21.2%, top-1 and 5.6% top-5 error for single crop evaluation” compared to other CNN architectures at the time. Another paper that Chunmian Lin et al. published has explored the application of the Inception-v3 Model through transfer learning; the transfer learning-based model “is retrained for 5000 epochs at different learning rates [20]. The accuracy test results indicate that the transfer learning-based method is robust for traffic sign recognition, with the best recognition accuracy of 99.18 % at the learning rate of 0.05”. Some other optimization methods can be used to optimize the problems as given in [8, 21–25].

3 Proposed CNN Architecture for Passion Food Recognition 3.1 The Proposed CNN Architectures 3.1.1

Model 1

The proposed architecture of the CNN model for Passion fruit classification can be seen in Figs. 2 and 3. This CNN model consisted of 4 convolutions layers as illustrated in Figs. 2 and 4 dense layers for the classifier of the neural network excluded the input and output layer as shown in Fig. 3 [12, 26]. This model able to give testing accuracy of 97% in classifying the 4 types of passion fruits. As seen in Fig. 2, after the input of training data in the size of (224, 224) RGB color images, it is passed into the first convolutional layer. The first convolutional layer is designed with 64 convolutional filters in the size of (3, 3); the stride is (1, 1) when translating the filters on the input images by one step, padding is set to the same which will provide the same output after the first convolutional. Next, batch normalization is applied after the ReLu activation function to give the output of the mean activation close to zero and the standard deviation close to 1 [3]. The result will then pass to the max-pooling layer of size (2, 2) to reduce the output size to simplify the model. Furthermore, the same number of convolutional filters applied in the convolutional layer 2. However, the filter size is increased to (5, 5) with no padding and the same batch normalization is applied to

Markisa/Passion Fruit Image Classification Based Improved …

147

the ReLu activation function before the max-pooling layer of size (2, 2). The same architecture used for the convolutional layer 2 is repeated on the convolutional layer 3 but increase the filter size to (7, 7), the same ReLu activation and batch normalization is applied to prevent the overfitting of the model before the max-pooling layer of size (2, 2). In the convolutional layer 4, the convolutional filters are reduced to only 16 filters with size (7, 7) and batch normalization is applied on the output without the ReLu function and max-pooling layer. The output provided by the base model for the proposed CNN will become the input to the neural network architecture after it extracted the features in the input images. After the convolutional layer extracted the features inside the fruit dataset, the pixel value is flattening out before input to the neural network as seen in Fig. 3. 3 dense layers inside the neural network without including the input and output layers. The first dense layer is constructed by 512 numbers of nodes and the L2 regularization or Ridge Regularization with both lambda and bias of 0.01 are used to add penalties on the weights to create a simpler model and prevent overfitting [4]. The dropout rate of 0.25 for faster computation by ignoring 25% of the neurons when training to avoid overfitting. Therefore, two regularization techniques which are the L2 regularization and dropout regularization of 0.25 are used in the first dense layer due to large training neurons. The ReLu activation function is used for the first dense layers and input to the second dense layer with only 64 numbers of neuron and the same dropout rate of 0.25 as in the first layer without L2 regularization. The output is then passed to the ReLu activation function and the becoming the input of the last dense layer before the output layer. The third layer of the neural network only has 32 nodes with no dropout rate and regularization. ReLu activation function is used for the third dense layer and the Softmax activation function is used for the output layers with 4 neurons to make the multiclass classification. The optimizer used in the loss function to update the weight and bias in the neural network is categorical cross-entropy due to the input is in one hot-encoded and Adagrad optimizer of 0.001 learning rate. The epoch is set to 30 and the batch size is set to 10. The metric used to measure is the categorical accuracy for the multiclass classification.

3.1.2

Model 2

Figure 4 shows the second proposed CNN model architecture. This model has 6 convolution blocks, 2 pooling blocks, 2 fully connected layers and a SoftMax classifier. All input images are color images with sizes of 224 × 224 pixels, 3 channels. All the convolution blocks have same filter sizes (3 × 3), and paddings are applied to ensure the output images have same sizes as the inputs. However, different filter number are used, 128 filters for convolution block 1, 96 filters for block 2, 64 filters for block 3, 32 filters for blocks 4 and 5, and 12 filters for block 6. All convolution blocks have same activation function which is RELU. Maximum pooling with sizes of 2 × 2 is applied after convolution blocks 1 and 2 to reduce the sizes of the image twice (from 224 × 224 to 56 × 56). After going through the convolution base, the dimension of the images is 56 × 56 × 12. Then, the images are flattened to vectors

148 Fig. 2 CNN architecture for model 1

A. Abdo et al.

Markisa/Passion Fruit Image Classification Based Improved …

149

Fig. 3 Classifier for the proposed CNN architecture for model 1

of size 37,632 before fitting into the fully connected layers. Both fully connected layers have 1000 nodes, use RELU as the activation function. The only difference is dropout rate of 0.3 is applied to the first layer. Then, the SoftMax classifier will output the result, either the images are markisa besar, markisa kuning, markisa manis or markisa ungu.

3.2 Transfer Learning Models As part of this study, we also include transfer learning model. Due to some limitations in our device and limited resources we are only able to compare between VGG16 and InceptionV3 model. In both models, we freeze the base convolutional layers and

150

A. Abdo et al.

Fig. 4 Second CNN model

remove the flatten layer and its classifier. However, we maintained the weights by using the ‘imagenet’ option. Then, we replaced it to suit our dataset which contained 4 classes. We are using ReLu as the dense layer activation function and softmax as the output layer activation function and implemented early stopping to reduce the training time. Whenever the model reaches 99% accuracy, we stop the model training. This will also reduce the possibility of overfitting.

3.2.1

VGG16

VGG16 is convolutional neural network that was developed by Karen Simonyan and Andrew Zisserman from Oxford University in 2014 [27]. This model contained 16 layers and achieves 92.7% top-5 test accuracy on ImageNet dataset which contains 14 million images belonging to 1000 classes. Figure 5 shows the architecture of VGG16.

Markisa/Passion Fruit Image Classification Based Improved …

151

Fig. 5 VGG-16 architecture

3.2.2

InceptionV3.

The second transfer learning model is sequentially concatenated based on the Inception-v3 model. The model consists of Low-level feature mappings learned by basic convolutional operation with (1 × 1) and (3 × 3) kernels. In addition, multiscale feature representations are concatenated to feed into auxiliary classifiers with diverse convolution kernels (i.e., 1 × 1, 1 × 3, 3 × 1, 3 × 3, 5 × 5, 1 × 7, and 7 × 7 filters), which is used to produce better convergence performance. In the experiment section, we configured the fully connected layer with one Dense layer and a dropout layer, followed by another experiment with two Dense layers and two dropout layers. Finally, a Sigmoid classifier is used to produce a one-hot vector, consistent with fourclass probability. Finally, a classification result can be determined depending on the maximum value of the four-class probability. Figure 6 shows the architecture of InceptionV3.

3.3 Dataset There are many types of Markisa/passion fruits. In our dataset, we include 4 different types of this fruit which are Markisa Besar (Giant Passion Fruit), Markisa Kuning

Fig. 6 InceptionV3 architecture

152

A. Abdo et al.

Fig. 7 Examples of images

(Yellow Passion Fruit), Markisa Manis (Sweet Passion Fruit), and Markisa Ungu (Purple Passion Fruit). We divided the dataset into 80% training, 10% validation and 10% for testing. Figure 7 shows the examples of images in our dataset.

3.4 Augmentation We also apply some image augmentation by rotating the images to certain degree. Table 1 shows the rotation for one sample of the image in Markisa Besar class.

Markisa/Passion Fruit Image Classification Based Improved … Table 1 Image augmentation of Markisa Besar

Degree of rotation 0° (original image)

180°

90° anticlockwise

275°

Image

153

154

A. Abdo et al.

Table 2 Parameters tuning options for proposed CNN model

Table 3 Parameters tuning options for transfer learning

Parameter

Options

Optimizer

Adam, SDG, Adagrad, RMSprop

Dense layer



Learning rate

0.1, 0.01, 0.001, 0.0001

Epoch

10, 30, 50, 70

Filter



Batch size

10, 20, 30, 40

Parameter

Options

No of neurons in single dense layer

512, 1024

Type of optimizer

Adam, SGD

Epochs

10, 20

Dropout

0.1, 0.2

Learning rate

0.01, 0.001

Batch size

50, 100

4 Performance Result 4.1 Experimental Setup The dataset used was quite balanced, which consisted of 250 Markisa Besar, 250 Markisa Kuning, 250 Markisa Manis, 250 Markisa Ungu and totaling 1000 images. The size of the images has been standardized to 224 × 244 × 3. The programming language used in this study is Python with Tensorflow and Keras library. To run the codes, we use Google Colaboratory with GPU. However, the GPU runtime is limited, and we are unable to use it extensively. Thus, we reduce our parameter tuning options from 4 different values in each parameter into 2 different values only for transfer learning section. Tables 2 and 3 show the parameters tuning options for proposed CNN model and transfer learning, respectively.

4.2 Performance of Proposed CNN Model 4.2.1

Model 1

To obtain the 97% passion fruit classification model as seen in Figs. 2 and 3, the model summary of the CNN architecture as seen in Fig. 8. We will first experiment with different architectural designs and different hyperparameters tuning.

Markisa/Passion Fruit Image Classification Based Improved …

155

The first proposed model is seen in Fig. 9, used to experiment with different optimizers, number of dense layers, different learning rates, number of epochs, number of filters and lastly is the number of training batch sizes. The best model is the CNN architecture as seen in Fig. 8 with 97% accuracy on the testing data. The initial proposed CNN architecture as seen in Fig. 9 consisted of 4 convolutional layers and 3 dense layers just like the best model as seen in Fig. 8. The

Fig. 8 The model summary for the best proposed CNN model with accuracy of 100%

156

A. Abdo et al.

Fig. 9 The first experimented model

first convolutional layer consists of 64 filters with size (3, 3), stride (1, 1) and zeropadding that will output the same result just like the best model as seen in Figs. 2,3 and 8. However, the number of filters for the second and third convolutional layer is only 16 instead of 64 as compared to the best model. The rest of the convolutional architecture and the neural network dense layer architecture is the same as the best model. However, the initial learning rate was set to 0.0001 and Adam optimizer is used to get the categorical accuracy. In addition, the epoch was set to 30 and a batch size of 10 for the training dataset. As a result, we can observe that the total parameter for the best model is 17,725,876, larger than the initial model with 17,422,804 due to fewer number of filters used in the CNN architecture as seen in Figs. 8 and 9.

Markisa/Passion Fruit Image Classification Based Improved …

157

(I) Effect of Optimizers The initial model has experimented with different optimizers such as Adam, SDG, Adagrad and RMSprop, respectively. The training accuracies and the validation accuracies can be seen in Fig. 10. Out of the 4 optimizers, we can see that the Adagrad optimizer gives a stable validation accuracy with close to 90% accurate in detecting the multiclass classification. RMSprop also shows a good validation accuracy near the epoch 30 but the values are fluctuating as compared to Adagrad. The training accuracies and the validation accuracy for Adagrad is more stable as compared to the other 3 optimizers with no fluctuation in the validation accuracies. Both training and validation accuracies for the Adagrad are close to each other showing the model becomes less overfitting. Figure 11 illustrated the testing evaluation accuracies comparison of the horizontal bar graph for different optimizers. As a result, based on the performance of validation accuracies, we can see that Adagrad optimizer performs better and less overfit. The high validation accuracy also shows a high testing evaluation accuracy as seen in Fig. 11. Therefore, we will now switch the Adam optimizer to Adagrad optimizer and continue to tune the hyperparameter. (II) Effect of Dense Layer The initial model has 3 dense layers in the neural network classifier, it experiments with the increase of one dense layer with 64 nodes and a dropout rate of 0.25, ReLu

Fig. 10 The effect of the optimizer on the training and validation accuracy against epoch

158

A. Abdo et al.

Fig. 11 The testing evaluation accuracies comparison with different optimizers

activation after the second dense layer as seen in Fig. 12. As a result, the experiment will have up to 6 dense layers after the 3 times of adding new dense layers in the classifier. The result shows that when more dense layers were added, the model will learn slower as it required more epochs or episodes of training before it can predict well. The validation accuracy will higher than the training accuracies after 4 dense layers existed in the model as the complexity of the model increased. As a result, the testing evaluation accuracies illustrated in Fig. 13 shows that model with dense layer 3 will have a higher testing evaluation accuracy with 0.96 than more dense layers in the epoch size 30. Therefore, we will remain the dense layers with only 3, Adagrad optimizer, a learning rate of 0.0001 and the training epoch of 30 with batch size of 10 as our current model. (III) Effect of Learning Rate Next, the model now is tested with different learning rates from 0.1, 0.01, 0.001 and 0.0001 as seen in Fig. 14. The result shows an obvious trend that when the learning rate is high, the model will converge faster shows by the validation accuracy closed to the training accuracy in less than 10 epochs. However, this is not stable as a high learning rate means the model will update its weight and bias faster and might not learn well after 10 epoch with fluctuation in the validation accuracies. The smaller the learning rate, the model will learn slower and get better accuracies as illustrated in Fig. 14. The testing evaluate accuracies increase when the learning rate become smaller but only until 0.001. This is because the learning rate of 0.001 gives the highest which is close to 1 or 0.99 as compared to a learning rate of 0.0001 with testing evaluation accuracy of 0.96. The result could be explained because only 30 epoch is tested and the smaller learning rate might need a bigger epoch size to train better. As a result, because of the low epoch for fast computation and high accuracy generated, we will select the learning rate of 0.001 instead of the initial learning rate

Markisa/Passion Fruit Image Classification Based Improved …

159

Fig. 12 The effect of the number of epochs on the training and validation accuracy against epoch Fig. 13 The testing evaluation accuracies comparison with different number of epochs

160

A. Abdo et al.

of 0.0001 as set by the initial model. Although the validation accuracy for learning rate of 0.0001 is slightly higher than 0.001 as seen in Fig. 15, but we will go for a fast converge model. The current model is Adagrad optimizer, the learning rate of 0.001, 3 dense layers and epoch size of 30 with batch size of 10. (IV) Effect of Number of Epochs Furthermore, the model now is experimented with different epoch sizes from 10, 30, 50 to 70 as seen in Fig. 16. The first 10 epoch shows that the validation accuracy is small and the model is overfitting as it has bad testing evaluate accuracy with only 0.39. When the epoch size getting bigger, the validation accuracy starts to converge and become consistent even with further increase of the epoch size as seen in epoch 70. The model will start to have a consistent validation accuracy after the epoch of 20. The testing evaluation accuracies for different epoch sizes can be seen in Fig. 17. The result shows that epoch sizes 30 and 50 yield the highest testing evaluation accuracies of 0.99 as compared to epoch 70 with an accuracy of 0.95. Consequently, we will pick epoch size of 30 for training the model as it required less computational resource and yet give a good result on the validation accuracy. The current model is Adagrad optimizer, a learning rate change from 0.0001 to 0.001, 3 dense layers and epoch size of 30 with batch size of 10.

Fig. 14 The effect of learning rate on the training and validation accuracy against epoch

Markisa/Passion Fruit Image Classification Based Improved … Fig. 15 The testing evaluation accuracies comparison with different learning rate

Fig. 16 The effect of learning rate on the training and validation accuracy against epoch

161

162

A. Abdo et al.

Fig. 17 The testing evaluation accuracies comparison with different learning rate

(V) Effect of Filter Number The initial model has only the first convolutional layer of 64 filters number, the second convolutional has 16 filters numbers and the third convolutional layer has also 16 filters numbers. When the second and third layers of the convolutional layer filters also change to 64, the result are shown in Fig. 18. The total number of filters added is 48 if only a second convolutional is added else, total added filter is 90 for both second and third layer of convolutional filter change to 64. When more filters added, the model able to capture the image features better as seen in the validation accuracies close to the training accuracies in the 48 and 90 filters added. On the other hand, the testing evaluation accuracies show improvement after adding the 48 filters and 90 filters which both yield 100% accuracies from 99% as seen in Fig. 19. Therefore, the number of filters for the first 3 convolutional layers will change to 64 filters and it is the best model’s architecture as illustrated in Figs. 2 and 3. (VI) Effect of Batch Size Since we already determine the best model’s architecture, now we will experiment on how the training batch size will affect the performance of the model as seen in Fig. 20. We can observe that when the batch size increase, the model will update slower, and the model testing evaluation accuracies will become smaller as seen in Fig. 21. The accuracies drop to 0.97 and 0.98 after the increase of the batch size. This can be explained by the larger batch size will decrease the number of times the parameters update. As a result, we will retain the batch size of 10 in the training of the input dataset. The best model now is 10 epochs with batch size of 10, learning rate of 0.001, 3 dense layers, 64 number of filters for all the first 3 convolutional layers and Adagrad optimizer.

Markisa/Passion Fruit Image Classification Based Improved …

Fig. 18 The effect of number of filters on the training and validation accuracy against epoch Fig. 19 The testing evaluation accuracies comparison with different number of filters

163

164

A. Abdo et al.

Fig. 20 The effect of number of filters on the training and validation accuracy against epoch

Fig. 21 The testing evaluation accuracies comparison with different number of filters

Figure 22 depicts the best model predicted accuracy for the testing dataset which shows 97% accuracy in the passion fruit classification. 3 misclassifications happened on the Markisa Besar and the model able to predict all the others classes correctly. The best model shows a 100% accuracy during the testing evaluation accuracy but the

Markisa/Passion Fruit Image Classification Based Improved …

165

Fig. 22 The best model predicted accuracy for Model 1 is 97%

actual predicted accuracy on the input dataset is 97% with 3 images being misclassified out of the 100 images in the testing dataset. Therefore, it is believed that testing accuracy can be increased my feeding the model on more Markisa Besar images with different variety.

4.2.2

Model 2

Figure 23 shows the summary of the second CNN models. Based on the summary, we need to train 38838816 parameters. We first trained the model with training data, then performed hyperparameters tuning using validation data, finally test the accuracy of the model using testing data. To obtain the best parameters for this model, we have performed the hyperparameters tuning according to the setup mentioned in Table 2. After performing the hyperparameters tuning, the model has test accuracy of 65% with the following parameters: . . . . . .

Optimizer = Adam Learning rate = 0.001 Last filter numbers = 12 Number of nodes in each dense layer = 1000 Epochs = 50 Batch size = 20.

The section below shows the effect of each hyperparameter on the model performance.

166

A. Abdo et al.

Fig. 23 Summary of second CNN models

(I) Effect of Last Filter Size To test the effect of last filter size of convolution base on the model performance, we keep the other parameters as constant: . . . .

Number of nodes = 2000 optimizer = Adam epochs = 30 batch size = 40.

The effect of the last filter size is shown in Fig. 24. Based on the result, we can see that filter number of 12 has the highest validation accuracy which is 0.71. We also can see that the model is overfitted with the training data because it can classify the training images perfectly but the performance on the validation set is only 0.71 accuracy. In the next hyperparameter tuning, we will keep filter size to 12. (II) Effect of Number of Nodes in Each Dense Layer To test the effect of number of nodes in each dense layer on the model performance, we keep the other parameters as constant:

Markisa/Passion Fruit Image Classification Based Improved …

167

Fig. 24 Effect of the last filter size on model performance

. . . .

Last filter size = 12 optimizer = Adam epochs = 30 batch size = 40.

The effect of the number of nodes in each dense layer is shown in Fig. 25. Based on the result, we can see that 1000 nodes have the highest validation accuracy which is 0.70. We also can see that the model is overfitted with the training data because it can classify the training images perfectly but the performance on the validation set is only 0.70 accuracy. In the next hyperparameter tuning, we will keep number of nodes to 1000. (III) Effect of Optimizers To test the effect of optimizers on the model performance, we keep the other parameters as constant: . . . .

Last filter size = 12 Number of nodes = 1000 epochs = 30 batch size = 40.

The effect of optimizers is shown in Fig. 26. Based on the result, we can see that Adam optimizer has the highest validation accuracy which is 0.80. We also can see that the model is overfitted with the training data because it can classify the training images with 0.99 accuracy but the performance on the validation set is only 0.80 accuracy. In the next hyperparameter tuning, we will keep optimizer as Adam.

Fig. 25 Effect of the number of nodes on model performance

168

A. Abdo et al.

Fig. 26 Effect of optimizers on model performance

(IV) Effect of Batch Size To test the effect of batch size on the model performance, we keep the other parameters as constant: . . . .

Last filter size = 12 Number of nodes = 1000 epochs = 30 Optimizer = Adam.

The effect of batch size is shown in Fig. 27. Based on the result, we can see that batch size of 20 has the highest validation accuracy which is 0.58. We also can see that the model is overfitted with the training data because it can classify the training images perfectly but the performance on the validation set is only 0.58 accuracy. In the next hyperparameter tuning, we will keep batch size to 20. (V) Effect of Epochs To test the effect of epochs on the model performance, we keep the other parameters as constant: . . . .

Last filter size = 12 Number of nodes = 1000 Batch size = 20 Optimizer = Adam.

The effect of epochs is shown in Fig. 28. Based on the result, we can see that 50 epochs have the highest validation accuracy which is 0.65. We also can see that the

Fig. 27 Effect of batch size on model performance

Markisa/Passion Fruit Image Classification Based Improved …

169

Fig. 28 Effect of epochs on model performance

model is overfitted with the training data because it can classify the training images perfectly but the performance on the validation set is only 0.65 accuracy.

4.3 Performance of Proposed Transfer Learning Model 4.3.1

VGG16

In this experiment, we added only 1 dense layer after the flatten layer and 1 dropout layer before the output layer while the base VGG16 convolutional layer are being freeze. Figure 29 illustrate the model architecture. As stated in the previous section, we come out with different parameter tuning during the model training. From the parameters option, we have trained 64 models with different combination of the parameters (refer in Appendix). As the comparison, we selected the best model with different parameters. The best accuracy achieved from the training is 0.97. Figure 30 shows the training and validation accuracy and losses across different epochs before it stopped learning when 99% accuracy achieved for the best model in this part. (I) Effect on Optimizers For optimizer, we chose Adam and SGD with same parameters. The comparison between these 2 optimizers are as shown in Table 4. As we can see, the Adam Optimizer perform better than SGD with same parameters.

Fig. 29 VGG-16 model architecture

170

A. Abdo et al.

Fig. 30 Training/validation accuracy and loss across different epochs

Table 4 Comparison between Adam and SDG optimizers

Same parameters

Optimizer Accuracy

Number of neurons in dense layer: 512 Adam Dropout: 0.2 SGD Epochs: 20 Learning rate: 0.01 Batch size: 100

0.97 0.75

(II) Effect on Number of Neurons in Dense Layer For number of neurons in dense layer, we try with 2 different values 512 or 1024. The result is shown in Table 5. From the result obtained, seems like the lower number of neurons perform better than the higher number of neurons with accuracy difference 0.10. (III) Effect on Dropout Different rate of dropout used in this experiment are 0.1 and 0.2. The result is shown in Table 6. Table 5 Comparison between different number of neurons

Same parameters

No. of neurons in dense layer Accuracy

Optimizer: Adam 512 Dropout: 0.2 1024 Epochs: 20 Learning rate: 0.01 Batch size: 100

0.97 0.87

Markisa/Passion Fruit Image Classification Based Improved … Table 6 Comparison between different dropout rate

Table 7 Comparison between different learning rate

171

Same parameters

Dropout

Accuracy

Optimizer: Adam No of neurons in dense layer: 512 Epochs: 20 Learning rate: 0.01 Batch size: 100

0.2

0.97

0.1

0.81

Same parameters

Learning rate

Accuracy

Optimizer: Adam No of neurons in Dense layer: 512 Epochs: 20 Dropout: 0.2 Batch size: 100

0.01

0.97

0.001

0.97

From the result, it is obvious that the higher dropout performs better than the lower dropout rate. (IV) Effect on Learning Rate Besides rate of dropout, we also test on different learning rate, 0.01 or 0.001. The result is shown in Table 7. There is no effect in changing the learning rate as the result is same accuracy. (V) Effect on Batch Size We also tested on the effect of different batch size during the model training. The result is shown in Table 8. For batch size, it only has little difference in accuracy which is only 0.03. It can be concluded that batch size does affect the performance a bit. (VI) Effect on Epochs Lastly, we try 2 different epochs, 10 and 20. The result is shown in Table 9. Same as batch size, number of epochs just affect a little on the accuracy with 0.06 in difference. As a summary from the experiment on VGG16 transfer learning, we need to choose the best optimizer, dropout rate and number of neurons in the dense layer to get the best model. However, different learning rate does not affect the performance of the model while the batch size and number of epochs only affects a little on the accuracy value. Table 8 Comparison between different batch size

Same parameters

Batch size

Accuracy

Optimizer: Adam No of neurons in dense layer: 512 Epochs: 20 Dropout: 0.2 Learning rate: 0.01

100

0.97

50

0.94

172

A. Abdo et al.

Table 9 Comparison between different epochs

Same parameters Optimizer: Adam No of neurons in dense layer: 512 Batch size: 100 Dropout: 0.2 Learning rate: 0.01

Epochs

Accuracy

20

0.97

10

0.91

Fig. 31 Confusion matrix for VGG-16

After we get the best model with the best parameters, we apply the model by classifying the testing dataset consists of 100 images with 25 images per label. And the results turned out quite good as illustrated in Fig. 31. Only 5 images from Markisa Manis that are misclassified as Markisa Kuning with predicted accuracy of 95%. However, the rest are predicted correctly. Seems like Markisa Kuning is a dominant label. Perhaps in future works, we can identify why the other labels always misclassified as Markisa Kuning in this model although the misclassified images are the minority.

4.3.2

InceptionV3

In this experiment, we imported InceptionV3’s base model and omitted its top layer that consists of Dense layers and dropout layers. Then, the base model weights and biases were frozen to preserve the learnable parameters from the previous training. Next, a fully connected layer with one dense layer and a dropout layer configured with the Relu activation function. Finally, the Sigmoid activation function is used for the output layer with four classes representing the four cultivars of Markisa. Figure 33 illustrates the model architecture. It’s imperative to mention that Dense layers and two dropout layers are used for the experiment part. Figure 32 shows the transfer learning through Inception-V3 model.

Markisa/Passion Fruit Image Classification Based Improved …

173

Fig. 32 Transfer learning through Inception-V3 model

Fig. 33 Best performing models

As stated in the previous section, we specified different parameters tuning options for our transfer learning model. To expedite the time to experiment with different combinations of parameters and automate the selection of model with the best accuracy, we conducted hyperparameter tuning with the HParams Dashboard [7]. Running the latter resulted in 64 models or trails for the connected layer with one Dense layer and additional 64 models or trails for the connected layer with two Dense layers. Comparing the accuracy results for all the 128 models, we can conclude that the best performing model has 93% accuracy (refer to Appendix). (I) Effect on Optimizers For the optimizer, we have limited our selection between Adam and SGD as optimizers as shown in Fig. 34. From Fig. 34, we can conclude that the Adam optimizer has resulted in overall higher accuracy than the SDG optimizer in the one Dense layer experiment; it has also contributed to the top-performing model when it comes to accuracy. From Fig. 35, we can conclude that the Adam optimizer has also resulted in overall higher accuracy than the SDG optimizer in the two Dense layers experiments; the result is not as good as the experiment with the one Dense layer; however, it contributed to the top-performing model when it comes to accuracy. (II)

Effect on Number of Neurons in the Dense Layer

For the number of neurons in the Dense layer, we experimented with two different values, 512 and 1024. From Fig. 36, we can conclude that the number of neurons hasn’t affected the model accuracy with one Dense layer, as both configurations of 512 and 1024 neurons have resulted in lower and higher accuracy; probably, other parameters have more

174

A. Abdo et al.

Fig. 34 HParams scatter plot matrix view for one dense layer optimizer

Fig. 35 HParams scatter plot matrix view for two dense layers optimizer

effect on the model’s accuracy. However, 512 neurons have contributed to the topperforming model. From Fig. 37, the results are like the one Dense layer experiment; however, 512 neurons have contributed to the top-performing and lowest-performing model. (III) Effect on Dropout For Dropout rates, we experimented with the values 0.1 and 0.2. From Fig. 38, we can conclude that the dropout value of 0.2 for one Dense layer has an overall better result; however, it failed to produce the top-performing model and produced the lowest-performing model.

Markisa/Passion Fruit Image Classification Based Improved …

175

Fig. 36 HParams scatter plot matrix view for one dense layers neurons

Fig. 37 HParams scatter plot matrix view for two dense layers neurons

From Fig. 39, we can conclude that a dropout value of 0.2 has resulted in a better performing model for the two Dense layers than the one Dense layer; probably, higher dropout values correlate with higher accuracy in Dense multilayers. (IV) Effect on Learning Rate In addition to the Dropout rate, we also test on different learning rates, 0.01 or 0.001. From Fig. 40, we can conclude that the learning rate with the value of 0.01 has resulted in overall higher accuracy than the 0.001 in the one Dense layer experiment; it has also contributed to the top-performing model when it comes to accuracy (Fig. 41).

176

A. Abdo et al.

Fig. 38 HParams scatter plot matrix view for one dense layer dropout

Fig. 39 HParams scatter plot matrix view for two dense layers dropout

Contrary to the one Dense layer experiment, a lower value of 0.001 has performed better in the two Dense layers experiment; probably, lower learning rate values correlate with higher accuracy in Dense multilayers. (V) Effect on Batch Size We also tested the effect of different batch sizes during the model training. We can conclude that the batch sizes haven’t affected the model accuracy with one Dense layer from Fig. 42. Both configurations of 50 and 100 batch sizes have resulted in lower and higher accuracy; probably, other parameters affect the model’s accuracy. However, batch sizes of 50 have contributed to the top-performing model.

Markisa/Passion Fruit Image Classification Based Improved …

177

Fig. 40 HParams scatter plot matrix view for one Dense layer learning rates

Fig. 41 HParams scatter plot matrix view for two Dense layers learning rates

From Figure 43, the results are similar to the one Dense layer experiment; however, batch sizes of 100 have contributed to the top-performing and lowest-performing model. (VI) Effect on Epochs For the number of Epochs used to train the models, we experimented with two different values, 10 and 20.

178

A. Abdo et al.

Fig. 42 HParams scatter plot matrix view for one dense layer batch size

Fig. 43 HParams scatter plot matrix view for two dense layers batch sizes

From Fig. 44, we can conclude that higher Epochs has resulted in overall higher accuracy than the lower in the one Dense layer experiment; it has also contributed to the top-performing model when it comes to accuracy. From Fig. 45, we can conclude that a higher Epochs value has also resulted in higher accuracy than the lower value in the two Dense layers experiments. However, it has been proven that higher Epochs result in very high train accuracy; however, a very high Epochs will cause overfitting, and the validation accuracy will decrease because models won’t generalize very well.

Markisa/Passion Fruit Image Classification Based Improved …

179

Fig. 44 HParams scatter plot matrix view for one dense layer epochs

Fig. 45 HParams scatter plot matrix view for two dense layers epochs

(VII) Effect of the Dense Layer Figure 46 shows that the model with one Dense layer has more consistent performance than the two Dense layers model. However, the latter has a performance spike when configured with specific attributes. Therefore, as a summary from the experiments on Inception-V3 transfer learning, we can conclude that the best performing model is the model with the below parameters (Fig. 47). Finally, after finding the best performing model with the most optimum parameters, we test the model with a dataset consists of 100 images with 25 images per label. The results are illustrated in Fig. 48.

180

A. Abdo et al.

Fig. 46 Accuracy comparison between the different configurations of the dense layers

Fig. 47 Top model parameters

From Fig. 48, we can conclude that the best performing Inception-V3 transfer learning has low testing performance and an average accuracy of 65.3%. The model has failed to classify Markisa Manis.

4.4 Accuracy Comparison For comparison, the exact same testing set is applied to other prevalent deep learning architectures, result shown as Table 10.

Markisa/Passion Fruit Image Classification Based Improved …

181

Fig. 48 Confusion matrix for Inception-V3

Table 10 Testing accuracy of all models

Model

Accuracy (%)

Custom CNN 1

97

Custom CNN 2

65

VGG16

95

Inceptionv3

65

5 Conclusion In this study, 4 different CNN models are created for the Markisa Fruit classification for the 4 different types of Markisa. Two custom CNN models are created, and 2 transfer learning models are used with the based model of VGG16 and Inceptionv3. The classifier of the two transfer learning models is customed with different classifiers and use to make the prediction. The result showed that the first custom CNN model shows the highest accuracy with 97% followed by the transfer learning model of VGG 16 with an accuracy of 95%. The second custom CNN model and the Inceptionv3 both give the same testing accuracy of 65%. Consequently, the custom CNN’s performance on the testing accuracy is comparable to the transfer learning model such as VGG16. The architecture design is crucial in determining how well the model able to capture the feature inside the input dataset.

182

A. Abdo et al.

Appendix Table 10 Result of parameter tuning in VGG16 Num_units

Dropout

Optimizer

Epochs

Learning rate

Batch_size

Accuracy

512

0.2

Adam

20

0.01

100

0.97

512

0.2

Adam

20

0.001

100

0.97

512

0.1

Adam

20

0.001

50

0.96

512

0.2

Adam

20

0.001

50

0.96

1024

0.2

Adam

10

0.001

50

0.96

512

0.1

SGD

20

0.001

100

0.95

1024

0.1

Adam

10

0.001

100

0.95

512

0.1

Adam

10

0.001

50

0.94

512

0.2

Adam

20

0.01

50

0.94

512

0.1

SGD

10

0.001

50

0.94

512

0.2

Adam

10

0.01

50

0.94

512

0.1

Adam

20

0.001

100

0.94

512

0.2

Adam

10

0.001

100

0.94

1024

0.2

Adam

10

0.01

100

0.94

1024

0.1

SGD

10

0.001

100

0.94

1024

0.1

SGD

20

0.001

100

0.94

1024

0.2

Adam

10

0.001

100

0.94

512

0.1

Adam

10

0.01

100

0.93

1024

0.2

Adam

20

0.01

50

0.93

512

0.2

SGD

20

0.001

50

0.92

1024

0.1

Adam

10

0.01

50

0.92

1024

0.1

Adam

20

0.001

50

0.92

512

0.2

Adam

10

0.01

100

0.91

1024

0.2

SGD

10

0.01

50

0.91

1024

0.2

Adam

20

0.001

100

0.9

1024

0.2

SGD

10

0.001

50

0.89

512

0.1

SGD

20

0.01

100

0.87

512

0.1

Adam

10

0.001

100

0.87

1024

0.1

Adam

20

0.01

50

0.87

1024

0.2

Adam

20

0.01

100

0.87

1024

0.1

Adam

10

0.01

100

0.87

1024

0.2

SGD

20

0.001

100

0.87

512

0.2

SGD

20

0.001

100

0.86

1024

0.1

Adam

20

0.01

100

0.86 (continued)

Markisa/Passion Fruit Image Classification Based Improved …

183

(continued) Dropout

Optimizer

Epochs

Learning rate

Batch_size

512

0.1

SGD

10

0.001

100

0.85

1024

0.2

SGD

20

0.001

50

0.85

Num_units

Accuracy

1024

0.2

SGD

20

0.01

50

0.85

512

0.1

SGD

10

0.01

100

0.84

1024

0.2

SGD

10

0.01

100

0.84

512

0.1

Adam

20

0.01

50

0.83

1024

0.2

SGD

20

0.01

100

0.83

1024

0.2

Adam

20

0.001

50

0.83

512

0.2

SGD

10

0.01

50

0.82

1024

0.1

SGD

20

0.01

50

0.82

1024

0.1

SGD

20

0.001

50

0.82

512

0.1

SGD

20

0.001

50

0.81

512

0.1

Adam

20

0.01

100

0.81

512

0.1

Adam

10

0.01

50

0.81

1024

0.2

SGD

10

0.001

100

0.81

1024

0.1

Adam

20

0.001

100

0.81

512

0.2

SGD

20

0.01

50

0.8

1024

0.1

SGD

10

0.001

50

0.8

1024

0.1

Adam

10

0.001

50

0.8

512

0.2

SGD

10

0.001

100

0.79

512

0.2

SGD

10

0.001

50

0.78

512

0.1

SGD

10

0.01

50

0.78

512

0.1

SGD

20

0.01

50

0.77

512

0.2

Adam

10

0.001

50

0.75

512

0.2

SGD

20

0.01

100

0.75

512

0.2

SGD

10

0.01

100

0.75

1024

0.1

SGD

10

0.01

50

0.75

1024

0.1

SGD

20

0.01

100

0.73

1024

0.2

Adam

10

0.01

50

0.73

1024

0.1

SGD

10

0.01

100

0.72

Table 10 Result of parameter tuning for Inception-V3 with one dense layer Number of neurons

Dropout rate

Optimizer

Epochs

Learning rate

Batch size

Accuracy (%)

512

0.1

Adam

20

0.01

50

93

512

0.2

Adam

20

0.01

100

91 (continued)

184

A. Abdo et al.

(continued) Number of neurons

Dropout rate

Optimizer

Epochs

Learning rate

Batch size

Accuracy (%)

1024

0.1

Adam

10

0.001

50

90

1024

0.1

Adam

10

0.001

100

89

1024

0.1

Adam

20

0.01

50

89

512

0.1

Adam

20

0.001

50

89

512

0.2

Adam

10

0.001

50

89

512

0.1

Adam

10

0.001

50

89

1024

0.2

Adam

20

0.001

100

89

1024

0.2

Adam

20

0.01

50

89

1024

0.2

Adam

20

0.01

100

89

1024

0.1

Adam

20

0.001

100

89

512

0.2

Adam

20

0.001

100

89

1024

0.2

Adam

20

0.001

50

89

512

0.2

Adam

20

0.001

50

89

512

0.1

Adam

10

0.01

100

88

1024

0.2

Adam

10

0.001

50

88

1024

0.1

Adam

20

0.01

100

88

512

0.2

Adam

10

0.01

50

88

1024

0.1

Adam

10

0.01

100

88

512

0.1

Adam

20

0.01

100

88

512

0.1

Adam

10

0.01

50

88

1024

0.1

Adam

20

0.001

50

88

512

0.1

Adam

10

0.001

100

88

512

0.1

Adam

20

0.001

100

88

1024

0.1

Adam

10

0.01

50

87

1024

0.2

Adam

10

0.01

50

87

1024

0.2

Adam

10

0.001

100

87

512

0.2

Adam

10

0.01

100

86

1024

0.2

Adam

10

0.01

100

86

512

0.2

Adam

20

0.01

50

86

512

0.2

Adam

10

0.001

100

86

1024

0.2

SGD

20

0.001

100

90

1024

0.2

SGD

20

0.01

100

89

512

0.1

SGD

20

0.001

100

89

1024

0.1

SGD

20

0.01

50

89

1024

0.1

SGD

20

0.01

100

89

512

0.2

SGD

20

0.01

100

89 (continued)

Markisa/Passion Fruit Image Classification Based Improved …

185

(continued) Number of neurons

Dropout rate

Optimizer

Epochs

Learning rate

Batch size

Accuracy (%)

512

0.1

SGD

20

0.01

100

89

1024

0.2

SGD

10

0.01

50

89

1024

0.1

SGD

10

0.01

100

88

1024

0.2

SGD

10

0.01

100

88

1024

0.1

SGD

20

0.001

50

88

512

0.1

SGD

10

0.01

50

88

1024

0.2

SGD

20

0.01

50

88

512

0.2

SGD

20

0.001

100

87

512

0.2

SGD

10

0.01

50

87

1024

0.2

SGD

20

0.001

50

87

1024

0.1

SGD

10

0.01

50

87

512

0.1

SGD

20

0.01

50

87

512

0.1

SGD

10

0.01

100

87

512

0.1

SGD

20

0.001

50

87

512

0.2

SGD

10

0.001

100

87

1024

0.1

SGD

20

0.001

100

87

1024

0.1

SGD

10

0.001

50

86

1024

0.2

SGD

10

0.001

50

86

512

0.2

SGD

10

0.01

100

86

512

0.2

SGD

10

0.001

50

85

512

0.1

SGD

10

0.001

100

85

512

0.2

SGD

20

0.01

50

85

1024

0.1

SGD

10

0.001

100

84

512

0.2

SGD

20

0.001

50

84

512

0.1

SGD

10

0.001

50

83

1024

0.2

SGD

10

0.001

100

77

Table 10 Result of parameter tuning for Inception-V3 with two dense layers Optimizer

Learning rate

Batch size

Dropout rate

Epochs

Number of neurons

Accuracy (%)

Adam

0.001

100

0.2

20

512

92

Adam

0.001

50

0.1

20

512

90

Adam

0.001

100

0.1

20

512

89

Adam

0.001

100

0.1

10

512

89

Adam

0.001

50

0.1

20

1024

89

Adam

0.001

50

0.2

20

1024

89 (continued)

186

A. Abdo et al.

(continued) Optimizer

Learning rate

Batch size

Dropout rate

Epochs

Number of neurons

Accuracy (%)

Adam

0.01

100

0.2

20

512

89

Adam

0.01

50

0.1

20

512

89

Adam

0.001

100

0.1

10

1024

89

SGD

0.001

50

0.1

20

1024

89

SGD

0.001

50

0.2

20

512

89

SGD

0.01

100

0.1

20

512

89

SGD

0.01

100

0.2

20

512

89

SGD

0.01

50

0.1

20

512

89

SGD

0.01

50

0.1

20

1024

89

SGD

0.01

50

0.1

10

1024

89

SGD

0.01

100

0.2

20

1024

89

Adam

0.001

100

0.2

10

1024

88

Adam

0.001

100

0.2

10

512

88

Adam

0.001

50

0.2

20

512

88

Adam

0.001

50

0.1

10

1024

88

Adam

0.001

100

0.1

20

1024

88

Adam

0.01

100

0.2

20

1024

88

Adam

0.001

100

0.2

20

1024

88

Adam

0.01

50

0.2

20

512

88

Adam

0.001

50

0.2

10

512

88

Adam

0.001

50

0.2

10

1024

88

Adam

0.01

100

0.1

10

512

88

SGD

0.01

50

0.2

20

1024

88

SGD

0.01

50

0.1

10

512

88

SGD

0.01

50

0.2

10

1024

88

SGD

0.01

50

0.2

20

512

88

SGD

0.01

100

0.1

20

1024

88

SGD

0.01

50

0.2

10

512

88

Adam

0.01

100

0.1

20

512

87

Adam

0.01

100

0.1

10

1024

87

Adam

0.01

50

0.2

20

1024

87

Adam

0.001

50

0.1

10

512

87

Adam

0.01

100

0.2

10

1024

87

SGD

0.01

100

0.2

10

512

87

SGD

0.001

50

0.1

20

512

87

SGD

0.01

100

0.2

10

1024

87 (continued)

Markisa/Passion Fruit Image Classification Based Improved …

187

(continued) Optimizer

Learning rate

Batch size

Dropout rate

Epochs

Number of neurons

Accuracy (%)

SGD

0.01

100

0.1

10

512

87

SGD

0.01

100

0.1

10

1024

87

Adam

0.01

100

0.2

10

512

86

SGD

0.001

50

0.1

10

1024

86

SGD

0.001

100

0.1

20

512

86

SGD

0.001

50

0.2

10

512

86

Adam

0.01

50

0.1

10

512

85

Adam

0.01

50

0.2

10

1024

85

Adam

0.01

100

0.1

20

1024

85

Adam

0.01

50

0.1

20

1024

85

SGD

0.001

100

0.1

20

1024

85

SGD

0.001

100

0.2

10

512

85

SGD

0.001

50

0.2

10

1024

85

SGD

0.001

100

0.2

20

512

85

SGD

0.001

100

0.2

20

1024

84

SGD

0.001

100

0.2

10

1024

84

SGD

0.001

100

0.1

10

1024

83

SGD

0.001

50

0.2

20

1024

83

Adam

0.01

50

0.1

10

1024

82

SGD

0.001

100

0.1

10

512

82

Adam

0.01

50

0.2

10

512

80

SGD

0.001

50

0.1

10

512

78

References 1. Passiflora edulis. July 1, 2021. [Online]. https://en.wikipedia.org/wiki/Passiflora_edulis 2. Abualigah, L. M. Q. (2019). Feature selection and enhanced krill herd algorithm for text document clustering (pp. 1–165). Springer. 3. Dwivedi, R. (2020, December 4). Everything you should know about dropouts and batch normalization in CNN. Analytics India Magazine. https://analyticsindiamag.com/everythingyou-should-know-about-dropouts-and-batchnormalization-in-cnn/ 4. Khandelwal, R. (2019, January 10). L1 and L2 regularization—DataDrivenInvestor. Medium. https://medium.datadriveninvestor.com/l1-l2-regularization-7f1b4fe948f2?gi=bccf46d4504a 5. Kumari, N., Bhatt, A. K., Dwivedi, R. K., & Belwal, R. (2019). Performance analysis of support vector machine in defective and non defective mangoes classification. 6. Alhaj, Y. A., Dahou, A., Al-qaness, M. A., Abualigah, L., Abbasi, A. A., Almaweri, N. A. O., Elaziz, M. A., & Damaševiˇcius, R. (2022). A novel text classification technique using improved particle swarm optimization: A case study of Arabic language. Future Internet, 14(7), 194.

188

A. Abdo et al.

7. Hyperparameter tuning with the HParams dashboard. TensorFlow, April 8, 2021. [Online]. Available: https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_h params. Accessed June 5, 2021. 8. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 9. Daradkeh, M., Abualigah, L., Atalla, S., & Mansoor, W. (2022). Scientometric analysis and classification of research using convolutional neural networks: A case study in data science and analytics. Electronics, 11(13), 2066. 10. Wu, D., Jia, H., Abualigah, L., Xing, Z., Zheng, R., Wang, H., & Altalhi, M. (2022). Enhance teaching-learning-based optimization for tsallis-entropy-based feature selection classification approach. Processes, 10(2), 360. 11. Ali, M. A., Balasubramanian, K., Krishnamoorthy, G. D., Muthusamy, S., Pandiyan, S., Panchal, H., Mann, S., Thangaraj, K., El-Attar, N. E., Abualigah, L., & Elminaam, A. (2022). Classification of glaucoma based on elephant-herding optimization algorithm and deep belief network. Electronics, 11(11), 1763. 12. Abualigah, L., Kareem, N. K., Omari, M., Elaziz, M. A., & Gandomi, A. H. (2021). Survey on Twitter sentiment analysis: architecture, classifications, and challenges. In Deep learning approaches for spoken and natural language processing (pp. 1–18). Springer. 13. O’Mahony, N., Campbell, S., Carvalho, A., Harapanahalli, S., Hernandez, G. V., Krpalkova, L., Riordan, D., & Walsh, J. (2019, April). Deep learning vs. traditional computer vision. In Science and information conference (pp. 128–144). Springer. 14. Risdin, F., Mondal, P. K., & Hassan, K. M. (2020). Convolutional neural networks (CNN) for detecting fruit information using machine learning techniques. IOSR Journal of Computer Engineering, 22, 1–13. 15. Palakodati, S. S. S., Chirra, V. R. R., Yakobu, D., & Bulla, S. (2020). Fresh and rotten fruits classification using CNN and transfer learning. Revue d’Intelligence Artificielle, 34(5), 617– 622. 16. Kishore, M., Kulkarni, S., & Senthil Babu, K. (n.d.). Fruits and vegetables classification using progressive resizing and transfer learning. Journal of University of Shanghai for Science and Technology. Retrieved July 5, 2021, from https://jusst.org/wp-content/uploads/2021/02/Fruitsand-Vegetables-Classification-using-Progressive-Resizing-and-Transfer-Learning-1.pdf 17. Pardede, J., Sitohang, B., Akbar, S., & Khodra, M. (2021). Implementation of transfer learning using VGG16 on fruit ripeness detection. International Journal of Intelligent Systems and Applications, 13(2), 52–61. https://doi.org/10.5815/ijisa.2021.02.04 18. Inceptionv3. Wikimedia Foundation, June 29, 2021. [Online]. https://en.wikipedia.org/wiki/ Inceptionv3. Accessed July 5, 2021. 19. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826. 20. Lin, C., Li, L., Luo, W., Wang, K. C., & Guo, J. (2019) Transfer learning based traffic sign recognition using inception-v3 model. Periodica Polytechnica Transportation Engineering, 242–250. 21. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 22. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile Search Algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 23. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 24. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177.

Markisa/Passion Fruit Image Classification Based Improved …

189

25. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 26. Fan, H., Du, W., Dahou, A., Ewees, A. A., Yousri, D., Elaziz, M. A., Elsheikh, A. H., Abualigah, L., & Al-qaness, M. A. (2021). Social media toxicity classification using deep learning: Realworld application UK Brexit. Electronics, 10(11), 1332. 27. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

Enhanced MapReduce Performance for the Distributed Parallel Computing: Application of the Big Data Nathier Milhem, Laith Abualigah, Mohammad H. Nadimi-Shahraki, Heming Jia, Absalom E. Ezugwu, and Abdelazim G. Hussien

Abstract Now a days and previous years, the increase in the volume of data has accelerated and this requires more storage places with the increase of data, as big data has a huge number of users and cloud computing, and these users need to access data securely and privately from any device at any time. Therefore, it is important to provide a safe flow of data in the Internet of Things (IOT records file) and to reduce its size in a way that does not affect its purpose or its purpose. The most important field of data mining is the search for items and repetitive data inside storage locations. Apriori algorithm was the most common algorithm for finding a set of repeated elements from data. This needs to delete a group of data that is repeated more than N. Milhem · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800, George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan M. H. Nadimi-Shahraki Faculty of Computer Engineering, Islamic Azad University, Najafabad Branch, 8514143131 Najafabad, Iran Big Data Research Center, Islamic Azad University, Najafabad Branch, 8514143131 Najafabad, Iran Centre for Artificial Intelligence Research and Optimisation, Torrens University Australia, Brisbane 4006, Australia H. Jia Department of Information Engineering, Sanming University, Fujian 365004, China A. E. Ezugwu School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, KwaZulu-Natal, King Edward Road, Pietermaritzburg 3201, South Africa A. G. Hussien Department of Computer and Information Science, Linköping University, Linköping, Sweden Faculty of Science, Fayoum University, Faiyum, Egypt © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_8

191

192

N. Milhem et al.

once and create a number of new groups after deleting the repeated ones, which leads to an increase in the storage space and an increase in the speed of its performance. In this paper, we implemented the MapReduce Apriori (MRA) algorithm on the Apache Hadoop cluster that includes two functions (Map and Reduce) to find the repeated sets of k-elements. Keywords Internet of Things (IoT) · Big Data · Hadoop · Map Reduce · Apriori algorithms · Data mining

1 Introduction Modern technology has become more complex, especially with the development of Internet of things devices, which leads to an increase in huge data to accelerate in size and grow dramatically with the passage of time, to become of a size and complexity so large that it is difficult to store and the lack of tools to manage or process it with high efficiency [1]. Internet of things devices connected to the distributed and cloud infrastructure provide and transmit data and other resources for uploading to the cloud. Therefore, it is important to ensure that data and resources are ready to be accessed and that users are able to access them securely in any IoT environment and are distributed in an orderly manner and reduce their volume [2]. Distributed and parallel computing systems are the best way to process data on a large scale, and these algorithms have been used and transformed into ‘large algorithms’ to work with big data. MapReduce contributes to data analysis and is one of the best algorithms in this field, and is a programming model for parallel and distributed execution of big data [3]. The Apriori algorithm is the most popular and widely used algorithm in data mining that mining sets of repetitive elements using filter generation. Apriori is the core algorithm of Association Rule Mining (ARM) and its genesis has fueled research in data mining. Apriori is one of the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in 2006 based on the most impactful data mining [4]. It not only works to shrink large data, but is also concerned with a set of characteristics such as speed, and the movement of various data in many forms. Which is mainly determined by large size and high speed. The variety is high. Traditional data mining techniques and tools are effective in analyzing / extracting data but not scalable and efficient in managing big data. Big data architectures and technologies have been adopted to analyze this data. This study aims to achieve adding a proposed application of distributed parallel computing performance on big data and how to transfer big data as it is collected from the Internet of Things to be considered as input data and simplified after processing using Hadoop. How effective is the validity of reducing the repetition of big data and ensuring its quality in operations that include (data collection and processing) through algorithms analyzing data and data results.

Enhanced MapReduce Performance for the Distributed …

193

2 Background 2.1 Big Data (BD) The term “Big Data” include the (large volume, different forms, speed of processing, technology, methods and impact) of digital data that accorded from companies and individuals [5–12]. Big Data is the Information asset characterized by such a High Volume, Velocity and Variety to require specific Technology and Analytical Methods for its transformation into Value [13, 14]. Volume: This feature represents the large amount of data that is generated or obtained from various sources such as social media, the bank, and the government and private sectors, and it is increasing by the year 2021, so more than 44 trillion GB. Value: It shows obtaining data through data collected from different sources, conducting analyzes on them, and making sure of their values, as the analysis informs us to give values of interest to companies and businesses for growth and progress, and accordingly some decisions and ideas can be taken in the future. Veracity: This part clarifies the contradictions and doubts that exist in the data during the process. Some data packets have to be removed. Velocity: The rate at which all the data is accumulated, this property measures Data generation rate with increasing numbers of users, and it was accessed via IoT. Variety: This feature deals with different formats of data including data coming from IOT (images, video, JSON files, and social media). Include three formats of data are structured, unstructured, and semi-structured data Fig. 1 explain formats of data.

Fig. 1 Big data classification

194

N. Milhem et al.

2.2 Hadoop It is an open source programming system that is based on Java that processes a set of big data that exists in a distributed computing environment. Hadoop Ecosystem is a program that provides various services for finding a solution to the big data problem. It contains Apache projects and a set of tools and special solutions. It includes four major components of Hadoop, namely HDFS, Map Reduce, YARN, and Hadoop Common. These Tools a are used to find solutions and support these key components. These tools are linked to provide services such as data ingest, analysis, storage, maintenance, etc.

2.2.1

HDFS

Distributed file system is designed in order to contain very huge amounts of data obtained from the Internet of things and its size (terabytes or even petabytes) and connect to information. It stores redundant files across storage devices in case of failure and high availability.

2.2.2

Map Reduce

It is one of the programming models that was created to process big data by dividing it into a group of Independent tasks, and the division is parallel, through two models [1]. The first model is Map. The second models Reduce Each model does its job, and the map’s function is to extract the results as pairs of values, where a Reduce model takes the output of the map and processes it„ a collection of values is produced (Fig. 2).

2.3 Apriori Algorithm The Apriori algorithm works on sets of repetitive elements in order to establish the correlation factor between them, K, and it is designed to work on big data that have related parameters between them. With the help of the correlation factor K + 1, in order to determine the strength or weakness of the contact between two objects. This algorithm is widely used to efficiently calculate the set of functions for elements. The goal of this iterative process is to find the repeated data set from the huge data set. Some other optimization methods can be used to optimize the problems as given in [15–20] (Fig. 3).

Enhanced MapReduce Performance for the Distributed …

195

Fig. 2 MapReduce programming model

Fig. 3 The APRIORI-algorithm

3 Related Work Researcher and research communities have provided experiments too many approaches in BD analytical. This part of research provides the conducted work and the results of each one of them. MapReduce is used to scale algorithms for Big Data analyticsIt investigates an in this case, the focus has been on reducing the volume of data through the development of MapReduce and its integration with Apriori [3]. These algorithms are working

196

N. Milhem et al.

on analyzing big data. In his study, he focused on finding a solution to the problem of scaling the “large algorithms” of the common correlation mining algorithm. The results in this study confirm that an effective MapReduce implementation should avoid dependent iterations, such as those of the original Apriori sequential algorithm. Utility Frequent Patterns Mining on Large Scale Data based on Apriori MapReduce Algorithm, the main objective was to enhance “Pattern Mining Algorithms” to work on big Data by proposing a set of algorithms based on MapReduce architecture and hadoop environment. This algorithm was merging Apriori with MapReduce, The results indicated a good performance in wipers [21]. Effective implementation of Apriori The algorithm is based on MapReduce on the Hadoop, a set of problems were posed, such as load balancing, the mechanism for dividing data and how to distribute it, working on monitoring it, as well as passing parameters between nodes [22]. Parallel and distributed computing is one of the most widespread fields and has become wide and diversified, and there is also a major difference that distinguishes Hadoop is its scalability, simplicity in its work, and high reliability to solve most challenges and problems easily and effectively. To determine the way of Distributed Parallel Computing Environment for Big Data in mapreduce base apriori alhorthims, the researcher present the literature surveyed (Table 1) as a case study to highlight the challenges envisaged for effective for implemented the MapReduce Apriori (MRA) algorithm on the Apache Hadoop cluster to stream/process BD.

4 Methodology (Prescriptive Study) 4.1 Hadoop Architecture The architecture of Hadoop cluster as on (Fig. 4) consists of Master and Slave, the Master is Name Node and the Slaves Are Data Node. The Name Node in master of HDFS runs the dataNode daemon in the slave. The job in master submission node runs the task Tracker in the slave, which is the only point of contact for a client wishing to execute a MapReduce job. The Job Tracker in the master monitors the progress of running MapReduce task and is responsible for coordinating the execution of the map and reduce [14]. These services work on two deferent machines, and in a small cluster they are often collocated. The bulk of a Hadoop cluster consists of slave nodes that run both a Task Tracker, which is responsible running the user code, and a Data Node daemon, for serving HDFS data [13].

Enhanced MapReduce Performance for the Distributed …

197

Table 1 Comparison of existing approaches used to handle the frequent elements to Apply efficient, validation, scalability and reliability Author

Year

Objective

Pros

Cons

[23]

2010

Apache Mapreduce framework used to calculate achieve parallelism and find frequently element

A set of 9 machines (1 master and 8 slave), used data from IBM Corporation and the number of nodes was compared with the speed up through hadoop cluster

Ability of mapreduce with Apriori to give more advantage. It can applied is easily to many machines to deal with big data without synchronization problem

[24]

2011

Data mining is used new strategy of rules and focused in cloud computing environment and propose a method of big data set distribution

Used data set from Google, and the input data was divided into two groups: a first group consisting of a 16-MB and a second group consisting of a 64-MB and Experimental Between N of Node and Executions Times

The algorithm works in the cloud computing environment effectively and can extract the redundant set of data from the group data, through the mechanism of data segmentation and distributing data.The efficiency of the algorithm has been improved

[25]

2012

Propose new framework for work on big data on certain problems types of distributable using a huge N.of nodes to find scale well and efficiently

The data set experiments for an AllElectronics branch and framework used three stages: scaleup-sizeup -speedup

The experimental and result between three stage is actually more efficient can works a huge data

[26]

2013

A new model for mining dataset of frequent elements to Apply validation, scalability and reliability

Use 256 MB datasets and single machine to experiments Apriori and FP-Growth algorithm through running time with Data Size

Model has proven that the results in this method are feasible, valid and capable of improving the overall performance of the data mining operation on a large scale

[27]

2014

The algorithm works on big data processing and efficient data mining when it changes at the same time between threshold value and the original database at the level

Created program in Java and application on Intel computer processor 3.10 GHz i3-200 dual-core with 4 GB main memory and used Apriori and FP-Growth coupler algorithms analyzed by comparison (Dataset Size, Dataset Transactions)

The results proved that this algorithm is led to a higher acceleration and effective in reducing the frequent time of work,

[28]

2015

Discusses the use and implementation mechanism through mining big data for e-commerce companies and improving sales processes

A hadoop cluster is setup with 4 system nods(3-slave and 1-master name node) on Ubuntu 14.04

The inventory of the product of e-commerce companies can be updated based on the set of recurring items at regular intervals

[29]

2016

It focuses on taking the timestamp at each stage and considers it as a symbol in its transaction, and this is considered appropriate for the process of indexing data with its timestamp

DESIGN is enhance of apriori implement on MR and HBASE on hadoop cluster, and compare between apriori orginal and MH apriori Linux with Hadoop 0.20.0. consist of 5 nodes, (1 master-4 slave) dataset size is 1.8 GB form IBM

Map-Hbase-Apriori can only once scan to finish the database matching of the frequent element

(continued)

198

N. Milhem et al.

Table 1 (continued) Author

Year

Objective

[30]

2017

Enhance Apriori based on 350,000 records from 2007 to Hadoop cluster on big data 2014 were obtained after data applied of axle faults of EMUs preprocessing; and applied of Apriori based Hadoop cluster

The results showed that the algorithm achieved high accuracy in the error prediction process and speed in the operation process

[31]

2018

Developed for MR approach base with Apriori algorithm for recursive data mining and works on any type of database

A new algorithm, Apriori Core MapReduce, is proposed to work on big data that takes less time and memory than the original algorithm

This algorithm works on any type of database

[32]

2019

Improving performance of iterative element set parallel mining using Hadoop with FP-Growth and Apriori comparison

The algorithm is implemented on data set and market basket analyses

If the proposed is not work with mapreduce, the time for exploration forces will decrease

[33]

2020

Create an algorithm based on mining effectively on the real data set that works in parallel and To split the original data set

Hadoop v1.2.1 used data size 400 GB by AIS Global for two months (4–5) was used for the year 2012 and experimental data through three stage consist first stage Calculating the partition number N and second stage Determining partition boundaries and third stage includes Partitioning the data set

Number of frequent cases decreases rapidly, but They considered the size of the data to be small compared to the experiment, and we need more data for comparison

Fig. 4 Hadoop: master/slave architecture

Pros

Cons

Enhanced MapReduce Performance for the Distributed …

199

Fig. 5 MapReduce in Hadoop

4.2 MR Programming Model MAP-REDUCE computing model (Fig. 5) include two functions, Map () and Reduce () functions. The tow functions are both defined with pairs of data structure (key1; value1). Map function is work to each item in the input dataset (key1; value1) pairs and call produces a list (key2; value2). All the (key, value) that have the same key in the output lists is save to reduce () function which generates one (value 3) or empty return.

4.3 Apriori Algorithm Apriori algorithm uses an iterative method called layer-by-layer search, whereby the set of k elements is used to explore the set of elements (k + 1). First, scan the database, sum the number of each item, and collect the items that meet the minimum support score to find the set of 1 recurring item clusters. This group is referred to as L1. Then, use L1 to find the set of L2 for two sets of repeating elements, use L2 to search for L3, and so on, until the set of repeating k-elements can no longer be found. Every Lk present requires a full database scan. The Apriori algorithm uses a priori properties of repeated element groups to compress the search space.

200

N. Milhem et al.

5 Result and Discussion (Proposed Framework) In these results, implementation of proposed a framework improving Enhance Performance Distributed Parallel Computing for Big Data in cloud computing environment, Fig. 6 explain how to frame process. This section explains frame architecture and basic concepts in the context of the mapreduce base Apriori algorithm, we presented a high-level abstract architecture to find frequent Item of set data. The Concept of frame consist is: Logs file: The color group is data collected from the Internet of Things, and stored in HDFS pure in Hadoop cluster where this data is unordered. MAP—Apriori: The data is fetched from HDFS in the form of pairs (K, V), and they are arranged according to the type of data and this process can be repeated more than once because it includes frequent items. Reduce—Apriori: The inputs of this process are a group of similar pairs (K, V) in their kind coming from MAP—Apriori in the multiple layers, the main task is collecting the similar values (K, Vn). Output: At this stage, the data is verified before it is stored in HDFS Output, all frequent values (K1, V1) are retrieved to MAP—Apori, until they are completely collected.

Fig. 6 MP-A frame

Enhanced MapReduce Performance for the Distributed …

201

6 Conclusion In this paper, we have proposed new frame to efficient pattern to mining frequent data available in big data, and apply algorithms to effective and validity of reducing the repetition of big data and ensuring its quality in operations. Through MapReuce base Apriori algorithms in Hadoop cluster. Where all the practical researches related to this field were compared to each other and the results lead to widely effective in the field of data mining. After comparing all studies and verifying the effectiveness of the algorithms in giving reliable results in this field, we will apply them to neural network based deep learning, especially since it is working on MapRduce in different studies.

References 1. Altaf, M. A. B., Barapatre, H. K., & Sangvi, A. Mining condensed representations of frequent patterns on big data using max Apriori map reducing technique. 2. Apache Hadoop. http://hadoop.apache.org/ 3. Kijsanayothin, P., Chalumporn, G., & Hewett, R. (2019). On using MapReduce to scale algorithms for big data analytics: A case study. J Big Data, 6, 105. https://doi.org/10.1186/s40537019-0269-1 4. Singh, S., Garg, R., & Mishra, P. K. (2018). Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster. Computers and Electrical Engineering, 67, 348–364. ISSN 0045-7906. 5. Gharaibeh, M., Alzu’bi, D., Abdullah, M., Hmeidi, I., Al Nasar, M. R., Abualigah, L., & Gandomi, A. H. (2022). Radiology imaging scans for early diagnosis of kidney tumors: a review of data analytics-based machine learning and deep learning approaches. Big Data and Cognitive Computing, 6(1), 29. 6. Gandomi, A. H., Chen, F., & Abualigah, L. (2022). Machine learning technologies for big data analytics. Electronics, 11(3), 421. 7. Bashabsheh, M. Q., Abualigah, L., & Alshinwan, M. (2022). Big data analysis using hybrid meta-heuristic optimization algorithm and MapReduce framework. In Integrating meta-heuristics and machine learning for real-world optimization problems (pp. 181–223). Springer. 8. Gharaibeh, M., Almahmoud, M., Ali, M. Z., Al-Badarneh, A., El-Heis, M., Abualigah, L., Altalhi, M., Alaiad, A., & Gandomi, A. H. (2021). Early diagnosis of alzheimer’s disease using cerebral catheter angiogram neuroimaging: A novel model based on deep learning approaches. Big Data and Cognitive Computing, 6(1), 2. 9. Abualigah, L., Diabat, A., & Elaziz, M. A. (2021). Intelligent workflow scheduling for Big Data applications in IoT cloud computing environments. Cluster Computing, 24(4), 2957–2976. 10. Abualigah, L., Gandomi, A. H., Elaziz, M. A., Hamad, H. A., Omari, M., Alshinwan, M., & Khasawneh, A. M. (2021). Advances in meta-heuristic optimization algorithms in big data text clustering. Electronics, 10(2), 101. 11. Abualigah, L., & Masri, B. A. (2021). Advances in MapReduce big data processing: platform, tools, and algorithms. In Artificial intelligence and IoT (pp. 105–128). 12. Al-Sai, Z. A., & Abualigah, L. M. (2017, May). Big data and e-government: A review. In 2017 8th International conference on information technology (ICIT) (pp. 580–587). IEEE. 13. Kumar, A., Kiran, M., Mukherjee, S., & Ravi Prakash G. (2013). Verification and validation of MapReduce program model for parallel K-means algorithm on Hadoop cluster. International Journal of Computer Applications 72(8). (0975-8887).

202

N. Milhem et al.

14. Qayyum, R. (2020). A roadmap towards big data opportunities, emerging issues and Hadoop as a solution. International Journal of Education and Management Engineering, 10, 8–17. https://doi.org/10.5815/ijeme.2020.04.02 15. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 16. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 17. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 18. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 19. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 20. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 21. Nandini, G. V. S., & Rao, N. K. K. (2019) Utility frequent patterns mining on large scale data based on Apriori MapReduce algorithm. International Journal of Research in Informative Science Application and Techniques (IJRISAT), 3(8), 19381–19387. 22. Yahya, A. A., & Osman, A. (2019). Using data mining techniques to guide academic programs design and assessment. Procedia Computer Science, 163, 472–481. ISSN 1877-0509, 23. Yang, X. Y., Liu, Z., & Fu, Y. (2010). MapReduce as a programming model for association rules algorithm on Hadoop. In The 3rd international conference on information sciences and interaction sciences (pp. 99–102). https://doi.org/10.1109/ICICIS.2010.5534718 24. Li, L., & Zhang, M. (2011). The strategy of mining association rule based on cloud computing, In 2011 International conference on business computing and global informatization (pp. 475– 478).https://doi.org/10.1109/BCGIn.2011.125 25. Li, N., Zeng, L., He, Q., Shi, Z. (2012). Parallel implementation of Apriori algorithm based on MapReduce. In 2012 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing (pp. 236–241). https://doi.org/10. 1109/SNPD.2012.31 26. Rong, Z., Xia, D., & Zhang, Z. (2013). Complex statistical analysis of big data: Implementation and application of Apriori and FP-Growth algorithm based on MapReduce. In 2013 IEEE 4th international conference on software engineering and service science (pp. 968–972). https:// doi.org/10.1109/ICSESS.2013.6615467 27. Wei, X., Ma, Y., Zhang, F., Liu, M., & Shen, W. (2014). Incremental FP-Growth mining strategy for dynamic threshold value and database based on MapReduce. In Proceedings of the 2014 IEEE 18th international conference on computer supported cooperative work in design (CSCWD) (pp. 271–276). https://doi.org/10.1109/CSCWD.2014.6846854 28. Chaudhary, H., Yadav, D. K., Bhatnagar, R., & Chandrasekhar, U. (2015). MapReduce based frequent itemset mining algorithm on stream data. In 2015 Global conference on communication technologies (GCCT) (pp. 598–603).https://doi.org/10.1109/GCCT.2015.734 2732 29. Feng, D., Zhu, L., & Zhang, L. (2016). Research on improved Apriori algorithm based on MapReduce and HBase. In 2016 IEEE advanced information management, communicates, electronic and automation control conference (IMCEC) (pp. 887–891).https://doi.org/10.1109/ IMCEC.2016.7867338 30. Li, L., Shi, T., & Zhang, W. (2017). Axle fault prognostics of electric multiple units based on improved Apriori algorithm. In 2017 29th Chinese control and decision conference (CCDC) (pp. 4229–4233). https://doi.org/10.1109/CCDC.2017.7979241

Enhanced MapReduce Performance for the Distributed …

203

31. Pandey, K. K., & Shukla, D. (2018) Mining on relationships in big data era using improve apriori algorithm with MapReduce approach. In 2018 International conference on advanced computation and telecommunication (ICACAT) (pp. 1–5).https://doi.org/10.1109/ICACAT.2018.893 3674 32. Deshmukh, R. A., Bharathi, H. N., & Tripathy, A. K. (2019). Parallel processing of frequent itemset based on MapReduce programming model. In 2019 5th International conference on computing, communication, control and automation (ICCUBEA) (pp. 1–6)https://doi.org/10. 1109/ICCUBEA47591.2019.9128369 33. Lei, B. (2020). Apriori-based spatial pattern mining algorithm for big data. In 2020 International conference on urban engineering and management science (ICUEMS) (pp. 310– 313).https://doi.org/10.1109/ICUEMS50872.2020.00074

A Novel Big Data Classification Technique for Healthcare Application Using Support Vector Machine, Random Forest and J48 Hitham Al-Manaseer, Laith Abualigah, Anas Ratib Alsoud, Raed Abu Zitar, Absalom E. Ezugwu, and Heming Jia Abstract In this study, the possibility of using and applying the capabilities of artificial intelligence (AI) and machine learning (ML) to increase the effectiveness of Internet of Things (IoT) and big data in developing a system that supports decision makers in the medical fields was studied. This was done by studying the performance of three well-known classification algorithms Random Forest Classifier (RFC), Support Vector Machine (SVM), and Decision Tree-J48 (J48), to predict the probability of heart attack. The performance of the algorithms for accuracy was evaluated using the Healthcare (heart attack possibility) dataset, freely available on kagle. The data was divided into three categories consisting of (303, 909, 1808) instances which were analyzed on the WEKA platform. The results showed that the RFC was the best performer. Keywords Big data · Internet of Things · Random forest classifier · J48 · Support vector machine · Weka · E-Health

H. Al-Manaseer · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah · A. R. Alsoud Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 19328, Jordan L. Abualigah Faculty of Information Technology, Middle East University, Amman 11831, Jordan R. A. Zitar Sorbonne Center of Artificial Intelligence, Sorbonne University-Abu Dhabi, Abu Dhabi, United Arab Emirates A. E. Ezugwu School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, KwaZulu-Natal, King Edward Road, Pietermaritzburg 3201, South Africa H. Jia Department of Information Engineering, Sanming University, Fujian 365004, China © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_9

205

206

H. Al-Manaseer et al.

1 Introduction In the current era, communication has become widespread between many things, such as computers, large web servers, smart devices, etc. through the Internet. This contact form is known as the Internet of Things (IoT) [1]. IoT is characterized by its massive structure and complexity, and represents the second set of the Internet, possibly may have trillions of interconnected points. The use of IoT will lead to achieving high economic benefit to the various sectors, because it works to enhance the possibility of production and innovation [2–5]. It has brought about tremendous and unprecedented changes that helped reduce costs, improve efficiencies, and increase revenues, which led to the generation of a huge volume of data. Figure 1 describe the concept of it. The current technological revolution has resulted in the generation of large amounts of data [6–9]. As a result of the massive development of the IoT, huge amounts of data have been created. This data is called “big data”, and it refers to a wide range of data that needs new structures and technologies to manage that data, whether to capture and process it in order to be able to extract value to enhance insight and decision making [10]. Big data has many characteristics such as being large in size, high speed, high diversity, and high accuracy [11, 12]. Due to advances in healthcare dataset management systems, large amounts of medical data have been generated, and this type of machine learning is classified as supervised learning. Analysis and classification methods can be used in big data science and data mining to enhance the effectiveness of the IoT and meet the challenges it faces such as the mechanism of storage, transportation and processing to large volume of data. One of the problems facing big data science is the classification issue. If the dataset contains many dimensions, the compilation process becomes moderate. However, consideration must be given to choosing a method for extracting the desired features from the set of features for the dataset as this leads to the loss of part of the dataset’s data [13, 14]. The main benefit of selecting specific features and ignoring unnecessary ones is to reduce data volumes and improve “classification/prediction accuracy” [15]. The classification method is one of the most applied methods in the data mining, as

Fig. 1 IoT concept

A Novel Big Data Classification Technique for Healthcare …

207

it uses a set of previously classified examples in order to build new model can be used in several application such as IoT E-Health systems. Data mining is defined as the mechanism of extracting data from the data set, discovering useful information from it, and then analyzing the data collected in order to enhance the decision-making mechanism. Data mining uses different algorithms and seeks to reveal specific features of data [16]. This study aims to apply data mining techniques in the E-health systems of the IoT, especially the study of Health care (heart attack possibility) dataset and the real feasibility of these techniques in the E-health of the IoT. There are various ways to use the principles of data mining to create smart E-Health systems for the IoT. As a case study, technologically scalable study of healthcare dataset was developed using free, open source software such as WEKA (Waikato Knowledge Acquisition Environment). And also it is aim to compare the accuracy of Random forest classifier (RFC), Support Vector Machine (SVM), and Decision Tree-J48 (J48) algorithms in classifying and analyzing medical data. Here is a review of the main benefits of using healthcare data mining: . Predicting the patient’s likelihood of having a heart attack. . The use of data mining techniques helps decision makers (i.e. health care workers) to make decisions related to disease cases. . Reducing the rate of medical errors, as the use of data mining techniques in this study predicts in advance the possibility of a heart attack. The rest of this paper is organized as follows. Section 2 Literature Review. Section 3 Methodology. Section 4 Process development. Section 5 Experiment and Result. Finally, Sect. 6 shows conclusions and future work.

2 Literature Review There is currently a wide body of literature covering a wide range of techniques, which can be used as an integral part of big data and IoT. The following sections survey the best methods used in this field. Lakshmanaprabu et al. in [17] used Random Forest Classifier (RFC) and the MapReduce process to develop a technology based on big data analysis in an IoTs based healthcare system. E-Health data was collected from patients with various diseases and taken into account for the analysis of this data. To get the best rating, the optimal traits were selected based on the Enhanced Dragonfly Algorithm (IDA) from the dataset. RFC used for classifying E-Health data using enhanced features. The proposed technique outperforms other classification methods such as the Gaussian mixture model and logistic regression. The maximum training and testing accuracy of the proposed technique is 94.2% precision, and 89.99% recall. Various performance measures were analyzed and their results were compared with existing methods in order to verify the efficiency of the proposed method. The limitations on the proposed

208

H. Al-Manaseer et al.

technique are computationally slow due to the large dataset. Some other optimization methods can be used to optimize the problems as given in [18–23]. Cervantes et al. in [24] conducted a comprehensive survey of SVMs taxonomy including applications, challenges, and trends that included a brief introduction to SVMs, a description of its many applications, and a summary of its challenges and trends. Examine and define limitations of SVMs [24]. Study and discuss the future of SVM in conjunction with more applications. Describe the major flaws in SVM and the various algorithms implemented to address these flaws in detail based on the work of researchers who encountered these flaws. Jain et al. [25] linked Apache Hadoop to Weka. The big data stored on the Hadoop distributed file system (HDFS) and processed with Weka using Weka’s Knowledge flow. Knowledge flow provides a good way to build topologies using HDFS components that can be used to provide data for machine learning algorithms available in Weka [25]. In big data mining, the supervised machine learning methods used which include Naïve Bayes, SVM and J48. The accuracy of these methods was compared with raw data and normative data given for the same structure. A new approach in big data mining proposed that gives better results compared to the reference approach. The accuracy of classifying raw data sets has been increased. Normalization was also applied to the raw dataset and the accuracy was found to improve after supervised estimation of the dataset. Siou-Wei and others in [26] use the SVM for classifying and processing data based on three characteristics: healthy, unhealthy, and very unhealthy. Uploaded the physiological parameters of the test object and classification results to cloud storage and web page rendering in order to provide the basis for big data analysis in future research. All biomedical units equipped with wireless sensor network chips can process and collect the measured data and then transmit it to the cloud server via the wireless network for storage and analysis of that data. Li et al. in [27] presented a comprehensive survey of using big data science and data mining methods on IoTs aims to identify the topics that should be focused more on in current or future research. By following up on conference articles and published journals on IoT big data and also IoT data mining areas between 2010 and 2017. Articles were screened using the literature review set and methodological maps of 44 articles. These articles fall into three classes: architecture, platform, framework, and application.

3 Methodology This part studies the methodology used to analyze big data in IoT E-Health systems, using some of the modeling procedures. This analysis uses Health care (heart attack possibility) dataset for training and testing purposes. IoT data is used for the performance of systems, infrastructure and, IoT objects. IoT objects contain data produced as a result of interaction between people, people, systems, and systems. This data can be used to improve the services provided by the

A Novel Big Data Classification Technique for Healthcare …

209

Fig. 2 Big data model in the IoT [17]

IoT. All health centers, regardless of where testing is conducted, have access to each patient’s information, using big data science, and also tests are stored at the same time the test was made, allowing appropriate decisions to be made from the moment the patient is tested. Extracting specific data from big data, as well as extracting any data from smart data, are thorny problems that can be solved through data mining techniques. Therefore, different models can be used to extract data. Figure 2 illustrates a model of big data in the IoT [17]. The dataset on IoT objects, infrastructure, includes some minute details and information about healthcare data such as patient age, sex, etc. Health data were classified using the RFC, SVM and J48. A. Random Forest Classifier (RFC) RFC represents an immutable set of classification trees. It performs well in many job issues. The reason is that it is not sensitive to any disturbances in the information set, and it does not have an overfitting problem. It combines many trees predictions, each one being trained independently of the rest [28]. The RFC generates a randomized example to the information and visualizes a major order of ratios in order to develop choice trees. B. Support Vector Machine (SVM) SVM classified as one of the best techniques used in predicting expected solutions [29]. SVM was presented by Vapnik as one of machine learning model that performs the task of classification and regression. Due to the generalization, optimization, and discriminatory power of SVM, it has been used in the fields of data mining, machine learning, and pattern recognition in the past years. Use SVM extensively to solve binary process classification problems. SVMs outperform other supervised machine learning methods [30]. In recent years SVMs

210

H. Al-Manaseer et al.

have become one of the most widely used classification methods due to good theoretical foundations and generalization [24].

4 The Proposed Method This section describes the approach chosen to develop data mining techniques in order to focus on analyzing data and discovering exploration principles by which health information can be provided to the patient and predicted heart disease. C. Case Study Archived historical data was used, the data set consisted of 76 attributes, but in all published experiments a subset of 14 attributes was used. Especially, machine learning (ML) researchers have only used the Cleveland dataset so far. The “target” field expresses the extent to which the patient has a heart disease. The integer number with a value of (0) indicates that there is no less chance or chance of having a heart attack. As for the chance of having a heart attack, it is represented by the number (1). This data set is freely available on the kaggle website. Table 1 show the full list of attributes [31]. Depending on the composition of the data set, a mechanism for preparing the data and extracting knowledge from it was hypothesized. After the validation process through the case study, the approach is applicable and feasible in many analyzes of patients’ E-health data. The objective of the approach is to build an analytical model Table 1 Descriptions of the dataset attributes for heart attack [31] No.

Description

Attribute

1

age

Age

2

sex

Sex

3

cp

Chest pain type (4 values)

4

trestbps

Resting blood pressure

5

chol

Serum cholestoral in mg/dl

6

fbs

Fasting blood sugar > 120 mg/dl

7

restecg

Resting electrocardiographic results (values 0,1,2)

8

thalach

Maximum heart rate achieved

9

exang

exercise induced angina

10

oldpeak

oldpeak = ST depression induced by exercise relative to rest

11

slope

The slope of the peak exercise ST segment

12

ca

Number of major vessels (0–3) colored by flourosopy

13

thal

thal: reversible defect = 2, normal = 0; 1 = fixed defect = 1;

14

target

Objective: 0 = low probability of having a heart attack, 1 = probability of having a heart attack at a high rate

A Novel Big Data Classification Technique for Healthcare …

211

Fig. 3 The correctly classified instances using cross validation

to produce a set of decisions for use as a decision support system for E-Health. Figure 3 shows a flowchart illustrating the proposed approach. After a process of data validation by case study, the approach is applicable and feasible in many analyzes of patients’ E-health data [17]. The WEKA data mining software was used to implement the proposed system. WEKA is free open source software, defined as a set of ML methods for solving data mining problems in real-world, developed in Java and works on almost any platform. It is analytical tool that applies data mining approach to any datasets. Although there are several supported and professional data mining software packages, WEKA provide many advantages such as it is open source, downloadable application, fast, ease of use and access, easy to implement, and does not require any financial requirements (i.e. no fees) [32, 33]. In this study, data stored in comma-separated values file (csv) form were used. The target attribute was chosen as the main attribute of the trial class. Then a rules set is used as by decision-makers in the health centers as a decision support system, where information is provided to them to predict the possibility heart attack. The target attribute was chosen as the main attribute of the experiment category. Then a set of rules is used by decision makers in health centers as a decision support system, where information is provided to them to predict the possibility of a heart attack.

5 Experiments and Results As a result of increasing computing power and the massive amount of data currently available, machine learning algorithms are becoming increasingly complex and more powerful [34]. In this study, three types of classification algorithms are tested: SVM, RFC, and J48.

212 Table 2 Correctly classified instances cross validation

H. Al-Manaseer et al. Algorithms

T303

T909

T1818

SVM

47.4

96

100

RF

84.2

98.7

100

J48

82.9

92.5

99.1

Determining the optimal size of the dataset is essential, as too many cases and too few can lead to imprecise models [32]. For this reason, Health care (Heart attack possibility) dataset was divided into three categories, the first consisting of 303 instances, the second one consisting of 909 instances, and the third consisting of 1818 instances. The SVM, RFC, and J48 algorithms ran, evaluated with tenfold validation. Since cross-validation suffers from an overfitting problem because the data being tested is the same as the data used in training, which means it often learns and maintains patterns within this dataset [34]. So another evaluation mechanism used based on creating an isolated test set consisting of 25% of the total dataset for each of the previous three classifications and using it to evaluate these algorithms. Figure 3 shows the percentage of correctly classified instances when the algorithms are applied to the previous three categories. It is noted from the graph that the algorithms converged in classification accuracy when the dataset size exceeded 909 cases. While SVM failed to rank at 303 cases. Table 2 shows a summary of the results. Figure 4 shows the percentage of correctly classified instances when the algorithms are applied to the three previous categories. It is noted from the graph the RFC outperformed the other algorithms, and the three converged in classification accuracy when the size of the dataset exceeded 1818 instances. And also again the SVM failed to rank at 303 cases. Table 3 shows a summary of the results.

Fig. 4 Correctly classified instances percentage split (25%)

A Novel Big Data Classification Technique for Healthcare … Table 3 Correctly classified instances percentage split (25%)

213

Algorithms

T303

T909

T1818

SVM

47.4

96

100

RF

84.2

98.7

100

J48

82.9

92.5

99.1

6 Conclusion The IoT works hand in hand with big data when huge scales of information must be processed and analyzed. In this study, E-health data were analyzed using classification algorithms and in particular the Health care (Heart attack possibility) dataset was used. The optimal feature of the medical database was identified, which helps in building an effective model in predicting heart disease. The results showed the superiority of RFC over other.

References 1. Firouzi, F., Farahani, B., Weinberger, M., DePace, G., & Aliee, F. S. (2020). IoT fundamentals: Definitions, architectures, challenges, and promises. In Intelligent Internet of Things (pp. 3–50). Springer. 2. Gharaibeh, M., Alzu’bi, D., Abdullah, M., Hmeidi, I., Al Nasar, M. R., Abualigah, L., & Gandomi, A. H. (2022). Radiology imaging scans for early diagnosis of kidney tumors: a review of data analytics-based machine learning and deep learning approaches. Big Data and Cognitive Computing, 6(1), 29. 3. Gandomi, A. H., Chen, F., & Abualigah, L. (2022). Machine learning technologies for big data analytics. Electronics, 11(3), 421. 4. Bashabsheh, M. Q., Abualigah, L., & Alshinwan, M. (2022). Big data analysis using hybrid meta-heuristic optimization algorithm and MapReduce framework. In Integrating meta-heuristics and machine learning for real-world optimization problems (pp. 181–223). Springer. 5. Gharaibeh, M., Almahmoud, M., Ali, M. Z., Al-Badarneh, A., El-Heis, M., Abualigah, L., Altalhi, M., Alaiad, A., & Gandomi, A. H. (2021). Early diagnosis of alzheimer’s disease using cerebral catheter angiogram neuroimaging: A novel model based on deep learning approaches. Big Data and Cognitive Computing, 6(1), 2. 6. Abualigah, L., Diabat, A., & Elaziz, M. A. (2021). Intelligent workflow scheduling for big data applications in IoT cloud computing environments. Cluster Computing, 24(4), 2957–2976. 7. Abualigah, L., Gandomi, A. H., Elaziz, M. A., Hamad, H. A., Omari, M., Alshinwan, M., & Khasawneh, A. M. (2021). Advances in meta-heuristic optimization algorithms in big data text clustering. Electronics, 10(2), 101. 8. Abualigah, L., & Masri, B. A. (2021). Advances in MapReduce big data processing: platform, tools, and algorithms. In Artificial intelligence and IoT (pp. 105–128). 9. Al-Sai, Z. A., & Abualigah, L. M. (2017, May). Big data and e-government: A review. In 2017 8th international conference on information technology (ICIT) (pp. 580–587). IEEE. 10. Katal, A., Wazid, M., & Goudar, R. H. (2013). Big data: Issues, challenges, tools and good practices. In 2013 Sixth international conference on contemporary computing (IC3) (pp. 404– 409). IEEE.

214

H. Al-Manaseer et al.

11. Chebbi, I., Boulila, W., & Farah, I. R. (2015) Big data: Concepts, challenges and applications. In Computational collective intelligence (pp. 638–647). Springer. 12. Alam, F., Mehmood, R., Katib, I., Albogami, N. N., & Albeshri, A. (2017). Data fusion and IoT for smart ubiquitous environments: A survey. IEEE Access, 5, 9533–9554. 13. Revathi, L., & Appandiraj, A. (2015). Hadoop based parallel framework for feature subset selection in big data. International Journal of Innovative Research in Science, Engineering and Technology, 4(5), 3530–3534. 14. Shankar, K. (2017). Prediction of most risk factors in hepatitis disease using apriori algorithm. Research Journal of Pharmaceutical Biological and Chemical Sciences, 8(5), 477–484. 15. Manogaran, G., Lopez, D., & Chilamkurti, N. (2018). In-mapper combiner based MapReduce algorithm for processing of big climate data. Future Generation Computer Systems, 86, 433– 445. 16. Injadat, M., Moubayed, A., Nassif, A. B., & Shami, A. (2020). Multi-split optimized bagging ensemble model selection for multi-class educational data mining. Applied Intelligence, 50(12), 4506–4528. 17. Lakshmanaprabu, S. K., et al. (2019). Random forest for big data classification in the internet of things using optimal features. International Journal of Machine Learning and Cybernetics, 10(10), 2609–2618. 18. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 19. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 20. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 21. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 22. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 23. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 24. Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189–215. 25. Jain, A., Sharma, V., & Sharma, V. (2017). Big data mining using supervised machine learning approaches for Hadoop with Weka distribution. International Journal of Computational Intelligence Research, 13(8), 2095–2111. 26. Su, M. Y., Wei, H. S., Chen, X. Y., Lin, P. W., & Qiu, D. Y. (2018). Using ad-related network behavior to distinguish ad libraries. Applied Sciences, 8(10), 1852. 27. Li, W., Chai, Y., Khan, F., Jan, S. R. U., Verma, S., Menon, V. G., & Li, X. (2021). A comprehensive survey on machine learning-based big data analytics for IoT-enabled smart healthcare system. Mobile Networks and Applications, 26(1), 234–252. 28. Chin, J., Callaghan, V., & Lam, I. (2017). Understanding and personalising smart city services using machine learning, the internet-of-things and big data. In 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE) (pp. 2050–2055). IEEE. 29. Vapnik, V. (2013). The nature of statistical learning theory. Springer Science & Business Media. 30. Liang, X., Zhu, L., & Huang, D. (2017). S Multi-task ranking SVM for image cosegmentation. Neurocomputing, 247, 126–136. 31. Naresh, B. (2021) Health care: Heart attack possibility [Online]. Kaggle, July 4, 2021. https:// www.kaggle.com/nareshbhat/health-care-data-set-on-heart-attack-possibility

A Novel Big Data Classification Technique for Healthcare …

215

32. Oliff, H., & Liu, Y. (2017). Towards industry 4.0 utilizing data-mining techniques: A case study on quality improvement. Procedia CIRP, 63, 167–172. 33. WEKA. (2021). The workbench for machine learning [Online]. WEKA. https://www.cs.wai kato.ac.nz/ml/weka/index.html. Last accessed June 4, 2021. 34. Géron, A. (2019)/ Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media.

Comparative Study on Arabic Text Classification: Challenges and Opportunities Mohammed K. Bani Melhem, Laith Abualigah, Raed Abu Zitar, Abdelazim G. Hussien, and Diego Oliva

Abstract There have been great improvements in web technology over the past years which heavily loaded the Internet with various digital contents of different fields. This made finding certain text classification algorithms that fit a specific language or a set of languages a difficult task for researchers. Text Classification or categorization is the practice of allocating a given text document to one or more predefined labels or categories, it aims to obtain valuable information from unstructured text documents. This paper presents a comparative study based on a list of chosen published papers that focus on improving Arabic text classifications, to highlight the given models and the used classifiers besides discussing the faced challenges in these types of researches, then this paper proposes the expected research opportunities in the field of text classification research. Based on the reviewed researches, SVM and Naive Bayes were the most widely used classifiers for Arabic text classification, while more effort is needed to develop and to implement flexible Arabic text classification methods and classifiers. M. K. B. Melhem · L. Abualigah (B) School of Computer Sciences, Universiti Sains Malaysia, 11800 George Town, Pulau Pinang, Malaysia e-mail: [email protected] L. Abualigah Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan Faculty of Information Technology, Middle East University, Amman 11831, Jordan R. A. Zitar Sorbonne Center of Artificial Intelligence, Sorbonne University-Abu Dhabi, Abu Dhabi, United Arab Emirates A. G. Hussien Department of Computer and Information Science, Linköping University, Linköping, Sweden Faculty of Science, Fayoum University, Faiyum, Egypt D. Oliva IN3—Computer Science Department, Universitat Oberta de Catalunya, Castelldefels, Spain Depto. de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Guadalajara, Jalisco, Mexico © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_10

217

218

M. K. B. Melhem et al.

Keywords Arabic text classification · Deep learning · CHI square · Single-label text categorization · Multi-label text categorization · Naïve Bayes · Arabic natural language processing · Feature selection

1 Introduction Nowadays, enormous amounts of information and much-hidden knowledge are available on the internet behind the globally distributed digital contents, this knowledge can be extracted if suitable and innovative tools were applied to the given digital contents [1, 2]. Several sciences and methods helped in the process of automatically extracting information from digital content, text classification is one of these methods that significantly contributes to the speed and the accuracy of obtaining information and knowledge. Previously, professionals and domain experts were classifying documents manually [3, 4]. However, with the tremendous and increasing development in the quantity and quality of Arabic digital contents, manual classification became ineffective and unfeasible, which posed significant challenges to the text classification process and motivated researchers to develop and improve automatic methods for text classification, which in turn created many challenges for the researchers [5–12]. Interested researchers presented and implemented many solutions, but most of these solutions were restricted to the use of classical machine learning classifiers with small datasets that were not freely available, nor sufficient in most cases. To overcome this challenge, many researchers turned to adapting the use of deep learning techniques, improving the given algorithms, providing and suggesting more free datasets, which added clear improvements to the processes of text classifying. The purpose of this study is to explore the publications available on the topic of Arabic text classification, and to summarize the results of this publication and list them as research challenges and opportunities that can help researchers who are interested in this type of research. Therefore, the main objective of this paper is to select some of the latest researches in Arabic text classifiers and explore them to highlight the most prominent improvements and additions that they brought and what are the research areas that emerged through these improvements, to help users, researchers and community to take advantage of the information that exists in the Arabic digital contents. To achieve this goal, the researcher selected five Arabic text classifications papers published in 2020 that were concerned with using different techniques to improve the text classification process.

2 Literature Review Alshaer et al. in [13], studied the impact of ImpCHI squares on the text classifiers (Random Forest, Naïve Bayes Multinomial, Decision Tree, Bayes Net, Artificial Neural Networks and Naïve Bayes,) and the influence of using improved CHI squares as feature selection on the process results of the text classification to build

Comparative Study on Arabic Text …

219

the model according to precision, F-measure, Recall and Time. Also, they described the importance of data pre-processing steps in the text classification process to derive supporting results and improve efficiency. Chantar et al. in [14], studied the impact of Enhanced Grey Wolf Binary Optimizer (GWO) in the FS packaging method on the Arabic text classification problem, then the authors, using News datasets Akhbar-Alkhaleej, Al-jazeera and Alwatan, compared the the proposed method performance with SVM, Decision Trees, NB and KNN classifiers. Bahassine et al. in [15], proposed improved method that concern with employing the Chi-square feature selection (referred to, hereafter, as ImpCHI) to make an enhancements on Arabic text classification performance, and compared it with three metrics (mutual information, information gain and Chi-square). Marie-Sainte et al. in [16], studied a new proposed algorithm (firefly algorithm based feature selection method) and applied it on different combinatorial problems. This technique was validated by using the Support Vector Machine classifier and three evaluation measures (precision, recall and F-measure). Ashraf Elnagar in [17], (Arabic text classification using deep learning models, 2020), introduce a new freely rich and unprejudiced datasets for both Arabic text categorization tasks: single-label (SANAD) and multi-label (NADiA) tasks. Also proposed a comprehensive comparison of various deep learning models that are used to classify Arabic text to evaluate the effectiveness of such models in NADiA and SANAD datasets. Some other optimization methods can be used to optimize the problems as given in [18–23].

3 Background 1. Text Classification Text classification or categorization is the practice of allocating a given text document to one or more predefined labels or categories [24, 25]. It aims to obtain valuable information from unstructured text documents to be employed in many applications such as detecting and categorizing spam emails, news monitoring, and indexing scientific articles in an automated way [26, 27]. Generally, there are two types of label classification: single-label classification (assigning documents to a single specific related class or category) and multi-label classification (meaning that each document or instance is identified as several categories or categories) [1, 2]. Basically, in most cases, the text document will run as a frequency vector of words [3, 4]. 2. Deep Learning Models A deep neural network (DNN) is a neural network with a deep, rich set of hidden layers. The three main parts of the network are the input layer, the hidden layer, and the output layer. As the name suggests, the main purpose of each type of layer is as

220

M. K. B. Melhem et al.

input or output, except for hidden layers. The hidden layer is an additional layer that is added to the network to add more calculations, where the task is too complicated for a small network. The number of hidden layers can reach a hundred or more. DNN has excellent precision and is considered revolutionary. There are many types of DNNs (Convolution neural networks (CNN), Recurrent neural networks (RNN) and others), the difference between the various DNN models is how they are connected [28], Arabic text classification using deep learning models, 2020). 3. Feature Selection Feature selection is one of the most important elements that might increase the sorting process’ performance. It is the elimination of redundant and irrelevant data and the selection of important data to reduce the complexity of the classification process [15]. 4. CHI Square CHI Square is a statistical approach for extracting random data from large data sets using two independent variables and two variables. In the data mining process, it is a method for selecting features. The CHI square method is used in the preprocessing step of the text classification system [13]. 5. Improved CHI Square The Enhanced CHI method (impCHI) is an enhancement of the classical CHI method. The ImpCHI method is used in conjunction with Chinese. The research result showed that the function is effective when selecting Arabic text data. Additionally, ImpCHI squares are used with Arabic and decision trees when using the optical drying process. Given results showed that, in terms of recovery measures, ImpCHI performs better than conventional CHI. 6. Grey Wolf Optimizer (GWO) This algorithm was proposed in [29], it’s one of the most recent swarm intelligence (SI) algorithms, which has attracted the attention of many researchers in different fields of optimization. 7. Firefly Algorithm Firefly Algorithm (FA) is Bio-inspired algorithm it is also well-known and efficient algorithm [30]. It was successfully applied in the FS concept to deal with Arabic speech recognition systems but it was not implement for Arabic text classification [31].

4 Literature Review Results and Discussion Different types of Arabic text classifiers were used by Alshaer et al. in [13] (Bayes Net (BN), Naïve Bayes (NB), Naïve Bayes Multinomial (NBM), Random Forest (RF),

Comparative Study on Arabic Text …

221

Decision Tree (DT) and Artificial Neural Networks (ANNs)) with improved CHI (ImpCHI) Square algorithm and compared it to each other according to the Average precision, Average Recall, Average F-measure, and Average Time, by conducting six tests for each classifier: without pre-processing, with pre-processing, without pre-processing and CHI, with pre-processing and CHI, without pre-processing and ImpCHI, and with pre-processing and impCHI. The results of this study show that using ImpCHI square as feature selection method, gave better results in precision, Recall and F-measure. But it gave worse results in Time build model. Moreover, results have the superiority over the classified CHI Square without the pre-processing for Avg. precision, Avg. Recall, Avg. f-measure and Avg. time. Overall, Naïve Bayes classifiers get the best results for Avg. precision, Avg. Recall and Avg. F-measure which means the Naïve Bayes classifier is the best algorithm that was compared. The used dataset was collected from different Arabic resources and contains 9055 Arabic documents. In another study, Bahassine et al. in [15], feature selection method with improved Chi-square and SVM classifier was used to enhance Arabic text classification process, and compared results, via common evaluation criteria’s precision, recall and fmeasure, with previous features selection methods Mutual Information (MI), Chisquare, Information Gain (IG) and Term Frequency-Inverse Document Frequency (TFIDF). results showed that ImpCHI performs better than other features selection for most features, When the number of features not equal 20, at different sizes of features the results are better in precision, recall and f-measure when using SVM classifier compared to DT for all features selection. But this study mentions an easily interpretable result by non-export done by the decision tree, which helps to identify for every class the important and pertinent term, while SVM is difficult to interpret the results. Chantar et al. in [14] within a wrapper FS approach proposed an enhanced binary grey wolf optimizer (GWO) using different learning models with classifiers decision trees, K-nearest neighbour, Naive Bayes, and SVM and Three Arabic public datasets, Alwatan, Akhbar-Alkhaleej, and Al-jazeera-News to study and evaluate the efficacy of different BGWO-based wrapper methods. Two different methods are proposed to convert continuous GWO (CGWO) to binary version (BGWO) BGWO1 and BGWO2. Also, common evaluation criteria’s precision, recall and f-measure were used. The results of this research show that a great performance added via the SVM-based feature selection technique, the proposed binary GWO optimizer and the elite-based crossover scheme in the Arabic document classification process. Marie-Sainte et al. in [16], go with another different approach to enhanced the Arabic text classification in different combinatorial problems using Firefly Algorithm based Feature Selection. Support Vector Machine classifier, three evaluation measures (precision, recall and F-measure) had been used to validate this method. The data set named OSAC used in this study was collected from the BBC and CNN Arabic websites. The data set also contains 5843 text documents. It is divided into two subsets to construct the training and test data of the classification system. The preprocessing stage was skipped in this study because the dataset has already been preprocessed. The results of this paper showed that the proposed feature selection

222

M. K. B. Melhem et al.

method is very efficient in improving Arabic Text Classification accuracy and the precision value of this method achieves values equal to 0.994, which is great evidence of its efficiency. In a very attractive and extensive study, Ashraf Elnagar in [28], on the impact of the deep learning model in Arabic text classification, proposed and introduce free, rich and unbiased dataset freely available to the research community, for both tasks (single-label, multi-label) of Arabic text classification were called in order SANAD and NADiA, The final size of NADiA is approximately 485,000 articles, covering a subset of 30 categories. In this research, nine deep learning models (BIGRU, BILSTM, CGRU, CLSTM, CNN, GRU, HANGRU, HANLSTM and LSTM) were developed for Arabic text classification tasks with no pre-processing requirements. This study shows that all models work well in the SANAD corpus. The lowest precision achieved by the convolutional GRU is 91.18% and the highest performance achieved by the GRU of care is 96.94%. Regarding NADiA, Attention-GRU achieved the highest overall accuracy rate of 88.68% in the largest subset of the 10 categories in the "Masrawy" dataset.

5 Results and Discussion The total number of reviewed publications in this study were 5, 1 publication implement firefly algorithm, 1 publication implement binary grey wolf optimizer, 2 publications used improved CHI Square and 1 publication implement the deep learning models. The selected publications were published in 2020. 2 publications introduced new datasets one of them introduce extensive and large dataset, also, all of the publications used a ready dataset, some of them have been already preprocessing. Overall, all of the reviewed publications gave an improvement using the proposed method of each other for Arabic text classification process. List of challenges and research opportunities achieved by this study: . Low resources of Freely Available Arabic datasets still an important challenge to researchers. . a verified good classifier on document classification like Naive Bayes and SVM can be used with other methods proposed in other studies. . The proposed methods can be used with other classifiers even if it is giving worse results with specific method. . ImpCHI, Firefly and GWO are affective methods which have a good research opportunity. . Deep learning models is an important technique that may be implemented by adapted by any method or algorithms with superiority effective results.

Comparative Study on Arabic Text …

223

6 Conclusions and Future Work In recent years, the classification of Arabic texts has been regarded as one of the most important topics in the field of knowledge discovery. Large amounts of data are submitted online every day, from social media posts and comments to product reviews. By using Arabic text classification tools, these data sources can be used to obtain useful information. Our research explored and analyzed five recent articles that applied different techniques to explore and improve the classification of Arabic texts. Our findings are summarized as follows: . Arabic dataset still considers as Low-resource for researchers. . Using verified classifiers on deferent algorithms may enhance Arabic text classifications. . There are many research opportunities for the hot topics considered in deep learning. In the future work, we will expand the selected publications to all publications that publish in 2020 and find the most effective classifier and method that may accept enhancement, besides the worse classifier and methods that used in Arabic text classifications.

References 1. Jackson, P., & Moulinier, I. (2007). Natural language processing for online applications: text retrieval, extraction and categorization (vol. 5). John Benjamins Publishing. 2. Sanasam, R., Murthy, H., & Gonsalves, T. (2010). Feature selection for text classification based on Gini coefficient of inequality. FSDM, 10, 76–85. 3. Feldman, R. (2007). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge University Press. 4. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. 5. Gharaibeh, M., Alzu’bi, D., Abdullah, M., Hmeidi, I., Al Nasar, M. R., Abualigah, L., & Gandomi, A. H. (2022). Radiology imaging scans for early diagnosis of kidney tumors: a review of data analytics-based machine learning and deep learning approaches. Big Data and Cognitive Computing, 6(1), 29. 6. Gandomi, A. H., Chen, F., & Abualigah, L. (2022). Machine learning technologies for big data analytics. Electronics, 11(3), 421. 7. Bashabsheh, M. Q., Abualigah, L., & Alshinwan, M. (2022). Big data analysis using hybrid meta-heuristic optimization algorithm and MapReduce framework. In Integrating meta-heuristics and machine learning for real-world optimization problems (pp. 181–223). Springer. 8. Gharaibeh, M., Almahmoud, M., Ali, M. Z., Al-Badarneh, A., El-Heis, M., Abualigah, L., Altalhi, M., Alaiad, A., & Gandomi, A. H. (2021). Early diagnosis of alzheimer’s disease using cerebral catheter angiogram neuroimaging: A novel model based on deep learning approaches. Big Data and Cognitive Computing, 6(1), 2. 9. Abualigah, L., Diabat, A., & Elaziz, M. A. (2021). Intelligent workflow scheduling for big data applications in IoT cloud computing environments. Cluster Computing, 24(4), 2957–2976.

224

M. K. B. Melhem et al.

10. Abualigah, L., Gandomi, A. H., Elaziz, M. A., Hamad, H. A., Omari, M., Alshinwan, M., & Khasawneh, A. M. (2021). Advances in meta-heuristic optimization algorithms in big data text clustering. Electronics, 10(2), 101. 11. Abualigah, L., & Masri, B. A. (2021). Advances in MapReduce big data processing: platform, tools, and algorithms. In Artificial intelligence and IoT (pp. 105–128). 12. Al-Sai, Z. A., & Abualigah, L. M. (2017, May). Big data and e-government: A review. In 2017 8th international conference on information technology (ICIT) (pp. 580–587). IEEE. 13. Alshaer, H., Otair, M., Abualigah, L., Alshinwan, M., & Khasawneh, A. (2020). Feature selection method using improved CHI Square on Arabic text classifiers. 14. Chantar, H., Mafarja, M., Alsawalqah, H., Heidari, A. A., Aljarah, I., & Faris, H. (2020). Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification. 15. Bahassine, S., Madani, A., Al-Sarem, M., & Kissi, M. (2020). Feature selection using an improved Chi-square for Arabic text. 16. Marie-Sainte, S. L., & Alalyani, N. (2020). Firefly algorithm based feature selection for Arabic text classification. 17. Elnagar, A., Al-Debsi, R., & Einea, O. (2020). Arabic text classification using deep learning models. 18. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 19. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250. 20. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158. 21. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. 22. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177. 23. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49. 24. Khreisat, L. (2009). A machine learning approach for Arabic text classification using N-gram frequency statistics. Journal of Informetrics, 72–77. 25. Sebastiani, F. (2005). Text categorization. In J. H. Doorn, L. C. Rivero, & V. E. Ferraggine (Eds.), Encyclopedia of database technologies and applications (pp. 683–687). IGI Global. 26. Dharmadhikari, S., Ingle, M., & Kulkarni, P. (2011). Empirical studies on machine learning based text classification algorithms. Advanced Computing: An International Journal, 161–169. 27. El Kourdi, M., Bensaid, A., & Rachidi, T. (2004). Automatic Arabic document categorization based on the Naïve Bayes algorithm. In Proceedings of the workshop on computational approaches to Arabic script-based languages (pp. 51–58). 28. Elnagar, A., Al-Debsi, R., & Einea, O. (2020). Arabic text classification using deep learning models. Information Processing and Management. 29. Mirjalili, S., Mirjalili, S. M., & Lewisa, A. (2014). Grey Wolf optimizer. Advances in Engineering Software. 30. Sayadi, M. K., Ramezanian, R., & Ghaffarinasab, N. (2010). A discrete firefly meta-heuristic with local search for makespan minimization in permutation flow shop scheduling problems. International Journal of Industrial Engineering Computations. 31. Harrag, A., & Nassir, H. (2014). Firefly feature subset selection application to Arabic speaker recognition system. International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications.

Pedestrian Speed Prediction Using Feed Forward Neural Network Abubakar Dayyabu, Hashim Mohammed Alhassan, and Laith Abualigah

Abstract Pedestrian speed behavior is governed by the pedestrian characteristic of Gender, age, group size, and facility types, as investigated by many researchers in dynamic pedestrian studies. However, little attention is given to investigating the effect of pedestrian dresses on pedestrian speed behavior. This research investigates the effect of dressing types on pedestrian speed behavior through the use of non-linear feed-forward neural networks to model the pedestrian speed behavior considering the dressing effect on the overhead pedestrian crossing bridge. The research uses a video method of data collection, a manual method of data extraction from video, excel, and Minitab for statistical analysis, artificial neural network (ANN) for model building, training, validation, and prediction. The statistical analyses indicate ascending direction speed to be higher than descending direction pedestrian speed with a value of 67.72 m/min and 52.19 m/min. The speed distribution also indicate male pedestrian wearing English/short African clothes and cover shoe to have a higher mean speed of 84.21 m/min and 60.10 m/min in ascending descending direction The artificial neural network was satisfactory in building, training and validation as indicated by R and RMSE values presented in Table 3a–d, respectively. Keywords Pedestrian microscopic modeling · ANN · R and RMSE

A. Dayyabu · H. M. Alhassan Department of Civil Engineering, Bayero University Kano, Gwarzo Road New Campus, Kano 700241, Nigeria e-mail: [email protected] A. Dayyabu Department of Civil Engineering, Nile University of Nigeria, Abuja, Nigeria L. Abualigah (B) Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 19328, Jordan e-mail: [email protected] Faculty of Information Technology, Middle East University, Amman 11831, Jordan © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_11

225

226

A. Dayyabu et al.

1 Introduction Walking has been the oldest, natural, and most used mode of transportation by a man in search of materials to shelter, the water to drink, and the food to eat for his survival, as such pedestrian facilities could be traced as far back as the origin of man when the first man was brought on the earth surface. After shelter, the first man created a footpath to source for water to drink, the food to eat, and footpath remain the only means of transportation until when animals were domesticated [1]. Many walk for recreation, for exercise, some walk due to its health benefits, some walk due to its simplicity, and some walk due to its cheapness and no personal vehicle [2, 3]. Despite the advantages above, usage, and historical origin, little attention is given to the walking facilities regarding design standards, regulation, and safety. These results in higher pedestrian-involved accidents. According to the World health organization (WHO, 2010, 2013, [4]), (22%) of those killed in road traffic accidents worldwide are pedestrians. The African region accounts for the highest with thirty-eight percent (38%) even though it has the least number of motorized vehicles among the six world regions. Nigeria and South Africa have the highest fatality rates (33.7 and 31.9 deaths per 100,000 population per year, respectively) in the region. A study conducted in Ghana found that 68% of the pedestrian killed were knocked down by a vehicle when they were in the middle of the roadway road crossing [5]. In another study, Ogendi et al. [6] reported that out of the 176 persons involved in a road traffic accident in Kenya, 59.1% were pedestrians. The study also revealed that 72.6% of the pedestrians involved were injured while crossing the road, 11% were standing by the road, while 8.2% were walking along the road, and another 8.2% were hit while engaging in other activities, including hawking. The trend is similar in Nigeria; for instance, Aladelusi et al. [7] found pedestrians to be among the highest victims of a road traffic accident. Also, Solagberu et al. [8] investigated pedestrian injuries in Lagos, Nigeria, found that 67% out of 702 pedestrians involved in a road accident resulted from road crossing instances. Odeleye [9] mentioned poor planning, reckless motorized drivers’ behavior toward pedestrians, and unsafe state of road traffic environment as the leading causes of a pedestrian accident in Nigeria. Based on the rising trend in pedestrians’ fatality globally and locally, understanding pedestrians’ behavior has been the focus of this research. This study aims to develop a model for predicting pedestrians’ speed using an artificial neural network (ANN) approach based on the field data considering the effect of gender, clothing types, and shoe types worn by individual pedestrians in Kano, Nigeria. The microscopic pedestrian model is extensively studied by many researchers, including [10], that used the concept of magnetic theory to described movement, representing the movement of each pedestrian by the motion of a magnetized object in a magnetic field, assuming each pedestrian and obstacle to be positive magnetic pole and the pedestrian destination to be a negative magnetic pole. Gipps and Marksjö [11] used a CA-like concept to model pedestrian traffic flow. The authors use reverse gravitybased rules to move pedestrians over a grid of hexagonal cells. Blue and Adler ([12],

Pedestrian Speed Prediction Using Feed Forward Neural Network

227

2001) used cellular automata principles to model pedestrian behavior on a unidirectional and bi-direction movement. Dijkstra and Jessurun [13] and Wang et al. [14] both extended the cellular automata model to simulate pedestrian behavior in public places. Chen et al. [15] extended the cellular automata model in modeling pedestrian behaviors under attracting incidents. Hu et al. [16] extended cellular automata (CA) to enhance evacuation efficiency and analyze the model concerning queuing time. Alghadi et al. [17] allowed more pedestrians to be in the same cell. Lu et al. [18] extended the floor field cellular automata (CA) model to capture and evaluate group behaviors’ influence on crowd evacuation as individuals’ presence within the crowd with family and friends resulting in a mixture of groups than a pure collection of individuals. Helbing and Molnár [19] suggested that pedestrians’ behavior along their movement path could be model as social forces. However, some steps toward it were taken previously in Lewin [20], who suggested that human behavioral change is guided by social field or social force. Teknomo [21] extended the social force model considering the repulsive force to be two, with one coming in effect when there is a pedestrian in front and the other coming in effect when the radius of two or more pedestrians overlap. Helbing et al. [22], Lakoba et al. [23], Parisi et al. (2009) introduced a “self-stopping” mechanism to prevent a pedestrian from pushing over other pedestrians in the simulation process. Zanlungo et al. [24] introduced collision prediction and avoidance mechanisms during the simulation process. Moussaid et al. [25] developed an individual-based model that could describe how a pedestrian interacts with other members in the same group and with the other group members. Xun et al. (2015) investigated the effects of spatial distance, occupant density, and exit width in exit selection in a subway station. Abualigah et al. [26] proposed a new optimization technique can be used to solve this problem. Gruden et al. [27] use ANN to model microscopic pedestrian crossing behavior. Das et al. [28] use ANN to model pedestrian macroscopic traffic flow relationships. The author compares ANN with the other deterministic models on different pedestrian facilities and found the ANN to have an outstanding performance. Zampieri et al. [29] compare space syntax and ANN to model pedestrian movement behavior and found ANN to have a better performance with an accuracy of more than 90% of correlation coefficient and an average error smaller than 0.02.

2 Material and Method 2.1 Data Collection Location The data for the research was collected at an overhead bridge located at Sa’adatu Rimi College of education Kano, Nigeria; the bridge is constructed in 2014 by the Kano state government through the ministry of works, housing, and transport of Kano state to improve pedestrian safety and reduced delays to the motorist by crossing pedestrian. The majority of the people using the pedestrian overhead were Sa’adatu

228

A. Dayyabu et al.

Fig. 1 Location pedestrian bridge 001 for the data collection

Rimi College Of Education, Kano. The college is among the largest Teacher Training Institution in Nigeria, with a student population above 45,000 in 2012. The location of the data collection is presented in Fig. 1. The road under the bridge is a four-lane divided arterial road with a higher traffic flow.

2.2 Data Capturing and Extraction Hikvision cube IP security camera DS-2CD2442FWD-IW 4 MP WDR was used for data capturing. The camera was mounted at 7 m above the ground level to complete pedestrians’ features of gender, clothing types, and shoe types. A 12-h data was collected from 7:00 a.m. to 7:00 p.m. from Monday through Thursdays. Features and speed of individuals were extracted from the playback of the recorded video manually with AVS video editing. Research considers single pedestrians speed data for pedestrians in the age range of 18–40. The speed data were re-grouped into three pedestrian combinations, All pedestrians made of all single pedestrian, Combination I; made of single pedestrian wearing English/ African short clothing type. Combination II; made of single pedestrian wearing African long/gown.

2.3 Data Preparation The input data of pedestrians’ gender, clothing type, shoe type, and speed, obtained from the playback of the field observation video, were normalized into a standard scale of 0–1prior to model building and analysis. The normalization was carried out using the normalization equation presented in Eq. (1)

Pedestrian Speed Prediction Using Feed Forward Neural Network

Xs =

X i − X min X max − X min

229

(1)

where Xs , is standardized value: Xi is original value; Xmin is the minimum value of X; Xmax is the maximum value of X.

2.4 Sensitivity Analysis Sensitivity analysis was carried to find the relationship between the input variables and the output variable and establish the significance of each input variable in model building. Pearson product-moment coefficient of correlation was used for the sensitivity analysis. The Pearson correlation equation is presented in Eq. (2) r=.

Sx y Sx x S yy

(2)

where; r—is the Pearson product-moment coefficient of correlation; Sxx—is the standard deviation of variable X; Syy —is the standard deviation of variable Y; Sxy —is the standard deviation of the product of variable X and Y.

2.5 ANN Model Formulation An artificial neural network as an AI-based model is a mathematical model that aims to handle the non-linear relationship of an input–output dataset. Historically, ANN is information processing tools derived from analogy with the brain’s biological nervous system, with the fundamental component called neuron (node) (Sirhan and Koch, 2013). ANN has proved to be practical regarding complex functions in various fields, including prediction, pattern recognition, classification, forecasting, control system, and simulation [30, 31]. Among the different classifications of ANN algorithms, Feed-Forward Neural Network (FFNN) with Backpropagation (BP) is widely applied and the most common classes [32]. Artificial neural network (ANN) is a tremendously fast emerging technique in non-linear modeling due to its predictive capability and ability to quickly learn system behavior. ANN is made of parallel operating architecture consisting of input, hidden and output layers interconnected by neurons, as presented in Fig. 2. ANN is trained with the association of input and target output values by activation function of hidden neurons, and its predictive capability can be improved by adjusting connection weights of each neuron until the required performance value is reached (maximum correlation coefficient or minimum mean square error between the target and output values). The critical problem in solving complex ANN architecture is obtaining required performance value and the numbers of hidden layers as well as neurons. There are several alternatives which tried based

230

A. Dayyabu et al.

Fig. 2 Artificial neural network (ANN) structure

on the association of input and target output to represent the ANN architecture do to no know general rules (Bums and Whitesides, 1993). The research proposes an ANN model based on feed-forward with backpropagation algorithm. The chosen feed-forward ANN comprises of input; a hidden layer and an output layer. The required number of neurons in the hidden layer is selected by trial and error based on the best performance value. The input layer comprises of 2 neurons; 3 neurons; 4 neurons; 5 neurons, which the target output layer has a single neuron of field observed speed. The strength of each connection of neurons is referred to as weight. The sum of the inputs and their weights processing into a summation operation is given in Eq. (3) N E TJ =

n .

Wi j X i j

(3)

i=1

where Wij is established weight; Xij is input value; NETj is input to a node in layer j. In the backpropagation technique, the target output neuron quantified by a sigmoid function is given by Eq. (4) f (N E TJ ) =

1 1 + exp(−N E TJ )

(4)

The backpropagation algorithm is analogous to supervised training and minimizes the sum of square error by modifying connection weights.

2.6 ANN Model Validations Validation is an essential part of modeling as it demonstrates how reasonable the model represents the actual system. The coefficient of correlation, coefficient of

Pedestrian Speed Prediction Using Feed Forward Neural Network

231

determination, MSE, and RMSE is used for model validation. RMSE represents the sample standard deviation of the differences between predicted values and observed values. These values of R2 , R, MSE, and RMSE are estimated using Eqs. 5–8. Table 2b, d present the validation result of both ascending and descending direction pedestrians. .n

(Oi − Pi )2 R = 1 − .i=1 ) n ( i=1 Oi − O √ R = R2 2

(5) (6)

.n

(Oi − Pi )2 N √ RMSE = MSE

MSE =

i=1

(7) (8)

3 Results Analysis and Discussion 3.1 Descriptive of Observed Pedestrian Data. The data collected were classified into discrete and continuous the discrete data were presented in Fig. 3a–e; Fig. 3a pedestrian classification based on gender type; Fig. 3b pedestrian classification based on the direction of movement; Fig. 3c pedestrian classification based on Age group; Fig. 3d pedestrian classification based on clothing types; Fig. 3e pedestrian classification based on shoe types. The research shows the presence of different types of pedestrians with a total pedestrian observed was 5672 male, 4443 in ascending direction, 1229 in descending direction and 1138 female, 983 in ascending direction, and 155 in descending direction as presented in Fig. 3a. The pedestrian group sizes observed were single pedestrian having a total of 4219, 3254 in ascending direction, 965 in descending direction, two pedestrian groups having a total 1939, 1716 in ascending direction, 223 in descending direction, three pedestrian group having a total of 631, 456 in ascending direction, 175 in descending direction, four pedestrian group having a total of 271, 250 in ascending direction and 21 in descending direction as presented in Fig. 3b. The pedestrians comprise all age with a pedestrian in the age range between 18–40 having a total of 5233, 4182 in ascending direction 1051 in descending direction, age range less than 18 having a total of 402 pedestrians 242 in ascending direction and 160 in descending direction and age range more significant than 40 with a total pedestrians 1175, 1002 in ascending direction and 173 in descending direction as presented in Fig. 3c. The pedestrians were observed wearing different types of clothes ranging from English wear with a total of 1601, 1342 in ascending direction, 259

232

A. Dayyabu et al.

Fig. 3 a Pedestrian classification based on gender type. b Pedestrian classification based on the direction of movement. c Pedestrian classification based on age group. d Pedestrian classification based on clothing types. e Pedestrian classification based on shoe types

in descending direction, short African wear having a total of 583, 344 in ascending direction, 239 in descending direction, long African wear having a total of 4131, 3407 in ascending direction, 724 in descending direction, and pedestrian wearing gown/hijab accounted for a total of 495, 333 pedestrians in ascending direction and 162 in descending direction as presented in Fig. 3d. The pedestrians observed were wearing a different type of shoes, 1756 were wearing a cover shoe, 1376 in ascending direction, 380 in descending direction, while 5054 pedestrians were wearing slippers, 4050 in ascending direction, 1004 in descending direction as presented in Fig. 3e.

Pedestrian Speed Prediction Using Feed Forward Neural Network

233

3.2 Speed Characteristic and Distribution Results The speed characteristics of maximum, minimum, and mean speed for all the different pedestrian combination mentioned in the methodology are presented; with Table 1a presenting male pedestrians speed characteristic based on cover shoe type; Table 1b presenting male pedestrians speed characteristic based on slipper shoe type and Table 1c presenting female pedestrians speed characteristic based on slipper shoe type. The statistical analyses presented in Tables 1a–c indicate ascending direction speed to be higher than descending direction pedestrian speed with a value of 67.72 m/min and 52.19 m/min, respectively. The speed distribution also indicates male pedestrian wearing English/short African clothes and cover shoe to have a higher mean speed of 84.21 m/min and 60.10 m/min in ascending descending direction followed by male pedestrians wearing English/short African clothes and slippers shoe with a mean speed of 72.6 and 57.7 m/min in ascending and descending direction, followed by male pedestrians wearing long/gown clothes type and cover shoe Table 1 a Male pedestrian speed characteristics base on cover shoe type. b Male pedestrian speed characteristics base on slippers shoe type. c Female pedestrian speed characteristics base on slippers shoe type (a) All pedestrian

Pedestrian comb. I

Pedestrian comb. II

Ascending Descending Ascending Descending Ascending Descending No. of pedestrian 1167

300

240

60

141

39

Max (m/min)

102

85

102

63.75

78

54.35

Min (m/min)

34

34

51

46.36

51.57

42.35

Mean (m/min)

67.74

52.19

84.21

60.1

70.14

58.92

(b) All pedestrian

Pedestrian comb. I

Pedestrian comb. II

Ascending Descending Ascending Descending Ascending Descending No. of pedestrian 1167

300

203

56

583

145

Max (m/min)

102

85

102

72.86

70

48.35

Min (m/min)

34

34

56.67

46.36

39.35

34.23

Mean (m/min)

67.74

52.19

72.7

57.70

68.3

56.07

(c) All pedestrian

Pedestrian comb. I

Pedestrian comb. II

Ascending Descending Ascending Descending Ascending Descending No. of pedestrian 330

123

37

18

293

105

Max (m/min)

85

85

85

72.86

85

85

Min (m/min)

26.84

28.33

28.33

34

26.84

28.33

Mean (m/min)

50.42

47.09

55.25

50.5

49.1

48.90

234

A. Dayyabu et al.

Table 2 a Pearson correlation coefficient matrix for ascending direction pedestrians. b Pearson correlation coefficient matrix for descending direction pedestrians (a) Male Male

1

Female

-1

Female

C-Type I

C-Type II

S-Type I

S-Type II

Seed

1

C-Type I

0.002295

−0.00229

1

C-Type II

0.010094

−0.01009

−0.9053

1

S-Type I

−0.07758

0.077585

0.319891

−0.39542

1

S-Type II

0.06734

−0.06734

−0.32385

0.389791

−0.99453

1

Seed

0.205272

−0.20527

0.522692

−0.49069

0.611517

−0.6234

1

Male

Female

C-Type I

C-Type II

S-Type I

S-Type II

Seed

(b) Male

1

Female

-1

1

C-Type I

0.267199

−0.2672

1

C-Type II

−0.20397

0.203973

−0.95477

1

S-Type I

0.147209

−0.14721

0.153344

−0.15923

1

S-Type II

−0.14413

0.144127

−0.1508

0.157253

−0.98757

1

Seed

0.487616

−0.487622

0.822385

−0.78832

0.433206

−0.42792

1

with a mean speed of 70.14 m/min and 58.92 m/min in ascending and descending direction, followed by male pedestrians wearing long/gown clothes type and slipper shoe with a mean speed of 68.30 m/min and 56.07 m/min in ascending and descending direction, followed by female wearing English/short African clothes and slipper shoe with a mean speed of 55.25 m/min and 50.5 m/min in ascending and descending direction and lastly female wearing long/gown African clothes and slipper shoe with a mean speed of 55.25 m/min and 50.5 m/min in ascending and descending direction. The speed distribution of the observed pedestrian data was presented based on the combinations specified in the methodology; Fig. 4a for all single pedestrians in ascending direction; Fig. 4b, for all single pedestrians in descending direction; Fig. 4c–f for male pedestrians wearing a cover shoe. Figure 4g–j for male pedestrians wearing slippers. Figure 4k–n for female pedestrians wearing slippers.

3.3 Result of Sensitivity Analysis The research uses the Pearson correlation method in determining the order of importance of each variable in model building. Table 1a, b presented the Sensitivity analysis provide a relationship between the independent with the dependent variable and

Pedestrian Speed Prediction Using Feed Forward Neural Network

235

Table 3 a ANN model training ascending direction. b ANN model validation ascending direction. c ANN model training descending direction. d ANN model validationdescending direction (a) Training-phase R2

R

MSE

RMSE

ANN-M1

0.4125

0.6423

0.03880

0.1971

ANN-M2

0.4559

0.6752

0.0357

0.1890

ANN-M3

0.4953

0.7038

0.0326

0.1806

ANN-M4

0.4165

0.6454

0.0386

0.1964

ANN-M5

0.5020

0.7085

0.0320

0.1790

(b) Validation-phase ANN-M1

R2

R

MSE

RMSE

0.4272

0.6536

0.0364

0.1908

ANN-M2

0.4499

0.6708

0.0348

0.1866

ANN-M3

0.4946

0.7046

0.0312

0.1767

ANN-M4

0.4311

0.6566

0.0362

0.1901

ANN-M5

0.4948

0.7034

0.0314

0.1771

R2

R

MSE

RMSE

(c) Training-phase ANN-M1

0.3997

0.6366

0.0285

0.1687

ANN-M2

0.5908

0.6322

0.0287

0.1695

ANN-M3

0.5482

0.7687

0.0196

0.1400

ANN-M4

0.6193

0.7404

0.0216

0.1471

ANN-M5

0.6193

0.7870

0.0182

0.1350

R2

R

MSE

RMSE

(d) Training-phase ANN-M1

0.3974

0.6304

0.0297

0.1723

ANN-M2

0.3975

0.6305

0.0297

0.1723

ANN-M3

0.5803

0.7618

0.0207

0.1438

ANN-M4

0.5405

0.7352

0.0226

0.1505

ANN-M5

0.6077

0.7795

0.0193

0.1390

236

A. Dayyabu et al.

Fig. 4 a Pedestrian speed distribution (ALL PEDESTRIAN ACSEND DIR). b Pedestrian speed distribution (ALL PEDESTRIAN DESCEND DIR). c Pedestrian speeddist (PEDESTRIAN COMB. I ACSEND DIR based on cover shoe). d Pedestrian speeddist (PEDESTRIAN COMB. I DESCEND DIR based on cover shoe). e Pedestrian speeddist (PEDESTRIAN COMB. II ACSEND DIR based on cover shoe). f Pedestrian speeddist (PEDESTRIAN COMB. II DESCEND DIR based on cover shoe). g Pedestrian speeddist (PEDESTRIAN COMB. I ACSEND DIR based on slipper shoe). h Pedestrian speeddist (PEDESTRIAN COMB. I DESCEND DIR based on slipper shoe). i Pedestrian speeddist (PEDESTRIAN COMB. II ACSEND DIR based on slipper shoe). j Pedestrian speed dist (PEDESTRIAN COMB. II DESCEND DIR based on slipper shoe. k Pedestrian speeddist (PEDESTRIAN COMB. I ACSEND DIR based on slipper shoe). l Pedestrian speeddist (PEDESTRIAN COMB. I DESCEND DIR based on slipper shoe). m Pedestrian speeddist (PEDESTRIAN COMB. II ACSEND DIR based on slipper shoe). n Pedestrian speeddist (PEDESTRIAN COMB. II DESCEND DIR based on slipper shoe

Pedestrian Speed Prediction Using Feed Forward Neural Network

Fig. 4 (continued)

237

238

A. Dayyabu et al.

each variable’s significance in model building. Table 2a presents the relationship for ascending direction, and Table 2b presents the relationship for descending direction. Moreover, the sensitivity analysis result indicates that shoe types have more significance in ascending direction, with slippers being the most significant followed by a cover shoe, followed by clothing type I, followed by clothing type II and less is the gender presented in Table 2a. While in descending direction, clothing type I is the most significant, followed by clothing type II, followed by female gender, followed by male gender, followed by cover shoe type, and shoe type II, as presented in Table 2b.

3.4 Model Estimation Analysis Results In this research, a two-layer feed-forward network trained with Levenberg– Marquardt algorithm is used to analyze ANN Models. Feed-forward networks consist of a series of layers, and each subsequent layer has a connection from the previous. The model was built using the MATLAB 2019a; five ANN models were developed based on the Pearson correlation coefficients. During the process, 75% of data for training and 25% for validation were used to analyze ANN models. Network performance was measured according to the mean of squared error (MSE). Table 3a–d present the performance measure for all the five ANN models in training and validation in ascending and descending directions. From the ANN performance analysis presented in Table 3a–d, the values of R and RMSE showed evidence that ANN could be used to model pedestrian speed on the stair as all the values of R from model 1 to model 5 are more significant than 0.5 and model 5 has the best performance with R-value of (0.7085 and 0.7034) in training and validation ascending direction and (0.7870 and 0.7795) in training and validation descending direction (Fig. 5).

4 Conclusion The artificial intelligent modeling based on ANN could be used in pedestrian speed prediction considering the effect of gender, clothing types, and shoe types, as shown in the ANN performance analysis conducted in this research. All the ANN models built from the observed data have the performance greater than 0.5, indicating the acceptability of ANN in pedestrian speed prediction on a stairway. The research also concluded that dressing of pedestrian in terms of clothing, shoe type, and gender affects pedestrian speed, male pedestrians wearing English/short African clothes with cover shoe has the highest speed compared with any other dressing a pedestrian could wear. Female pedestrians wearing long African with slippers have less speed than any other pedestrian combination.

Pedestrian Speed Prediction Using Feed Forward Neural Network

239

Fig. 5 a Pedestrian speed relationship between predicted and observed data (TRAINING). b Pedestrian speed relationship between predicted and observed data (TESTING). c Pedestrian speed relationship between predicted and observed data (TRAINING). d Pedestrian speed relationship between predicted and observed data (TESTING)

References 1. Jacobson, H. R. (1940). A history of roads from ancient times to the motor age (Georgia Institute of Technology). https://smartech.gatech.edu/bitstream/handle/1853/36216/jacobson_ herbert_r_194005_ms_95034.pdf 2. Olojede, O., Yoade, A., & Olufemi, B. (2017). Determinants of walking as an active travel mode in a Nigerian city. Journal of Transport and Health, 6, 327–334. https://doi.org/10.1016/ j.jth.2017.06.008 3. Litman, T. (2011). Evaluating public transportation health benefits. (April). http://site.ebrary. com/lib/sfu/docDetail.action?docID=10534560 4. WHO. (2015). Global status report on road safety 2013. WHO. http://www.who.int/violence_ injury_prevention/road_safety_status/2013/en/ 5. Damsere-Derry, J., et al. (2010). Pedestrians’ injury patterns in Ghana. Accident Analysis and Prevention, 42(4), 1080–1088. 6. Ogendi, J., Odero, W., Mitullah, W., & Khayesi, M. (2013). Pattern of pedestrian injuries in the city of Nairobi: Implications for urban safety planning. Journal of Urban Health, 90(5), 849–856. 7. Aladelusi, T. O., et al. (2014). Evaluation of pedestrian road traffic maxillofacial injuries in a Nigerian tertiary hospital. African Journal of Medicine and Medical Sciences, 43(4), 353–359. 8. Solagberu, B. A., et al. (2014). Child pedestrian injury and fatality in a developing country. Pediatric Surgery International, 30(6), 625–632. 9. Odeleye, A. J. (2001). Improved road traffic environment for better child safety in Nigeria. In Road user characteristics with emphasis on life-styles, quality of life and safety—proceedings of 14th ICTCT workshop held Caserta, Italy, October, 2001, pp. 72–82. http://trid.trb.org/view/ 745284

240

A. Dayyabu et al.

10. Okazaki, S., & Matsushita, S. (1979). A study of simulation model for pedestrian movement. In Architectural space, part 3: along the shortest path, taking fire, congestion and unrecognized space into account, transactions of architectural institute of Japan, 285. https://citeseerx.ist. psu.edu/viewdoc/summary?doi=10.1.1.626.596 11. Gipps, P. G., & Marksjö, B. (1985). A micro-simulation model for pedestrian flows. Mathematics and Computers in Simulation, 27(2–3), 95–105. https://doi.org/10.1016/0378-475 4(85)90027-8 12. Blue, V. J., & Adler, J. L. (1998). Emergent fundamental pedestrian flows from cellular automata microsimulation. Transportation Research Record: Journal of the Transportation Research Board, 1644(1), 29–36. https://doi.org/10.3141/1644-04 13. Dijkstra, J., & Jessurun, J. (2001). Theory and practical issues on cellular automata. Theory and practical issues on cellular automata, (January 2000). https://doi.org/10.1007/978-1-44710709-5 14. Wang, J., Zhang, L., Shi, Q., Yang, P., & Hu, X. (2015). Modeling and simulating for congestion pedestrian evacuation with panic. Physica A: Statistical Mechanics and Its Applications, 428, 396–409. https://doi.org/10.1016/j.physa.2015.01.057 15. Chen, Y., Chen, N., Wang, Y., Wang, Z., & Feng, G. (2015). Modeling pedestrian behaviors under attracting incidents using cellular automata. Physica A: Statistical Mechanics and Its Applications, 432, 287–300. https://doi.org/10.1016/j.physa.2015.03.017 16. Hu, J., You, L., Zhang, H., Wei, J., & Guo, Y. (2018). Study on queueing behavior in pedestrian evacuation by extended cellular automata model. Physica A: Statistical Mechanics and Its Applications, 489, 112–127. https://doi.org/10.1016/j.physa.2017.07.004 17. Alghadi, M. Y., Mazlan, A. R., & Azhari, A. (2019). The impact of board gender and multiple directorship on cash holdings: Evidence from Jordan. International Journal of Finance and Banking Research, 5(4), 71–75. 18. Lu, L., Guo, X., & Zhao, J. (2017). A unified nonlocal strain gradient model for nanobeams and the importance of higher order terms. International Journal of Engineering Science, 119, 265–277. 19. Helbing, D., & Molnár, P. (1995). Social force model for pedestrian dynamics. Physical Review E, 51(5), 4282–4286. https://doi.org/10.1103/PhysRevE.51.4282 20. Lewin, K. (1951). Field theory in social science. Amazon.co.uk: Lewin, Kurt: Books. Retrieved September 24, 2020, from https://www.amazon.co.uk/Field-Theory-Social-Science-Lewin/dp/ B0007DDXKY 21. Teknomo, K. (2006). Application of microscopic pedestrian simulation model. Transportation Research Part F: Traffic Psychology and Behaviour, 9(1), 15–27. https://doi.org/10.1016/j.trf. 2005.08.006 22. Helbing, D., Buzna, L., Johansson, A., & Werner, T. (2005). Self-organized pedestrian crowd dynamics: Experiments, simulations, and design solutions. Transportation Science, 39(1), 1– 24. 23. Lakoba, T. I., Kaup, D. J., & Finkelstein, N. M. (2005). Modifications of the Helbing-MolnárFarkas-Vicsek social force model for pedestrian evolution. Simulation, 81(5), 339–352. https:// doi.org/10.1177/0037549705052772 24. Zanlungo, F„ Bršˇci´c, D., & Kanda, T. (2014). Pedestrian group behaviour analysis under different density conditions. Transportation Research Procedia, 2, 149–158. https://doi.org/ 10.1016/j.trpro.2014.09.020 25. Moussaïd, M., Perozo, N., Garnier, S., Helbing, D., & Theraulaz, G. (2010). The walking behaviour of pedestrian social groups and its impact on crowd dynamics. PLoS ONE, 5(4), e10047. https://doi.org/10.1371/journal.pone.0010047 26. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. 27. Gruden, C., Otkovi´c, I. I., & Šraml, M. (2020). Neural networks applied to microsimulation: A prediction model for pedestrian crossing time. Sustainability (Switzerland), 12(13).

Pedestrian Speed Prediction Using Feed Forward Neural Network

241

28. Das, P., Parida, M., & Katiyar, V. K. (2015). Analysis of interrelationship between pedestrian flow parameters using artificial neural network. Journal of Medical and Biological Engineering, 35(6), 298–309. 29. Zampieri, F. L., Rigatti, D., & Ugalde, C. (2009). Evaluated model of pedestrian movement based on space syntax, performance measures and artificial neural nets. In 7th International space syntax symposium, pp 1–8. 30. Govindaraju, R. S. (2000). Artificial neural networks in hydrology. II: Hydrologic applications. Journal of Hydrologic Engineering, 5(2), 124–137. 31. Solgi, M., Najib, T., Ahmadnejad, S., & Nasernejad, B. (2017). Synthesis and characterization of novel activated carbon from Medlar seed for chromium removal: Experimental analysis and modeling with artificial neural network and support vector regression. Resource-Efficient Technologies, 3(3), 236–248. 32. Elkiran, G., Nourani, V., & Abba, S. I. (2019). Multi-step ahead modelling of river water quality parameters using ensemble artificial intelligence-based approach. Journal of Hydrology, 577, 123962. 33. Price, J. L., McKeel Jr, D. W., Buckles, V. D., Roe, C. M., Xiong, C., Grundman, M., ... & Morris, J. C. (2009). Neuropathology of nondemented aging: Presumptive evidence for preclinical Alzheimer disease. Neurobiology of Aging, 30(7), 1026–1036. 34. Zare, M., & Koch, M. (2016, July). Using ANN and ANFIS models for simulating and predicting groundwater level fluctuations in the Miandarband Plain, Iran. In Proceedings of the 4th IAHR Europe congress. Sustainable hydraulics in the era of global change (p. 416), Liege, Belgium. 35. Schuchhardt, J., Schneider, G., Reichelt, J., Schomburg, D., & Wrede, P. (1995). Classification of local protein structural motifs by kohonen networks. Bioinformatics: From Nucleic Acids and Proteins to Cell Metabolism, 85–92. 36. Blue, V. J., & Adler, J. L. (2001). Cellular automata microsimulation for modeling bi-directional pedestrian walkways. Transportation Research Part B: Methodological, 35(3), 293–312. 37. Zheng, X., Li, H. Y., Meng, L. Y., Xu, X. Y., & Chen, X. (2015). Improved social force model based on exit selection for microscopic pedestrian simulation in subway station. Journal of Central South University, 22(11), 4490–4497.

Arabic Text Classification Using Modified Artificial Bee Colony Algorithm for Sentiment Analysis: The Case of Jordanian Dialect Abdallah Habeeb, Mohammed A. Otair, Laith Abualigah, Anas Ratib Alsoud, Diaa Salama Abd Elminaam, Raed Abu Zitar, Absalom E. Ezugwu, and Heming Jia

Abstract Arab customers give their comments and opinions daily, and it increases dramatically through online reviews of products or services from companies, in both Arabic, and its dialects. This text describes the user’s condition or needs for satisfaction or dissatisfaction, and this evaluation is either negative or positive polarity. Based on the need to work on Arabic text sentiment analysis problem, the case of the Jordanian dialect. The main purpose of this paper is to classify text into two classes: negative or positive which may help the business to maintain a report

A. Habeeb . M. A. Otair Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan L. Abualigah (&) . A. R. Alsoud Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman, Jordan e-mail: [email protected] L. Abualigah Faculty of Information Technology, Middle East University, Amman 11831, Jordan School of Computer Sciences, Universiti Sains Malaysia, 11800 Pulau Pinang, Gelugor, Malaysia D. S. A. Elminaam Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt Faculty of Computer Science, Misr International University, Obour, Egypt R. A. Zitar Sorbonne Center of Artificial Intelligence, Sorbonne University-Abu Dhabi, Abu Dhabi, United Arab Emirates A. E. Ezugwu School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Road, Pietermaritzburg, KwaZulu-Natal 3201, South Africa H. Jia Department of Information Engineering, Sanming University, Fujian 365004, China © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Abualigah (ed.), Classification Applications with Deep Learning and Machine Learning Technologies, Studies in Computational Intelligence 1071, https://doi.org/10.1007/978-3-031-17576-3_12

243

244

A. Habeeb et al.

about service or product. The first phase has tools used in natural language processing; the stemming, stop word removal, and tokenization to filtering the text. The second phase, modified the Artificial Bee Colony (ABC) Algorithm, with Upper Confidence Bound (UCB) Algorithm, to promote the exploitation ability for the minimum dimension, to get the minimum number of the optimal feature, then using forward feature selection strategy by four classifiers of machine learning algorithms: (K-Nearest Neighbors (KNN), Support vector machines (SVM), Naïve-Bayes (NB), and Polynomial Neural Networks (PNN). This proposed model has been applied to the Jordanian dialect database, which contains comments from Jordanian telecom company’s customers. Based on the results of sentiment analysis few suggestions can be provided to the products or services to discontinue or drop, or upgrades it. Moreover, the proposed model is applied to the database of the Algerian dialect, which contains long Arabic texts, in order to see the efficiency of the proposed model for short and long texts. Four performance evaluation criteria were used: precision, recall, f1-score, and accuracy. For a future step, in order to build on or use for the classification of Arabic dialects, the experimental results show that the proposed model gives height accuracy up to 99% by applying to the Jordanian dialect, and a 82% by applying to the Algerian dialect.

.

.

Keywords Natural language processing Text classification Sentiment analysis Feature selection Inspired algorithms ABC UBC KNN SVM PNN Naïve Bayes

.

.

.

.

.

.

.

.

1 Introduction A part of human intelligence is the use of language in communication, including the ability to speak, read, and analyze images to understand content. With artificial intelligence, the method uses machine learning to reach a part of the intelligence able to read and understand the context [1]. Machine learning algorithms deal with automatic text classification. Learning the features used to build text classification in various fields like email routing, spam filtering, web page classification, sentiment analysis, topic tracking. To perform the text classification job, will use proposed feature selection, the preprocessing tools stemming, tokenization, and stop word removal with feature selection based on optimization algorithms, to handle the high-dimensional of features. Feature selection is an approach to choosing the most valuable features from the dataset of high dimensionality. Then use it to reduce the performance of classification [2]. The amount of data on the dynamic web pages and Internet is increasing every second, produced from social media, companies that care about customer opinion, and multi sources, hence the need to classify the textual documents for unstructured data. The existence of unstructured data creates a need to have knowledge that used in many domains. The text classification and categorization use to point the task of predict predefined domains or categories to given written text. The automated

Arabic Text Classification Using Modified Artificial Bee Colony …

245

classification task to report the relevant multiple and single closed, Format the unstructured textual to be compatible with ML algorithms. Mine the interesting knowledge and understand customer needs. The most important task in Natural language processing (NLP) techniques a sentiment analysis used to determine textual is positive or negative. The use of NLP to complete automatic analysis of text, represent data in a format suitable for machine learning [3]. One of the optimization algorithms is the artificial bee colony algorithm (ABC) that used successfully in many studies. This algorithm suffers in part from its stochastic feature when search in poor exploitation equation to improve it for best solutions [4]. Because of this weakness in the algorithm, the ABC algorithm with elite opposition-based learning strategy is utilized to solve poor exploitation in original ABC [5]. The examining ABC algorithm with elite opposition-based learning strategy (EOABC) [5]. Customer feedback is important for the business; for fully understand your customer’s requirements; to know the level of customer’s satisfaction; it is necessary to take customers notes to evaluate their responses. This can help with innovation, product development and improve service that build a loyal customer base. However, the huge volume of data needs to process. In this paper the problem is about classification Arabic text in Jordanian dialect, which will be used in classifiers algorithms to test the training Dataset to the predicted label. The typical ABC algorithms are solutions of some search equation, which are good at exploration, but often demonstrates insufficient exploitation such that exploitation is the act of confining the search to a small area of the search space to refine the solutions. In the artificial bee colony algorithm, the greedy equation Chooses a food source according to the probability value, based on the roulette wheel method. A greedy selection applies between the food source and the new food source. As a first contribution, we modify the Artificial Bee Colony to enhance exploitation by applying UBC algorithms instead of A greedy selection. To Choose a food source according to the probability value, and get the optimal solution in small area of the search exploit. As a second contribution, classifiers reveal the ability of machine learning through supervised machine learning algorithms used to determine the value of the text, which can be a negative value expressing dissatisfaction, or a positive value expressing satisfaction, in order to describe a person’s feeling towards a product, service, or current state.

2 Related Works 2.1

Introduction

The research applies for social media content of opinions customers to solve the Arabic Sentiment (SA) analysis problem. Analyzing their written text to apply in improve the customer services and product quality. SA dealing with massive data.

246

A. Habeeb et al.

To reduce the high dimensionality space need feature selection for machine learning, proposed a bio-inspired optimizer an enhancement called the salp swarm algorithm (SSA) designed for feature selection (FS) to solve the problem of Arabic sentiment analysis. Proposed two phases, first reduce the number of features by apply filtering technique based on information gain metric [6]. Second phase applies the wrapper (FS) technique with combines (SSA) optimizer with four variants of S-shaped transfer and applies the KNN for classification. Experimental results show classification accuracy of SSA combined with the S-shaped transfer, functions outperformed the particle swarms optimizer and the grey wolf optimizer [6]. The sentiment analysis, proposed model a semi-supervised approach applies in Arabic and its dialects. this method Made up of a deep learning algorithms, to tackle classify Arabic text as detecting its polarity (Positive, negative), on a sentiment corpus. The approach applies on FB Facebook text massages written in MSA Modern Standard Arabic in DALG Algerian dialect for to scripts Arabic and Arabizi. They have two option to handle Arabizi, translation and transliteration, the experimented were done on many test corpora dedicated to DALG/MSD, with deep learning classifiers such as (LR) Logistic Regression, (RF) Random Forest, (LSTM) short-term memory and (CNN) Convolutional Neural Network. The classifiers are combined with fast Text and Word2vec, Experimental results F1 score up 95% and for extrinsic experiments 89% [7]. The optimization algorithm is the most important way to choose the feature selection because it is important in the classification process for high-dimensional text, where it works in select a set of optimal features that reduce calculation and cost. It improves the accuracy of text classification. Feature Selection method based on natural difference measurement and binary Jaya optimization algorithm (NDM-BJO) and evaluations using the Support Vector Machine and Naive Bayes, to find the error rate. The results show that the NDM-BJO model gives improvements. Evaluating various categories of feature Selection methods [8]. A difficult mathematical task in machine learning is text classification, due to the large increase in natural language text documents. Here the feature selection is the basis of the process because thousands of feature sets are possible to classify the text. The proposed model suggests an enhanced binary gray wolf (GWO) modified within a wrapper (FS) approach to address Arabic script classification problems. Shell-based feature selection while using various learning models, Naive Bayes, K-nearest neighbor and SVM classifiers, training data from three Arab public datasets, Gulf News, Al Watan and Al Jazeera News, BGWO-based wrapper methods. Results and analysis show that SVM based feature selection technique with the proposed binary GWO optimizer with elite-based crossover scheme has enhanced efficacy in dealing with Arabic text classification problems compared to other peers [9]. Choose efficient features from datasets is important to artificial intelligence, pattern recognition, text classification, and data mining, Feature selection (FS) can exclude features that are not relevant to the classification task and reducing the dimensions of data sets, which helps us understand better data. By choosing feature

Arabic Text Classification Using Modified Artificial Bee Colony …

247

selection, machine learning techniques are performed Optimize, and reduce account requirements. So far, a large number of feature selection methods suggested, while the most practical method suggested not found. Although it is conceivable that different classes of feature selection methods followed various criteria to evaluate the variables, which were focus on rare studies Evaluation of the different classes of feature selection methods. Feature selection methods under five different categories are thirteen superiors, focusing on assessment compare the general diversity and effectiveness of these methods. Thirteen feature selection methods classified using the rank aggregation method. The later, the better Five FS methods chosen to perform multi-class classifications. The SVM a classifier. Different numbers, different languages of the selected features, and different performance measures used for general diversity and measure validation of these methods combined. Analyses results signify the Mahalanobis distance is the better approach ever [10]. Many different techniques used to identify offensive speech in the media and tweet community. This research classifies neural networks (NN). To participate in the task OffensEval No. 12 of the workshop SemEval 2020, a model used to identify offensive speech C-BiGRU composed of a CNN, bidirectional RNN. A multidimensional numerical representation or each words and detect it using fast text, this apply on dataset of labels tweets to train the model on detecting a words have an offensive meaning, this model use for English, Turkish and Danish. Respectively models achieved 90.88%, 76.76% and 76.70% F1-score [11]. The emotional state of client’s needs to understand through sentiment analysis technique in natural language processing. To analyses the Chinese language, the proposed use LSTM-based Chinese text sentiment analysis, Bi-GRU and, attention mechanism model. This model works on deep properties of text and merges context to learn text properties with greater precision. Then the Multi-Head Self-Attention Model used to reduce external transactions and determine word weights and mislead the distinct text. The experiment gets 87.1% accuracy [12]. Cyberbullying is a problem that has victims, with the increase in the use of the Internet, more cyberbullying results. Classification studies on bullying in Arabic and English have done. This paper suggests using RNN algorithms with trained pre-word embedding an interconnected set of experiences on channel News Comments dataset, 0.84 F1 Scores [13]. Predominantly the exploitation problem appears in the (ABC) algorithm. The swarm of honeybees inspired this algorithm. It has addressed many problems. For more solution for exploitation in ABC algorithms, this paper proposes a chaotic ABC with elite opposition based learning strategy. The outcome is to improve exploitation ability. Furthermore, the elite opposition utilized to best exploit potency in available solutions. The results compared with several artificial bee colony algorithms [14]. Contribute to sentiment analysis for natural language processing, concerned with classifies the polarity of the text and the cause the need to understand opinions, feelings, emotions, and evaluations data is urgent. This work aims to implement a

248

A. Habeeb et al.

sentiment analysis system that identifies and understands semantics without linguistic resources. The proposed model examined to detect its polarity positive or negative [15]. Feature selection is very important for classification, it enhances classification performance, removes redundant features, and reduces computational time. A proposal for a new error-based artificial bee colony algorithm for the feature selection problem. Developed by incorporating new error-based standardized solution search mechanisms. Thirteen machine learning data sets are used. SVM and KNN Classification algorithms are used [16]. The proposed Multi-objective artificial bee colony-based feature weighting technique for naïve Bayes (MOABC-FWNB), the approaches consider the relationship between feature-feature (redundancy) independently and feature-class (relevancy) using the Naïve Bayes (NB), the proposed model to determine the weights of features, an experimental study was conducted on 20 benchmark UCI datasets [17] (Table 1). The literature review is related to text value extraction, so that the text value is used in a diverse way. To employ it from the process of classification or analyzing a feeling or extracting a certain value. We mentioned in this research problem about sentiment analysis in Arabic text, to cover these gaps, in this paper we work to identify a subset of optimal traits by modifying the artificial bee colony algorithm, and then employ this subset of features in the classification process within supervised machine learning to build an integrated application that serves prediction operations to analyze the human feeling from the value of a text.

3 The Proposed Method 3.1

Introduction

This chapter presents the procedures and implementation of the experiments and how to obtain the results of our proposed models. This paper aims to get the minimum number of the optimal feature that effect the value of text using the enhanced ABC-UBC designed for feature selection described in details section 3.4 then apply with wrapper technique classification that needed for machine learning to improve accuracy Classifiers text. to solve the problem of Arabic sentiment analysis. The proposed model has examined on two datasets, (1) Jordanian dialect sentiment corpus (2) Algerian dialect sentiment corpus. In addition, the datasets will have divided into 80% training, 20% test. The learning model phase depends on an optimal set of features from the essential phase, which will be used in classifiers algorithms to test the training dataset to the predicted label. Evaluate the proposed model compared with widely used classification techniques. The pre-processing steps of the dataset will also be discussed in this chapter The entire experiment was designed and implemented using Python. Python 3.8, Spyder 3, Jupiter notebook server is: 6.1.4, have been used to import dataset and

Arabic Text Classification Using Modified Artificial Bee Colony …

249

Table 1 Related studies summary Author

Method

Dataset

Research title

Summary

[6]

Salp swarm algorithm

Benchmark dataset of Arabic tweets

Arabic sentiment analysis based on salp swarm algorithm with S-shaped transfer functions

[7]

Word2vec

Algerian dialect corpus

[8]

Hybrid feature selection method based on normalized difference measure and binary Jaya optimization algorithm (NDM-BJO) Enhanced binary grey wolf optimizer

10 News group text corpus

A Semi supervised approach for sentiment analysis of arab (ic + izi) Messages: application to the algerian dialect Optimal feature subset selection using hybrid binary Jaya optimization algorithm for text classification SVM, NB

The average classification accuracy Rate of (80.08%). PSO came next with an average Classification accuracy rate of (80.06%) The best results that obtained are up to 80.58% (for F1 score)

[9]

Corpus of Arabic texts

[10]

Machine learning techniques

Corpus of english novels

[11]

Neural network model representation embedding

Tweet dataset

Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification Comparing multiple categories of feature selection methods for text classification

NLP_Passau at SemEval-2020 task 12: multilingual neural network for offensive language detection in English, Danish and Turkish

Accuracy in 92.5%, for 5648 features) and 97.8% for 300 features)

The best results that obtained with SVM are up to %96 for F-measure

The macro-averaged F-measures are 0.93, 0.94, 0.89, and 0.90. the kappa coefficients are 0.93, 0.94, 0.88. With the increase in the number of selected features CNN, NN result of f1 scores up to 90.88% 76.76%

(continued)

250

A. Habeeb et al.

Table 1 (continued) Author

Method

Dataset

Research title

Summary

[12]

Neural network models

Chinese text

Model (CNN-BiLSTM) The experiment gets 87.1% accuracy

[13]

Neural network models

[14]

ABC algorithm with elite opposition-based learning strategy (EOABC)

Arabic channel news comments dataset Benchmark test functions

[15]

Neural network models

Corpus of Arabic texts

[16]

New standard error-based artificial bee colony (SEABC) algorithm Multi-objective artificial bee colony-based feature weighting technique for naïve Bayes

Thirteen datasets are used from UCI machine learning datasets Twenty benchmark UCI datasets

A Intelligent CNN-BiLSTM approach for chinese sentiment analysis on spark Classification of cyberbullying text in Arabic A survey on the studies employing machine learning (ML) for enhancing artificial bee colony (ABC) optimization algorithm Deep attention-based review level sentiment analysis for arabic reviews A new standard error based artificial bee colony algorithm and its applications in feature selection Feature weighting for naïve Bayes using multi objective artificial bee colony algorithm

[17]

The result of f1 scores up to 84% Presented a survey on studies of improving the ABC using ML

Using deep learning ANN

Using artificial bee colony algorithm

Using multi objective artificial bee colony algorithm For Feature weighting

evaluate and compare the result. Using the CountVectorizer means breaking down a sentence or paragraph or any text into words then to convert the words to multidimensional matrix to training data in classifiers forward features selection using the machine learning algorithms. The operating system was used OS Windows 10 20H2, Processor Intel(R) Core(TM) i7-3520M, RAM 12 GB.

3.2

Data Preparation

This paper has two datasets as shown in Fig. 1, first the Jordanian dialect sentiment corpus 3000 notes are written in the Arabic Jordanian dialect specifically and collect from different telecommunication companies, the dataset was collected from

Arabic Text Classification Using Modified Artificial Bee Colony …

Fig. 1 The proposed model

251

252

A. Habeeb et al.

Table 2 The characteristics of the Jordanian dialect sentiment corpus Number of instances

3000

Number of positive notes Number of negative Topics Language Annotation Predicted attribute Count of words Count of stem words

1116 1884 Reviews and feedback from customer’s notes Jordanian dialects (AD) Manual (by expert native speakers) Class of opinion polarity (positive, negative) 1631 847

Jordanian telecom company notes that were written by call center employees, these notes were written during the customer’s calls with call center. Call center employees summarize the calls that they receive as notes. The characteristics of the dataset are given in Table 2. The second dataset, the Algerian dialect sentiment corpus Articles extracted from political, news, sports, religion, and society articles selected from Algerian Arabic newspaper websites. The characteristics of the dataset are given in Table 3.

3.3

Data Annotation

The dataset has been divided into two different categories as positive, negative. The dataset has been annotated by a group of experts, the classification of Arabic messages into two categories has been linked with a number, to facilitate the classification process as 1 for positive, and 2 for negative. Table 4, shows a sample of Jordanian dialect sentiment corpus, Table 5 shows a sample of Algerian dialect sentiment corpus.

Table 3 The characteristics of the Algerian dialect sentiment corpus Number of instances

5630

Number of positive notes Number of negative Topics

3046 2584 Articles extracted from news, political, religion, sports, and society Algerian, Arabic Dialects (AD) Modern Standard Arabic (MSA) Manual (by expert native speakers) Class of opinion polarity (positive, negative) 9468 3848

Language Annotation Predicted attribute Count of words Count of stem words

‫… ‪Arabic Text Classification Using Modified Artificial Bee Colony‬‬

‫‪253‬‬

‫‪Table 4 Dataset example‬‬ ‫‪Polarity‬‬

‫‪Note‬‬

‫‪1‬‬ ‫‪1‬‬ ‫‪1‬‬ ‫‪1‬‬ ‫‪2‬‬ ‫‪2‬‬ ‫‪2‬‬ ‫‪2‬‬

‫ﺍﻟﺘﻐﻄﻴﺔ ﺻﺎﺭﺕ ﻣﻤﺘﺎﺯﺓ ﺑﻀﺎﺣﻴﺔ ﺍﻟﺮﺷﻴﺪ‬ ‫ﺳﺮﻋﺔ ﺍﻻﻧﺘﺮﻧﺖ ﺻﺎﺭﺕ ﻣﻨﻴﺤﺔ ﺑﺘﻼﻉ ﺍﻟﻌﻠﻲ‬ ‫ﺍﻟﻌﺮﻭﺽ ﺍﻟﺠﺪﻳﺪﺓ ﻣﺸﺠﻌﺔ‬ ‫ﺍﻟﺘﻄﺒﻴﻖ ﺳﻬﻞ ﻋﻠﻴﻨﺎ ﻛﺜﻴﺮ‬ ‫ﻋﻢ ﺑﺤﺎﻭﻝ ﺍﺣﻮﻝ ﺗﻌﺮﻓﺖ ﺧﻄﻲ ﻟﺨﻂ ﺍﻟﻜﻞ ﺑﻠﻜﻞ ﻭ ﺑﻌﻄﻴﻨﻲ ﻳﺮﺟﻰ ﺍﻟﻤﺤﺎﻭﻟﺔ ﻻﺣﻘﺎ‬ ‫ﺍﻟﻨﺖ ﺑﻀﻞ ﻳﻔﺼﻞ ﻣﻊ ﺍﻧﻮ ﻣﻌﻲ ﺣﺰﻡ ﻛﺜﻴﺮ‬ ‫ﺑﺮﻥ ﻋﻠﻲ ﺭﻗﻢ ﺧﺎﺹ ﻭ ﺑﻀﻞ ﻳﺰﻋﺠﻨﻲ‬ ‫ﺣﻜﻴﺖ ﻣﻌﻜﻢ ﻛﺘﻴﺮ ﻭ ﻫﺎﻱ ﺍﻟﻤﻜﺎﻟﻤﺔ ﺍﻟﻌﺸﺮﻳﻦ ﻭ ﻣﺎ ﺣﺪﺍ ﺣﻠﻠﻲ ﻣﺸﻜﻠﺘﻲ‬ ‫‪Table 5 Dataset example‬‬

‫‪Polarity‬‬

‫‪Articles‬‬

‫‪2‬‬

‫ﺷﻲء ﻋﺠﻴﺐ ﻭ ﺍﻟﻠﻪ ﺍﻥ ﻳﻜﻮﻥ ﻣﻨﺎﻇﻞ ﻛﺒﻴﺮ ﻛﻤﺎ ﻳﻘﺎﻝ ﻭ ﺭﺋﻴﺲ ﺣﻜﻮﻣﺔ ﻳﺠﻬﻞ ﻣﻜﺎﻧﺔ ﺍﻟﺸﻴﺦ ﺍﻻﺑﺮﺍﻫﻤﻲ ﻓﻲ ﺍﻟﻌﺎﻟﻢ‬ ‫ﺍﻻﺳﻼﻣﻲ ﻭ ﻣﺎ ﻗﺪﻣﻪ ﻟﻠﺜﻮﺭﺓ ﺍﻟﺠﺰﺍﺋﺮﻳﺔ ﻭ ﻛﻼﻣﻪ ﻭ ﺑﻴﺎﻧﺎﺗﻪ ﻣﻌﺮﻭﻓﺔ ﻭ ﻣﻨﺸﻮﺭﺓ ﺑﺎﻣﻜﺎﻥ ﺍﻱ ﺍﻧﺴﺎﻥ ﺍﻻﻃﻼﻉ ﻋﻠﻴﻬﺎ ﻭ‬ ‫ﻋﻠﻰ ﺗﺎﺭﻳﺦ ﺍﺻﺪﺍﺭﻫﺎ ﻭ ﻗﺪ ﺍﺳﺘﻐﻞ ﺍﺣﺪ ﺍﻟﺼﺤﺎﻓﻴﻴﻦ ﺍﻟﺘﺎﻓﻬﻴﻦ ﻛﻼﻡ ﺍﻟﺴﻴﺪ ﺑﻠﻌﻴﺪ ﻭ ﺑﺪﺃ ﻳﻠﻮﻙ ﻛﻼﻡ ﺍﻟﺘﺸﻔﻲ ﻭ ﺍﻻﻧﺘﻘﺎﺹ‬ ‫ﻟﻼﻣﺎﻡ ﺍﻟﺒﺸﻴﺮ ﺍﻻﺑﺮﺍﻫﻤﻲ ﻫﺬﺍ ﺍﻻﻣﺎﻡ ﺍﻟﺬﻱ ﻛﺎﻧﺖ ﺗﺘﺠﻨﺒﻪ ﻋﻴﻮﻥ ﺍﻟﻌﻠﻤﺎﺀ ﺍﻣﺜﺎﻝ ﺍﻟﻌﻘﺎﺩ ﻭ ﻃﻪ ﺣﺴﻴﻦ ﻓﻲ ﺍﺭﻭﻗﺔ ﻣﺠﻤﻊ‬ ‫ﺍﻟﻠﻐﺔ ﺍﻟﻌﺮﺑﻴﺔ ﺑﺎﻟﻘﺎﻫﺮﺓ ﺗﻘﺪﻳﺮﺍ ﻭ ﺍﺣﺘﺮﺍﻣﺎ ﻟﻌﻠﻤﻪ ﺍﻟﻜﺒﻴﺮ ﻭ ﺗﺒﺤﺮﻩ ﻻ ﻣﺘﻨﺎﻫﻲ ﻓﻲ ﺍﻟﻠﻐﺔ ﻭ ﺍﻻﺩﺏ ﻳﻬﺎﻥ ﻫﺬﻩ ﺍﻟﻤﺮﺓ ﻋﻠﻰ‬ ‫ﺍﻳﺪﻱ ﺍﻃﻔﺎﻝ ﻓﻲ ﺍﻟﻌﻠﻢ ﻭ ﺍﻟﻔﻜﺮ ﻛﻢ ﺗﻤﻨﻴﺖ ﻟﻮ ﻛﺎﻥ ﺍﻻﺑﺮﺍﻫﻤﻲ ﻣﺼﺮﻳﺎ ﻟﺮﺃﻳﻨﺎ ﺍﻟﻌﺠﺐ ﺍﻟﻌﺠﺎﺏ ﻓﻲ ﺗﻘﺪﻳﺮﻩ ﻭ ﺍﺣﺘﺮﺍﻣﻪ ﻭ‬ ‫ﺭﺑﻤﺎ ﻟﻘﺐ ﻣﻦ ﻃﺮﻑ ﺍﻟﻤﺼﺮﻳﻴﻦ ﺑﻤﻠﻚ ﺍﻟﺒﻴﺎﻥ ﺍﻟﻌﺮﺑﻲ ﻭ ﺟﻌﻠﻮﺍ ﻟﻪ ﺗﻤﺜﺎﻻ ﻳﻨﺎﻓﺲ ﺗﻤﺜﺎﻝ ﻃﻪ ﺣﺴﻴﻦ ﻓﻲ ﺍﻟﺠﺎﻣﻌﺔ‬ ‫ﺍﻟﻤﺼﺮﻳﺔ ﻟﻜﻦ ﻋﻨﺪﻧﺎ ﺣﻴﺚ ﺍﻟﺠﻬﻞ ﻭ ﺍﻟﺮﻛﺎﻛﺔ ﺍﻟﻠﻐﻮﻳﺔ ﻻ ﺑﺪ ﻣﻦ ﺍﻧﺘﻘﺎﺹ ﻣﻦ ﻗﻴﻤﺘﻪ ﻭ ﺟﻬﺎﺩﻩ ﻭ ﻣﻦ ﺧﻼﻝ ﺗﺘﺒﻌﻲ‬ ‫ﻟﻤﺴﻴﺮﺓ ﺗﺎﺭﻳﺨﻨﺎ ﺍﻟﻤﻌﺎﺻﺮ ﻻﺣﻈﺖ ﺍﻥ ﻣﻌﻈﻢ ﻣﺴﺆﻭﻟﻴﻨﺎ ﺭﻣﺖ ﺑﻬﻢ ﺍﻟﺼﺪﻓﺔ ﺍﻟﻰ ﻭﺍﺟﻬﺔ ﺍﻟﺤﻜﻢ ﻭ ﻟﻴﺲ ﻛﻤﺎ ﻫﻮ ﻋﻨﺪ‬ ‫ﻏﻴﺮﻧﺎ ﺣﻴﺚ ﺍﻟﻜﻔﺎءﺔ ﻭ ﺍﻟﻌﻠﻢ ﻭ ﺍﻟﻨﺰﺍﻫﺔ ﻫﻲ ﻣﻴﺰﺍﻥ ﺍﻻﺧﺘﻴﺎﺭ ﻭ ﻟﻠﺤﻘﻴﻘﺔ ﺍﻥ ﻣﻌﻈﻢ ﻭﺯﺭﺍﺀ ﺍﻟﺴﺒﻌﻴﻨﺎﺕ ﻛﺎﻧﻮﺍ ﻇﻞ ﻟﻼﺥ‬ ‫ﺑﻮﻣﺪﻳﻦ ﻓﻬﻮ ﻣﻦ ﻗﺎﻡ ﺑﻜﻞ ﺷﻲء ﻓﻲ ﺍﻟﻤﺠﺎﻝ ﺍﻟﺴﻴﺎﺳﻲ ﻭ ﺍﻻﻗﺘﺼﺎﺩﻱ ﻭ ﺍﻻﺟﺘﻤﺎﻋﻲ ﻭ ﻫﻢ ﻣﺠﺮﺩ ﺩﻣﻰ ﻣﺘﺤﺮﻛﺔ‬ ‫ﻻﻭﺍﻣﺮﻩ ﺍﻟﻨﺎﻓﺬﺓ ﻭ ﺍﻟﺪﻟﻴﻞ ﺍﻥ ﻫﺆﻻﺀ ﻋﻨﺪﻣﺎ ﺭﺟﻌﻮﺍ ﻟﻠﺤﻜﻢ ﻣﺮﺓ ﻟﻢ ﻳﻘﺪﻣﻮﺍ ﺷﻲء ﻳﺬﻛﺮ ﻭ ﺍﻻﺥ ﺑﻠﻌﻴﺪ ﺍﻟﺬﻱ ﻟﻘﺐ ﺑﺄﺏ‬ ‫ﺍﻟﺼﻨﺎﻋﺔ ﺍﻟﺜﻘﻴﻠﺔ ﻛﺎﻥ ﻣﻦ ﺍﻟﻤﻔﺮﻭﺽ ﺍﻥ ﻳﻨﺴﺐ ﻟﺒﻮﻣﺪﻳﻦ ﻓﻬﻮ ﺻﺎﺣﺐ ﺍﻟﻔﻜﺮﺓ ﻭ ﺍﻟﻔﻀﻞ ﻟﻢ ﻳﺴﺘﻄﻊ ﺍﻧﻘﺎﺫ ﻣﺼﻨﻊ ﻭﺍﺣﺪ‬ ‫ﺻﻐﻴﺮ ﻭ ﺍﺿﻄﺮ ﻟﺒﻴﻌﻪ ﻟﻠﺨﻮﺍﺹ ﻻ ﺑﺪ ﻣﻦ ﺍﻋﺎﺩﺓ ﻗﺮﺍءﺔ ﺗﺎﺭﻳﺨﻨﺎ ﺑﻌﻴﻮﻥ ﻧﺎﻗﺪﺓ ﻭ ﻭﺍﻋﻴﺔ ﺗﻌﺘﻤﺪ ﻓﻘﻂ ﻋﻠﻰ ﺍﻟﺤﻘﺎﺋﻖ‬ ‫ﺍﻟﺪﺍﻣﻐﺔ ﺍﻟﺘﻲ ﺗﺴﻨﺪﻫﺎ ﺍﻟﻮﺛﺎﺋﻖ ﻭ ﻟﻴﺲ ﻋﻠﻰ ﺍﻻﺑﺎﻃﻴﻞ‬ ‫ﻛﻠﻤﺎ ﺃﻃﺎﻝ ﺍﻟﻠﻪ ﻓﻲ ﻋﻤﺮﻱ ﺃﺗﺄﻛﺪ ﺑﻤﺎ ﻻ ﻳﺪﻉ ﻣﺠﺎﻻ ﻟﻠﺸﻚ ﺃﻥ ﺟﻞ ﻣﻦ ﻗﺎﺩﻭﺍ ﺍﻟﺠﺰﺍﺋﺮ ﺑﻌﺪ ﺍﻻﺳﺘﻘﻼﻝ ﺍﻟﻰ ﻳﻮﻣﻨﺎ ﻫﺬﺍ ﻫﻢ‬ ‫ﺃﻗﺮﺏ ﺍﻟﻰ ﺍﻟﺠﻬﻞ ﺍﻟﻤﺮﻛﺐ ﺃﻭ ﺍﻟﻌﻤﺎﻟﺔ ﺑﻞ ﺍﻻﻧﺒﻄﺎﺡ ﺍﻟﻰ ﻓﺮﻧﺴﺎ ﺍﻻﺳﺘﻌﻤﺎﺭﻳﺔ ﻓﻼ ﺍﺗﺼﻮﺭ ﻛﻴﻒ ﻟﻤﺠﺎﻫﺪ ‪ -‬ﻛﻤﺎ ﻳﻘﻮﻟﻮﻥ‬ ‫ ﻭﺭﺋﻴﺲ ﺣﻜﻮﻣﺔ ﻟﻠﺪﻭﻟﺔ ﺍﻟﺠﺰﺍﺋﺮﻳﺔ ﺍﻟﻤﺴﺘﻘﻠﺔ ﻳﺘﺴﻢ ﺑﻬﺬﺍ ﺍﻟﺠﻬﻞ ﺍﻟﻤﺮﻛﺐ ﻭﻧﻠﻮﻡ ﺍﻟﺤﺮﺍﻗﻴﻦ ﻭﻧﻜﻔﺮ ﺍﻟﻤﻨﺘﺤﺮﻳﻦ ﻭﻧﺴﺠﻦ‬‫ﺃﺻﺤﺎﺏ ﺍﻟﺮﺃﻱ ﺍﻵﺧﺮ ﺃﻟﻴﺲ ﻣﺎ ﻧﺤﻦ ﻋﻠﻴﻪ ﺍﻵﻥ ﻫﻮ ﺛﻤﺮﺓ ﻣﺎ ﺯﺭﻋﻪ ﺃﻣﺜﺎﻝ ﻫﺆﻻﺀ ﺍﻟﻤﺘﺨﻠﻔﻮﻥ ﻋﻘﻠﻴﺎ ﻭﺍﻟﻤﻨﺒﻄﺤﻮﻥ ﻣﻨﺬ‬ ‫ﺯﻣﻦ ﻭﺍﻟﺸﻴﺎﺗﻮﻥ ﺣﺎﻟﻴﺎ ﻟﻚ ﺍﻟﻠﻪ ﻳﺎﺟﺰﺍﺋﺮ‬ ‫ﻣﻨﺎﻓﻘﻮﻥ ﺑﻼ ﻋﻨﻮﺍﻥ ﻫﺆﻻﺀ ﻻ ﻳﺴﺘﺤﻮﻥ ﻣﺎﺯﺍﻟﻮﺍ ﻳﺴﺘﺒﻐﻠﻮﻥ ﺍﻟﺸﻌﺐ ﻭﻳﻜﺬﺑﻮﻥ ﻋﻠﻴﻪ ﺍﻟﺮﺟﻞ ﻓﻲ ﺣﺎﻟﺔ ﻭﺍﻟﻨﺎﺯﻋﺎﺕ ﻏﺮﻗﺎ‬ ‫ﻳﺮﻛﺰﻭﻥ ﻋﻠﻰ ﺇﻇﻬﺎﺭ ﺻﻮﺭﺗﻪ ﻟﻠﺸﻌﺐ ﻭﻛﺄﻧﻪ ﻫﻮ ﻣﻦ ﻳﺤﻜﻢ ﻭﻳﺪﻳﺮ ﺷﺆﻭﻥ ﺍﻟﺒﻠﺪ ﻭﺍﻟﻠﻪ ﻻ ﺗﺴﺘﺤﻮﻥ ﻋﻠﻰ ﺍﺭﻭﺍﺣﻜﻢ‬ ‫ﻭﺍﻟﻠﻪ ﻟﻮ ﻛﻨﺎ ﻓﻲ ﺩﻭﻟﺔ ﺍﻟﻌﺪﺍﻟﺔ ﺍﻟﻤﺴﺘﻘﻠﺔ ﻭﺩﻭﻟﺔ ﺍﻟﺤﻖ ﻭﺍﻟﻘﺎﻧﻮﻥ ﻟﺤﻮﻛﻢ ﻫﺆﻻﺀ ﻋﻠﻰ ﻣﺒﻠﻎ ‪ 800‬ﻣﻠﻴﺎﺭ ﺩﻭﻻﺭ ﺃﻳﻦ ﺫﻫﺒﺖ‬ ‫ﻭﺃﻳﻦ ﺻﺮﻓﺖ ﻣﺎﺩﻣﻨﺎ ﻧﺨﺎﻑ ﻣﻦ ﻇﻠﻨﺎ ﺗﺠﻤﻌﻨﺎ ﺍﻟﺰﺭﻧﺔ ﻋﻨﺪ ﺃﻫﻞ ﺍﻟﺸﺮﻕ ﻭﺗﻔﺮﻗﻨﺎ ﻫﺮﺍﻭﺓ ﺍﻟﺪﺭﻛﻲ ﻓﻠﻦ ﻳﺘﻐﻴﺮ ﺣﺎﻟﻨﺎ‬ ‫ﻗﺪ ﺗﻜﻮﻥ ﻣﻌﻲ ﻭﻗﺪ ﺗﻜﻮﻥ ﺿﺪﻱ ﻓﻴﻤﺎ ﺃﻗﻮﻟﻪ ﻳﺎ ﺳﻴﺪ ﺳﻌﺪ ﺑﻮﻋﻘﺒﺔ ﻫﺆﻻﺀ ﺍﻟﻤﻨﺎﺿﻠﻴﻦ ﻳﺘﺮﺑﺼﻮﻥ ﺍﻟﻔﺮﺻﺔ ﻟﻠﺬﻫﺎﺏ ﺑﻌﻴﺪ‬ ‫ﻭﻫﻢ ﻋﻠﻰ ﻋﻠﻢ ﺑﺄﻧﻬﻢ ﻻﻳﺴﺘﻄﻴﻌﻮﻥ ﺗﺄﺩﻳﺔ ﺭﺑﻊ ﺍﻟﻤﻬﺎﻡ ﺍﻟﺘﻲ ﺗﺴﻨﺪ ﺍﻟﻴﻬﻢ ﻣﺜﻞ ﻣﻦ ﻛﺎﻧﻮﺍ ﻳﺼﻔﻘﻮﻥ ﻟﻪ ﻭﻫﻮ ﻳﻘﺬﻑ‬ ‫ﺍﻟﻤﻨﺎﺿﻠﻴﻦ ﺍﻷﻗﺤﺎﺡ ﻭﻳﺰﻏﺮﺩﻭﻥ ﻭﻳﺼﻔﻘﻮﻥ ﻟﻮ ﻛﺎﻧﻮﺍ ﻓﻌﻼ ﻣﻨﺎﺿﻠﻴﻦ ﻣﻦ ﺃﺟﻞ ﺍﻟﺒﻼﺩ ﻭﺍﻟﻌﺒﺎﺩ ﻛﺎﻥ ﺍﻷﺟﺪﺭ ﺑﻬﻢ ﺃﻥ‬ ‫ﻳﺴﺘﻘﻴﻠﻮﺍ ﻣﻨﺎ ﻣﻨﺎﺻﺒﻬﻢ ﻷﻧﻬﻢ ﻏﻴﺮ ﻣﻘﺘﻨﻌﻴﻦ ﺑﻤﺎ ﺣﺪﺙ ﺍﻟﺬﻳﻦ ﺻﻔﻘﻮﺍ ﻟﺴﻴﺪﻫﻢ ﺳﻴﺼﻔﻘﻮﻥ ﻟﺴﻴﺪ ﺃﺧﺮ ﻓﻘﻂ ﺃﻏﺎﺿﻬﻢ‬ ‫ﺳﻴﺪﻫﻢ ﺍﻷﻭﻝ ﻷﻧﻬﻢ ﻛﺎﻧﻮﺍ ﻳﺘﻤﻨﻮﻥ ﺃﻥ ﻳﻜﻮﻧﻮﺍ ﻣﺸﺮﻋﻴﻦ ﻓﻲ ﺍﻟﺒﺮﻟﻤﺎﻥ ﺷﻜﺮ ﺍﻷﺥ ﺳﻌﺪ ﺃﻋﺎﻧﻚ ﺍﻟﻠﻪ ﻟﻠﺮﺩ ﻋﻠﻰ ﻣﻨﺎﺿﻠﻴﻦ‬ ‫ﻳﻔﻜﺮﻭﻥ ﻓﻘﻂ ﻓﻲ ﺃﻧﻔﺴﻬﻢ‬ ‫ﺍﺭﻳﺪ ﺍﻥ ﺍﻧﺒﻪ ﺍﻟﺴﻴﺪﺓ ﻧﺠﻼﻭﻯ ﺍﻥ ﺍﻟﻤﻘﺎﻝ ﺍﻟﺬﻯ ﺗﻨﺒﺎ ﻓﻴﻪ ﺍﻟﺴﻴﺪ ﺑﻮﻋﻘﺒﺔ ﺑﺮﺣﻴﻞ ﺍﻟﺴﻴﺪ ﺳﻌﺪﺍﻧﻰ ﻛﺎﻥ ‪ 05‬ﺍﻳﺎﻡ ﻗﺒﻞ ﺣﺪﻭﺙ‬ ‫ﺍﻟﺤﺪﺙ ‪ -‬ﻭﻫﺬﺍ ﻳﺪﻝ ﻋﻠﻰ ﺍﻥ ﺍﻟﺴﻴﺪ ﺑﻮﻋﻘﺒﺔ ﻫﻮ ﺍﻛﺒﺮ ﻣﻦ ﺍﻥ ﻳﺘﻨﺎﻭﻟﻪ ﺍﺣﺪ ﺑﺴﻮﺀ ﻫﻮ ﻓﻰ ﺧﺪﻣﺔ ﺍﻟﻘﺮﺍﺀ ﻭﻟﻮ ﻛﺎﻥ ﻳﺮﻳﺪ‬ ‫ﺍﻟﺠﺎﻩ ﻭﺍﻟﻤﺎﻝ ﻟﻜﺎﻥ ﻟﻪ ﺫﻟﻚ ﻣﻨﺬ ﺯﻣﻦ ﻃﻮﻳﻞ ﻭﻫﻮ ﻋﻨﺪﻣﺎ ﺗﺤﺪﺙ ﻋﻦ ﺍﻟﺴﻴﺪ ﺳﻌﺪﺍﻧﻰ ﻓﺎﻧﻤﺎ ﻋﺒﺮ ﻋﻤﺎ ﻳﺪﻭﺭ ﻓﻰ ﻣﺨﻴﻠﺔ‬ ‫ﺍﻟﻘﺮﺍﺀ ﻭﺍﻗﻮﻝ ﻟﻠﺴﻴﺪﺓ ﺍﻟﻔﺎﺿﻠﺔ ﺍﻥ ﺣﺰﺏ ﺟﺒﻬﺔ ﺍﻟﺘﺤﺮﻳﺮ ﻳﻌﺞ ﺑﺎﻟﻤﻨﺎﺿﻠﻴﻦ ﺍﻻﻛﻔﺎﺀ ﻭﻋﻠﻴﻜﻢ ﻣﻦ ﺍﻻﻥ ﺗﺼﺤﻴﺢ‬ ‫ﺍﻻﻭﺿﺎﻉ ﻭﺍﻥ ﺑﺪﺍ ﻟﻜﻢ ﺍﻻﻣﺮ ﻻ ﻳﺴﺘﺤﻖ ﺫﻟﻚ ﻓﺎﻧﺰﻟﻮﺍ ﺍﻟﻰ ﺍﻟﺸﺎﺭﻉ ﻭﺍﺳﻤﻌﻮﺍ ﺣﺪﻳﺚ ﺍﻟﺸﻌﺐ‬

‫‪2‬‬

‫‪2‬‬

‫‪1‬‬

‫‪1‬‬

254

A. Habeeb et al.

The Algerian dataset contains articles that describe the human feeling in its positive or negative state, as this paper needs long paragraphs to train the proposed model.

3.4 3.4.1

Preprocessing Tokenization

The process of converting text into tokens before transforming it into vectors. It is also easier to filter out unnecessary tokens. For example, split a document into paragraphs or sentences into words. In this case, the tokenizing split sentences into words as shown in Fig. 1 pre-processing phase. Words using CAMel Tools to apply tokenizing for Arabic Natural language processing in (ANLP) Python [18].

3.4.2

Text Pre-processing

The main task is to avoid non-meaningful, it is important for text classification can reduce the error with high accuracy. Each file of the corpus was subject to the following procedure as shown in Fig. 1 pre-processing phase: . . . . . . .

Delete digits, punctuation marks and numbers. Delete all non-Arabic characters. Delete stop-words and non-useful words like pronouns, articles. In addition, propositions. Change the letter ‘‘‫”ﻯ‬to ‘‘‫”ﻱ‬. Change the letter ‘‘‫”ﺓ‬to ‘‘‫”ﻩ‬. Change the letter ‘‘” ‫ ‘‘ﺁ‬,”‫ ‘‘ﺇ‬,”‫ ‘‘ﺅ‬,”‫ ‘‘ﺉ‬,”‫ ﺃ‬to ‘‘‫”ﺍ‬. Delete characters that confuse the classification process [19].

3.4.3

Stemming

The implements CAMel Tools for ANLP Arabic Natural language processing in Python as shown in Fig. 1 pre-processing phase, a collection of open-source, utilities for dialect identification, pre-processing, morphological modeling, sentiment analysis, and named entity recognition, and describe the functionalities and stemming of Arabic words [18]. It is a process of reducing inflected words into one root or stem by removing suffixes, prefix, and infixes. Types of Stemming: statistical [20].

Arabic Text Classification Using Modified Artificial Bee Colony … Table 6 Count vectorizer example

3.4.4

Words

Vector No

‫ﻳﻔﺼﻞ‬ ‫ﻳﻮﺟﺪ‬ ‫ﻳﻨﺨﺼﻢ‬

898 901 899

255

Text to Numeric Data Representation

Implements the Count Vectorizer pre-training algorithm to encode the presence words example in Table 6 to calculate the matrix of numeric values for each word as show in Fig. 2 within each review texts [21].

3.4.5

Most Affective Jordanian Words

See the Table 7. Figure 3 shows how the pre-processing phase steps to Numeric Data Representation as Countvectorizer.

3.5

3.5.1

Modified Artificial Bee Colony Algorithm with Upper Confidence Bound Algorithm The Original Artificial Bee Colony Algorithm

Karaboga [22] has defined the swarm intelligence algorithms or distributed problem-solving devices behavior of social insect colonies and other animal intelligent behavior of a honey bee swarm, based

Fig. 2 Example of matrix of numeric value for words

as “any attempt to design inspired by the collective societies” there is a special on this foraging behavior,

256

A. Habeeb et al.

Table 7 Most affective Jordanian dialect words in classifiers: Features Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature:

750 728 626 605 598 597 306 153 211 223 246 261 272 294 306 315 318 321 337 348 401 402 403 455 467 492 498 575 576 587 581 626 637 641 644 656 733 740 751 760

Words

Score

‫ﻣﺎ‬ ‫ﻻ‬ ‫ﻋﺎﻟﺘﻄﺒﻴﻖ‬ ‫ﺻﺎﺭﺕ‬ ‫ﺷﻜﺮ‬ ‫ﺍﺷﺘﻜﻲ‬ ‫ﻏﺎﻟﻲ‬ ‫ﺧﻄﺄ‬ ‫ﻏﻴ ﺮ‬ ‫ﻣﺶ‬ ‫ﺯﺍﺑﻂ‬ ‫ﺟﻴﺪﺓ‬ ‫ﻣﺸﻜﻠﺔ‬ ‫ﺍﺭﺧﺺ‬ ‫ﺭﺻ ﻴ ﺪﻱ‬ ‫ﺻﻼﺣﻴﺔ‬ ‫ﺍﻟﻐﻴﻬﺎ‬ ‫ﻋﺎﻟﻤﺴﺆﻭﻝ‬ ‫ﺍﻭﻓﺮ‬ ‫ﺑﺎﻷﻗﺴﺎﻡ‬ ‫ﻋﺎﺳﺎﺱ‬ ‫ﺑﻀﻞ‬ ‫ﺑﻄﺊ‬ ‫ﺗﺘﻌﻠﻖ‬ ‫ﻣﻀﻄﺮ‬ ‫ﻣﺤﺘﺮﻣﻴﻦ‬ ‫ﺟﻮﺍﺋﺰ‬ ‫ﺳﺎﻋﺪﻧﻲ‬ ‫ﺳﺮﻋﺔ‬ ‫ﻟﻘ ﻴ ﺖ‬ ‫ﺳﻬﻞ‬ ‫ﺍﻟﺤﻞ‬ ‫ﺍﺣ ﺴﻦ‬ ‫ﺳﻬﻞ‬ ‫ﻋﻠﻴﻨﺎ‬ ‫ﻋ ﻨ ﺪﻱ‬ ‫ﻓﺎﺻﻞ‬ ‫ﻟﻤﺎ‬ ‫ﻣﺎﻛﺲ‬ ‫ﺭﻳﺤﻨﺎ‬

Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score: Score:

0.01007 0.28105 0.01048 0.01314 0.21708 0.05050 0.01008 0.01989 0.00226 0.00115 0.00152 0.00186 0.00426 0.00265 0.01008 0.00160 0.00114 0.00152 0.00488 0.00379 0.00371 0.00613 0.00006 0.00267 0.00335 0.00155 0.00666 0.00446 0.00342 0.00471 0.00362 0.01048 0.00479 0.00646 0.00423 0.00317 0.00111 0.00125 0.00179 0.00446 (continued)

Arabic Text Classification Using Modified Artificial Bee Colony …

257

Table 7 (continued) Features Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature: Feature:

779 780 782 802 804 815 812 828 887 888

Words

Score

‫ﻣﺶ‬ ‫ﻣﺸﺎﻛﻞ‬ ‫ﻣﺸﺠﻌﺔ‬ ‫ﻣﻤﺘﺎﺯ‬ ‫ﺍﻋﻄﻮﻧﻲ‬ ‫ﻓ ﺰﺕ‬ ‫ﻣﻨﻴﺤﺔ‬ ‫ﻧﺰﻟﺖ‬ ‫ﻳﻄﻠﻌﻠﻲ‬ ‫ﻳﻌﻤﻞ‬

Score: Score: Score: Score: Score: Score: Score: Score: Score: Score:

0.00286 0.00458 0.00552 0.01359 0.00505 0.00443 0.00310 0.00472 0.00542 0.00117

establish the new ABC algorithm simulating real world. The ABC algorithms can be efficiently used for solving multimodal and multidimensional optimization problems. The ABC has three groups, employed, onlookers, and Scouts bees. Distributed as the first half has employed artificial bees, second half consist of onlookers. One employed bee for food source, onlooker bees wait in the hive and decide on a food source to exploit based on the information shared with the employed bees. The employed bee becomes a scout after depleting its food [22]. The original ABC Algorithms: (1) (2) (3) (4) (5)

Generate the initial solution source randomly assigned Evaluate the fitness (fit(xi)) of the population Set cycle to 1 Repeat For each employed bee { (a) Produce new solution Vi by using (2) (b) Calculate its fitness value fit(Vi) (c) Apply greedy selection process} (6) Calculate the probability values Pi for the solution (xi) by (3) (7) For each onlooker bee { (a) Select a solution xi depending on Pi (b) Produce new solution Vj (c) Calculate its fitness value fit(Vj) (d) Apply greedy selection process} (8) If there is an abandoned solution for the scout, then replace it with a new solution which will be randomly produced by (4) (9) Memorize the best solution so far (10) Cycle = cycle + 1 (11) Until cycle = maximum cycle number Pseudocode 1: ABC algorithm

258

A. Habeeb et al.

Fig. 3 Second phase preprocessing

The ABC algorithm as swarm intelligence, is an iterative process, ABC create a candidate solution according to the following equation: Each solution Xi = 1, 2, …, SN; where SN represents the number of solutions, xj ¼ ð1; 2; :::; DÞ a D-dimensional vector. The food source is randomly assigned to SN of the employed bee with fitness evaluated. Then, cycle of search process for employed, onlooker and scout bee’s is repeated. ( ) xji ¼ xjmin þ randð0; 1Þ xjmax _ xjmin

ð1Þ

To produce a candidate solution according to Vij , position from the old one in this phase search of employed bees denoted by Eq. 2, where j 2 (1, 2, …, D), k 2 (1, 2, …, SN). hji ; ; theta is a random number in [−1, 1]. A food source vi is assigned for every food source xi . Once vi is obtain it will be evaluated and compared with xi . A greedy selection is applied between xi and vi . Then, best one is selected depending on fitness values, the food amount of at xi . ( ) vji ¼ xji þ uji xji _ xjk

ð2Þ

ABC select food, each onlooker bee select food depending on fitness value that is obtained from employed bees. Where the fit(xi) is the fitness value of solution i. Onlooker will select food source and produce new candidate position pi of the selected food. Moreover, the selection probability of each solution is calculated by: fit(x Þ pi ¼ PSN i m¼1 fit(xm Þ

ð3Þ

After completing the search of employed and onlooker bees, the ABC algorithm checks with here is any exhaust source to be disused. The scouts can discover rich entirely as unknown food sources.

Arabic Text Classification Using Modified Artificial Bee Colony …

259

The Original Artificial Colony Bee Algorithm has three control parameters, food source, limit value to stop iteration when find the optimal food source,and MEN the maximum cycle number [23].

3.5.2

Enhancing Artificial Bee Algorithm with Upper Confidence Bound

Upper Confidence Bound algorithms changes its pure Exploration and Exploitation balance as it gathers more information of the environment to best exploitation in it [24]. Exploration and exploitation are essential for a population-based optimization algorithm. Like PSO, GA, DE, where exploration refers to the ability to achieve optimal discovery of unknown areas. In terms of exploitation, it is the ability to apply prior knowledge to obtain a better solution in practice for exploration [25]. The ABC algorithm is the process for maximum or minimum solution in problem-solving within possible search space. The scout bees have to control the exploration ability while employed and onlooker bees are having exploitation ability. The artificial bee colony is efficient for constrained and multidimensional basic functions. When we deal with local search ability. the convergence rate is poor with complex multimodal function. The artificial bee colony algorithm in equation (2) Chose a food source according to the probability value, based on the roulette wheel method. A greedy selection is applied between xi and vi. In this phase of the original ABC Pseudocode (1): (5)(c) and (6)(d) where apply the greedy selection is applied, in order to improve the exploitation some modifications inspired by Upper Confidence Bound algorithm (UBC). with this modified affects the four results: mode, mean, median, and standard deviation. The UCB algorithm modifies its levels of exploration and exploitation, when UCB has information about the available actions. Low confidence in the best actions, can increase good action favors exploitation. adjust the balance as time progresses, the UBC achieves an optimal action of average reward compared to greedy. sffiffiffiffiffiffiffiffiffiffiffi log t c UBC algorithm AðtÞ ¼ argmax½QtðaÞ þ NtðaÞ

ð4Þ

where Nt(k) is the number of times the treatment arm k has been selected up to the time t, equation (5). AðtÞ ¼ argmaxQtðaÞ Greedy algorithm

ð5Þ

where argmax specifies choosing the action ‘a’ for Qt(a) is maximizing QtðaÞ action ‘a’ at time step ‘t’.

260

A. Habeeb et al.

Table 8 Mapping parameter equation between (greedy and UBC) algorithms UBC parameter

Estimated value

Greedy parameter

Qt(a) Specifies choosing the action ‘a’ for Qt(a) is maximized Nt(a)

Action ‘a’ at time step ‘t’ argmax

Qt(a) Specifies choosing the action ‘a’ for Qt(a) is maximized

C Qt(a)

Number of times that action ‘a’ has been selected, prior to time ‘t’ Confidence value that controls the level of exploration Represents the exploitation part of the equation

Constant Qt(a)

Table 8 shows how to map parameters in the equation from greedy selection to the UBC selection process. Since the UBC is high potential for being optimal, it inspired a method of to MAB problems called (Upper confidence bound) approach [26]. In order to simplify the modified ABC with UBC as shown in Pseudocode 2 step (5)(a), (6)(d), and how much the modified selection of new food source effects the behavior of ABC-UBC using the reinforcement learning at the Artificial bee colony. (1) (2) (3) (4) (5)

Generate the initial population xi (i = 1, 2,..., SN) Evaluate the fitness (fit(xi)) of the population Set cycle to 1 Repeat For each employed bee { (a) Produce new solution Vi by using (2) (b) Calculate its fitness value fit(Vi) (c) Apply UBC selection process} (6) Calculate the probability values Pi for the solution (xi) by (3) (7) For each onlooker bee { (a) Select a solution xi depending on Pi (b) Produce new solution Vj (c) Calculate its fitness value fit(Vj) (d) Apply UBC selection process} (8) If there is an abandoned solution for the scout, then replace it with a new solution which will be randomly produced by (4) (9) Memorize the best solution so far (10) Cycle = cycle + 1 (11) Until cycle = maximum cycle number Pseudocode 2: modified ABC with UBC

Arabic Text Classification Using Modified Artificial Bee Colony …

3.5.3

261

Obtain the Number of Feature Selection Using the Modified ABC-UBC

Modified ABC-UBC process to find the minimum number of features subset of features (words) that has a higher classification accuracy. Initial food sources: number of features equal the search space, it is the step to find out the best accuracy for wrapper method, the forward feature selection in proposed model. To find the optimal features based on the minimum number of features. Then apply the number of subset features in the forward feature selection. in order to present food source as a bit vector that is considered if value 1 or not considered if value 0. the generated number of random between 0 and 1 is a Ri, for each position in each food source, the value of the position is considered as 1 if the Ri value is less than the MR value. as part of a subset features, if value 0 this is not considered features. The number of variable n features which is a random number that controls the subset features, the subset of features is evaluated classification accuracy by the classifier. And used as a fitness value of food source. The neighbors of feature (food source) determined by employed bees, the new food source has selected with the UBC algorithm, as indicated in Eq. (5) [27].

3.6

Feature Selection

Three objectives of FS develop the prediction interpretation predictors with more cost-effective and fast predictors. Best deal with underlying test that has generated the data [28]. In the first phase, selecting the optimal feature is very important, which means choosing distinctive features from a feature set while excluding extraneous features [29]. Practically, any combination of machine learning and search strategy can be used as a wrapper to train models for the best possible combination of features that resulted in the best performance. Looking for a subset of feature set as n, where n the number of feature obtained from the modified ABC-UBC, to optimize performance in the next step of machine learning algorithm classifiers, and evaluate the model performance of the newly-trained machine learning with metrics performance measure. Moreover, the ending criteria in this proposed model is a predefined number of features by the ABC-UBC, In addition, the Receiver Operating Characteristics (ROC) is also used to measure the performance of the classifiers. The ROC graphs are used to visualize, organize, and select classifications based on the performance. The difference between the ROC and accuracy is that the ROC is helpful in managing unbalanced instances of classes, whereas, the accuracy is a single number to sum up the performance.

262

A. Habeeb et al.

The ROC analysis evaluates models using (FPR) false positive rate and TP (TPR) true positive rate. These are calculated as FPR ¼ FP N and TPR ¼ P , where N is the number of negative, p is the number of positives, and TP is the number of true positives. Researchers use the Forward Feature Selection, that starts with no feature and adds one at a time by evaluating all features individually, then select the feature that results in best performance [6].

3.7

The Text Classification

Text categorization or tagging is the process of tagging text into labeled groups, text classifiers can analyze text and assign labels or tags based on their content [30].

3.7.1

Support Vector Machines Classifier (SVM)

Belongs to nonparametric supervised techniques A binary classifier between two classes by single identifies boundary, the most important models for SVM text classifications are Linear and Radial Basis functions. Linear classification tends to train the data-set then builds a model that assigns classes or categories [31]. the main goal in this model to use SVM in forward feature selection Classifiers text. To an optimal line in the simplest case using the training data to separate data into classes depending on training data label (0 and 1). The learning phase in SVM to process the repeated constraint classifier with an optimal decision boundary [31].

3.7.2

K-Nearest Neighbors Classifier (KNN)

Belongs to nonparametric supervised techniques, assumes that a similar class exists in close proximity to the part of used to classification problems, the main goal in this model to use KNN in forward feature selection Classifiers text. to solve the problem of Arabic sentiment analysis, KNN determines the label of a new sample based on the label of its nearest neighbors [32].

3.7.3

Naïve- Bayes Classifier

Naive Bayes is a learning method in which you introduce a polynomial model, or a probabilistic learning method. Naïve Bayes often relies on a document’s word bag view, combining the most frequently used words while ignoring the other rare ones. Bag of Words relies on feature extraction method to provide a classification for some data. the main goal in this model to using it in forward feature selection Classifiers text [33].

Arabic Text Classification Using Modified Artificial Bee Colony …

3.7.4

263

Polynomial Neural Networks Classifier

PNNs flexible neural architecture classifier algorithm is based on the GMDH method and utilizes a class of polynomials such as linear, modified quadratic, cubic, the number of layers can set during the training and test for classification has the capability to capture relationships between the words in sentence. The main goal in this model to using it in forward feature selection Classifiers text. to solve the problem of Arabic sentiment analysis [34].

4 Results 4.1

Results Information

This paper aims to extract the polarity of Arabic text, for the introduced datasets. To classify these texts using the proposed model Fig 1. The Results section is a summary of the experiments that will be presenting results in tables. Four performance evaluation criteria were used: precision, recall, f1-score, and accuracy.

4.2

The Jordanian Dialect Dataset Experiments

4.2.1

The Result of Arabic Text Classifiers with Pre-processing Phase

Results of KNN classifiers with Pre-processing phase are presented as follows (Table 9). SVM (Table 10). NB (Table 11). PNN (Table 12).

Table 9 Result of KNN with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

KNN Macro avg Weighted avg Accuracy

0.82 0.91

0.93 0.96

0.87 0.93

0.98

0.98

1.00

0.97

0.99

0.97 0.98

0.99

264

A. Habeeb et al.

Table 10 Result of SVM with pre-processing Models

DataSet

Classifier

Precision or label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

SVM Macro avg Weighted avg Accuracy

1.00 0.99

0.80 0.90

0.89 0.94

0.99

0.99

0.99

1.00

0.99

0.99

0.99 0.99

Table 11 Result of NB with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

Naïve Bayes Macro avg Weighted avg Accuracy

0.67

0.80

0.73

0.96

0.99

0.97

0.83

0.89

0.85

0.97

0.96

0.96

0.98

0.96

Table 12 Result of PNN with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

PNN Macro avg Weighted avg Accuracy

0.80 0.89

0.80 0.89

0.80 0.89

0.97

4.2.2

0.97

0.99

0.97

0.99

0.99

0.97 0.97

The Result of Arabic Text Classifiers Without Pre-Processing Phase

KNN (Table 13). SVM (Table 14). NB (Table 15). PNN (Table 16).

Arabic Text Classification Using Modified Artificial Bee Colony …

265

Table 13 Result of KNN without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

accuracy

Without pre-processing CountVectorizer

Jordan Dialect

KNN Macro avg Weighted avg Accuracy

0.56 0.78

0.93 0.94

0.70 0.84

0.95

1.00

0.97

0.95

0.97

0.95 0.95

Table 14 Result of SVM without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Without pre-processing CountVectorizer

Jordan Dialect

SVM Macro avg Weighted avg Accuracy

1.00 0.99

0.80 0.90

0.89 0.94

0.99

0.99

0.99

1.00

0.99

0.99

0.99 0.99

Table 15 Result of NB without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Without pre-processing CountVectorizer

Jordan Dialect

Naïve Bayes Macro avg Weighted avg Accuracy

0.71

0.80

0.75

0.97

0.99

0.85 0.97

0.98

0.89 0.97

0.98

0.87 0.97 0.97

Table 16 Result of PNN without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Without pre-processing CountVectorizer

Jordan Dialect

PNN Macro avg Weighted avg Accuracy

0.87 0.93 0.98

0.87 0.93 0.98

0.87 0.93 0.98

0.98

0.99

0.99

0.98

0.99

266

4.2.3

A. Habeeb et al.

The Result of Arabic Text Using Forward Feature Selection with ABC-UBC and Pre-Processing Phase

KNN (Table 17 and Fig. 4). SVM (Table 18 and Fig. 5). NB (Table 19 and Fig. 6). PNN (Table 20 and Fig. 7).

Table 17 Result of KNN using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results with pre-processing

Jordan Dialect

10

Classifier

Precision for label 1 2 0.94

Recall label 1 2 0.94

0.89

f1-score label 1 2

Accuracy

0.92

0.92

KNN

0.89

Macro avg

0.92

0.92

0.92

Weighted avg

0.92

0.92

0.92

Accuracy

Fig. 4 Performance of features with the selected number of features

0.92

0.91

Arabic Text Classification Using Modified Artificial Bee Colony …

267

Table 18 Result of SVM using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results with pre-processing

Jordan dialect

10

Classifier

Precision for label 1 2 0.99

Recall label 1 2 0.80

1.00

f1-score label 1 2

Accuracy

0.83

0.98

SVM

0.86

Macro avg

0.92

0.90

0.91

Weighted avg

0.98

0.98

0.98

Accuracy

0.99

0.98

Fig. 5 Performance of features

Table 19 Result of NB using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Feature selection with ABC-UBC results with pre-processing

Jordan Dialect

10

Naïve Bayes

1.00

0.80

0.89

0.99

Macro avg

0.99

0.90

0.94

Weighted avg

0.99

0.99

0.99

Accuracy

0.99

1.00

0.99

0.99

268

A. Habeeb et al.

Fig. 6 Performance of features

Table 20 Result of PNN using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results with pre-processing

Jordan Dialect

8

Classifier

Precision for label 1 2

0.80

1.00

f1-score label 1 2

Accuracy

0.86

0.98

PNN

0.92

Macro avg

0.95

0.90

0.92

Weighted avg

0.98

0.98

0.98

Accuracy

0.99

Recall label 1 2

0.98

0.99

Arabic Text Classification Using Modified Artificial Bee Colony …

269

Fig. 7 Performance of features

4.2.4

The Result of Arabic Text Using Forward Feature Selection with ABC-UBC Without Pre-Processing Phase

KNN (Table 21 and Fig. 8). SVM (Table 22 and Fig. 9). NB (Table 23 and Fig. 10). PNN (Table 24 and Fig. 11).

Table 21 Result of KNN using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

10

Classifier

Precision for label 1 2

0.94

0.89

f1-score label 1 2

Accuracy

0.92

0.92

KNN

0.89

Macro avg

0.92

0.92

0.92

Weighted avg

0.92

0.92

0.92

Accuracy

0.94

Recall label 1 2

0.92

0.91

270

A. Habeeb et al.

Fig. 8 Performance of features

Table 22 Result of SVM using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

10

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

0.80

0.83

SVM

0.86

Macro avg

0.92

0.90

0.91

Weighted avg

0.98

0.98

0.98

Accuracy

0.99

0.99

0.98

0.99

Accuracy

Arabic Text Classification Using Modified Artificial Bee Colony …

271

Fig. 9 Performance of features

Table 23 Result of NB using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Classifier

Precision for label 1 2

Recall abel 12

f1-score label 1 2

Accuracy

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

10

Naïve Bayes

0.92

0.80

0.86

0.98

Macro avg

0.95

0.90

0.92

Weighted avg

0.98

0.98

0.98

Accuracy

0.99

1.00

0.98

0.99

272

A. Habeeb et al.

Fig. 10 Performance of features

Table 24 Result of PNN using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Classifier

Precision for label 1 2

Recall label 12

f1-score label 1 2

Accuracy training test

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

8

PNN

0.92

0.80

0.86

0.99

Macro avg

0.95

0.90

0.98

Weighted avg

0.98

0.98

0.92

Accuracy

0.99

1.00

0.98

0.99

0.97

Arabic Text Classification Using Modified Artificial Bee Colony …

273

Fig. 11 Performance of features

Table 25 Result of KNN with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

KNN macro avg weighted avg accuracy

0.59 0.61

0.88 0.57

0.71 0.53

0.60

4.3 4.3.1

0.61

0.62

0.26

0.60

0.36

0.55 0.60

The Algerian Dialect Dataset Experiments The Result of Arabic Text Classifiers with Pre-processing Phase

KNN (Table 25). SVM (Table 26). NB (Table 27). PNN (Table 28).

274

A. Habeeb et al.

Table 26 Result of SVM with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy training test

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

SVM Macro avgn Weighted avg Accuracy

0.72 0.70

0.75 0.70

0.73 0.70

0.70

0.68

0.70

0.64

0.70

0.66

0.72

0.70 0.70

Table 27 Result of NB with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

Naïve Bayes Macro avg Weighted avg Accuracy

0.63

0.79

0.70

0.63

0.63

0.44

0.63

0.61

0.61

0.63

0.63

0.62

0.52

0.63

Table 28 Result of PNN with pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy training test

Pre-processing (Stemming, stop word) CountVectorizer

Jordan Dialect

PNN Macro avg Weighted avg Accuracy

0.61 0.59

0.73 0.58

0.67 0.58

0.60

4.3.2

0.59

0.57

0.60

0.44

0.49

0.59 0.60

The Result of Arabic Text Classifiers Without Pre-processing Phase

KNN (Table 29). SVM (Table 30). NB (Table 31). PNN (Table 32).

0.74

Arabic Text Classification Using Modified Artificial Bee Colony …

275

Table 29 Result of KNN without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Without pre-processing CountVectorizer

Jordan Dialect

KNN Macro avg Weighted avg Accuracy

0.66 0.75

0.94 0.67

0.78 0.66

0.70

0.84

0.74

0.91

0.70

0.75

0.68 0.70

Table 30 Result of SVM without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy training test

Without pre-processing CountVectorizer

Jordan Dialect

SVM Macro avg Weighted avg Accuracy

0.79 0.76

0.77 0.76

0.78 0.76

0.76

0.72

0.76

0.74

0.76

0.73

0.76

0.76 0.76

Table 31 Result of NB without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Without Pre-processing CountVectorizer

Jordan Dialect

Naïve Bayes Macro avg Weighted avg Accuracy

0.60

0.79

0.68

0.60

0.58

0.36

0.59

0.58

0.56

0.59

0.60

0.58

0.44

0.60

Table 32 Result of PNN without pre-processing Models

DataSet

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Without pre-processing CountVectorizer

Jordan Dialect

PNN Macro avg Weighted avg Accuracy

0.70 0.65 0.66

0.67 0.65 0.66

0.68 0.65 0.66

0.66

0.61

0.64

0.66

0.62

276

4.3.3

A. Habeeb et al.

The Result of Arabic Text Using Forward Feature Selection with ABC-UBC and Pre-processing Phase

KNN (Table 33 and Fig. 12). SVM (Table 34 and Fig. 13). NB (Table 35 and Fig. 14). PNN (Table 36 and Fig. 15).

Table 33 Result of KNN using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC Results with Pre-processing

Jordan Dialect

10

Classifier

0.67

Recall label 1 2 0.62

0.95

f1-score label 1 2

Accuracy

0.75

0.77

KNN

0.94

Macro avg

0.81

0.79

0.77

Weighted avg

0.82

0.77

0.77

Accuracy

Fig. 12 Performance of features

Precision for label 1 2

0.77

0.79

Arabic Text Classification Using Modified Artificial Bee Colony …

277

Table 34 Result of SVM using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC Results with Pre-processing

Jordan Dialect

10

Classifier

Precision for label 1 2 0.76

Recall label 1 2 0.79

0.79

f1-score label 1 2

Accuracy

0.81

0.79

SVM

0.83

Macro avg

0.79

0.79

0.79

Weighted avg

0.79

0.79

0.79

Accuracy

0.77

0.79

Fig. 13 Performance of features

Table 35 Result of NB using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Classifier

Precision for label 1 2

Recall label 1 2

f1-score label 1 2

Accuracy

Feature selection with ABC-UBC results with pre-processing

Jordan Dialect

10

Naïve Bayes

0.90

0.75

0.82

0.82

Macro avg

0.82

0.82

0.82

Weighted avg

0.83

0.82

0.82

Accuracy

0.74

0.90

0.82

0.81

278

A. Habeeb et al.

Fig. 14 Performance of features

Table 36 Result of PNN using forward feature selection with ABC-UBC and pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results with pre-processing

Jordan Dialect

10

Classifier

precisionnfor label 1 2

f1-score label 1 2

Accuracy Training Test

0.62

0.72

0.74

PNN

0.86

macro avg

0.76

0.75

0.74

weighted avg

0.77

0.74

0.73

accuracy

0.65

Recall label 1 2 0.87

0.74

0.75

0.70

Arabic Text Classification Using Modified Artificial Bee Colony …

279

Fig. 15 Performance of features

Table 37 Result of KNN using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

10

Classifier

0.70

Recall label 12 0.67

0.95

f1-score label 1 2

Accuracy training test

0.78

0.79

KNN

0.94

Macro avg

0.82

0.81

0.79

Weighted avg

0.70

0.95

0.80

Accuracy

4.3.4

Precision for label 1 2

0.80

0.79

The Result of Arabic Text Using Forward Feature Selection with ABC-UBC Without Pre-processing Phase

KNN (Table 37 and Fig. 16). SVM (Table 38 and Fig. 17). NB (Table 39 and Fig. 18). PNN (Table 40 and Fig. 19).

0.793

280

A. Habeeb et al.

Fig. 16 Performance of features

Table 38 Result of SVM using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

10

Classifier

Precision for label 1 2

0.71

0.79

f1-score label 1 2

Accuracy

0.76

0.75

SVM

0.81

Macro avg

0.75

0.75

0.75

0.75

0.75

0.75

0.75

Accuracy

0.96

Recall label 12

0.75

0.74

Arabic Text Classification Using Modified Artificial Bee Colony …

281

Fig. 17 Performance of features

Table 39 Result of NB using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Classifier

Precision for label 1 2

Recall label 12

f1-score label 1 2

Accuracy training test

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

10

Naïve Bayes

0.90

0.75

0.82

0.82

Macro avg

0.82

0.82

0.82

Weighted avg

0.83

0.82

0.82

Accuracy

0.74

0.90

0.82

0.81

0.81

282

A. Habeeb et al.

Fig. 18 Performance of features

Table 40 Result of PNN using forward feature selection with ABC-UBC without pre-processing phase Models

DataSet

ABC UBC Fno

Classifier

Precision for label 1 2

Recall label 12

f1-score label 1 2

Accuracy

Feature selection with ABC-UBC results without pre-processing

Jordan Dialect

6

PNN

0.91

0.62

0.74

0.76

Macro avg

0.79

0.77

0.76

Weighted avg

0.80

0.76

0.76

Accuracy

0.67

0.92

0.76

0.77

Arabic Text Classification Using Modified Artificial Bee Colony …

283

Fig. 19 Performance of features

4.4

Experimental Results and Discussion

The Jordanian dialect dataset experiments (Table 41 and Fig. 20). Table 41 Comparison of performance values Model

Arabic text classifiers with pre-processing phase Arabic text classifiers without pre-processing phase Arabic text using forward feature selection with ABC-UBC and pre-processing phase Arabic text using forward feature selection with ABC-UBC without pre-processing phase

Optimization algorithms

Machine learning classifiers

Performance measures Precision Recall F1-SCORE

Accuracy

KNN SVM NB PNN KNN SVM NB PNN KNN SVM NB PNN

0.91 0.99 0.83 0.89 0.78 0.99 0.85 0.98 0.92 0.92 0.99 0.95

0.96 0.90 0.89 0.89 0.94 0.90 0.89 0.98 0.92 0.90 0.90 0.90

0.93 0.94 0.85 0.89 0.84 0.94 0.87 0.98 0.92 0.91 0.94 0.92

0.98 0.99 0.96 0.97 0.95 0.99 0.97 0.98 0.92 0.98 0.99 0.98

KNN SVM NB PNN

0.92 0.92 0.95 0.95

0.92 0.90 0.90 0.90

0.92 0.91 0.92 0.92

0.99 0.98 0.98 0.98

284

A. Habeeb et al.

Arabic text classifiers without Pre-processing

Arabic text classifiers with Pre-processing 1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0.8 0.6 0.4 0.2 0 Precision Precision

Recall

F1-SCORE

Recall SVM

SVM

NB

PNN

F1-SCORE

Accuracy

Accuracy NB

PNN

KNN

KNN

F.F.S with ABC-UBC with Pre-processing phase

F.F.S with ABC-UBC without Pre-processing phase

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0 Precision

Recall SVM

NB

F1-SCORE PNN

Accuracy

Precision

KNN

Recall SVM

NB

F1-SCORE PNN

Accuracy

KNN

Fig. 20 Compared prediction accuracy for the four tests using Jordanian dialect dataset

4.5

Experimental Results and Discussion

The Algerian dialect dataset experiments (Table 42 and Fig. 21).

Table 42 Comparison of performance values Model

Arabic text classifiers with pre-processing phase Arabic text classifiers without pre-processing phase

Optimization algorithms

Machine learning classifiers

Performance measures Precision Recall F1-SCORE

KNN SVM NB PNN KNN SVM NB PNN KNN

0.61 0.70 0.63 0.59 0.75 0.76 0.59 0.63 0.81

0.57 0.70 0.61 0.58 0.76 0.76 0.58 0.65 0.79

0.53 0.70 0.61 0.58 0.66 0.76 0.56 0.65 0.77

Accuracy 0.60 0.70 0.63 0.60 0.70 0.76 0.60 0.66 0.77 (continued)

Arabic Text Classification Using Modified Artificial Bee Colony …

285

Table 42 (continued) Model

Optimization algorithms

Arabic text using forward feature selection with ABC-UBC and pre-processing phase

Modified ABC-UBC

Arabic text using forward feature selection with ABC-UBC without pre-processing phase

Modified ABC-UBC

Machine learning classifiers

Performance measures Precision Recall F1-SCORE

Accuracy

SVM NB PNN

0.79 0.82 0.76

0.79 0.82 0.75

0.79 0.82 0.74

0.79 0.82 0.74

KNN

0.82

0.81

0.79

0.79

SVM NB PNN

0.75 0.82 0.74

0.75 0.82 0.72

0.75 0.82 0.72

0.75 0.82 0.76

Arabic text classifiers without Pre-processing

Arabic text classifiers with Pre-processing 1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0 Precision

Recall SVM

F1-SCORE

NB

PNN

Accuracy

Precision

KNN

Recall SVM

NB

F1-SCORE PNN

Accuracy

KNN

F.F.S with ABC-UBC without Pre-processing phase

F.F.S with ABC-UBC with Pre-processing phase 1

1

0.8

0.8 0.6

0.6

0.4

0.4

0.2 0.2

0 Precision

0 Precision

Recall SVM

NB

F1-SCORE PNN

Accuracy

Recall SVM

NB

F1-SCORE PNN

Accuracy

KNN

KNN

Fig. 21 Compared prediction accuracy for the four tests using Algerian dialect dataset

286

A. Habeeb et al.

5 Conclusion In this paper, The extent to which the modified algorithm influences optimal features. within Jordanian text Classifiers and their effect, the proposed modified ABC-UBC achieves the minimum number of feature selection picks out the optimal features from the words for the classification task. The test was carried out using the Jordanian dialect dataset. The comparison of performance measures shown in Table 40, with four tests in Jordanian text classifiers: with Pre-processing phase, without Pre-processing phase, with using forward feature selection with ABC-UBC with Pre-processing phase, and with using forward feature selection with ABC-UBC without Pre-processing phase. We inferred The optimized features are given into the classification task. with higher accuracy up to 99% Moreover, the precision, recall, and f1-score also rate from 95% to 99%. After testing the classification algorithms, we compared prediction accuracy for four tests so that have Support Vector(SVM), KNeighborsClassifier(KNN), Naive Bayes(NB), Probabilistic Neural Network (PNN) as shown in Fig. 5 the best result of KNN, NB, PNN, accuracy up to 99.9%. And the test was carried out using the Algerian dialect dataset. The comparison of performance measures shown in Table 41, with four tests in Algerian text classifiers: with Pre-processing phase, without Pre-processing phase, with using forward feature selection with ABC-UBC with Pre-processing phase, and with using forward feature selection with ABC-UBC without Preprocessing phase. This model with the four tests gives accuracy up to 82% (for F1 score). A comparison between the contents of the Jordanian dialect data set and the Algerian dialect data set. The text size in the Jordanian dialect does not exceed twenty words for each row in the database. While the text size in the Algerian dialect is a long paragraph, the words are more than 100 per row in the database. Through experience, the following was observed: The accuracy of classification is affected by the number of words. If the number of a word decreases, the accuracy of classification increases. In the future, The objective is to apply the proposed model supervised approach in Arabic, and its dialects, to be comparable with other methods after test in more Arabic datasets. The method will introduce different functions like spam detection and others to achieve the excellent results of the Arabic text classification system.

References 1. Proudfoot, D. (2020). Rethinking turing’s test and the philosophical implications. Minds and Machines, 1–26. 2. Janani, R., & Vijayarani, S. (2020). Automatic text classification using machine learning and optimization algorithms. Soft Computing, 1–17. 3. Elnagar, A., Al-Debsi, R., & Einea, O. (2020). Arabic text classification using deep learning models. Information Processing & Management, 57(1), 102121.

Arabic Text Classification Using Modified Artificial Bee Colony …

287

4. Karaboga, D., Gorkemli, B., Ozturk, C., & Karaboga, N. (2014). A comprehensive survey: Artificial bee colony (ABC) algorithm and applications. Artificial Intelligence Review, 42(1), 21–57. 5. Jiang, D., Yue, X., Li, K., Wang, S., & Guo, Z. (2015). Elite opposition-based artificial bee colony algorithm for global optimization. International Journal of Engineering, 28(9), 1268– 1275. 6. Alzaqebah, A., Smadi, B., & Hammo, B. H. (2020, April). Arabic sentiment analysis based on salp swarm algorithm with S-shaped transfer functions. In 2020 11th International Conference on Information and Communication Systems (ICICS) (pp. 179–184). IEEE. 7. Guellil, I., Adeel, A., Azouaou, F., Benali, F., Hachani, A. E., Dashtipour, K., ... & Hussain, A. (2021). A semi-supervised approach for sentiment analysis of arab (ic+ izi) messages: Application to the algerian dialect. SN Computer Science, 2(2), 1–18. 8. Thirumoorthy, K., & Muneeswaran, K. (2020). Optimal feature subset selection using hybrid binary Jaya optimization algorithm for text classification. Sādhanā, 45(1), 1–13. 9. Chantar, H., Mafarja, M., Alsawalqah, H., Heidari, A. A., Aljarah, I., & Faris, H. (2020). Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification. Neural Computing and Applications, 32(16), 12201–12220. 10. Zheng, W., & Jin, M. (2020). Comparing multiple categories of feature selection methods for text classification. Digital Scholarship in the Humanities, 35(1), 208–224. 11. Hussein, O., Sfar, H., Mitrović, J., & Granitzer, M. (2020, December). NLP_Passau at SemEval-2020 Task 12: Multilingual neural network for offensive language detection in English, Danish and Turkish. In Proceedings of the Fourteenth Workshop on Semantic Evaluation (pp. 2090–2097). 12. Pan, Y., & Liang, M. (2020, June). chinese text sentiment analysis based on BI-GRU and self-attention. In 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) (vol. 1, pp. 1983–1988). IEEE. 13. Rachid, B. A., Azza, H., & Ghezala, H. H. B. (2020, July). Classification of cyberbullying text in Arabic. In 2020 International Joint Conference on Neural Networks (IJCNN) (pp. 1– 7). IEEE. 14. Guo, Z., Shi, J., Xiong, X., Xia, X., & Liu, X. (2019). Chaotic artificial bee colony with elite opposition-based learning. International Journal of Computational Science and Engineering, 18(4), 383–390. 15. Almani, N., & Tang, L. H. (2020, March). Deep attention-based review level sentiment analysis for Arabic reviews. In 2020 6th Conference on Data Science and Machine Learning Applications (CDMA) (pp. 47–53). IEEE. 16. Hanbay, K. (2021). A new standard error based artificial bee colony algorithm and its applications in feature selection. Journal of King Saud University-Computer and Information Sciences. 17. Chaudhuri, A., & Sahu, T. P. (2021). Feature weighting for naïve Bayes using multi objective artificial bee colony algorithm. International Journal of Computational Science and Engineering, 24(1), 74–88. 18. Obeid, O., Zalmout, N., Khalifa, S., Taji, D., Oudah, M., Alhafni, B., ... & Habash, N. (2020, May). CAMeL tools: An open source python toolkit for Arabic natural language processing. In Proceedings of the 12th language resources and evaluation conference (pp. 7022–7032). 19. Ayedh, A., Tan, G., Alwesabi, K., & Rajeh, H. (2016). The effect of preprocessing on arabic document categorization. Algorithms, 9(2), 27. 20. Chen, P. H. (2020). Essential elements of natural language processing: What the radiologist should know. Academic radiology, 27(1), 6–12. 21. Vijayaraghavan, S., & Basu, D. (2020). Sentiment analysis in drug reviews using supervised machine learning algorithms. arXiv preprint arXiv:2003.11643. 22. Karaboga, D. (2005). An idea based on honey bee swarm for numerical optimization (vol. 200, pp. 1–10). Technical report-tr06, Erciyes university, engineering faculty, computer engineering department.

288

A. Habeeb et al.

23. Ghambari, S., & Rahati, A. (2018). An improved artificial bee colony algorithm and its application to reliability optimization problems. Applied Soft Computing, 62, 736–767. 24. Xiang, Z., Xiang, C., Li, T., & Guo, Y. (2020). A self-adapting hierarchical actions and structures joint optimization framework for automatic design of robotic and animation skeletons. Soft Computing, 1–14. 25. Sharma, A., Sharma, A., Choudhary, S., Pachauri, R. K., Shrivastava, A., & Kumar, D. A. (2020). Review on artificial bee colony and it’s engineering applications. Journal of Critical Reviews. 26. Li, Y. (2020). Comparison of various multi-armed bandit algorithms (E-greedy, ompson sampling and UCB-) to standard A/B testing. 27. Hijazi, M., Zeki, A., & Ismail, A. (2021). Arabic text classification using hybrid feature selection method using chi-square binary artificial bee colony algorithm. Computer Science, 16(1), 213–228. 28. Zhang, X., Fan, M., Wang, D., Zhou, P., & Tao, D. (2020). Top-k feature selection framework using robust 0–1 integer programming. IEEE Transactions on Neural Networks and Learning Systems. 29. Janani, R., & Vijayarani, S. (2020). Automatic text classification using machine learning and optimization algorithms. Soft Computing, 1–17. 30. Dhar, A., Mukherjee, H., Dash, N. S., & Roy, K. (2021). Text categorization: Past and present. Artificial Intelligence Review, 54(4), 3007–3054. 31. Sheykhmousa, M., Mahdianpari, M., Ghanbari, H., Mohammadimanesh, F., Ghamisi, P., & Homayouni, S. (2020). Support vector machine vs. random forest for remote sensing image classification: A meta-analysis and systematic review. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 32. Saadatfar, H., Khosravi, S., Joloudari, J. H., Mosavi, A., & Shamshirband, S. (2020). A new K-nearest neighbors classifier for big data based on efficient data pruning. Mathematics, 8(2), 286. 33. Ruan, S., Li, H., Li, C., & Song, K. (2020). Class-specific deep feature weighting for Naïve Bayes text classifiers. IEEE Access, 8, 20151–20159. 34. Oh, S. K., Pedrycz, W., & Park, B. J. (2003). Polynomial neural networks architecture: Analysis and design. Computers & Electrical Engineering, 29(6), 703–725.