Intelligent Human Centered Computing: Proceedings of HUMAN 2023 981993477X, 9789819934775

This book features high-quality research papers presented at the First Doctoral Symposium on Human Centered Computing (H

309 71 32MB

English Pages 428 [429] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Organisation
Contents
About the Editors
An Intelligent Approach for Brain Tumor Classification Using Different CNN Variants
1 Introduction
2 Literature Review
3 Methodologies
4 Experiment and Analysis
5 Results and Discussion
6 Conclusion and Future Scope
References
Effective Estimation of Relationship Strength Among Facebook Users Applying Pearson Correlation and Jaccard's Coefficient
1 Introduction
2 Related Work
3 Enumerated Structure of the Proposed Approach
3.1 Calculation of Relationship Strength
3.2 Assessment Matrices
3.3 Dataset
4 Results Analysis
5 Conclusion and Future Work
References
Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique
1 Introduction
2 Literature Review
3 Methodology
3.1 Dataset Used
3.2 Feature Extraction
4 Result and Analysis
5 Comparative Study
6 Conclusion
References
Optimization of Intraday Trading in F&O on the NSE Utilizing BOLLINGER BANDS
1 Introduction
2 Literature Review
3 Methodology
4 Result
5 Conclusion
References
Identification of Mental State Through Speech Using a Deep Learning Approach
1 Introduction
2 Literature Review
3 Methodology
3.1 Dataset Used
3.2 Feature Extraction
3.3 Classification with CNN
4 Experimental Analysis
4.1 Comparative Study
5 Conclusion
References
Intellectual Property in Human Genomics in India
1 Introduction
1.1 Statement of Problem
1.2 Need for the Research
1.3 Research Methodology
1.4 Objective of the Study
1.5 Scope of the Study
2 Human Genes – Scientific Overview
2.1 Genome Engineering
2.2 Application of Genomics
3 Social, Legal, and Ethical Issues in Patenting Human Genomics
3.1 Gene Patents and Research-Related Issues
3.2 Patenting Human Genes: The Ethical Debate
4 Intellectual Property and Role of Human Genomics in India
4.1 Indian Patents Act 1970
4.2 Gene Patents Under the Patents Act of 1970
5 Right to Health and Genetic Patents
6 Conclusion
References
Performance of Automated Machine Learning Based Neural Network Estimators for the Classification of PCOS
1 Introduction
2 Background
2.1 Artificial Neural Network
2.2 Automated Machine Learning
3 Dataset Description
3.1 Experimental Set up
3.2 Hardware and High-Performance Computing Environment
4 Results
4.1 Analysis of Different PyTorch Neural Network Estimators
5 Conclusion
References
Job Recommendation a Hybrid Approach Using Text Processing
1 Introduction
2 Proposed Work
3 Machine Learning Algorithms
4 Preprocessing of Data
5 Result and Discussion
6 Use of Optimization in Machine Learning Classifiers
7 Conclusion
References
A Study to Investigate the Existence of Monolexemic Colour Terms in Dravidian Languages: A Visual Psychophysics Approach
1 Introduction
2 Literature Review
3 Materials and Methods
4 Experimental Apparatus
5 Experimental Methods
6 Results
7 Discussion
8 Conclusion
References
Deep Artificial Neural Network Based Blind Color Image Watermarking
1 Introduction
2 Proposed Blind Color Image Watermarking
2.1 Watermark Embedding Algorithm
2.2 Watermark Extraction Algorithm
3 Result and Discussion
3.1 Embedding Payload
4 Conclusion
References
IoT Based Automatic Temperature Screening & Alert System for Symptomatic COVID-19 Detection
1 Introduction
1.1 Infrared Thermal Imaging
1.2 Hand Held Non-contact Infrared Thermometers
2 Device Design Schema
2.1 Hardware
2.2 Embedded Software
3 Results and Discussions
3.1 Experimental Outcome
4 Conclusion and Future Scope
References
Boosting Machine Learning Algorithm to Classify Road Conditions for Maintenance Strategy of Flexible Pavements
1 Introduction
2 Literature Review
3 Methodology
3.1 Data Collection
3.2 Data Preprocessing
3.3 Balancing the Dataset
3.4 Feature Scaling (Normalization)
3.5 Data Analysis
4 Data Validation
4.1 Metrics in This Study for Performance Evaluation
5 Results
6 Conclusions
References
Hyper Parameterized LSTM Models for Predicting NSE Intraday Bias Based on Global Market Trends
1 Introduction
2 Systematic Review
3 Proposed Method
3.1 Data Collection
3.2 Brief Insight into LSTM
4 Result Analysis
5 Conclusion
References
Malicious URL Classification Using Machine Learning
1 Introduction
2 Survey
3 Proposed Work
3.1 List of Features Used for Classification
3.2 Machine Learning Algorithms
4 Experiment and Results
4.1 Dataset Information
4.2 Performance Metrics
5 Conclusion and Future Scope
References
Prognostic Stage Classification for Invasive Breast Cancer by Analysing Affected Lymph Node
1 Introduction
2 Literature Review
3 Methodology
4 Result and Discussion
4.1 Comparison Table
5 Conclusion
References
Study of Task Scheduling Algorithms for Energy Minimization in a Cloud Computing Environment
1 Introduction
1.1 The Benefits of Task Scheduling Algorithms
1.2 Paper Organization
2 Literature Survey
3 Energy Management Algorithms
4 Conclusion
References
Human Stress Detection from SWCT EEG Data Using Optimised Stacked Deep Learning Model
1 Introduction
2 Related Work
3 Dataset Description
4 Methodology
4.1 Discrete Wavelet Transform (DWT)
4.2 Particle Swarm Optimization (PSO)
4.3 Grid Search Optimization (GSHPO)
4.4 Long Short-Term Memory (LSTM)
4.5 Bidirectional Long Short-Term Memory (BiLSTM)
5 Proposed Work
5.1 Data Preprocessing
5.2 Proposed Model
6 Results
6.1 Metrics Based Performance
6.2 Convergence Based Performance
6.3 ROC Curve Analysis
7 Conclusion and Future Work
References
Impact of Carbon Emission Policies on an Imperfect EOQ Model Under Cloud Fuzzy Environment
1 Introduction
2 Notations and Assumptions
3 Mathematical Model
3.1 Crisp Mathematical Model
3.2 Fuzzy Mathematical Model:
3.3 Cloud Fuzzy Mathematical Model:
4 Numerical Analysis
5 Sensitivity Analysis
6 Conclusions
References
Rule-Based Investigation on Positive Change in Air Quality at Kolkata During Lockdown Period Due to Covid-19 Pandemic
1 Introduction
2 Related Works
3 Preliminaries and Problem Statement
3.1 Association Rule Mining
3.2 Air Quality Index (AQI)
3.3 Target Location and Data Collection
4 Proposed Approach
4.1 Design of Binary TDB of Pollutants
4.2 Proposed Algorithm
5 Experimental Result
6 Conclusion and Future Scope
References
Performance Analysis of Professional Higher Education Programmes Driven by Students Perception: A Latent Variable Computation Model for Industry 5.0
1 Introduction
2 Literature Review
3 Methodology
4 Results
5 Conclusion
References
Graph Based Zero Shot Adverse Drug Reaction Detection from Social Media Reviews Using GPT-Neo
1 Introduction
2 Review Literature
3 Proposed Methodology
3.1 Data Collection
3.2 Data Pre-processing
3.3 Grouping
3.4 Graph Construction
3.5 GPT-Neo Model
4 Result
5 Conclusion
References
Digital Twin for Industry 5.0: A Vision, Taxonomy, and Future Directions
1 Introduction
1.1 Data Flow for Digital Twin
1.2 Simulation and Digital Twin
1.3 Motivation
1.4 Contributions
2 Related Work
3 Taxonomy of Digital Twin
3.1 Types of Digital Twin
3.2 Sustainability Taxonomy of Digital Twin
4 Application of Digital Twin
4.1 Digital Twin in Healthcare
4.2 Digital Twins in Smart Cities
4.3 Digital Twin in Manufacturing Industry
5 Data Handling Methodologies in Digital Twins
5.1 Procedure for Data Acquisition of Digital Twin
5.2 Procedure for Data Communication of Digital Twin
5.3 Procedure for Data Storage of Digital Twin
5.4 Procedure for Data Collaboration of Digital Twin
5.5 Techniques for Data Integration of Digital Twin
5.6 Techniques for Data Transformation of Digital Twin
5.7 Techniques for Data Service of Digital Twin
6 Opportunities
7 Challenges
8 Conclusion
References
Application of Machine Learning Technology for Screening of Mental Health Disorder
1 Introduction
2 Review of Literature
3 Methodology
References
Novel Machine Learning Techniques for Diabetes Prediction
1 Introduction
2 Literature Survey
3 Proposed Methodology
3.1 Dataset Description
3.2 Data Pre-processing
3.3 Model Building
4 Experiment
4.1 Experimental Setup
4.2 Evaluation
4.3 Base Model
4.4 Result and Discussion
5 Conclusion and Future Scope
References
Analysis of BLER and Throughput During the Coexistence of Two 5G NR
1 Introduction
2 Design of the Coexistence of Two 5G NR
2.1 Simulation Diagram
2.2 Experimental Procedure
3 Results of the Coexistence of Two 5G NR
3.1 Episode 1: f1 = 1.901 GHz, f2 = 2.16 GHz
3.2 Episode 2: f1 = 1.901 GHz, f2 = 2.1599937 GHz
3.3 Episode 3: f1 = 1.901 GHz, f2 = 2.15999381 GHz
3.4 Episode 4: f1 = 1.901 GHz, f2 = 2.15999387 GHz
3.5 Episode 5: f1 = 1.901 GHz, f2 = 2.159995 GHz
3.6 Episode 6: f1 = 1.901 GHz, f2 = 2.1600013 GHz
3.7 Episode 7: f1 = 1.901 GHz, f2 = 2.16000137 GHz
4 Summary and Conclusion
References
An Assessment of Forest Fire Dataset Using Soft Computing Technique
1 Introduction
2 Literature Survey
3 Methodology
4 Results and Discussion
5 Conclusion and Future Scope
References
Predicting the Data Science Employability Rate Using Data Mining Techniques
1 Introduction
2 Literature Review
3 Research Methodology
4 Results and Discussions
5 Conclusions
References
Classification of Online Fake News Using N-Gram Approach and Machine Learning Techniques
1 Introduction
2 Related Literatures
3 Proposed Methodology and Model Building
3.1 Data Collection
3.2 N-Gram Model
3.3 Data Pre-processing
3.4 Feature Extraction
3.5 Training and Testing the Model
3.6 Machine Learning Classifiers
3.7 Results and Discussions
4 Conclusion and Future Scope
References
An Edge Assisted Robust Smart Traffic Management and Signalling System for Guiding Emergency Vehicles During Peak Hours
1 Introduction
2 Related Work
3 Methodology
3.1 Calculating Index Value
3.2 Arrival of Emergency Vehicle
3.3 Decision Making at Edge Server
4 Experiment Results for Different Weather Conditions
5 Conclusion
References
Patent Analysis on Artificial Intelligence in Food Industry: Worldwide Present Scenario
1 Introduction
2 Methodology
3 Results
4 Discussion
4.1 AI in Food Industry: Direct Application
4.2 AI in Food Industry: Indirect Application
5 Significance of AI in Food Industry: Post COVID-19 Time
6 Benefits of the Present Research
7 Conclusion
Appendix
References
Research of the Influence of the Fuzzy Rules Number on the Learning of a Neuro-Fuzzy System
1 Introduction
2 Estimation of the Minimum and Maximum the Number of Fuzzy Rules Included into a Fuzzy-Logical System
3 Soft and Hard Models for Fuzzy Inference Systems
4 Multiple Regression Analysis
5 Numerical Simulation of the Functioning and Performance Assessment of a Fuzzy Model with a Different Number of Fuzzy Rules
6 Conclusion
References
Optimization of Traffic Flow Based on Periodic Fuzzy Graphs
1 Introduction
2 Literary Review
3 Periodic Temporal Fuzzy Graphs
4 Possibility of Real Use
5 Conclusion
References
ChatGPT: A OpenAI Platform for Society 5.0
1 Introduction
2 Fundamentals of ChatGPT
3 Comparison Between ChatGPT and BERT
4 Features and Benefits of ChatGPT
5 Method of Operations and Enterprise Architecture of ChatGPT
6 Potential Industrial Applications of ChatGPT
7 Role of Chat GPT in Software Development
8 Limitations of ChatGPT
9 Conclusions
References
Author Index
Recommend Papers

Intelligent Human Centered Computing: Proceedings of HUMAN 2023
 981993477X, 9789819934775

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Springer Tracts in Human-Centered Computing

Siddhartha Bhattacharyya Jyoti Sekhar Banerjee Debashis De Mufti Mahmud Editors

Intelligent Human Centered Computing Proceedings of HUMAN 2023

Springer Tracts in Human-Centered Computing Series Editors Siddhartha Bhattacharyya, Rajnagar Mahavidyalaya, Birbhum, West Bengal, India Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warszawa, Poland Mario Koeppen, Department of Computer Science and Electronics, Kyushu Institute of Technology, Fukuoka, Japan Vaclav Snasel, Dept of Computer Science, VŠB-TUO, Ostrava-Poruba, Czech Republic Rudolf Kruse, Faculty of Computer Science, Otto-von-Guericke University, Magdeburg, Sachsen-Anhalt, Germany

The book series is aimed at providing an exchange platform for researchers to summarize the latest research and developments related to human-centered computing. Human-centered computing is focused on the study of design, development, and deployment of human-computer systems based on mixed-initiatives. It may be visualized as a three-dimensional space comprising human, computer, and environment. This upcoming computing paradigm has emerged from the convergence of multiple disciplines concerned both with understanding human beings and with the design of computational artifacts. The field of human-centered computing is a multidisciplinary field encompassing disciplines such as computer science, human factors, sociology, psychology, cognitive science, anthropology, communication studies, graphic design and industrial design. The book series covers the topics and fields of distributed environments entailing Internet-based information systems, grids, sensor-based information networks, and mobile and wearable information appliances, multimedia and multi-modal humancomputer interfaces, design of intelligent interfaces and information representation and visualization, multi-agent systems, effective and constrained computer-mediated human-human interaction, defining relevant semantic structures for multimedia information representation, community specific HCI solutions, collaborative and interactive systems, social interaction, social orthotics, affective computing, knowledge-driven human-computer interaction, human-centered semantic formulation, human-centered management science, and participatory action research. The series will include monographs, edited volumes, and selected proceedings.

Siddhartha Bhattacharyya · Jyoti Sekhar Banerjee · Debashis De · Mufti Mahmud Editors

Intelligent Human Centered Computing Proceedings of HUMAN 2023

Editors Siddhartha Bhattacharyya Algebra University College Zagreb, Croatia Rajnagar Mahavidyalaya Rajnagar, Birbhum, India Debashis De Department of Computer Science and Engineering Maulana Abul Kalam Azad University of Technology Kolkata, West Bengal, India

Jyoti Sekhar Banerjee Department of Computer Science and Engineering (AI & ML) Bengal Institute of Technology Kolkata, India Mufti Mahmud Department of Computer Science Nottingham Trent University Nottingham, UK

ISSN 2662-6926 ISSN 2662-6934 (electronic) Springer Tracts in Human-Centered Computing ISBN 978-981-99-3477-5 ISBN 978-981-99-3478-2 (eBook) https://doi.org/10.1007/978-981-99-3478-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Siddhartha Bhattacharyya would like to dedicate this volume to his beloved wife Rashni Jyoti Sekhar Banerjee would like to dedicate this volume to his father Late Chandra Sekhar Banerjee Debashis De would like to dedicate this volume to his father Late Dilip Kumar De Mufti Mahmud would like to dedicate this book to his parents—Reazuddin Ahmed and Monowara Begum

Preface

Computing is today at one of its most exciting junctures, enabling a wide range of vital human endeavours. To support all human activities, we need to develop new paradigms to search, organise, and integrate such information, given the exponential growth in the amount of data we can collect as more and more information becomes available in more and more media forms and via more and more sensors and devices. Traditionally, computing technologies have been developed in isolation from their societal and cultural contexts, and human centered computing (HCC) seeks to shift this paradigm. By making computer systems easier to use and more affordable worldwide, HCC could help protect the environment and economic distribution and give people who wouldn’t have these opportunities otherwise access to education and health care. This volume is aimed to cover almost all aspects of modern human centered computing and its applications. As human centered computing (HCC) is an interdisciplinary field, the volume showcases the fusion of human sciences (such as social and cognitive) with computer science (including human–computer interaction (HCI), signal processing, machine learning, and ubiquitous computing) for creating computing systems that are designed from the ground up with people in mind. The personal, social, and cultural contexts in which these technologies are used should be at the centre of this attention. HCC is more than just a place where people from different fields can meet and talk. It also aims to change the existing computing paradigm by finding new ways to design and build systems that improve people’s lives. This volume comprises thirty-three well-versed contributions involving HCC and its applications. The growth of unrepresentative cells inside the brain is termed a swelling part of the brain or brain tumour. This tumour may be either malignant means cancerous, or benign means noncancerous. These tumours grow rapidly if malignant and affect the brain by creating pressure under the skull. The aim of Chapter 1 is to develop a brain tumour MRI binary classification approach that is automatic and efficient to help doctors make decisions. The proposed approach estimates the accuracy through some self-created optimal variants of a newly adopted and mostly used CNN architecture Residual Network 50 model. Numerous ways have been established till the present to identify tie strength among users on online social media. Although calculating relationship strength among interconnected users has been a burning research topic, it is also challenging to find powerful interconnections in a dynamic network as the volatility of user connections over some time. In Chapter 2, the authors propose a novel method to calculate relationship strength among Facebook users with Pearson correlation and Jaccard’s coefficient. Two factors, viz. analogy profile and analogy friendship, have been proposed to obtain the final relationship strength. The performance of the proposed model is compared with the popular existing model, namely trust propagation-user relationship strength (TP-URS), using the assessment matrices precision, recall, and dice similarity coefficient (DSC). The

viii

Preface

proposed method provides a precision value of 0.66, a recall value of 0.74, and a DSC value of 0.71, which are comparatively better than the existing algorithm. Music is an effectual therapy that makes us calm, cheerful, and excited. Music genre classification (MGC) is essential for music recommendation and information retrieval. In Chapter 3, an effective automatic musical genre classification approach has been experimented with where different features and order are fused to get a better progressive result than the existing methods. Frame-wise extraction of time-domain features (wavelet scattering, zero crossing rate, energy) and frequency-domain features (mel frequency cepstral coefficient-MFCC, pitch, linear predictive coefficient-LPC) is performed here. After that, the mean value of each extracted feature is put in a vector and fed to the classifier. Two well-known machine learning (ML) algorithms, support vector machine (SVM) and K-nearest neighbour (KNN), are used to classify the GTZAN dataset. The proposed method outperforms the existing works. As AI-based ALGO-TRADE technologies have advanced, firms around the globe have altered their practices. Investors can now increase their chances of success and depend less on luck by using this new technology. When there is only one remaining share of stock or product, buy-or-write is sometimes employed. The “buy-and-sell maximisation study design” is a research approach to create an opinion. Bollinger bands are used to develop the ALGO-TRADE programmes used in the investigation carried out in Chapter 4. The Bollinger bands assist in determining whether the present price is abnormally high or low. This study also examines how much money the NIFTY options trading account made and lost in 2021 and 2022. Identification of one’s feelings and attitude through speech is a powerful medium for expressing. Finding the emotional content in speech signals and identifying the emotions in speech utterances is crucial for researchers. Chapter 5 examines how well a deep learning-based model can identify speech emotions from two well-known datasets, TESS and RAVDESS. In this work, a proper combination of frequency-domain acoustic features of thirteen (13) linear predictive coefficients (LPC) and mel frequency cepstral coefficients (MFCCs) are fed into a two-dimensional convolutional neural network (CNN) model for classification. According to the experimental findings, the suggested method can recognise speech emotions with an average accuracy of 99% (TESS) and 73% (RAVDESS) for speaker-dependent (SD) speech. “Genome” and “Human Genomics” are essential in their coexistence. Genetic science grows eventually with the rapid developments of biological sciences and the biotechnology industry. Patents’ role has become crucial in pharmaceuticals and genetic science, which the authors discuss in Chapter 6. The main idea of selecting this arena is to analyse the gaps concerning privacy in DNA fingerprinting methods and human genetic science. Further, the chapter explores whether international intellectual property encompasses more than ensuring intellectual property protection and transfer. Ethical and legislative challenges about the consequences of technological advances and information are growing widespread. This chapter analyses the global problems which arise in the future regarding Intellectual Property and growth in the genomics arena. Moreover, this chapter focuses on genetics’ physiological, legal, ethical, and sociological issues and implications, thereby examining the impact of genomics on individuals and the entire society and the patentability of human genes in India.

Preface

ix

Artificial neural networks (ANNs) and automated machine learning (AutoML) have transformed the study of artificial intelligence by producing models that are very effective at completing inductive learning tasks. Despite their accomplishments, more information is needed on when to utilise one over the other. In Chapter 7, TPOT-NN-based integrated AutoML and NN estimator performance is compared with non-Auto ML-NN estimator in the context of polycystic ovary syndrome (PCOS) benchmark datasets. The main findings of this study point to TPOT-NN as a useful tool for some datasets. It also points to potential avenues for basic TPOT to improve classification models in the future. This work reported in Chapter 8 is an attempt to collate the data and discover the foremost relevant candidate-job association mapping concurring with a user’s skills, interests, and preferences and to provide a possible job opportunity as an efficient solution. Several personalised content-based and case-based approaches are considered in this regard. The investigation involves several feature-based item representation methods along with feature-weighted schemes. A comparative evaluation of the distinctive perspective is performed utilising the Kaggle data respiratory. The investigation of this study shows that job transitions can be successfully predicted. The delicacy of the model is evaluated based on various machine learning algorithms such as Naïve Bayes, logistic regression, support vector machine, random forest, K-nearest neighbours, and multilayer perceptron. In this work, a hybrid recommender framework is created to analyse further investigation to identify the region of opportunity for predicting a suitable job. Colour naming remains problematic in cognitive science and linguistics, especially since they exhibit structural regularities across cultures. The universality of colour terms by Berlin and Kay shows eleven colour prototypes in typological patterns of fundamental colour words in English and other languages. In Chapter 9, the authors try to evaluate the monolexemic colour theory and the presence of these postulated colour words in Dravidian widely spoken languages and map them from a vision sciences viewpoint. Visual psychophysical methods are employed in the investigation by recruiting individuals from four L1 mother tongues languages. The experiment examines semantic fluency, colour naming, and stoop tasks. Digital data is growing enormously as the year passes, so a mechanism to protect digital content is needed. Image watermarking is one of the important tools for humans to provide copyright protection and authorship. For achieving the ideal balance between imperceptibility and robustness, a robust blind colour image watermarking employing deep artificial neural networks (DANN), LWT, and the YIQ colour model has been presented in Chapter 10. In the suggested watermarking method, an original 512-bit watermark is applied for testing, and a randomly generated watermark of the same length is used for training. PCA extracts ten statistical features with significant values out of 18 statistical features, and binary classification is used to extract watermarks here. Four images, Lena, Peppers, Mandril, and Jet, display an average imperceptibility of 52.48 dB. For the threshold value of 0.3, it does an excellent job of achieving a good balance between robustness and imperceptibility. Except for the Gaussian noise, rotation, and average filtering attacks, it also demonstrates good robustness against common image attacks. The experimental results demonstrate that the suggested watermarking method outperforms competing methods.

x

Preface

Chapter 11 deals with an automatic Internet of Things (IoT) enabled temperature screening system using a non-contact infrared sensor. Since 2020, it has been almost mandatory in every crowd gathering/premises to monitor the temperature of the human body by using a hand-held, non-contact-type device operated by a person at the entrance. Here, instead of a human being, a portable device is mounted at the entry gate, intelligent enough to upload the data to the cloud as a database. This device can also count the number of persons entering a premise. In order to develop this, the authors have chosen non-contact-type infrared temperature sensor, which converts the human body’s reflected radiation into a voltage that is processed by a microcontroller which pushes the data over the cloud. The door-mounted device contains an OLED which displays the person’s body temperature, and if it is more than the normal range (98.6 ºF) an alarm starts buzzing with a caution message on display. In addition to temperature screening, this device can be further used as a self-attendance system in close premises, especially with office employees or staff in corporate areas. Pavements are one of the most important public assets which play a crucial role in an enriched economy. India, a developing country with tropical weather conditions and an overloaded traffic stream, needs periodic evaluation and maintenance of roads. The appropriate treatment applied to the right pavement at the appropriate moment can reduce life cycle costs, produce higher-quality pavements, and ensure that good pavements remain in good condition. Chapter 12 aims to develop a boosting machine learning algorithm to classify the road in good, fair, and poor condition depending upon the type and area of distress so that maintenance methods are applied accordingly, resulting in the proper allocation of government funds. It thus helps to find the optimal maintenance and repair policy concerning budgetary limits and pavement conditions. The stock market makes it easier for businesses to get money by letting people buy-and-sell stocks and bonds and by giving investors capital gains and dividends. It is where individuals’ investments and savings can be utilised, thus contributing to economic growth. Those who invest should be aware of the associated market risk. Numerous researchers are actively investigating various risk reduction strategies. Chapter 13 focuses on predicting intraday bias on the National Stock Exchange (NSE) using static natural language processing (NLP) on patterns from several international markets. It is important to remember that the market can move quickly in response to big world events like the COVID-19 epidemic, the Suez Canal problem, the Russia-Ukraine conflict, etc. The primary objective of this research is to find solutions to such issues. It is achieved by employing an LSTM model with tuned hyperparameters. To compare LSTM with regression, ARIMA, and HOLTWINTERS computer models, Root MSE is used to check the results. The study also looks at how well the NSE is linked to other markets worldwide. The model fits the data well, and the correlation values are close to 1. In this era of emerging web technologies, people are always dealing with online resources. Web phishing is a social engineering attack where cyber-attackers trick people and reveal important credentials. People type a search query, and based on the query results, users generally enter the top few websites. Cyber-attackers deploy different mechanisms like keyword stuffing and content cloaking to modify the rank of a webpage. In Chapter 14, the authors check whether a website’s content is cloaked and then further

Preface

xi

classify the URL whether it is phishing or not. SVM and KNN classifiers are used on a dataset of 11,72,598 URLs to achieve a 90% accuracy score. To accurately identify the cancer stage, it is required to classify the tumour and lymph node stages from pathological and clinical data. Most existing models concentrate on cancer detection, whereas the limited model performs the tumour and lymph node stage classification during diagnosis time. For personalised treatment prognostic stage, cancer classification is required during diagnosis. The aim of Chapter 15 is to build a computational model which accurately classifies the stages of tumour (T) and lymph node (N) at the time of detection. The proposed model identifies a few crucial features (tumour size, nuclear grade, axially lymph node, and so on) using recursive feature elimination (RFE). A modified feature set prepared by RFE with the crucial and necessary feature of data is fed in the proposed model for classifying the tumour (T) and node (N) stages. Necessary features (such as mitotic count, tubule formation, oestrogen receptor) highly correlated with the crucial features are used to enhance the model’s performance. Real-time applications are supported and permitted to run on virtualised resources that can be dynamically provisioned in a cloud computing environment. It is one of the effective platform services that enable a wide range of cloud infrastructure-based applications. With the aid of workflow systems, the construction of current scientific applications may now be completed in an uncomplicated and time-efficient approach. Implementing efficient workflow scheduling algorithms makes it feasible to increase the utilisation of resources, hence improving the performance of cloud computing and meeting user expectations. In cloud computing, the scheduling of jobs can directly affect the total amount of resources utilised and the operational expense of a system. In Chapter 16, the authors have studied and discussed the advantages and disadvantages of the various existing task scheduling methods employed by cloud service providers, such as particle swarm optimisation (PSO), crow search algorithm (CSA), and sparrow search algorithm (SSA). Several variables, including reaction time, load balance, execution time, and makespan, are examined to determine the best effective strategy for task scheduling under any conditions. It can be concluded that CSA, CPO, and SSA are the most effective algorithms for reducing energy consumption, boosting performance, and shortening makespan. In today’s fast-paced technical world, mental stress among humans is rising, and its diagnosis is crucial. The brain’s electrical signals are a reference point for an electroencephalography (EEG) approach that identifies the brain’s functions. However, due to their intricacy, only a doctor with specialised training, such as a neurologist or a mental health counsellor, can interpret EEG signals in their decision-making on stress levels. In order to detect human stress levels, this research uses raw EEG data of 40 individuals while performing Stroop Colour-Word Test (SCWT) from SAM 40 dataset. Chapter 17 proposes (i) a discrete wavelet transform and particle swarm optimisation (DWT-PSO)based data pre-processing for data refinement and (ii) a hybrid deep learning model, a grid search hyperparameter optimisation (GSHPO)-based stacked bidirectional long short-term memory (BiLSTM) and long short-term memory (LSTM) model for the classification of stress and relaxed state in individuals. The proposed model tends to be precise, with the highest classification accuracy of 98.07%.

xii

Preface

Chapter 18 examines a cloud fuzzy inventory system which involves deteriorating items with different carbon regulations. The quality of products is mostly not taken into account in the traditional inventory systems. In reality, one may find a number of imperfect products in every prepared batch. Due to the imperfect products, the seller bears additional costs because of repair, reimbursement and negative response of consumers, etc. Therefore, screening processes of ordered batches become essential for all industries. During the last few decades, organisations have been trying to find solutions to decrease carbon release in the supply chain because of strict carbon emission regulations. The two most well-known carbon regulations are carbon tax and carbon cap-trade regulation. Because of high market complexity and uncertainty, the market demand rate cannot be taken as a crisp parameter because of high market complexity and uncertainty. Hence, the rate of market demand is considered imprecise in nature to handle uncertain market demand. In this work, the authors have first solved the crisp model. After that, they assumed the demand rate as a general fuzzy and cloud fuzzy number. To solve the problem, Yager’s index methods have been applied. The numerical illustration helps to compare the objective value under different environments. The authors have also discussed the effects of cloud fuzzy. Lastly, sensitivity analysis is calculated to learn the effect of the various parameters. During the COVID-19 lockdown from 24 March 2020 to 15 September 2021, there was a significant positive change in the air quality in Kolkata. Positive change in air quality accelerates better health conditions. Most existing research has focused on chemical or geographical analysis of this positive change in air quality. Chapter 19 presents an association rule-based analysis of positive improvement in the air quality at Kolkata during the lockdown period. The proposed approach finds the quality association rules among the prime pollutants in the air. The derived association rules express the knowledge regarding changing patterns of the pollutants during the lockdown period. Relevant experiments with real-life datasets are performed to justify the proposed approach. The results obtained from the experiments and allied discussions pay the limelight on the future strategy for controlling air pollution at Kolkata. Complexity is prevalent in contemporary culture. As Industry 5.0 takes hold, people will encounter higher difficulties in accomplishing jobs in the future. Human inventiveness and intelligent machines will converge to produce more precise, user-friendly, and resource-efficient solutions. This transition envisages meeting both industries’ production goals while keeping the planet’s biodiversity in good shape. Key aspects of Industry 5.0, like the creation of new types of jobs, the need for new skills, and the rapid development of technology, are getting far too complicated. The digital revolution can help develop these skills in students. Chapter 20 investigates the extent to which such primary proficiency parameters (e.g. motivational, intellectual, social and emotional, with one/more sub-parameters in each area) are perceived to have evolved in preparation for work in the Industry 5.0 era using data collected from 198 students. The raw data are subjected to a battery of one-sample normality tests. Then the main components are extracted, and the factor loadings are analysed in order to determine how students in these programmes value various parts of their education. The findings are summarised in accordance with the conceptual framework and overall conclusions of the study.

Preface

xiii

Adverse reactions to drugs are referred to as side effects caused by misuse of drugs. They are described in various sources, including biomedical literature, drug reviews, and social media user posts. Natural language processing techniques have recently made it possible to identify ADR automatically from documents. Chapter 21 presents a contextualised language model and various graphical representations. A contextualised graph is presented after identifying adverse drug reactions. Finally, a zero-shot text classification algorithm is developed to determine whether patient reviews of specific drugs for specific conditions contain evidence of adverse reactions to the drug. A graph network associates drug names, conditions, and reviews in this working dataset. Digital twin (DT) is envisioned as a critical technology to revolutionise the manufacturing and maintenance of real-life gadgets shortly. It is a system collection that goes beyond traditional simulation and analysis techniques. It converts a physical system’s components, operations, dynamics, and firmware into a digital equivalent. Physical and digital systems share all inputs, outputs, and processes via real-time data connections and information transmission. Digital twins’ models speed up human user efficiency, reduce wear on real-world systems, and promote creativity and innovation while posing the fewest operational risks in Industry 5.0. The authors discuss digital twins’ concepts, applications, methodologies, and research scopes in detail in Chapter 22. The study contains a systematic review of the architecture and operation of digital twins along with a contextual taxonomy of application domains like manufacturing, health care, and smart cities. Mental health disorders are one of the most significant public health problems worldwide. Currently, one in every eight individuals is suffering from some mental health issue. Anxiety and depression are common manifestations of neurotic and psychotic problems, respectively. Screening and diagnosing mental health disorders in the population is a complex clinical task. The conventional approach of questionnaire-based interviews and detailed medical examination for screening and diagnosing anxiety and depression requires highly trained healthcare professionals like psychiatrists or psychologists and a significant amount of time and patience. Machine learning (ML) is state-of-the-art technology where computers can learn to perform a task from the data. So, screening mental health disorders can also be performed using ML algorithms. Attributes like job profile, age, marital status, employment status, duration of service, working hours, and chronic disease conditions are selected to predict mental health disorders. In Chapter 23, three popular classification algorithms, i.e. decision tree, logistic regression, and random forest, are selected based on the literature review and evaluated based on accuracy, precision, recall, and F1 score. A random forest model with hyperparameter tuning is found to be the best fit for this specific purpose. Nowadays, diabetes mellitus is a dangerous disease that raises levels of blood glucose, which over time causes serious damage to body organs. It has no permanent cure, so early detection is very much required. Diabetes also invites many other diseases like heart attack, brain stock, heart failure, etc. Machine learning is widely used in this field for early detection. It has been successfully applied to various domains such as computer vision, computer graphics, natural language processing, speech recognition, computer networks and intelligent control, and health care. Machine learning and the Internet of Things help to improve the quality of human lives by automating some basic

xiv

Preface

tasks. It can monitor individuals using applications and sensors attached to the body. Various machine learning algorithms have been implemented for diabetes prediction. In Chapter 24, the authors discuss the methodologies of diabetes prediction using different machine learning algorithms. They also propose a novel algorithm based on two blocks with a minimum of 13 layers with better accuracy, thus providing an effective way to predict diabetes. The development intends to communicate more than one 5G NR device with the 5G NR base station. In this situation, the coexistence of two 5G NR in uplink mode is required. To evaluate throughput and BLER, one 5G NR is fixed at 1.901 GHz, and the other is variable in Chapter 25. The range of frequencies where the maximum throughput and minimum BLER are from 2.1599937 GHz to 2.1600014 GHz. The area of maximum throughput and minimum BLER denotes the reliability of the signal. As a result, two 5G NR devices can communicate with the 5G base station. When 4G LTE is shifted to 5G NR, then there will be an increase in massive devices. At that time, more than one 5G NR device will communicate with a 5G NR base station. As a result, facilities for users will be increased. Since human experts may overlook significant signs from one side of the planet to the next, woods fire is continuing and harming many timberland properties and honest creatures’ lives and causing a lot of ecological contaminations. Chapter 26 attempts to anticipate the causes of backwoods fire. In this study, the authors use a dataset of forest fires to conduct exploratory information research. It is found that the two central point temperatures and the dry season code have an impact on the woodland fires. Then, the authors develop a device with the objective of continuously monitoring the temperature and the dry spell code. Artificial intelligence (AI) attempts to build robots that think and behave like people. In order to understand more about the development of AI-related disciplines, the dataset is available on different websites. The dataset contains employee parameters such as project size, expertise, job profile, and annual package. Chapter 27 aims to reflect on the future of data science and its disciples regarding employability rate. Data science, engineering, and machine learning professional salary packages are subject to analysis. Employee parameters are job title held in the year, experience, and the number of hours worked in a particular year. K-means clusters are retrieved based on parameters such as project size and human expertise. Get a forecast for the salary package. This information is important in predicting increases in the number of people in these specific jobs and allows one to examine their annual income further. Naïve Bayes classifier is used for prediction producing 78% precision. Over time, there has been a significant increase in text data, leading to an alarming rise in the spread of fake news globally. This has detrimentally impacted society, including the economy, politics, organisations, and individuals. To combat this issue, it is crucial to identify fake news early on. Fake news propagators often target innocent people, making it necessary to develop effective techniques for detecting and preventing the spread of fake news. One approach is to use supervised machine learning algorithms to classify news articles as true or false by analysing their language and features. In Chapter 28, natural language processing techniques and feature extraction techniques

Preface

xv

are implemented to analyse and predict the spread of fake news. The XGBoost algorithm yields the best results during testing, with an accuracy of 99.62%. Congestion in traffic is unavoidable in many cities in India and other countries. It is an issue of major concern. The steep rise in automobiles on the roads, followed by old infrastructure, accidents, pedestrian traffic, and traffic rule violations, all add to challenging traffic conditions. Given these poor traffic conditions, there is a critical need for automatically detecting and signalling systems. Various technologies, like video analysis, infrared, and wireless sensors, are already used for traffic management and signalling systems. The main issue with these methods is that they are costly and require high maintenance. In Chapter 29, the authors propose a three-phase system that can guide emergency vehicles and manage traffic based on the degree of congestion. In the first phase, the system processes the captured images and calculates the index value used to discover the degree of congestion. The index value of a particular road depends on its width and the length up to which the camera captures images of that road. These are used as inputs while setting up the system. In the second phase, the system checks whether any emergency vehicles are present or not in any lane. In the third phase, processing and decision-making are performed at the edge server. The proposed model is robust and considers adverse weather conditions such as hazy, foggy, and windy. It works very efficiently in low light conditions also. The edge server is a strategically placed server that provides us with low latency and better connectivity. Using edge technology in this traffic management system reduces the strain on the cloud servers. The system becomes more reliable in real time because the latency and bandwidth get reduced due to processing at the intermediate edge server. Technological advancement is largely based on artificial intelligence (AI). AI has gained prominence over the last few years in the food industry. The food industry is leveraging the power of AI. Patent analysis on “AI and Food Industry” in global databases indicates huge applications of AI, especially in upgrading the method of food sorting, food safety, food nutrition, food delivery, and supply chain food processors. Chapter 30 aims to analyse patent documents (from 2000 to 19) which reveal significant research on AI reshaping the food industry. The major findings in the present patent analysis have been performed through patent searches in different paid and unpaid databases. The findings in this domain establish that the major contributing jurisdictions for patent filing in this particular domain are China, the USA, Korea, and Japan. The study indicates that most of the patents (26%) on AI in the food industry come from the application of AI in cooking devices or kitchen appliances, followed by AI and health monitors (15%), the restaurant sector (13%), and agriculture sector (10%). The statistics regarding domestic and international patent applications, companies’ and universities’ contributions, and current research trends have also been analysed. The RMSE measures the accuracy of fuzzy logic systems (root mean square error), with lower values indicating greater precision. Adaptive neuro-fuzzy interference system (ANFIS) is employed to minimise the RMSE. ANFIS algorithms are based on the traditional fuzzy logic models of Mamdani and Sugeno and use hard arithmetic operations during finding minimum and maximum values. The RMSE can be reduced several times by adding fuzzy rules to the inference. Furthermore, using soft arithmetic operations in

xvi

Preface

the fuzzy system can reduce the RMSE without using the ANFIS models. Chapter 31 outlines the analytic expressions that demonstrate the proposed method’s effectiveness. In Chapter 32, one of the most significant and frequently encountered problems of rapidly developing cities is considered—inconsistent regulation of traffic lights at several consecutive sections of the road intersection. This problem is most relevant in cities with a rapidly growing population, as a result, increasing the number of vehicles on public roads. The identified problems are relevant due to the increasing number of road users, which entails the risk of traffic accidents and congestion, which greatly complicates the logistics situation of transport and other companies, thereby increasing cash expenses. In order to eliminate the identified problem, the chapter considers an approach that allows optimising the regulation of traffic flows based on periodic fuzzy graphs, that is, graphs in which the fuzzy adjacency of vertices changes in discrete time, while discrete time has the property of cyclicity. Society 5.0 is a new human-centric society in which globalisation and the fast expansion of digital technologies such as robotics, artificial intelligence (AI), and the Internet of Things (IoT) have resulted in profound shifts in how people live and work together. To meet the new societal requirement, OpenAI launched ChatGPT, which was launched in the fourth quarter of 2022; it is a new AI-embraced conversational chatbot powered by the Large Language Model (LLM) that uniquely produces unique text in different languages. Much industrial automation in various Lines of Business (LOB) can be accomplished by leveraging ChatGPT’s powerful AI capabilities to create smarter and more human-like innovations to increase customer satisfaction (CS). Chapter 33 highlights various AIenabled features and benefits of the engine and how those can be seamlessly embedded in various industrial automation to drive the business processes in banking and financial services (BFS), health care, customer relationship management (CRM), etc., in smarter and innovative ways with the latest machine learning capabilities. It also emphasises how an existing system can be upgraded or updated with its delivery operations by enabling ChatGPT in its legacy software engines. This study also depicts a key comparative study of various newly invented other language models with nearly similar benefits. The primary target audience of this book will be the researchers, professors, graduate students, scientists, policymakers, professionals, and developers working in IT and ITeS, i.e. people working on intelligent human centered computing technologies. Additionally, the book will be very useful to professors who may wish to adopt this book as a text for computational intelligence-related studies and to supplement independent study projects. The audience would also include interested professionals and experts from both public and private computer, electronics, data science, and information technology industries. Siddhartha Bhattacharyya Jyoti Sekhar Banerjee Debashis De Mufti Mahmud

Organisation

International Advisory Committee Chairs Panagiotis Sarigiannidis Leo Mrsic Rajkumar Buyya

M Shamin Kaisar Nabarun Bhattacharyya

University of Western Macedonia, Kozani, Greece Algebra University College, Croatia CLOUDS Lab, School of Computing and Information Systems, The University of Melbourne, Australia Institute of Information Technology, Jahangirnagar University, Dhaka, Bangladesh Maulana Abul Kalam Azad University of Technology, West Bengal, India

International Advisory Committee Members Badlishah Ahmad Alessio Botta Miguel Camelo Youcef Brik Chi Cheng Huazhong Roshan Chhetri Konstantin Glasman Quansheng Guan Byeong-jun Lydia Chen Zhen Chen Mohammad Hassan Horst Hellbruck Angelo Genovese Ehab Hussein Konstantinos Katsaros Shady Khalifa Hing Keung Lau Mangui Liang Vincent Luk Mahdin Mahboob

Universiti Malaysia Perlis, Malaysia University of Napoli Federico II, Italy University of Girona, Spain USTHB, Algeria University of Science and Technology, China . St. Petersburg State University of Film and Television, Russia South China University of Technology, China Korea University, South Korea IBM Zurich Research Laboratory, Switzerland Tsinghua University, China King Saud University, Saudi Arabia University of Applied Sciences Lubeck, Germany Universita degli Studi di Milano, Italy University of Babylon, Iraq University College London, England Queen’s University, Northern Ireland The Open University of Hong Kong, Hong Kong Beijing Jiaotong University, China The Government of Hong Kong SAR, Hong Kong University of Liberal Arts Bangladesh, Bangladesh

xviii

Organisation

Kennedy Offor Rodolfo Oliveira Mehmet Tukel Manoj Vadakkepalisseri

Anambra State University, Nigeria Nova University of Lisbon, Portugal Istanbul Technical University, Turkey Mekelle Institute of Technology, Ethiopia

Technical Programme Committee Celia Shahnaz Sheng-Lung Peng Seunghun Yoo Hesam Yousefian Peiyan Yuan Chen Yunfang Manzil Zaheer Rafi Zaman Deze Zeng Wei Zeng Haibo Zhou Mu Zhou Bartlomiej Zielinski Kayhan Zrar Ghafoor Subha Ghosh Kunwar Vaisla Nicholas Valler Rambabu Vatti Francisco Vazquez-Gallego Hao Wang Jian Wang Jiayin Wang Ke Wang Hadi Omar Yahya Osais Amjad Osmani

Bangladesh University of Engineering and Technology (BUET) Dhaka National Dong Hwa University, Hualien, Taiwan Samsung Electronics, North Carolina, USA Roozbeh Institute of Higher Education, Iran Beijing University of Posts and Telecommunications, China Nanjing University of Posts and Telecommunications, China Indian Institute of Technology Guwahati, India Muffakham Jah College of Engineering and Technology, India School of Computer Science and Engineering, The University of Aizu, Japan Florida International University, Florida Shanghai Jiao Tong University, China Chongqing University of Posts and Telecommunications, China Silesian University of Technology, Poland University of Koya, Iraq Maulana Abul Kalam Azad University of Technology, West Bengal, India BT Kumaon Institute of Technology, India CrowdCompass, Portland Sinhgad College of Engineering, India Centre Tecnologic de Telecomunicacions de Catalunya (CTTC), Spain Tsinghua University, China Jilin University, China Washington University, Missouri Tsinghua University, China MIMOS Berhad, Malaysia King Fahd University of Petroleum and Minerals, Saudi Arabia Qazvin branch Azad university, Iran

Organisation

Zhipeng Ouyang Bhadoria P. B. S. Jose Pacheco Nikita Lyamin Deepak M Xiaolin Ma Saida Maaroufi Douglas Macedo Mahdin Mahboob Milad Mahdavi Mohammed Abdo Mohammed Mahdi Ljubomir Jacic Kuldeep Jadon Rakesh Jadon Mohsen Jahanshahi Ankit Jain Kamal Kant Theo Kanter Murizah Kassim Chi Harold Liu Nishakumari Lodha Thanasis Loukopoulos Bhanu Kaushik Sumit Kaushik Ken-ichi Kawarabayashi Mounir Kellil Xiaoya Hu Zhaozheng Hu Pingguo Huang Philippe Hunel Christian Esposito Enver Ever Leila Falek Adriano Goes Spyridon Gogouvitis Abhijeet Gole

xix

OPNET, USA Indian Institute of Technology, Kharagpur, India Universidad de Carabobo, Venezuela Halmstad University, Sweden Indian Institute of Space Science and Technology, India Wuhan University of Technology, China Ecole Polytechnique de Montreal, Canada Federal University of Santa Catarina, Brazil University of Liberal Arts Bangladesh, Bangladesh Islamic Azad University of Qazvin, Iran Universiti Sains Malaysia, Malaysia Technical College Pozarevac, Serbia Institute of Technology and Management, India Madhav Institute of Technology & Science, Gwalior, India Islamic Azad University, Iran Swami Vivekanand College of Engineering, India AMITY University Noida, India Stockholm University, Sweden Universiti Teknologi MARA, Malaysia IBM Research, China North Maharashtra University, India Technological Educational Institute of Lamia, Greece University of Massachusetts, USA Kurukshetra University, India National Institute of Informatics, Japan CEA LIST, France Huazhong University of Science and Technology, China Georgia Institute of Technology, Georgia Tokyo University of Science, Japan University of Antilles Guyane, Guadeloupe ICAR - CNR, Italy Middle East Technical University, Turkey Bab Ezzouar Alger, Algeria State University of Campinas - UNICAMP, Brazil National Technical University of Athens, Greece Ruia College, India

xx

Organisation

Glauco Goncalves Sachin Goyal Bo Gu Daniela Castelluccia Eduardo Cerqueira Chinmay Chakraborty Woo Chaw Seng Mohamed Boucadair Reinaldo Braga Raouyane Brahim Indranil Sengupta Abdull Gani Jemal Hussien Mohamed A. Azim Ravinder Agarwal Lucio Agostinho Abhishek Bhattacharya Angsuman Sarkar Arpita Chakraborty Biswapati Jana Dulal Acharjee Indrajit Pan Santanu Padhikar Abhishek Basu Ankan Bhattacharya Renjith V. Ravi Ehtiram Khan Debabrata Samanta Sourav De Jadav Chandra Das Rik Das Nitish Pathak Soumen Mukherjee

Federal Rural University of Pernambuco, Brazil RGPV Bhopal, India Waseda University, Japan University of Bari, Italy Federal University of Para, Brazil Birla Institute of Technology, India University of Malaya, Malaysia France Telecom, France Federal University of Ceara, Brazil INPT, Morocco VC-JIS University, India University of Malaysia, Malaysia Deakin University, Australia Taibah University, UAE Thapar University, India University of Campinas, Brazil Institute of Engineering & Management, India Kalyani Government Engineering College, India Bengal Institute of Technology, India Vidyasagar University, India Purushottam Institute of Engineering & Technology, India RCC Institute of Information Technology, Kolkata, India MAKAUT, Kolkata, India RCC Institute of Information Technology, Kolkata, India Mallabhum Institute of Technology, Bishnupur, India M.E.A. Engineering College, India Jamia Hamdard University, India CHRIST (Deemed to be University), Bangalore, India Cooch Behar Government Engineering College, India MAKAUT, Kolkata, India Siemens, India Guru Gobind Singh Indraprastha University (GGSIPU), New Delhi, India RCC Institute of Information Technology, West Bengal, India

Organisation

xxi

Asia-Pacific Artificial Intelligence Association (AAIA), Kolkata Branch Advisory Committee Members Sankar K. Pal Nikhil Ranjan Pal Ujjwal Maulik Anupam Basu Amit Konar Bidyut Baran Chaudhuri Sushmita Mitra Bijaya Ketan Panigrahi Ponnuthurai Nagaratnam Elizabeth C. Behrman Rajkumar Buyya Saman K. Halgamuge Aboul Ella Hassanien Mario Koeppen Rudolf Kruse Janusz Kacpryzk Bhabani P. Sinha Malay Kumar Kundu Vincenzo Piuri Bhargab B. Bhattacharya Wei-Chang Yeh Punam Kumar Saha Pawan Lingras Sukumar Nandy L. M. Patnaik K. V. S. Hari

Indian Statistical Institute, Kolkata, India Indian Statistical Institute, Kolkata, India Jadavpur University, India Indian Institute of Technology Kharagpur, India Jadavpur University, India Indian Statistical Institute, Kolkata, India Indian Statistical Institute, Kolkata, India Indian Institute of Technology Delhi, India Nanyang Technological University, Singapore Wichita State University, USA The University of Melbourne, Australia The University of Melbourne, Australia Cairo University, Egypt Kyushu Institute of Technology, Japan Otto-von-Guericke University Magdeburg, Germany Polish Academy of Sciences, Poland Indian Statistical Institute, Kolkata, India Indian Statistical Institute, Kolkata, India University of Milan, Italy Indian Statistical Institute, India National Tsing Hua University (NTHU), Taiwan University of Iowa, USA Saint Mary’s University, Halifax, Nova Scotia, Canada Indian Institute of Technology Guwahati, India Indian Institute of Science, Bangalore, India IISc Bangalore, India

Organising Committee Honorary Chair(s) Panagiotis Sarigiannidis Ivan Zelinka Leo Mrsic

University of Western Macedonia, Kozani, Greece VSB Technical University of Ostrava, Czech Republic Algebra University College, Croatia

xxii

Organisation

Chief Patron(s) Goutam Roychowdhury (Honourable Chancellor) Manoshi Roychowdhury (Honourable Co-chancellor)

Techno India University, West Bengal, India Techno India University, West Bengal, India

Patron(s) Goutam Sengupta (Honourable Vice Chancellor) Saikat Maitra (Honourable Vice Chancellor) Manas Kumar Sanyal (Honourable Vice Chancellor) Rina Paladhi (Director)

Techno India University, West Bengal, India Maulana Abul Kalam Azad University of Technology (MAKAUT), West Bengal, India Kalyani University, West Bengal, India Techno India University, West Bengal, India

General Chair(s) Siddhartha Bhattacharyya Mufti Mahmud

Rajnagar Mahavidyalaya, Birbhum, India and Algebra University College, Zagreb, Croatia Nottingham Trent University, UK

Organising Chair Subhabrata Roychaudhuri (Chairman)

Computer Society of India, Kolkata Chapter, India

Organising Co-chair(s) D. P. Sinha (Fellow) Subimal Kundu (Fellow) Phalguni Mukherjee (Fellow) Sib Daspal (Fellow) Aniruddha Nag (Former Chairman) Snehasis Banerjee

Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India TCS Research & Innovation, Kolkata, India

Organisation

xxiii

Programme Chair(s) Debashis De Jyoti Sekhar Banerjee

Maulana Abul Kalam Azad University of Technology, West Bengal, India Bengal Institute of Technology, West Bengal, India

Programme Co-chair(s) Abhishek Das Diganta Sengupta Anwesha Mukherjee Khondekar Lutful Hassan

Aliah University, India MSIT, Kolkata, India Department of Computer Science, Mahishadal Raj College, India Department of CSE, Aliah University, India

Organising Committee Member(s) Ishan Ghosh Abhishek Majumader Avijit Gayen Sanjoy Bhattacharjee Arnab Mandal Gopal Purkait Moutushi Singh Nilanjana Dutta Ray Prantosh Pal Soumya Pal Nabajit Chakravarty (MC Member) Swaraj Kumar Nath (MC Member) Samir Mandal (MC Member) Joydeep Ghosh (MC Member) Sharmila Ghosh (MC Member) Dibyendu Biswas (MC Member)

Computer Science and Engineering, Techno India University, West Bengal, India Computer Science and Engineering, Techno India University, West Bengal, India Computer Science and Engineering, Techno India University, West Bengal, India Computer Science and Engineering, Techno India University, West Bengal, India Computer Science and Engineering, Techno India University, West Bengal, India Computer Application, Techno India University, West Bengal, India IEM Kolkata, India TINT, Newtown, West Bengal, India Raigunj University, India St. Mary’s Engineering College, Kolkata, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India Computer Society of India, Kolkata Chapter, India

xxiv

Organisation

Industry Chair(s) Tanushyam Chattopadhyay (Associate General Manager) Anupam Dutta (Partner)

Department of Artificial Intelligence, Adani Enterprise Limited, India PWC Kolkata, India

Industry Academic Partnership Chair Partha Sarkar

TCS Academic Relationship Manager, India

Finance Chair Rajat Kanti Chatterjee (Treasurer)

Computer Society of India, Kolkata Chapter, India

Registration Chair Sudipta Sahana (MC Member)

Computer Society of India, Kolkata Chapter, India

Publicity Chair(s) Sourav Chakraborty (Immediate Computer Society of India, Kolkata Chapter, India Past Chairman) Gautam Hajra (Former Chairman) Computer Society of India, Kolkata Chapter, India

Contents

An Intelligent Approach for Brain Tumor Classification Using Different CNN Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sumit Das, Manas Kumar Sanyal, and Diprajyoti Majumdar

1

Effective Estimation of Relationship Strength Among Facebook Users Applying Pearson Correlation and Jaccard’s Coefficient . . . . . . . . . . . . . . . . . . . . . Deepjyoti Choudhury and Tapodhir Acharjee

15

Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Debashri Das Adhikary, Tanushree Dey, Somnath Bera, Sumita Guchhhait, Utpal Nandi, Mehadi Hasan, and Bachchu Paul Optimization of Intraday Trading in F&O on the NSE Utilizing BOLLINGER BANDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joyjit Patra, Mimo Patra, and Subir Gupta Identification of Mental State Through Speech Using a Deep Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Somnath Bera, Tanushree Dey, Debashri Das Adhikary, Sumita Guchhhait, Utpal Nandi, Nuruzzaman Faruqui, and Bachchu Paul Intellectual Property in Human Genomics in India . . . . . . . . . . . . . . . . . . . . . . . . . . Aranya Nath and Gautami Chakravarty Performance of Automated Machine Learning Based Neural Network Estimators for the Classification of PCOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pijush Dutta, Shobhandeb Paul, Arindam Sadhu, Gour Gopal Jana, and Pritam Bhattacharjee Job Recommendation a Hybrid Approach Using Text Processing . . . . . . . . . . . . . Dipanwita Saha, Dinabandhu Bhandari, and Gunjan Mukherjee A Study to Investigate the Existence of Monolexemic Colour Terms in Dravidian Languages: A Visual Psychophysics Approach . . . . . . . . . . . . . . . . . Male Shiva Ram, B. R. Shamanna, Rishi Bhardwaj, P. Phani Krishna, S. Arulmozi, and Chakravarthy Bhagvati

25

33

43

54

65

74

86

Deep Artificial Neural Network Based Blind Color Image Watermarking . . . . . . 101 Sushma Jaiswal and Manoj Kumar Pandey

xxvi

Contents

IoT Based Automatic Temperature Screening & Alert System for Symptomatic COVID-19 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Madhurima Chattopadhyay, Debjyoti Chowdhury, and Kumar Harsh Boosting Machine Learning Algorithm to Classify Road Conditions for Maintenance Strategy of Flexible Pavements . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Gurpreet Kaur and Rajiv Kumar Hyper Parameterized LSTM Models for Predicting NSE Intraday Bias Based on Global Market Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Tamoghna Mukherjee, Subir Gupta, and Anirban Mitra Malicious URL Classification Using Machine Learning . . . . . . . . . . . . . . . . . . . . . 147 Trinanjan Daw, Pourik Saha, Mainak Sen, Khokan Mondal, and Amlan Chakrabarti Prognostic Stage Classification for Invasive Breast Cancer by Analysing Affected Lymph Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Sweta Manna and Sujoy Mistry Study of Task Scheduling Algorithms for Energy Minimization in a Cloud Computing Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Sanna Mehraj Kak, Parul Agarwal, M. Afshar Alam, and Ahmed J. Obaid Human Stress Detection from SWCT EEG Data Using Optimised Stacked Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Akshay Jadhav, Lokesh Malviya, Shishir Kumar Shandilya, and Sandip Mal Impact of Carbon Emission Policies on an Imperfect EOQ Model Under Cloud Fuzzy Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Srabani Shee and Tripti Chakrabarti Rule-Based Investigation on Positive Change in Air Quality at Kolkata During Lockdown Period Due to Covid-19 Pandemic . . . . . . . . . . . . . . . . . . . . . . . 212 Atreyee Datta, Khondekar Lutful Hassan, and Krishan Kundu Performance Analysis of Professional Higher Education Programmes Driven by Students Perception: A Latent Variable Computation Model for Industry 5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Bhaswati Roy, Sandip Mukherjee, Niloy Kumar Bhattacherjee, Sayanti Samanta, and Subir Gupta

Contents

xxvii

Graph Based Zero Shot Adverse Drug Reaction Detection from Social Media Reviews Using GPT-Neo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Arijit Dey, Jitendra Nath Shrivastava, and Chandan Kumar Digital Twin for Industry 5.0: A Vision, Taxonomy, and Future Directions . . . . . 246 Anusua Mazumder, Partha Sarathi Banerjee, Amiya Karmakar, Pritam Ghosh, Debashis De, and Houbing Song Application of Machine Learning Technology for Screening of Mental Health Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 Arkaprabha Sau, Santanu Phadikar, and Ishita Bhakta Novel Machine Learning Techniques for Diabetes Prediction . . . . . . . . . . . . . . . . 274 Mehboob Zahedi, Samit Bhajna, and Abhishek Das Analysis of BLER and Throughput During the Coexistence of Two 5G NR . . . . 289 Jayanta Kumar Ray, Rabindranath Bera, Sanjib Sil, and Quazi Mohmmad Alfred An Assessment of Forest Fire Dataset Using Soft Computing Technique . . . . . . . 302 Adnan Ahmad, Md Adib, Anjali Sharma, Dharmpal Singh, Ira Nath, Sudipta Sahana, and Nirbhay Mishra Predicting the Data Science Employability Rate Using Data Mining Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Ritika Mishra, Karan Philip, and Prafulla Bafna Classification of Online Fake News Using N-Gram Approach and Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 Atanu Sarkar, Anil Bikash Chowdhury, and Mauparna Nandan An Edge Assisted Robust Smart Traffic Management and Signalling System for Guiding Emergency Vehicles During Peak Hours . . . . . . . . . . . . . . . . 337 Shuvadeep Masanta, Ramyashree Pramanik, Sourav Ghosh, and Tanmay Bhattacharya Patent Analysis on Artificial Intelligence in Food Industry: Worldwide Present Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 Rosalima Gupta, Mahuya Hom Choudhury, Mufti Mahmud, and Jyoti Sekhar Banerjee Research of the Influence of the Fuzzy Rules Number on the Learning of a Neuro-Fuzzy System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Bobyr Maxim, Milostnaya Natalia, Khrapova Natalia, and Suprunova Olga

xxviii

Contents

Optimization of Traffic Flow Based on Periodic Fuzzy Graphs . . . . . . . . . . . . . . . 374 Sergey Gorbachev, Alexander Bozhenyuk, and Polina Nikashina ChatGPT: A OpenAI Platform for Society 5.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384 Chandan Pan, Jyoti Sekhar Banerjee, Debashis De, Panagiotis Sarigiannidis, Arpita Chakraborty, and Siddhartha Bhattacharyya Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

About the Editors

Dr. Siddhartha Bhattacharyya [FRSA, FIET (UK), FIEI, FIETE, LFOSI, SMIEEE, SMACM, SMAAIA, SMIETI, LMCSI, LMISTE] is currently the principal of Rajnagar Mahavidyalaya, Birbhum, India. He is also serving as a scientific advisor at Algebra University College, Zagreb, Croatia. Prior to this, he was a professor at CHRIST (Deemed to be University), Bangalore, India. He also served as the principal of RCC Institute of Information Technology, Kolkata, India. He has served VSB Technical University of Ostrava, Czech Republic, as a senior research scientist. He is the recipient of several coveted national and international awards. He received the Honorary Doctorate Award (D. Litt.) from the University of South America and the SEARCC International Digital Award ICT Educator of the Year in 2017. He is a co-author of 6 books and the co-editor of 98 books and has more than 400 research publications in international journals and conference proceedings to his credit. Dr. Jyoti Sekhar Banerjee is currently serving as the head of the Department in the Computer Science and Engineering (AI and ML) Department at the Bengal Institute of Technology, Kolkata, India. Additionally, he is also the professor in-charge, the R & D and Consultancy Cell & Nodal officer of the IPR Cell of BIT. Dr. Banerjee did his Postdoctoral Fellowship at Nottingham Trent University, UK, in the Department of Computer Science. He also completed the postgraduate diploma in IPR & TBM from MAKAUT, WB. He has teaching and research experience spanning 18 years and completed one IEI funded project. He is the lead author of “A Text Book on Mastering Digital Electronics: Principle, Devices, and Applications” and the co-editor of the six books. He has also co-authored another book and is currently processing three edited books in reputed international publishers like Springer, CRC Press, De Gruyter, etc. Debashis De is a professor in the Department of Computer Science and Engineering at the Maulana Abul Kalam Azad University of Technology, West Bengal, India. He received M.Tech. from the University of Calcutta, in 2002. and a Ph.D. from Jadavpur University in 2005. He is a senior member-IEEE, a fellow IETE, and a life member CSI. He was awarded the prestigious Boyscast Fellowship by the Department of Science and Technology, Government of India, to work at the Heriot-Watt University, Scotland, UK. He received the Endeavour Fellowship Award from 2008 to 2009 by DEST Australia to work at the University of Western Australia. He received the Young Scientist award both in 2005 in New Delhi and in 2011 in Istanbul, Turkey, from the International Union of Radio Science, Belgium. In 2016, he received the JC Bose research award from IETE, New Delhi. Mufti Mahmud is an associate professor of Cognitive Computing at the Computer Science Department of Nottingham Trent University (NTU), UK. He has been the recipient of the top 2% cited scientists worldwide in computer science (2020 and 2021), the NTU

xxx

About the Editors

VC outstanding research award 2021, and the Marie-Curie postdoctoral fellowship. He is the coordinator of the Computer Science and Informatics research excellence framework unit of assessment at NTU and the deputy group leader of the Cognitive Computing & Brain Informatics and the Interactive Systems research groups. His research portfolio consists of GBP3.3 million grant capture with expertise that includes brain informatics, computational intelligence, applied data analysis, and big data technologies focusing on healthcare applications. He has over 15 years of academic experience and over 200 peer-reviewed publications. He is the general chair of the Brain Informatics conference 2020, 2021, and 2022; Applied Intelligence and Informatics conference 2021 and 2022; Trends in Electronics and Health Informatics 2022; the chair of the IEEE CICARE symposium since 2017, and was the local organising chair of the IEEE WCCI 2020. He will serve as one of the general chairs of the 31st edition of the ICONIP conference to be held in Auckland (NZ) in 2024. He is the section editor of Cognitive Computation, the regional editor (Europe) of the Brain informatics journal, and an associate editor of the Frontiers in Neuroscience. He has been serving as the chair of the Intelligent System Application and Brain Informatics Technical Committees of the IEEE Computational Intelligence Society (CIS), a member of the IEEE CIS Task Force on Intelligence Systems for Health, an advisor of the IEEE R8 Humanitarian Activities Subcommittee, the publications chair of the IEEE UK and Ireland Industry Applications Chapter, and the project liaison officer of the IEEE UK and Ireland SIGHT Committee, the secretary of the IEEE UK and Ireland CIS Chapter, and the Social Media and Communication officer of the British Computer Society’s Nottingham and Derby Chapter.

An Intelligent Approach for Brain Tumor Classification Using Different CNN Variants Sumit Das1(B)

, Manas Kumar Sanyal2 , and Diprajyoti Majumdar1

1 JIS College of Engineering, Information Technology, Kalyani 741235, India

[email protected]

2 Business Administration, University of Kalyani, Kalyani 741235, India

[email protected]

Abstract. The growth of unrepresentative cells residing inside the brain is termed as swelling part of brain or brain tumors. This tumor may be either malignant means cancerous or benign means noncancerous. These tumors grow rapidly if it is malignant and affect the brain by creating pressure under the skull. This pressure is dangerous for the normal activity of the human body functioning and hence early detection of these tumors is the primary task of the physician. Effective treatment of brain cancers depends on early identification. One of the most used scanning techniques in neurology is magnetic resonance imaging (MRI). A highly powerful magnetic field is used in MRI to stimulate the target tissue using radiofrequency pulses. High soft tissue contrast and no ionizing radiation exposure are two benefits. The major objective of this study is to develop a brain tumors MRI binary classification approach that is automatic and efficient to help doctors make decisions. The proposed approach is to estimate the accuracy through some selfcreated optimal variants of a newly adopted and mostly used CNN architecture Residual Network 50 model. Keywords: Deep Learning · Brain tumor classifier · Residual Net Architecture · Ensemble learning · K-Fold Cross Validation

1 Introduction There are 150 different forms of brain tumours, however they are most specifically divided into two groups: primary and metastatic. The primary one originates from the tissue of the brain and the metastatic one from the bloodstream or lung cancer. Once the tumor is detected, innovative surgical and radiation are the treatments carried out to cure the patients. The benign tumor is dangerous and can cause death if it spreads out as well as put pressure on the nerves too. The common type of malignant brain tumor is gliomas and it originated from glia, a brain supporting cell, or cerebrum. The World Health Organization (WHO) devised the grading system to show the aggressiveness of tumours based on their historical characteristics. The cause of brain tumors is when the genes on the chromosomes are damaged or do not function properly. The symptoms of brain tumor patients are lethargy, vomiting, headaches, and many more. In addition to surgery, chemo, radiation therapy, and radiosurgery are treatments [1]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 1–14, 2023. https://doi.org/10.1007/978-981-99-3478-2_1

2

S. Das et al.

Deep learning is a more sophisticated form of machine learning that is important in examining prediction accuracy and prognosis identification. Deep learning techniques like linear regression, decision trees, k-nearest neighbours, random forests, support vector machines, and artificial neural networks are applied to provide accuracy ranging from low to high, as well as progressively more complicated understanding of the methodology. The researchers are utilizing these techniques from 2015 to present exponentially and revolutionizing also initiated in every corner of the industry, especially in prognosis, diagnosis, and treatment in healthcare [2]. The paper is structured as a literature review part where the various papers that are connected to our study are examined, along with the gaps in the linked work. In the Methodologies section, the processes, steps of the proposed method, background study, and implementations are discussed. In the Experiment and analysis section, some self created variants and experimental analysis of advanced Residual Network models (P.S. variants of resnet identity block and convolutional block) on the MRI image data is described. In the result and discussion section, the comparative study between specially selected architectures, and what is achieved after the entire research is discussed. In the Conclusion and Future scope section, it is concluded which model is giving satisfactory accuracy and speculate its scope on medical diagnosis.

2 Literature Review Medical diagnosis using artificial intelligence (AI) and machine learning (ML) is taken into account as significant one amongst the foremost necessary problems with computing systems. The scientist tries to use a machine learning approach to find images of a brain consisting of neoplasm or else. The results showed that the Artificial Neural Network (ANN) approach is extremely promising [3]. The classification of tumours as malignant or benign is carried out in this article using a backpropagation neural network and the grey level co-occurrence feature extraction technique. The scientists present and discuss the methods for segmenting brain tumours in multidimensional imaging such as MRI, CT (computed tomography), and PET (positron emission tomography). Future developments in this field, including all the noted astonishing breakthroughs, will surely demonstrate significant potential for the brain tumour segmentation procedures [4]. K-Nearest Neighbor (K-NN), an intelligent method for classifying tumours in MRI images, is used to provide accurate diagnosis of severe diseases. According to the experimental findings, the method obtains 100% classification accuracy when using K-NN and 98.92% when using ANN [5]. Through the use of the segmentation technique in image processing, the complex tumour dataset may be efficiently segmented. Combining FLAIR (fluid attenuated inversion recovery) with a T2 sequence image and applying fuzzy C-means results in a more pronounced tumour boundary. By applying the statistical feature kurtosis and the texture feature energy, the tumour portion can be distinguished from surrounding tissue, and segmentation accuracy has been improved as shown by dice overlap and the Jaccard index [6]. In the extensive study of works of literature, it has been explored that AI enabledexpert system [7–10], Soft Computing [11–13], and Machine Learning (ML) techniques [14–16] in this advance digital era are playing a significant role in the field of medical

An Intelligent Approach for Brain Tumor Classification

3

prognosis, diagnosis, treatment as well as healthcare management. The study looked into how effective a feature extractor trained convolutional neural networks (CNNs) was. Additionally used in the analysis of the medical images for feature extraction are the Generative Adversarial Network and Capsule Network. In the most current pipeline, CNNs were discovered to have taken the place of classical machine learning algorithms in real-world feature extraction of medical pictures [17]. With the development of computer vision and machine learning, the diagnosis strategy can be built more quickly than with the conventional method, although it takes a specialist an associate in nursing a long time to diagnose the magnetic resonance imaging image. The algorithmic software that is being developed has four stages, including the capture of MRI images, preprocessing, feature extraction, and classification methods. The magnetic resonance imaging of the brain with neoplasm was classified using the bag of features, and an average classification accuracy of 97% was attained across all categories [18]. Time of life increases in the prevalence of brain tumours have led to the current clinical use of computerized-tomography (CT), MRI, and PET imaging tools to aid doctors and remove dangerous tumours precisely through surgery. To predict the labels of features, a deep convolutional neural network and a support vector machine are trained. To train the deep classifier, a support vector machine and a deep convolutional neural network are pipelined in serial [19]. For the categorization of brain tumours, a large number of researchers have previously adopted pre-trained CNN models. To train the models, they mostly used the transfer learning approach. For instance, Inar and Yildirim used a modified version of the pre-trained ResNet-50 CNN model for brain tumour identification by substituting 8 new layers for the model’s last 5 levels. With their improved CNN model, they used MRI scans to reach 97.2% accuracy [20]. Similar to this, Khawaldeh et al. (2017) suggested a modified version of the AlexNet CNN model to divide healthy, low-grade, and high-grade glioma brain MRI images into categories. Using 4069 brain MRI images, an overall accuracy of 91.16% was attained [21]. The pre-trained ResNet-34 CNN model was recommended by Talo et al. to identify brain cancers from MRI scans. Despite having a detection accuracy of 100%, they only employed 613 photos for the deep learning model, which is not a large quantity for machine learning studies [22]. Rehman et al. suggested classifying different forms of brain tumours into glioma, meningioma, and pituitary by utilising three well-known pre-trained CNN models such as AlexNet, GoogleNet, and VGG16. Using the same transfer learning strategy, the VGG-16 earned the highest accuracy of 98.69%. They made use of 3064 brain MRI scans from 233 patients [23]. The 696 T1-weighted MRI scans were used by Mehrotra et al. (2020) to categorise the brain tumour pictures as malignant and benign using a deep learning-based transfer learning technique. For the classification investigation, the most well-known CNN models, including ResNet-101, ResNet-50, GoogleNet, AlexNet, and SqueezeNet, were used. With the aid of transfer learning using the pre-trained AlexNet CNN model, they attained the greatest accuracy of 99.04% [24]. Additionally, some researchers classify brain tumours using a combination of deep neural networks and other cutting-edge techniques. For instance, Mohsen et al. classified brain MRI images into four classes—normal brain, glioblastoma, sarcoma, and

4

S. Das et al.

metastatic bronchogenic carcinoma tumors—using a deep neural network (DNN) classifier in conjunction with the discrete wavelet transform (DWT) and principal component analysis (PCA). A 96.97% accuracy rate was discovered [25]. The proposal of CNN and genetic algorithm (GA)-based approaches by Kabir Anaraki et al. for noninvasively classifying various grades of glioma using MRI images. They classified three glioma grades with an accuracy of 90.92%, and glioma, meningioma, and pituitary tumour types with an accuracy of 94.23% [26]. Yang et al. looked at the general impact of CNN trained with transfer learning and fine-tuning for noninvasively classifying low-grade and highgrade gliomas by analysing on traditional MRI images. Using pre-trained GoogleNet and pre-trained AlexNet, they attained an accuracy of 86.67% and 87.4%, respectively [27]. For the challenge of classifying and rating gliomas from raw pathology images, Ertosun and Rubin created a deep learning pipeline including ensemble learning of CNN. Their approach was regarded as being highly effective. However, they ran into the issue of data scarcity, which is a difficulty that deep learning systems frequently encounter. For the HGG vs. LGG classification job, they achieved 96.53% accuracy, while for the LGG Grade I vs. LGG Grade II classification assignment, they attained 71% accuracy [28]. These were some of popular related works since 2015 on the field of detection or classification of brain tumours from MRI Images. The reviews od literature assist the authors to acquire knowledge based on the recent research and developments and these also show the research direction of the ongoing work on brain tumor classification and risk prediction.

3 Methodologies The three image datasets of brain tumours used in this paper were gathered from Kaggle.The Br35H dataset provided by kaggle contains little contrast-enhanced images containing almost 3060 MRI with 1560 “yes” and 1500 “no” labeled [29]. The second dataset provided by Kaggle contains almost 253 images with 98 no and 155 yes labeled.Every image is little contrast enhanced and scanned by radiologists [30]. The third data set is also from kaggle. It consists of 170 of Normal and 230 of tumor images captured from 400 patients [31]. The sample image with brain tumer (Yes) and without brain tumer (No) from the dataset are shown in Fig. 1. Diverse models of the same kind of issue solving are combined to create an ensemblebased system. As a result, these systems are sometimes referred to as ensemble systems or multiple classifier systems. For instance, individuals frequently consult with numerous doctors before deciding on a medical procedure, they read user reviews before making a big-ticket purchase, they check references on potential hires, etc. Even this essay has undergone multiple expert reviews before it was approved for publishing. The separate judgments of various specialists are combined to come to a final judgement in each case. The main objective is to reduce the unlucky choice of an unneeded medical procedure, a subpar product, an unskilled worker, or even a badly qualified employee [32]. The pre-trained Resnet50 model from Keras applications, which has largely been adopted and approved by researchers working on image classification tasks and medical

An Intelligent Approach for Brain Tumor Classification

Brain tumer :Yes

Brain tumer :No

Brain tumer :Yes

Brain tumer :No

5

Brain tumer :Yes

Brain tumer :No

Fig. 1. Sample image with brain tumer (Yes) and without brain tumer (No)

Fig. 2. Increasing network depth leads to worse performance [33]

image analysis, is used in this work. By having skip-connection-based Identity blocks, the resnet has already resolved the deep-hidden layer problems in convolutional neural networks. That means, theoretically it is known that by increasing hidden layers in the neural network, the accuracy must increase. But in a real-case scenario, it doesn’t happen. By merely stacking layers on top of one another, network depth cannot be increased. Since the gradient is back-propagated to earlier layers and may become arbitrarily small through repeated multiplication, deep networks are notoriously difficult to train. As a result, Fig. 2 shows that as the network develops deeper, its performance becomes saturated or even starts to decline quickly. Figure 3 illustrates the fundamental concept of ResNet, which is the introduction of a “identity shortcut link” that omits one or more layers

6

S. Das et al.

Fig. 3. A Residual Block Architecture [33]

In this technique, Resnet50 along with 4 Variants of residual Identity Block has been designed for research. The established Resnet50 architecture is given in Fig. 4.

Fig. 4. The established Resnet50 architecture

Convolution block and identity block’s interior designs are depicted in Figs. 5 and 6, respectively

Fig. 5. Convolution block

In the diagram of Convolutional block and Identity block, the original layers have been shown as follows: x_shortcut = x;

An Intelligent Approach for Brain Tumor Classification

7

Fig. 6. Identity block- skip connection over 3 layers

x→Conv2d→BN →ReLU→Conv2d →BN → ReLU→Conv2d →BN → (x+x_shortcut) → ReLU a) From this diagram, the first variant is designed by adding the batch normalization layer after addition and before ReLU. So, the new diagram is: X = X_shortcut ; (saving the input); X →Conv2d →BN → ReLU→ Conv2d → BN → ReLU → Conv2d → (x+x_shortcut) → BN→ ReLU→ X’ This technique will normalize the additional output of skip connection and real connection and then it will pass it through the activation. For normalizing after addition, the previous outputs will also be scaled and merged with new outputs which can make the results a little bit different. b) The second Variant is ReLU before addition. Here, the architecture will be like this: X = X_shortcut; (saving the input); X → Conv2d → BN → ReLU→ Conv2d→BN→ReLU → Conv2d → BN→ ReLU → (x+x_shortcut) →X’ This technique will first remove all of the negative pixels from the output image coming from the BN layer of the real network and then it will add it with the skipped connection output and pass it to the next Block. c) The 3rd Variant is Relu only Pre activation and the architecture will be like this: X = X_shortcut; (saving the input); X → ReLU→ Conv2d → BN →ReLU→ Conv2d → BN →ReLU→ Conv2d → BN→ (x+x_shortcut) →X’; In this technique, relu is before every conv2d(weight) and BN layer. For this reason, the input will first pass the activation layer and its non-negative pixel values will remain as output then it will be passed through conv2d and batch normalization by which the values of output features will be normalized and again it will be passed through activation. d) Our last but not the least Variant is Fully pre-activation identity block and the architecture is: X = X_shortcut; (saving the input); X → BN->ReLU→ Conv2d →BN →ReLU→ Conv2d → BN→ReLU→Conv2d→ (x+x_shortcut) →X’; In this architecture, the input image will be continuously normalized and passed through the activation function. So before the activation, every image will be scaled properly .This approach can bring a new change in the accuracy.

8

S. Das et al.

4 Experiment and Analysis Too much pre-processing in the data is not applied in this work because it was kind of enhanced and pre-processed from previous. Every single image has been preprocessed by (224 × 224) [because the proposed model takes input shape 224 × 224 × 3], cropped and zoomed a little bit and applied a Gaussian Blur filter to reduce extreme sharpness of data as shown in Figs. 7 and 8. At last the data is rescaled dividing by 255. This will give the pixel values range between 0 to 1.

Fig. 7. Original image data

Fig. 8. Pre-processed image data

The strategy of the experiment is revealed in Fig. 9. Stratified K - Fold Cross Validation A dataset with 100 samples, of which 80 are from a negative class and 20 from a positive class, is said to have undergone stratified sampling. Training data for stratified sampling consists of 64 negative classes (80% of 80%) and 16 positive classes (80% of 20). In training data, there are 64 negative samples and 16 positive samples, totaling 80 samples, which is an equal representation of the original dataset. In the test set, there are 16 negative samples (i.e., 20% of 80) and 4 positive samples (i.e., 20% of 20), totaling 16 (−ve) + 4 (+ve) = 20 samples. Cross-validation using stratified sampling: In stratified k-fold cross-validation, various training and testing data sections have been produced in order to train one or more models. The fold number is typically given the k value. If K = 5, the given dataset has been divided into 5 folds and the Train and Test has been run on each fold. The picture below illustrates the flow of the fold-defined size during each run, where one fold is taken into consideration for testing and the others are used for training and iterations. In this study, k = 5, stratified k fold cross validation is employed, which is k fold cross validation using stratified sampling. So, the data will be divided into 5 separate

An Intelligent Approach for Brain Tumor Classification

9

folds and for each and every fold, there will be 4 parts of training and 1 part of testing data and it will iterate 5 times and the testing and training data will be shuffled and stratified after every iteration as depicted in Fig. 9. So, there are 5 models which are Original pre-trained Resnet50 from keras applications and 4 newly created Resnet50 models with 4 variants of Residual Identity block.

Fig. 9. K Fold Cross validation [34] and flow of experiment strategies

5 Results and Discussion In this work, the main proposed idea was to do an experiment with original Resnet50 network and self created Resnet50 with different variants of Residual Identity blocks. Because the identity blocks are the main reason by which the model remains less complicated and and the accuracy increases by increasing the layer density. But in some previous research works of the author, the pre-trained Resnet50 model did not give satisfactory results on Magnetic resonance Images, but the distance between training and validation accuracy was very much less compared to other CNN architectures(training accuracy was 86% and validation accuracy on test data was 74%). The overall average accuracies of each and every model is shown in Fig. 10 (Table 1).

10

S. Das et al. Table 1. Accuracy Graph of five models fold-wise

For this reason, in this work, it is necessary to build some new methodologies and architectures. As in the previous work, the Residual Network had shown its less complexities compared to VGG16,VGG19,Alexnet etc. So the author adopted this architecture for updating and creating Variants. There are 4 different Variants of the Identity blocks which have been discussed in the methodologies section - BATCH-NORM AFTER ADDITION, RELU BEFORE ADDITION, RELU ONLY PRE-ACTIVATION & FULLY PRE-ACTIVATED ID BLOCK. In the experiment with k(5) Fold cross validation, the 5 models were passed through 5 iterations with every 5 separated and shuffled Training and validation datasets (Every model with 20 epochs per Fold iteration.) After the experiment, the captured accuracy plots are received as shown in Fig. 11. According to these results, the variant 1 and variant 2 shows highest accuracy (97%) in round figures. The measure of mean(M) and standard deviation(SD) 0.92 and 0.06 significantly meaningful statistics, which demonstrate adequate diagnosing computer learning assistant. But as an absolute value, Resnet variant 2 means “Relu before addition in ID block” has shown the highest digits of accuracy.

An Intelligent Approach for Brain Tumor Classification

11

Fig. 10. Oroverall average accuracies of each and every model

Fig. 11. Oroverall average accuracies graph with descriptive statistics

A conclusion from this investigation on various residual network architecture variations in the binary classification of MRI brain tumour pictures is that, it can be accomplished that the new variants of resnet50 have shown lot of meaningful analysis and less complex model training and for some extra sort of features, the variant 1 and variant 2 estimates the high accuracy. Experimental note is that Researchers who are willing to design this type of variants on newly adopted CNN architectures, are highly suggested for having more GPU and ram power in their systems for the training of models. In this experiment, the Google Colab PRO GPU took approximately 1 h 45 min for every fold of iterations. Where 1/5

12

S. Das et al.

fold contains 5 models and each of them have been trained with 20 epochs.(~1–2 min per epoch).

6 Conclusion and Future Scope Doctors are still human and humans are filled with errors. Though they analyze the MRI scan manually with their knowledge and practice, it is still a difficult challenge for them to detect the correct position of the tumor. Therefore, it can be concluded that this model can not only detect the tumor but also classify its type and hazard, which is pretty fruitful to doctors as well as normal people for personal AI assistant as well as personal AI medication respectively to some extent. In the future, these variants will be able to mark the extent of important portions and nerves of the brain and the tumor besides classifying its type. Therefore as a surgery helper and business application, it will be very advantageous. Now authors are working on this model which has two prime operations, predicting brain tumor probability by taking patients’ symptoms mostly logistic regression model or any classification model, and analyzing and classifying tumors from MRI scan images with the help of the convolutional neural network. In the future, these techniques can be applied to breast, lungs, ovarian, and skin tumors. Consent for publication: Yes Availability of supporting data: Provided as and when requires. Funding: No funding. Authors’ contributions: All author’s(Sumit Das, Manas Kumar Sanyal, Diprajyoti Majumdar) contributions is the outcome of this article. Acknowledgement. Dr. Sumit Das would especially like to thank the expert Prof. Manas Kumar Sanyal, University of Kalyani and Coauthor Mr. Diprajyoti Majumdar for prompt counsel and support. The management, JIS Group, JIS College of Engineering, and Department of Information Technology deserve special thanks for providing a variety of R&D facilities.

Conflict of Interest. The authors declare that they have no conflict of interest. This article does not contain any studies with human participants or animals performed by any of the authors. Informed consent was obtained from all individual participants included in the study.

References 1. Brain Tumors - Classifications, Symptoms, Diagnosis and Treatments (2022). https://www. aans.org/. Accessed 13 Jan 2022 2. Gulum, M.A., Trombley, C.M., Kantardzic, M.: A review of explainable deep learning cancer detection models in medical imaging. Appl. Sci. 11(10) Art. no. 10 (2021). https://doi.org/ 10.3390/app11104573 3. Al-Ayyoub, M., Husari, G., Darwish, O., Alabed-alaziz, A.: Machine learning approach for brain tumor detection. In: Proceedings of the 3rd International Conference on Information and Communication Systems - ICICS 2012, pp. 1–4. Irbid, Jordan (2012). https://doi.org/10. 1145/2222444.2222467

An Intelligent Approach for Brain Tumor Classification

13

4. Angulakshmi, M., Lakshmi Priya, G.G.: Automated brain tumour segmentation techniques- a review. Int. J. Imaging Syst. Technol. 27(1), 66–77 (2017). https://doi.org/10.1002/ima.22211 5. Al-Badarneh, A., Najadat, H., Alraziqi, A.M.: A classifier to detect tumor disease in MRI brain images. In: 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 784–787. Istanbul (2012). https://doi.org/10.1109/ASONAM.201 2.142 6. Ejaz, K., Rahim, M.S.M., Bajwa, U.I., Rana, N., Rehman, A.: An unsupervised learning with feature approach for brain tumor segmentation using magnetic resonance imaging. In: Proceedings of the 2019 9th International Conference on Bioscience, Biochemistry and Bioinformatics - ICBBB 2019, pp. 1–7. Singapore, Singapore, 2019. https://doi.org/10.1145/331 4367.3314384 7. Das, S., Biswas, S., Paul, A., Dey, A.: AI doctor: an intelligent approach for medical diagnosis. In: Bhattacharyya, S., Sen, S., Dutta, M., Biswas, P., Chattopadhyay, H. (eds.) Industry Interactive Innovations in Science, Engineering and Technology. LNNS, vol. 11, pp. 173–183. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-3953-9_17 8. Das, S., Sanyal, M., Datta, D., Biswas, A.: AISLDr: artificial intelligent self-learning doctor. In: Bhateja, V., Coello Coello, C.A., Satapathy, S.C., Pattnaik, P.K. (eds.) Intelligent Engineering Informatics. AISC, vol. 695, pp. 79–90. Springer, Singapore (2018). https://doi.org/ 10.1007/978-981-10-7566-7_9 9. Das, S., Sanyal, M.K., Datta, D.: Advanced diagnosis of deadly diseases using regression and neural network. In: Mandal, J.K., Sinha, D. (eds.) CSI 2018. CCIS, vol. 836, pp. 330–351. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-1343-1_29 10. Das, S., Sanyal, M.K.: Machine intelligent diagnostic system (MIDs): an instance of medical diagnosis of tuberculosis. Neural Comput. Appl. 32(19), 15585–15595 (2020). https://doi. org/10.1007/s00521-020-04894-8 11. Das, S., Sanyal, M.: Application of AI and soft computing in healthcare: a review and speculation. Int. J. Sci. Technol. Res. 8, 1786–1806 (2019) 12. Das, S., Sanyal, M.K., Datta, D.: Artificial Intelligent Embedded Doctor (AIEDr.): a prospect of low back pain diagnosis. Int. J. Big Data Anal. Healthc. IJBDAH 4(2), 34–56 (2019). https://doi.org/10.4018/IJBDAH.2019070103 13. Das, S., Sanyal, M.K., Datta, D.: Intelligent approaches for the diagnosis of low back pain. In: 2019 Amity International Conference on Artificial Intelligence (AICAI), pp. 684–695 (2019). https://doi.org/10.1109/AICAI.2019.8701266 14. Das, S., Sanyal, M.K., Kumar Upadhyay, S.: A Comparative Study for Prediction of Heart Diseases Using Machine Learning, Social Science Research Network, Rochester, NY, SSRN Scholarly Paper ID 3526776 (2020). https://doi.org/10.2139/ssrn.3526776 15. Das, S., Sanyal, M.K., Datta, D.: A comprehensive feature selection approach for machine learning. Int. J. Distrib. Artif. Intell. IJDAI 13(2), 13–26 (2021). https://doi.org/10.4018/ IJDAI.2021070102 16. Das, S., Synyal, M.K., Upadhyay, S.K., Chatterjee, S.: An intelligent approach for predicting emotion using convolution neural network. J. Phys. Conf. Ser. 1797(1), 012014 (2021). https:// doi.org/10.1088/1742-6596/1797/1/012014 17. Nadeem, M.W., et al.: Brain tumor analysis empowered with deep learning: a review, taxonomy, and future challenges. Brain Sci. 10(2), 118 (2020). https://doi.org/10.3390/brainsci1 0020118 18. Marghalani, B.F., Arif, M.: Automatic classification of brain tumor and alzheimer’s disease in MRI. Procedia Comput. Sci. 163, 78–84 (2019). https://doi.org/10.1016/j.procs.2019.12.089 19. Wu, W., et al.: An intelligent diagnosis method of brain MRI tumor segmentation using deep convolutional neural network and SVM algorithm. Comput. Math. Methods Med. 2020, e6789306 (2020). https://doi.org/10.1155/2020/6789306

14

S. Das et al.

20. Çinar, A., Yildirim, M.: Detection of tumors on brain MRI images using the hybrid convolutional neural network architecture. Med. Hypotheses 139, 109684 (2020). https://doi.org/10. 1016/j.mehy.2020.109684 21. Khawaldeh, S., Pervaiz, U., Rafiq, A., Alkhawaldeh, R.S.: Noninvasive grading of glioma tumor using magnetic resonance imaging with convolutional neural networks. Appl. Sci. 8(1), Art. no. 1 (2018). https://doi.org/10.3390/app8010027 22. Talo, M., Baloglu, U.B., Yıldırım, Ö., Rajendra Acharya, U.: Application of deep transfer learning for automated brain abnormality classification using MR images. Cogn. Syst. Res. 54, 176–188 (2019). https://doi.org/10.1016/j.cogsys.2018.12.007 23. Rehman, A., Naz, S., Razzak, M.I., Akram, F., Imran, M.: A deep learning-based framework for automatic brain tumors classification using transfer learning. Circuits Syst. Sign. Process. 39(2), 757–775 (2019). https://doi.org/10.1007/s00034-019-01246-3 24. Mehrotra, R., Ansari, M.A., Agrawal, R., Anand, R.S.: A Transfer Learning approach for AI-based classification of brain tumors. Mach. Learn. Appl. 2, 100003 (2020). https://doi. org/10.1016/j.mlwa.2020.100003 25. Mohsen, H., El-Dahshan, E.-S.A., El-Horbaty, E.-S.M., Salem, A.-B.M.: Classification using deep learning neural networks for brain tumors. Future Comput. Inform. J. 3(1), 68–71 (2018). https://doi.org/10.1016/j.fcij.2017.12.001 26. Kabir Anaraki, A., Ayati, M., Kazemi, F.:Magnetic resonance imaging-based brain tumor grades classification and grading via convolutional neural networks and genetic algorithms. Biocybern. Biomed. Eng. 39(1), 63–74 (2019). https://doi.org/10.1016/j.bbe.2018.10.004 27. Yang, Y., et al.: Glioma grading on conventional MR images: a deep learning study with transfer learning. Front. Neurosci. 12 (2018). https://www.frontiersin.org/article/10.3389/fnins. 2018.00804. Accessed 10 June 2022 28. Ertosun, M.G., Rubin, D.L.: Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA. Annu. Symp. Proc. 2015, 1899–1908 (2015) 29. Br35H : Brain Tumor Detection 2020 (2022). https://www.kaggle.com/ahmedhamada0/braintumor-detection. Accessed 10 June 2022 30. Brain MRI Images for Brain Tumor Detection (2022). https://www.kaggle.com/navoneel/ brain-mri-images-for-brain-tumor-detection. Accessed 10 June 2022 31. MRI Based Brain Tumor Images (2022). https://www.kaggle.com/mhantor/mri-based-braintumor-images. Accessed 10 June 2022 32. Ensemble learning, Wikipedia (2022). https://en.wikipedia.org/w/index.php?title=Ens emble_learning&oldid=1093276853. Accessed 22 June 2022 33. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. arXiv, (2015). http://arxiv.org/abs/1512.03385. Accessed 22 June 2022 34. Manna, S.: K-Fold Cross Validation for Deep Learning using Keras. The Owl (2020). https://medium.com/the-owl/k-fold-cross-validation-in-keras-3ec4a3a00538. Accessed 22 June 2022

Effective Estimation of Relationship Strength Among Facebook Users Applying Pearson Correlation and Jaccard’s Coefficient Deepjyoti Choudhury(B)

and Tapodhir Acharjee

Department of Computer Science and Engineering, Assam University, Silchar, India [email protected]

Abstract. Numerous ways have been established till the present to identify tie strength among users on online social media. Although calculating relationship strength among interconnected users has been a burning research topic, it is not also easy to find out powerful interconnections in a dynamic network as the volatility of user connections over a period of time. Mainly, the highest frequency of sharing information between two users in a network indicates strong bonding in a social network. And possible mutual connections may be established based on the strong bonding between two users. In this paper, we have proposed a novel method to calculate relationship strength among Facebook users with Pearson Correlation and Jaccard’s Coefficient. We propose two factors Analogy Profile and Analogy Friendship to obtain the final relationship strength. The performance of our proposed model is compared with the popular existing model, namely Trust Propagation-User Relationship Strength (TP-URS) using the assessment matrices Precision, Recall, and Dice Similarity Coefficient (DSC). Our proposed method provides a precision value of 0.66, recall value of 0.74, and DSC value of 0.71 which are comparatively better than the existing algorithm.

Keywords: Relationship Strength Facebook · Analogy

1

· Tie Strength · Social Networks ·

Introduction

Direct or mutual relationships have been established so far in real-world networks. And accordingly, the robustness of the correlation among the users in online social platforms was calculated through traditional algorithms. Different network methods along with the structure of ties have also been described [1]. Full network methods [2], Snowball methods [3], and Ego-centric networks [4] are popular among them. Mark S. Granovetter [5] first presented the argument that the overlapping of degrees available in two different friendship networks indicates different tie strengths of one network to another. He argued that the relationship strength is c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 15–24, 2023. https://doi.org/10.1007/978-981-99-3478-2_2

16

D. Choudhury and T. Acharjee

the amalgamation of several factors like association time, psychological potency, intimacy among users, etc. According to Granovetter, strong links are communicated with the intersection of circles of friendship, whereas weak ties function as pass over between such circles. This means that weak ties are more significant than strong ties for network-wide information dispersion. Identifying the strength among interconnections in a network is one of the most contentious topics of research nowadays. The main research challenge in this field is finding how to appropriately quantify ‘Relationship Strength’. This is a basic issue that must be addressed before moving on to other relevant problems in social networks like Friend Recommendations in a Social Bookmarking System [6], diffusion-based similarity on tripartite graphs [7], etc. Information on a person’s relationship strength with another user is convenient in several circumstances, like connection forecasts, module suggestions, news articles, etc. [8]. Based on the evaluation of correlation robustness, the public web-work contributor may upgrade the standard of a few public web-work assistance. Distinct recent research has concentrated on estimating the strength of social network interactions [9]. Using interaction data, the robustness of a relationship has been forecasted. However, almost all existing methodologies evaluate the strength of direct links in the midst of users in online public networks while ignoring indirect interactions. Determining the strength of the ties between such persons, which is equally critical in social networks, is a critical difficulty. This motivated us to adopt a novel approach to finding relationship strength in indirect relationships. The paper is demonstrated as: the second section elaborates on the historical background as well as the existing work done so far on finding the relationship strength. The detailed enumerative structure of our proposed approach has been discussed in the third section. The working flow of our proposed model, calculative procedures of relationship strength, assessment matrices as well as dataset are well talked about in this section. Experimental results, as well as an analysis of findings, have been detailed in the fourth section. Finally, we have summarized our paper by debating evaluative findings and upcoming work in the last section.

2

Related Work

Plenty of works have been conveyed to identify user strengths in different relationships on social media. Friendship linkages are denser in available online social networks, where the concept of “friendship” is larger than what is normally regarded in sociological research, yet the relationships carry weaker information [10]. Seven well-constructed features in terms of undertaking data to forecast tie robustness were defined in this paper. Gilbert et al. [11] described an approach that correlates public network information with link robustness. The published approach is based on a repository of more than two thousand public network links, and also it executes admirably, distinguishing in the middle of well-built and fragile connections with more than 85% correctness. To capture this concept, Viswanath et al. [12] had examined the development of activity among Facebook users. They discovered that linkages in the pursuit network

Effective Estimation of Relationship Strength Among Facebook Users

17

which is about to appear and move quickly over a period of time and that the robustness of the links shows a common declining shift of undertaking as the online public network ties have grown up. An unsupervised model was designed to calculate link strength among users from different interconnection activities like conveying, labeling, etc. [13]. Srba et al. [14] proposed and test a technique for analyzing the intensity of relationships in social networks. The estimated relationship strengths can also be utilized in a variety of ways to obtain more accurate information for specific users. Using a tag vocabulary, the relationship strength is estimated between websites in social bookmarking services and construct the interconnections of the online pages [15]. The Bayes theorem was employed in this stage to calculate positive strength from an individual’s strength calculation. Four partial trust estimates were identified and calculated the final trust value was the weighted average of these partial estimates from trustor to trustee [16]. In earlier days, Adamic et al. [17] developed methodologies and tools for mining data to uncover social networks and the exogenous elements that underpin network structure. Bilgic et al. [18] evaluated a model which addressed 2 challenges in terms of entity categorization and finding the strength of the connections in a predefined algorithm. A narrative specified probabilistic model of network enhancement was presented and derived a well-organized gradual learning algorithm for such models, which was then used to forecast connections between nodes [19]. A topological approach to defining features in a real-time network was also identified in this paper. Khadangi et al. [20] attempted to calculate the tie strength formulated on an individual’s task and side view data. A classified knowledge framework capable of learning the oscillation of publicly available data under the nose of supplementary networks was presented feature selection method was derived using various resources [21]. Manca et al. [22] had elaborated on the design models and architectural procedures to recommend friendship in a social bookmarking system. Probabilistic methods have always been gathering attraction to collateral common activities among users. Wang et al. [23] initiated a specific stochastic visual approach that may be scaled to enormous networks to evaluate the chance of two nodes cooccurring. Lin et al. [24] proposed a calculative method to identify tie strength in direct relationships among social media users. A new approach ‘Trust Propagation Strategy’ to estimate relationship strength was presented in this paper. By forming communities, data sparsity may be decreased and concentrated on uncovering the latent properties of communities rather than individuals [25]. Zhao et al. [26] presented a generic framework for assessing the robustness of relationships between various users, reckoning not only to gather the individual’s profile statistics but also interactive affairs and tasks areas. A technique of measuring relationship strength was presented which was gleaned from the individual’s tasks areas selection and interaction practice [27]. Ure˜ na-Carrion et al. [28] focused on a huge cellular phone dataset and assess a variety of getthrough-to-time sequence variables for apiece link, then it was used to forecast the neighborhood extends over, a trait associated with powerful relationships in the literature. Perikos et al. [29] presented an efficient survey very recently on the development of forwarding in the field of predicting relationship strength in online social networks.

18

D. Choudhury and T. Acharjee

Fig. 1. Work Flow of the Proposed Model.

3

Enumerated Structure of the Proposed Approach

Figure 1 depicts the workflow of our proposed approach. In our model, we have considered mainly two factors to calculate relationship strength. As we have taken Facebook data to experiment with our model, we are first required to collect profile information, circles, and ego networks. One circle indicates the interconnection (‘friend list’) from the Facebook network. 3.1

Calculation of Relationship Strength

Two factors, namely Analogy Profile and Analogy Friendship have been considered in our paper to compute relationship strength among users on Facebook. Analogy Profile provides the value of the influences of a user in social media. On the other hand, Analogy Friendship defines the number of available connections and the possibility of future connections among the users in a real-time network. That is why we have chosen these two factors to generate the relationship strength in this paper. We have taken Pearson Correlation as the Analogy Profile and Jaccard’s coefficient as Analogy Friendship. Finally, our relationship

Effective Estimation of Relationship Strength Among Facebook Users

19

strength is computed for Facebook users. Let’s elaborate on the factors briefly to compute the relationship strength below: Analogy Profile. Here, we have used modified Pearson Correlation [30] as an Analogy Profile in our novel approach. In an effort to estimate the likelihood of tie genesis between the vertices vi & vj , the unification neighbourhood set, U niij is defined as: U niij = p|(Ai [p] > null)otherwise(Aj [p] > null)

(1)

An appreciable association connecting the unification neighborhood set, U niij , Ai , and Aj specifies the greater constructional analogy between vertices i and j. To determine the association in the midst of two vertices, the association coefficient in the midst of the unification neighborhood set of vectors is determined. In our paper, the Pearson correlation coefficient (CRij ) is used as an analogy profile to calculate the correlation strength of two users on the Facebook network. The association in the midst of the unification neighborhood vectors set Ai and Aj is estimated as :  p∈U niij (Ai [p] − Ai )(Aj [p] − Aj ) CRij =  (2) 2 2 p∈Uij (Ai [p] − Ai ) (Aj [p] − Aj ) Ai is the mean standards in  the unification neighbourhood vector set Ai and p∈U ni

(Ai [p])

ij it can be estimated as: Ai = . Even if two nodes have no shared U niij neighbours in our approach, they may have considerable structural similarities. As a result, a link may be identified by comparing their neighbours.

Analogy Friendship. Here, Jaccard’s coefficient [31] is used to calculate Analogy Friendship. Generally in online social networks, two users have a high probability to be connected with each other if they confer the highest number of mutual friendships. In this way, relationship strength can also be calculated based on the mutual connections in a network. It is a common similarity measure in data recovery that determines the possibility where both p and q generate a feature f (arbitrarily selected feature) whether p or q contains. This method results in the measurement: |γ(p) ∩ γ(q)| (3) value(p; q) := |γ(p) ∪ γ(q)| Relationship Strength. Now we have calculated the measurements of all three factors. And we can identify the relationship strength by assigning the weight to dissimilar interconnections among the users in our Facebook network. The desired relationship strength can be found as shown below: Relationship Strength = α ∗ Analogy P rof ile + β ∗ Analogy F riendship (4) Here, α, β are weighted parameters and the logical principles span is [0,1] and the summation of α, β is 1.

20

3.2

D. Choudhury and T. Acharjee

Assessment Matrices

Based on the values generated by the error matrix, we have calculated Precision (P) [32], Recall (R), and Dice Similarity Coefficient (DSC) to evaluate our experimental results. As Precision and Recall are commonly used as assessment matrices to solve machine learning problems, we have used these two matrices and finally compute the DSC based on the values generated by precision and recall. Here in Eq. (5), accurate pragmatic gives the number of positively detected connections by our model as accurate. Fake pragmatic indicates the number of non-existing connections wrongly detected as positive. In Eq. (6), fake contradiction provides the value of existing connections in a network wrongly detected as negative or non-existence. The formulas are stated below: |AccurateP ragmatic| |AccurateP ragmatic| + |F akeP ragmatic|

(5)

|AccurateP ragmatic| |AccurateP ragmatic| + |F akeContradiction|

(6)

P recision, P = Recall, R =

Dice Similarity Coef f icient, DSC = 2. 3.3

P.R P +R

(7)

Dataset

We have collected the dataset (Facebook) from SNAP [33] library. The Facebook network consists of ‘friends list’ that is indicated by circles. All the existing data in the network was collected from survey participants. The dataset is represented by the profile information, ‘circles’, and ego networks. The existing data available in the Facebook network are unidentified by restoring an individual user’s internal id with the current merit. Furthermore, while feature vectors are published from Facebook data, their meaning has been hidden. The statistics of the dataset are represented in Fig. 2.

4

Results Analysis

The experiments were conducted in a Jupyter notebook using the Python programming language. Our system configuration comprises of Windows 10 operating system and 8 GB RAM. The factors α and β have a considerable influence on the computation of Relationship Strength, as well as the friend recommendation from the ‘circles’ of the Facebook data. There is a simple and effective method for adjusting its value to a tolerable range. Then the value 1 is respectively assigned to α or β while the other parameter is assigned as 0 and compute the outcome into 2 categories. The outcome is shown below: While we put the value of α as equal to 1, we achieved the precision value as 0.65, recall value as 0.72, and finally, the DSC is computed based on the current value of precision and recall. The value 0.68 is found as the DSC value in our

Effective Estimation of Relationship Strength Among Facebook Users

21

Fig. 2. Statistics of the Facebook Data. Table 1. Experimental Results of Precision, Recall and DSC. α β Precision Recall DSC 1 0 0.65 0 1 0.68

0.72 0.77

0.68 0.72

experiment. On the other hand, while the value of β is 1, then our experiment found precision value as 0.68, recall value as 0.77, and the outcome of DSC value as 0.72. As per the value we have achieved for DSC in both the cases, α and β are recalculated as: α=

0.68 DSC(α) = = 0.49 DSC(α + β) 0.68 + 0.72

(8)

β=

DSC(β) 0.72 = = 0.51 DSC(α + β) 0.68 + 0.72

(9)

Now, we have found the revised values of α and β as 0.49 and 0.51 consecutively. The revised value is then applied to a new trial, and the new assessment result shown in the table below reveals that the following modification leads to all evaluation indicators having vastly improved. To demonstrate the efficacy of the suggested technique, we conducted tests comparing our algorithm to TP-URS [24]. Synthetic relationship strength has been estimated in TP-URS. Computation of two factors, direct and indirect relationships calculate the final result in TP-URS. As the factors used in TP-URS are quite similar to our proposed approach, this motivated us to compare our findings with this. Table 2 shows the final outcome of our experiment.

22

D. Choudhury and T. Acharjee Table 2. Experimental Results after Adjustments of Parameters. α

β

0.49 0.51

TP-URS Proposed Method Precision Recall DSC Precision Recall DSC 0.62

0.68

0.66

0.66

0.74

0.71

In the result shown in Table 2, we have noticed that our proposed method achieves better results in precision than TP-URS. The precision value for TPURS is achieved as 0.62, and our proposed method achieves this as 0.66. Our proposed approach provides 0.74 as a recall value and performed better than TP-URS at 0.68. DSC value in our proposed approach attained a higher at 0.71 than the TP-URS at 0.66. Overall, our proposed method has done well in our experiment.

5

Conclusion and Future Work

A novel approach to identifying relationship strength based on two authenticate factors has been presented in this paper. The Facebook network has been used for the experimental result. We have compared our findings with one popular method, namely TP-URS in this paper. According to the obtained results, our proposed approach delivers the best suggestion performance. Our approach assesses two independent characteristics concurrently in a compatible manner, resulting in much-decreased data noises and losses and a more accurate connection strength value, and enhanced buddy recommendation system performance. The findings demonstrated that the suggested technique outperformed the current algorithm. However, as a limitation of the proposed approach, the technique requires parameter adjustment for improved performance. The betterment of the values of α and β will lead to more efficiency to establish the relationship strength among users in real-time networks. Another parameter may be added to compute the formula of relationship strength for better productivity in results. As a result, we want to make the algorithm adaptable in order to automatically alter the settings in our future work. We will also try to implement our technique on more real-world networks to check the differences with the existing methods in the near future.

References 1. Hanneman, R.A., Riddle, M.: Introduction to social network methods (2005) 2. Al Hasan, M.: Optimization Challenges in Complex, Networked and Risky Systems (INFORMS), pp. 115–139 (2016) 3. Naderifar, M., Goli, H., Ghaljaie, F.: Snowball sampling: a purposeful method of sampling in qualitative research, Strides in Development of Medical Education 14(3) (2017)

Effective Estimation of Relationship Strength Among Facebook Users

23

4. Small, M.L.: introduction: The past and future of ego-centric network analysis mario l. small, bernice pescosolido, brea l. perry, edward (ned) smith 5. Granovetter, M.: Thestarch of weak ties. Am. J. Sociol. 78(6), 1360 (1973) 6. Manca, M., Boratto, L., Carta, S.: Using Behavioral Data Mining to Produce Friend Recommendations in a Social Bookmarking System. In: Helfert, M., Holzinger, A., Belo, O., Francalanci, C. (eds.) DATA 2014. CCIS, vol. 178, pp. 99–116. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25936-9 7 7. Shang, M.S., Zhang, Z.K., Zhou, T., Zhang, Y.C.: Using behavioral data mining to Produce friend recommendations in a social bookmarking system. Phys. A 389(6), 1259 (2010) 8. Tang, F.: Link-prediction and its application in online social networks. Ph.D. thesis, Victoria University (2017) 9. Brauer, K., Sendatzki, R., Gander, F., Ruch, W., Proyer, R.T.: Profile similarities among romantic partners’ character strengths and their associations with relationship-and life satisfaction 99, 104248 (2022) 10. Kahanda, I., Neville, J.: Proceedings of the International AAAI Conference on Web and Social Media, vol.3, pp. 74–81 (2009) 11. Gilbert, E., Karahalios, K.: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 211–220 (2009) 12. Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: Proceedings of the 2nd ACM Workshop on Online Social Networks, pp. 37–42 (2009) 13. Xiang, R., Neville, J., Rogati, M.: Proceedings of the 19th international conference on World Wide Web, pp. 981–990 (2010) 14. Srba, I., Bielikov´ a, M.: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. IEEE 3, pp. 13–16 (2010) 15. Yanagimoto, H., Yoshioka, M.: 2012 IEEE International Conference on Fuzzy Systems. IEEE, pp. 1–8 (2012) 16. Nasir, S.U., Kim, T.H.: Trust computation in online social networks using cocitation and transpose trust propagation. IEEE Access 8, 41362 (2020) 17. Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211 (2003) 18. Bilgic, M., Namata, G.M., Getoor, L.: 7th IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, pp. 381–386 (2007) 19. Kashima, H., Abe, N.: 6th International Conference on Data Mining (ICDM’06) IEEE, pp. 340–349 (2006) 20. Khadangi, E., Zarean, A., Bagheri, A., Jafarabadi, A.B.: ICCKE 2013 IEEE, pp. 461–465 (2013) 21. Lu, Z., Savas, B., Tang, W., Dhillon, I.S.: 2010 IEEE international conference on data mining IEEE, pp. 923–928 (2010) 22. Manca, M., Boratto, L., Carta, S.: Science and Information Conference Springer, pp. 227–242 (2014) 23. Wang, C., Satuluri, V., Parthasarathy, S.: 7th IEEE international conference on data mining (ICDM) IEEE, pp. 322–331 (2007) 24. Lin, X., Shang, T., Liu, J.: An estimation method for relationship strength in weighted social network graphs. J. Comput. Commun. 2(04), 82 (2014) 25. Zhao, G., Lee, M.L., Hsu, W., Chen, W., Hu, H.: Proceedings of the 22nd ACM international conference on Information Knowledge Management, pp. 189–198 (2013) 26. Zhao, X., Yuan, J., Li, G., Chen, X., Li, Z.: Relationship strength estimation for online social networks with the study on face book. Neurocomputing 95, 89 (2012)

24

D. Choudhury and T. Acharjee

27. Tao, W., Ju, C., Xu, C.: Research on relationship strength under personalized recommendation service. Sustainability 12(4), 1459 (2020) 28. Ure˜ na-Carrion, J., Saram¨ aki, J., Kivel¨ a, M.: Estimating tie strength in social networks using temporal communication data. EPJ Data Science 9(1), 37 (2020) 29. Perikos, I., Michael, L.: A survey on tie strength estimation methods in online social networks. ICAART 3, 484–491 (2022) 30. Nettleton, D.: Selection of variables and factor derivation, Commercial Data Mining, pp. 79–104 (2014) 31. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019 (2007) 32. Goutte, C., Gaussier, E.: A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In: Losada, D.E., Fern´ andez-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1 25 33. Leskovec, J., et al.: Stanford network analysis project (2010)

Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique Debashri Das Adhikary1 , Tanushree Dey1 , Somnath Bera1 , Sumita Guchhhait1 , Utpal Nandi1 , Mehadi Hasan2 , and Bachchu Paul1(B) 1 Department of Computer Science, Vidyasagar University, Midnapore, West Bengal 721102,

India [email protected], [email protected], [email protected] 2 Begum Rokeya University, Park Mor, Modern, 5404 Rangpur, Bangladesh

Abstract. Music is an effectual therapy in our life that makes us calm, cheerful, and excited. Music genre classification (MGC) is essential for recommendation of music and information retrieval. In our proposed work, an effective automatic musical genre classification approach has been experimented with where different features and order are fused together to get a better progressive result than the existing method. Frame-wise extraction of time-domain features(Wavelet scattering, Zero Crossing Rate, energy) and frequency-domain features(Mel Frequency Cepstral Coefficient-MFCC, pitch, Linear Predictive Coefficient-LPC) is done here. After that, the mean value of each extracted feature is put in a vector and fed to the classifier. Two well-known machine learning (ML) algorithms, Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) are used to classify the GTZAN dataset. The proposed method outperformed than the existing work. Keywords: Music genre classification · Mel Frequency Cepstral Coefficient · Linear Predictive Coding · Support Vector Machine · K-Nearest Neighbor

1 Introduction Feelings and emotions of humans can be expressed through music. Music is not for enjoyment only, it also has a great physiological and social effects in our life. In recent years a huge amount of music is accessible on different platforms. But it is difficult to organize or structure such a vast amount of music [1]. To overcome this difficulty, music genre is introduced to classify musical data based on their harmony to each other. Gradually the interest in MGC is increasing in the field of automatic speech recognition for recommendation of music, classification of musical instrument and in emotion classification [2]. Music is classified for the enjoyment of the music lovers as per their preferred genre [3]. According to mood human may interested on music as preferred. MGC can be used for music therapy in medical science. MGC effects mood, improve memory, helps to workout with more energy, can also lead to better learning of music that stimulates our brain. Genre wise managing the songs properly helps the user to handle the music efficiently. Classification is done through the proper selection © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 25–32, 2023. https://doi.org/10.1007/978-981-99-3478-2_3

26

D. D. Adhikary et al.

of feature sets for representing different genres. Pitch, timbre, tonality and rhythm are the related feature sets that distinguishes different genres and helps to achieve better accuracy [4]. In our proposed system we have used both time-domain features (Wavelet scattering, ZCR and energy) and frequency-domain features (MFCC, pitch and LPC). We classify the following 10 genres(blues, classical, rock, jazz, reggae, metal, country, pop, disco, and hip-hop) using ML methods such as KNN and SVM. Here we got 80.5% of accuracy for SVM and 71.5% for KNN classifier. The rest of the paper is structured as follows: In Sect. 2 some related work based on genre classification is discussed, the proposed methodology is presented in Sect. 3, Sect. 4 gives the evaluated result and comparative study of the proposed method, and finally Sect. 5 provides the conclusion of our proposed work.

2 Literature Review Music genre classification has been developing day by day gradually. Different research works are already done to develop this genre classification technique based on different features and classifiers. MFCC is the widely used feature to recognize ten music genres on GTZAN dataset in [5]. Mel spectrum and MFCC are used for building up a music genre classification system with the help of Million Song Dataset (MSD) in [6]. MFCC is also used with LPC, tonality and music surface features like spectral flux, spectral centroid, spectral roll-off, ZCR and low energy to build up a MGC system by classifying ten classes with SVM and using 10-fold cross validation strategy on GTZAN dataset in [4]. Chroma frequencies is used with MFCC, spectral centroid, spectral contrast and spectral roll-off, ZCR to build a music genre classification system on GTZAN dataset. Nine genre i.e. Blues, Classical, Country, Disco, Jazz, Metal, Pop, Reggae, Rock are classified here in [7]. MFCC, Spectral Centroid, Spectral Spread, Spectral flux, Spectral roll-off, Chroma the spectral features and the temporal features like ZCR are used for MGC in [8]. Though MFCC is used in several research works as a feature but the combination of Short time energy (STE), ZCR, Spectral centroid, Spectral Flux are used to build a music genre classification system for classifying four class i.e. rock, pop, dangdut and jazz in [3]. In another research work along with wavelet scattering, MFCC is used to build a music genre classification system on GTZAN and phone segment classification on TIMIT dataset by using the deep neural network as a classifier where ten genres are used on GTZAN and forty eight phone classes are used on TIMIT dataset in [10].

3 Methodology The overall steps of proposed method is specified in the flow diagram below in Fig. 1. Here 80% of samples are used for training purpose and rest of 20% are used for testing purpose. In our approach the feature extraction is followed by the classification. Before diving direct to feature extraction and classification we should have a proper format of relatable data. In feature extraction step different time-domain and frequency-domain features are fetched from speech signal. The features from training samples are then classified with different ML algorithm to generate models for predicting test samples for recognizing the correct music genre.

Taxonomy of Music Genre Using Machine Intelligence

27

Fig. 1. Phases of proposed method

3.1 Dataset Used In GTZAN dataset[9] there are 1000 music files in an audio format. Each music file is 30 s long in “.wav” format. Sampling rate of each audio file is 22050 Hz and 16 bits mono channel.The dataset contains 10 genres namely blues, classical, country, disco, hiphop, jazz, metal, pop, reggae and rock. Each genre contains 100 different files. 3.2 Feature Extraction The first step in this system is to extract features. The music data must be identified in order to convert the extracted features into a classification system. 3.2.1 Time Domain Features • Wavelet Scattering: Wavelet scattering technique is an efficient tool for representation of data. As a feature extraction technique it can be combined with various popular classification algorithms. It allows us to extract the effective and invisible features at different scales. It mainly aims in the frequency-time domain where energy spectrum of different signals is calculated with no loss of valuable features. Wavelet scattering can provide better coefficient value rather than Fourier transform. Wavelet scattering transform follows step by step procedure where output of each step is used as input to the next step. Some specific parameters used in wavelet scattering network are the duration of the time invariance, number of filter banks and the number of wavelets per octave. Three consecutive operations such as convolution, nonlinearity and averaging are done for getting wavelet scattering transform (WST) of the input signal X {x1 , x2 ….. xn }. The WST coefficients are gained through low-pass filter φ with applying averaging function on wavelet modulus coefficients [10, 11, 17].

Fig. 2. Wavelet Scattering Transform process

A filter is built to cover the overall frequencies seized in a signal, the low pass filter φ and a wavelet function ψ are designed. When the central frequency of a wavelet ψ(t)

28

D. D. Adhikary et al.

is set to 1, it is referred to as a band-pass filter. A wavelet filter bank ψλ (t) is created by dilating the wavelet. ψλ (t) = λψ(λt)

(1)

p

Where in λ = 2 Q , p belongs to Z, is the highest level of scattering, and Q is represented as the number of wavelets per octave. Filter bank is a collection of band pass filters that are centralized in frequency domain in λ and with frequency bandwidth Qλ since the order of bandwidth for wavelet ψ(t) is Q1 . We take a single coefficient at 0th level which is given in Eq. 2. C0 x(t) = x ∗ φ(t)

(2)

Where * is the convolutional operator. For speech signal the coefficient is near to 0. In our work we take Q = 8 for 1st order that defines used wavelet that contains same frequency resolution as compared to mel frequency filters. By taking average of the wavelet modulus coefficients we get approximate mel-frequency spectral coefficients which is given in Eq. 3   (3) C1 x(t, λ1 ) = x ∗ ψλ1  ∗ φ(t) Amplitude modulation with high frequency is received by 2nd order coefficient at each layer frequency band which is given in Eq. 4.    (4) C2 x(t, λ1 , λ2 ) = x ∗ ψλ1  ∗ ψλ2  ∗ φ(t) • Zero Crossing Rate: The noisiness of a signal is measured by ZCR. It gives higher values in the occurrence of the noisy signals. Firstly, all the sound signals are divided into tiny frames. Then number of ZCR present in each frame is determined [3, 12]. • Energy: Energy is produced by the vibrations of the sound waves. Energy is generally measured by the summation of the squares of the values of the signals and then normalizing by the particular frame length. 3.2.2 Frequency Domain Features • Pitch: Pitch of a sound is generally described by the frequency of vibration of waves. Pitch of a sound is determined only when the frequency is stable and clear so that it can be distinguished from noise [13]. • LPC: An LPC algorithm classified these sounds by making either determined periodic pulses (voiced) or an arbitrary sound generator (unvoiced) as the origin. An LPC algorithm

Taxonomy of Music Genre Using Machine Intelligence

29

has become an important approach for appraising speech parameters like spectra, pitch, vocal tract area functions, formats and representing speech for low bit rate transmission [14]. In our work we took the frame size of 20 ms. • MFCC: Mel-frequency cepstrum coefficient is a popular feature extraction technique where melfrequency cepstrum is a representation of short –term power spectrum of audio signal [15, 16]. In our work we took the frame size of 20 ms. Combining 365 features that have been extracted and fused from the GTZAN dataset is given in Table 1: Table 1. Features of GTZAN dataset Feature

Order

Pitch

1

MFCC

13

LPC

15

Wavelet scattering

334

ZCR

1

Energy

1

4 Result and Analysis In GTZAN dataset there are ten classes: blues (Bs), classical (Cl), country (Cy), disco (Do), hiphop (Hp), jazz (Jz), metal (Ml), pop (Pp), reggae (Re), rock (Rk). The model is trained with two machine learning algorithm SVM and KNN. All the genres are correctly predicted with respect to actual genres which is represented in a tabular form is the confusion matrix Table 2, 3 represents the confusion matrix of GTZAN dataset using SVM, KNN classifiers. Table 2. Confusion Matrix for GTZAN dataset using SVM classifier Bs

Cl

Cy

Do

Hp

Jz

Ml

Pp

Re

Rk

Bs

19

0

0

1

0

0

0

0

0

0

Cl

0

19

0

0

0

1

0

0

0

0

Cy

1

0

15

0

0

1

0

0

0

3

Do

1

1

1

12

3

0

0

0

1

1

Hp

0

0

1

0

15

0

1

0

2

1 (continued)

30

D. D. Adhikary et al. Table 2. (continued) Bs

Cl

Cy

Do

Hp

Jz

Ml

Pp

Re

Rk

Jz

0

0

2

0

0

18

0

0

0

0

Ml

0

0

0

2

0

0

18

0

0

0

Pp

0

0

1

1

0

0

0

17

0

1

Re

0

0

0

0

2

0

0

1

16

1

Rk

0

1

0

0

0

1

2

2

1

13

Table 3. Confusion Matrix for GTZAN dataset using KNN classifier Bs

Cl

Cy

Do

Hp

Jz

Ml

Pp

Re

Rk

Bs

19

0

0

1

0

0

0

0

0

0

Cl

0

20

0

0

0

0

0

0

0

0

Cy

1

0

12

2

0

2

0

1

1

1

Do

0

1

0

11

2

1

0

1

2

2

Hp

1

0

1

2

11

0

2

0

3

0

Jz

0

0

2

1

0

15

0

1

1

0

Ml

1

0

0

3

0

0

16

0

0

0

Pp

0

1

1

0

0

1

0

16

1

0

Re

0

0

2

0

3

1

0

1

13

0

Rk

0

2

2

0

0

2

0

4

0

10

Figure 3(a), Fig. 3(b) represents accuracy of different genres of GTZAN dataset using SVM,KNN classifiers.

Fig. 3. (a) Accuracy of different genres of GTZAN dataset of SVM classifier. (b) Accuracy of different genres of GTZAN dataset of KNN classifier

Taxonomy of Music Genre Using Machine Intelligence

31

5 Comparative Study From Table 4, it has been clearly noticed that a comparison of accuracy is build up between several research works with our proposed method. As GTZAN is a well-known standard dataset for music genre, most of the researcher preferred to use it for classifying the music styles under this dataset. In case of this dataset it is seen that Mutiara et al. [4], Ali et al. [5], achieved 76.6%, 77% where as Patil et al. [7], Chathuranga et al. [13] achieved overall 78% accuracy by using the common SVM classifier. In our proposed work 80.5% classification accuracy is achieved by using the SVM classifier for the same GTZAN dataset. Table 4. Comparative analysis with respect to accuracy Research work on Music Genre Classification

Features

Classifiers

Accuracy

Mutiara et al. [4]

MS, MFCC, Tonality, LPC

SVM

76.6% for GTZAN dataset

Ali et al. [5]

MFCC

SVM, KNN

77% using SVM for GTZAN dataset

Patil et al. [7]

MFCC, Chroma features, spectral centroid, spectral roll-off, ZCR

SVM, KNN

78% using SVM for GTZAN dataset

Chathuranga et al. [13]

ZCR, Spectral flux, Spectral Centroid, Spectral roll-off, Pitch, Chroma, MFCC

SVM

78% for GTZAN dataset

Proposed Work

Pitch, MFCC, LPC, wavelet scattering, ZCR, energy

SVM, KNN, Naive Bayes, Decision Tree

80.5% using SVM for GTZAN dataset

6 Conclusion An automated system for MGC on GTZAN dataset is presented in this proposed work. Pitch, MFCC, LPC, wavelet scattering, ZCR and energy are used to create audio feature set of 365 combination. For GTZAN dataset we got accuracy of 80.5% is obtained using SVM classifier and 71.5% for KNN classifier. In comparison to using just one of the distinct feature set categories, we outperformed ourselves. It also performs well when compared to other research work using the same dataset. As part of our ongoing research, we want to create MGC systems that are deep learning based and include additional features that can extract more valuable information from audio signals.

32

D. D. Adhikary et al.

References 1. Pushpalatha, K., Sagar, U.S.: Music genre classification using machine learning techniques. Int. J. Res. Eng. Sci. Manage. 4(7), 77–82 (2021) 2. Girsang, A.S., Manalu, A.S., Huang, K.W.: Feature Selection for Musical Genre Classification Using a Genetic Algorithm 3. Ardiansyah, B.Y., Sahara, R.: Music genre classification using naïve bayes algorithm. Int. J. Comput. Trends Technol. 62(1), 50–57 (2018) 4. Mutiara, A.B., Refianti, R., Mukarromah, N.R.A.: Musical genre classification using support vector machines and audio features. TELKOMNIKA TELKOMNIKA (Telecommun. Comput. Electron. Control) 14, 1024–1034 (2016) 5. Ali, M.A., Siddiqui, Z.A.: Automatic music genres classification using machine learning. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(8), 337–344 (2017) 6. Vishnupriya, S., Meenakshi, K.: Automatic music genre classification using convolution neural network. In: 2018 International Conference on Computer Communication and Informatics (ICCCI), pp. 1–4. IEEE (2018) 7. Patil, N.M., Nemade, M.U.: Music genre classification using MFCC, K-NN and SVM classifier. Int. J. Comput. Eng. Res. Trends 4(2), 43–47 (2017) 8. Banitalebi-Dehkordi, M., Banitalebi-Dehkordi, A.: Music genre classification using spectral analysis and sparse representation of the signals. J. Sign. Process. Syst. 74(2), 273–280 (2014) 9. Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002) 10. Andén, J., Mallat, S.: Deep scattering spectrum. IEEE Trans. Signal Process. 62(16), 4114– 4128 (2014) 11. Ghezaiel, W., Luc, B.R.U.N., Lézoray, O.: Wavelet Scattering Transform and CNN for Closed Set Speaker Identification. In: 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6 (2020) 12. Bahuleyan, H.: Music genre classification using machine learning techniques. arXiv preprint arXiv:1804.01149 (2018) 13. Chathuranga, D., Jayaratne, L.: Musical genre classification using ensemble of classifiers. In: 2012 Fourth International Conference on Computational Intelligence, Modelling and Simulation, pp. 237–242. IEEE (2012) 14. Mada Sanjaya, W.S., Anggraeni, Dyah, Santika, Ikhsan Purnama: Speech recognition using linear predictive coding (LPC) and adaptive neuro-fuzzy (ANFIS) to control 5 DoF arm robot. J. Phys. Conf. Ser. 1090, 012046 (2018). https://doi.org/10.1088/1742-6596/1090/1/012046 15. Swedia, E.R., Mutiara, A.B., Subali, M.: Deep learning long-short term memory (LSTM) for Indonesian speech digit recognition using LPC and MFCC Feature. In: 2018 Third International Conference on Informatics and Computing (ICIC), pp. 1–5. IEEE (2018) 16. Gamit, M.R., Dhameliya, K.: Isolated words recognition using MFCC, LPC and neural network. Int. J. Res. Eng. Technol. 4(6), 146–149 (2015) 17. Soro, B., Lee, C.: A wavelet scattering feature extraction approach for deep neural network based indoor fingerprinting localization. Sensors 19(8), 1790 (2019)

Optimization of Intraday Trading in F&O on the NSE Utilizing BOLLINGER BANDS Joyjit Patra1

, Mimo Patra1 , and Subir Gupta2(B)

1 Department of C.S.E, Dr. B. C. Roy Engineering College, Durgapur, West Bengal, India 2 Department of CSE, Swami Vivekananda University, Kolkata 700121, West Bengal, India

[email protected]

Abstract. As AI-based ALGO-TRADE technologies have advanced, firms around the globe have altered their practices. With this new technology, investors can now make their chances of success higher and depend less on luck. When there is only one remaining share of stock or product, buy-or-write is sometimes employed. The “buy-and-sell maximization study design” is a type of research approach used to create an opinion. Bollinger bands were used to develop the ALGO-TRADE programs used in this investigation. In finance, Bollinger Bands are a form of technical indicator used to display the top and bottom of a trading range. The Bollinger Bands are price-bounding envelopes that are drawn one standard deviation above and below a simple moving average of prices. The standard deviation is used to calculate the width of the band, making it sensitive to market fluctuations. Bollinger bands will assist the study in determining whether the present price is abnormally high or low. People frequently use upper and lower bands in conjunction with a moving average. Moreover, the bands were designed to work together and will not function properly if used independently. Combine with additional indications that work well together and double-check their findings. The study also looked at how much money the NIFTY option trading account made and lost for the years 2021 and 2022. Based on the results, a trader who starts with an investment of Rs 50,000 can make a profit of Rs 30,000. Keywords: ALGO-TRADE · AI · Bollinger bands · GARCH model

1 Introduction Briefly, over time Traders and investors who are new to the market and retail traders are finding it harder and harder to trade and invest in different market segments. With improvements in AI technology and the use of AI in trading and investing by large institutions like FII and DII, making decisions and placing orders for larger quantities has become easier [1]. Small and retail traders find it very hard to make a profit on the market because they get emotionally attached to and think about each trade. This makes them take longer to make decisions, which leads to more slippages. Managing the complexity of an algorithm has also become easier as time has progressed. ALGOTRADE is not a novel concept in the trading industry. Algorithmic trading is one of © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 33–42, 2023. https://doi.org/10.1007/978-981-99-3478-2_4

34

J. Patra et al.

the oldest available trading strategies [2]. Due to how hard it is to set up algo trading, however, this strategy hasn’t been used on a large scale in a while.Traditional backwardand forward-looking trading strategies offer a distinct advantage over ALGO-TRADE tools. These tools use an algorithm that uses artificial intelligence to predict the future price of an asset based on how its price has changed in the past. Using these strategies, traders can capitalise on future market trends. They don’t use the trader’s analysis; instead, they use technical analysis, which doesn’t take market news or fundamental values into account. ALGO-TRADE relies on complex algorithms that can accept a variety of inputs [3]. For instance, an algorithm may analyse past prices to predict the future direction of an asset. In contrast, a second algorithm could be developed to analyse current news stories in order to reach a conclusion. Consequently, ALGO-TRADE has gained immense popularity among traders. It enables them to conduct profitable and automated trades. This allows them to leave their computers on while the algorithm performs all the work. Due to the creation of Algo and the fact that large institutions have used it a lot in the past few years, the volume and number of trades have grown a lot over time. There are numerous types of trading that are typically employed by institutions. Different strategies are used in hedging to reduce risks. For example, shares of a coffee company can be hedged with shares of a tea company [4]. Currency can also be hedged with gold. Swing trading, in which a trade or investment is made over the course of a week or more, and scalping, in which institutions buy and sell large amounts and stay in the trade for a few minutes before getting out, are mostly the same. In scalping, institutions use their large amounts of money to take advantage of small price changes, which make them a lot of money because of how much money they have. This type of trading needs orders to be put in and taken out faster than the strategy they use, which is hard for a person to do but easier for an Algo. The ALGO-TRADE software is designed to place orders at precisely the right time and from any location. Large systems are used by institutions and traders. They are made to take advantage of price changes in a very efficient way. The algorithm can also be set up for different ways to trade, such as day trading, swing trading, scalping, etc. On futures markets like commodities, stocks, and currencies, algorithmic trading is also used to make money. Because the prices of these assets don’t remain constant but rather fluctuate constantly, algorithmic trading can profit greatly from these price changes [5, 6]. ALGO-TRADE can also be used to test an algorithm’s performance in the past, which would take more time and cost more if done by hand. Algo trading’s main benefit is that it cuts down on the time and money needed for trading by taking out the human element. It also lets traders carry out orders more quickly since algorithms can make decisions faster and more objectively than people can. This takes bias and emotion out of the decision-making process [7]. The following details are essential for the next steps of the study: The literature review is dissected in greater depth in Sect. 2. The paper breaks down the proposed system’s methodology in Sect. 3, and then discusses the results in Sect. 4. A summary and some suggestions for the future are provided in Sect. 5.

Optimization of Intraday Trading in F&O on the NSE

35

2 Literature Review The second method is backed up by a lot of research on how trade and volatility work differently in different types of derivative contracts. The people who did this research wanted to find a way to solve a problem that policymakers, financiers, and market participants all had. As the derivatives contract came to a close, would volatility and returns be subject to long-term pressure to rise or fall? If a user’s trading techniques generate consistent profits, they may be successful. If there is a significant increase in volatility during expiration times, the authorities will be concerned. In the United States, the effect of expiration day on derivatives trading has been the subject of a large-scale empirical study [8]. The triple curtain call has inspired a great deal of study. The researcher began the investigation by gathering data from the real world. This data showed that the number of transactions went up, that there was more volatility than expected, and that money was spent on non-economic items on their expiration dates. A lot of research studies don’t say where their data came from for no clear reason. They compared the mean and standard deviation of earnings that were and were not limited by expiration dates. The purpose of the study is to shed light on this crucial topic. For the validity of the researchers’ conclusions about the effects of the expiration date, they insist on meticulous modeling of the underlying process that affects stock returns. The most essential thing the study have done is create a more precise Generalized Auto-Regressive Conditional Heteroscedastic (GARCH) model for simulating how data is created and how its expiration date influences it [9, 10]. According to numerous studies, the correlation between market volatility and volume is robust. But the unusual market liquidity seen around expiration time doesn’t seem to have anything to do with bigger changes in spot prices. The study intend to fulfill an urgent need in the field of market research by demonstrating that our method is applicable to the Indian market. The expiration date and time must be set before proceeding. Every month, on the last Thursday, all contracts expire. The final half-hour of the expiration day is known as the “expiration hour.” To find out how open futures contracts affected the market, the last half-hour of the last day of the contract was looked at. On expiration days and the five days preceding them, there were significantly more trades than on other days. The study also shows that the expiration date has a big effect on the market index’s daily returns and its volatility. On the cash market, the price, volatility, and volume of shares are compared to their values one and two weeks prior to their expiration. The day before futures contracts expire, cash market values are often low, but they rise the day after [10, 11]. However, this does not imply that the great majority of stock values will alter. Finally, expiration days are significantly busier than other days. All open contracts are paid in cash using the most recent half-hourly weighted average of the underlying price, according to the second definition. The cash settlement terms said that contracts that didn’t automatically renew were settled at midnight based on the closing price of the underlying securities. The phrase “triple witching hour” shows up more often than the study might think in this body of work. All futures, options, and index contracts expire at this time. The Indian stock market is currently experiencing what is known as a “quadruple witching hour.“ Each month’s options on indices and stocks expire on the last Thursday. The research can utilize less frequent daily data to determine how the deadline influences the

36

J. Patra et al.

circumstances [12–14]. Conditional mean and variance of returns are not affected at all by the maturity date.

3 Methodology This Study focuses on the way to reduce the trading and investment efforts and Stress occurred by a person during punching an order by automating the process using AI tools. Figure 1 shows the methodology diagram and Table 1 shows the pseudo code. The algorithm executes all trades automatically and is capable to place entry order Stoploss order and Target order at once. This algorithm retrieves Live NSE data and places a trade based on the signals provided by the algorithm using Broker’s APIs. In the first step, Algo collect the market data from NSE API once all the data are fetched from the NSE API the study Filter the data based on Exchange Segment, Instrument Type, Strike Price, and CE/PE if OPTSTK|OPTIDX, once the data is filtered, Algo Store the data into Local Database for Further references. After this, Algo load the Bollinger Band Indicator set its Standard Deviation to 1.5, and retrieves the current candle Open, High, Low, Close, and Bollinger Band Upper band and Lower Band values. To Understand the formation of the upper Band and lower Band of the Bollinger Band one can investigate its Mathematical Equation. Equation 1 represents the mathematical formula for Bollinger upper band while Eq. 2 represents the mathematical formula for Lower Bollinger Band. The Equation for Bollinger Band. BOLU = MA (TP , b) + a ∗ ∂[TP , b)]

(1)

BOLD = MA (TP , b) − a ∗ ∂[TP , b)]

(2)

where:

once all the data is retrieved, the study store the data in our Local Database for further references. Afterward, the study check if the current candle Open, High, Low, Close is above the Upper Bollinger band value for the current candle or if the current candle Open, High, Low, Close is less than Lower Bollinger Band Value if any of the given conditions are true, the study mark the current candle as our Trigger Candle and mark trigger candle High, Low for further references. Now Algo checks for the upcoming candle’s close till once closes above or below the trigger candle High or low once a candle closes above

Optimization of Intraday Trading in F&O on the NSE

Fig. 1. Precedure of share market prediction using the Bollinger Bands Algorithm

37

38

J. Patra et al.

the trigger candle High Algo generates a BUY signal with Entry Price at the current candle Close and Stoploss at the current candle Low while the target order is placed at the price of current candle Close + 4 * (current candle Close – current candle Low) and if a candle closes below trigger candle Low Algo generate a SELL signal with Entry Price at current candle Close and Stoploss at current candle High while the target order is placed at the price of current candle Close - 4 * (current candle High – current candle Close). Table 1. Pseudo Code of share market prediction using the Bollinger Bands Algorithm

1. 2. 3. 4. 5. 6.

7.

8. 9.

Start Load Market Data DB = Filter {Exchange Segment, Symbol, Instrument Type, Strike Price, PE/CE} Set Std Dev to 1.5 and Calculate Bollinger Upper Band, Bollinger Lower Band, DB = Read Current Candle {Open, High, Low, Close} If, Current Candle {Open, High, Low, Close} > Bollinger Upper Band|| < Bollinger Lower Band Mark Current Candle as Trigger Candle and Mark it’s High, Low Else, Repeat from Step 4 If, Next CandleClose > High Set Entry orderlimit price = current candleclose, Stoploss orderlimit price = current candlelow, Target orderlimit price = current candleclose + 4*(current candleclose - current can dlelow) Else If, Next CandleClose < Low Set Entry orderlimit price = current candleclose, Stoploss orderlimit price = current candlehigh, Target orderlimit price = current candleclose – 4*(current candlehigh - current candleclose) Else, Repeat from Step 4 Place Orders Exit

4 Result In the area of the report devoted to analyzing the results, the primary objective is to examine potential economic outcomes. Using charts with titles such as “target achieved chart,” “stop loss hit chart,” and “trigger candle identification chart, “the study may

Optimization of Intraday Trading in F&O on the NSE

39

determine trading strategy is operating as intended. The study can use these charts to determine whether the studyr trading system is operating as intended. Consequently, it is crucial that the research take each of the three conditions into account separately. In the next section of the research, additional information regarding all three incidents will be provided.

Fig. 2. Target Achieved Chart

A “trigger candle” occurs when the open, high, low, or closing price of a candle does not fall within the Bollinger band. This type of candle is referred to as an “exterior Bollinger band candle.” Once a trigger candle has been found, its high, low, and closing values must be written down so they can be used in the next step. When the closing price of a candle is greater than the high of the trigger candle, a buy signal is generated. When a candle closes at a price that is lower than the low of the trigger candle, a sell signal is generated. When the SL buy signal matches the low of the trigger candle, the goals are always multiples of 4, but they are never less than 2 when the SL sell signal matches the high of the trigger candle. Since the open, high, low, and close of the trigger candle did not interact with the Bollinger band, the system will send a sell signal with S.L. at the high level of the trigger candle, entry at the close of the current candle, and a target that is four times the range of S.L. if the following candle closes below the low level of the trigger candle. The objective was attained, and it was accomplished with excellence. The target achieved chart is shown in Fig. 2. In this case, the triggering candle was found because the ends of the candle never touched the Bollinger band while it was around. This allowed us to locate the candle that triggered the alert. A buy signal was sent because the highest point of the candle that triggered the signal was lower than the price at which the next candle closed, which was higher. The target was within four times the S.L., and the S.L. was kept towards the bottom of the trigger candle at all times. The stop-loss order was carried out very quickly, and the algorithm ended the trade at a loss right away. This is where the stop-loss hit chart, as shown in Fig. 3, can be found.

40

J. Patra et al.

Fig. 3. Stop Loss Hit Chart

Fig. 4. Trigger Candle Identification Chart or System Ideal

If there is no trigger candle, the system is in perfect order and waits for a candle to violate the trigger candle’s requirements before placing an order. If a trigger candle is found, it means that the system is not in perfect order and is waiting for a new candle to come along and fix the problem. If a trigger candle is discovered, the system enters a state of imperfect order in which it waits for a candle to either meet or fail to fulfill the conditions. The study tried to show in the example which candle shouldn’t be used as a trigger candle. When this occurs, the system is in a “hunting state,” which is the optimal state for locating a trigger candle. Due to this, it is simple to identify a trigger candle by observing its high, low, open, and close as it forms. The study can do this while the candle is being created. The TRIGGER CANDLE IDENTIFICATION CHART or SYSTEM IDEAL is shown in Fig. 4.

Optimization of Intraday Trading in F&O on the NSE

41

5 Conclusion Businesses all over the world have changed their ways of doing things because of the rise of AI-based technologies like ALGO-TRADE. Using this cutting-edge technique, investors may boost their odds of making money and rely less on luck to do so. Depending on the situation, a buy-or-write decision may need to be made when there is only one unit of a product or asset that is already out there. A number of distinct study designs, such as “buy-and-sell maximization,” aim to reach a conclusion based on research conducted in the actual world. The researchers’ primary objective was to enhance a simulation model employing Bollinger bands so that they could use it to create the ALGO-TRADE program, which was used in this inquiry. In the realm of finance, Bollinger Bands are a form of technical indicator that displays the high and low points of a trading range. A histogram is utilised to illustrate these facts. The top and bottom of a simple moving average of prices and the top and bottom of the Bollinger Bands are separated by one standard deviation. Therefore, Bollinger Bands serve as price envelopes. This definition places the Bollinger Bands at the top and bottom of the price range. The breadth of the band is extremely vulnerable to market fluctuations because it is dependent on the standard deviation, which measures how dissimilar the data are. This is because the standard deviation measures the instability of the data. Using Bollinger Bands, one can determine whether the present price is logically high or low. This indication can help traders determine whether the current price is high or low. Sometimes, upper and lower bands are added to a moving average to show more information. Using the bands separately will not give the same results as when they are used together. The best results will be achieved by combining the bands. Add more signals, and then do more testing based on what the trader learned from the first tests. The analysis looked at three different charts to see if the NIFTY option trading account would make money in 2021 and 2022. These charts were titled “Target Achievement Chart,” “Stop Loss Hit Chart,” and “Trigger Candle Identification Chart.” Overall, they were an excellent system. Based on the information, a trader who puts in 50,000 rupees might expect to make 30,000 rupees in profit. Each transaction was finalised within the constraints of a single trading day. When traders look at all of the study’s findings together, it’s clear that they have value and can help traders do their jobs better.

References 1. Razavi, S.: Deep learning, explained: Fundamentals, explainability, and bridgeability to process-based modelling. Environ. Model. Softw. 144, 105159 (2021). https://doi.org/10. 1016/j.envsoft.2021.105159 2. Sheu, H.J., Wei, Y.C.: Effective options trading strategies based on volatility forecasting recruiting investor sentiment. Expert Syst. Appl. 38, 585–596 (2011). https://doi.org/10.1016/ j.eswa.2010.07.007 3. Mann, J., Kutz, J.N.: Dynamic mode decomposition for financial trading strategies. Quant. Financ. 16, 1643–1655 (2016). https://doi.org/10.1080/14697688.2016.1170194 4. Arfaoui, M., Ben Rejeb, A.: Oil, gold, US dollar and stock market interdependencies: a global analytical insight. Eur. J. Manag. Bus. Econ. 26, 278–293 (2017). https://doi.org/10. 1108/EJMBE-10-2017-016

42

J. Patra et al.

5. Akhtar, M.M., Zamani, A.S., Khan, S., Shatat, A.S.A., Dilshad, S., Samdani, F.: Stock market prediction based on statistical data using machine learning algorithms. J. King Saud Univ. Sci. 34, 101940 (2022). https://doi.org/10.1016/j.jksus.2022.101940 6. Singh, A.K., Patra, J., Chakraborty, M., Gupta, S.: Prediction of Indian government stakeholder oil stock prices using hyper parameterized LSTM models. In: 2022 International Conference on Intelligent Controller and Computing for Smart Power, ICICCSP 2022, pp. 1–6 (2022). https://doi.org/10.1109/ICICCSP53532.2022.9862425 7. Joshi, V.K.: Impact of Fluctuation : Stock/Forex/Crude Oil on Gold. SCMS J. Indian Manag. 96–115 (2009) 8. Basarir, C., Bayramoglu, M.F.: Global macroeconomic determinants of the domestic commodity derivatives. In: Dincer, H., Hacioglu, Ü., Yüksel, S. (eds.) Global Approaches in Financial Economics, Banking, and Finance. CE, pp. 331–349. Springer, Cham (2018). https://doi. org/10.1007/978-3-319-78494-6_16 9. Fang, X., Xu, M., Xu, S., Zhao, P.: A deep learning framework for predicting cyber attacks rates. Eurasip J. Inf. Secur. 2019 (2019). https://doi.org/10.1186/s13635-019-0090-6 10. Kumar, B., Pandey, A.: Market efficiency in Indian commodity futures markets. J. Indian Bus. Res. 5, 101–121 (2013). https://doi.org/10.1108/17554191311320773 11. Vashishtha, A., Kumar, S.: Development of financial derivatives market in India - a case study. Int. Res. J. Financ. Econ. 37, 15–29 (2010) 12. Aboura, S., Chevallier, J., Jammazi, R., Tiwari, A.K.: The place of gold in the cross-market dependencies. Stud. Nonlinear Dyn. Econom. 20, 567–586 (2016). https://doi.org/10.1515/ snde-2015-0017 13. Mittal, S., Nagpal, C.K.: Predicting a reliable stock for mid and long term investment. J. King Saud Univ. Comput. Inf. Sci. (2021). https://doi.org/10.1016/j.jksuci.2021.08.022 14. Lin, X., Yang, Z., Song, Y.: Short-term stock price prediction based on echo state networks. Expert Syst. Appl. 36, 7313–7317 (2009). https://doi.org/10.1016/j.eswa.2008.09.049

Identification of Mental State Through Speech Using a Deep Learning Approach Somnath Bera1 , Tanushree Dey1 , Debashri Das Adhikary1 , Sumita Guchhhait1 , Utpal Nandi1 , Nuruzzaman Faruqui2 , and Bachchu Paul1,1(B) 1 Department of Computer Science, Vidyasagar University, Midnapore 721102, West Bengal,

India [email protected] 2 Department of Software Engineering, Daffodil International University, Daffodil Smart City, Ashulia, Dhaka, Bangladesh [email protected]

Abstract. Identification of one’s feelings and attitude through speech is a powerful medium for expressing. Finding the emotional content in speech signals and identifying the emotions in speech utterances is crucial for researchers. This paper examines how well a deep learning-based model can identify speech emotions from two well-known datasets, TESS and RAVDESS. In this research work, a proper combination of frequency domain acoustic features of thirteen(13) Linear Predictive Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCCs) are fed into a two-dimensional Convolutional Neural Network (CNN) model for classification. According to the experimental findings, the suggested method can recognize speech emotions with an average accuracy of 99% (TESS) and 73% (RAVDESS) for speaker-dependent (SD) speech. Keywords: Speech Emotion Recognition · Mel Frequency Cepstral Coefficients · Linear Predictive Coefficients · Convolutional Neural Network

1 Introduction The most practical and efficient method of sharing our ideas and thoughts is verbal communication. Speech expresses the speaker’s mood in addition to the syntactical and semantic content of the spoken words. Communication is much more effective during this time by using appropriate expressions and identifying one’s feelings. The clues of whole-body emotional events include speech, facial expression, body language, brain impulses, etc. [1]. A variety of emotions, including anger, sadness, and happiness, are used to express feelings [2]. Speech Emotion Recognition (SER) is an effective technique for human-machine interaction that can detect the speaker’s emotion as well as their actual and virtual presence. SER is frequently utilized for a variety of reasons, including entertainment, education, the auto industry, and medical care [3]. Psychiatrists diagnose patients by using the SER system to determine their mental health. SER is helpful for © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 43–53, 2023. https://doi.org/10.1007/978-981-99-3478-2_5

44

S. Bera et al.

treating people who are experiencing depression or a great deal of stress. The global education system is gradually shifting to an online format day by day. However, it might be challenging to pinpoint a student’s mental state in this mode, which may interfere with an efficient learning process. With the use of SER, it is possible to determine from a student’s emotions whether or not they are currently engaged in studying [4, 5]. Generally, it is simple to identify someone’s emotions from their speech, but it is very challenging for a machine to do so. The identification of emotions can be challenging due to a number of speech-related elements, including changes in the speaker’s vocal tract, rate of speech, culture, and surroundings. An emotional conflict can sometimes be determined by the words that follow a spoken utterance. Due to the shifting social and cultural environments, there is also a new way of communicating, which poses a significant problem that may make it difficult to accurately recognize emotions [6]. In order to ascertain the speaker’s emotional state, it is necessary to separate paralinguistic components that are unrelated to the speaker or the lexical content from the speech. The context or meaning of the utterance is always referenced by the linguistic information. The implicit messages, such as the emotion in the speech, are referred to as paralinguistic information [7]. The following distinct acoustic features are frequently utilized to identify the spoken emotion: Qualitative features, spectral features, and continuous features. For the purpose of identifying speech emotion, many features have been studied. Researchers have weighed the benefits and drawbacks of each characteristic, but none has yet been able to determine which group is the best [8]. Due to variances in factors associated to speech, extraction and carefully selecting features are necessary in order to achieve successful results from SER. While extracting the characteristics, MFCC takes into account the nature of the speech, whereas LPC makes predictions based on the features already extracted. Many well-known feature extractors, such as MFCC, LPC, energy, Spectrogram, pitch, Zero Crossing Rate (ZCR), and Principal Component Analysis (PCA), as well as well-known classifiers, such as GMM, Hidden Markov Model (HMM), Support Vector Machine (SVM), K Nearest Neighbor (KNN), Linear Discriminant Analysis (LDA) etc. are typically employed in speech-oriented research projects. In an effort to extract high-level characteristics from utterances expressing various emotions and build a hierarchical representation of the speech, many deep learning architectures have been developed for speech emotion recognition. In this paper, TESS [9], RAVDESS [10] speech corpus is used for experiment. We have described neural network (CNN) model in this paper with LPC and MFCC parameters.

2 Literature Review Prior research on the SER system has focused on a number of projects where speech features and classification methods are used. Other significant problems with databases of emotional speech recordings and the categories of emotion under investigation have been noted. For the purpose of obtaining substantial feature vectors for accurately identifying emotions, several researchers focus on various types of features, such as prosodic and spectral features individually or in combination. MFCC, LPCC, energy, pitch were

Identification of Mental State Through Speech

45

extracted in [3] from the speech signal to recognize the speech properly. In order to recognize speech emotions, a variety of classifier types had been proposed. The SER system employed classifiers like the Gaussian Mixtures Model (GMM), K-nearest neighbours (KNN), Hidden Markov Model (HMM), Support Vector Machine (SVM), Artificial Neural Network (ANN), etc. and finally achieved 80% accuracy level for speaker dependent system by using SVM. To learn local and global emotion-related features from speech and the logmel spectrogram, two convolutional neural network and long short-term memory (CNN LSTM) networks were built in [11]. In place of CNN, a local feature learning block (LFLB) was used here to extract emotional features. Based on uttered speech signals an emotion recognition system was built in [12]. 39 MFCC, ZCR, Harmonic to Noise Rate (HNR) and Teager Energy Operator (TEO) the several features were extracted here from the speech signals. SVM was used as classifier and finally achieved 74.07% accuracy level. A novel deep dual recurrent encoder model was introduced in [13] that simultaneously uses text data and audio signals to better understand speech data. Four categories of emotion i.e. angry, happy, sad, neutral were recognized by using the IEMOCAP dataset. In [14], modular end to end speech emotion recognition system was developed. An upstream and downstream architecture was proposed here to recognize the emotions by using the self-supervised feature extraction technique. In another side 3-D Log Mel Spectrums were extracted from raw signals and applied to the ADRNN system in [15] for recognizing the speech emotion properly. To predict emotions and compare the level of accuracy for each classifier, some researchers used various classifier types. In [16], four emotions—angry, happy, sad, and neutral—are classified using three additional well-known classification algorithms: SVM, LDA, and D-tree. They attained the greatest accuracy of 85% for D-tree among the designated classifiers using the MFCC, DWT, pitch, energy, and ZCR characteristics from the speech signal. Emotions were categorized in [17] by using GMM, HMM, KNN, Multi-layer Perceptron (MLP), Recurrent Neural Net-work (RNN), and Back propagation. In [2], the common six emotions were classified using LDA, Regularized Discriminate Analysis (RDA), SVM, and KNN on both the Berlin and the Spanish datasets. RDA was accomplished with 92.6% accuracy on the Berlin dataset and 90.5% accuracy on the Spanish dataset using MFCC, pitch with energy, among four classifiers. Additionally, classifiers built on neural networks were used in the SER system. On two well-known databases, Berlin and Spanish in [18], seven key emotions were predicted using SVM, Multivariate Linear Regression (MLR), and RNN. Three classifiers were employed for classification using MFCC and Modulation Spectral Features (MSF), one of which was applied to the Spanish dataset. On the Berlin database, 90% accuracy was reached for RNN and 82.41% accuracy was achieved for MLR. RNN, MLR, and SVM are utilised to categorise Bengali speech datasets in [19] in order to identify emotions. To harvest features from one’s own dataset, MFCC and MSF are employed. RNN modulation technique aids this system’s accuracy, which is 51.33%. It has been determined after extensive research that no one feature affects the overall performance of the SER system in either a positive or negative way. However, combining many characteristics makes it easier to obtain the extracted values for classification. After reviewing numerous worthwhile research studies, we discovered another problem: while our main goal is to classify more emotions and use as much data as possible

46

S. Bera et al.

while still achieving a satisfactory level of accuracy, some studies classified only a few common emotions and achieved a higher level of accuracy.

3 Methodology The SER system aims to automatically identify emotional state from spoken audio samples. Eighty percent of the data in the recorded audio datasets are trained using different classifiers, and the remaining twenty percent are tested for this system’s ability to predict emotions. By using different speaking styles, speakers can define their emotional state. Features are useful data that have been taken from the audio signal input. The approach of our suggested system is created through a number of steps, which are shown in Fig. 1 below.

Fig. 1. Suggested Methodology of SER system

Here, 80% of the emotion-based speech samples are used for training, and the remaining 20% are used as testing samples. Preprocessing and feature extraction are standard processes that are applied to both training and test samples. Preprocessing is required to remove unwanted noise from audio input samples. Different prosodic and spectral features are extracted from preprocessed data in the feature extraction step. 3.1 Dataset Used In our proposed effort, we make use of the Toronto Emotional Speech Set (TESS) [9] a well-known audio-based English dataset. The TESS dataset, which solely includes females, has 2800 words with 7 different emotions, including neutral, fear, disgust, happiness, and pleasant surprise. The two female speakers in the audio samples are 26 and 64 years old. The speakers have had formal training in music and education. The frequency is set to 24414 Hz and a 32 bit mono channel is used while recording audio samples. We also employ the multimodal Ryerson Audio-Visual Database for Emotional Speech and Song (RAVDEES) [10], which is based on the English language. A total of 24 professional actors (male actor 12 and female actress 12) recorded their voices using two lines that were lexically matched in this database. This dataset includes eight different types of emotions, including neutral, calm, happy, sad, angry, afraid, disgust, and startled. A 32-bit mono channel and a frequency of 16000 Hz are used while recording audio samples.

Identification of Mental State Through Speech

47

3.2 Feature Extraction According to some researchers, spectral properties represent the relationship between changes in channel shape and voice motions. MFCC and LPC are the spectral characteristics that is frequently employed in speech emotion identification due to its advantages of being easy to calculate, having excellent distinguishing abilities, and having a high robustness to noise. The sensitivity with which human ears receive sound waves of various frequencies is non-linear, and this is how the MFCC is taken from inspiration from the human auditory system [20, 21]. As a result, the relationship between spectral features and frequency is nonlinear. The entire voiced portion of the speech signal is divided into 25 ms frames with a 60% overlap since this is important for gathering the relevant data from frame boundaries for both LPC and MFCC. The extraction procedure of MFCC is depicted in Fig. 2.

Fig. 2. Illustrates the extraction procedure for MFCC.

First, a hamming window is used to process the speech frame in order to extract the MFCC, and then a discrete Fourier transformation is used to calculate the discrete spectrum (DFT). The impact of harmonics is also diminished with a Mel filter. The calculation of MFCC is completed using the discrete cosine transformation (DCT). One of the most effective speech analysis methods, LPC provides a practical way to encode high-quality speech at a low bit rate. A specific speech sample at this moment can be roughly represented as a linear combination of speech samples from the past, according to the fundamental principle of linear predictive analysis [22]. The phases of LPC are depicted in Fig. 3.

Fig. 3. Illustrates the extraction procedure for LPC

The goal of LPC is to reduce the total of the squared discrepancies between the original speech signal and the estimated speech signal over the course of a finite period

48

S. Bera et al.

of time. To do so, a special set of predictor coefficients could be produced. Every frame an estimation of these predictor coefficients is made. Each frame of the windowed signal is subsequently auto correlated, with the order of the linear prediction analysis being the largest autocorrelation value. The LPC [22] analysis follows, where each frame of the autocorrelations is converted into a collection of LPC parameters, including the LPC coefficients. Finally, both 13 coefficients of MFCC and LPC are also used here. 3.3 Classification with CNN The operation of CNNs is thoroughly explained in this section. The CNN consists of input layer, convolutional layers, pooling layers, fully-connected and output layer. This neural network has multiple layers, each of which is made up of a number of two-dimensional planes and a number of separate neurons. The majority of CNN architectures accept images as inputs, which enables the programmer to incorporate particular qualities into the architecture, making the forward function easier to construct, minimizing the number of parameters, and accelerating model training. Figure 4 illustrates the typical CNN architecture, which consists of a convolutional and a max-pooling layer.

Fig. 4. Architecture of CNN

The convolution layer is used to compute the CNN weights, the bias term is added to the outputs, the activation function is then used, and finally the pooling layer is used. The proposed model contain three convolution layers for each dataset used in this work. Adam optimization is used for training purpose. The whole experiment is done in the Google colab pro platform. For RAVDEES dataset input of shape is (526, 13, 1) and (297,13,1) for TESS dataset depending on where LPC is used. The said input shape represents the height, width and color channel respectively. Input data was fed into the first 2D-convolutional layer (Conv2D) with a filter size of 64 and a RELU activation function. The second Conv2D pair layer (filter size 128) and third Conv2D pair layer (filter size 256) are used to repeat this process twice more. The feature matrix has been post processed to convert equal dimension. The output is then routed through a 2Dmax pooling layer having a pool of (2,2). After MaxPooling is finished, the output is transmitted down to the dropout layer with a 0.3 dropout rate. In order to decrease the

Identification of Mental State Through Speech

49

likelihood of overfitting, the dropout layer removes some values. The learning process for this Feature is now complete. Then, using the Flatten layer, the resulting vector is converted into a one-dimensional vector. The data was flattened using a one-dimensional vector because it is simple to use for categorization. The final fully connected Dense layer with 8 units for RAVDEES dataset and 7 units for TESS dataset and a Softmax activation function is then applied to the generated vector. We then received the finished product.

4 Experimental Analysis To achieve the overall good performance in the SER system, a number of parameters have an impact. Among these key elements, our suggested approach applies some elements including audio sample quality, extracted characteristics, and classification models. English dataset “TESS” is the primary focus of our work, although popular dataset “RAVDEES” is also used to compare our results to other recent studies. It has been noted that the CNN model is classified by 13 MFCC and 13 LPC coefficients of features for the dataset TESS which is rather balanced than RAVDESS dataset. It provides a moderately steady result for each model. The confusion matrix for the highest accuracy obtained for TESS dataset is given in Table 1. Where each row of this matrix reflects the rate of recognition, and each cell determine the number of identified emotions. As a result, the number of emotions displayed in each class is represented by the sum of the components on each row. The appropriately categorized emotions are represented by the diagonal members of this matrix. Table 1. Confusion matrix of highest accuracy obtained for TESS dataset. Sad

Angry

Neutral

PS

Disgust

Happy

Fear

Sad

78

0

0

1

1

0

0

Angry

0

79

0

0

0

1

0

0

0

0

0

0

0

PS

0

0

0

78

0

2

0

Disgust

0

0

0

0

80

0

0

Happy

0

0

1

1

0

78

0

Fear

0

0

0

0

0

0

80

Neutral

80

The table above provides a comprehensive understanding of classifier’s recognition accuracy. The CNN is employed in this case to determine the prediction accuracy in relation to models i.e. For TESS dataset we have achieved 99% accuracy level whereas for RAVDEES dataset, 73% accuracy level is achieved here. Our main objective is to determine maximum no of emotions with a higher success rate using the appropriate feature in this SER system. The graphical presentation of f1-score respective with TESS dataset is mentioned below in Fig. 5.

50

S. Bera et al.

f1-Score

Fig. 5. Class wise f1 Score for TESS dataset.

4.1 Comparative Study After obtaining the forecast accuracy, our suggested work is contrasted with several existing, accepted standards of study. The study of this comparison is expressed in Table 2. Table 2. Comparative study based on these standard databases. Dataset

Research work on SER

Features

Classifiers

Accuracy

Works on

TESS

Agarwal et al. [23]

MFCC, LPC, PSD, Energy, Entropy, Formant frequency, Pitch

DNN-DHO, DAE

97.85%

Evaluate performance of Speech emotion recognition

Choudhury et al. [22]

RMFCC, Epoch Location, Strength of Epoch, Slope of Strength of Epoch, Energy, Entropy of Energy, Zero-Crossing Rate, Spectral Centroid, Spectral Spread, Spectral Entropy, Spectral Flux, Spectral Rolloff

SMO

99%

Recognize emotions from uttered speech

(continued)

Identification of Mental State Through Speech

51

Table 2. (continued) Dataset

RAVDEES

Research work on SER

Features

Classifiers

Accuracy

Works on

Proposed Method

MFCC, LPC

Conv2D

99%

Recognize speech emotion

Patil et al. [24]

MFCC, FFT, ZCR,energy, Pitch and GVV

SVM

70.19%

Recognize emotions from speech

Issa et.al [25]

MFCC + Chromagram + Melspectogram + Spectral Contrast + Tonnetz

Deep CNN

71.61%

Recognize uttered speech emotions

Proposed Method

MFCC, LPC

Conv2D

73%

Recognize speech emotion

From the above comparative study it is seen that, in case of TESS dataset, Agarwal et al. [23] achieved 97.85% accuracy level and Choudhury et al. [22] achieved 99% accuracy level that is similar to our proposed work. But one point is that more no of features were used to get the result whereas in our proposed work only each 13 coefficients of MFCC and LPC are used to develop the standard model with 99% of accuracy level. Similarly using these 13 coefficients, in case of RAVDEES dataset, 73% accuracy level we achieved.

5 Conclusion In this study, a two dimensional CNN based SER system is developed. The whole experiment is done with the help of the two standard dataset TESS and RAVDEES by considering the standard seven and eight emotions respectively. The achievement of an acceptable level of success using a particular feature set and an appropriate classifier is one of the primary goals of SER studies. For each dataset, two dimensional CNN (Conv2D) is applied and accuracy is evaluated based on the feature combinations of both 13 coefficients of MFCC and LPC. The success rate is achieved 99% on TESS dataset and 73% on RAVDEES dataset for combination of MFCC and LPC. However, in terms of its simplicity and applicability, our study holds its own against the aforementioned dataset. Our aim is to develop a hybrid system with less number of features. In the future, we hope to achieve improved accuracy by combining additional features with deep neural network-based classifiers, which will improve SER systems for all common languages.

52

S. Bera et al.

References 1. Yuvaraj, R., et al.: Detection of emotions in parkinson’s disease using higher order spectral features from brain’s electrical activity. Biomed. Signal Process. Control 14, 108–116 (2014) 2. Kuchibhotla, S., Vankayalapati, H.D., Anne, K Rao: An optimal two stage feature selection for speech emotion recognition using acoustic features. Int. J. Speech Technol. 19(4), 657 (2016) 3. Ingale, A.B., Chaudhari, D.: Speech emotion recognition (2012) 4. Özseven, T.: A novel feature selection method for speech emotion recognition. Appl. Acoust. 146, 320–326 (2019). https://doi.org/10.1016/j.apacoust.2018.11.028 5. El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011). https://doi. org/10.1016/j.patcog.2010.09.020 6. Zhang, Z., Coutinho, E., Deng, J., Schuller, B.: Cooperative learning and its application to emotion recognition from speech. IEEE/ACM Trans. Audio Speech Lang. Process. 1 (2014). https://doi.org/10.1109/TASLP.2014.2375558 7. Pérez-Espinosa, H., Reyes-García, C.A., Villaseñor-Pineda, L.: Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model. Biomed. Sign. Process. Control 7(1), 79–87 (2012). https://doi.org/10.1016/j.bspc.2011.02.008 8. Huang, Z., Dong, M., Mao, Q., Zhan, Y.: Speech emotion recognition using CNN. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 801–804 (2014) 9. Dupuis, K., Pichora-Fuller, M.K.: Recognition of emotional speech for younger and older talkers: behavioural findings from the toronto emotional speech set (2011) 10. Livingstone, S.R., Russo, F.A.: The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PLOS ONE 13(5), e0196391 (2018). https://doi.org/10.1371/journal.pone. 0196391 11. Zhao, J., Mao, X., Chen, L.: Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomed. Sign. Process. Control 47, 312–323 (2019) 12. Aouani, H., Ayed, Y.B.: Speech emotion recognition with deep learning. Procedia Comput. Sci. 176, 251–260 (2020). https://doi.org/10.1016/j.procs.2020.08.027 13. Yoon, S., Byun, S., Jung, K.: Multimodal speech emotion recognition using audio and text. In: 2018 IEEE Spoken Language Technology Workshop (SLT), pp. 112–118. IEEE (2018) 14. Morais, E., Hoory, R., Zhu, W., Gat, I., Damasceno, M., Aronowitz, H.: Speech emotion recognition using self-supervised features. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6922–6926. IEEE (2022) 15. Meng, H., Yan, T., Yuan, F., Wei, H.: Speech emotion recognition from 3D log-mel spectrograms with deep learning network. IEEE Access 7, 125868–125881 (2019). https://doi.org/ 10.1109/ACCESS.2019.2938007 16. Koduru, A., Valiveti, H.B., Budati, A.K.: Feature extraction algorithms to improve the speech emotion recognition rate. Int. J. Speech Technol. 23(1), 45–55 (2020). https://doi.org/10.1007/ s10772-020-09672-4 17. Basu, S., Chakraborty, J., Bag, A., Aftabuddin, M.: A review on emotion recognition using speech. In: 2017 International Conference on Inventive Communication And Computational Technologies (ICICCT), pp. 109–114. IEEE (2017) 18. Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Mahjoub, M.A.:Speech Emotion Recognition: Methods and Cases Study (2018) 19. Hasan, H.M., Islam, M.A.: Emotion recognition from bengali speech using rnn modulationbased categorization. In: 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 1131–1136. IEEE (2020)

Identification of Mental State Through Speech

53

20. Liu, Z.-T., Xie, Q., Min, W., Cao, W.-H., Mei, Y., Mao, J.-W.: Speech emotion recognition based on an improved brain emotion learning model. Neurocomputing 309, 145–156 (2018). https://doi.org/10.1016/j.neucom.2018.05.005 21. Zhu, L., Chen, L., Zhao, D., Zhou, J., Zhang, W.: Emotion recognition from Chinese speech for smart affective services using a combination of SVM and DBN. Sensors 17(7), 1694 (2017). https://doi.org/10.3390/s17071694 22. Choudhury, A.R., Ghosh, A., Pandey, R., Barman, S.: Emotion recognition from speech signals using excitation source and spectral features. In: 2018 IEEE Applied Signal Processing Conference (ASPCON), pp. 257–261. IEEE (2018) 23. Agarwal, G., Om, H.: Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimedia Tools Appl. 80(7), 9961–9992 (2020). https:// doi.org/10.1007/s11042-020-10118-x 24. Patil, S., Kharate, G.: Implementation of SVM with SMO for Identifying Speech Emotions using FFT and Source Features (2021) 25. Issa, D., Demirci, M.F., Yazici, A.: Speech emotion recognition with deep convolutional neural networks. Biomed. Sign. Process Control 59, 101894 (2020). https://doi.org/10.1016/ j.bspc.2020.101894

Intellectual Property in Human Genomics in India Aranya Nath1(B)

and Gautami Chakravarty2(B)

1 Damodaram Sanjivayya National Law University, Visakhapatnam 531035, AP, India

[email protected]

2 Kalinga Institute of Industrial Technology, Bhubaneswar 751024, Odisha, India

[email protected]

Abstract. “Genome” and “Human Genomics” are essential in their coexistence. Genetic Science grows eventually with the rapid developments of biological sciences and the biotechnology industry. Patents’ role has become crucial in Pharmaceuticals and genetic Science, which the authors will discuss in the Research. The main idea of selecting this arena is to analyze the gaps concerning privacy in DNA fingerprinting methods and human genetic Science. Further, the paper will explore whether International Intellectual Property encompasses more than ensuring intellectual property protection and transfer. Ethical and legislative challenges about the consequences of technological advances and information are growing widespread. This chapter analyses the global problems which arise in the future regarding Intellectual Property and growth in the genomics arena. Genomics affects everyone and is the consequence of international scientific collaboration. Genomics represents the intersection of intellectual Property, development, and globalization. This chapter focuses on genetics’ physiological, legal, ethical, and sociological issues and implications. The chapter examines the impact of genomics on individuals and the entire society and the patentability of human Genes in India. Keywords: Genomics · Biotechnology · Intellectual Property Laws · Privacy · ethical norms · Health · Patents

Abbreviations 1. 2. 3. 4. 5. 6. 7. 8.

IPR- Intellectual Property Rights IPO - Indian Patent Office cDNA - Complementary deoxyribonucleic Acid DNA- Deoxy Ribonucleic Acid RNA- Ribonucleic Acid TRIPS- Trade- related aspects of Intellectual Property Rights R&D- Research and Development EST- Expression Sequence Tag

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 54–64, 2023. https://doi.org/10.1007/978-981-99-3478-2_6

Intellectual Property in Human Genomics in India

55

1 Introduction “Genetics” or “genome science” comes under the domain of Bioinformatics. This biomedical field arose in response to the overwhelming need for a flexible and intelligent method of storing, maintaining, and accessing huge and complex biological data sets. Rapid advances in genomic and other molecular research tools and information technology have resulted in a massive volume [1] of molecular genetic data. The core objective of Bioinformatics at the start of the genomic revolution was developing and managing a database to hold biological information such as nucleotide and amino acid sequences. This database requires architectural challenges and the creation of an interface via which academics can access current data and input new or changed data. The necessity to establish massive datasets such as Germplasm and DNA Database and compare the DNA sequence data emerging from the human genome and other genome sequencing [2] initiatives prompted the development of Bioinformatics. Bioinformatics now includes: • • • •

Protein structural analysis. Gene and protein functional information. Patient data. Metabolic pathways from many species.

DNA is intrinsically contradictory. With exceptional precision, distinctive patterns in an individual’s DNA can validate or disprove that person’s purported attendance at a criminal investigation. In contrast, they may utilize the similarity between people’s DNA sequences to prove their link to others, primarily through distant ancestors, to the benefit of genealogists, anthropologists, and archaeologists. The increasing utilization of DNA as a tool for identifying and interpreting people and cohorts raises questions about how cultures and legal systems define personhood, [3] the cornerstone for human-animal rights within national and international laws. As a result, debates about personal and societal values are inherent in all privacy and intellectual property laws. It also raises concerns regarding abstraction. Do we see humans as simple expressions of genetic data that can be mined and properly commercialized or as beings outside the scope of commercial use, as typified by ancient English jurisprudence about the ownership and use of intellectual property rights are essential for the success of [4] any technical invention? IPR is critical in designing methods for disseminating and transferring technology that will benefit society the most. Biotech intellectual Property has several forms; one item may attract more than one IP protection. The most prevalent technique of intellectual protection is patenting. The purpose of patents is to encourage innovators to create. Without incentives to promote innovation, future improvements will stagnate. Individuals are often eager to invest in dangerous Research that would go in vain because of such incentives. Gene patents are a crucial pillar of the current biotechnology sector. The understanding of genetics has changed dramatically during the last few decades. It started as a basic unit of heredity, conveying traits from generation to generation. Once it discovered the DNA structure or the ‘genetic code,’ gene fragments became a data source. A gene is a distinct DNA unit holding the information required to create a specific protein. Proteins are the building blocks for cellular structures and carry out most cellular operations. As a result, genes might refer to as the foundation of life.

56

A. Nath and G. Chakravarty

1.1 Statement of Problem We all know that the COVID-19 outbreak, the millennium’s most terrible disaster, pushed the world’s extensive international system into turmoil, bringing numerous issues to the frontline. Initially, Human Genomics came into the picture, where the controversy arose due to the Patentability of Human Genes in European Countries. Whereas in India, based upon the Stringent Patent Protection system on inventions in Human Genomics, critical thinking regarding the patentability of Genes for anti-cancer drugs is required. In the mid, Covid-19 when everyone was frightened, Medical Practitioners and scientists were compelled to research and test potential vaccinations based on the genes as a result of clinical trials. Does it raise the issue of whether patenting genes for the vaccinations of Covid-19 will be fine or not? Initially, human beings are not ready for clinical trials of Human Genes; moreover, Patenting Human Genes would lead to various bioethical issues. So, a uniform legal framework states the ethical, moral, and social ramifications of such patenting, which necessitate close examination and study. 1.2 Need for the Research In their Research on Patents in Human- Genomics overview, Dinesh Yadav discusses recent developments in biotechnology and emphasized and demarked patents, including those focusing on genomics developments and those with deoxyribonucleic acid, genes, sequences, and other significant genetic material and gene technologies. T.R. Sivaramjani and Samir K. Brahmachari’s Research on Human Genome studies and Intellectual Property focus on the ethical and legal rights of human genome sequence patenting. Verkey E. Patenting of Medical Methods – Need of the Hour [Online] Journal of Intellectual Property Law & Practice, 2007; 2(2) discusses an additional risk: granting medical technique patents may raise medical costs through royalty payments imposed by the patent holder. Myriad Genetic Laboratories Inc., for example, used patent rights to prevent access to genetic technology used to diagnose cancer. Many people have cancer-causing mutations, but early discovery can assist in avoiding or treating the disease efficiently. Genetic tests are available to determine the existence of BRCA1 and BRCA2 mutations. Myriad Genetics owns most of the patents for these diagnostics testing methods. The cumulative consequence of Myriad’s European patents is its monopoly on all breast and ovarian cancer diagnostic tests throughout Europe. The patents cover all diagnostic procedures, particular mutations, and diagnostic kits. Myriad has also copyrighted work on BRCA2 gene sequencing. The patent covers somatic mutations in the gene linked to breast and other malignancies and their application in diagnostics, therapy, [5] and pharmacological screening. Basheer S, Purohit S, Reddy P. Exclusions from Patents that Advance Public Health Goals. The “incentive” or “reward” theory is one of the many ideas that exist today to explain the goal and logic of the patent system, according to a 2011 study on exclusions from the patentable subject matter and exceptions and limitations to the rights. More importantly, the challenges are unique in terms of technology. It may easily transfer evidence that patents could be helpful in a capital-intensive industry like medicines to sectors like semiconductors and information technology. Another issue is whether

Intellectual Property in Human Genomics in India

57

developing and least developed nations will end up importing technology on a net basis. The question is whether patent systems promote technology transfer to these countries or impede the advancement these nations could have made had they been allowed to imitate and learn freely, as many developed nations were before TRIPS. 1.3 Research Methodology The Research is purely Doctrinal, analytical & exploratory. In this study, the researcher is trying to evaluate the Patentability of Human Genomics. Over here, the researcher uses the doctrinal method of Research where the authors collected all the information related to the first chapter from various articles, journals, e-books, and other secondary sources. Next, the researcher uses an analytical method to analyze the social, legal, and ethical issues of the patentability of human genomics. Lastly, the researcher uses the experimental research method, describing not discussing new things before. Over here, the researcher uses this method to explain the patentability of Genes in India. Therefore, the researcher must establish the legislation’s lacunae by providing suitable examples and judicial precedents. 1.4 Objective of the Study • To present an overview of genetics and gene patents from a scientific viewpoint • Examine numerous laws, policies, and judicial approaches to gene patentability in India and other countries. • Determine gene patents’ social, legal, ethical, and moral consequences. • To provide solutions or therapies to the challenges that come with gene patenting. 1.5 Scope of the Study The study aims to understand the concept of human genomics and analyses the global problems which arise in the future regarding intellectual Property and growth in the genomics arena. The study will also explore genetics’ physiological, legal, ethical, and sociological issues and implications.

2 Human Genes – Scientific Overview For many years, it has been known that all living things inherit traits or traits from their parents. Johann Gregor Mendel (1822–1884), the “father of genetics,” undertook a decades-long search for patterns of inheritance. He conducted experiments on pea plants and derived the law of assimilation [6] and an independent assortment. Mendel, through his experiments, deduced that biological variants inherit from parent organisms. Although his work was published in 1865, it was in 1900 that his findings were recognized and understood. Independent Research. Mendel hypothesized one factor that transmits traits from parents to their children, namely [7] “genes.” But Mendel never used the term “gene” in observations. Charles Darwin used the word “gemmule” for genetic units, later known as chromosomes.

58

A. Nath and G. Chakravarty

Human cells contain a nucleus within tightly coiled structures called chromosomes. Humans have 23 pairs of chromosomes, one from each parent. Each of the chromosomes [8] includes thousands of genes. Genes carry our traits through generations and comprise deoxyribonucleic acid (DNA). DNA contains all the information needed to build and maintain an organism. Passing on all or part of an organism’s DNA helps provide continuity from one generation to the next while allowing small changes that contribute to the diversity of life. Almost all living cells contain DNA. Organisms classify into eukaryotes and prokaryotes. Eukaryotes consist of cells containing the nucleus and DNA present in the nucleus. On the other hand, since prokaryotes are comprised of cells that lack a nucleus, the DNA resides directly in the cell’s [9] cytoplasm. Except for some viruses in which the genes consist of a closely related compound called RNA, all other living organisms contain DNA. Until the 1950s, scientists still found The structure of DNA is unknown. The structure of DNA [10] has proven to be very helpful in understanding the fundamentals of genetics. All DNA comprises a series of smaller molecules called nucleotides at the most basic level. Each nucleotide consists of three main components: a nitrogen-containing region called a nitrogenous base, a carbon-based sugar molecule called deoxyribose, and a phosphorus-containing region called a phosphate group attached to the molecule. There are four different types of nucleotides, and a specific nitrogen base identifies each. A DNA molecule comprises two strands of nucleotides coiled together like parallel handrails or a twisted ladder. Both sides of the scale contained sugar and phosphates. The bonded nitrogen base pairs form ladders. The base sequence in DNA provides the code for protein structure. Protein comprises a chain of amino acids. The order of amino acids determines the unique properties of each protein as a result. DNA packaging refers to fitting DNA into a compact structure inside the cell. The lengthy double-stranded DNA is securely looped, coiled, and folded to fit inside the cell easily. Eukaryotes wrap their DNA around a protein called histones within the nucleus. Chromatin refers to eukaryotic DNA and the histone proteins [11], which keep it in a spiral structure. Furthermore, DNA squeezes by a twisting process known as supercoiling. In both eukaryotes and prokaryotes, such closely packed DNA is arranged into structures known as chromosomes. Except for eggs, sperm, and red blood cells, every cell in our body carries a complete set of chromosomes in its nucleus. Various creatures have differently shaped chromosomes. Chromosomes are often seen in eukaryotes as an X-shaped structure. There are 23 pairs of chromosomes, including two sex chromosomes, X and Y. A male has XY chromosomes, whereas a female has XX chromosomes. 2.1 Genome Engineering Genetic engineering is the alteration of an organism’s genotype using recombinant DNA technologies to change the organism’s genome to attain desirable features. It is sometimes called gene editing, gene modification, or gene transfer. In addition to recombinant DNA technology, genetic engineering [12] employs compression molding, bio ballistics, and electro and porous chemical technologies. All living creatures’ DNA comprises the same nucleotide building blocks, allowing genes from one organism to be read by genes [13] from another.

Intellectual Property in Human Genomics in India

59

In the simplest basic terms, genetic engineering accomplishes [14] through the following procedures. • • • • • •

Gene identification and isolation Modification of genes so they can transfer to another organism Gene removal Insertion of the isolated gene into the host organism through a vector Evaluating the success of the resultant gene combination The successful completion of gene cloning results in a specific DNA sequence, which can be used commercially for protein production, [15] genetically modified microorganisms, transgenic plants, and transgenic animals.

2.2 Application of Genomics Medicine, pharmaceuticals, and - Genetic engineering uses to create a variety of treatments and therapies. Genetically engineered drugs include insulin, growth hormones, Taxol, and interferon. Transgenic animals are also used in the manufacturing of medicinal items. The procedure is known as pharming. Gene therapy has also gained popularity in the medical industry because it can cure and prevent genetic problems. Daily, scientific findings on this subject are being developed. Food sector: Many genetically modified foods and substances are now accessible due to genetic manipulation. Transgenic plants exhibit several enhanced properties due to genetic changes, such as faster growth rate, disease resistance, better taste, extended shelf life, and lower water demand.

3 Social, Legal, and Ethical Issues in Patenting Human Genomics The number of gene patent applications continues to rise with the fast growth of biological Science and related fields. Generally, patents are granted as a social compact between the inventor and society. The patent system is structured to encourage innovation. First, it promotes innovation by allowing individual inventors to recoup their R&D expenditures and benefit from their technical advances. Given that inventions often benefit society by providing higher-quality products or methods of production, it has long been assumed that patent protection offers a favourable asset to the community as an incentive for innovation. Although gene patents are essential in biotechnological advancements, problems such as scientific Research, healthcare access, and moral considerations must all be [16] considered. 3.1 Gene Patents and Research-Related Issues Over the last several decades, several discoveries in medicine and healthcare have resulted in patentable innovations and discoveries [17], such as newly identified genes that create a specific protein or a unique chemical entity that may be sold as a medicine. Gene patents act as gate-way patents in diagnostics, pharmaceutical, and biomedical scientific Research, providing a financial incentive for scientists to do more studies. However, there is a growing fear that these patents would inhibit future investigations

60

A. Nath and G. Chakravarty

into patented genes’ functioning and potential applications. Gene patents have a comprehensive scope since the holder claims the isolated gene sequence as in its invention. The patent owner’s rights are not limited to the product generated by the technique or process indicated in the patent. Research equipment like chemical reagents is unlikely to be impacted by patents. Still, patents can be a risk to researchers when the patent holder makes the patented innovations [18] available under onerous licensing terms. If the license is too pricey, conducting Research will be cost-prohibitive, and when more than one license is necessary, the procedure becomes more time-consuming. Patent thickets arise when licenses are issued on smaller and extra not unusual place genes, creating practical and economic problems such as slowing down the fee of dissemination of clinical statistics and obstructing the development of the genetic era. Proscribing gene patents may also lower the quantity of socially precious statistics to be had by the general public. 3.2 Patenting Human Genes: The Ethical Debate The International Human Genome Project, which included scientists from the United States, the United Kingdom, France, Germany, Japan, and China, produced the world’s first human genome sequence. There are four reasons why patenting human genes would have an impact on human dignity: it alters our genetic integrity, equates to human ownership, commercializes bodily parts that should not transform into commodities, and should be considered collective Property. Opponents of gene patents believe that changing human genetic content to develop different and better humans interferes with nature and natural processes and that an incorrect modification of our genetic material might eventually jeopardize genetic integrity. Materialists argue that humans can control life and manipulate its genetic structure in a streamlined and reliable way. In contrast, vitalists argue that the complexity and autonomy of life defy [19] accurate science. Ethical considerations regarding the patentability of human genes include uniqueness and fungibility, holiness, or violability, which are based on whether ethical and moral boundaries are the same as those of what is legally enforced. Sacred encompasses two core values: great regard for life and care for the human psyche. If these two standards violate, patenting any life, including human DNA, is immoral. Additionally, patenting human genes may lead to using genetic testing by employers, insurance companies, government agencies, and other groups. Tampering with the human genome can harm future generations. Participatory research ideas centered on communities assist in guaranteeing that genetic Research reflects community priorities.

4 Intellectual Property and Role of Human Genomics in India Intellectual property rights balance the moral and economic rights of Creators and inventors with broader interests and societal needs. The rationale behind patents and copyrights results from incentives and rewards for inventors to benefit society. The Patent Act makes regulations to promote the advancement of Science and technology. Incentivize inventors and investors by giving them exclusive rights. The inventor Makes the invention available to the public and temporarily exercises exclusive rights to its time.

Intellectual Property in Human Genomics in India

61

Encouraging economic benefits from exercising exclusive rights Inventors invent, and shareholders invest. From an evolutionary point of view states, intellectual Property is a private right and protected like any other right Tangible property. Still, for developing countries, IP is a public good that should use to promote economic development. As we all know, India is a developing country with a lot of advancement in the field of Science, technology, and pharmaceuticals. Henceforth, it should enable a strong and effective protection mechanism to reduce the risks of getting the invention to be duplicated. As a result, the Indian Patents Act of 1970 [20] came to the forefront with many amendments for better protection of scientific inventions as the days are changing with modern times. 4.1 Indian Patents Act 1970 India’s patent policy seeks to balance development and innovation, with the period of each patent awarded extended to 20 years from the date of application. Eligibility of patent application in India, an invention must be original, non-obvious, and helpful. The Patents Amendment Act of 2002 modified the concept of “innovation” to conform to TRIPS Article 27. However, Sects. 3 and 4 of the Patents Act of 1970 provide for a list containing subject matter that is not patentable. It includes inventions that cause a threat to the public order of humans, animals, plants, health, or the environment, scientific principles, abstract ideas, and the discovery of any living or non-living item in nature. Discovery is the process that produces a fresh product, a new technique, or a combination of the two. Patent protection is granted only when the invention is novel in all aspects and if it is a combination. 4.2 Gene Patents Under the Patents Act of 1970 The controversy over whether biotechnological creations are inventions or merely discoveries has raged on for a long time and relates to gene patentability. There is no established standard for establishing gene patentability. The patent law proposes a set of criteria to resolve the patentability subject effectively. The subject matter’s patentability will be calculated based on its uniqueness, utility/industrial application, and inventive step or non-obviousness. Novelty Regardless of the subject matter to be patented, novelty is a necessary criterion under patent law. The word “novelty” is substituted in the Patents Act of 1970 by the term “new invention,” which indicates that the subject matter has not gone into the public domain or has not yet become established of the current state of the art [21] is inadmissible for patent protection. They are naturally occurring substances whose characteristics and composition are recognized initially in the case of DNA. As a result, isolating DNA from its natural condition without human involvement in the [22] operation of the specified gene or gene sequence does not qualify for patentability. Non-obviousness or Inventiveness The Indian patent law requires an inventive step to grant a patent to an invention, a feature of the invention that is not obvious to a person skilled in the art. It is essential

62

A. Nath and G. Chakravarty

to establish the non-obviousness of the invention, not only because it was not known previously. At the same time, a person of ordinary expertise needed help to figure out the invention. The obligation of supporting the claim transfers to the patent applicant, who must establish that their invention is better than the current previous art. Demonstrating economic value is straightforward, as biotechnology advances have various applications in medicine and diagnostics. Utility or Industrial Application The patent law in India requires that biotechnology innovations be capable of industrial replication, which can lead to a total monopoly and high-priced medicines. It is reasonable to broaden the general industrial applicability of biotech patents, as biotechnology ideas can develop, utilized in an industry, and copied several times. The directions in the Manual of Patent Procedure for reviewing biotechnology innovations state that gene sequences and DNA sequences with unknown activities need not fulfill the industrial application requirements. Fragments/ESTs (Expression Sequence Tag) are permissible, provided they meet the needs of utility and industrial applicability.

5 Right to Health and Genetic Patents The difficulty of formulating patent policies in India hinders by the Indian Constitution, which protects the right to health and medical care as basic human rights. Article 25 of the Universal Declaration of Human Rights and Article 12 of the International Covenant on Economic, Social, and Cultural Rights require parties to respect everyone’s right to the best possible physical and mental health in [23] a welfare state. The right to health comprises four major components: greater healthcare availability, accessibility, quality, and acceptance. All gene uses must go via the original gene patent or the ‘gatekeeper patents’ before being used in an invention. These patents have an ‘anti-common impact’ in society and are known as ‘blocking patents.’ A blocking patent covers critical parts of a patent to prevent others from inventing around it, leading to restricted licensing. Under Indian Patent Law, a person can appear before the controller of Patents to file a Compulsory license where the reasonable needs of the public regarding the patented invention are not met, or the public cannot obtain accessibility to the patented product at a reasonably affordable price. The Act also offers an exemption to patent protection for Research, experimentation, or education. Section 4 of the Competition Act can invoke when an enterprise abuses its market position or position. It may apply this principle to some patent holders who require permission to manufacture or manufacture downstream gene products. Still, it is not a panacea for resolving disputes between gene patents and the right to health.

6 Conclusion Genes determine the traits of all living things, from parents to offspring. Genetic engineering, or the genetic mutation or manipulation of an organism’s gene through biotechnology, has uses in agriculture, medicine, health, the environment, and industry. There

Intellectual Property in Human Genomics in India

63

are strong reasons for and against patenting genetic information in the current political atmosphere. Still, any opposition is unlikely to prevent gene patenting. To ensure that gene patents are utilized together in a socially worthwhile manner, lawmakers must explore various techniques within and outside of existing patent rules. To some extent, the exact meaning of microorganisms can clear up a misunderstanding about Indian law’s stance on gene patenting. On the other hand, lax standards for biological breakthroughs in comparison to chemical breakthroughs might result in invention evergreening and invalid patents. As a result, India requires criteria for genetic patenting. India must adequately adapt the fundamental standard for patentability, novelty, non-obviousness, and utility biotechnology foundation.

References 1. Chauhan, S.: Computational biology and intellectual property rights: a need of the hour’, Res. Rev. J. Comput. Biol. 10(1) (2021). https://medicaljournals.stmjournals.in/index.php/ RRJoCB/article/view/2521 Accessed 21 Dec 2022 2. Thampi, S.M.: Introduction to Distributed Systems. arXiv, Nov. 23 (2009). https://doi.org/10. 48550/arXiv.0911.4395 3. Li, Y., Huang, C., Ding, L., Li, Z., Pan, Y., Gao, X.: Deep learning in bioin-formatics: introduction, application, and perspective in big data era. arXiv, Feb. 28 (2019). https://doi.org/ 10.48550/arXiv.1903.00342 4. Madhani, P.M.: Indian Bioinformatics: Growth Opportunities and Chal-lenges. Rochester, NY (2011). https://papers.ssrn.com/abstract=1967317. Accessed 21 Dec 2022 5. Diagnostics Lit Review Final.pdf.https://msfaccess.org/sites/default/files/2020-05/Diagno stics%20Lit%20Review%20Final.pdf. Accessed 21 Dec 2022 6. History of Genetics. https://www.news-medical.net/life-sciences/History-of-Genetics.aspx. Accessed 21 Dec 2022 7. 1909: The Word Gene Coined, Genome.gov. https://www.genome.gov/25520244/online-edu cation-kit-1909-the-word-gene-coined. Accessed 21 Dec 2022 8. Human chorionic somatomammotropin and growth hormone gene expression in rat pituitary tumour cells is dependent on proximal promoter sequences. - PMC. https://www.ncbi.nlm. nih.gov/pmc/articles/PMC317937/. Accessed 21 Dec 2022 9. Essentials of Genetics by Heidi Chial, et al. - Read online. https://www.e-booksdirectory. com/details.php?ebook=9457. Accessed 21 Dec 2022 10. Admin, The History of DNA Timeline, DNA Worldwide Oct. 17, 2014. https://www.dna-wor ldwide.com/resource/160/history-dna-timeline. Accessed 21 Dec 2022 11. Francis Crick, Rosalind Franklin, James Watson, and Maurice Wilkins | Science History Institute. https://www.sciencehistory.org/historical-profile/james-watson-francis-crickmaurice-wilkins-and-rosalind-franklin. Accessed 21 Dec 2022 12. (PDF) Recent Advances in Genetic Engineering-A Review. https://www.researchgate.net/pub lication/233953717_Recent_Advances_in_Genetic_Engineering-A_Review. Accessed 21 Dec 2022 13. What Is Genetic Modification? | Live Science. https://www.livescience.com/64662-geneticmodification.html. Accessed 21 Dec 2022 14. Basic steps in genetic engineering: International Journal of Environmental Studies: Vol 15, No 1. https://www.tandfonline.com/doi/abs/https://doi.org/10.1080/002072380087 37419. Accessed 21 Dec 2022

64

A. Nath and G. Chakravarty

15. Genome Engineering Technologies (GET) and their Applications | Depart-ment of Biotechnology. https://dbtindia.gov.in/schemes-programmes/research-development/knowledge-gen eration-discovery-research-new-tools-and-1. Accessed 21 Dec 2022 16. Allen, Andrew — “Biotechnology, research and intellectual property law” [2002] CanterLawRw 5; (2002) 8 Canterbury Law Review 365. http://www.nzlii.org/nz/journals/Canter LawRw/2002/5.html. Accessed 21 Dec 2022 17. The Difficulties and Challenges of Biomedical Research and Health Ad-vances, Policy & Medicine. https://www.policymed.com/2011/02/the-difficulties-and-challenges-of-bio medical-research-and-health-advances.html. Accessed 21 Dec 2022 18. Johnston, J., Wasunna, A.: Patents, biomedical research, and treatments: examining concerns, canvassing solutions. Hastings Cent. Rep. 37, S1-36 (2007). https://doi.org/10.1353/hcr.2007. 0006 19. Ratcliffe, S.: The ethics of genetic patenting and the subsequent implications on the future of health care. Touro Law Rev. 27(2) (2011) 20. Kankanala, K.C.: Genetic Patent Law and Strategy. Manupatra (2007) 21. Himatej Reddy, T.: Patenting Biotechnology Based Inventions - In India. Rochester, NY (2012). https://doi.org/10.2139/ssrn.2198744 22. Determination Of Obviousness/Inventive Step - Indian Approach - Patent - India. https:// www.mondaq.com/india/patent/345598/determination-of-obviousnessinventive-step--ind ian-approach. Accessed 21 Dec 2022 23. Bandhua Mukti Morcha vs Union Of India & Others on 16 December, 1983. https://indian kanoon.org/doc/595099/. Accessed 21 Dec 2022

Performance of Automated Machine Learning Based Neural Network Estimators for the Classification of PCOS Pijush Dutta1

, Shobhandeb Paul2 , Arindam Sadhu1 and Pritam Bhattacharjee3(B)

, Gour Gopal Jana1

,

1 Greater Kolkata College of Engineering and Management, Kolkata 743387, West Bengal, India 2 Gurunanak Institute of Technology, Kolkata 700114, West Bengal, India 3 School of Computing, Amrita Vishwa Vidyapeetham, Amritapuri 690525, Kerala, India

[email protected]

Abstract. Artificial neural networks (ANNs) and automated machine learning (AutoML) have transformed the study of artificial intelligence by producing models that are very effective at completing inductive learning tasks. Despite their accomplishments, there is little information on when to utilize one over the other. In present research TPOT-NN based integrated AutoML and NN estimator performance compared with non-Auto ML-NN estimator in the context polycystic ovary syndrome (PCOS) benchmark datasets. The main findings of this study point to TPOT-NN as a useful tool that, on some datasets, achieves outperformed accuracy than other methods used in this research. It also point to potential avenues for basic TPOT to improve classification models in the future. Keywords: Automated machine learning · Artificial neural networks · TPOT · Hybrid ANN based TPOT · PCOS

1 Introduction Artificial neural networks (ANN) and automated machine learning (AutoML) both are used to develop highly effective classification and regression model [1–3]. NNs and AutoML are frequently viewed as rival techniques because despite their triumphs, there is still a great deal of disagreement and no quantitative information about the heuristically benefits of the two perspectives and how to choose which would work better on certain genuine issues [3]. Fortunately, there is considerable interest in merging AutoML with NNs to boost performance by taking use of their combined benefits [4]. Other well-known NNs used in AutoML contexts include and Amazon’s AutoGluon & Auto-Net [5, 6], among others, but none of them are both extensively accessible through the same ML pipeline. By extending TPOT to use NN estimators in its classification pipelines, or TPOT-NN classifier, these concerns have been attempted to be explored [7]. A comparison study is conducted between TPOTNN, non-NN TPOT (base TPOT), and NN classifier on the UCI PCOS with & without fertility dataset, which has © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (Eds.): Human 2023, STHC, pp. 65–73, 2023. https://doi.org/10.1007/978-981-99-3478-2_7

66

P. Dutta et al.

541 records with 41 characteristics, in order to determine which method performs better [8]. In final section, an examination of TPOT-NN architectures, better performance manifestations, and future directions has been conduct [9]. In conclusion, this research offers the following: A recent addition to AutoML called TPOT-NN combines TPOT with estimators from neural networks. Increasing the number of NN estimators in TPOT currently comes with a trade-off: longer model training times. Intriguing structures that mimic parts of bigger, more complicated deep learning architectures can occasionally be found in the architectures trained by TPOT-NN. The core of TPOT-NN concentrates on categorizing increasingly complicated, atypically organized data types.

2 Background 2.1 Artificial Neural Network Individual neurons make up artificial neural networks, which use a set of hyper parameters that may be adjusted to convey signals to other neurons. The “deep” design of an ML, which consists of many ANN (MLP-ANN) layers stacked (more than ten deep layers), enables Deep models to approximation exceedingly complicated and nonlinear objective functions [11, 12]. One of the simplest ANNs is logistic regression, which is defined as y = f (wx + b)

(1)

where, w, b and f are weight matrices, bias vectors and an activation function x & y are input vector and output vector. The standard logistic function’s multivariate Softmax generalization in this case is called f(x). MLP-NN, presented (1 hidden layer) by is used in this most basic ANN design with many stacked layers. h1 = ∅1 (w1 x + b1 )

(2)

y = ∅2 (w2 x + b2 )

(3)

The main motive of the algorithm is to finding out the value of b and w from the train dataset using convex optimization which further helps to obtain the least loss function L from test dataset. An ANN model’s estimating capability often rises with increased depth or breadth [13, 14].

Fig. 1. Pipelining model for MLP-ANN

Performance of Automated Machine Learning

67

Furthermore, sufficiently large NN designs reveal the interesting occurrence that, contrary to the bias-variance tradeoff indicated in Fig. 1, model entanglement increased, by increase of variance and biasing value. Portable NN structures significantly outperform non-NN learners for typical binary classification on basic datasets, by means of training time and error [15]. 2.2 Automated Machine Learning The selection of the optimal feature transformations, model topologies, and hyper parameterizations is one of the most difficult parts of developing an ML system [16]. These architectural concerns are often determined with the use of previous knowledge, and experimenter divination, all of which might impede the performance of the final learnt pipeline and complicate the process of developing the ML system [17]. Given a wide range of potential architectural configurations, AutoML offers techniques for addressing these decisions automatically. The ideal architecture for a particular job may be determined using a variety of AutoML approaches. In general, this kind of AutoML builds mathematical function trees that are enhanced in terms of fitness metrics like classification accuracy [18, 19]. Random mutations are used to build each generation of trees. An ideal tree is created by repeating this method for several training generations. Architectures that are gradually more suitable are passed down to next generations, much like in natural evolution. 2.2.1 TPOT Model A TPOT model employs GP to find the best ML pipelines for classification or regression [20, 21]. All the four different types of operators may be used by TPOT to conduct GP on trees [9]. Figure 2 represent simplify model of TPOT pipelines that can perform as well as or better than competing state-of-the-art ML methods while having a very small number of operators. The implementation of Stacking Estimator components in TPOT, which enables estimators to propagate there to succeeding operators [22], is a crucial part of the system. A single estimator’s raw outputs can be given to succeeding operators in any number, which is a beneficial trait that can be used to build nonlinear multi-layer NN structures with arbitrary-sized neighboring layers. 2.2.2 TPOT - NN Model A LR classifier and a MLP-NN, both of can be connected with ANN in PyTorch, to build two new classification estimators TPOT and initial version of TPOT-NN [9, 23]. The dimensionality reduction in training layer, learning rate, and other key parameters of ML can be typically tweaked manually by TPOT using GP. The amount of features in the input dataset displayed in Fig. 3 determines the alternatives available in a specific experiment, TPOT-NN estimator performance can be further improved by LR and MLP [24, 25], but GP however can be used to dimensionality optimization of intermediate layers in TPOT-NN estimators.

68

P. Dutta et al.

Fig. 2. Pipelining model for TPOT basic

MLPs provide LR’s shallow architecture depth, which enables us to manage to improve the performance of TPOT pipelines incorporated with MLP-ANN. By supplying a “configuration dictionary,” TPOT users may regulate the range of operators that are accessible as well as the parameters that can be trained and the values that can be assumed. This functionality in the experiments helps to selectively limit TPOT’s estimators in order to identify these two sources of variance.

Fig. 3. Pipelining model for TPOT-ANN

Performance of Automated Machine Learning

69

3 Dataset Description The UCI PCOS with & without fertility dataset [8, 27] is the one that was used. The UCI dataset had 541 records with 41 attributes for women who were not fertile and 541 records with 3 possible attributes for women who were fertile. Most studies that have been published have employed a subset of the 12 criteria. The major argument for quoting the aforementioned characteristics is that they were considered to be the most important when treating a patient for PCOS. Figures 4 and 5 (Whisker plot) indicate the attributes and features of the sample datasets for PCOS without infertility, respectively. These characteristics and their connected values demonstrate how PCOS can be tied to a list of characteristics. 11 of these 12 characteristics are used in the PCOS illness prediction.

Fig. 4. Sample datasets for PCOS without infertility

3.1 Experimental Set up We performed 541 dataset for PCOS with fertility (3 potential attributes) & 541 datasets for without fertility (12 potential attributes among 41 attributes). Datasets into 80% train and 20% test, and utilized 5-fold cross-validation to rate the classification accuracy of the pipelines. The training for all three experiments (MLP-ANN, TPOT base, and TPOTNN) was terminated after 35 generations, with each generation including 100 distinct pipeline trees, allowing for full training to take place. 3.2 Hardware and High-Performance Computing Environment The Intel Core i3 6th Generation CPU, 8 GB of RAM, and 500 GB of HDD storage were used for each experiment. All tests using PyTorch neural network estimators were conducted using Python 3.7.6, Jupyter notebook 6.0.3, and Ubuntu as the operating system [26].

70

P. Dutta et al.

Fig. 5. Whisker plot of the PCOS attributes

4 Results Performance of the pipeline assessed by training time and train model time. Table 2 displays the outcomes of executing the TPOT-described datasets. To check the momentous change in accuracy and variance samples t-test and Levene’s test has been done, along with mean classification accuracy, for the MLP-ANN, TPOT-base, and TPOT-NN experiments. 4.1 Analysis of Different PyTorch Neural Network Estimators The classification accuracy for PCOS with and without infertility is much higher with TPOT-NN. When TOPT incorporated with new PyTorch NN estimators, it performs consistently except improving the model accuracy. However TPOT-NN can diminish the variance for further improving the accuracy for both the data types of with & with fertility PCOS (Table 1).

Performance of Automated Machine Learning

71

To check the impacts of different PyTorch neural network estimators integrated TPOT in each of the 2 configurations represent in Table 2. Table 1. Analysis of PyTorch neural network estimators Name of the Estimator

Descriptions of Estimator

TPOT-NN?

MLP-NN

TPOT restricted only MLP-NN

No

TPOT-base

Baseline TPOT

No

TPOT-all

TPOT with all estimators

Yes

Table 2. Performance of integrated TPOT in various configurations Types of Dataset

Parameter

PCOS with infertility

Accuracy

91.25%

Training Time

19.742 ms

PCOS without infertility

Accuracy

95.36%

97.54%

99.1%

Training Time

MLP-NN

8.987 ms

TPOT-Base

TPOT-all

t-test

Levene

93.74%

98.65%