134 24 23MB
English Pages 129 Year 2023
IoT and Big Data Analytics (Volume 2) Deep Learning for Healthcare Services Edited By Parma Nand
School of Engineering and Technology, Sharda University Greater Noida, U.P., India
Vishal Jain
Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, U.P., India
Dac-Nhuong Le
Faculty of Information Technology, Haiphong University, Haiphong, Vietnam
Jyotir Moy Chatterjee
Lord Buddha Education Foundation, Kathmandu, Nepal
Ramani Kannan
Center for Smart Grid Energy Research, Institute of Autonomous System, Universiti Teknologi PETRONAS (UTP), Malaysia
& Abhishek S. Verma
Department of Computer Science & Engineering, School of Engineering & Technology, Sharda University, Greater Noida, U.P., India
IoT and Big Data Analytics (Volume 2) Deep Learning for Healthcare Services Editors: Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma ISBN (Online): 978-981-5080-23-0 ISBN (Print): 978-981-5080-24-7 ISBN (Paperback): 978-981-5080-25-4 ©2023, Bentham Books imprint. Published by Bentham Science Publishers Pte. Ltd. Singapore. All Rights Reserved. First published in 2023.
BSP-EB-PRO-9789815080230-TP-115-TC-05-PD-20230707
BENTHAM SCIENCE PUBLISHERS LTD.
End User License Agreement (for non-institutional, personal use) This is an agreement between you and Bentham Science Publishers Ltd. Please read this License Agreement carefully before using the book/echapter/ejournal (“Work”). Your use of the Work constitutes your agreement to the terms and conditions set forth in this License Agreement. If you do not agree to these terms and conditions then you should not use the Work. Bentham Science Publishers agrees to grant you a non-exclusive, non-transferable limited license to use the Work subject to and in accordance with the following terms and conditions. This License Agreement is for non-library, personal use only. For a library / institutional / multi user license in respect of the Work, please contact: [email protected].
Usage Rules: 1. All rights reserved: The Work is the subject of copyright and Bentham Science Publishers either owns the Work (and the copyright in it) or is licensed to distribute the Work. You shall not copy, reproduce, modify, remove, delete, augment, add to, publish, transmit, sell, resell, create derivative works from, or in any way exploit the Work or make the Work available for others to do any of the same, in any form or by any means, in whole or in part, in each case without the prior written permission of Bentham Science Publishers, unless stated otherwise in this License Agreement. 2. You may download a copy of the Work on one occasion to one personal computer (including tablet, laptop, desktop, or other such devices). You may make one back-up copy of the Work to avoid losing it. 3. The unauthorised use or distribution of copyrighted or other proprietary content is illegal and could subject you to liability for substantial money damages. You will be liable for any damage resulting from your misuse of the Work or any violation of this License Agreement, including any infringement by you of copyrights or proprietary rights.
Disclaimer: Bentham Science Publishers does not guarantee that the information in the Work is error-free, or warrant that it will meet your requirements or that access to the Work will be uninterrupted or error-free. The Work is provided "as is" without warranty of any kind, either express or implied or statutory, including, without limitation, implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the results and performance of the Work is assumed by you. No responsibility is assumed by Bentham Science Publishers, its staff, editors and/or authors for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products instruction, advertisements or ideas contained in the Work.
Limitation of Liability: In no event will Bentham Science Publishers, its staff, editors and/or authors, be liable for any damages, including, without limitation, special, incidental and/or consequential damages and/or damages for lost data and/or profits arising out of (whether directly or indirectly) the use or inability to use the Work. The entire liability of Bentham Science Publishers shall be limited to the amount actually paid by you for the Work.
General: 1. Any dispute or claim arising out of or in connection with this License Agreement or the Work (including non-contractual disputes or claims) will be governed by and construed in accordance with the laws of Singapore. Each party agrees that the courts of the state of Singapore shall have exclusive jurisdiction to settle any dispute or claim arising out of or in connection with this License Agreement or the Work (including non-contractual disputes or claims). 2. Your rights under this License Agreement will automatically terminate without notice and without the
need for a court order if at any point you breach any terms of this License Agreement. In no event will any delay or failure by Bentham Science Publishers in enforcing your compliance with this License Agreement constitute a waiver of any of its rights. 3. You acknowledge that you have read this License Agreement, and agree to be bound by its terms and conditions. To the extent that any other terms and conditions presented on any website of Bentham Science Publishers conflict with, or are inconsistent with, the terms and conditions set out in this License Agreement, you acknowledge that the terms and conditions set out in this License Agreement shall prevail. Bentham Science Publishers Pte. Ltd. 80 Robinson Road #02-00 Singapore 068898 Singapore Email: [email protected]
BSP-EB-PRO-9789815080230-TP-115-TC-05-PD-20230707
CONTENTS PREFACE ................................................................................................................................................ i LIST OF CONTRIBUTORS .................................................................................................................. iii CHAPTER 1 ROLE OF DEEP LEARNING IN HEALTHCARE INDUSTRY: LIMITATIONS, CHALLENGES AND FUTURE SCOPE .............................................................................................. Mandeep Singh, Megha Gupta, Anupam Sharma, Parita Jain and Puneet Kumar Aggarwal INTRODUCTION .......................................................................................................................... A Framework of Deep Learning ............................................................................................. LITERATURE REVIEW .............................................................................................................. E-Health Records by Deep Learning ...................................................................................... Medical Images by Deep Learning ......................................................................................... Genomics by Deep Learning .................................................................................................. Use of Mobiles by Deep Learning .......................................................................................... FROM PERCEPTRON TO DEEP LEARNING ......................................................................... Recurrent Neural Network (RNN) .......................................................................................... Convolutional Neural Network (CNN) ................................................................................... Boltzmann Machine Technique .............................................................................................. Auto-Encoder and Deep Auto-Encoder .................................................................................. Hardware/ Software-Based Implementation ........................................................................... DEEP LEARNING IN HEALTHCARE: FUTURE SCOPE, LIMITATIONS, AND CHALLENGES ............................................................................................................................... CONCLUSION ............................................................................................................................... REFERENCES ............................................................................................................................... CHAPTER 2 GENERATIVE ADVERSARIAL NETWORKS FOR DEEP LEARNING IN HEALTHCARE: ARCHITECTURE, APPLICATIONS AND CHALLENGES ............................. Shankey Garg and Pradeep Singh INTRODUCTION .......................................................................................................................... DEEP LEARNING ......................................................................................................................... The Transition from Machine Learning to DL ....................................................................... Deep Feed-forward Networks ................................................................................................. Restricted Boltzmann Machines ............................................................................................. Deep Belief Networks ............................................................................................................. Convolutional Neural Networks ............................................................................................. Recurrent Neural Networks .................................................................................................... GENERATIVE ADVERSARIAL NETWORKS ......................................................................... GANs Architectures ................................................................................................................ Deep Convolutional GAN(DCGAN) ............................................................................. InfoGAN ........................................................................................................................ Conditional GANs ......................................................................................................... Auto Encoder GANs ...................................................................................................... Cycle GANs ................................................................................................................... GANs Training Tricks ............................................................................................................ Objective Function-Based Improvement ....................................................................... Skills Based Techniques ................................................................................................ Other Miscellaneous Techniques .................................................................................. STATE-OF-THE-ART APPLICATIONS OF GANS ................................................................. Image-Based Applications ...................................................................................................... Sequential Data Based Applications .......................................................................................
1 2 4 4 5 6 6 8 10 11 12 12 13 13 13 15 16 23 23 25 25 27 27 28 29 30 31 33 33 33 34 34 35 35 36 37 38 39 41 42
Other Applications .................................................................................................................. FUTURE CHALLENGES ............................................................................................................. CONCLUSION ............................................................................................................................... REFERENCES ...............................................................................................................................
43 44 44 44
CHAPTER 3 ROLE OF BLOCKCHAIN IN HEALTHCARE SECTOR ....................................... Sheik Abdullah Abbas, Karthikeyan Jothikumar and Arif Ansari INTRODUCTION .......................................................................................................................... FEATURES OF BLOCKCHAIN .................................................................................................. DATA MANAGEMENT AND ITS SERVICES (TRADITIONAL VS DISTRIBUTED) ....... DATA DECENTRALIZATION AND ITS DISTRIBUTION .................................................... ASSET MANAGEMENT .............................................................................................................. ANALYTICS ................................................................................................................................... Analytics Process Model ......................................................................................................... Analytic Model Requirements ................................................................................................ IMMUTABILITY FOR BIOMEDICAL APPLIANCES IN BLOCKCHAIN ......................... SECURITY AND PRIVACY ......................................................................................................... BLOCKCHAIN IN BIOMEDICINE AND ITS APPLICATIONS ............................................ Case Study .............................................................................................................................. CONCLUSION AND FUTURE WORK ...................................................................................... REFERENCES ...............................................................................................................................
48
CHAPTER 4 BRAIN TUMOR DETECTION BASED ON DIFFERENT DEEP NEURAL NETWORKS - A COMPARISON STUDY .......................................................................................... Shrividhiya Gaikwad, Srujana Kanchisamudram Seshagiribabu, Sukruta Nagraj Kashyap, Chitrapadi Gururaj and Induja Kanchisamudram Seshagiribabu INTRODUCTION .......................................................................................................................... RELATED WORK ......................................................................................................................... APPROACH .................................................................................................................................... Dataset ..................................................................................................................................... Data Pre-Processing ................................................................................................................ Data Augmentation ................................................................................................................. Contouring .............................................................................................................................. Transfer Learning .................................................................................................................... MODELS USED IN THE COMPARISON STUDY ................................................................... Convolutional Neural Network ............................................................................................... Input Layer .................................................................................................................... Convolution Layer ......................................................................................................... Activation Layer ............................................................................................................ Pooling Layer ................................................................................................................ Fully Connected Layer .................................................................................................. Output ............................................................................................................................ VGG 16 ................................................................................................................................... ResNet 50 ................................................................................................................................ EVALUATION PARAMETERS .................................................................................................. RESULTS AND DISCUSSION ..................................................................................................... Convolutional Neural Network ............................................................................................... VGG16 and ResNet50 ............................................................................................................ GUI ......................................................................................................................................... CONCLUSION AND FUTURE WORK ...................................................................................... NOTES ............................................................................................................................................. REFERENCES ...............................................................................................................................
48 50 51 52 52 53 54 55 55 56 56 57 59 60 63 64 65 66 67 67 69 70 71 72 72 73 73 74 75 76 76 76 77 79 80 80 80 82 85 86 86
CHAPTER 5 A ROBUST MODEL FOR OPTIMUM MEDICAL IMAGE CONTRAST ENHANCEMENT AND TUMOR SCREENING ................................................................................ Monika Agarwal, Geeta Rani, Vijaypal Singh Dhaka and Nitesh Pradhan INTRODUCTION .......................................................................................................................... LITERATURE REVIEW .............................................................................................................. PROPOSED MODEL .................................................................................................................... Dataset ..................................................................................................................................... Image Pre-Processing .............................................................................................................. Features Extraction ................................................................................................................. Tumor Detection ..................................................................................................................... RESULTS AND DISCUSSION ..................................................................................................... FUTURE SCOPE ............................................................................................................................ CONCLUSION ............................................................................................................................... REFERENCES ...............................................................................................................................
90 91 93 100 101 101 103 105 106 108 108 108
SUBJECT INDEX ....................................................................................................................................
i
PREFACE This book aims to highlight the different applications of deep learning algorithms in implementing Big Data and IoT-enabled smart solutions to treat and care for terminally ill patients. The book shall also unveil how the combination of big data, IoT, and the cloud can empower the conventional doctor-patient relationship in a more dynamic, transparent, and personalized manner. Incorporation of these smart technologies can also successfully port over powerful analytical methods from the financial services and consumer industries like claims management. This coupled with the availability of data on social determinants of health – such as socioeconomic status, education, living status, and social networks – opens novel opportunities for providers to understand individual patients on a much deeper level, opening the door for precision medicine to become a reality. The real value of such systems stems from their ability to deliver in-the-moment insights to enable personalized care, understand variations in care patterns, risk-stratify patient populations, and power dynamic care journey management and optimization. Successful application of deep learning frameworks to enable meaningful, cost-effective personalized healthcare services is the primary aim of the healthcare industry in the present scenario. However, realizing this goal requires effective understanding, application, and amalgamation of deep learning, IoT, Big Data, and several other computing technologies to deploy such systems effectively. This book shall help clarify understanding of certain key mechanisms and technologies helpful in realizing such systems. Through this book, we attempt to combine numerous compelling views, guidelines, and frameworks on enabling personalized healthcare service options through the successful application of Deep Learning frameworks. Chapter 1 represents a survey of the role of deep learning in the healthcare industry with its challenges and future scope. Chapter 2 focuses on recent work done in GAN and implements this technique in the different deep-learning applications for healthcare. Chapter 3 focuses on the role of blockchain in biomedical engineering applications. Chapter 4 compares three different architectures of Convolutional Neural Networks (CNN), VGG16, and ResNet50, and visually represents the result to the users using a GUI.
ii
Chapters 5 propose an efficient model for medical image contrast enhancement and correct tumor prediction.
Parma Nand School of Engineering and Technology Sharda University Greater Noida, U.P. India Vishal Jain Department of Computer Science and Engineering School of Engineering and Technology Sharda University, Greater Noida U.P., India Dac-Nhuong Le Faculty of Information Technology Haiphong University, Haiphong Vietnam Jyotir Moy Chatterjee Lord Buddha Education Foundation Kathmandu, Nepal Ramani Kannan Center for Smart Grid Energy Research Institute of Autonomous System Universiti Teknologi PETRONAS (UTP) Malaysia & Abhishek S. Verma Department of Computer Science & Engineering School of Engineering & Technology Sharda University, Greater Noida U.P., India
iii
List of Contributors Abhishek S. Verma
Department of Computer Science & Engineering, School of Engineering & Technology, Sharda University, Greater Noida, U.P., India
Anupam Sharma
HMR Institute of Technology & Management, Delhi, India
Arif Ansari
Data Sciences and Operations, Marshall School of Business, University of Southern California, Los Angeles, California, United States
Chitrapadi Gururaj
Department of Electronics and Telecommunication Engineering, BMS College of Engineering, Bengaluru, Visvesvaraya Technological University, Belagavi, India
Dac-Nhuong Le
Faculty of Information Technology, Haiphong University, Haiphong, Vietnam
Geeta Rani
Computer and Communication Engineering, Manipal University, Jaipur, India
Induja Kanchisamudram Seshagiribabu
Department of Computer Science (Artificial Intelligence), Andrew and Erna Viterbi School of Engineering, University of Southern California, Los Angeles, California, United States
Jyotir Moy Chatterjee
Lord Buddha Education Foundation, Kathmandu, Nepal
Karthikeyan Jothikumar
Department of Computer Science Engineering, National Engineering College, Tamil Nadu, India
Monika Agarwal
Dayanand Sagar University, Bangalore, India
Mandeep Singh
Raj Kumar Goel Institute of Technology, Ghaziabad, India
Megha Gupta
IMS Engineering College, Ghaziabad, India
Nitesh Pradhan
Computer Science Engineering, Manipal University, Jaipur, India
Parita Jain
KIET Group of Institutes, Ghaziabad, India
Parma Nand
School of Engineering and Technology, Sharda University Greater Noida, U.P., India
Puneet Kumar Aggarwal
ABES Engineering College, Ghaziabad, India
Pradeep Singh
Department of Computer Science & Engineering, National Institute of Technology, Raipur, Chhattisgarh, India
Ramani Kannan
Center for Smart Grid Energy Research, Institute of Autonomous System, Universiti Teknologi PETRONAS (UTP), Malaysia
Shankey Garg
Department of Computer Science & Engineering, National Institute of Technology, Raipur, Chhattisgarh, India
Sheik Abdullah Abbas
School of Computer Science Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India
Shrividhiya Gaikwad
Department of Electronics and Telecommunication Engineering, BMS College of Engineering, Bengaluru, Visvesvaraya Technological University, Belagavi, India
iv Sukruta Nagraj Kashyap
Department of Electronics and Telecommunication Engineering, BMS College of Engineering, Bengaluru, Visvesvaraya Technological University, Belagavi, India
Srujana Kanchisamudram Seshagiribabu
Department of Electronics and Telecommunication Engineering, BMS College of Engineering, Bengaluru, Visvesvaraya Technological University, Belagavi, India
Vishal Jain
Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, Greater Noida, U.P., India
Vijaypal Singh Dhaka
Computer and Communication Engineering, Manipal University, Jaipur, India
IoT and Big Data Analytics, 2023, Vol. 2, 1-22
1
CHAPTER 1
Role of Deep Learning in Healthcare Industry: Limitations, Challenges and Future Scope Mandeep Singh1,*, Megha Gupta2, Anupam Sharma3, Parita Jain4 and Puneet Kumar Aggarwal5 Raj Kumar Goel Institute of Technology, Ghaziabad, India IMS Engineering College, Ghaziabad, India 3 HMR Institute of Technology & Management, Delhi, India 4 KIET Group of Institutes, Ghaziabad, India 5 ABES Engineering College, Ghaziabad, India 1 2
Abstract: Nowadays, the acquisition of different deep learning (DL) algorithms is becoming an advantage in the healthcare sector. Algorithms like CNN (Convolution Neural Network) are used to detect diseases and classify the images of various disease abnormalities. It has been proven that CNN shows high performance in the classification of diseases, so deep learning can remove doubts that occur in the healthcare sector. DL is also used in the reconstruction of various medical diagnoses images like Computed Tomography and Magnetic Resonance Imaging. CNN is used to map input image data to reference image data, and this process is known as the registration of images using deep learning. DL is used to extract secrets in the healthcare sector. CNN has many hidden layers in the network so that prediction and analysis can be made accurately. Deep learning has many applications in the healthcare system, like the detection of cancer, gene selection, tumor detection, recognition of human activities, the outbreak of infectious diseases, etc. DL has become famous in the field of healthcare due to its open data source. In the case of the small dataset, CNN becomes an advantage as it does not provide an excellent way to statistical importance. Deep Learning is a technique that includes the basis of ANN (Artificial neural networks), appears as a robust tool for machine learning, and encourages recasting artificial intelligence. Deep learning architecture has more than two hidden layers, as in ANN; it is only one or two. Therefore, this chapter represents a survey of the role of deep learning in the healthcare industry with its challenges and future scope. * Corresponding author Mandeep Singh: Raj Kumar Goel Institute of Technology, Ghaziabad, India; E-mail: [email protected]
Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma (Eds.) All rights reserved-© 2023 Bentham Science Publishers
2 IoT and Big Data Analytics, Vol. 2
Singh et al.
Keywords: Artificial neural networks (ANN), Auto-encoders (AEs), Bioinformatics, Biological neural networks, Boltzmann machine, Convolution neural networks (CNN), Deep autoencoders, Deep belief networks (DBNs), Deep learning (DL), Deep neural nets (DNNs), Deep structures, Electronic health records (EHRs), Genomics, Machine learning (ML), Medical images, Medical informatics, Pervasive sensing, Restricted boltzmann machines (RBMs), Recurrent neural nets (RNNs), State-of-the-art ML, Unified medical language system (UMLS). INTRODUCTION Deep learning has emerged as an interesting new technique in machine learning in recent years. Deep learning, in contrast to more standard Neural Networks (NNs), makes use of numerous hidden layers. A large number of neurons provides a broadcast level of coverage of the initial stage data; the non-linear permutations of the results are in a lower-dimensional projection, and it is a feature of the space. So that every higher-perceptual level is correlated to a lower-dimensional projection. A fine result is given as an effective abstraction at a high level for the raw data or images if the network is suitably weighted. This high level of abstraction allows for the creation of an automatic feature set that would otherwise require hand-crafted or customized features [1]. The development of an autonomous feature set without human interaction has significant advantages in sectors such as health informatics. In medical imaging, for example, it might be more complex and difficult to describe the features by using descriptive methods. Implicit traits could be used to identify fibroids and polyps, as well as anomalies in tissue morphology like tumors. Such traits may also be used to determine nucleotide sequences in translational bioinformatics so that they potentially bind strongly [2]. Several architectures stand out among the numerous methodological versions of deep learning. Since 2010, the number of papers using the deep learning method has increased. It has an interleaved sequence of feedforward layers that employ convolutional filters, followed by reduction, rectification, or pooling layers. Each network layer generates a high-level abstract characteristic [3]. The mechanism allows visual information in the form of related fields and is similar to this physiologically inspired architecture. Deep Belief Networks (DBNs), stacked Auto-encoders acting as deep Auto-encoders, extending artificial NNs with many layers as Deep Neural Nets (DNNs), and extending artificial NNs with directed cycles as Recurrent Neural Nets are all possible architectures for deep learning (RNNs). The latest developments in graphics processing units (GPUs) have also had a substantial impact on deep learning's practical adoption and acceleration. Many of the theoretical notions that underlie deep learning were already proposed before the advent of GPUs, albeit they have only recently gained traction [4].
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 3
A new era in healthcare is entering in which vast biomedical data is becoming increasingly crucial. The abundance of biomedical data presents both opportunities and obstacles for healthcare research. Exploring the relationships between all of the many bits of information in these data sets, in particular, is a major difficulty in developing a credible medical tool that is based on machine learning and data-driven approaches. Previous research has attempted to achieve this goal by linking numerous data to create different information that is used in finding data from data clusters. An analytical tool is required based on machine learning techniques that are not popular in the medical field, even though existing models show significant promise. Indeed, due to their sparsity, variability, temporal interdependence, and irregularity, it makes a fine important issue in biomedical data. New challenges are introduced by different medical ontologies, which are used in the data [5]. In biomedical research, expert selection having the composition to employ based on ad hoc is a frequent technique. The supervised specification of the feature space, on the other hand, scales poorly and misses out on new pattern discovery chances. On the other hand, depict learning methodologies allow for the product adaptation of the depictions needed for the prognosis from data sets. Expert systems are a reflection of an algorithm with several presentation levels. They are made up of basic but complex sections that successively change a representation at the beginning level with given input data into and at the end level, a slightly more abstract representation. In computer vision, audio recognition, and natural language processing applications, deep learning models performed well and showed considerable promise. Deep learning standards present the intriguing potential for information related to biomedical, given their established efficacy in several areas and the quick growth of methodological advancements. DL approaches are already being used or are being considered for use in health care [4]. On the other hand, deep learning technologies have not been evaluated for medical issues that are well enough for their accomplishment. Deep learning contains various elements, such as its improved performance, end-to-end learning scheme with integrated feature learning, and ability to handle complicated and multi-modality data, which could be beneficial in health care. The deep learning researchers accelerate these efforts, which must clarify several problems associated with the features of patient records, but there is a need for enhanced models and strategies which also allow transfer learning to hook up with clinical information via frameworks and judgment call support in the clinic [5]. This article stresses the essential components that will have a significant effect on healthcare, a full background in technological aspects, or broad, deep learning applications. Conversely, biomedical data is concentrated solely by us, including that derived from the image of clinical background, EHRs, genomics, and different medically used equipment. Other data sources are useful for patient health monitoring, and deep
4 IoT and Big Data Analytics, Vol. 2
Singh et al.
learning has yet to be widely applied in these areas. As a result, we will quickly present the basics of deep learning and the medical applications to examine the problems, prospects, and uses of these methods in medicine and next-generation health care [6]. A Framework of Deep Learning An artificial intelligence technology can discover associations between data without requiring it to be defined beforehand. The capacity to build predictive models, a strong assumption required about the underlying mechanisms, which are often unclear or inadequately characterized, is the main attraction. Because they are made up of typically linear, a single modification of the traditional techniques, which is the ability to access required data from its raw data form. DL differs from traditional machine learning in terms of getting required data from the raw data [7]. DL, in reality, permits computational models made up of many intermediate layers to form neural networks to learn several degrees of abstraction for information representations [8]. Traditional ANNs, on the other hand, typically have three layers to provide training and supervision solely for the task at hand, and are rarely generalizable. Alternatively, each layer in the system of deep learning optimizes a local unsupervised criterion to build an observation pattern of data to get as inputs from the layer below. Deep neural networks examine a layer-by-layer irregular method to initialize the endpoints in subsequent hidden layers to learn generalizable “deep structures” and their representations. Those types of representations are sent into a supervised layer to use as a backpropagation method; the entire network is in a fine network that is very good to optimum in the specific final goal [9]. The unsupervised pre-training breakthrough, new ways to avoid overfitting, the use of general-purpose graphics processing units to speed up calculations, and the development of unsupervised pre-training breakthrough made it possible to develop high-level components to quickly assemble neural networks to find a solution for different tasks by establishing state-of-the-art [10]. In reality, DL is proven to be effective at uncovering subtle structures and is responsible for considerable, achieving outstanding results in image object detection, envisioned, and natural language translation and generation. Healthcare flooring could be achieved by relevant clinical-ready successes in the way of the new generation of deep learning-based smart solutions for genuine medical care [11]. LITERATURE REVIEW Deep learning's application used in medicines is new and has not been properly investigated. In this chapter reviewed some of the most important recent literature
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 5
on deep model applications. Publications are cited in this literature review for lighting the types of communication networks and medical data that were taken into account (Table 1). To our knowledge, no research has used deep learning in all of these data sets, or a subset of them is joint for medical data examination and prediction representation. Many exploratory studies assessed the combined use of genomes and EHRs, but they did not use deep learning; therefore, they were not included in this review. The most common deep learning architectures are used in the healthcare industry. These models explain the basic concepts that underpin their construction (Table 2). E-Health Records by Deep Learning Deep learning (DL) has lately been used to handle aggregated data. Structured (e.g., diagnoses, prescriptions) data and unstructured data are both included in EHRs. The majority of this literature used a deep architecture to the process of a health care system for a specific clinical task. A frequent technique is to demonstrate that deep learning outperforms traditional machine learning models in terms of metrics. While most articles show end-to-end supervised networks in this situation, unsupervised models are also provided in multiple papers [12]. Deep learning was utilized in many research to predict disease-based conditions. Liu et al. [13] reported that four layers outperformed baselines in predicting serious heart failure and serious chronic diseases. Short attention of RNNs with sharing and sentiment classification was utilized in a deep dynamic end-to-end network that affects current disease conditions and the medical future is projected. The authors also advocated using a decay effect to control the LSTM unit to manage irregular events, which are difficult to handle in longitudinal EHRs. DeepCare was tested on diabetes and mental health patient cohorts for disease progression modeling, intervention recommendation, and future risk prediction. It utilized RNNs with gated recurrent units (GRU) to create an ending model that is based on patient history to encounter with future diagnoses and treatments. Deep learning has also been used for the continuous-time model of information, such as laboratory findings, to identify specific phenotypes automatically. RNNs and LSTM were utilized by Lipton et al. [14], who used 13 commonly gathered clinical parameters from patients in pediatric ICU to train a model to categorize 128 illnesses. The results outperformed numerous strong baselines, including a multilayer perceptron trained on hand-engineered features, by a large margin. Che et al. [15] employed SDAs regularized with proper information based on ICD-9s. Lasko et al. [16] employed a two-layer stacked AE (without regularization) to mimic the sequences of serum uric acid readings. Razavian et al. [17] used LSTM
6 IoT and Big Data Analytics, Vol. 2
Singh et al.
units for different networks to predict the illness of patient onset from laboratory tests and found that they performed better than logistic regression relevant to clinical characteristics that were hand-engineered [18]. EHRs have also been subjected to deep neural models, which were used to train medical concepts, including diseases, drugs, and tests that might be utilized for analysis and prediction [19]. Tran et al. [20] employed RBMs to predict suicide risk in a population of 7578 mental health patients by learning abstractions of ICD-10 codes. A network model based on RNNs also showed some promise in eliminating patient data from medical records, allowing the counter of unlimited patient summaries to be done automatically. Predicting unplanned readmissions of patients after discharge has recently received a lot of interest [21]. Nguyen et al. [22] presented Deeper, a CNN-based middle methodology for stratifying medical risks by analyzing and combining clinical cues in prescribed EHRs. Deeper enables the investigation of important and readable medical patterns, and its efficacy in predicting readmission was excellent in a short period. Medical Images by Deep Learning Following the breakthrough of deep learning in the vision of the computer, the clinical applications to early uses were in image analysis, notably in the study of the mind, Magnetic Resonance Imaging (MRI) scans to predict various types of Alzheimer's disease [23, 24]. CNN's are used to infer multiple layers of higher knee MRI results to dynamically segregate tissues and help lessen the risk of osteoarthritis in a variety of medical sectors [25]. This method outperformed a province method that used manually selected 3D multi-scale properties despite the usage of 2D pictures. Deep learning has also been used to discriminate numerous modules of sclerosis lesions in Multiple channel-based 3-Dimensional MRIs [26]. In recent times, an author has used diagnosed diabetic retinopathy in retinal photos, achieving specificity and high sensitivity to certified ophthalmologist annotations over roughly 10,000 test photos [27]. When it comes to classifying clinical histology pictures of different forms of cancer, CNNs outperformed 21 board-certified dermatologists over a large data set of more than 100000 images (around 2000 biopsy-labeled test images) [28]. Genomics by Deep Learning Deep learning is used in slightly elevated biology to preserve structural integrity from ever-increasing datasets. The deep models allow for the discovery of highlevel characteristics, resulting in improved performance over traditional models, increased interpretability, and a better knowledge of the structure of biological data.
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 7
Table 1. Literature review summary of deep learning architecture applied and clinical diseases. Approach Year
E-health records
Application Use
Model Type
References
2015
Proposed a prediction-based framework for congestive heart failure
CNN
Lui et al. [12]
2015
Diagnosis and clinical measurements of patients from ICUs data
LSTM RNN
Lipton et al. [14]
2016
Memory model (MM) of patient history is predictive medicine-based
LSTM RNN
Pham et al. [19]
2016
Predict future clinical events by the unsupervised history of the patient
Stacked Denoising AE
Miotto et al. [29, 30]
RBM
Liang et al. [18]
Stacked AE
Che et al. [23]
Stacked AE
Lasko et al. [16]
2014 Diagnosis of patients automatically from their history 2015
Time series clinical physiological pattern
2013 Uric acid measurement is to suggest multiple subtypes of effect 2017
The patient's history is used to define medications
GRU RNN
Choi et al. [1]
2016
Lab tests are to predict the disease
LSTM RNN
Razavia n et al. [17]
Sparse AE
Lui et al. [13]
2014 Alzheimer’s disease is predicted by different MRIs
Medical Imaging
2013
Detecting modes arise for variations of Alzheimer
RBM
Brosch et al. [25]
2013
Predicting an automatic MRIs for the osteoarthritis
CNN
Prasoon et al. [26]
2014
Creating Multiple channels based 3-Dimensional MRIs
RBM
Yoo et al. [11]
2016
Ultrasound images used for breast cancer diagnosed
Stacked Denoising AE
Cheng et al. [27]
2016
The diabetic retinopathy diagnosed from retinal photographs
CNN
Gulshan et al. [28]
2017
Detect skin cancer at the dermatology level
CNN
Esteva et al. [29]
2015
The DNA sequence is used to predict chromatin.
CNN
Zhou et al. [32]
2016
Multiple cells predicting in the open-source platform for prediction
CNN
Kelley et al. [36]
Determine the RNA and DNA for the proteins
CNN
Alipanahi et al. [35]
Estimating different chromosomes
CNN
Koh et al. [37]
Cancer classification from gene expressions
Stacked Sparse AE
Fakoor et al. [39]
Genomics 2015 2016 2013
8 IoT and Big Data Analytics, Vol. 2
Singh et al.
(Table 1) cont.....
Approach Year
Mobile
Application Use
Model Type
References
2016
Detecting gait of freezing in patients
CNN/RNN
Hammerla et al. [43]
2015
Wearable sensors are estimating
CNN
Zhu et al. [44]
2016
Health is monitored by photoplethysmography signals
RBM
Jindal et al. [40]
2016
Quality of sleep is predicted by physical activity during awake time
CNN
Sathyanarayana et al. [45]
In the literature, various works have been proposed. We go through the main points here and recommend the readers to [29 - 32] for additional in-depth discussions. The initial neural network applications in genomics used deep architectures to substitute traditional machine learning without modifying the input features. To forecast the splicing activity of specific exons, a reasonably obtainable fodder NN was used by Xiong et al. [33]. When compared to simpler approaches, this method had a greater prediction accuracy of splicing activity and was able to discover uncommon mutations associated with splicing misregulation. Recent research has used CNNs directly by using DNA sequences, eliminating the requirement to create characteristics beforehand. Because CNN performs the interchange parameters between areas, they need a fully connected network. This increased the discovery of important patterns by allowing the models to be trained on larger DNA sequence windows [34]. Alipanahi et al. [35], for example, introduced DeepBind, a deep architecture based on CNNs that predicts RNA and DNA-binding protein specificities. DeepBind enables the rebuilding of both novel and known motif sequences, evaluates the impact of other approaches, and finds useful single nucleotide differences (SNVs). Kelley et al. [36] created Basset for employed CNNs to predict DNA methylation states in single-cell bisulfite sequencing investigations, while Koh et al. [37, 38] utilized CNNs to de-noise genome-wide chromatin immunoprecipitation followed by sequencing data, which helps to obtain an accurate estimation for many chromatin marks. Use of Mobiles by Deep Learning Sensing cell phones and gadgets are revolutionizing mobile apps at a high range, also including health tracking [39]. A single wearable gadget-based range of risk factors can monitor as the line between consumer health wearables and medical devices blurs. These features include the ability to offer patients with direct personal and medical analytics that can enhance their wellness, enable routine care, and help with chronic illness management [40]. Deep learning is seen as a critical component in the analysis of the new form of data. However, because of hardware limitations, only a few recent efforts in the healthcare sensing arena
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 9
have used deep models. The running and dependence on a mobile device to handle harsh and complicated context information are still a difficult effort [41]. Several investigations looked for ways to get over such hardware constraints. Lane and Georgiev [42] suggested a neural network with less power inference engine to exploit the mobile device's Central Processing Unit and Digital Signal Processor without overburdening the hardware. It is a type of software accelerator which is capable of decreasing the device requirement of deep learning, which is currently in the adoption of the mobile bottleneck. Table 2. Review table of deep learning health care domain architectures. Technology
Use
Descriptions
RBM
Diagnosis, monitoring
An RBM is a standard process that learns a prediction approach over a set of inputs. RBMs are Boltzmann machines that require the neurons to create a distance matrix. The same connections vary with different nodes from each of the main kinds, but none exists between nodes in a group.
RNN
Predict medication
RNNs are effective for processing data streams. They are made up of a single network that performs all the tasks for each sequence of elements. The output value of each data is determined by the preceding calculations.
CNN
The arrangement of a cat's visual system has an impact on CNN. To provide translation-invariant descriptors, CNN uses ties to the area and feature Predict multiple pooling by linked weights across units. One co-evolutionary layer is used in diseases the basic CNN design, which is a fully connected layer usually for supervised prediction.
AE
The AE model is an unsupervised classification framework in which the input and target values are the same. A decoder converts the input into a Predict diseases latent representation, which is then used by another decoder to build the like cancer input from this depiction to create an AE. By restricting, it is possible to find meaningful patterns in the data (and hence from the output).
On the other hand, a few pieces of data are processed from phones and medical monitors. Human Activity Recognition, in particular, has been the subject of significant deep-learning investigations. In contrast, many studies do not directly explore the reliable predictions achieved by deep models. HAR can also be used in therapeutic settings. Hammerla et al. [43] introduced a typical motor consequence in Parkinson's disease, in which patients have difficulty initiating actions, like walking. An accelerometer data on outperforming all other models including CNN. Despite the limited data set, it shows that deep learning may be used to process activity recognition metrics in clinical settings. During ambulatory activities, Zhu et al. [44] used data from a triaxial accelerometer and a cadence sensor; researchers were able to forecast Energy Expenditure (EE) with encouraging findings.
10 IoT and Big Data Analytics, Vol. 2
Singh et al.
Deep learning, specifically CNNs and RBMs, outperformed traditional machine learning in interpreting portable neurophysiological signals such as electroencephalograms, photoplethysmography, and local field potentials. Autography data of patients' physical activity during the awake time was used. Sathyanarayana et al. [45] discussed the use of complete data, including a dataset of 92 adolescents for an experimental purpose. The researchers found that the highest specific and sensitive data was obtained with a conclusion of 46 percent, which is better than the regression model. FROM PERCEPTRON TO DEEP LEARNING Perceptron is one of the earliest bio-inspired algorithms of neural networks proposed [46]. It mathematically accredits how neurons work in biological concepts. Processes of brain-related information are done by the billions of neurons' interconnection. There every neuron is restored in the current form other than the neurons, which are interconnected, and when the voltage crosses the limits, then the potential is generated. These potentials allow neurons to obstruct other interconnected neurons, and biological networks then transmit information by processing the encoded data. Biological Neural Networks learn according to the restored characteristics, modify themselves and also make some new connections. Perceptrons copy biochemical processes through a function known as activation function, or through few weights consisting of an input layer which is straight connected with an output node. By adjusting these weights, it can learn how classification is done on linearly separable patterns. For solving more complicated problems, a neural network uses one or more layers known as hidden layers of perceptron introduced [47]. The Delta rule is used to train the neural network in which the weights are adjusted according to the input values presented by the network. This rule is implemented by utilizing the backpropagation procedure and used during training by the supervised NNs [48]. Random variables are assigned to network weights, especially without having earlier knowledge of minimizing the gap between the desired output and network output. The weights are adjusted in the recursive training process. To find the minimum error surface, the most common repeated training method is used where activation functions are used, which make the surface differentiable. The hidden layer apprehends the expression of the nonlinear relationship of more complex hypotheses done by adding the deep architecture of the network with more hidden layers. So the NNs are called deep neural networks. The failure of the deep neural networks training process occurs as it is non-trivial. Advanced backpropagation algorithms are used to solve this problem, resulting in slow learning [49]. Training of Deep Neural Networks is
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 11
done by supervised and unsupervised learning techniques. In supervised learning, to train the DNNs, labeled data is used, and the prediction of the target value is made by training the weights. In unsupervised learning, training is done based on non-labeled data, and it is mainly used for extraction, clustering, or reduction. Extracting the most related features is combined with the initial training procedure, and the features are used by supervised learning for classification. Fig. (1) shows the different DL architectures, and the details of all the architectures are the following: Recurrent neural network (RNN), Convolutional Neural Network (CNN), Boltzmann Machine Technique, Autoencoder, Deep Autoencoder, and Hardware/Software-based implementation.
Fig. (1). Deep Learning Architectures.
Recurrent Neural Network (RNN) RNN is a type of neural network technique where the usage of hidden layers makes the system capable of analyzing the data. It is used in many applications like DNA sequences, speech recognition, analyzing the text, etc. [50]. Previously used step information is maintained by the representation and strong interdependencies of training tests in RNN. To provide the output of new data, RNN utilizes the two input sources, the past, and present. So it is said that RNN has memory space. It sometimes generates an error and exploding gradient [51]. This problem is solved by the work proposed [52], which includes the usage of long short-term memory. To minimize the classification errors, the network learns what type of data is stored and allows to writing/reading of that data during training of data using long short-term memory.
12 IoT and Big Data Analytics, Vol. 2
Singh et al.
RNN or LSTM share the same weights at every layer, but in DNN, other algorithms with different weights are used at each layer. This advantage makes the network capacity to learn easily so that the parameters are reduced easily. RNN algorithm shows great results in many applications like language processing, bioinformatics, generating image information, etc. Convolutional Neural Network (CNN) Except for all other algorithms of DNN, convolution neural networks have the advantage that it works well on multidimensional input such as images. The main problem of more parameters is solved by CNN [53]. This algorithm name is derived from the concept of convolution filters which are used in the image filtering process as it performs complex operations [54]. The process of CNN follows the following steps: 1. By using several filters input image is convoluted. 2. The output of the previous step is then sub-sampled. 3. Step 2 output is taken as new input, and all the above steps are iterated until extraction of high-level features is done. After the last sampling layer, a CNN approves many fully connected layers so that the conversion of 2D feature maps into a 1D vector for classification. These fully connected layers contain a maximum of CNN parameters increasing the effort for training. For solving this issue, various works have been proposed in the literature, and some of them are GoogLeNet [55], Clarifai [56], VGG [57], and AlexNet [58]. One more deep learning technique is introduced, which is called Convolution Deep Belief Networks [59]. CDBN is trained similarly to a technique known as DBN, but its structure is similar to CNN. Boltzmann Machine Technique In a study [60], the first proposed restricted Boltzmann machine is described. These types of neural networks are used to model the probabilistic relationships between the variables. This type of network is developed by using probabilistic units with a particular distribution. In this technique, minimization of error is done by adjusting weights slowly, and this learning procedure is known as Gibbs sampling. A bayesian network is a particular type of network in which a probabilistic unit characterizes the non-dependence in the form of an acyclic graph between
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 13
variables [61]. To implement an efficient training algorithm, the hidden and visible layers are constricted to form a bipartite graph in the restricted Boltzmann machine. In literature [62], it is explained that Contrastive Divergence (CD) is an algorithm that is mainly used in RBM. CD is a two-phase algorithm that is positive and negative phases. CD is an unsupervised technique. The time of the positive phase is changed to recreation, whereas at the time of the negative phase, it tries to replicate the information which is available in the present configuration. Auto-Encoder and Deep Auto-Encoder An auto-encoder is a type of neural network which is designed for the extraction of features by using data-driven learning. It also has an equal number of input and output nodes. This recreation of the input vector is done by training instead of assigning class labels. Thus, it is a type of unsupervised learning technique. In this technique, the number of input/output layers is greater than the number of hidden layers so that the extraction of the most discriminative features in lowerdimensional space is achieved by the encoding of the data. The deep auto-encoder also faces the issue of vanishing gradient so that network learns to rebuild the average of the training data sets. The solution to this problem is to initialize the weights over the network so that it starts with a good approximation of the final configuration. To find the initial weights means doing pre-training or training each layer separately. After this step, parameters make small adjustments to achieve the best or desired performance by standard backpropagation. To obtain a more stable variation of the input pattern, several auto-encoders have been proposed. Hardware/ Software-Based Implementation Some companies like Wolfram Mathematica [63] and Nervana System [64] provide a cloud-based service. Many future possible hardware solutions like neuromorphic electronic systems are used for neuroscience simulation. These are the future hardware that is used to implement synapse and AN in a chip [65]. Hardware designs that are used nowadays are IBM TrueNorth, SpiNNaker [66], Intel Curie, and NuPIC. DEEP LEARNING IN HEALTHCARE: FUTURE SCOPE, LIMITATIONS, AND CHALLENGES In comparison with traditional machine learning algorithms, deep learning techniques are used for the prediction of data in the healthcare sector, as it is impressive in the case of medical applications. This impressiveness arises since DL algorithms have not given a complete solution for each problem, but some
14 IoT and Big Data Analytics, Vol. 2
Singh et al.
questions remain unanswered. These are the following aspects that summarise issues with deep learning: 1. Many researchers used deep learning techniques as black–box as there is no way to provide or apply modification in case of the occurrence of a misclassification issue. 2. It is already known that a large set of training data is required to declare new concepts for training a reliable and effective model. Nowadays, so much healthcare data is available from many organizations starting from medical on paper records to electronic records, but diseases oriented data is limited. Deep learning is well-defined for rare diseases. During the training of a DNN, a common problem arrives, which is known as overfitting, which mainly occurs when the number total number of points in the training set is proportional to the number of parameters used. So this network cannot generalize to new samples, but it can memorize the training examples. Regularization methods [90] are used to avoid the overfitting problem. 3. In the case of the DNN algorithm, the simple input data cannot be used directly as input data, so before the training process, we require aspects like preprocessing normalization and input domain changes. Finding the correct preprocessing of data is a challenging task, and the training process will take more time comparatively. 4. This is another aspect where an important challenge is faced in DNN as it is feasible to add small changes in input data very easily so that the data will be misclassified. It is also noted that every machine learning algorithm allows this issue. 5. It is already known that deep learning algorithm works well on applications that have huge amounts of data, like speech recognition. Healthcare is a different domain where we do not have much data for the comprehensive study of diseases as the number of people is only approx. 7.5 million worldwide. Consequently, the numbers of patients are less, so deep learning an algorithm does not give a detailed result. 6. Data quality is also an aspect that is faced in healthcare informatics, as healthcare data is not clean. It is noisy and incomplete, so it is a challenging task to apply deep learning algorithms to noisy data and find a better result. The requirement of deep learning algorithms occurs, which solves the issue of data quality (Table 3).
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 15
Table 3. Different deep learning methods with applications in healthcare. Area
Applications
Public Health
• Air pollutant prediction • Diseases prediction • Population prediction
Medical Imaging
• 3-D brain reconstruction • Tumor detection
Input Data
Base Method
• Text message • DNN • Geo-tagged images • CNN • Social media data • Deep Autoencoder • MRI • CT Scan, X-ray images
• Deep Autoencoder • DNN
References [67 - 72]
[73 - 78]
Pervasive Sensing
• Hand gesture recognition
• Depth Camera
• CNN
[79]
Medical Informatics
• Monitoring of Human Behaviour • Prediction of disease • Data mining
• Medical dataset • Electronic health records • Lab tests
• Deep Autoencoder • CNN • RNN
[80 - 88]
Bioinformatics
• Cancer diagnosis
• Gene expression
• Deep Autoencoder
[89]
It is concluded that deep learning algorithms and techniques make the diagnosis of diseases faster and smarter when human interpretation becomes difficult. It is also used to decrease uncertainty in the decision-making process. The above challenges and limitations provide various chances and later research possibilities to enhance the usage of deep learning algorithms. Therefore, keep in mind the points we would encourage for later usage of deep learning algorithms in healthcare informatics. CONCLUSION Deep learning has acquired a focal situation lately in AI and Machine learning. In this chapter, we have illustrated how deep learning has empowered the improvement of more information-driven arrangements in medical-care informatics by permitting the automation of highlights that decrease the measure of human interaction during the mediation. This is the advantage for some issues in medical informatics and has, at last, upheld an incredible step forward for unstructured information like those emerging from clinical informatics, bioinformatics, and clinical imaging. As of recently, most uses of deep learning figuring out how to medical informatics have included preparing medical information as an unstructured source. Regardless, a lot of data is similarly encoded in organized information like electronic health records, which give an elaborate image of the patient's set of past experiences, analysis, results, etc. This information is important to secure a comprehensive viewpoint on a patient's condition and thereafter have the alternative to work on the idea of the got acceptance. In fact, a solid estimation through significant learning in a mix could
16 IoT and Big Data Analytics, Vol. 2
Singh et al.
work on the steady nature of clinical decision organizations. Patient and clinical data is excessive to obtain, and sound control individuals address a tremendous piece of a standard prosperity dataset. Significant learning computations have commonly been used in applications where the datasets were changed or, as a workaround, in which fabricated data was added to achieve esteem. The latter plan includes a further issue as regards the reliance on the produced normal data tests. The quickest developing kinds of information in biomedical exploration, like EHRs, imaging, - omics profiles, and screen information, are mind-boggling, heterogeneous, inadequately explained, and for the most part, unstructured. Utilizations of deep learning explored the tremendous scope in biomedical analysis and showed effective chances to demonstrate, address and gain knowledge from heterogeneous and complex infrastructure. State-of-the-art Machine learning DL approaches ought to be improved similar to data combination, security, interpretability, and transient showing to be feasibly applied to the clinical space. DL can open the course toward the coming period of insightful clinical consideration structures, which can scale to join billions of patient records and rely upon a single thorough patient depiction to enough maintain clinicians in their regular activities. Deep learning can fill in as a core value to arrange both speculation-driven examination and exploratory examination in clinical spaces dependent on various medical care information. REFERENCES [1]
E. Choi, Z. Xu, Y. Li, M. Dusenberry, G. Flores, E. Xue, and A. Dai, "Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer", Proc. Conf. AAAI Artif. Intell., vol. 34, no. 1, pp. 606-613, 2020. [http://dx.doi.org/10.1609/aaai.v34i01.5400]
[2]
B.L.P. Cheung, and D. Dahl, "Deep learning from electronic medical records using attention-based cross-modal convolutional neural networks", Proc. IEEE EMBS Int. Conf. Biomed. Health Inform. (BHI). Las Vegas, NV, USA, pp. 222–225, 2018. [http://dx.doi.org/10.1109/BHI.2018.8333409]
[3]
X. Zeng, Y. Feng, S. Moosavinasab, D. Lin, S. Lin, and C. Liu, "Multilevel self-attention model and its use on medical risk prediction", Pac. Symp. Biocomput., vol. 25, pp. 115-126, 2020. [PMID: 31797591]
[4]
F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, and G. J. Dipole. Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 1903–1911, 2017.
[5]
L. Li, W.Y. Cheng, B.S. Glicksberg, O. Gottesman, R. Tamler, R. Chen, E.P. Bottinger, and J.T. Dudley, "Identification of type 2 diabetes subgroups through topological analysis of patient similarity", Sci. Transl. Med., vol. 7, no. 311, 2015. [http://dx.doi.org/10.1126/scitranslmed.aaa9364] [PMID: 26511511]
[6]
M. Gerstung, E. Papaemmanuil, I. Martincorena, L. Bullinger, V.I. Gaidzik, P. Paschka, M. Heuser, F. Thol, N. Bolli, P. Ganly, A. Ganser, U. McDermott, K. Döhner, R.F. Schlenk, H. Döhner, and P.J. Campbell, "Precision oncology for acute myeloid leukemia using a knowledge bank approach", Nat. Genet., vol. 49, no. 3, pp. 332-340, 2017. [http://dx.doi.org/10.1038/ng.3756] [PMID: 28092685]
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 17
[7]
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition", Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. [http://dx.doi.org/10.1109/5.726791]
[8]
R.J. Williams, and D. Zipser, "A learning algorithm for continually running fully recurrent neural networks", Neural Comput., vol. 1, no. 2, pp. 270-280, 1989. [http://dx.doi.org/10.1162/neco.1989.1.2.270]
[9]
P. Smolensky, Information processing in dynamical systems: Foundations of harmony theory (No. CU-CS-321-86). Colorado Univ at Boulder Dept of Computer Science, 1986.
[10]
G.E. Hinton, and R.R. Salakhutdinov, "Reducing the dimensionality of data with neural networks", Science, vol. 313, no. 5786, pp. 504-507, 2006. [http://dx.doi.org/10.1126/science.1127647] [PMID: 16873662]
[11]
Yoo Y, Brosch T, Traboulsee A, Deep learning of image features from unlabeled data for multiple sclerosis lesion segmentation. In: International Workshop on Machine Learning in Medical Imaging, Boston, MA, USA, 117–24, 2014.
[12]
S. Liu, S. Liu, and W. Cai, "Early diagnosis of Alzheimer’s disease with deep learning", International Symposium on Biomedical Imaging. pp. 1015-18, 2014. [http://dx.doi.org/10.1109/ISBI.2014.6868045]
[13]
C. Liu, F. Wang, and J. Hu, "Risk prediction with electronic health records: a deep learning approach", ACM International Conference on Knowledge Discovery and Data Mining. pp. 705-14, Sydney, NSW, Australia, 2015.
[14]
Z.C. Lipton, D.C. Kale, and C. Elkan, "Learning to diagnose with LSTM recurrent neural networks", International Conference on Learning Representations. pp. 1-18, San Diego, CA, USA, 2015.
[15]
Z. Che, D. Kale, and W. Li, "Deep computational phenotyping", ACM International Conference on Knowledge Discovery and Data Mining. pp. 507-16, Sydney, NSW, Australia, 2015.
[16]
T.A. Lasko, J.C. Denny, and M.A Levy, "Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data", PLoS ONE, vol. 8, vol. 6, pp. e66341, 2013. [http://dx.doi.org/https://doi.org/10.1371/journal.pone.0066341]
[17]
N. Razavian, J. Marcus, and D. Sontag, "Multi-task prediction of disease onsets from longitudinal laboratory tests", Proceedings of the 1st Machine Learning for Healthcare Conference, pp. 73–100, 2016.
[18]
Z. Liang, G. Zhang, and J.X. Huang, "Deep learning for healthcare decision making with EMRs", IEEE International Conference on Bioinformatics and Biomedicine pp. 556-9, 2014. [http://dx.doi.org/10.1109/BIBM.2014.6999219]
[19]
T. Pham, T. Tran, and D. Phung, "DeepCare: a deep dynamic memory model for predictive medicine", In Advances in Knowledge Discovery and Data Mining: 20th Pacific-Asia Conference, PAKDD 2016, Springer International Publishing. [http://dx.doi.org/10.1007/978-3-319-31750-2_3]
[20]
T. Tran, T.D. Nguyen, D. Phung, and S. Venkatesh, "Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM)", J. Biomed. Inform., vol. 54, pp. 96-105, 2015. [http://dx.doi.org/10.1016/j.jbi.2015.01.012] [PMID: 25661261]
[21]
M. Singh, N. Sukhija, A. Sharma, M. Gupta, and P.K. Aggarwal, "Security and privacy requirements for IoMT-based smart healthcare system", Big Data Analysis for Green Computing., 17–37, 2021. [http://dx.doi.org/10.1201/9781003032328-2]
[22]
P. Nguyen, T. Tran, N. Wickramasinghe, and S. Venkatesh, "Deepr: A Convolutional Net for Medical Records", IEEE J. Biomed. Health Inform., vol. 21, no. 1, pp. 22-30, 2017.
18 IoT and Big Data Analytics, Vol. 2
Singh et al.
[http://dx.doi.org/10.1109/JBHI.2016.2633963] [PMID: 27913366] [23]
M.K.K. Leung, A. Delong, B. Alipanahi, and B.J. Frey, "Machine Learning in Genomic Medicine: A Review of Computational Problems and Data Sets", Proc. IEEE, vol. 104, no. 1, pp. 176-197, 2016. [http://dx.doi.org/10.1109/JPROC.2015.2494198]
[24]
C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle, "Deep learning for computational biology", Mol. Syst. Biol., vol. 12, no. 7, p. 878, 2016. [http://dx.doi.org/10.15252/msb.20156651] [PMID: 27474269]
[25]
T. Brosch, R. Tam, "Manifold learning of brain MRIs by deep learning", Med Image Comput Comput Assist Interv., vol. 16, pp. 633–40, 2013.
[26]
A. Prasoon, K. Petersen, and C. Igel, "Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network", In Medical Image Computing and Computer-Assisted Intervention–MICCAI, 16th International Conference, Nagoya, Japan, Proceedings, Part II 16, pp. 246253, 2013. [http://dx.doi.org/10.1007/978-3-642-40763-5_31]
[27]
J.Z. Cheng, D. Ni, Y.H. Chou, J. Qin, C.M. Tiu, Y.C. Chang, C.S. Huang, D. Shen, and C.M. Chen, "Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans", Sci. Rep., vol. 6, no. 1, p. 24454, 2016. [http://dx.doi.org/10.1038/srep24454] [PMID: 27079888]
[28]
V. Gulshan, L. Peng, M. Coram, M.C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, R. Kim, R. Raman, P.C. Nelson, J.L. Mega, and D.R. Webster, "Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs", JAMA, vol. 316, no. 22, pp. 2402-2410, 2016. [http://dx.doi.org/10.1001/jama.2016.17216] [PMID: 27898976]
[29]
A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau, and S. Thrun, "Dermatologistlevel classification of skin cancer with deep neural networks", Nature, vol. 542, no. 7639, pp. 115-118, 2017. [http://dx.doi.org/10.1038/nature21056] [PMID: 28117445]
[30]
R. Miotto, L. Li, B.A. Kidd, and J.T. Dudley, "Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records", Sci. Rep., vol. 6, no. 1, p. 26094, 2016. [http://dx.doi.org/10.1038/srep26094] [PMID: 27185194]
[31]
E. Choi, M.T. Bahadori, A. Schuetz, "Doctor AI: Predicting clinical events via recurrent neural networks", In Machine learning for healthcare conference, pp. 301-318, PMLR, 2016.
[32]
J. Zhou, and O.G. Troyanskaya, "Predicting effects of noncoding variants with deep learning–based sequence model", Nat. Methods, vol. 12, no. 10, pp. 931-934, 2015. [http://dx.doi.org/10.1038/nmeth.3547] [PMID: 26301843]
[33]
H.Y. Xiong, B. Alipanahi, L.J. Lee, H. Bretschneider, D. Merico, R.K.C. Yuen, Y. Hua, S. Gueroussov, H.S. Najafabadi, T.R. Hughes, Q. Morris, Y. Barash, A.R. Krainer, N. Jojic, S.W. Scherer, B.J. Blencowe, and B.J. Frey, "The human splicing code reveals new insights into the genetic determinants of disease", Science, vol. 347, no. 6218, 2015. [http://dx.doi.org/10.1126/science.1254806] [PMID: 25525159]
[34]
S. Sharma, M. Singh, R. Singh, R. Prajapati, “ASD screening using machine learning”, Int. J. Sci. Res. Management Studies., Vol. 5, no. 7, pp. 1-8, 2021.
[35]
B. Alipanahi, A. Delong, M.T. Weirauch, and B.J. Frey, "Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning", Nat. Biotechnol., vol. 33, no. 8, pp. 831-838, 2015. [http://dx.doi.org/10.1038/nbt.3300] [PMID: 26213851]
[36]
D.R. Kelley, J. Snoek, and J.L. Rinn, "Basset: learning the regulatory code of the accessible genome
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 19
with deep convolutional neural networks", Genome Res., vol. 26, no. 7, pp. 990-999, 2016. [http://dx.doi.org/10.1101/gr.200535.115] [PMID: 27197224] [37]
P.W. Koh, E. Pierson, and A. Kundaje, "Denoising genome-wide histone ChIP-seq with convolutional neural networks", Bioinformatics., vol. 33, no. 14, pp. 225-233, 2016. [http://dx.doi.org/10.1101/052118]
[38]
L. Piwek, D.A. Ellis, S. Andrews, and A. Joinson, "The rise of consumer health wearables: promises and barriers", PLoS Med., vol. 13, no. 2, 2016. [http://dx.doi.org/10.1371/journal.pmed.1001953] [PMID: 26836780]
[39]
R. Fakoor, F. Ladhak, and A. Nazi, "Using deep learning to enhance cancer diagnosis and classification", International Conference on Machine Learning, Atlanta, GA, USA, 2013.
[40]
V. Jindal, J. Birjandtalab, and M.B. Pouyan, "An adaptive deep learning approach for PPG-based identification", 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, pp. 6401-4, 2016. [http://dx.doi.org/10.1109/EMBC.2016.7592193]
[41]
C.D. Manning, P. Raghavan, and H. Schu¨tze, "Introduction to information retrieval", Cambridge university press: Cambridge, vol. 3, 2008. [http://dx.doi.org/10.1017/CBO9780511809071]
[42]
N.D. Lane, and P. Georgiev, "Can deep learning revolutionize mobile sensing?", International Workshop on Mobile Computing Systems and Applications. pp. 117-22, Santa Fe, NM, USA, 2015.
[43]
N.Y. Hammerla, S. Halloran, and T Ploetz, "Deep, convolutional, and recurrent models for human activity recognition using wearables", arXiv, 1604.08880, 2016.
[44]
J. Zhu, A. Pande, and P. Mohapatra, "Using deep learning for energy expenditure estimation with wearable sensors", In: 17th International Conference on E-health Networking, Application Services (HealthCom), Cambridge, MA, USA, pp. 501–6, 2015. [http://dx.doi.org/10.1109/HealthCom.2015.7454554]
[45]
A. Sathyanarayana, S. Joty, L. Fernandez-Luque, F. Ofli, J. Srivastava, A. Elmagarmid, T. Arora, and S. Taheri, "Sleep Quality Prediction From Wearable Data Using Deep Learning", JMIR Mhealth Uhealth, vol. 4, no. 4, 2016. [http://dx.doi.org/10.2196/mhealth.6562] [PMID: 27815231]
[46]
S. Dalal, and S. Jain, "Smart mental healthcare systems", J. Web Semant., pp. 153-163, 2021. [http://dx.doi.org/10.1016/B978-0-12-822468-7.00010-9]
[47]
M. Rath, and J. M. Chatterjee, "Exploration of information retrieval approaches with focus on medical information retrieval", Ontology-Based Information Retrieval for Healthcare Systems, pp. 275-291, 2020. [http://dx.doi.org/10.1002/9781119641391.ch13]
[48]
D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Neurocomputing: Foundations of research”, J. A. Anderson and E. Rosenfeld, Eds. Cambridge, MA, USA: MIT Press, , pp. 696–699, 1988.
[49]
J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, Q.V. Le, and A.Y. Ng, "On optimization methods for deep learning", Proc. ICML, pp. 265-272, 2011.
[50]
A. Chakrabarty, U.S. Das, "Big data analytics in excelling health care: Achievement and challenges in India. Big Data Anal. Intell. Pers. Health. Care., 55–74, 2020. [http://dx.doi.org/10.1108/978-1-83909-099-820201008]
[51]
H. Nahata, and S. P. Singh, "Deep learning solutions for skin cancer detection and diagnosis", Machine Learning with Health Care Perspective: Machine Learning and Healthcare, pp. 159-182, 2020. [http://dx.doi.org/10.1007/978-3-030-40850-3_8]
[52]
S. Hochreiter, and J. Schmidhuber, "Long short-term memory", Neural Comput., vol. 9, no. 8, pp.
20 IoT and Big Data Analytics, Vol. 2
Singh et al.
1735-1780, 1997. [http://dx.doi.org/10.1162/neco.1997.9.8.1735] [PMID: 9377276] [53]
S. Goyal, N. Sharma, B. Bhushan, A. Shankar, and M. Sagayam, Iot enabled technology in secured healthcare: applications, challenges and future directions.Cognitive internet of medical things for smart healthcare. Springer: Cham. pp. 25-48, 2021. [http://dx.doi.org/10.1007/978-3-030-55833-8_2]
[54]
R. Singh, P. Singh. "Smart Nursery with Health Monitoring System Through Integration of IoT and Machine Learning." In Big Data Analytics and Intelligence: A Perspective for Health Care, Emerald Publishing Limited, pp. 93-114, 2020. [http://dx.doi.org/10.1108/978-1-83909-099-820201017]
[55]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions", Proc. CVPR, pp. 1-9, 2015.
[56]
M. D. Zeiler, and R. Fergus, “Visualizing and understanding convolutional networks”, ECCV, pp. 818–833, 2014. [http://dx.doi.org/10.1007/978-3-319-10590-1_53]
[57]
K. Simonyan, and A. Zisserman, "Very deep convolutional networks for large-scale image recognition", arXiv, 1409.15562014.
[58]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks”, NIPS, pp. 1097–1105, 2012.
[59]
H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng, "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations", Proc. ICML pp. 609-616, 2009. [http://dx.doi.org/10.1145/1553374.1553453]
[60]
D. Ackely, G. Hinton, and T. Sejnowski, "Learning and relearning in Boltzmann machines", Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1, no. 2, pp. 282-317, 1986.
[61]
H. Wang, and D.-Y. Yeung, "Towards bayesian deep learning: A survey”, ArXiv e-prints, 2016.
[62]
M. A. Carreira-Perpinan, and G. Hinton, “On contrastive divergence learning”, AISTATS, vol. 10, pp. 33–40, 2005.
[63]
Wolfram Research, “Wolfram math,” [Online]. Available: https://www.wolfram.com/mathematica/.
[64]
Nervana Systems, “Neon,” [Online]. Available: https://github.com/NervanaSystems/neon.
[65]
NVIDIA corp., “Nvidia dgx-1,” [Online]. Available: http://www.nvidia.com/object/deep-learningsystem.html.
[66]
L. Pastur-Romay, F. Cedrón, A. Pazos, and A. Porto-Pazos, "Deep artificial neural networks and neuromorphic chips for big data analysis: Pharmaceutical and bioinformatics applications", Int. J. Mol. Sci., vol. 17, no. 8, p. 1313, 2016. [http://dx.doi.org/10.3390/ijms17081313] [PMID: 27529225]
[67]
B. Zou, V. Lampos, R. Gorton, and I. J. Cox, “On infectious intestinal disease surveillance using social media content,” DigitalHealth, pp. 157–161, 2016. [http://dx.doi.org/10.1145/2896338.2896372]
[68]
V.R.K. Garimella, A. Alfayad, and I. Weber, "Social media image analysis for public health", in Proc. CHI, New York, NY, USA: ACM, pp. 5543–5547, 2016. [http://dx.doi.org/10.1145/2858036.2858234]
[69]
L. Zhao, J. Chen, F. Chen, W. Wang, C.-T. Lu, and N. Ramakrishnan, “Simnest: Social media nested epidemic simulation via online semi supervised deep learning”, IEEE ICDM, pp. 639–648, 2015.
[70]
E. Horvitz, and D. Mulligan, "Data, privacy, and the greater good", Science, vol. 349, no. 6245, pp. 253-255, 2015.
Deep Learning in Healthcare Industry
IoT and Big Data Analytics, Vol. 2 21
[http://dx.doi.org/10.1126/science.aac4520] [PMID: 26185242] [71]
B. Felbo, P. Sundsoy, A. Pentland, S. Lehmann, and Y.A. de Montjoye, “Using deep learning to predict demographics from mobile phone metadata”, 2016.
[72]
B.T. Ong, K. Sugiura, and K. Zettsu, "Dynamically pre-trained deeprecurrent neural networks using environmental monitoring data forpredicting pm2. 5", Neural Comput. Appl., pp. 1-14, 2015. [PMID: 27418719]
[73]
J. Shan, and L. Li, “A deep learning method for microaneurysm detection in fundus images”, IEEE Chase, pp. 357–358, 2016.
[74]
A. Mansoor, J.J. Cerrolaza, R. Idrees, E. Biggs, M.A. Alsharid, R.A. Avery, and M.G. Linguraru, "Deep learning guided partitioned shapemodel for anterior visual pathway segmentation", IEEE Trans. Med. Imaging, vol. 35, no. 8, pp. 1856-1865, 2016. [http://dx.doi.org/10.1109/TMI.2016.2535222] [PMID: 26930677]
[75]
D. C. Rose, I. Arel, T. P. Karnowski, and V. C. Paquit, "Applying deep layered clustering to mammography image analytics”, BSEC, pp. 1–4, 2010.
[76]
Y. Zhou, and Y. Wei, "Learning hierarchical spectral-spatial featuresfor hyperspectral image classification", IEEE Trans. Cybern., vol. 46, no. 7, pp. 1667-1678, 2016. [http://dx.doi.org/10.1109/TCYB.2015.2453359] [PMID: 26241988]
[77]
J. Lerouge, R. Herault, C. Chatelain, F. Jardin, and R. Modzelewski, "IODA: An input/output deep architecture for image labeling", Pattern Recognit., vol. 48, no. 9, pp. 2847-2858, 2015. [http://dx.doi.org/10.1016/j.patcog.2015.03.017]
[78]
J. Wang, J. D. MacKenzie, R. Ramachandran, and D. Z. Chen, “A deep learning approach for semantic segmentation in histology tissue images”, in MICCAI, pp. 176–184, 2016.
[79]
M. Poggi, and S. Mattoccia, “A wearable mobility aid for the visually impaired based on embedded 3D vision and deep learning”, in IEEEISCC, pp. 208–213, 2016.
[80]
Z. Che, S. Purushotham, R. Khemani, and Y. Liu, “Distilling knowledge from deep networks with applications to healthcare domain”, ArXiv e-prints, 2015.
[81]
Z. Yan, Y. Zhan, Z. Peng, S. Liao, Y. Shinagawa, S. Zhang, D.N. Metaxas, and X.S. Zhou, "Multiinstance deep learning: Discoverdiscriminative local anatomies for bodypart recognition", IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1332-1343, 2016. [http://dx.doi.org/10.1109/TMI.2016.2524985] [PMID: 26863652]
[82]
H. Shin, L. Lu, L. Kim, A. Seff, J. Yao, and R.M. Summers, “Interleaved text/image deep mining on a large-scale radiology databasefor automated image interpretation”, CoRR, 2015.
[83]
T. Huang, L. Lan, X. Fang, P. An, J. Min, and F. Wang, "Promises and challenges of big data computing in health sciences", Big Data Research, vol. 2, no. 1, pp. 2-11, 2015. [http://dx.doi.org/10.1016/j.bdr.2015.02.002]
[84]
N. Tajbakhsh, J.Y. Shin, S.R. Gurudu, R.T. Hurst, C.B. Kendall, M.B. Gotway, and J. Liang, "Convolutional neural networks for medical image analysis: Full training or fine tuning?", IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1299-1312, 2016. [http://dx.doi.org/10.1109/TMI.2016.2535302] [PMID: 26978662]
[85]
L. Nie, M. Wang, L. Zhang, S. Yan, B. Zhang, and T.S. Chua, "Diseaseinference from health-related questions via sparse deep learning"., IEEE Trans. Knowl. Data Eng., vol. 27, no. 8, pp. 2107-2119, 2015. [http://dx.doi.org/10.1109/TKDE.2015.2399298]
[86]
S. Mehrabi, S. Sohn, D. Li, J.J. Pankratz, T. Therneau, J.L.S. Sauver, H. Liu, and M. Palakal, “Temporal pattern and association discovery of diagnosis codes using deep learning”, ICHI, pp. 408–416, 2015.
[87]
K. Fritscher, P. Raudaschl, P. Zaffino, M. F. Spadea, G. C. Sharp, and R. Schubert, “Deep neural
22 IoT and Big Data Analytics, Vol. 2
Singh et al.
networks for fast segmentation of 3D medical images”, in MICCAI, pp. 158–165, 2016. [http://dx.doi.org/10.1007/978-3-319-46723-8_19] [88]
W. J. Gordon, and C. Catalini, "Blockchain technology for healthcare: Facilitating the transition to patient-driven interoperability", Comput. Struct. Biotechnol. J., vol. 16, pp. 224_230, 2018.
[89]
N. A. Latha, B. R. Murthy, and U. Sunitha, "Electronic health record'', Int. J. Eng., vol. 1, no. 10, pp. 25_27, 2012.
[90]
N. Srivastava, G.E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networksfrom overfitting", J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929-1958, 2014.
IoT and Big Data Analytics, 2023, Vol. 2, 23-47
23
CHAPTER 2
Generative Adversarial Networks for Deep Learning in Healthcare: Architecture, Applications and Challenges Shankey Garg1,* and Pradeep Singh1 Department of Computer Science & Engineering, National Institute of Technology, Raipur, Chhattisgarh, India 1
Abstract: Deep Learning is a new generation of artificial neural networks that have transformed our daily lives and has impacted several industries and scientific disciplines in recent years. Recent development in deep learning provides various significant methods to obtain end-to-end learning models from complex data. Generative Adversarial Networks give the idea of learning the deep representations without extensively interpreting the training data. The generative adversarial network involves the generative modeling approach that uses the deep learning approach. The chapter is broadly divided into the following sections as (1) Insights of deep learning & the generative adversarial networks, (2) GAN’s representative variants, training methods, architecture, and mathematical representation, and (3) Efficacy of GAN in different applications. This chapter will gain the recent work done in GAN and implement this technique in the different deep learning applications for healthcare. Here, we will also analyze some of the future perspectives and trends in the forthcoming time.
Keywords: Deep Learning, Generative Adversarial Networks, Healthcare. INTRODUCTION Artificial Neural Networks (ANNs) were used to imitate the biological nervous system in specially designed hardware and software and are now the most trending method in computational intelligence. It has been 70 years now, and ANNs have gained the attention of researchers and are continuing the same. The multilayer neurons have been most widely used at the end of the previous century. The reason for its emergence may include the availability of huge training data* Corresponding author Shankey Garg: Department of Computer Science & Engineering, National Institute of Technology, Raipur, Chhattisgarh, India; E-mail: [email protected]
Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma (Eds.) All rights reserved-© 2023 Bentham Science Publishers
24 IoT and Big Data Analytics, Vol. 2
Garg and Singh
sets containing high-quality labels, the emergence of multi-core, multi-threaded implementations, advancements in parallel computing capabilities, etc. [1]. The conceptual roots of deep learning are skillfully planted in the traditional neural network composition. Compared to the classical utilization of neural networks, deep learning uses multiple hidden neurons and layers- usually greater than two layers in its architecture combined with the new training methodology that adds an advantage to it. Deep architecture could be built by adding more hidden layers capable of expressing more tedious tasks or hypotheses; it is already known that the hidden layers can easily capture non-linear relationships. The networks formed from the multiple hidden layers are called Deep Neural Networks. Deep Learning provides advanced methods to train deep neural network architectures. Usually, DNN is trained using both supervised and unsupervised techniques. Supervised learning uses class labels to train deep neural networks and determine the weight that reduces the error for predicting a target value and is usually used in classification and regression tasks. In contrast, unsupervised learning uses feature extraction, dimensionality reduction, and clustering methods. Class labels are not provided for training deep neural networks [2]. Researchers of almost every field have actively explored deep learning. Image processing is one among the many. Image super-resolution is an integral part of the image processing techniques that consists of getting a greater resolution image from a lower resolution. This technique is used for various applications, viz., medical imaging, remote sensing, pattern recognition, security, surveillance, etc. Many deep learning methods are being used to image super-resolution & image processing tasks that range from traditional convolution neural networks to all new generative adversarial networks [3]. Likewise, Reinforcement Learning is recently gaining a lot of success in solving tedious tasks with continuous control. The deep learning methodology applied in reinforcement learning typically uses multiple function approximators (usually consisting of a network having hidden layers in sharing mode). A new distributional algorithm, C51, was introduced capable of solving the reinforcement learning problem. On top of the methodologies mentioned above, there is also a new way to learn the state-value distribution that is being inspired by the comparison between the actor-critic architecture and the generative adversarial networks [4]. The detail of the above-mentioned techniques will be discussed in the coming sections.
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 25
DEEP LEARNING Deep learning (DL) is part of machine learning with its base in ANN. It is a reliable technique that takes AI to a higher stage and comes up with many advancements. The huge success is the availability of increased computational power, high-speed data storage, and concurrency [2]. The system converts actual data into its feature vector for the machine learning scenario to learn and classify the pattern. In contrast, deep learning involves multiple layers in the learning of complex functions [5]. DL is best for developing intricate structures in multidimensional data as it provides complex problem-solving methods in many fields of science, industry, and government. Deep learning started booming in late 2012 as the convolutional neural network (CNNs) gained an overwhelming success in research; researchers and scientists from almost every field started exploring this field [6]. It could be seen that the future of deep learning is promising as it needs significantly less engineering and gives more success by using a vast amount of computational data and novel algorithms and architectures that accelerate its progress [5]. The Transition from Machine Learning to DL The training methodology in ML can be broadly categorized as supervised and unsupervised learning. A supervised learning approach is used when the output label is already given for the problem set. Here, the training data consists of the numerical or nominal vectors about the features of input data and the respective output data. The training process is defined as the regression when the output is continuous, but when the output data consists of the categorized value, the training process has termed the classification. While unsupervised learning involves unlabeled data, it infers a function that describes the hidden structures. As the data here is not labeled, so one cannot evaluate the accuracy [7]. Naïve Bayes model is typically a classification algorithm that relies on the input data's probability distribution [8]. Among the different classification algorithms, the Support vector machine is the most famous due to its high-performance rank in most related problems [9]. Also, ensemble learning combined with many classification algorithms is being used for precise prediction and more advanced classification [10]. Artificial Neural Network is a popularly known regression and classification algorithm in ML and tries to imitate the signal transmission of the biological nervous system [7]. Fig. (1) gives the conceptual framework of the artificial neural networks derived through natural inspiration. Earlier, the ANN has given an outstanding performance in different fields. Still, it faced several difficulties like local minima while optimizing and overfitting, resulting in deep networks for finding the solution. Deep Neural Networks are formed by the
26 IoT and Big Data Analytics, Vol. 2
Garg and Singh
number of layers stacked in a series. Here the initial layer represents the input layer by which the prediction is to be made, and the last one is the output layer and represents the value produced after the prediction, and the layers present betwixt input and output layer depicts the hidden layer [7]. Here, each neuron is linked to neuron outputs in the preceding layer, and every neuron is multiplied by a number known as weight [1]. Increasing the hidden layers results in the deep architecture capable of depicting the complex hypothesis. These types of neural networks are called deep neural networks. For many years, many hardware limitations have made DNNs implementations impractical due to complex computation demands for both training and processing. But the recent advancements in hardware parallelization have partially made it possible to overcome the problem and allow the DNNs architecture to be utilized as per the demand [2].
Fig. (1). (A) Biological neuron (B) Artificial neuron.
There are several deep architectures available increasing day by day. It would not be fair enough to compare these architectures as architecture has its pros and cons and is relevant for particular applications. As discussed, deep learning is capturing almost every field, so it will not be so surprising to know that newer algorithms and newer architectures might be developed quite often to gain human-like efficiency. We will discuss some of the significant deep architectures developed in the next sections [1, 2].
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 27
Fig. (2). Deep feed forward network.
Deep Feed-forward Networks It is a basic form of deep architecture, and here the connection moves only in the forward direction. DNN comprises multiple numbers of hidden layers in standard multi-layer neural networks [1]. Fig. (2) shows a deep feed-forward network with ‘i’ inputs, ‘w’ weights, and output ‘Y’. The deep architecture is mainly used for classification and regression techniques and can express and handle complex hypotheses as it contains many hidden layers. It has proven itself in most of the areas with success [2]. A neural network (NN) with some hidden units could approximate various continuous functions; hence, it could be called the universal approximator. But we can also approximate complex tasks with similar accuracy with the help of deep architecture with very few units [11]. Restricted Boltzmann Machines Boltzmann machines (BM) depict a specific class of log-linear Markov Random Fields. The independent parameters have linear valued energy functions [12]. Restricted Boltzmann Machine (RBM) is the category of BM having a variation that restricts the intra-layer contact among the units, named ‘restricted’ [1]. RBMs are graph-based generative stochastic network models consisting of two layers, viz., the hidden layer and the visible layer, fully connected; however, the connections within the layers are not allowed. The network is capable of learning a joint probability distribution with its ‘n’ visible units V=[] and ‘m’ hidden units H=[], and their symmetric connection among the two layers is given by the weight matrix ‘w’ [13, 14]. The visible layer in a basic RBM model depicts the observed
28 IoT and Big Data Analytics, Vol. 2
Garg and Singh
input unit’s v. In contrast, the feature detectors are represented by hidden layers that are the hidden units h. Here, the data of making mensuration tends to obey the Gaussian distribution. Hence, it is clear that the energy function of RBM is much more complicated as compared to the general binary case considering Gaussian visible neurons and the hidden neurons. The probability of every node is updated sequentially, and the system comes to equilibrium when the probability for each neuron does not modify further. Here, the energy of the combined configuration at the equilibrium of both the hidden and the visible layers is given by the negative weighted sum of each set of nodes subtracting the biases of all the visible and the hidden neurons; architecture is depicted in Fig. (3). Hence, the summary of RBM is to upgrade this energy function which is defined as E (v, h) and is given as: E(v, h) =∑𝒏𝝐𝒗𝒊𝒔𝒊𝒃𝒍𝒆
(𝒗𝒏 −𝒂𝒏 )𝟐 𝟐𝜹𝟐𝒏
+ ∑𝒎𝝐𝒉𝒊𝒅𝒅𝒆𝒏
(𝒉𝒎 −𝒃𝒎 )𝟐 𝟐𝜹𝟐𝒎
− ∑𝒏,𝒎
𝒗𝒏 𝒉𝒎 𝜹𝒏 𝜹𝒎
𝒘𝒏,𝒎
(1)
In the above equation, vn and hm are the visible units ‘n’ and hidden units ‘m’ state. Gaussian mean for the visible nodes and the hidden nodes are denoted by an and am, respectively. δnδm depicts the standard deviations and the weight between them is given by wn,m . Now, the probability assigned to a visible vector by the RBM network is adding all the hidden vectors [12]. Training of the deep belief network by using RBMs as the building block is one of the significant applications in RBMs' history and is considered the beginning of the deep learning era. RBM proved its capability in the Netflix dataset by performing collaborative filtering, and hence it is most widely used in application that requires collaborative filtering [1]. Deep Belief Networks Restricted Boltzmann Machines are limited in their representations, but they become more powerful when stacked up to form a DBN. It is a generative graphbased framework comprising different latent parameters that are binary and represent hidden vectors in the input values. The output from the RBM’s hidden layer is utilized as the input to the visible layer of the lower RBM’s to form DBN. But we can notice that the connection formed among the top two layers of DBN is undirected, just as a standard RBM. So, it is clear that DBN with only one hidden layer is an RBM [1, 15, 16]. DBNs are composed of stacked RBMs, being learned greedily using the learning algorithm, i.e., RBM, at a level never considers others while learning. Fig. (4) shows a three-layer DBN containing each RBM in every layer, as depicted in Fig. (3).
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 29
Fig. (3). Restricted boltzmann machine model.
Fig. (4). A deep belief network.
Convolutional Neural Networks Deep neural networks cannot perform well with multidimensional input data containing similar data, such as image data. The reason behind this problem may include the massive nodes and parameters that are needed to be trained, and it is practically not possible. Hence, CNNs have been proposed for the analysis of the imagery data [2]. CNNs are inspired by the human visual system and are identified in the deep learning methods from the victory of the ILSVRC-2012 competition with the very successful model AlexNet. From the AlexNet model, a new era of AI has begun. CNN models have performed well in many scenarios as compared to human recognition abilities [1]. A color image consists of 2D arrays
30 IoT and Big Data Analytics, Vol. 2
Garg and Singh
with pixel intensities in the three color channels; the data of these images are represented as multiple arrays. These kinds of data can be processed using Convolutional Neural Networks. Here, 1D is for the signals, sequences and language, whereas 2D is meant for the images or audio spectrograms and 3D is for videos or heavy images. CNN uses natural signals' properties, including the four main ideas: local connections, shared weights, pooling, and using so many layers. The basic CNN model has many layers, whereas the initial layers have two varieties: convolutional and pooling layers. Here, convolutional layer units are organized in feature maps and determine features from the preceding layer. While the pooling layer appropriately gathers similar features. Convolution layers and the pooling and non-linear are stacked together that follow convolutions and fully connected layers [5]. A convolutional layer follows the pooling layer to minimize the previous layer's feature maps related to a feature map in the existing layer [11]. Fig. (5) gives the basic architecture of convolutional neural networks here; the processing determines whether the given image is a cat or a dog. For this classification, the image is given to the input layer that passes through various convolutions and then given to the pooling layer for downsampling. Then, it is given to the fully connected layer, called a dense layer, and lastly, the image is classified as a dog or cat.
Fig. (5). Basic architecture of convolutional neural network.
Recurrent Neural Networks A neural network consisting of hidden units with the ability to analyze data streams is known as a recurrent neural network. The training samples having strong inter-dependencies and containing sensible data regarding all the stuff in previous time steps are usually fed in RNN. The output obtained at time t-1 influences the decision at time t, and hence, it could be concluded that RNNs use
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 31
two methods of inputs, the present and the recent past, to give the outcome of the novel data. Hence, RNNs are usually known to have memory [2]. Backpropagation is used for the training of RNNs. RNNs are used for the data containing sequential inputs like speech and language. These networks play a vital role in other applications also where the output depends on the previous time steps or previous computations like text prediction, DNA sequencing, etc. Although RNNs are very dynamic and proved to be very powerful in the areas mentioned above, training an RNN is still a tedious and problematic task because the backpropagated gradient rapidly increases or collapses at each step. A variation in RNN is introduced called the Long Short-term Memory Units (LSTMs) to overcome this problem [5]. RNNs could also be used with CNN to increase the effective pixel neighborhood. Fig. (6) depicts an unrolled recurrent neural network. Here, X(0) is taken as an input from the given input sequence, which provides Y(0) as an output. This output, together with the X(1), becomes the input for the next step. Similarly, Y(1) and X(2) become the input for the next step and so on. Hence, it keeps remembering the background while training.
Fig. (6). Unrolled recurrent neural network.
GENERATIVE ADVERSARIAL NETWORKS Generative adversarial networks (GANs) have become a fascinating idea in the last decade in the machine learning area. Their applicability in different fields made it possible to happen. GANs depict some tedious and multi-dimensional data that could be used for image processing, videos, music, natural language processing, and other domains, including medical image synthesis, security, fraud detection, etc. [17, 18]. GAN is a trending technique applicable to both semisupervised and unsupervised learning. It could be achieved by the implicit modeling of high-dimensional data distributions [19]. This network is specified by
32 IoT and Big Data Analytics, Vol. 2
Garg and Singh
simply training a network pair to contest with each other. Both the generative and the discriminative models are trained simultaneously [1, 17, 19]. GANs are motivated from the two-player zero-sum game, the gain of both players is zero, and the gain/loss of each player is stabilized by the gain/loss of the opposite player. GANs perform the generator and discriminator learning concurrently. The generator module is used for training the model, and it generates new examples. The discriminator module classifies examples as either real or fake, so it could also be called a binary classifier. The optimization technique adopted by GANs is a minimax optimization problem, and here the aim is to gain the Nash equilibrium [17, 20]. When both the models are neural networks, it is simple to create GANs framework. For learning the generator’s distribution Hg for the data x, beforehand, only the input noise variable is referred to as Hz(Z), where z is the noise variable. Here, the generative adversarial networks depict the mapping from the noise space to data space as G (z, θg). Here G depicts the differentiable function denoted by a neural network having parameters θg. While, another neural network is also referred as D(x, θd) having θd as the parameter and the output of the D(x) is single scalar output. Here, D(x) is used to denote the probability having x containing in the data not from the generator G. The training of discriminator D is done so as to increase the probability of providing the right class label to both the training data and the fake data generated from the generator. G is being trained for minimizing log(1-D(G(z))) concurrently [17]. Fig. (7) shows the structure of a GAN. For the computation purpose, we have used G and D as the differentiable function for the representation of generator and discriminator with their inputs as the real data x and fake data z, respectively.
Fig. (7). Generative adversarial network.
The sample produced by G and following the distribution Hdata of actual data is represented by G(z). Suppose the input to the discriminator D is from the original data x, then the classification result should be accurate, and the label obtained would be 1. Accordingly, if the input to the discriminator D is from the G(z), then the classification result would be false with label 0. The discriminator D is used to
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 33
provide the correct classification report of the data. In contrast, G is used to make the generated data G(z) efficiency compatible with the efficiency of actual data x on D [20]. GANs Architectures Deep Convolutional GAN(DCGAN) DCGAN is used for training generators and discriminator networks and provides a framework for the generator modeled with CNN. DCGANs use strides and fractional stride convolutions that permit the spatial downsampling & upsampling operators to learn while training. The up and downsampling operators tackle the change in locations and sampling rates which is an essential need. It is also suggested to apply batch normalization for both networks to maintain training in deeper models. Also, it is recommended to decrease the fully connected layers utilized to enhance the training performance of deeper models [18, 19]. InfoGAN The traditional GANs are capable of learning a few semantic features, but they cannot conquer the relationship between the noise source and some specific semantics, shown in Fig. (8). InfoGANs decompose the input noise source into two kinds, viz. z and c, where z is the incompressible noise and c is the latent code representing the structured semantic data distribution. InfoGAN is expected to solve the following equation: 𝒎𝒊𝒏(𝑮)𝒎𝒂𝒙(𝑫)𝑽𝑰 (𝑫, 𝑮) = 𝑽(𝑫, 𝑮) − 𝝀𝑰(𝒄; 𝑮(𝒛, 𝒄))
(2)
Here, V(D,G) depicts the objective function of the GAN, G(z,c) is the generated data, I is the mutual information, and λ represents the tunable regularization parameter. We can write the lower bound of the I(c;G(z,c)) with the help of an auxiliary distribution Q(c|x) for approximating P(c|x). The objective function of info GAN is given by: 𝒎𝒊𝒏(𝑮)𝒎𝒂𝒙(𝑫)𝑽𝑰 (𝑫, 𝑮) = 𝑽(𝑫, 𝑮) − 𝝀𝑳𝑰 (𝒄; 𝑸)
(3)
The LI (c;Q) in the equation is the lower bound of I(c;G(z,c)). InfoGAN also comes in several variants like casual InfoGANs and semisupervised InfoGANs [17, 20].
34 IoT and Big Data Analytics, Vol. 2
Garg and Singh
Fig. (8). InfoGAN.
Conditional GANs GANs are extended by creating the generator and the discriminator networks class conditional, y. The pros of conditional GAN are that it provides superior depictions for multi-modal data generation (Fig. 9). Equation 4 depicts the objective function of Conditional GANs [17, 19]: (4)
Fig. (9). Conditional GAN.
If we compare equations (3) and (4), it could be concluded that the generator module of InfoGAN is comparatively the same as Conditional GAN. However, the latent code c of InfoGAN is unknown and could be found by training. Auto Encoder GANs Autoencoders are a neural network for unsupervised learning that has an encoder and decoder. It is trained in a self-supervised way as it accepts the input as its target value, and the autoencoder learns non-linear mappings in both directions.
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 35
We combine the autoencoder structure in a GAN model here. The encoder squeezes data p to a latent variable q and the decoder tries to rebuild the encoded data back to the actual data p. The method is capable of stabilizing GAN as it can learn the posterior distribution p(q|p) for reconstructing the data p that lowers the mode collapse that happened due to the limitation in GAN’s inference capability to map data p to q [18, 19]. When we want the latent space to have a meaningful arrangement, and one wants to execute feed-forward sampling of an autoencoder, then adversarial training gives a way to gain these two objectives. Cycle GANs CycleGAN can perform the image-to-image transition between two unpaired domains shown in Fig. (10). The main idea behind the CycleGAN involves translating one domain into another domain and then again to the same domain results from where we started [21]. CycleGAN is an essential progression for the unpaired data. CycleGAN performs the training of two transformations P:X→Y & Q:Y→X parallel for satisfying the following conditions: 1. 𝑃𝑥~𝑝(𝑦)𝑓𝑜𝑟 𝑥~𝑝(𝑥), 𝑎𝑛𝑑 𝑄𝑦~𝑝(𝑥)𝑓𝑜𝑟 𝑦~𝑝(𝑦); 2. 𝑄𝑃𝑥 = 𝑥 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥 ∈ 𝑋, 𝑎𝑛𝑑 𝑃𝑄𝑦 = 𝑦 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑦 ∈ 𝑌 Here p(x) and p(y) are the domain image distribution X and Y. The first condition describes that generated image came from the desired domains and is imposed by training two discriminators, X and Y. The second condition states that information of the source image is encoded in the generated image and is imposed by a cyclic consistency loss of the form ||QPx-x||+||PQy-y|| [22]. Fig. (10) shows the simplified architecture of a CycleGAN. It converts the image of a Cat to Dog. CycleGAN needs two generators; generator X2Y convert a Cat to Dog, while generator Y2X converts Dog to Cat. The two discriminators find the real or fake image of the Cat and Dog, respectively. CycleGaN could be implemented in various areas where the paired training data set is not available. GANs Training Tricks Although many unique solutions exist theoretically, GANs training is still complex and unstable due to various reasons. One of many reasons may be that ideal weights correlate to the saddle points, not to lessen the loss functions. Training of GANs focuses on searching for the parameters of the discriminator responsible for maximizing the classification accuracy and exploring the parameters of a generator that maximally confuses the discriminator. The training method of GAN is depicted in Fig. (11). The parameters of one model are upgraded during training; simultaneously, the parameters of the other are frozen.
36 IoT and Big Data Analytics, Vol. 2
Garg and Singh
There is a unique ideal discriminator for a fixed generator. It is also seen that the generator G is optimal at the time discriminator D is confused maximally and is not capable of distinguishing the actual samples from the pirated ones. Theoretically, the discriminator is trained till it is ideal compared to the current generator, and then the generator is upgraded once more. But practically, the discriminator may not be trained till optimal; it is trained only for small epochs, and simultaneously, the generator is upgraded along with the discriminator. Despite many proposed approaches, GANs training is complex, and it is not stable for many reasons. The researchers have proposed several methods to improve the training. We will briefly discuss some of the techniques in this section [17, 19].
Fig. (10). CycleGAN architecture.
Fig. (11). GANs training loop. generating new data samples x by passing random samples z, through the generator network, discriminator's gradient may be updated k times before updating the generator.
Objective Function-Based Improvement Suppose we use the original objective function of GAN. In that case, we will have to face the gradient vanishing problem for the training of G. Also, it is seen that using G-loss in a non-saturating game may lead to the mode collapse issue. All these issues are due to the objective function that could not be resolved by
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 37
updating the architecture of GANs. A surrogate objective function could be an approach to overcome these issues. Theoretically, many variants based on objective function have been introduced for the same. Some of them are discussed in Table 1. Table 1. Objective function-based GANs training variants. Objective FunctionBased Variants
Key Points
Unrolled GANs
Pros: • Manage Mode Collapse by adopting an optional objective function for the generator concerning an unrolled optimization of the discriminator. • Effectively overcome mode collapse, maintain GANs training, and enhance assortments and coverage of the generated distribution by the generator. Cons: • The computation cost of each training step grows linearly with the number of unrolling steps [23].
Energy-Based GANs
• The two classes of unsupervised learning-based methods-GANs and autoencoders are bridged to form an alternative energy-based framework. • Here, the discriminator is viewed as an energy function, and this computed energy function can be seen as a trainable cost function for the generator. • Show better behavior compared to the regular GANs while training [24].
Least Square GANs
• While updating the generator, regular GAN causes vanishing gradient problems for the data that are not on the accuratearea of the decision boundary. LS GANs use the least square loss function for the discriminator and propose an objective function to overcome the vanishing gradient. • It shows stability during the learning process and generates a better quality image [25].
Mode Regularized GANs • Introduce several methods to regularize the objective thereby stabilizing the GANs training. • Propose an auto encoder based regularizer that provides a systematic way to measure and avoid missing mode problems and stabilizes training also [26]. Relativistic GAN
• The relativistic discriminator fixes and improves the standard GAN. • The modification can be interpreted as “The discriminator estimates the probability that the given real data is more realistic than a randomly sampled fake data” [27].
Skills Based Techniques GANs training requires finding the Nash equilibrium of a non-convex min-max game, but they are typically trained with gradient descent. Some of the perspectives that could be implemented to ensure stability and better results are summarized below: ●
Feature Matching: Specifies a novel aim for the generator that stops the
38 IoT and Big Data Analytics, Vol. 2
●
●
Garg and Singh
generator from overtraining the ongoing discriminator. Rather than maximizing the discriminator's output, it guides the generator in generating the samples that match the features of the original pieces [18, 28]. Label Smoothing: It is a technique that restores the 0 and 1 outputs for the classifier having smooth values like 0.9 or 0.1 and is presently used for reducing the vulnerability of the neural networks to adversarial examples. For GAN, label smoothing is usually used for the actual samples rather than the fake ones, as if it is not done, the discriminator can act incorrectly [28]. Virtual Batch Normalization: Batch normalization drastically enhances the optimization of NNs and proved efficient for the DCGANs. But, it makes the neural network’s output for the input x dependent on many inputs x` in a similar mini-batch. For overcoming the situation, Virtual Batch Normalization is proposed here. Every sample x is normalized based on the stats gathered on a reference slot of examples taken initially and given at the initiation of the training and on x itself. This reference batch normalized with its own stats only. Virtual batch normalization is computationally more expensive as it needs a running forward propagation on the two mini-batches of data. Hence it is generally used with the generator network only [28].
Other Miscellaneous Techniques It is seen that the traditional GAN architecture uses multi-layer perceptron (MLP) as a generator G and the discriminator D, but MLP is just good for small datasets. It could not generalize the complex images. Deep Convolutional GANs (DCGANs) were introduced for better performance. In DCGANs, deep CNNs are employed for defining the generator G and discriminator D, and it is also known that CNN works better than MLP [29]. Another architecture, Laplacian generative adversarial network (LAPGAN), was proposed [36] and can produce a better quality image than the original GAN. LAPGAN utilizes a cascaded CNN along with Laplacian pyramid architecture to produce high-resolution images. SinGAN [30] trains a generative model by using one natural image. SinGAN model consists of a pyramid of fully convolutional GANs that learn a patch distribution at various image scales. The samples generated by SinGANs are usually confused with the real data samples. Another novel training methodology uses progressive neural networks to define all new GAN structures called Progressive GAN (PGGAN). PGAN uses a generator and discriminator networks that are a replica of each other, and they always progress in synchrony. The remaining layers in both networks remain trainable during the training process. Initially taking the low-quality image, new layers are attached to model great details as the training process moves ahead [31].
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 39
STATE-OF-THE-ART APPLICATIONS OF GANS GANs are very efficient generative models that are capable of generating real-like images. This characteristic of GAN allows it to be utilized in many academic and engineering fields. Some of the state-of-the-art applications of GANs are discussed in this section. Nagano and Kikuta [32] stated that SRGAN can generate photo-realistic images from downsampled images. While superresolution of low-resolution original image sources containing some artifacts taken many years back is very difficult. In their work on the food domain, authors proposed two methods to gain original high-resolution images from the downsampled images. In the first method, downsampling is done by using noise injection to develop the desired low-resolution from the high-resolution images for model training. The second method includes the training of models for each target domain. They also suggested a new evaluation method called the xception score. The results are compared with several existing methods, and found that the proposed method produced better and more realistic images. Wang et al. [33] tried to modify the SRGAN by improving its three components. According to the authors, SRGAN is an influential work of generating real-like textures in image super-resolution, but the imaginary details are usually escorted with nasty artifacts. So to resolve the issue, the study was conducted based on the network model, adversarial loss, and perceptual loss of SRGAN for improving and deriving an Enhanced SRGAN. Authors instigate Residual-in-Residual Dense Block (RRDB) without batch normalization for the root network-building model. Also, RRDB fetches some ideas from relativistic GAN to predict relative originality rather than absolute values. Lastly, the perceptual loss was enhanced by applying the features before activation. The proposed method can give superior visual quality with realistic and original textures than the SRGAN by using these enhancements. According to Reed et al. [34], Modern AI systems are not useful in synthesizing realistic images from text, although recurrent neural networks can learn the discriminative text feature representations. Also, the DCGANs have started generating quality images of particular categories like faces, album covers, interiors of the room, etc. In their work, the authors proposed a new deep architecture and formulated a GAN that efficiently bridges the text and image modeling, and translates the visual concepts from characters to pixels. They depict the efficiency of their model by generating compelling images of birds and flowers for a given descriptive text description. Men et al. [35] instigated the attribute decomposed GAN, a new generative framework for controllable person image synthesis, producing real-like images of the persons with required attributes given in different source inputs. The main idea behind the model is to human implant attributes into the latent space as self-contained codes, consequently gaining a flexible and constant grip of attributes through mixing and interpolation
40 IoT and Big Data Analytics, Vol. 2
Garg and Singh
methods in explicit style representations. Karnewar and Wang [36] suggested that although GANs have seen huge victories in image synthesis, the reason behind this instability may be that the gradient passing from the discriminator to the generator becomes mysterious when there is less overlapping support of actual and fake distributions. Authors, in their study, introduced the Multi-Scale Gradient GAN, a simple and efficient method for addressing this and allowing the flow of gradients from the discriminator to the generator at various scales. The method is an effective approach for high-resolution image synthesis. Image synthesis is a widely used method in medical imaging. Zhaoa et al. [44] stated that the emergence of deep neural networks has made a drastic impact in medical imaging by providing different deep learning-based models to improve image synthesis accuracy. Still, assessing the uncertainty in the model is most important in clinical applications, which was the missing part. To overcome the issue, the authors proposed a Bayesian conditional GAN having a strong dropout to enhance image synthesis accuracy. 102 subjects of the brain tumor dataset are used for validation with T1w and T2w MR image translation. Compared to the traditional Bayesian neural network having monte Carlo dropout, the given approach reaches an efficient lower RMSE with a p-value of 0.0186. In this approach, an enhanced calibration of the generated uncertainty by uncertainty recalibration is also given. Qin et al. [37] used the GAN-based data augmentation method to enhance skin lesion classification efficiency. The authors modified the conventional structure of style-based GANs by controlling the input noise in the generator and adjusting the generator and discriminator for synthesizing the skin lesion images. The transfer learning method is applied to construct a classifier for image classification using the pre-trained deep neural network. Finally, the artificial images of the given style-based GANs are given to the training set for better classification results. The style-based GAN outperforms the other GAN models; the evaluation is done based on Inception score, Frechet inception score, precision, and Recall. Karras et al. [38] gave a different generator model for GANs from the style transfer method. The novel model can automatically learn, unsupervised separation of high-level attributes and the stochastic alternative in the generated images. This novel generator improves the recent ones concerning the traditional distribution quality metrics, resulting in visibly better interpolation properties. Zhang et al. [39] stated that fetching efficient pixel-wise classes on X-ray images is tedious, as the visibility of anatomy overlaps and its complex pattern. In contrast, the labeled CT data are better accessed, as the 3D CT scans give a better structure. Authors, in their work gave a method for classifying the X-ray image parsing from the labeled 3D CT scans. The proposed model does not need any topogram labels and could give an average dice of 86% that is same as result from supervised training (89%). Lin et al. [40] prenteded a new method called RelGAN for multi-domain image to image translation. Objective is to utilize the relative attributes that
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 41
describe the change needed on the selected attributes. The method is fit for modifying the images by altering some particular attributes in a simultaneous manner at the same time preserving the other attributes. Category wise exploration of the GAN’s application is being discussed below. Image-Based Applications ●
●
●
Image Synthesis: GAN research usually focuses on the improvement of the quality and utility of generated images. LAPGAN model gives the cascade of CNNs in a Laplacian pyramid architecture that produces images in a coarse to fine fashion [41]. LAPGAN extends the conditional version of the GAN where the generator and the discriminator get additional label information in the input. This method proved to be very helpful and is more commonly used to enhance the quality of images [41, 19]. Trending deep learning-based techniques give satisfying results in the complicated job of inpainting a big lost area in the picture. These methodologies are capable of generating visually credible image structures and textures [42]. Image to Image Translation: Translating an input image to an output image is performed by conditional adversarial networks. Pix2pix model is a standard method for these types of problems. These pix2pix models also construct a loss function for training this mapping. The technique is considered a recurring theme of image processing, computer graphics, and computer vision. This method gives a fruitful performance for many other issues in this domain [42]. CycleGAN is the extension to this work that introduces a cycle consistency loss for saving the actual image after a translation cycle. This model not requires paired images for training, and it makes the preparation of data easy and opens this method for an extensive range of applications [22]. Image Super-resolution: Super-resolution is used for a set of methods used to upscale any video or images. It is a technique for generating high-resolution images from the given lower resolution image [3]. A novel training method is suggested [31] for GANs. The key idea behind this was to extend the generator and discriminator function incrementally. More layers are added to the model as the training progresses, starting from the low resolution. It results in fast and stabilized training, which also gives a very high-quality image. These efforts were extended in the SRGAN model that adds an adversarial loss component that restricts the images to be in the manifold of real images. The generator of SRGAN is conditioned for a less-resolution image that gives a photorealistic natural image having 4 times upscaling factors. SRGAN is effortless to tailor particular domains, as novel training image pairs can be created by downsampling a high-resolution image. The visual quality of SRGAN is enhanced by the three main components: the network architecture, adversarial loss and perceptual loss, and the improvement of every element has resulted in
42 IoT and Big Data Analytics, Vol. 2
●
●
Garg and Singh
an Enhanced SRGAN. ESRGAN gives constantly good visual quality with enhanced realistic and natural texture compared to the SRGAN [42]. Video Prediction and Generation: The core problem of computer vision include understanding object motions and image dynamics. For both the video recognition and video generation task, called action classification and future prediction, a framework is needed to determine how scenes transform. A motion and content decomposed Generative Adversarial Network for generating is given in the literature [43]. In this architecture, video is generated by mapping the order of random vectors for arranging video frames. The huge experimental result on many exciting datasets with a qualitative and quantitative contest of the trending approaches proves the potential of the proposed architecture. Face Ageing: Face age progression and regression, usually called predicting future looks and estimating the previous looks, are known as face aging, and renaissance mainly focuses on rendering face images containing or not containing the aging effect and yet preserving the personalized features of the face. A conditional adversarial autoencoder (CAAE) network can be used to detect the face differently. It is flexible in getting the age progression and regression simultaneously by controlling the age attribute. We can summarize the benefits of CAAE in four aspects. Firstly, all-new architecture generates both age progression and regression images along with photorealistic face images. Secondly, the framework is much more flexible and general as it does not require paired data in the training set and the labeled data for testing. Thirdly, it preserves the personality and avoids the ghostly artifacts by disentangling the age and nature in the latent vector space. Lastly, it is robust in terms of pose variation, expression, and occlusion [42].
Sequential Data Based Applications GANs have also seen achievements in consecutive data such as speech, music, voice, natural language processing, time series, etc. ●
●
Speech: For speech processing, a voice conversion system is created from GAN and VAE, and this is called variational autoencoding Wasserstein GAN (VAWGAN). In this model, the encoder collects phonetic samples of source speech. The decoder processes the updated output speech with the condition that the target information is given just like the conditional VAE. VAE is not capable of generating sharp outputs due to the Gaussian distribution, hence VAW-GAN uses WGAN to provide a decoder for the generator that focuses on reconstructing the output voice [42]. Music: For extracting the whole sequences of music, an easy and straight method is continuous RNN-GAN (CRNN-GAN), which models the generator and discriminator as an RNN with long-short term memory (LSTM). But these
Generative Adversarial Networks
●
IoT and Big Data Analytics, Vol. 2 43
can be used only for extracting the whole sequence of music and for the partially generated sequence. Several GAN models like sequence GAN(SeqGAN) [44], object reinforced GAN(ORGAN) [45], and the policy gradient algorithms were employed for not generating the whole sequence at once. Natural Language Processing: For information retrieval things, IRGAN is used. And for neural dialog generation, adversarial learning is used. GANs can also be used for the next generation and speech-language processing. Adversarial Reward Learning is used for visual storytelling. Furthermore, GANs could also be used for different NLP applications like question-answer selection, poetry generation, review detection and generation, etc. [17].
Other Applications ●
Medical Field: GANs are extensively used in medical areas like generating and designing DNA, Drug Discovery, Medical Image Processing, Dental Restoration and Doctor recommendations, etc. [17]. Drug Discovery: Many researchers applied GANs to images and videos, but the researchers and scientists of Insilico Medicine give a method of automatic discovering of the drug by using GANs. The researchers train the generator to sample drug personnel for any given ailment as minutely feasible to current drugs from a drug database. The medications for untreated diseases are available using a generator and discriminator after training at an early stage. Also, it is proved that the sampled medicine truly heals the given disease [42]. Oncology: Insilico medicine uses a definite set of parameters to represent the duct of novel anticancer molecules. The objective here is to forecast the response of drugs and compounds efficient in resisting cancer cells. To achieve this, researchers give an adversarial autoencoder architecture for the recognition and generation of novel compounds relying on available biochemical data [42]. Data Science: GANs can also be used for data generation, neural network generation, data augmentation, spatial representation learning, etc. Also, it is used in other miscellaneous areas like malware detection, game playing, steganography, privacy-preserving, etc. [17]. RL-CycleGAN: Authors [ 46 ] introduced the RL-scene consistency loss for image translation, which confirms that the translation operation is invariant concerning the Q-values attached with images. This, in turn, helps in learning a task-aware translation. Using this loss in an unsupervised domain translation, RL-CycleGAN is obtained, which is a novel method for simulation to real-world transfer for reinforcement learning. GAN Q-Learning: Distributional reinforcement learning has proven its efficacy in the complex Markov decision process for setting a non-linear function ❍
❍
●
●
●
44 IoT and Big Data Analytics, Vol. 2 .
Garg and Singh
approximation. Distributional RL could also be used in several other ways [4]. Proposed, GAN Q-Learning, a new distributional RL-based method that uses GANs and analyzes its performance in a simple tabular environment and Open AI Gym. FUTURE CHALLENGES It is seen that GANs have attracted the researcher's attention from almost every field because of their capacity to grasp vast amounts of unlabeled data. Although a lot of progress is seen in this field to overcome some of the significant challenges that existed to date, another question arises about the basic problem of equilibrium for GAN. In the future, other enhanced adversarial learning techniques and mask mechanisms could be developed to achieve better performance compared to other recent tools. Arora et al. [47] give an equilibrium of a limited mixture of neural networks, representing that equilibrium may not exist under a definite level equilibrium. Here, the authors also argued that when the GAN training appears to converge, the trained distribution maybe very far from the target distribution. For this problem, a new method was proposed by the authors called neural net distance. How could one measure the accuracy of data synthesized by generative models for its evaluation? Is it better to use likelihood estimation? Is model comparison possible? The above mentioned are some open problems and queries that are confined to GANs and probabilistic models. This [48] shows that if the GANs are evaluated applying many procedures, it may lead to clashing results regarding the standard of synthesized data; the development of selecting one method over another relies on the application. CONCLUSION We tried to give an in-depth review of a different perspective of GANs. Since 2014, Artificial Intelligence community researchers have been very fond of this model, and many studies are going on in this field. We tried to present a systematic course of development of GANs. It is seen here that GANs are not only confined to their ability to learn deeply but are capable of making use of the vast unlabeled samples while learning; they are promising types of generative models. There are still ample development opportunities in the theory, applications, and algorithms from the training perspective. REFERENCES [1]
S. Sengupta, S. Basak, P. Saikia, S. Paul, V. Tsalavoutis, F. Atiah, V. Ravi, and A. Peters, "A review of deep learning with special emphasis on architectures, applications and recent trends", Knowl. Base. Syst., vol. 194, 2020. [http://dx.doi.org/10.1016/j.knosys.2020.105596]
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 45
[2]
D. Ravi, C. Wong, F. Deligianni, M. Berthelot, J. Andreu-Perez, B. Lo, and G.Z. Yang, "Deep Learning for Health Informatics", IEEE J. Biomed. Health Inform., vol. 21, no. 1, pp. 4-21, 2017. [http://dx.doi.org/10.1109/JBHI.2016.2636665] [PMID: 28055930]
[3]
Z. Wang, J. Chen, and S.C.H. Hoi, "Deep Learning for Image Super-resolution: A Survey", IEEE Trans. Pattern Anal. Mach. Intell., vol. 8828, pp. 1-1, 2020. [PMID: 32217470]
[4]
T. Doan, B. Mazoure, and C. Lyle, "Gan q-learning", arXiv, 2018.
[5]
Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning", Nature, vol. 521, no. 7553, pp. 436-444, 2015.
[6]
K. Suzuki, "Overview of deep learning in medical imaging", Radiological Phys. Technol., vol. 10, no. 3, pp. 257-273, 2017. [http://dx.doi.org/10.1007/s12194-017-0406-5] [PMID: 28689314]
[7]
J. Lee, S. Jun, Y. Cho, H. Lee, G.B. Kim, J.B. Seo, "Lee-deep learning in medical imaging", Gen., vol. 18, no. 4, pp. 570–84, 2017.
[8]
Q. Wang, G.M. Garrity, J.M. Tiedje, and J.R. Cole, "Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy", Appl. Environ. Microbiol., vol. 73, no. 16, pp. 5261-5267, 2007. [http://dx.doi.org/10.1128/AEM.00062-07] [PMID: 17586664]
[9]
S. Tong, and E. Chang, "Support vector machine active learning for image retrieval simon", Proc ninth ACM Int Conf Multimed, vol. 54, pp. 1–12, 2001.
[10]
M.D. Engelhart, H. Moughamian, and J.A. Walsh, "The handbook of brain theory and neural networks", Educ. Psychol. Meas., vol. 30, no. 1, pp. 187-187, 1970. [http://dx.doi.org/10.1177/001316447003000129]
[11]
H.P. Chan, R.K. Samala, L.M. Hadjiiski, and C. Zhou, "Deep learning in medical image analysis", Adv. Exp. Med. Biol., vol. 1213, pp. 3-21, 2020. [http://dx.doi.org/10.1007/978-3-030-33128-3_1] [PMID: 32030660]
[12]
L. Liao, W. Jin, and R. Pavel, "Enhanced restricted boltzmann machine with prognosability regularization for prognostics and health assessment", IEEE Trans. Ind. Electron., vol. 63, no. 11, pp. 7076-7083, 2016. [http://dx.doi.org/10.1109/TIE.2016.2586442]
[13]
Y. Hua, J. Guo, and H. Zhao, "Deep Belief Networks and deep learning", Proc 2015 Int Conf Intell Comput Internet Things, ICIT., pp. 1-4, 2015. [http://dx.doi.org/10.1109/TFUZZ.2015.2406889]
[14]
H. Salman, J. Grover, and T. Shankar, "Hierarchical reinforcement learning for sequencing behaviors", pp. 2709–33, 2018.
[15]
Y. Hua, J. Guo, and H. Zhao, "Deep Belief Networks and deep learning", Proc 2015 Int Conf Intell Comput Internet Things, ICIT. ,pp. 1-4, 2015.
[16]
H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng, "Unsupervised learning of hierarchical representations with convolutional deep belief networks", Commun. ACM, vol. 54, no. 10, pp. 95-103, 2011. [http://dx.doi.org/10.1145/2001269.2001295]
[17]
J. Gui, Z. Sun, Y. Wen, D. Tao, J. Ye. "A review on generative adversarial networks: Algorithms, theory, and applications", IEEE transactions on knowledge and data engineering, 2020.
[18]
Y. Hong, U. Hwang, J. Yoo, and S. Yoon, "How generative adversarial networks and their variants work: An overview", ACM Comput. Surv., vol. 52, no. 1, pp. 1-43, 2020. [http://dx.doi.org/10.1145/3301282]
[19]
J. Luo, and J. Huang, "Generative adversarial network: An overview", J. Sci. Instrum., vol. 40, no. 3, pp. 74–84, 2019.
46 IoT and Big Data Analytics, Vol. 2
Garg and Singh
[20]
K. Wang, C. Gou, Y. Duan, Y. Lin, X. Zheng, and F.Y. Wang, "Generative adversarial networks: Introduction and outlook", IEEE/CAA J. Autom. Sin., vol. 4, no. 4, pp. 588–98, 2017.
[21]
X. Zhu, Y. Liu, J. Li, T. Wan, and Z. Qin, "Emotion classification with data augmentation using generative adversarial networks", Lect. Notes. Comp. Sci., pp. 349–60, 2018. [http://dx.doi.org/10.1007/978-3-319-93040-4_28]
[22]
C. Chu, A. Zhmoginov, and M. Sandler, "CycleGAN, a master of steganography", Nips., pp: 1–6, 2017.
[23]
M.Y. Liu, and O. Tuzel, "Unrolled Generative Adversarial Networks", Adv. Neural Inf. Process. Syst., pp. 469-477, 2016.
[24]
G. Dai, J. Xie, Y. Fang, "Energy based generative adversarial networks". Proc 2017 ACM Multimed Conf., 672–80, 2017.
[25]
X. Mao, Q. Li, R.Y.K. Lau, Z. Wang, H. Xie, S.P. Smolley. "Least squares generative adversarial networks", In Proceedings of the IEEE Int Conf Computer Vision, pp. 2794-2802, 2017.
[26]
T. Che, Y. Li, A.P. Jacob, Y. Bengio, W. Li, "Mode regularized generative adversarial networks", 5th Int Conf Learn Represent ICLR 2017 - Conf Track Proc, pp. 1–13, 2019.
[27]
A. Jolicoeur-Martineau, "The relativistic discriminator: A key element missing from standard GaN", 7th Int Conf Learn Represent ICLR, 2019.
[28]
T. Salimans, and I. Goodfellow, V.C. Zarema, "Improved Techniques for Training GANs", vol. 19, no. 1, pp. 1-9, 2018.
[29]
A. Radford, L. Metz, and S. Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks", 4th Int Conf Learn Represent ICLR , pp. 1–16, 2016.
[30]
T.R. Shaham, T. Dekel, and T. Michaeli, "SinGAN: Learning a generative model from a single natural image". Proc IEEE Int Conf Comput Vis., pp. 4569–79, 2019.
[31]
T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of GANs for improved quality, stability, and variation", 6th Int Conf Learn Represent ICLR, pp. 1–26, 2018.
[32]
Y. Nagano, and Y. Kikuta, "SRGAN for super-resolving low-resolution food images", ACM Int Conf Proceeding Ser, pp. 33-7, 2018.
[33]
X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and L.C. Change, "ESRGAN : Enhanced Super-Resolution Generative Adversarial Networks", Proceedings of the European Conference on Computer Vision (ECCV). pp. 33-7, 2018.
[34]
S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, "Generative adversarial text to image synthesis", 33rd Int Conf Mach Learn ICML, pp. 1681-90, 2016.
[35]
Y. Men, Y. Mao, Y. Jiang, W-Y. Ma, and Z. Lian, "Controllable person image synthesis with attribute-decomposed GAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 084–93, 2020. [http://dx.doi.org/10.1109/CVPR42600.2020.00513]
[36]
Karnewar, Animesh and O. Wang. MSG-GAN: Multi-scale gradients for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7799–808, 2020.
[37]
Z. Qin, Z. Liu, P. Zhu, and Y. Xue, "A GAN-based image synthesis method for skin lesion classification", Comput. Methods Programs Biomed., vol. 195, 2020. [http://dx.doi.org/10.1016/j.cmpb.2020.105568] [PMID: 32526536]
[38]
T. Karras, S. Laine, and T. Aila, "A style-based generator architecture for generative adversarial networks", Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., pp. 4396–405, 2019. [http://dx.doi.org/10.1109/CVPR.2019.00453]
Generative Adversarial Networks
IoT and Big Data Analytics, Vol. 2 47
[39]
Y. Zhang, S. Miao, T. Mansi, and R. Liao, "Unsupervised X-ray image segmentation with task driven generative adversarial networks", Med. Image Anal., vol. 62, 2020. [http://dx.doi.org/10.1016/j.media.2020.101664] [PMID: 32120268]
[40]
Y.J. Lin, P.W. Wu, C.H. Chang, E. Chang, and S.W. Liao, "RelGAN: Multi-domain image-to-image translation via relative attributes", Proc IEEE Int Conf Comput Vis., pp. 5913–21, 2019.
[41]
E. Denton, A. Szlam, and R. Fergus, "Deep generative image models using a laplacian pyramid of adversarial networks", 1–9, 2015.
[42]
H. Alqahtani, M. Kavakli, and T. Gulshan, "Applications of Generative Adversarial Networks (GANs): An Updated Review", Arch. Comput. Methods Eng., no. 0123456789, 2019. [http://dx.doi.org/10.1007/s11831-019-09388-y]
[43]
S. Tulyakov, M.Y. Liu, X. Yang, and J. Kautz, "MoCoGAN: Decomposing Motion and Content for Video Generation", Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 1526-1535, 2018. [http://dx.doi.org/10.1109/CVPR.2018.00165]
[44]
L. Yu, W. Zhang, J. Wang, and Y. Yu, "SeqGAN: Sequence generative adversarial nets with policy gradient", 31st AAAI Conf Artif Intell AAAI. 2017, pp. 2852-8, 2017. [http://dx.doi.org/10.1609/aaai.v31i1.10804]
[45]
G.L. Guimaraes, B. Sanchez-Lengeling, C. Outeiral, P.L.C. Farias, and A. Aspuru-Guzik, "ObjectiveReinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models", arXiv preprint, 1705.10843.
[46]
K. Rao, C. Harris, A. Irpan, S. Levine, J. Ibarz, and M. Khansari, "RL-CycleGAN : Reinforcement Learning Aware Simulation-To-Real", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 11157-11166, 2020.
[47]
S. Arora, R. Ge, Y. Liang, T. Ma, and Y. Zhang, "Generalization and equilibrium in generative adversarial nets (GANs)", 34th Int Conf Mach Learn ICML 2017. pp. 322-49, 2017.
[48]
L. Theis, A. Van Den Oord, and M. Bethge, "A note on the evaluation of generative models", 4th Int Conf Learn Represent ICLR 2016, pp. 1–10, 2016.
48
IoT and Big Data Analytics, 2023, Vol. 2, 48-62
CHAPTER 3
Role of Blockchain in Healthcare Sector Sheik Abdullah Abbas1,*, Karthikeyan Jothikumar2 and Arif Ansari3 School of Computer Science Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India 2 Department of Computer Science Engineering, National Engineering College, Tamil Nadu, India 3 Data Sciences and Operations, Marshall School of Business, University of Southern California, Los Angeles, California, United States 1
Abstract: The domain of bio-medical engineering is facing significant challenges and issues due to data management, data attribution, data security, data availability and privacy. The mechanism of blockchain corresponds to a time-stamped set of data records that is maintained and managed by a cluster/group of the computer system. The collection of cluster groups is secured and protected by cryptographic values, which are often said to be the chains. The growth of blockchain has progressed in a variety of applications in terms of asset management, IoT, smart devices, supply chains, public data validations, and personal identification. Even though the growth has progressed in different emerging disciplines, the impact of blockchain in technology-based biomedical engineering created a vast difference and incorporation in various fields of operation. In a medical data management system, an old audit trail is mostly needed to perform data operations such as insert, delete, and update. Blockchain is, therefore, suitable for the process concerning fixed ledger to record and update critical information services. This chapter completely focuses on the role of blockchain in biomedical engineering applications.
Keywords: Asset management, Biomedical appliances, Data management, Data analytics, Digital medicine, Healthcare infrastructure, Medical data. INTRODUCTION Blockchain is one of the continuously increasing applications with a view on significant information and communiqué technology (ICT) and its challenges. It has gained its significance in the field of different sorts of applications such as data analytics, financial sectors, food technology, Internet of Things and Biomedical engineering applications. There exist the most unusual situations and Corresponding author Sheik Abdullah Abbas: School of Computer Science Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India; E-mail: [email protected] *
Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma (Eds.) All rights reserved-© 2023 Bentham Science Publishers
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 49
forms of applications which efficiently utilize blockchain in the domain of medical applications [1]. Healthcare data management is one of the most essential fields in recent days. Managing and monitoring healthcare data is becoming a tedious process with the rapid generation of data in clinical data analysis. With this consideration, the domain of data-intensive applications such as healthcare data analysis needs a lot of data security and data management service. In healthcare diagnosis, data management and its confidentiality play a significant role in the patients, doctors and data administrators. The analysis of the correct form of data with the right person and its follow-up has to be maintained to have quality healthcare service. The improvement in healthcare data analysis has moved to e-health management service with telemedicine facilitation. To manage and transform this into an excellent form of healthcare service, there is a need for blockchain and its functionalities to have a unique and well transferable healthcare service [2]. Upon considering blockchain, the realm of traditional healthcare service can be improved in terms of reliability, safety and security with an impact towards realtime healthcare data analytics. Biomedical informatics has three significant components in the classification of the data that are intended to develop the decision-support model. Risk analysis and disease prediction in bioinformatics include the applicability of computational techniques to formulate their goal. Informatics techniques such as statistics, machine learning, soft computing, swarm intelligence, data mining, and data visualization have been used by medical data. Hence computational and statistical methods are used to determine the aspects related to a specified disease. Bioinformatics can be broadly classified into three types based on the type of data to be processed to frame the decision-support model. They are bioinformatics, imaging informatics and public health informatics. The process behind biomedical informatics includes data analysis and data interpretation which are considered to be significant tasks in risk determination. The platform of bioinformatics includes the process of determining aspects related to gene structure, anticipated properties of diseases, and nucleotide polymorphism structure. The structure provides the determination of disease syndromes with its attribute properties [3]. The protein sequence and its features can be located by the disease specified. The sequential structure of proteins and the organizational structure of nucleic acids can be clearly understood with the processing paradigms and incorporations over bioinformatics. The field of bioinformatics includes the
50 IoT and Big Data Analytics, Vol. 2
Abbas et al.
mechanism of processing a large variety of data based on the type and nature of informatics. It entails the development of algorithms to detect gene structure and its sequence to predict the nature of protein structure and its function-related sequences. The primary goal behind bioinformatics is to regulate the realization of the biological process. Over the decades, the field of bioinformatics has had its rapid growth over the technological developments in molecular biology. Common research interest in bioinformatics includes the analysis and design of Deoxyribonucleic acid (DNA) sequences, protein sequences and protein structures with 3-D modeling [4]. The health sector needs to be improved by enhancing medical facilities, diseasespecific risk factor determination, and by spreading health awareness among the people. In addition to the health sector, there lies an individual responsibility and awareness specific to the disease. Enhancing and rendering health-based services also depend upon the likelihood and habits of the people around a specific region. If the risk-specific syndromes are detected in advance, the cost-effectiveness and treatment expenses can be avoided and thereby, we can render a population-based healthcare service. Real-time healthcare data analysis involves the process of analyzing and monitoring healthcare records in a real-time perspective and with significant analysis of risk factors and comorbidities [5]. The incorporation of blockchain technology and its functionalities makes the healthcare domain have a good decision support system, and thereby, we can provide a “valuable healthcare service”. FEATURES OF BLOCKCHAIN The technology of Blockchain was first explored in the form of bitcoin, which is considered the popular form of cryptocurrency. But, now, it has been explored through various forms of technological incorporations in different fields of action. The following are the features of Blockchain technology which makes it considerable to expose its usage in different fields of action. Decentralized technology, Enhancement in security, faster resolution, compromise, distributed ledgers, and environment cannot be corrupted. These salient features make blockchain technology to be well adhered towards the domain of medical data analysis and its applications. Since the domain of healthcare needs good and advanced forms of security in order to manage and store data, blockchain can provide a good platform for rendering solutions in a secured way for medical data analysis [6].
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 51
In a blockchain platform, instead of working towards a centralized way, it works with one of the coordinating features called the collection of nodes. Each node has a copy of a corresponding digital ledger. In order to add a transaction, every executable node has to check and confirm its validity. If most nodes confirm it is found to be valid, then it will be added to the ledger. Hence this mechanism improves transparency. Therefore, without getting respective concerns from the majority of the nodes in the distributed platform, no one can add any such transaction blocks to the ledger. Also, once the transaction block gets added, there is no possibility to make any edit, update and delete operations over it. The information content that is present in the blockchain is hashed cryptographically; therefore, the information available in the network hides the exact nature of the available data [7]. Fig. (1) provides the opportunities and challenges as observed by typical blockchain technology.
Fig. (1). Opportunities vs Challenges of a typical blockchain environment.
DATA MANAGEMENT AND ITS SERVICES (TRADITIONAL VS DISTRIBUTED) Data management in blockchain technology provides a platform for working in a unified environment with a decentralized way of handling data. The peers communicate the information shared by means of a consent algorithm. In this process, there is a need for a centralized manager, which takes the responsibility for the whole process of the network communicated. In general, the traditional database follows the way of the client-server mechanism in which the prescribed user modifies the data, which is then updated and stored in the centralized server. The controlling authority is responsible for the entire flow of data regulation and updation in the database access controls.
52 IoT and Big Data Analytics, Vol. 2
Abbas et al.
Meanwhile, the controlling authority is responsible for the administrative controls over the database, which involves the operation of insert, delete, and update operations in the database [8]. The blockchain is differentiated into various types from the point of service and operation provided to the vendors they are: community-level blockchain, personal level blockchain, mixed level (hybrid) blockchain, and associated blockchain. Blockchain technology consists of numerous decentralized nodes in which each of the nodes involves them as an active participant for administrative roles. Each of the nodes confirms the participation of newly adhered nodes which enter the platform and then access the database [9]. For the addition of any controls in the database, the majority of the nodes must agree and confirm the access control mechanism in order to make the service and security features over the database. The difference among the services is data integrity and transparency, which is one of the key properties for any such data access and storage platforms. A notable feature of blockchain is that it distinguishes it from traditional technology which is integrity and transparency. Each of the users can assure that each segment of data in which they access is uncorrupted and unmodified, and the verification about the time append mechanism can be made with the service corresponding to blockchain technology [10]. DATA DECENTRALIZATION AND ITS DISTRIBUTION Most of the blockchain networks are decentralized while decision making the activities, events, and database updates. With regard to this technological feature, centralization is critically safe when handling data updates and processing [11]. Also, this paves the way for decentralization applications which then trust the entity rather than providing any form of trustless validation. Hence the network trusts the medium of distribution for data transfer over the network. ASSET MANAGEMENT Asset management is one of the key features of any technological incorporation to have a good intervention in customer and security needs. There are various key features in which the platform of blockchain can make an improvement in asset management and its maintenance. The following are the features in which asset management can retain its contributions towards blockchain technology: Open collaboration, Transparency in transaction, Transparency in asset management, and enforcing consistency. With the creation of a system comprised of technology and its service providers, we can create a new set of process and its collaborators [12]. This makes the way
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 53
for introducing new vendors with their interactions and their management governed by blockchain features. The transactions made in the blockchain are immutable; therefore, the activities and the controls cannot be modified without any prior permission from the network. Hence all the events and activities performed on the assert management are verified for the centralized network to be taken into action. This makes the mechanism of data sharing, history access, data flexibility management and so on. ANALYTICS Analytics is defined as the process of dealing with the raw data involving various pre-processing steps in order to model out an analytical model for interpretation and evaluation. In recent days, data analytics has been used extensively by various industries and marketing agencies to draw decisions at proper levels. The interpreted solution must provide the unidentifiable pattern from the available model to make the model to behave as a proper decision support model [13, 14]. The domain of data-driven decision making in blockchain has been acknowledged by all the research areas, and there is promising growth towards the idea of the term “big data”. Meanwhile the integrity, volume, variety, time, and privacy with big data inhibit progress at all the phases which creates value from the data. The underlying computer hardware and software usually face more complication with an aim to provide the necessity for efficacy and to increase the speed of computing the data [15]. The applications corresponding to big data involving the concept of data exploitation, efficiency, profit, and sustainability involve the following challenges: Scalability of computing with encompassment of architecture and algorithms, Data querying with improvements in data processing techniques, Planning techniques and tools, Fault tolerance, High volume of data, Powerful visualization engines, Data privacy and Integrity, Computational speed, Energyaware resource provisioning, Resource monitoring, and Operational efficiency. Predictive Analytics in the blockchain is a division of progressive analytics which ultimately focus on the prediction of the future event or with absolute data. The mechanism of prediction focuses towards the analysis and determination of patterns in the data. Hence decisions at exact proper levels can be made in predicting the mere future [16]. Predictive data analytics deals with various techniques in statistics, data mining, analytics, and information management. As an inference, executive predictive data models will be derived from the execution. Hence,blockchain technology for healthcare specifically focuses on the method used for data availability [17]. Blockchain technology in healthcare benefits the patients and the user-supported environment in many ways, such as: Mastering of
54 IoT and Big Data Analytics, Vol. 2
Abbas et al.
patients indices, Single-mode of longitudinal patients records, Compilation of regular visits (episodes), records corresponding to the specific disease, laboratory results, treatments, ambulance services, hospitalization data, and medications. Analytics Process Model Data analytics is an interdisciplinary platform that supports all types of data to be processed, analyzed and to determine useful patterns from it. It mainly focuses on the development of an algorithmic model which best suits the defined domain upon considering the application in process. The observed data need to be carefully analyzed, and suitable data pre-processing methods have to be applied; this is considered to be an important step of the analytical process model. The execution flow of the model is given in Fig. (2).
Fig. (2). Basic workflow of analytics process model.
The data of the medical domain comes along in various formats for a single patient of record. In this case, it has to be clearly observed and monitored [18]. Thereby the data transformation with each segment of medical data type, which is in text, image, audio, or video format, has to be clearly observed and updated with regard to the corresponding patient. This can be significantly made easy by blockchain technology and its incorporations at all levels of data storage, access, and security features. This can be even challenging to some extent which makes the users have some the erroneous results, and the same can be efficiently made by the algorithmic incorporations in data analytics and its process model depictions where data plays a significant role [19].
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 55
Analytic Model Requirements The requirements of the analytical model need the questionnaire which is designed to solve the problems. This has to be clearly answered and properly defined at all levels. The model to be developed must exhibit its predictive power for the observed dataset. It must be interpretable and justifiable for the actual algorithmic incorporation and its requirements. It must have well-established statistical and analytical power and proof of validation in an affordable cost benefit analysis [28]. IMMUTABILITY FOR BIOMEDICAL APPLIANCES IN BLOCKCHAIN Besides the usage of cryptocurrency, Ethereum platform in blockchain technology provided a way for various applications. In general, healthcare applications and their supporting strategies are developed using the Ethereum framework. In this digital era, data is one of the strongest assets and is considered to be very crucial for processing, retrieval and storage properties [20]. With this extensive development in technology, the usage of data has been made very secure in numerous aspects. In various aspects, blockchain provides a way for sharing/retrieval of Electronic Health Records (EHR) in a secure means of handling. Blockchain in EHR provides a way of securing the patients' information in terms of privacy and data security [21, 22]. Blockchain supports various patient-centric services and disparate involved in the modern healthcare system. Patient's information is one of the utmost most confidential information in all aspects from malicious attacks such as DoS, Dropping attack, mining attack and so on. Blockchain provides various access control mechanisms to deal with the data that is mainly concerned with medical records and their applications. Fig. (3) provides the entire structure of Hospital Management System (HMS).
Fig. (3). Overview of Hospital Management System.
56 IoT and Big Data Analytics, Vol. 2
Abbas et al.
SECURITY AND PRIVACY The first adherent security feature of blockchain has been formulated using blocks of cryptographic chains. The blockchain is designed in such a way that it is well built with security attributes such as resistance, consistency, resistance to DoS attack, and double-spending attack. In addition to consistency, availability and integrity, unlink ability and confidentiality features can be enhanced to improve the security features with the consensus algorithm and its improvements [23, 24]. With consistency, the global distributed ledger confirms the systematic property in which all the specified nodes will have some type of ledger at the same time of action [25]. In general, tamper-resistance is defined as the resistance towards any entity with intentional tampering either by the user or by the adversaries with keen access towards the entity [26]. Hence any resistance that has been stored in the blockchain platform cannot be modified or tampered with after the processing of blockchain generation [27, 28]. This is one such key property which can be specifically used with biomedical engineering applications where the alteration or modification of data is prohibited. The following are the features of blockchain in the security aspect with regard to biomedical applications: Increased capacity, Enhanced security, Immutability, Minting, Quicker settlement, and Decentralized system and operation. The aspect of the traditional healthcare system is slow as it requires an abundance of time for a transaction or patient process to be completed [29]. This is why most hospitals need to upgrade their systems and infrastructure in handling medical records. We can reach this problem by the technological incorporations from the blockchain for a faster response which then saves money and time, thereby providing the patients with a good convenient environment at safe level. BLOCKCHAIN IN BIOMEDICINE AND ITS APPLICATIONS Blockchain is considered to be one of the distributed absolute ledger technologies introduced for enabling various services in recent fields of evolution [30]. The promising goal of biomedical problems is to have solutions corresponding to user centric activities and its satisfaction. At this point, blockchain and its functionalities exploited its framework for the development of biomedical applications and its data centric activities. Research work and its focus reveal the integration, access, and control over patient EHR addressing the clinical results, medical research, clinical medications and supply chain that focus on medical research and its applications [31]. The most underlying component is adapting the infrastructure for the modern
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 57
healthcare information system, which exchanges data in various formats and modes of transfer. This specifically takes place between the patients, service providers and other trusted parties. The data maintenance and management services provide improved medical record management, claim process and improved research based on clinical trials and experimentation. The following are the key features of blockchain for biomedical engineering applications: Improved patient-managed HER, Non-modifiable patient records other than the owner/doctor/administrative authority, Source-verifiable records, Increased security, etc. At all perspectives, the management of medical records is one of the critical parts where the complete information regarding the patient lies. This should be considered important hence all the fields of data can be ascertained from the EHR. Thereby, the details corresponding to the medical record should be managed accordingly by the built-in infrastructure facilities supported by blockchain technology [32]. The following are the features that make blockchain technology to be completely fit for the healthcare sector: Design, Data access, Data sharing, Decision support model generation, Technological updates, Governance, Functionality, Security and Confidentiality. The ultimate goal behind blockchain technology is to improve healthcare and its associated co-factors in order to provide a valuable healthcare service. Rather than the benefits that have been entirely supported by blockchain technology, the way of transparently handling the data with distributed, immutable and very specific trust based model makes this healthcare system more improved and benefitted platform healthcare technology [33]. Case Study A sample case study has been investigated for the blockchain dataset and its corresponding attributes for the development of a predictive model to exhibit its properties in order to determine the most significant attributes. The implementation proceeds for the bitcoin evaluation dataset with 22 attributes and a class label. The model development proceeds with data pre-processing with the applicability of normalization techniques. The normalization used for this data evaluation is Z-score normalization. Once normalization is applied, then the selection of the data classification scheme is made. The data classification scheme used is a Tree-based (Decision tree) Classifier and ensemble (Tree forest) classifier algorithm. All classifiers are unique in their execution parameters and their supporting paradigms. Let's visualize the output results as produced by the decision tree algorithm. The following tree depicts the output as observed from the decision tree in Fig. (4).
58 IoT and Big Data Analytics, Vol. 2
Abbas et al.
Fig. (4). Output Decision Tree.
In the output decision tree, the btc_output_volume is found to be the root node for all the preceding attributes, and the splitting criterion used for evaluation is a gain ratio. The following is the confusion matrix obtained as a result of the decision tree with 70.48% accuracy, as depicted in Table 1. Table 1. Confusion matrix obtained using Decision tree algorithm. Matrix
True Zero
True One
True Three
True Two
True Four
True Five
True Seven
Class Precision
pred. zero
2039
270
63
152
27
11
0
79.59%
pred. one
3
4
0
0
1
0
0
50.00%
pred. three
0
0
0
0
0
0
0
0.00%
pred. two
271
52
9
15
2
0
1
4.29%
pred. four
0
0
0
0
0
0
0
0.00%
pred. five
0
0
0
0
0
0
0
0.00%
pred. seven
0
0
0
0
0
0
0
0.00%
class recall
88.15%
1.23%
0.00%
8.98%
0.00%
0.00%
0.00%
The following are the metrics observed for the evaluation of block chain dataset. absolute_error
: 0.371 +/- 0.433
mean_squared_error
: 0.411
kendall_tau
: -0.033
spearman_rho
: -0.035
weighted_mean_precision
: 13.57%
weighted_mean_recall
: 13.44%
kappa
: 0.009
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 59
Accordingly, the evaluation with the model development can be made for any sort of dataset as it is available in raw format. Similarly, specific analysis has been made accordingly using a random forest algorithm with the generation of multiple trees in connection to weight assumptions. The following graph illustrates the weight assumptions made with the result corresponding to the random forest algorithm Fig. (5). From Fig. (5), it has been observed that the weight assumption using random forest generates multiple assumptions and considerations for the blockchain dataset. This provides an efficient way of handling data analytic model development for the given dataset. Thus it is platform independent to enable patterns from various sorts of real-time datasets and their applications [34, 35].
Fig. (5). Result obtained using Random Forest algorithm.
CONCLUSION AND FUTURE WORK Medical data analysis and its management need to be considered to be one of the important sectors in which data has to be managed and monitored confidentially. The technological impacts and their adverse effects should have a good outcome for biomedical engineering and its applications. Upon considering this, the effect of blockchain provides a good way of data transmission, storage, analysis and management. Blockchain provides a well-established infrastructure for medical data management and its access with its varying significant features. Meanwhile, this chapter focuses on the blockchain and its services for biomedical engineering and its applications. The chapter provides a view of blockchain technology, its features, data decentralization, asset management, analytics, immutability and
60 IoT and Big Data Analytics, Vol. 2
Abbas et al.
data privacy with its security constraints. Hence the future of blockchain technology must extend its service and solutions to various sorts of real-time applications. Future work can be made in accordance with the applicability of blockchain to real-time healthcare monitoring systems. REFERENCES [1]
A. Ashta, and G. Biot-Paquerot, "FinTech evolution: Strategic value management issues in a fast changing industry", Strateg. Change, vol. 27, no. 4, pp. 301-311, 2018. [http://dx.doi.org/10.1002/jsc.2203]
[2]
R.T. Aune, A. Krellenstein, M. O’Hara, and O. Slama, "Footprints on a Blockchain: Trading and Information Leakage in Distributed Ledgers", Journal of Trading, vol. 12, no. 3, pp. 5-13, 2017. [http://dx.doi.org/10.3905/jot.2017.12.3.005]
[3]
C. Pranav, “Applications of blockchain in healthcare”, Center for Open Science, 2020. [http://dx.doi.org/10.31224/osf.io/nkvcd]
[4]
R. Banach, "Blockchain applications beyond the cryptocurrency casino: The Punishment not Reward blockchain architecture", Concurr. Comput., vol. 33, no. 1, 2021. [http://dx.doi.org/10.1002/cpe.5749]
[5]
R. Böhme, N. Christin, B. Edelman, and T. Moore, "Bitcoin: Economics, Technology, and Governance", J. Econ. Perspect., vol. 29, no. 2, pp. 213-238, 2015. [http://dx.doi.org/10.1257/jep.29.2.213]
[6]
M.A. Cyran. "Blockchain as a foundation for sharing healthcare data", Blockchain in Healthcare Today., 2018. [http://dx.doi.org/10.30953/bhty.v1.13]
[7]
C. Michael, P. Nachiappan, V. Sanjeev, and K. Vignesh, "Blockchain Technology: Beyond Bitcoin", Applied Innovation Review, no. 2, pp. 6-19, 2016.
[8]
B. Chevallereau, G. Carter, and S. Sneha, "Voice Biometrics and Blockchain: Secure Interoperable Data Exchange for Healthcare", Blockchain in Healthcare Today, vol. 2, 2019. [http://dx.doi.org/10.30953/bhty.v2.119]
[9]
J. Dai, and M.A. Vasarhelyi, "Toward Blockchain-Based Accounting and Assurance", J. Inf. Syst., vol. 31, no. 3, pp. 5-21, 2017. [http://dx.doi.org/10.2308/isys-51804]
[10]
H. Deng, R.H. Huang, Q. Wu, "The regulation of initial coin offerings in China: Problems, prognoses and prospects. European Business Organization Law Review", Springer Science and Business Media LLC, vol. 19, no. 3, pp. 465–502, 2018. [http://dx.doi.org/10.1007/s40804-018-0118-2]
[11]
H. Joseph. Enterprise blockchain sales and solutions engineering. Architecting Enterprise Blockchain Solutions, Wiley; vol. 21; pp. 137–62, 2020. [http://dx.doi.org/10.1002/9781119557722.ch5]
[12]
M.A. Engelhardt, "Hitching Healthcare to the Chain: An Introduction to Blockchain Technology in the Healthcare Sector", Technol. Innov. Manag. Rev., vol. 7, no. 10, pp. 22-34, 2017. [http://dx.doi.org/10.22215/timreview/1111]
[13]
W. Antony, Future of Blockchain. Commercializing Blockchain. Wiley; vol. 29, pp. 251–68, 2019. [http://dx.doi.org/10.1002/9781119578048.ch13]
[14]
K. Lee, "Towards on blockchain standardization including blockchain as a service", Journal of Security Engineering, vol. 14, no. 3, pp. 231-238, 2017. [http://dx.doi.org/10.14257/jse.2017.06.05]
Blockchain in Healthcare
IoT and Big Data Analytics, Vol. 2 61
[15]
N. Arvind, B. Joseph, F. Edward, M. Andrew, and G. Steven, "Bitcoin and Cryptocurrency Technologies Network Security", Mark Allen Group, vol. 2016, no. 8, p. 4, 2016. [http://dx.doi.org/10.1016/S1353-4858(16)30074-5]
[16]
M. O’Dair, and Z. Beaven, "The networked record industry: How blockchain technology could transform the record industry", Strateg. Change, vol. 26, no. 5, pp. 471-480, 2017. [http://dx.doi.org/10.1002/jsc.2147]
[17]
A. Pazaitis, P. De Filippi, and V. Kostakis, "Blockchain and value systems in the sharing economy: The illustrative case of Backfeed", Technol. Forecast. Soc. Change, vol. 125, pp. 105-115, 2017. [http://dx.doi.org/10.1016/j.techfore.2017.05.025]
[18]
I Radanović, and R. Likić, Opportunities for use of blockchain technology in medicine. Applied health economics and health policy. Springer Science and Business Media LLC, vol. 16, no. 5, pp. 583–90, 2018. [http://dx.doi.org/10.1007/s40258-018-0412-8]
[19]
Welfare A. Commercializing Blockchain. Wiley; 2019. [http://dx.doi.org/10.1002/9781119578048]
[20]
M.S. Gross, and R.C. Miller, "Ethical Implementation of the Learning Healthcare System with Blockchain Technology", Blockchain in Healthcare Today, vol. 2, 2019. [http://dx.doi.org/10.30953/bhty.v2.113]
[21]
P. Gomber, R.J. Kauffman, C. Parker, and B.W. Weber, "On the Fintech Revolution: Interpreting the Forces of Innovation, Disruption, and Transformation in Financial Services", J. Manage. Inf. Syst., vol. 35, no. 1, pp. 220-265, 2018. [http://dx.doi.org/10.1080/07421222.2018.1440766]
[22]
A.F. Hussein, A.K. ALZubaidi, Q.A. Habash, and M.M. Jaber, "An Adaptive Biomedical Data Managing Scheme Based on the Blockchain Technique", Appl. Sci. (Basel), vol. 9, no. 12, p. 2494, 2019. [http://dx.doi.org/10.3390/app9122494]
[23]
X. Xu, I. Weber, M. Staples, “Model-Driven Engineering for Blockchain Applications”, Architecture for Blockchain Applications. Springer, Cham., 2019. [http://dx.doi.org/10.1007/978-3-030-03035-3_8]
[24]
X.X. Xu, I. Weber, and M. Staples, “Design Process for Applications on Blockchain”, Architecture for Blockchain Applications. Springer, Cham., 2019. [http://dx.doi.org/10.1007/978-3-030-03035-3_6]
[25]
W. Antony, Introduction to Blockchain Technology. Commercializing Blockchain. Wiley; pp. 7–35, 2019. Available from: http://dx.doi.org/10.1002/9781119578048.ch1 [http://dx.doi.org/10.1002/9781119578048.ch1]
[26]
X. Perry, "Java programming for blockchain applications. Practical Java® Programming for IoT, AI, and Blockchain", Wiley; pp. 347–88, 2018. [http://dx.doi.org/10.1002/9781119560050.ch10]
[27]
A. Sheik Abdullah, S. Selvakumar, C. Ramya, V. Priyadharsini, and C. Reshma, "A Survey on Evolutionary Techniques for Feature Selection", International Conference on Emerging Devices and Smart Systems (ICEDSS), 2017. [http://dx.doi.org/10.1109/ICEDSS.2017.8073659]
[28]
Yingli Wang, Jeong Hugh Han, and Paul Beynon-Davies, “Understanding blockchain technology for future supply chains: a systematic literature review and research agenda”, Supply Chain Management: An International Journal., vol. 24, no. 1, pp. 62-84, 2018. [http://dx.doi.org/10.1108/SCM-03-2018-0148]
[29]
A. Sheik Abdullah, R. Parkavi, P. Priyadharshini, and T. Saranya, "Data Analytics and its Applications to Cyber-Physical Systems", Cyber-Physical Systems and Industry 4.0 Practical Applications and
62 IoT and Big Data Analytics, Vol. 2
Abbas et al.
Security Management, 2021. [30]
R. Suganya, S. Rajaram, and A.S. Abdullah, “Big data in Medical Image Processing”, CRC Press, Taylor and Francis, 2018.
[31]
Xu X., Weber I., Staples M., “Blockchain in Software Architecture”, Architecture for Blockchain Applications. Springer, Cham., pp 83-92, 2019. [http://dx.doi.org/10.1007/978-3-030-03035-3_5]
[32]
A. Sheik Abdullah, S. Selvakumar, and C. Ramya, Descriptive Analytics. Applying Predictive Analytics Within the Service Sector. IGI Global. pp. 88-112, 2017. [http://dx.doi.org/10.4018/978-1-5225-2148-8.ch006]
[33]
A. Sheik Abdullah, and S. Selvakumar, "Assessment of the risk factors for type II diabetes using an improved combination of particle swarm optimization and decision trees by evaluation with Fisher’s linear discriminant analysis", Soft Comput., vol. 23, no. 20, pp. 9995-10017, 2019. [http://dx.doi.org/10.1007/s00500-018-3555-5]
[34]
A.S. Abdullah, S. Selvakumar, P. Karthikeyan, and M. Venkatesh, “Comparing the efficacy of decision tree and its variants using medical data”, Indian J. Sci. Tech., vol. 10, no.18, 2017. [http://dx.doi.org/10.17485/ijst/2017/v10i18/111768]
[35]
A.S. Abdullah, S. Selvakumar, and A.M. Abirami, "An introduction to data analytics. Handbook of research on advanced data mining techniques and applications for business intelligence. IGI Global, pp. 1-14, 2017. [http://dx.doi.org/10.4018/978-1-5225-2031-3.ch001]
IoT and Big Data Analytics, 2023, Vol. 2, 63-89
63
CHAPTER 4
Brain Tumor Detection Based on Different Deep Neural Networks - A Comparison Study Shrividhiya Gaikwad1, Srujana Kanchisamudram Seshagiribabu1, Sukruta Nagraj Kashyap1, Chitrapadi Gururaj1,* and Induja Kanchisamudram Seshagiribabu2 Department of Electronics and Telecommunication Engineering, BMS College of Engineering, Bengaluru, Visvesvaraya Technological University, Belagavi, India 2 Department of Computer Science (Artificial Intelligence), Andrew and Erna Viterbi School of Engineering, University of Southern California, Los Angeles, California, United States 1
Abstract: Glioblastoma, better known as Brain cancer, is an aggressive type of cancer that is fatal. Biomedical imaging technology now plays a prominent part in the diagnosis of cancer. Magnetic resonance imaging (MRI) is among the most efficient methods for detecting and locating brain tumors. Examining these images involves domain knowledge and is prone to human error. As computer-aided diagnosis is not widely used, this is one attempt to develop different models to detect brain tumors from the MRI image. In this chapter, we have carried out a comparison between three different architectures of Convolutional Neural Networks (CNN), VGG16, and ResNet50, and visually represented the result to the users using a GUI. Users can upload their MRI scans and check the tumor region if they have been diagnosed with cancer. Initially, pre-processed data is taken as input, and the features are extracted based on different model approaches. Lastly, the Softmax function is used for the binary classification of the tumor. To further validate the methodology, parameters like Accuracy, Recall, Precision, Sensitivity, Specificity, and f1 score are calculated. We have observed up to 86% of accuracy in the CNN model, whereas VGG16 and ResNet50 had an accuracy of 100% for our test dataset and 96% for our validation dataset.
Keywords: Bottleneck design, Brain tumor, CNN, Confusion matrix, Contouring, Data augmentation, Data pre-processing, Deep neural network, GUI, MRI images, Residual blocks, ResNet50, Transfer learning, Tumor region, User interface, Vanishing gradient, VGG16, Windows application. *
Corresponding author Chitrapadi Gururaj: Department of Electronics and Telecommunication Engineering, BMS College
of Engineering, Bengaluru, Visvesvaraya Technological University, Belagavi, India; E-mail: [email protected] Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma (Eds.) All rights reserved-© 2023 Bentham Science Publishers
64 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
INTRODUCTION In biomedical sciences, the conventional approach, identification, and classification for tumor detection are via human inspection. Because of the abnormal proliferation of cells in the brain, brain tumors have a negative impact on humans. It has the potential to interrupt genuine brain function and be fatal. This manual inspection technique is more liable to human error, time-consuming, and in specific instances, very impractical for analysis. Also, the treatment therapy depends upon the tumor’s degree at the time of inspection, the pathogenic type, and the tumor’s category [1]. With advances in medical and biological imaging technology like Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), detection has become much more effective and precise [2]. In the health care sector, computer-aided technologies for diagnosis, surgeries, and Artificial Intelligence techniques play a significant role. Any analysis of abnormally fast-growing tumor cells needs Automatic Brain tumor segmentation from multimodal MR images [3]. Hence a model with increased efficiency is necessary for the accurate identification of tumors, as computer-aided diagnosis of brain tumors is usually not opted due to the problem of various influencing factors on diagnosis [4]. As a result, the proposed work focuses on correctly recognizing brain tumors and nontumor MRIs using techniques that remove the tumor from images before applying various deep learning and classification models. On the other perspective, deep learning is a subsection of artificial intelligence that focuses on model learning from experiences that are based on the data provided to it. The model’s decisions will be based on various input data fed to it during the learning phase. This learning method entails extracting features or patterns from data using the model’s algorithm. As new input data is applied to the Neural Network, weights are determined as it moves through the neurons (also known as activation units), which are used to minimize the loss. In addition, hidden layers exist between the input and output layers, which will aid in decision-making. In the final layer, prediction is made on the basis of the model, and weights are calculated. The prediction is then evaluated for accuracy and if this accuracy is acceptable, the model is deployed. Furthermore, based on our observations, we conclude that the acceptable pixel size should be used to detect the feature or pattern for a given number of hidden layers. This entire procedure is referred to as data pre-processing, and it is a
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 65
crucial step before training the classification model. Data augmentation (a form of pre-processing) will be used to improve the model’s accuracy. RELATED WORK The diagnosis of brain tumors is usually made using imaging data and a brain tumor scan examination. The correct interpretation of these photos is crucial in establishing a patient's status. However, the accumulation of doctors' medical knowledge, differences in experience levels, and evident exhaustion can all affect how well image results are evaluated. Therefore, a way to correctly discover brain tumor scans is very important. Clinical research is aided by the use of PCA-based feature extraction and pattern recognition [5]. This suggests that the entire cerebral venous system is imaged separately using MRI. In layman's terms, a segmenting function is carried out, which is distinguished by a high level of homogeneity between anatomy and adjacent brain tissue. The convolutional neural network classification method has been used to train and test the accuracy of tumor identification in brain MRI images [2]. In terms of delineations, there is a high degree of spatial resemblance and consistency in volume estimates. The strategy beat existing methods for segmenting brain cells not just in terms of volume similarity metrics, but also in terms of segmentation time [5]. A framework for brain tumor segmentation treats the tumor segmentation problem as a machine learning problem [6]. The suggested brain tumor detection method may efficiently detect tumor cells with improved results in terms of correlation coefficient, sensitivity, and specificity, according to experimental data. In comparison to the above-mentioned brain tumor detection approaches, the detection accuracy of a 2D detection network and single-mode is greatly enhanced [7]. Deep Convolutional Neural Networks (Conv-Nets) are being investigated for classifying brain tumors utilizing multisequence MR images [8]. The goal is to create an algorithm that makes it easier to extract data from the brain's right and left hemispheres while simultaneously highlighting higher-level statistical features from a different level drawn from the specified brain area. This approach can be used to locate tumor cells using a single spectral magnetic resonance picture [9]. We combined RDM-Net with Deep Residual Dilate Network, which is a residual and dilated convolutional network. It can alleviate vanishing gradient problems while also increasing the receptive field without lowering the resolution. Some knowledge about regions with small tumors could be discarded in image processing, and for resolution, it is diluted to single pixels by going through continuous convolution procedures [10]. Using a Convolutional Neural Network to improve diagnostic outcomes. It primarily segments and
66 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
detects MRI brain tumor scans before using convolution to increase efficiency and rate using machine learning technologies. PCA-based feature extraction and pattern identification are carried out, which aids clinical research [11]. To make this model valid, CNN architectures VGG16 and ResNet 50 were trained. The identification and classification of tumors were also examined [12]. In computer-aided diagnostics, automatic tissue type classification for the region of interest (ROI) is critical. The ROI is dilated and more useful than before, which is supported by the tumor region being augmented to produce finer sub-regions and histograms that show high-quality classification with greater accuracy. Studies in [13] described a hybrid machine learning system for brain tumor classification based on the Genetic Algorithm and Support Vector Machine (SVM). A comparison study of the two and GA-SVM was recommended, taking into account the texture and intensity of the current brain tumor [14]. It is unavoidable to fine-tune the parameters balancing the model in order to improve the outcomes, and it is one of the most difficult duties for real-world applications [15]. Image slice sampling is sent into a Squeeze and Excitation ResNet to provide an automatic method for brain tumor classification from MRI. The Convolutional Neural Network (CNN) with zero-centering and normalization is used in this model [16], based on ResNet and randomized neural networks, that goes into additional depth about innovative problematic brain identification for magnetic resonance images on the brain. To conduct ELM-CBA subsequently, ResNet is utilized as the feature extractor, which is a convolutional neural network structure [17]. To extract features, a feed-forward back-propagation artificial neural network on k-nearest neighbors is utilized with the first stage of discrete wavelet transformation (DWT) [18] and the second stage of principal component analysis (PCA) [19]. Research in [20] CNN for image classification, segmentation, registration, and detection to automate the classification of brain tumors. VGG16, a CNN architectural model, was further detailed. APPROACH For accurate tumor detection, the collection of brain MRI images is submitted to various techniques. They are the data pre-processing steps that include data augmentation so that a huge dataset can be considered without overfitting. Then we feed the processed data to Convolutional Neural Network to classify the images as tumors or non-tumors. To improve detection accuracy, the data contouring is performed after augmentation and feeding to pre-trained models like VGG16 and ResNet50, which use the notion of transfer learning as shown in
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 67
Fig. (1). By using all 3 models, a comparison study is brought about which defines the best model that we can use for the detection of brain tumors out of the 3 used ones.
Fig. (1). Flowchart of implementation.
Dataset MRI images of the brain are taken as the dataset with 250 positive presence of the tumor and 110 negative presences. The datasets are obtained from Kaggle, which is an open-source platform with large datasets mainly used in ML and DL studies. To increase the accuracy and precision of model data, augmentation is done, so larger data can be used in detection. Data Pre-Processing As we have discussed earlier, Data Pre-Processing is a crucial part of image processing techniques. Machine Learning and Deep Learning algorithms work effectively when pre-processed data is fed as the input [21]. It is the most important step in data mining [22]. Pre-processing is a technique to modify the given data so that using modified data will give a significant change in the efficiency of the algorithm. Later, feature encoding is performed on the data. This is done so as to make transformations on the data so that model easily accepts the data without losing the original meaning [23]. The datasets that we obtained from Kaggle have undergone this pre-processing further splitting them into Train, Test and Val sets of images which can be observed from Fig. (2). There is difficulty in distinguishing brain tumor tissue from normal tissue because of the similar color. Brain tumors must be analyzed accurately [24]. Thus, this splitting of data is advisable.
68 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
Train Dataset: This dataset will be used to train our algorithm to build a model by learning the data. The learning process will include the model to learn all the intricate details of our data to extract necessary information and recognize a pattern. Validation Dataset: This dataset verifies the parameters/weights we obtained during the training phase in order to improve the model's performance. To put it another way, it aids the model in determining how well the acquired hyperparameters are. The model does not learn from this dataset, but it does use it as a source of information to improve its hyperparameters. Test Data: This dataset is used to test our model assumptions. This data will not be revealed until the final model, and its weights are decided. After that, these untouched data experiments on test data give us a brief overview of how the model performs with real-world data.
Fig. (2). Splitting of dataset.
The split ratio for the above is dependent on the objective of the problem. In this case of brain tumor detection, we have given more emphasis on training data. Each time the model is run, the probability of the occurrence of images in each set changes owing to different combinations of datasets. As a result, a random splitting of 193 training, 50 validations & 10 testings are done. Each time the model is administered, the probability of the occurrence of images in every set changes due to distinctive combinations. Here the images possess a different pixel size, i.e., the width and height of the image. Even the black corners of the images are different, resulting in weird pictures after resizing a few with a wide width. Hence histogram of ratio distributions (ratio distribution= image width/image height) is also shown in Fig. (3). This refers to as Thresholding.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 69
Fig. (3). Histogram of image ratios.
Data Augmentation Image data augmentation is a technique used to modify images in different ways leading to extend the size of our dataset, which will make the model learn better. Deep Learning techniques conditioned with augmented data perform better, and augmentation techniques can create multiple viewpoints of the information, allowing fit models to consider what they learned from the additional images more effectively. There are two methods of data wrapping and synthetic overlapping that can be used to increase the datasets more than the previously existing ones [25]. Utilization of different operations is done, which causes images to be rotated, shifted, cropped, flipped, scaled and also modified in brightness which can be brought upon. Any imbalance in image classification in terms of training and testing data can be overcome by the augmentation process. Many open-source python packages offer data augmentation functions. Be it ImageDataGenerator by Keras or OpenCV, simple transformation operations mentioned above could be used in these. The 2 poor predictions of data – Overfitting and underfitting* can be substantially prevented from data augmentation. Having a large dataset in computer vision projects is absolutely crucial for the proper working of the model. As the deep learning techniques also gain from this and thus bring out a very accurate model [26]. We have applied different angles of rotation, such as 5, 15, 30, 45 degrees. As we can see in Fig. (4), there are other modifications like horizontal and vertical flips and width and height shifts which are performed as part of augmenting the data.
70 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
Fig. (4). Showing data augmentation on one image.
Contouring Automated image contouring shows efficiency enhancements on certain radiotherapy patient commitments. While atlas segmentation has been fairly beneficial, the following era of algorithms is already pointing to improvements in accuracy and performance based on convolutional neural networks. This thesis offers a thorough analysis of the effects of machine learning when dealing with those commitments. Furthermore, cutting-edge atlas contouring methods necessitate a significant amount of overhead in order to classify the best-fit cases in terms of computing efficiency and perform no rigid registration to suit individual labels. However, it is becoming evident that the flaws of this century's automatic contouring architectures will be addressed when technology advances to fully convolutional neural networks. For enhanced computational resources, such neural networks, sometimes referred to as Deep Learning, use a series of pattern recognition
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 71
phases. Filter convolution is used in the network's early layers to find core image capabilities such as edges and complicated intensity levels. Though CNNs have a high computational overhead for model training—filters and layer weights are optimized over several days from time to time—and the implementation or inference of an ordered neural network to be segmented on each affected entity is relatively efficient and allows flexibility to create different models for any one network of CNN approaches. Because of the rapid processing of qualifying models, it is also reasonably effective to employ distinct CNNs for each volume of interest. This means that bigger or greater contrast regions do not override the collection of data to delineate subtle structures. The capacity to create detailed organ contours on the fly, as well as the automation of recurring scientific labor, allows for the acquisition of dimensions that would ordinarily be ignored. By cropping the images for the largest contours identified, which can be seen in Fig. (5), brain images are contoured to lower the computational costs of training.
Fig. (5). Contouring of brain images.
Transfer Learning In the classic supervised learning methods, any knowledge learned previously by the model is transferred to another model, which will utilize the same data and thus reduce the time required in training. This is known as transfer learning. Transfer learning saves training time and gives a better performance of neural networks while saving a lot of data. The last layer of the model is substituted with
72 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
this pre-trained knowledge. Transfer learning has vast applications in text classification, reinforcement learning and artificial intelligence. Transfer learning permits us to address those eventualities by leveraging the already present classified facts of a few related task domains. In practice, we seek to switch as much expertise as we can from the source setting to our domain. We attempt to store this expertise gained in fixing the source task inside the supply domain and use it in our hassle of interest, as shown in Fig. (6).
Fig. (6). Traditional ML vs. Transfer Learning.
This expertise can tackle numerous parts of paperwork relying on the facts: it can pertain to how gadgets are composed to permit us to discover novel objects more effortlessly; it can be with regard to the overall words humans use to explicit their opinions, etc. We will be discussing those models which use Transfer Learning for detecting brain tumors which is the main objective. MODELS USED IN THE COMPARISON STUDY Convolutional Neural Network Convolutional neural networks are commonly used in image categorization and recognition, with categorization being the primary goal. As a vital element of Deep Neural Networks, it can recognize and classify certain aspects of photographs. A convolutional neural network is made up of many convolutional layers, pooling layers, and fully coupled layers. With a convolutional front end, pooling layers, and a fully connected back end.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 73
CNN architecture, foremost as shown in Fig. (7), consists of Feature Extractionidentification and separation based on the various features of the datasets further used for analysis purposes. Classification helps in predicting the classes of the images acquired from previous convolutions.
Fig. (7). Schematic diagram of CNN structure.
We process CNN for 28 X 28-pixel images with stride 2 in each convolution, following up 4 dense layers. Functions like dropout & flatten are used to reduce overfitting. The basic functionality of CNN above is segregated into four key areas: Input Layer It holds the pixel value of the images. Unlike traditional CNN, the data of multichannel images can be directly fed without much pre-processing. As discussed earlier, images with the appropriate resolution are given as input. Convolution Layer The most essential layer is the convolution layer, which performs mathematical operations with each layer parameter focused on learnable kernels. This particular layer, whose main objective is to extract features from photos, (which appears to be the source of CNN's name), uses kernels of various layers to maintain spatial relationships. With these Convolution operations & spatial dimensionality, the entire depth of the input can be covered. In virtue of getting different feature graphs and images, CNN endorses a multikernel convolution to get exquisite image features. Fig. (8) shows the convolution
74 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
using a 3X3 kernel matrix where the dot product of input data and convolution matrix is done. This procedure is adhered to in all source inputs to get the final output layer.
Fig. (8). Schematic of the convolution operation.
CNN introduces local perception and parameter sharing to reduce the network's complexity. Parameter sharing assumes that if a property of one region is beneficial for computation in one spatial region, it will likely be valuable in another. Activation Layer It is attached to the convolution layer. Having both linear and non-linear functions helps in learning the complexity of the data. Commonly used functions are ReLU, Sigmoid, and tanh, which are evidently shown in Fig. (9). The additional Softmax function is also used.
Fig. (9). Activation functions.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 75
Deep learning architectures use the sigmoid function to estimate probability and solve binary classification tasks. Sigmoid = 1/ (1 + exp(−x))
(1)
The hyperbolic tangent Activation Function is very useful in DL models. It outputs a better result in the case of multi-layer neural networks. In its virtue, it attains a gradient 1 for input zero. tanh = (1 − exp(−2x))/ (1 + exp(−2x))
(2)
In terms of dead neuron issues, the rectified linear unit (ReLU) activation function has faster computation and increased speed, overcoming sigmoid and tanh (the dead neuron is a condition where zero gradients occur as activation weight is rarely used) and also overcomes the vanishing gradient (gradient of loss function tending to zero due to the addition of certain activation functions, which would make training a hard task). So, most of the time, we use ReLU. ReLU = max (0, x)
(3)
To compute a probability distribution from a vector of real values, using the Softmax Function. It returns probabilities for each class in multi-class models, with the target class having the highest likelihood. f (xi) = exp ( xi ) / ∑ j exp (xj)
(4)
Pooling Layer Helps in reducing the computational cost to remove some redundant information effectively, thereby overfitting. We can reduce the dimension of the feature map, thereby reducing the amount of calculation. Overfitting is a condition where the model cannot fit on the unseen dataset. An error that occurred in the testing or validation dataset is much greater than the error in the training dataset. So, in the pooling layer, down-sampling is done with the help of strides. The model uses max pooling defined as follows: The pooling operation which takes the maximum of a particular region in the feature map and the region area depends on the stride length selected. Thus, the output after max-pooling has the most prominent features of the previous existing feature maps.
76 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
Fully Connected Layer The Fully Connected (FC) layer consists of the weights from images and passes this information to further neural layers. The input image from the previous layers is flattened and fed to the FC layer. The actual classification process begins here. Output This shows the presence or absence of a tumor after Softmax operation. VGG 16 VGG 16 or Visual Geometry Group is a part of Convolutional Neural Network (CNN) architecture. From Fig. (10), we can visualize that it has 16 layers that have weights, which is why it is referred to as VGG16. It has around 13 convolutional layers and 5 max-pooling layers followed by 3 fully connected layers. It has the following layers: 1. Input Layer: This layer takes only colored images as an input with a size of 224x224, along with red, green, and blue channels. 2. Convolution Layer: The input image will pass through a stack of this convolution layer followed by a convolution filter which has a very small receptive field of 3x3 and a stride of 1. 3. An activation function called ReLU is present in the hidden layers. At the output, Softmax is applied. Recent experiments on deep architecture have used models that can be trained to become deep supervised neural networks, but with initialization or training methods that are not the same as those used in traditional feedforward neural networks [27]. Backpropagation is useful for determining the gradient of a loss function when all of the network's weights are taken into account. As a result, in VGG, the loss at each node can be minimized. Although it offers excellent performance results, the large number of parameters makes it difficult to operate [28]. The working of this model requires the images to be resized to 224 X 224. Running the data for 50 epochs helps us reach a higher precision-based model with a good f1 score and validation accuracy of 0.96, test accuracy of 0.90. A final average accuracy of 0.96 is received from VGG-16.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 77
Fig. (10). VGG16 Architecture.
ResNet 50 Symbolic deep network architecture is ResNet. It is a strong key structure that is seen very much in several computer vision technology applications. It regulates the problems with the gradient (as the gradient is backpropagated to previous layers, a small gradient will result from repeated multiplication of the function. Consequently, its output is diminished or even continues to decline rapidly when the network goes deeper) which inevitably happens in the Deep Neural Network like a vanishing gradient problem. ResNet established two simultaneous routes, identity mapping and residual mapping, to address the problem of “as the network deepens, the accuracy diminishes”. If the network reaches its optimum condition and continues to widen, the residual mapping would be upgraded to Zero, and only identity mapping will remain, ensuring that neither gradient recession nor performance reduction will occur [29]. A salient function of the skip connection is used by ResNet. Fig. (11) is an example of the skip relation. The image to the left is the layers of convolution that are one after the other. We stack more convolution layers on the right image as before, but we apply the initial input to the convolution block's original performance. This whole process is skipped connection. Two different types of shortcut modules are present in ResNet execution. The former is an identification block where the convolutional layer is not present in the skip connection. In this example, the input has dimensions identical to the output. The other one is a convolution block with the existence of a shortcut convolution layer. So, relative to the output dimensions, the input dimensions are compact. The start and end of
78 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
networks of the two blocks explained above have a convolution layer with kernel size 1x1 which represents the bottleneck design. The number of parameters is decreased while the network output does not worsen too much with this bottleneck design. ResNet50 is outrun by a remarkable margin in the case of a deeper network. So Residual Network is used in object detection and recognition [30].
Fig. (11). Skip connection.
50 Epochs with 7 steps: each increases the efficiency of ResNet50, giving accuracies as shown in Fig. (12). A final accuracy of 0.96 is obtained in ResNet50.
Fig. (12). ResNet50.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 79
EVALUATION PARAMETERS There are 2 major evaluation procedures considered in this chapter. Firstly, the confusion matrix describes the model’s classification accuracy, and secondly, the formula approach of calculating accuracy, precision, f1 score, recall, specificity, and sensitivity [31]. The former is advisable as the main evaluation standard in case of irregular dataset distribution in training, testing, and validation because a correct and unique accuracy will be obtained for every trial [32]. Sometimes, measurement errors in the identification or detection during classification may lead to improper results [33]. Accuracy is the indicator of all cases that have been detected precisely. It is most widely used where all classes are equally relevant, i.e., it is used when the correct predictions, such as True Positive (TP) and True Negative (TN), are important but false predictions such as, False Positive (FP) and False Negative (FN), are just as important which f1-score uses to give an idea about the model for when there is a disparity in the distribution of images to different classes. If the class distribution is equivalent, accuracy can be used, although F1-score is a superior metric in our case with unbalanced classes. Accuracy = (ܶܲ+TN )/ (ܶܲ + ܲܨ+ ܶܰ +)ܰܨ
(5)
Precision = (TP)/( T ܲ+ )ܲܨ
(6)
Recall = (TP)/( ܶܲ + )ܰܨ
(7)
F1 score= 2 ({ כPreܴܿ݅ܽܿ݁ כ ݊݅ݏll )/( ܲ ݊݅ݏ݅ܿ݁ݎ+ ܴ݈݈݁ܿܽ )} (8) Sensitivity = ܶܲ/( ܶܲ + ) ܰܨ Specificity = 𝑇𝑁/( 𝑇𝑁 + 𝐹𝑃 )
(9) (10)
In predictive analytics, accuracy optimization is primarily used in classification and decision frameworks [34]. If the data is unbalanced or false positive/negative decisions result in varied costs, this consideration is just a conditioned instructive value [35]. As a result, more sophisticated analysis quality measurements, such as precision and recall, are preferred in medicine [36].
80 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
RESULTS AND DISCUSSION As we have observed from the models developed, we have obtained an accuracy of 84% for the basic CNN model and 96% of validation accuracy for both the VGG16 and ResNet50 models. For the test dataset, which contains 10 images taken randomly from the dataset, we have obtained an accuracy of 86% for CNN and 100% for VGG16 and ResNet50 architectures. Convolutional Neural Network We can observe that the peak validation loss for CNN is around 2.5, and overall accuracy is up to 86%. For the test dataset, all the metrics necessary for the comparison of classification done by the CNN model are demonstrated in Figs. (13) and (14).
Fig. (13). Classification report of Basic CNN figure.
Fig. (14). CNN network convergence.
VGG16 and ResNet50 Here we have observed that peak validation loss for VGG16 is around 1.4 and for ResNet50 is around 1.3, which is very much better than the basic CNN model with an average accuracy of 96% for both VGG16 and ResNet50 from Figs. (15) and (16). As we have discussed earlier, f1-score is a better metric for comparison
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 81
of the model’s performance as our model has an unequally distributed dataset [37]. From the operations mentioned above, we have calculated precision, recall, and f1-score from the confusion matrix obtained from the model [38]. This confusion matrix contains 4 blocks with 2 rows, 2 columns where rows represent the true class of the images and columns represent the predicted class of the images from our model. From these True and false predictions, we have calculated all necessary metrics and made a classification report. We have obtained similar results for VGG16 and ResNet50 models for the validation dataset and test dataset. Hence the metrics for the classification report are also the same for the two models. This is represented in Table 1.
Fig. (15). VGG16 network convergence.
Fig. (16). ResNet50 network convergence. Table 1. Confusion matrix values for VGG 16 and ResNet50. VGG 16 & ResNet50
TP
FP
FN
TN
Accuracy
Validation dataset
18
1
1
30
0.96
Test dataset
5
0
0
10
1.00
82 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
GUI For the prediction of brain tumors, a windows tool has been developed. Using contouring, this application will assist in identifying the specific tumor region from Magnetic Resonance scans. The existence of a tumor is critical in determining the tumor's malignancy. Because there are many different types of tumors, such as Glioblastoma, meningitis, and pituitary, it is crucial to identify the exact location of the tumor. There is a query button in the program that takes the user to the National Cancer Institute's Centre for Cancer Research's Brain Tumor Q&A website. This website has a plethora of information about early detection and treatment approaches for cancerous brain tumors. The instructions on how to use the software for predictive analytics and viewing tumors is shown in Fig. (17). To begin, the user must first choose a Brain MRI picture to test from local storage or the computer, as well as a model for prediction. Even if the user is inexperienced with the model's technicalities, the software tool utilizes ResNet50 as the default model because VGG16 and ResNet50 fared equally in statistics. After selecting a picture, the user can choose between two operations: tumor detection or ground truth visualization. The query button takes you to the webpage shown in Fig. (18).
Fig. (17). Application overview.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 83
Fig. (18). NCI website.
The user has to select the detect tumor operation in order to analyze tumors and examine the tumor region, View tumor operation. To upload a picture, it should be in the .jpg, .jpeg, or .png format. The client may visualize the tumor if it has been diagnosed from the picture. The application will provide first and second-level extraction when you click the view button from the view tumor operation. The result of the mask operation will be the first level, and the segmented tumor picture will be the second. If the user attempts to see the picture for NO Tumor, it will only show the first mask operation, since it will not progress to the second extraction because of the absence of a tumor, as shown in Fig. (19). The presence of the tumor will enable the View button as shown in Fig. (20).
Fig. (19). Tumour Absent.
84 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
Fig. (20). Tumor Present.
The ground truth tumor visualization is shown in Fig. (21). To distinguish the tumor from the healthy brain, the tumor is highlighted in green. The purple region surrounding the tumor signifies the tumor's probability of spreading. The close button will close all of the windows that have been opened.
Fig. (21). Tumor visualization
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 85
CONCLUSION AND FUTURE WORK In this chapter, we implemented a technique to detect the presence and absence of tumors from an MRI image. It contains an artificial neural network model which is trained initially with the datasets and then used to predict the presence and absence of the disease. The model performs better in terms of producing an accurate result for the detection of the tumor. The accuracy of the model turned out to be 96%. To affirm the performance of the model, it was trained multiple times. The model was trained using Convolutional Neural Network and further trained using its architectures, i.e., VGG16 and ResNet50. The transfer learning method was chosen over traditional ML techniques so as to train the model repeatedly to obtain better precision and accuracy. The f1 scores were also calculated based on this. To eliminate complexities, CNN datasets of various sizes were changed according to 28x28 pixel values to partition the data sets into training and testing sets. After testing random datasets, weights were calculated and validation loss and accuracy were obtained. Furthermore, when the model was trained using VGG16, it was observed it increases the depth of the network which enables it to learn more complex features. VGG16 gave a better accuracy compared to CNN. It was also observed from Fig. (22) that the model loss decreased rapidly, which helped in detecting the tumor better.
Fig. (22). Evaluation standards for 3 models.
A Confusion Matrix was also obtained, which showed the number of tumors present in the datasets. Next, the datasets were trained using another architecture called ResNet50, which gave different results compared to VGG16. It was observed that it overcomes the gradient issues that occur in a deep neural network.
86 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
The output dimensions in ResNet-101 replace VGG-16 layers in Faster R-CNN. It also resulted in a 28 percent improvement in relative performance. ResNet model outperforms by an exceptional margin in the case of the deeper network. A Confusion Matrix was also obtained, which showed the number of tumors present in the datasets. From the comparison of models developed, it was observed that the f1-score for vgg16 and resnet50 is better than basic CNN. After a comparison of 3 different models, parameters like Sensitivity, Specificity, and f1-score are calculated as a measure of evaluation. We have observed up to 86% of accuracy in the CNN model, whereas VGG16 and ResNet50 had an accuracy of 100% of accuracy for our test dataset and 96% for our validation dataset. This result from the confusion matrix is obtained from training and testing the 2 models for multiple epochs and this is an average measure from our observation. So, both VGG16 and ResNet50 perform equally well for the considered scenario. A user-friendly graphical user interface platform for the detection of brain tumors extends the approach of the idea implemented into a social utility domain, thus helping in automatic detection. Not only is the GUI serving as a base for tumor detection, but it also provides information by linking to the NCI portal. It serves as a medium for future advancements in cloud and web applications. The performance of the model will change with the dataset as it depends on the grade of the image that is suitable for training the model. This research can be extended by classifying into multinomial classes, and survival rate predictions can also be made by training the model with survival records of patients. NOTES *Overfitting model captures noise than data *Under fitting model cannot capture the trend which is underlying for a given data REFERENCES [1]
N. Noreen, S. Palaniappan, A. Qayyum, I. Ahmad, M. Imran, and M. Shoaib, "A Deep Learning Model Based on Concatenation Approach for the Diagnosis of Brain Tumor", IEEE Access, vol. 8, pp. 55135-55144, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.2978629]
[2]
J. Dolz, A. Laprie, S. Ken, H.A. Leroy, N. Reyns, L. Massoptier, and M. Vermandel, "Supervised machine learning-based classification scheme to segment the brainstem on MRI in multicenter brain tumor treatment context", Int. J. CARS, vol. 11, no. 1, pp. 43-51, 2016. [http://dx.doi.org/10.1007/s11548-015-1266-2] [PMID: 26206715]
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 87
[3]
M. Ali, S.O. Gilani, A. Waris, K. Zafar, and M. Jamil, "Brain Tumour Image Segmentation Using Deep Networks", IEEE Access, vol. 8, pp. 153589-153598, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.3018160]
[4]
X. Tian, and C. Chen, "Modulation Pattern Recognition Based on Resnet50 Neural Network", 2019 IEEE 2nd International Conference on Information Communication and Signal Processing (ICICSP), pp. 34-38, 2019. [http://dx.doi.org/10.1109/ICICSP48821.2019.8958555]
[5]
M. Li, L. Kuang, S. Xu, and Z. Sha, "Brain tumor detection based on multimodal information fusion and convolutional neural network", IEEE Access, vol. 7, pp. 180134-180146, 2019. [http://dx.doi.org/10.1109/ACCESS.2019.2958370]
[6]
W. Wang, F. Bu, Z. Lin, and S. Zhai, "Learning Methods of Convolutional Neural Network Combined With Image Feature Extraction in Brain Tumor Detection", IEEE Access, vol. 8, pp. 152659-152668, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.3016282]
[7]
M. Havaei, P. Jodoin, and H. Larochelle, "Efficient interactive brain tumor segmentation as withinbrain kNN classification", 22nd International Conference on Pattern Recognition, pp. 556-561, 2014. [http://dx.doi.org/10.1109/ICPR.2014.106]
[8]
K.H. Oh, S.H. Kim, and M. Lee, "Tumor detection on brain MR images using regional features: Method and preliminary results", 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV), pp. 1-4, 2015. [http://dx.doi.org/10.1109/FCV.2015.7103705]
[9]
M. Arbane, R. Benlamri, Y. Brik, and M. Djerioui, "Transfer learning for automatic brain tumor classification using MRI images", 2nd International Workshop on Human-Centric Smart Environments for Health and Well-being (IHSH), pp. 210-214, 2021. [http://dx.doi.org/10.1109/IHSH51661.2021.9378739]
[10]
Y. Ding, C. Li, Q. Yang, Z. Qin, and Z. Qin, "How to improve the deep residual network to segment multi-modal brain tumor images", IEEE Access, vol. 7, pp. 152821-152831, 2019. [http://dx.doi.org/10.1109/ACCESS.2019.2948120]
[11]
Z. Qu, S.Y. Wang, L. Liu, and D.Y. Zhou, "Visual cross-image fusion using deep neural networks for image edge detection", IEEE Access, vol. 7, pp. 57604-57615, 2019. [http://dx.doi.org/10.1109/ACCESS.2019.2914151]
[12]
Z. Jia, and D. Chen, "Brain tumor identification and classification of MRI images using deep learning techniques", IEEE Access, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.3016319]
[13]
R.C.G. Chehata, and W.B. Mikhael, "Data augmentation for face recognition system implemented in multiple transform domains", IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 203-206, 2019. [http://dx.doi.org/10.1109/MWSCAS.2019.8885339]
[14]
J. Sachdeva, V. Kumar, I. Gupta, N. Khandelwal, and C.K. Ahuja, "Multiclass Brain Tumor Classification Using GA-SVM", Developments in E-systems Engineering, vol. 2011, pp. 182-187, 2011. [http://dx.doi.org/10.1109/DeSE.2011.31]
[15]
I. Ramírez, A. Martín, and E. Schiavi, "Optimization of a variational model using deep learning: An application to brain tumor segmentation," IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 631-634, 2018. [http://dx.doi.org/10.1109/ISBI.2018.8363654]
[16]
S. Waheed, R.A. Moffitt, Q. Chaudry, A.N. Young, and M.D. Wang, "Computer aided histopathological classification of cancer subtypes", IEEE 7th International Symposium on
88 IoT and Big Data Analytics, Vol. 2
Gaikwad et al.
BioInformatics and BioEngineering, pp. 503-508, 2007. [http://dx.doi.org/10.1109/BIBE.2007.4375608] [17]
P. Ghosal, L. Nandanwar, S. Kanchan, A. Bhadra, J. Chakraborty, and D. Nandi, "Brain Tumor Classification Using ResNet-101 Based Squeeze and Excitation Deep Neural Network", 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP) pp. 1-6, 2019. [http://dx.doi.org/10.1109/ICACCP.2019.8882973]
[18]
S. Somasundaram, and R. Gobinath, "Current Trends on Deep Learning Models for Brain Tumor Segmentation and Detection – A Review", 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon) pp. 217-221, 2019. [http://dx.doi.org/10.1109/COMITCon.2019.8862209]
[19]
H. Mohsen, E.A. El-Dahshan, and A.M. Salem, "A machine learning technique for MRI brain images", 8th International Conference on Informatics and Systems (INFOS), pp. 161-165, 2012.
[20]
H. Ucuzal, Ş. YAŞAR and C. Çolak, "Classification of brain tumor types by deep learning with convolutional neural network on magnetic resonance images using a developed web-based interface", 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), pp. 1-5, 2019. [http://dx.doi.org/10.1109/ISMSIT.2019.8932761]
[21]
B.H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, Y. Burren, N. Porz, J. Slotboom, R. Wiest, L. Lanczi, E. Gerstner, M.A. Weber, T. Arbel, B.B. Avants, N. Ayache, P. Buendia, D.L. Collins, N. Cordier, J.J. Corso, A. Criminisi, T. Das, H. Delingette, C. Demiralp, C.R. Durst, M. Dojat, S. Doyle, J. Festa, F. Forbes, E. Geremia, B. Glocker, P. Golland, X. Guo, A. Hamamci, K.M. Iftekharuddin, R. Jena, N.M. John, E. Konukoglu, D. Lashkari, J.A. Mariz, R. Meier, S. Pereira, D. Precup, S.J. Price, T.R. Raviv, S.M.S. Reza, M. Ryan, D. Sarikaya, L. Schwartz, H.C. Shin, J. Shotton, C.A. Silva, N. Sousa, N.K. Subbanna, G. Szekely, T.J. Taylor, O.M. Thomas, N.J. Tustison, G. Unal, F. Vasseur, M. Wintermark, D.H. Ye, L. Zhao, B. Zhao, D. Zikic, M. Prastawa, M. Reyes, and K. Van Leemput, "The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)", IEEE Trans. Med. Imaging, vol. 34, no. 10, pp. 1993-2024, 2015. [http://dx.doi.org/10.1109/TMI.2014.2377694] [PMID: 25494501]
[22]
P. Chandrasekar, and K. Qian, "The impact of data preprocessing on the performance of a naive bayes classifier", IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), pp. 618619, 2016. [http://dx.doi.org/10.1109/COMPSAC.2016.205]
[23]
E. Bullitt, and S.R. Aylward, "Volume rendering of segmented image objects", IEEE Trans. Med. Imaging, vol. 21, no. 8, pp. 998-1002, 2002. [http://dx.doi.org/10.1109/TMI.2002.803088] [PMID: 12472272]
[24]
A. Wulandari, R. Sigit, and M.M. Bachtiar, "Brain Tumor Segmentation to Calculate Percentage Tumor Using MRI", International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), pp. 292-296, 2018. [http://dx.doi.org/10.1109/KCIC.2018.8628591]
[25]
S.C. Wong, A. Gatt, V. Stamatescu, and M.D. McDonnell, "Understanding Data Augmentation for Classification: When to Warp?", International Conference on Digital Image Computing: Techniques and Applications (DICTA) pp. 1-6, 2016. [http://dx.doi.org/10.1109/DICTA.2016.7797091]
[26]
C. Khosla, and B.S. Saini, "Enhancing Performance of Deep Learning Models with different Data Augmentation Techniques: A Survey", International Conference on Intelligent Engineering and Management (ICIEM), pp. 79-85, 2020. [http://dx.doi.org/10.1109/ICIEM48762.2020.9160048]
[27]
X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks”, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp.
Deep Neural Networks
IoT and Big Data Analytics, Vol. 2 89
249-256, 2010. [28]
G. Karaduman, and E. Akin, "A deep learning based method for detecting of wear on the current collector strips’ surfaces of the pantograph in railways", IEEE Access, vol. 8, pp. 183799-183812, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.3029555]
[29]
X. Wu, H. Xu, X. Wei, Q. Wu, W. Zhang, and X. Han, "Damage identification of low emissivity coating based on convolution neural network", IEEE Access, vol. 8, pp. 156792-156800, 2020. [http://dx.doi.org/10.1109/ACCESS.2020.3019484]
[30]
S. Khellat-Kihel, and M. Tistarelli, "A biologically-inspired attentional approach for face recognition", 7th International Workshop on Biometrics and Forensics (IWBF), pp. 1-5, 2019. [http://dx.doi.org/10.1109/IWBF.2019.8739187]
[31]
T.M. Shahriar Sazzad, K.M. Tanzibul Ahmmed, M.U. Hoque, and M. Rahman, "Development of Automated Brain Tumor Identification Using MRI Images", International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1-4, 2019. [http://dx.doi.org/10.1109/ECACE.2019.8679240]
[32]
S. Grampurohit, V. Shalavadi, V.R. Dhotargavi, M. Kudari, and S. Jolad, "Brain Tumor Detection Using Deep Learning Models", 2020 IEEE India Council International Subsections Conference (INDISCON), pp. 129-134, 2020. [http://dx.doi.org/10.1109/INDISCON50162.2020.00037]
[33]
G. Golino, A. Graziano, A. Farina, W. Mellano, and F. Ciaramaglia, "Comparison of identity fusion algorithms using estimations of confusion matrices", 17th International Conference on Information Fusion (FUSION), pp. 1-7, 2014.
[34]
T. Villmann, M. Kaden, M. Lange, P. Stürmer, and W. Hermann, "Precision-recall-optimization in learning vector quantization classifiers for improved medical classification systems", IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 1-7, 2014. [http://dx.doi.org/10.1109/CIDM.2014.7008150]
[35]
A. Sorokin, "Multi-label classification of brain tumor mass spectrometry data in pursuit of tumor boundary detection method", International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), pp. 169-171, 2017. [http://dx.doi.org/10.1109/ICIIBMS.2017.8279736]
[36]
C Gururaj, D Jayadevappa, and Satish Tunga, "A Study of Different Content Based Image Retrieval Techniques", International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE). vol. 5, no. 8, 2016. [http://dx.doi.org/10.17148/IJARCCE.2016.5846.]
[37]
C. Gururaj, "Proficient algorithm for features mining in fundus images through content based image retrieval”, IEEE International Conference on Intelligent and Innovative Computing Applications (ICONIC – 2018), pp 108-113, 2018. [http://dx.doi.org/10.1109/ICONIC.2018.8601259]
[38]
C. Gururaj, and Satish Tunga, "AI based Feature Extraction through Content Based Image Retrieval", Journal of Computational and Theoretical Nanoscience. vol. 17, no. 9, pp. 4097-4101, 2020. http://dx.doi.org/10.1166/jctn.2020.9018.
90
IoT and Big Data Analytics, 2023, Vol. 2, 90-111
CHAPTER 5
A Robust Model for Optimum Medical Image Contrast Enhancement and Tumor Screening Monika Agarwal1, Geeta Rani2,*, Vijaypal Singh Dhaka2 and Nitesh Pradhan3 Dayanand Sagar University, Bangalore, India Computer and Communication Engineering, Manipal University, Jaipur, India 3 Computer Science Engineering, Manipal University, Jaipur, India 1 2
Abstract: The use of medical imaging techniques have improved the correctness of disease screening and diagnosis. But, the quality of these images is greatly affected by real-time factors such as the type of machinery used, the position of a patient, the intensity of light, etc. The poorly maintained machines, incorrect positioning of patients, and inadequate intensity of light lead to low contrast and poor-quality medical images that work as hindrances in examining medical images. Thus, there is a need to upgrade the features of medical images. Researchers applied histogram equalization for contrast enhancement. However, it improves the visual appearance of medical images but faces the difficulties of over-enhancement, noise, and undesirable artifacts. Also, these techniques report low accuracy in tumor detection. Therefore, we propose an efficient model for medical image contrast enhancement and correct tumor prediction. The model performs segmentation, weighted distribution, gamma correction, and filtering to improve the visual appearance of MRI images. Further, it employs the optimum feature extraction for the correct detection of regions infected with tumors. Furthermore, findings obtained in a simulated environment demonstrate that our proposed model outperforms current models.
Keywords: Automatic, Adaptive gamma correction, Brightness preservation, Brain tumor detection, Contrast enhancement, Convolutional neural network, Deep learning, Entropy, Gray level co-occurrence matrix, Histogram equalization, Homomorphic filtering, Image classification, Model, Medical resonance imaging, Machine learning, Medical imaging, Optimum, Peak signal to noise ratio, Threshold, Tumor, Weighted distribution. Corresponding author Geeta Rani: Computer and Communication Engineering, Manipal University Jaipur, India; E-mail: [email protected]
*
Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma (Eds.) All rights reserved-© 2023 Bentham Science Publishers
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 91
INTRODUCTION The mind is the centralized information center of the human body, controlling all functions such as muscle coordination, breathing, metabolism, sense organ functionality, and so on. A brain tumor is an unregulated, unorganized, and undifferentiated mass of cells formed in the brain. It is a potentially fatal illness, and it adversely affects the working efficiency of a person. Among all the ailments, brain tumors are responsible for 85 to 90 percent damage to the central nervous system [1]. Unfortunately, brain and nervous system diseases are the tenth foremost origin of death for both men and women across the globe [2, 3]. It has been estimated that brain tumor is the cause of death of approximately 18,020 people, including 10,190 men and 7,830 women, in the year 2020 [4]. Further, the survival rate of patients is dependent on the age group. The survival rate decreases with an increase in age. The 5-year life expectancy for persons younger than 15 years of age is nearly 74%. Whereas, it decreases to 71% for people of age range from 15 to 39 years and 21% for people of age above 40 years. As per the reports circulated by the “World Health Organization (WHO)” [2], there has been significant growth in the number of brain tumor cases worldwide. In the United States in 2020, 23,890 people, including 13,590 men and 10,300 women, were diagnosed with primary cancerous spinal cord and brain tumors. It is expected that 3,540 youngsters between the ages of 15 would be spotted with a brain tumor by 2021 [5]. The alarming rise in the number of patients and demises due to brain tumor across the world raises the demand for developing the system for early detection and determining the severity of brain tumors. The symptoms of brain tumor vary based on the affected part of the brain. These symptoms include seizures, headache with vomiting, difficulty in speaking and walking, vision and mental disorders, etc. The brain tumor can be categorized into benign and malignant types of tumor. The benign tumor comprises a uniformly distributed mass of non-cancerous cells. Whereas, the malignant tumor consists of a non-uniform mass of cancerous cells. Further, American Brain Tumor Association (ABTA) and World Health Organization (WHO) graded the brain tumor on the measure from I to IV to categorize whether it is benign or malignant [1]. The tumor lies in ranks I and II are categorized as benign tumors while the tumor graded III and IV are classified as a malignant tumor. There are chances that a benign tumor turns malignant if it is not distinguished and treated at the primary phase [1]. As a result, detecting a brain tumor at the earliest point is important. Many pioneering research works are available for the early finding of the severity of brain tumors from medical images [1 - 5]. But, complex background, the presence of noise in medical modalities, and poor quality of images are the
92 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
identified obstacles in the correct and early detection and prediction of tumors from medical images. As a result, image processing technology must be used to strengthen the visual appearance of medical images. These techniques ease the tasks of Machine Learning (ML) and computer vision systems developed for the early screening of tumors. These also improve the visual interpretation, feature extraction, and image analysis efficacy of the ML models. The literature given from [6 - 20] reveals that HE is the furthermost used technique for contrast upgrading. This technique uses the cumulative density function and normalizes the intensity dispersal of input image gray levels. It is easy to implement, and accurately and evenly heightens the contrast of an image. Also, it is easy to retrieve the original histogram back from the equalized image. However, this method adjusts an image's mean brightness to the center of the dynamic spectrum. Thus, it creates the problems of over enhancement, increasing the brightness of background noise, and creating intensity saturation artifacts. Further, it focuses mainly on high-frequency histogram bins and eliminates the low-frequency histogram bins that produce the washed-out effects. These limitations are a barrier to the use of HE for medical imaging applications. Also, there are limited research works that proposed an integrated system for refining the feature of medical images and detecting the brain tumor from medical images such as MRI. To discourse the above-stated limitations, in this chapter, we suggested Robust Otsu’s Double Threshold Weighted Constraint Histogram Equalization (ODTWCHE) technique with optimized feature extraction. The overall workflow of the technique is shown in Fig. (1).
Fig. (1). Activity flow of the proposed model.
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 93
The proposed technique ODTWCHE [20] first read a low-contrast input MRI image. Then, it generates a histogram. Based on the generated histogram, the technique calculates the amount of enhancement required. The histogram is then divided into three sections using Otsu's double threshold process. Further, it applies a weighted normalized constrained model on each segmented sub histogram for modifying the probabilities. Then, it equalizes every sub histogram individually. Lastly, it applies an adaptive gamma correction process for more improvement of overall contrast and a Wiener filtering process for image improvement and noise elimination. After preprocessing, MATLAB inbuilt function region prop () is applied for extracting the statistical features such as “area, bounding box, centroid, major axis length, eccentricity, minor axis length, filled area, orientation, convex area, equivalent diameter, extent, solidity, and perimeter” from the preprocessed image. Based on these statistical features, the proposed technique ODTWCHE [20] efficiently detects all tumorous cells present in the preprocessed MRI image. These tumorous cells are important for the diagnosis, recognition of the severity of the tumor, and assessment of the response of the patient’s body towards the therapy. The remaining chapter is organized as follows: Literature Review section gives the review of related literature, and Proposed Model section presents the details of the model proposed for tumor detection. Results and Discussion section discusses the results and discussion. And the last section presents the summary of the work. LITERATURE REVIEW In this section, we extant the analysis of the state-of-the-art works [6 - 31] available for contrast enhancement and early detection of tumors and cancer. These research works focus on the approaches of histogram equalization [6 - 20], weighted distribution [10 - 14], adaptive gamma correction [11 - 17], homomorphic filtering [17 - 20], image segmentation [6 - 20], features extraction [21 - 26] and tumor detection [27 - 32]. Image segmentation approaches such as “Brightness Preserving Bi Histogram Equalization (BBHE)” [6] and “Dualistic Sub-Image Histogram Equalization (DSIHE)” [7]. These methods use mean or median strength thresholds for the histogram division of an initial input image into two sub histograms. Each subhistogram is now equalized separately. These techniques, on the other hand, yield strong contrast improvement but also provide intensity saturation artifacts and a problem of mean brightness shift. To resolve these problems, the researchers further proposed recursive image segmentation approaches [8, 9] based on the local mean value of the histogram of an input image. In these approaches, the subhistogram is separated into two sub histograms at each phase and produced 2r sub
94 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
histograms after rth step, where r is a regular number. The value of recursive stage, r, will be dependent on the preference of the user. In addition, the researchers claimed that the output image intensity suits equivalent to the input image intensity with an increase in r [8]. As compared to previous works [6, 7], this work maintains more brightness. It does, however, address the issue of overenhancement in low-contrast regions. Moreover, calculating the highest value of r is difficult. To address all these problems, researchers further applied the median value for the recursive thresholding of the histogram of an input image [9]. This median value-dependent work preserves better brightness and data content, but it does not successfully conserve the natural representation of the image due to unwanted annoying artifacts. To conserve a more natural appearance and an input image intensity, the approach “Quad Weighted Histogram Equalization with Adaptive Gama Correction (QWAGC-FIL)” [10] separates the histogram into more than two segments based on its valley positions and further applies the “histogram equalization, gamma correction process, and homomorphic filtering”. Although this technique provides better visual quality and preserves more entropy than BBHE [3] and DSIHE [4], it does not resolve the problem of over enhancement. To further enhance the eminence of images and resolve the problem of over enhancement, the authors [11] proposed the “Adaptive Gamma Correction with Weighting Distribution (AGCWD)” technique. This technique is effective for the global contrast improvement of digital images, and for improving the brightness of images based on the gamma correction and luminance pixels probability density function. In this series, the authors proposed another method in reference number [12] to effectively handle the histogram equalization process mean shift problem. Based on the quantile values, this approach first partitions the input image histogram into more than two sub histograms. Furthermore, a weighted distribution theory is implemented in the partitioned sub histograms. This method takes less time and preserves more brightness. Further in a study [13], the authors proposed a method based on a recursive division of an input image histogram accompanied by the weighted distribution and adaptive gamma correction process. The main purpose of this research is to efficiently enhance the contrast. To preserve brightness and enhances contrast more accurately and precisely as compared to the abovementioned existing contrast enhancement methods, the researchers [14] first segment the input image histogram recursively. Then, clipped the segmented histogram by six plateau limits followed by the weighted distribution and histogram equalization process. But these techniques are not successful in identifying the challenges of the region of interest due to improper segmentation. Thus, to address the challenge of identifying the Region of Interest from the enhanced MRI images, the authors [15 - 20] employed Otsu’s thresholding
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 95
method. This method is useful in the partitioning of an image into multiple Double sub-regions based on the identified and desirable features. Also, it performs better than “Range Limited Bi-Histogram Equalization (RLBHE)” [15], and “Range Limited Threshold Multi-Histogram Equalization (RLDTMHE)” [16]. The RLBHE method partitions the histogram of an input image into two independent sub histograms based on Otsu’s single thresholding method. Further, the range of an equalized image is considered to produce the least AMBE between the processed image and the original image. Whereas in RLDTMHE technique, partitions the input image histogram into three independent sub histograms based on Otsu’s double thresholding method. This method measured the spectrum of an equalized image to produce the least “absolute mean brightness error (AMBE)” among the processed image and an input image. Both RLWHE [17] and RLDTWHE [18] use Otsu’s thresholding method for the division of the histogram of an input image. The technique RLDTWHE is more effective in identifying the Region of Interest (ROI) even from the complex background and the image with multiple objects. Therefore, it outperforms RLWHE. To further improve the accuracy of identifying the desirable ROI, the authors [19] employed “Maximum Entropy-based Weighted Constrained Histogram Equalization with Adaptive Gamma Correction (MEWCHE-AGC)”. This technique uses the maximum entropy as a threshold for the division of the histogram of an input image. This approach not only identifies the correct ROI but also reports higher entropy and brightness preservation than the RLDTWHE. Moreover, it is effective in retaining the natural appearance of a digital image. In line with the previous research works, the authors [20] developed the “Optimized Double Threshold Weighted Constrained Histogram Equalization (ODTWCHE)” technique for contrast improvement and tumor detection. This technique is sequential integration of “range limited Otsu’s double threshold method, the weighted constrained model, Particle Swarm Optimization (PSO) technique, adaptive gamma correction process, and wiener filtering”. Maximum likelihood and illumination retention are achieved using this method. It also addresses the issue of image over-enhancement while maintaining the natural appearance of images. A couple of preliminary works [21 - 32] focus on feature extraction and tumor detection with machine learning and deep learning methods. The research work proposed [21] first processed the image of the tumorous brain by “anisotropic diffusion filter”, then employed the “active contour model (ACM)” for accurate tumor detection, discrete wavelet transform (DWT) for features extraction, and “Independent Component Analysis (ICA)” for selection of relevant features. Finally, for tumor classification, a qualified support vector machine with various
96 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
kernels is used. The researchers proposed a hybrid model that focuses on segmentation and classification for the identification of tumorous cells. The model involves three stages, viz. segmentation based on maximum likelihood estimators, feature extraction based on “Gray Level Co-occurrence Matrix (GLCM)”, and classification using “Artificial Neural Network (ANN)” [22]. The work proposed [23] used integration of the bio-inspired “Berkeley Wavelet Transform (BWT)” based division and the “Support Vector Machine (SVM)” as a classification method to improve the treatment efficacy. In this work, the authors extracted the texture features of a brain tumor based on GLCM. Further, the researchers [24] used an efficient combination of wavelet texture features and “Deep Neural Network (DNN)” for tumor segmentation and classification. First, they integrated GLCM and wavelet GLCM to remove wavelet texture functionality from noisefree images. Then, using the “Oppositional Flower Pollination Algorithm (OFPA)”, they chose the useful characteristics. Further, they applied DNN to classify the tumor. In the last step, the authors used the probabilistic Fuzzy CMeans clustering (PFCM) method for tumor segmentation. All these approaches do not provide superior performance in extracting the features and detecting brain tumors. To achieve superior performance, the researchers primarily focused on noise reduction strategies, GLCM-based features extraction, DWT based brain tumor detection [25]. This is accompanied by morphological filtering and the PNN (Probabilistic Neural Network) based classification. The perfect blend of the above-stated techniques achieved the highest precision of 100% in the identification of healthy and unhealthy brain MRI images. The work [26] used another efficient approach for improving accuracy. This approach first finds out the region of interest (ROI) of brain tumors by using morphological operations. ROI is used to extract the features of a brain tumor and then calculate the shape features. Based on these features, both Random Forest and SVM are used for the classification of a tumor. But according to experimental results, the random forest gives better accuracy than other existing state of the art methods. The authors [27] applied pre-processing, segmentation, and classification to the input brain tumor MRI images. They employed the median filter, morphological operation, masking, feature extraction, and SVM classification in their approach. Their approach reported a classification accuracy of approximately. 99% in classifying the malignant tumor. In this sequence of improving the accuracy, the researchers used manual skull stripping for the first time to accurately retrieve the area of interest (ROI) and used Gaussian filtering to eliminate the noise. Then, they applied tumor segmentation based on an improved thresholding method followed by extraction of geometric and texture features. Further, they choose the utmost relevant features by employing the genetic algorithm. In the last step, they
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 97
applied SVM with a static kernel function to define the selected properties. This is the unique approach for the precise separation and classification of brain tumors based on better segmentation of salience and feature selection characteristics. The researchers [28] proposed a framework based on three different architectures viz. Alex Net, VGG Net, and Google Net classify brain tumors into different categories. Each research incorporates transfer learning strategies, i.e., fine-tuning and frosting, using MRI images of various brain tumor datasets. In this framework, the data improvement strategy is often used to raise dataset samples and mitigate the chances of over-fitting. As a result, VGG 16 reached the best accuracy of up to 98.69% in the diagnosis and classification of brain tumors. Further, the researchers [29] offered a transfer learning-based block-wise finetuning strategy. They minimized the need for pre-processing and did not use handcrafted feature extraction. Their approach achieved an average accuracy of 94.82% with fivefold cross-validation. The researchers [30] used transfer learning to improve the efficacy of the classification models even though the small datasets are available for training. They employed the pre-trained Google Net model for tumor finding from brain MRI images available in the fig-share MRI dataset. Their method gave an average prediction performance of 98%. The techniques discussed so far involve manual feature extraction that is prone to human errors. Thus, the researchers [31] proposed the automated diagnosis and assessment of brain tumors. In this model, the brain MRI images are preprocessed using a Gaussian filter. Then, deep features are retrieved using Alex Net and VGG 16 versions. In the last step, classification is performed by using Extreme Machine Learning Methods. One more group of researchers suggested a new two-phase multi-model automated diagnostic process [32] for finding and localizing brain tumors. The first stage of this process involves pre-processing, CNN-based features extraction and feature classification based on “error-correcting output codes support vector machine (ECOC-SVM)”. The key aim of the first step is to identify and distinguish brain tumors into healthy and cancerous types. In the second step, the tumor is located based on a completely developed five-layer convolutional neural network centered on the area (R-CNN). The efficiency of the first stage was analyzed by three CNN models, viz. VGG-16, Alex Net, and VGG-19, and with Alex Net. The approach reported an overall accuracy of 99.55% and generated a dice score of 0.87 for the brain tumor localization phase.
98 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
The above discussion of the state-of-the-art techniques employed for contrast improvement features extraction and tumor detection shows that the techniques proposed [1, 19] face the problems of mean brightness shift, uneven illumination, over enhancement, and under enhancement. Also, these techniques fail to preserve the maximum brightness and detailed information. Therefore, these techniques are not effective inaccurate feature extraction and brain tumor finding from MRI images. A further discussion of the existing works shows that the technique ODTWCHE overcomes all the above-mentioned limitations and performs better in feature extraction and tumor detection. In this chapter, we enhanced the robustness of the ODTWCHE techniques proposed in [20]. We experimented with the MRI images collected from different sources, including online [33] as well as offline. We collected real-time images from hospitals across India to validate the robustness of ODTWCHE in the screening of tumors in real-time patients using their MRI images. Table 1 displays the comparison of medical image contrast enhancement techniques and Table 2 displays the comparison of features extraction and brain tumor detection techniques. Table 1. Comparison of medical image contrast enhancement techniques in literature. Author Y.T. Kim [6]
Technique
Features
Segmentation is based on the Preserve mean brightness mean gray level of a histogram well as compared to HE of an input image into two sub and also provide natural histograms, followed by the HE enhancement. phase.
Weaknesses Less entropy preservation and over-enhancement problem.
Y. Wang et Dependent on its novel PDF, al. [7] followed by the HE method, the image is disintegrated into two equal-area sub-images.
Effectively conserve the image luminance.
Provide unnatural contrast enhancement and unpleasant visual artifacts in the background.
S. D. Chen et. al. [8]
Provide scalable brightness preservation.
Not suitable for highly structured images, difficult to search for the appropriate value of r.
Recursive segmentation is done Preserve more brightness based on the median intensity with good MSSI and value. PSNR.
A crucial problem is assessing the sum of separation, degrading the results.
K. S. Sim [9]
The average brightness is dependent on recursive segmentation.
M. Agarwal et al. [10]
Segmentation is based on the valley positions.
Provide better contrast enhancement with maximum entropy preservation.
Produce undesirable visual artifacts and are not suitable for highly complex structured background images.
S. C. Huang et al. [11]
The gamma correction is used and the distribution of likelihood of luminance pixels.
Provide better contrast enhancement with less computation time.
Not suitable for highly structured, uneven illuminated, and non-symmetric distributed images.
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 99
(Table 1) cont.....
Author
Technique
Features
Weaknesses
M. Tiwari et. al. [12]
Segmentation is based on Non-recursive nature Fail to remove noise, blurriness, quantile values followed by preserves image brightness sharp edges, and angular dynamic range stretching and a within less computation corners from medical images. weighted distribution process. time.
S. Rani et al. [13]
Implement a recursive division of histogram of an input image accompanied by the weighted distribution and adaptive gamma correction process.
Produce linear minimum mean square error (MMSE).
High computational time complexity.
Provide better contrast enhancement without incurring annoying artifacts.
Not suitable for all kinds of medical MRI images.
M. A. Qadar Focus on contrast enhancement et al. [14] of brightness distorted images by applying weighted distribution process. C. Zuo et. al. [15]
Segmentation is based on Otsu’s method followed by a range optimization process.
Preserve original image brightness well.
Not suitable for highly complex structured background images due to improper segmentation.
H. Xu et al. [16]
Segmentation is based on Otsu’s double threshold.
Obtain more clear details with good brightness preservation.
Not suitable for 3D medical MRI images.
M. Agarwal et al. [17]
Segmentation is based on Provide improved contrast Otsu’s single thresholding enhancement while method, followed by the maintaining optimum weighted distribution and range entropy and illumination. optimization process.
M. Agarwal Apply weighted probability et al. [18] distribution, gamma correction, and homomorphic filtering. M. Agarwal et al. [19]
Segmentation is based on the entropy maximization method followed by probability modification process.
Not applicable for all forms of medical MRI images.
Provide a clear understanding of local information with appropriate comparison.
Not suitable for 3D medical MRI images.
Provide maximum entropy and brightness by controlling the enhancement rate.
Limited improvement in contrast and not suitable for highly complex structured background images.
Table 2. Comparison of features extraction and brain tumor detection techniques in literature. Author
Technique Used
Data Set Used
Performance Evaluation
S. S. Sandhya et al. [21]
Used Gaussian Filter, FCM, Curvelet transform, and PNN.
Collected from Kadhmiya teaching hospital and the Internet.
The achieved recognition rate of 98%.
The Cancer Imaging Archive (TCIA)
Achieved sensitivity 93.47%, specificity 100% and accuracy 96.34%,
R. Deepa et al. Used a mixture model based on [22] GLCM and ANN for tumor identification and classification.
100 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
(Table 2) cont.....
Author
Technique Used
Data Set Used
Performance Evaluation
N. B. Bahadure et al. [23]
Used a hybrid of “Biologically inspired Berkeley Wavelet Transform (BWT)” and Support Vector Machine (SVM) as a classifying approach.
All the test images were taken using the 3 Tesla Siemens Magnetom Spectra MRI Machine.
Achieved specificity 94.2%, accuracy 96.51%, 97.2% sensitivity and 0.82 dice similarity index coefficient.
N. V. Shree et al. [25]
Used GLCM, DWT, morphological filtering, and PNN.
Collected from websites www.diacom.com
Achieved nearly 100% accuracy.
S. Arivoli et al. [27]
Used median filtering, morphological operation, masking, feature extraction, and SVM classification.
Mostly collected from Brats and TCIA.
Provided approx. 99% of accuracy.
A. Rehman et al. [28]
Used transfer learning techniques, i.e., fine-tune and freeze and conducted three studies on Alex net, VGG net, and Google net.
Figshare Brain Tumor Data set.
Attained the highest accuracy of 98.69% on fine-tuning VGG 16 architecture.
Z. N. K. Swati Used transfer learning-based Figshare Brain Tumor Data Achieve 94.82% of average et al. [29] block-wise fine-tuning strategy. set. accuracy under fivefold cross-validation. S. Deepak et al. [30]
Used a Google Net Pre-Trained Figshare Brain Tumor Data Model for features extraction set. from brain tumor MRI images.
Generated a mean classification accuracy of 98%
A. Ari et al. [31]
Used Gaussian Filter, Alex net, Used three data sets: and VGG 16 models for deep TCIA, Figshare, and Henry features extraction, ELM Ford Hospitals (RETRO). classifier.
Provided superior performance in detecting and classifying brain tumors.
M. K. AbdEllah et al. [32]
Proposed two-phase multi-model Used Brats 2013 database. automatic detection model for the diagnosis and location of brain tumors.
Obtained accuracy of 99.55% and a dice score of 0.87.
Deep learning CNN model, i.e., Res net 50 is used for the brain tumor finding process.
Obtained accuracy of 97.2%
A. Cinar et al. [35]
Used dataset from the Kaggle site.
PROPOSED MODEL In this chapter, we use the architecture of ODTWCHE [20]. We further optimize its feature extraction and improve its robustness. The improved feature extraction and high robustness make the technique more effective in the early screening of brain tumors. The details of the experiments and optimum feature extractions are argued in the following sub-sections.
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 101
Dataset In this model, we used a publicly available Figshare database [36] for evaluating the performance of the suggested system. The researcher Cheng prepared this dataset in 2017. It comprises a training dataset of 3,064 MRI images of the brain and a test dataset of 397 MRI images of the brain from 233 patients. Further, the dataset contains the MRI images of patients detected with benign or malignant brain tumors. All these images relate to the modality of T1-CE-MRI and contain axial, coronal, and sagittal views. In the training sample, there are 735 MRI images, and in the test set, there are 219 MRI images. The training set of malignant tumors comprises 2,284 MRI images, while the testing data contain 178 images. The description of the number of MRI images of different categories is given in Table 3. Table 3. The details about the training and testing datasets used for tumor detection and classification. Total Images
Training Dataset
Testing dataset
Benign
Malignant
Benign
Malignant
561
2,106
219
178
3,064
Image Pre-Processing Due to the presence of several types of noise, raw images collected from websites and scan centers are not appropriate for direct processing. Thus, image preprocessing is the most important step for removing the noise and improving the quality of low-contrast original input images. In addition, medical image preprocessing shows a significant part in the patient analysis. This image pre-processing is particularly required for MRI/CT images because 1. Label or marks (film artifacts) and noise can interfere with the post-processing like tumor detection and classification processes of these images. 2. Images are to be made more appropriate for further processing in CAD systems. 3. Image quality needs to be improved. 4. The need to exclude noise from the images is important. In this suggested work, the primary focus is on removing noise, unwanted visual artifacts, and additional cranial (skull) tissues, as well as improving contrast. The
102 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
automated system's success has been affected by the existence of these unwanted visual artifacts and skull tissues. Therefore, the appropriate image pre-processing technique ODTWCHE [20] is used in this proposed work to effectively remove these additional undesirable visual artifacts and cranial tissues. The ODTWCHE technique consists mainly of six steps which are described as follows and the block diagram is given in Fig. (2) [20].
Input Image
Histogra m Building
Image Segmentation based on Otsu’s Threshold Method Filtering Process
Processed Image
Probability Modificatio n Process
Gamma Correction Process
Particle Swarm Optimizatio n Technique
Histogram Equalization Process
Fig. (2). Block Diagram of ODTWCHE Technique.
The first step includes the segmentation of an input image histogram into three sub histograms, namely foreground, background and target on the basis of Otsu’s double threshold method. This method performs global thresholding according to the shape of a histogram. It automatically selects an optimal value of threshold by maximizing the inter-class variance. Hence, it effectually removes the segmentation difficulty of image histograms having a complex background and multi objects. The second step consists of the probability modification process. In this process, we modify the probabilities of statistical sub histograms based on a weighted normalized constrained model. This model assigns a higher weight to less frequent gray levels and less weight to more frequent gray levels. As a result, it controls the problem of over-enhancement and effectively reduces the domination nature of high-frequency histogram bins. The third step employs Particle Swarm Optimization (PSO) technique in order to find the optimal value of parameters used in the weighted normalized constrained model. This technique uses the fitness function as an entropy to preserve more information and to control the degree of enhancement.
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 103
In the fourth step, all sub histograms are equalized independently by the histogram equalization process. This process spreads intensity values over the whole range of an input image on the basis of the transformation function. It generates a histogram with uniform intensity distribution, i.e., more wide and flat. As a result, it enhances the contrast of an input image [4]. In the fifth step, we applied an adaptive gamma correction process for further global contrast enhancement of a resultant equalized image. This process maintains a balance between low computational cost and a high level of visual quality. It effectively increases the low-intensity values and avoids significant decrement in high-intensity values. As a result, gamma correction process avoids all unnatural changes that occur in CDF function. Finally, wiener filtering is applied in the sixth step to effectively reduce the noise from visually important areas of an enhanced image. The wiener filter is a linear space invariant filter. It minimizes the mean square error (MSE) between an original image and its corresponding restored image. Features Extraction Feature extraction is the procedure of decreasing an initial set of raw data to understandable, smaller, and adaptable groups known as features or feature vectors for processing while correctly or precisely describing the original dataset. It mainly collects the greater-level details of an image, such as “texture, shape, contrast, and color”. The most prominent application of feature extraction is in medical image processing. Features such as edge detection, pixel density, shapebased detection, motion detection, template matching, etc. are extracted from an image to process, analyze and predict various other parameters that may be needed by various tumor detection and classification algorithms [22 - 31]. Features extraction mainly depends on the expert’s knowledge in a particular domain. It is a crucial step in conventional machine learning and the accuracy of the brain tumor detection process depends on the extracted features. Generally, the process of feature extraction can be categorized into two types. The first type is manual handcrafted or low-level features extraction based on traditional algorithms such as the “Gray Level Co-occurrence Matrix (GLCM)”, MATLAB inbuilt function region props () and the second type is the automated features extraction based on deep learning technique i.e. Convolutional Neural Network (CNNs). Texture analysis based on handcrafted features is now a critical component of human visual observation and machine learning technologies. It is effectively used by selecting standard features to improve the diagnostic system's accuracy. The most commonly used image processing applications for the retrieval of
104 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
statistical and texture properties are GLCM and region props (). The observations and assessments of these features could enhance the diagnosis, the various stages of the tumor, and the evaluation of therapy response. The extracted texture features are “Standard Devia tion, Mean, Entropy, Skewness, Contrast, Inverse Difference Moment (IDM) or Homogeneity, Kurtosis, Energy, Correlation, Directional Moment (DM), and Coarseness”. The extracted statistical features are “centroid, area, bounding box, major axis length, eccentricity, minor axis length, convex area, orientation, filled area, solidity, equivalent diameter, perimeter, and extent”. Features extraction provides a starting point for the brain tumor detection task and ensures better diagnosis results analysis on brain MRI images [32]. The proposed technique employed the GLCM technique for feature extraction [23]. This technique effectively extracts all statistical and textual features from the preprocessed MRI images. The difference in values of statistical features such as “area, centroid, bounding box, major axis length, minor axis length, eccentricity, orientation, convex area, filled area, equivalent diameter, solidity, extent, and perimeter” for the non-tumorous and tumorous cells is useful in tumor detection. The values of these features in the detected tumor are shown in Table 4. Further, the textural features such as “contrast, Inverse difference moment, entropy, correlation, variance, sum average, sum entropy, difference entropy, inertia, cluster shade, cluster prominence, dissimilarity, homogeneity, energy, autocorrelation, maximum probability, and inverse difference normalized (INN)” from the pre-processed brain MRI are also important for distinguishing the tumorous and non-tumorous tissues. The findings of the textural and statistical features of tumorous cells are important for the diagnosis, recognition of the severity of tumors, and assessment of the response of patients’ body towards the therapy. Table 4. Statistical features of identified tumor in MRI images. Tumor Features
MRI Brain 1
MRI Skull 1
MRI Brain 2
MRI Skull 2
Area
805
423
982
1090
Centroid
[130.1205, 128.1081]
[131.3073, 92.2695]
[126.7912, 128.9959]
[149.3532, 98.28.07]
Bounding Box
[24.5000, 6.5000, 209, 241]
[87.5000, 69.5000, 89, 46]
[3.5000, 3.5000, 247, 251]
[35.5000, 15.5000, 189, 239]
Major Axis Length
339.9325
99.7980
404.2464
316.5386
Minor Axis Length
296.7446
48.1141
398.0508
224.1649
Eccentricity
0.4878
0.8761
0.1744
0.7060
Orientation
88.4660
-11.2281
84.7256
-47.4157
Convex Area
40218
2955
61501
33015
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 105
(Table 4) cont.....
Tumor Features
MRI Brain 1
MRI Skull 1
MRI Brain 2
MRI Skull 2
Filled Area
805
423
60865
1090
Equivalent Diameter
32.0150
23.2073
33.3599
37.2536
Solidity
0.0200
0.1431
0.0160
0.0330
Extent
0.0160
0.1033
0.0158
0.0241
Perimeter
1.4198e+03
752.8540
963.5020
1.9532e+03
Tumor Features
MRI Brain 1
MRI Skull 1
MRI Brain 2
MRI Skull 2
Tumor Detection A brain tumor is an unregulated development of irregular cells in the brain. The severity of a brain tumor differs greatly from that of other cancers. It harms humans and has the potential to impair normal brain functions. As a result, early diagnosis and care are critical for saving lives. However, owing to the difficulty of the tumor size and location unpredictability, detecting a brain tumor in an MRI is a stimulating task. This proposed work mainly focuses on the accurate finding of brain tumor using image processing techniques. A brain tumor is more curable and treatable if detected and classified in its early stage. The architecture of ODTWCHE employed in this chapter accurately detects the brain tumor at an initial phase. In the first phase, it converts the preprocessed output image into a binary format using the inbuilt function ‘bwlabeln ()’ of MATLAB [10]. Further, the resultant binary image will be subjected to some area opening operations to remove all the linked components that have less than 50 pixels. Then, it calculates the area (referred to as the number of pixels occupied by the object) and solidity (referred to as Image area/ smallest convex polygon that could cover the image) of each brain MRI image. Then, it extracts the tumorous tissues based on the values of solidity and area of the brain MRI image [30]. Generally, a tumor has a higher solidity value than normal brain tissues. The samples of experimental results obtained using the MRI of the brain and skull are shown in Figs. (3 - 6).
Fig. (3). (a) Input Image (b) Binary Image (c) Detected Tumor (d) Improved Tumor Detection.
106 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
Fig. (4). (a) Input Image (b) Binary Image (c) Detected Tumor (d) Improved Tumor Detection.
Fig. (5). (a) Input Image (b) Binary Image (c) Detected Tumor (d) Improved Tumor Detection.
Fig. (6). Comparison of different image contrast enhancement techniques.
Based on the experimental findings, the authors suggest that the proposed system is effective in identifying a brain tumor at a preliminary phase. Early detection of a brain tumor will aid in lessening the chance of unnecessary surgery and upsurging the number of treatment options available. Therefore, it helps in improving the survival rate of patients. RESULTS AND DISCUSSION In this section, the authors explain the specifics of the outcomes acquired from performing experiments. The results obtained by employing the ODTWCHE technique reported the average value of entropy as 6.91 bits, which is near to the entropy of the input image. It reveals that the technique is efficient in preserving detailed information.
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 107
Furthermore, the technique achieved a PSNR of 31.05 DB, which is better than the current techniques. This indicates that ODTWCHE preserves the maximum brightness and reduces the noise. Furthermore, this methodology showed an average contrast of 40.15 DB, which is greater than the contrast reported by other techniques in the literature. These findings show that the ODTWCHE approach has the best contrast enhancement, eliminates over-enhancement, and decreases noise. It can also distinguish tumors from images of complicated backgrounds and numerous objects. It is clear from the simulation results shown in Figs. (3 - 6) that the optimum ODTWCHE technique accurately detects the tumor in all types of enhanced medical MRI images. Further, we compared the results of the Robust OTDWCHE with optimized feature extraction with state-of-the-art techniques proposed [3 - 17]. The comparison of the results is shown in Fig. (7). It is evident from Fig. (7) that the suggested method outperforms in terms of detailed preservation, PSNR, and contrast of the images. Fig. (7) displays the comparison of diverse image contrast enhancement techniques such as “GHE, BBHE [6], DSIHE [7], AGCWD [11], RLBHE [15], RLDTMHE [16], TOHE [37], EASHE [38]” and the proposed technique ODTWCHE for figshare database [36] in the MATLAB simulation environment. Based on the experimental findings in Fig. (7), it is apparent that the suggested algorithm ODTWCHE outperforms the other eight HE-based contrast improvement methods. The suggested algorithm secures the most amount of data while offering the best contrast improvement with the least amount of artifacts.
Fig. (7). (a) Input Image (b) Binary Image (c) Detected Tumor (d) Improved Tumor Detection.
108 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
Besides that, since it uses an automated weighted constrained model well before the HE phase, the algorithm ODTWCHE effectively preserves the natural quality of an image and manages the enhancement rate. These findings show that the proposed algorithm has a stronger balance between brightness conservation and contrast improvement. FUTURE SCOPE This work can be protracted to locate the tumor using other imaging techniques such as X-rays and CT scans. Also, there is a scope to enhance its applicability for the classification of malignant and benign tumors. CONCLUSION In this chapter, we highlight the need for tumor screening at an early stage. We proposed the technique for improved feature extraction. Also, we optimized the ODTWCHE technique for tumor detection from MRI images. Further, we improved the robustness of the technique by experimenting with images obtained from different sources. The technique has become efficient in the early detection of tumors, even from images with complex backgrounds and multiple organs. We also show the features of the tumor detected from the MRI images used for validating the effectiveness of the technique. We have provided the results of assessment metrics used to assess the visual appearance of images created using the ODTWCHE methodology. The high values of PSNR, entropy and contrast prove the efficacy of the improvements proposed in this chapter. We conclude that the suggested approach is applicable for contrast improvement of low-quality MRI images and primary tumor screening based on the experimental findings. It can be used as an assistant for clinical experts or radiologists. REFERENCES [1]
Jin Liu, Min Li, Jianxin Wang, Fangxiang Wu, Tianming Liu, and Yi Pan, "A survey of MRI-based brain tumor segmentation methods", Tsinghua Sci. Technol., vol. 19, no. 6, pp. 578-595, 2014. [http://dx.doi.org/10.1109/TST.2014.6961028]
[2]
A. Philips, D.L. Henshaw, G. Lamburn, and M.J. O’Carroll, "Authors’ Comment on “Brain Tumours: Rise in Glioblastoma Multiforme Incidence in England 1995–2015 Suggests an Adverse Environmental or Lifestyle Factor”", J. Environ. Public Health, vol. 2018, pp. 1-3, 2018. [http://dx.doi.org/10.1155/2018/2170208] [PMID: 30046315]
[3]
R. Hua, Q. Huo, Y. Gao, H. Sui, B. Zhang, Y. Sun, Z. Mo, and F. Shi, "Segmenting Brain Tumor Using Cascaded V-Nets in Multimodal MR Images", Front. Comput. Neurosci., vol. 14, p. 9, 2020. [http://dx.doi.org/10.3389/fncom.2020.00009] [PMID: 32116623]
[4]
Z.U. Rehman, M.S. Zia, G.R. Bojja, M. Yaqub, F. Jinchao, and K. Arshid, "Texture based localization of a brain tumor from MR-images by using a machine learning approach", Med. Hypotheses, vol. 141, 2020. [http://dx.doi.org/10.1016/j.mehy.2020.109705]
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 109
[5]
T.F. Hossain, S. Shishir, M. Ashraf, and M.A. Al Nasim, "Muhammad Shah, F. Brain Tumor Detection Using Convolutional Neural Network", 1st Int. Conf. Adv. Sci. Eng. Robot. Technol., pp. 15, 2019. [http://dx.doi.org/10.1109/ICASERT.2019.8934561]
[6]
Y.T. Kim, "Contrast enhancement using brightness preserving bi-histogram equalization", IEEE Trans. Consum. Electron., vol. 43, no. 1, pp. 1-8, 1997. [http://dx.doi.org/10.1109/TCE.2002.1010085]
[7]
Yu Wang, Qian Chen, and Baeomin Zhang, "Image enhancement based on equal area dualistic subimage histogram equalization method", IEEE Trans. Consum. Electron., vol. 45, no. 1, pp. 68-75, 1999. [http://dx.doi.org/10.1109/30.754419]
[8]
Soong-Der Chen, and A.R. Ramli, "Contrast enhancement using recursive mean-separate histogram equalization for scalable brightness preservation", IEEE Trans. Consum. Electron., vol. 49, no. 4, pp. 1301-1309, 2003. [http://dx.doi.org/10.1109/TCE.2003.1261233]
[9]
K.S. Sim, C.P. Tso, and Y.Y. Tan, "Recursive sub-image histogram equalization applied to gray scale images", Pattern Recognit. Lett., vol. 28, no. 10, pp. 1209-1221, 2007. [http://dx.doi.org/10.1016/j.patrec.2007.02.003]
[10]
M. Agarwal, and R. Mahajan, "Medical Images Contrast Enhancement using Quad Weighted Histogram Equalization with Adaptive Gamma Correction and Homomorphic Filtering", 7th International Conference on Advances in Computing & Communication (ICACC) pp. 509-517, 2017.
[11]
S.C. Huang, F.C. Cheng, and Y.S. Chiu, "Efficient contrast enhancement using adaptive gamma correction with weighting distribution", IEEE Trans. Image Process., vol. 22, no. 3, pp. 1032-1041, 2013. [http://dx.doi.org/10.1109/TIP.2012.2226047] [PMID: 23144035]
[12]
M. Tiwari, B. Gupta, and M. Shrivastava, "High-speed quantile-based histogram equalisation for brightness preservation and contrast enhancement", IET Image Process., vol. 9, no. 1, pp. 80-89, 2015. [http://dx.doi.org/10.1049/iet-ipr.2013.0778]
[13]
S. Rani, and M. Kumar, "Contrast Enhancement using Improved Adaptive Gamma Correction with Weighting Distribution Technique", Int. J. Comput. Appl., vol. 101, no. 11, pp. 47-53, 2014. [http://dx.doi.org/10.5120/17735-8849]
[14]
M.A. Qadar, Y. Zhaowen, A. Rehman, and M.A. Alvi, "Recursive weighted multi-plateau histogram equalization for image enhancement", Optik (Stuttg.), vol. 126, no. 24, pp. 5890-5898, 2015. [http://dx.doi.org/10.1016/j.ijleo.2015.08.278]
[15]
C. Zuo, Q. Chen, and X. Sui, "Range Limited Bi-Histogram Equalization for image contrast enhancement", Optik (Stuttg.), vol. 124, no. 5, pp. 425-431, 2013. [http://dx.doi.org/10.1016/j.ijleo.2011.12.057]
[16]
H. Xu, Q. Chen, C. Zuo, C. Yang, and N. Liu, "Range limited double threshold multi histogram equalization for image contrast enhancement", Springer, vol. 22, no. 2, pp. 246-255, 2015.
[17]
M. Agarwal, and R. Mahajan, "Medical Images Contrast Enhancement using Range Limited Weighted Histogram Equalization", In 6th International Conference on Smart Computing and Communications (ICSCC), NIT Kurukshetra, Haryana, India, vol. 125, pp. 149-156, 2018.
[18]
G. Rani, and M. Agarwal, "Contrast Enhancement using Optimum Threshold Selection", Int. J. Soft. Innov., vol. 8, no. 3, p. 7, 2019.
[19]
M. Agarwal, G. Rani, S. Agrawal. "Sequential model for digital image contrast enhancement", Recent Advances in Computer Science and Communications, vol. 14, no. 9, pp. 2772-2784, 2021. [http://dx.doi.org/10.2174/2666255813999200717231942]
110 IoT and Big Data Analytics, Vol. 2
Agarwal et al.
[20]
M. Agarwal, G. Rani, V.S. Dhaka, "Optimized contrast enhancement for tumor detection", Int. J. Imaging System. Technol., vol. 30, no. 3, pp. 687-703, 2020. [http://dx.doi.org/10.1002/ima.22408]
[21]
S.S. Sandhya, and G. Giri Babu Kande, "A novel approach for the detection of tumor in MR images of the brain and its classification via independent component analysis and kernel support vector machine", Imaging Med., vol. 9, no. 3, pp. 33-44, 2017.
[22]
R. Deepa, and W.R.M. Sam Emmanuel, "Identification and classification of brain tumor through mixture model based on magnetic resonance imaging segmentation and artificial neural network, Concepts Magn. Reson. Part A Bridg", Educ. Res. Willey, vol. 45, no. 2, pp. 1-12, 2017. [http://dx.doi.org/10.1002/cmr.a.21390]
[23]
N.B. Bahadure, A.K. Ray, and H.P. Thethi, "Image analysis for MRI based brain tumor detection and feature extraction using biologically inspired BWT and SVM", Int. J. Biomed. Imaging, 2017.
[24]
S. Preethi, and P. Aishwarya, S. Preethi, P. Aishwarya, "Combining wavelet texture features and deep neural network for tumor detection and segmentation over MRI", J. Intell. Syst., 2017. [http://dx.doi.org/10.1515/jisys-2017-0090]
[25]
N. Varuna Shree, and T.N.R. Kumar, "Identification and classification of brain tumor MRI images with feature extraction using DWT and probabilistic neural network", Brain Inform., vol. 5, no. 1, pp. 23-30, 2018. [http://dx.doi.org/10.1007/s40708-017-0075-5]
[26]
B. Asodekar, and P.S.A.S. Gore, "Brain Tumor Classification Using Shape Analysis of MRI Images", Int. Conf. Commun. Inf. Process., 2019. [http://dx.doi.org/10.2139/ssrn.3425335]
[27]
S. Arivoli, K.J. Ravindran, R. Raveen, and S. Tennyson, "Detection and Classification of Brain Tumor using Machine learning Approaches, Int", J. Res. Pharm. Sci., vol. 10, no. 3, pp. 2153-2162, 2019.
[28]
A. Rehman, S. Naz, M.I. Razzak, F. Akram, and M. Imran, "A deep learning-based framework for automatic brain tumors classification using transfer learning", Circuits Syst. Signal Process., vol. 39, no. 2, pp. 757-775, 2020. [http://dx.doi.org/10.1007/s00034-019-01246-3]
[29]
Z.N.K. Swati, Q. Zhao, M. Kabir, F. Ali, Z. Ali, S. Ahmed, and J. Lu, "Brain tumor classification for MR images using transfer learning and fine-tuning", Comput. Med. Imaging Graph., vol. 75, pp. 3446, 2019. [http://dx.doi.org/10.1016/j.compmedimag.2019.05.001] [PMID: 31150950]
[30]
S. Deepak, and P.M. Ameer, "Brain tumor classification using deep CNN features via transfer learning", Comput. Biol. Med. Elsevier, vol. 111, pp. 1–7, 2019. [http://dx.doi.org/10.1016/j.compbiomed.2019.103345]
[31]
A. Arı, "Brain MR Image Classification Based on Deep Features by Using Extreme Learning Machines", Biomed. J. Sci. Tech. Res., vol. 25, no. 3, pp. 19137-19144, 2020. [http://dx.doi.org/10.26717/BJSTR.2020.25.004201]
[32]
M. K. Abd-Ellah, A. I. Awad, A. A. M. Khalaf, and H. F. A. Hamed, "Two-phase multi-model automatic brain tumor diagnosis system from magnetic resonance images using convolutional neural networks", Eurasip J. Image Video Process. Springer, vol. 97, no. 1–10, 2018. [http://dx.doi.org/10.1186/s13640-018-0332-4]
[33]
CVG-UGR-Database. Available from: http://decsai.ugr.es/cvg/dbimagenes
[34]
L. Zhuang, and Y. Guan, "Image enhancement via subimage histogram equalization based on mean and variance", Comput. Intell. Neurosci., pp. 1-12, 2017., [http://dx.doi.org/10.1155/2017/6029892]
[35]
A. Çinar, and M. Yildirim, "Detection of tumors on brain MRI images using the hybrid convolutional
Contrast Enhancement and Tumor Screening
IoT and Big Data Analytics, Vol. 2 111
neural network architecture", Med. Hypotheses, vol. 139, 2020. [http://dx.doi.org/10.1016/j.mehy.2020.109684] [36]
J. Cheng, "Brain magnetic resonance imaging tumor dataset", Figshare MRI Dataset Version 5, 2017. [http://dx.doi.org/10.6084/m9.figshare.1512427.v5]
[37]
P. Shanmugavadivu, and K. Balasubramanian, "Thresholded and Optimized Histogram Equalization for contrast enhancement of images", Comput. Electr. Eng., vol. 40, no. 3, pp. 757-768, 2014. [http://dx.doi.org/10.1016/j.compeleceng.2013.06.013]
[38]
L. Zhuang, and Y. Guan, "Adaptive image enhancement using entropy-based subhistogram equalization", Comput. Intell. Neurosci., 2018. [http://dx.doi.org/10.1155/2018/3837275]
112
IoT and Big Data Analytics, 2023, Vol. 2, 112-115
SUBJECT INDEX A Algorithms 66, 96, 103 genetic 66, 96 traditional 103 Alzheimer’s disease 6, 7 Artifacts 92, 93, 98, 101, 102 intensity saturation 92, 93 visual 98, 101, 102 Artificial 1, 2, 4, 23, 25, 26, 64, 66, 96, 99 intelligence techniques 64 intelligence technology 4 neural networks (ANN) 1, 2, 23, 25, 66, 96, 99 neuron 26 Automated diagnostic process 97 Automatic 64, 70 brain tumor 64 contouring architectures 70
B Berkeley wavelet transform (BWT) 96, 100 Binary classification tasks 75 Biochemical processes 10 Biological 2, 10, 23, 25 nervous system 23, 25 neural networks 2, 10 Biomedical 3, 48, 49, 56, 57, 59 informatics 49 problems 56 technology-based 48 Blockchain 50, 51, 52, 53, 54, 55, 56, 57, 59, 60 platform 51, 52, 56 technology 50, 51, 52, 53, 54, 55, 57, 59, 60 Boltzmann machine(s) (BM) 2, 9, 11, 12, 27 technique 11, 12 Brain 10, 15, 63, 64, 65, 66, 97, 104, 105 cancer 63
function 64 MRI images 65, 66, 97, 104, 105 reconstruction 15 related information 10 Brain tumor(s) 40, 63, 64, 65, 66, 67, 82, 91, 92, 96, 97, 98, 99, 100, 101, 103, 105, 106 datasets 40, 97 detection process 103 detection techniques 98, 99 malignant 101 segmentation 65
C Cadence sensor 9 CAD systems 101 Cancer 1, 6, 7, 9, 63, 93, 105 breast 7 Computational intelligence 23 Computed tomography (CT) 1, 64 Computer-aided 64, 66 diagnostics 66 technologies 64 Contrastive divergence (CD) 13 Convolutional 11, 12, 25, 29, 30, 63, 65, 66, 70, 72, 76, 80, 85, 90, 97 developed five-layer 97 neural network 11, 12, 29, 30, 63, 65, 66, 72, 76, 80, 85 Convolution neural networks (CNN) 1, 2, 6, 7, 8, 9, 10, 11, 12, 15, 29, 63, 66, 73, 80
D Data management 48, 49, 51, 59 in blockchain technology 51 medical 59 Data 53, 54, 59 processing techniques 53
Parma Nand, Vishal Jain, Dac-Nhuong Le, Jyotir Moy Chatterjee, Ramani Kannan, and Abhishek S. Verma (Eds.) All rights reserved-© 2023 Bentham Science Publishers
Subject Index
storage 54 transformation 54 transmission 59 Datasets 10, 16, 57, 58, 59, 66, 67, 68, 69, 73, 80, 85, 86, 100, 101 bitcoin evaluation 57 block chain 58 testing random 85 Decoder processes 42 Deep 2, 5, 8, 10, 12, 24, 26, 27, 28, 65, 76 architecture 5, 8, 10, 24, 26, 27, 76 belief networks (DBNs) 2, 12, 28 Feed-forward Networks 27 model applications 5 neural networks training process 10 residual dilate network 65 Deep convolutional 33, 38, 39, 65 GAN (DCGANs) 33, 38, 39 neural networks 65 Deep learning 3, 12, 13, 14, 15, 18, 24, 28, 69, 103 algorithms 14, 15, 18 era 28 methodology 24 techniques 12, 13, 14, 69, 103 technologies 3 Discrete wavelet 66, 95, 96, 100 transform 95 transformation (DWT) 66, 95, 96, 100 Discriminator 32, 33, 38 learning 32 networks 33, 38 Diseases 1, 5, 14, 91 chronic 5 infectious 1 nervous system 91 rare 14 DNA 8, 43 binding protein specificities 8 designing 43
E Electroencephalograms 10 Energy expenditure (EE) 9
F Feature(s) 9, 56
IoT and Big Data Analytics, Vol. 2
113
pooling 9 confidentiality 56 Framework 3, 4, 7, 25, 33, 37, 40, 42, 56, 65, 97 conceptual 25 energy-based 37 prediction-based 7 Frechet inception score 40 Functions 27, 65, 75, 94, 103, 105 linear valued energy 27 luminance pixels probability density 94 normal brain 105 segmenting 65 sigmoid 75 transformation 103
G GANs 32, 43 framework 32 reinforced 43 Gated recurrent units (GRU) 5
H Hardware parallelization 26 Healthcare 4, 56, 57, 60 flooring 4 information system 57 monitoring systems, real-time 60 system, traditional 56 Healthcare data 14, 49 analysis 49 management 49 Hybrid machine learning system 66
I Image(s) 12, 40, 41, 43, 66, 71, 92, 93, 98, 103 filtering process 12 luminance 98 processing, medical 43, 103 processing technology 92 brain 71 magnetic resonance 66 segmentation approaches 93 slice sampling 66
114 IoT and Big Data Analytics, Vol. 2
synthesis 40, 41 Image processing 24, 67, 103, 105 applications 103 techniques 24, 67, 105 Imaging techniques 108 Input data 14, 25, 29, 64, 74 multidimensional 29 Input image 41, 76, 93, 94, 95, 98, 99, 103, 105, 106, 107 Input noise 33, 40 source 33 Inverse difference moment (IDM) 104
K Kurtosis 104
L Language processing 12, 31, 42, 43 natural 31, 42, 43 Learning 16, 97, 100 based block-wise fine-tuning strategy 97, 100 computations 16 Learning algorithm 14, 28 machine 14 Long-short term memory (LSTM) 5, 12, 31, 42
M Machine learning 3, 4, 8, 13, 66, 103 algorithms, traditional 13 traditional 4, 8 techniques 3 technologies 66, 103 Magnetic resonance imaging (MRI) 1, 6, 7, 15, 63, 64, 65, 66, 92, 105 Management 8, 53, 57, 59 chronic illness 8 data flexibility 53 Manual inspection technique 64 Mean square error (MSE) 99, 103 Mechanism 2, 4, 48, 50, 51, 52, 53, 55 access control 52, 55 client-server 51 Medical 90, 92 imaging applications 92
Nand et al.
resonance imaging 90 Minimum mean square error (MMSE) 99 MRI images 63, 67, 85, 90, 97, 98, 99, 101, 104, 108 medical 99 Multi 24, 38 layer perceptron (MLP) 38 threaded implementations 24
N Network 1, 2, 4, 10, 11, 12, 13, 14, 24, 25, 31, 32, 51, 52, 53, 71, 77, 78, 86 bayesian 12 biological 10 deep 25 deeper 78, 86 pair 32 Neural 11, 38 network’s output 38 network technique 11 Neurons 2, 9, 10, 26, 28, 64, 75 dead 75 work 10 Noise 39, 90, 91, 93, 96, 99, 101, 103, 107 elimination 93 injection 39
O ODTWCHE 98, 102, 106, 107, 108 approach 107 methodology 108 technique 98, 102, 106, 108 Omics profiles 16 Operations 69, 83 mask 83 transformation 69 Osteoarthritis 6, 7 Otsu’s thresholding method 95
P Particle swarm optimization (PSO) 95, 102 Principal component analysis (PCA) 66 Probabilistic neural network 96 Processing 10, 26, 30, 43, 50, 52, 55, 56, 101, 103 speech-language 43
Subject Index
Progressive 38 GAN (PGGAN) 38 neural networks 38
R Range optimization process 99 Real-time healthcare data analysis 50 Recurrent neural nets (RNNs) 2, 5, 6, 9, 11, 12, 15, 30, 31, 42 Regression techniques 27 Reinforcement learning problem 24 Residual-in-residual dense block (RRDB) 39 Restricted boltzmann machines (RBMs) 2, 7, 8, 9, 10, 13, 27, 28
S Sclerosis lesions 6 Services 13, 49, 50, 51, 52, 54, 56, 59, 60 ambulance 54 cloud-based 13 health-based 50 traditional healthcare 49 transferable healthcare 49 Skills based techniques 37 Software accelerator 9 Splicing activity 8 SRGAN model 41
T Transfer learning 40, 85, 100 method 40, 85 techniques 100 Tumor(s) 84, 91, 96, 101, 104, 105 features MRI brain 104, 105 malignant 91, 96, 101 visualization 84 Tumor cells 64, 65 fast-growing 64 Tumor detection 64, 66, 82, 86, 90, 93, 95, 96, 98, 101, 103, 104, 105, 108 based brain 96
U Unified medical language system (UMLS) 2
IoT and Big Data Analytics, Vol. 2
W Wearable sensors 8 World Health Organization (WHO) 91
115