313 30 11MB
English Pages XIV, 283 [290] Year 2021
E. Priya V. Rajinikanth Editors
Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems
Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems
E. Priya • V. Rajinikanth Editors
Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems
Editors E. Priya Electronics and Communication Engineering Sri Sairam Engineering College Chennai, Tamil Nadu, India
V. Rajinikanth Electronics and Instrumentation Engineering St. Joseph’s College of Engineering Chennai, Tamil Nadu, India
ISBN 978-981-15-6140-5 ISBN 978-981-15-6141-2 https://doi.org/10.1007/978-981-15-6141-2
(eBook)
# Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Healthcare systems are essential to diagnose a disease in its premature phase and execute a possible treatment process to control/cure that disease. Most healthcare systems assist the doctor during disease detection and treatment planning process. Due to the availability of modern tools, disease detection using semi-automated and automated methods is widely suggested by various researchers to diagnose disease in various organs. In hospitals, bio-image and bio-signal based methods are widely considered to examine the condition and progression of various diseases. The purpose of this book is to help beginners, researchers, and doctors understand the basic concepts and recent advancements in medical image and signal-assisted disease detection system. This book is concerned with supporting and enhancing the utilization of medical data assessment using a variety of real-world medical data analysis. This book presents a well-standing forum to discuss the characteristics of the traditional and recent versions in disease examination with a chosen image/signal procedure. The book is proposed for professionals, scientists, and engineers who are concerned in medical data examination methods using deep-learning and machinelearning techniques. This book provides an outstanding foundation for undergraduate and postgraduate students as well. It has several features, such as an outstanding basis of the medical data analysis, and includes different applications and challenges with extensive studies for systems that have used medical data examination procedures. The book is organized as follows: Chapter 1 provides a procedure for brain tumor detection using the benchmark BRATS database. In medical imaging, brain tumor detection and recognition from MRI examination are essential for both the analysis and processing of brain cancers. This chapter proposes an integrated framework for brain tumor recognition based on fuzzy C-Means and multi-property feature reduction. Three primary steps are involved in this work: auto skull stripping and tumor contrast stretching is performed through a combination of well-known filtering methods and segmentation of tumor with fuzzy C-Means; multi-property features are fused such as shape, texture, point, and Gabor wavelet by weights assignment; and finally the implementation of neighborhood component analysis based irrelevant features are removed from fused feature vector. The final compressed feature vector is fed v
vi
Preface
to one-against-all support vector machine and achieves accuracy of 100% and 96.3% on BRATS2013 and BRATS2015 dataset, respectively. Chapter 2 implements an image processing technique to examine brain tumor and ischemic stroke from an MRI slice. The clinical level diagnosis of these is normally carried out with a well-known imaging technique called the MRI due to its multimodality and proven nature. This work implements a hybrid imaging technique by considering the thresholding and segmentation techniques. Thresholding helped improve the visibility of the abnormal part and is executed with the Brain-Storm-Optimization based Otsu/Kapur function. Later, the abnormal part from this image is extracted using the chosen segmentation technique. In this work, a detailed assessment of the existing segmentation procedures, such as watershed, active-contour, level-set, and region growing, is presented. Finally, a study with the ground truth and extracted part is executed based on Jaccard, Dice, and accuracy to validate performance of the proposed technique. Chapter 3 presents an asymmetrical pattern of thermal image analysis with Coherence Enhanced Diffusion Filter (CEDF) based reaction diffusion level set and Curvelet Transform (CT). The breast tissues are segmented using Reaction Diffusion Level Set Method (RDLSM). In this level set, the development of edge map generated by CEDF acts as an edge indicator. The regions of left and right breast from the segmented images are separated. The categories of abnormal and normal sets have been established by pathological and healthy conditions of the separated regions. Three levels of curvelet decomposition are performed on these tissues. The features of texture such as contrast, dissimilarity, and difference of variance are determined from the extracted curvelet coefficients. The results show that coherence enhanced diffusion filter based RDLSM is able to segment the regions of breast. This technique shows the correlation between the ground truth and segmented output as higher value than the conventional level set method. Chapter 4 establishes capturing analytical data from the fusion of medical images is a demanding and emerging area of exploration. This work proposes a methodology that strengthens the lung tumor examination for mass screening as CT along with PET images are fused effectively. The existing system automatically differentiates the lung cancer for PET/CT images using framework study, and Fuzzy C-Means (FCM) was developed successfully. The preprocessing approach strengthens the certainty of the cancer revelation. Semantic operations implement authentic lung ROI extraction. But the elementary problem is the directionality and phase information not being resolved. To defeat this problem, Dual-Tree Complex Wavelet Transform (DTCWT) is used in the proposed model that presents as a method for image fusion. The fusion outcome is improved when Dual-Tree Complex Wavelet Transform (DTCWT) is correlated with Discrete Wavelet Transform (DWT). It also improves the accuracy of PSNR, entropy, and similarity value. The proposed work presents a region growing technique for segmentation of CT, PET, and fused images.
Preface
vii
Chapter 5 discuses the early diagnosis of breast cancer using medical thermographs. The inherent limitations of breast thermal images are low contrast and low signal-to-noise ratio, which makes segmentation a challenging task. In this chapter, segmentation of breast thermal images is attempted using weighted level set evolution (WLSE) with phase congruency edge map. The segmented outputs are validated using overlap measure and regional statistics measure such as efficiency (EFI), Youden (Y), and ROI indices. The segmentation performance of WLSE is also compared with segmented output of DRLSE. Results show that the WLSE method could handle the inherent limitations of breast thermal images and could segment the breast regions with minimal information loss. The regional statistics measures such as EFI and Y-index are found to be 0.97 and 0.95, respectively, which indicate good correlation between the segmented and ground truth images. Hence, weighted level set method could be used to extract the breast tissue from other regions for automated analysis and clinical diagnosis. Chapter 6 provides information on MEMS based gripper tool to handle a thin, larger surface area component that is not available in the market, which is capable of functioning even in moisture as the tool is devised by means of a polymer material. The proposed device has a default in plane displacement of less than 500 microns where the tools are completely in closed position. The device works on the principle of a push pull based actuation method where it can hold components with thickness of about 300 microns, where the entire device is controlled precisely by a screw based actuation mechanism. The device can be fabricated by rapid prototyping process, and structural mechanics simulation study is carried out with COMSOL multiphysics simulation software for identifying the appropriate results. Chapter 7 discusses musculoskeletal disorders, which is a major concern globally. It causes pain and suffering to the individual, resulting in productivity loss in both manufacturing and service sector, which has an adverse impact over the economy. The main aim of this chapter is to design and test an upper body exoskeleton arm that will be used for empowering the able-bodied, that is, healthy users. The exoskeleton arm is intended to power or amplify the ability of the human elbow. Inverse dynamic model of the system is simulated through kinematic analysis and workspace analysis. Upon validation of the model, the mechanical system design constitutes the material selection for the exoskeleton frame, design for joint imitation, load sharing of the completely mechanical structure, and dimensioning. The electrical system of the prototype constitutes the important issue of the actuator selection, power supply requirement analysis, control scheme design, components selection, and controller selection. The two systems are integrated and the orchestral motion of the prototype and human arm is tested under different conditions. Chapter 8 presents a method known as image fusion by which data from multiple images are incorporated into a single image in order to enhance the quality of the image and reduce the artifacts, randomness, and redundancy. Image fusion plays a vital role in medical diagnosis and treatment. The objective of image fusion is to
viii
Preface
process the content at every pixel position in the input images and sustain data from that image which represents the genuine scene or upgrades the potency of the fused image for an accurate application. The fused image provides an intuition to data contained within multiple images in a single image, which facilitates physicians to diagnose diseases in a more effective manner. Though numerous singular fusion strategies yield optimum results, the focus of researchers is moving towards hybrid fusion techniques, which could exploit the attributes of both multi-scale and non-multi- scale decomposition methods. Chapter 9 implemented the detection of breast cancer using mammography technique which is used to detect and analyze the level of cancer. The existing methods use specific level of analysis which is not suitable for accurate prediction and identification. Normally cancer is identified based on the region in which the cancer available. This method never considers the outlier, that is, the occurrence of cancer outside the region such that it will be addressed by introducing multilevel convex hull based analysis. The hulls are formed based on the closeness of the image pixel. Outlier pixels are also considered by taking the sub convex hull which detects the presence of cancer. It prevents the spread of cancer to other regions of the body. The main objective of the proposed method is to perform inside and outside region based detection for breast cancer identification using convex hull based approach. Chapter 10 shows a technique to diagnose retinal abnormality by analyzing the abnormalities in retinal fundus image. In case of diabetic patients, the blood vessels become abnormal over time and results in blockages. These abnormalities lead to the development of various types of aberrations in retina. High sugar levels make the blood vessels defective and results in formation of bright and dark lesions. The risk of loss of vision is reduced by detecting lesions by analyzing the fundus image in the early stage of diabetic retinopathy. This chapter reviews various studies for automatic abnormality detection in fundus images with a purpose of easing out the work of researchers in the field of diabetic retinopathy. A condensed study on methodology and performance analysis of each detection algorithm is laid out in a simple table form, which can be reviewed effortlessly by any researcher to realize the advantages and shortcomings of each of these algorithms. This chapter highlights various research protocols for detection and classification of retinal lesions. It provides guidance to researchers working on retinal fundus image processing. Chapter 11 presents a medical imaging system for the advent of fast computing systems and multimedia technologies. Images from various imaging modalities are processed, stored, and transmitted for different reasons. This attracts huge investment for high-end computing systems to ensure security for huge volumes of medical data. Medical images are manipulated intentionally or unintentionally for analysis, diagnosis, access of patient data, and research. Hence, providing authenticity and security to the medical images becomes essential for entrusted workflow among the medical fraternity. Among the solutions exiting at present,
Preface
ix
watermarking tends to be promising for copyright, authentication, and ownership issues. This chapter reviews wavelet based medical images watermarking techniques, the attacks on medical images, and the performance measures. A few existing state-of-the-art methods are discussed for limitations, level of security, and robustness. Chapter 12 presents the study and analysis of EEG signal extraction information. In this chapter, several nonlinear techniques used in literature are discussed. A case study of sudden unexpected death in epilepsy application implementation is analyzed. In this chapter, structural and bio-signal changes that potentially attribute to sudden death pathogenesis are analyzed using multimodal analysis of brain activity. The localization of the epileptic zone based on scalp EEG was performed with the prior information from MRI using SPM8tool box. The combination of EEG and ECG features were extracted using EPILAB software. The results show that combined biomarker of MRI volume, EEG-MRI localization, and EEG/ECG combined feature provided better results. The combined features of MRI, EEG, and ECG biomarkers are helpful in diagnosis of sudden unexpected death in epilepsy and people at high risk. Chapter 13 presents a technique to analyze the sEMG signal. The sEMG signal is acquired using a dual channel amplifier from the below elbow muscles of one amputee for six actions. The acquired signal is preprocessed using band pass and band stop filter to eliminate the noise in the signal. The classification is accomplished using machine learning and deep learning approach. The testing of machine learning algorithm and deep learning are implemented in Raspberry pi 3 embedded in a python script. In the machine learning approach, 11 relevant time domain features are extracted from the preprocessed signal that are fed as input to linear support vector machines for classification. In the second approach, the signals are converted into images for deep learning analysis. Convolution Neural Network (CNN) is used for classification of six hand actions. The model is trained and tested by varying number of steps per epoch. The number of epochs and accuracy is compared with linear SVM. Results demonstrate that the mean accuracy of linear support vector machine is observed to be 76.66% and CNN model with 1000 steps per epoch with 10 epochs is found to be 91.66%. Chapter 14 discusses image processing based tuberculosis (TB) diagnosis. The non-uniform illumination in microscopic digital TB images due to light source optics and camera noise degrades the visual perception of these images. The decomposition-based methods such as Bi-dimensional Empirical Mode Decomposition (BEMD) and Discrete Wavelet Transform (DWT) are attempted to preprocess the sputum smear images. The most appropriate illumination correction method is evaluated by qualitative and quantitative measures. Intensity profile is used as a qualitative analysis and histogram-based statistical features serve as a quantitative measure to validate the illumination correction methods. The preprocessed sputum smear images are subjected to threshold-based segmentation methods. Results demonstrate that BEMD performs better than DWT in
x
Preface
removing the non-uniform illumination in the sputum smear images. This helps in identifying the TB objects by Otsu segmentation method. It is observed from the results that Otsu-based segmentation resulted in close match with the ground truth than maximum entropy-based segmentation. Thus the developed workflow resulted in better identification of the disease causing objects namely the tubercle bacilli. This will further enhance the classification of these images into positive and negative images that aid for mass screening of pulmonary tuberculosis. Chennai, Tamilnadu, India Chennai, Tamilnadu, India
E. Priya V. Rajinikanth
Contents
1
2
3
4
5
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature Reduction for Brain Tumor Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Attique Khan, Habiba Arshad, Wasif Nisar, Muhammad Younus Javed, and Muhammad Sharif
1
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect Brain Tumor/Stroke Section: A Study . . . . . . . . . . David Lin, V. Rajinikanth, and Hong Lin
29
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and Asymmetry Analysis Using Curvelets in Breast Thermograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Prabha Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Dafni Rose, K. Jaspin, and K. Vijayakumar Segmentation and Validation of Infrared Breast Images Using Weighted Level Set and Phase Congruency Edge Map Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Thamil Selvi
51
67
87
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin Plate Holding . . . . . . . . . . . . . . . . . . . . . . . 103 T. Aravind, S. Praveen Kumar, G. Dinesh Ram, and D. Lingaraja
7
Design and Testing of Elbow-Actuated Wearable Robotic Arm for Muscular Disorders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 D. Manamalli, M. Mythily, and A. Karthi Raja
8
A Comprehensive Study of Image Fusion Techniques and Their Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 R. Indhumathi, S. Nagarajan, and T. Abimala
xi
xii
Contents
9
Multilevel Mammogram Image Analysis for Identifying Outliers: Misclassification Using Machine Learning . . . . . . . . . . . . . . . . . . . . 161 K. Vijayakumar and C. Saravanakumar
10
A Review on Automatic Detection of Retinal Lesions in Fundus Images for Diabetic Retinopathy . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Remya Koppara Revindran and Mahendra Nanjappa Giriprasad
11
Medical Image Watermarking: A Review on Wavelet-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Nagarajan Sangeetha, X. Anita, and Rajangam Vijayarajan
12
EEG Signal Extraction Analysis Techniques . . . . . . . . . . . . . . . . . . 223 M. Kayalvizhi
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 C. N. Savithri, E. Priya, and J. Sudharsanan
14
An Automated Approach for the Identification of TB Images Enhanced by Non-uniform Illumination Correction . . . . . . . . . . . . 261 E. Priya
About the Editors
E. Priya is a Professor in the Department of ECE at Sri Sairam Engineering College. She holds a B.E. degree in Electronics and Communication Engineering from the University of Madras and an M.E. degree from Madras Institute of Technology, Anna University. She received her Ph.D. in Biomedical Engineering from the same university. With 17 years of teaching experience, she is currently guiding students in the areas of biomechanical modeling and image and signal processing. Her research interests include biomedical imaging, image processing, signal processing, and the application of artificial intelligence and machine learning techniques. A recipient of the DST-PURSE fellowship, she has published several articles in international journals and conference proceedings, as well as book chapters, in the areas of medical imaging and infectious diseases. She also serves on the editorial review board of the International Journal of Information Security and Privacy (IJISP), IGI Global.
xiii
xiv
About the Editors
V. Rajinikanth is a Professor in the Department of Electronics and Instrumentation Engineering at St Joseph’s College of Engineering, Chennai, India. His research chiefly concerns medical image and signal analysis, including EEG signals, brain MRI assessment, histopathology image analysis, evaluation of dermoscopy images, and ischemic stroke examination using brain MRIs recorded with various modalities. With 18 years of teaching experience in the fields of controller design, artificial intelligence applications, optimization, and biomedical instrumentation, he has published more than 75 research articles in peerreviewed international journals and conference proceedings and has authored or co-authored 8 book chapters. He edited the book Advances in Artificial Intelligence Systems and currently serves as an Associate Editor for the International Journal of Rough Sets and Data Analysis (IJRSDA).
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature Reduction for Brain Tumor Recognition Muhammad Attique Khan, Habiba Arshad, Wasif Nisar, Muhammad Younus Javed, and Muhammad Sharif
Abstract
In medical imaging, brain tumor detection and recognition from magnetic resonance imaging examination are essential for both the analysis and processing of brain cancers. From the literature, it is quite clear that the recognition of brain tumors with high accuracy depends on the multi-levels features fusion. In this book chapter, we proposed an integrated framework for brain tumor recognition based on fuzzy C-means and multi-properties feature reduction. Three primary steps are involved in this work. In the first step, auto-skull stripping and tumor contrast stretching is performed through a combination of well-known filtering methods, and then segment the tumor region by fuzzy C-means. In the second step, multi-properties features are fused such as shape, texture, point, and Gabor wavelet by weights assignment. In the third step, NCA (neighborhood component analysis)-based irrelevant features are removed from fused feature vector (FV). The final compressed FV is fed to one-against-all support vector machine and achieved an accuracy of 100% and 96.3% on BRATS2013 and BRATS2015 dataset, respectively. Comparison with other techniques shows that the NCA-based reduction approach outperforms on selected datasets. Keywords
Brain tumor · Fuzzy C-means · Feature extraction · Feature reduction · Recognition
M. A. Khan (*) · M. Y. Javed Department of Computer Science, HITEC University, Taxila, Pakistan H. Arshad · W. Nisar · M. Sharif Department of CS, COMSATS University Islamabad, Islamabad, Pakistan # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_1
1
2
1.1
M. A. Khan et al.
Introduction
In the last few years, due to the advancement of computer vision, and machine learning, the automated e-health-care system in medical imaging provides a lot of help to the doctors for better and quick treatment of patients [1–4]. The human brain is considered as one of the most complex organs as it works with more than 100 billion nerve cells. A brain tumor occurs due to abnormal cell growth and its uncontrolled cell division inside the brain. These cells grow rapidly and affect the normal performance of the brain and also damaged the healthy cells [5, 6]. The brain tumors can be classified into two groups and in different grades according to their severity such as benign (non-cancerous) and malignant (cancerous). Low-grade (grade I and II) meningioma and gliomas are non-cancerous tumors. These types of tumor have homogeneous shape and can be monitored radio lucidly or fully eradicated through surgical methods, and they did not persevere again [7]. Highgrade (grade III and IV) glioblastoma multiform is a cancerous tumor having a heterogeneous structure which can be cured using chemotherapy, radiotherapy, or a combination of these. But these tumor treatments are life-risking processes; therefore, it is essential to the timely diagnosis of a brain tumor at an early stage to take further remedies [8, 9]. MRI is the most widely used medical imaging technique for visualization of brain structure and for detection of size and position of affected tissues in a human body. It produces the detailed image structure of a brain from all directions, and it is also able to show different soft tissues at high contrast with spatial high resolution. It is noninvasive as it does not use harmful radiations as compared to other imaging methods like CT scan and X-ray [10, 11]. Lately, researches in this domain under many developed countries have shown that over the past few decades, the death rate of people has been increased due to the brain tumor. The multifaceted brain tumors can be categorized into two groups based on the tumor growth pattern, their origin, and malignancy.
1.1.1 Problem Statement and Contributions The manual segmentation of tumor regions in MRI images is a difficult and timeconsuming process. Generally, the tumors largely vary in the form of appearance and structure. Moreover, a great difference in the size and location of a tumor is also a major challenge in this process. The tumorous tissues have a great intensity to cover healthy brain tissues. Further, often a developing tumor cell can refract and change the adjacent brain tissues [6, 12]. An automated system must be designed to overcome human-based diagnostic faults, for instance, incorrect identification when a great number of brain MRIs are analyzed. Mostly, errors in image classification occur because of random noise, less image contrast, weak edges, and non-homogeneity that are common in medical imaging. Hence, the correct classification of medical images is required for accurate medical diagnosis in which many of them have complex geometry [8]. A number of methods for edge detection in
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
3
MRI brain tumor images are presented, for instance, fuzzy C-means (FCM) method [13], cellular automata [14], neural network [15], Markov random fields [16], and entropy-based [17, 18], stochastic [19], and random forests [20]. In this chapter, we propose a new automated system for brain tumor segmentation and classification based on fuzzy C-means and neighborhood component analysis (NCA) feature reduction. Our major contributions in this work are: 1. Auto-skull stripping is performed through morphological operators, and then tumor visibility is improved through well-known preprocessing filters. 2. Fuzzy C-means and thresholding approach is fused for accurate brain tumor segmentation. 3. Multi-properties features such as HOG, LBP, SFTA, and Gabor wavelet features are fused. Then, the NCA reduction algorithm is performed for the curse of dimensionality. 4. A features-based comparison is conducted with NCA-based reduced features to show how the irrelevant features affect the overall accuracy.
1.2
Related Work
A lot of segmentation and feature extraction techniques are introduced in literature such as clustering techniques, statistical methods [21, 22], saliency methods [23], thresholding [24], fusion methods [25], shape feature, a texture feature [26], point feature [27], and a few more for recognition [28–31]. From all, the clustering algorithms and texture features are more important in the medical imaging for segmentation and classification of abnormal deformities. In the medical domain, the diagnosis and treatment of a brain tumor is an essential step for the survival of human life [32]. Therefore, the accurate segmentation of the tumor always generates good features for correct classification. But sometimes, the presence of noisy factors and inaccurate tumor segmentation produces irrelevant and high-dimensional features. This kind of problem is resolved by reduction algorithms named feature reduction. Selvapandian et al. [33] presented a glioma tumor detection and segmentation method in brain MR images. In the presented approach, the original MRI scans are enhanced through fusion method. The fusion approach is based on non-subsampled contourlet transform and texture features extraction. Adaptive neuro-fuzzy (ANF) inference system is applied for the training of extracted features which later classified features into normal and glioma scans. Then morphological methods are used for the segmentation of tumor regions in brain MR images. The validation of this presented method is done on publically available brain MRI dataset and showed improved accuracy. Lahmiri et al. [34] presented three automated analysis methods for identification of glioma in brain MRIs. To segment MRI image, each method utilizes particle swarm optimization (PSO) method named as Darwinian PSO, classical PSO, and fractional-order Darwinian PSO. Using these segmented images, directional spectral distribution mark is evaluated. Later, by using generalized Hurst exponents
4
M. A. Khan et al.
(HE), multifractals of evaluated directional spectral distribution are assessed. Finally, support vector machine (SVM) is applied on evaluated multifractals for classification and to evaluate the performance of three methods, leave one out cross validation approach is utilized. Soltaninejad et al. [35] presented an automated brain tumor identification and segmentation method through fluid-attenuated inversion recovery MR imaging. This method contains a superpixel approach, and from these superpixels, Gabor texton, fractal examination, and statistical and curvature intensity features are manipulated. For classification of each superpixel into a tumor and healthy region, extremely randomized trees (ERT) classifier is utilized. The presented method shows improved performance on personal clinical dataset and BRATS 2012 dataset. Reddy et al. [36] described an automated system for the brain tumor and pancreatic cancer detection using accurate segmentation and classification algorithm. The presented algorithm consists of three phases: (i) in preprocessing, median filtering model is applied, (ii) segmentation is performed through fuzzy C-means segmentation method, and (iii) selection of significant features like graylevel co-occurrence matrix. The presented method is tested on the Harvard medical school database and the cancer imaging collection with improved performance. Kebir et al. [37] introduced recognition and segmentation method for brain tumor from MRI scans. The introduced approach consists of two primary steps: (i) skull stripping and (ii) tumor auto-detection and segmentation. For these two steps, Gaussian mixture model, fuzzy C-means, active contour, wavelet transform, and entropy segmentation approaches are utilized. The first step is assessed using OASIA, IBSR, and LPBA40 datasets, and the second step is measured using BRATS dataset. The achieved results illustrate that the presented method gives better results using this approach. Rajesh et al. [38] presented a brain tumor detection and classification method from brain MRI images. In this method, rough set theory is used for feature extraction, and particle swarm optimization neural network (PSONN) is applied for the classification of healthy and unhealthy brain MRI images.
1.3
Proposed Methodology
The proposed automated brain tumor recognition system is based on three primary steps including tumor visibility enhancement, tumor segmentation through fuzzy C-means, and multi-properties features fusion and reduction. The mainstream design of the proposed system is shown in Fig. 1.1.
1.3.1 Tumor Contrast Enhancement Contrast enhancement (CE) is an essential step in any computer vision application and especially in medical imaging; it gets more importance due to size and color of infection or disease symptoms [39, 40]. The original images when capturing them are holds low resolution due several factors such as capturing instrument, lighting
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
5
Fig. 1.1 Proposed architecture of automated brain tumor recognition through fusion of multiple features and reduction
condition, noise, and occlusion. These problems hide the important information in the image like infected regions. Therefore, through CE techniques, these problems can be easily resolved and improve the visualization of infected region like brain tumor. In this work, we initially perform auto-skull stripping for removal of irrelevant regions from brain MRI scans. The skull stripping is performed by using few morphological operations like opening, area removal, and thresholding. At first, the threshold value is selected for conversion of original image into binary, and then the extra regions are removed through area removal operation. Finally, the opening operation is performed and the refine image mapped in gray format. The effects of this process are shown in Fig. 1.2 which clearly described that after skull stripping, the extra regions are removed from the original image as shown in Fig. 1.2d. After that, contrast enhancement is performed by using top-hat filtering operation which enhances the foreground region such as the tumor part. Top-hat filtering includes both opening and closing operations which are mathematically defined as follows: Letting X ¼ {(u, f(u))| u 2 P, P ⊆ E2} and the structuring element S ¼ {(v, b(v))| v 2 Q, Q ⊆ E2}, then the opening operation is defined by the following equations:
6
M. A. Khan et al.
Fig. 1.2 Automatic skull stripping using thresholding and morphological operations. (a) Original image, (b) thresholding image, (c) refinement after morphological operations, and (d) skull-stripped image
ψ X,S ðuÞ ¼ ðX X∘SÞðuÞ
ð1:1Þ
X∘S ¼ ðX⊝SÞ⨁S
ð1:2Þ
inf X⊝SðuÞ ¼ v2Q, uþv2P f f ðu þ vÞ bðvÞg
ð1:3Þ
where ψ X, S(u) is an opening top-hat filtering operation image and effects shown in Fig. 1.3.
1.3.2 Fuzzy C-Means (FCM)-Based Segmentation The FCM is a fuzzy clustering approach based on reduction of quadratic criterion where clusters are characterized by their corresponding centers [41]. In general, FCM is an iterative process of clustering approach, and the result is attained by continuously updating the cluster center and membership value. It allocates value to each class by using fuzzy memberships, and updating function is achieved by computing the cost functions. Suppose for the N data samples T ¼ (t1, t2, . . ., tN), the method permits to split the data samples, by computing the class centers x and the membership value M, and according to these centres and membership values, the minimization of an objective function Ifcm is represented as
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
7
Fig. 1.3 Contrast enhancement results using opening top-hat filtering operation. The above row shows the original skull stripping images, whereas the bottom row shows the filtering image
I fcm ¼
c X N X
μnlm kt m xl k2
ð1:4Þ
l¼1 m¼1 c X
μlm ¼ 1, μlm 2 ½0, 1, 0
N X
μlm N
ð1:5Þ
m¼1
l¼1
where N is the number of samples, c is the number of clusters, n is any real number (n > 1) that manage the fuzziness of the resultant partitioning, {xl, (1 l c)} represents the cluster centroids, and μlm(1 l c, 1 m N ) shows the fuzzy membership values of tm that belongs to the lth cluster. k.k represents the Euclidean norm, and in this work, n ¼ 2 is utilized. The objective function is reduced when the greater membership values are given to input samples that are nearer to their adjacent clusters, and minimum membership values are given when they are far away from the cluster centers. Minimizing the objective function for Ifcm by using Lagrange multiplier method, the condition is defined as c X
μlm ¼ 1,
ð1:6Þ
l¼1 ∂I
∂I
So we have ∂μfcm ¼ 0 and ∂xfcml ¼ 0. The updated method for membership function lm μlm and cluster centroid xl gives the iterative solutions, described by the following equations:
8
M. A. Khan et al.
μlm ¼
2 n1
kt m x l k =
1 c P
2
ð1:7Þ
kt m xi kn1
i¼1 N P
xl ¼
μnlm t m
m¼1 N P
m¼1
ð1:8Þ
μnlm
The segmentation effects after FCM clustering are shown in Fig. 1.4. After segmentation of the tumor region, the boundary is drawn through active contours to show the actual region of tumor on the original image. Later, the extracted tumor regions are utilized in the feature extraction step as shown in Fig. 1.1.
1.3.3 Feature Extraction and Reduction Feature extraction and reduction are primary steps in pattern recognition and machine learning for classification of objects into their relevant class. Several types of features are extracted in the literature in medical imaging such as shape, texture, color, point, and wavelet. The importance of any feature depends on the number of extracted features and their patterns information. But sometimes, the appearances of extraneous features diminish the recognition accuracy of an automated system. In this work, we extract four different types of features such as LBP, HOG, MSER, and Gabor wavelet from segmented tumor images. These features are fused by the serial-based approach, for the major aim of combining their patterns information. The fusion of these features shows some advantages and disadvantages. The major advantage of this step is to improve the overall accuracy, whereas the disadvantage of this step is an increase in the computational time and the addition of some redundant information. This challenge can be resolved through the reduction of redundant features. Therefore, we utilized the NCA method for feature reduction. The detailed description of each step is described below.
LBP Features The most useful texture features used in the pattern recognition are local binary patterns (LBP) which are utilized for extraction of texture information of an object in the given image. In this, we extract the texture information of tumor through LBP features [42, 43]. For extracting an LBP feature, a binary value is generated for each pixel in an image and constructs a 3 3 matrix for all pixels of an image. Then, each neighbor pixel of 3 3 matrices is subtracted by its center pixel where the central pixel is described as threshold value. The pixel values, which are greater than threshold value, is output must be 1 and the less pixel values are 0 as shown in Fig. 1.5 and mathematically described in Eq. 1.9:
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
9
Fig. 1.4 Segmentation results of FCM. (a) Original scan, (b) segmented scan, and (c) mapped scan
Fig. 1.5 Representation of threshold for LBP feature extraction
10
M. A. Khan et al.
L¼
7 X
f ð I i I c Þ 2i
ð1:9Þ
i¼0
where Ic denotes the central pixel value, Ii is current neighboring pixel, f(x) is a threshold defined function, and 2i are different patterns found in an image. The value of i is iterated up to eight times because of eight neighboring pixels in 3 3 window. The threshold function f(x) is defined by Eq. 1.10: ( f ð xÞ ¼ f ð I i I c Þ
1 if x I c 0 if x < I c
ð1:10Þ
HOG Feature HOG feature is mostly utilized in CV community for object detection and localization [44]. It is originally designed for the base of mass and shape theory which can be recognized through the direction of edges. In this work, the segmented tumor regions are set as an input, and in the very first step, compute the gradient information from both horizontal and vertical directions denoted by Gx and Gy. The gradient Gx is computed from horizontal axis hx, whereas the gradient Gy is computed from the gradient information on vertical axis hy by the following equations: Gx ¼
Δðhx þ 1, hyÞ Δðhx 1, hyÞ ðhx þ 1Þ ðhx 1Þ
ð1:11Þ
Gy ¼
Δðhx þ 1, hyÞ Δðhx 1, hyÞ ðhx þ 1Þ ðhx 1Þ
ð1:12Þ
In the second step, the orientation and magnitude are computed from the gradient for all pixels in the given image. In the third step, histograms are computed for each 8 8 cells of the image. The main aim of this step is to compute the exact information of tumor in different patches. In the third step, the computed histogram values are divided into 16 16 block size, and normalize the resultant values. In the fourth step, a feature vector of dimension N 3780 is obtained. The visual representation of HOG feature on segmented tumor image is shown in Fig. 1.6.
MSER Features Maximally stable extremal region (MSER) features [45] are used to find the level of threshold and extremal region of an image by examining the change of area w.r.t. change of intensity. Here, the change in area is normalized by the area of segmented tumor pixels which are utilized as a resistance measuring. Mathematically, the resistance is defined by the following equation:
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
11
Fig. 1.6 Visualization effects of HOG feature on segmented tumor image
Fig. 1.7 Representation of MSER feature
St ðξt Þ ¼
A ðξ t Þ D t ðξ t Þ
ð1:13Þ
where St(ξt) is the resistance of a region (ξt), A(ξt) is the area of connected tumor region, and Dt is the first derivative of region ξt. This approach affects the intensity of light; therefore, we extract these features from segmented tumor images and import the extremal regions because the connected region is defined by border and intensity function. The extraction of these features from segmented images because the stable images having the threshold near to multi-scale detection which detects both small and large structures. The visual representation of MSER feature is shown in Fig. 1.7.
Gabor Wavelet Feature Based on multi-resolution and multi-orientation properties, this feature is most useful in CV applications such as face recognition and medical imaging [46, 47]. By using different angles and scales, the Gabor equation project provides a high-dimensional Gabor coefficient matrix of unnecessary features. Therefore, we remove the extra features by using a few filtering processes to achieve a good accuracy. In this work, a DWT filtering approach is utilized: the selection function of discrete wavelet transform (DWT) feature selection is defined by the following equation:
12
M. A. Khan et al.
Z D¼
1
jαðxÞj2 dx x 1
ð1:14Þ
Features Fusion and Reduction The extracted multi-property features are serial-based fused [48] to combine the information of all feature types in one vector. The serial-based fusion means simple concatenation of all the features in one matrix. This method increases the execution time and consumes much memory because of vector length. The serial-based fusion is defined by Eq. 1.15: F α ¼ LBPα1 HOGα2 MSERα3 WTα4
ð1:15Þ
where Fα denotes the serially fused feature vector (FV), LBPα1 is LBP FV, HOGα2 is HOG FV, MSERα3 is point FV, and finally WTα4 is wavelet transform FV, respectively. Later, few of the features in the fused FV are redundant and noisy; thus, they are essentially removed before recognition. To tackle with this kind of challenge, we used neighborhood component analysis (NCA) approach for irrelevant feature reduction. NCA [49] is non-parametrical which creates no supposition about the contour of the category classification. The major advantage of NCA is that during the feature reduction process, no information would be lost [50, 51]. Suppose T ¼ {t1, t2. . ., tn} is a labeled input sample point, ui 2 Fα are fused features, and { s1, s2, . . ., sn} are corresponding labels. A Mahalanobis distance is termed as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X Fα t i Fα t j Fα t i Fα t j d ti , t j ¼
ð1:16Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X t i t j Fα X Fα t i t j d ti , t j ¼
ð1:17Þ
ti selects tj point as its neighbor with some probability Pij, and from the selected neighbor, it takes the class labels. By using Euclidean distance in the converted feature space, the probability Pij is termed as 8 > < P ¼ P exp Fα t i Fα t j ; ij y6¼i exp Fα t i Fα t y > : Pij ¼ 0;
j 6¼ i
ð1:18Þ
j¼i
As each sample point would be taken as neighbor, the input sample would take all the class labels. So, the correctly classified probability Pi by the sample data ui can be measured as follows (represent the data sample in the similar class as ui by Si): Pi ¼
X j2Si
Pij
ð1:19Þ
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
13
NCA analyze the transformed feature vector Fα to maximize the projected feature points that are accurately classified by using the objective function: F ðFα Þ ¼
X XX Pi Pij i
i
ð1:20Þ
j2Si
A gradient rule would be measured through differentiation of F(Fα) in terms of converted feature vector Fα: XX X ∂F ðFα Þ ¼ 2Fα Pij t ij t Xij Piy t iy t Xiy ∂Fα y i j2S
! ð1:21Þ
i
where tij ¼ ti tj. If the converted feature vector Fα is constrained to be a non-squared vector of dimension d D, then both the linear dimensionality minimization and metric learning could be implemented at a time. Then, select the features by using stochastic rule, and the best value of reduction function is 0.4 which means 40% of the features are reduced through this approach for better results. After reduction of features, the selected features are fed to feed-forward neural network (FFwNN) for final recognition where the selected features are denoted by b F.
Feed-Forward Neural Network In the structure of FFwNN, the inputs are processed in only one way as input layer to output layer [52]. FFwNN is initially trained through selected features which are obtained after NCA-based reduction approach. The structure of proposed FFwNN is shown in Fig. 1.8 which shows that the first layer is input which follows the two hidden and finally output. The number of neurons is defined as the number of inputs where, in this work, the number of neurons depends on the selected features. The output is computed through the following equation: ( Y ¼η βþ
h X j¼1
β j Φ λj þ
d X
!) b λjiF
ð1:22Þ
i¼1
where β denotes the bias, λ is the weight value, and η and Φ are activation functions for hidden and output layer. The sigmoid function is utilized in this work as activation.
1.4
Experimental Setup and Analysis
1.4.1 Experimental Setup The proposed integrated framework is tested on two datasets, namely, BRATS2013 and BRATS2015. BRATS2013 dataset consists of a total of 30 subjects including 20 high-grade (HG) and 10 low-grade (LG) scans. From all these 30 subjects, we
14
M. A. Khan et al.
Fig. 1.8 Architecture of FFwNN for selected features after NCA-based reduction
Healthy
Malignant
benign
Fig. 1.9 Sample images of BRATS2013 dataset
divide images into three classes – healthy, benign, and malignant – as shown in Fig. 1.9. BRATS2015 dataset consists of a total of 384 subjects including 220 HG scans and 54 LG scans, whereas each scan further includes four modalities such as Flair, T1, T1-W, and T2 as shown in Fig. 1.10. The results are evaluated in two phases. In the first phase, few segmentation results of fuzzy C-means are analyzed on a privately collected dataset which includes 80 images (40 unhealthy and 40 healthy). In the second phase, the recognition results are calculated on BRATS2013 and BRATS2015 datasets. The recognition results for both datasets are obtained in four different ways as discussed under the results section. Ten classifiers of different categories such as decision trees, logistic regression, SVM, fine KNN, and ensemble methods are utilized for comparison of classification accuracy. All simulations of this framework are performed on the Statistics and Machine Learning Library of
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
Flair
T1 Modality
T1-C Modality
T2 Modality
15
Fig. 1.10 Sample images of multimodal BRATS2015 dataset
MATLAB. For this purpose, we utilized personal computer i7 of 16 GB of RAM and 8 GB Nvidia GeForce GTX 980 desktop graphics card.
1.4.2 Results and Analysis The detailed results of BRATS2013 and BRATS2015 are presented in this section in terms of numerical values and graphical plots. The results on each dataset are calculated in four steps – a fusion of LBP and HOG features along NCA reduction; the fusion of HOG, LBP, and MSER features along NCA reduction; the fusion of all features along NCA-based 20% feature reduction; and finally NCA-based 40% feature reduction. The recognition results for each feature set are obtained through a number of classifiers as numerical results are presented in Table 1.1 (for BRATS2013) and Table 1.6 (BRATS2015). The detailed description of recognition results for each dataset is given below.
BRATS2013 Dataset Results The numerical and visual results of BRATS2013 dataset are described in this section. In Table 1.1, the fusion of LBP and HOG features along NCA reduction features is utilized as an input for classifiers, and 99.4% best accuracy is achieved on feed-forward neural network (FFwNN) which is also verified through a confusion matrix in Table 1.2. Moreover, a graphical comparison of each classifier for fusion of LBP and HOG features is shown in Fig. 1.11. In Fig. 1.11, the results are obtained
Classifier Fine tree Logistic regression Linear SVM (O vs. A) Quadratic SVM (O vs. A) Medium Gaussian SVM (O vs. A) Fine KNN Weighted KNN Ensemble bagged trees Ensemble RUSBoosted trees FFwNN
Fusion (LBP + HOG + MSER) + NCA (%) 96.7 97.6 94.0 98.2 98.5 90.9 94.5 97.4 91.0 99.6
Fusion (LBP + HOG) + NCA (%) 96.5 97.0 96.4 97.9 98.0
92.1 92.3 98.5 92.4
99.4
Table 1.1 Proposed recognition results of BRATS2013 dataset
99.7
93.1 97.0 98.7 94.7
Fusion (all features) + NCA (20%) (%) 97.8 98.2 97.4 98.6 95.6
99.8
95.8 97.8 99.2 95.3
NCA-based 40% feature reduction (%) 98.3 98.6 98.1 98.7 98.9
16 M. A. Khan et al.
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
Table 1.2 Confusion matrix for FFwNN using fusion of LBP and HOG features
Tumor class Benign Healthy Malignant
Tumor class Benign (%) 99.6 1 0.3
Healthy (%) 0.1 99.0 0.1
17
Malignant (%) 0.3 99.6
Fig. 1.11 Recognition accuracy after fusion of HOG and LBP features along NCA reduction using cropped images. These results are obtained after 500 times of iterations of this method
Table 1.3 Confusion matrix for FFwNN using fusion of LBP, HOG, and MSER features
Tumor class Benign Healthy Malignant
Tumor class Benign (%) 99.8
Healthy (%) 0.1 99.0
Malignant (%) 0.1 1 100
after 500 times of iterations, and a very small change in results is noted which shows the consistency of fusion results. In the second phase, the recognition results are obtained on a fusion of HOG, LBP, and MSER features along the NCA reduction approach. The best-achieved accuracy of 99.6% on FFwNN is presented in Table 1.1 under Column 3 and verified through Table 1.3. The overall average accuracy of this step is improved by 0.2% as compared to the first phase of recognition results. This fusion and reduction step is also iterated up to 500 times, and variation in the results is shown in Fig. 1.12.
18
M. A. Khan et al.
Fig. 1.12 Recognition accuracy after fusion of HOG, LBP, and MSER features along NCA reduction using cropped images. These results are obtained after 500 times of iterations of this method
Table 1.4 Confusion matrix for FFwNN using NCA-based reduction
Tumor class Benign Healthy Malignant
Tumor class Benign (%) 99.9 0.2
Healthy (%) 0.1 99.8
Malignant (%)
100
In the third phase, all features are fused and NCA-based 20% features are reduced. After 20% feature reduction, the accuracy of FFwNN is increased up to 99.7%, whereas the ensemble baggage tree and quadratic SVM achieve an accuracy of 98.7% and 98.6%, respectively. The FFwNN results are also verified by Table 1.4, and variation among results for all classifiers after 500 times of iteration is shown in Fig. 1.13. In the last phase, 40% of features are reduced by the NCA approach and fed to classifiers. The best accuracy of 99.8% which is achieved on FFwNN is presented in Table 1.1 and verified by Table 1.5. The overall results for all classifiers are improved after a 40% reduction of extracted features. But when the features are reduced, more than 40% of the important information is also removed which affects the overall accuracy of the system. The recognition results after 100 times of iterations are also demonstrated in Fig. 1.14 which clearly shows that the consistency of the overall system after 40% feature reduction.
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
19
Fig. 1.13 Recognition accuracy after fusion of all features and 20% feature reduction by NCA approach using cropped images. These results are obtained after 500 times of iterations of this method
Table 1.5 Confusion matrix for FFwNN using NCA-based reduction
Tumor class Benign Healthy Malignant
Tumor class Benign (%) 99.9 0.2
Healthy (%)` 0.1 99.8
Malignant (%)
100
BRATS2015 Dataset Results The recognition results of BRATS2015 dataset are described in this section. The results of BRATS2015 dataset are computed similar to BRATS2013 dataset. In Table 1.6, the numerical results are presented for all four feature-selected approaches. The fusion of LBP and HOG feature along NCA-based reduction approach achieved maximum accuracy of 93.7% on FFwNN, whereas the fusion of LBP, HOG, and MSER features along NCA-based reduction improves the recognition accuracy to almost 1% and reached 93.7% to 94.7%. Later, all features are fused, and 20% features are reduced by NCA approach and achieve accuracy of 95.4%. Finally, 40% features are reduced by NCA approach, and accuracy is reached up to 96.3%. The overall results of FFwNN show a significant improvement, and results can be verified by confusion matrices given in Tables 1.7, 1.8, 1.9, and 1.10. In addition, a Monte Carlo simulation is also performed in which the entire algorithm is run up to 500 times for all four feature-selected approaches as shown in Figs. 1.15, 1.16, 1.17, and 1.18. The accuracy plotted in Figs. 1.15, 1.16, 1.17, and 1.18 shows that a little bit change occurred in the results after 500 times of iterations of the proposed method.
20
M. A. Khan et al.
Fig. 1.14 Recognition accuracy after fusion of all features and 40% feature reduction by NCA approach using cropped images. These results are obtained after 500 times of iterations of this method
1.4.3 Discussion and Comparison A brief discussion of the proposed automated tumor recognition approach is described in this section. As shown in Fig. 1.1, the proposed approach consists of three steps – tumor visibility enhancement, tumor segmentation, and multiproperties feature extraction and reduction through NCA approach. The effects of tumor visibility enhancement results are demonstrated in Figs. 1.2 and 1.3, whereas the segmentation of tumor regions is demonstrated in Fig. 1.4. Later, multiproperties features are extracted, and their visual representation is shown in Figs. 1.5, 1.6, and 1.7. The validation of the proposed system is conducted on two publically available datasets such as BRATS 2013 and BRATS 2015. The recognition results on each dataset are computed on four different feature sets such as the fusion of LBP and HOG features along NCA reduction; the fusion of HOG, LBP, and MSER features along NCA reduction; the fusion of all features along NCA-based 20% feature reduction; and finally NCA-based 40% feature reduction. The numerical results of BRATS2013 and BRATS2015 datasets are presented in Tables 1.1 and 1.6 which are also verified by confusion matrices given in Tables 1.2, 1.3, 1.4, 1.5, 1.7, 1.8, 1.9, and 1.10. The best results are achieved on FFwNN which are demonstrated in Figs. 1.11, 1.12, 1.13, 1.14, 1.15, 1.16, 1.17, and 1.18. These results are computed after the Monte Carlo simulation of 500 times of iterations. The average results after 500 times of iterations show the consistency of the proposed system. In addition, the segmentation results are also verified by their ground truth images as shown in Fig. 1.19. Finally, an extensive comparison is conducted of the proposed method with existing techniques as presented in Table 1.11 which demonstrate that the proposed
Classifier Fine tree Logistic regression Linear SVM (O vs. A) Quadratic SVM (O vs. A) Medium Gaussian SVM (O vs. A) Fine KNN Weighted KNN Ensemble bagged trees Ensemble RUSBoosted trees FFwNN
Fusion (LBP + HOG +MSER) + NCA (%) 85.4 88.2 80.5 85.1 87.0 85.0 89.8 84.1 87.0 94.7
Fusion (LBP + HOG) + NCA (%) 88.6 87.0 84.1 86.7 83.7
85.8 89.4 85.8 88.2
93.7
Table 1.6 Proposed recognition results of BRATS2015 dataset
95.4
86.3 89.5 85.7 88.9
Fusion (all features) + NCA (20%) (%) 87.4 89.4 87.6 88.4 87.6
96.3
87.4 90.6 91.9 89.8
NCA-based 40% feature reduction (%) 89.8 91.0 89.6 91.5 89.4
1 An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . . 21
22
M. A. Khan et al.
Table 1.7 Confusion matrix for cubic SVM using fusion of LBP and HOG features Tumor class Flair T1 modality T1-C modality T2 modality
Tumor class Flair (%) T1 modality (%) 94.0 2 4 91.9 2 1 2 0.9
T1-C modality (%) 3.5 95 2.9
T2 modality (%) 4 0.6 2 94.1
Table 1.8 Confusion matrix for cubic SVM using fusion of LBP, HOG, and MSER features Tumor class Flair T1 modality T1-C modality T2 modality
Tumor class Flair (%) T1 modality (%) 94.4 4 2 94.5 3 1 4
T1-C modality (%) 1 2.5 95.4 1.4
T2 modality (%) 0.6 1 0.6 94.6
Table 1.9 Confusion matrix for cubic SVM using fusion of all features Tumor class Flair T1 modality T1-C modality T2 modality
Tumor class Flair (%) T1 modality (%) 98.1 1.3 5 94.7 2 3 0.9
T1-C modality (%) 0.3 93.2 0.1
T2 modality (%) 0.6 1.7 99.0
Table 1.10 Confusion matrix for cubic SVM using NCA-based reduction Tumor class Flair T1 modality T1-C modality T2 modality
Tumor class Flair (%) T1 modality (%) 96.4 2 92.5 4 1
T1-C modality (%) 3 4 94.0 0.5
T2 modality (%) 0.6 1.4 2 98.5
results are significantly good as compared to existing techniques. Moreover, our results are better because in existing techniques like [53], authors only consider the healthy and unhealthy samples, whereas the BRATS datasets include four modalities as T1, T1 weighted, Flair, and T2.
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
1.5
23
Conclusion
We have proposed an automated system for brain tumor recognition using MRI scans. The skull stripping and opening top-hat filtering are performed to improve the visibility of the tumor region which is later segmented by fuzzy C-means clustering
Fig. 1.15 Recognition accuracy after fusion of HOG and LBP features along NCA reduction. The accuracy is plotted after 500 times of iterations of this method
Fig. 1.16 Recognition accuracy after fusion of HOG, LBP, and MSER features along NCA reduction. The accuracy is plotted after 500 times of iterations of this method
24
M. A. Khan et al.
Fig. 1.17 Recognition accuracy after fusion of all features along NCA-based 20% feature reduction. The accuracy is plotted after 500 times of iterations of this method
Fig. 1.18 Recognition accuracy after fusion of all features along NCA-based 40% feature reduction. The accuracy is plotted after 500 times of iterations of this method
along thresholding. Multi-properties features are fused by the serial-based approach and reduced the irrelevant features through NCA reduction approach. For comparative analysis of selected features after NCA-based reduction, different classifiers are utilized, and FFwNN classifier is selected due to better numerical results as compared to other classification algorithms. The recognition results demonstrate that the
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
25
Fig. 1.19 Proposed tumor segmentation results using MRI scans. (a) Original image, (b) proposed segmentation scan, (c) ground truth image, and (d) boundary comparison. In (d) the blue line shows the ground truth area, and the red line shows the proposed segmented region Table 1.11 Comparison with existing techniques for both datasets
Method [12] [53] [53] [54] Proposed Proposed
Year 2017 2018 2018 2017 2019 2019
Dataset BRATS2013 BRATS2013 BRATS2015 BRATS2015 BRATS2013 BRATS2015
Accuracy (%) 93.0 99.8 95.1 90.4 99.8 96.3
reduction of 40% of features improves the overall system accuracy. The validation of the proposed system is conducted on BRATS2013 and BRATS2015 dataset and achieves an accuracy of 99.8% and 96.3%, respectively. Moreover, the comparison with existing techniques shows the overall improvement of our system.
26
M. A. Khan et al.
In the future, the deep learning-based system will be implemented and perform classification through enforcement learning (EL). The entrance of EL in the domain of machine learning provides better performance.
References 1. Bahadure NB, Ray AK, Thethi HP (2017) Image analysis for MRI based brain tumor detection and feature extraction using biologically inspired BWT and SVM. Int J Biomed Imaging 2017 2. Saba T, Khan MA, Rehman A, Marie-Sainte SL (2019) Region extraction and classification of skin cancer: a heterogeneous framework of deep CNN features fusion and reduction. J Med Syst 43:289 3. Khan MA, Rashid M, Sharif M, Javed K, Akram T (2019) Classification of gastrointestinal diseases of stomach from WCE using improved saliency-based method and discriminant features selection. Multimed Tools Appl:1–28 4. Khan MA, Akram T, Sharif M, Saba T, Javed K, Lali IU et al (2019) Construction of saliency map and hybrid set of features for efficient segmentation and classification of skin lesion. Microsc Res Tech 82:741–763 5. Mohsen H, El-Dahshan E-SA, El-Horbaty E-SM, Salem A-BM (2018) Classification using deep learning neural networks for brain tumors. Future Comput Inform J 3:68–71 6. Khan MA, Lali IU, Rehman A, Ishaq M, Sharif M, Saba T et al (2019) Brain tumor detection and classification: a framework of marker-based watershed algorithm and multilevel priority features selection. Microsc Res Tech 82:909–922 7. Nazir M, Khan MA, Saba T, Rehman A (2019) Brain tumor detection from MRI images using multi-level wavelets. In: 2019 international conference on Computer and Information Sciences (ICCIS), pp 1–5 8. Anitha V, Murugavalli S (2016) Brain tumour classification using two-tier classifier with adaptive segmentation technique. IET Comput Vis 10:9–17 9. Amin J, Sharif M, Yasmin M, Fernandes SL (2017) A distinctive approach in brain tumor detection and classification using MRI. Pattern Recogn Lett 10. Nabizadeh N, Kubat M (2015) Brain tumors detection and segmentation in MR images: Gabor wavelet vs. statistical features. Comput Electr Eng 45:286–301 11. Damodharan S, Raghavan D (2015) Combining tissue segmentation and neural network for brain tumor detection. IAJIT 12 12. Abbasi S, Tajeripour F (2017) Detection of brain tumor in 3D MRI images using local binary patterns and histogram orientation gradient. Neurocomputing 219:526–535 13. Kumar SA, Harish B, Aradhya VM (2016) A picture fuzzy clustering approach for brain tumor segmentation. In: 2016 second international conference on Cognitive Computing and Information Processing (CCIP), pp 1–6 14. Sompong C, Wongthanavasu S (2017) An efficient brain tumor segmentation based on cellular automata and improved tumor-cut algorithm. Expert Syst Appl 72:231–244 15. Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y et al (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31 16. Held K, Kops ER, Krause BJ, Wells WM, Kikinis R, Muller-Gartner H-W (1997) Markov random field segmentation of brain MR images. IEEE Trans Med Imaging 16:878–886 17. Rajinikanth V, Satapathy SC, Fernandes SL, Nachiappan S (2017) Entropy based segmentation of tumor from brain MR images–a study with teaching learning based optimization. Pattern Recogn Lett 94:87–95 18. Sert E, Avci D (2019) Brain tumor segmentation using neutrosophic expert maximum fuzzysure entropy and other approaches. Biomed Signal Process Control 47:276–287 19. Farhi L, Yusuf A, Raza RH (2017) Adaptive stochastic segmentation via energy-convergence for brain tumor in MR images. J Vis Commun Image Represent 46:303–311
1
An Integrated Design of Fuzzy C-Means and NCA-Based Multi-properties Feature. . .
27
20. Tustison NJ, Shrinidhi K, Wintermark M, Durst CR, Kandel BM, Gee JC et al (2015) Optimal symmetric multimodal templates and concatenated random forests for supervised brain tumor segmentation (simplified) with ANTsR. Neuroinformatics 13:209–225 21. Khan MA, Akram T, Sharif M, Shahzad A, Aurangzeb K, Alhussein M et al (2018) An implementation of normal distribution based segmentation and entropy controlled features selection for skin lesion detection and classification. BMC Cancer 18:638 22. Sharif M, Khan MA, Zahid F, Shah JH, Akram T (2019) Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection. In: Pattern analysis and applications, pp 1–14 23. Akram T, Khan MA, Sharif M, Yasmin M (2018) Skin lesion segmentation and recognition using multichannel saliency estimation and M-SVM on selected serially fused features. J Ambient Intell Humaniz Comput:1–20 24. Sharif M, Khan MA, Iqbal Z, Azam MF, Lali MIU, Javed MY (2018) Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection. Comput Electron Agric 150:220–234 25. Liaqat A, Khan MA, Shah JH, Sharif M, Yasmin M, Fernandes SL (2018) Automated ulcer and bleeding classification from wce images using multiple features fusion and selection. J Mech Med Biol 18:1850038 26. Afza F, Khan MA, Sharif M, Rehman A (2019) Microscopic skin laceration segmentation and classification: a framework of statistical normal distribution and optimal feature selection. Microsc Res Tech 27. Sharif M, Khan MA, Akram T, Javed MY, Saba T, Rehman A (2017) A framework of human detection and action recognition based on uniform segmentation and combination of Euclidean distance and joint entropy-based features selection. EURASIP J Image Video Process 2017:89 28. Sharif M, Khan MA, Faisal M, Yasmin M, Fernandes SL (2018) A framework for offline signature verification system: best features selection approach. Pattern Recogn Lett 29. Khan MA, Akram T, Sharif M, Javed MY, Muhammad N, Yasmin M (2018) An implementation of optimized framework for action classification using multilayers neural network on selected fused features. Pattern Anal Applic:1–21 30. Nasir M, Attique Khan M, Sharif M, Lali IU, Saba T, Iqbal T (2018) An improved strategy for skin lesion detection and classification using uniform segmentation and feature selection based approach. Microsc Res Tech 81:528–543 31. Sharif M, Tanvir U, Munir EU, Khan MA, Yasmin M (2018) Brain tumor segmentation and classification by improved binomial thresholding and multi-features selection. J Ambient Intell Humaniz Comput:1–20 32. Jain D, Singh V (2018) An efficient hybrid feature selection model for dimensionality reduction. Proc Comput Sci 132:333–341 33. Selvapandian A, Manivannan K (2018) Fusion based glioma brain tumor detection and segmentation using ANFIS classification. Comput Methods Prog Biomed 166:33–38 34. Lahmiri S (2017) Glioma detection based on multi-fractal features of segmented brain MRI by particle swarm optimization techniques. Biomed Signal Process Control 31:148–155 35. Soltaninejad M, Yang G, Lambrou T, Allinson N, Jones TL, Barrick TR et al (2017) Automated brain tumour detection and segmentation using superpixel-based extremely randomized trees in FLAIR MRI. Int J Comput Assist Radiol Surg 12:183–203 36. Reddy DJ, Prasath TA, Rajasekaran MP, Vishnuvarthanan G (2019) Brain and pancreatic tumor classification based on GLCM—k-NN approaches. In: International conference on intelligent computing and applications, pp 293–302 37. Tchoketch Kebir S, Mekaoui S, Bouhedda M (2019) A fully automatic methodology for MRI brain tumour detection and segmentation. Imaging Sci J 67:42–62 38. Rajesh T, Malar RSM, Geetha M (2018) Brain tumor detection using optimisation classification based on rough set theory. Clust Comput:1–7
28
M. A. Khan et al.
39. Khan SA, Nazir M, Khan MA, Saba T, Javed K, Rehman A et al (2019) Lungs nodule detection framework from computed tomography images using support vector machine. Microsc Res Tech 40. Khan MA, Javed MY, Sharif M, Saba T, Rehman A (2019) Multi-model deep neural network based features extraction and optimal selection approach for skin lesion classification. In: 2019 international conference on Computer and Information Sciences (ICCIS), pp 1–7 41. Adhikari SK, Sing JK, Basu DK, Nasipuri M (2015) Conditional spatial fuzzy C-means clustering algorithm for segmentation of MRI images. Appl Soft Comput 34:758–769 42. Camlica Z, Tizhoosh HR, Khalvati F (2015) Medical image classification via SVM using LBP features from saliency-based folded data. In: 2015 IEEE 14th international conference on Machine Learning and Applications (ICMLA), pp 128–132 43. Satpathy A, Jiang X, Eng H-L (2014) LBP-based edge-texture features for object recognition. IEEE Trans Image Process 23:1953–1964 44. Tasdemir SBY, Tasdemir K, Aydin Z (2018) ROI detection in mammogram images using wavelet-based Haralick and HOG features. In: 2018 17th IEEE international conference on Machine Learning and Applications (ICMLA), pp 105–109 45. Kimmel R, Zhang C, Bronstein AM, Bronstein MM (2011) Are MSER features really interesting? IEEE Trans Pattern Anal Mach Intell 33:2316–2320 46. Ghasemzadeh A, Azad SS, Esmaeili E (2018) Breast cancer detection based on Gabor-wavelet transform and machine learning methods. Int J Mach Learn Cybern:1–10 47. Mahmood M, Jalal A, Evans HA (2018) Facial expression recognition in image sequences using 1D transform and Gabor wavelet transform. In: 2018 international conference on Applied and Engineering Mathematics (ICAEM), pp 1–6 48. Yang J, Yang J-y, Zhang D, Lu J-f (2003) Feature fusion: parallel strategy vs. serial strategy. Pattern Recogn 36:1369–1381 49. Goldberger J, Hinton GE, Roweis ST, Salakhutdinov RR (2005) Neighbourhood components analysis. In: Advances in neural information processing systems, pp 513–520 50. Shang Q, Tan D, Gao S, Feng L (2019) A hybrid method for traffic incident duration prediction using BOA-optimized random forest combined with neighborhood components analysis. J Adv Transp 2019 51. Ferdinando H, Seppänen T, Alasaarela E (2017) Enhancing emotion recognition from ECG signals using supervised dimensionality reduction. ICPRAM:112–118 52. Ayoobkhan MUA, Chikkannan E, Ramakrishnan K (2018) Feed-forward neural network-based predictive image coding for medical image compression. Arab J Sci Eng 43:4239–4247 53. Amin J, Sharif M, Yasmin M, Fernandes SL (2018) Big data analysis for brain tumor detection: deep convolutional neural networks. Futur Gener Comput Syst 87:290–297 54. Kamnitsas K, Ledig C, Newcombe VF, Simpson JP, Kane AD, Menon DK et al (2017) Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med Image Anal 36:61–78
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect Brain Tumor/Stroke Section: A Study David Lin, V. Rajinikanth, and Hong Lin
Abstract
In human physiology, the brain is the most important internal organ, and a disease in the brain affects the individual severely and the untreated disease will lead to death. In the brain, tumor as well as ischemic stroke is the leading cause of death/ permanent disability in elderly. The clinical-level diagnosis of this disease is normally carried with the well-known imaging technique called magnetic resonance imaging (MRI) due to its multimodality and proven nature. Radiology helps to generate a 3D structure of the MRI and from which the 2D slices are extracted and examined to detect the abnormality. Further, the examination of 2D slices is quite simple compared to the 3D image. This work implements a hybrid imaging technique by considering the thresholding and segmentation techniques. Thresholding helped to improve the visibility of the abnormal part, and this thresholding is executed with the brain storm optimization-based Otsu’s/Kapur’s function. Later, the abnormal part from this image is extracted using the chosen segmentation technique. In this work, a detailed assessment of the existing segmentation procedures, such as watershed, active contour, level set and region growing, is presented. Finally, a study with the ground truth and extracted part is
D. Lin AP Research, Seven Lakes High School, Katy, TX, USA V. Rajinikanth (*) Electronics and Instrumentation Engineering, St. Joseph’s College of Engineering, Chennai, Tamil Nadu, India e-mail: [email protected] H. Lin Department of Computer Science and Engineering Technology, University of Houston-Downtown, Houston, TX, USA # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_2
29
30
D. Lin et al.
executed, and based on the Jaccard, Dice, and accuracy, the performance of the proposed technique is validated. Keywords
Brain disease · MRI · Brain tumor · Brain stroke · Otsu · Kapur · Substantiation
2.1
Introduction
In humans, the brain is the fundamental internal organ that conscientiously monitors the complete physiological system. Further, it receives the bioelectric potentials from all other sensory organs, processes the information, and executes a control function to implement the essential control operations. The functioning of the brain always decides the condition of an individual, and the disease in brain will severely affect the entire basic operation of an individual, and the untreated abnormality also will lead to death [1–4]. Because of its severity, a considerable number of brain MRI assessment procedures are implemented in the literature [5–8]. Most of the earlier works are related to the segmentation of the abnormal section and assessment [9, 10], machine learning schemes, and recently the deep learning procedures [11–14] to examine the brain abnormality existing in the MRI slices. Recently, a substantial amount of awareness programs and precautionary actions are taken to protect the brain from abnormality. But due to different unavoidable reasons, such as modern lifestyle, food habits, heredity, and age, most of the humans are suffering due to various brain abnormalities in which brain tumor (BT) and brain stroke (BS) are the leading causes of death/disability in elderly people. The MRI-based assessment is one of the clinically proven procedures, in which the MRI slices of a chosen modality, like Flair, T1, T1C, T2 and diffuse-weighted (DW), is examined by an experienced doctor, who will plan the essential treatment to control the abnormality in the brain. During the mass screening operation, the existing patients are large, and the images obtained from these patients are also very large which will build the complexity in evaluating the brain MRI with the help of the doctor. Further, the availability of the doctors is very less compared to the available patients, and hence it is necessary to get the computer help to diagnose the brain abnormalities. The computer-based assessment is very fast, and the outcome attained by the computer is then sent for the approval of the doctor, which considerably reduced the diagnosis burden. Due to its clinical significance, a considerable number of the brain MRI examination techniques are proposed and implemented to examine a class of brain abnormalities [15–20]. The existing procedure implemented a class of semiautomated [21–24] and automated techniques to extract and evaluate the abnormal sections of the brain. The common procedure implemented in earlier works are as follows: (i) extraction of the 2D MRI slice from the 3D MRI, (ii) enhancing the abnormal section with a chosen image improvement technique, (iii) mining of the
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
31
segment of interest, (iv) feature extraction and selection, and (v) classification and getting the support of the doctor for further assessment. The aim of the research work proposed in this study is to propose an image processing method to extract and evaluate the abnormal section existing in the brain MRI slices with better accuracy. In this work, a hybrid methodology discussed in recent methods is considered to examine the abnormal parts, such as the brain tumor and the stroke section from the 2D MRI slices of a chosen modality. The proposed work includes the following: (i) pre-processing based on the brain storm optimization-based Otsu’s/Kapur’s function, (ii) implementing a chosen segmentation procedure to mine the abnormal part of brain MRI, (iii) comparing the extracted section and the ground truth (GT) to compute the essential performance values, and (iv) validation of the implemented technique based on the values of Jaccard, Dice, and accuracy. In this research, the necessary brain MRI slices are collected from the benchmark image datasets, such as Multimodal Brain Tumor Segmentation Challenge (BRATS2015) [25, 26] and Ischemic Stroke Lesion Segmentation Challenge (ISLES2015) [27, 28]. The images of these datasets are available with the GT offered by an expert, and the comparison of GT with the extracted section will help to identify the performance of the proposed technique. If the proposed work provides the better result on the benchmark images, then it can be used for analyzing the clinical grade MRI in the future. In this work, a detailed analysis on the threshold procedures like Otsu and Kapur is initially presented. Later, a detailed study on the existing segmentation techniques, like watershed, active contour, level set, and region growing, is presented. The experimental work confirms that the proposed hybrid technique worked well on the BRATS2015 and ISLES2015 database. This work is arranged as follows: Sect. 2.2 delineates the associated existing techniques, Sects. 2.3 and 2.4 presented the description of problem and implemented method, and Sect. 2.5 presents the results of experiment and the conclusion is discussed in Sect. 2.6.
2.2
Related Research Works
In the literature, a considerable number of brain MRI assessment procedures are discussed by the researchers. Each practice has presented its merit and demerits with a chosen machine learning (ML) and deep learning (DL) work. In most of these works, the segmentation-assisted approach is proven as the better methodology. The segmentation-based method will assist to obtain the necessary shape and aspect features of the abnormal brain segment, which will improve the classification accuracy of the ML and DL methods, and this procedure plays a vital role in ensemble-assisted classification [29, 30]. The outline of the accessible brain abnormality detection techniques can be seen in Table 2.1, which offers the details concerning the approach, the outcome, and the validation process.
32
D. Lin et al.
Table 2.1 Summary of brain abnormality evaluation methods Reference [4]
[5] [6] [7] [31] [32] [33] [34] [35] [36]
Methodology A detailed framework for the early detection of the brain abnormality with heuristic algorithm-assisted technique using brain MDI slices Implemented hybrid image examination technique for the brain abnormality assessment Implementation of a machine learning procedure for the classification of brain MRI into normal and tumor class Jaya algorithm-assisted thresholding and segmentation of brain tumor from MRI Assessment of MRI slices with fuzzy entropy and level set Kapur’s threshold and Chan-Vese segmentation for tumor extraction and evaluation A hybrid imaging procedure to examine brain CT/MRI Tsallis entropy-assisted analysis of brain MRI MRI of flair/DW modality assessment with a chosen heuristic algorithm Implementation of Kapur’s thresholding and Markov random fieldbased segmentation to extract the brain abnormality
Abnormality Tumor
Tumor Tumor Tumor Tumor Tumor Tumor Tumor Stroke Tumor
Along with the above-said brain tumor detection procedures presented in Table 2.1, a considerable number of research works are also executed to detect the stroke section using the brain MI slices. The details of the image examination procedures employing the stroke section from a chosen MRI slice can be found in [37–40]. Further, a detailed review of the existing brain MRI detection procedures with the deep learning and machine learning methods can be found in [29]. The existing outcomes confirmed that the assessment of the brain MRI is essentially a crucial and essential task, which should assist the doctor to plan for the possible treatment procedures. Based on this outcome of the implemented tool, the doctor will take the decision, like surgery, chemotherapy, and radiotherapy, to control and remove the abnormal cell growth in the brain region. In the proposed work, a hybrid imaging approach is developed to examine the tumor/stroke section from the 2D MRI slices attained from BRATS and ISLES database. The objective of the proposed work is to implement a technique, which will offer better values of the Jaccard, Dice, and accuracy.
2.3
Problem Formulations and Methodology
Normally, the clinical-level evaluation is performed with an experienced radiologist, and the evaluation report of the radiologist is then sent to a doctor for further examination. The clinical stage examination engages in recognizing the position and the harshness of the tumor/stroke, to plan for necessary treatment planning
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
Fig. 2.1 Stages involved in the proposed assessment technique with Otsu’s thresholding
33
Brain-StormOptimization
3D MRI
2D MRI slice
Otsu’s thresholding
Segmentation
WS
GT
AC
LS
RG
Comparison
Performance value computation and validation
process. The doctor-level assessment of MRI slice depends on the accessibility of doctor, and in most of the circumstances, this examination needs additional time, and the assessment of all these images is time-consuming. Therefore, recently, a large number of computerized analytical methods are implemented to support the doctor for the assessment procedure. This practice aims to implement a brain storm optimization (BSO)-assisted brain abnormality assessment practice using the integrated threshold and segmentation technique. This system aims to extort the abnormal part from the tumor/stroke images using a chosen segmentation technique. The outcome of this approach is then evaluated based on the attained performance values computed using a GT. The proposed methodology is presented in Figs. 2.1 and 2.2, in which Fig. 2.1 depicts the approach with Otsu’s threshold and Fig. 2.2 depicts the approach with Kapur’s function. The stages of the proposed technique are as follows: collection of the BRATS/ ISLES 3D image, extraction of the 2D slice from the 3D image, implementing the tri-level thresholding with BSO-assisted Otsu/Kapur technique, execution of the chosen segmentation procedure to extract the abnormal region, comparison of the extracted section with GT, and performance computation and validation.
34
D. Lin et al.
Fig. 2.2 Stages involved in the proposed assessment technique with Kapur’s entropy thresholding
Brain-StormOptimization
3D MRI
2D MRI slice
Kapur’s thresholding
Segmentation
WS
GT
AC
LS
RG
Comparison
Performance value computation and validation
2.3.1
Brain Image Collection
In medical domain, most of the disease detection techniques require the bio-images collected using a chosen imaging procedure. Collecting the clinical-level images from the hospitals is a very time-consuming procedure, and most of the images attained from the hospitals are also associated with various defects such as irregular illumination, noise, and improper reconstruction. Further, the images collected from the hospital require the authorization of the ethical committee. Due to these issues, the medical grade-like images are prepared and shared for the research purpose, and these images are then considered as the benchmark to test and validate the imageassisted diagnostic procedures developed by the researchers. This work collected the necessary test pictures of brain MRI slices from the BRATS2015 [25, 26] and ISLES2015 [27, 28]. These datasets are free from the skull section and also have the GT for comparing and validation purpose. This work considered 200 images of the BRATS2015 database with modalities, Flair, T1C, and T2 and 200 images of ISLES2015 database with DW modality. The sample test images collected from these databases are depicted in Figs. 2.3 and 2.4, respectively. Figure 2.3 presents the sample test pictures of BRATS2015 with modalities, such as Flair, T1, T1C, T2, and the GT existing in the database. In the proposed study, the MRIs with Flair and T2 modalities are considered for the assessment, due to its better
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
35
GT
T2
T1C
T1
Flair
2
Fig. 2.3 Sample test pictures of BRATS2015 database
visibility. Similarly, the DW and Flair modality images of ISLES2015 are presented in Fig. 2.4 along with the actual GT and the low-grade GT (GTL). Other images considered in this work are similar to the sample images presented in Figs. 2.3 and 2.4.
2.3.2
Image Thresholding
Image thresholding is one of the commonly accepted image assessment procedures, largely considered to enhance the abnormal sections in a class of medical images [41–45]. Even though a large number of image examination procedures are existing in the literature, Otsu’s and Kapur’s methods are largely considered due to their proven performance [46–50]. In this work, a tri-level threshold discussed by the
D. Lin et al.
GT
GTL
Flair
DW
36
Fig. 2.4 Sample test images of ISLES2015 dataset
researchers is considered to pre-process the test image. Further, to minimize the complexity in the threshold process, the brain storm optimization (BSO) is considered, and this algorithm helps to identify the optimal threshold value by maximizing the objective value. The particulars of the BSO and the threshold selection procedures are presented in the following subsections:
Brain Storm Optimization BSO was a novel heuristic technique proposed by Shi (2011) [51] based on the mimic of the human’s decision-making capability during the hard situations [52, 53]. Traditional BSO works with the following steps: Step 1 Step 2 Step 3 Step 4 Step 5
Construct a brainstorm human cluster with different environment Begin the algorithm to create a multiple ideas as per the chosen problem Recognize and select a fitting idea to resolve the chosen problem Choose the suggestion which provides improved probability Permit the decision-making person to accumulate many idea as possible to resolve the given task
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
Step 6
37
Replicate the procedure till an improved clarification is attained for the selected problem
In a proposed study, the BSO is employed to recognize the optimal threshold by maximizing the objective values of Otsu’s/Kapur’s function. Other details on BSO can be found in [51–53]. The parameters of the BSO are assigned as follows: agent size ¼ 30, iteration value ¼ 2500, search dimension ¼ 3, and stopping criteria ¼ maximized Otsu’s/ Kapur’s function.
Otsu’s Function The established Otsu’s practice is shown below [46]. The between-class variance for a bi-level dilemma can be offered with the subsequent statement: let the mission is to separate the picture in to the background (B0) and the object (B1) using a preferred threshold Th. The probability allotment for a picture with L1 thresholds, B0 and B1, is denoted as B0 ¼ where u0 ðT Þ ¼
T1 P
P0 P P P . . . T1 and B1 ¼ T . . . L1 u0 ð T Þ u0 ð T Þ u1 ðT Þ u1ðT Þ
P j , u1 ð T Þ ¼
L1 P
ð2:1Þ
P j ,and L ¼ 256.
j¼T
j¼0
The means μ0 and μ1 of B0 and B1 are expressed as: μ0 ¼
T 1 L1 X X jP j jP j and μ1 ¼ u ð T Þ ω1 ðT Þ 0 j¼T j¼0
ð2:2Þ
The mean intensity (μT) of the picture is: μT ¼ ω0 μ0 þ ω1 μ1 and u0 þ u1 ¼ 1 The maximized objective value is: Maximize J ðT Þ ¼ σ 0 þ σ 1
ð2:3Þ
where σ 0 ¼ u0(μ0 μT)2 and σ 1 ¼ u1(μ1 μT)2. Similar state is attained in tri-level threshold and its expression will be: Maximize J ðT Þ ¼ σ 0 þ σ 1 þ σ 2
ð2:4Þ
Kapur’s Function Kapur’s approach is a proven technique to enhance the image based on the chosen threshold value [48].
38
D. Lin et al.
Mathematical model of the KE is defined below. Letting T ¼ [t1, t2, . . ., tL 1] designate thresholds of the picture, then the universal entropy will be: Cost function ¼ J max ¼ J Kapur ðT Þ ¼
L X
ORj
for Rf1, 2, 3g
ð2:5Þ
J¼1
Equation 2.5 presents the maximized entropy with valued to the chosen threshold value. In multiple threshold task, the goal is denoted as: OR1
t1 X PoRj
¼
j¼1
θR0
ln
t2 X PoRj
OR2 ¼
j¼t lþ1
θR1
⋮ L X PoRj
ORk ¼
j¼t kþ1
θRL1
ln
ln
PoRj
!
θR0 PoRj
, !
θR1 PoRj
,
ð2:6Þ
!
θRK1
where PoRj indicates likelihood distribution and θR0 , θR1 , . . . , θRL1 depicts likelihood occurrence in L-levels. The main function of BSO is to recognize best threshold based on maximized entropy. Other related information can be found in [32, 36].
2.3.3
Image Segmentation
Segmentation is a necessary scheme in image testing, which helps to extort the necessary division of the image for additional assessment. Substantial quantities of segmentation techniques are available in the literature to extort the abnormal sections from the brain MRI. The proposed work implemented watershed (WS), active contour (AC), level set (LS), and region growing (RG) methods existing in the literature [54–60], in which the WS is known as the automated technique and AC, LS, and RG are grouped as the semi-automated procedure. Each method has its own merit and demerits, and choosing a particular segmentation process is a challenging task. This work considered all the segmentation techniques to extract the abnormal brain part, and the performance of the chosen segmentation is confirmed using the attained values of the Jaccard, Dice, and accuracy.
2.3.4
Evaluation and Validation
The performance appraisal and confirmation is a necessary procedure to verify the operation of the image assessment method. In this work, a validation is executed
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
39
among abnormal section, and the GT and the necessary performance values shown below are computed [5–7]: Jaccard Index ¼ JAC ¼ I gti \ I t =I gti [ I t Dice ¼ DIC ¼ 2 I gti \ I t =I gti [ jI t j
ð2:7Þ ð2:8Þ
TP þ TN TP þ TN þ FP þ FN
ð2:9Þ
Precision ¼ PRE ¼
TP TP þ FP
ð2:10Þ
Sensitivity ¼ SEN ¼
TP TP þ FN
ð2:11Þ
Specificity ¼ SPE ¼
TN TN þ FP
ð2:12Þ
Accuracy ¼ ACC ¼
F1Score ¼ F1S ¼
2TP 2TP þ FN þ FP
ð2:13Þ
where Igti ¼ GT, It ¼ tumor, [ is the union process, and \ is the intersection function. TP, TN, FP, and FN indicate the true positive, true negative, false positive, and false negative, correspondingly [].
2.4
Results and Discussions
This part of the research presents the experimental outcomes and its discussion. All these results are attained using MATLAB software. Initially, the necessary test image of dimension 216 176 1 pixels (BRATS2015) and resized dimension of 256 256 1 pixels (ISLES2015) are considered for the assessment. The original images of ISLES2015 are of dimension 77 77 1 pixels, and it is then resized to 256 256 1 pixels to get better visibility. The proposed work considered only the axial view of the brain MRI for the assessment with the modalities Flair and T2 for BRATS and DW and Flair for ISLES. Figure 2.5 depicts the gray histogram for the sample test images and its GT. Similar procedure is implemented for the brain stroke image of ISLES2015. The histogram of the chosen Flair, T2, and GT is presented in this figure in which the Flair image histogram has better pixel distribution compared to T2, and the GT histogram has only a chosen pixel group, which represents the background, tumor core, tumor, and edema. Initially, the proposed MRI examination system is implemented on the BRATS2015 database using Otsu’s approach and later executed using Kapur’s function, and the related results are presented in Fig. 2.6. Otsu-based threshold is initiated by assigning the BSO parameters as discussed in section “Brain storm optimization”. The BSO is allowed to randomly vary the
D. Lin et al.
GT
T2
Flair
40
(a)
(b)
Fig. 2.5 Sample images and the related gray histogram. (a) Image. (b) histogram
threshold of the chosen test picture, till it finds the maximal level of Otsu’s function. After finding this value, it will display the values as the optimal threshold, and this process will enhance the test image by grouping the pixel groups into three sections like (i) background, (ii) tumor part, and (iii) normal brain section. After thresholding, a chosen segmentation procedure is implemented to extract the tumor segment from the brain MRI. Figure 2.6a, b presents the converged optimization search and the attained optimal thresholds, respectively. Figure 2.6c presents the outcome of Otsu’s thresholding, and the results of Kapur’s threshold are depicted in Fig. 2.6d. Later, the watershed (WS) segmentation is then implemented to extract the tumor section from Fig. 2.6c, and the related results are depicted in Fig. 2.6e–h. The extracted tumor in Fig. 2.6h is then compared with the GT, and the essential performance values are then computed. After computing all the necessary parameters, the segmentation task is repeated with AC, LS, and RG methods chosen in this work.
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
41
Fig. 2.6 Experimental results attained with the BRATS2015 database. (a) Convergence of optimization search based on Otsu’s function, (b) attained threshold values, (c) outcome of Otsu’s threshold, (d) result with Kapur’s threshold, (e) detected edge, (f) watershed fill, (g) morphological enhancement, (h) extracted tumor section
D. Lin et al.
Kapur
Otsu
42
(a)
(b)
(c)
(d)
Fig. 2.7 Results attained with the ISLES2015 database. (a) Test image, (b) GT, (c) thresholded image, (d) extracted tumor Table 2.2 Average performance attained with BRATS2015 dataset with WS segmentation Function Otsu Kapur Otsu Kapur
Modality Flair T2
JAC 0.8937 0.9153 0.8795 0.8993
DIC 0.9132 0.9207 0.9005 0.9095
SEN 0.9775 0.9683 0.9397 0.9669
SPE 0.9825 0.9775 0.9558 0.9794
ACC 0.9644 0.9702 0.9372 0.9814
PRE 0.9295 0.9317 0.9166 0.9314
F1S 0.9416 0.9469 0.9382 0.9446
The work executed on Otsu’s tresholded image is repeated from the images attained with Kapur’s threshold, and the attained values are noted. Similar procedure is employed for the ISLES database, and the attained results are shown in Fig. 2.7. The outcome of this research confirmed that the proposed hybrid image processing technique works well on the brain MRI with modalities, like Flair, T2, and DW. Later, the proposed work executes a relative assessment among the GT and the extracted abnormal section, and the average value of the attained results is considered for the assessment. The attained results of BRATS and ISLES are depicted in Tables 2.2 and 2.3, respectively. The results of these tables confirm that the accuracy attained using Kapur’s threshold image is slightly better compared to Otsu’s thresholded image. Further, Table 2.2 and 2.3 confirmed that the proposed work helped to get better results on both the benchmark datasets. Further, a graphical representation shown in Fig. 2.8 also confirms that, for the BRATS images, Kapur’s approach helped to attain better values compared to Otsu, and similar results are obtained for the ISLES. In the future, the performance of the proposed system can be enhanced by considering other threshold methods, such as Tsallis entropy and Shannon entropy, existing in the literature. Further, the performance of the developed system can be
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
43
Table 2.3 Average performance attained with ISLES2015 dataset with WS segmentation Function Otsu Kapur Otsu Kapur
Modality Flair T2
JAC 0.8848 0.9036 0.8906 0.8837
DIC 0.9048 0.9221 0.8998 0.9084
SEN 0.9638 0.9683 0.9464 0.9747
SPE 0.9722 0.9774 0.9662 0.9837
ACC 0.9594 0.9612 0.9511 0.9648
PRE 0.9338 0.9311 0.9066 0.9438
F1S 0.9513 0.9352 0.9328 0.9558
Fig. 2.8 Graphical representation of chosen performance measures
considered to evaluate the high-grade brain tumor images of the BRATS2015 and the stroke images existing in the Radiopaedia database [61] (refer to Figs. 2.9, 2.10 and 2.11 in Appendix). Further, the performance of the BSO can be validated against other heuristic approaches.
2.5
Conclusions
The proposed research aims to develop an image examination procedure to examine the abnormal section from the brain MRI slices. The common brain abnormality, like the tumor and the stroke section, is examined using the hybrid imaging technique. This work executed a pre-processing and segmentation practices on the BRATS and ISLES databases (200 tumor + 200 stroke ¼ 400 images). The preliminary effort enhances the test picture using the BSO-assisted Otsu’s/Kapur’s function. After threshold, the tumor/stroke section is then extracted using a chosen segmentation technique from WS, AC, LS, and RG. The comparison of tumor/stroke segment and GT is then employed, and the performance standards are then calculated. The attained result confirmed that the result attained with Kapur’s threshold is better compared to Otsu’s, and hence, in the future, other entropy-assisted threshold is to be executed to improve the diagnosis accuracy.
44
D. Lin et al.
Appendix
2 1 200 150 100 50
0 –1 –2 160 140 120 100
80
60
40
20
(a)
(b)
2 0 –2 –4 –6 150 100 50
(c)
50
100
150
200
(d)
(e)
Fig. 2.9 Results attained with LS. (a) Initiation of bounding box, (b) bounding box, (c) converged LS on tumor, (d) 3D view of the extracted tumor, (e) binary image of the tumor (similar kind of results were attained with the active contour and region growing segmentation techniques)
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
Fig. 2.10 Sample images from high-grade BRATS2015 database
Fig. 2.11 Sample images from Radiopaedia clinical grade database
45
46
D. Lin et al.
References 1. Louis DN et al (2016) The 2016 world health organization classification of tumors of the central nervous system: a summary. Acta Neuropathol 131:803–820. https://doi.org/10.1007/s00401016-1545-1 2. El-Dahshan, E.S.A, Mohsen, H.M., Revett, K. et al. (2014) Computer-aided diagnosis of human brain tumor through MRI: a survey and a new algorithm. Expert Syst Appl, vol.41, no.11, pp.5526–5545 3. Amin J, Sharif M, Yasmin M et al (2018) Big data analysis for brain tumor detection: deep convolutional neural networks. Future Gener Comput Syst 87:290–297 4. Fernandes SL et al (2019) A reliable framework for accurate brain image examination and treatment planning based on early diagnosis support for clinicians. Neural Comput Appl:1–12. https://doi.org/10.1007/s00521-019-04369-5 5. Dey N et al (2019) Social-group-optimization based tumor evaluation tool for clinical brain MRI of flair/diffusion-weighted modality. Biocybern Biomed Eng 39(3):843–856. https://doi. org/10.1016/j.bbe.2019.07.005 6. Pugalenthi R et al (2019) Evaluation and classification of the brain tumor MRI using machine learning technique. Control Eng Appl Inf 21(4):12–21 7. Satapathy SC, Rajinikanth V (2018) Jaya algorithm guided procedure to segment tumor from brain MRI. J Opt 2018:12. https://doi.org/10.1155/2018/3738049 8. He T, Pamela MB, Shi F (2016) Curvature manipulation of the spectrum of a valence–arousalrelated fMRI dataset using a Gaussian-shaped fast fourier transform and its application to fuzzy KANSEI adjective modeling. Neurocomputing 174:1049–1059 9. Hore S, Chakroborty S, Ashour AS, Dey N, Ashour AS, Sifakipistolla D, Bhattacharya T, Bhadra Chaudhuri SR (2015) Finding contours of hippocampus brain cell using microscopic image analysis. J Adv Microsc Res 10(2):93–103 10. Kovalev V, Kruggel F (2007) Texture anisotropy of the brain’s white matter as revealed by anatomical MRI. IEEE Trans Med Imaging 26(5):678–685 11. Liu M, Zhang J, Nie D et al (2018) Anatomical landmark based deep feature representation for MR images in brain disease diagnosis. IEEE J Biomed Health 22(5):1476–1485 12. Gudigar A, Raghavendra U, San TR, Ciaccio EJ, Acharya UR (2019) Application of multiresolution analysis for automated detection of brain abnormality using MR images: a comparative study. Futur Gener Comput Syst 90:359–367 13. Buda M et al (2019) Association of genomic subtypes of lower-grade gliomas with shape features automatically extracted by a deep learning algorithm. Comput Biol Med 109:218–225. https://doi.org/10.1016/j.compbiomed.2019.05.002 14. Sharif M, Tanvir U, Munir EU, Khan MA, Yasmin M (2018) Brain tumor segmentation and classification by improved binomial thresholding and multi-features selection. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-1075-x 15. Moldovanu S, Moraru L, Biswas A (2016) Edge-based structural similarity analysis in brain MR images. J Med Imaging Health Inf 6:1–8 16. Tatla SK, Radomski A, Cheung J, Maron M, Jarus T (2012) Wii-habilitation as balance therapy for children with acquired brain injury. Dev Neurorehabil:1–15. http://www.ncbi.nlm.nih.gov/ pubmed/23231377 17. Sullivan JR, Riccio CA (2010) Language functioning and deficits following pediatric traumatic brain injury. Appl Neuropsychol 17(2):93–98. http://www.ncbi.nlm.nih.gov/pubmed/ 20467948 18. McKinlay A, Grace RC, Horwood LJ, Fergusson DM, Ridder EM, MacFarlane MR (2008) Prevalence of traumatic brain injury among children, adolescents and young adults: prospective evidence from a birth cohort. Brain Inj 22(2):175–181. http://www.ncbi.nlm.nih.gov/pubmed/ 18240046
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
47
19. Rajinikanth V, Dey N, Satapathy SC, Ashour AS (2018) An approach to examine magnetic resonance angiography based on Tsallis entropy and deformable snake model. Futur Gener Comput Syst 85:160–172 20. Acharya UR et al (2019) Automated detection of Alzheimer’s disease using brain MRI images– a study with various feature extraction techniques. J Med Syst 43(9):302. https://doi.org/10. 1007/s10916-019-1428-9 21. Jahmunah V et al (2019) Automated detection of schizophrenia using nonlinear signal processing methods. Artif Intell Med 100:101698. https://doi.org/10.1016/j.artmed.2019.07. 006 22. Rajinikanth V, Satapathy SC, Fernandes SL, Nachiappan S (2017) Entropy based segmentation of tumor from brain MR images – a study with teaching learning based optimization. Pattern Recogn Lett 94:87–95. https://doi.org/10.1016/j.patrec.2017.05.028 23. Rajinikanth V, Satapathy SC, Dey N, Lin H (2018) Evaluation of ischemic stroke region from CT/MR images using hybrid image processing techniques. In: Intelligent multidimensional data and image processing, pp 194–219. https://doi.org/10.4018/978-1-5225-5246-8.ch007 24. Palani TK, Parvathavarthini B, Chitra K (2016) Segmentation of brain regions by integrating meta heuristic multilevel threshold with Markov random field. Curr Med Imaging Rev 12 (1):4–12 25. Menze BH, Jakab A, Bauer S et al (2015) The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging 34(10):1993–2024 26. Brain Tumour Database (BraTS-MICCAI). http://hal.inria.fr/hal-00935640. Accessed 20 Aug 2019 27. Maier O, Wilms M, Von der Gablentz J, Krämer UM, Münte TF, Handels H (2015) Extra tree forests for sub-acute ischemic stroke lesion segmentation in MR sequences. J Neurosci Methods 240:89–100. https://doi.org/10.1016/j.jneumeth.2014.11.011 28. Maier O et al (2017) ISLES 2015 – a public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI. Med Image Anal 35:250–269 29. Tandel GS et al (2019) A review on a deep learning perspective in brain cancer classification. Cancers (Basel) 11(1):111. https://doi.org/10.3390/cancers11010111 30. Nadeem MW et al (2020) Brain tumor analysis empowered with deep learning: a review, taxonomy, and future challenges. Brain Sci 10(2):E118. https://doi.org/10.3390/ brainsci10020118 31. Roopini TI, Vasanthi M, Rajinikanth V, Rekha M, Sangeetha M (2018) Segmentation of tumor from brain MRI using fuzzy entropy and distance regularised level set. Lect Notes Electr Eng 490:297–304. https://doi.org/10.1007/978-981-10-8354-9_27 32. Manic KS, Hasoon FA, Shibli NA, Satapathy SC, Rajinikanth V (2019) An approach to examine brain tumor based on Kapur’s entropy and Chan–Vese algorithm. AISC 797:901–909 33. Rajinikanth V, Satapathy SC, Dey N, Lin H (2018) Evaluation of ischemic stroke region from CT/MR images using hybrid image processing techniques. Intell Multidimens Data Image Process:194–219. https://doi.org/10.4018/978-1-5225-5246-8.ch007 34. Rajinikanth V, Fernandes SL, Bhushan B, Sunder NR (2018) Segmentation and analysis of brain tumor using Tsallis entropy and regularised level set. Lect Notes Electr Eng 434:313–321 35. Revanth K et al (2018) Computational investigation of stroke lesion segmentation from flair/ DW modality MRI. In: Fourth international conference on Biosignals, Images and Instrumentation (ICBSII), IEEE 206–212. https://doi.org/10.1109/icbsii.2018.8524617 36. Rajinikanth V, Raja NSM, Kamalanand K (2017) Firefly algorithm assisted segmentation of tumor from brain MRI using Tsallis function and Markov random field. J Control Eng Appl Inform 19(3):97–106 37. Kanchana R, Menaka R (2015) Computer reinforced analysis for ischemic stroke recognition: a review. Indian J Sci Technol 8(35):81006 38. Usinskas A, Gleizniene R (2006) Ischemic stroke region recognition based on ray tracing. In: Proceedings of international baltic electronics conference. https://doi.org/10.1109/BEC.2006. 311103
48
D. Lin et al.
39. Tang F-H, Ng DKS, Chow DHK (2011) An image feature approach for computer-aided detection of ischemic stroke. Comput Biol Med 41:529–536 40. Rajini NH, Bhavani R (2013) Computer aided detection of ischemic stroke using segmentation and texture features. Measurement 46:1865–1874 41. Rajinikanth V, Thanaraj PK, Satapathy SC, Fernandes SL, Dey N (2019) Shannon’s entropy and watershed algorithm based technique to inspect ischemic stroke wound. SIST 105:23–31. https://doi.org/10.1007/978-981-13-1927-3_3 42. Raja NSM et al (2019) A study on segmentation of leukocyte image with Shannon’s entropy. Histopathol Image Anal Med Decis Mak:1–27. https://doi.org/10.4018/978-1-5225-6316-7. ch001 43. Rajinikanth V, Dey N, Kavallieratou E, Lin H (2020) Firefly algorithm-based Kapur’s thresholding and Hough transform to extract leukocyte section from hematological images. Applications of firefly algorithm and its variants: case studies and new developments, pp 221–235. https://doi.org/10.1007/978-981-15-0306-1_10 44. Rajinikanth V, Dey N, Satapathy SC, Kamalanand K (2020) Inspection of crop-weed image database using Kapur’s entropy and spider monkey optimization. Adv Intell Syst Comput 1048:405–414. https://doi.org/10.1007/978-981-15-0035-0_32 45. Rajinikanth V, Raja NSM, Satapathy SC, Dey N, Devadhas GG (2018) Thermogram assisted detection and analysis of ductal carcinoma in situ (DCIS). In: International conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), IEEE 1641–1646. https://doi.org/10.1109/icicict1.2017.8342817 46. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66 47. Raj SPS et al (2018) Examination of digital mammogram using Otsu’s function and watershed segmentation. In: Fourth international conference on Biosignals, Images and Instrumentation (ICBSII), IEEE 206–212. https://doi.org/10.1109/ICBSII.2018.8524794 48. Kapur JN, Sahoo PK, Wong AKC (1985) A new method for gray-level picture thresholding using the entropy of the histogram. Comput Vis Graph Image Process 29:273–285 49. Rajinikanth V, Satapathy SC, Dey N, Fernandes SL, Manic KS (2019) Skin melanoma assessment using Kapur’s entropy and level set – a study with bat algorithm. Smart Innov Syst Technol 104:193–202. https://doi.org/10.1007/978-981-13-1921-1_19 50. Shriranjani D et al (2018) Kapur’s entropy and active contour-based segmentation and analysis of retinal optic disc. Lect Notes Electr Eng 490:287–295. https://doi.org/10.1007/978-981-108354-9_26 51. Shi Y (2011) Brain storm optimization algorithm. Lect Notes Comput Sci 6728:303–309. https://doi.org/10.1007/978-3-642-21515-5_36 52. Cheng S, Shi Y, Qin Q, Zhang Q, Bai R (2014) Population diversity maintenance in brain storm optimization algorithm. J Artif Intell Soft Comput Res 4(2):83–97 53. Cheng S, Qin Q, Chen J, Shi Y (2016) Brain storm optimization algorithm: a review. Artif Intell Rev 46(4):445–458 54. Manic KS, Priya RK, Rajinikanth V (2016) Image multithresholding based on Kapur/Tsallis entropy and firefly algorithm. Indian J Sci Technol 9(12):89949 55. Raja NSM, Rajinikanth V, Fernandes SL, Satapathy SC (2017) Segmentation of breast thermal images using Kapur’s entropy and hidden Markov random field. J Med Imaging Health Inform 7(8):1825–1829 56. Fernandes SL, Rajinikanth V, Kadry S (2019) A hybrid framework to evaluate breast abnormality. IEEE Consum Electron Mag 8(5):31–36. https://doi.org/10.1109/MCE.2019.2905488 57. Dey N, Rajinikanth V, Ashour AS, Tavares JMRS (2018) Social group optimization supported segmentation and evaluation of skin melanoma images. Symmetry 10(2):51. https://doi.org/10. 3390/sym10020051
2
Hybrid Image Processing-Based Examination of 2D Brain MRI Slices to Detect. . .
49
58. Dey N, Shi F, Rajinikanth V (2019) Leukocyte nuclei segmentation using entropy function and Chan-Vese approach. Inf Technol Intell Transp Syst 314:255–264. https://doi.org/10.3233/9781-61499-939-3-255 59. Dey N (ed) (2017) Advancements in applied metaheuristic computing. IGI Global, Hershey 60. Dey N (2020) Applications of firefly algorithm and its variants. Springer, Singapore 61. https://radiopaedia.org/articles/ischaemic-stroke. Accessed 25 Jan 2020
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and Asymmetry Analysis Using Curvelets in Breast Thermograms S. Prabha
Abstract
Thermal pattern which shows asymmetry plays a vital role to diagnosis the cancer in breast thermogram. In this work, asymmetrical pattern of thermal images is analysed using coherence-enhanced diffusion filtering (CEDF)-based reaction diffusion level set and Curvelet transform (CT). The breast tissues are segmented using reaction diffusion level set method (RDLSM). Development of edge map generated by CEDF acts as edge indicator in this level set. The regions of left and right breast from the segmented images are separated. The categories of abnormal and normal sets have been established by pathological and healthy conditions from the separated regions. Three levels of Curvelet decomposition are performed on these tissues, and features of texture such as contrast, dissimilarity and difference of variance are determined from the extracted Curvelet coefficients. The results show that coherence-enhanced diffusion filter-based RDLSM is able to segment the regions of breast. This technique shows the correlation between the ground truth and segmented output as high value than the conventional level set. CT offers the directional information as more than wavelet and ridgelet transform. Therefore, grey-level co-occurrence features extracted from the coefficients of CT are noticed to be important in delineating the tissues of normal and abnormal breast. Hence, it seems that this can be used in the automated detection of asymmetry in breast thermograms for efficient diagnosis of breast cancer. Keywords
Breast thermogram · Coherence diffusion filter · Curvelet transform · Asymmetry analysis S. Prabha (*) Department of ECE, Hindustan Institute of Technology and Science, Chennai, India e-mail: [email protected] # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_3
51
52
3.1
S. Prabha
Introduction
Breast cancer is a type of cancer that originates from breast tissues mostly from the inner lining of milk ducts. It is reported as the second leading cause of cancer death among women [1]. Infrared imaging, mammography, MRI, CT, ultrasound and PET scans are used to detect breast cancer. MRI and PET are not popularly adopted for this due to their high cost, complexity and accessibility issues. Mammography, the gold standard screening tool and other imaging modalities have limitations such as high cost, radiation exposure, painful and uncomfortable imaging procedure. The screening of mammogram misses one in five of breast cancers and in turn increases the false-positive findings by 20% which leads to more aggressive treatment for patients [2]. Infrared imaging is a non-invasive, non-ionizing, painless, low-cost imaging modality which is used along with mammography for better diagnosis of breast cancer [3]. Human body emits the thermal pattern as images are called thermograms [4]. Thermography is a therapeutic decision-making tool used for earliest breast cancer detection [5]. Earliest breast cancer detected by infrared imaging has a chance of 85% cure than 10% if the cancer is detected late [6, 7]. Fourier, wavelet, steerable pyramid decomposition methods and soft thresholding techniques have been implemented in breast thermogram as noise reduction techniques [8–11]. Asymmetry analysis plays a major role in the identification of abnormalities present in breast tissues [12, 13]. For proper analysis of asymmetry in thermograms, segmentation and feature extraction are crucial steps in breast thermograms [14]. The challenges that exist in the segmentation of breast thermograms are low contrast, low SNR and absence of clear edges [15, 16]. Different segmentation techniques have been reviewed for breast thermography [17]. Common methods which is used in segmentation of breasts are morphological, edge-based and thresholding techniques [18, 19]. The process of optimization with multithresholding techniques had used to examine the breast thermograms [20]. Clustering techniques such as K-means and FCM had proven its effectiveness for diagnosing breast abnormalities [21]. The edge-based method proves its efficiency to segment nearly accurate shape of the structures [22, 23]. Anisotropic diffusion filter with canny operator in LSM is used to detect breast region alone from other neighbouring regions [24]. Automatic threshold, automatic border detection and cubic spline techniques have been reported to separate breast tissues [25]. Level set based segmentation has been proposed for the extraction of blood vessels in thermal images by Zhou. Hough transform has been used to extract lower breast boundaries [26–30]. Identification of bottom breast borders and inflammatory wrinkles requires complex and accurate segmentation algorithms [31, 32]. LSM are extensively used for segmentation in medical image processing to capture dynamic interfaces and shape [30]. In order to sustain steady LS evolution and regularity in LS function, re-initialization is necessary. Re-intialization leads to incorrect movement of zero LS are far away from predictable position and thus resulted as numerical errors. These problems lead to future growth of LSM. A re-initialization free theoretical analysis has been reported via reaction diffusion
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
53
method [33]. Modified diffusion rate introduced in level set equation offers a piecewise constant solution. Implementation of two-step splitting method in reaction diffusion regularizes the LS function to reach the stability. Higher boundary antileakage and anti-noise capabilities are achieved by this method [31]. In conventional LS, Gaussian edge map is used as stopping boundary function [34]. To stop the diffusion process automatically and preservation of boundary near edges, there is a need for efficient edge indicator [35]. CED filters are most efficacious in pretty coherent structures that enhance boundaries at diverse orientations [22]. Orientation smoothing is achieved by CD with an incorporation of diffusion and structure tensor. This integration leads to significantly good scale space representation with high coherence [36]. Nonlinear coherence diffusion model has been applied in ultrasound medical images and simultaneously performs noise reduction and preservation of structured regions [37]. The feature extraction techniques which include statistical [25, 38], histogram [39, 40] and fractal [41, 42] analyses could extract features which are originated to be beneficial in detection of diverse conditions in breast tissues. The report of higher order statistics such as skewness, kurtosis, variance and difference variance is most considerable in the asymmetry investigation of breast thermograms [43, 44]. Bispectral features and Hilbert transform features from Riesz and quaternion are used to differentiate between non-cancerous and cancerous thermal patterns [45, 46]. Radial basis function and learning to rank method with texture serve as a significant part for detecting breast abnormalities [47, 48]. It has been reported that the performance of non-separable and complex wavelet features is useful in differentiating between different breast pathologies [49]. Curvelet is efficient to detect the activities of image along curves than wavelet and ridgelet [50]. Among the multiresolution analysis such as wavelet, ridgelet and Curvelet transforms, Curvelet proves its effectiveness for detecting abnormality [51]. Curvelet yields high accuracy rate for the classification of normal and abnormal tissues in medical images compared to other multiresolution techniques [52]. Curvelet-based feature extraction method is used to detect abnormalities present in breast thermograms [53]. Adaptive fusion of medical image from CT and MRI is effectively denoised using Curvelet transform and total variation method [54]. In this method, the asymmetry is analysed in thermography with the help of thermal images which is carried out using coherence-enhanced diffusion filter-based RDLSM and Curvelet analysis. The breast tissues are segmented from background tissue using RDLSM. CEDF have been implemented as an edge indicator in this level set. Healthy and pathological conditions of left and right regions of breast are correctly identified by midpoint of inframammary folds. Grey-level co-occurrence texture features are extracted from Curvelet transform performed on separated left and right breast regions.
54
S. Prabha
3.2
Materials and Methods
Breast thermal images for this study is obtained from online database of the project PROENG (http://visual.ic.uff.br/proeng/). The techniques used for the acquisition of images, protocol particulars and the details of camera are explained [31]. Forty seven images of breast are considered for this analysis. CEDF is used as an edge map in RDLSM. Segmented breast tissues are obtained from the proposed level set method. Left and right regions are separated using midpoint of inframammary folds. The regions with pathology are grouped as abnormal tissues, and the health regions are considered as normal tissues. Curvelet transform features are extracted from these regions and analysed for different tissues characteristics. The workflow of the method is described in Fig. 3.1.
3.2.1
Reaction Diffusion-Based Level Set Method
The evolution of curve in active contour models [36] is demonstrated as K(s, t) : ! R2, s 2 [0, 1] and t 2 [0, 1]. The curve or surface under evolution is given by ∂K ðs, t Þ ¼ BZ ∂t
ð3:1Þ
Contour motion is influenced by the parameter B, and Z is a normal vector to contour K. The rate equation is described as ∂φ ¼ RegðφÞ þ BδðφÞ ∂t
ð3:2Þ
where Reg(φ) ¼ α div [r(φ) ∇ φ], the diffusion rate equation is elaborated as r(φ), constant is denoted as α, and force is described as B. The problem such as leakage of boundary is unavoidable one that leads the solution as unstable one. The advanced diffusion term “εΔφ” is added in conventional LS to generate stable solution [37]. The equation that has been created using RD rate is described as
Brea st Ther mogra ms
Reacon Diffusion Based Level Set Method with Coherence Edge Map
Segmentaon of Breast Tissues
Fig. 3.1 Representation of framework carried out
Separaon of Le and Right Breast Regions
Curve let Trans form
Feature Extracon
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
∂ϕ 1 ¼ εΔΦ LðΦÞ, x 2 Ω C Rn ε ∂t
55
ð3:3Þ
The second term in the above equation is the reaction term which is expressed as ε1L(φ) ¼ Bδ(φ) and the operator of Laplacian is Δ and constant value is denoted as ε. The force term B in the active contour model equation L(Φ) ¼ Fδ(Φ) is given by F ¼ div
gðj∇lσ jÞ∇Φ þ vgðj∇lσ jÞ j ∇lσ j
ð3:4Þ
where v is constant and Lg(| ∇lσ | ) is an edge indicator function, which is given by ðj∇I σ jÞ Lðj∇I σ jÞ ¼ exp m ðj∇I σ jÞ gðj∇I σ jÞ ¼ exp k
ð3:5Þ
Gaussian kernel is convolved with original image of standard deviation σ that had generated a smoothed image which is denoted as Iσ , and m is a contrast parameter. Generally the evolution of curve is terminated by edge indicator function.
3.2.2
Coherence-Enhancing Reaction Diffusion-Based Level Set Method
In conventional level set formulation, coherence-enhanced diffusion filter is an integration of structure tensor and anisotropic nonlinear diffusion filtering. The image u(x, t) has been described as a structure which is represented as J 0 ð∇uσ Þ≔∇uσ ∇uσ ≔∇uσ ∇uσ T
ð3:6Þ
The version having smoothed image of the gradient is denoted as ∇uσ . The structure tensor smoothing obtained with a kernel of Gaussian Kρ is expressed as J ρ : ð∇uσ Þ≔K ρ ∇uσ ∇uσ
ð ρ 0Þ
ð3:7Þ
The eigenvalues of Jρ are a measure of anisotropy that affords valuable information on structure consistency. If the eigenvalue difference is strong, then it develops large anisotropy, whereas the value is zero for isotropic structures. In order to investigate movement of structures in an image, the structure tensor Jρ(∇uσ ) is used as an edge indicator [36–38]. By integrating this image gradient in the reaction diffusion level set formulation, then the force term F is given by
56
S. Prabha
gðj∇I σ jÞÞ∇Φ F ¼ div þ vgðj∇I σ jÞ j ∇I σ j
ð3:8Þ
In the above equation, constant is denoted as v is, whereas the function of edge indicator is expressed as Jρ(∇uσ ).
3.2.3
Curvelet Transform
The radial window W(r) and angular window V(θ) are elaborated using Curvelet transform. This window pair supports the polar wedge that is described in the Fourier domain as 2⌊ U j ðr, θÞ ¼ 2ð3j=4Þ W 2j r V
j=2c
2π
θ
ð3:9Þ
The function x ¼ (x1, x2) of CT is described as scale, orientation, position as 2j, θ and xkj,l . Fast discrete CT is applied in this work using algorithm called wrapping. The transform utilizes multiscale for creating a structure of pyramid with several orientations at each scale that is described as wrapping based CT. The above structure explains the different nature of sub-bands at varied scales in frequency domain. Lower and higher level sub-bands consist of varied positions and orientations [11, 49–52]. The Curvelet coefficients are derived at several orientations and scales. Features with statistical behaviour such as difference of variance, contrast and dissimilarity are described below. The grey levels of an image with the incidence of probability density are given below Pði, jÞ ¼ hði, jÞ=ðN M Þ
ð3:10Þ
where h (i, j) is the CT value of coefficient in matrix. The dimensions of consistent sub-band are expressed as N * M. The variation in grey level has been observed in several areas of texture in an image and is extracted clearly by contrast feature. Low contrast has been reflected as poor edges in an image, whereas high values imitate the sharp edges of an image. Contrast ¼
XX i
j
ði jÞ2 Pði, jÞ
ð3:11Þ
Grey-level variations are measured by dissimilarity feature which is given below Dissimilarity ¼
X ji jjpði, jÞ
ð3:12Þ
This feature shows high value for the texture sample with excesses of the conceivable grey levels [55].
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
57
The neighbour and reference pixel mean value is combined in terms of dispersion which is measured as variance. Diff:Variance ¼
X i
ði f Þ2 Pxy ðiÞ
ð3:13Þ
where Pxy ðk Þ ¼
3.2.4
XX i
Pði, jÞ k ¼ ji jj & f ¼ j
X i
iPxy ðiÞ
ð3:14Þ
Asymmetry Analysis
Threshold binary decomposition (TBD) method is employed to perform asymmetry analysis. The set of thresholds is computed from input image grey-level distribution information by using this technique. Multi-level Otsu algorithm (MLOA) is adopted for this technique. The basic principle behind MLOA is for separating the image’s pixels into many classes and verifies the best threshold value which is obtained by maximizing the input image interclass variance or minimizing the intraclass variance. Number of threshold value chosen for this method is four. TBD method decomposes the grey-level input into set of binary images. The input greyscale image I is modelled by binary decomposition algorithm as a 2D function I (x, y), which is given by I b ðx, yÞ ¼
1
if I ðx, yÞ t
0
otherwise
ð3:15Þ
where I (x, y) is a greyscale value. Then, the fractal dimension is estimated from the resulted binary image. Then calculate the mean grey level and pixel counting for the corresponding region.
3.3
Results and Discussion
Figure 3.2a, b shows typical normal and abnormal greyscale breast thermal images. These images are characterized with low-intensity variations and low contrast. This makes clear identification of lower breast boundaries and inframammary folds difficult. Figure 3.2b shows the thermal image of a subject having carcinoma in the left breast region. Asymmetry is a difficult task to identify which involves complexity in interpreting the abnormality. Even though asymmetry is induced by abnormal tissues, in order to quantitatively extract this information, accurate delineation of breast tissues is highly important. Accurate identification of breast tissue boundaries is an extremely complex task.
58
S. Prabha
Fig. 3.2 (a) Typical normal image. (b) Typical abnormal images
Fig. 3.3 Edge maps. (a) Typical normal image. (d) Typical abnormal image. (b and e) Gaussian edge map. (c and f) Coherence enhanced diffusion edge map
RDLSM is exploited for segmentation of breast tissues. The contour is initially placed in an image. Then the contour emerges till it attains the iteration endpoint. Finally, creation of binary mask from the contour is integrated with original image to get desired output. Generally, the contour of level set emerges from the knowledge of edge map. CEDF and Gaussian filter are used as edge maps for the evolution of contour which is shown in Fig. 3.3. It is observed that the edge map obtained from Gaussian filter indicates uncleared edges in inner breast regions. The edge pattern appears to be discontinuous and has weak boundaries especially in inframammary fold regions of abnormal images.
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
59
Fig. 3.4 Segmented mask and breast tissue using coherence edge map. (a and c) Segmented mask. (b and d) Segmented breast tissue
The sigma value assigned for stopping function at edges in CEDF is fixed as 10. The extent of diffusion decided by the time function as 3, that determine the smoothing for gradient as pre and post process which is chosen as 1. The iteration for CEDF is limited as 100 to conclude the process. The CEDF edge map is established as blazing, definite and extra confined from adjacent structures. This method highlights the distinct edges from its nearby framework and eliminates the undistinguishable edges that arise by Gaussian edge map. In order to achieve the boundary of desired edges, an efficient edge map is created as stopping function for LSE. Alpha value decides the contour to reach the ROI. The masks generated for the abovementioned breasts of Fig. 3.3a, d using coherence diffusion filter are shown in Fig. 3.4. The inspection shows CEDF in level set could elicit the tissues of breast with definite and clear borders. Midpoint of inframammary folds is used as centre region to separate left and right regions of breast tissues. The above separated regions are classified into normal and abnormal based on the coefficients obtained from CT. The features of GLCM such as contrast, dissimilarity and difference of variance are excerpted from each normal and abnormal tissue (Table 3.1).
60 Table 3.1 The details of Curvelet sub-band distribution at each scale
S. Prabha
Curvelet transform (three-level decomposition) Scale Total no. of sub-bands Sub-band considered for statistical operations
1 1 1
2 16 8
3 32 16
Fig. 3.5 Scatter plot representation of segmented and ground truth area. (a) Gaussian filter. (b) Coherence diffusion filter
The Scatter plot representation of segmented and ground truth area for Gaussian as well as proposed method is described in Fig. 3.5. The normalized mean values of GLCM features derived from coefficients obtained from CT at three different scales for CED-based RD and Gaussian-based RD level sets are shown in Figs. 3.6 and 3.7. These features derived at three different scales are resulted as high for normal subjects in Gaussian-based level sets. They exhibit distinct variations between normal and abnormal subjects. The features derived at second scale show high differentiation which could be due to the presence of both low- and high-frequency details. In coherence diffusion-based RD level sets, the features derived at three different scales exhibit high variations between normal and abnormal subjects. This indicates that this method performs smoothing well and extracts the edge map with distinct and sharp boundaries. The high variations are obtained from the features derived at second scale followed by first and third scale. The features derived at first scale are able to extract low-frequency information associated with changes in metabolic activity of breast tissues. Similarly, the detail information represented by features at higher scale preserves the structural changes associated with normal and pathological conditions. The feature DIS shows high separability between normal and abnormal subjects which is due to effective detection of the extremes of the grey values. The significance of asymmetry is analysed by evaluating the absolute difference between the features derived using Gaussian- and coherence diffusion-based level sets and is shown in Fig. 3.8. The maximum variation observed is 9% for the
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
Fig. 3.6 Features of mean value derived using Gaussian-based level set at different scales
Fig. 3.7 Features of mean value using coherence-based level set at different scales
61
62
S. Prabha
Fig. 3.8 Percentage variation of feature for Gaussian- and coherence-based levels set at different scales
dissimilarity feature at second scale and 7% at first and third scale. The maximum variation observed was 6% for the contrast and difference of variation features.
3.4
Conclusion
In this paper, an attempt has been made to segment breast tissues using reaction diffusion-based level set method. The edge map extracted using coherence diffusion filter is used as an edge indicator in the level set function to segment breast tissues. Separation of left and right breast regions is carried out after effective removal of non-breast regions. Depending on the health and pathology, the separated images are grouped under normal and abnormal tissues. GLCM features such as contrast, difference of variance and dissimilarity are extracted from Curvelet coefficients of segmented breast tissues. The results show that coherence diffusion filter could extract well-connected edge map. The segmented breast tissues using coherence-based level set have distinct boundary for both normal and abnormal images. Also, high correlation coefficient is observed between the segmented output and the ground truth. This could be attributed to the extraction of magnitude, direction and orientation information by coherence diffusion filter. Curvelet transform is effective at detecting image activity almost in all direction instead of radial, horizontal, vertical and diagonal directions. Therefore, Curvelet transform-based GLCM features are found to be significant in demarcating normal and carcinoma subjects. It is observed that these features could show the structural changes due to different vascularized nature of the various
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
63
tissues. It is found that the integration of coherence diffusion-based level set with Curvelet features could be used to identify normal and various pathological conditions in breast thermograms.
References 1. NBCF, National Breast Cancer Foundation, Inc. Available: http://www.nationalbreastcancer. org/about-breast-cancer/what-isbreast-cancer.aspx.2010 2. Qi H, Diakides N (2007) Infrared imaging in medicine, pp 1–10 3. Lahiri BB, Bagavathiappan S, Jayakumar T, Philip J (2012) Medical applications of infrared thermography: a review. Infrared Phys Technol 55(4):221–235 4. Minikina W, Dudzik S (2009) Measurements in infrared thermography. Infrared thermography. Error and uncertainties, 1st edn. Wiley, pp 15–60 5. Prabha S, Suganthi SS, Sujatha CM (2017) Analysis of breast thermal images using anisotropic diffusion filter based modified level sets and efficient fractal algorithm, cognitive computing and medical information processing in computer and information science. Springer 6. Ng EYK (2009) A review of thermography as promising non-invasive detection modality for breast tumor. Int J Therm Sci 48:849–859 7. Minkina W, Dudzik S (2009) Infrared thermography: errors and uncertainties, 1st edn. Wiley, pp 15–60 8. Candes EJ, Donoho LD (2000) Curvelets: a surprisingly effective non-adaptive representation for objects with edges. Stanford University, Department of Statistics 9. Do MN, Vetterli M (2005) The contourlet transform: an efficient directional multiresolution image representation. IEEE Trans Image Process 14(12):2091–2106 10. Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627 11. Dettori L, Semler L (2007) A comparison of wavelet, Ridgelet and Curvelet-based texture classification algorithm in computed tomography. Comput Biol Med 37:486–498 12. Prabha S, Sujatha CM (2018) Proposal of index to estimate breast similarities in Thermograms using fuzzy C means and anisotropic diffusion filter based fuzzy C means clustering, infrared physics and technology, vol 93, Elsevier, pp 316–325 13. Diakides NA, Bronizino JD (2007) Detecting breast cancer from thermal infrared images by asymmetry analysis. In: Medical infrared imaging. Taylor and Francis Group, pp 11–14 14. Prabha S, Suganthi SS, Sujatha CM (2015) An approach to analyze the breast tissues in infrared images using nonlinear adaptive level sets and Riesz transform features, technology and health care, vol 23(4). IOS Press, pp 429–442 15. Prabha S, Sujatha CM, Ramakrishnan S (2014) Asymmetry analysis of breast thermograms using BM3D technique and statistical texture features. In: 3rd IEEE international conference on informatics, electronics & vision, Dhaka, Bangladesh 16. Kafieh R, Rabbani H (2011) Wavelet- based medical infrared image noise reduction using local model for signal and noise. In: IEEE statistical signal processing workshop, pp 549–552 17. Borchartt T, Conci A, Lima R, Resmini A, Sanchez (2013) Breast thermography from an image processing viewpoint: a survey. Signal Process 93:2785–2803 18. Wang B, Chen LL, Zhang ZY (2019) A novel method on the edge detection of infrared image. Optik 180:610–614 19. Duarte A, Carrão L, Espanha M, Viana T, Freitas D, Bartolo P, Faria P, Almeida HA (2014) Segmentation algorithms for thermal images. Proc Technol 16:1560–1569 20. Díaz-Cortés MA, Ortega-Sánchez N, Hinojosa S, Oliva D, Cuevas E, Rojas R, Demin A (2018) A multi-level thresholding method for breast thermograms analysis using Dragonfly algorithm. Infrared Phys Technol 93:346–361
64
S. Prabha
21. Bhowmik MK, Gogoi UR, Majumdar G, Bhattacharjee D, Datta D, Ghosh AK (2017) Designing of ground-truth-annotated DBT-TU-JU breast thermogram database toward early abnormality prediction. IEEE J Biomed Health Inform 22(4):1238–1249 22. Zhang K, Lei Z, Huihui S, David Z (2013) Reinitialization-free level set evolution via reaction diffusion. IEEE Trans Image Process 22(1):258–271 23. Jiang M (2018) Edge enhancement and noise suppression for infrared image based on feature analysis. Infrared Phys Technol 91:142–152 24. Josephine Selle J, Shenbagavalli A, Sriraam N, Venkatraman B, Jayashree M, Menaka M (2018) Automated recognition of ROIs for breast thermograms of lateral view-a pilot study. Q InfraRed Thermogr J 15(2):194–213 25. Motta L, Conci A, Diniz E, Luís R (2010) Automatic segmentation on thermograms in order to aid diagnosis and 2D modeling. In: Proceedings of 10th workshop em Informática Médica, pp 1610–1619 26. Lipari C, Head J Advanced infrared image processing for breast cancer risk assessment. In: Conference of the IEEE Engineering in Medicine and Biology Society, vol 2, pp 673–676 27. Scales N, Herry C, Frize M (2004) Automated image seg-mentation for breast analysis using infrared images. In: Conference of the IEEE Engineering in Medicine and Biology Society, pp 1737–1740 28. Qi H, Kuruganti PT, Snyder WE (2008) Detecting breast cancer from thermal infrared images by asymmetry analysis. In: The Biomedical Engineering Handbook, Medical devices and systems, 3rd edn. CRC Press, Boca Raton, pp 27.1–27.14 29. Zadeh HG, Kazerouni IA, Haddadnia J (2011) Distinguish breast cancer based on thermal features in infrared images. Can J Image Proc Comput Vis 2(6):54–58 30. Machado DA, Giraldi G, Novotny AA, Marques RS, Conci A. Topological derivative applied to automatic segmentation of frontal breast thermograms 31. Li C, Chenyang X, Changfeng G, Martin DF (2005) Level set evolution without re-initialization: a new variational formulation: Proc. IEEE Conf Comput Vis Pattern Recognit 1:430–436 32. Prabha S, Sujatha CM, Ramakrishnan S (2015) Robust anisotropic diffusion based edge enhancement for level set segmentation and asymmetry analysis of breast thermograms using Zernike moments. J Biomed Sci Instrum 51:341–348 33. Prabha S, Anandh KR, Sujatha CM, Ramakrishnan S (2014) Total variation based edge enhancement for level set segmentation and asymmetry analysis in breast thermograms. In: 36th IEEE international conference on Engineering in Medicine and Biology Society (EMBS), Chicago, USA, pp 6438–6441 34. Li C, Xu C, Gui C, Martin DF (2010) Distance regularized level set evolution and its application to image segmentation. IEEE Trans Image Process 19:154–164 35. Chao SM, Tsai DM (2010) An improved anisotropic diffusion model for detail-and edgepreserving smoothing. Pattern Recogn Lett 31(13):2012–2023 36. Weickert J (1999) Coherence-enhancing diffusion of colour images. Image Vis Comp 17:201–212 37. Abd-Elmoniem KZ, Youssef A, Kadah YM (2002) Real-time speckle reduction and coherence enhancement in ultrasound imaging via nonlinear anisotropic diffusion. IEEE Trans Biomed Eng 49(9):997–1014 38. Weickert J (1999) Coherence-enhancing diffusion filtering. Int J Comp Vis 31:111–127 39. Wiecek B, Zwolenik S, Jung A, Zuber J (1998) Advanced thermal, visual and radiological image processing for clinical diagnostics. Conf Proc IEEE Eng Med Biol Soc 8(4):139–144 40. Kuruganti PT, Qi H (2002) Asymmetry analysis in breast cancer detection using thermal infrared images. Conf Proc IEEE Eng Med Biol Soc 2:1129–1130 41. Acharya UR, Ng EYK, Tan H, Sree V (2012) Thermography based breast cancer detection using texture features and support vector machine. J Med Syst 36(3):1503–1510
3
Edge-Enhancing Coherence Diffusion Filter for Level Set Segmentation and. . .
65
42. Mahnaz T, Caro L, Saeed S, Ng E (2010) Analysis of breast thermography using fractal dimension to establish possible difference between malignant and benign patterns. J Healthcare Eng 1(1):27–43 43. Serrano R, Ulysses C, Ribeiro J, Lima RCF (2010) Using Hurst coefficient and Lacunarity for diagnosis of breast diseases considering thermal images. Conf Proc Syst Signals Image Process:550–553 44. Koay J, Herry C, Frize M (2004) Analysis of breast thermography with an artificial neural network. Conf Proc IEEE Eng Med Biol Soc 1(1):1159–1162 45. Tavakol ME, Ng EYK, Chandran V, Rabbani H (2013) Separable and nonseparable discrete wavelet transform based texture features and image classification of breast thermograms. Infrared Phys Technol 61:274–286 46. Prabha S, Suganthi SS, Sujatha CM (2015) Differentiation of breast abnormalities in infrared images using Reisz and Quaternion Hilbert transform based features. Int J Biomed Eng Technol 19(3):255–265 47. Gogoi UR, Majumdar G, Bhowmik MK, Ghosh AK (2019) Evaluating the efficiency of infrared breast thermography for early breast cancer risk prediction in asymptomatic population. Infrared Phys Technol 99:201–211 48. Abdel-Nasser M, Moreno A, Puig D (2019) Breast cancer detection in thermal infrared images using representation learning and texture analysis methods. Electronics 8(1):100 49. Tavakol ME, Chandran V, Ng EYK, Kafieh Z (2013) Breast cancer detection from thermal images using bispectral invariant features. Int J Therm Sci 69:21–36 50. Starck JL, Candes E, Donoho DL (2002) The Curvelet transform for image denoising. IEEE Trans Image Process 11(6):670–684 51. AlZubi S, Islam N, Abbod M (2011) Multiresolution analysis using wavelet, ridgelet, and Curvelet transforms for medical image segmentation. J Biomed Imaging 4 52. Francis SV, Sasikala M, Saranya S (2014) Detection of breast abnormality from thermograms using curvelet transform based feature extraction. J Med Syst 38(4):1–9 53. Bhadauria HS, Dewal ML (2013) Medical image denoising using adaptive fusion of Curvelet transform and total variation. Comput Electr Eng 39(5):1451–1460 54. Motta L, Conci A, Lima R, Diniz E, Luis S (2010) Automatic segmentation on thermograms in order to aid diagnosis and 2D modeling. In: Proceedings of 10th workshop em Informática Médica, pp 1610–1619 55. Lofstedt T, Brynolfsson P, Asklund T, Nyholm T, Garpebring A (2019) Gray-level invariant Haralick texture features. PLoS One 14(2)
4
Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET Image J. Dafni Rose, K. Jaspin, and K. Vijayakumar
Abstract
Capturing analytical data from the fusion of medical images is a demand and emerging area of exploration. A vast area of applications of image fusion demonstrates its importance in the examination of diseases and surgical outline. The project intends to authorize a simple as well as effective fusion routine to examine lung cancer. A methodology is developed to strengthen the lung tumour examination for mass screening; CT (computerized tomography) along with PET (positron emission tomography) images are fused effectively. The existing system automatically differentiates the lung cancer for PET/CT images using framework study, and Fuzzy C-Means (FCM) was developed successfully. The pre-processing approach strengthens the certainty of the cancer revelation. Semantic operations implement authentic lung ROI extraction. But the elementary problem is the directionality and phase information cannot be resolved. To defeat this problem Dual-Tree Complex Wavelet Transform (DTCWT) is used in the proposed model that presents a method for image fusion. When Dual-Tree Complex Wavelet Transform (DTCWT) is correlated to the Discrete Wavelet Transform (DWT), the fusion outcome is improved a lot. It also improves the accuracy of PSNR, entropy, and similarity value. Segmentation supports to positively evaluate the outline of the cancer cells and spot the exact location of those cells. The prediction of lung cancer can be done by the decision tree algorithm. The proposed system presents a region growing technique for segmentation. CT, PET, and fused images are segmented.
J. Dafni Rose · K. Jaspin · K. Vijayakumar (*) St. Joseph’s Institute of Technology, Chennai, Tamil Nadu, India # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_4
67
68
J. Dafni Rose et al.
Keywords
CT/PET images · DTCWT · NSCT · Region growing · Prediction · Tumour examination
4.1
Introduction
The two types of cancer commonly seen in humans are reduced cell lung cancer and large cell lung cancer. The symptoms can be quite easily spotted during the earlier stages when a person starts coughing up a lot of blood, has intermitted chest pain, and experiences weight loss and a complete loss of appetite. Other symptoms include feeling tired all the time and having trouble breathing after doing long strenuous physical activity. Identifying these symptoms as early as possible can improve the survival rate from 15% to 50%. This is still not ideal and can be greatly improved upon to present better odds of survival. Some of the commonly used techniques to detect these tumour cells in the early stages of growth include image generation by X-rays, computed-aided tomography (CT) scans, and magnetic resonance imaging (MRI). Out of these techniques however, the CT scan is the most recommended method which produces the 3D images of the lungs almost instantaneously. There could be a greater reduction in the number of deaths associated with this case by providing immediate treatment to the patients. This is mainly because the process of early detection of cancer plays an important role to prevent cancer cells from multiplying and spreading to other regions. The present techniques of lung cancer detection have lesser accuracy and are simply inadequate with the number of variations and cases of cancer that keep coming up every day. We thus have to adopt new methods of cancer detection and cancer treatment to meet with the changing times. Lung cancer is a carcinogenic inflammation that develops in human beings due to their neighbourhood. The main causes of lung cancer in males and females are cigarette smoking and alcohol consumption. Lung cancer has the highest death tolls in developed countries such as the United States. A recent survey by the Centers for Disease Control and Prevention has stated that there have been more deaths due to lung cancer than breast, prostate, and colon cancer combined. In the same survey, it had been found out that around 40% of trusted source of people had been diagnosed with lung cancer and the disease has reached a developed state at the time of diagnosis. About one-third of those people have been diagnosed with stage 3 of this disease. According to the American Cancer Society, more than 80% of lung cancers are discovered to be non-small cell lung cancer (NSCLC). The remaining 10–15% are diagnosed as small cell lung cancer (SCLC). These are two different types of lung cancer that require different treatments. While survival rates vary, stage 3 lung cancer is treatable. Many factors affect an individual’s outlook, including the stage
4
Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET. . .
69
of cancer, treatment plan, and overall health. To observe cancer in a better way, a system holds both CT (computerized tomography) and PET (positron emission tomography) images as inputs, and they are fused using dual-tree complex wavelet transform. CT lung screening is a harmless, scanning technique that uses mild X-rays to screen the lungs for cancer in less than a minute. This scan allows the radiologist to examine different cross-section of the lungs with the help of a moving X-ray beam. A spiral computed tomography (CT) scanner can detect smaller lumps and cancerous growth than standard chest X-rays. These growths detected by the system can be benign (noncancerous) or malignant (cancerous). In case the identified tumour growth is found out to be malignant, then they can be removed easily and improve a patient’s survival rate. CT scan image will contribute to the bone structure of the human body, whereas PET scan image will deliver information about the flow of blood in the human body with low-dimensional resolution. Positron emission tomography (PET) is another advanced technique used for cancer-cell identification. With the help of a radioactive tracing agent, you can easily view the tissues and cells even at their molecular level using this technique. When a PET scan is performed thoroughly, it can help to study simple body functions such as blood pressure, oxygen, and sugar levels of a person. This helps us understand how certain organs of our body function. The results also enable the doctor to see the functioning of specific organs. For lung issues, the doctor can then specifically look closer at the lung area while interpreting the PET scan images. It is standard practice to combine a lung PET scan with a CT scan, and conditions of lung cancer are thus easily detected in the earlier stages itself by viewing the increasing metabolic activity of different regions of the body. This process of identification is referred to as image fusion. The stages of lung cancer can also be identified with the help of a lung PET scan. Lung cancer tumours are tissues with a comparatively higher metabolic rate (higher energy usage) and tend to stand out on the PET scan. Usually when a cancerous growth is identified, a threat level is assigned to it ranging from 0 to 4 indicating the stage of the cancer. These levels are used to indicate how developed the cancer tumours are: level 4 means that the cancer has reached its fully developed state, and level 0 means that it is in its earlier stages of growth. To put it plainly a person who is identified to have stage 4 of the disease will have greater risks in terms of survival, but a person with stage 0 will have more survival chances. Based on the results of the lung PET scan, a doctor can determine the most appropriate course of treatment. Fusion is the mechanism in which two or more images are united to get one unique image called a fused image [1–5]. The fused image incorporates more detailed and significant information than the isolated input images, and the influential information of the original image will be retained. The image fusion technique is very important in digital image processing. Data fusion can be divided into three levels: these levels are pixel-level fusion, feature level fusion, and decision level fusion. The image fusion is an important part of the
70
J. Dafni Rose et al.
medical field to diagnose the diseases in the human body so that practical treatment can be enforced. To inspect the cancer cells and to fuse the resulting image, the wavelet is used. Prediction technique is also used to detect whether a person can be able to affect with cancer with their habits. For the clustering process of the image, the k-means clustering algorithm is used.
4.1.1
Related Works
Sara Lioba Volpi has developed an auto working system for the detection of malignant tumours in the lungs by the mediastinum segmentation. To specify the regions in the lungs’ borders, in which restorative areas are going through, the mediastinum borders find the locality, which contains the lymph nodes [6]. Paul Miller has developed a system for tumour segmentation in the lungs of the PET/CT image by an image classification technique. This technique is applied to a large number of scanned images which are PET/CT images, of the people affected with malignant tumours [7]. New York State had the fourth highest death rate from breast cancer in 1995–1999, though it was 17th in colorectal cancer and 39th in lung cancer [8, 9]. Jinman Kim developed a system to find the textual appearance, high are used in differentiating the tumour as normal and abnormal. The system can be applied to only the PET or CT images; they cannot be processed at the same time. To find out the localization of the disease is easy in this method [10]. Julien Work proposed medical imaging used for the detection of tumours and also analysis for the treatment. This system can be applied to a large number of images. The most challenging task is the manual partition of 3D images [11]. Ashnil Kumar has proposed a graph-based algorithm which facilitated the regeneration of volumetric CT/PET images, by the examination of the region of interest and separation of 3D images. They indexed these specifications by the representation of the graph to find out the similarity of images based on the structural measures of tumours [12]. Marhaban, A. J. Nordin has developed a system which is the combination of PET and computed tomography (CT), which merges the objective and practical details of the patients. This method is most commonly used in the diagnosis of cancer. It also gives efficient results comparing with other systems [13]. GergelySzabo developed an automatic system which is based on lung nodules in PET-CT images. The area of nodules can be found using the foreground and background mean ratio [14]. Hui Cui, Xiueing Wang, has developed, which separates the malignant tumours in the lungs from the normal tissues. The values of intensities are the same for both the normal and the abnormal tissues. To get the functional information from PET and the information of intensity from CT, this system is used [15]. J. Garcia, J. Fdez-Valdivia, and F. Cortijo have developed a system which works with a novel algorithm for performing k-means clustering. The main work in this system is that all prototypes are viewed as potential candidates for the closest prototype at the root level itself. Finally, it improves the computational speed of the k-means algorithm [16]. R. T. Ng and J. Han have developed a system based on spatial data mining and characteristics which exist implicitly in the spatial
4
Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET. . .
71
database. CLARINS is the most efficient and it is used here [17]. To eliminate the possibilities of getting various contradicting solutions to a single problem during diagnosis, Vijayakumar K., Pradeep Mohan Kumar K., and Daniel Jesline proposed the replacement of a single regular Agent oriented Approach (AoA) by Intelligent Artificial Agents that act like human and dynamically make smart decisions with intelligent searching approach (ISA) [18]. An advanced approach that maintains minimal time usage and intercommunication between the agents without compromising the perfection of the solution is obtained. Ries LAG et al. describe the statistics of cancer reviews from 1975 to 2002 which includes more than 25 types of cancer statistics review results [8]. Vinod Jagannath Kadam, Shivajirao Manikrao Jadhav, and Vijayakumar K. proposed a feature ensemble learning based on Stacked Sparse Autoencoders and Softmax Regression Model (FE-SSAE-SM model) for early breast cancer diagnosis [9]. This system allowed for breast cancers to be detected in their early stages and subjected to treatment before it advanced to untreatable stages.
4.2
Methodology
4.2.1
Pre-processing for CT and PET Image
Pre-processing is executed to promote the quality of the image. Here pre-processing is compassed for both CT and PT image, in which one and the other images are going to be used and only the better-quality images should be achieved. The intent of pre-processing is an advancement of the image data that will overcome the undesirable distortions or strengthen some of the image features which will be used for further processing. The CT image is taken as the input, and the consecutive steps described in the diagrams are done. The average filter is used for denoising, and region growing is used for the segmentation. Certainly, the segmented and the filtered image are accustomed to the output. Several pre-processing techniques are used to improve image quality. For example, applying filters to smooth, denoising the image, performing image normalization. Apart from histogram, normalization is also an acceptable way, to begin with. In this module, the average filter is used to smooth the image. Before implementing an average filter, the image needs to be converted from RGB to a grayscale image. A true-color image can be transformed into a grayscale image by sustaining the illumination of the image. 1. Smoothing: Smoothing is the procedure that is regularly used to scale down the noise in an image or to outcome a less pixelated image. Most of the smoothing methods are established on low-pass filters. Smoothing can still be done with average and middle values. Here the smoothing is performed to weaken the noise in both the CT and PET image.
72
J. Dafni Rose et al.
1
2 COVERT RGB TO GRAY SCALE
CT IMAGE
3 SMOOTHING
4 FILTERD IMAGE
ENHANCEMENT
Fig. 4.1 Pre-processing for CT image
2
1 PET IMAGE
COVERT RGB TO GRAY SCALE
3 SMOOTHING
4 ENHANCEMENT
FILTERD IMAGE
Fig. 4.2 Pre-processing for PET image
2. Enhancement: Image enhancement is again a method to boost the image quality. Image enhancement is done by regulating the digital images so that the outcomes are better enough for display or further image analysis. For example, you can eliminate noise from the image, sharpen the image, or illuminate an image, constructing it easier to pinpoint indispensable features. Onboard, the average filter is used to smooth and enhance the image. 3. Average filter: The regularly average filter is used to operate both smoothing and enhancement. The average filter is used to clear away the noise from both the images. It entirely works by taking an input vector of values and estimate an average for individual value in the vector. Both output vector and input vector are equivalent in size and shape (Figs. 4.1 and 4.2).
4.2.2
Dual-Tree Complex Wavelet Transform Algorithm
1. Wavelet: is comprehensively distinct from curvelet, and wavelet is very appropriate. Consistently it initiates at zero, increments, and then decrements backward to zero. It can be occasionally being anticipated as a “concise oscillation” like the one which is taped by a seismograph or the heart monitor. Broadly, wavelets are eagerly established to have definite features that make them very favourable for signal processing. Among many wavelet transforms, we progressively use dualtree complex wavelet transform (DTCWT). 2. Introduction to dual-tree complex wavelet transforms (DTCWT): Dual-tree complex wavelet transform (DTCWT) is the recent improvement in discrete wavelet transform (DWT). DTCWT has some supplementary properties such as shift-invariance, directionality, and phase information. • Shift-invariance or sensitivity: If the input image or signal shifts from its region to another region, it may cause any unreliable variation in the event.
4
Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET. . .
Fig. 4.3 DTCWT. LL means Low Low, HL means High Low, LH means Low High, HH means High High
LL
LH
Level 1
HL
HH
LL LL
LL HL
LL LH
LL HH
LH
73
HL
HH
Level 2
• Directionality: It is directionally selective in two and higher dimensions. It accomplishes this with the repetition factor of the 2D image. • Phase information: It is very practical to use in image processing tasks such as edge information and corner detection. Phase information is not distressed by noise. These are the main logic of why DTCWT is used. This will uphold the important data of the original image (Fig. 4.3). Only Low Low information will be detached into assorted levels. Here the detailed information is achieved at the first level itself, so it is halted in LEVEL 1 tree. The tree can be detached till LEVEL 6. 3. Image Fusion and segmentation: Image fusion is the process that combines information in multiple images of the same scene. These images may be captured from different sensors, acquired at different times, or having different spatial and spectral characteristics. For image fusion, the NSCT algorithm is used. Contourlet transform is a multidirectional and multiscale transform; this is advanced by joining the Laplacian pyramid and directional filter bank (DFB) which can be used to capture the mathematical properties of images. The process of upsampling and downsampling is not good enough for shift invariant. So input images are decomposed using non-subsampled contourlet transform (NSCT) which leads to better shift invariant. In regulation to defeat this hindrance non-subsampled contourlet transform (NSCT) is used. NSCT is flexible, multiscale, multi-direction, and shift-invariant image decomposition that can be efficiently implemented in image fusion. It is the positively shiftinvariance interpretation of contourlet transform. It uses a pyramid filter and a directional filter bank (DFB). NSCT eradicate the downsampling and upsampling. NSCT breaks up the image into two that are loss-pass subband and high-pass subband. The DFB disintegrates the high-pass subband into assorted directional subband, i.e., (horizontal, vertical, diagonal). This is repeated on low-pass subband, and it is the outcome of non-subsampled contourlet transform.
74
J. Dafni Rose et al.
1 FILTERED IMAGE OF CT
WAVELET TRANSFORM (DTCWT)
2 3 FUSION RULES
4 FUSED IMAGE
SEGMENTATION
WAVELET TRANSFORM (DTCWT)
FILTERED IMAGE OF PET 1
5 2
SEGMENTED IMAGE
Fig. 4.4 Image fusion
Subsampling means a decline of data to a level convenient for the new pattern rate. Segmentation is accomplished in this system, for the exact localization and characterization of cancer. A region growing algorithm, which is a simple image segmentation procedure carried out on a region basis, is used in this system. It includes the selection of leading seed marks and works by following a segmentation procedure that segments images on a pixel basis (Fig. 4.4). This access to segmentation of the image analyses the nearby pixels of leading seed marks and resolves whether the pixel acquaintance should be added to the region. The procedure is emphasized on the images, in the same manner as universally accepted data clustering algorithms. Co-registration may be enabled using spatial information of that brief arterial phase for localization during a procedure if a tumour only presents during arterial phase imaging. Physicians are able to use off-line prior imaging in the procedure room with the help of image registration. The CT space has provisions for any imaging data set to be registered which can then be used to guide robotic needle placements for point and click tumour destruction [19]. This can be really useful in the future when tumour-specific and cell-specific contrast agents are developed. Fusion may also make it possible to biopsy metabolically active regions of a tumour facilitating a more accurate biopsy. It can also result in gaining information on the temporal and spatial evolution of a tumour genomic or proteomic profile, which is helpful in tailoring patient-specific drug regimens (Fig. 4.5). 4. Clustering: Clustering is one of the most common techniques of analysis applied to extract information from data. It provides an idea about the structure of the data. In simple terms, clustering means dividing into parts or making into groups from a given set of patterns. Patterns in the same clusters are meant to be similar, and patterns in the different clusters are meant to be dissimilar. The objective is to try and locate homogeneous subgroups formed inside the data sets with data points or patterns in each cluster being as similar as possible to a previously defined similarity measure.
4
Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET. . .
75
Fig. 4.5 System architecture of image fusion and segmentation
The similarity measure used is application-specific and differs from process to process. Clustering classes have many applications in the organization of image databases, especially with image segmentation and pattern prediction. The design of their human interface is a set of images and therefore proves to be very compatible with the clustering algorithms used. Some of the reviews are RHC99, MK01, WWFW97, SMM, GGG02, and some of the recent works in the field. It can also be used to segment a movie, for example [GFT98], and to ease image data mining, for example, searching for stimulating partitions of medical images [THI00]. [BDF02] do unsupervised. Clustering will also use extra information. When nonoverlapping clusters are present with each instance belonging to only cluster and one cluster only, it means it has hard clustering, whereas with soft clustering, single individuals can belong to more than one cluster. Clustering is primarily used for the analysis of data in machine learning. The analysed data is divided into groups of similar data called clusters. Cluster analysis looks at algorithms that can identify clusters automatically, for example, hierarchical and partitional clustering algorithms. Algorithms that break up the data as a hierarchy of clusters are termed hierarchical clustering algorithms. Algorithms that produce mutually disjoint partitions by dividing the data set are called partitional algorithms. There are many different ways to do this partitioning, based on distinct models. Distinct algorithms are applied to every model and differentiating their characteristics and results. These models are prominent by their organization and the type of relationships between them. Based on the recently analysed cluster models, in a data set to make them in separately the information, a lot of clustering algorithms can be used. The most important process is discussed here. Every method
76
J. Dafni Rose et al.
has its advantages and disadvantages, and it is important to mention here. The choice of an algorithm that we will decide to use will always depend on the quantity and quality of the data set and what we want to do with it. Detecting distinct kinds of patterns in image data is one of the important applications of clustering in the medical image. This can be very effective and efficient in biology research, distinguishing the objects and identifying the patterns. Another important use is the separation of medical exams. This system can be utilized to carry out the analysis of personal data of the patients or people combined with shopping, location, interest, actions, and an infinite number of indicators. The result of the analysis will provide very important information and trends. Climatology, robotics, recommender systems, and mathematical and statistical analysis are other venues of applying clustering algorithms. Generally, the spectrum of utilization for clustering algorithms is broad. Clustering analysis does not possess a ground truth to base the evaluation of the performance of the model on like supervised learning models possess. There is no solid evaluation metric that can be used to evaluate the results of different clustering algorithms. Additionally, k-means requires k as an input; k is not learnt from data which means there is no right answer as to how many clusters should be in any problem. We can evaluate how well the models are performing based on different K clusters in the cluster-predict methodology since clusters are used in the downstream modelling. 5. k-means clustering algorithm: k-means algorithm can be used to partition the input data set into k partitions that are called clusters. K-means clustering is used for unlabelled data lacking defined categories. The algorithm aims to find K groups in the data where K is a variable and iteratively assigns each data point to one of the K groups based on the features that are provided resulting in clusters that are derived from feature similarity. The main idea is to define k centres, one for each cluster. Each centroid of a cluster is a collection of feature values that define the resulting groups. Examining the centroid feature weights can be used to qualitatively interpret what kind of group each cluster represents after which each point belonging to a given data set is taken and connected to the nearest centre. This is repeated until no point is awaiting decisions. Recalculate k new centroids as barycentre of the clusters obtained after the first step is complete and an early group age is done. Upon deriving k new centroids, a new binding is to be carried out between the same data set points and the nearest new centre. A loop has been achieved. As a result of this loop, we may notice that until no more changes are done, the k centres change their location step by step or, in other words, centres do not move anymore. Finally, this algorithm aims at minimizing an objective function known as the squared error function which is given by
4
Lung Cancer Diagnosis Based on Image Fusion and Prediction Using CT and PET. . .
J ðV Þ ¼
Ci C X X
ð kx i v r kÞ 2
77
ð4:1Þ
i¼1 r¼1
where: ‘||xi –vr||’ is the Euclidean distance between xi and vr. ‘ci’ is the number of data points in the ith cluster. ‘c’ is the number of cluster centres. Algorithmic Steps for k-Means Clustering Let A ¼ {a1, a2, a3, . . ., an} be the set of data points and C ¼ {c1, c2, . . ., cv} be the set of centres. 1. Randomly select ‘c’ cluster centres. 2. Calculate the distance between each data point and cluster centres. 3. Assign the data point to the cluster centre whose distance from the cluster centre is the minimum of all the cluster centres. 4. Recalculate the new cluster centre using: Xci vi ¼ 1=Ci x ð4:2Þ j¼1 i where ‘gi’ denotes the number of data points in the ith cluster. 5. Recompute the distance between each data point and newly obtained cluster centres. 6. If no data point was reassigned, then stop; otherwise repeat from step 3. There are two main approaches described especially for the distance calculations: 1. In one method, it will use the information from the previous iteration to reduce the number of distance calculations. CLUSTER is a k-means-based clustering algorithm that makes full use of the fact that the change of the allocation of patterns to the clusters is relatively lesser after the first few iterations. It uses rules which being resolved not to change if the closest prototype of a pattern has been changed or not by using a simple check. Distance calculations are not carried out if the assignment has not changed. After the completion of a few iterations, the movement of the cluster centroids is also small for consecutive iterations. 2. In another method, it will create the prototype vectors in an appropriate data structure so that it will be more efficient in finding the closest prototype for a given pattern. This problem is reduced to calculating the nearest neighbour for a given pattern in the prototype space making the number of distance calculations performed proportional per iteration. Prototype vectors for vector quantization and many other applications are fixed allowing for the construction of optimal data structures that, for a given input test pattern evaluate the closest vector.
78
J. Dafni Rose et al.
Start
N
N
O
o
Score
19 1 > : ðj∇φj 1Þ2 , 2
if j∇φj 1
ð5:3Þ
if j∇φj 1
where |∇φ| has two minimum points at |∇φ| ¼ 0 and |∇φ| ¼ 1. The distance regularized term has forward and backward diffusion effect which maintains desired shape near the zero level set. The large time step ensures the numerical stability by reducing the number of iteration and complexity [22]. External energy functional is used to move the zeroth level set curve towards the object boundary [19] and is given by E ext ðφÞ ¼ λLS Lg ðφÞ þ νAg ðφÞ
ð5:4Þ
where λLS > 0 and ν are constants and the terms ℒg(φ) and A g ðφÞ represent length integral and the area integral, respectively. The energy minimization technique integrated into the variational level set framework for the segmentation [22] is given by ∂φ ∇φ þ vδε ðφÞg ¼ μdiv dp ðj∇φjÞ∇φ þ λδε ðφÞ div g j∇φj ∂t
ð5:5Þ
where μ > 0 and λ > 0 are constants, δε is the Dirac function, ν is the variable that control the speed of the contour, g is the edge indicator and dp is the distance regularization term which provides the stability in the level set evolution.
5
Segmentation and Validation of Infrared Breast Images Using Weighted Level Set. . .
5.2.3
91
Weighted Level Set Evolution
The weighted level set evolution [23] uses weighting function which depends on edge intensities and edge orientation. The gradient vector flow (GVF) field of the image is employed to measure the local edge orientations. The effect of length and area term of the external energy functional are controlled by the weighting function, and it is obtained using Eq. 5.6 ωðϕ, kÞ ¼ I ðϕ, k Þð1γðϕ,kÞÞ
ð5:6Þ
where I 2 [0, 1] denotes the average edge intensity, γ 2 [1, 1] denotes the average difference between the direction of the image’s GVF and the normal direction of movement of contour C. γ provides the presence of edges and makes the level set contour to capture the desired boundary and k is a constant that determines the size of the region adjacent to C from where local edge features. Although the weighting function depends on the parameter ϕ and k, its value is in the range of [0,1]. The energy minimization technique integrated into the variational level set framework for the segmentation is given by ∂φ ∇φ ¼μdiv dp ðj∇φjÞ∇φ þ ð1 ωðϕ, kÞÞλδε ðφÞdiv g j∇φj ∂t
ð5:7Þ
þ ωðϕ, kÞvδε ðφÞg where μ > 0 and λ > 0 are constants, δε is the Dirac function, ν is the variable that control the speed of the contour, g is the edge indicator and dp is the distance regularization term which provides the stability in the level set evolution. The edge indicator function g in the Eqs. (5.5) and (5.7) is defined as g¼
1 1 þ j∇ Gσ I j2
ð5:8Þ
where Gσ is a Gaussian function with the standard deviation σ, I is a given 2D image, symbol * represents convolution, ∇ is the gradient operator and | | is the modulus of the smoothed image gradients. The level set method uses intensity-based edge function. The level set function ϕ stops moving as image grey gradient become large and ‘g’ becomes zero. The small value of σ is sensitive to noise and results in unstable evolution. Larger value of σ lead to boundary leakage and boundary extracted is not accurate. To overcome this, phase-based edge detection is also attempted in this paper. The Fourier components of images are maximum in phase along the edges, low at other places. Maximum phase congruency corresponds to peaks in local energy function and are related by
92
J. Thamil Selvi
EðxÞ ¼ PCðxÞ
X
A n n
ð5:9Þ
where An is the Fourier amplitude coefficients, PC is phase congruency and E is the energy function. The phase congruency is obtained by substituting PCðxÞ ¼
E ð xÞ P ε þ n An
ð5:10Þ
As ∑nAn(x) becomes small, the expression becomes unstable. To prevent the instability, small constant ε of order 0.01 is added. To obtain the features in all orientation PC(x) is obtained by adding along all orientation. P o ð xÞ ½ E o ð xÞ T o o WP PCðxÞ ¼ P o n ðAno ðxÞ þ εÞ
ð5:11Þ
where ‘o’ defines the orientation, ‘E’ is the energy, ‘W’ is the weighting function that reduces phase congruency at narrow filter response regions, Τo is the noise threshold along the orientation, Ano are the amplitude of frequency components of S(x) and ε is a small constant. This phase congruency value is used as edge detector ‘g’ in level set contour evolution.
5.2.4
Validation of Segmented Results
The efficiency of segmentation algorithms can be quantitatively analysed by computing a metric that best describes similar or dissimilar regions of the segmented image to the expected image. The expected image is the ground truth image, which is a reference image that is manually segmented by a radiologist or a trained professional. The performance of proposed segmentation frameworks on breast thermal images is evaluated quantitatively using similarity metrics based on geometry and overlap measures. These measures take values between 0 and 1. High similarity measure indicates good agreement of segmented and ground truth images. Regional statistics measures consider the number of pixels that are segmented as ROI and non-ROI. True positive (TP) that counts the number of pixels that are correctly segmented as ROI, true negative (TN) that counts the number of pixels that are correctly identified as non-ROI, false positive (FP) that counts for the pixels that are incorrectly identified as in ROI and false negative (FN) that counts for the pixels that are incorrectly identified as non-ROI are calculated. Based on these values, statistical evaluation of accuracy, sensitivity, specificity, positive predictive rate (PPR) and negative predictive rate (NPR) are calculated. Accuracy is defined as the ratio of number of correctly identified pixels to the total number of pixels that gives the degree of similarity between ground truth and segmented image.
5
Segmentation and Validation of Infrared Breast Images Using Weighted Level Set. . .
93
Sensitivity measures the number of positive results among all positives. Specificity measures the number of negative results among all negatives [24]. These measures are defined as the following: Accuracy ¼ ðTP þ TNÞ=ðFN þ FP þ TN þ TPÞ
ð5:12Þ
Sensitivity ¼ TP=ðTP þ FNÞ
ð5:13Þ
Specificity ¼ TN=ðTN þ FPÞ
ð5:14Þ
PPR ¼ TP=ðTP þ FPÞ
ð5:15Þ
NPR ¼ TN=ðTN þ FNÞ
ð5:16aÞ
To overcome the complex regional statistics measure, the indices derived from these measures such as EFI, Youden and ROI indices are used to validate. The segmented outputs validated against ground truth using EFI, Youden and ROI indices [25] are calculated using EFI ¼
1 2ðSEN þ ESPÞ
ð5:16bÞ
Y‐index ¼ SEN þ ESP 1
ð5:17Þ
ROI index ¼ 100 ð1 ACC PPR NPRÞ
ð5:18Þ
The indices are combination of regional statistics measures such as accuracy (Acc), sensitivity (SEN), specificity (ESP), positive predictive rate (PPR) and negative predictive rate (NPR). Overlap measures are determined by identifying intersecting and non-intersecting regions of segmented results and ground truth images. The classic measures commonly used are Jaccard coefficient (JC), Tanimoto coefficient (TM), Dice similarity coefficient (DC) and volume similarity (VS) indices [24]. If XV represents the segmented result and YV represents the ground truth image, these measures are defined as the following: JC ¼ jX V \ Y V j=jX V [ Y V j
ð5:19Þ
DC ¼ 2jX V \ Y V j=ðjX V j þ jY V jÞ TM ¼ jX V \ Y V j þ jX V [ Y V j= jX V [ Y V j þ jX V \ Y V j
ð5:20Þ
VS ¼ 1 jjX V j jY V jj=ðjX V j þ jY V jÞ
ð5:22Þ
ð5:21Þ
94
5.3
J. Thamil Selvi
Results and Discussion
The representative set of input images of varying breast boundary characteristics are shown in Fig. 5.1a–d. The images are observed with clear and distinguishable lower breast boundaries and inframammary fold and shown in Fig. 5.1a, b. Some images are also observed with indistinguishable and vague lower breast boundaries and inframammary fold, which is shown in Fig. 5.1c, d. All these images exhibit the properties of distinguishable breast from the background tissues. These images are subjected to DRLSE and weighted level set segmentation methods. The edge map extracted using Gaussian filter is shown in Fig. 5.2e–h. The edge map is found to be thick and spurious. The DRLSE segmented output with Gaussian edge map of breast images are shown in Fig. 5.2i–l. The α is weighted area term coefficient and fixed to 0.6 throughout the study. The breast contours segmented using DRLSE are observed to be irregular and have contour leakage near lower breast boundaries and inframammary fold. This may be due to less control over the speed of curve evolution. Hence, DRLSE method fails to capture the desired weak and indistinguishable breast boundaries. Similarly WLSE segmented output is shown in Fig. 5.2m–p. Compared to segmented output of DRLSE method, the weighted level set method is observed to yield smooth segmentation. This may be due to the weighting function of the WLSE method which is efficient in capturing the desired boundary. Some of the segmented breast tissues are observed with undersegmentation which may be due to thick and spurious Gaussian edge map. To mitigate this problem, both level set methods are subjected to phase based edge map. The phase edge map is shown in Fig. 5.3e–h. The corresponding DRLSE and WLSE segmented output of varying boundary characteristics breast images are shown in Fig. 5.3i–l and m–p. The phase edge maps are observed to be thin and highly directional. The segmented outputs are observed to be smooth with less contour leakages. This may be due to presence of image feature weighting function in the WLSE method and distance regularizer term in DRLSE method. In spite of thin and highly directional edges, the DRLSE segmented outputs are observed with non-breast tissues which may be due to less control over the speed function. Similarly in WLSE method, the weighting function depends on edge intensities and edge orientation. The edge intensities are derived from thin and distinct phase edge map and edge orientations, measured using gradient vector flow (GVF) field of
Fig. 5.1 Representative set of input breast images with varying shapes and size
5
Segmentation and Validation of Infrared Breast Images Using Weighted Level Set. . .
95
Fig. 5.2 Representative set of (a–d) input images, (e–h) Gaussian edge maps, (i–l) DRLSE segmented output images and (m–p) WLSE segmented output images
image. Thus weighting function was efficient in capturing the weak and indistinguishable lower breast boundaries and inframammary fold precisely. Hence the weighted level set method yields smooth segmentation without significant breast loss. The validation of segmented and ground truth images using Gaussian and phase edge map with DRLSE method shown in Fig. 5.4. The regional statistics index of Gaussian and phase edge map with DRLSE method is shown in Fig. 5.4a. The regional statistics index of Gaussian edge map with DRLSE framework is observed to be low compared to phase edge map. Some segmented images are observed with the breast loss due to improper and undersegmentation. The low regional statistics index of 95% and 94% may be due to thick and discontinuous edges. Hence, DRLSE method fails to capture the true breast boundaries, which in turn leads to segmentation inconsistency of the entire image set.
96
J. Thamil Selvi
Fig. 5.3 Representative set of (a–d) input images, (e–h) phase edge maps, (i–l) DRLSE segmented output images, (m–p) WLSE segmented output images
Similarly the regional statistics index of phase edge map with DRLSE segmented image against ground truth image is slightly high compared to Gaussian edge map. This may be due to thin edges of phase edge map which in turn aids the distance regularizer term of DRLSE method to maintain stability of the contour and capture the true boundaries near the lower breast boundaries and inframammary fold. The overlap measures of Gaussian and phase edge map with DRLSE framework is shown in Fig. 5.4b. The DRLSE with phase edge map show high overlap measure compared to Gaussian edge map. Among the measure DC shows more than of 95% and VS of 98% of similarity between segmented and ground truth images. The validation of segmented and ground truth images using Gaussian and phase edge map with WLSE method is shown in Fig. 5.5.The regional statistics index of Gaussian and phase edge map with WLSE method is shown in Fig. 5.5a. The
5
Segmentation and Validation of Infrared Breast Images Using Weighted Level Set. . .
97
Fig. 5.4 Average values of (a) EFI and Y-index and (b) overlap measures of DRLSE-based segmentation using Gaussian and PC edge map
regional statistics index is observed to be high for phase and WLSE framework. The thin edge map and new weighting function in WLSE method are efficient in capturing the indistinguishable and vague inframammary fold and lower breast boundaries. This leads to high regional measures of 98% between segmented and ground truth images. Similarly the overlap measure is shown in Fig. 5.5b. It shows high DC of 98% and VS of 99% for phase edge map and WLSE framework. This
98
J. Thamil Selvi
Fig. 5.5 Average values of (a) EFI and Y-index and (b) overlap measures of weighted level set segmentation using Gaussian and PC edge map
may be due to thin and directional edges by phase congruency edge map, and GVF field aids level set to capture these directional and thin edges near the inframammary fold and lower breast boundaries. This result shows that WLSE with phase edge map segments the breast from other region without any significant breast loss.
5
Segmentation and Validation of Infrared Breast Images Using Weighted Level Set. . .
99
The DRLSE and WLSE segmented outputs with phase edge map framework are validated against the ground truth using regional and overlap measures. From Fig. 5.6a, it is observed that compared to DRLSE, WLSE segmented output yields high values of overlap, EFI, Y- and low ROI indices. The high magnitude of EFI and
Fig. 5.6 Average values of (a) EFI and Y-index and (b) overlap measures of DRLSE and WLSE segmented images
100
J. Thamil Selvi
Fig. 5.7 Average values of (a) ROI index of DRLSE and WLSE and (b) scatter plot between the ground truth and WLSE segmented area images
Y-index depicts the accuracy and sensitivity of WLSE segmentation. Similarly the overlap measures shown in Fig. 5.6b are observed to be high for WLSE compared to DRLSE segmented output. Among the overlap measure, VS is observed to be high indicating reduced misclassification of ROI and non-ROI elements. The low ROI index of 6.5 in Fig. 5.7a indicates that the segmented image is close to ground truth
5
Segmentation and Validation of Infrared Breast Images Using Weighted Level Set. . . 101
images. The correlation coefficient of 0.91 in Fig. 5.7b indicates good correlation of WLSE segmented area and ground truth area. This could be due to presence of edgedependent weighting function which is based on the local edge features derived from the images, which drives the level set contour towards the weak breast boundary.
5.4
Conclusion
Breast thermal images are amorphous in nature. Segmentation of breast region of varying boundary characteristics is challenging task. In this work, an attempt is made to segment the breast tissues using WLSE method with phase edge map. The WLSE integrated with phase edge map framework is observed to be efficient in segmenting the breast region with less information loss. The overlap measure and index derived from the regional statistics measure yield high accuracy, sensitivity and specificity for above-mentioned framework. The high sensitivity and specificity of segmentation resulted in 0.97 and 0.95 of EFI and Y-index values. The average overlap measures are observed to be 0.98. The linear fit between the segmented and ground truth area of WLSE method indicates high correlation with R value of 0.91. Hence, the integration of phase edge map and WLSE method shall aid accurate segmentation for clinical diagnosis.
References 1. Qi H, Kuruganti PT, Snyder WE, Nicholas A, Diakides M, Bronzino JD (2007) Detecting breast cancer from thermal infrared images by asymmetry analysis. In: Medical infrared imaging: principles and practice, the biomedical engineering handbook. CRC Press, Taylor & Francis Group, pp 1–11 2. Tanner C, Schnabel JA, Smith AC, Sonoda LI, Hill DLG, Hawkes DJ, Degenhard A, Hayes C, Leach MO, Hose DR (2002) The comparison of biomechanical breast models: initial results. In: ANSYS proceedings 3. Etehadtavakol M, Ng EYK, Chandran V, Rabbani H (2013) Separable and non-separable discrete wavelet transform based texture features and image classification of breast thermograms. Infrared Phys Technol 61:274–286 4. Borchartt TB, Conci A, Lima RC, Resmini R, Sanchez A (2013) Breast thermography from an image processing viewpoint: a survey. Signal Process 93(10):2785–2803 5. Machado D, Giraldi G, Novotny A, Marques R, Conci A (2013) Topological derivative applied to automatic segmentation of frontal breast thermograms. In: Workshop de Visao Computacional, Rio de Janeiro, vol 350 6. Motta L, Conci A, Lima R, Diniz E, Luís S (2010) Automatic segmentation on thermograms in order to aid diagnosis and 2D modeling. In: Proceedings of 10th workshop em Informática Médica, vol 1, pp 1610–1619 7. Lipari CA, Head JF (1997) Advanced infrared image processing for breast cancer risk assessment. In: Proceedings of the 19th annual international conference of the IEEE Engineering in Medicine and Biology Society. ‘Magnificent milestones and emerging opportunities in medical engineering’ (Cat. No. 97CH36136). IEEE, vol 2, pp 673–676 8. Herry CL, Frize M (2002) Digital processing techniques for the assessment of pain with infrared thermal imaging. In: Proceedings of the second joint 24th annual conference and the annual fall
102
J. Thamil Selvi
meeting of the Biomedical Engineering Society. Engineering in medicine and biology. IEEE, vol 2, pp 1157–1158 9. Qi H, Head JF (2001) Asymmetry analysis using automatic segmentation and classification for breast cancer detection in thermograms. In: 2001 conference proceedings of the 23rd annual international conference of the IEEE Engineering in Medicine and Biology Society. IEEE, vol 3, pp 2866–2869 10. Borchartt TB, Conci A, Lima RC, Resmini R, Sanchez A (2013) Breast thermography from an image processing viewpoint: a survey. Signal Process 93(10):2785–2803 11. Zhang B, Fadili JM, Starck JL (2008) Wavelets, ridgelets, and curvelets for Poisson noise removal. IEEE Trans Image Process 17(7):1093–1108 12. Suganthi SS, Ramakrishnan S (2014) Anisotropic diffusion filter based edge enhancement for segmentation of breast thermogram using level sets. Biomed Signal Process Control 10:128–136 13. Zhou Q, Li Z, Aggarwal JK (2004) Boundary extraction in thermal images by edge map. In: Proceedings of the 2004 ACM symposium on Applied computing. ACM, pp 254–258 14. Guidotti P, Lambers JV (2009) Two new nonlinear nonlocal diffusions for noise reduction. J Math Imaging Vis 33(1):25–37 15. Zhang K, Zhang L, Song H, Zhang D (2012) Reinitialization-free level set evolution via reaction diffusion. IEEE Trans Image Process 22(1):258–271 16. Huang YL, Jiang YR, Chen DR, Moon WK (2007) Level set contouring for breast tumor in sonography. J Digit Imaging 20(3):238–247 17. Kovesi P (2000) Phase congruency: a low-level image invariant. Psychol Res 64(2):136–148 18. PROENG (2012) Image processing and image analyses applied to mastology 19. Li C, Xu C, Gui C, Fox MD (2005) Level set evolution without re-initialization: a new variational formulation. In: 2005 IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR’05), vol 1. IEEE, pp 430–436 20. Caselles V, Kimmel R, Sapiro G (1997) Geodesic active contours. Int J Comput Vis 22 (1):61–79 21. Li C, Xu C, Gui C, Fox MD (2010) Distance regularized level set evolution and its application to image segmentation. IEEE Trans Image Process 19(12):3243–3254 22. Khadidos A, Sanchez V, Li CT (2017) Weighted level set evolution based on local edge features for medical image segmentation. IEEE Trans Image Process 26(4):1979–1991 23. Machado D, Giraldi G, Novotny A, Marques R, Conci A (2013) Topological derivative applied to automatic segmentation of frontal breast thermograms. In: Workshop de Visao Computacional, Rio de Janeiro, vol 350 24. Conci A, Galvão SS, Sequeiros GO, Saade DC, MacHenry T (2015) A new measure for comparing biomedical regions of interest in segmentation of digital images. Discret Appl Math 197:103–113 25. Cardemes R, Rd LG, Cuadra MB (2009) A multidimensional segmentation evaluation for medical image data. Comput Methods Prog Biomed 96:108–124
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin Plate Holding T. Aravind, S. Praveen Kumar, G. Dinesh Ram, and D. Lingaraja
Abstract
A MEMS-based gripper tool to handle a thin, larger surface area component is not available in the market, which is capable of functioning even in the moisture, as the tool is devised by means of a polymer material. The proposed device has a default in plane displacement of less than 500 microns where the tools are completely in closed position. The device works on the principle of a pushpull-based actuation method where it can hold components of thickness of about 300 microns, where the entire device is controlled precisely by a screw-based actuation mechanism. The device can be fabricated by a rapid prototyping process, and structural mechanics simulation study is carried out with COMSOL Multiphysics simulation software for identifying the appropriate results. Various polymers were chosen, comparison of their results in terms of displacement and stress-strain components was obtained, and the suitable material is identified here to be as poly tetra fluoro ethylene (PTFE). For an applied pressure of 0.1 Pa, verowhite produces a displacement of 6.94 107 μm by sustaining a stress around 1.09 N/m2, which is the best among the materials under consideration. Keywords
MEMS · Mechanical gripper · Push-pull mechanisms · Normally open and closed arm · Polymer materials
T. Aravind (*) · S. Praveen Kumar · G. Dinesh Ram · D. Lingaraja Department of Electronics and Communication Engineering, Saveetha Engineering College, Chennai, Tamil Nadu, India # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_6
103
104
6.1
T. Aravind et al.
Introduction
Over the past few years, the burgeoning technology of micro and nano device applications falls on many fields like industrial, consumer, automobiles, medical, etc. To handle micro-sized objects, microgrippers are used which are based on MEMS [9, 13]. Microgrippers are generally used to grip the object with adequate force to transfer the object from one place to another without affecting the object under gripping and finally releasing the object at a specific position. Actuation mechanisms involved in microgrippers are piezoelectric [19], mechanical [11, 16], electrostatic [20], electromagnetic [6], electrothermal [4], shape memory alloy [14], and vacuum type [8]. MEMS applications over the microgripper are increasing day by day due to the automation of gripping objects in industries, medical equipment, and components. Xu designed a compliant microgripper which is integrated with position and force-sensing capabilities. Piezoelectric stack actuator (PSA) is used to drive the gripper with two strain gauges with resolutions. Piezoelectric microgripper provides a large displacement amplification ratio for gripping large-scale objects. The static and dynamic performances of the gripper are characterized and analyzed by calibrated sensors. Both position and grasping force are used to assemble the micro object using the microgripper system [19]. Hot and cold arm actuators are two mirror images in electrothermal grippers and are fabricated in the surface micromachining process. Electrothermally activated polymer microgripper is used for the manipulation of single cells in physiological ionic solution [3]. Recently, electrostatic microgripper is based on the compliant rotary mechanisms and linear guiding [10]. Wei et al. and Chronis et al. have presented electrothermal actuator of U beam microgripper within plane and out of a plane for grasping the micro objects at low temperature and low voltage. Larger motion in gripper leads to deflection in arms, and asymmetric heating results in unequal width arms [4, 17]. Due to a higher temperature in arms, current flux is also higher in thinner arms. Shivhare et al. have designed and analyzed with two in-plane chevron electrothermal actuator coated with polymer material for producing higher gripping force with low input voltage [15]. In the gripping arms, temperature induced is high so that the heat sink in the shuttle is introduced to reduce the temperature. Sijie Yang and Qingsong Xu designed and analyzed an integrated electrothermal actuator and electrothermal force sensor for grasping the micro-sized objects with a compliant mechanism [21]. A novel metallic microgripper is designed with metallic chevron actuator and interdigitated comb drives. The metallic chevron actuator is used to grasp the object, transport nanoparticles from one position to another, and finally release them on target places. During the release of the object, vibration in arms is produced by an interdigitated comb-drive electrostatic actuator [5]. Recently technology of nanomanipulation has high force with large strokes and the size of the gripper is compact. Nanorobotic manipulation is MEMS-based used for shrinking the size of the gripper. Micro and nano scales robots are manipulated by recent actuation like electromagnetic, piezoelectric, and shape memory alloy
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . .
105
[4]. This paper presents an overview of recent microgrippers in research institutes and commercial microgrippers. In research institutes, microgripper is focused on electrostatic, thermal, thermal and magnetic, SMA actuated fluid, and piezoelectric microgripper. Micro-manipulation based on MMOC microgripper is fabricated and analyzed [18]. Alogla et al. designed a microgripper for manipulating the object and also it is used to sense the gripping force. The pneumatic microgripper has both static and dynamic models that are measured and controlled [1]. The mechanical microtweezers are used in cloning technology for manipulating single cell-like embryos. New way of pick and place the cell aggregates are fabricated of two stainless steel grippers with 100 m for biofabricating synthetic tissues [12]. In this gripper design, the material selection provides better actuation to the gripper in terms of pull in and out mechanisms [2]. The mechanical microgripper is fabricated by push and pull mechanisms with the parameter like stress, strain, and displacement decides the stability of the microgripper. The gripper tool proposed is purely mechanical type gripper where its sheer force can be manually controlled with screw-type setup. The proposed system is to hold a thin plate of thickness 150 micron sample in a moisture environment. The structural specification of the gripper is discussed below. The device aspect ratio of the device is as follows where the actuator arm will be initially in a relaxed position with a displacement gap of 300 microns to hold sample till the size of 150 μm and the increased stress may result in the structural deformation in collapse arm and tampering arm.
6.2
Mechanisms of Actuation
6.2.1
Complaint Mechanisms
When the transmission of the force, motion, or energy in the objects is applied, the deflection in the each part will reflect entire or partial motion in complaint parts. The differences between them replace the links and hinges by flexible material. According to Ho, compliant mechanisms are classified into two sorts: lumped compliant mechanism and distributed compliant mechanisms. This gripper chooses the distributed, compliant mechanism for microgripper device design. The reason for utilizing the compliant mechanism to make this device is because it is just a threedimensional structure, so it is suitable for microfabrication. Another reason is it can be designed by specific definitions. Compliance is distributed throughout the entire body. There are no hinges and, therefore, no localized fatigue points [7].
6.2.2
Push and Pull Mechanism
In the proposed polymer-based mechanical microgripper, the mode of actuation mechanism is based on a controlled push and pull mechanism [16]. Figure 6.1 shows the structure of a proposed mechanical microgripper. When the base of the
106
T. Aravind et al.
Fig. 6.1 Structure of mechanical microgripper
gripper is pushed forward, the gripping arms of the gripper will open. When the base of the gripper is pulled backward, gripping arms experience an inward force and all the eight gripping arms will close. A unique design for gripping larger surface area with eight gripping arms and central part of the microgripper controls the mechanism of push and pull. The actuation of microgripper is controlled by the cylindrical support. The mid part of the gripping arms is constructed in the hollow cylinder base at one end and other end of the shaft. The opening and closing of the microgripper is mainly depended on the hollow cylinder. The pressure is applied to the base of the hollow cylinder, for even distribution of pressure to the whole microgripper.
6.2.3
Normally Open and Closed Arms
In normally open gripper the gripper will be in the rest with some gap. When the actuation force is applied to the boundary load, the gripping arms move toward an object, the gap is decreased between two gripping arms thus grasps the object. The actuation force must be equally transmitted to all gripping arms and also provides good precision control of the gripping force. Transmission has disadvantages that energy consumption is high. In normally closed gripper, initially gripping arms are closed. When the actuation force is applied to the gripper, the gap is increased according to the object and gripping arm grasps the object. After gripping the thin object with some actuation force, the gap of gripping arms will get increased and release the object. In this mechanism, elastic strain energy is supplied to gripping arms when the transmission occurs. Normally open gripper is preferred mostly because the gap of opening depends on the size of the object. If the size of an object is larger, the energy stored in the elastic strain energy mechanism is more and there is more chance of damage probability in the objects [13].
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . .
6.2.4
107
Motion of Jaws
In mechanical microgripper, the gripping arms grasp object by two methods, one is rotational motion and other is parallel motion. In rotational motion, the gripping arms hold the objects according to the given reaction force. The longitudinal axes in the component are parallel to the gripping arms and reaction force pushes the object at the middle of the gripping arms. In rotational motion of gripping, arms have a chance of slipping the object. To overcome this disadvantage, the parallel motion of the jaw is used. In parallel motion, the reaction force is omitted and uniform stress distribution is obtained while grasping the object. To achieve a pure parallel motion in gripping, arms are very complicated because in gripping arms there will be some amount of rotational motion in arms. While designing the microgripper, the important task is to reduce the rotational motion in gripping arms [13].
6.3
Design and Simulation of Mechanical Microgripper
In this paper, the MEMS-based mechanical microgripper is mainly designed for gripping the thin and large surface area. The size of microgripper is 150 130 μm2 and dimension of microgripper is shown in Fig. 6.2. In microgripper there are three major parts, gripping arm, fixed part, and actuation part. The gripper structure is mainly designed to grip thin metal plate where the human being can’t handle the objects. The cylindrical support which is at the base of the gripper is the actuation Fig. 6.2 Dimension of mechanical microgripper
108
T. Aravind et al.
part of the gripper. When the gripper base moves inward or outward, it actuates the gripping arms which will deform the structure by open or close. When actuation of a microgripper occurs, the size of the gripping arms will be varied and it is expressed by [11]. G ¼ g þ 2∇g0
ð6:1Þ
where g0 is the initial gap separation. According to applied pressure, the structural deformation leads to the displacement in gripping arms. Δg is per arm opening of the gripping arms in the mechanical microgripper. It is expressed by Δg ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi L2 sin 2 ðθ0 Þ ðΔsÞ2 þ 2L cos ðθ0 ÞΔs L sin ðθ0 Þ
ð6:2Þ
where Δs ¼ L cos(θ0) is the displacement induced by the shaft on the backbone, L is the length of an arm, and θ0 is the starting angle between arms and flexures. The different polymer material is used for mechanical grippers like verowhite, polycarbonate, polyimide, fullcure 720, poly tetra fluoro ethylene (PTFE), and poly methyl methacrylate (PMMA). Verowhite material is rigid, high resolution, and opaque white widely used in many applications like medical devices and components, electronic housing, and industries. Verowhite material has a high withstanding pressure of about 58 MPa. Polycarbonate polymers are transparent amorphous thermoplastic and they are engineering plastics used for robust materials such as impact-resistant glass-like surface. Polycarbonate combined with flame retardant materials without significant material degradation. Polycarbonate polymers are used in medical devices, bulletproof glass, automotive components, etc. Polyimide material is high-performance polymer which is infusible, predominantly aromatic molecules with high thermal stability. It has excellent heat radiation resistance and low impact strength. It is widely used in aircraft engines, engine components, and electrical components. Fullcure 720 is a rigid and semi-translucent material which is used in general purpose products and medical devices. Poly tetra fluoro ethylene (PTFE) material is an opaque plastic fluoropolymer and versatile ivory white. It has excellent chemical resistance and withstands high temperatures used in industries, aerospace, and medical devices. Poly methyl meth acrylate (PMMA) is also called as acrylic and amorphous thermoplastic material with excellent clarity and UV resistance. It is used in general products like glazing for aircraft. The specification of different materials is given in Table 6.1.
6.4
Results and Discussion
6.4.1
Stress and Strain Analysis
The stress, strain, and mechanical impact over the microgripper is analyzed with finite element analysis (FEA) simulation tool from COMSOL Multiphysics. When minimum to maximum pressure is applied to the gripper device, flexure will push in
Specifications Young modulus (MPa) Density (kg/m3) Poisson ratio
Polycarbonate 2300
1200 0.38
Verowhite 2500
1175 0.3
Materials
1420 0.34
Polyimide 2500
Table 6.1 Specification of different polymer material properties
1180 0.3
Fullcure 720 2870 2200 0.46
Poly tetra fluoro ethylene (PTFE) 500
1185 0.35
Poly methyl methacrylate (PMMA) 2450
6 Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . . 109
110
T. Aravind et al.
Table 6.2 Stress analysis for different polymer material properties Pressure (Pa) 0.1 0.5 1 2 3 4 5 6 7 8 9 10 25 50
Stress (N/m2) Verowhite Polycarbonate 1.17 1.19 5.85 5.79 11.7 11.9 23.4 27 35.1 34.9 46.8 50 58.5 57.9 70.3 73 82 80.9 93.7 96 105 107 117 119 293 291 585 579
Fullcure 720 1.2 6.02 12 24.1 36.1 48.1 60.2 72.2 84.2 96.3 108 120 301 602
Polyimide 1.18 5.89 11.8 23.6 35.3 47.1 58.9 70.7 82.4 94.2 106 118 294 589
PTFE 1.09 5.43 10.9 21.7 32.6 43.4 54.3 65.2 76 86.9 97.8 109 272 543
PMMA 1.185 5.9 11.85 23.65 35.35 47.5 59 71 82.9 94.7 106.5 118.5 294.4 589.5
and produce the structural deformation with some angle variation in the original structure. The deformation of the gripper makes the gripping arms to open wider. The maximum stress is obtained in the hinges between the middle parts of the gripping arms. To choose the best material for the microgripper, the different polymer materials are analyzed. From verowhite, polycarbonate, polyimide, fullcure720, PTFE, and PMMA analyses are compared below. When the pressure applied to the microgripper device is between 0.1 and 50 Pa, the device acts in rotational motion in the jaws. The stress analysis of the microgripper for different polymer material with respect to different pressure applied is shown in Table 6.2. While comparing the different polymer materials the PTFE shows the best stress results. PFTE has less young modulus of 500 MPa while compared to other materials like verowhite, polycarbonate, fullcure 720, polyimide, and PMMA which is shown in Fig. 6.3. It has excellent chemical resistance and high-temperature withstanding capability. PTFE material is used in applications like medical devices, aerospace, and manufacturing industries. Due to high temperature capability, microgripper is used to grasp the object without human help where the temperature is high or low. When the pressure of 0.1 Pa is applied to the microgripper, the structure deforms and produces the stress of about 1.09 N/m2 is shown in Fig. 6.4. The strain analysis of different polymer materials according to applied pressure is given in Table 6.3. The load to the gripper tool is applied in the form of the screwing method attached at the bottom portion as a post-simulation process. The maximum stress is observed at the hinges of the beam and strain is observed near the fixed side arm positions. To analyze the gripping efficiency in an object, the main consideration of the microgripper will be size and dimension. So the gripper size is increased by about 300 150 μm2. When the same amount of pressure of about 0.1 Pa is applied to the
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . .
60 50
Stress N/m2
40
Verowhite Polycarbonate Fullcure 720 Polyimide PTFE PMMA
30 20 10 0 0
1
2
3
Pressure (Pa)
4
5
Fig. 6.3 Pressure versus stress for different polymer materials
Fig. 6.4 Stress result of mechanical microgripper for poly tetra fluoro ethylene
111
Strain Pressure 0.1 0.5 1 2 3 4 5 6 7 8 9 10 25 50
Verowhite 2.98 1010 1.49 109 2.98 109 5.95 109 8.93 109 1.19 108 1.49 108 1.79 108 2.08 108 2.38 108 2.68 108 2.98 108 7.44 108 1.49 107
Polycarbonate 2.75 1010 1.37 109 2.75 109 5.49 109 8.24 109 1.1 108 1.37 108 1.65 108 1.92 108 2.2 108 2.47 108 2.75 108 6.87 108 1.37 107
Table 6.3 Strain analysis for different polymer materials Fullcure 720 3.16 1010 1.56 109 3.16 109 6.32 109 9.48 109 1.26 108 1.58 108 1.9 108 2.21 108 2.53 108 2.85 108 3.16 108 7.9 108 1.58 107
Polyimide 3.12 1010 1.56 109 3.12 109 6.23 109 9.35 109 1.25 108 1.56 108 1.87 108 2.18 108 2.49 108 2.8 108 3.12 108 7.79 108 1.56 107
PTFE 5.25 1010 1.05 109 5.25 109 1.05 108 1.57 109 2.1 108 2.62 108 3.15 108 3.67 108 4.2 108 4.72 108 5.25 108 1.31 107 2.62 107
PMMA 3.04 1010 9.11 1010 3.04 109 6.08 109 9.11 109 1.22 108 1.52 108 1.82 108 2.13 108 2.43 108 2.73 108 3.04 108 7.59 108 1.52 107
112 T. Aravind et al.
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . .
113
Fig. 6.5 Stress analysis for different dimension of 300 μm 150 μm
base of the gripper, the stress obtained was 5.62 N/m2 and is shown in Fig. 6.5. The stress obtained for 0.1 Pa pressure is 1.09 N/m2, while comparing the size of microgripper of about 150 130 μm2 gives the best result for handling the thin and large surface area.
6.4.2
Displacement Analysis
The displacement profile for the different polymer materials is given in Table 6.4. While comparing the different polymers like verowhite, polycarbonate, fullcure 720, polyimide, PTFE, and PMMA, PTFE produced the best result in displacement. When 0.1 Pa pressure is applied to the base of cylindrical support, the gripping arms get displaced of about 6.94 107 μm which is shown in Fig. 6.6. The maximum displacement of the microgripper observed in gripping arms is given by Ds ¼ Lcos (h0) which tells the rate of change in the gap will move closer to zero. When gripper is at rest, the angular disparity of resting position is at 225 and 135 . After the deformation of the microgripper, flexures in the middle are displaced from the angle of about 45 in resting position to 55 and other side arm 135 in resting position to 125 , and main gripping arm angle is varied from 225 in rest to open 215 and on the other side from 135 to 145 . The gripper works according to the normally open and closed mechanism, the thin and the large surface area object can be gripped.
Pressure 0.1 0.5 1 2 3 4 5 6 7 8 9 10 25 50
Displacement (μm) Verowhite 1.42 107 7.09 107 1.42 106 2.85 106 4.25 106 5.67 106 7.09 106 8.5 106 9.92 106 1.13 105 1.28 105 1.42 105 3.54 105 7.09 105
Polycarbonate 1.53 107 7.67 107 1.53 106 3.07 106 4.6 106 6.14 106 7.67 106 9.21 106 1.07 105 1.23 105 1.38 105 1.53 105 3.84 105 7.67 105
Table 6.4 Displacement analysis for different polymer materials Fullcure 720 1.24 107 6.21 x107 1.24 106 2.48 106 3.72 106 4.96 106 6.21 106 7.45 106 8.69 106 9.93 106 1.12 105 1.24 105 3.1 105 6.21 105
Polyimide 1.43 107 7.3 107 1.43 106 2.89 106 4.28 106 5.7 106 7.6 106 8.55 106 9.98 106 1.143 105 1.3 105 1.49 105 3.6 105 7.8 105
PTFE 6.94 107 1.39 107 6.94 106 1.39 105 2.08 105 2.78 105 3.47 105 4.17 105 4.86 105 5.55 105 6.25 105 6.94 105 1.74 104 3.47 104
PMMA 2.89 107 4.34 107 1.45 106 2.89 106 4.34 106 5.79 106 7.23 106 8.68 106 1.01 105 1.16 105 1.3 105 1.45 105 3.62 105 7.23 105
114 T. Aravind et al.
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . .
115
Fig. 6.6 Displacement result of mechanical microgripper for poly tetra fluoro ethylene stress displacement
120
1.74x10-4 6.94x10-5
100
6.25x10-5
Stress (N/m2)
4.86x10-5 4.17x10-5
60
3.47x10-5 2.78x10-5
40
2.08x10-5 1.39x10-5
20
6.94x10-6 1.39x10-7
0
6.94x10-7 0
2
4
6
Pressure (Pa) Fig. 6.7 Pressure versus stress and displacement
8
10
Displacement (Pm)
5.55x10-5 80
116
T. Aravind et al.
Figure 6.7 shows the result of stress and displacement profile versus applied pressure to poly tetra fluoro ethylene material.
6.5
Conclusion
A PTFE-based mechanical microgripper has been designed and its responses for displacement, stress, and strain were analyzed. Earlier, identification of a suitable material with desirable qualities was performed when various materials like verowhite, fullcure 720, polycarbonate, polyamide, PTFE, and PMMA are compared. Among all the chosen materials, PTFE showed better response in all the fields as we desire. The young’s modulus of the material is desirably small and it possesses better density than most of the materials of the same class. It also possesses better Poisson ration in the range of 0.46. Hence, the device is responsive even for a minimal force and it will be much stable as the density of the material is quite high. According to the applied pressure from minimum to maximum value, the device responds with push and pull mechanism. The stress obtained for mechanical microgripper is 1.09 N/m2 and displacement is 6.94 107 μm for an applied pressure of 0.1 Pa. Acknowledgement We thank Saveetha MEMS Design Centre, Saveetha Engineering College, Chennai for providing the facility to complete this project successfully.
References 1. Alogla AF, Amalou F, Balmer C et al (2015) Micro-tweezers: design, fabrication, simulation and testing of a pneumatically actuated micro-gripper for micromanipulation and microtactile sensing. Elsevier 236:394–404. https://doi.org/10.1016/j.sna.2015.06.032 2. Aravind T, Ramesh R, Kumar SP (2016) Design and simulation of a novel polymer based 4 arms mechanical microgripper for micromanipulation. World Appl Sci J 34:1318–1325. https://doi.org/10.5829/idosi.wasj.2016.1318.1325 3. Chronis N, Lee LP (2004) Polymer mems-based microgripper for single cell manipulation. 17th IEEE Int Conf Micro Electro Mech Syst Maastricht MEMS 2004 Tech Dig 4. Chronis N, Chronis N, Lee LP (2015) Electrothermally activated SU-8 microgripper for single cell manipulation in solution electrothermally activated SU-8 microgripper for single cell manipulation in solution. J Microelectromech Syst 14:857–863. https://doi.org/10.1109/ JMEMS.2005.845445 5. Demaghsi H, Mirzajani H, Ghavifekr HB (2014) Design and simulation of a novel metallic microgripper using vibration to release nano objects actively. Microsyst Technol 20:65–72. https://doi.org/10.1007/s00542-013-1888-7 6. Feddema JT, Ogden AJ, Warne LK et al (2002) Electrostaticl electromagnetic gripper. Proc 5th Biannu World Autom Congr, pp 268–274. https://doi.org/10.1109/WAC.2002.1049452 7. Ho NL, Dao T, Huang S, Le HG (2016) Design and optimization for a compliant gripper with force regulation mechanism. Int J Mech Aerospace, Ind Mechatron Manuf Eng 10:1927–1933 8. Jaiswal AK, Kumar B (2017) Vacuum gripper – an important material handling tool. Int J Sci Technol 7:1–8 9. Jia Y, Xu Q (2013) MEMS microgripper actuators and sensors: the state-of-the-art survey. Recent Patents Mech Eng 6:132–142. https://doi.org/10.2174/2212797611306020005
6
Analysis of Material Profile for Polymer-Based Mechanical Microgripper for Thin. . .
117
10. Kim K, Liu X, Zhang Y et al (2008) Mechanical characterization of polymeric microcapsules using a force-feedback MEMS microgripper. Conf Proc. Annu Int Conf IEEE Eng Med Biol Soc IEEE Eng Med Biol Soc Annu Conf 2008:1845–1848. https://doi.org/10.1109/IEMBS. 2008.4649539 11. Martínez JA, Panepucci RR (2007) Design, fabrication and characterization of a microgripper device. Florida Conf Recent Adv Robot FCRAR 2007:1–6 12. Mehesz AN, Brown J, Hajdu Z et al (2011) Scalable robotic biofabrication of tissue spheroids. Biofabrication 3(2):025002. https://doi.org/10.1088/1758-5082/3/2/025002 13. Nikoobin A, Hassani Niaki M (2012) Deriving and analyzing the effective parameters in microgrippers performance. Sci Iran 19:1554–1563. https://doi.org/10.1016/j.scient.2012.10. 020 14. Roch I, Bidaud P, Collard D, Buchaillot L (2003) Fabrication and characterization of an SU-8 gripper actuated by a shape memory alloy thin film. J Micromech Microeng 13:330–336. https://doi.org/10.1088/0960-1317/13/2/323 15. Shivhare P, Uma G, Umapathy M (2016) Design enhancement of a chevron electrothermally actuated microgripper for improved gripping performance. Microsyst Technol 22:2623–2631. https://doi.org/10.1007/s00542-015-2561-0 16. Thangavel A, Rengaswamy R, Sukumar P (2018) Design and material analysis for prototyping of four arm mechanical microgripper with self-locking and anti-slipping capability. Microsyst Technol 25:851–860. https://doi.org/10.1007/s00542-018-4025-9 17. Wei J, Duc TC, Sarro PM (2008) An electro-thermal silicon-polymer micro-gripper for simultaneous in-plane and out-of-plane motions, pp 1466–1469 18. Wester BA, Rajaraman S, Ross JD et al (2011) Development and characterization of a packaged mechanically actuated microtweezer system. Sensors Actuators A Phys 167:502–511. https:// doi.org/10.1016/j.sna.2011.01.005 19. Xu Q (2013) A new compliant microgripper with integrated position and force sensing. 2013 IEEE/ASME Int Conf Adv Intell Mechatronics Mechatronics Hum Wellbeing, AIM 2013:591–596. https://doi.org/10.1109/AIM.2013.6584156 20. Xu Q (2015) Design, fabrication, and testing of an MEMS microgripper with dual-axis force sensor. IEEE Sensors J 15:6017–6026. https://doi.org/10.1109/JSEN.2015.2453013 21. Yang S, Xu Q (2016) Design and simulation a MEMS microgripper with integrated electrothermal actuator and force sensor. ICARM 2016 – 2016 Int Conf Adv Robot Mechatronics, pp 271–276. https://doi.org/10.1109/ICARM.2016.7606931
7
Design and Testing of Elbow-Actuated Wearable Robotic Arm for Muscular Disorders D. Manamalli, M. Mythily, and A. Karthi Raja
Abstract
Musculoskeletal disorders are a major concern globally not just for the pain and suffering by the individual, but also due to its adverse impact on the economy of the individual and the society. This results in productivity loss in both the manufacturing and service sector, which has an adverse impact on the economy. The main aim of this work is to design and test an upper-body exoskeleton arm that will be used for empowering the able-bodied, i.e., healthy users. The exoskeleton arm is intended to power or amplify the ability of the human elbow. The inverse dynamic model of the system is simulated through kinematic analysis and workspace analysis. Upon validation of the model, the mechanical system design constitutes the material selection for the exoskeleton frame, design for joint imitation, load sharing of the completely mechanical structure, and dimensioning. The electrical system of the prototype constitutes the important issue of the actuator selection, power supply requirement analysis, control scheme design, components selection, and controller selection. The two systems are integrated and the orchestral motion of the prototype and human arm is tested under different conditions. Keywords
Musculoskeletal disorders · Upper body exoskeleton · Mechanical design · Electrical design
D. Manamalli · M. Mythily (*) · A. K. Raja Department of Instrumentation Engineering, MIT Campus, Anna University, Chennai, Tamil Nadu, India e-mail: [email protected] # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_7
119
120
7.1
D. Manamalli et al.
Introduction
Musculoskeletal disorders (MSDs) are considered as the most common workoriented health problem around the world affecting tens of millions of workers. A statistical study by European Union reports 25% of workers suffer from backache and 23% complain muscular pain, which are resulted mainly by heavy physical work, repetitive movements, manual handling, and even improper posture. The above-mentioned strenuous functions result in MSDs and can provoke intense pain to the point of creating it tough or not possible to carry out even day-to-day tasks [1]. Thus, based on MSDs’ impact on individual, organization, and the whole economy, there is a need to give attention to the ways in which these problems can be solved. In twenty-first century, these work-related MSDs are tackled in many ways, through the enormous development in the field of automation and robotics. Certain developing countries like India show interest over this issue and rehabilitation but all these are at the ground level of prevention which can be only practiced in parallel with the much-needed robust technical solutions. In this paper, wearable robot with a single degree of freedom (DOF) is proposed. The robotic system was intended to power or amplify the ability of the human elbow. The research was carried out initially by obtaining the mathematical model of the system and finally real-time design and testing of the whole prototype with preliminary results were obtained.
7.1.1
Methodology
Analysis and modeling of human movement has become a topic of greater importance as it helps to understand the complex systems by efficient initialization of mathematics, mechanics, and concepts of physiology. Most of the key features are extracted and are used to create a simplified representation of the system. Such a model allows one to closely observe the insights of the systems and to make predictions regarding the performance under bounded input conditions and various system parameters. The data from real physical models could be used to validate derived models for increasing reliability. To model any system, the fundamental concepts of the system are required. Hence the human arm model’s anatomy gives us the needed data regarding the functionality of the upper arm. Different range of motions of the human arm is given in Fig. 7.1. The theoretical concepts of the upper limb reveal three important articulations namely the shoulder, elbow, and wrist. The proposed model is being dedicated only to the elbow actuation with a single degree of freedom. The ranges of motion of all the articulations are given in Table 7.1. These values represent the average range of joint motion and our interest is to select the range of motion of human elbow while doing strenuous tasks. This has been finalized based on the joint range of motion of elbow as 0 –145 . This will be a parameter that will be considered as a boundary for the modeling and designing of the prototype [1, 5].
7
Design and Testing of Elbow-Actuated Wearable Robotic Arm for Muscular. . .
121
Fig. 7.1 Different range of motions of the human arm Table 7.1 Range of motions of shoulder and elbow joint
Range of motion
7.2
Shoulder abduction 0 –180
Shoulder flexion 50 –180
Shoulder rotation 80 –100
Elbow flexion 0 –145
Real-Time Prototype Design
The real-time wearable robot was designed with the aim of actuating the human joint with an effective actuator that amplifies the human elbow to lift and hold loads more than that of the bare human elbow. With this aim of the design, begin exploring the suitable joint that can be actuated easily with less design requirements and an undemanding mechanical structure. This model is expected to be a proof of concept and the results of it are used to validate the process of designing. There is a wide gap between the real-time world designing and the designing in simulation tools. To understand such constraints and bridge the gap between the two domains, a fully working and an enhanced anthropomorphic prototype is needed and thus designed. The real-time model of the exoskeleton arm is shown in Fig. 7.2 [2, 6, 8]. The design consideration begins with the overall classification of the system, which is classified as the mechanical system design and the electrical system design. The mechanical system design constitutes the material selection for the exoskeleton frame, design for joint imitation, load sharing of the completely mechanical structure, and dimensioning. In an equivalent way, the electrical system of the prototype constitutes the important issue of the actuator selection, power supply requirement analysis, control scheme designing, components selection, and controller selection. The two systems perform an orchestral motion of the prototype and human arm.
122
D. Manamalli et al.
Fig. 7.2 Real-time model of the exoskeleton arm and its operation
7.2.1
Mechanical System Design
The mechanical system design began with the selection of material for fabrication that satisfies the requirements of the mechanical structure such as the load-bearing properties; the yielding stress and strain are to be considered. In that way, aluminum exhibited favorable metallurgical properties. Hence, aluminum was selected and finalized as the metal used for fabrication that has less weight and more strength; the yielding strength of aluminum is very high when compared with its other counterparts. The wearable robot was designed with the perspective of matching the range of motion and biomechanics of the upper limb even though no model can never imitate a fully functional human arm, but can be tried to imitate to up to an acceptable extent. Thus, the design is to have two main parts: the upper arm and the lower arm. In this design, there is no support to be given to the hand, and it is left composed externally. The upper arm is responsible for the load sharing of the whole structure as it is to be fastened to the human upper arm, sharing the load to the ground through the human body. The dimensions for fabrication were decided considering both the biomechanical aspects and the material strength considerations of human and the aluminum alloy used. The design when comes to biomechanical aspects a participant or a volunteer should be selected, hence two volunteers, a male and a female, were selected and considering their weight and height as the base parameters. The link lengths were calculated and they shared comparable attributes. Thus the design was made for the person with maximum value of the arm length so that both the participants can be tested on. The feasibility of design was tested by modeling the prototype in the SOLIDWORKS design tool, which positively resulted. The SOLIDWORKS model was designed with the real-time dimensions and a provision for motor shaft and the model enables to view the range of motion between the two links; the whole structure was a composite structure with two links and one joint. Fig. 7.3 below shows the model designed in the solid works [3, 10, 14]. The software validation is followed by the design of the elbow joint as seen earlier, the human elbow has an intelligent joint design, it has an excellent mechanical stopper design which is inbuilt, which restricts the lower hand to extend beyond the upper arm. Thus, our aim was to restore the same mechanism in the real-time
7
Design and Testing of Elbow-Actuated Wearable Robotic Arm for Muscular. . .
123
Fig. 7.3 Model for fabrication
arm, which we are trying to fabricate. The joint to be designed needs to imitate the human elbow joint so the inspiration for the design is took from the human elbow itself. The upper arm is considered as the principal support for the whole structure and was made with the dimension 400 mm 30 mm 8 mm; the lower arm, which is connected with the actuator, is of dimension 250 mm 20 mm 8 mm. Once we finalized the dimensions, the joint can be designed to have mechanical stopper mechanism, which has been represented in Fig. 7.3, which can be visually appreciated that it is similar to that of the elbow joint [7, 9]. The joint model drawings shown in Fig. 7.4 give a novel 2-point support for the actuator which can give extra support to the actuator and the whole frame, then passive support is given to upper arm and forearm. The design itself efficiently inherits a stopper mechanism for out of range motions; thus, giving complete protection to the human wearer [4, 11–13].
124
D. Manamalli et al.
30
UPPER ARM WITH SUPPORT SYSTEM
20
400
FORE ARM WITH SUPPORT SYSTEM
250
ELBOW JOINT LIMITER
NOTE: JOINT PROTECTION CAP
DIMENSIONS ARE JUST FOR REPRESENTATIVE PURPOSE
MECHANICAL ELBOW JOINT WITH JOINT LIMITATIONS
Fig. 7.4 Joint model drawings
7.2.2
Electrical System Design
The electrical system comprises the major component of the exoskeleton arm that is the actuator, which is a brushed permanent magnet direct current (PMDC) motor. The electrical motor used for driving the exoskeletal system has a different structure from ordinary PMDC motors. The brushed PMDC motor is intended, with two electrical inputs and one electrical output, driving a worm and wheel gear configuration to decrease the speed and increase the torque of the driven load of the arm,
7
Design and Testing of Elbow-Actuated Wearable Robotic Arm for Muscular. . .
125
Fig. 7.5 Elbow actuator
mechanical structure, and external load or the payload. The actuator has two speed inbuilt in which we could choose the one with low speed as our work aims to in lifting higher loads and are not interested in the time constraint of operation [12, 15– 17]. This actuator was selected based on calculating whether it can meet up the torque requirement of the human elbow. This actuator’s speed was an important parameter that is to be under control, as it would be hazardous to work with a high torque motor and at high speed. The selected actuator for driving the system is given in the Fig. 7.5. The selected actuator has an overwhelming torque characteristic which more than that is required. On choosing an actuator less than this torque will face notable torque reduction during peak loads, hence this is an optimal selection for our requirement. The actuator selected demands a tight control for avoiding any accidents as human-machine interface is there and this should be viewed very seriously. The control structure should be flawless and efficient and should be a fault-tolerant. The control scheme should have safety features that it should protect the user even in the case of acute power failure and other undesirable circumstances. The above requirements of the tight and effective control scheme lead to the designing of a novel microcontroller less control scheme for the exoskeleton system. The position control of the exoskeleton system can be viewed as open loop system as the position of the system can be controlled by the operator or the user himself and easily actuate the entire forearm by single button press. The open loop control system of the exoskeleton is basically a relay-based control structure where the relay controls the direction of the motor while switching the direction of the supply to the actuator. Then the speed is another variable, which is to be controlled, hence there was a separate experiment conducted to find the accurate amount current and voltage that are required to be supplied so that the motor can provide sufficient torque at peak loads. The experiment was conducted with high current supply and gradually decreased using a rheostat until the motor reaches the necessary speed. As a result of this experiment the speed was reduced from 50 rpm to 6 rpm with a supply voltage of 8 V and 0.9A at normal load conditions, with rheostat
126
D. Manamalli et al.
Fig. 7.6 Open loop relay-based control unit schematic and fabrication
showing an 8.75 Ω with a power consumption of 7 W. At peak load the actuator drew more current that is equivalent to the rated current of the motor itself. The speed was controlled permanently as there is no requirement for change in the speed during operation in the future; this was implemented by connecting a high power resistor in series with the supply of the motor. To meet the experimental value of the resistance, a 10 Ω, 10 W resistor was connected in series to the input power supply to the actuator resulting in speed reduction. Even though it is known that the motor performance will be reduced when the supply voltage to the system is reduced and to which our system is not immune, still the torque from the motor after speed control matches the peak load requirement for the designed payload and arm. Figure 7.6 shows the control unit of the system.
7.3
Experimental Validation
The designed system was tested on the two volunteers for whom the design was tailored. The experiment was conducted with two cases, one lifting the arm without human effort and the second one lifting the human arm with load and then holding the same load for a time, which is to be recorded for calculating efficiency. The experiment was conducted with the user’s left hand as they both were right-handed. This is to avoid them putting effort during the experiment. The experiment was repeated with and without exoskeleton and the time for holding payload was calculated. The volunteers with a height of 152 cm, and 165 cm; with weight 49 kg, and 53 kg were employed. They were able to effortlessly lift the arm successfully; hence the first experiment was effectively accomplished. The second experiment accomplished with 3 times increase in efficiency than the bare hand. It
7
Design and Testing of Elbow-Actuated Wearable Robotic Arm for Muscular. . .
127
was observed that a payload of 3 kg was lifted with one hand for 38 s with exoskeleton, and in bare hand it was just 10 s, which is approximately 3 times more than human power.
7.4
Conclusion
In this work, an upper-body exoskeleton arm that will be used for empowering people with muscular disorders was designed and tested. The exoskeleton arm is intended to power or amplify the ability of the human elbow. The inverse dynamic model of the system is simulated through kinematic analysis and workspace analysis. Upon validation of the model, the mechanical system design constitutes the material selection for the exoskeleton frame, design for joint imitation, load sharing of the completely mechanical structure, and dimensioning. The electrical system of the prototype constitutes the important issue of the actuator selection, power supply requirement analysis, control scheme designing, components selection, and controller selection. The two systems are integrated and the orchestral motion of the prototype and human arm is tested under different conditions.
References 1. Okubanjo AA et al (2017) Modeling of 2-DOF robot arm and control. Futo J Series (FUTOJNLS) 3(2):80–92 2. Shah J et al (2015) Dynamic analysis of two-link robot manipulator for control design using computed torque control. Int J Res Comput Appl & Robot 3(1):52–59 3. Khalate AA et al (2011) An adaptive fuzzy controller for trajectory tracking of robot manipulator. Intell Control Autom 2:364–370 4. Ward K (2001) Rapid simultaneous learning of multiple behaviours with a mobile robot. Proc 2001, Australian Conference on Robotics and Automation, Sydney, pp 14–15, Nov 2001 5. Sciavicco L, Siciliano B (2000) Modelling and control of robot manipulators, 2nd. Springer, London/New York 6. Amir Ebrahimi (2017) Stuttgart Exo-Jacket: an exoskeleton for industrial upper body applications, IEEE – International Conference on Human System Interactions (HSI) 7. Xiloyannis M et al (2017) Preliminary design and control of a soft exo-suit for assisting elbow movements and hand grasping in activities of daily living. J Rehab Assist Technol Eng (RATE) 4:1–15 8. Thomas N et al (2016) Development of an assistive robot for the torque analysis of upper extremity joints. Int J Sci Eng Res 7(12):1738–1743 9. Martinez F et al (2008) Design of a five actuated DoF upper limb exoskeleton oriented to workplace help. IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics 10. Almomani A et al (2016) The 1st pneumatic fluidic muscles based exoskeleton suit in the U.A.E. IEEE/American Society of Engineering Education 11. Parasuraman S et al (2009) Human upper limb and arm kinematics for robot based rehabilitation. IEEE/ASME International Conference on Advanced Intelligent Mechatronics 12. Munasinghe et al (2014) Reduced jerk joint space trajectory planning method using 5-3-5 spline for robot manipulators. IEEE International Conference on Information and Automation for Sustainability
128
D. Manamalli et al.
13. Stambolian D et al (2016) Development and validation of a three dimensional dynamic biomechanical lifting model for lower back evaluation for careful boxplacement. Int J Ind Ergon 45:10–18 14. Seo NJ et al (2009) A comparison of two methods of measuring static coefficient of friction at low normal forces: a pilot study. Ergonom Jl 52(1):121–135 15. Kumra S et al (2012) Design and development of 6 DOF robotic arm controlled by man machine interface. IEEE – International Conference on Computational Intelligence and Computing Research 16. Shaari LA et al (2015) Torque analysis of the lower limb exoskeleton robot design. ARPN J Eng Appl Sci 10(19):9140–9149 17. Gopura et al (2016) Developments in hardware systems of active upper-limb exoskeleton robots: a review. Robot Auton Sys J 75:203–220
8
A Comprehensive Study of Image Fusion Techniques and Their Applications R. Indhumathi, S. Nagarajan, and T. Abimala
Abstract
The method by which data from different images are amalgamated into a single image so as to upgrade the quality of the image and reduce the artifacts, randomness, and redundancy is known as image fusion. Image fusion plays a vital role in the medical field. The objective of image fusion is to process the information at every pixel position in the input images and assist the data from that image which constitute the genuine scene or upgrade the potency of the fused image for an accurate application. The fused image provides an intuition to data contained within multiple images in a single image which facilitates physicians to diagnose diseases in a more effective manner. Though numerous singular fusion strategies yield optimum results, the focus of researchers is moving toward hybrid fusion techniques, which could exploit the attributes of both multi-scale and nonmulti-scale decomposition methods. An illustrative study in this paper would form the basis for nurturing advanced research ideas in the field of image fusion. Keywords
Fused image · Image fusion · Multi-scale decomposition · Multi-scale and nonmulti-scale decomposition
R. Indhumathi (*) EEE Department, Jerusalem College of Engineering, Chennai, Tamil Nadu, India S. Nagarajan Surendra Institute of Engineering and Management, Siliguri, India T. Abimala (*) ICE Department, St. Joseph’s College of Engineering, Chennai, Tamil Nadu, India # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_8
129
130
R. Indhumathi et al.
Abbreviations CT DTCWT DWT GA MRI NSCT NSST PCA PCNN PET SWT
8.1
Computed Tomography Dual-Tree Complex Wavelet Transform Discrete Wavelet Transform Genetic Algorithm Magnetic Resonance Imaging Non-Subsampled Contourlet Transform Non-Subsampled Shearlet Transform Principal Component Analysis Pulse Coupled Neural Network Positron Emission Tomography Stationary Wavelet Transform
Introduction
Medical diagnosis is a strategy of figuring out the disease from a person’s indications and signs [1]. The data required to diagnose are typically collected from a history and physical examination of the individual. Medical diagnosis is regarded as an endeavor at classifying an individual’s condition into partitioned and distinguished categories, which provide decisions about treatment and visualization to be made. Hence, these techniques play a vital role in this world. However, a single medical diagnostic technique is not sufficient to diagnose a particular disease. Therefore, medical practitioners utilize a secondary diagnostic procedure to provide complete data about a specific infection [2]. In such cases, there is a need to analyze two unique images. Some researchers are trying to integrate the two different images which provide complete information about the disease at the hardware level, but integrating data at the hardware level is computationally complex. Therefore, we are integrating complementary information from two distinct images, thereby organizing facts accommodated within respective images [3]. A fused image provides an intuition to data contained within multiple images in a single image which facilitates physicians to diagnose diseases in a more effective manner. Multi-modal image fusion is an easy passageway for doctors to recognize the damage to anatomize images of original modalities. Medical imaging is the methodology to create optical representations about the interior body for medical intervention. Medical imaging reveals information about the internal structures which are hidden by the skin and bones. They provide provision to diagnose and treat the disease. Distinctive imaging techniques such as computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), single photon emission tomography (SPECT), etc. provide distinctive statistics about the human body which is crucial in diagnosing diseases. However, these modalities are not prepared to give total data by observational imperatives. The intention of image fusion is to process the substance at each pixel position in the info images and continue the information from that image which speaks to the certified scene or
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
131
redesigns the potency of the fused image for an exact application. Computed tomography (CT) is used to provide information about dense structures like bones, whereas magnetic resonance imaging (MRI) is used to provide information about soft tissues while positron emission tomography (PET) provides information about metabolic activity taking place within the body. PET/CT is a superior diagnostic procedure but is not accessible everywhere because it is an updated technology in the field of medicine. Though PET and CT are done independently, they utilize the same machine so the CT and PET can overlay each other and yield an output which contains anatomical information overlaid with information about the metabolic activity acquired by PET. It is much easier to diagnose the disease if the anatomic region concerned is metabolically active which is an indication of cancer since we do not have errors for utilizing distinctive hardware. Cancer has been diffused widely, and it might not be sufficiently concentrated enough in one area, so at times physicians will do surgery demonstrated by a clear PET. Though false positives and negatives may occur with PET image, most of the time an accurate picture of what is going on is obtained. Hence, PET scan is a very valuable diagnostic tool. Multi-sensor data fusion has turned into a more and more formal solution for various applications. Several circumstances in image processing require high spatial and high spectral information in a single image. Image fusion is utilized in various applications such as remote sensing, medical imaging, robot guidance, military applications, etc. Image fusion techniques are extensively categorized into two classifications: spatial domain and frequency domain. Spatial domain techniques are further classified into average, maximum fusion algorithms, and principle component analysis. Frequency domain techniques are further categorized into pyramid based decomposition and wavelet transforms. Pyramidal method was proposed to beat the downsides of spatial domain image fusion techniques. Pyramidal method is further categorized into Gaussian pyramid method, Laplacian pyramid method, ratio of low pass pyramid method, and morphological pyramid method. Wavelet-based image fusion transforms are further systematized into discrete wavelet transform (DWT), stationary wavelet transform (SWT), non-subsampled contourlet transform (NSCT). DWT is a multi-scale decomposition of a signal. DWT applies low pass and high pass filters to rows and columns. The low pass filter extricates the low-frequency parts while high pass filter extricates high-frequency components from a signal. Though DWT comprises various advantages such as flexibility and high compression ratio, it suffers from major drawback called shift variance. To overcome the drawbacks of DWT, SWT was proposed. SWT strategy performs upsampling of filter coefficients, thus the decomposed image will have the same size as the original image. SWT strategy eliminates downsampling in the forward direction and omits upsampling in the reverse direction. Though DWT and SWT are good at isolated discontinuities, they do not provide sufficient information about the edges. To overcome the downsides of DWT and SWT, NSCT was proposed. NSCT is a geometric investigation technique and it uses the geometric consistency present in the source images, thereby increasing localization, shift invariance, etc.
132
R. Indhumathi et al.
Image fusion can be accomplished at three levels—-pixel level, feature level, and decision level. Pixel level image fusion is usually performed by combining pixel values from individual source images. Feature level image fusion performs fusion only after segmenting the source image into numerous features such as pixel intensity, edges, etc. Decision level image fusion is a high-level fusion strategy which utilizes fuzzy rule, heuristic algorithms, etc. Pixel level image fusion is the most widely utilized technique in the field of medicine, remote sensing, computer vision, etc. Image fusion process generally comprises of three important steps: 1. Mapping the pixels in the image into transform domain 2. Fusing the low-frequency and high-frequency coefficients using suitable fusion rules 3. Applying inverse transform to obtain the fused image Pixel level image fusion is classified into two major categories: multi-scale decomposition and non-multi-scale decomposition methods [4–6]. Fusion rules play a vital role in fusing low- and high-frequency components. Most commonly used fusion rules are weighted averaging, min-max rule, component substitution, machine learning, region-based consistency verification, cross-scale fusion, coefficient, and window-based strategies based on activity level measurement [7]. In this paper, a detailed literature review has been done on the various image fusion strategies which lay the foundation for nurturing research ideas in the field of image fusion.
8.2
Literature Review of Various Image Fusion Techniques
8.2.1
Wavelet Transform (WT)
Praveen Kumar Reddy Yelampalli et al. (2018) put forward a novel image fusion technique using Daubechies wavelet transform [8]. A new attribute descriptor called recursive Daubechies pattern (RDbW) was developed. Initially, image registration was done by utilizing RDbW technique. After registration, the images were fused by employing wavelet transform. The proposed strategy was implemented by relating the basis function in a recursive fashion to the neighborhood pixels. Then, the relationship between the center pixel and the neighborhood pixels was obtained. RDbW feature vector was obtained by encoding the local pixel relationship into binary sequence. One-dimensional daub4 was utilized as the mother wavelet and it was utilized iteratively to yield the local texture of images utilized. The proposed strategy had produced good quantitative results for various quality metrics such as SIE, entropy, standard deviation, etc. The performance of the proposed technique had been further investigated by utilizing the multimodal images in the presence of Gaussian noise and AWGN. The altered pixel values had been associated to yield the pertinence between the centermost pixel and
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
133
the neighboring pixels. The relationship obtained had been further concealed into binary sequence and histograms had been utilized to calculate the RDbW feature vector. Experimental results illustrated that recursive Daubechies pattern (RDbW) outperformed the existing image fusion strategies. Paul Hill et al. (2017) put forward a perceptual image fusion strategy which utilized explicit luminance and contrast masking models [9]. The source images were disintegrated using dual-tree complex wavelet transform (DTCWT). Initially, Noticeability Index of image coefficients was calculated. This strategy enabled to maintain the perceptually important coefficients from the source images in the resultant fused image. Subjective and objective analysis illustrated that dual-tree complex wavelet transform (DTCWT) gave better results than the extant image fusion strategies. Image fusion is an emanate technology in the medical field [10]. Fusing two or more images yields more information content required for accurate diagnosis and treatment. In this paper, Rajarshi et al. (2016) had put forward strategy in order to improve the performance of discrete wavelet transform (DWT). The proposed strategy utilized approximation and detailed layer fusion rules. Lastly, inverse DWT had been utilized to reconstruct the fused image. The proposed strategy was compared with multi-level local extrema (MLE) method. Quantitative investigation illustrated that the proposed strategy produced better results in terms of PSNR, MI, and SSIM. Neetu Mittal et al. (2015) utilized discrete wavelet transform for fusing medical images [11]. The authors utilized seven wavelet transform methodologies such as biorthogonal, Coiflet, Dmeyer, reverse biorthogonal, and Symlet. Among the wavelet transforms utilized in this work, Daubechies wavelet is most widely used and it is the basic foundation behind wavelet signal processing. Biorthogonal wavelet exhibited the linear phase property. Mexican Hat, Morlet, and Meyer wavelets exhibited symmetric property. Comparative analysis illustrated that entropy and standard deviation were worse for Dmeyer (dmey) and Coiflets (coif) wavelet transforms while it was better for Symlet (sym) wavelet transform. The authors concluded that Symlet wavelet which utilized maximum wavelet coefficients outperformed other fusion strategies. Vani et al. (2015) put forward a novel approach to fuse multimodal images called dual-tree discrete wavelet transform (DTDWT) [12]. The images were initially decomposed by DTDWT methodology. Average fusion rule was employed for fusing low-frequency coefficients while maximum fusion rule was utilized for fusing high-frequency coefficients. Fuzzy local information C-Means algorithm (FLICM) was performed on fused output which enabled detection of tumor easily. FLCIM methodology integrated spatial as well as gray level information in an efficient manner which enabled proper segmentation of images even in the presence of noise. Quantitative analysis demonstrated that the proposed strategy produced better results in terms of entropy, peak signal-to-noise ratio (PSNR), root mean square error (RMSE), standard deviation (SD), fusion factor (FF), and fusion symmetry (FS). Tannaz Akbarpour et al. (2015) put forward a novel approach to extract regions that were affected by Alzheimer’s disease [13]. The first two models of MRI images
134
R. Indhumathi et al.
were fused by utilizing dual-tree wavelet transform (DTWT) strategy. Shift in-variance property was achieved in DTWT by doubling the sampling rate. After performing fusion, the features of the fused image were extracted by utilizing fuzzy C means algorithm (FCM). Visual and quantitative outcomes demonstrated that combining fusion and segmentation strategies produced better results which enabled the diagnosis of the affected area accurately. Ischemic stroke is a condition which causes the destruction of brain cells due to the lack of blood supply [14]. Detecting ischemia stroke using CT image is unendurable. To overcome this drawback, CT and MRI images had been fused to yield a compound image which gives detailed information than the input images. Mirajkar et al. (2015) put forward a new image fusion methodology using wavelet transform. The proposed algorithm involved four phases. Initially preprocessing of CT and MRI images had been done. Determining equivalent CT image for input MRI image was performed in the second phase. The third phase incorporated image enrollment and fusion. Finally, segmentation of stroke lesion was performed. Objective and subjective analysis illustrated the effectiveness of the wavelet transform. Babu et al. (2015) put forward a novel filtering methodology along with curvelet transform and wavelet transform [15]. The main objective behind the proposed strategy was to reduce noise in the output image since noise could significantly improve root mean square error and reduce peak signal-to-noise ratio. The author utilized non-local means filter to remove speckle noise and shrinkage rule was employed to shrink the noise content. On comparing the shrinkage coefficient with the threshold value if the shrinkage coefficient was smaller, then noise could be removed and information about edges could be easily preserved. On comparing the curvelet transform with wavelet transform it was observed that curvelet transform produced better results than wavelet transform. Experimental outcomes demonstrated that the proposed strategy successfully eliminated noise, reduced RMSE, and increased PSNR value. Image fusion is a strategy of combining complementary information from two or more distinct images into a single image [16]. Arnika et al. (2014) put forward the MWT algorithm for fusing images from distinct modalities. In the proposed strategy, the source images were initially registered followed by applying Multi wavelet transform. The resultant image was obtained by applying various fusion methodologies. Experimental analysis demonstrated that the fused image provided more details and the texture also was clear. The authors concluded that the proposed methodology could be an effective approach to fuse medical images. Elizabeth Thomas (2014) put forward a new image fusion strategy called Daubechies complex wavelet transform (DCxWT) to blow away the drawbacks of wavelet transform such as sensitivity, poor directionality, and phase information lag [17]. Multi-resolution principle was utilized in the proposed strategy. Maximum selection rule was utilized to fuse complex wavelet coefficients. The performance of Daubechies complex wavelet transform was verified both visually and quantitatively by comparing the experimental outcomes with LWT. The major advantages of DCxWT were accurate reconstruction property, non-redundant wavelet transform,
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
135
and symmetric property. Additionally, they offered phase information which comprised most of the structural information about an image. The most commonly utilized fusion methodology is biorthogonal wavelet transform (BWT) [18]. Maruturi Haribabu et al. (2014) proposed biorthogonal wavelet transform using absolute maximum and energy selection rule. Generally, orthogonal filters do not have linear phase characteristics. Phase distortion prompts mutilation along the edges. To overcome the above downside, biorthogonal wavelet which utilizes linear phase characteristics and symmetry property was has been introduced. The proposed strategy utilized wavelet and scaling functions to analyze an image. Maximum selection rule was utilized for fusing low-frequency coefficients while energy fusion rule was utilized for fusing high-frequency coefficients. On comparing the proposed methodology with DWT and PCA it was obvious that the proposed strategy performed better than DWT and PCA in terms of standard deviation, mean, and entropy values. Yang Yanchun et al. (2014) proposed a new image fusion methodology which utilized lifting wavelet transform (LWT) and dual-channel pulse coupled neural network (PCNN) so as to meet the necessities of medical diagnosis [19]. Region spatial frequency was adopted to fuse low-frequency coefficients while dual-channel PCNN was adopted to fuse high-frequency coefficients. Dual-channel PCNN was adopted owing to its simpler architecture and adaptability. Quantitative analysis illustrated that the proposed technique improved the quality of the output than the existing traditional fusion techniques. The author concluded that the proposed technique retained the source as well as detailed information with reduced complexity. Tannaz Akbarpour et al. (2014) put forward a novel approach to fuse medical images [20]. To overcome the drawbacks of 2D wavelets, the authors had employed complex wavelet transforms. Dual-tree complex wavelet transform is shift invariant and has the tendency to act in six directions. After decomposition, the images were fused by adapting maximum and average fusion rules. Quantitative parameters such as mutual information, standard deviation, and entropy illustrated that the proposed strategy was superior to the existing wavelet transforms. Tian Lan et al. (2014) designed multimodal image fusion using wavelet transform (WT) and human visual system (HVS) [21]. The source images were initially decomposed adopting WT and HVS techniques. To yield an output image, inverse wavelet transform was applied. Quantitative analysis illustrated that the wavelet transform (WT) and human visual system (HVS) performed better than the existing image fusion algorithms furthermore, it could be credited to the fusion of estimate coefficients based on human visual system. HVS models are utilized more often in digital watermarking applications, image compression, etc. Shutao Li et al. (2013) had put forward a strategy which utilized two-scale decomposition of an image using a guided filter [22]. The 2D decomposition comprised of 2 layers-base layer and detail layer. The base layer contained substantial scale variations of intensities while the detail layer contained every small scale detail. Weighted average technique was proposed to make use of spatial consistency of base and detail layers. In addition, this technique introduced a fast two-scale
136
R. Indhumathi et al.
fusion method. Guided filtering technique was utilized as a local filtering method which was an edge-preserving filter. The estimation time of guided filter is independent of the size of the filter and it utilized strong interrelationships between neighboring pixels. The authors got the insight that this 2D decomposition of guided filter could be used expertly for image registration that preserved the original and complementary data of individual source images. Rui Shen et al. (2013) proposed a new fusion rule for multi-scale-decomposition of volumetric medical images that takes into account both intrascale and interscale consistencies [23]. An exemplar representation of coefficients had been determined by utilizing proper information from source images owing to which efficient color fusion scheme had been proposed. The CS fusion rule allows to pass data between each disintegration level to accomplish intrascale and interscale consistencies so that the resultant image conserved most of the information from the source images. Experiments conducted on volumetric medical images demonstrated the effectiveness and flexibility of the cross-scale fusion rule. The authors concluded by stating that the CS fusion rule could be extended for 4-D medical images. Sharmila et al. (2013) put forward a new image fusion technique called discrete wavelet transform-averaging-entropy-principle component analysis method (DWT-A-EN-PCA) [24]. Existing image fusion algorithms shared a familiar attribute. For example, contrast pyramid method loses an excessive amount of data from the input images while ratio pyramid strategy delivers heaps of false information that does not exist in the input images. Morphological pyramid method creates numerous wrong edges. Though curvelet transform diminishes fusion error only as O ((log(n) 3)2), the computational cost of fusion process is O (n2log(n)). To beat the confinements of the above methods, discrete wavelet transform-averagingentropy-principle component analysis method (DWT-A-EN-PCA) was proposed. Experimental analysis illustrated that entropy (EN), signal-to-noise ratio (SNR), and fusion symmetric (FS) parameters were better for the proposed strategy. Sohaib Afzal et al. (2013) put forward a new two-stage medical image fusion Scheme [25]. Initially, multi-scale fusion procedures were applied to yield fused images. Secondly, the individual results were consolidated by utilizing weighted average, with nearby structural similarity measure used as weights. The weights were calculated using the structural similarity measure at each pixel location. Utilizing the above procedure, a superior quality fused image was obtained. Performance analysis showed that the proposed strategy was capable of producing better fused images than individual multi-scale techniques such as discrete wavelet transform, dual-tree complex wavelet transform, laplacian pyramid, contourlet transform, and curvelet transform-based fusion. Fused images were evaluated quantitatively using different quality metrics. The subjective analysis demonstrated that MI had improved from 3% to 4% and PSNR had increased by 2% in the proposed strategy. The authors inferred that control over the features of input images could have been gained by training the classifiers properly. Rajiv Singh et al. (2012) had put forward fusion of multimodal medical images which utilized Daubechies complex wavelet transform [26]. Existing image fusion strategies experienced issues such as sensitivity, phase information lag, and poor
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
137
directionality. To conquer the above issue, the authors used complex wavelet transform, a shift invariant technique. The DCxWT conserved data about the edges and produced phase information, profoundly immune to noise and complexity distortions. The DCxWT algorithm had been compared with wavelet domain (dualtree complex wavelet transform (DTCWT), lifting wavelet transform (LWT), multiwavelet transform (MWT), stationary wavelet transform (SWT)) and spatial domain ((principal component analysis (PCA), contourlet transform (CT), and non-subsampled contourlet transform (NSCT)-based image fusion methods). Quantitative analysis was done by using suitable quality metrics such as entropy, edge strength, standard deviation, fusion factor, and fusion symmetry. Experimental analysis showed that the proposed fusion method was superior to the existing fusion methods. DCxWT methodology was further tested against Gaussian, salt & pepper and speckle noise to inspect the validness of the proposed strategy. Huimin Lu et al. (2012) had put forward the maximum local energy method to measure the low-frequency coefficients of images and the results had been compared with wavelet transform [27, 28]. Here, the coefficients of two different types of images were obtained through beyond wavelet transform. The low-frequency coefficients were chosen utilizing maximum local energy technique while highfrequency coefficients were selected by using the modified Laplacian method. Finally, inverse beyond wavelet transform was performed to obtain the fused image. Three types of images (multifocus, multimodal medical, and remote sensing images) were used in this investigation and the outcomes had been analyzed quantitatively. The experimental analysis demonstrated that maximum local energy could be a novel methodology for obtaining fused image with adequate performance. Quantitative and qualitative analysis illustrated that MLE-Bandelet transform and MLE-contourlet transform were superior for analyzing multifocus images. The author concluded by stating that MLE-contourlet transform was better in processing CT/MRI and remote sensing images than the MLE-Bandelet transform. Haozheng Ren et al. (2011) put forward an image fusion method based on wavelet transformation. Initially, the fundamental idea of multi-focus image fusion had been explained [28]. Later, they introduced some enhanced wavelet transforms such as multi-wavelet and multi-band multi-wavelet. In the meantime, the authors applied the multi-band multi wavelet in the image fusion with the wavelet fusion strategy. They also compared various methods based on images, windows, and regions and adopted the fusion norm based on grades and characteristic measurement of regional energy. Additionally, the author compared different wavelet transforms in the aspects of entropy, peak signal-to-noise ratio, square root error, and standard error in the experimentation. The experimental outcomes demonstrated that multi-band multi-wavelet was extremely viable in image fusion. The authors also utilized certain post-processing strategies to the fused image. The post processing method involved anisotropic diffusion based on partial differential equations. The experiments demonstrated that the brim diffusion improved the PSNR of image and oppressed the block effects brought about by the wavelet fusion method.
138
R. Indhumathi et al.
Kunal Narayan Chaudhury et al. (2010) had put forward a new image fusion methodology called dual-tree complex wavelet transform (DT-WT) which exhibited better shift-invariance than the discrete wavelet transform [29]. The proposed strategy utilized an amplitude-phase representation of the DT-WT which offered an immediate clarification for the improvement in the move invariance. The representation depends on the modifying action of the group of fractional Hilbert transform (fHT) operators, which extended the idea of arbitrary phase-shifts from sinusoids to finite-energy signals. The authors portrayed the shiftability of the DT-WT in terms of the shifting property of the fHTs. By introducing a generalization of the Bedrosian theorem for the fHT operator, the authors derived an explicit understanding of the shifting action of the fHT for the family of wavelets obtained through the modulation of low pass functions. Finally, the authors extended these ideas to the multidimensional setting by introducing a directional extension of the fHT. The authors concluded that the shiftability of the dual-tree transform could be utilized to images with higher dimensions. Image fusion is considered as a sort of information coordinated technology that has been utilized as a part of numerous fields [30]. Pathological changes can be detected and located exactly in the fused image. Cheng Shangli et al. (2008) had proposed an image fusion strategy which employed wavelet transform. The input images were initially disintegrated into low- and high-frequency coefficients using wavelet transform. The decomposed data could be reconstructed perfectly by using a suitable inverse wavelet transform. The authors suggested that fast algorithms such as MALLAT algorithm could be utilized in the wavelet decomposition which was as important as fast Fourier transform (FFT). As the fast algorithm of MALLAT upgraded wavelet, it became easy for the wavelet to be actualized through VLSI. The localization property of wavelets proved to be good in isolating particularity and irregular structures in signals. The authors conducted numerous experiments to illustrate the efficacy of the proposed methodology. Experimental analysis illustrated that utilization of wavelet transform and weighted fusion yielded a good fused image of PET/CT. They could precisely discover and locate the diseased region from the fused image when compared with single CT or PET image.
8.2.2
Non-Subsampled Contourlet Transform (NSCT)
Zhiqin Zhu et al. (2019) had put forward a novel image fusion technique which utilized phase congruency and local Laplacian energy [31]. The proposed strategy utilized non-subsampled contourlet transform to disintegrate the source images into low- and high-frequency sub-bands. Laplacian energy-based fusion rule was utilized to fuse low-frequency coefficients and phase congruency based fusion rule was used to fuse high-frequency coefficients. In order to produce instructive detailed information PC was implemented. PC is a dimensionless quantification to calculate the sharpness of an image. Experimental analysis established that the proposed strategy produced better outcomes in terms of both subjective and objective analyses. The
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
139
authors concluded that the proposed strategy could be extended by streamlining fusion rule of low-frequency coefficients. Weiwei Kong et al. (2019) had put forward a novel image fusion framework based on local difference (LD) in non-subsampled domain [32]. Initially, the source images were disintegrated into low- and high-frequency components using non-subsampled techniques. Finally, the images were fused using local difference (LD) operator. On comparing the proposed strategy with non-subsampled contourlet transform (NSCT) and non-subsampled shearlet transform (NSST), it was inferred that the proposed strategy produced superior results in terms of both qualitative and quantitative analyses. The authors found that the proposed strategy had extraordinary potential to amend and improve the data within the image which could upgrade the clarity in detecting the disease. Yong Yang et al. (2017) had put forward a novel image framework which utilized non-subsampled contourlet transform (NSCT) and sparse representation (SR) strategy [33]. Initially, the source images were disintegrated using the NSCT technique. For fusing low SR-based scheme was proposed. In this strategy a dictionary was built by coordinating numerous instructive and minimized sub-lexicons, in which each sub-lexicon was found out by extricating a couple of principle component analysis bases from the mutually grouped patches acquired from the low-pass sub-images. For fusing high-frequency coefficients multi-scale morphology focus-measure (MSMF) was utilized. Experiments were conducted on a group of multifocus images. Experimental outcomes illustrated that the NSCT strategy outperformed the existing state-of-the-art methods. Image fusion is the strategy of amalgamating complementary data from distinct images to form a distinct image which comprises both spectral and spatial information [34]. In this work, Vikrant Bhateja et al. (2016) utilized non-subsampled contourlet transform based image fusion model that integrated principal component analysis, phase congruency, directive contrast, and entropy. For fusing low-frequency coefficients phase congruency rule was applied while the combination of directive contrast and normalized Shannon entropy was applied to highfrequency coefficients. Qualitative and quantitative analysis illustrated that the CT with PCA outperformed the existing state-of-the-art fusion techniques. Medical imaging has had a tremendous development in the field of medical diagnosis [35]. The primary objective of liver perfusion imaging is to detect the abnormalities in the liver accurately. In this paper, Lakshmi Priya et al. (2016) put forward an approach for liver perfusion in the NSCT domain using phase congruency and log-Gabor fusion rule. The NSCT approach preserved information content from the source images, improving the quality of fused image in a considerable fashion. Quantitative analysis illustrated that the proposed approach performed superior than the existing wavelet and contourlet transform in terms of spatial frequency, SSIM, cross-correlation, cross-entropy, and PSNR. Image fusion algorithms most commonly utilized choose max or fusion rule in order to select the foremost coefficient [36]. Choose max fusion rule provides distortion in the fused output while mean fusion rule provides a blurring effect along the edges. To beat the above downsides, Egfin Nirmala et al. (2015) put
140
R. Indhumathi et al.
forward an innovative approach where fusion rules were replaced by soft computing methodology in order to enhance the accuracy of the fusion process. To accomplish multi-resolution decomposition, the NSCT strategy was employed. The underlying objective behind the proposed strategy was to extract significant visual information from the source images. Adaboost SVM classifier was utilized since it has the capability to select essential features that are required for classification. The proposed methodology selected coefficients based on the fact that the fused image retained data from individual source images. Experimental results revealed that NSCT-SVM/AdaSVM strategy maximized mutual information and outperformed the existing state-of-the-art fusion strategies. Rupali Mankar et al. (2015) put forward a new image fusion strategy based on discrete non-subsampled contourlet transform [37]. The aim of the proposed strategy was to reduce errors between the input and the fused image. Initially, the images were decomposed into low- and high-frequency components by utilizing the NSCT technique. After decomposition, the images were fused by using averaging and gradient fusion rule. Finally, inverse NSCT was utilized to reconstruct the fused image. Visual inspection made it known that the proposed strategy preserved edges while quantitative analysis illustrated that the NSCT strategy gave a satisfactory performance in terms of quality metrics such as correlation coefficient, PSNR (peak signal-to-noise ratio), and mean square error. Nikhil Dhengre et al. (2015) had put forward a multimodal medical image fusion which utilized log-Gabor and guided filters with non-subsampled contourlet transform (NSCT) [38]. Initially, the input images were disintegrated using the NSCT strategy. Tentatively it was demonstrated that Gabor filter yielded a viable encoding of normal images. Log-Gabor filter conquered the bandwidth impediment of Gabor filter as this channel could be built with subjective transmission capacity and improved data transmission could be utilized to deliver a channel with an insignificant spatial degree. Because of this, log-Gabor channel was utilized to remove the important low recurrence information from the low recurrence sub-bands. Likewise, guided filter was the edge conquering smoothing filter. The authors utilized the equivalence proposed above to extricate the high-frequency components from highfrequency sub-bands. The neighborhood local fluctuation between the sifted yields of both the source images was determined to intertwine the low- and high-frequency sub-groups. Finally, the inverse contourlet transform yielded the output image. Experimental outcomes illustrated that the proposed strategy yielded better low-and high-frequency details than any other existing image fusion strategies. Gaurav Bhatnagar et al. (2013) had put forward a fusion framework based on non-subsampled contourlet transform [39]. In this strategy, the input images were initially disintegrated utilizing the NSCT technique followed by fusion of low- and high-frequency coefficients. Here two different fusion rules based on phase congruency and directive contrast were recommended and used to fuse low- and highfrequency components. Finally, the fused image was produced by employing the inverse NSCT technique. From experimental outcomes, it was inferred that the NSCT methodology produced foremost results from both visual and quantitative views. The outstanding benefit of phase congruency is that the low-frequency
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
141
coefficients produce a contrast and brightness invariant representation which can be finally combined and contrasted. Jianwen Hu et al. (2012) had put forward a new multi-scale geometrical examination known as the multi-scale directional bilateral filter (MDBF) [40]. It utilized the non-subsampled directional channel bank into the multi-scale two-sided channel. The multi-scale directional channel was developed by amalgamating the MBF with the NSDFB. The MBF was connected to the first image to get the itemized sub-bands and the rough guess sub-band. Gaussian filtering is a standout amongst the most normally utilized routines for image smoothing. The major assumption of the Gaussian filter is to shift the images little by little over space. Yet, this assumption fails on the edges. To sweep away the above drawback, the bilateral filter was developed. Bilateral filter is a non-linear, non-iterative, and local procedure that can smooth images while protecting the edges. In the course of combining the characteristics of preserving edge, the bilateral filter through its ability to capture directional information of the directional filter bank, the MDBF can represent better inherent geometrical structure of images. To confirm the effectiveness of the MDBF, the MDBF was implemented to multi-sensor images. The experimental outcomes confirmed that the proposed methodology provided better output quantitatively. On the other hand, the MDBF’s run time was longer than traditional methods because this process utilized numerous instructions and was space-variant and also non-subsampled. Single-photon discharge registered tomography images are difficult to find in determination, while anatomical structures are truant from the information [41]. Studies have been carried out to fuse the SPECT image with magnetic resonance (MR) image. Due to the minimal similarity among the images, intertwined outputs are continuously concealed, bringing about the loss of some critical anatomical structures. To trounce the above issues, Tianjin Li et al. (2012) proposed a variable- weight matrix which was assessed by reducing the cost function using the simplex method. The proposed methodology avoided the insufficiency of the traditional transparency technique in the fusion of MR and SPECT images. The proposed technique was synchronized with the multi-scaled fusion rule to obtain fused images with intense luminance and exact details. The best part of the proposed framework is its spontaneous properties which permit sluggish deviation between original images and the simple control of detail performance. Moreover, this process holds the comprehension behavior of original images, which makes the experimental analysis simple to distinguish and to comprehend. The fusion rule is then presented with the GIHS framework and the non-subsampled contourlet transform (NSCT) to obtain clear visual understanding, improved feature preservation and detail execution control. Experiments on the usual cerebrum map book have demonstrated the control of the proposed system over the present image fusion techniques. Wang Xin et al. (2011) proposed a novel image fusion technique based on contourlet transform [42]. Contourlet transform produced a pseudo-Gibbs impact for the need of translation invariance. The authors put forward a new methodology which united contourlet transform with image blocking fusion to avoid pseudoGibbs effect. Initially, the latest fusion rule based on contourlet transform was
142
R. Indhumathi et al.
proposed to obtain the primary fused image. Further, base images and the primary fused image were partitioned into equivalent size image blocks. The base image blocks, which were more identical to the primary fused image blocks, were chosen as the absolute fused image blocks. Experimental analysis confirmed that the proposed method successfully eliminated the image distortion which arose from contourlet transform. The fusion effect was enhanced than the result of image blocking and contourlet transform fusion methods. Experimental analysis confirmed that the proposed algorithm efficiently overcame the translation invariance of contourlet transform and produced a superior image impact compared with the conventional fusion methods. Nemir AL-Azzawi et al. (2009) proposed a new methodology which utilized dual-tree complex contourlet transform [43]. The images formed by the dual-tree complex contourlet transform with superior contours and textures retained the shift invariance property. Here fusion rules based on principle component analysis which relied on the frequency component of DFCTT coefficient were proposed. PCA methodology was utilized for fusing low-frequency coefficients while the local energy model was utilized for fusing high-frequency coefficients. Finally, the inverse dual tree complex contourlet transform was implemented to obtain the absolute fused image. Through this method, the authors decomposed an image into two levels by using biorthogonal Daubechies. Visual and statistical analysis proved that the outcomes attained have additional comprehensive information along with small information distortion.
8.2.3
Shearlet Transform (ST)
Medical image fusion shows an essential part in providing improved visualization for diagnosing diseases [44]. It helps in diagnosing various critical diseases. However, the performance of image fusion methodologies is often contrived by the interference of noise in the input images. Sneha Singh et al. (2019) put forward a novel image fusion strategy for neurological images since it can acquire even small scale details in an efficient manner. Initially, the source images were decomposed by utilizing Non-subsampled shearlet transform (NSST) technique. Low-frequency components were fused by utilizing sparse representation model while high-frequency components were fused by using a guided filtering technique. Finally, inverse NSST was employed to reconstruct the fused image. Experiments were conducted on real-time magnetic resonance single-photon emission computed tomography, magnetic resonance-positron emission tomography, and computed tomography magnetic resonance neurological image data sets. Experimental results illustrated that the proposed strategy outperformed the present techniques in conditions of qualitative and quantitative analyses. C. S. Asha et al. (2019) put forward a multi-modal medical image fusion which utilized non-subsampled shearlet transform (NSST) domain via a chaotic grey wolf optimization algorithm [45]. Initially, the input images were decomposed by using the NSST technique. Low-frequency components were fused by utilizing choose
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
143
maximum fusion rule while high-frequency components were fused through weighted combination using a recent chaotic grey wolf optimization algorithm in order to reduce the difference among the output and the input images. The fused image was attained by means of inverse NSST transform. The proposed strategy was validated by using 100 datasets. Experimental outcomes illustrated that the proposed strategy performed better than the existing state-of-the-art methods. Emimal Jabason et al. (2019) put forward an innovative image fusion technique using non-subsampled shearlet transform (NSST) [46]. A novel energy maximization rule was proposed. The minor circulations of the high-recurrence NSST coefficients showed heavier tails than the Gaussian dissemination. As a result, they utilized a position scale distribution to portray the non-Gaussian measurements of observational NSST coefficients by studying the parameters utilizing the most extreme probability estimation. At that point, they utilized this model to build up a most extreme posteriori estimator to get the noise-free coefficients. Experimental analysis illustrated that signal intensities in the fused image had been better when compared with the present strategies. The proposed method provided improved outputs in diagnosing neurological disorders such as Alzheimer, epilepsy, and multiple sclerosis. Image fusion is most commonly utilized as an assisted approach for medical practitioners. Computed tomography (CT) and magnetic resonance imaging (MRI) help the practitioners to diagnose the diseases in an effective manner [47]. Sneha Singh et al. (2018) recommended a novel image fusion strategy which utilized ripplet transform and shearlet transform in a cascaded manner. Utilizing RT and ST in a cascaded manner yielded numerous directional decomposition coefficients, thereby increasing the shift invariance property. Initially, the low- and the highfrequency components were fused by utilizing sum-modified Laplacian and spatial frequency. Maximum fusion rule established on regional energy was exploited at stage 2. The proposed strategy helped to preserve information from the individual source images. Experimental analysis illustrated that the proposed cascaded strategy outperformed the present image fusion techniques. Yanyu Liu et al. (2018) suggested a novel image fusion technique which utilizes non-subsampled shearlet transform (NSST) and dictionary learning utilizing sparse representation (SR) [48]. Initially, the source images were disintegrated by utilizing the NSST technique. High-frequency components were fused using maximum fusion rule while low-frequency components were fused using SR based fusion rule. Finally, inverse NSST was used to reconstruct the fused image. Experimental outcomes illustrated that the proposed strategy provided better results in terms of both visual and quantitative analyses. Feng Wang et al. (2016) recommended a novel image fusion technique which utilized non-subsampled shearlet transform (NSST) [49]. Initially, the input images were disintegrated using NSST technique. Local least root mean square error was utilized as a latest fusion weight for fusing low-frequency components while a novel edge preserving weight was utilized to fuse high-frequency components. To conclude, inverse NSST was used to achieve the fused image. Experimental outcomes
144
R. Indhumathi et al.
illustrated that the fused image preserved both rich details and structural contents from the individual source images. Jingming Yang et al. (2016) proposed multimodal image fusion using non-subsampled shearlet transform and compressive sensing theory [50]. Initially, NSST technique was engaged to disintegrate the base images into low- and highfrequency components. Then, weighted fusion rule was employed to fuse low-frequency coefficients while the fuzzy rule was employed to fuse highfrequency coefficients. The directional sub-band coefficients were fused using compressive sensing theory, thereby increasing the speed of execution. Finally, inverse NSST was used to acquire the fused image. Trial analysis revealed that the recommended strategy contained numerous advantages such as reduction in computational complexity, preserving detailed information, elimination of Gibbs effect, etc. Biswajit Biswas et al. (2015) put forward a novel image fusion methodology called spine medical image fusion which utilized wiener filter in shearlet domain [51]. The proposed strategy amalgamated anatomical as well as functional information of CT and MRI images, thereby providing complementary information which was more useful for medical diagnosis and treatment. Initially, the low-frequency components were decomposed by utilizing singular value decomposition which produced a singular value ingredient of low-frequency sub-bands. The singular value produced was diminished by utilizing a weight factor. Finally, inverse shearlet transform was applied to create a representation of low-frequency sub-bands while high-frequency sub-bands were produced by choosing largest directional sub-bands from CT and MRI images. Experimental analysis demonstrated that the proposed strategy outperformed the existing image fusion strategies. The approach put forward by Esin Karahan et al. (2015) utilized Markov–Penrose diagrams, Van integration of Bayesian DAG and tensor network strategies [52]. The authors extended matrix type EEG/fMRI fusion in order to couple tensor decompositions of EEG and fMRI. The proposed strategy helped to analyze time, frequency and inverse problems. In order to fuse electrical and metabolic signals, bio-physical models were utilized. The indirect nature of signals led to various inverse problems which could be solved by utilizing the above approach. The proposed methodology helped the author to have a clear perspective about multimodal fusion through partial least squares and matrix-tensor factorization. Experimental analysis illustrated that the proposed strategy was more useful in fusing metabolic and electrical signals. Lei Wang et al. (2012) recommended multi-modal medical image fusion system which utilized shift-invariant shearlet transform [53]. SIST method was done by utilizing non-subsampled pyramid filter scheme and shift invariant shearing filters. The probability density function and standard deviation of the SIST coefficients were utilized to compute the fused coefficients. A key advantage of the shearlet technique over NSCT is that there are no limits on the number of instructions for the shearing as well as the size of the supports. Furthermore, inverse discrete shearlet transform requires only an abstract of the shearing filters. This results in a performance that is computationally more efficient. The authors also considered the
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
145
dependencies of the SIST coefficients and inter sub-bands in the recommended fusion rule. By doing so, further information from the base images got transferred into the fused image.
8.2.4
Neuro Fuzzy Techniques
Multimodal image fusion acts as a vital role in medical analysis and treatment [54]. Pulse coupled neural network is one of the widely used image fusion techniques. The PCNN variables were adjusted manually earlier which did not yield satisfying results. In this paper, Lu Tang et al. (2019) proposed a guided adaptive optimization strategy, where pulse coupled neural network (PCNN) was optimized by utilizing multi-swarm fruit fly optimization algorithm (MFOA). Quality assessment was utilized as hybrid fitness function in order to enhance the functioning of MFOA. The recommended strategy spontaneously selects the variables in order to enhance the fusion effect. Experimental outcomes illustrated the efficacy of the recommended strategy that gave better results than the existing image fusion techniques. Hajer Ouerghi et al. (2018) proposed a multimodal image fusion approach based on non-subsampled shearlet transform (NSST) and simplified pulse-coupled neural network model (S-PCNN) [55]. Initially, the images were modified into YIQ components. The registered MRI image and the Y component of PET image were disintegrated into low- and high-frequency components using the NSST strategy. Low-frequency components were fused by utilizing weight region standard deviation (SD) and local energy and high-frequency components were fused by utilizing S-PCNN strategy. To conclude, inverse NSST and inverse YIQ technique were applied. Trial results illustrated that the recommended strategy performed better than the present image fusion techniques in terms of feature metrics such as mutual information, entropy, SD, fusion quality, and spatial frequency. Yulin Xiong et al. (2017) recommended an image fusion method which utilized shift-invariant shearlet transform (SIST) and adaptive pulse coupled neural network (PCNN) [56]. At first, the input images were disintegrated into low- and highfrequency components by using SIST strategy. Low-frequency components were fused by utilizing local variance and energy fusion rule while adaptive PCNN was utilized to fuse high-frequency components. Spatial frequency in SIST domain was utilized as input of PCNN. The linking strength of PCNN was taken from gradient energy. Inverse SIST was utilized to reconstruct the fused image. Experimental outputs illustrated that the recommended strategy provided improved results in terms of qualitative and quantitative analysis. To enhance the performance of image fusion, Hu Shaohai et al. (2016) put forward a distinct strategy which utilized block-matching and 3D filtering algorithm (BM3D) for image de-noising [57]. The proposed strategy initially organized the group of data into three-dimensional array by utilizing a block matching strategy. Secondly, a three-dimensional transform which comprised of two-dimensional
146
R. Indhumathi et al.
non-subsampled shearlet transform and a one-dimensional discrete cosine transform to yield low-frequency and high-frequency components. Low-frequency components were fused by using average fusion rule while high-frequency components were fused by sum of modified Laplacian (SML) technique. The firing maps of PCNN were calculated by the SML strategy. Inverse transform was applied to reconstruct the fused image. Trial outcomes demonstrated the effectiveness of the recommended strategy in conditions of qualitative and quantitative analysis. Medical image fusion is characterized as the procedure by which a single image is created by combining images from two distinct modalities [58]. Though wavelet transform performs well at isolated discontinuities, it fails to capture smoothness along the edges. To prevail over the downsides of wavelet transform, Rajkumar et al. (2014) put forward two fusion techniques-iterative neuro-fuzzy approach (INFA) and lifting wavelet transform and neuro-fuzzy approach (LWT-NFA). In the suggested method, the authors adopted general process where images were given a priority with certain index. This index determined the number of times an image to be fused to yield a final image. Quantitative analysis illustrated that INFA methodology produced clear and better information than the existing wavelet transforms. Behzad Kalafje Nobariyan et al. (2014) suggested a novel approach to enhance the resolution of output image [59]. After image registration, YCbCr was performed on the multispectral images to determine luminance. DWT fusion algorithm based on pulse coupled neural network (PCNN) was employed to fuse MRI image. Finally, inverse YCbCr was employed to yield the output image. YCbCr model was used to convert multi-spectral image which contains red, green, and blue RGB channels into components Y, Cb, and Cr respectively, where Y is the luminance component, Cb and Cr are the blue and red difference chromatic components. The major benefit of the suggested approach is its global couple and pulse synchronization characteristics. Also, PCNN is an efficient methodology to choose “better” high-frequency coefficients. The authors compared the proposed strategy with DWT, contourlet, and curvelet methodologies in order to demonstrate the efficacy of the suggested strategy. The proposed approach not only preserved spectral and spatial resolution but also reduced spectral distortion. Das et al. (2011) put forward an innovative multimodality medical image fusion (MIF) strategy based on ripplet transform which utilized pulse-coupled neural network (PCNN) [60]. The suggested MIF scheme exploited the benefits of RT and PCNN to yield enhanced outcomes. The input medical images were initially disintegrated by discrete RT (DRT). The low-frequency sub-bands (LFSs) were fused by means of the “max selection” rule while for high-frequency sub-bands (HFSs) PCNN model was effectively used. Modified spatial frequency (MSF) in DRT domain was given as key to persuade the PCNN and coefficients in DRT domain with large firing times were chosen as coefficients of the fused image. Finally, inverse DRT (IDRT) was implemented to yield the fused image. The DRT was competent of determining two-dimensional singularities and constituting image edges more capably. Experimental analysis demonstrated that the suggested method
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
147
could conserve more functional information in the fused image with advanced spatial resolution and less difference to the input images. The execution of the proposed algorithm was evaluated by various quantitative measures like mutual information (MI), spatial frequency (SF), entropy (EN), etc. Experimental analysis and comparisons proved the usefulness of the proposed scheme in fusing multi modal medical images. Medical image fusion is the vital pace after the registration, which is an integrative display strategy of two images, to compensate the lack of anatomic and functional image [61]. With respective to the original image fusion methods and principle component analysis (PCA) theory, Jinzhu Yang et al. (2011) put forward a latest block enhanced algorithm for medical images. The basic concept of principle component analysis (PCA) image fusion method was to compute the covariance matrix of the two images and then determine the eigen values and their equivalent eigen vectors. To conclude, the fused image was reached by using the weighted coefficients which could be utilized by the eigen values and eigen vectors. The proposed algorithm not only removed the influence of image imbalance but also eliminated the distorted effect which could provide more reference information for medical diagnosis. Experimental analysis proved that the working of the proposed algorithm was advanced to the traditional PCA algorithm. Qualitative results confirmed that the suggested technique was higher to original images by producing higher contrasts and clearer outlines. In the medical imaging collection of multiple-task brain imaging data from the same subject has now turned out to be an extremely regular practice [62]. Jing Sui et al. (2010) put forward a valuable model called “CCA + ICA” which was a powerful tool for multitask information fusion. This joint source separation model exploited two multivariate methodologies which utilized authorized correlation analysis and autonomous component investigation. The purpose of the proposed methodology was to accomplish both high estimation precision and to present the accurate relationship among two data sets in which sources could be either similar or different data set relationship. Here, the authors focused on multi-task brain imaging data fusion which was a second-level investigation using “features”. Most of the present methods maximize (1) inter subject/direction co-variation or (2) statistical independence among the segments, or both to unite two datasets. Yet, such a prerequisite is not met by practice, thus the two presumptions will not be satisfied simultaneously bringing about a trade-off solution. To overcome the above issue introduced a joint blind source separation (BSS) model whose suppositions are less stringent and exploited highest improvement of data was suggested. The performance of CCA + ICA was analyzed by comparing with joint-ICA and mCCA. As expected, the three methods effectively extracted diverse views of the data with CCA + ICA seeming to emphasize both task- common and task-distinct aberrant brain regions in schizophrenia. Most medical images are fuzzy for two reasons: first is that noise signal blurs the high-frequency signal of image edges; another is the fringe of tumor with normal tissues cannot be clearly characterized on the images [63]. To beat the above
148
R. Indhumathi et al.
drawbacks, Yang-ping Wang et al. (2007) had put forward a novel approach using fuzzy radial basis function neural networks (Fuzzy- RBFNN). Global genetic algorithm (GA) was engaged to train the networks. Experimental results showed that the introduced approach was more exceptional for multimodal medical images both in visual effect and in point evaluation parameter. Results showed that the introduced approach outperformed gradient pyramid from both visual and quantitative analyses, especially for blurred images.
8.2.5
Hybrid Technology
Ming Yin et al. (2019) introduced an image fusion framework which utilized non-subsampled shearlet transform (NSST) [64]. Initially, the source images were disintegrated using NSST technique into low- and high-frequency sub-bands. Low-frequency components were fused by employing energy fusion rule while high-frequency components were fused by using parameter-adaptive pulse-coupled neural network (PA-PCNN) model. Finally, inverse NSST was done to reconstruct the fused image. To conclude the efficacy of the proposed strategy the authors validated it using 80 pairs of multimodal medical images. Experimental outcomes established the effectiveness of the proposed strategy in view of visual and objective analyses. Niladri Shekhar Mishra et al. (2018) proposed an image fusion technology which integrates fuzzy membership values as input to pulse coupled neural network (PCNN) [65]. The inputs utilized were fuzzy in nature. Initially, the input images were disintegrated by utilizing non-subsampled shearlet transform. After decomposition the low-frequency components were fused with maximum fusion rule and high-frequency components were fused with fuzzy membership functions. An assembling operation was done to find the resultant of fuzzy membership function. Finally, inverse NSST was done to reconstruct the fused image. Visual and quantitative analyses illustrated that the suggested strategy outperformed the existing image fusion techniques. Zhiying Song et al. (2017) introduced a distinct image fusion strategy which utilized pulse coupled neural networks (PCNN) model in non-subsampled shearlet transform (NSST) domain [66]. The authors utilized PET and CT images for their research database. To begin with, the base images were disintegrated into low- and high-frequency components using NSST strategy. Secondly, PCNN was utilized whose inputs were energy of edge and linking strength was average gradient. Maximum selection rule was used to fuse low-frequency components while maximum region energy was utilized to fuse high-frequency components. Inverse NSST yielded the fused image. Experiments conducted proved the superiority of the proposed strategy. Yong Yang et al. (2016) put forward an image fusion method which utilized non-subsampled contourlet transform (NSCT) with type-2 fuzzy logic techniques [67]. Initially, the source images were decomposed by utilizing NSCT strategy. Later
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
149
energy fusion rule was used to fuse low-frequency coefficients while type-2 fuzzy logic rule was used to fuse high-frequency coefficients. To end with, inverse NSCT was useful to reconstruct the fused image. Subjective and objective evaluation confirmed the effectiveness of the proposed approach. In addition, the authors also tried an efficient color medical image fusion. The color image fusion technique produced lesser distortion and an improved visual effect. Rasha Ibrahim et al. (2015) put forward a new image fusion strategy which integrated the sparse representation with robust principle component analysis algorithm (RPCA) [68]. Initially, the images were decomposed by utilizing RPC algorithm. Then OMP algorithm was used to fuse low- and high-frequency coefficients by using suitable fusion rules. In order to retain the contrast of the image, average fusion rule was employed to fuse low-frequency coefficients while maximum norm fusion rule was employed to fuse high-frequency coefficients. Maximum norm fusion rule was used to preserve information about the edges. As a final point, the fused image was reconstructed from sparse coefficients and adaptive dictionary. On analysis, the proposed strategy was found to produce better results than the existing image fusion methodologies. Paramanandham et al. (2015) introduced a novel multi focus image fusion technique by combining discrete wavelet transform and stationary wavelet transform [69]. Primarily, the input images were disintegrated by utilizing wavelet transform and the coefficients were fused using suitable fusion rules. Fused image was attained by engaging suitable inverse wavelet transforms. The introduced algorithm was computed by performing suitable quality metrics such as RMSE, PSNR, SF, and entropy. The experimental outcomes demonstrated that the introduced fusion approach was valuable from both visual and quantitative inspections. Parmar Arpita et al. (2015) suggested a new image fusion strategy which involved the hybridization of SWT and PCA methodology [70]. Initially, the images were disintegrated into low-frequency and high-frequency sub-bands by utilizing stationary wavelet transform. The decomposed images were then fused by utilizing PCA fusion rule. Finally, the inverse wavelet transform was functional to reconstruct the image. On analyzing the experimental results, it was clear that the suggested strategy produced better results than the present methodologies. Vikrant Bhateja et al. (2015) put forward a novel image fusion framework which utilized stationary wavelet transform and non-subsampled contourlet transform to acquire images from two diverse modalities [71]. In order to minimize the redundancy, principal component analysis was utilized in the SWT strategy. Followed by PCA, maximum fusion rule was functional in the NSCT strategy to improve the image contrast. Though the SWT technique yielded an image with enhanced frequency and time localization, it yielded shift variance in the fused image. In order to provide a shift-invariant output NSCT was employed. On comparing the introduced strategy with the existing up to date fusion strategies, it was evident that cascading SWT and NSCT domains outperformed the existing image fusion strategies. Though the existing image fusion methodologies provide good quantitative results, they cause spatial noise in the fused output [72]. To overcome the above
150
R. Indhumathi et al.
downside, Richa Gupta et al. (2014) put forward a novel approach using discrete wave packet decomposition (DWPT) and optimized the results obtained by utilizing genetic algorithm (GA). DWPT is an extension of traditional discrete wavelet transform. DWPT decomposed an image into small frequency levels by utilizing entropy as a selection criterion. The effectiveness of the proposed approach was demonstrated by comparing DWPT methodology with the HIS method. Abhinav Krishn et al. (2014) put forward a novel image fusion methodology which utilized principal component analysis (PCA) and wavelets [73]. The discovered methodology decomposed the source images by utilizing 2D-discrete wavelet transform. In order to make best use of the spatial resolution, PCA was useful to the decomposed coefficients. To get improved fusion results, Daubechies wavelet family was utilized. Simulation results demonstrated that the suggested technique worked superior than the existing modern fusion approaches. Conservation of time and frequency components in wavelet transform and feature enhancement property of PCA made this approach more appropriate for medical image fusion. Himanshi et al. (2014) put forward an efficient image fusion methodology depends on principal component analysis (PCA) and dual-tree complex wavelet transform (DTCWT) [74]. DTCWT has various advantages such as shift invariance, high directional selectivity, etc. Additionally, DTCWT delivered phase information which made the fusion procedure better. The fused output furnished spectral, spatial information and also details about the soft tissues. Comparison analysis made it clear that DTCWT showed a significant improvement over the existing strategies by restoring information content from individual modalities. DTCWT technique was more precise and was utilized for multimodal medical image fusion. Bhutada et al. (2011) suggested a new approach which exploited the attributes of wavelet and curvelet transform independently and adaptively in “homogenerous”, “non-homogeneous” and “neither homogeneous nor non-homogeneous” regions, which were distinguished by different approach [75]. The edge information which is unheld by wavelet transform was withdrawn by denoising it with curvelet transform. This pulled out data were utilized as edge structure data (ESI) for combining offshore regions of denoised images procured by utilizing wavelet and curvelet transform. The fused image obtained from the proposed algorithm produced the following advantages. That is, preservation of the edge information and better smoothness in background owing to the removal of fuzzy edges developed during the denoising procedure by the curvelet transform. Much improvement was not seen in SNR and PSNR of the intended technique, when evaluated with curvelet transform-based denoising methodology. From the experimental outcomes, it was obvious that the result of denoising non-homo region conserved edge information. The exclusion of fuzzy edges from a homogeneous region was the main novelty of the proposed approach. It could be viewed from the results that for almost all denoised images with diverse Gaussian noise levels, there was a similar rate of improvement in SNR, PSNR, UQI, and SSIM as compared with the other latest techniques like WT, WT1, CT, and CT1. Finally, the authors concluded that the proposed approach outperformed the other approaches by preserving edges. To
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
151
conquer the lack of shift invariance in contourlet transform and to empower the image fusion as per human vision properties, LIU Fu et al. (2011) utilized non-subsampled contourlet transform (NSCT) and pulse coupled neural network (PCNN) mutually in image fusion strategies which is a high-efficiency fusion algorithm fitting human vision properties [76]. The author decomposed the original images to find the coefficients of low-frequency sub-bands and high-frequency sub-bands. The coefficients of low and high frequency sub-bands were practiced by a modified PCNN. Fused image was achieved by applying inverse NSCT transformation. Trial results proved that this process was enhanced than wavelet, contourlet and traditional PCNN methods since it had greater mutual information. Also, the proposed methodology could preserve edge information and texture well. Further, the fused image conserved more information content. Though the proposed strategy has various advantages, choosing parameters effectively should be researched upon meticulously. The overview of various multi-modal image fusion techniques is framed in Table 8.1 for better understanding of the reader. Also a comparative analysis of various image fusion techniques in terms of advantages and drawback is presented in Table 8.2.
8.3
Conclusion
From the literature review conducted, it was inferred that discrete wavelet transform (DWT) suffered from shift variance property which led to the loss of information. Stationary wavelet transform (SWT) did not provide better information along contours & edge regions. Non-subsampled contourlet transform (NSCT) suffered from the issue of complication in designing non-sub pyramid & DFB requirement of proper filter tuning for specific application and perfect reconstruction filters. Non-subsampled shearlet transform (NSST) suffered from complication due to restriction on number of directions due to the presence of shearing filters. Neuro fuzzy techniques had to be trained properly before performing image fusion. Hybrid strategies outfit the ascribes and advantages of diverse methods which are complementary to one another. In conclusion, Pixel level image fusion has gained the utmost importance in the recent years which demonstrates the importance of image fusion in various fields such as medical, remote sensing, military surveillance, weapon detection, etc. At present, integrating different image fusion techniques, i.e., hybrid technology plays a vital role in the field of research. Though there are numerous advantages, still image fusion techniques suffer from certain drawbacks such as imaging hardware, computational complexities, noise, evaluation metrics, dissimilarity between images, training data sets, resolution difference between images and environmental conditions. In this context it is expected that innovative ideas and novel research contributions will keep surfacing and growing in the upcoming years.
152
R. Indhumathi et al.
Table 8.1 Overview of various multi modal image fusion strategies Dataset CT,MRI, SPECT, PET, X-ray, ultrasound, mamograms
Author Praveen Kumar Reddy Yelampalli [8] Paul Hill [9]
Rajarshi [10]
Neetu Mittal [11]
Vani [12]
Tannaz Akbarpour [13] Mirajkar [14]
Babu [15] Arnika [16] Elizabeth Thomas [17]
Maruturi Haribabu [18] Yang Yanchun [19] Tannaz Akbarpour [20]
Tian Lan [21]
Shutao Li [22]
Transform Wavelet transform (WT)
Fusion rule Recursive Daubechies pattern Dual-tree complex wavelet transform Discrete wavelet transform Discrete wavelet transform Dual tree discrete wavelet transform Dual-tree wavelet transform Discrete wavelet transform Curvelet transform Multi wavelet transform Daubechies complex wavelet transform Biorthogonal wavelet transform Lifting wavelet transform Dual tree complex wavelet transform Wavelet transform (WT) and human Discrete wavelet transform
Application Detection of tumor, lung cancer detection, study of abdomen area, sarcoma, remote sensing, military surveillance, etc.
(continued)
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
153
Table 8.1 (continued) Dataset
Author Rui Shen [23]
Transform
Sharmila [24]
Sohaib Afzal [25] Rajiv Singh [26]
Huimin Lu [27, 28] Haozheng Ren [28]
Kunal Narayan Chaudhury [29]
Cheng Shangli [30] CT,MRI, SPECT, PET, X-ray, ultrasound, mamograms
Zhiqin Zhu [31]
Weiwei Kong [32] Yong Yang [33]
Vikrant Bhateja [34]
Lakshmi Priya [35]
Nonsubsampled Contourlet transform
Fusion rule Cross-scale fusion Discrete wavelet transformaveragingMulti-scale wavelet Daubechies complex wavelet Beyond wavelet transform Multi-wavelet and multiband multiwavelet Dual-tree complex wavelet transform Discrete wavelet transform Phase congruency and local laplacian Local difference (LD) SR and multiscale morphology focus measure Phase congruency & directive contrast Phase congruency and log-Gabor rule
Application
Detection of tumor, lung cancer detection, study of abdomen area, sarcoma, remote sensing, military surveillance, etc.
(continued)
154
R. Indhumathi et al.
Table 8.1 (continued) Dataset
Author Egfin Nirmala [36]
Transform
Rupali Mankar [37] Nikhil Dhengre [38] Gaurav Bhatnagar [39]
Jianwen Hu [40] Tianjin Li [41] Wang Xin [42]
CT,MRI, SPECT, PET, X-ray, ultrasound, mamograms
Nemir AL-Azzawi [43] Sneha Singh [44]
C.S.Asha [45]
Emimal Jabason [46] Sneha Singh [47]
Yanyu Liu [48]
Feng Wang [49]
Jingming Yang [50] Biswajit Biswas [51] Karahan [52]
NonSubsampled Shearlet Transform (NSST)
Fusion rule Soft computing methodology Averaging and gradient rule Log Gabor and guided filter Phase congruency & directive contrast Bilateral filter Multi-scaled fusion rule PCA, maximum fusion rule Maximum fusion rule Sparse representation model and guided Maximum fusion rule and grey wolf Energy maximization rule Sum-modified Laplacian and spatial Sparse representation and maximum Local least root mean square error Weighted fusion rule and fuzzy rule Singular value decomposition Markov– Penrose diagrams, Van
Application
Detection of tumor, lung cancer detection, study of abdomen area, sarcoma, remote sensing, military surveillance, etc.
(continued)
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
155
Table 8.1 (continued) Dataset
Author Lei Wang [53]
Transform
CT,MRI, SPECT, PET, X-ray, ultrasound, mamograms
Lu Tang [54]
Neuro fuzzy algorithm
Hajer Ouerghi [55] Yulin Xiong [56]
Hu Shaohai [57]
Rajkumar [58]
Behzad Kalafje Nobariyan [59] Das [60]
Jinzhu Yang [61]
CT,MRI, SPECT, PET, X-ray, ultrasound, mamograms
Jing Sui [62] Yang-ping Wang [63] Ming Yin [64]
Niladri Shekhar Mishra [65] Zhiying Song [66] Yong Yang [67]
Rasha Ibrahim [68] Paramanandham [69]
Hybrid technology
Fusion rule Probability density function and Multi-swarm fruit Fly optimization Weight region standard deviation (SD) Local variance and energy fusion rule Blockmatching and 3D filtering Iterative Neuro-fuzzy approach YbCr model Maximum fusion rule and PCNN Principle component analysis CCA + ICA Genetic algorithm NSST and PCNN using energy rule NSST and PCNN using fuzzy rules NSST and PCCN using maximum NSCT with fuzzy using energy and SR with PCA using OMP algorithm DWT and SWT
Application
Detection of tumor, lung cancer detection, study of abdomen area, sarcoma, remote sensing, military surveillance, etc.
Detection of tumor, lung cancer detection, study of abdomen area, sarcoma, remote sensing, military surveillance, etc.
(continued)
156
R. Indhumathi et al.
Table 8.1 (continued) Dataset
Author Parmar Arpita [70] Vikrant Bhateja [71] Richa Gupta [72]
Transform
Abhinav Krishn [73] Himanshi [74] Bhutada [75] LIU Fu [76]
Fusion rule SWT and PCA
Application
SWT and NSCT DWPT and GA using entropy criterion PCA and wavelets PCA and DTCWT Wavelet and Curvelet NSCT and PCNN
Table 8.2 Comparison Analysis of various image fusion Techniques S. No 1
Fusion transforms Wavelet transform [8]
2
Non-subsampled contourlet transform [31]
3
Non-subsampled shearlet transform [44] Neuro fuzzy algorithms [54]
4
5
Hybrid technology [64]
Advantages Multi-resolution nature, higher scalability and flexibility Shift invariant
Use of shearing filters helps to extract data from all directions Considers the relationship between the pixels. The images need not entirely registered before fusion. Better qualitative and quantitative results
Drawbacks Shift variant
Complication in designing non-subsampled pyramid and directional filter bank, requirement of proper filter tuning for specific applications, perfect reconstruction filters Complication in designing shearing filters Poor quantitative results
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
157
References 1. Zhoue Y, Gao K, Dou Z et al (2018) Target-aware fusion of infrared and visible images. IEEE Access 6:79039–79049 2. Kong W, Miao Q, Yang L (2018) Multimodal medical sensor medical image fusion based on local difference in non-subsampled domain. IEEE Trans Instrument & Measure 68(4):938–951 3. Zhang K, Wang M, Yang S et al (2018) Spatial–spectral-graph-regularized low-rank tensor decomposition for multispectral and hyperspectral image fusion. IEEE J Selec Topic Appl Earth Observ & Remote Sens 11(4):1030–1040 4. Ghassemian H (2016) A review of remote sensing image fusion methods. Inf Fusion 32:75–89 5. Li S, Kang X, Fang L, Hu J, Yin H (2017) Pixel-level image fusion: a survey of the state of the art. Inf Fusion 33:100–112 6. James AP, Dasarathy BV (2014) Medical image fusion: a survey of the state of the art. Inf Fusion 19:4–19 7. Dogra A, Goyal B, Agrawal S (2017) From multi-scale decomposition to non-multi-scale decomposition methods: a comprehensive survey of image fusion techniques and its applications. IEEE Access 5:16040–16067 8. Yelampalli PKR, Nayak J, Gaidhane VH (2018) Daubechies wavelet-based local feature descriptor for multimodal medical image registration. IET Image Process 12(10):1692–1702 9. Hill P, Al-Mualla ME, Bull D (2016) Perceptual image fusion using wavelets. IEEE Trans Image Process 26(3):1076–1088 10. Rajarshi K, Himabindu CH. DWT based medical image fusion with maximum local extrema. IEEE, International Conference on Computer Communication and Informatics (ICCCI) 2016 Jan 7–9; Coimbatore, India, pp 1–5 11. Mittal N, Singh HP, Gupta R. Decomposition & reconstruction of medical images in MATLAB using different Wavelet parameters. IEEE, international conference on futuristic trendson computational analysis and knowledge management 2015 Feb 25–27; Noida, India, pp 647–653 12. Vani M, Saravanakumar S. Multi focus and multi modal image fusion using Wavelet transform. 3rd international conference on signal processing, communication and networking 2015 Mar 26–28; Chennai, India, pp 1–6 13. Akbarpour T, Shamsi M, Daneshvar S. Extraction of brain regions affected by Alzheimer disease via fusion of brain multispectral MR images. IEEE, 7th international conference on information and knowledge technology 2015 May 26–28; Urmia, Iran, pp 1–6 14. Mirajkar P, Ruikar SD (2013) Image fusion based on stationary wavelet transform. Int J Adv Eng Res Stud 2013:99–101 15. Babu G, Siva Kumar R, Praveena B. Design of spatial filter for fused CT and MRI brain images. IEEE international conference on advanced computing and communication systems 2015 Jan 5–7; Coimbatore, India, pp 1–6 16. Arnika M, Jayamathura V. Image fusion on different modalities using multi wavelet transforms. International conference on electronics and communication systems 2014 Feb 13–14; Coimbatore, pp 1–4 17. Thomas E, Nair PB, John SN. Merry Dominic image fusion using Daubechies complex wavelet transform and lifting wavelet transform: a multiresolution approach. IEEE international conference on magnetics, machines and drives 2014 July 24–26; Kottayam, pp 1–5 18. Haribabu M, Hima Bindu CH, Prasad KS. Image fusion with biorthogonal wavelet transform based on maximum selection and region energy. International conference on computer communication and informatics 2014 Jan 3–5; Coimbatore, pp 1–6 19. Yang Y, Dang J, Wang Y. Medical image fusion method based on lifting wavelet transform and dual-channel PCNN. 9th IEEE conference on industrial electronics and applications 2014 Jun 9–11; Hangzhou, China, pp 1179–1182
158
R. Indhumathi et al.
20. Akbarpour T, Shamsi M, Daneshvar S. Structural medical image fusion by means of dual tree complex wavelet. IEEE, The 22nd Iranian conference on electrical engineering 20–22 May 2014; Tehran, Iran, pp 1970–1975 21. Lan T, Xiao Z, Li Y, Ding Y, Qin Z. Multimodal medical image fusion using wavelet transform and human vision system. IEEE, international conference on audio, language and image processing 7–9 July 2014; Shanghai, China, pp 491–495 22. Li S, Kang X, Hu J (2013) Image fusion with guided filtering. IEEE Trans Image Process 22 (7):2864–2875 23. Shen R, Cheng I, Basu A (2013) Cross-scale coefficient selection for volumetric medical image fusion. IEEE Trans Biomed Eng 60(4):1069–1079 24. Sharmila K, Rajkumar S, Vijayarajan V. Hybrid method for multimodality medical image fusion using discrete wavelet transform and entropy concepts with quantitative analysis. IEEE international conference on communication and signal processing 3–5 April 2013; Melmaruvathur, India, pp 489–493 25. Afzal S, Majid A, Kausar N. A novel medical image fusion scheme using weighted sum of multi-scale fusion results. IEEE, 11th international conference on frontiers of information technology 16–18 Dec 2013; Islamabad, Pakistan, pp 113–118 26. Singh R, Khare A (2014) Fusion of multimodal medical images using Daubechies complex wavelet transform – a multiresolution approach. Inform Fusion 19:49–60 27. Sahu A, Bhateja V, Krishn A, Himanshi. Medical image fusion with Laplacian pyramids. IEEE, 2014 international conference on medical imaging, m-health and emerging communication systems 7–8 Nov 2014; Greater Noida, India, pp 448–453 28. Ren H, Lan Y, Zhang Y. Research of multi-focus image fusion based on M-band Wavelet Transformation. IEEE, fourth international workshop on advanced computational intelligence 19–21 Oct 2011; Wuhan, China, pp 395–398 29. Chaudhury KN, Unser M (2009) On the shiftability of dual-tree complex wavelet transforms. IEEE Trans Signal Process 58(1):1–21 30. Shangli C, Junmin HE, Zhngwei LV. Medical image of PET/CT weighted fusion based on wavelet transform. IEEE, 2nd international conference on bioinformatics and biomedical engineering 16–18 May 2008; Shanghai pp. 2523–2525 31. Zhu Z, Zheng M, Qi G, Wang D, Xiang Y (2019) A phase congruency and local laplacian energy based multi-modality medical image fusion method in NSCT domain. IEEE Access 7:20811–20824 32. Kong W, Miao Q, Yang L (2019) Multimodal sensor medical image fusion based on local difference in non-subsampled domain. IEEE Trans Instrum Meas 68(4):938–951 33. Yang YQ, Huang S, Pan L (2016) Multimodal sensor medical image fusion based on type2 fuzzy logic in NSCT domain. IEEE Sensors J 16(10):3735–3745 34. Bhateja V, Srivastava A, Moin A, Lay-Ekuakille A. NSCT based multispectral medical image fusion model. IEEE international symposium on Medical Measurements and Applications (MeMeA) 15–18 May 2016, Benevento, Italy, pp 1–5 35. Priya BL, Adaikalamarie SJ, Jayanthi K. Multi-temporal fusion of abdominal CT images for effective liver cancer diagnosis. International conference on Wireless Communications, Signal Processing and Networking (WiSPNET), 23–25 March 2016; Chennai, India, pp 1452–1457 36. Egfin Nirmala D, Vaidehi V 2015 Comparison of pixel-level and feature level image fusion methods. IEEE, 2nd international conference on computing for sustainable global development 11–13 March 2015; New Delhi, pp 743–748 37. Mankar R, Daimiwal N. Multimodal medical image fusion under nonsubsampled contourlet transform domain. IEEE, international conference on communications and signal processing 2–4 April 2015; Melmaruvathur, India, pp 0592–0596 38. Dhengre N, Upla KP, Trivedi RD. Multimodal biomedical image fusion: use of log-Gabor and guided filters with non-subsampled contourlet transform. International Conference on Image Information Processing (ICIIP) 21–24 Dec 2015; Waknaghat, India, pp. 6–11
8
A Comprehensive Study of Image Fusion Techniques and Their Applications
159
39. Gaurav B, Jonathan Wu QM, Liu Z (2013) Directive contrast based multimodal medical image fusion in NSCT domain. IEEE Trans Multimedia 15(5):1014–1024 40. Hu J, Li S (2012) The multiscale directional bilateral filter and its application to multisensor image fusion. ELSEVIER Inform Fusion 13(3):196–206 41. Li T, Wang Y (2012) Multiscaled combination of MR and SPECT images in neuroimaging: a simplex method based variable-weight fusion. Comput Methods Programs Biomed 105 (1):35–39 42. Xin W, Yingfang Li. A new method for multi-focus image fusion using countourlet transform. IEEE, international conference on Transportation, Mechanical, and Electrical Engineering (TMEE) 2011 Dec 16–18; Changchun, pp 2319–2322 43. AL-Azzawi N, Mat Sakim HA, Wan Abdullah AK, Ibrahim H. Medical image fusion scheme using complex contourlet transform based on PCA. IEEE, annual international conference of the IEEE engineering in medicine and biology society 2009 Sept 3–6; Minneapolis, MN, USA, pp 5813–5816 44. Singh S, Anand RS. Multimodal medical image sensor fusion model using sparse K-SVD dictionary learning in nonsubsampled shearlet domain. IEEE transactions on instrumentation and measurement 2019; PP(99):1–15 45. Asha CS, Lal S, Gurupur VP, Saxena PUP (2019) Multi-modal medical image fusion with adaptive weighted combination of NSST bands using chaotic grey wolf optimization. IEEE Access 7:40782–40796 46. Singh S, Anand RS, Gupta D (2018) CT and MR image information fusion scheme using a cascaded framework in ripplet and NSST domain. IET Image Process 12(5):696–707 47. Liu Y, Zhou D, Nie R et al Brain CT and MRI medical image fusion scheme using NSST and dictionary learning. IEEE 4th international conference on computer and communications (ICCC) 7–10 Dec 2018; Chengdu, China, pp 1579–1583 48. Wang F, Cheng Y. A novel weight fusion approach for multi-focus image based on NSST transform domain. IEEE Chinese Guidance, Navigation and Control Conference (CGNCC) 12–14 Aug 2016; Nanjing, China, pp 2250–2254 49. Yang J, Wu Y, Wang Y, Xiong Y. A novel fusion technique for CT and MRI medical image based on NSST. Chinese Control and Decision Conference (CCDC) 28–30 May 2016; Yinchuan, China, pp 4367–4372 50. Biswas B, Chakrabarti A, Dey KN. Spine medical image fusion using wiener filter in shearlet domain. IEEE 2nd international conference on Recent Trends in Information Systems (ReTIS) 9–11 July 2015; Kolkata, India, pp 387–392 51. Karahan E, Rojas-López PA et al (2015) Tensor analysis and fusion of multimodal brain images. Proc IEEE 103(9):1531–1559 52. Karahan E, Rojas López PA, Bringas-Vega ML, Valdés-Hernández PA (2015) Tensor analysis and fusion of multimodal brain images. Proc IEEE 103(9):1531–1559 53. Wang L, Li B, Tian L-f (2012) Multi-modal medical image fusion using the inter-scale and intra- scale dependencies between image shift-invariant shearlet coefficients. Inform Fusion 19:20–28 54. Lu T, Tian C, Kai X (2019) Exploiting quality-guided adaptive optimization for fusing multimodal medical images. IEEE Access 7:96048–96059 55. Ouerghi H, Mourali O, Zagrouba E (2018) Non-subsampled shearlet transform based MRI and PET brain image fusion using simplified pulse coupled neural network and weight local features in YIQ colour space. IET Image Process 12(10):1873–1880 56. Xiong Y, Wu Y,Wang Y, Wang Y. A medical image fusion method based on SIST and adaptive PCNN. 29th Chinese Control and Decision Conference (CCDC) 28–30 May 2017; Chongqing, China, pp 5189–5194 57. Shaohai H, Dongsheng Y, Shuaiqi L, Xiaole M. Block-matching based multimodal medical image fusion via PCNN with SML. IEEE 13th International Conference on Signal Processing (ICSP) 6–10 Nov 2016; Chengdu, China, pp 13–18
160
R. Indhumathi et al.
58. Rajkumar S, Bardhan P, Akkireddy SK, Munshi C. CT and MRI image fusion based on wavelet transform and neuro-fuzzy concepts with quantitative analysis. IEEE, international conference on electronics and communication systems 13–14 Feb 2014; Coimbatore, India, pp 1–6 59. Nobariyan BK, Daneshvar S, Forough A. A new MRI and PET image fusion algorithm based on Pulse coupled neural network. IEEE, the 22nd Iranian Conference on Electrical Engineering (ICEE) 20–22 May 2014, Tehran, Iran, pp 1950–1955 60. Das S, Kundu MK (2011) A Neuro-fuzzy approach for medical image fusion. IEEE Trans Biomed Eng 60(12):3347–3353 61. Yang J, Han F, Zhao D. A block advanced PCA fusion algorithm based on PET/CT. IEEE, fourth international conference on intelligent computation technology and automation 28–29 March 2011; Shenzhen, Guangdong, China, pp 925–928 62. Sui J, Adali T, Pearlson G et al (2010) A CCA+ICA based model for multi-task brain imaging data fusion and its application to schizophrenia. Neuro Image 51(1):123–134 63. Wang Y, Dang J, Li Q, Li S. Multimodal medical image fusion using fuzzy radial basis function neural networks. IEEE, Proceedings of the 2007 International conference on wavelet analysis and pattern recognition 2–4 Nov 2007; Beijing, China, pp 778–782 64. Yin M, Liu X, Yu L (2019) Medical image fusion with parameter-adaptive pulse coupled neural network in nonsubsampled shearlet transform domain. IEEE Trans Instrum Meas 68(1):1–16 65. Mishra NS, Dhabal S. On combination of fuzzy memberships for medical image fusion using NSST based fuzzy-PCNN. Fifth international conference on Emerging Applications of Information Technology (EAIT) 12–13 Jan 2018; Kolkata, India, pp 1–4 66. Song Z, Jiang H, Li S. An improved medical image fusion method based on PCNN in NSST domain. International Conference on Virtual Reality and Visualization (ICVRV) 21–22 Oct 2017; Zhengzhou, China, pp 442–443 67. Yang Y, Que Y, Huang S, Pan L (2016) Multimodal sensor medical image fusion based on type-2 fuzzy logic in NSCT domain. IEEE Sensors J 16(10):3735–3745 68. Ibrahim R, Alirezaie J, Babyn P. Pixel level jointed sparse representation with RPCA image fusion algorithm. IEEE international conference on telecommunications and signal processing 9–11 July 2015; Prague, Czech Republic, pp 592–595 69. Shenoy PR, Shih M-C, Rose K. Hidden Markov model-based multi-modal image fusion with efficient training. IEEE international conference on image processing 27–30 Oct. 2014; Paris, France, pp 3582–3586 70. Parmar Arpita G, Jadav Kalpesh R, Nishant N Parmar. Design and implementation of novel wavelet approach for improving quality of medical images. IEEE, fifth international conference on communication systems and network technologies, 4–6 April 2015; Gwalior, India, pp 515–520 71. Bhateja V, Patel H, Krishn A, Sahu A, Lay-Ekuakille A (2015) Multimodal medical image sensor fusion frame work using Cascade of wavelet and Contourlet transform domains. IEEE Sensors J 15(12):6783–6790 72. Gupta R, Awasthi D. Wave-packet image fusion technique based on Genetic Algorithm. IEEE, 5th international conference on confluence the next generation information technology summit 25–26 Sept. 2014; Noida, India, pp 280–285 73. Krishn A, Bhateja V, Himanshi, Sahu A. Medical image fusion using Combination of PCA and wavelet analysis, IEEE, international conference on advances in computing, communications and informatics 24–27 Sept. 2014; New Delhi, India, pp 986–991 74. Himanshi,Bhateja V, Krishn A, AkankshaSahu. An improved medical image fusion approach using PCA and complex wavelets, IEEE, 2014 international conference on medical imaging, m-health and emerging communication systems7–8 Nov 2014; Greater Noida, India, pp 442–447 75. Bhutada GG, Anand RS, Saxena SC (2011) Edge preserved image enhancement using adaptive fusion of images denoised by wavelet and curvelet transform. Digital Signal Processing 21 (1):118–130 76. Fu L, Yifan L, Xin L. Image fusion based on nonsubsampled contourlet transform and pulse coupled neural networks. IEEE, fourth international conference on intelligent computation technology and automation 28–29 March 2011; Shenzhen, Guangdong, China, pp 180–183
9
Multilevel Mammogram Image Analysis for Identifying Outliers: Misclassification Using Machine Learning K. Vijayakumar and C. Saravanakumar
Abstract
Nowadays, medical issues are growing at a higher pace than ever because of various factors that surround the well-being of human life. This chapter deals with an important problem that women wrestle with—breast cancer. This is a disease which is not only painful, but also has cut down lifespan of many women. One important diagnostic technique for breast cancer is mammography, which is used to detect and analyze the level of cancer spread. The existing methods use a specific level of analysis, which is not suitable for accurate prediction. Normally, cancer is diagnosed based on the organ in which the cancerous cells proliferate. This method does not consider outliers. This shortfall can be addressed by introducing a multilevel convex hull-based analysis. The hulls are formed based on the closeness of the image pixel. In this method, the outlier’s pixels are also considered by taking the sub-convex hull, which is used to detect the cancer. It can help prevent the cancer cells spreading other portions of the body. There are various levels of analysis that carried out; texture based, correlation based, and statistic based. The main objective of the proposed method is to perform diagnosis both inside and outside the region using convex hull-based approach. It provides more accuracy than the single level of detection method. Keywords
Image processing · Machine learning · Mammogram · Medical imaging · Convex hull K. Vijayakumar (*) Department of Computer Science and Engineering, St.Joseph’s Institute of Technology, Chennai, Tamil Nadu, India C. Saravanakumar Department of Information Technology, St.Joseph’s Institute of Technology, Chennai, Tamil Nadu, India # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_9
161
162
9.1
K. Vijayakumar and C. Saravanakumar
Introduction
Mammogram is an image which describes the breast cancer. It is used to identify breast cancer in its early stages, preventing women from serious problem. There are various types of mammography available for detecting breast cancer namely traditional, digital, and three dimensional. Traditional type uses minimum dosage for detecting the cancer and to do the diagnosis process. Digital type of mammography uses chip-based device for recording the images for further analysis. In this case, processing is faster from image capturing to exposing the final document when compared to the traditional mammograms. Three-dimensional type of mammogram uses various layers which are arranged in series. It supports accurate analysis and interpretation of the cancer. The reasons for taking the mammogram test may vary. It could be due to irregular breast, or it could be due to changes from the previous abnormality, or swelling in the breast and so on. Symptoms of the breast cancer can differ from one person to another person. It can be classified as early type, spreadable type (invasive), carcinoma type (Ductal, Lobular), inflammatory type, and metastatic type. Men could also have breast cancer with following symptoms—pain in breast, swelling in breast, thicker in level, abnormal changes in the skin and nipple, change in color of the nipple and breast. There are different types of breast cancers existing, namely common type, molecular type, and rare type. Common type of cancer starts with the coating of the various organs in the breast. Molecular cancer affects the breast based on the proteins and gene area of the breast. This is classified into subtypes HER2 and negative in three layers. Rare type of breast cancer starts at tissue in the soft portion and spread up to lymph which causes the block in the skin vessels. Breast cells are categorized as benign and malignant. Benign cells are normal cells without any cancer, whereas abnormal cells are considered as malignant. The proposed algorithm classifies the cancer type which gives the accurate result for the patient to know the level of cancer.
9.2
Literature Survey
Wavelet-based mammogram is used for detection with lossless compression in nature. The compression reduces the mammogram size for faster diagnosis. The foot print components are taken for further analysis [1]. The mammogram image segmentation process is carried out for handling complex images with cutting of the region. The texture features are extracted and suitable weights are applied to the features in order to partition sub-images for accurate analysis [2]. The cluster-based image features are identified for qualitative analysis of various classes such as benign and malignant. There are two views of analysis, namely CC-based view and MLO-based view. Various parameters are assessed by using machine learning and deep learning algorithms. Micro-level classification is done by mapping all features in the statistical level [3]. The mammogram analysis is done based on various features by segmenting the images in mass level. There could problems in the region at image pixel level, such as contrast, which is not suitable for an optimized image.
9
Multilevel Mammogram Image Analysis for Identifying Outliers: Misclassification. . .
163
The image quality is maximized by performing image enhancement which makes two layers of the image—foreground and background. The classified features are validated and observed by human vision with corresponding values [4]. An automated version of mammogram detection and interpretation is required for a better performance. The micro-classification identified plays a vital role for further treatment. The script classifies the cancer based on the level by considering the location where the cancer has originated. This method provides a better detection rate when compared to other methods [5]. Normally, the detection of breast cancer focuses on the particular area in which the disease exists. The problem of inaccuracy is overcome by using an outlier-based detection. This method uses the clustering method for grouping the features based on the relevant and outlier-based instances. The rules are written for making the relationship between different regions [6]. The CBIR-based mammogram detection is introduced to take the clinical decision by a doctor to get a clear view of the level of cancer. This is done by considering compression, encoding, and searching the features in the corresponding space. The framework used for this analysis is unsupervised machine learning algorithm with graph-based hashing technique [7]. The traditional detection methods suffer lack of scalability in image processing. This method uses the features that are highly cohesive in order to detect the accurate cancer regions. This method proves more accurate; hence, SVM-based classifier with high efficiency is being used [8]. CAD-based tool is used to recognize the mammogram image for early detection with the complete set of description related to the anomaly in the image. The scalable blob tool provides the support for identification through learning and discovery [9]. Efficient co-occurrences of the mammogram images are classified and extracted by using grayscale level. The SVM-supervised learning method is used for classification with high accuracy and early treatment of cancer [10]. BLHE algorithm is used to remove the noise and irrelevant features which may exist in the mammogram through various image morphing operations. It also enhances the image to a high quality by considering the contrasts of the image. This method uses the threshold for segmenting the images in order to detect the cancer region [11]. Architecture-based distortion is applied for interim cancer-affecting people by early detection. The texture features and analysis of fractal are assessed in proper level. Sign-based detection is done by using the property called localization with various level architectural formation of mass [12]. Breast cancer identification has been organized into various levels. They are image filtering, morphing of the image, erosion, edge detection methods, rough estimation, and threshold fixation. The regions are adjusted through snake-based approach with high accuracy [13]. Stacked sparse auto-encoders and softmax regression-based breast cancer diagnosis using feature ensemble learning are used [14]. Designed Agent-based data Analysis process is performed using disease database with suitable features [15]. The existing method of analysis and detection suffers various issues which are related to the performance. These are overcome by implementing the novel image classification with high detection rate.
164
9.3
K. Vijayakumar and C. Saravanakumar
Problem Formulation
The proposed method is formulated based on the mammogram images with various persons and different levels. Data set are collected from Digital Database for Screening Mammography database (DDSM). The data set has features like volume, number of cases, size of the mammogram image, scanner used for image acquisition, number of bits for image representation, resolution in microns, and actual images. The images are accessed from the DDSM, and then preprocessing is performed for further analysis. The convex hull is identified for various types of images for accurate analysis. The region is marked, and then the relevancy and irrelevancy of the mammogram images with features are extracted. Figures 9.1 and 9.2 show original image and convex hull image, respectively. The convex hull is a geometric analysis which divides a plane in to a number of planes. These planes form a closed loop with a set of points which is covered using polygon. Let P ¼ {p1, p2, p3. . .pn) be a set of point over the image. The plane is divided by connecting two points that form a straight line with coordinates {(X1, Y1),(X2,Y2),. . .(Xn, Yn)}. The straight line formed is referred to as AX + BY ¼ C. The formula contains two thresholds either within or outside the points. AX + BY > C and AX + BY 110 s; atrial flutter Initial sinus arrhythmia with 1-s sinus pause, limited (artefact) ST-segment elevation postictally L temp Normal sinus arrhythmia Atrial bigeminy/tachycardia Atrial fibrillation (55 s)
EPILAB software demonstrates that the preictal subjects from interictal seizure are distinguished using classifier based on a support vector machine (SVM) with an extraordinary sensitivity and specificity. The proposed methodology gives an average sensitivity of 89.467% and specificity of 84.213%, respectively. Using EPILAB, personalized classification algorithm is suggested to discriminate between SUDEP and control from the localization of epileptogenic structures and EEG-ECG features. When the volume features of MRI, localization features, and EEG-ECG features are combined, the sensitivity and specificity improves to, 94.5% and 83.2%
238
M. Kayalvizhi
respectively. This shows that combined MRI volume, MRI-EEG localization, and EEG-ECG combined feature can be used as biomarker for diagnosis of SUDEP.
12.11 Conclusion In this chapter, the complete research of different nonlinear techniques in the analysis of EEG neuronal signal is studied. The complete evaluation on multimodel analysis was carried out. Rather than nonlinear analysis of signal, multimodel analysis gives better results. Literature shows seizures cause lung and heart anomalies. The work carried out discloses the improved autonomic activity related with seizure patients who later died of SUDEP. The combination of biosignals with structural changes is a useful biomarker to assist determination of pathophysiology of disorders.
References 1. Atwood HL, MacKay WA (1989) Essentials of neurophysiology. B.C. Decker, Hamilton 2. Gevins A, Le J, Leong H, McEvoy LK, Smith ME (1999) Deblurring. J Clin Neurophysiol 16 (3):204–213. [43] 3. Crouzeix A, Yvert B, Bertrand O, Pernier J (1999) An evaluation of dipole reconstruction accuracy with spherical and realistic head models in MEG. Clin Neurophysiol 110 (12):2176–2188 4. Gevins A, Leong H, Smith ME, Le J, Du R (1995) Mapping cognitive brain function with modern high-resolution electroencephalography. Trends Neurosci 18(10):429–436 5. Leal AJR, Dias AI, Vieira JP (2006) Analysis of the EEG dynamics of epileptic activity in gelastic seizures using decomposition in independent components. Clin Neurophysiol 117 (7):1595–1601 6. Bai D, Li Qiu T (2007) The sample entropy and its application in EEG based epilepsy detection. J Biomed Eng 24(1):200–205 7. Roth BJ, Balish M, Gorbach A, Sato S (1993) How well does a three-sphere model predict positions of dipoles in a realistically shaped head? Electroencephalogr Clin Neurophysiol 87 (4):175–184 8. Cuffin BN (1996) EEG localization accuracy improvements using realistically shaped head models. IEEE Trans Biomed Eng 43(3):299–303 9. Michel CM, Murray MM, Lantz G, Gonzalez S, Spinelli L, de Peralta Menendez RG (2004) EEG source imaging. Clin Neurophysiol 115(10):2195–2222 10. Kaiser DA (1994) Interest in films as measured by subjective & behavioural ratings and topographic EEG, Methodological Issues 11. Babiloni F, Babiloni C, Locche L, Cincotti F, Rossini PM, Carducci F (2000) High-resolution electro-encephalogram: source estimates of Laplacian-transformed somatosensory evoked potentials using a realistic subject head model constructed from magnetic resonance images. Med Biol Eng Comput 38(5):512–519 12. Babiloni F, Cincotti F, Carducci F, Rossini PM, Babiloni C (2001) Spatial enhancement of EEG data by surface Laplacian estimation: the use of magnetic resonance imaging based head models. Clin Neurophysiol 112(5):724–727 13. Tyner FS, Knott JR (1989) Fundamentals of EEG technology, volume 1: basic concepts and methods. Raven press, New York
12
EEG Signal Extraction Analysis Techniques
239
14. Franaszczuk PJ, Bergey GK (1999) An autoregressive method for the measurement of synchronization of interictal and ictal EEG signals. Biol Cybern 81:3–9 15. Saleheen HI, Kwong T (1997) New finite difference formulations for general inhomogeneous anisotropic bioelectric problems. IEEE Trans Biomed Eng 44(9):800–809 16. Nagel HN (1995) Biopotential amplifiers. In: Bronzino JD (ed) The biomedical engineering handbook. CRC Press, Boca Raton, pp 1185–1195 17. Thompson JF, Soni BK, Weatherrill NP (1998) Handbook of grid generation. CRC Press, Boca Raton 18. Haueisen J, Ramon C, Eiselt M, Brauer H, Nowak H (1997) Influence of tissue resistivities on neuromagnetic fields and electric potentials studied with a finite element model of the head. IEEE Trans Biomed Eng 44(8):727–735 19. Wendel K, Narra NG, Hannula M, Kauppinen P, Malmivuo J (2008) The influence of CSF on EEG sensitivity distributions of multilayered head models. IEEE Trans Biomed Eng 55 (4):1454–1456 20. Kannathal N, Choo M, Acharya UR, Sadasivan P (2005) Entropies for detection of epilepsy in EEG. Comput Methods Prog Biomed 80(3):187–194. https://doi.org/10.1016/j.cmpb.2005.06. 012 21. Kiymik MK, Akin M, Subasi A (2004) Automatic recognition of alertness level by using wavelet transform and artificial neural network. J Neurosci Methods 139(2):231–240. https:// doi.org/10.1016/j.jneumeth.2004.04.027 22. Lehnertz K, Elger CE (1998) Can epileptic seizures be predicted? Evidence from nonlinear time series analysis of brain electrical activity. Phys Rev Lett 80:5019–5022 23. Hamalainen MS, Sarvas J (1989) Realistic conductivity geometry model of the human head for interpretation of neuromagnetic data. IEEE Trans Biomed Eng 36(2):165–171 24. Meckes-Ferber S et al (2004) EEG dipole source localisation of interictal spikes acquired during routine clinical video-EEG monitoring. Clin Neurophysiol 115(12):2738–2743 25. Michel CM et al (2004) EEG source imaging. Clin Neurophysiol 115(10):2195–2222. Brodbeck V et al (2011) Electroencephalographic source imaging: a prospective study of 152 operated epileptic patients. Brain 134(10):2887–2897 26. Ottosen N, Peterson H (1992) Introduction to the finite element method. Prentice-Hall, Englewood Cliffs 27. Bai O, Nakamura M, Nagamine T, Shibasaki H (2001) Parametric modeling of somatosensory evoked potentials using discrete cosine transform. IEEE Trans Biomed Eng 48(11):1347–1351 28. Nunez PL (1995) Neocortical dynamics and human EEG rhythms. Oxford University Press, New York 29. Pijn J, Pijn M, Velis DN, van der Heyden MJ, DeGoede J, van Veelen CW, Lopes da Silva FH (1997) Nonlinear dynamics of epileptic seizures on basis of intracranial EEG recordings. Brain Topogr 9:249–270 30. Rush S, Driscoll DA (1969) EEG electrode sensitivity—an application of reciprocity. IEEE Trans Biomed Eng 16(1):15–22 31. Subasi A (2007) EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst Appl Int J 32(4):1084–1093. https://doi.org/10.1016/j.eswa.2006.02. 005 32. Swiderski B, Osowski S, Rysz A (2005), Lyapunov exponent of EEG signal for epileptic seizure characterization. In: Proceedings of the 2005 European conference on circuit theory and design, vol 2 (28), pp 153–156 33. van Mierlo P et al (2014) Functional brain connectivity from EEG in epilepsy: seizure prediction and epileptogenic focus localization. Prog Neurobiol 121:19–35
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural Network
13
C. N. Savithri, E. Priya, and J. Sudharsanan
Abstract
Prosthetic arms are rapidly gaining pace because of their use in the development of robotic prosthesis. The surface electromyography (sEMG) signal acquired from the residual limb of amputee helps to control the movement of prosthesis. In this work, the sEMG signal is acquired using a dual-channel amplifier (Olimex EMG shield) from the below-elbow muscles of one amputee for six actions. The acquired signal is pre-processed using band-pass and band-stop filter to eliminate the noise in the signal. The classification is accomplished using machine learning and deep learning approach. The testing of machine learning algorithm and deep learning are implemented in Raspberry Pi 3 embedded in a Python script. In the machine learning approach, 11 relevant time domain features are extracted from pre-processed signal that are fed as input to linear support vector machines for classification. In the second approach, the signals are converted into images for deep learning analysis. Convolutional neural network (CNN) is used for classification of six hand actions. The model is trained and tested by varying a number of steps per epoch and number of epochs, and accuracy is compared with linear support vector machine (SVM). Results demonstrate that the mean accuracy of linear support vector machine is observed to be 76.66%, whereas for CNN model with 1000 steps per epoch with 10 epochs, it is found to be 91.66%. The classification accuracy of CNN is higher than machine learning model. Thus the usage of proposed methodology with CNN model could act as an indigenous system for sEMG signal acquisition and to activate the signal-controlled prosthetic arm which aids in rehabilitation of the amputees.
C. N. Savithri (*) · E. Priya · J. Sudharsanan Department of Electronics and Communication Engineering, Sri Sai Ram Engineering College, Chennai, Tamilnadu, India e-mail: [email protected]; [email protected] # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_13
241
242
C. N. Savithri et al.
Keywords
sEMG signal · Arm action · Amputee · Time domain features · Linear support vector machine · Convolutional neural network
13.1
Introduction
Human hand is one of the most amazing parts of human body used for interaction with outside world. Humans use their hands to perform their routine activities. A loss of human hand can affect a person mentally as well as physically. The mobility of the amputees is reduced which lowers their confidence level, and they face a lot of social problems. Researchers focus on proposing suitable solutions to overcome disabilities of individuals suffering from upper limb amputations. The foremost idea of scientists is to control the electrically powered prostheses by the myoelectric signals detected from the normal hand. A myoelectric signal produced by contractions in the muscle conveys information relevant to the prosthetic user intention. Surface EMG signals are prime use as control strategy for gesture recognition interface and multifunctional prostheses. Pattern recognition methods are promising in several fields of biomedical engineering such as myoelectric prostheses which includes stages like pre-processing, segmentation, feature extraction, feature selection, and classification [1, 2]. Recent technological developments play a vital role in acquisition of bio-signals. The sEMG signal is one among the bio-signals which is generated by muscular activity such as contraction and relaxation. The signals can be acquired from muscles using Ag/AgCl electrodes by placing them on the muscles’ surface. The sEMG signal extracted is noisy, and it is of very low amplitude. Each muscle movement corresponds to unique pattern in the sEMG signal [3, 4]. The focus of research related to noise can be categorized into several key areas such as analyzing the EMG signal quality and identifying the presence of noise, its type, and level of contamination of signal. Recent research works also focus on minimizing the effect of noise and extracting features that are robust against noise and finally noise-tolerant classification techniques. The inevitable problem encountered while acquiring the sEMG signal is contamination of EMG signals by noise that may compromise the effectiveness of EMG processing [5, 6]. The raw sEMG signals usually contain useful information as well as unwanted noise resulting in ambiguity. Pre-processing is required to clean the signal by removing the noise components (artifacts) with the help of mathematical tools and also to enhance the spectral resolution. Pre-processing comprises of elimination of the confounding information such as noise, etc. The pre-processing of sEMG signal is necessary to mitigate the impact of noise in order to accomplish further processing or analysis on the input. Motion artifact, power-line interference, analog to digital converter (ADC) clipping, amplifier saturation, and inherent noise due to electronic equipment are the most common sources of noise encountered in EMG applications [7–9]. The random noise whose frequency ranges from 0 Hz to a few thousand Hz cannot be removed by the use of conventional filters. Random noise can only be
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
243
minimized by the use of high-quality electronic components and designing the circuit carefully [10, 11]. The other method to minimize random noise is by using adaptive filters and wavelet transforms as the EMG signal is also nonstationary. Wavelet-based de-noising methods have proven to minimize the effect of random noise by discarding the wavelet coefficients containing noise before reconstruction [12, 13]. De-noising of EMG signals depends on the basis function and level of decomposition. Thus, wavelet-based de-noising technique reduces the effect of noise while increasing the performance of classifier around 30% [14, 15]. An approach to assess the quality of bio-signal was introduced and presented a pattern classification method to distinguish clean from contaminated electromyography (EMG) signals. Authors proposed a top-down approach to automatically detect the presence of noise in signal using feature vector and one-class SVM. According to their approach, a signal dB is identified as clean EMG signal as long as the signal-tonoise ratio (SNR) value is greater than 10 dB [7]. Suitable pre-processing techniques are essential to improve the quality of the sEMG signal when it is found to be insufficient. It is also apparent that the dominant energy band of EMG signal and several types of noise are different such as 50 Hz or 60 Hz power-line interference or the motion artifact 0–20 Hz. Conventional finite impulse response (FIR) and infinite impulse response (IIR) filters have been proposed to minimize the effect of several types of noise without affecting the EMG signal as they are efficient, simple, and cost-effective [6, 16]. De Luca et al. recommended the use of Butterworth filter to eliminate the motion artifacts and baseline wandering [17]. A novel pre-processing step is presented, namely, minimum entropy deconvolution adjusted (MEDA), to improve the signal for feature extraction subsequently in better classification of different upper limb motions. MEDA method finds filter coefficients of FIR filter which retrieves the EMG signal having high kurtosis value and eliminates noise components with low kurtosis value that increased the performance of classifier by 20.5% [18]. In addition to IIR and FIR filters, authors have proposed the use of adaptive digital filters to eliminate several types of noise [19–21]. The required information is a mixture of EMG signal together with various noises and artifacts inside the raw EMG signal. The efficacy of classifier decreases when the raw EMG signal is applied as input to the classifier. Different types of EMG features have been proposed by the researchers that can be applied as an input to classifier to improve its performance. Feature extraction is a dimensionality reduction technique which transforms the sEMG signal to lowerdimensional feature space that is intended to reveal the characteristic of input signal. Feature extraction method highlights the distinctive information of an input signal and increases the information density of the processed signal [22, 23]. The features are intended to be nonredundant, facilitating the classifier to recognize different hand actions as well as classify them into predefined class and aid in better human interpretations. The choice of suitable features has a tremendous impact on the performance of classification task. The three common categories of features from the myoelectric prosthesis literature are time domain, frequency (spectral) domain, and time-scale (time-frequency) domain [24, 25]. The Hudgins’ set of time domain features, which is the most universally used feature set, involves two kinds of
244
C. N. Savithri et al.
features that extract information associated with signal amplitude and frequency information [26]. The high-quality EMG feature must possess maximum class separability resulting in low misclassification rate; low computational complexity therefore can be implemented with minimal hardware and robust against dynamic factors [27, 28]. Machine learning techniques owing to their accuracy and robustness have been implemented for classification of arm actions based on sEMG signals. The classification of hand actions based on EMG signal has been explored to a greater extent leading to the development of numerous approaches such as mathematical models and discriminative learning models and approaches based on genetic algorithm [29, 30]. The common classifiers for classification of hand action to control myoelectric prostheses, which include linear discriminant analysis (LDA) and support vector machines (SVMs), have proven to improve classification accuracy [1]. The shortcomings of these classifiers are computational time, choice of appropriate kernel, and so on. Better results are obtained for SVM with appropriate selection of time domain features and kernel for SVM [31, 32]. However, the selection of features for classification is challenging and impacts long-term performance. The classification accuracy is affected by the loss of information during feature extraction. Deep learning technique is introduced to obtain higher accuracy. Deep learning algorithms have the ability to automatically learn discriminant features from huge amounts of data. Recently deep learning has revolutionized numerous fields of machine learning, including computer vision and hand motion recognition. One example of deep learning framework is the convolutional neural networks (CNNs). CNNs are most common since they learn classification tasks directly from the raw data [33].CNNs have been employed in estimation of hand movements from EMG signals [34, 35]. CNNs are applied in classification of hand gestures with model input as high-density spatial EMG data matrices [36]. Recurrent CNNs are used in the EMG frequency transform for classification of hand movements [37]. CNN is used on the EMG spectrogram to classify hand/wrist gestures and to control a robotic arm to pick up a cube and place it in a specified location [38].
13.2
Methodology
13.2.1 Signal Acquisition sEMG signals are acquired from two different muscles in hand using Ag/AgCl surface electrode. The sEMG signal is acquired from only one subject for six different actions which are open, close, supination, pronation, flexion, and extension. The informed written consent is obtained from amputee. The subject repeats each action 60 times. The signal is acquired five times in a trial with “rest-motion-rest” pattern and relaxes for 5 minutes to avoid fatigue in the muscle and stress in subject. The signal is captured through Olimex EMG shield which amplifies the raw sEMG signal that is in the range of 50–30 millivolts to a signal of amplitude 5 volts. The amplification factor of the Olimex EMG shield is 2800. The shield also has high-pass
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
High voltage protection
HR Rejection
3rd order “Besselworth” filter fc=40Hz, G=3.56
Instrumental Amplifier G=10
High pass filter 1 pole fc=0.16Hz
245
High pass filter 1 pole fc=0.16Hz
Opamp with regulated gain G=101
Fig. 13.1 Block diagram of Olimex EMG shield
filter to remove the low-frequency noise produced by the skin movement in the hand muscle surface. The block diagram of the Olimex shield is illustrated in Fig. 13.1. The sEMG signal is then passed to STM32 controller which has an inbuilt 12-bit analog-to-digital converter that converts the analog sEMG signal into digital value of range 0–4096. The raw sEMG signal has frequency range of 10–500 Hz. Since the maximum frequency of the analog signal is 500 Hz, the sampling rate is fixed to 1000 samples per second according to Nyquist theorem. The sEMG signal is acquired for 2.048 seconds from the subject which gives 2048 samples in the digital signal. This time duration is chosen since all the six actions can be performed within the time span of 2.048 seconds. The digital signal is stored in the form of integer array of size 2048 [39]. These integer arrays are passed to Raspberry Pi 3 for processing the signal and extracting features from the signal. The block diagram of proposed system is illustrated in Fig. 13.2.
13.2.2 sEMG Signal Pre-processing The major noise component in the sEMG signal is the power signal interference which has a frequency of 50 Hz. The main reason behind the pre-processing of the sEMG signal is to eliminate this power signal interference and high-frequency noise components. The frequency spectrum of the sEMG signal clearly shows that the higher-frequency components in sEMG signal have less power compared to the lower-frequency components. The raw sEMG signal from ADC is passed through Butterworth band-pass filter of fifth order with cutoff frequency of 10 Hz and 120 Hz and Butterworth band-stop filter of fifth order with cutoff frequency of 48 Hz to 52 Hz. The harmonics of 50 Hz power signal are also eliminated by the Butterworth band-stop filter with cutoff frequency of 100 Hz to 104 Hz [40].
13.2.3 Feature Extraction for Machine Learning Features are the characteristic properties of a signal that help in classification of different signal patterns corresponding to different actions in the hand [41–43]. The discriminating information in the pre-processed sEMG signal is obtained by feature extraction. Time domain features are selected owing to their performance in noisy
246
sEMG Signal Channel 1
C. N. Savithri et al.
Olimex EMG shield for channel1 STM32 Microcontroller
sEMG Signal Channel 2
Raspberry pi 3
Olimex EMG shield for channel 2
Fig. 13.2 Block diagram of proposed system
environment and being simple to compute when compared to frequency and timefrequency domain. The mathematical formula is enumerated in this section from Eq. (13.1) to Eq. (13.11). The mean absolute value (MAV) of the signals is obtained by averaging the absolute value of the samples over the number of samples. MAV ¼
N 1 X jx j N k¼1 k
ð13:1Þ
where xk is the kth sample of sEMG signal in an array of size N. The root mean square (RMS) of sEMG signal is the square root of the arithmetic mean of the square of the samples in the signal. vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u1 X RMS ¼ t x2 N k¼1 k
ð13:2Þ
Waveform length (WL) provides complexity of the sEMG signal. It is given as the cumulative length of the sEMG signal. WL ¼
N X
jΔxk j
where Δxk ¼ xk xk1
ð13:3Þ
k¼1
Skewness is asymmetry in a statistical distribution, in which the curve appears distorted or skewed either to the left or to the right. Skewness can be quantified to define the extent to which a distribution differs from a normal distribution.
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
skewness ¼ where standard deviation σ ¼ and mean x ¼
247
N 1 X ð x xÞ 3 Nxσ 3 k¼1 k N 1 X ð x xÞ 2 N k¼1 k
ð13:4Þ
N 1 X x N k¼1 k
Kurtosis is known as a statistical method that is used to describe the distribution and a characteristic that identifies the tendency of peak data. Kurtosis level is determined by comparing the peak of the curve inclination data distribution and normal curve. kurtosis ¼
N 1 X ð x xÞ 4 4 Nxσ k¼1 k
ð13:5Þ
Variance is the measure of samples that are spread out from the average value of the sEMG signal. vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u1 X variance σ ¼ t ð x xÞ 2 N k¼1 k
ð13:6Þ
Integrated EMG (iEMG) is defined as the area under the curve of the rectified sEMG signal, that is, the mathematical integral of the absolute value of the raw sEMG signal. IEMG ¼
N X
j xk j
ð13:7Þ
k¼1
ZC (zero crossing) is the number of times signal x crosses zero within an analysis window; it is a simple measure associated with the frequency of the signal. To avoid signal crossing counts due to low-level noise, a threshold ε was included. ZC ¼ fxk > 0 and xkþ1 < 0g or fxk < 0 and xkþ1 > 0g and jxk xkþ1 j ε ð13:8Þ Slope sign change (SSC) is related to signal frequency and is defined as the number of times the slope of the EMG waveform changes sign within an analysis window. A count threshold ε was used to reduce noise-induced counts (ε ¼ 0.015 V).
248
C. N. Savithri et al.
SSC ¼
N 1 X
f ½ðxk xk1 Þxðxk xk1 Þ
k¼2
( where f ðxÞ ¼
1,
if x >¼ Threshold
0,
if x < Threshold
ð13:9Þ
Willison amplitude is defined as the amount of times that the change in EMG signal amplitude exceeds a threshold. WA ¼
N X
jΔxk j where Δxk ¼ xk xk1
ð13:10Þ
k¼1
Log detect is the exponential value of the mean of logarithmic value of base 10 of each sample in sEMG signal.
Log Detect ¼ e
1 N
N P
log ðjxk jÞ
k¼1
ð13:11Þ
13.2.4 Classification Using Machine Learning Algorithms The time domain features extracted from sEMG signal are used to train the most popular machine learning algorithm support vector machine (SVM) which will classify the different sEMG signals into six hand actions. In this work, linear kernel is selected, and SVM is trained with 240 trials based on which SVM classifies the remaining 120 trials into the 6 actions.
Support Vector Machines The linear support vector machine (linearSVM) is a supervised machine learning algorithm that is used for classification and regression of data [44]. The crux of SVM algorithm is kernel that transforms low-dimensional input space to high-dimensional input space. SVM has the benefit of greater stability and smaller number of training parameters for a small sample model as well as was preferable than neural networks. Convolutional Neural Networks Other way of classification of EMG signal is the usage of convolutional neural networks (CNNs), which is a deep learning network. CNN has a high application in various fields such as machine vision, image classification, natural language classification, and medical image recognition [35]. The crux of deep learning is in multilayer representation of feature that aids to learn high-level features from low level in a hierarchical manner while eliminating the need for feature engineering [45]. Convolution networks are designed in a specific way in which the network takes an input image and passes through a series of convolution layers with filters.
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
249
1x256 1@128x128
16@8x8 16@4x4 8@16x16 2@64x64 4@32x32
Input
Layer 1
Layer 2
Layer 3
Layer 4
1x128
Dense
Fig. 13.3 Architecture of convolutional neural network
There are different layers for modeling the network. The first convolution layer converts the image into features by convoluting the entire image with a smaller filter. After the convolution, the image size has been reduced by the process of pooling. The pooling reduces the image size by subsampling the original image according to the filter dimension. Max pooling is used in formation of CNN. Max pooling takes the largest element from the rectified feature map [46]. Figure 13.3 shows the architecture of proposed convolutional neural network. In general, the increase in number of layer of CNN gives higher accuracy in classification. All the layers must be fully connected to form a neural network. The basic element in the neural network is the building blocks called neurons. Neural networks were inspired by the neural architecture of a human brain [47]. In purely mathematical terms, a neuron in the machine learning world is a placeholder for a mathematical function, and its only job is to provide an output by applying the function on the inputs provided. The function used in a neuron is generally termed as an activation function. ReLU Function The rectified linear unit is the most commonly used activation function in deep learning models. The function returns 0 if it receives any negative input; however, for any positive value x, it returns that value back as given in Fig. 13.4. Mathematically ReLU can be represented by Eq. (13.12). In general practice as well, ReLU has been found to be performing better than sigmoid or tanh functions. RðxÞ ¼ max ð0, xÞ
ð13:12Þ
A layer is a collection of neurons which take in an input and provide an output. Inputs to each of these neurons are processed through the activation functions assigned to the neurons. Any neural network has one input and output layer. The number of hidden layers, for instance, differs between different networks depending
250
C. N. Savithri et al.
Fig. 13.4 ReLU function
10 8 6 4 2
–10
–5
5
10
upon the complexity of the problem to be solved. A fully connected network is a network in which all the convolution layers are connected or flattened to form a vector to make the predictions of different classes.
13.3
Results and Discussion
The raw sEMG signal is recorded for six hand actions, namely, open, close, supination, pronation, flexion, and extension, as shown in Fig. 13.5, from only one belowelbow mid-forearm amputee. Primitive actions that are frequently performed in the daily routine are considered; hence the modeled prosthetic hand can perform accurately. The skin in the forearm is prepared before placing the surface electrodes. The subject is informed about the procedure, and each action is iterated for 60 trials. Thus, the database created consists of 60 trials of 6 hand actions to facilitate training and testing during classification. In this work, the classification results obtained with convolutional neural network are compared against the traditional classification approach in which features are used for training and testing. Figures 13.6a and 13.6b shows the signal plot of the raw sEMG signal for the hand action flexion acquired using Olimex EMG shield from channel 1 and channel 2 and its pre-processed signal along with the frequency domain plot. It is observed the signals acquired from channel 1 and channel 2 for the same action are different. The electrodes connected to channel 1 are placed over the flexor carpi radialis muscle; likewise, for channel 2, it is placed on palmaris longus muscle. The function of these muscles is to aid in flexing the hand. The offset in the raw sEMG signal for all hand action is removed by subtracting mean value from all samples. The powerline interference and its harmonics are eliminated by fifth-order Butterworth bandstop filter as shown in the frequency domain plot. In the first approach, 11 time domain features are extracted from the pre-processed signal which include mean absolute value, root mean square value, waveform length, skewness, kurtosis, variance, integrated EMG, zero crossings, slope sign change, Wilson amplitude, and log detect. The features extracted from
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
(a) Flexion
(b) Extension
(c)Open
(d)Close
(e)Pronation
(f)Supination
251
Fig. 13.5 Different hand actions
the pre-processed sEMG signal are tabulated in Table 13.1. It is evident that the feature values are different for each hand action as the potential generated by the muscle is different for each hand action. The range of the time domain features is very wide, and hence the 11 feature values are normalized between 0 and 1. In this work, hand actions are identified with machine learning algorithm, namely, linear support vector machine. A total of 60 trials of signal are acquired from the subject for each hand action. The database is created for 6 actions with 360 signals. The normalized feature values are used to train the classifier. For calculating the accuracy of the machine learning and deep learning models, the total database is split into training and testing data. Two hundred forty trials out of 360 trials are used for training, and 120 trials are used for testing. Table 13.2 shows the chaotic matrix for classification of six actions using SVM classifier. The error rate for SVM-based classification is 23%. SVM-based classification performs satisfactorily with clear margin of separation in largerdimensional space and is memory efficient. In SVM classifier, the misclassification rate of open and supination is high. The foremost cause is that the muscles involved in the two hand motions have a high coinciden5ce, and hence the classifier is unable to differentiate between the two hand actions. On the other hand, the overall recognition rate of SVM is high; as a result, prosthetic arm can be activated to mimic the recorded arm actions. The performance metrics such as mean accuracy, F1 score, and precision score for six hand actions is calculated from the chaotic matrix shown. The accuracy of identification of each hand action is described by the precision score. A simple quality measure that uses harmonic mean which could help the end users to interpret
Fig. 13.6a sEMG signal of flexion channel 1
252 C. N. Savithri et al.
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
253
Fig. 13.6b sEMG signal of flexion channel 2
13
254
C. N. Savithri et al.
Table 13.1 Time domain features of filtered sEMG signal Features MAV IEMG ZC SSC WAMP LOGDET RMS WL VAR SKEW KURT
Action Flexion 77 159,515 313 854 2012 6230 166 83,469 27,885 0.31 9.02
Extension 22 46,501 283 881 1247 4894 38 24,009 1467 3.34 6.91
Close 35 71,811 276 808 1017 5758 56 35,682 3213 0.73 5.39
Open 23 47,732 269 960 568 4989 46 31,790 2152 4.00 68.62
Supination 28 57,571 288 761 1378 5507 47 33,026 2245 1.48 29.61
Pronation 61 125,852 257 824 1987 6036 119 54,659 14,283 0.07 7.88
Supination 2 0 0 0 12 0
Pronation 0 0 0 0 0 18
Table 13.2 Chaotic matrix of linear support vector machine Linear SVC Flexion Extension Close Open Supination Pronation
Flexion 16 2 0 0 0 0
Extension 0 18 0 0 0 0
Close 0 0 8 0 0 0
Open 2 0 12 20 8 2
the performance classifier is given by F1 score. The hand action open is excellently detected preceded by extension and pronation action by the linear SVM classifier. The average F1 score of six types of hand actions is 77.32. The experimental results revealed that SVM-based classification provides good accuracy in upper limb action classification and facilitates the amputee for rehabilitation. It is promising to use sEMG signals to activate the prosthetic hand, which thus can provide support to the impaired arm of amputees. The maximum accuracy is obtained for SVM-based machine learning approach and is found to be 76.66%. The mean accuracy is computed for all six hand actions. SVM performs better for small data set and less number of classes. Therefore it is essential to move toward higher classification algorithms such as deep learning. Convolutional neural network (CNN) in deep learning networks provides higher accuracy than the machine learning models. In the second approach, the pre-processed signal is converted into an image of size 128x128, which is fed as an input to CNN as shown in Fig. 13.7. The CNN model comprises of two outer layers (input and output) and four hidden layers. The first convolution layer convolutes the input image, and the dimension of the image is reduced to 126x126. The first convolution layer extracts low-level features, whereas the last
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
255
Fig. 13.7 Images for different hand actions for training CNN model
convolution layer extracts high-level features and combines them consequently to improve accuracy. After convolution, the images are passed to max pooling layer of filter size 2x2 which furthermore reduces the image dimension to 63x63. This step is repeated four times, hence giving four hidden layers. The final layer consists of six neurons which represent the six different hand actions. All the convolution layers are flattened to form a network. The dimension of feature determines the number of units in the input layer, and the number of class (six arm actions) determines the dimension of the output layer. ReLU is used as the activation function as the learning is faster CNN model is trained and validated by varying the number of epochs and steps per epoch. Figure 13.8 shows the accuracy and loss curves of the model during training and testing without over-fitting phenomenon. The number of steps per epoch is incremented from 100 to 500 with an incremental value of 100. It is evident from plot that the higher number of steps per epoch required for training the model provides higher accuracy while the loss reduces. From the above plot, it is found that the models with higher steps per epoch have higher accuracy. Therefore 1000 steps per epoch are fixed for training the CNN model. Other criteria for finalizing the CNN model are number of epochs. Maximum accuracy of 91.66% is obtained when the number of epochs is 10 for the training. Table 13.3 shows the chaotic matrix for classification of six actions using CNN. The error rate for CNN-based classification is 8.34%. Results demonstrate that CNNs are capable of extracting features automatically from EMG using convolution layers and rectified linear units. Figure 13.9 shows the comparison of performance metrics of conventional method SVM and CNN. The ability to identify arm actions is given by F1 score, and with simple architecture, CNN outperformed in classifying six hand actions with an F1 score of 91% as shown in Fig. 13.9.
C. N. Savithri et al.
Loss
Accuracy
256 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
train test
0
100
200 300 Steps per Epochs
400
500
train test
1.75 1.50 1.25 1.00 0.75 0.50 0.25 0
100
200
300 Steps per Epoch
400
500
Fig. 13.8 Variation of loss and accuracy with change in steps per epoch
Table 13.3 Chaotic matrix for six arm actions CNN Flexion Extension Close Open Supination Pronation
Flexion 16 0 0 2 0 0
Extension 0 18 0 0 0 0
Close 0 0 20 0 0 0
Open 4 2 0 18 0 2
100 90 80 70 60 50 40 30 20 10 0
Supination 0 0 0 0 20 0
Pronation 0 0 0 0 0 18
Linear SVC CNN
Accuracy
Precision score
F1 score
Fig. 13.9 Comparison of performance metrics for SVM and CNN
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
13.4
257
Conclusion
Out of 6.7 billion world population, nearly 1.4 million suffer from below-elbow mid-forearm amputation. A prosthetic arm apparently gives the amputees an increased confidence and desire to demonstrate their day-to-day functional activities. An attempt has been made to classify six arm actions from one amputee using CNN. Though EMG classification is subject specific, the results are comparable with the accuracy of normal subjects owing to the fact that two channels are used for acquisition of sEMG signals along with increased number of trials per action. In this work, mean accuracy of six hand actions is compared with conventional techniques, namely, support vector machine with extraction of engineered features and convolutional neural network. The image of raw EMG signal is applied directly to CNN, from which it automatically derives features as effectively as the time domain features. Results indicate that CNN with simple architecture outperforms the traditional classifiers. Thus, the proposed system can act as an indigenous system for sEMG signal acquisition and to actuate prosthetic arm which aids in rehabilitation of the amputees.
References 1. Scheme E, Englehart K (2011) Electromyogram pattern recognition for control of powered upper-limb prostheses: state of the art and challenges for clinical use. J Rehab Res Dev 48(6) 2. Sapsanis C, Georgoulas G, Tzes A (2013, June) EMG based classification of basic hand movements based on time-frequency features. In 21st Mediterranean conference on control and automation (pp. 716–722). IEEE 3. Subasi A, Yilmaz M, Ozcalik HR (2006) Classification of EMG signals using wavelet neural network. J Neurosci Methods 156(1–2):360–367 4. Ismail R, Wijaya GD, Ariyanto M, Suriyanto A, Caesarendra W (2018, October) Development of myoelectric prosthetic hand based on Arduino IDE and visual C# for trans-radial amputee in Indonesia. In 2018 international conference on applied engineering (ICAE) (pp. 1–5). IEEE 5. Chowdhury RH, Reaz MB, Ali MABM, Bakar AA, Chellappan K, Chang TG (2013) Surface electromyography signal processing and classification techniques. Sensors 13(9):12431–12466 6. Reaz MBI, Hussain MS, Mohd-Yasin F (2006) Techniques of EMG signal analysis: detection, processing, classification and applications (correction). Biol Proced Online 8(1):163–163 7. Fraser GD, Chan AD, Green JR, MacIsaac DT (2014) Automated biosignal quality analysis for electromyography using a one-class support vector machine. IEEE Trans Instrum Meas 63 (12):2919–2930 8. McCool P, Fraser GD, Chan AD, Petropoulakis L, Soraghan JJ (2014) Identification of contaminant type in surface electromyography (EMG) signals. IEEE Trans Neural Syst Rehabil Eng 22(4):774–783 9. Thongpanja S, Phinyomark A, Quaine F, Laurillau Y, Limsakul C, Phukpattaranont P (2016) Probability density functions of stationary surface EMG signals in noisy environments. IEEE Trans Instrum Meas 65(7):1547–1557 10. Hamedi M, Salleh SH, Ting CM, Astaraki M, Noor AM (2016) Robust facial expression recognition for MuCI: a comprehensive neuromuscular signal analysis. IEEE Trans Affect Comput 9(1):102–115
258
C. N. Savithri et al.
11. Khezri M, Jahed M (2008, August) Surface electromyogram signal estimation based on wavelet thresholding technique. In 2008 30th annual international conference of the IEEE engineering in medicine and biology society (pp. 4752–4755). IEEE 12. Maier J, Naber A, Ortiz-Catalan M (2017) Improved prosthetic control based on myoelectric pattern recognition via wavelet-based de-noising. IEEE Trans Neural Syst Rehabil Eng 26 (2):506–514 13. Phinyomark A, Phukpattaranont P, Limsakul C (2011) Wavelet-based denoising algorithm for robust EMG pattern recognition. Fluct Noise Lett 10(02):157–167 14. Phinyomark A, Limsakul C, Phukpattaranont P (2009, March) A comparative study of wavelet denoising for multifunction myoelectric control. In 2009 international conference on computer and automation engineering (pp. 21–25). IEEE 15. Phinyomark A, Limsakul C, Phukpattaranont P (2009, May) An optimal wavelet function based on wavelet denoising for multifunction myoelectric control. In 2009 6th international conference on electrical engineering/electronics, computer, telecommunications and information technology (Vol. 2, pp. 1098–1101). IEEE 16. Hargrove L, Scheme E, Englehart K, Hudgins B (2008) Filtering strategies for robust myoelectric pattern classification. CMBES Proceedings 31 17. De Luca CJ, Gilmore LD, Kuznetsov M, Roy SH (2010) Filtering the surface EMG signal: movement artifact and baseline noise contamination. J Biomech 43(8):1573–1579 18. Powar OS, Chemmangat K, Figarado S (2018) A novel pre-processing procedure for enhanced feature extraction and characterization of electromyogram signals. Biomed Sig Process Cont 42:277–286 19. Fraser GD, Chan AD, Green JR, Abser N, MacIsaac D (2011) CleanEMG—power line interference estimation in sEMG using an adaptive least squares algorithm. In 2011 annual international conference of the IEEE engineering in medicine and biology society (pp. 7941–7944).IEEE 20. Ortolan RL, Mori RN, Pereira RR, Cabral CM, Pereira JC, Cliquet A (2003) Evaluation of adaptive/nonadaptive filtering and wavelet transform techniques for noise reduction in EMG mobile acquisition equipment. IEEE Trans Neural Syst Rehabil Eng 11(1):60–69 21. Zhou P, Lock B, Kuiken TA (2007) Real time ECG artifact removal for myoelectric prosthesis control. Physiol Meas 28(4):397 22. Phinyomark A, Limsakul C, Phukpattaranont P (2011) Application of wavelet analysis in EMG feature extraction for pattern classification. Measure Sci Rev 11(2):45–52 23. Phinyomark A, Nuidod A, Phukpattaranont P, Limsakul C (2012) Feature extraction and reduction of wavelet transform coefficients for EMG pattern classification. ElektronikairElektrotechnika 122(6):27–32 24. Phinyomark A, Phukpattaranont P, Limsakul C (2012) Feature reduction and selection for EMG signal classification. Expert Syst Appl 39(8):7420–7431 25. Phinyomark A, Quaine F, Charbonnier S, Serviere C, Tarpin-Bernard F, Laurillau Y (2013) EMG feature evaluation for improving myoelectric pattern recognition robustness. Expert Syst Appl 40(12):4832–4840 26. Ahsan MR, Ibrahimy MI, Khalifa OO (2011, May) Electromyography (EMG) signal based hand gesture recognition using artificial neural network (ANN). In 2011 4th international conference on mechatronics (ICOM) (pp. 1–6).IEEE 27. Boostani R, Moradi MH (2003) Evaluation of the forearm EMG signal features for the control of a prosthetic hand. Physiol Meas 24(2):309 28. Zardoshti-Kermani M, Wheeler BC, Badie K, Hashemi RM (1995) EMG feature evaluation for movement control of upper extremity prostheses. IEEE Trans Rehabil Eng 3(4):324–333 29. Zhang YT, Herzog W, Liu MM (1995, September) A mathematical model of myoelectric signals obtained during locomotion. In proceedings of 17th international conference of the engineering in medicine and biology society (Vol. 2, pp. 1403–1404).IEEE
13
Classification of sEMG Signal-Based Arm Action Using Convolutional Neural. . .
259
30. Oskoei MA, Hu H (2006, December) GA-based feature subset selection for myoelectric classification. In 2006 IEEE international conference on robotics and biomimetics (pp. 1465–1470).IEEE 31. Oskoei MA, Hu H (2008) Support vector machine-based classification scheme for myoelectric control applied to upper limb. IEEE Trans Biomed Eng 55(8):1956–1965 32. Hudgins B, Parker P, Scott RN (1993) A new strategy for multifunction myoelectric control. IEEE Trans Biomed Eng 40(1):82–94 33. Deng L, Yu D (2014) Deep learning: methods and applications. Found Trends Signal Process 7 (3–4):197–387 34. Park KH, Lee SW (2016, February) Movement intention decoding based on deep learning for multiuser myoelectric interfaces. In 2016 4th international winter conference on brain-computer Interface (BCI) (pp. 1–2).IEEE 35. Atzori M, Cognolato M, Müller H (2016) Deep learning with convolutional neural networks applied to electromyography data: a resource for the classification of movements for prosthetic hands. Front Neurorobot 10:9 36. Geng W, Du Y, Jin W, Wei W, Hu Y, Li J (2016) Gesture recognition by instantaneous surface EMG images. Sci Rep 6:36571 37. Xia P, Hu J, Peng Y (2018) EMG-based estimation of limb movement using deep learning with recurrent convolutional neural networks. Artif Organs 42(5):E67–E77 38. Allard UC, Nougarou F, Fall CL, Giguère P, Gosselin C, Laviolette F, Gosselin B (2016, October). A convolutional neural network for robotic arm guidance using semg based frequency-features. In 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2464–2470).IEEE 39. Samarawickrama K, Ranasinghe S, Wickramasinghe Y, Mallehevidana W, Marasinghe V, Wijesinghe K (2018) Surface EMG signal acquisition analysis and classification for the operation of a prosthetic limb. Int J Biosci Biochem Bioinformatics 8:32–41 40. Sharma S, Farooq H, Chahal N (2016) Feature extraction and classification of surface EMG signals for robotic hand simulation. Commun Appl Electron 4:27 41. Sathish S, Nithyakalyani K, Vinurajkumar S, Vijayalakshmi C, Sivaraman J (2016) Control of robotic wheel chair using EMG signals for paralysed persons. Indian J Sci Technol 9(1):1–3 42. Savithri CN, Priya E (2019) Statistical analysis of EMG-based features for different hand movements. In: Smart intelligent computing and applications. Springer, Singapore, pp 71–79 43. Savithri CN, Priya E (2017, January) Correlation coefficient based feature selection for actuating myoelectric prosthetic arm. In 2017 trends in industrial measurement and automation (TIMA) (pp. 1–6).IEEE 44. Vapnik V (2013) The nature of statistical learning theory. Springer 45. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444 46. Zia urRehman M, Waris A, Gilani SO, Jochumsen M, Niazi IK, Jamil M et al (2018) Multiday EMG-based classification of hand motions with deep learning techniques. Sensors 18(8):2497 47. Said AB, Mohamed A, Elfouly T, Harras K, Wang ZJ (2017, March) Multimodal deep learning approach for joint EEG-EMG data compression and classification. In 2017 IEEE wireless communications and networking conference (WCNC) (pp. 1-6). IEEE
An Automated Approach for the Identification of TB Images Enhanced by Non-uniform Illumination Correction
14
E. Priya
Abstract
Tuberculosis (TB) is a contagious airborne infectious disease for which early diagnosis is essential to control. The conventional method employed for TB identification is the microscopy-based TB screening. Screening manually with microscopes is tedious and requires well-trained technicians. Large variation in sensitivity is observed apart from labor concern, while the identification of TB is performed manually. Also according to the stage of infection, the process becomes time-consuming as it requires a huge number of images to be processed per slide. Hence there is a tremendous need in TB-burden countries to automate the identification of TB in sputum smear images so as to enhance the sensitivity and efficiency of the test. The visual perception of digital TB images is degraded by the non-uniform illumination effect observed due to long working hours of microscope lamp and the turbulence in the camera. The decomposition-based methods such as bi-dimensional empirical mode decomposition (BEMD) and discrete wavelet transform (DWT) are attempted to pre-process the sputum smear images. The apt correction method to overcome illumination is figured out by qualitative and quantitative assessments. Intensity profile is used as a qualitative analysis, and histogram-based statistical features serve as a quantitative measure to validate the illumination correction methods. The pre-processed sputum smear images are subjected to threshold-based segmentation methods. Results demonstrate that BEMD performs better than DWT in removing the non-uniform illumination in the sputum smear images. This helps in identifying the TB objects by Otsu segmentation method. It is observed from the results that Otsu-based E. Priya (*) Department of Electronics and Communication Engineering, Sri Sai Ram Engineering College, Chennai, Tamilnadu, India e-mail: [email protected] # Springer Nature Singapore Pte Ltd. 2021 E. Priya, V. Rajinikanth (eds.), Signal and Image Processing Techniques for the Development of Intelligent Healthcare Systems, https://doi.org/10.1007/978-981-15-6141-2_14
261
262
E. Priya
segmentation resulted in close match with the ground truth than maximum entropy-based segmentation. Thus the developed workflow resulted in better identification of the disease-causing objects, namely, the tubercle bacilli. This will further enhance the classification of these images into positive and negative images that aid for mass screening of pulmonary tuberculosis. Keywords
Tuberculosis · Sputum smear images · Non-uniform illumination · Histogrambased analysis · Discrete wavelet · Empirical mode decomposition · Otsu · Segmentation indices
14.1
Introduction
14.1.1 Tuberculosis Tuberculosis (TB) is an airborne contagious infection and remains one among top 10 reasons of mortality globally. Every year millions of people fall sick from TB. It is a second leading killer worldwide next to human immunodeficiency virus (HIV) due to a single infectious agent. Because of the huge happening, the disease prevails a key universal health crisis. On an average, 95% of deaths due to TB arise in lowand middle-income countries [1–6]. TB is caused by Koch bacillus popularly known as Mycobacterium tuberculosis. TB spreads when people with TB expel bacteria into the air while coughing or sneezing. Patients with smear-positive pulmonary TB can infect ten people in the vicinity. Pulmonary TB condition occurs when it affects the lungs, and it is termed as extrapulmonary TB on other sites [4, 7]. The prime challenge of reducing the spread of TB is the early diagnosis of the disease, which is the crucial stride in the control of TB globally. Diagnosis of tuberculosis at an early stage is essential to attain better health consequences. However if TB is diagnosed at the right time, it can be treated with appropriate follow-ups. Despite being a preventable and curable disease, more deaths prevail per year than HIV [8–11]. There prevail several TB diagnostic procedures, yet it remains a challenging task especially in low- and middle-income countries. They primarily depend on manual diagnosis of TB with visual screening of stained smears of sputum [8].
14.1.2 Other Methods to Test TB There are several tests available for the finding of TB such as chest X-rays, culture test, Mantoux skin test, interferon-gamma release assays, GeneXpert, sputum smear microscopy, and serological tests. Other tests include blood, skin, antigen, and molecular deoxyribonucleic acid tests [2, 8, 12].
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
263
Low-resource countries where TB cases are high have very limited access to latest diagnostic technologies in spite of rapid diagnostic technologies with automated cultures or molecular tests. Because of this, the costly molecular tests and lengthy culture tests make them unaffordable. Although it is termed to be a gold standard, it takes at least 4 weeks’ time for the culture test to give a conclusive result. Thus, in developing countries like India, culture and GeneXpert tests are not feasible [1, 9, 11].
14.1.3 Sputum Smear Microscopy Currently, sputum smear microscopy is the first and prime choice of screening method in the developing countries for TB detection. Sputum smear microscopy is the utmost used cost-effective, simple, and efficient approach since results can be available within hours. It requires minimal bio-safety standards when compared with other tests [2, 9, 11]. Sputum smear microscopic technique is a noninvasive procedure developed for the diagnosis of TB. The special dyes that are used differentiate the disease-causing bacillus from the background. Also there exists a very good relation between the number of bacilli being identified in the smear and the count established by microscopy [3]. This sputum smear procedure is normally accomplished using auramine stain fluorescence microscopy or Ziehl–Neelsen (ZN) stain bright-field (conventional) microscopy. Conventional microscopy uses a conventional light source and carbol fuchsin or Kinyoun acid-fast staining procedure. It is the most preferred microscopy procedure in developing nations where TB cases reported are being high. Conventional microscopy is preferred due to its convenience and cost-effectiveness; however, its sensitivity is low. This procedure is the crucial method for detection of TB in secluded areas. The ZN staining method is simple, rapid, and low cost [3, 5, 8, 9, 11]. Compared to ZN method, the auramine fluorescence smear microscopy has a higher sensitivity and consumes less time. Fluorescence microscopy technique adopts acid-fast fluorochrome dye and halogen or mercury vapor lamp as light source. Fluorescence microscopy technique requires latest special skill sets for operating and sustenance of the optical equipment. Recent development of inexpensive, robust, and prolonged lifespan fluorescence microscopes based on the lightemitting diodes motivated the World Health Organization (WHO) to recommend the auramine fluorescence smear microscopy as an alternative to ZN [3, 11]. The WHO suggested viewing 300 fields per sputum sample. The number of bacilli illustrates the seriousness of the disease which is counted when the TB sputum smear is visualized under the microscope. Hence from a large number of view fields, a very good image in terms of its quality needs to be captured before it is processed. About 40 minutes to 3 hours is needed to manually visualize 40 to 100 images per view field in a single slide [1, 5, 13].
264
E. Priya
TB screening procedure becomes an exhausting effort when performed manually. It is liable to error due to workload and deficiency of well-trained lab technicians. These experts visualize the smeared slides with microscopes, to capture rod-shaped bacteria responsible for TB disease. It may also lead to technician fatigue so that a TB-positive slide may be diagnosed as smear negative due to sparseness of acid-fast bacilli or when too few fields are examined. This often leads to low recall rates that cause false alarms in demarcating TB positive from negative [8, 11, 14].
14.1.4 Automation It is evident from literature that the manual screening may misdiagnose 33–50% of TB-positive cases. All these problems can be overcome using automated methods which not only increase the sensitivity and specificity of TB identification but also reduce the time of diagnosis. Hence there is a need for an automatic system capable of analyzing the image captured by a microscope and identifying the presence of mycobacteria [5, 6, 10, 13]. Automatic screening is the apt procedure to improve sensitivity for the diagnosis of TB which may reduce human variability in slide analysis and speed up the screening process. The key benefits of automatic TB detection system are improved patient care, efficient healthcare outcome quality, and reduced processing time for detection. Moreover automation overcomes physical and mental fatigue and reduces the time required to view the fields with respect to visual examination by a medical professional [2, 8, 9, 15]. However, image analysis of patient sputum smears is the most cost-effective and extensively adopted procedure for TB detection, particularly in developing countries. Automatic processing of the stained sputum smear digital images reduces the burden on the pathologist or technician, such that it reduces human error and improves sensitivity of the test [12].
14.1.5 Literature Review Authors claim the benefit of automatic screening of bacillus over manual procedure that results in more reproducible values of the test being performed and a quicker screening procedure. Several works have shown the potential in harnessing image analysis techniques for the finding of bacilli in sputum smear TB images [3, 16]. Automatic procedures for bacillus screening with fluorescence microscopy are the first developed method for these images to identify Mycobacterium tuberculosis or TB bacilli. Literature describes the identification of TB bacilli in sputum image from fluorescence microscope using edge-pixel linking and boundary tracing and later using Canny edge detector and Fourier descriptor to segment the images. Veropoulos et al. used neural network technique for TB bacilli classification. Authors have used adaptive color thresholding in red green blue (RGB) space for color segmentation. They implemented Gaussian mixture model to describe bacilli features and Bayesian for classification [17, 18].
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
265
The first method for automatic screening of bacillus in conventional microscopy was declared in the year 2008. Study reveals the combination of pixel classifier being used, and accuracy above 95% is experienced using object classifier for the identification of bacilli objects. Hue color component-based approach is adopted to identify valid single bacillus. Color segmentation of bacillus with classification in huesaturation-intensity (HSI) space has been reported by the authors. The innovative divide-and-conquer method improves the potential of color-based classification [3, 19–23]. Some of the authors have attempted image processing algorithms such as image binarization; naive Bayesian-based color segmentation; color space-based segmentation such as RGB, HSI, YCbCr, and CIE Lab; k-means clustering; and Otsu segmentation. The segmentation procedure is followed by identification of regions using decision tree and random forest classifiers, multilayer perceptron neural network, support vector machines, and convolutional neural network [6, 7, 15, 24– 26].
14.1.6 Pre-processing The sputum smear images acquired may have the non-uniform illumination because of faulty power supply voltage and mainly due to the long-run use of microscopic optics. Thus the non-uniform illumination needs to be corrected for effective recognition and categorization of TB sputum smear images [7, 10]. Hence pre-processing is essential to develop image quality for further processing of these images. BEMD and DWT Bi-dimensional empirical mode decomposition (BEMD) method is used for representing nonstationary values by intrinsic mode functions (IMFs). The IMFs are extracted using EMD technique. This method finds a number of applications, including background correction [27]. Discrete wavelet transform (DWT) analysis is a multi-scale and multi-resolution analysis and expresses the image in both the spatial and frequency domains. This enables to segregate the low- and highfrequency particulars present in the image. The approximation coefficients in the wavelet domain refer to the low-frequency background information. Likewise, the detail coefficients at different levels of decomposition contain the high-frequency information of the image [28].
14.1.7 Segmentation The role of image segmentation is important in medical image application, machine vision technology, and object recognition. Accurate methods for delineating the medical images are of prime goal, and thus successful segmentation would result in appropriate diagnosis and detection of various diseases [29–31]. Suitable
266
E. Priya
segmentation techniques need to be chosen to detect tubercle bacilli from the sputum smear images. Segmentation of an anatomical object is not an easy task, because the intensity values of the object to be segmented and its surrounding region have a hazy contour clarity. Many traditional image segmentation algorithms follow intensive procedures or at times need manual intervention. In the past years, various segmentation methods or algorithms have been reported in the literature that identifies various classes of medical images [32]. The usage of appropriate segmentation procedure is yet another important concern as this is mostly application specific and depends on the modality by which the image is acquired and several other aspects. Image segmentation algorithms are established based on the analysis of edges, regions, and their combination. Segmentation algorithms based on edge prompt the rapid change in pixel intensity of the image within a small neighborhood. But segmentation based on region analysis depends on the presence of homogeneity in the image [32]. Several authors have reported image segmentation in the categories such as threshold-based segmentation, region merging and splitting, watershed algorithm, histogram-based approach, cluster analysis, and wavelet transform. Among these, threshold-based segmentation is the popular tool because of its simplicity especially in real-time image processing. The thresholding method could be of either bi-level or multilevel. The basic threshold is the bi-level which divides the image into two distinct regions of interest. Maximum entropy-based thresholding is a method where entropy is calculated using Shannon, Renyi, and Tsallis procedures. Entropy-based thresholding approach has been used for segmenting bacteria images from a low-intensity background [29–31, 33–35]. Threshold-based segmentation procedures are much significant on considering the computational complexity when it is compared with other segmentation methods. Otsu’s thresholding-based image segmentation is one of the most preferred because of its favorable results. It finds the presence of homogeneous regions in the image from the histogram and identifies the region of interest [36].
14.1.8 Image Segmentation Indices The segmented images are validated by various difference or similarity measures that have typical mathematical and computational properties. The most common segmentation indices include sum of absolute differences (SAD), sum of squared difference (SSD), and normalized cross-correlation (NCC). In spite of them being popular, these measures are prone to outliers and are not strong enough to variations in the template that occur at overlapping edges in the image. Though NCC measure is meticulous, its computation time is more. NCC is more powerful than SAD and SSD under uniform illumination changes, so it finds usage in object recognition and industrial inspection. The other measures include probabilistic Rand index (PRI) which finds the agreement and disagreement between segmentation output and ground truth. Global consistency error (GCE) measures the error between
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
267
the segmentation output and gold standard ground truth, and variation of information (VoI) is a distance measure. These measures have been used in a number of applications, including object recognition, the creation of disparity maps for stereo images, and motion estimation for video compression [37–40].
14.2
Methodology
14.2.1 Acquisition of TB Sputum Images The collected sputum sample is smeared on a slide with absence of dirt. The smear is dried by passing air and then is fixed by passing the slides two to three times through a low flame. The heat-fixed slide is cleansed with auramine O stain to tighten the acid-fast bacilli’s cell wall, and thereafter it is dried for 10 minutes. Then the slide is washed with running water once it is decolorized with acid-alcohol. The slide is counter-stained with potassium permanganate to get a contrast background. Before the slide is viewed under the microscope, it is washed with running tap water and dried with air. The stained slides are prepared at Groote Schuur Hospital in South African National Health Laboratory Services, Cape Town. About hundred images are acquired from the microscope coupled with a camera in black and white mode of 20x objective lens and 0.5 numerical apertures. The resolution of AxioCam HR camera that is used is 4164 3120 with a pixel size of 6.45 μm (h) 6.45 μm (v). Presence of spotlight persists in these images because of microscope light optics, camera gain, and exposure time. There is a presence of bright mark in the middle of these images. The images of size 256 256 pixels are pre-processed by decomposition-based methods such as BEMD and DWT. Figure 14.1 shows the block diagram representation of the work carried out in this chapter. The images are pre-processed by BEMD and DWT methods. The pre-processed images are segmented by thresholding techniques such as maximum entropy and Otsu segmentation methods. The segmented results are validated with ground-truth images by performance indices.
Sputum smear TB images
100 images of size 256x256
Preprocessing
• BEMD • DWT
Fig. 14.1 Pipeline flow of the work
Segmentation
• Maximum •
entropy Otsu :
Validation
PRI, GCE, VOI, NCC, SAD, SSD 256x256
268
E. Priya
14.2.2 Pre-processing of Sputum Smear Images Pre-processing step is a significant step for a more successful study of digital images before any analysis is carried out. An effective method would increase the segmentation accuracy and is essential for further processing steps [41].
BEMD EMD is appropriate for systems with nonlinear and nonstationary data. This method decomposes the time series function into mono-component called intrinsic mode functions (IMFs) which identifies the individual frequency components. By a continual step, EMD decomposes the given function into subsequent IMFs consisting of extrema known as sifting process. The sifting process is continued until all or the enough number of IMFs is computed. The EMD extracts the frequency from a higher value progressively to a lower value from the given function [27]. The EMD processes two-dimensional (2D) images by computing the algebraic mean value of maximal and minimal envelope surface termed as local extrema points [42]. Let I(x, y) be the input image, then the algebraic mean is referred to as a(x, y). The 2D EMD (BEMD) method is computed by finding the difference in image I(x, y) and a(x, y). d ðx, yÞ ¼ I ðx, yÞ aðx, yÞ
ð14:1Þ
The above procedure is repeated until d(x, y) is an IMF. The residue r1(x, y) till last residue rn(x, y) is separated from input image by eliminating the IMFs. r 1 ðx, yÞ ¼ I ðx, yÞ IMF 1 ðx, yÞ
ð14:2Þ
Now the original image could be: I ðx, yÞ ¼
n X
IMF i ðx, yÞ þ r n ðx, yÞ
ð14:3Þ
i¼1
The first IMF presents the high frequency followed by the next lower frequencies. The rest of the IMFs are reconstructed by discarding the lower frequencies which contribute to the illumination component to get the illumination corrected image.
DWT DWT is a well-known method which provides a proficient time-scale illustration for functions that have identical characteristics to the functions in wavelet basis. The orthogonal DWT which represents the signal s(n) for its decomposition is represented by:
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
s ð nÞ ¼
K 1 X X
w j ðk Þψ 2 j n k
269
ð14:4Þ
j¼1 k¼1
where the function ψ(n) represents a discrete wavelet and the coefficients wj(k) represent the signal at level j. Haar wavelet is chosen in this work for image decomposition due to its low computing requirements. The Haar wavelet is a symmetric wavelet among orthogonal types. It is discontinuous and resembles a step function. It is suitable for the sputum smear images, as the foreground objects present in these images are the white rod-shaped objects that are distinguishable from the background. DWT acts as a filter by allowing low frequencies for the approximate portion and high frequencies for the detailed portion followed by decimation. In general, DWT can be inferred as band-pass filter banks that disintegrate the given function into different levels of meaningful components with gradually lower frequencies. Hence the time resolution seems to be good at high frequencies, and frequency resolution sounds to do well at low frequencies. There are some unique main differences between the decomposition performed by BEMD and DWT. DWT scales frequency contents in a fixed manner and especially based on sampling frequency and the disintegration level. But IMFs may have varying frequency components based upon confined function properties. The performance of DWT is on the selection of wavelet and its similarity to analyzed function, while BEMD does not have basic functions and is mostly functional dependent [27]. The TB objects (outliers and bacilli) present in the sputum smear images are identified by the segmentation procedure.
14.2.3 Segmentation of Digital TB Images Image segmentation is a useful tool for image analysis. It partitions the image into group of pixels. It is performed over images to separate the pixels into groups based on its similar characteristics. The goal of segmentation is to project the image in a simple way so that it becomes easier to analyze further. It is the procedure of mapping a label that shares certain characteristics in common. A collective segments or a set of contour being extracted from the image covering the entire image is the resultant of segmentation. The pixels in a particular region can be associated with some characteristic or computed property including color, intensity, or texture. Segmentation can be carried out based on methods such as edge-based, regionbased, and entropy being one of them [29, 43–45].
Image Thresholding Thresholding is a known method because it is simple and could be implemented more easily. The histogram of the input image is extracted so that distinct peaks and valleys representing foreground and background objects are distinguished. Spatial correlation information between pixels is the prime concern in thresholding-based
270
E. Priya
methods [46]. Image thresholding, based on entropy, is a significant method for image segmentation. It performs well as it measures the randomness and variance within the image [29]. Entropy-Based Thresholding Entropy is calculated to find the threshold value for segmentation. Entropy provides a good level of information to express an image. The entropy can be computed from the distribution of gray levels and there by a pertinent partition for the image could be obtained. The calculation could be made by entropy measures such as Shannon, Tsallis, and Renyi. Maximum entropy is measured when maximum variance is encountered. Compared to gradient-based methods which are sensitive to noise, entropy is considered to be an effective method. The maximum entropy-based segmentation finds the optimum threshold that helps in identification of different disease detection by medical practitioners. An image that exhibits a low entropy value presents unwanted information that is not meaningful. But homogeneous region in an image results in minimum entropy, and the heterogeneous region categorizes maximum entropy. The probability to locate the pixel randomness in the image for finding entropy is computed from histograms. In the field of medicine, entropy-based segmentation finds application in identifying the abnormal condition, thus enabling to demarcate from the control subjects [29–32]. Entropy Computation Kapur entropy-based thresholding is also known as maximum entropy thresholding. It calculates the maximum cumulative sum of entropy of background and foreground objects [46]. The entropy of an image with G gray levels and probability of k-th gray level as Pk is represented by E¼
G1 X k¼0
Pk log 2 ð1=Pk Þ ¼
G1 X
Pk log 2 ðPk Þ
ð14:5Þ
k¼0
where Pk ¼ nk/MxN, nk is the number of pixels with gray scale k, and MxN is the size of the image [32]. Shannon’s concept of defining the entropy of an image entirely assumes the image to be represented by its gray level histogram H. The entropy of the entire image ranges from {0, 1, . . ., L 1}. The entropies are computed from global and local average gray value of the image by considering the eight neighborhood pixels [31, 47].
Otsu-Based Segmentation Global thresholding-based image segmentation is suitable if the intensity variation amidst the foreground objects is entirely dissimilar. A unique threshold is enough to distinguish the foreground objects. But if there are so many such unlike objects of interest, then thresholding in a global level won’t produce expected outcomes. Thus
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
271
multilevel thresholding is opted to choose a value more than 1. Hence choosing appropriate threshold value is essential to segregate the region of interest in an image [48]. In this work, threshold method based on Otsu is aimed to segregate the TB objects from the sputum smear image. The prime purpose of Otsu’s method is to derive the optimal threshold value. It is computed by gathering the pixels into two classes as the histogram has a bimodal pattern. Otsu’s algorithm utilizes variance property of the image because the greater value of variance presents big difference between background and foreground objects. The optimal threshold value is chosen by minimizing the within-class variance or maximizing the between-class variance. The intra-class variance for each cluster is represented as: σ 2 wðt Þ ¼ q1 ðt Þσ 21 ðt Þ þ q2 ðt Þσ 22 ðt Þ
ð14:6Þ
where the weight qi refers to the probability of each class and is expressed as: q1 ð t Þ ¼
t X
q2 ð t Þ ¼
Pi
i¼1
μ 1 ðt Þ ¼
l X
Pi
ð14:7Þ
i¼tþ1
t X iPðiÞ ðt Þ q i¼1 1
μ2 ðt Þ ¼
l X iPðiÞ ðt Þ q i¼tþ1 2
ð14:8Þ
And the individual class variance is computed by: σ 21 ðt Þ ¼
t X
½i μ1 ðt Þ2
Pi q1 ð t Þ
ð14:9Þ
½i μ2 ðt Þ2
Pi q2 ð t Þ
ð14:10Þ
i¼1
σ 22 ðt Þ ¼
l X itþ1
The algorithm is rerun for all threshold values of t within class variance. σ 2b ðt Þ ¼ σ σ 2w ðt Þ ¼ q1 ðt Þ q2 ðt Þ½μ1 ðt Þ μ2 ðt Þ2
ð14:11Þ
This assessment thus reduces the intra-class variance and maximizes the betweenclass variance [36, 48].
14.2.4 Image Segmentation Indices The result of the segmentation is validated with expert’s ground-truth image by quantitative assessment method. Six standard image segmentation indices such as PRI, GCE, VoI, SAD, SSD, and NCC are used in this work to compare the segmentation results with that of ground-truth images.
272
E. Priya
The PRI counts the fraction of pair of pixels whose label is consistent between the segmentation output and ground truth. The Rand index is defined as: R ¼ (a + b)/(a + b + c + d)¼(a + b)/(n/2) (12)where a + b is the number of compliance between the two images X and Y and c + d is the number of dissimilarity between the images X and Y. Its value is between 0 and 1 with 0 indicating the two pixel information that do not agree on any pair of points and 1 indicating exact match. The GCE measures the extent to which the segmentation is compared with that of the other segmentations which are considered to be consistent. If one segment is an appropriate subset of the other, then the pixel lies in an area of refinement, and the error is 0. If there is no subset relationship, then the two regions overlap in an inconsistent manner. e¼
n X m X i¼1
kSði, jÞ T ði, jÞk2
ð14:13Þ
j¼1
where segmentation error measure takes in two conditions and produces a genuine output in the range 0 to 1. Here 0 signifies no error and vice versa. The VoI metric defines the distance between two segmentations measured as the average conditional entropy of the resultant segmentation given the expert’s ground truth. This thus roughly measures the amount of randomness in segmentation output which cannot be explained by the ground truth. The two clusters with subsets X and Y and the variety of information between two clustering are represented as: VI ðX; Y Þ ¼ H ðX Þ þ H ðY Þ 2I ðX, Y Þ
ð14:14Þ
where: X ¼ fX 1 , X 2 , . . . X k g, Pi ¼ jX i j=n and n ¼
X
jX i j
ð14:15Þ
k
Here H(X) is the entropy of X, and I(X, Y) is the mutual information between X and Y [49]. The SAD and SSD distance measures consider the effect of global gray-level variations, setting the average gray level difference equal to zero. In SAD method, the sum of absolute difference between the segmented result T of size nxm pixels and blocks of sizes nxm pixels in the ground-truth image S is expressed as: SADði, jÞ ¼
n X m X i¼1
jSði, jÞ T ðx, yÞj
ð14:16Þ
j¼1
To compute SAD(i, j), it requires few steps to find the segmented result area (nxm). These computations are performed for each (i, j) in the gold standard, groundtruth image where 1 i < (p ‐ n) and 1 j < (q ‐ m).
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
273
SSD is a dissimilar measure to quantify dissimilarity between two images. Its minimum value is 0. SSD is more sensitive to noise in low-contrast regions and relies on the assumption of similar intensity profiles of the images [37–39]. SSDði, jÞ ¼
n X m X i¼1
ðI ði, jÞ J ði, jÞÞ2
ð14:17Þ
j¼1
The standard cross-correlation (CC) is very much prone to noise and is usually replaced by the normalized one (NCC). All these measurements are performed on the local gray-level values. Cross-correlation is ideal for images whose intensities are linearly related [38, 50]. NCC ði, jÞ ¼
n X m X i¼1
b ði, jÞ b ði, jÞG F
ð14:18Þ
j¼1
where FF GG b b ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi F 2ffi and G ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 P P FF GG
ð14:19Þ
The normalized one is computed by subtracting the mean of the patch intensities and dividing it by the standard deviation. The NCC value ranges from 1 to +1, where +1 represents perfect match, whereas 1 refers to complete anti-correlation. It computes the absolute difference between each pixel in the segmented and the respective pixel in the ground truth. These differences are summed up to create a simple similarity metric [39, 40].
14.3
Results and Discussion
The microscopic setup used in this work is presented in Fig. 14.2. The digital camera coupled with the microscopic setup acquires the digital images, which are stored in the computer. The motorized assembly helps in moving the slide such that the images from various view fields are acquired. A portion (256 x 256 pixel resolution) of the image is used for further processing. Figure 14.3 (a) and (b) shows the typical TB-positive and TB-negative images. TB-positive image shows the presence of bright colored rod-shaped disease-causing agent, the bacilli. TB-negative image shows either the absence of or scanty presence of bacilli. The outliers such as dirt, food rests, and clumps of sputum due to uneven smear may be present both in TB-positive and in TB-negative images. Thus identification of rod-shaped bacilli from these images is quite challenging. Appropriate image processing tools need to be chosen such that the bacilli identification helps in segregating the TB images into positive and negative images.
274
E. Priya
Digital camera
Eyepiece Objective Slide
Motorized XYZ stage assembly
Light source
Fig. 14.2 Microscopy setup at Medical Imaging Research Unit, University of Cape Town, South Africa
The non-uniform illumination component present in these images needs to be corrected, or otherwise categorization of these images would be a difficult task. These images are processed using the methods such as BEMD and DWT. The performance of these methods is compared by carrying out qualitative analysis. A line in the image as shown in Fig. 14.4. (a) is drawn, and the intensity profile is drawn along the line. The 2D and 3D intensity profile is drawn as presented in Fig. 14.4. (b) and (c) before illumination correction. The 2D and 3D intensity profile shows the presence of baseline wandering. This wandering of baseline is due to non-uniform illumination present in these images. Figure 14.5 (a) shows the output of BEMD method. The output shows the absence of non-uniform illumination. The same is reflected in the 2D and 3D intensity profile presented in Fig. 14.5 (b) and (c). The profile shows the removal of baseline wandering problem and is thus free from non-uniform illumination. Figure 14.6 (a) shows the output of the DWT algorithm; as it can be visualized, the contrast of the image is less when compared with Fig. 14.5 (a). The 2D and 3D intensity profile shows the presence of baseline wandering. Hence this demonstrates that the non-uniform illumination component is not completely removed with DWT method. It is inferred from the qualitative analysis that BEMD method performs better in removing the non-uniform illumination in the TB sputum smear images. The IMF1 of BEMD method is shown in Fig. 14.7 (a). The IMF1 aids in reconstructing the illumination corrected image. The estimated background by the DWT method is presented in Fig. 14.7 (b). The estimated background is computed from the second level of decomposition of the DWT method. This background image is subtracted from the non-uniform illuminated image to get the corrected image. As a value addition, the respective histograms are plotted as shown in Fig. 14.8. The histogram plot for the image before illumination correction is presented in Fig. 14.8 (a). The histogram plot after illumination correction by BEMD and DWT is shown in Fig. 14.8 (b) and (c), respectively. A sharp peak is visualized in
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
275
Fig. 14.3 Typical (a) TB-positive and (b) TB-negative image
Fig. 14.8 (b) compared to Fig. 14.8 (c). This shows that the brightness of the illumination corrected image by BEMD method is better than DWT. Apart from qualitative, quantitative performance is also studied. Quantitative performance includes histogram-based statistical features. The bar plot presenting mean value of the histogram features is shown in Fig. 14.9. The higher value of mean for the BEMD method represents the brightness of the illumination corrected image. In a similar way, the high value of variance represents an increase in contrast of the images. Decrease in skewness represents uniform background intensity of the sputum smear images. Kurtosis value is high for BEMD-based corrected images such that they tend to have sharp and better image quality. Removal of low frequency (non-uniform illumination) minimizes energy which helps in the
276
E. Priya 300
250
Intensity
200
150
100
50
0
-50 0
20
40
60
80
100
120
140
Distance along profile
Fig. 14.4 (a) A line in the TB-positive image (before illumination correction) to extract (b) 2D and (c) 3D intensity profile along the line 300
250
Intensity
200
150
100
50
0
-50 0
20
40
60
80
100
120
140
Distance along profile
(a)
(b)
(c)
Fig. 14.5 (a) TB-positive image after illumination correction by BEMD method and (b) 2D and (c) 3D intensity profile along the line 300 250
Intensity
200 150 100 50 0 -50 0
20
40
60
80
100
120
140
Distance along profile
(a)
(b)
(c)
Fig. 14.6 (a) TB-positive image after illumination correction by DWT method and (b) 2D and (c) 3D intensity profile along the line
An Automated Approach for the Identification of TB Images Enhanced by. . .
14
277
60
50
40
30
20
10
0
0
50
100
150
Pixel intensity
(a)
200
250
70
Frequency distribution of pixel intensity
70
Frequency distribution of pixel intensity
Frequency distribution of pixel intensity
Fig. 14.7 (a) IMF1 of BEMD and (b) estimated background by DWT
60
50
40
30
20
10
0
0
50
100
150
Pixel intensity
(b)
200
250
70
60
50
40
30
20
10
0
0
50
100
150
200
250
Pixel intensity
(c)
Fig. 14.8 Histogram plot (a) before and after illumination correction by (b) BEMD and (c) DWT methods
identification of foreground objects. The homogeneous illumination component in the images before illumination correction results in low entropy value. The quantitative measures also justify the goodness of the BEMD method in correcting the non-uniform illumination. Figure 14.10 (a) and (b) shows the result of segmentation by maximum entropy and Otsu segmentation method. A very light variation in gray level is captured by the maximum entropy method which is not of interests. But Otsu-based threshold segmentation aids in identifying the TB objects in these images. The segmentation indices such as PRI, GCE, VOI, NCC, SAD, and SSD are computed for the segmented images with respect to the ground-truth expert images. Figure 14.11 presents box plot of normalized average values of the segmentation indices. It is observed from the box plot that only a small difference exists between the maximum entropy and the Otsu segmentation method. Hence, the normalized
278
E. Priya
Fig. 14.9 Bar plot of histogram-based statistical features
Fig. 14.10 Threshold-based segmentation results: (a) maximum entropy and (b) Otsu segmentation resultant images
An Automated Approach for the Identification of TB Images Enhanced by. . . 1
1
0.9
0.9
Normalized values
Normalized values
14
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
279
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
0
0 PRI
GCE
VOI
NCC
SAD
Image segmentation indices
(a)
SSD
PRI
GCE
VOI
NCC
SAD
SSD
Image segmentation indices
(b)
Fig. 14.11 Image segmentation indices for (a) maximum entropy and (b) Otsu segmentation
numerical values of these indices are averaged, and their standard deviation values are presented in Table 14.1. The average and standard deviation values from Table 14.1 demonstrate that in PRI, the agreement between segmented and ground truth is more for Otsu than maximum entropy, as is the case with NCC measure too. No difference is observed for GCE measure, as both the segmentations are considered to be consistent. The distance measure VoI is lower for Otsu, as it indicates the closeness with expert’s ground truth. Both the similarity measures, SAD and SSD, justify the nearest match of the Otsu segmented result with ground truth. Hence Otsu is a better choice than maximum entropy in segmenting the TB objects from sputum smear images.
14.4
Conclusion
According to WHO report, TB is the top contagious disease worldwide, with ten million people falling ill in 2018. Progress toward the End TB Strategy milestones for 2020 has been declared at the UN high-profile meeting on TB. The diagnostic tests for TB disease include sputum smear microscopy which is developed more than 100 years ago but still remains the backbone [4]. In this work, the sputum smear images captured under fluorescence microscope are considered. Because of microscope optics, a low-frequency illumination component is present in these images. An attempt has been made to remove the non-uniform illumination by BEMD and DWT decomposition methods. The foreground TB objects are parted from the background using threshold-based segmentation methods such as maximum entropy and Otsu. It is observed that BEMD performs better in correcting the non-uniform illumination component. The intensity profile and histogram-based statistical features prove the efficiency of BEMD in improving the image brightness. The EMD method
Segmentation methods Maximum entropy-based segmentation Otsu segmentation
Image segmentation indices PRI GCE 0.971 0.007 0.005 0.002 0.98 0.005 0.005 0.002
Table 14.1 Average standard deviation values of image segmentation indices VoI 0.175 0.043 0.133 0.035
NCC 0.417 0.03 0.418 0.043
SAD 0.019 0.01 0.028 0.005
SSD 0.1 0.015 0.118 0.013
280 E. Priya
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
281
has no basis function as similar to DWT and thus is fully adaptive to the image under consideration. Results demonstrate that Otsu segmentation segregates the TB objects from the background in terms of the indices. The similarity, error, and correlation measures highlight the performance of Otsu than maximum entropy method. Thus the proposed workflow will aid clinicians in improving the clinical outcome of TB.
References 1. Mithra KS, Emmanuel WS (2018) FHDT: fuzzy and Hyco-entropy-based decision tree classifier for tuberculosis diagnosis from sputum images. Sādhanā 43(8):125 2. Mithra KS, Emmanuel WS (2018) Automatic methods for Mycobacterium detection on stained sputum smear images: a survey. Patt Recogn Imag Anal 28(2):310–320 3. Costa Filho CFF, Levy PC, Xavier CDM, Fujimoto LBM, Costa MGF (2015) Automatic identification of tuberculosis mycobacterium. Res Biomed Eng 31(1):33–43 4. World Health Organization (2019) Global tuberculosis report 2019. World Health Organization 5. Shah MI, Mishra S, Yadav VK, Chauhan A, Sarkar M, Sharma SK, Rout C (2017) Ziehl– Neelsen sputum smear microscopy image database: a resource to facilitate automated bacilli detection for tuberculosis diagnosis. J Med Imag 4(2):027503 6. de Assis Soares L, Côco KF, Salles EOT, Bortolon S (2015, March) Automatic identification of Mycobacterium tuberculosis in Ziehl-Neelsen stained sputum smear microscopy images using a two-stage classifier. VISAPP 3:186–191 7. Rulaningtyas R, Suksmono AB, Mengko TL, Saptawati P (2015, April) Identification of mycobacterium tuberculosis in sputum smear slide using automatic scanning microscope. In: AIP conference proceedings (Vol. 1656, no. 1, p. 060011). AIP Publishing 8. El-Melegy M, Mohamed D, ElMelegy T, Abdelrahman M (2019) Identification of tuberculosis bacilli in ZN-stained sputum smear images: a deep learning approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. (pp. 0-0) 9. Shah MI, Mishra S, Sarkar M, Rout C (2017) Identification of robust focus measure functions for the automated capturing of focused images from Ziehl–Neelsen stained sputum smear microscopy slide. Cytometry A 91(8):800–809 10. Sheeba F, Thamburaj R, Mammen JJ, Nithish R, Karthick S (2015) Detection of overlapping tuberculosis bacilli in sputum smear images. In: 7th WACBE World congress on bioengineering 2015. Springer, Cham, pp 54–56 11. Coronel JE, Del Carpio CC, Dianderas EJ, Florentini EA, Kemper GL, Sheen P, Zimic MJ (2019) Evaluation of microbiological variants of sputum processing and concentration of mycobacteria to optimize the microscopic and imaging diagnosis of tuberculosis. Int J Mycobacteriol 8(1):75 12. Goyal A, Roy M, Gupta P, Dutta MK, Singh S, Garg V (2015) Automatic detection of mycobacterium tuberculosis in stained sputum and urine smear images. Arch Clin Microbiol 6(3):1 13. Shah MI, Mishra S, Sarkar M, Rout C (2016) Automatic detection and classification of tuberculosis bacilli from ZN-stained sputum smear images using watershed segmentation. 14. Sugirtha GE, Murugesan G (2017, March) Detection of tuberculosis bacilli from microscopic sputum smear images. In: 2017 Third International Conference on Biosignals, Images and Instrumentation (ICBSII). IEEE, pp 1–6 15. del Carpio C, Dianderas E, Zimic M, Sheen P, Coronel J, Lavarello R, Kemper G (2019) An algorithm for detection of tuberculosis bacilli in Ziehl-Neelsen sputum smear images. Int J Elect Comp Eng 9(4):2968–2981 16. Sheeba F, Thamburaj R, Michael JS, Maqlin P, Mammen JJ (2012) Segmentation of sputum smear images for detection of tuberculosis bacilli. BMC Infect Dis 12(S1):O14
282
E. Priya
17. Veropoulos K, Campbell C, Learmonth G, Knight B, Simpson J (1998, September) The automated identification of tubercle bacilli using image processing and neural computing techniques. In: International conference on artificial neural networks. Springer, London, pp 797–802 18. Forero M, Cristobal G, Alvarez-Borrego J (2003, November) Automatic identification techniques of tuberculosis bacteria. In: Applications of digital image processing XXVI, vol 5203. International Society for Optics and Photonics, pp 71–81 19. Khutlang R, Krishnan S, Dendere R, Whitelaw A, Veropoulos K, Learmonth G, Douglas TS (2009) Classification of Mycobacterium tuberculosis in images of ZN-stained sputum smears. IEEE Trans Inf Technol Biomed 14(4):949–957 20. Makkapati V, Agrawal R, Acharya R (2009, August) Segmentation and classification of tuberculosis bacilli from ZN-stained sputum smear images. In: 2009 IEEE international conference on automation science and engineering. IEEE, pp 217–220 21. Lenseigne B, Brodin P, Jeon HK, Christophe T, Genovesio A (2007, April) Support vector machines for automatic detection of tuberculosis bacteria in confocal microscopy images. In: 2007 4th IEEE international symposium on biomedical imaging: from nano to macro. IEEE, pp 85–88 22. Nayak R, Shenoy VP, Galigekere RR (2010, December) A new algorithm for automatic assessment of the degree of TB-infection using images of ZN-stained sputum smear. In: 2010 international conference on systems in medicine and biology. IEEE, pp 294–299 23. Osman MK, Mashor MY, Jaafar H (2012) Performance comparison of extreme learning machine algorithms for mycobacterium tuberculosis detection in tissue sections. J Med Imag Heal Inform 2(3):307–312 24. Panicker RO, Kalmady KS, Rajan J, Sabu MK (2018) Automatic detection of tuberculosis bacilli from microscopic sputum smear images using deep learning methods. Biocyber Biomed Eng 38(3):691–699 25. Sadaphal P, Rao J, Comstock GW, Beg MF (2008) Image processing techniques for identifying Mycobacterium tuberculosis in Ziehl-Neelsen stains. Int J Tuberc Lung Dis 12(5):579–582 26. Zhai Y, Liu Y, Zhou D, Liu S (2010, December) Automatic identification of mycobacterium tuberculosis from ZN-stained sputum smear: algorithm and system design. In: 2010 IEEE international conference on robotics and biomimetics. IEEE, pp 41–46 27. Janušauskas A, Jurkonis R, Lukoševičius A, Kurapkienė S, Paunksnis A (2005) The empirical mode decomposition and the discrete wavelet transform for detection of human cataract in ultrasound signals. Informatica 16(4):541–556 28. Shen X, Li Q, Tian Y, Shen L (2015) An uneven illumination correction algorithm for optical remote sensing images covered with thin clouds. Remote Sens 7(9):11848–11862 29. Khattak SS, Saman G, Khan I, Salam A (2015) Maximum entropy based image segmentation of human skin lesion. World Acad Sci Eng Technol Int J Comp Elect Autom Cont Info Eng 9 (5):1094–1098 30. Jia H, Peng X, Song W, Oliva D, Lang C, Yao L (2019) Masi entropy for satellite color image segmentation using tournament-based lévy multiverse optimization algorithm. Remote Sens 11 (8):942 31. Qi C (2014) Maximum entropy for image segmentation based on an adaptive particle swarm optimization. Appl Math Info Sci 8(6):3129 32. Bandyopadhyay O, Chanda B, Bhattacharya BB (2011, June) Entropy-based automatic segmentation of bones in digital X-ray images. In: International conference on pattern recognition and machine intelligence. Springer, Berlin/Heidelberg, pp 122–129 33. Feng D, Wenkang S, Liangzhou C, Yong D, Zhenfu Z (2005) Infrared image segmentation with 2-D maximum entropy method based on particle swarm optimization (PSO). Pattern Recogn Lett 26(5):597–603 34. Sezgin M, Sankur B (2004) Survey over image thresholding techniques and quantitative performance evaluation. J Elect Imag 13(1):146–166
14
An Automated Approach for the Identification of TB Images Enhanced by. . .
283
35. Yan C, Sang N, Zhang T (2003) Local entropy-based transition region extraction and thresholding. Pattern Recogn Lett 24(16):2935–2941 36. Nyma A, Kang M, Kwon YK, Kim CH, Kim JM (2012) A hybrid technique for medical image segmentation. BioMed Res Int 2012 37. Fouda YM (2014) One-dimensional vector based pattern matching. arXiv preprint arXiv 1409:3024 38. Giachetti A (2000) Matching techniques to compute image motion. Image Vis Comput 18 (3):247–260 39. Ledig C, Rueckert D (2016) Semantic parsing of brain MR images. In: Medical image recognition, segmentation and parsing. Academic Press, pp 307–335 40. Bhat M, Kapoor P, Raina BL (2012) Application of sad algorithm in image processing for motion detection and simulink blocksets for object tracking. Int J Eng Sci Adv Technol 2 (3):731–736 41. Akar E, Kara S, Akdemir H, Kırış A (2017) Fractal analysis of MR images in patients with Chiari malformation: the importance of preprocessing. Biomed Sign Process Cont 31:63–70 42. Qin X, Liu S, Wu Z, Jun H (2008, May) Medical image enhancement method based on 2D empirical mode decomposition. In: 2008 2nd international conference on bioinformatics and biomedical engineering. IEEE, pp 2533–2536 43. Merjulah R, Chandra J (2019) Classification of myocardial ischemia in delayed contrast enhancement using machine learning. In: Intelligent data analysis for biomedical applications. Academic Press, pp 209–235 44. Sengur A, Budak U, Akbulut Y, Karabatak M, Tanyildizi E (2019) A survey on neutrosophic medical image segmentation. In: Neutrosophic set in medical image analysis. Academic Press, pp 145–165 45. Caselles V, Kimmel R, Sapiro G (2005) Geometric active contours for image segmentation. In: Handbook of image and video processing. Academic Press, pp 613–627 46. Jiang C, Yang W, Guo Y, Wu F, Tang Y (2018) Nonlocal means two dimensional histogrambased image segmentation via minimizing relative entropy. Entropy 20(11):827 47. Zhou M, Hong X, Tian Z, Dong H, Wang M, Xu K (2014) Maximum entropy threshold segmentation for target matching using speeded-up robust features. J Elect Comp Eng 2014:24 48. Malathi M, Sinthia P, Jalaldeen K (2019) Active contour based segmentation and classification for pleura diseases based on Otsu’s thresholding and support vector machine (SVM). Asian Pacif J Cancer Prevent APJCP 20(1):167 49. Yang AY, Wright J, Ma Y, Sastry SS (2008) Unsupervised segmentation of natural images via lossy data compression. Comput Vis Image Underst 110(2):212–225 50. Golkar E, Rahni AAA, Sulaiman R (2014, December) Comparison of image registration similarity measures for an abdominal organ segmentation framework. In 2014 IEEE Conference on Biomedical Engineering and Sciences (IECBES) (pp. 442–445). IEEE