350 48 6MB
English Pages 346 [348] Year 2020
Computational Intelligence for Machine Learning and Healthcare Informatics
Intelligent Biomedical Data Analysis (IBDA)
Edited by Deepak Gupta, Nhu Gia Nguyen, Ashish Khanna, Siddhartha Bhattacharyya
Volume 1
Computational Intelligence for Machine Learning and Healthcare Informatics Edited by Rajshree Srivastava, Pradeep Kumar Mallick, Siddharth Swarup Rautaray and Manjusha Pandey
Editors Rajshree Srivastava DIT University Diversion Road Mussoorie, 248009 Uttarakhand, India [email protected]
Siddharth Swarup Rautaray KIIT University KIIT Road Bhubaneswar, 751024 Odisha, India [email protected]
Pradeep Kumar Mallick KIIT University KIIT Road Bhubaneswar, 751024 Odisha, India [email protected]
Manjusha Pandey KIIT University KIIT Road Bhubaneswar, 751024 Odisha, India [email protected]
ISBN 978-3-11-064782-2 e-ISBN (PDF) 978-3-11-064819-5 e-ISBN (EPUB) 978-3-11-064927-7 ISSN 2629-7140 Library of Congress Control Number: 2020934485 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2020 Walter de Gruyter GmbH, Berlin/Boston Cover image: gettyimages/thinkstockphotos, Abalone Shell Typesetting: Integra Software Services Pvt. Ltd. Printing and binding: CPI books GmbH, Leck www.degruyter.com
Rajshree Srivastava would like to dedicate this book to her father Dr. R. L. Srivastava, mother Smt. Nandini Srivastava, and brother Sarvesh Lal Srivastava. Pradeep Kumar Mallick would like to dedicate this book to his parents and students. Siddharth Swarup Rautray would like to dedicate this book to his parents. Manjusha Pandey would like to dedicate this book to her parents.
Preface The amount of data collected today from various applications all over the world across a wide variety of fields is expected to double for every 2 years. It has no utility unless these data/records are analyzed to get useful information. The development of powerful computers is a boon to implement these techniques leading to automated systems. The transformation of data into knowledge is by no means an easy task for highperformance large-scale data processing. Moreover, these data may involve uncertainty in many different forms. Many different models like fuzzy sets, rough sets, soft sets, neural networks, their generalizations, and hybrid models obtained by combining two or more of these models have been found to be fruitful in representing data. These models are also very much fruitful for analysis. More often than not, machine learning (ML) is reduced to include only the important characteristics that are necessary from a particular study point of view or depending upon the application area. So, reduction techniques have been developed. Often the data collected have missing values. These tuples having missing values are supposed to be eliminated from the dataset before analysis. More importantly, these new challenges may comprise, sometimes even deteriorate, the performance, efficiency, and scalability of the dedicated data-intensive computing systems. This brings up many research issues in the industry and research community in forms of capturing and accessing data effectively. In addition, fast processing while achieving high performance and high throughput, and storing it efficiently for future use is another issue. Further, programming for ML is an important challenging issue. Expressing data access requirements of applications and designing programming language abstractions to exploit parallelism are an immediate need. While these are a few examples of issues, our intention in this book is to offer concepts of computational intelligence for ML and in healthcare informatics in a precise and clear manner to the research community, that is, the conceptual basis required to achieve in-depth knowledge in the field of computer science and information technology. Application of ML in healthcare is an emerging research issue that is related to the digital transformations in the healthcare paradigm. Today’s world is wired differently compared to the past few years where almost everything can be made ubiquitously. Extreme connectivity enables more universal, global, and close-to-instant communication. These new technological models based on these emerging topics are mainly focused on collecting and connecting health-related data from all available sources, extracting meaningful information from that data, and providing that information to other players. Extreme automation, on the other hand, can be coupled with extreme connectivity or even with extreme environment, allowing computing systems to control and manage physical processes and respond in evermore “human” ways in highly connected normal environment or a challenging extreme environment. It will help those researchers who have interest in this field to keep insight into different concepts and their importance for applications in real life. This has been https://doi.org/10.1515/9783110648195-202
VIII
Preface
done to make this book more flexible and to stimulate further interest in topics. All these motivated us toward computational intelligence for ML and healthcare informatics. This book is organized into 15 chapters. Chapter 1 is a review of bone tissue engineering for the application of AI (artificial intelligence) in cellular adhesion prediction. The chapter shows that most of the AI tools used were the artificial neural network in their different types, followed by cellular automata and multiagent systems. The intended use varies, but it is mainly related to understanding the variables involved and adjusting a model that provides insight and allows for a better and more informed design process of the scaffold. Health informatics primarily deals with the methodologies that help to acquire, store, and use information in health and medicine. Chapter 2 deals with the various types of ML techniques, approaches, challenges, and their future scope in healthcare informatics. Further, these techniques can also be used to make a model for quick and precise healthcare discovery. Chapter 3 reported a new method by combining the Stockwell transform as morphological features and heart rate variability as dynamic features. This final feature vector is applied as input to artificial bee colony-optimized twin support vector machines for automated recognition of heartbeats in 16 classes. The developed method can be utilized to monitor long-term heartbeat recordings and analyze the nonstationary behavior of heartbeats. The validation of the developed method is performed on the Physionet data while its evaluation is done under patient-specific scheme. An improved accuracy of 88.22% is achieved by the methodology under patient-specific scheme. Chapter 4 discusses the stringing together of three major techniques: automatic speech recognition, automated translation by machine, and conversion of text into spoken utterance, that is, text to speech for seamless communication in healthcare services. Besides this, the technological developments and implementation of the challenges at each step are identified and briefly discussed. The performance of the S2S system is evaluated in the healthcare domain. Chapter 5 focuses on recent advancement of ML and deep learning in the field of healthcare system. Chapter 6 explains the classification techniques in ML – supervised, unsupervised, and semisupervised – and how effectively these techniques are used in psychological disorder prediction. The accuracy levels of each technique have been reported. The recent advancements in ML techniques for predicting disorders have been explained, and also the challenges being faced while predicting psychological disorders have been studied and explained in this chapter. Chapter 7 focuses on the automatic analysis of cardiovascular diseases using empirical mode decomposition (EMD) and support vector machines. This chapter employs Hilbert–Huang transform for feature selection (FS), which is demonstrated as an effective technique capable of providing frequency spectrum that varies with time. As a result, the output coefficients are used to extract different features such as weighted mean frequency, skewness, central moment, and many more processed from the intrinsic mode functions extricated by utilizing the EMD calculation. The validation of proposed methodology
Preface
IX
is performed on the Physionet data to identify six categories of heartbeats. The methodology reported a higher accuracy of 99.18% in comparison with previous methodologies reported. The methodology can be utilized as a solution for computerized diagnosis of cardiac diseases to serve the cardiac healthcare. Chapter 8 talks about different architectures of near-data processing (NDP), which is compatible for ML methods and the way in which NDP architectures are mentioned. Chapter 9 discusses on the classification of various image fusion algorithms and their performance evaluation metrics. Authors have also surveyed on various type of images (single sensor and multimodal) that can be fused together and their corresponding image fusion methods. As argued in this chapter, various application areas of image fusion are explored. Strength and weakness of different image fusion methods have been discussed with examples. Image fusion results may be verified using quantitative and qualitative metric parameters. Further, various quantitative performance metric has also been discussed in this chapter. Chapter 10 provides a literature study of the health recommender systems (HRS) domain in general, which includes the literature, innovations, purpose, and methods of HRS, along with the new concept of HRS being used for medication purposes. Chapter 11 describes two case studies on dense convolutional neural network approach for medical diagnosis. From the study, it is found that the approach results are better than all the state-of-the-art approaches in this field, giving us higher accuracy. In future, authors aim to apply this approach to various types of medical images. If the desired result is achieved, it can change our way of approaching medical reports. Time to time, survey and data have clearly revealed the hesitation and reluctance of patients to undergo robotic-assisted surgeries. While there exist many factors that determine the automation of human effort in industries, authors analyze few specific ones that lay common in that field. The analysis further proves that robots do provide various advantages in the industries. Today, robotic surgery is a major topic of research. Chapter 12 shows how impact of sentiment analysis tools can improve patients’ life in critical diseases. The study shows various types of tools used in each case and different media sources, and examines its impact and improvement in diseases such as obesity, diabetes, cardiovascular disease, hypertension, schizophrenia, Alzheimer’s disease, and cancer using sentiment analysis and its impact on one’s life. Sentiment analysis helps in designing strategies to improve the understanding and behavior of patients. Chapter 13 provides a multilevel image border for object segmentation. The suggested algorithm is evaluated on standard image sets using FA, DE, and particle swarm optimization, and the results are compared with entropy approaches for Shannon or fuzzy. The suggested approach shows better efficiency in objective factor than the state-of-the-art approaches, structural similarity index, PSNR, and standard derivation. ML is an application of AI, which deals with the capability of a computer to learn from the given data to gain knowledge in making predictions and decisions based on its experience. Chapter 14 shows ML can be used in healthcare.
X
Preface
Chapter 15 is a recent survey of evolutionary computation (EC)-based FS techniques whose objective is mainly to improve the accuracy of the ML algorithm in minimized computation time. The idea is to bring forth the main strengths of EC as a naturally inspired optimization technique for FS in the ML process. The modeling of biological and natural intelligence that has made progressive advancements in the recent decade motivates the author to review state-of-the-art FS techniques to add more to the area of computational intelligence. This book is intended to be used as a reference for undergraduate and postgraduate students in the disciplines of computer science, electronics and telecommunication, information security, and electrical engineering. April, 2020
Rajshree Srivastava Pradeep Kumar Mallick Siddharth Swarup Rautaray Manjusha Pandey
Contents Preface
VII
List of contributors
XIII
María de Lourdes Sánchez, Adrián Will, Andrea Rodríguez, Luis O. Gónzalez-Salcedo 1 A review of bone tissue engineering for the application of artificial intelligence in cellular adhesion prediction 1 Divya Gaba, Nitin Mittal 2 Implementation and classification of machine learning algorithms in healthcare informatics: approaches, challenges, and future scope 21 Sandeep Raj, Arvind Choubey 3 Cardiac arrhythmia recognition using Stockwell transform and ABC-optimized twin SVM 35 Shweta Sinha, Shweta Bansal 4 Computational intelligence approach to address the language barrier in healthcare 53 Yogesh Kumar, Manish Mahajan 5 Recent advancement of machine learning and deep learning in the field of healthcare system 77 Prabhsimar Kaur, Vishal Bharti, Srabanti Maji 6 Predicting psychological disorders using machine learning
99
Sweta Kumari, Sneha Kumari 7 Automatic analysis of cardiovascular diseases using EMD and support vector machines 131 J Naveenkumar, Dhanashri P. Joshi 8 Machine learning approach for exploring computational intelligence Simrandeep Singh, Nitin Mittal, Harbinder Singh 9 Classification of various image fusion algorithms and their performance evaluation metrics 179
153
XII
Contents
Dhrubasish Sarkar, Medha Gupta, Premananda Jana, Dipak K. Kole 10 Recommender system in healthcare: an overview 199 Purva Ekatpure, Shivam 11 Dense CNN approach for medical diagnosis
217
Dhaval Bhoi, Amit Thakkar 12 Impact of sentiment analysis tools to improve patients’ life in critical diseases 239 Diksha Thakur, Nitin Mittal, Simrandeep Singh, Rajshree Srivastva 13 A fuzzy entropy-based multilevel image thresholding using neural network optimization algorithm 253 Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi 14 Machine learning in healthcare 277 Vanaja Ramaswamy, Saswati Mukherjee 15 Computational health informatics using evolutionary-based feature selection 309 Index
329
List of contributors Shivam Pune Institute of Computer Technology Indian Institute of Information Technology Allahabad, Pune, India [email protected] Shweta Bansal KIIT College of Engineering, Gurugram, India [email protected] Vishal Bharti Department of Computer Science and Engineering, DIT University, Dehradun, India [email protected] Dhaval Bhoi U & P U Patel Department of Computer Engineering, Chandubhai S. Patel Institute of Technology, Charotar University of Science and Technology, Gujarat, India [email protected] Arvind Choubey Department of Electronics and Communication Engineering, Indian Institute of Information Technology Bhagalpur Bhagalpur, India [email protected] Joshi P. Dhanashri Department of Computer Engineering, Bharati Vidyapeeth (Deemed to Be University) College of Engineering, Pune, India [email protected]
Luis O. Gónzalez-Salcedo Materials Catalysis and Environmental Research Group, Faculty of Engineering and Administration, National University of Colombia – Palmira Headquarters Palmira, Colombia [email protected] Medha Gupta Amity Institute of Information Technology Amity University, Kolkata, India [email protected] Premananda Jana Netaji Subhas Open University Kalyani, West Bengal, India [email protected] Prabhsimar Kaur Department of Computer Science and Engineering, DIT University, Dehradun, India [email protected] Dipak K. Kole Department of CSE, Jalpaiguri Government Engineering College, Jalpaiguri West Bengal, India [email protected] Sneha Kumari Department of Electrical Engineering, Indian Institute of Technology Patna, Patna, India [email protected]
Purva Ekatpure Pune Institute of Computer Technology, Indian Institute of Information Technology Allahabad, Pune, India [email protected]
Sweta Kumari Department of Computer Science and Engineering, Swami Devi Dayal Institute of Engineering and Technology Kurukshetra, India [email protected]
Divya Gaba Electronics and Communication Engineering Chandigarh University, Punjab, India [email protected]
Yogesh Kumar Chandigarh Engineering College, Landran Mohali, India [email protected]
https://doi.org/10.1515/9783110648195-204
XIV
List of contributors
Manish Mahajan Chandigarh Engineering College, Landran Mohali, India [email protected] Srabanti Maji Department of Computer Science and Engineering, DIT University, Dehradun, India [email protected] Nitin Mittal Electronics and Communication Engineering Chandigarh University, Punjab, India [email protected] Vihal Mohanty School of Computer Science and Engineering Vellore Institute of Technology, Vellore Tamil Nadu, India [email protected] Saswati Mukherjee Department of Information Science and Technology, College of Engineering Guindy Anna University, Chennai, Tamil Nadu, India [email protected] Jayakumar Naveenkumar Department of Computer Engineering Bharati Vidyapeeth (Deemed to Be University) College of Engineering, Pune, India [email protected] Sandeep Raj Department of Electronics and Communication Engineering, Indian Institute of Information Technology Bhagalpur Bhagalpur, India [email protected] Vanaja Ramaswamy Department of Information Science and Technology, College of Engineering Guindy Anna University, Chennai, Tamil Nadu, India [email protected]
Andrea Rodríguez Media and Interfaces Laboratory (LAMEIN) Faculty of Sciences and Technology, National University of Tucumán, San Miguel de Tucumán, Argentina [email protected] María de Lourdes Sánchez Advanced Informatics Technology Research Group (GITIA), National Technological University, Tucumán Regional Faculty San Miguel de Tucumán, Argentina [email protected] Dhrubasish Sarkar Amity Institute of Information Technology Amity University, Kolkata, India [email protected] Ramani Selvanambi School of Computer Science and Engineering Vellore Institute of Technology, Vellore Tamil Nadu, India [email protected] Harbinder Singh Electronics and Communication Engineering CEC Landran, Punjab, India [email protected] Simrandeep Singh Electronics and Communication Engineering Chandigarh University, Punjab, India [email protected] Shweta Sinha Amity University Haryana, India [email protected] Rajshree Srivastva Department of Computer Science Engineering, DIT University, Dehradun, India [email protected]
List of contributors
Amit Thakkar Smt. Kundanben Dinsha Patel Department of Information Technology, Chandubhai S. Patel Institute of Technology, Charotar University of Science and Technology, Gujarat, India [email protected] Diksha Thakur Department of Electronics and Communication Engineering, Chandigarh University, Mohali, Punjab, India [email protected]
XV
Daiyaan Ahmed Shaik School of Computer Science and Engineering, Vellore Institute of Technology, Vellore Tamil Nadu, India [email protected] Adrián Will Advanced Informatics Technology Research Group (GITIA), National Technological University, Tucumán Regional Faculty San Miguel de Tucumán, Argentina [email protected]
María de Lourdes Sánchez, Adrián Will, Andrea Rodríguez, Luis O. Gónzalez-Salcedo
1 A review of bone tissue engineering for the application of artificial intelligence in cellular adhesion prediction Abstract: Artificial intelligence (AI) is changing, at a fast pace, all aspects of science, technology, and society in general, giving rise to what is known as the 4th Industrial Revolution. In this chapter, we review the literature regarding AI applications to bone tissue engineering, and more particularly, to cell adhesion in bone scaffolds. The works found are very few (only six works), and we classify them according to the AI technique used. The question we want to address in this chapter is what AI techniques were used and what exactly have they been used for. The chapter shows that the most used AI tools were the artificial neural network, in their different types, followed by cellular automata and multiagent systems. The intended use varies, but it is mainly related to understanding the variables involved and adjusting a model that provides insight and allows for a better and more informed design process of the scaffold. Keywords: bone tissue engineering, artificial intelligence, stem cells, scaffolds, cell adhesion
1.1 Introduction Regenerative medicine is a multidisciplinary specialty, which seeks the maintenance, improvement, or restoration of the function of cells, tissues, and organs. It is based on four pillars: cell therapy, organ transplantation, biomedical engineering, and, finally, tissue engineering (Rodríguez et al., 2013). Bone tissue engineering (BTE) is a constitutive part of regenerative medicine. Its main objective is to repair both the shape and function of the damaged bone.
María de Lourdes Sánchez, Adrián Will, Advanced Informatics Technology Research Group (GITIA), National Technological University, Tucumán Regional Faculty, San Miguel de Tucumán, Argentina Andrea Rodríguez, Laboratorio Advanced Informatics Technology Research Group (GITIA), National Technological University, Tucumán Regional Faculty, San Miguel de Tucumán, Argentina Luis O. Gónzalez-Salcedo, Grupo Materials Catalysis and Environmental Research Group, Faculty of Engineering and Administration, National University of Colombia – Palmira Headquarters, Palmira, Colombia https://doi.org/10.1515/9783110648195-001
2
María de Lourdes Sánchez et al.
The size of bone deficiency constitutes a critical factor because it does not spontaneously regenerate, since it requires surgical intervention. In this regard, at present, many people are affected precisely by bone or joint problems. In third age people, these effects represent almost 50% of chronic diseases that can develop, causing pain and physical disability and, in some cases, requiring surgery, bone grafts, or implants (Moreno et al., 2016). In addition, one of the problems of regenerative medicine is that many organ transplants are required, but their donors are very few. This leads to an important cooperation with tissue engineering, resulting in a promising strategy for bone reconstruction and in the development as possible solutions of bioengineering structures for this purpose (Roseti et al.,2017). According to Moreno et al. (2016) and Roseti et al. (2017), blood is the most transplanted, followed by bone transplantation. Although the success of the therapeutic solutions described and used for more than a decade is in the clinical environment, some inconveniences can take place because infections can occur after placing the implant in the body (Gaviria Arias et al., 2018). Tissue engineering plays a major role in overcoming these limitations, becoming a favorable area to repair bone lesions using porous three-dimensional matrices seeded with growth factors and mesenchymal stem cells (MSC). These matrices are built using different technologies and are known as scaffolds. Once constructed and implanted, the MSCs or other types of cells (i.e., pre-osteoblasts) are seeded on the surface of the scaffold and the natural process of human tissue regeneration is stimulated and helped by the growth factor, in order to produce new bone (Moreno et al., 2016; Suárez Vega et al., 2017). Tissue engineering takes advantage of the natural ability of the body to regenerate using engineering and biology to replace or repair damaged tissues (Moreno et al., 2016; Granados et al., 2017). Therefore, we can say that tissue engineering dramatically increases the capabilities of regenerative medicine. Furthermore, if tissue engineering is combined with cell therapy, the capabilities are even higher. For example, embryonic therapeutic cells or living stem cells can be used alone or in association with scaffolds of biomaterials (Moreno et al., 2016). In this regard, Roseti et al. (2017) mention that, alternatively, different types of cells can be used or combined with scaffolds in vivo, promoting the osteogenic differentiation or releasing the necessary soluble molecules. To achieve bone regeneration knowledge of cells, three-dimensional scaffolds, and growth factors or signaling molecules are required (Gordeladze et al., 2017). This leads to a series of important questions: (1) type of cells, biological products, biomaterials, and internal microarchitecture of the scaffold to be used; (2) selection of optimal physiological and therapeutic doses; (3) temporal and/or spatial distribution of the mentioned criteria for tissue reconstruction; (4) its dynamics and its kinetics; (5) application related to the visualization of customized and performancerelated design specifications; and (6) manipulation of the pathways involved in the
1 A review of bone tissue engineering for the application of AI
3
requirement of sophisticated tissue engineering therapies (Gordeladze et al., 2017). Answering these questions will lead us to the unequivocal identification of the fundamental factors that are considered necessary to complete the successful regeneration of the tissue. In addition, for a better understanding of the interaction of cells, scaffolds, and growth factors, the possibility of having bioinformatics systems is extremely important because these systems can study what happens with the different variables and thus can propose a simulation and/or prediction model, as mentioned in Gordeladze et al. (2017). These concerns have been raised for years ago (Estrada et al., 2006). Then, the need of sophisticated experimental tools for analysis is defined, as well as the inclusion of more realistic in vitro models and better forms of acquisition and noninvasive images in vivo, which leads to the development of computational models that are capable of processing a large amount of information. In this regard, Narváez Tovar et al. (2011) review the different computational models of bone differentiation and adaptation, disregarding how the models have increased in complexity when moving from two-dimensional to three-dimensional representations and have included new factors or variables as developed experimental research and have gone from mechanistic considerations to models that consider biological aspects of the bone adaptation process. However, for the same authors, the mathematical relationships that support these models only represent a small part of all the mechanisms involved in the problem. Scaffolds must be biomimetic and functional. That is, their internal microarchitecture must mimic the natural microenvironment to which cells are accustomed to achieve the necessary cellular responses in order to form the new tissue. Although there are several methodologies for scaffolding manufacturing, many of these methods produce deficient scaffolds, which fail to promote three-dimensional healing and in the formation of a blood vessel network within the scaffold, as expressed by (Eltom et al., 2019). This leads to the need to predict the result of cell adhesion in the rehabilitation process, which requires the provision of computational tools for this purpose. However, the different factors to consider in the modeling process create a complexity that is generally resolved in the field of artificial intelligence (AI). AI corresponds to the mathematical and computational technique to generate capacity in artifacts to exhibit intelligent behavior, the main areas being artificial neural networks (ANN), evolutionary programming, fuzzy logic, data mining, machine learning, expert systems, artificial life, and swarm intelligence, among others. However, the different factors to consider in the modeling process create a complexity that is generally better solved in the field of AI. Various applications have been used since AI in regenerative medicine and tissue engineering, as expressed by Biswal et al. (2013). However, most of them have focused on other types of scaffolds (Ramírez López et al., 2019). We present in this chapter, a review focused on applications of AI to BTE. We will restrict ourselves to works from the last 10 years, focused on bone scaffolds
4
María de Lourdes Sánchez et al.
and cell adhesion (as the first process that must occur toward tissue formation), in order to solve one research question: What are the AI techniques used and what exactly have they been used for? This is due to the fact that about scaffolds in tissue engineering there exist dozens of different materials and several constructive techniques, but the complexity and high price of the process to build, seed, and measure bone scaffolds make it available for very little data. Therefore, the techniques used, the problems they attack, and the variables present must be analyzed in order to design future AI applications. The rest of the chapter is structured as follows: in Section 1.2, we present a concise review of what is BTE; in Section 1.3, we emphasize the importance of cell adhesion as a critical process toward tissue formation; Section 1.4 constitutes our main section, in which we analyze different AI developments applied to BTE, especially, cell adhesion; finally, in Section 1.5, we give our conclusions of the subject.
1.2 Bone tissue engineering and its key factors Figure 1.1 shows the essential basic parts for tissue engineering: cells, scaffolds, and growth factors, as we have mentioned. The cells can be classified according to the origin and according to the degree of differentiation. According to the origin they can be (1) the patient’s cells, called autologous; (2) cells of another organism of the same species, known as allogenic; and (3) cells of an organism of a different species, called xenogenic. According to their degree of differentiation, they are classified into embryogenic cells and somatic or adult cells (Moreno et al., 2016; Suárez Vega et al., 2017).
Growth factors
Cells
Scaffold
Figure 1.1: The fundamental pieces on which tissue engineering is based. Adapted from Moreno et al. (2016).
1 A review of bone tissue engineering for the application of AI
5
The scaffolds must be structures with adequate mechanical properties, in particular, they must present a high degree of porosity, that is, a large proportion of void spaces within their volume and provide the adequate physical environment for the cells to adhere, migrate, proliferate, and finally get a cellular conformation similar to that of living tissue. Finally, growth factors are mostly proteins that help and stimulate cells in these functions (Moreno et al., 2016). Roseti et al. (2017) mention that the scaffolds with the best performance for tissue engineering would be those designed to improve cell adhesion, migration and proliferation, osteogenic lineage differentiation, angiogenesis, host integrity, and load. This means that the scaffold must respect biological and physical guidelines (Altamirano Valencia et al., 2016). In this regard, Fernández (2011) mentioned that the requirements are (1) biocompatibility, (2) porosity and pore size, (3) biodegradability, (4) mechanical properties, (5) osteoinduction, and (6) feasibility of manufacturing (Altamirano Valencia et al., 2016). Moreno et al. (2016) extend these characteristics to (1) biocompatibility, (2) controlled and adjusted biodegradability, (3) porous structure represented in the pore size, (4) mechanical properties such as stiffness, resistance, and resistance to live tension, (5) osteoconductivity and osteoinduction, and (6) anisotropic structure for the correct adaptation to anatomical forms. Roseti et al. (2017) show in the graphic representation of Figure 1.2 that the characteristics of the scaffolds for BTE can be addressed based on four classifications: (1) biological requirements, (2) structural characteristics, (3) composition of biomaterials, and (4) manufacturing technologies. The lower part of Figure 1.2 indicates the primary purposes of bone regeneration through suitable scaffolds to allow cell grafting and restore physiological structure and functions. Note that the first stage is cell adhesion. On the other hand, the potential for surgery lies in the technological advances that are necessary to implement it (Tarassoli, 2019). They also mention that in a futuristic vision of the next 50 years, four key technologies will emerge from technological advancement: (1) AI, (2) robotics, (3) genomics, and (4) regenerative medicine. In the field of AI, from which robotics is derived and one of the disciplines of regenerative medicine, tissue engineering, and in particular bone scaffolds, these technological advances hope to provide the medicine–patients relationship of a capacity for precise planning and technical management, data analysis both in the presurgical and diagnostic part, and the production of “custom organs” to replace damaged and/or unhealthy tissues, through the use of three-dimensional bioprinting (scaffolds). However, one of the main obstacles is not so much the availability of biomaterials but their effective implementation. Despite all the research efforts underway around the world, there are still challenges in order to get the cells to behave as desired.
6
María de Lourdes Sánchez et al.
Biomimetic Bioinspired Tailored architecture Customized shape High porosity Pore interconnection Mechanical properties Surface topography
Nontoxic Biocompatible Bioresorbable Noninmunogenic Bioactive Smart Biological requirements
Structural features
scaffold for bone regeneration Composition
Manufacturing technologies
Biomaterials: Ceramics Polymerics Composites
Advanced: Electrospinning
AIMS Cell attachment Host integration
Conventional: Gas foaming Solvent casting Freeze-drying
Cell viability
Load bearing
Rapid prototyping
Cell homing
Osteogenic differentiation
Cell proliferation Vascular ingrowth
Figure 1.2: Properties that must be present in a scaffold for bone tissue engineering. Adapted from Roseti et al. (2017).
1.3 Cell adhesion: definition and importance Cellular adhesion is a natural behavior of cells. Cells can adhere to each other through direct intercellular unions or to extracellular materials that they secrete themselves. In some way, cells must coalesce to form an organized pluricellular structure (Alberts, 2008). When these unions between the cells are established, communication paths are generated. The resistance, shape, and arrangement of the different types of cells in an organism are controlled by these cohesion mechanisms. This, in turn, allows cells to interchange signals, which coordinate their behavior and regulate their genic expression patterns. The adhesion with other cells and with the extracellular matrix controls the orientation of the internal structure of each cell. Formation and destruction of these unions, as well as the remodeling of the extracellular matrix (ECM), control the way that cells move within the organism, guiding them, while the organism
1 A review of bone tissue engineering for the application of AI
7
grows, develops, and repairs itself. A wide variety of diseases can occur if this cohesion apparatus is affected (Alberts, 2008). Cells are little elements, deformable, and often mobile that contain an aqueous medium and that they are limited by a thin plasmatic membrane; nevertheless, millions of them can combine themselves to form a very massive, stable, and robust structure like an animal, a tree, or a human being. This is due to the ability of cells to stay together (Alberts, 2008). In mammals, cell adhesion is vital because it regulates proliferation, differentiation, and phenotypic behavior (Fisher et al., 2007); the same process must occur when cells are seeded in a scaffold. If cell adhesion is poor or deficient, different pathologies can occur, causing disorders or physiological problems (Vivas et al., 2015). We could say that cell attachment is the gateway to tissue regeneration. Analyzing all this, the questions that arise are: Which is the ideal scaffold? And how do we measure its efficacy? Fisher et al. (2007) establish that the efficacy of a biomaterial lies in the ability to regulate cell adhesion and that its internal microarchitecture supports the other cellular behaviors. A literature review shows that the variables that successfully influence the cell adhesion process include the following: (1) surface energy of the material (Gentleman and Gentleman, 2014; Nakamura et al., 2016), (2) wettability of the material, measured through the contact angle (Arima and Iwata, 2007; Tamada and Ikada, 1993; Lampin et al., 1997; Harbers and Grainger, 2011), (3) rigidity of the material (Phipps et al., 2011), (4) porosity (Danilevicius et al., 2015), (5) characteristics and diameter of the fiber, in the case of fibrillar scaffolds (Chen et al., 2009), (6) relationship between the surface area of the scaffold and its volume or specific surface area (Chen et al., 2009), (7) the roughness of the scaffold surface (Lampin et al., 1997), and (8) pore size (Murphy et al., 2010; O’Brien et al., 2005). Other variables are also mentioned in the literature, which under certain conditions must be taken into account, such as tortuosity (Chang and Wang, 2011); interconnectivity (Chang and Wang, 2011); surface charge (Chang and Wang, 2011; Lee et al., 2005); and permeability, pore shape, and heterogeneity (Boccaccio et al., 2016).
1.4 Analysis of literature: what are the AI techniques used and what exactly have they been used for? AI is the ability of machines to perform tasks that usually require human intelligence, such as speech recognition, visual perception, and language translation (Dibyajyoti et al., 2013). We restrict ourselves to the following techniques:
8
– – – – –
María de Lourdes Sánchez et al.
Agent based and multiagent systems Cellular automata Evolutionary algorithms ANN of different types Fuzzy logic
where evolutionary algorithms include particle swarm optimization, genetic algorithms, and artificial immune systems. As for the review, we will restrict ourselves to the following conditions: – Works published after 01/01/2009 (last 10 years) – Including one of the AI techniques mentioned above – Focused on bone scaffolds – Studying cell adhesion as an important part of the chapter In this section, we will present the results found, applying these criteria to our search. There are very few works fulfilling all conditions set, perhaps because most interesting AI applications require expensive experiments in order to adjust or train the models. No works fulfilling the restrictions were found using fuzzy logic, genetic algorithms, immune systems, or other evolutionary algorithm techniques other than what is described below. In the case of multiagent systems, no work fulfilling all conditions was found, so the closest work was included (BTE on scaffolds, focused or contemplating cell adhesion as an important part of their work, and applying one or more of the AI methods mentioned).
1.4.1 Artificial neural networks ANNs are, in their most basic definition, computational networks that simulate the decision process in networks of neurons, in the biological human central nervous system (Graupe, 2013). There has been tremendous progress in different aspects and types of neural networks, particularly deep neural networks. The latter can perform complicated tasks, such as speech recognition or language translation, effectively. Nevertheless, they require a considerable amount of data in order to be effectively trained. They have been successfully applied to medical imaging (http:// www.zebra-tech.com), a huge and fast-growing Israeli company dedicated to AI analysis of medical images, but in the case of scaffolds, data are scarce. Therefore, deep learning techniques are complicated to apply at present.
1 A review of bone tissue engineering for the application of AI
9
1.4.2 Aggregate artificial neural networks and particle swarm optimization As for traditional ANNs, we found a study in which scaffolds manufactured with powder bed and inkjet head 3D printing were considered (Asadi-Eydivand et al., 2016). For these scaffolds, they proposed an aggregated artificial neural network (AANN) to investigate the effects of print orientation, layer thickness, and delay time between each layer. The output variables studied in this work were the percentage of open porosity and the compressive strength of the resulting scaffolds. As first step, the authors applied two optimization methods to obtain the optimal parameter settings for printing scaffolds. First, particle swarm optimization (PSO) was implemented to obtain the topology for the AANN. Then, Pareto front optimization (PFO) was used to determine the parameters for a scaffold with required porosity and compressive strength. The study starts from the premise that a porous scaffold is required in BTE, as we have already mentioned throughout this work. This condition will act as a guide for cell adhesion, proliferation, differentiation, and eventual tissue growth. We add here, also, cell adhesion. As we explained previously, the other phenomena cannot occur if the cells did not adhere to the scaffold previously. Scaffold permeability constitutes another critical factor because of its ability to build living tissue. Permeability refers to what, due to its physical characteristics, is in a position to be crossed by some type of fluid without altering its structure. In addition, nutrient diffusion, waste disposal from the regeneration site, and an appropriate mechanical environment must be guaranteed for successful BTE implementations. A balance must be achieved between having an excellent permeability without compromising the mechanical properties of the scaffold: interconnected pores will allow permeability and cell growth but decrease compressive strength, whereas closed pores only reduce strength. The variables used were layer thickness, orientation, and delay time in the printing as input variables, and open porosity and compressive strength as output variables. They constructed 48 scaffolds, in the shape of 6 by 12 mm cylinders with predefined size pores, covering three orientations (x, y, z), four values of layer thickness (127, 114, 102, and 89 µm), and four delay times (500, 300, 100, and 50 ms). They use these data and a combination of PSO and a backpropagation method to find an optimal topology for several individual single-layer feedforward neural networks. The network produced is, in fact, an ensemble of three single-layer feedforward neural networks, with five, four, and eight neurons in the hidden layers. All three networks receive the same input variables, produce two output variables (compressive strength and open porosity), and the results of the three networks are combined in a final output layer of the ensemble to produce the final result. This output layer is also tuned using a second PSO algorithm. The networks were trained for 300 epochs, and the data division was fixed, 40 training and 8 testing data (83% and 16%). As for the PSO, very few details are included.
10
María de Lourdes Sánchez et al.
They obtain a training error of R = 0.91 and RMSE = 0.22, and testing error of R = 0.97 and RMSE = 0.13 for compressive strength of the scaffolds. Similarly, training error of R = 0.94 and RMSE = 0.17, testing error of R = 0.94 and RMSE = 0.25 for open porosity. These results are combined in a plotted Pareto front in order to detect a proper combination of compressive strength and open porosity, which are clearly conflicting variables (more pores or more open pores will clearly decrease the compressive strength of the material). The best result obtained is for orientation X, thickness 117.8, and delay time 137 ms, which produces the highest porosity at 89% but only 0.33 MPa compressive strength. The work gets to predict two significant design variables with high accuracy, in this case, using practical easy to control input variables. The use of PSO increases the amount of processing but allows for much smaller networks that make better use of the little amount of data available, since the optimization of the geometry tends to discard useless neurons. Nevertheless, more experimental work is needed in order to validate the proper cellular adhesion of the cases studied in the paper. The authors include the complete dataset, which is a significant contribution to the area. The dataset is complete for the work, variables, shape introduced (cylinders), and compressive strength is analyzed, which is the main variable.
1.4.3 Machine learning: support vector machines and support vector regression Support vector machines (SVMs) are aimed at representing a data-based method to solve classification tasks using various mathematical and computational tools, including special types of neural networks. Its use in prediction has shown a lower error compared to classifiers based on other methods that include even the neural networks commonly used to estimate variables. In this way, SVMs have become a powerful method to build a classifier to recognize subtle patterns in complex data sets. They have been used for genomics and chemogenomics, among many other successful applications. Support vector regression (SVR) is a modification of SVM specially designed for regression problems. Applied to cell adhesion and tissue engineering scaffolds, we found this work where the authors used SVMs to investigate the response of adherent cells as a function of variable substrate dimensionality (2D vs. 3D) and porous microarchitecture (Tourlomousis et al., 2019). The first part of the study consisted of fabricating four different scaffolds with fiber-based structural features, namely: − SES-1 min: Two-dimensional nonwoven fibrous mesh made by solution electrospinning (SES), random mesh, and time of spinning is 3 min. − SES-3 min: Two-dimensional nonwoven mesh made by SES, random mesh, and time of spinning is 3 min.
1 A review of bone tissue engineering for the application of AI
11
− MEW|0–90°: Three-dimensional woven fibrous mesh with “0–90°” pore microarchitecture made by direct melt electrowriting (MEW) and a recision-stacked substrate. − MEW|0–45°: Three-dimensional woven fibrous mesh with “0–45°” pore microarchitecture made by direct MEW and a precision-stacked substrate. Flat glass surfaces were used as control. The work proposes a methodology based on SVM to assess the geometry of the scaffold and cell adhesion across these different substrates and architectures. The authors use seven different input variables on 88 samples to perform three different classification tasks: – Seven variables as input, classify into four different classes (type A = Glass, type B = SES-1 min, type C = SES-3 min, and type D = MEW 0–45°). They include the confusion matrix and obtain an average classification accuracy of 64.4%. – Seven variables as input, classify into three different classes (A, C, and D). Classification accuracy boosts up to 90.6%. – Binary classification tasks only focused on using the seven input variables to separate classes A, B, and C, from D (i.e., glass and SES from MEW). Classification accuracy improves to 93.0%. The variables are normalized using a Z-score function, using a linear kernel SVM in MATLAB©, and the reported accuracies are calculated using k-fold cross-validation with k = 5. In conclusion, the key factors considered in this work were dimensionality (2D vs. 3D), pore size, fiber diameter, and topography (randomly stacked vs. precisely stacked). The authors obtained good classification precision and the use of k-fold cross-validation allowed for better utilization of the small data set and low bias. The authors used SVM for estimating characteristics of nanofibrous scaffolding (Adebileje et al., 2019). The study aimed to acquire a pattern that exists between the proteins band intensity of nanofibrous Poly(L-lactic) acid (PLLA) scaffolds and adopts the acquired pattern for estimating the protein adsorption and cell adhesion. The work was conducted under MATLAB© computing environment, using SVR (a variant of SVM specifically adapted for regression problems), in order to train and test the SVR. Specifically, they used a Gaussian kernel on fifteen values of experimental data of band intensity taken from the literature. They started from the premise that cell attachment, growth, and migration, on the polymer surfaces, are believed to be assisted by proteins, either secreted by the cells or adsorbed from serum proteins. They analyze their results producing the following conclusions: − Scaffolds possessing nanofibrous pore walls adsorbed serum proteins four times more when compared to solid pore wall scaffolds.
12
María de Lourdes Sánchez et al.
− The nanofibrous architecture selectively mediated protein adsorption such as vitronectin and fibronectin, even though both scaffolds were fabricated from the same PLLA material. − Nanofibrous band intensity scaffolds indicated 1.7 times osteoblastic cell attachment when contrasted with solid pore wall scaffolds. − Biomimetic nanofibrous with estimated band intensity architecture constitutes excellent scaffolding materials for tissue engineering. There exist a few methodological problems with this work. The most noticeable being that they obtain a training accuracy of 76.63% and a testing accuracy of 99.51%. This comes from the fact of using such a small sample and only one train/ test fixed data splitting. A sounder methodological approach, in this case, would be to use k-fold cross-validation with k = 4 or 5, which would produce more sensible results. Even then, more data are probably necessary in order to produce consistent results.
1.4.4 Multiagent systems Multiagent systems are a computational paradigm of AI, which have been applied in a vast amount of problems, from robotics to distributed problem-solving. A multiagent system is composed of multiple intelligent agents that interact with each other: “Smart agents are software entities that carry out a set of operations on behalf of a user or program, with a certain degree of independence or autonomy, and in doing so, employ some knowledge or representation of desires or user goals” (Araujo and Rodríguez, 2018). The authors highlight four basic properties of the agents: autonomy, reactivity, proactivity, and social capacity. It is one of the most modern tools in AI. Although it is not referred to the bone, we include the work as the only agentbased modeling work we could apply to the scaffold design (Ramírez López et al., 2019). They used MSCs, cells that are also used in BTE. This work has, as its final objective, the study of the design process of biodevices for the treatment of infarcted myocardium in biomodels. For this reason, the authors sought to evaluate the behavior of differentiated MSCs and undifferentiated MSCs. The agent-based model is developed and intended to reproduce the behavior of individual MSCs by simulating apoptosis, differentiation, proliferation, and migration. The scaffolds considered were collagen. The cells considered were rat MSCs. With the agent-based model, the authors intended to model cell viability after a certain period of time, the differentiation of MSCs, cell migration, and how the cells might arrange over the myocardial tissue. Once again, at this point, we could say that the study and observation of these phenomena indirectly give us an idea of cell adhesion, because remember that, as Fisher et al. (2007) said, cells must be able to
1 A review of bone tissue engineering for the application of AI
13
establish contacts with the biomaterial surface so that functions that are anchorage dependent can take place, such as spreading, proliferation, and migration. The system of intelligent agents represents the cell population, where the reactions at the micro level cause emergent behaviors. Agents have attributes and, in this case, these attributes belong to biological cell properties. The interactions between the set of multiple agents represent the different processes that occur at the cellular level. The software used for the simulations was Netlogo©. There were two simulated agents: on the one hand, the cells, and on the other hand, the substrate (or environment). The first type of agents represented either cardiac tissue or blood vessels. They have the attribute “damage” and “inflammation.” The second type of agents represented either MSC, endothelial, or cardiomyocyte. For the type of agent MSC, the attributes were “cardiomyocyte differentiation probability,” “endothelial differentiation probability,” “differentiation time counter,” “duration of the differentiation process,” and “following cell type.” For the type of agent endothelial cells and cardiomyocytes, the attributes were “probability of dying by apoptosis” and “influence over the differentiation decision.” The simulations were expensive and slow in time. They required 50 GB RAM server to lower by 83% what it would have taken on an 8 GB RAM desktop computer. Even then, this multiagent system was developed as a general model based on empirical knowledge and available research. The authors recognize that some simplifications were introduced in order to get the simulations to produce good results in the available timeframe. The simulations were validated by considering the physical behavior of the system: The authors performed two simulations that behaved as expected, namely, the anti-inflammatory effect that MSCs exert over the infarcted myocardium and the population size of each cell type (MSC, endothelial, and cardiomyocyte) over time. Both were 60-day simulations. Finally, there was no cell migration in any of the collagen scaffolds. The authors adduce this situation to the lack of significant tensile forces for focal adhesions to occur.
1.4.5 Cellular automata Cellular automata are a set of discrete elements that evolve in time according to predefined rules. Specifically, a distance or neighborhood is defined and each cell acquires a new state (usually living or dead), according to its previous value and those in its immediate predefined neighborhood. This has allowed us to use them to successfully model complex phenomena (Pérez Betancourt et al., 2015). The authors built a computer model based on cellular automata (Czarnecki et al., 2014). The objective of the work is twofold: (1) to study the effects of crystallinity,
14
María de Lourdes Sánchez et al.
orientation, and surface roughness in cell growth in three carbon fiber scaffolds, and (2) model this effect using cellular automata, as a first step toward developing a proper scaffold design tool. As for the experimental data, three different types of carbon fibers were tested (T300, P25, and P120). The total number of samples, by type of fiber, was six because they used six-well tissue culture plates. Each experiment was repeated three times. The cells considered were primary human osteoblasts. The authors propose a two-dimensional grid for cellular automata modeling of the adhesion, differentiation, and proliferation processes, where the size of the grid is the diameter of an osteoblast as well as the size of the fiber, 8 µm. As for the reproduction and life or death of each cell, they use two functions: a Lagrange polynomial and an exponential function, both influenced by the physical parameters of the fiber and using linear correlation of the experimental results to adjust the three parameters found in each model. Each lattice site can assume one of three states (empty, occupied, or blocked). The obtained results showed that the polynomial obtains better results than exponential one. The authors use IBM SPSS© software to make a statistical comparison between the numerical results and the obtained images, obtaining a maximum difference of 13.2% among them in the polynomial case. MATLAB© software package was used for all algorithms and simulations. In this work, the impact of the considered variables: crystallinity, surface roughness, and material orientation were studied. The results obtained by the authors propose carbon fiber as suitable for osteoblast growth obtaining a good simulation validation and providing most of the necessary experimental and methodological details. They also confirm that material alignment is critical since multidirectional scaffolds showed better cell growth than unidirectional ones. These authors also implemented a cellular automata model to simulate cell adhesion and cell proliferation of an initial cell population when they are seeded on a scaffold (Vivas et al., 2015). The cells considered were 3T3 murine fibroblasts. The model was a twodimensional monolayer culture. The output variables considered were cell settling time in culture, adhesion time, and proliferation time. The simulated area was 1 mm of length and 0.5 mm of height. Then, it was discretized in squares, resulting in 67 horizontal and 33 vertical elements. Each element was a square with 15 µm of side. They represented the cell as a 15 µm diameter circle. Cells were placed in a random way. The first part of the simulation consisted of evaluating at what time the cells came into contact with the surface in seconds. To do this, the authors began the simulation with a population of 50 cells, randomly distributed. The results they obtained were 4 s so that the first cell has contact with the surface and 34 s so that all the cells have reached the surface. The second part of the study consisted of simulating the adhesion process, which takes place once the cells have come into contact with the surface of the scaffold. As cells were randomly placed, the adhesion time varies according to the distance to the
1 A review of bone tissue engineering for the application of AI
15
substrate. The results obtained were: focal adhesions were observed after 100 min. During this time, cells preserve their rounded morphological shape. After that, cell membrane molecules reorganize. The cytoskeleton and glycocalyx are also reorganized to minimize repulsion between cell and substrate. This will allow FA to mature so that it increases the number of anchoring sites. A total of 236 iterations were required to simulate the cell adhesion process. The simulation results were contrasted with an experimental part that consisted of 3T3 fibroblast culture, which was then plated into T25 cell culture flasks. Then, the evaluations of cell adhesion and cell proliferation were made triplicate and quadruplicate, respectively. This work only makes a two-dimensional approximation of a scaffold. Although some details are provided for the cellular automata rules and grid, more details should be provided for the experimental setup of the program.
1.5 Conclusions and future works There are very few examples of AI applications to bone scaffolds in tissue engineering that consider cell adhesion as an important part of their analysis. In this work, we have presented the results of the literature review focused on answering the question: what are the AI techniques used and what exactly have they been used for? In response to those questions, we found that different types of neural networks (combined with PSO, SVM, and SVR) have been used and cellular automata and multiagent systems have been applied to works focusing on scaffolds and cell adhesion. The use is mainly for modeling a part of the process to perceive and interpret the role of each variable in the process (as opposed to directly trying to optimize some aspect of the constructive process). Table 1.1 shows a summary of the works analyzed and a comparison of them with respect to their main characteristics: method used, input variables, output variables, validation of results, and so on. One key aspect to consider is the low amount of samples used (from 15 to 88), which seriously complicates the application of many of these techniques that usually require hundreds to thousands of samples. Some important methodological issues have been observed, mainly that only some of the works use k-fold cross-validation, which is the standard way of improving results and avoiding bias in the case of small datasets. In this respect, several works share the full list of their (hard to obtain) data and the details of the experiment, which represents a significant contribution by itself since future research can be conducted upon those data. The inclusion of meta-optimization methods to enhance the optimization method used (as in the first work where PSO determines the topology and number of networks used) is a very useful tool in order to improve results and better use the data.
Layer thickness Delay time Orientation Ellipticity – Rectangularity – cell Area – FA size – FA Aspect ratio Cell E-slope Cell mean G-function Approximate size of proteins
Does not apply
PSO AANN PFO SVM
SVR and SVM
Multiagent system
Optimal design of a D-printed scaffold using intelligent evolutionary algorithms
Machine learning metrology of cell confinement in melt electrowritten three-dimensional biomaterial substrates
Classification based Computation: Nanofibrous Scaffold Architecture using Support Vector Regression Analysis
Isolation, Characterization, and Agent-Based Modeling of Mesenchymal Stem Cells in a Bioconstruct for Myocardial Regeneration Scaffold Design
Input parameters
Year Method
Title
Table 1.1: Summary of the works analyzed.
% open porosity Compressive strength
Does not apply
No
Yes
Yes Should use k-fold validation
Mean inflammation along time Cells population along time
No
Band intensity of Very small data a nanofibrous set and fixed scaffold train/test sets divisions produce a biased result
Fivefold Different crossclassifications validation
No
Amount k-Fold Output variables Validate of data crossresults? validation
Ramírez López et al. ()
Adebileje et al. ()
Tourlomousis et al. ()
Asadi-Eydiv et al. ()
Reference
16 María de Lourdes Sánchez et al.
Crystallinity Orientation Surface roughness D scaffold
D cellular automata
D cellular automata
Cellular Automata Simulation of Osteoblast Growth on MicrofibrousCarbon-Based Scaffolds
Modeling cell adhesion and proliferation: a cellularautomata based approach
Does not apply
Does not apply
Cells settling time Cell adhesion time Cell proliferation time
Osteoblast growth
Yes, with micrography of in vitro tests
Yes
Vivas et al. ()
Czarnecki et al. ()
1 A review of bone tissue engineering for the application of AI
17
18
María de Lourdes Sánchez et al.
The meta-optimization method discards the unnecessary networks, diminishing the number of parameters to train and making better use of the data. Another important aspect to consider that is often overlooked and clearly important in bone scaffolds is the mechanical properties of the material, the design, and the shape. Uniaxial compressive strength is a critical characteristic in the case of bone (since the scaffold will be loaded and severely so, depending on the placement of the implant on the body). As shown in the first work analyzed, there is a maximum compressive strength that can be achieved while maintaining porosity for a given material. If the compressive strength is not enough for the loads expected for the position in the body and the particular patient, then another material must be used. This aspect, as well as the impact of the shape of the scaffold, requires more study and can be effectively assisted by AI, provided critical data are collected. Finally, the ample variety of construction techniques, materials, and cells used require different variables for the problem. Further study is needed in order to determine and classify all the variables involved in the cell adhesion process of the bone scaffold, and the relation among them and with the different mathematical models available. Porosity, open porosity, and interconnectivity have been studied but mainly in limited geometrical shapes (cylindrical and prismatic mainly), but the shape of the bone to replace will vary these values and play an important part in cell adhesion, which will require further study. As future works, it is clearly necessary to conduct a study and survey of the variables that most influence cell adhesion, for a BTE scaffold, and a collection of the data available, in order to facilitate future applications of AI to this case. BTE is still an unexplored field with a lot of potential. Acknowledgments: The work contained in this chapter is part of the master’s thesis of María de Lourdes Sánchez under the direction of Dr. Adrián Will and Dr. Andrea Rodríguez, in the Maestría en Ingeniería en Sistemas de Información, Universidad Tecnológica Nacional, Facultad Regional Tucumán, Tucumán, Argentina.
References Adebileje S.A., Aiyelabegan H.T., Adebileje T.A., and Olakunle T.O. Classification based computation: Nanofibrous scaffold architecture using support vector regression analysis. Acta Scientific Medical Sciences, (2019), 7, 93–98. Alberts B., Johnson A., Lewis J., Raff M., Roberts K., and Walter P. Biología Molecular de LA CÉLULA – 5ta. edición, (2008), Ediciones Omega. Altamirano Valencia A.A., Vargas Becerril N., Vázquez F.C., Vargas Koudriavtsev T., Montesinos J.J., Alfaro Mayorga E., et al. Biocompatibilidad de andamios nanofibrilares con diferentes concentraciones de PLA/Hidroxiapatita. ODOVTOS-International Journal of Dental Sciences, (2016), 39–50.
1 A review of bone tissue engineering for the application of AI
19
Araujo P.B. and Rodríguez S.A. Tesis Doctoral: Patrones de Diseño Organizacionales para Sistemas Multiagentes, (2018). Arima Y. and Iwata H. Effect of wettability and surface functional groups on protein adsorption and cell adhesion using well-defined mixed self-assembled monolayers. Biomaterials, (2007), 28 (20), 3074–3082. Asadi-Eydivand M., Solati-Hashjin M., Fathi A., Padashi M., and Osman N.A. Optimal Design of a 3D-Printed Scaffold Using Intelligent Evolutionary Algorithms Applied Soft Computing, (http://dx.doi.org/10.1016/j.asoc.2015.11.011), (2016), 36–47. Biswal D., Kumar N.S., Yadav P., and Moharana A. Artificial intelligence in advancement of regenerative medicine & tissue engineering. 2nd International Conference on Tissue Engineering and Regenerative Medicine (ICTERM), (2013). Boccaccio A., Uva A.E., Fiorentino M., Lamberti L., and Monno G. A mechanobiology-based algorithm to optimize the microstructure geometry of bone tissue scaffolds. International Journal of Biological Sciences, (2016), 12(1), 1–17. Chang H.-I. and Wang Y. Cell responses to surface and architecture of tissue engineering scaffolds. Regenerative Medicine and Tissue Engineering – Cells and Biomaterials, (2011). Chen M., Patra P.K., Lovett M.L., Kaplan D.L., and Bhowmick S. Role of electrospun fibre diameter and corresponding specific surface area (SSA) on cell attachment. Journal of Tissue Engineering and Regenerative Medicine, (2009), 3(4), 269–279. Czarnecki J., Jolivet S., Blackmore M., Lafdi K., and Tsonis P. Cellular automata simulation of osteoblast growth on microfibrous-carbon-based scaffolds. Tissue Engineering Part A, (2014). Danilevicius P., Georgiadi L., Pareman C., Claeyssens F., Chatzinikolaidou M., and Farsari M. The effect of porosity on cell ingrowth into accurately defined, laser-made, polylactide-based 3D scaffolds. Applied Surface Science, (2015), 336, 2–10. Dibyajyoti B., Pritha Y., Suraj Kumar N., and Aasis M. Artificial intelligence in advancement of regenerative medicine & tissue engineering. Department of Biotechnology and Medical Engineering, NIT Rourkela, (2013). Eltom A., Zhong G., and Muhammad A. Scaffold techniques and designs in tissue engineering functions and purposes: A review. Hindawi Advances in Materials Science and Engineering, (2019). Estrada C., Paz A.C., and López L.E. Ingeniería de tejido óseo: Consideraciones básicas. Revista EIA, (2006), 93–100. Fernández J.M. Estudio de biocompatibilidad de polímeros sintéticos y su aplicación en ingeniería de tejido óseo. Tesis de Doctorado – Universidad Nacional de La Plata – Facultad de Ciencias Exactas – Departamento de Ciencias Biológicas, (2011). Fisher J.P., Mikos A.G., and Bronzino J.D. (Edits.). (2007). CRC Press. Gaviria Arias D., Guevara Agudelo A., and Cano López E. Evaluación del crecimiento de fibroblastos humanos en andamios de fibroína de Bombyx mori L. Revista Colombiana de Biotecnología, (2018), 47–56. Gentleman M.M. and Gentleman E. The role of surface free energy in osteoblast–biomaterial interactions. International Materials Review, (2014), 59(8), 417–429. Gordeladze J.O., Haugen H.J., Lyngstadaas S.P., and Reseland J.E. (2017). Bone Tissue Engineering: State of the Art, Challenges, and Prospects. En tissue engineering for artificial organs: Regenerative medicine, smart diagnostics and personalized medicine, first Edition (págs. 525–551). Anwarul Hasan. Granados M.V., Montesinos-Montesinos J.J., and Álvarez-Pérez M.A. Adhesión y proliferación de las células troncales mesenquimales de médula ósea en andamios fibrilares de poli (ácido L-láctico) (PLA). Revista Mexicana de Ingeniería Biomédica, (2017), 38(1), 288–296.
20
María de Lourdes Sánchez et al.
Graupe D. Principles of Aritificial Neural Networks – 3rd edition (Vol. Volume 7), (2013), World Scientific. Harbers G.M. and Grainger D.W. Cell-Material interactions: Fundamental design issues for tissue engineering and clinical considerations. Hollinger E.J.O., An Introduction to Biomaterials, Second Edition, (pág. 63), (2011), CRC Press. Lampin M., Warocquier-Clèurot R., Legris C., Degrange M., and Sigot-Luizard M.F. Correlation between substratum roughness and wettability, cell adhesion, and cell migration. Journal of Biomedical Materials Research, (1997), 36(1), 99–108. Lee M.H., Brass D.A., Morris R., Composto R.J., and Ducheyne P. The effect of non-specific interactions on cellular adhesion using model surfaces. Biomaterials, (2005), 26(14), 1721–1730. Moreno M., Amaral M.H., Sousa Lobo J.M., and Silva A.C. Scaffolds for bone regeneration: State of the art. Current Pharmaceutical Design, (2016). Murphy C.M., Haugh M.G., and O’Brien F.J. The effect of mean pore size on cell attachment, proliferation and migration in collagen–glycosaminoglycan scaffolds for bone tissue engineering. Biomaterials, (2010), 31(3), 461–466. Nakamura M., Hori N., Ando H., Namba S., Toyama T., Nishimiya N., et al. Surface free energy predominates in cell adhesion to hydroxyapatite through wettability. Materials Science and Engineering: C, (2016), 62, 283–292. Narváez Tovar C.A., Velasco Peña M.A., and Garzón Alvarado D.A. Modelos computacionales de diferenciación y adaptación ósea. Revista Cubana de Investigaciones Biomédicas, (2011), 126–140. O’Brien F.J., Harley B.A., Yannas I.V., and Gibson L.J. The effect of pore size on cell adhesion in collagen-GAG scaffolds. Biomaterials, (2005), 26(4), 433–441. Pérez Betancourt Y.G., Rodríguez Puente R., and Mufeti T.K. Cellular automata and its applications in modeling and simulating the evolution of diseases, (2015). Phipps M.C., Clem W.C., Catledge S.A., Xu Y., Hennessy K.M., Thomas V., et al. Mesenchymal stem cell responses to bone-mimetic electrospun matrices composed of polycaprolactone, collagen I and nanoparticulate hydroxyapatite. PloS One, (2011), 6(2), e16813. Ramírez López D.V., Melo Escobar M.I., Peña-Reyes C.A., Rojas Arciniegas Á.J., and Neuta Arciniegas P.A. Isolation, characterization, and agent-based modeling of mesenchymal stem cells in a bio-construct for myocardial regeneration scaffold design. Data, (2019). Rodríguez A.P., Felice C., and Ruiz G. Universidad Nacional de Tucumán – Facultad de Ciencias Exactas y Tecnología. Seminario Introducción a la Ingeniería de Tejidos, (2013), Tucumán. Roseti L., Parisi V., Petretta M., Cavallo C., and Desando G. Scaffolds for bone tissue engineering: State of the art and new perspectives. Material Science and Engineering C, (2017), 1246–1262. Suárez Vega D.V., Velazco de Maldonado G.J., and Yépez Guillén J.D. Histomorfometría de la regeneración ósea obtenida con sistema liposoma-membrana de quitosano en un modelo experimental. Revista Virtual de la Sociedad Paraguaya De Medicina Interna, (2017), 4, 12–34. Tamada Y. and Ikada Y. Effect of preadsorbed proteins on cell adhesion to polymer surfaces. Journal of Colloid and Interface Science, (1993), 334–339. Tarassoli S.P. Artificial intelligence, regenerative surgery, robotics? What is realistic for the future of surgery?. Annals of Medicine and Surgery, (2019), 53–55. Tourlomosis F., Jia C., Karydis T., Mershin A., Wang H., Kalyon D.M., et al. Machine learning metrology of cell confinement in melt electrowritten three-dimensional biomaterial substrates. Microsystems & Nanoengineering, (2019). Vivas J., Garzón-Alvarado D., and Cerrolaza M. Modelling cell adhesion and proliferation: A cellularautomata based approach. Advanced Modeling and Simulation in Engineering Sciences, (2015).
Divya Gaba, Nitin Mittal
2 Implementation and classification of machine learning algorithms in healthcare informatics: approaches, challenges, and future scope Abstract: Health informatics primarily means dealing with the methodologies that help to acquire, store, and use information in health and medicine. Large database including heterogeneous and complex data of healthcare can be very beneficial but it is difficult for humans to interpret such a “big data.” With such a large dataset, machine learning (ML) algorithms can work very well in predicting the disease and treatment methods. ML algorithms include learning from past experience and making the predictions and decisions for the current problems. There are many challenges that are encountered while applying ML in healthcare and mostly in healthcare applications where dataset is very complex and is of varied type (ranging from texts to scans). A possible alternative is the use of interactive ML where doctors can be taken in loop. Hence, an integrated and extensive approach is required for application of ML in health informatics. This chapter deals with the various types of ML techniques, approaches, challenges, and its future scope in healthcare informatics. Further, these techniques can be used to make a model for quick and precise healthcare discovery. Keywords: Machine learning, Healthcare informatics, Healthcare applications
2.1 Introduction Medicinal services informatics manages the procurement, transmission, preparation, storage, and recovery of data appropriate to human services for the timely discovery, quick finding, and quick diagnosis of the disease. The extent of medicinal services informatics is bound to information related to infections, social insurance archives, and the systems related to treatment of such information. Besides the plan of giving moderate quality and consistent human services, customary medicinal practices over the United States in the course of recent time have contributed to improve innovation and technical help to analysts, therapeutic experts, and patients. These kinds of
Divya Gaba, Nitin Mittal, Electronics and Communication Engineering, Chandigarh University, Punjab, India https://doi.org/10.1515/9783110648195-002
22
Divya Gaba, Nitin Mittal
efforts have really proved to be beneficial for remedy and for creating the mechanical advancements in computerized image restoration. Artificial intelligence (AI) is a characteristic expansion of man-made reasoning. Specialists and restorative experts regularly resort to utilize AI in addressing complex measurable investigation. The specialty of joining both medicinal services information and AI with the objective of recognizing examples of plotting is ordinarily stated to as human services informatics. The objective of social insurance informatics is hence applied to distinguish designs that are found in information, and afterward gain from the recognized (Clifton et al., 2012).
2.2 Techniques of machine learning for health informatics There are numerous methods of AI which have been created for well-being informatics. Some systems are based on wavelet and fuzzy logic, and all strategies have various applications and they kept running on various sorts of databases. In this part, we will talk about few of them. Wavelet transform is the tool for better representation of any signal as it gives not only the frequency detail but also the spatial information at the same time. Numerous diseases have been recognized by electrocardiographic (ECG) analysis. Anyway ECG sign is a nondirect capacity so the examination of ECG sign is finished by wavelet transform. Other than the wavelet change, the AI calculation will comprise R point location utilizing various algorithms. This concealed data is better perceived in the discrete wavelet transform (DWT) space than in time and recurrence areas. The explanation behind this predominance originates from the way that the DWT area gives a precise time and recurrence portrayal of a sign and this empowers both recognition and extraction of disguised data. For this undertaking, two distinct classes of ECG signs were examined: left bundle branch block and ordinary heart pulse. We contrived an advanced handling structure to separate significant data from these signs. The primary preparing step was DWT-based clamor decrease pursued by measurement decrease with principal component analysis (PCA). The PCA yield was sustained to a support vector machine (SVM), which was utilized to group the sign into either typical or infected. The 10-overlap cross-approved arrangement results show 99.93% affectability, 100% particularity, 100% positive predictive value, and 99.96% exactness. The solid characterization results demonstrate that mechanized handling of ECG sign can uncover applicable data for disease inference (Guyton et al., 2006).
2 Implementation and classification of machine learning algorithms
23
2.2.1 Fuzzy logic tool Fuzzy logic control frameworks are basically rule-based master frameworks that comprise a lot of phonetic guidelines as “IF–THEN.” Fuzzy logic technique is utilized as a guideline for many diseases. One such disease is for the guideline of level of glucose of persons suffering from diabetes. A scientific model that has been created shows a connection between the level of glucose in humans and the nourishment intake. A summed-up fuzzy logic controller includes a lot of rubrics, which is acquainted with managing glucose stages for persons suffering from diabetes. Then from the fuzzy logic technique, output is displayed. There are two kinds of diabetes: type 1 and type 2 (Van den Berghe et al., 2001). Different fuzzy logic rules are applied for detecting both the types of diabetes from the database (Figures 2.1 and 2.2). Machine learning techniques Wavelet transformation
Fuzzy logic
Genetic algorithm
Pixel based Rule learning
Techniques for AI diagnosis
Figure 2.1: Wavelet transformation tool.
True/yes/1
Rule base
Boolean logic
False/no/0 Figure 2.2: Fuzzy logic rule base.
Versatile fuzzy logic control frameworks comprise an accumulation of etymological principles, fuzzy ramifications, fuzzy model distinguishing pieces of proof, and a versatile calculation. This versatile framework can be a multilevel framework. The main level of this framework is fuzzy logic. The subsequent stage, or more elevated stage, is the adjusting framework that is utilized in procedures by means of evolving conditions. In a straightforward framework, the deliberate nonfuzzy state function is contrasted and a given nonfuzzy set point. At that point, the fresh nonfuzzy worth is changed over to two fluffy regulator inputs: mistake and error difference. Since eventually it is essential to calculate the estimation of the regulator yield that can be determined, a defuzzifier is required, which changes over the yield of fleecy set to a new worth which can be determined and direct this motivator to the last control part.
24
Divya Gaba, Nitin Mittal
The glucose level for determining diabetes can be directed by applying fuzzy logic algorithm. A recreation study is additionally completed, which incorporates the investigation of various parameters. The outcomes demonstrate that a criticism guideline is plausible if the estimation of extent in glucose is feasible. It is captivating because it is hard to screen and acquire persistent glucose level. Three types of insulins that are accessible to medicinal services suppliers exist. Right now, human insulin is the most utilized in light of the fact that a few patients will build up a protection from insulin removed from creatures because of the minor aminocorrosive contrast between animal insulin and human insulin. If the glucose level of any patient is uncontrolled, it may affect the heart pathway, mind, and other parts of body; hence, it needs to be controlled. From atherosclerosis, it is understood that cardiovascular disappointments and other types of attacks frequently as conceivably occur. Small aneurysms of the retina can be formed due to the release of high glucose levels, and decrease of eyesight along these lines can cause visual impedance. Stream in the feet may moreover decrease, inciting supply course cementing, ulcers, defilement, and even gangrene. Hence, diabetic patients need to screen their day-by-day admission to control the glucose level and action carefully; this progression could help keep up their glucose at sufficient levels. Sadly, this exacting way of life may cause an “institutional” brain science, and it might be hard to reliably keep up a severe day-by-day routine more than quite a long while. Long-acting insulin may be infused to a diabetic patient who is undertaking treatment for a day around multiple times, and to decrease the post-dinner blood glucose spike, fast-acting insulin may be infused. Also, the glucose-detecting gadgets that are accessible are intrusive and the blood glucose content could be measured by any blood test.
2.2.2 Genetic algorithm In genetic algorithm, an optimized solution is determined for a population of candidate solutions. This procedure goes through the following steps: initialization of population, evolution, selection of candidates, modifying the traits, and mutation. These steps are illustrated in the following flowchart (Figure 2.3). ECG is the device to measure the heartbeats and hence it can monitor the function of heart. It gives both basic and utilitarian reasons for such kind of irregularities. In typical conditions, the doctor observes the example of advancing ECG, comprehends the illness procedure, and determines the hidden infection. ECG along these lines has a significant job in screening heart variations from the norm. Early finding and treatment of heart maladies are significant; be that as it may, in numerous districts, on account of the enormous population and restricted medicinal
2 Implementation and classification of machine learning algorithms
25
Initialize population Evolution Selection Modification Mutation Figure 2.3: Genetic algorithm flowchart.
services assets, it is costly for therapeutic specialists to screen each individual. There is, in this way, a need to create robotized screening instruments that will utilize some component extractors and AI calculations. Highlight extraction systems, for example, head segment examination (Van den Berghe et al., 2001) and straight discriminant investigation, are utilized before grouping. After include extraction, second thing comes in the designed order. The significant weakness of the k-implies algorithm that was first given by MacQueen is that it will consistently meet the neighborhood ideal of the goal work. Another administered grouping calculation is the mistake back proliferation neural organize, which can isolate complex information designs. Once more, the Error back Propagation neural network (EBPNN) is additionally a neighborhood improvement of the goal work. A class of arrangement techniques, called developmental calculations, are population-based strategies as opposed to test-based techniques and have heuristically adjusted structures. These calculations consistently join to the worldwide ideal of the goal work. Hereditary calculations are transformative calculations, which obtain standards from normal hereditary qualities. There are numerous works in the writing (Rumelhart et al., 1986) for a similar investigation of the GA.
2.2.3 Pixel-based technique AI assumes a fundamental job in CAD since items, for example, sores, and organs in medicinal pictures may not be spoken precisely by a basic condition. For instance, a lung knob is commonly displayed as a strong circle, yet there are knobs of different shapes and knobs with inward in homogeneities, for example, speculated or nonstrong knobs (Lostumbo et al., 2010). A polyp in the colon is demonstrated as a round article; however these are likewise nodules that show a level morphology. In this manner, demonstrative errands in restorative imaging essentially require gaining from models (or information).
26
Divya Gaba, Nitin Mittal
AI has been utilized to arrange injuries into specific categories (e.g., irregular or typical, sores or nonsores, and harmful or generous) in CAD. AI calculations for characterization incorporate direct discriminant investigation, quadratic discriminant examination (Fukunaga, 1990), multilayer perceptron (Rumelhart et al., 1986), and SVMs (Vapnik, 1995). As accessible computational power expanded significantly, pixel/voxel-based AI (PML) does not highlight determined from sectioned districts as information data, has developed in medicinal picture preparing/examination; therefore, including count or segmentation is not required. Since PML can maintain a strategic distance from mistakes brought about by off base element computation and division, the exhibition of PML can conceivably be more than the other. With the extension of neural channels (Schwartz et al., 2016) and “neural edge enhancers” (Braithwaite et al., 2016), which are artificial neural network (ANN)-based (Kandel et al., 2012) regulated nonlinear picture handling methods, and a massive training ANN system (Lostumbo et al., 2010).
2.2.4 Rule learning AI strategies have a vast scope of utilizations in human services conveyance, research, organization, and the board. AI has a wide area of applications in all fields. A considerable number of these applications are gradually increasing as the social insurance network turns out to be increasingly acquainted with AI and its huge potential. Then again, most of the AI scientists are curious about medicinal services settings and overtrivialize them. This common absence of comprehension among medicinal services and AI people group brings about the absence of cutting edge AI techniques appropriation. Among the medicinal services territories that advantage the most from AI are those that depend on computerized forms or that can be robotized. The capacity of AI techniques to adjust to powerfully evolving conditions, previously inconspicuous circumstances, and new difficulties make them perfect for these sorts of uses. Two of the most widely recognized uses of AI in social insurance are choice emotionally supportive networks and learning revelation. Choice emotionally supportive networks depend on computational models that guide chiefs in an assortment of circumstances. These models can be developed and kept up utilizing AI. AI has, in this manner, extraordinary potential when accurately applied to difficult issues that cannot be comprehended with increasingly customary computational strategies or physically without the utilization of PCs. In any case, for AI to be received in social insurance, strategies need to satisfy a few prerequisites. These prerequisites are prominent and pertinent to all intents and purposes in all areas where AI is or can be utilized. Be that as it may, a portion of these prerequisites is especially significant in social insurance at the time of the reception of innovations that are recent and results are incredibly testing.
2 Implementation and classification of machine learning algorithms
27
2.2.4.1 Acceptability Models should be acknowledged by their potential clients. While in part identified with straightforwardness, agreeableness necessitates that the models that do not negate the learning of existing specialists are something else “sensibly” consistent with what is presently being done, and relate to the existing work processes. One of the primary issues in social insurance is adequacy, which really needs to be taken into consideration. It should be noted that the created models being utilized are exact and better than the techniques that are right now being utilized but no one likes to change the manner in which they work. The utilization of machine learning (ML) calculations ought to prompt improved work and give motivations to members; generally, results may not be embraced. Capacity deals with complex sorts of information. Social insurance information is mind-boggling. Indeed, even generally basic utilizations of AI to social insurance information require making various transformations, information prepreparing, encoding of factors, and others. To have boundless acknowledgment in human services, AI techniques ought to have the option to work legitimately with medicinal services information without the requirement of misleadingly encode. Social insurance information is not, and ought-not be, taken by ML instruments as an accumulation of quantity without importance. Albeit further developed ML techniques perceive a wide scope of information types (ostensible, organized, ordinal, interim, proportion, outright, compound, and so on), pervasive gauges. PCs require monstrous measures of information to settle on basic choices or find straightforward certainties. People do precisely the inverse – we can settle on significant choices and find significant actualities dependent on negligible data. Despite the fact that there are numerous contrast features in human and PC deduction/learning forms, one such significance is the capacity to utilize foundation information to put issues into the proper setting. Correspondingly, AI calculations that are furnished with huge information bases and an abundance of foundation learning need not approach tremendous measures of information. This permits AI calculations to concentrate on the revelation of novel actualities and not what is as of now known to specialists. Incredibly huge archives of restorative and human services information (is frequently not coded and much of the time is just accessible as the content of distributed original copies) can be fused into the AI procedure.
2.2.4.2 Accuracy Models need to give solid expectations and in addition dependably depict information, which is, by and large, their primary capacity. Various proportions of precision are accessible, all of which play out some types of tallying/scoring of right and
28
Divya Gaba, Nitin Mittal
inaccurate expectations and mixes thereof. Some ordinarily utilized proportions of exactness incorporate accuracy, review, affectability, explicitness, F-score, and others.
2.2.4.3 Straight forwardness Restorative and human service studies expect models to be effectively comprehended by individuals not prepared in AI, insights, and other propelled information examination techniques. In this sense, giving only the solid forecasts is not adequate, as models ought to likewise “clarify” why a particular expectation is done; hence, it does not relate to techniques that lead to making of new information, yet in addition to self-sufficient frameworks that due to their basic job need to leave a “review trail” and be investigated/confirmed intermittently. There has been to a great extent overlooked by numerous cutting-edge AI techniques that give rise to the idea of understandability, and interpretation has been noted on master works. One explanation behind this is it is difficult to gauge the unpredictability of made models and speculations, and utilize that estimation as one of the learning portrayal determination criteria. It is for all intents and purposes difficult to reliably gauge and think about the straightforwardness of models learned in various portrayals. Moreover, compound information portrayals, which are normal to individuals, will in general be hard to learn through AI strategies.
2.2.4.4 Adequacy Model acceptance and its application calculations should be effective. AI calculations applied in medicinal services ought to have the option to adapt to a lot of information. The information may have numerous models (here and there called records or data points), qualities (at times called factors or includes), or both. The hypothetical assessments of calculation multifaceted nature are frequently accessible for some strategies. All the more significantly, clients need the strategies to be executed in a particular timeframe, regardless of whether it implies that outcomes are just inexact or “sufficient.”
2.2.4.5 Exportability Consequences of AI ought to be legitimately transferable to choice help, and different frameworks in which they can be applied at the earliest. It is not abnormal that the educated models will work alongside previously existing models and hence should be perfect. For instance, models for learning can be deciphered or legitimately taken as standards in Arden Syntax, a well-known portrayal language in clinical choice
2 Implementation and classification of machine learning algorithms
29
emotionally supportive networks. On the off chance that models are found out in totally various portrayals, they should be deciphered (normally around) to the objective structure. In the course of recent decades, numerous standard learning calculations and programming have been created. Numerous kinds of principles are considered in AI research contingent upon their utilization and structure, including affiliation rules (which are utilized to speak to regularities in information), choice standards (which are utilized to help choices) and its grouping protocols (used to order models into ideas), rules with special cases (that incorporate part portraying when the standard does not matter), and attribution protocols. The AQ21 framework is especially appropriate for dangerous medicinal services circumstances in light of its adaptability, capacity to manage various kinds of characteristics, handle both huge and little datasets, use foundation information in various structures, gain from individual and amassed information, oversee meta-values, adapt to commotion, per structure productive enlistment, create elective speculations, and numerous different highlights. AQ21 utilizes attribution runs as the primary type of learning representation. The accompanying sections quickly present attribution principles, and blueprints AQ21 primary calculations (Michalski, 2004). Characteristic acceptance necessitates that information should be proportionate to articulations in common language (e.g., English), so the individuals who are not specialists in AI or learning mining, or do not have a specialized foundation may get it. Subsequently, medicinal specialists, social insurance chairmen, attendants, and scientists ought to have the option to comprehend, decipher, alter, and apply information learned by processer frameworks. Such an objective requires that learning revelation projects utilize a language that can either be consequently meant regular language or effectively comprehend without anyone else (Hripcsak, 1994). Choice emotionally supportive networks are comprehensively characterized as PC frameworks that guide decision producers. This statement can take all the applications related to spreadsheets and other models that lead to master framework. In this section, we center around information-based choice emotionally supportive networks in which computers offer help to their clients dependent on the substance of their insight bases and these clients can hence make them beneficial for their use and make them application based also. Customarily, choice emotionally supportive networks are static as in their insight does not change after some time without expressing mediation by the client. AI-based choice emotionally supportive networks can, in any case, advance and adjust to powerfully changing situations in which they work and make it useful in all kinds of environment. Versatility is another characteristics that can be achieved and in which AI can help in choice support. This is a very important feature and is really beneficial in many spheres. This can be illustrated by the following example. Consider a ready framework that furnishes clinicians with messages advising them about significant occasions identified in a particular patient, that is, sensitivities, sedate medication connections, and unusual
30
Divya Gaba, Nitin Mittal
outcomes. An oversensitive ready framework that presentations such as large number of messages cause is a notable wonder called ready exhaustion. In such a case, doctors never again read cautions, yet rather disregard every one of them. An ordinary way to deal with the issue is to make a framework-wide strategy/limit, so alarms try not to overpower clients. This one-size-fits-all methodology disregards every one of the contrasts among doctors and the manner in which they practice. An AI-based arrangement can adjust to explicit clients (doctors) and show just alarms that have the most reduced shot of not being overwritten. The second significant zone wherein AI can be utilized in human services is learning age. Most of the choice emotionally supportive networks depend on principles. These standards are designed by boards of specialists dependent on those tasks that are best and proved. Their creation is a long and troublesome procedure. One of the significant uses of AI is information age – the learning if present in the correct structures can help in arrangement of medical logic modules (MLMs). Since guidelines made by the AQ21 framework are free (e.g., unordered), they can be effectively joined into choice emotionally supportive networks. The real principles are written in the “rationale” space of MLMs while the “information” opening is utilized to infer characteristic qualities and make an interpretation of them into the necessary configuration. Since one MLM is compared with a total choice, it incorporates different principles framing a total rule set family.
2.3 Diagnosis of AD through machine learning techniques One of the widely recognized types of dementia is Alzheimer’s disease (AD), portrayed by psychological and scholarly deficiencies that meddle with day-by-day life if viable treatment is not accessible (Zhang et al., 2010). Advertisement deteriorates after some time by step-by-step decimating synapses, making misfortune in the brain capacity to memorize and the ability to reason, ability to take decisions, and communicate. The number of individuals who develop AD is relied upon to keep on expanding as future increments. With the maturing of the total populace, AD has turned into a major issue and a tremendous weight to the human services framework. Perceiving the critical need to back off or totally forestall the event of an overall social insurance emergency, exertion has been in progress to create and direct viable pharmacological and conduct mediations for postponing the beginning and movement of the ailment. A noteworthy collection of writing proposes that neurotic appearance of AD starts numerous prior years it tends to be analyzed utilizing intellectual tests. At the phase where side effects can be watched, noteworthy neurodegeneration has just happened. In contrast to AD, Mild cognitive impairment (MCI) is increasingly hard to analyze because of its exceptionally gentle subjective disability side effects.
2 Implementation and classification of machine learning algorithms
31
Right now, AD-related neurodegeneration, for example, auxiliary decay, neurotic amyloid testimonies, and metabolic adjustments have been distinguished as potential biomarkers. Progressed factual AI and example acknowledgment strategies have been effectively applied to delineate examples during the beginning period. Instances of AI systems that are generally utilized in medicinal imaging examination incorporate help counterfeit neural systems, k-closest neighbor classifier, and straight discriminant investigation. Notwithstanding deciding gathering contrasts, design characterization strategies can be prepared to recognize people who are in danger for AD. An ongoing report exhibited that grouping techniques are fit for distinguishing AD patients by means of their MRI examination and accomplished precision tantamount to that obtained by experienced doctors. Endeavors have additionally been attempted to create relapse methods for relating clinical scores to imaging information, encouraging consistent checking of AD movement (Johnson et al., 2006).
2.4 Approaches 2.4.1 Human–computer interaction – knowledge data mining AI/ML thinks about the standards of computational gaining from information to comprehend insight. Computational learning has been of great interest for quite a long time, yet we are far away from understanding knowledge: realities are not information and portrayals are not understanding Human computer interaction (HCI) and Knowledge discovery (KDD) did not orchestrate previously. HCI had its emphasis on specific exploratory ideal models, implanted profoundly in cognitive science; and meant to be intellectually/impartially conceivable. Thus, a coordinated effort of the two universes and a complete under-remaining of the information biological system alongside a multidisciplinary range of abilities, encompassing various specializations like information science, well-being, and security. They can be profoundly beneficial for taking care of the previously mentioned issues.
2.5 Data collection techniques All of the above ML algorithms and approaches that have been discussed above rely on data that may be collected directly or indirectly. Every one of the chosen articles depend on a dataset straightforwardly or in a roundabout way obtained from informal communities. We distinguished two wide ways to deal with information assortment: (1) gathering information legitimately from the members with their assent utilizing reviews and electronic information
32
Divya Gaba, Nitin Mittal
assortment instruments (e.g., Facebook applications) and (2) total information extricated from open posts. The strategies for gathering information legitimately from members fluctuated with the end goal of the investigations and the objective stage. These strategies included posting venture data on important sites welcoming members to participate in the undertaking (Tsugawa et al., 2015) and posting assignments on publicly supporting stages requesting venture volunteers (Schwartz et al., 2016). For publicly supporting, analysts posted nitty gritty data about their investigations on stages, for example, Amazon Mechanical Turk (Kittur et al., 2008) to pull in members. As a component of a poll, the members would regularly be approached to give educated assent permitting assortment regarding their informal community information. A scope of surveys were utilized to gauge members’ degrees of wretchedness and life fulfilment, including the Center for Epidemiologic Studies Depression Scale, Patient Health Questionnaire-9, and Beck Depression Inventory.
2.6 Conclusion and future challenges There are many challenges in the existing ML techniques that have been discussed. Existing techniques for inclination for the most part require the contribution of a human–master on the up and up as heuristics and area knowledge to guarantee the choice of a suitable arrangement of highlights, accordingly include, and are vital to learning and comprehension. Notwithstanding, such techniques are constrained by the precision and unwavering quality of the masters learning (power of the human) and furthermore by the degree to which that information can be moved to new undertakings. A down-to-earth model is a regularized MTL, which depends on the minimization of regularization functional like SVMs, which have been effectively utilized in the past for single-task learning. The regularized MTL approach permits to show the connection between undertakings as far as a novel part capacity that uses a task coupling parameter and to a great extent beats single task getting the hang of utilizing SVMs. In any case, perform various tasks using SVMs that are innately limited by the way that they require each class expressing in its very own weight vector. An elective plan for MTL is an expansion of the enormous edge closest neighbor calculation. Rather than depending on isolating hyperplanes, its choice capacity depends on the closest neighbor rule that inalienably reaches out to numerous classes and turns into a characteristic fit for MTL. This methodology beats cutting-edge MTL classifiers; notwithstanding, much open research difficulties stay open around there.
2 Implementation and classification of machine learning algorithms
33
2.7 Future work This chapter aimed at reviewing various types of ML techniques, approaches, challenges, and its future scope in healthcare informatics. After reviewing the various types of ML algorithms, it becomes imperative to study the nature of data and predict how the data is associated with the efficiency of ML algorithms applied. Hence, the future work is reviewing the associativity of nature of data and efficiency of ML algorithm applied. Further, the ethics and reliability of data imported from online sites for applying ML algorithm also becomes important to be taken into consideration. Thus, the efficiency of ML techniques for health informatics, data science, and its association becomes imperative to be reviewed at large angles.
Bibliography American Diabetes Association. Economic consequences of diabetes mellitus in the US in 1997. Diabetes Care, (1998), 21(2), 296–309. Baxter J. A model of inductive bias learning. Journal of Artificial Intelligence Research, (2000), 12, 149–198. Braithwaite S.R., Giraud-Carrier C., West J., Barnes M.D., and Hanson C.L. Validating machine learning algorithms for Twitter data against established measures of suicidality. JMIR Mental Health, (2016) May 16, 3(2), e21. Clifton D.A., Gibbons J., Davies J., and Tarassenko L. (2012) Machine learning and software engineering in health informatics. In: First international workshop on realizing artificial intelligence synergies in software engineering (RAISE), Zurich, Switzerland, 5 June 2012. Doi K. Current status and future potential of computer-aided diagnosis in medical imaging. The British Journal of Radiology, (2005), 78(1), S3–S19. Fauci A.S., Braunwald E., Kesper D.L., Hauser S.L., Longo D.L., Jamesonn J.L., and Loscalzo J. Harrison’s Principles of Internal Medicine. 17th. New York: Mc-Graw Hill, (2008). Fukunaga K. Introduction to Statistical Pattern Recognition. 2nd. San Diego: Academic Press, (1990). Ghahramani Z. Probabilistic machine learning and artificial intelligence. Nature, (2015), 521, 452–459. Guyton A.C. and Hall J.E. Textbook of Medical Physiology. 11th. Philadelphia: W. B Saunders Co, (2006). Hastie T., Tibshirani R., and Friedman J.The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd. New York: Springer, (2009). Houlsby N., Huszar F., Ghahramani Z., and Hernndez-lobato J.M. Collaborative Gaussian processes for preference learning. In: Pereira F., Burges C., Bottou L., and Weinberger K. (eds.), Advances in Neural Information Processing Systems (NIPS 2012), (2012), 2096–2104. Hripcsak G. Writing Arden Syntax medical logic modules. Computers in Biology and Medicine, (1994), 5(24), 331–363. Huppertz B. and Holzinger A. Biobanks a source of large biological data sets: Open problems and future challenges. In: Holzinger A. and Jurisica I. (eds.), Knowledge Dis- covery and Data Mining, (2014), LNCS, Vol. 8401, Springer, Heidelberg, 317–330. Johnson S.C. et al. Activation of brain regions vulnerable to Alzheimer’s disease: The effect of mild cognitive impairment. Neurobiology of Aging, (2006), 27(11), 1604–1612.
34
Divya Gaba, Nitin Mittal
Kandel E.R., Schwartz J.H., Jessell T.M., Siegelbaum S.A., and Hudspeth A. Principles of Neural Science. 5th. ed. New York: McGraw-Hill, (2012), 1760 pages. Kittur A., Chi E., and Suh B. Crowdsourcing user studies with Mechanical Turk. 2008 Presented at: SIGCHI Conference on Human Factors in Computing Systems – CHI ’08; Apr 5–10, 2008; Florence, Italy p. 453–456. LeCun Y., Bengio Y., and Hinton G. Deep learning. Nature, (2015), 521, 436–444. Lostumbo A., Wanamaker C., Tsai J., Suzuki K., and Dachman A.H. Comparison of 2D and 3D views for evaluation of flat lesions in CT colonography. Academic Radiology, (2010), 17(1), 39–47. Michalski R.S. (2004) Attributional calculus: a logic and representation language for natural induction. Reports of the machine learning and inference laboratory: MLI 04-2 George Mason University, Fairfax. Park K. Park’s Textbook of Preventive and Social Medicine. 18th. India: Banarsidas Bhanot publishers, (2005). Rumelhart D.E., Hinton G.E., and Williams R.J. Learning representations by back-propagating errors. Nature, (1986), 323, 533–536. Samuel A.L. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, (1959), 3, 210–229. Schwartz H.A., Sap M., Kern M.L., Eichstaedt J.C., Kapelner A., Agrawal M. et al. Predicting individual well-being through the language of social media. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, (2016), 21, 516–527. Shiraishi J., Li Q., Suzuki K., Engelmann R., and Doi K. Computer-aided diagnostic scheme for the detection of lung nodules on chest radiographs: Localized search method based on anatomical classification. Medical Physics, (2006), 33(7), 2642–2653. Su X., Kang J., Fan J., Levine R.A., and Yan X. Facilitating score and causal inference trees for large observational studies. Journal of Machine Learning Research, (2012), 13, 2955–2994. Teh Y.W., Jordan M.I., Beal M.J., and Blei D.M. Hierarchical Dirichlet processes. Journal of the American Statistical Association, (2006), 101, 1566–1581. Tsugawa S., Kikuchi Y., Kishino F., Nakajima K., Itoh Y., and Ohsaki H. Recognizing depression from Twitter activity. 2015 Presented at: 33rd Annual ACM Conference on Human Factors in Computing Systems; Apr 18–23, 2015; Seoul, Republic of Korea p. 3187–3196. Valiant L.G. A theory of the learnable. Communications of the ACM, (1984), 27, 1134–1142. Van den Berghe G., Wouters P., Weekers F., Verwaest C., Bruyninckx F., Schietz M. et al. Intensive insulin therapy in critically ill patients. New England Journal of Medicine, (2001), 345(19), 1359–1367.
Sandeep Raj, Arvind Choubey
3 Cardiac arrhythmia recognition using Stockwell transform and ABC-optimized twin SVM Abstract: The World Health Organization reported that the cardiac disease is a major cause of mortalities worldwide and shall prevail up to the next decade. Therefore, the cardiac healthcare demands an automated computer-aided diagnosis of longer duration electrocardiogram (ECG) signals or heartbeats. In this chapter, an efficient features representation and machine learning methods are combined and developed to process the ECG signals. Initially, the raw heartbeats are preprocessed for eliminating various kinds of noises inherited within them. Consequently, the QRS-wave is located by applying Pan–Tompkins technique within the signals. Following the QRS-wave localization, a rectangular window of fixed size is selected for segmenting the heartbeats. Then, the Stockwell transform method is employed for extracting the time–frequency (TF) information from heartbeats as features. Few coefficients are selected for an efficient representation of heartbeats which further reduces the complexity during processing using classifier. These output coefficients represent the characteristics of individual heartbeats and supports in distinguishing between them based on their morphology. Further, the R-peak to R-peak information between heartbeats are captured and concatenated with the output TF coefficients. As a result, this final feature vector represents each heartbeat that is applied to twin support vector machine classifier to classify these features into their corresponding categories. The classifier performance is also enhanced as its parameters are employed by employing the artificial bee colony algorithm under patient specific scheme. The proposed methodology is validated over PhysioNet database, and the output of the classifier model is compared with the labels of corresponding heartbeats of the database to formulate the results. The experiments conducted reported a higher overall accuracy of 88.22% over existing state-of-the-art methods. Keywords: Electrocardiogram (ECG), Arrhythmias, Stockwell transform, RR-intervals, Twin support vector machines.
Sandeep Raj, Arvind Choubey, Department of Electronics and Communication Engineering, Indian Institute of Information Technology Bhagalpur, Bhagalpur, India https://doi.org/10.1515/9783110648195-003
36
Sandeep Raj, Arvind Choubey
3.1 Introduction The World Health Organization statistics indicate that there has been a significant growth in count of global mortalities due to cardiac abnormalities. In 2008, an approximate number of around 17.3 million people died due to cardiovascular diseases (CVDs) which is expected to reach 23.3 million by 2030. Approximately 80% of mortalities has been occurred in low or moderate per capita income countries. Maximum of these deaths are caused due to misdiagnosis or late diagnosis of CVDs. Electrocardiogram (ECG) is a low-cost, accurate, and noninvasive tool widely used to diagnose CVDs, which is commonly used to analyze the behavior of cardiac processes (Garge et al., 2018; Raj, 2018a; Raj et al., 2018f). ECG is the graphical recording of the electric potentials that is produced by pumping action of the heart. It is the depolarization and repolarization of the heart tissue, which generates a potential difference to examine. This mechanism starts by triggering of the SA node – the pacemaker for heart that generates an electrical impulse that triggers a series of electrical occurrences in the heart. Such natural events are mentioned by the electrodes and fluctuations are represented in the wave component of the ECG signal. ECG signal usually consists of P-wave component, which is an indicative of depolarization of atrias, a QRS complex wave component, which is an indicative of depolarization in ventricles, ST slope component, which represents the proper blood flow throughout the human body, and T-wave, which is an indicative of repolarization in the ventricles. The detection of cardiac diseases from the ECG recordings is performed using rhythm, heart rate variability measures, and QRS duration (Raj, 2018a). The R-wave peak classification is much essential in automatic signal classification, especially in critical conditions and cardiac abnormalities. Today ECG is the most promising cardiac diagnostic approach around the world because it reveals essential clinical information (Raj, 2018a). It diagnoses the rhythmic episodes of the ECG and further arrhythmias are classified if the patient has any cardiac disease. Arrhythmia comes due to damage of heart muscles, diabetes, habit of eating tobacco, low and high respiration, blood pressure, and so on (Raj, 2018a). These arrhythmias come under life-threatening (critical) and nonlife-threatening (noncritical) (Raj, 2018a). Severe cardiac abnormalities do not allow any time to undergo treatment, whereas noncritical cardiac abnormalities require special treatment for saving life. Simple diagnosis using naked eye may mislead the detection. Therefore, computerized diagnosis is required at that stage (Raj, 2018a). In the last few decades, several methodologies are reported to overcome the limitations with classical heartbeat recognition methods to have a useful diagnosis. More often, four steps (Hu et al., 1993, 1997; Raj, 2016a, 2017a, 2018a, 2019; Raj and Ray, 2017b; Raj and Ray, 2018b, 2018c, 2018d, 2018e; Raj et al., 2016b, 2018f) are integrated together to perform automated diagnosis of heartbeat. They include the filtering, R-wave localization, extraction of features extraction, and features recognition stages. Various robust filters are designed to eliminate the noises associated
3 Cardiac arrhythmia recognition using Stockwell transform
37
within the heartbeats. However, the main focus of the current study is based on the signal processing and machine learning algorithms. The feature extraction step extracts the information and represents the heartbeats while they are being categorized into different classes using the machine learning tools (Chui et al., 2016; Faezipour et al., 2010; Fraser et al., 2014). Several feature extraction algorithms have been reported in literature. In most techniques, features of time (Chazal et al., 2004; Ince et al., 2009; Melgani and Bazi, 2008; Minami et al., 1999), frequency (Minami et al., 1999), TF (Pourbabaee et al., 2018; Raj et al., 2015a), and nonlinear dynamism (Lagerholm et al., 2000; Linh et al., 2003; Raj et al., 2015b; Raj and Ray, 2015c, 2017a) are extracted and passed through the classification stage to achieve more accuracy. Sedjelmaci et al., gathered the varying nature of R-wave to R-wave and ST segment information from every heartbeat as feature considering the variable heartrate of a subject. This time-domain information is applied to detect four kinds of cardiac abnormality, that is, arrhythmia. Rai et al. employed discrete wavelet transform for feature extraction along with morphological features and artificial neural network (ANN) to classify the different categories of heartbeats. George et al., filtered the heartbeat recordings and computed the fractal dimension of ECG using Hurst power to identify four kinds of arrhythmias. (Martis et al., 2013) employed the classical PanTompkin (PT) algorithm (Pan, 1985) for determining the position of R-peak within the ECG signals. Thereafter, wavelet transform is applied to filter the signals. The features representing a particular heartbeat are identified by employing support vector machines (SVMs) into three classes of arrhythmias and reported a performance accuracy of more than 99%. Vafaie et al., employed the features using ANNs whose parameters are optimized using genetic algorithm and reported an accuracy of 98.67% in detecting the potential arrhythmias. Mary et al., employed the detrended fluctuation analysis (DFA) technique to compute the fractal dimensions which is further employed to distinguish between a normal and abnormal heartbeats. Mhetre et al., gathered the morphological information of the heartbeats as features and applied them for identification using an expert system in ventricular natural beat. (Raj et al 2017a) proposed a Discrete Cosine Transform (DCT) based Discrete Orthogonal Stockwell Transform (DOST) method for extracting significant characteristics from the heartbeats and further classification of these features are performed by employing SVMs whose metrics are gradually tuned by employing particle swarm optimization technique. This chapter aims to develop a method which combines the feature extraction and machine learning algorithms to analyze the heartbeats efficiently in a real scenario. Here, the Stockwell transform (ST) is utilized for extracting the TF information from the subsequent heartbeats. These features represent each of the ECG signals and are further identified using artificial bee colony (ABC) tuned twin support vector machines (TSVMs) into different categories of heartbeats. The developed methodology is validated over the PhysioNet data under patient specific scheme to estimate its accuracy. The experimental results reported higher recognition accuracy than the previous studies.
38
Sandeep Raj, Arvind Choubey
The later body of the chapter is organized as follows: the methods are summarized in Section 3.2 while the proposed methodology involved in the automated diagnosis of heartbeat is explained in Section 3.3. The simulation results of the chapter are presented in Section 3.4 and lastly, Section 3.5 concludes the chapter.
3.2 Methods This section highlights the methods such as S-transform for feature extraction, TSVM for recognition and ABC for optimizing the TSVM parameters used in this study.
3.2.1 Stockwell transform The ST (Stockwell, 2007) is a TF analysis tool to analyze the one-dimension signals. It is considered as a generalization of the short-time Fourier transform and its extension continuous wavelet transform (CWT). The drawbacks of CWT are well addressed by ST. The ST provides a frequency-dependent resolution of TF space. Further, it also provides absolutely referenced phase information of the input signal whose combination is unique. The derivation of ST (Stockwell, 2007) can be understood as the phase corrected CWT along with Gaussian window function. +ð∞
Sx ðt, f Þ =
xðτÞ · j f j · e − πðt − τÞ
2f 2
· e − 2πf τ
−∞
The above expression explains that the S-transform function can be expressed as 2 2 the convolution of xðτÞ e − 2πf τ and j f j e − πt f . By applying Fourier transform to both sides +ð∞
Sx ðt, f Þ =
2 =f 2
Xðf + αÞ · e − πα
· ej2παt dα
−∞
The discrete Stockwell transform (Stockwell, 2007) can be written as N −1 X 2 2 Sx nΔT , mΔf = X ½ðp + mÞΔF e − πp =m e j2pn=N p=0
Here, t = nΔT , f = mΔf , and α = pΔF . Also, ΔT is the sampling interval and ΔF is the sampling frequency. The ST localizes the real and imaginary spectra and determines the local amplitude and phase spectrum (Stockwell, 2007) which makes the multiresolution decomposition of ST useful (Stockwell, 2007).
3 Cardiac arrhythmia recognition using Stockwell transform
39
3.2.2 Twin support vector machines A new machine learning tool, that is, TSVM (Hsu and Lin, 2002; Tomar and Agarwal, 2015) is employed for classifying an input data. In the beginning, it was used to solve binary classification problem. It uses two nonparallel plane binary classifier. It generates two nonparallel planes (Hsu and Lin, 2002; Tomar and Agarwal, 2015) around which the data points of the corresponding class get clustered. Each plane is closer to one of the two classes and is as far as possible from the other. In TSVMs (Hsu and Lin, 2002; Tomar and Agarwal, 2015), all data points are distributed in the sense that the objective function resembles a particular category and the constraints are estimated using the pattern of another class (Hsu and Lin, 2002; Tomar and Agarwal, 2015). The algorithm finds two hyperplanes that are formed by formulating a set of two quadratic programming problems (QPPs) similar to SVMs in which single QPP is formulated, one for each class. The classification of the new point is done according to which hyperplane a given point is closest to out of two. TSVM (Hsu and Lin, 2002; Tomar and Agarwal, 2015) is approximately four times faster than the traditional SVM. Figure 3.1 shows a hyperplane separating the two classes of data problem in the higher dimensional space. In real space of d dimensions Rd, the matrix X1 represents the positive class data samples while X2 represents the negative class data samples. Further, a set of two hyperplanes that are nonparallel to each other (Hsu and Lin, 2002; Tomar and Agarwal, 2015) in the real space can be represented as
Class II
Class I Figure 3.1: Hyperplane separating the two classes in twin support vector machine classification scheme.
xT w1 + z1 = 0 and xT w1 + z2 = 0 Here, symbols z1 and z2 represents the bias terms of the hyperplane whereas w1 and w2 indicate vectors normalized to hyperplane. The TSVM developed for performing linear and nonlinear classification are reported in (Hsu and Lin, 2002; Tomar and Agarwal, 2015). Here, the training of pth class is performed with qth class. In the training phase, the positive class is considered as the data samples of pth class whereas negative class is considered as the data samples for the qth class and vice versa (Hsu and Lin, 2002; Tomar and Agarwal, 2015). This
40
Sandeep Raj, Arvind Choubey
work employs the TSVM classifier model as it has computational complexity about four times lesser and achieves comparable accuracy than standard SVM (Hsu and Lin, 2002; Tomar and Agarwal, 2015). Therefore, the TSVM method is chosen as an alternative for analysis of heartbeats.
3.2.3 Artificial bee colony ABC (Karaboga, 2005) is a nature-inspired algorithm that is widely used as an optimization tool. Optimization algorithm follows a set of rules to allocate the scare resources in the best possible fit. The ABC technique is a population-based search algorithm in which the artificial bees modify the food positions with time. Here, the aim of the bees is to determine the position of food sources with the most amount of nectar present, and finally it determines the one nectar with the highest amount. The bee system consists of two significant components: (a) food sources and (b) foragers (Karaboga, 2005). There are three types of foragers: unemployed foragers, employed foragers, and experienced foragers. The ABC technique comprises four main stages, that is, initialization phase, employed bee phase, onlooker bee phase, and the scout phase (Karaboga, 2005). The flowchart of ABC algorithm is depicted in Figure 3.2.
Start Generate P number of random solutions
Employed bee phase
Onlooker bee Phase
Scout bee phase
NO
Convergence
End Figure 3.2: Flowchart of artificial bee colony method.
3 Cardiac arrhythmia recognition using Stockwell transform
41
3.3 Proposed method The proposed methodology comprises of combination of four significant stages such as preprocessing, R-wave localization, feature extraction, and feature recognition stages depicted in Figure 3.3 to classify different categories of heartbeats. Initially, the raw ECG signals extracted from the human body are preprocessed to improve their quality by removing various noises associated within them during data acquisition. The preprocessing step is followed by the feature extraction approach where significant characteristics from the heartbeats are extracted to represent them in lower dimensions. Finally, these features are applied to learning algorithms for classification.
Input (PysioNet database)
Preprocessing
R-peak detection and ECG segmentation
Feature extraction using Stransform
Feature classification using ABC optimized TSVM
Confusion matrix
Figure 3.3: Steps involved in automated classification of cardiac arrhythmias.
3.3.1 ECG data The proposed methodology is validated on a well-known database, that is, Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia data (Moody, 2001). The database contains the records of 47 subjects comprising 48 files or records having a total of 110,109 heartbeat labels (Moody and Mark, 2001). The data contain the annotations of the signals which is finally used to formulate the results in the supervised learning classifier mechanism. A total of 16 classes of heartbeats including normal are available in the database for experimental purpose. The data is sampled at a rate of 360 Hz while the signals are digitized using an analog-todigital converter having 11-bit resolution within 10 mV range (Moody and Mark, 2001). For performing the experiments, all the records comprising different classes of signals are used which is filtered using a band-pass filter with a cut-off frequency of 0.1 Hz to 100 Hz (Moody and Mark, 2001). To determine the training and testing datasets, a certain fraction from all the 16 categories of the heartbeats are selected and mapped as per AAMI recommendations under subject-specific scheme. Under this scheme, records, that is, 102, 104, 107, and 217 are not considered in any of the training and testing datasets. Hence, only 44 records are considered to conduct the experiments. Here, two different types of analysis are performed. The first one has records that are divided (i.e., 22 records each) equally to constitute the datasets. The mapping of ECG beat categories from the PhysioNet database to AAMI recommendation is presented in (Raj et al., 2018e). The overfitting of the classifier model is avoided by conducting 22-fold cross-validation (CV) (Stone, 1974) over the entire dataset.
42
Sandeep Raj, Arvind Choubey
3.3.1.1 Preprocessing The performance of any diagnosis system is greatly affected by the quality of heartbeats. The noises associated with a heartbeat may contain baseline drift, power line interference, muscle artifacts, contact noise, electrosurgical noise, and quantization noise. It is necessary to eliminate these different kinds of noises failing which can lead to false alarms. Further, this step enhances the signal-to-noise ratio which helps in accurate detection of the fiducial points within the heartbeats. Different filters are employed to remove different kinds of noises. A set of two median filters are employed for eliminating the baseline wander (Raj et al., 2018e) within the heartbeats. A 200 ms primary median filter is used to demarcate the QRS-wave (Raj and Ray, 2018e) and P-wave. Whereas, a 600 ms secondary filter demarcates the T-wave within the heartbeat. Finally, the baseline wander is removed by subtracting the output of the secondary filter from the raw ECG data (Raj and Ray, 2018e). Thereafter, the power-line interference and high-frequency noises are removed from the heartbeats by passing the baseline corrected heartbeat using a 12-tapped low-pass filter (LPF) (Raj and Ray, 2018e). This LPF has a cut-off frequency of 35 Hz with equal ripples in the pass and stop bands. The output of this filter is considered as preprocessed heartbeat which is allowed to pass through the Rwave localization and segmentation steps for automated recognition of ECG signals (Raj and Ray, 2018e). Figures 3.4 and 3.5 show the raw heartbeat and the filtered ECG output from the record #237 of the database, respectively.
Amplitude (in mV)
5
2.5
0
0
2
4 6 Time (in seconds)
8
10
Figure 3.4: Raw Electrocardiogram signal corrupted with noise (record #237).
3.3.2 Localization of R-wave and heartbeat segmentation This study classifies the different types of arrhythmias based on the localization of Rwaves within the ECG signals. Prior to segment the ECG signals before feature extraction, it is necessary to determine the locations of R-waves. A lot of research works have been reported in literature for detecting the R-peaks (Raj et al., 2018f) among
43
3 Cardiac arrhythmia recognition using Stockwell transform
Amplitude (in mV)
5
2.5
0 0
2
4 6 Time (in seconds)
8
10
Figure 3.5: Preprocessed Electrocardiogram signal.
which this study employs a well-established PT algorithm (Pan and Tompkins, 1985). It is chosen due to its proven lower computational burden and higher performance under noisy environments. The detected R-waves are verified with the positions of annotations of R-peaks provided in the database. Figure 3.6 depicts the R-wave localization within the heartbeats of record #237 of the database.
Amplitude (in mV )
5
ECG P Q R S T
2.5
0
0
0.5
1 Time (in seconds)
1.5
2
Figure 3.6: Fiducial point detection.
In this study, the segmentation step is a bit different from almost all the works reported. They use a rectangular window of constant time or samples. A new segmentation step is proposed as proportional segmentation around the QRS-wave is considered. The samples are taken as 65% of posterior R peak in the right and from the left, 35% of anterior R peak are selected for estimating the length of every heartbeat. It is ensured that every information of the ECG from starting P-wave to ending T-wave is preserved and no information regarding any wave is lost.
3.3.3 Input representation For any pattern recognition system, a significant role is played by the feature extraction stage. These features represent the characteristics of an input heartbeat.
44
Sandeep Raj, Arvind Choubey
In literature, several types of features are extracted from the subsequent heartbeats, such as fiducial points within ECG signals, wavelet features, RR interval features, and high order accumulants. Many of the works concatenated different features together which results in achieving a higher accuracy. On contrary, concatenation of different types of features results in increased computational complexity of the classification system. Therefore, an efficient representation of an input is very essential for any classification system. This chapter employs the ST (Stockwell, 2007) to capture the significant TF information as features of the subsequent heartbeats to detect potential arrhythmias. The ST (Stockwell, 2007) provides a frequency dependent resolution of TF space. Its combination with the absolute reference local phase characteristics of the input data is unique. The ST (Stockwell, 2007) applied to input heartbeat provides time-frequency coefficients of N2 dimensions, that is, a heartbeat of length N is transformed into N2 length. From the ST coefficients, the following features are extracted from the ST features, that is, energy, standard deviation, mean, kurtosis, entropy, and skewness. These features extracted are concatenated together to form a feature set representing the heartbeats. These features correspond to the morphological characteristics of every heartbeat.
3.3.4 Heartbeat variability features Additionally, the R-wave to R-wave information between the consecutive heartbeats are measured to estimate the heartrate variability by heartbeats. Hence, four types of R- to R-wave information as features are calculated that resembles the pattern of heartbeat, namely previous, post, local, and average R peak to R peak information. Here, the period between a previous and present R-wave is estimated to compute the previous R to R information. The period between the present R-wave and upcoming R-wave is estimated to compute the post R to R information. The combination of the previous and post R- to R-wave period as characteristic of the heartbeat can be termed as an instantaneous rhythm feature. The average R-wave to R-wave characteristic is estimated by taking average of R-wave to R-wave interval of previous three minutes episode of each and every heartbeat. Similarly, the local R-wave information is estimated by taking average of all the R-wave to R-wave intervals of the previous eight seconds event of a current heartbeat. Both these features, that is, average and local correspond to average characteristics for a series of heartbeats. Finally, these heartbeat variability features are concatenated to the output coefficients obtained as a result of S-transform for a particular heartbeat.
3 Cardiac arrhythmia recognition using Stockwell transform
45
3.3.4.1 Feature classification In this work, a TSVM classifier model (Hsu and Lin, 2002; Tomar and Agarwal, 2015) is used for recognition of extracted ST and heartrate variability features representing each heartbeat in 16 classes. Initially, the TSVMs were designed to solve a binary category recognition problem. This study addresses the multicategory recognition problem by using the one-against-one (OAO) SVM model (Hsu and Lin, 2002; Tomar and Agarwal, 2015). Under this scheme, the selection of the kernel argument and the cost function parameters plays a significant role in reporting higher classification performance. The use of kernel functions enables nonlinear classification of features. In other words, the use of kernel helps in achieving better classification performance on nonlinear data or overlapping features in the high-dimensional space. As such, all the kernels such as linear, radial basis function (RBF), and polynomial functions are used to analyze the accuracy. A summary of the classification accuracy of these different kernel functions are reported in Table 3.1. Table 3.1 concludes that the RBF kernel achieved the highest accuracy among all the kernels utilized for classification purpose. It is noted that the training and testing datasets assumed to measure the performance is same as presented in Section 3.3.1. Therefore, the classification performance of RBF kernel is only reported and studied in detail.
Table 3.1: Accuracy reported using different kernel functions. Kernel
Training performance (%)
Testing performance (%)
Linear
.
.
Radial basis function
.
.
Multilayer perceptron
.
.
Polynomial
.
.
*The performance is computed in terms of accuracy.
Further, the kernel (γ) and cost function (C) metrics are gradually optimized using ABC algorithm for enhancing the accuracy of the classifier model from the results reported in Table 3.1. In the implementation of the ABC technique, the steps involved are presented in Karaboga (2005). The fitness of each food source is evaluated by the ABC (Karaboga, 2005) technique and aims to determine the classifier model optimally. Since this current study involves the classification of five categories of heartbeats using OAO-SVM model, the learning parameters of 10 binary SVM classifiers are tuned by employing the ABC algorithm. Here, simple SV count technique is chosen as a criterion of fitness in order to optimize the ABC framework to restrict the error bound condition resulting in the unbiased performance of the recognition model. During training stage,
46
Sandeep Raj, Arvind Choubey
the learning parameters of the TSVM model, that is, C and γ, are optimized by employing the ABC technique. The parameters are selected based on m-fold CV technique (Stone, 1974) in the training stage. The biasing of the classifier model is avoided by using twenty-two fold CV is conducted over the testing and training data sets, in order to estimate the better parameters estimation of C and γ parameters which results in best classification performance of the developed model.
3.3.5 Performance metrics After the confusion matrix is estimated, the performance metrics for every category of ECG beat are determined in terms of five performance metrics, namely, accuracy (AC), sensitivity (SE), F-score (Fs), error (ER), and positive predictivity (PP). The SE is defined as the ratio of correctly recognized instances over total number of instances as SE = TP/ (TP + FN) (Raj and Ray, 2018e). The PP is determined as the ratio of correctly recognized ECG beats over the total number of detected ECG beats, PP = TP/(TP + FP) (Raj and Ray, 2018e). The AC is defined as fraction of total figure of correctly identified instances and the total number of instances classified, AC = (TP + TN)/(TP + TN + FP + FN) and F-score (Fs) is termed as (2TP/2TP + FN + FP). The aforesaid performance metrics are calculated on the benchmark PhysioNet data for the developed method which is analyzed in patient-specific scheme (Raj and Ray, 2018e).
3.4 Simulation results The developed method is implemented on personal computer (windows 10 platform, Intel core i5 (CPU), 2.5 GHz, 8GB RAM) in MATLAB (version 8.2, R2018b) simulation environment whose performance is estimated by identification of various classes of heartbeats evaluated under patient-specific assessment scheme. The recognition performance report of proposed method is demonstrated by performing the experiment on benchmark MIT-BIH arrhythmia data described in Section 3.3.1. The performance of the trained classifier is reported in terms of a confusion matrix as presented in Tables 3.2 and 3.3 for every category of heartbeat in patient-specific strategy. The confusion matrix is formulated by mapping the correctly identified instances and misidentified instances in to their corresponding categories identified by the developed methodology (Raj and Ray, 2018e). The column of the confusion matrix denotes the total number of cardiac events identified by the proposed method. Whereas the row of confusion matrix denotes the ground truth or annotations used for reference provided in the benchmark database (Moody and Mark, 2001). In total of 43,112 and 89,414 testing ECG instances, 81,803 instances are correctly identified by the developed methodology and reported more accuracy of 86.53% and 88.22%,
3 Cardiac arrhythmia recognition using Stockwell transform
47
Table 3.2: Accuracy reported using equally split dataset. Classified results Category
n
s
v
f
q
Total
GT
N
,
,
,
,
RR
S
,
,
OU
V
,
,
UT
F
NH
Q
D
Total
,
,
,
,
,
Table 3.3: Accuracy reported using 22-fold CV. Classified results Category
n
s
v
f
q
Total
GT
N
,
,
,
,
,
RR
S
,
,
OU
V
,
,
UT
F
NH
Q
D
Total
,
,
,
,
,
respectively, in the equally split dataset and 22-fold CV strategy along with an error rate of 13.47% and 11.78%. In Tables 3.2 and 3.3, it is observed that the accuracy of classes “e” and “q” is quite less when compared to other categories of heartbeats which is due to lesser number of ECG instances considered for training in these two classes. It must be noted that experiments are conducted for all the data available for these two classes in the benchmark PhysioNet data. After the confusion matrix is computed, the performance metrics such as sensitivity (SE), F-score (Fs) and positive predictivity (PP) metrics are calculated for every category of heartbeats. Before determining these performance metrics, it is important to calculate the other parameters like true positive (TP), false positive (FP) and false negative (FN) metrics for a particular category of heartbeat as shown in Tables 3.4 and 3.5. Under the patient-specific scheme, Tables 3.4 and 3.5
48
Sandeep Raj, Arvind Choubey
Table 3.4: Metrics under equally split dataset. TP
FP
FN
SE
PP
Fs
,
.
.
.
,
,
.
.
.
,
,
.
.
.
,
.
.
.
.
.
.
,
,
.
.
.
FP
FN
SE
PP
Fs
,
,
,
.
.
.
,
,
,
.
.
.
,
,
,
.
.
.
,
.
.
.
.
.
.
,
,
.
.
.
,
,
Table 3.5: Metrics under 22-fold CV. TP
,
present the Fs, PP, and SE parameters computed for all five categories of heartbeats that are reported to be overall 86.53% each in equally split dataset and 88.22% each in 22-fold CV strategy, respectively. In Figures 3.7 and 3.8, the three bars together resemble the value of sensitivity, positive predictivity, and F-score, respectively, for class 1. Here, class 1 indicates Performance Metrics Pp
Se
100
Fs
80 VL
60 40 20 0
1
2
3 Number of classes
4
Figure 3.7: Metrics plot under equally split datasets.
5
3 Cardiac arrhythmia recognition using Stockwell transform
100 90 80 70 VL 60 50 40 30 20 10 0
49
Performane metrics Sensitivity
1
2
Positive Predictivity
3 Number of classes
4
F -score
5
Figure 3.8: Metrics plot under 22-fold cross-validation.
the normal category of heartbeat. And similarly 2, 3, 4, and 5 indicates the s, f, v, q categories of heartbeats along x-axis. In addition, VL is values of the parameters out of 100 along y-axis.3
3.4.1 Comparative study A brief comparison is presented between the accuracy achieved by the developed method and the previous methods available in the literature under patient-specific assessment scheme. A fair comparison is quite tedious to make which is due to the fact that different works have been evaluated on a different database and different numbers and classes of heartbeats classified. A few of the works have validated their methods over the datasets considered from the healthcare units or patients. In account of these factors, a fair comparative study is presented over the existing methodologies evaluated on the benchmark PhysioNet arrhythmia data and reported in Table 3.6. Table 3.6: Comparative with existing works. Works (Ref.)
Feature extraction
Classifier
Classes
Accuracy (%)
Nambhash et al. ()
WT
Fuzzy
(avg.)
Oresko et al. ()
RR intervals
BPNN
Proposed
ST
ABC + TSVM
All MIT
.
WT, wavelet transform; BPNN, backpropagation neural networks; ABC, artificial bee colony; TSVM, twin support vector machines.
Table 3.6 concludes that the proposed methodology has achieved more accuracy under patient-specific scheme when compared with the available methodology provided
50
Sandeep Raj, Arvind Choubey
along the literature. In comparison with some of the works, the current study identifies a greater number of cardiac events. As such, the quantity of heartbeats varies greatly among the different classes of heartbeats in the datasets; the metrics parameters reported for a particular class can be considered as more reliable and significant. As the current work achieves a higher accuracy than the existing works, it is implicit to conclude that features extracted in TF space using ST technique are efficient along with heartrate variability features and significant in discriminating between various categories of heartbeats when combined with the ABC optimized TVSM to efficiently detect and recognize heartbeats.
3.5 Conclusion and future scope This chapter reported a new method by combining the ST as morphological features and heartrate variability as dynamic features. This final feature vector is applied as input to ABC optimized TSVMs for automated recognition of heartbeats in 16 classes. The developed method can be utilized to monitor long-term heartbeat recordings and analyzing the nonstationary behavior of heartbeats. The validation of the developed method is performed on the PhyioNet data while its evaluation is done under patientspecific scheme. An improved accuracy of 88.22% is achieved by the methodology under patient specific scheme. In future, this current work is subjected to include a greater number of heart rhythms for monitoring and analysis, to develop more efficient algorithms and their implementation of mobile platforms. This developed methodology can be considered as an efficient technique and employed in the computer-aided diagnosis allowing subjects to lead a healthy lifestyle for cardiovascular diseases. The scope of the chapter can be extended to incorporate more number of heartbeats for classification and their analysis using the novel deep learning methodologies. Further, the proposed method can be prototyped on suitable hardware platforms for performing real-time analysis in laboratory experimental setup. The prototyped hardware can be further fabricated into an embedded device to perform real-time analysis of heartbeats integrated with the Internet-of-things technology.
References Chazal P.D., O’ Dwyer M., and Reilly R.B. Automatic classification of heartbeats using ECG morphology and heartbeat interval features. IEEE Transactions on Biomedical Engineering, (2004), 51(7), 1196–1206. Chui K.T., Tsang K.F., Chi H.R., Ling B.W.K., and Wu C.K. An accurate ECG-based transportation safety drowsiness detection scheme. IEEE Transactions on Industrial Informatics, (2016), 12(4), 1438–1452.
3 Cardiac arrhythmia recognition using Stockwell transform
51
Faezipour M., Saeed A., Bulusu S., Nourani M., Minn H., and Tamil L. A patient-adaptive profiling scheme for ECG beat classification. IEEE Transactions on Information Technology in Biomedicine, (2010), 14(5), 1153–1165. Fraser G.D., Chan A.D., Green J.R., and Macisaac J.R. Automated biosignal quality analysis for electromyography using a one class support vector machine. IEEE Transactions on Instrumentation and Measurement, (2014), 63(12), 2919–2930. Garge G.K., Balakrishna C., and Datta S.K. Consumer health care: Current trends in consumer health monitoring. IEEE Consumer Electronics Magazine, (2018), 7(1), 38–46. Hsu C.W. and Lin C.-J. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, (2002), 13(2), 415–425. Hu Y.H., Palreddy S., and Tompkins W.J. “A patient-adaptable ECG beat classifier using a mixture of experts approach.” IEEE Transactions on Biomedical Engineering, (1997), 44(9), 891–900. Hu Y.H., Tompkins W.J., Urrusti J.L., and Afonso V.X. Applications of artificial neural networks for ECG signal detection and classification. Journal of Electrocardiology, (1993), 26, 66–73. Ince T., Kiranyaz S., and Gabbouj M. A generic and robust system for automated patient-specific classification of ECG signals. IEEE Transactions on Biomedical Engineering, (2009), 56(5), 1415–1426. Karaboga D. (2005), An idea based on honey bee swarm for numerical optimization. Comput. Eng. Dept., Erciyes Univ., Kayseri, Turkey, Rep. TR06, Oct. 2005. Lagerholm M., Peterson C., Braccini C., Edenbrandt L., and Sornmo L. Clustering ECG complexes using Hermite functions and self-organizing maps. IEEE Transactions on Biomedical Engineering, (2000), 47(7), 838–848. Linh T.H., Osowski S., and Stodoloski M. On-line heart beat recognition using Hermite polynomials and neuro-fuzzy network. IEEE Transactions on Instrumentation and Measurement, (2003), 52(4), 1224–1231. Martis R., Acharya U., Mandana K., Ray A., and Chakraborty C. Cardiac decision making using high order spectra. Biomedical Signal Processing and Control, (2013), 8(2), 193–203. Melgani F. and Bazi Y. Classification of electrocardiogram signals with support vector machines and particle swarm optimization. EEE Transactions on Information Technology in Biomedicine, (2008), 12(5), 667–677. Minami K., Nakajima H., and Toyoshima T. Real-time discrimination of ventricular tachyarrhythmia with Fourier-transform neural network. IEEE Transactions on Biomedical Engineering, (1999), 46(2), 179–185. Moody G.B. and Mark R.G. The impact of the MIT-BIH arrhythmia database. IEEE Engineering in Medicine and Biology Magazine, (2001), 20(3), 45–50. Nambakhsh M.S., Tavakoli V., and Sahba N. FPGA-core defibrillator using wavelet-fuzzy ECG arrhythmia classification. In Proceedings of IEEE Engineering in Medicine and Biology Society. Annual Conference, (2008), 2673–2676. Oresko J.J. A wearable smartphone-based platform for real-time cardiovascular disease detection via electrocardiogram processing. EEE Transactions on Information Technology in Biomedicine, (2010), 14(3), 734–740. Pan J. and Tompkins W.J. A real-time QRS detection algorithm. IEEE Transactions on Biomedical Engineering, (1985), 32(3), 230–236. Pourbabaee B., Roshtkhari M.J., and Khorasani K. Deep convolutional neural networks and learning ECG features for screening paroxysmal atrial fibrillation patients. IEEE Transactions on Systems Man Cybernetics-Systems, (2018), 48(12), 2095–2104. Raj S. (2018a), Development and hardware prototype of an efficient method for handheld arrhythmia monitoring device. Ph.D. Thesis IIT Patna. 2018, pp. 1–181.
52
Sandeep Raj, Arvind Choubey
Raj S. (2019). A real-time ECG processing platform for telemedicine applications Advances in Telemedicine for Health Monitoring: Technologies, Design, and Applications, IET, 2019, (Accepted). Raj S., Chand G.S.S., and Ray K.C. Arm-based arrhythmia beat monitoring system. Microprocessors and Microsystems, (2015a), 39(7), 504–511. Raj S., Luthra S., and Ray K.C. Development of handheld cardiac event monitoring system. IFAC Papers-On-Line, (2015b), 48(4), 71–76. Raj S. and Ray K.C. A comparative study of multivariate approach with neural networks and support vector machines for arrhythmia classification. In International Conference on Energy, Power and Environment: Towards Sustainable growth, (2015c), 1–6. Raj S. and Ray K.C. Cardiac arrhythmia beat classification using DOST and PSO tuned SVM. Computer Methods and Programs in Biomedicine, (2016a), 136, 163–177. Raj S. and Ray K.C. Application of variational mode decomposition and ABC optimized DAG-SVM in arrhythmia analysis. In Proceedings of 7th International Symposium in Embedded Computer System Design (ISED), (2017a), 1–5. Raj S. and Ray K.C. ECG signal analysis using DCT-based dost and PSO optimized SVM. IEEE Transactions on Instrumentation and Measurement, (2017b), 66(3), 470–478. Raj S. and Ray K.C. A personalized arrhythmia monitoring platform. Scientific Reports, (2018b), 8(11395), 1–11. Raj S. and Ray K.C. Automated recognition of cardiac arrhythmias using sparse decomposition over composite dictionary. Computer Methods and Programs in Biomedicine, (2018c), 165, 175–186. Raj S. and Ray K.C. Sparse representation of ECG signals for automated recognition of cardiac arrhythmias. Expert Systems with Applications, (2018d), 105, 49–64. Raj S. and Ray K.C. A personalized point-of-care platform for real-time ECG monitoring. IEEE Transactions on Consumer Electronics, (2018e), 66(4), 1–9. Raj S., Ray K.C., and Shankar O. Cardiac arrhythmia beat classification using DOST and PSO tuned SVM. Computer Methods and Programs in Biomedicine, (2016b), 136, 163–177. Raj S., Ray K.C., and Shankar O. Development of robust, fast and efficient QRS complex detector: A methodological review. Journal of the Australasian College of Physical Scientists and Engineers in Medicine, (2018f), 41(3), 581–600. Stockwell R.G. A basis for efficient representation of the S-transform. Digital Signal Processing, (2007), 17(1), 371–393. Stone M. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B (Methodology), (1974), 36(2), 111–147. Tomar D. and Agarwal S. A comparison on multi-class classification methods based on least squares twin support vector machine. Knowledge Based Systems, (2015), 81, 131–147.
Shweta Sinha, Shweta Bansal
4 Computational intelligence approach to address the language barrier in healthcare Abstract: In this aeon of globalization and economic growth, the fixed geographic boundary of any country/continent does not confine the mobility of people. Education and healthcare are two service sectors that have seen the major changes in this respect. But, the language diversity across the globe works as an obtrusion for the smooth transition from one part of the globe to the other. The challenges due to linguistic diversity possess more severe difficulties in the healthcare sector. Certainly, the miscommunications of any form in this sector can have far-reaching consequences that may turn out to be irreversible. These difficulties get more scaled up when the patient moves from one part of the world to the other, where very few people speak or understand his/her language. Solutions in terms of translation of one language speech to another language speech can help overcome these difficulties. Automatic speech-to-speech (S2S) translation can make the communication seamless that can expand the horizon of the healthcare sector. This chapter discusses the advancements in natural language processing, the chief focus being the spoken aspect of the language during communication. The chapter discusses the stringing together of three major techniques: automatic speech recognition, automated translation by machine and conversion of text into spoken utterance, that is, text to speech for seamless communication in healthcare services. Besides this, the technological developments and implementation of the challenges at each step is identified and briefly discussed. The performance of the S2S system is evaluated in the healthcare domain. Keywords: Healthcare, Speech-to-speech translation, Language barrier, Spoken language technology
4.1 Introduction Communication through speech has been and will always remain a dominant mode of information sharing and social bonding among humans. In this era of the World Wide Web, big data. and powerful computing technologies; the voice technology have taken the central place for human–machine interaction. This prominence is due to the naturalness of speech as a communication medium. Out of several artificial Shweta Sinha, Amity University Haryana, India Shweta Bansal, KIIT College of Engineering, Gurugram, India https://doi.org/10.1515/9783110648195-004
54
Shweta Sinha, Shweta Bansal
intelligence applications, natural language processing (NLP) is the specialized branch focused toward the human-generated text and speech data. Use of NLP for simple functionality like transcribing physician notes to advanced applications such as robot-assisted surgery using human instructions are prevalent in the healthcare sector. On the contrary, the disparity in the healthcare sector due to communication difficulties is dominant (Casserly et al., 2010). Use of interpreters for bridging the language gap between the two parties is an old and traditional way (Figure 4.1). Considerable advancement in the fields of language technology has motivated the research community to work for the automation of language translation.
Automatic translation
Smooth communication?
yes Do you have pain here?
Human interpreter no
Translation device
Figure 4.1: Translation system uses scenario in the medical domain.
Development in the domain of automatic speech recognition (ASR) and text-to-speech (TTS) has paved the way for extending the focus of machine translation (mainly text conversion) to spoken language translation. Speech-to-speech (S2S) translation is considered one of only a handful of advancements that are supposed to influence the world positively. End-to-end translation of speech signal from one language to the other language (S2S) and vice versa will remove the language barrier due to geographic variances. For a country like India that is making its place in the global market, speech translation technology will be a boon to the people. Language has its own place in healthcare. The importance of language in healthcare can be summarized as its role in – Exact and proper communication: In the medical domain, miscommunication can lead to dangerous outcomes. It is very much essential that the information disseminated by the doctor is exactly gauged by the patient and vice versa. – Trust building: Allowing someone to operate upon your body involves a trust factor. The patient and doctor must be able to trust each other, and the language barrier is the major hindrance in this regard.
4 Computational intelligence approach to address the language barrier in healthcare
55
– Sharing past information: Initiating any treatment requires information regarding medical history, symptoms, and evidence. Proper communication between the doctor and the patient only can enable effective sharing of information. Both parties can do it conveniently in their own language. – Time, money, and lives: Proper communication between the patients and the doctors can save time, money, and more importantly, the lives. Speech, the most natural mode of human communication, becomes an important medium in this regard. Due to the criticality of language as a factor for influencing the quality of services in the healthcare sector, language interpreters have become an essential entity in this field, and due to globalization, the demand for them is growing day by day. Undoubtedly, they are easing the communication between the doctor and the patients by performing the conversion of one language data to the other language, but to some extent, this manual process lacks confidentiality. The communication between the patient and doctor may involve discussions that require secrecy, but the manual translation process could not guarantee this due to the involvement of the third person. Apart from this, the correctness of translation done by the manual method is dependent on the individual’s efficiency and knowledge, and this may lead to mistaken diagnosis with severe consequences. Also, the manual interpretation involves a high cost that may be easily acceptable to all. These issues motivate the speech researchers to extend voice technology for barrier-free communication in the healthcare sector. Although the speech technology has progressed to a very advanced level and the accuracy percentage of speech-based applications have reached a new height, serious issues exist in building automatic translation system in healthcare because of exceptional characteristics of the medical domain. The unique language and medical terminologies and also the demand for high accuracy pose several challenges in the development of an automatic translation system. These challenges can be sketched out as follows: – Healthcare sector requires reliable, timely, and accurate translation as any failure can cause a devastating effect. – Medical terminologies are difficult, as it is mainly multiword or eponym. – Vocabulary is continuously growing due to the addition of diseases and treatment options. – A large number of medical domain words and acronyms add complexity to the language. – Extra-linguistic knowledge is required to overcome the grammar divergence between the origin and the objective language. Addition or deletion of parts may be done during translation to make it relevant for the target language.
56
Shweta Sinha, Shweta Bansal
The associated issues and challenges require extra effort for the development of a translation system. Identifying the exact domain of communication and jargons associated with that can help overcome many of these challenges.
4.2 Background and related research Speech translation is a technology that translates the context of spoken language embedded in the signal into the speech of the target language. Its importance lies in the ability to erase the language divide enabling the global exchange of the culture and information. The inception of speech translation took place in the year 1983 when it first grabbed attention. The NEC corporation demonstrated speech translation as concept exhibit in ITU Telecom World. In 1986, Advanced Telecommunication Research Institute International (ATR) was founded with the purpose to perform basic research required for speech translation. Around 1993, three sites namely, Carnie Melon University (CMU), the ATR, and Siemens collectively performed speech translation experiment. Soon the projects in the area of speech translation started across the globe (Waibel et al., 2003; Zhou et al., 2004). In 2006, countries like Germany, United States, and European Union launched projects Verbmobil, TransTac, and Nespole, respectively. In 2007, a consortium of ATR, NICT Japan, CDAC India, and technical institutes of China, Korea, Thailand, Indonesia, and Vietnam initiated a project titled Asian Speech Translation Research (A-STAR), and in 2009 first network-based Asian languages S2S system was launched (Arora et al., 2013). This research-based consortium was further expanded to cover the Middle East and European languages and countries from that part of the globe also became a member of the consortium. Later on, the project was named as U-STAR project. In 2016, this group was further extended to cover ASEAN language speech translation and included Malaysia, Myanmar, Singapore, Thailand, and Vietnam as its participants. The aim of the consortium is to develop an integrated speech translation service for the ASEAN community and also to develop a common ASEAN speech translation evaluation resource for ASEAN speech translation system evaluation. The timeline diagram in Figure 4.2 shows the current state and indicates the future of speech technology. Speech translation is an architectural cluster of three technologies: speech recognition, language translation, and speech synthesis. Automatic translation of speech requires the automation of these three technologies with seamless integration to complete the task. All of these have their own difficulties. Compared to text translation, speech translation is far more complex due to the involvement of colloquial expressions, the influence of nativity in speech, and presence of ungrammatical phrases. Considering the complex nature of the translation work, the research community initially started work in limited domain involving simple conversation and gradually progressed to more advanced jobs.
4 Computational intelligence approach to address the language barrier in healthcare
57
Timeline 2000 Enter spoken sentences to solve extended reservation problems Approx. 5,000 words 2006
2007
Everyday travel conversation for Japanese, English, and Chinese. Approx., 60,000 words; average 7 words/sentence. Prototype on compact PC
Commercialization of J-E/E-J mobile-phone speech-translation service for everyday travel conversation
2012
2010 Prototype international Internet-connected service for fast speech translation of everyday travel conversation in seven Asian languages
Networked speech translation for practical travel conversation between Japanese, English, Chinese, and Korean, supporting location-dependent proper nouns
2015 Continuous simultaneous speech translation of lectures. Approx. 500,000 words.
2015 Prototype international Internet-connected service for fast speech translation of everyday travel conversation in Asian and Western European languages
2025 Multi-language simultaneous interpretation, with situational awareness and ability to summarize
Figure 4.2: Timeline diagram representing speech translation past and future.
Many disparities in healthcare have been reported in the literature and communication difficulties have been identified as one of the most important factors (Jacobs et al., 2001; Fassil, 2000; Bischoff et al., 2003; Flores et al., 2005). Owing to these, several efforts in the direction of speech translation have been carried out in the medical domain. Due to enormous challenges in this field, all projects have been focused toward specific communication in this domain and are biased toward a few user profiles. Not much success has yet been achieved in the medical domain, and for long there has been debate regarding success of technology-oriented spoken language techniques in the field which is more receptive to the varying needs of principle users that may be doctor, patient, or other healthcare service provider. Table 4.1 outlines
58
Shweta Sinha, Shweta Bansal
Table 4.1: Projects working on speech translation in the medical domain. System
Developers
Domains
Languages
CCLNC
MITLIncoln Lab (Lee et al., )
Doctor–patient dialogue
English, Korean
MASTOR
IBM Yorktown Heights (Zhou et al., )
Medical emergencies
English, Mandarin
Speechalator Carnegie Mellon University (Waibel et al., )
Medical interviews
English, Arabic
Laser ACTD
Carnegie Mellon University (Schulz et al., )
Doctor–patient dialogue
English, Thai
No name
SRI Menlo Park CA (Kathol et al., )
First medical exchanges
English, Pashto
Transonics
University of Southern Doctor–patient dialogue California (Ettelale et al., )
English, Farsi
Accultran
A-life Medical Inc, San Diego CA Doctor–patient dialogue (Heinze et al., )
English, Spanish
S-MINDS
Sehda Inc, Mountain View CA (Ehsani et al., )
Medical disaster recovery
English, Korean
Converser
Spoken Translation, Berkeley CA www.Spokentranslation. com /products/healthcare
Pharmacy, emergency, physical therapy, admissions, obgyn, oncology
English, Spanish
MASTOR
IBM Yorktown Heights domino. watson.ibm.com/comm/pr. nst/pagesnews _mastor.html
Medical-oriented conversations with members of the Iraqi security forces, in hospital settings and during daily interactions with Iraqi citizens
English, Arabic (Iraqi and MSA)
MedSLT
ISSCO, University of Geneva www.issco.unige.ch/projects/ medst/
Headache, chest pain, abdominal pain
English, French, Japanese, Finnish, Spanish, Greek
S:MINDS
fluential Inc, Sunnyvale Ca www.fluentialinc.com/ therapy.swf
Radiology, physical therapy
English, Spanish
MediBabble
University of California, San Francisco http:/medibabble. com/
Cardiovascular, pulmonary
Spanish, Cantonese, Mandarin, Russian, and Haitian Creole
BabelDr
University of Geneva (P. Bouillon et al., )
Abdomen
French to Arabic
4 Computational intelligence approach to address the language barrier in healthcare
59
the projects working for speech translation in the medical domain. Some of them are available as a fully functional system in the commercial market. The domain specified in the table highlights their usability and also identifies the intended users of the system.
4.3 Speech translation technology: an overview Automatic speech translation is a cluster of three major technologies in the NLP domain. The overall process requires the understanding of spoken utterance followed by its translation into the target language, consistent with the language grammar rules. This complete process would create a pathway for patients with limited knowledge of the language at the service provider. It can be achieved by sequence execution of three technologies, that is, ASR, machine translation (MT), and TTS. The pipeline is depicted in Figure 4.3. This technological cluster will also help the patients with visual or any physical impairments to communicate with the doctors or the care providers as well as to connect with the digital world to take a more direct role in self-health and treatment. Spoken language communication needs a system that can recognize, interpret, and respond.
4.3.1 Speech production and perception One of the major concerns in the field of automatic speech processing is the gap between the performance of machine and human. Incorporating knowledge into the machine about how humans produce and perceive sounds will help improve the performance of a machine. Before moving into the details of using technology to remove the language barrier, the study of the anatomy of speech production must be understood. Automating the whole process of speech processing and understanding is a complicated task, not only due to domain complexities but also due to the complexities associated with the vocal machine of the human system. The system will require handling the vocal tract (VT) characteristics of individual, accent due to nativity, background noise, and also the microphone characteristics in addition to coping with the ambiguity of the language and nongrammaticality of spoken conversations. Human voice production is a complex system that is capable of producing several distinct sounds. Lungs, larynx, and VT combine to constitute the human voice production system. VT, that is, oral/nasal cavity, soft/hard palate, tongue, lips, jaw, and teeth form the most important organ affecting the speech production. The respiratory (lungs) and the laryngeal (larynx) subsystems provide the source (air pressure) for speech signal. The articulatory subsystem works upon the generated signal to produce different phones. These sound units can be categorized as voice, unvoiced, or
Massive amounts of speech data from multiple speakers
Large corpus
Massive corpus of English text
and grammar
“a”, “I”,”u”,...
Translation of spoken language
-
Massive collection of aligned English and Hindi phrases
list of my
Convert string of English words into Hindi using a English-Hindi dictionary
Figure 4.3: Pipeline architecture of speech-to-speech translation.
Corpus
Convert to English text using English language dictionary
list of my prescribed medicine
Convert into Englist Sound
list of my ...
List of my prescribed medicine
English
Multilingual speech recognition
Massive corpus of Hindi text
my prescribed medicine of list
Change word order in accordance with Hindi grammar
Hindi
Long-time Hindi speech data
from database
Retrieve speech waveforms matching text
Multilingual speech synthesis
60 Shweta Sinha, Shweta Bansal
4 Computational intelligence approach to address the language barrier in healthcare
61
plosives depending upon the articulators responsible for their production. The varying nature of sound units creates challenges for their automated processing. Speech perception is the other vertical of spoken language understanding. It has been shown that acoustic speech signal does not directly map to the phonological segments (Casserly et al., 2010). On the other hand, the human auditory system is a combination and transformation of hearing and voice that are inseparable. The inherent connection between the two verticals leads to a better understanding of speech characteristics and can reduce the gap between human and machine performance. Figure 4.4 presents the relation between speech production and perception.
Sound Nasal cavity Oral cavity Glotal puff
White noise
Impulses
Pharynx cavity
Vocal Tract
Larynx
~ Power supply
Lungs
Figure 4.4: Speech production and perception system in humans.
4.3.2 Automatic speech recognition Speech is the most natural and convenient mode of communication between human. Using this medium as the communication between human and machine has now become the most desired mode. This interaction requires the conversion of the spoken utterance of human into machine-understandable form. Automatic speech recognition is the method for doing this task. ASR provides the conversion of spoken utterances into text with the use of computer algorithm. Several potential applications such as dictation system, command and control instructions, and information retrieval use this technology. These systems range from isolated speech recognition to recognition of highly conversational speech signals. The ASR systems can also be categorized based on the vocabulary size ranging from small (up to 1000 words) to extra-large (more than a hundred thousand words) (Whittaker and Woodland, 2001). Figure 4.5 presents the general architecture of a standard ASR system. The ASR system is the integration of three major components: the acoustic model, the lexical model, and the language model. ASR systems work in two phases starting with the parameterization of input utterances followed by training and testing of systems using those features. At the core of the system, the parameters representing various
62
Shweta Sinha, Shweta Bansal
Text corpus
Speech signal
Feature extraction
Speech corpus
Acoustic model
Recognition engine / decoder
Language model
Recognized text
Pronunciation lexicon
Figure 4.5: The general architecture of the ASR system.
sound units of the language are used to develop a statistical model. During the first phase, that is, the training process creates models for speech acoustics (recorded sample of a large number of users are required), language statistics (use of text data for grammar knowledge and n-gram models), and recognition dictionary (lexicon of recognizable tokens and their phonetic transcription with several variations). The acoustic model created here can be context-dependent (allophone, triphone, etc.) as well as context-independent (monophones) models that capture speech, speaker, and channel variability. During the training, features in terms of vectors are extracted for the building of acoustic models. These speech features can be any of the linear prediction coefficients, Mel frequency cepstral coefficients (MFCC), perceptual linear prediction, gammatone frequency cepstral coefficient, and basilar-membrane frequency-band cepstral coefficient or the hybrid features (MF-PLP) retrieved as the combination of multiple feature sets (Dua et al., 2018). The feature extraction process helps in dimensionality reduction as well as probabilistic modeling. Recognition dictionary creation is a step toward lexical modeling. It assigns each orthographic token of the dictionary to the spoken utterance (can be phoneme, syllable, or word). Language modeling helps in the selection of the most appropriate recognition hypothesis that is obtained during acoustic modeling. The language model also helps in modeling the syntax, semantics, and structure of the target language. The goodness of the language model is based on the size and variety of text data used for the empirical estimate. For the acoustic model creation, today’s ASR uses techniques like hidden Markov model (HMM) (Young et al., 2008), dynamic programming (Jing et al., 2010), support vector machine (SVM) (Solera-Urena et al., 2007), and artificial neural net-
4 Computational intelligence approach to address the language barrier in healthcare
63
works (Mohamed et al., 2011). Recently, the model creation approach has been extended to use deep neural network (Upadhyaya et al., 2019) and dynamic belief networks. The system depicted in Figure 4.5 is based upon the stochastic HMM-based approach. These speech decoders use a method that implements token passing and is created using the Viterbi algorithm (Young et al., 2008). They can generate word/phoneme N-best lists representing the recognition hypotheses which are rescored using the language model and selects the one with the best score. Availability of easily adaptable, open-source toolkits, such as Julius, Sphinx, HTK, and Kaldi, has helped the developers to produce ASR applications in any target language. ASR performance: The most common measure to evaluate any ASR system performance is the word error rate (WER%). This parameter is evaluated by comparing the transcription output with a reference transcript. This error can be further used to evaluate three categories of errors. – Insertion (I): Unit present in the output transcript but does not exist in the input sample. – Deletion (D): Unit missing in the transcript output wherein it exists in the input sample. – Substitution (S): A word/unit confused and hence replaced with another unit in the output transcript. With these thee error categories, the WER% is computed as WER% = (I + D + S)/N, where N is the number of reference words. Apart from that, in a few languages, the performance is computed in terms of the character error rate and utterance error rate.
4.3.3 Machine translation With the growth in the economy showing a borderless phenomenon across the globe, the demand for language translation from one language to the other has surged. MT uses computer software to convert text or speech of one language to another while preserving the meaning of the source language. It can be comprehended as a substitution of a word in one language with the word in another. Soon after the advent of ENIAC computer in the year 1947, research was initiated for using computers as a device for translating natural languages. These efforts ranged from developing a simple translation dictionary to a full-fledged MT system (Dorr et al., 1999; Doddington, 2002). But, for long no success was achieved in this area, and in the year 1966 it was declared that semantic barrier between the languages could not be resolved by machine and hence the goal of MT cannot be achieved. In around 1988, IBM candid system was launched in the market which used to provide translation for around 60 Russian sentences (Antony, 2013). After this success, the research continued using advanced methods and new techniques and somewhere around 1990s statistical
64
Shweta Sinha, Shweta Bansal
machine translation (SMT) technology came into existence. This development changed the way the world used to see the language barrier (Bouillon et al., 2017). The success of SMT grabbed the attention of the IT sector and application based on MT were developed using the complex hybrid methods (Nair et al., 2016). MT systems can be created for one language pair called the bilingual system that can be bidirectional too, or it can be created for multiple language pair also. The system capable of handling translation between multiple sources and target languages is called a multilingual system. Creation of a bidirectional multilingual system is difficult, and so they are mainly unidirectional. One of the major challenges in creation of MT system is the demand for large corpora created parallelly for the two languages. Another challenge in the development of automatic translation system is related to the ambiguities among the languages. Any of the syntax, semantics, lexicon, or contextual ambiguity can make the process difficult. To handle these linguistic challenges, the text is kept limited to few domains so that lexicon and grammar become more restricted. If further extensions to the domain are required in future, the system has to be adapted to new domain and its lexicon. However, till date none of the state-of-the-art MT systems exist that can be used with high confidence in real-word scenario (Nair et al., 2016). The use of neural machine translation with long–short-term memory units has performed remarkably toward meeting the real-time goal. The initiative by Google to develop Google Translate for translation over 100 languages is noteworthy, but again its performance in the open domain needs attention. To obtain a quality result from MT system, postediting efforts are applied, which can be computer-aided support for grammar and spell checker or can be a human-assisted translation, that is, manual intervention. Success in the field of automatic translation can help remove the language barrier and can make available the rich literature of one language in other languages of the world. The translation from one language to the another can follow a direct conversion architecture or the interlingual architecture. In the former case, the syntactic conversion between one language and the other language is done based on the specified rule, whereas the later focusses on semantic transfer. In the case of interlingua architecture, the input data is first represented in text that is independent of the source language. Figure 4.6 presents the pyramid of MT approaches. The text of the second language, that is, the target language is generated using this language-independent intermediate data making it easier for translating into several languages. In between the two architectures exist a technique that works on transfer-based architecture. Here, the source language is analyzed syntactically to obtain intermediate representation, and then rules are used to map the representation to the target language. Another approach used for MT is to follow the statistical approach for translation. The SMT can be word-based, phrase-based, or syntax-based. They do not require an explicit rule description for the conversion process. Apart from these knowledgebased MT, empirical MT (Antony, 2013) approaches are also in use in the translation system. Hybrid MT approach, which combines the multiple approaches in building a
4 Computational intelligence approach to address the language barrier in healthcare
65
Interlingua
Semantic structure
Word structure
Source text
tion
Transfer rule
era
Syntactic structure
Syntactic structure
Gen
An aly sis
Semantic structure
Word structure
Target text
Figure 4.6: Machine translation pyramid.
single system has been widely used recently. These approaches perform satisfactorily as they exploit the strength of individual approaches to complete the task.
4.3.3.1 MT performance Performance evaluation of MT can be done manually as well as by using metric defined for machine-based evaluation. The manual evaluation requires evaluators with the knowledge of both the source and the target languages. Fluency and adequacy are the two major parameters used in the manual evaluation. Apart from this, BLEU (Papineni et al., 2002) and METEOR (Banerjee and Lavie, 2005) are other methods for automatic evaluation. Each of these methods is based upon n-grams (unigram, bigram, trigram) in the corpus. Different weight assignment approaches are used by these scoring techniques to obtain precision and recall values.
4.3.4 Text-to-speech synthesis To generate the natural resonant from the random text is the end objective of TTS. Speech synthesis refers to the artificial production of the human voice. Different categories of speech synthesizers exist with diverse technological variations due to software
66
Shweta Sinha, Shweta Bansal
and hardware. The excellence of any TTS system is based on two prime aspects – naturalness and intelligibility. Naturalness is the resemblance to natural human speech; on the other side, intelligibility is the ease of understanding by the listener. The TTS system always makes efforts to maximize both aspects (Raitio et al., 2010). TTS conversion is done in two phases. During the first phase, a phonetic transliteration of the input text is obtained and the second phase uses the statistics obtained in the first phase to produce voice output. The front-end of the TTS system is responsible for the preprocessing of text and also the grapheme to phoneme (g2p) conversion. During this process, written words are produced for the digits, numerals, and abbreviations, if any, present in the text. At the time of g2p conversion, each phonetic representation is mapped into prosodic units, that is, the sentence, clauses, or the phrases. At the back end, the output produced is utilized to obtain the speech signal. The most suitable application area of a speech synthesizer is the creation of screen readers for visually impaired and blind users. Also, it can be utilized for the development of communicational aids for a speech-impaired person. TTS synthesizers can be very helpful for dyslexic people. Several methods exist for speech synthesis, and they can be categorized as articulatory synthesis, formant synthesis, or concatenative synthesis. First, of these, the articulatory speech synthesis is based on the biomechanical models, which are responsibly initiating the speech signal in human being. The articulatory synthesizers imitate the influence of the articulators (tongue, lips, glottis, and VT). This reproduction is threedimensional differential equation which is time-dependent and computes the artificial speech output. The major requirement of such type of models is the availability of enormously high computational resources. Besides this, the output obtained by these models does not produce fluent speech and lack naturality (Beutnagel et al., 1999). Formant speech synthesis comprises the source-filter model, based on the hypothesis that the source (glottal) and filter can be treated independently (Jilka et al., 2003). The action of the filter is based on the bandwidth and formant frequencies. The advantage of the formant synthesis method is that it requires moderate computational resources for generating extremely comprehensible speech, though it does not produce entirely natural speech. The last approach is that the concatenative speech synthesis uses minimal sized speech units, named phonemes. The phonemes are recorded in a noise-free environment to develop a speech database. These recorded phonemes can be stored either as a waveform or in an encoded signal obtained with a suitable speech-encoding algorithm. The speech produced with this method uses recorded elementary units of actual sounds; therefore, this method has the highest potential for generating natural speech (Klatt, 1987). HMM is one of the most common techniques to simulate the observed behavior of any process. This capability of HMM is also utilized for modeling human speech behavior in speech technology. The model works in two passes. In the first pass, called the training phase, the HMM learns the acoustic-prosodic aspect of the speech. Subsequently, in the second phase, that is, the speech generation phase, HMMs
4 Computational intelligence approach to address the language barrier in healthcare
67
extract most likely parameters of the text to be synthesized. HMMs can be trained in two different ways; first, with a long speech corpus of around 2–4 h speech from a single speaker; second, an adaptive training approach that requires speech samples from multiple speakers and adapt it to a specific speaker’s voice characteristics. This approach requires a small corpus for synthesizing speech. Whatever be the approach followed, in the training phase HMM uses the speech corpus, its phonetic transcription extended to cover the context dependency, and marked phoneme boundaries. The speech features such as MFCC and pitch along with their derivatives are extracted from the waveform. With the help of these parameters, HMMs learn the spectral and excitation parameters. Figure 4.7 presents the HMM-based training of the speech synthesizers.
Text to be read HMM database Phonetic transcription
Spectral, excitation and duration parameter generation from HMMs
Context - dependent labelling
Mel-cepstram
Pitch Excitation
Filter Synthesized Speech
Figure 4.7: HMM-based speech synthesizer training module.
Once the training is over, the speech generation process starts. This process requires the phonetic transcription and context-dependent labelling of the text to be read. HMMs create the spectral and excitation parameters from the phone duration extracted from the state duration density function. Synthesizes speech is obtained using these parameters. To obtain a good quality speech, mixed excitation model is applied. Gnuspeech, free TTS, and Festival Speech Synthesis system are a few tools that can be used for building a TTS system.
68
Shweta Sinha, Shweta Bansal
4.4 Performance evaluation A TTS system can be evaluated in several aspects. Based on the purpose of building the system, the evaluation metric is selected. The performance of the TTS system can be measured to evaluate the naturalness using mean opinion score (MOS) associated with the quality of speech. The second parameter for evaluation is the intelligibility of the system that is measured in terms of MOS or WER. Apart from these, accuracies and comprehensibility are other measures for evaluation of a TTS system.
4.5 Addressing language barriers in healthcare The language barrier is considered as one of the most dominant obstructions in globalization. The necessity to resolve the barrier is much more needed in the medical domain. Migration of the Indian population to the other part of the world is dominant and is still increasing day by day. Not all moving out of the country is literate and can communicate only in their mother tongue. On the other hand, most of the countries have their own language or they use English as one of the media of communication. India also uses English as one of the official languages, but the proportion of population who are not well educated are not able to communicate in English. To overcome this language barrier, a speech translation is needed between Indian languages and English. Several efforts (Vemula et al., 2010) have been made in this direction but most of them are in the travel domain. Medical facilities being one of the crucial and most needed facility demands attention of the research community. Research to overcome the language barrier in healthcare services has started throughout the world (Ettelaie et al., 2005; Ehsani et al., 2006). An S2S system in the medical domain will be helpful to Indian migrants in the Middle East countries and also to other developed countries of the world where people move to seek the best possible healthcare services for critical diseases. Speech technology-based systems are highly data-driven systems, and their success is mainly dependent on the corpus available in the concerned languages and domain database (Kathol et al., 2005). This section presents an S2S translation system that works for Hindi to English translation and vice versa. Hindi being an underresourced language the creation of such a system is nontrivial, majorly due to unavailability of data and that too in the medical domain. Analysis of work done in the medical domain shows that applications in the medical/healthcare domain that use speech technologies can be placed in different categories:
4 Computational intelligence approach to address the language barrier in healthcare
69
– Dictation systems: Mainly used by doctors to provide notes to nurses/ receptionists. – Interactive systems: These voice-enabled systems are used to handle the interaction between patients and caretakers, patient and doctor, or patient and receptionists for making an appointment. – Command and control systems: Generally used for controlling medical equipment. – Language interpretation systems: These systems are mainly used to reduce the language barrier by providing translation services to the communication partners of different languages. All the work done in this domain marks that the medical domain utilizes a large corpus of words from the clinical/medicinal domain such as radiology, dental science, and pediatrics. A specialized corpus for development of S2S in the medical domain has to be created. We define the pipeline architecture used for S2S system to provide translation from Hindi to English and vice versa. The system discussed here works for speech translation between the patient’s language and that of the medical practitioners. The system provides end-to-end translation but has the facility at each intermediate level for evaluation and correction by human interpreters. This feature helps in identifying the error patterns and directs for further improvements. With time, when the errors are rectified, we look forward to removing the provision for human interventions. This can only be achieved when all the three models: ASR, MT, and TTS are tuned to perform perfectly.
4.5.1 Medical corpus creation In the first phase of the system creation, we focus on the development of system that deals with phrasal communication in medical domain and interactions in question– answer format. In the initial phase, we have avoided the fully conversational speech due to disfluencies, fillers, and other factors that make it difficult. The question–answer corpus mainly consists of queries that are raised by patients to be answered by medical practitioners and also questions based on history and symptoms queried by doctors and replied by patients. Table 4.2 shows the statistics of the corpus created for this purpose. One peculiar feature of the corpus is related to medical terminologies. The frequency of medical terms is less in the corpora as we have used doctor–patient interaction pattern and it is more likely that the doctor–patient conversation will be in general language terms instead of excessive use of medical terminologies.
70
Shweta Sinha, Shweta Bansal
Table 4.2: Corpus specifications for the speech translation system. Corpus specifications Speakers statistics Male: speakers
Age > years and years and 0 and s1, s2, . . . . . ., sk € R. The curve for Gaussian Process is given in below figure (Figure 6.10).
Y Observations Predictive mean
Output
Input
X
Figure 6.10: Gaussian process curve.
Ensemble Learning, also known as meta-algorithms merges several machine learning algorithms into one accurate model so as to decrease boosting, bagging, or stacking. This algorithm is basically divided into two categories – sequential ensemble methods and parallel ensemble methods (Dietterich, 2002). Many common types of ensembles are used, such as Bootstrap aggregating, Boosting, Bayes optimal classifier, etc. One of the types of ensemble, i.e., Bayes optimal classifier is expressed by the following equation. X P cj jhi PðTjhi ÞPðhi Þ (6:12) y = argmaxcj 2C h 2H i
where y is the output label that is being predicted, C is the group of possible outcomes, H is the hypothesis space, P refers to likelihood, and T refers to the data that is used for training. Block diagram of ensemble learning is given in (Figure 6.11). AdaBoost, also known as Adaptive Boosting is the first boosting algorithm that uses rough and inaccurate rules to produce very high accuracy prediction rules (Rätsch et al., 2001). An AdaBoost algorithm can be combined with other ML algorithms for
114
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Training data
Model 1
Model 2
Model 3
Model k
Average of all models output
Final Prediction Figure 6.11: Block diagram of ensemble learning.
enhancing their efficiency. It works by combing the output of weaker algorithms into a weighted sum that gives the final accurate output. The mathematical equation for AdaBoost Classification is given as: XM θ (6:13) FðxÞ = sign m = 1 mf m ðxÞ Where F(x) is the correct classifier output which is the weighted M soft classifier combination. Fm is the mth weak classifier and θm refers to the correlating weight. Block diagram of AdaBoost is given in (Figure 6.12).
Dataset
Weak classifier 1
Weak classifier 2
Weak classifier 3
Weighted sum of all weak classifiers
Final classification result Figure 6.12: Block diagram of AdaBoost.
Weak classifier k
6 Predicting psychological disorders using machine learning
115
Gradient Boosting is one of the useful algorithms for building predictive models that can quickly overfit a training dataset. This technique is also named as XGBoost. Gradient Boosting’s main objective deals with combining the prediction strength of all weak algorithms to get more accurate predictive results (Friedman, 2002). The most common type of weak algorithm used in Gradient Boosting is the decision tree. Let Fm be the imperfect model, then a new accurate model equation that uses the h estimator in below equation. Fm + 1 ðxÞ = Fm ðxÞ + hðxÞ
(6:14)
Where hðxÞ = y − FmðxÞ and y is the output variable. The equation for Gradient Boosting using decision tree can be represented in the below figure (Figure 6.13).
h1(x)
h2(x)
hk(x)
Figure 6.13: Gradient boosting structure using decision trees.
6.2.3 Disease prediction using semisupervised machine learning A. What is semisupervised learning Semisupervised learnings comprise algorithms that are trained upon a combination of data sets containing both labeled and unlabeled data. Generally, these type of combinations contains small data sets of labeled data and very large data sets of unlabeled data. These algorithms help overcome the basic disadvantages of supervised learning, which involves the hand labeling of data sets either by the machine learning engineer or the data scientists, which is a very costly affair when large sets of data are involved. Also, the application spectrum of unsupervised learning being limited the semisupervised learning algorithms help overcome these disadvantages (Zhu and Goldberg, 2009). Semisupervised learning algorithms’ basic procedure involves clustering of data by the programmer or data scientists using unsupervised learning algorithms and then labeling of rest of the unlabeled data using existing labeled data. The unlabeled data sets are relatively cheap to acquire whereas labeled data sets are very expensive. Semisupervised learning algorithms are based upon certain assumptions, such as continuity assumption, cluster assumption, and manifold assumption.
116
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
B. Related work on diseases predicted using semi-supervised machine learning The Table 6.2 describes the various semi-supervised ML techniques used to predict various mental disorders with dataset and accuracy range for each technique. Table 6.2: Disease prediction using semisupervised ML techniques. Disease
Ml techniques
Data type
Accuracy range
Alzheimer’s disease
GA (Vandewater et al., ; Biological (Vandewater Brasil, et al., ; et al., ), Johnson et al., ) clinical assessment (Brasil et al., ; Johnson et al., )
GA - – %
Dementia
GA (Kumari et al., , pp.–)
Imaging (Kumari et al., , pp.–)
GA .–% with NN
Attention deficit hyperactivity disorder
GA (Yaghoobi and Azadi, )
Clinical assessment (Yaghoobi and Azadi, )
GA %–.%
Depression
GA (Mohammadi et al., )
Imaging (Mohammadi et al., GA - > % )
Schizophrenia
GA (Kaufmann et al., )
Imaging (Kaufmann et al., )
GA –.%
Suicide/self-harm
GA (Poulin et al., )
Clinical notes (Poulin et al., )
GA - –%
C. Techniques used for disease prediction in semisupervised machine learning Genetic Algorithm (GA) is a stochastic search algorithm introduced in 1975 by John Holland for solving optimization problems. GA uses recombination and selection criteria for generating new sample points. They are applied on code that is the finite length of strings called chromosomes. Each string contains a finite length of symbols called genes. New data points are generated by applying further “mutating” and “crossover” process on them. In the medical field, GA is used for classification and feature extraction of high dimensional patterns (Figure 6.14) shows the basic steps for GA.
6.2.4 Disease prediction using unsupervised machine learning A. What is unsupervised learning As compared with supervised learning, unsupervised machine learning involves unlabeled data. The primary reason to use unsupervised learning is not to predict target outcome variables; however, to deal with meaningful information without supervision.
6 Predicting psychological disorders using machine learning
117
Initialize population
Evaluation of each individual
Are selection
No
criteria reached?
Crossover and mutation
Yes
Best result
Figure 6.14: Steps for GA.
Some instances where unsupervised learning is used are finding similarities among data objects, figuring out connections among attributes, reshaping attributes to decrease dimensionality, etc. Clustering is a fact finding technique aimed at identifying possible subgroups in the relevant data without previous information from existing groups. For instance, a suicide analyst may want to know which “patients” clusters come from the suicidal pool. (Coates et al., 2011). B. Related work on diseases predicted using unsupervised machine learning The Table 6.3 describes various unsupervised machine learning techniques used to predictvarious mental disorders with dataset and accuracy range for each technique. Table 6.3: Disease prediction using unsupervised machine learning techniques. Disease
ML techniques
Data type
Accuracy range
Anxiety
k-means clustering (Park et al., )
Social media (Park et al., )
–
118
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Table 6.3 (continued ) Disease
ML techniques
Data type
Accuracy range
Autism spectrum disorder
k-means clustering (Liu, Li and Yi, a)
Video/photo (Liu et al., a)
k-means clustering with SVM classifier – .%
Depression
k-means clustering (Park et al., ; Farhan et al., ), principal component analysis (Chen et al., a)
Social media (Park et al., k-means clustering- % ), PCA- % sensors (Farhan et al., ), imaging (Chen et al., a)
Posttraumatic stress
k-means clustering (Park et al., )
Social media (Park et al., )
–
Imaging (Chen et al., a)
PCA- %
Survey (Hagad et al., )
–
Schizophrenia Principal component analysis (Chen et al., a) Stress
k-means clustering (Hagad et al., )
C. Techniques used for disease prediction in unsupervised machine learning K-means clustering algorithm, initially introduced in 1967 by MacQueen (Haraty et al., 2015), is one of the unsupervised ML techniques that can solve the problems of clustering. With each k center, the data set is classified into different k clusters. The result of prediction depends on the distance from the center of clusters so for better results, they should be positioned far apart from each other. Then on the next step, the data set should associate with the nearest center to the belonging cluster. When all points are associated, further re-calculation is done by taking “k” as new centroids. A loop is to be generated by doing this again and again and the k centers will be changing their location on each step. This continues until the changes are no more done. This algorithm’s main objective is to reduce an objective function, also called a square error function which is given in equation (6.15). The K-means clustering example is shown in (Figure 6.15). Square Error Function = JðVÞ =
Xc i=1
Xci j=1
ðjjxi − yi jjÞ2
(6:15)
Where ||xi – vj|| denotes the Euclidean distance among xi and vj, ci refers to data points in ith cluster and “c” refers to the total number of cluster centers. Principal Component Analysis (PCA) is an unsupervised learning technique for statistics that is used to find the interrelations between groups of variables. It helps in reducing the dimensions of the dataset by converting a set of correlated
6 Predicting psychological disorders using machine learning
X
119
X’
K-Means clustering
Y
Y’
Figure 6.15: Example of K-means clustering.
variables to uncorrelated variables (Rokhlin and Tygert, 2009). This same procedure is repeated to get a whole new set of variables which are simply known as principle components or orthogonal. PCA is also called as general factor analysis where regression finds a line of best fit. The basic matrix equation for PCA is given by: Y = W’X,
(6:16)
where X is the input matrix with n rows and p columns, Y is the score matrix, and W is the coefficient matrix. The PCA equation is also given in linear form as set of p linear equations: yij = wlixlj + w2ix2j + + wpixpj
(6:17)
Let the original dimensions of the dataset be X and Y. By using PCA, the dimension of dataset reduces to new vector Xʹ and Yʹ whole illustration is given in (Figure 6.16).
Y
Y’
X’
X
Figure 6.16: PCA representation.
120
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
6.3 Comparative measures for validation of machine learning techniques So as to assess the prediction rate, there are a few indicators, e.g., precision, sensitivity, specificity, and accuracy to evaluate the model’s authenticity. These indicators are determined by confusion matrix given in below figure (Table 6.4). This model is a very useful tool for evaluating the quality of various algorithms for the prediction of diseases (Linden, 2006). The ideal state, which must contain significant data, must be present on the matrix’s main diagonal whereas the remaining values should be marked as zero or non-zero.
Table 6.4: Confusion matrix. Class A (predicted)
Class B (predicted)
Class A (actual)
True positive (TP)
False negative (FN)
Class B (actual)
False positive (FP)
True negative (TN)
True Positive (TP) = When observations are positive, and predictions are also “correct.” False Negative (FN) = When observations are positive, but predictions are “Negative.” TN = When observations are negative, and predictions are also “Negative.” FP = When observations are negative, but predictions are “Positive.”
6.3.1 The formulas for various indicators are described below 1.
Accuracy – It is given as the ratio of predictions which are correct, i.e., accurate to the total number of predictions made. Accuracy =
2.
ðTP + TNÞ ðTP + FN + FP + TNÞ
Recall or Sensitivity – It is known as the ratio of positive data points which are actually positive to the sum of all positive data points. It is also referred as TPR that stands for True Positive Rate. Recall =
3.
(6:18)
ðTPÞ ðTP + FNÞ
(6:19)
Precision – It is known as the ratio of predictions correct to the sum of all predictions positive.
6 Predicting psychological disorders using machine learning
Precision =
ðTPÞ ðTP + FPÞ
121
(6:20)
4. Specificity – The ratio of positive data points which are actually negative to the sum of all negative data points is known as specificity. It is also referred as FPR that stands for False Positive Rate. Specificity =
ðTNÞ ðFP + TNÞ
(6:21)
Performance comparison of the machine learning techniques for the three random mental disorders like autism spectrum disorder, schizophrenia, and traumatic brain injury have been shown in the (Figure 6.17–6.19) considering 1-Specificity (i.e., FPR) given by x-axis and sensitivity (i.e., TPR) given by y-axis. 100
Sensitivity (TPR)
80 60
40
SVM LR LDA PCA Genetic Algorithm
20 0 0.00
0.01
0.02
0.03 0.04 0.05 1 -Specificity (FPR)
0.06
0.07
Figure 6.17: Accuracy graph for the techniques used to predict autism spectrum disorder.
100
Sensitivity(TPR)
80 60 DT LDA SVM LR
40 20 0 0.00
0.01
0.02
0.03 0.04 0.05 1-Specificity (FPR)
0.06
0.07
Figure 6.18: Accuracy graph for the techniques used to predict schizophrenia.
122
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Sensitivity(TPR)
100 80 60
SVM RF DT KNN K-means clustering
40 20 0 0.00
0.01
0.02
0.04 0.05 0.03 1-Specificity (FPR)
0.06
0.07
Figure 6.19: Accuracy graph for the techniques used to predict traumatic brain injury.
6.4 Challenges in machine learning for predicting psychological disorders Despite significant progress in designing of methodologies that strengthen the importance of ML in medical science applications, some of the challenges still remain, which can be described as follows: A. Data availability: The main limitation for using machine learning in predicting mental disorders in advance is the limited size of datasets and their insufficient features description. B. Need for traceable conclusions: In real life situations, the doctors are not only concerned about the prediction results given by the ML model. However, they are also concerned about the factors or parameters that led to final prediction. That means for the better treatment of the patient, the doctor needs to know in depth working of the model. C. Reproducibility: Prototyping, iterating, and benchmarking machine learning techniques is the another challenge in real life applications. This workflow is a very challenging task when done manually, thus requires computer programming skills for increasing demand. D. Simple is better: It is proven that when the data is pre-processed properly, it tends to give more accurate results. But applying feature extraction on a large dataset and ensemble simpler models to make a better predictive model is one of the challenging task while doing predictions using machine learning techniques. E. Data management: Datasets gathered from EEG or ECG test is very vast in size that requires proper feature extraction for the implementation of machine learning algorithm. Managing such huge data and separating useful data for research work is another challenging task.
6 Predicting psychological disorders using machine learning
123
F. High level of customization required: In real world, health data is in very complex form and the data resources are also limited. Thus, implementation of any ML algorithm on the health data requires high level of customization which is another challenging task for the researchers. The integration of predictive models into the existing healthcare system is also very complex task. G. Utilize expertise, not only data: Datasets that are publically available are small in size compared to extensive training that doctors get in medical colleges for their research work. Thus, after so much of advancement in machine learning techniques, still doctors can perform much better that that of any predictive model.
6.5 Conclusion and future work Through the help of this chapter we can very well conclude that machine learning techniques in areas of mental health are making significant progress and are revealing exciting advances. In this chapter information has been provided about the ML techniques which are frequently used for predicting mental health. We have reviewed around 15 mental disorders that were predicted using three categories of machine learning and the accuracy range for each such diseases using different ML algorithms have been reported. Supervised ML Techniques are mostly used for prediction of mental health diseases, with SVM being the commonly used technique. The data in the mental healthcare domain being sparse, SVM technique provides more accurate predictions. Further we can conclude that unsupervised machine learning techniques are generally less used in the mental healthcare domain in comparison to other techniques. Various challenges which are being faced in areas of ML while analyzing mental disorders have also been described.
References Andrews J.A., Harrison R.F., Brown L.J.E., MacLean L.M., Hwang F., Smith T., and Astell A.J. Using the NANA toolkit at home to predict older adults’ future depression. Journal of Affective Disorders, (2017), 213, 187–190. Bailey N.W., Hoy K.E., Rogasch N.C., Thomson R.H., McQueen S., Elliot D., and Fitzgerald P.B. Responders to rTMS for depression show increased fronto-midline theta and theta connectivity compared to non-responders. Brain Stimulation, (2018), 11, 190–203. Bang S., Son S., Roh H., Lee J., Bae S., Lee K., . . . Shin H. Quad-phased data mining modeling for dementia diagnosis. BMC Medical Informatics and Decision Making, (2017), 17, 60. Bermejo P., Lucas M., Rodríguez-Montes J.A., Tárraga P.J., Lucas J., Gámez J.A., and Puerta J.M. Single- and multi-label prediction of burden on families of schizophrenia patients. In: Peek N., Marín Morales R., and Peleg M. (eds), Artificial Intelligence in Medicine. AIME 2013. Lecture Notes in Computer Science, (2013), Vol. 7885, Springer, Berlin, Heidelberg, 115–124.
124
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Besga A., Gonzalez I., Echeburua E., Savio A., Ayerdi B., Chyzhyk D., and Gonzalez-Pinto A.M. Discrimination between Alzheimer’s disease and late onset bipolar disorder using multivariate analysis. Frontiers in Aging Neuroscience, (2015), 7, 231. Bruce B.G. and Edward S.H. Rule-based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Proramming Project. Addison-Wesley Reading MA, (1984). Bosl W.J., Loddenkemper T., and Nelson C.A. Nonlinear EEG Biomarker Profiles for Autism And Absence Epilepsy. (2017). Brasil Filho A.T., Pinheiro P.R., and Coelho A.L. (2009, April). Towards the early diagnosis of Alzheimer’s disease via a multicriteria classification model. In International conference on evolutionary multi-criterion optimization (pp. 393–406). Springer, Berlin, Heidelberg. Breiman L. Random forests. Machine Learning, (2001), 45 (1), 5–32. Broek E.L., Sluis F., and Dijkstra T. Cross-validation of bimodal health-related stress assessment. Personal and Ubiquitous Computing, (2013), 17, 215–227. Bruining H., Eijkemans M.J., Kas M.J., Curran S.R., Vorstman J.A., and Bolton P.F. Behavioral signatures related to genetic disorders in autism. Molecular Autism, (2014), 5, 11. Burnham S.C., Faux N.G., Wilson W., Laws S.M., Ames D., Bedo J., and Villemagne V.L. Alzheimer’s disease neuroimaging initiative, Australian imaging, biomarkers and lifestyle study research group. A Blood-Based Predictor for Neocortical Aß Burden in Alzheimer’s Disease: Results From The AIBL Study, (2014), 19, 519–526. Molecular Psychiatry. Cao L., Guo S., Xue Z., Hu Y., Liu H., Mwansisya T.E., . . . Liu Z. Aberrant functional connectivity for diagnosis of major depressive disorder: A discriminant analysis. Psychiatry and Clinical Neurosciences, (2014), 68, 110–119. Chalmers C., Hurst W., Mackay M., and Fergus P. A smart health monitoring technology. In: Huang D.S., Bevilacqua V., and Premaratne P. (eds), ICIC 2016. Lecture Notes in Computer Science, Intelligent Computing Theories and Application, (2016), Vol. 9771, Springer, Cham, 832–842. Chen X., Liu C., He H., Chang X., Jiang Y., Li Y., . . . Yao D. Transdiagnostic differences in the resting-state functional connectivity of the prefrontal cortex in depression and schizophrenia. Journal of Affective Disorders, (2017a), 217, 118–124. Chiang H.-S., Liu L.-C., and Lai C.-Y. The diagnosis of mental stress by using data mining technologies. In: Park J., Barolli L., Xhafa F., and Jeong H.Y. (eds), Lecture Notes in Electrical Engineering, Information Technology Convergence, (2013), Vol. 253, Springer, Dordrecht, 761–769. Coates A., Ng A., and Lee H. An Analysis of Single-Layer Networks in Unsupervised Feature Learning, (2011) June. Coleman J.C., (1950), Abnormal psychology and modern life. Cook R.D. Detection of influential observation in linear regression. Technometrics, (1977), 19 (1), 15–18. Cortes C. and Vapnik V. Support-vector networks. Machine Learning, (1995), 20 (3), 273–297. Costafreda S.G., Dinov I.D., Tu Z., Shi Y., Liu C.-Y., Kloszewska I., . . . Simmons A. Automated hippocampal shape analysis predicts the onset of dementia in mild cognitive impairment. NeuroImage, (2011a), 56, 212–219. Deng F., Wang Y., Huang H., Niu M., Zhong S., Zhao L., . . . Huang R. Abnormal segments of right uncinate fasciculus and left anterior thalamic radiation in major and bipolar depression. Progress in Neuro-Psychopharmacology & Biological Psychiatry, (2018), 81, 340–349. Dietterich T.G. Ensemble learning. The Handbook of Brain Theory and Neural Networks, (2002), 2, 110–125. Dimitriadis S.I., Liparas D., and Tsolaki M.N. Alzheimer’s disease neuroimaging initiative (2018) Random forest feature selection, fusion and ensemble strategy: Combining multiple morphological MRI measures to discriminate among healthy elderly, MCI, cMCI and
6 Predicting psychological disorders using machine learning
125
Alzheimer’s disease patients: From the Alzheimer’s disease neuroimaging initiative (ADNI) database. Journal of Neuroscience Methods, 302, 14–23. Dipnall J.F., Pasco J.A., Berk M., Williams L.J., Dodd S., Jacka F.N., and Meyer D. Into the bowels of depression: Unravelling medical symptoms associated with depression by applying machinelearning techniques to a community based population sample. PloS One, (2016b), 11, e0167055. Dyrba M., Barkhof F., Fellgiebel A., Filippi M., Hausner L., Hauenstein K., and Teipel S.J. EDSD study group (2015) Predicting prodromal Alzheimer’s disease in subjects with mild cognitive impairment using machine learning classification of multimodal multicenter diffusion-tensor and magnetic resonance imaging data. Journal of Neuroimaging, 25, 738–747. Dyrba M., Ewers M., Wegrzyn M., Kilimann I., Plant C., Oswald A., and Teipel S.J. EDSD study group (2013) Robust automated detection of microstructural white matter degeneration in Alzheimer’s disease using machine learning classification of multicenter DTI data. PloS One, 8, e64925. El Naqa I. and Murphy M.J. What is machine learning?. In: El Naqa I., Li R., and Murphy M. (eds), Machine Learning in Radiation Oncology, Cham, (2015), Springer, 3–11 Epidemiology and Psychiatric Sciences, Vol. 26, 01, 22–36. Erguzel T.T., Ozekes S., Sayar G.H., Tan O., and Tarhan N. A hybrid artificial intelligence method to classify trichotillomania and obsessive compulsive disorder. Neurocomputing, (2015), 161, 220–228. Ertek G., Tokdil B., and Günaydın İ. Risk Factors and Identifiers for Alzheimer’s disease: A Data Mining Analysis. In: Perner P. (ed.), Advances in Data Mining, Applications and Theoretical Aspects. ICDM 2014. Lecture Notes in Computer Science, (2014), Vol. 8557, Cham, Springer. Falahati F., Ferreira D., Soininen H., Mecocci P., Vellas B., Tsolaki M., and Simmons A. Westman E and addneuromed consortium and the Alzheimer’s disease neuroimaging initiative. The effect of Age correction on multivariate classification in Alzheimer’s disease, with a focus on the characteristics of incorrectly and correctly classified subjects. Brain Topography, (2016), 29, 296–307. Farhan A.A., Lu J., Bi J., Russell A., Wang B., and Bamis A. (2016) Multi-view bi-clustering to identify smartphone sensing features indicative of depression. In 2016 IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), pp. 264–273. Friedman J.H. Stochastic gradient boosting. Computational Statistics & Data Analysis, (2002), 38 (4), 367–378. Goch C.J., Oztan B., Stieltjes B., Henze R., Hering J., Poustka L., and Maier-Hein K.H. Global changes in the connectome in autism spectrum disorders. In: Schultz T., Nedjati-Gilani G., Venkataraman A., O’Donnell L., and Panagiotaki E. (eds), Computational Diffusion MRI and Brain Connectivity, (2013), Cham, Springer, Mathematics and Visualization, 239–247. Hagad J.L., Moriyama K., Fukui K., and Numao M. Modeling work stress using heart rate and stress coping profiles. In: Baldoni M. et al. (eds), Principles and Practice of Multi-Agent Systems, CMNA 2015, IWEC 2015, IWEC 2014. Lecture Notes in Computer Science, (2014), Vol. 9935, Springer, Cham, 108–118. Hajek T., Cooke C., Kopecek M., Novak T., Hoschl C., and Alda M. Using structural MRI to identify individuals at genetic risk for bipolar disorders: A 2-cohort, machine learning study. Journal of Psychiatry & Neuroscience: JPN, (2015), 40, 316–324. Hansen L.K. and Salamon P. Neural network ensembles. IEEE Transactions on Pattern Analysis & Machine Intelligence, (1990), 10, 993–1001. Haraty R.A., Dimishkieh M., and Masud M. An enhanced k-means clustering algorithm for pattern discovery in healthcare data. International Journal of Distributed Sensor Networks, (2015), 11 (6), 615740.
126
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Hatton C.M., Paton L.W., McMillan D., Cussens J., Gilbody S., and Tiffin P.A. Predicting persistent depressive symptoms in older adults: A machine learning approach to personalised mental healthcare. Journal of Affective Disorders, (2019), 246, 857–860. Hoogendoorn M., Berger T., Schulz A., Stolz T., and Szolovits P. Predicting social anxiety treatment outcome based on therapeutic email conversations. IEEE Journal of Biomedical and Health Informatics, (2017), 21, 1449–1459. Iannaccone R., Hauser T.U., Ball J., Brandeis D., Walitza S., and Brem S. Classifying adolescent attention – deficit/hyperactivity disorder (ADHD) based on functional and structural imaging. European Child & Adolescent Psychiatry, (2015), 24, 1279–1289. Iliou T., Konstantopoulou G., Ntekouli M., Lymperopoulou C., Assimakopoulos K., Galiatsatos D., and Anastassopoulos G. ILIOU machine learning preprocessing method for depression type prediction. Evolving Systems, (2017), 475, 53–60. Jiao Y., Chen R., Ke X., Chu K., Lu Z., and Herskovits E.H. Predictive models of autism spectrum disorder based on brain regional cortical thickness. NeuroImage, (2010), 50, 589–599. Jie N.-F., Osuch E.A., Zhu M.-H., Wammes M., Ma X.-Y., Jiang T.-Z., and Calhoun V.D. Discriminating bipolar disorder from major depression using whole-brain functional connectivity: A feature selection analysis with SVM-FoBa algorithm. Journal of Signal Processing Systems, (2018), 90, 259–271. Jin C., Jia H., Lanka P., Rangaprakash D., Li L., Liu T., . . . Deshpande G. Dynamic brain connectivity is a better predictor of PTSD than static connectivity. Human Brain Mapping, (2017), 38, 4479–4496. Johnson P., Vandewater L., Wilson W., Maruff P., Savage G., Graham P., and Zhang P. Genetic algorithm with logistic regression for prediction of progression to Alzheimer’s disease. BMC Bioinformatics, (2014), 15 (16), S11. (suppl. Karamzadeh N., Amyot F., Kenney K., Anderson A., Chowdhry F., Dashtestani H., and Gandjbakhche A.H. A machine learning approach to identify functional biomarkers in human prefrontal cortex for individuals with traumatic brain injury using functional nearinfrared spectroscopy. Brain and Behavior, (2016), 6, e00541. Kaufmann T., Alnaes D., Brandt C.L., Doan N.T., Kauppi K., Bettella F., and Westlye L.T. Task modulations and clinical manifestations in the brain functional connectome in 1615 fMRI datasets. NeuroImage, (2017), 147, 243–252. Kessler R.C., van Loo H.M., Wardenaar K.J., Bossarte R.M., Brenner L.A., Cai T., and Nierenberg A.A. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Molecular Psychiatry, (2016), 21 (10), 1366. Khondoker M., Dobson R., Skirrow C., Simmons A., and Stahl D. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies. Statistical Methods in Medical Research, (2016), 25, 1804–1823. Kim H., Chun H.-W., Kim S., Coh B.-Y., Kwon O.-J., and Moon Y.-H. Longitudinal study-based dementia prediction for public health. International Journal of Environmental Research and Public Health, (2017), 14, 983. Kipli K., Kouzani A.Z., and Hamid I.R.A. Investigating machine learning techniques for detection of depression using structural MRI volumetric features. International Journal of Bioscience, Biochemistry and Bioinformatics, (2013), 3 (5), 444–448. König A., Satt A., Sorin A., Hoory R., Toledo-Ronen O., Derreumaux A., . . . David R. Automatic speech analysis for the assessment of patients with predementia and Alzheimer’s disease. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, (2015), 1, 112–124. Kotsilieris T., Pintelas E., Livieris I.E., and Pintelas P. (2018). Reviewing machine learning techniques for predicting anxiety disorders. Technical Report TR01-18. University of Patras.
6 Predicting psychological disorders using machine learning
127
Koutsouleris N., Meisenzahl E.M., Davatzikos C., Bottlender R., Frodl T., Scheuerecker J., . . . Gaser C. Use of neuroanatomical pattern classification to identify subjects in at-risk mental states of psychosis and predict disease transition. Archives of General Psychiatry, (2009), 66, 700–712. Kumari R.S., Sheela K.R., Varghese T., Kesavadas C., Albert Singh N., and Mathuranath P.S. A genetic algorithm optimized artificial neural network for the segmentation of MR images in frontotemporal dementia. In: Panigrahi B.K., Suganthan P.N., Das S., and Dash S.S. (eds), SEMCCO 2013. Lecture Notes in Computer Science, Swarm, Evolutionary, and Memetic Computing, (2013), Vol. 8298, Springer, Cham, 268–276. Kumari R.S., Varghese T., Kesavadas C., Singh N.A., and Mathuranath P.S. (2014). Longitudinal evaluation of structural changes in frontotemporal dementia using artificial neural networks. In Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2013 (pp. 165–172). Springer, Cham. Lenhard F., Sauer S., Andersson E., Månsson K.N., Mataix-Cols D., Rück C., and Serlachius E. Prediction of outcome in internet-delivered cognitive behaviour therapy for paediatric obsessive-compulsive disorder: A machine learning approach. International Journal of Methods in Psychiatric Research, (2018), 27, e1576. Li Q., Zhao L., Xue Y., Jin L., and Feng L. Exploring the impact of co-experiencing stressor events for teens stress forecasting. In: Bouguettaya A. et al. (eds), Web Information Systems Engineering – WISE 2017, WISE 2017. Lecture Notes in Computer Science, (2017b), Vol. 10570, Springer, Cham, 313–328. Liang S., Brown M.R.G., Deng W., Wang Q., Ma X., Li M., . . . Li T. Convergence and divergence of neurocognitive patterns in schizophrenia and depression. Schizophrenia Research, (2018a), 192, 327–334. Linden A. Measuring diagnostic and predictive accuracy in disease management: An introduction to receiver operating characteristic (ROC) analysis. Journal of Evaluation in Clinical Practice, (2006), 12 (2), 132–139. Linthicum K.P., Schafer K.M., and Ribeiro J.D. Machine learning in suicide science: Applications and ethics. Behavioral Sciences & The Law, (2019). doi: 10.1002/bsl.2392. Liu F., Guo W., Fouche J.-P., Wang Y., Wang W., Ding J., . . . Chen H. Multivariate classification of social anxiety disorder using whole brain functional connectivity. Brain Structure & Function, (2015a), 220, 101–115. Liu F., Xie B., Wang Y., Guo W., Fouche J.-P., Long Z., . . . Chen H. Characterization of posttraumatic stress disorder using resting-state fMRI with a multi-level parametric classification approach. Brain Topography, (2015b), 28, 221–237. Luo J., Wu M., Gopukumar D., and Zhao Y. Big data application in biomedical research and health care: A literature review. Biomedical Informatics Insights, (2016), 8, 1–10. Maji S. and Garg D. Hybrid approach using SVM and MM2 in splice site junction identification. Current Bioinformatics, (2014), 9 (1), 76–85. Maraş A. and Aydin S. Discrimination of psychotic symptoms from controls through data mining methods based on emotional principle components. In: CMBEBIH 2017, (2017), Springer, Singapore, 26–30. Maxhuni A., Hernandez-Leal P., Morales E.F., Enrique Sucar L., Osmani V., Muńoz-Meléndez A., and Mayora O. Using intermediate models and knowledge learning to improve stress prediction. In: Sucar E., Mayora O., and Munoz de Cote C.E. (eds), Applications for Future Internet. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, (2016), Vol. 179, Springer, Cham, 140–151. Mohammadi M., Al-Azab F., Raahemi B., Richards G., Jaworska N., Smith D., . . . Knott V. Data mining EEG signals in depression for their diagnostic value. BMC Medical Informatics and Decision Making, (2015), 15, 108.
128
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Murphy K.P. Naive Bayes Classifiers. Vol. 3, University of British Columbia, 18, 60.Neuropsychiatric Electrophysiology, (2006), 1. Nguyen T., Venkatesh S., and Phung D. Textual cues for online depression in community and personal settings. In: Li J., Li X., Wang S., Li J., and Sheng Q. (eds), ADMA 2016. Lecture Notes in Computer Science, Advanced Data Mining and Applications, (2016b), Vol. 10086, Springer, Cham, 19–34. Oh D.H., Kim I.B., Kim S.H., and Ahn D.H. Predicting autism spectrum disorder using blood-based gene expression signatures and machine learning. Clinical Psychopharmacology and Neuroscience, (2017), 15, 47–52. Park A., Conway M., and Chen A.T. Examining thematic similarity, difference, and membership in three online mental health communities from Reddit: A text mining and visualization approach. Computers in Human Behavior, (2018), 78, 98–112. Parrado-Hernández E., Gómez-Verdejo V., Martinez-Ramon M., Alonso P., Pujol J., Menchón J.M., and Soriano-Mas C. Identification of OCD-relevant brain areas through multivariate feature selection. In: Langs G., Rish I., Grosse-Wentrup M., and Murphy B. (eds), Lecture Notes in Computer Science, Machine Learning and Interpretation in Neuroimaging, (2012), Vol. 7263, Springer, Berlin, Heidelberg, 60–67. Pedersen M., Curwood E.K., Archer J.S., Abbott D.F., and Jackson G.D. Brain regions with abnormal network properties in severe epilepsy of Lennox-Gastaut phenotype: Multivariate analysis of task-free fMRI. Epilepsia, (2015), 56, 1767–1773. Plitt M., Barnes K.A., and Martin A. Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards. NeuroImage Clinical, (2015), 7, 359–366. Poulin C., Shiner B., Thompson P., Vepstas L., Young-Xu Y., Goertzel B., . . . McAllister T. Predicting the risk of suicide by analyzing the text of clinical notes. PloS One, (2014), 9, e85733. Provost F. and Kohavi R. Glossary of terms. Journal of Machine Learning, (1998), 30 (2–3), 271–274. Rasmussen C.E. Gaussian Processes in Machine Learning. Berlin, Heidelberg: Springer, (2003) February. In Summer School on Machine Learning, 63–71. Rätsch G., Onoda T., and Müller K.R. Soft margins for AdaBoost. Machine Learning, (2001), 42 (3), 287–320. Reece A.G. and Danforth C.M. Instagram photos reveal predictive markers of depression. EPJ Data Science, (2017), 6. Rikandi E., Pamilo S., Mäntylä T., Suvisaari J., Kieseppä T., Hari R., and Raij T.T. Precuneus functioning differentiates first-episode psychosis patients during the fantasy movie Alice in Wonderland. Psychological Medicine, (2017), 47, 495–506. Roberts G., Lord A., Frankland A., Wright A., Lau P., Levy F., . . . Breakspear M. Functional dysconnection of the inferior frontal gyrus in young people with bipolar disorder or at genetic high risk. Biological Psychiatry, (2017), 81, 718–727. Rokhlin V., Szlam A., and Tygert M. A randomized algorithm for principal component analysis. SIAM Journal on Matrix Analysis and Applications, (2009), 31 (3), 1100–1124. Safavian S.R. and Landgrebe D. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, (1991), 21 (3), 660–674. Sato J.R., Moll J., Green S., Deakin J.F.W., Thomaz C.E., and Zahn R. Machine learning algorithm accurately detects fMRI signature of vulnerability to major depression. Psychiatry Research, (2015), 233, 289–291. Saxe G.N., Ma S., Ren J., and Aliferis C. Machine learning methods to predict child posttraumatic stress: A proof of concept study. BMC Psychiatry, (2017), 17, 223.
6 Predicting psychological disorders using machine learning
129
Sheela Kumari R., Varghese T., Kesavadas C., Albert Singh N., and Mathuranath P.S. Longitudinal Evaluation of Structural Changes in Frontotemporal Dementia Using Artificial Neural Networks, (2014). Fook V.F.S., Jayachandran M., Wai A.A.P., Tolstikov A., Biswas J., and Kiat P.Y.L. iCOPE: Intelligent context-aware patient management systems for elderly with cognitive and functional impairment. In: Intelligent Patient Management, (2009), Springer, Berlin, Heidelberg, 259–278. Smets E., Casale P., Großekathöfer U., Lamichhane B., De Raedt W., Bogaerts K., and Van Hoof C. Comparison of machine learning techniques for psychophysiological stress detection. In: Serino S., Matic A., Giakoumis D., Lopez G., and Cipresso P. (eds), MindCare 2015. Communications in Computer and Information Science, Pervasive Computing Paradigms for Mental Health, (2016), Vol. 604, Springer, Cham, 13–22. Song I., Dillon D., Goh T.J., and Sung M. A Health Social Network Recommender System. Berlin, Heidelberg: Springer, (2011), 361–372. Agents in Principle, Agents in Practice. Souillard-Mandar W., Davis R., Rudin C., Au R., Libon D.J., Swenson R., and Penney D.L. Learning classification models of cognitive conditions from subtle behaviors in the digital clock drawing test. Machine Learning, (2016), 102, 393–441. Sundermann B., Bode J., Lueken U., Westphal D., Gerlach A.L., Straube B., . . . Pfleiderer B. Support vector machine analysis of functional magnetic resonance imaging of interoception does not reliably predict individual outcomes of cognitive behavioral therapy in panic disorder with agoraphobia. Frontiers in Psychiatry/Frontiers Research Foundation, (2017), 8, 99. Takagi Y., Sakai Y., Lisi G., Yahata N., Abe Y., Nishida S., and Tanaka S.C. A neural marker of obsessive-compulsive disorder from whole-brain functional connectivity. Scientific Reports, (2017), 7, 7538. Thin N., Hung N., Venkatesh S., and Phung D. Estimating support scores of autism communities in large-scale web information systems. In: Bouguettaya A. et al. (eds), Web Information Systems Engineering – WISE 2017, WISE 2017. Lecture Notes in Computer Science, (2017), Vol. 10569, Springer, Cham, 347–355. Tremblay S., Iturria-Medina Y., Mateos-Pérez J.M., Evans A.C., and De Beaumont L. Defining a multimodal signature of remote sports concussions. The European Journal of Neuroscience, (2017), 46, 1956–1967. Turing A.M. Computing machinery and intelligence. Dordrecht: Springer, (2009), 23–65. In Parsing the Turing Test (pp. Vakorin V.A., Doesburg S.M., da Costa L., Jetly R., Pang E.W., and Taylor M.J. Detecting mild traumatic brain injury using resting state magnetoencephalographic connectivity. PLoS Computational Biology, (2016), 12, e1004914. Vandewater L., Brusic V., Wilson W., Macaulay L., and Zhang P. An adaptive genetic algorithm for selection of blood-based biomarkers for prediction of Alzheimer’s disease progression. BMC Bioinformatics, (2015), 16 (18), S1. Vigneron V., Kodewitz A., Tome A.M., Lelandais S., and Lang E. Alzheimer’s disease brain areas: The machine learning support for blind localization. Current Alzheimer Research, (2016), 13, 498–508. Wang X., Zhang C., Ji Y., Sun L., Wu L., and Bao Z. A depression detection model based on sentiment analysis in micro-blog social network. In: Li J. et al. (eds), Trends and Applications in Knowledge Discovery and Data Mining, PAKDD 2013, Lecture Notes in Computer Science, vol 7867, (2013), Springer, Berlin, Heidelberg, 201–213.
130
Prabhsimar Kaur, Vishal Bharti, Srabanti Maji
Westman E., Aguilar C., Muehlboeck J.-S., and Simmons A. Regional magnetic resonance imaging measures for multivariate analysis in Alzheimer’s disease and mild cognitive impairment. Brain Topography, (2013), 26, 9–2. Xiao X., Fang H., Wu J., Xiao C., Xiao T., Qian L., . . . Ke X. Diagnostic model generated by MRIderived brain features in toddlers with autism spectrum disorder. Autism Research, (2017), 10, 620–630. Yaghoobi Karimu R. and Azadi S. Diagnosing the ADHD using a mixture of expert fuzzy models. International Journal of Fuzzy Systems, (2018), 20, 1282–1290. Zhu, X., and Goldberg, A B. Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 3.1, (2009), 1–130.
Sweta Kumari, Sneha Kumari
7 Automatic analysis of cardiovascular diseases using EMD and support vector machines Abstract: There has been a significant growth in the global mortalities due to cardiac diseases. Electrocardiography (ECG) is a low-cost, efficient, and noninvasive tool widely used to study the cardiac status. It behaves as a non-linear and non-stationary characteristic due to which it is difficult to analyze them with a naked eye. Therefore, the cardiac healthcare demands an automated tool to analyze long-term heartbeat records. In literature, a vast majority of methodologies are reported to perform analysis on the ECG, however, they fail to provide a complete solution to this challenge. This chapter employs Hilbert-Huang transform (HHT) for feature selection, which is demonstrated as an effective technique as such it is capable to provide frequency spectrum varying with time. As a result, the output coefficients are used to extract different features such as weighted mean frequency, skewness, central moment, and many more processed from the intrinsic mode functions extricated utilizing the empirical mode decomposition (EMD) calculation. These features extracted are applied to support vector machine (SVM) model for identification into subsequent classes of ECG signals. The validation of proposed methodology is performed on the Physionet data to identify six categories of heartbeats. The methodology reported a higher accuracy of 99.18% in comparison with previous methodologies reported. The methodology can be utilized as a solution for computerized diagnosis of cardiac diseases to serve the cardiac healthcare. Keywords: Electrocardiography (ECG), Hilbert-Huang transform, RR-interval, support vector machines (SVMs)
7.1 Introduction In 2008, an approximate number of around 17.3 million people died due of CVDs, which is expected to reach 23.3 million by 2030. The World Health Organization (WHO) statistics indicate that there has been a significant growth in count of global mortalities due to cardiac abnormalities. Electrocardiography (ECG) is a graph-based representation of the cardiac activity and serves as a noninvasive tool to diagnose
Sweta Kumari, Department of Computer Science and Engineering, Swami Devi Dayal Institute of Engineering and Technology, Kurukshetra, India Sneha Kumari, Department of Electrical Engineering, Indian Institute of Technology Patna, Patna, India https://doi.org/10.1515/9783110648195-007
132
Sweta Kumari, Sneha Kumari
cardiac abnormalities. It is widely utilized in clinical setting as a basic instrument for the identification of heart abnormalities. It gives significant data regarding the operational parts of the heart and cardiovascular framework. ECG is the most frequently recorded signal for the method of patient surveillance and examination. The surface electrodes are placed on the limbs or chest of a subject to capture and record ECG. It is regarded as a descriptive signal of cardiovascular physiology, helpful for heart arrhythmia diagnosis. An ECG shape irregularity is generally referred to as arrhythmia. Arrhythmia is a normal word that varies from usual sinus rhythm for any cardiovascular rhythm. Early diagnosis of cardiovascular disease can extend life by suitable therapy and improve the quality of life. Analyzing lengthy ECG records in brief period is very hard for doctors and the human eye is also unsuitable for continually detecting the morphological variations in the ECG signal. From the practical perspective, the study of the ECG sequence may need to be conducted during several hours for appropriate tests. Since, the amount ECG data is huge, its analysis is tiresome and complicated, and there is a high chance of losing the essential information. Therefore, the prevention of cardiac abnormality needs a strong computer-aided diagnosis (CAD) system. Various research studies have been reported in the domain of automatic analysis of ECG (Raj et al., 2018a; Saxena et al., 2002; Guler, 2005; Raj et al., 2018b, Raj et al., 2018c, Raj et al., 2018d). Electrocardiogram signals are considered as irregular, unsystematic, and transient in nature, so this is very important to accept certain systems that possess those same guidelines in their quantitative scheme. Fourier transform (FT) is used to process stationary signal in current usual methods, whereas wavelet analysis is useful for processing non-stationary signal. The classical wavelet transform (WT) property was possible due to its features describing the multi-resolution observation and the ability to display local signal characteristics in time-frequency space (Ercelebi, 2004, Raj et al., 2015a; Raj et al., 2018a). Since, the WT investigation is not self-versatile and is fairly founded on an appropriate determination of wavelet functions, the exactness of conclusion is very constrained. Thus, for an adequate and accurate test, the fundamental characteristics of the ECG signal should be termed and the evaluation method should be selected accordingly. A number of studies used different approaches to evaluate major aspects from an ECG signal; a few of them have been presented here. In 2001 (Osowski et al., 2001), the authors extracted second-, third-, and fourth-order cumulants to represent a single heartbeat and the hybrid fuzzy neural network for characterization method, while Acur (Acır, 2005) used amplitude values for making feature set, Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), and Adaptive Auto Regressive constant. Mahmoodabadi et al. (Mahmoodabadi et al., 2005) primarily emphasized the extraction of the ECG function using wavelet transformation with multi-resolution. Therefore, the hypotheses about the stationarity of the signal are rendered in all of these methods and thus it loses the self-adaptability to data processing. Another scientific data-analysis tool, Hilbert-Huang transform (HHT)
7 Automatic analysis of cardiovascular diseases using EMD
133
introduced by Huang et al. (Huang et al., 1971), produces specific physical data representations through irregular and non-stationary techniques. This method is focused on an empirically determined theory and generally consists of two areas: EMD and HSA. The empirical mode decomposition (EMD) is a self-adaptive adaptive technique and is completely based on the amount of information of the input data. EMD overcomes the limitations of the classical DWT method. The EMD approach dissolves any input data into a distinctive intrinsic mode functions (IMFs) (Huang, 2005) that are utilized in the combined time-frequency-energy specification of the IMFs to perform HSA together. Moreover, the adaptive EMD approach remains an obvious choice for selecting a suitable set of features to provide a generic solution for adequate ECG beat disclosure. Kolmogorov Complexity (KC) is a useful method in high-dimensional complex systems to decide dynamic motion. Even though ECG information models act as inherently non-linear spatio-temporal framework, it is an extremely helpful tool to determine the complexity, entropy, or randomness of this kind of lengthy sequence. This is a significant measure for determining the differences between the beats that belong to the distinct classes. Artificial neural networks (ANN) (Yegnanarayana, 2009) were commonly used for identification due to its dominant self-determined, recognition of random patterns and perceptual capacity. However, this experiences disadvantages like numerous local minima and computational complexity dependency on the input space dimension. SVM (Vapnik, 1995) is executed as a fresh technique of computing with a basic geometric description and a sparse alternative to solve such limitations. SVM is superior to ANN because there is less chance of ANNs to overfit. A fresh set of features are built utilizing empirical mode decomposition to guarantee that the suggested set of beats is classified more accurately. Once the EMD process starts, it generates five IMFs to represent a particular heartbeat. The HHT has been employed to extract the instant amplitude and frequency from each IMF, and thenafter the weighted mean frequency of the actual signal using the IMFs are calculated. In addition, for the first four IMFs to efficiently differentiate between various categories of ECG beats, a constrained selection of statistical attributes with resultant KCs is determined. A SVM classifier is instructed by selecting RBF for its kernel value for the categorization. The respective categorization precision is calculated as per the identification of the test information collection by getting the confusion matrix. The remainder of this chapter is organized as follows. Section 2 provides the methods for extraction and classification of features. Section 3 provides the methodology proposed, along with the structure of the database and the experiment. Section 4 illustrates the outcome and performance analysis of this report and finally Section 5 concludes the chapter.
134
Sweta Kumari, Sneha Kumari
7.2 Feature extraction and classification algorithms Hilbert-Huang transform and Kolmogorov Complexity algorithms that are used to obtain ECG characteristics are described in this section, while SVM is designed to use these characteristics to distinguish these ECG beats into six accepted types.
7.2.1 Hilbert-Huang transform HHT is commonly used for random and irregular time series signals like a statisticsanalysis method and has two types: HSA and EMD, respectively. EMD disintegrates a sophisticated time series into several IMFs has decreasing order of higher frequency elements to elements with lower frequency order. An IMF functions as a sequel to the SHM. The following properties should be met by any IMF: (I) The difference in the number of extremas and the number of zero crossings must be either zero or one, and (II) Mean value of the upper envelope described by the lower envelope and local maxima designed by local minima that use that interjecting signal, like cubic spline feature should be zero. EMD method alters the following adaptive method for separating IMFs from an initial signal: i. Discover all the local minima and maxima of the signal a(t) marked by Mi, i ∈ {1, 2, . . . .} and mj, j ∈ {1, 2, . . . }, individually. ii. Let P(t) be the upper envelope as well as the lower envelope is denoted by p(t), interpolating signal will be calculated. iii. Thus, estimate envelope’s local mean as M(t) = (P(t) + p(t))/2 using such interpolating signals. iv. Deduct the mean from initial signal like q(t) = a(t) − M(t). When q(t) do not meet the IMF characteristics, avoid steps (v) and (vi) and jump to step (i) Repeat the method with new input h(t) by substituting x(t). v. If q(t) pursue the properties of an IMF, it is considered as an IMF, for example zi(t) = q(t) and it is subtracted from the first signal, for example, x(t) = a(t)-zi(t), where i is the ith IMF. vi. Start again from step (i) with new contribution as x (t) and zi(t) as an IMF for a particular ECG is stored. The adaptive method is generally prevented if a monotonic IMF is acquired; however, in this report the sifting method stop criterion in the EMD algorithm is one mentioned in (Rilling et al., 2003). Therefore, the EMD method produces a sequence of IMFs added to a last leftover element, r(t).
7 Automatic analysis of cardiovascular diseases using EMD
135
Second stage includes rebuilding that initial signal by summarizing all IMFs and the remaining element. When the analyzed signal is received, the Hilbert transform (HT) estimation is performed to generate an analytical sign. The instant amplitude and frequency may be calculated as follows, as per the nature of an analytical signal, to create the combined allocation of amplitude, frequency, and energy gradually. Intricate conjugate r(t) of the initial value function s(t) is acquired by processing its HT as +ð∞ 1 pðτÞ dτ (7:1) qðtÞ = H ½SðtÞ = PV π ðt − τÞ −∞
where the value of the singular integral is indicated by PV. The analytical signal is therefore now described as follows: uðtÞ = sðtÞ + irðtÞ = vðtÞeiθðtÞ
(7:2)
And, vðtÞ =
p
r ðs2 + r 2 Þ, and θðtÞ = arctan s
(7:3)
In this case, v(t) is the amplitude at any current instant of time as well as θ is the phase function, whereas frequency at any instant is presented as ω=
dθ dt
(7:4)
Like this, the instant amplitude and frequency for every IMF will be calculated utilizing the HHT method as well as the essential characteristics ingrained in ECG beats can be obtained further.
7.2.2 Kolmogorov complexity The idea for how to estimate the feasibility of a finite length random sequence was originally proposed by Andrew Kolmogorov. KC is used to measure the predictability involved by a strong random signal and therefore popular among the ECG signal assessment. It generally represents the amount of computing resources required defining an item, i.e., it refers to the amount of measures that can be used to introduce the given series by implementing a self-delimiting development method. Because beats attached to distinct classes vary in their complexity, this index can readily is used to convert a time-domain space property into a high-dimensional space function. Computation suggested by Lempel and Ziv (Lempel and Ziv, 1976) uses a mathematically less costly approach to effectively manage KC. It originally converts a time-series of finite amplitude onto a binary succession by acquiring from user a
136
Sweta Kumari, Sneha Kumari
threshold, T. If the signal amplitude is greater than T, one will be the allotted value, while if it is lesser than T, the binary sequence will be assigned zero. This computation continues to iterate until the subsequence Q attains last component as well as KC (n) provides the difficulty scale of the pattern S when it stops. As mentioned in Lempel and Ziv (1976), entropy of the records should be standardized so for sequences of any size, the suggested algorithm performs asymptotic equipartition principle. Therefore, the steps are directly involved in the number of unique substrings needed to perform the whole sequence and their occurrence frequency throughout the process. The KC (n) upper limit is defined as follows: KCðnÞ =
n ð1 − 2n ÞlogðαnÞ
(7:5)
e Where α = |Г| indicate the size of the letters in order Γ and furthermore, en → 0 as n → ∞. So, when the length of the pattern reaches infinity, value of KC(n) changes as: bðnÞ = lim KCðnÞ = n!∞
KCðnÞ bðnÞ
(7:6)
Thereby, the standard KC is obtained by dividing (7.5) with (7.6) like: KC =
KCðnÞ bðnÞ
(7:7)
As KC is standardized, KC = 1 shows the signal’s highest variability.
7.2.3 Support Vector Machines (SVMs) Vapnik’s (Vapnik, 2000) suggested SVM was designed for binary representation, however a number of works are being conducted to efficiently introduce this to multi-class classification (Weston, 1998). With certain priori-based non-linear mapping, support vector machine maps the input trends to higher dimensional function space. After that a linear deciding surface is designed to identify the groups, so SVM classifies the input data of binary groups as a linear classifier all across the input space. Also, the SVM is capable of separating the non-linear mapping of input patterns for which it uses kernel functions. These kernel functions project the non-linear input patterns in the higherdimensional space to distinguish between one another. The SVM method focuses on building a hyperplane wTx + b = 0 in which w indicates hyperplane parameter whereas b represents the bias word. The limitation of constructing the optimized issue is that the margin should be max between the hyperplane and the support vector. Conditioning the SVM is presented as a quadratic optimization concern by implementing the goal and the restricted function.
7 Automatic analysis of cardiovascular diseases using EMD
137
SVMs are considerably better as a comparison mechanism compared to ANNs. They are less likely to overfit, get an easy geometric understanding, as well as provide a scarce alternative. It has also been seen that support vector machine offers a larger capacity for generalization. The selection of the kernels for non-linear identification in SVM is purely depends on the nature of data and therefore, it is essential to select proper kernel function to achieve higher accuracy. Two methods are usually taken to address the challenge of classifying more than two classes using SVM, one using multiple binary classifiers and the other using a more advanced issue of optimization. Two frequently used techniques are as one-against-all (OAA) and one-against-one (OAO) based against the use of various binary classifiers. OAA approach was the oldest MCSVM evaluation execution to build b binary SVM models in which b is the count of classes. In this strategy, the SVM routes the ith model of SVM through looking at every ith class data sets as one room and then the other dataset as another room. OAO approach (Kreßel, 1999) is an appropriate as well as effective methodology of applying the MCSVM. Such a technique shapes classifiers k(k-1)/2 in which every one of them is practiced on two-class information. OAO technique has been discovered to become more appropriate for realistic use (Hsu et al., 2002) because it utilizes lower dual issues causing them to stop quickly. In this study, OAO strategy is used to classify MCSVM and objectively study the option of kernel function. The selection of RBF as a kernel is noted to produce the ideal outcomes.
7.3 Proposed methodology The suggested techniques are shown in Figure 7.1 and discussed in subparagraphs below.
Input (MIT-BIH data)
Filtering
R-wave detection and windowing
Hilbert Huang transform + Time domain features
Feature classification using OAOSVM
Confusion matrix
Figure 7.1: Block diagram.
7.3.1 Database formation The database is generated with 25 data (of patients) in this analysis for feature extraction and classification from the MIT-BIH arrhythmia database (Mark, 1988). The main objective for choosing this database is its broad use in research articles covering the
138
Sweta Kumari, Sneha Kumari
same area. The logs and number of data sets used by the SVM classifier for training and testing have been presented in Table 7.1. In this chapter, the signals collected with the ML-II model at 360 Hz sampling rate and nearly 30 min duration. The specimen measurements with a specified notation of each peak are just used to separate sample beats in any of six classes such as 1. Normal (N), 2. Left bundle branch block (LBBB), 3. Right bundle branch block (RBBB), 4. Premature ventricular contraction (PVC), 5. Paced beat (PB), and 6. Atrial premature beat (APB). When the beats have been separated, the data of each class will be divided onto two equitable halves.
7.3.2 Denoising The performance of any diagnosis system is greatly affected by the quality of heartbeats. The noises associated with a heartbeat may contain baseline drift, power line interference, muscle artifacts, contact noise, electrosurgical noise, and quantization noise. It is necessary to eliminate these different kinds of noises failing, which can lead to false alarms. Further, this step enhances the signal-to-noise (SNR) ratio that helps in accurate detection of the fiducial points within the of the heartbeats. To eliminate noise, different filters are employed to remove different kinds of noises. A set of two median filters are employed for eliminating the baseline wander (Raj et al., 2018a) within the heartbeats. A 200 ms primary median filter is used to demarcate the QRS wave (Raj et al., 2018a) and P wave. Whereas a 600 ms secondary filter demarcates the T wave within the heartbeat. Finally, the baseline wander is removed by subtracting the output of the secondary filter from the raw ECG data (Raj et al., 2018a). Thenafter, the power-line interference and high-frequency noises are removed from the heartbeats by passing the baseline corrected heartbeat through a 12-tap lowpass filter (Raj et al., 2018a). This filter has a cut-off frequency of 35 Hz with equal ripples in the pass and stop bands. The output of this filter is considered as pre-processed heartbeat, which is allowed to pass through the R wave localization and segmentation steps for automated recognition of ECG signals (Raj et al., 2018a). Figures 7.2 and 7.3 Raw ECG Beats Amplitude (in mV )
5
2.5
0 0
2 x 103
4 x 103 6 x 103 Sample index
Figure 7.2: Pre-processing of heartbeat recordings (record #118).
8 x 103
1 x 104
139
7 Automatic analysis of cardiovascular diseases using EMD
Filtered ECG signal Amplitude (in mV )
300 200 100 0 –100 –200 –300
0
2 x 103
4 x 103 6 x 103 Sample index
8 x 103
10 x 103
Figure 7.3: Filtered heartbeat (record #118).
show the raw heartbeat and the filtered ECG output from the record #118 of the database respectively.
7.3.3 QRS wave localization and windowing This study classifies the different types of arrhythmias on the basis of localization of R-waves within the ECG signals. Prior to segment the ECG signals before feature extraction, it is necessary to determine the locations of R-waves. A lot of research has been reported in literature for detecting the R-peaks (Raj et al., 2018e) among which this study employs a well-established Pan-Tompkins (PT) algorithm (Pan, 1985). It is chosen due to its proven lower computational burden, higher performance under noisy environments. The detected R-waves are verified with the positions of annotations of R-peaks provided in the database. Figure 7.4. depicts the R wave localization within the heartbeats of record #118 of the database.
R wave detection Amplitude (in mV )
300 ECG
200
P Q
100 0
R
–100
S
–200 –300
T 0
1
Figure 7.4: QRS wave detection.
2 Times (in seconds)
3
4
140
Sweta Kumari, Sneha Kumari
In this study, the segmentation step is a bit different from almost all the works reported. They use a rectangular window of constant time or samples. The samples are taken as 65% of posterior R peak in the right and from the left, 35% of anterior R peak are selected for estimating the length of every heartbeat. It is ensured that every information of the ECG from starting P wave and ending T wave is preserved and no information regarding any wave is lost.
7.3.4 Feature selection The matrix featured enter data includes a substantial amount of useful data of its nature and characteristics. Direct input data processing can increase the difficulty of the algorithm in the complexity of the classifier model. Therefore, it is essential to extract the significant amount of information as features from any input data to represent the heartbeat for classification purpose. These features are applied as input to the classifier model which is trained and tested, these features must be fed as inputs. EMD’s signal decomposition generates (N-1) IMFs and a remaining element. As of lower-order IMFs (meaning higher-frequency IMFs) shows quick oscillation modes whereas IMFs with higher order shows detained oscillation modes, it was experimentally found that first four ECG beat’s IMFs contain required data about initial signal. Thus, only the first four IMFs can be used to determine the input characteristics for this experiment. In this way we decrease not only the complexity of the process but also time of the computation. Other elements included in this study of ECG beats are shown in Tables 7.1 and 7.2.
Table 7.1: Datasets. Class
Records
Training
Testing
N
, , ,
,
,
RBBB
, , ,
,
,
LBBB
, , ,
,
,
, , , ,
, ,
,
,
,
,
,
,
PVC
APB PB TOTAL
RBBB [., .] [., .] [., .] [−., .] [−., .] [., .] [., .] [−., .] [−., .]
N
[., .]
[., .]
[., .]
[−., .]
[−., .]
[., .]
[., .]
[−., .]
[−., .]
Features
Feature I
Feature II
Feature III
Feature IV
Feature V
Feature VI
Feature VII
Feature VIII
Feature IX
Table 7.2: Feature values for all six classes.
[−., .]
[−., .]
[., .]
[., .]
[−., .]
[., .]
[., .]
[., .]
[., .]
LBBB
[−., .]
[−., .]
[., .]
[., .]
[−., .]
[., .]
[., .]
[., .]
[., .]
PB
[−., .]
[−., .]
[., .]
[., .]
[−., .]
[−., .]
[., .]
[., .]
[., .]
PVC
[−., .]
[−., .]
[., .]
[., .]
[−., .]
[−., .]
[., .]
[., .]
[., .]
APB
7 Automatic analysis of cardiovascular diseases using EMD
141
142
i.
Sweta Kumari, Sneha Kumari
Weighted mean frequency (Feature I): ECG beat data’s Hilbert spectrum is calculated utilizing HHT throughout this chapter. When the mean frequency of the original signal is calculated using both instant frequency and instant amplitude for all IMFs (with the exception of remaining element). With m IMFs it defines the mean instant frequency, MIF(j) of cj(t) where jε[1, m] and tε[1, n] contain the jth IMF of n datasets as the weighted mean of instant frequency, wj(t) and instant amplitude, aj(t) of the Hilbert spectrum as follows: Pn 2 t = 1 wðt Þj aðt Þj ....... (7:8) MIF ð J Þ = Pn 2 t = 1 aðt Þj The mean frequency of the original signal shall be determined by Pn j = 1 MIF ð jÞ k aj k Pm WMF = j = 1 k aj k
(7:9)
ii. KC (Feature II): After getting the EMD decomposition of a signal x(t), its constituent IMFs cj(t), j ε[1, m] and t ε[1, n] will be converted to a binary signal when adopting the rules as follows: ( ) 1, ifcjðtÞ ≥ EðcjðtÞÞ c j ð tÞ = 0, ifcjðtÞ < EðcjðtÞÞ iii. R-peak amplitude (Feature III): R- peak refers to an ECG beat’s peak amplitude. R-peak is located in an ECG beat’s QRS complex and its amplitude is a significant feature of a beat. iv. Absolute IMF (Feature IV): Absolute IMF measures an IMF’s absolute amplitudes average value, for example, AIMF =
N 1X jxk j N K=1
(7:10)
v.
Median (Feature V): Median calculates a set of IMF sample values for the centered placed component (in case of odd sequence) or mean of the two centered placed components (in case of even sequence). vi. Standard Deviation (Feature VI): It is the square root of the variance of the series where the distribution of an IMF’s sample values is described by variance. vii. Kurtosis (Feature VII): It is a measurement of the extent to which allocation is outlier susceptible. The kurtosis of a distribution is characterized k=
Eðx − μÞ4 σ4
(7:11)
where μ denotes the mean of x, σ represents the standard deviation of x and E(t) speaks to the expected value of the quantity t. The figures 7.5 and 7.6 present the decomposition of the different ECG signals into intrinsic mode functions.
7 Automatic analysis of cardiovascular diseases using EMD
Figure 7.5: IMFs of lbbb, normal, and rbbb heartbeats.
143
144
Sweta Kumari, Sneha Kumari
Figure 7.6: EMD decomposition of PVC, paced, and APB signals.
7 Automatic analysis of cardiovascular diseases using EMD
145
viii. Skewness (Feature VIII): It is a function of asymmetry of data around the mean value of the sample. A distribution’s skewness is described as follows: S=
Eðx − μÞ3 σ3
(7:12)
ix. Central Moment (Feature IX): A distribution’s central time of order k is defined as follows: mk = Eðx − μÞk
(7:13)
Experimentally it is observed in this analysis that the central third-order moment produces good outcomes.
7.3.4.1 Performance metrics Calculating the following parameters can determine the efficiency of the proposed methodology. It is possible to define sensitivity (Se), specificity (Sp), and positive predictability (Pp) as. Se =
TP TP + FN
(7:14)
SP =
TN TN + FP
(7:15)
PP =
TP TP + FP
(7:16)
where TP denotes true positivity, TN denotes true negativity, FP denotes false positivity, and FN denotes false negativity, individually. True positive are beats that are correctly allocated to a specific class, while false negative relates to beats that must be categorized into the same class yet misidentified into another class, where as false positive measures the amount of beats falsely allocated to that same class. As well as true negative relates to those beats which relate to other classes and are categorized into same category of another classes. The percent of accurate detection or misrepresentation that produces the method’s overall performance in various terms are shown by values that relate to such metrics.
146
Sweta Kumari, Sneha Kumari
7.4 Experiments and simulation performance The proposed technique is implemented through MATLAB software. The ECG signal which is a series of minimal amplitude is quite tough to deal with. To allow subsequent analysis of the data, an additional pre-processing of ECG info is required. This includes the following steps: (I) To access the value of zero, convert the signal mean. (II) Standardize that signal at amplitude of the unit. Table 7.3 presents the confusion matrix which provides the information about correctly identified and misidentified instances. Table 7.3: Performance of proposed method. Ground truth/ predicted
N
LBBB
RBBB
PVC
PB
APB
,
RBBB
,
LBBB
,
PVC
,
APB
PB
,
N
The experiments reported a total classification accuracy of 99.18%. To build database of ECG beats which comprises of six various classes, a window size of 256 specimens is implemented. EMD decomposition for all beats is conducted and first four IMFs are received. With further study, such IMFs are loaded into the HHT transform unit. Such block measures the analytical form of the signal and is liable for measuring the instantaneous amplitude for each IMF sample and the instantaneous rate. Beats assigned to six separate classes are illustrated together with their first 4 IMFs. In addition, the remaining attributes are estimated and a feature vector is designed for training the OAO-SVM classifier model. A feature matrix optimization that used a third validity system is not done in this analysis, but it can be placed into the context of a possible expansion for this project. For each IMF that correlates to a beat, the importance of all aspects varies, but a composite selection is given in Table 7.4. As discussed above, OAO strategy is shown in Figure 7.7. This chapter deals with MCSVM grouping. Every class is compared to the remaining of (C-1) classes, for C classes and no repetitions are allowed, in total, the strategy incorporates C(C-1)/2 number of categories of heartbeats being classified. Sample data is inserted consecutively into each block where learning and validation occurs per each data point, keeping in mind
7 Automatic analysis of cardiovascular diseases using EMD
147
Table 7.4: Performance metrics. Class of beats
TN
TP
FP
FN
Sp (%)
Se (%)
Pp (%)
N
,
,
.
.
.
LBBB
,
,
.
.
.
RBBB
,
,
.
.
.
PVC
,
,
.
PB
,
,
.
.
.
APB
,
.
.
.
x
SVM 1 {CN,C2}
f1(x)
SVM 2 {CN,C3}
f2(x)
SVM 3 {CN,C3}
f3(x)
SVM 3 {CN–1,CN}
Winner takes all
C
fC(C–1)/2 (x)
Training set
Figure 7.7: Support vector machine architecture.
just the two classes listed in block. A division standard for the sorting process is considered to produce the performance of every binary SVMs. During the testing phase, the majority rule was used to allocate the final category to every data point. Such principle takes the majority of votes obtained by any specific class equal to every data point, which determines the assigned class in the input vector to per point. Because SVM application requires quadratic programmers, they need to be configured for better results for whom a total of 1,000 simulations being used at a time. It gets all possible combinations of OAO group and instantiates same amount of binary SVM configurations. After each binary SVM method is prepared and examined using RBF kernel, actual vote is performed for the categorized testing set of data. The category from which they correspond is determined by listing the highest votes cast by each class in the selection system for each section of the test set of data. After the entire grouping
148
Sweta Kumari, Sneha Kumari
method is established, the confusion matrix is developed for each training set of data and the subsequent identification precision is determined. As listed in Table 7.4, MCSVM classifier is programmed using testing set of data of different records. Sample data set is inserted onto the classifier when the classifier is programmed to determine the outcomes of the evaluation. The classified results could be expressed in the form of a confusion matrix which is given in Table 7.5 using above ECG beat dataset and the suggested method. The labels of the columns are the specific groups of beats, while the inputs in the labels of the row reflect the number of beats related to the group of a columns and marked as row illustrated class.
Table 7.5: Accuracy reported by various features. Feature subsets
Classification accuracy (%)
Feature I Feature I + II Feature I + II + III Feature I + II + III + IV Feature I + II + III + IV + V Feature I + II + III + IV + V + VI Feature I + II + III + IV + V + VI + VII Feature I + II + III + IV + V + VI + VII + VIII Feature I + II + III + IV + V + VI + VII + VIII + IX
. . . . . . . . .
The efficiency of classification is decided by the calculation of accuracy sensitivity, specificity, and positive predictability. For a category, the classification accuracy is specified as follows: η=
Total beats − total misclassified beats Total beats
(7:17)
The classification accuracies for each class are actually similar to those as referenced in the sensitivity parameters computed in Table 7.4 that resulted in achieving a classification performance with an accuracy of 99.18%. The computation of performance metrics such as sensitivity, specificity and positive predictivity (Raj et al., 2018b) are calculated utilizing (7.14), (7.15), and (7.16), separately, where the values of TN, FP, TN, and FN parameters are taken from Table 7.4. These metrics computed can be termed as comprehended for any instance, such as for PVC. Since, the sensitivity for the situation mirrors the correctly recognition of patients experiencing PVC in percent, according to (7.14) we accomplish a sensitivity of 99.14% is reported by the proposed method. Correspondingly, specificity demonstrates the subjects accurately recognized who do not experiences PVC which by (7.15) is 99.18%. The positive predictivity is determined by true positive
7 Automatic analysis of cardiovascular diseases using EMD
149
(TP) and false positives (FP) parameters. Here, TP implies regarding the subject suffering from PVC has been accurately recognized by the proposed method while FP exhibit regarding the subject is inaccurately distinguished the patient experiencing LBBB. Along these lines, utilizing (7.16) Pp for LBBB is 1,986/(1,986 + 9) = 99.54%. Table 7.5 summarizes the performance of the SVM classifier model for different kinds of features extracted and concatenated together. The higher accuracy reported by the proposed methodology can appreciate the fact that each and every data inserted in the characteristics mentioned is helpful for better results. When used collectively, all the nine features accomplish the accuracy of 99.51% proving the full feature set’s significance. The suggested methodology wide classification precision justifies the logic that the feature vector designed to represent the heartbeats are efficient which is also supported by the confusion matrix reported while performing the experiments. It is also to conclude from Table 7.5, that the different features extracted have also contributed significantly to achieve a higher performance Using these features in the method of classification offers a nice difference between the classes considered. Therefore, the method can be utilized in hospitals for an efficient computerized diagnosis of cardiovascular diseases. During the training phase of OAO-SVM, the decisiveness of support vectors is determined by the majority of the computing attempt performed on formulating the problem of quadratic optimization. We talked about that the suggested features depicts the category of heartbeats and changes the appropriate information into the high dimensional feature space improves the classification accuracy as well as gives important hints for separating the different categories of heartbeats. For training the SVM classifier the selection of the kernel function is fully empirically and is therefore selected on an exploratory premise. However, it is concluded in the current research that the RBF kernel function produces better results, therefore, the training of the classifier model is significant using proper values of scaling factor σ and C parameters to achieve a higher performance. Table 7.6 shows the prediction performance of the proposed method in comparion to the previous studies reported for an efficient recognition of different heartbeat classes. De Chazal and Reilly (De Chazal and Reilly, 2006) suggested an adaptive ECG processing scheme for classifying heartbeats into one of the five groups. The characteristics of morphological and heartbeat intervals are utilized to form feature vector that represents the heartbeats. Then after, the beats are categorized using a linear discriminating classifier. The accuracy of the classification achieved with this method is 85.90%. In 2009, the writers (Ince et al., 2009) introduced principal component analysis and discrete wavelet transform (DWT) to distinguish five classes using multidimensional Particle Swarm Optimization (PSO) and SVM as evaluation techniques to obtain an accuracy of 95% and 93%, successively. Osowski and Linh (Osowski and Linh, 2001) categorized a total of seven classes with 96.06% accuracy, while the writers suggested a binary classification accuracy of 94% and three class classification accuracy of 98% respectively is
150
Sweta Kumari, Sneha Kumari
Table 7.6: Comparison with state-of-the-art methodologies. Literature
Feature extraction
Classifier
Classes Accuracy (%)
Osowski and Linh HOSA (Osowski and Linh, )
Hybrid fuzzy neural network
.
Hu et al. (Hu et al., )
Time-domain approach
Mixture of experts
.
Llamedo and Martinez (Llamedo and Martinez, )
RR-interval and its derived features
Linear discriminant analysis and expectation maximization
.
Martis et al. (Martis et al., )
Bio-spectrum + PCA
SVM with RBF kernel
.
Ince et al. (Ince et al., )
WT + PCA
Multidimensional PSO
.
Raj et al. (Raj et al., a)
DWT using FFT
BPNN
.
Raj et al. (Raj et al., b)
R to R wave
Rules
.
de Chazal and Reilly (de Chazal and Reilly, )
Morphology and heartbeat interval
Linear discriminant analysis
.
Proposed
HHT + Time domain features
SVM
.
† WT: Wavelet transform; BPNN: Back propagation neural networks; PSO: Particle swarm optimization; FFT: fast Fourier transform; RBF: Radial basis function; ABC: Artificial bee colony; SVM: Support vector machines; PCA: Principal component analysis; DWT: Discrete wavelet transform.
reported in (Martis et al., 2013). In (Raj et al., 2015a), the authors used discrete wavelet transform for extracting significant features from the heartbeats and identified them back propagation neural network with an accuracy 97.40% to categorize eight classes of heartbeats. In (Raj et al., 2015b), the authors proposed a comparable research is performed in real time for four classes by implementing the rule-based method and achieving 97.96% accuracy. In this chPTER, the suggested methodology produces better outcomes in terms of 99.51% classification accuracy than present methods as shown in Table 7.6.
7 Automatic analysis of cardiovascular diseases using EMD
151
7.5 Conclusion and future scope The chapter introduces a unique technique for the classification of arrhythmia beat using a strategy consisting of HHT-SVM that is simulated and evaluated in MATLAB. Compared to the other current methods, the suggested method is considered as highly efficient as it is reported an accuracy of 99.51%. A third validation set is used to optimize the feature matrix which further decreases the computational burden of the training phase and the recognition system. It is also possible to train the SVM model developed by using the cross-validation strategy to decide the selection of kernel functions. In this chapter, the proposed methodology is validated on the Phyionet data to identify six generic kinds of heartbeats. They are normal (N), Left bundle branch block (LBBB), right bundle branch block (RBBB), preventricular contraction (PVC), paced beat (PB), and atrial preventricular beat (APB) from the different records of subjects available in the database. This implemented is suitable for computer-aided diagnosis of cardiac diseases and can be utilized in hospitals for providing proper care for other state-of-the-art methods. In future, more number of categories of heartbeats will be considered to evaluate the proposed method and develop a more efficient methodology. Using an embedded platform for real-time heartbeat monitoring system, the suggested method can be used as an application for hardware execution. In future, this current work can be extended to classify more significant of potential arrhythmias and the proposed method can be validated on the data of real-time patients. Further, more computationally efficient techniques can be developed to increase the efficiency to provide a more generic solution to the state-of-the-art techniques.
References Acır N. Classification of ECG beats by using a fast least square support vector machines with a dynamic programming feature selection algorithm. Neural Computing & Applications, (2005), 14(4), 299–309. Cortes C. and Vapnik V. Support-vector networks. Machine Learning, (1995), 20(3), 273–297. de Chazal P. and Reilly R.B. A patient-adapting heartbeat classifier using ECG morphology and heartbeat interval features. IEEE Transactions on Biomedical Engineering, (2006), 53(12), 2535–2543. Ercelebi E. Electrocardiogram signals de-noising using lifting-based discrete wavelet transform. Computers in Biology and Medicine, (2004), 34(6), 479–493. Güler I. and Übeylı E.D. ECG beat classifier designed by combined neural network model. Pattern Recognition, (2005), 38(2), 199–208. Hsu C.-W. and Lin C.-J. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, (2002), 13(2), 415–425. Hu Y.H., Palreddy S., and Tompkins W.J. A patient-adaptable ECG beat classifier using a mixture of experts approach. IEEE Transactions on Biomedical Engineering, (1997), 44(9), 891–900. Huang N.E. Introduction to the Hilbert- Huang transform and its related mathematical problems, Interdisciplinary Mathematics, (2005), 5, pp. 1–26.
152
Sweta Kumari, Sneha Kumari
Huang N.E., Shen Z., and Long S.R. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London A, Mathematical Physical and Engineering Science, (1971), 454(1971), 903–995. Ince T., Kiranyaz S., and Gabbouj M. A generic and robust system for automated patient-specific classification of ECG signals. IEEE Transactions on Biomedical Engineering, (2009), 56(5), 1415–1426. Kreßel U.H.G.. Pairwise classification and support vector machines. In: Schölkopf B., Burges C.J.C., and Smola A.J. (Eds.), Advances in Kernel Methods, (1999), MIT Press, 1999, 255–268. Lempel A. and Ziv J. On the complexity of finite sequences. IEEE Transaction on Information Theory, (1976), 22(1), 75–81. Llamedo M. and Martínez J.P. Án automatic patient-adapted ECG heartbeat classifier allowing expert assistance. IEEE Transactions on Biomedical Engineering, (2012), 59(8), 2312–2320. Mahmoodabadi S., Ahmadian A., and Abolhasani M. ECG feature extraction based on multiresolution wavelet transform. 27th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, (2005), 3902–3905. Mark R. and Moody G. MIT-BIH Arrhythmia Database Directory. Cambridge: Massachusetts Institute of Technology, (1988) Martis R.J., Acharya U.R., and Mandana K. Cardiac decision making using higher order spectra. Biomedical Signal Processing and Control, (2013), 8(2), 193–203. Osowski S. and Linh T.H. ECG beat recognition using fuzzy hybrid neural network. IEEE Transactions on Biomedical Engineering, (2001), 48(11), 1265–1271. Pan J. and Tompkins W.J. A real-time QRS detection algorithm. IEEE Transactions on Biomedical Engineering, (1985 March), BME-32(3), 230–236. Raj S. (2018a), Development and hardware prototype of an efficient method for handheld arrhythmia monitoring device. Ph.D. Thesis IIT Patna. 2018, pp. 1–181. Raj S., Chand G.P., and Ray K.C. Arm-based arrhythmia beat monitoring system. Microprocessors and Microsystems, (2015a), 39(7), 504–511. Raj S., Maurya K., and Ray K.C. A knowledge-based real time embedded platform for arrhythmia beat classification. Biomedical Engineering Letter, (2015b), 5(4), 271–280. Raj S., Ray K. C., and Shankar O., “Development of robust, fast and efficient QRS complex detector: a methodological review”, Aust. Phys. Eng. Sci. Med., (2018 Sept.), 41(3), 581–600. Raj S. and Ray K.C. A personalized arrhythmia monitoring platform. Scientific Reports, (2018b), 8(11395), 1–11. Raj S. and Ray K.C. Sparse representation of ECG signals for automated recognition of cardiac arrhythmias. Expert Systems with Applications, (2018c), 105, 49–64. Raj S. and Ray K.C. A personalized point-of-care platform for real-time ECG monitoring. IEEE Transactions on Consumer Electronics, (2018d), 66(4), 452–460. Rilling G., Flandrin P., and Goncalves P. On empirical mode decomposition and its algorithms. IEEEEURASIP Workshop on Nonlinear Signal and Image Processing NSIP, (2003), 3, 8–11. Saxena S., Kumar V., and Hamde S. Feature extraction from ECG signals using wavelet transforms for disease diagnostics. International Journal of Systems Science, (2002), 33(13), 1073–1085. Vapnik V. The Nature of Statistical Learning Theory. Springer, (2000). Weston J. and Watkins C. (1998), Multi-class support vector machines. Technical Report, Citeseer, 1998. Yegnanarayana B. Artificial Neural Networks. PHI Learning Pvt. Ltd, (2009).
J Naveenkumar, Dhanashri P. Joshi
8 Machine learning approach for exploring computational intelligence Abstract: The new concept provoked with genetic and natural characteristics like neuron structure in brain [neural network (NN)] is known as computational intelligence (CI). CI is depending upon three main stakes, namely, NN, fuzzy systems, and evolutionary computing (EC). Fuzzy system utilizes natural language or more specifically language used by human being to represent linguistic ambiguity as well as solves the difficulties in approximate reasoning computation. NN can be defined as computational model designed in line with structure and working of human brain which can be trained and perform job accordingly. This chapter focuses on EC and various facets of EC for machine learning (ML) applications. EC provides a way out to problem of optimization by generating, calculating, and transforming different likely solutions like use of evolutionary programming, genetic algorithms, multi-objective optimization, and evolvable hardware (EH), etc. In the recent years of ML and BigData, computer systems/applications are creating, gathering, and consuming a huge amount of data. Since such enormous data are utilized for ML functions along with applications which are data demanding, rate at which data transmission occurs from storage unit to host unit will be the reason for a bottleneck. Evolvable programming and EH jointly could be a key to solve abovementioned problem of optimization. Solution could be provided by using near-data processing (NDP) or in-situ processing. NDP allows execution of compute-centered applications in situ, which means within or very close to the storage unit (memory/ storage). Outcome of this may be that NDP seeks to decrease costly data movements and thus performance improvement. NDP is made up of a grouping of software system plus hardware system derived from the collection of central processing unit (host unit) plus smart storage system. Smart storage system has various elements namely various disks for storage, cache, frontend, and backend. Now cache memory consists of large capacity which is used for storing hot data (frequently used data) temporarily. The cold data (data used less frequently) are moved forward to underlying storage layer. Frontend is intended for connection between host system and storage system. Backend is intended to support communication among cache and multiple storage disks. One part of backend is backend controller that pedals these disks having capability of Redundant Array of Independent Disks (RAID) and having error finding as well as error-correction methods. This chapter talks about different architectures of NDP which is compatible for ML methods and in which
J Naveenkumar, Dhanashri P. Joshi, Department of Computer Engineering, Bharati Vidyapeeth (Deemed to Be University) College of Engineering, Pune, India https://doi.org/10.1515/9783110648195-008
154
J Naveenkumar, Dhanashri P. Joshi
way mentioned NDP architectures boost execution of ML methods. The chapter focuses on developments in recent research in NDP designed for various methods of ML. Keywords: in-situ processing, near-data processing (NDP), intelligent storage system, computational intelligence
8.1 Introduction to computational intelligence Electronic device could be capable to compute large quantity of diverse variety of data with extremely high speed when weigh against computing capacity of human being and also do not tire out like human act. Human contribution is not capable fully in few circumstances such as weather conditions forecasting, prediction of health-related issues, automatic collision systems, or several similar conditions in which very large quantity of data handing is necessary. For this reason, requirement for computational intelligence (CI) and artificial intelligence is rising gradually (Rattadilok et al., 2014). Those applications which require real-time data predominantly employ various sensors creating and gathering data at excessive speed. This type of applications is generally called as Internet of Things (IoT) applications and they move from focal point of researchers to intelligent data computing and discovery of several useful fractions of facts from that data at expected moment. This useful information possibly utilized to manage some computing, to activate various sensors or to control various physical devices or to support for decision-making. In real-time applications, the complexity of controlling or triggering jobs raises significantly. CI has been effectively practiced for such BigData or ML-based applications. CI is a concept similarly encouraged by biological characteristics as neural network (NN). It is founded on strong pillars as EC, fuzzy systems, and NN. Fuzzy system provides key to approximate reasoning problem as conventional logic by exploiting natural language used by human-to-sculpt linguistic haziness. NN is a paradigm which is based on structure and functioning of human brain and NN and which can gain knowledge and work consequently. EC offers a way out to problems of optimization by producing, calculating approximately, and transforming different likely solutions. EC is comprised of evolvable hardware (EH), genetic algorithms (GAs), multi-objective optimization, evolutionary programming, and the like. Even if CI techniques require some dataset that provides consistent and precise outcomes like prospect of system, in CI, the expected examples or circumstances formulate system such as it could be able to highlight apparent abnormalities in respective dataset in order to system developed for nucleus practice work autonomously. These jagged conditions happened are said to be as anomalies.
8 Machine learning approach for exploring computational intelligence
155
Above-mentioned anomalies are intricate for discovering beforehand and might be hazardous for outlining suitable dataset for training. So, CI schemes that are accustomed to cope by means of dynamic problems must able to adapt to conditional as well as ecological transformations (Rattadilok et al., 2014).
8.2 Computational intelligence research aspects In this section, some CI key facets of research are described.
8.2.1 Budding platform for development of algorithm in computational intelligence In said field, focus of study is to accomplish the reliable low latency processing of data for data-demanding applications by discovering novel CI system platforms, hardware elements of CI system, infrastructure, and scalable networks as per the need of the CI system. Fog and cloud computing is employed to accomplish proficient administration with data processing as well as to cut down architectural constraints of large-scale applications. Usually in data-demanding systems, data are composed and produced using sensors and IoT devices. This composed or produced data from IoT devices are huge data and to accumulate it organization must require extra memory. This facility to store and process such voluminous data is offered by datacentres. By means of virtualization, Cloud-based facilities offer software resources. As soon as sensors or smart devices produce or gather data constantly, extra network bandwidth is needed and more delay happened in transferring data to and from cloud or storage server. In these scenarios, data are filtered nearby the edge of network by the edge device, furthermore only constructive filtered data are transmitted on cloud or server by using network, this state is called as active storage or fog computing or edge computing in IoT applications. It is successor of Cloud-IoT incorporation. Edge computing is beneficial on the phase of transmission of data to cloud server or storage server from edge device. However in circumstances of data transmission to computation unit (possibly server, federal system utilizes data stored at storage unit) from storage unit, conventionally complete dataset requires to be provided as input to application irrespective of dataset size instead of sending only constructive dataset from storage to processing unit. It is not reasonable as it amplifies latency and acquires comparatively greater network bandwidth. To defeat this issue, NDP comes into frame. In NDP systems, piece of data processing is run in-situ, which means inside/close to the storage unit of data. It is more enlightened in section 8.2.4.
156
J Naveenkumar, Dhanashri P. Joshi
8.2.2 Security In scenario of in-storage processing, added security is essential at various stages like data, hardware framework, processor context switching, and operating system (OS). Data required to store with security check are used for data concentrating ML application. Beforehand some ways of security were employed like encryption, rolebased authentication, or random tokenization and control. However, mentioned security methods cannot be helpful for ML applications. For security of receptive data in ML applications, various methods can be helpful namely eliminating sensitive data, covering or masquerading data, or data coarsening (decline the data granularity or data precision with the intention of making it hard to recognize sensitive data contained by the respective dataset) (Google cloud, Online at https://cloud.google.com/ solutions/sensitive-data-and-ml-datasets). Vast data useful for applications like ML applications are usually stored at datacenters. Security of datacenter can be attained by grouping physical security along with logical/digital security. Here, logical security is associated with regard to data. International Organization of Standardization (ISO) has created model intended for security of information ISO 27001 explicitly. Information security is premeditated used in a variety of infrastructures or frameworks by ISO 27001 although any precise standard for datacenter security is not supplied by it (Achmadi et al., 2018). New security standard is designed in Lontsikh et al., 2016 namely information security management system (ISMS). ISMS composition is predominantly projected to manage the data integrity, availability of data, and privacy of the data for the datacenters framework. Similar to other ISO standardizations, it is dealing with running the process and technology framework in protecting security of information stored in datacenter. Few highly developed ways out are offered for instance decentralized datacenter (DDC). To trounce major security issues in conventional approach datacenter, datacenter can be virtualized by using DDC concept which uses blockchain technology and smart contracts (Rygula et al., 2017). Blockchain can be described as a sort of data structure or distributed ledger technology that generates a digital tamperproof record about contract. One more method projected in The International Business Machines Corporation private limited (IBM) is trusted virtual datacenter (TVDc). TVDc is designed and created by IBM to accomplish necessity for data reliability in addition to data remoteness, mandatory to offer more safety in virtual environments (Perez et al., 2008).
8.2.3 Satisfying explosion of data During these days in social field, IoT applications are also implemented and used. These real-time applications are formerly used for application domains where
8 Machine learning approach for exploring computational intelligence
157
latency issue is not there or else in applications where huge amount of data is not occupied like operating streetlights, displaying some advertisements etc. However since need of data increased, function difficulty enlarged and it is directly proportional to production and compilation of BigData. For example applications similar to monitoring and prediction of human fitness or sort of DSS, that is, decision support system required extra data need for analytics purpose. Such a massive amount of data required to accumulate at datacenter or any cloud server used for storage. The computation performed on such large data can be expensive for organization in the form of security and network bandwidth etc., because of transmission of data through network. For smart or intelligent application, prerequisite ML might be paired with EH. Lively investigation is done in above-said domain also. Datacenters are mainly intended to accumulate large and scattered data. NDP will be appropriate to beat this issue. In NDP system, some piece of data processing is run in-situ, which means inside/close to the storage node of data. NDP gets better system performance by shrinking network bandwidth constraint.
8.2.4 Near-data processing (NDP)/processing data in-situ As data used in ML or data-demanding systems are massive, incidence of transmission of data to host processing unit from storage system is a raising concern. In-situ processing or NDP is an amalgamation of hardware system and software stack depending on the gathering together with storage nodes and computational unit. There are sub-types of NDP, namely, in-memory processing and in-storage processing.
8.2.4.1 Processing in-memory Bounded Staled Sync (BSSync) is a framework explained in this paper (Lee et al., 2015). It has hardware stack and software stack to reduce overhead on the basis of minute job performed in parallel ML systems. Minute tasks get delegated on tiers of logical memory upon which to execute so that central processing unit could be able to completely exploit for ML algorithm. Within some ML scenarios, atomic processes are implemented for read operation, write operation, modify operation, and append operation. It is commonly exercised for parallel computing in order to attain that each thread running parallel be capable of altering same memory location value for particular variable. Five different ML functions (least square matrix (LSM) factorization, single source shortest path, latent Dirichlet allocation (LDA), LASSO Regression and breadth first search (BFS)) are utilized for framework assessment by varying inputs. A framework is proposed in a research article by authors, Lee et al., 2017. It is derived from similarity search associative memory (SSAM). It speeds up similarity search via NDP. Similarity search is a key essential idea on which k-nearest neighbors
158
J Naveenkumar, Dhanashri P. Joshi
(kNN) algorithm is based on. The thought of similarity search is used in a variety of functions like natural language processing (NLP), image processing etc. This paper fundamentally spotlights on specific ML algorithm, namely, kNN algorithm. Blockage arises in kNN is generally at the phase similarity search of matter. Paper discovers that SSAM offers throughput enhancement 170 times among 50% of search precision and enhancement 13 times by 99% of search precision. Many Dynamic Random Access Memory (DRAM) tiers are separated in a multiple parts called vaults vertically. A driver of host processing unit is trustworthy for handling, configuring, and compiling the SSAM accelerator. Similar framework of SSAM is proposed in Lee et al., 2018. In this research paper, architecture is assessed on restricted base of Euclidean distance between numerous parameters as well as feature vector constraints used for the purpose of kNN and three datasets specifically AlexNet dataset, Global Vectors for Word Representations (GloVe) dataset, and GIST dataset. In proposed framework given in Lee et al., 2018, SSAM is incorporated with host unit and DRAM. It is assessed for linear plus approximate kNN algorithm. The SSAM framework is generated on account of Hybrid Memory Cube2.0 (HMC2.0) for getting advantage on the basis of enhanced memory transmission capacity. Calculated judgment of savings in expenditure do not account for additional overhead present in datacenter similar to overhead in usage of electricity and overhead in network tools. Heterogeneous Reconfigurable Logic (HRL) architecture is proposed in Gao et al., 2016. HRL contains many diverse parts linked for a statically reconfiguring fabric used in routing. Intended HRL architecture having grid of Coarse-Grained Reconfigurable Array (CGRA)-pattern and functional units (FUs) (coarse-grained FUs) is used in execution of common mathematical functions. Architecture too exercises field programmable gate array (FPGA) pattern and configurable logic blocks (CLBs) (fine-grained CLBs) to perform unambiguous logic and particular functions. Static reconfigurable network is utilized by HRL system for reducing overhead on energy requirements. HRL system contains a bus-based data network of type coarse-grained in addition to a bit-level control network which is of type fine-grained. Architecture does not hold disseminated BRAM, that is, block RAM. Author planned a NDP system with scalable property and with many stacks of memory as well as central computation system based on HRL to handle analytical workloads. Similar systems are explained in Gao et. al., 2015 and Gao et. al., 2016 and are used for deep NN (DNN), graph handling, and MapReduce framework. Important parts of hardware stack in system have vault hardware, host processor as well as 3D memory stack. Gao et al. (2015) give details about software and hardware stack for NDP system. It provides analytical methods in-memory for applications on MapReduce framework, dealing with graph and NN. Paper (Hong et al., 2016)-projected framework that shifts implementation of linked list traversal (LLT) is close to the storage of data as well as decreases LLT latency through dropping accesses of data. On-chip also amplifies the effectiveness of parallel implementation and competence in terms of energy through LLT engine (committed equipment). Projected system is attuned for every sort of LL. The
8 Machine learning approach for exploring computational intelligence
159
architecture described in Hong et al., 2016 using NDP to accelerate LLT was evaluated along with the workloads of LLU, Graph500, Memcached, and Hash join. At processing/host unit, network configured by means of star topology and the memory network (interchip network) since lone computational unit has authenticity to utilize the units of memory, and HMC memory units’ inter-connection is not necessary. Use of LLT engines, data localization, and batching has effect on system performance. Achievement of the complete LLT events has a concern to the finishing time except memcached of all workloads. However in Memcached, each LLT function is linked to a client demand and the setback time in all LLT may influence to the user-level request’s latency. System explained in Vermij et al. (2017) has investigated performance of GRAPH500 algorithm as well as HPGA algorithm intended to work with a NDP framework based on shared-memory concept. It deliberated various characteristics for instance a program optimization, constraint plan for system in addition to management of special structural feature similar to user-enhanced logic, and the expansion of data-locality meant for data transfer in between multiple processors. System’s major modules are named as NDP manager, NDP cores, and NDP access point (AP). Framework is assessed using routine simulator which is used to simulate system. Nested OpenMP parallelization technique with shared memory is utilized for projected framework. Performance measurement of this architecture gives 3.5 times acceleration when examined in opposition to usual host unit. Auto-scaling and adjustment platform for cloud-based system (ASAPCS) is given in Kampars et al. (2017). It is the architecture created to handle the data assimilation and processing issues for proposing data-driven application in addition to set up on CloudStack-based Remote Terminal Unit (RTUs). Parts of architecture are Apache SPARK cluster, Kafta Cluster, data-driven system, adjustment engine, Cassandra cluster, and ASAPCS core unit. Kafta cluster offers additional altitude to system security. CNN logic unit (CLU) is proposed in Das et al. (2018) with 3D stacked memory based on NDP, like HMC. Evaluation of system has been executed by means of simulation. Simulation demonstrates the improved performance in favor of procedures within CNN while computes in-memory, at the same time when evaluated against host computation unit with DRAM. It demonstrates 76 times enhanced execution, whereas 55 times drop in power requirement.
8.2.4.2 Processing in-storage In Gu et al. (2016), NDP architecture for BigData system applications is designed namely Biscuit. Data-controlled port jobs keep up a correspondence for datacomputing jobs. Paper-designed system is Biscuit, which might not make a distinction between jobs executing on top of the host module as storage unit. Biscuit consists of two main subsystems: host unit and storage unit, that is, Solid State
160
J Naveenkumar, Dhanashri P. Joshi
Drive (SSD). Moreover, this framework contains disseminated software unit. SSDlet can be described as definite application set executed within SSD. SSD encloses two sort of memories. First is on-chip SRAM which is tiny in size and speedy memory and second is DRAM which is huge although low-speed memory. Lack of cache consistency, low processing ability than ordinary system, and fewer on-chips SRAM are some hardware constraints for the framework. This architecture is not capable for accomplishing requirements expected by predictable software concept. Paper of Choe et al. (2017) illustrates NDP system for stochastic gradient descent (SGD) in ML algorithm. This NDP framework is derived from the model of instorage processing (ISP) by using SSD as storage node. Similar system is explained in Meesad et al. (2013). Here author proposed NDP for ML using ISP and Samsung SSDs as storage nodes. Paper advises two means for making use of hardware part in support of ISP: (1) to plan devoted logic chips in keeping with application/developer constraint; and (2) to employ the embedded cores within the SSD controller unit. Channel controller has capacity to handle Read and Write functions to and from, respective, NAND flash. By using this functionality, author utilized each channel controller to carry out fundamental functions on data gathered in NAND channel. Sort of fundamental functions are relying on a form of ML algorithm. Cache controller is employed in proposed system for collecting the results generated by every channel controllers, besides use of cache controller’s inherent utilization within the SSD controller like cache manager (DRAM manager). Algorithms are projected for execution derived from ISP for ISP-based downpour SGD, synchronous SGD, and ISP-based EASGD. Now the concern can be described as if data amount go beyond capacity of page, data fragmentation takes place recurrently and this can influence the execution of method. Sun et al. (2018) design and describe a system named as DStore. DStore employs diverse processing potentials in all the two modules of system, namely, storage node and host system. DStore vigorously divides the compaction jobs into the few divisions of architecture conditional on respective capabilities to process. Proposed system attained parallelism in job execution. The host system unit proposes local Application Program Interface (APIs) attuned with LevelDB’s APIs. The device unit accepts the data about communication of the compaction drive by means of host subsystem. Framework DStore concurrently performs processes of multiple levels in database, metadata set, and the ratio of processing capacity. On demand compaction, task scheduling algorithm is designed by authors. DStore is compared with LevelDB and Collaborative Key-Value store (Co-KV). Micro-targets along with macro-targets have been finalized with respect to network bandwidth, processing capacity etc. Research article in Kaplan et al. (2016) provides system changing starting at processing in-memory till processing in-storage. In this paper, in-storage computing system (ISC system) is designed depending on Resistive Content Addressable Memory (ReCAM). It performs parallel-like storage meant for enormous quantity of data and as computing unit for small data. Article gives ISP, by not reassigning system
8 Machine learning approach for exploring computational intelligence
161
to hierarchy memory. Projected architecture ReCAM is assessed in opposition to GPU, Xeon Phi together with FPGA. Riedel et al. (2001) projected an architecture named as active disk. It is proposed for data processing on large scale. While Acharya et al. (1998) propose encoding model for active disks, algorithms for active disks, and assessment model of active disks. Paper judges active disk architecture that incorporates key computation aptitude and necessary memory contained by disk drive furthermore let system code chosen for delegation to disk and run at same place. To execute active disks, author(s) designed software model which is a stream-based model. Stream-based model permits disklets to execute with competence and security. Kang et al. (2013) provide smart SSD having property of cost-effective data processing. Flash memory array, SSD controller, and DRAM have been designed for creating system. Awan et al. (2017) proposes NDP architecture holding hybrid 2D included processing in-memory depending on programmable logic and processing in-storage for Apache spark-based applications happening Ivy Bridge Server. Ruan et al. (2019) proposed system INSIDER. One of the modules of INSIDER is storage system with number of tiers. While considering execution side, INSIDER effectively traverses the “data movement wall” furthermore completely uses the elevated performance of drive. While considering software part, INSIDER gives easier although supportive ideas for developers and gives essential support for system, which allows common execution environment.
8.2.5 Algorithmic improvement Algorithms are usually expanded to provide progressively illustration about several procedures or else in computational/programming language code. In recent times, algorithms rooted in genetic structure of genes of human being have been designed and created to offer best (optimized) solution meant for problem. It is seen, there is a vast progress in the domain of algorithms as ML algorithms have been created those might construct machine capable of discovering facts and understanding them by self using its earlier understanding in addition to input data present in training dataset. Smartness offered to machine can be attained by AI. Previously during some years, GAs are intended to resolve optimization issue of control system. As we observe several systems are robotically controlled systems, difficulty of encoding is rising. GA provides best solution which can use resource optimally for above-mentioned system. Consequently, difficulty level of GA gets higher with raise in difficulty level of said functions.
162
J Naveenkumar, Dhanashri P. Joshi
8.3 Developments in evolutionary computing Enhancement in procedure which is capable of simulated/virtualized by using computer is named as EC. EC is like a practice of building intelligent and smart machine. Network analysis (Merkle et al., 2008) (Jalili-Kharaajoo, 2003) (Jalili-Kharaajoo et al., 2004), cyber security (Khan et al., 2014), and gaming applications (Loiacono et al., 2012) are also application areas of EC. Several main flows in discovery are as follows for the EC.
8.3.1 Genetic algorithms Evolutionary algorithms (EAs) are said to be types of algorithm intended for optimization those might exercise techniques derived from structure of neurons in human brain along with theory of survival of the fittest. Searching intended algorithms founded on organism’s genetic practices might have been adapted and cultivated within incessantly unstable surroundings that are known to be as GAs. In earlier some time, GAs are accustomed to work out control system optimization problem. Difficulty of encoding is growing since we can see a lot of the scenarios have robotically controlled systems. GAs provide enhanced solution to abovementioned problem statements. As a result, difficulty of GA rose by means of increase in difficulty of said jobs. Similar to human cell’s lifecycle, GAs might have gene, chromosomes, fitness function, population, and mutation etc. (Malhotra et al., 2011). Living beings living on our planet earth be capable of uniquely illustrated with respective genetic encoding (genotype) and their responses in the form of morphological, physiological, and behavioral activities (phenotype). The contradictory operation of the genotype and the phenotype provides two dissimilar techniques meant for simulation of growth. In genotype, representation is intent on organism’s structure of genes. The decisions for the job are elucidated since decisions have been comparable to chromosomes and genes of organisms. Genetic operators have been designated to work to form chromosomal transformations usually being observed in existing genus, for example mutation, cross-over etc. indicate data structures coded in the solution. In phenotype, simulation focuses on the conduction of the candidate solution from a collection of solutions called as population. Varied techniques for changing respective actions could be utilized to create the new behaviors still as conserving a tough relationship between parents and their children rooted in behavior. In language of computation, the universal fundamental GA is summed up as follows (Malhotra et al., 2011) (like representing in Figure 8.1): I. Start: Produce suitable key ways out intended for the statement of problem (in genetics it is random population of chromosomes).
8 Machine learning approach for exploring computational intelligence
163
START
Generate random population Calculate fitness of each solution
END
Condition satisfied?
Select two individuals
Select operator
Crossover
Mutation
Accept and generate new solution set
Figure 8.1: General flowchart indicating usual flow of GA.
II. Fitness: Estimate the concert of all solutions that individually present within the solution set (in genetical language, it can be known as fitness of each chromosome in the population). III. New population: Generate a fresh population via replicating subsequent phases till completion of generation of the new population. a) Selection: Pick computed two solutions from the population (in genetics terms we can say like select parent chromosomes amongst a population) consistent with their exactness for application and exactness (in genetics can be call as fitness). b) Crossover: In the process of crossover likelihood, merge the solutions to generate novel solution (in genetics it is crossover the parents to shape children)
164
J Naveenkumar, Dhanashri P. Joshi
or offspring. While crossover has not been done, child shall be literally equal like parents. c) Mutation: With a possibility of mutation, convert the new solution (in genetics it can be described as mutate new offspring at each locus). d) Accepting: Situate novel way out (offspring) in the refreshed collection of solutions (can be said as new population). IV. Replace: Occupy recently produced collection of solutions (population) for advance implementation of said algorithm. V. Test: If the conditional statement is fulfilled, send back the finest likely result in present collection (population). VI. Loop: Jump to step II. Mechanism of mutation and crossover has been influencing execution of GA. In gaming theory, there are many algorithms and applications created by using GAs. Other engineering fields like chemical or mechanical engineering applications also adopt and use GA to develop applications just in case example if GA is used to predict properties of some chemicals or gases (Cheng et al., 2012). Current research and implementation in the domain GA is in growth concerning creation of evolutionary hardware or building alterations in evolutionary hardware by means of GA.
8.3.2 Evolutionary programming Evolution is a course of optimization that might be simulated by means of computer in addition to that utilized to create engineering applications. Evolution is similar to practice for creating intelligence in machine. Evolutionary programming (EP) is also subdivision of EA just similar to GAs (Fogel, 1995) (Chiong et al., 2014). EP is present in the other class of evolution. In EP, attention is on divergence of functions that are able to be proposed for a given experimentation, so it could be helpfully fine-tuned with the conducts of the explication in connection with execution quantifications in favor of the job. Smart actions are necessity of the joint capability to forecast relevant situation specified and conversion of those forecasts into a suitable answer on the road to the estimated result. In common, the environment had been enlightened like a chain of symbols chosen among a fixed collection of alphabets. Creating a code that can work on the series of symbols, that is, environment for generating an output symbol which can be likely for amplifying the execution of code on the way toward the succeeding symbol expected to emerge from the series and an evidently explained payoff function can be called as evolutionary problem. Finite state machine (FSM) is a mechanism offering pictorial demonstration of predictable actions of machine or program. One of the real-time applications of evolutionary programming could be FSM. The population/set of results of FSM are opened for symbol series or in different terms it might be called that symbols series
8 Machine learning approach for exploring computational intelligence
165
is the incoming symbols (input) in FSM. Meant for each parent FSM, particular resultant symbol is assessed besides following input symbol when each incoming symbol is sent to a machine. The outcome of above forecast is evaluated against payoff function. Fitness of the relevant operation might be calculated by the relevant function after ultimate calculation. Parent machines are selected randomly and mutate with each other to generate children machines. Several different techniques are capable of using to mutate parent machines namely modify a transition associated with state, delete a particular state, insert a new state, and modify the starting or initial state in addition to output symbol. Reliable probabilistic distribution is able to select and derive mutations. Newly created offspring are subsequently examined regarding offered environment. In subsequent iterations, the machines that are capable to provide the largest payoff are reserved as parent. Above process is repetitive until the definite estimate of the subsequent symbol in the series is considered necessary. Most excellent machine generates this predict and the newly selected symbol is placed into the qualified sequence of symbols and furthermore the procedure is repeated (Fogel, 1999). EP is relying on order of constraints (population size, the selection pressure, and the mutation distribution), which administer the events in the method alike with nearly every stochastic optimization algorithms. In process of developing child, process of variation is also supplementary toward parent. Originally EP was intended to use as well as examine with a few common calculations like whether a number is prime or not among those present in series. Afterward EP is evaluated for travelling salesman problem, optimal subset selection problem etc. Current efforts were made with EP on design and training NN, image processing uses etc. EP is utilized to create systems for medical applications such as disease diagnosis, drug design etc.
8.3.3 Evolvable hardware Since very long years ago computer scientists and engineers have utilized ideas from biology to propose innovative algorithms used in soft computing. For instance, it might be NN encouraged because of working and structure of neurons present in human brain or EC, which is intended for finding solution to a computational problem stimulated by evolution theory (Fogel, 1999). Likewise theories from natural world are utilized in the domain of electronics engineering from few last years. Nice example of this is modeling of annealing exploited in a variety of partitioning algorithms based on the automatic experience of annealing in metal cooling. Now to examine methods of tolerance of error and re-configurability in contemporary hardware, it is encouraged using genetic concepts are developed can said bioinspired hardware (Gordon et al., 2006).
166
J Naveenkumar, Dhanashri P. Joshi
EH is the incorporation of programmable hardware and evolutionary computation. The reason of growing EH can be developed by just reconfiguring usual framework of hardware to enhance its execution. The aptitude to self-reliant reconfiguration is imperative difference between EH with traditional H/W, where to alter the functionality of hardware after mechanized is not likely. Programmable hardware tools, like field programmable gate array and programmable logic device (FPGA and PLD), permit a number of realistic alterations after it is being deployed on a print circuit board (PCB), but these modifications are not independent, it involves physical alterations completed by designers. Because of EC, EH can modify particular hardware functionalities automatically (Stomeo et al., 2006). The key advantage of device which can be handled and operated with programming may be described as the organization of hardware to change several number of times by running program (software string bits) that determine the design of hardware. Above-mentioned advantage outcomes are decreasing the expenses of maintenance and restoring and speedy prototyping of hardware devices. FPGA applications are rising quickly because of above-mentioned benefit (Higuchi et al., 2006). There are numerous viewpoints for procedure as well as research on EH. Originally, they can be categorized as: EH or circuit design and adaptive hardware. EH can be explained as a design strategy derived from EA which can be additionally working in the target or projected technology. Adaptive hardware may explain aptitude to alter operation of H/W by itself (Salvador et al., 2016). Research in EH is categorized under following categories: digital hardware evolution, analog hardware evolution, and mechanical hardware evolution. Advanced study might make in the direction of semiconductor engineering along with the direction of mechanical engineering (Higuchi et al., 2006). Some areas of application for EH are as follows: (Gordon et al., 2006) – Formation of acclimatizing systems; – Making of fault tolerant frameworks; – Deal with poorly specified problems; – Automatic invention of hardware with low-cost; – Modernization in poorly understood design. The EH is still in its small age, and there are many challenges that are necessary to be managed previously to using it for large-scale industrialized usages. Consequently, research is in the region of getting better in algorithms evolved in the domain.
8.3.4 Multi-objective optimization EC is an area of computation used for resolving problem of optimization. Typically, inside optimization problems, presently exist several number of objective like cost diminution along with enhancement in performance of implementation. The objectives are
8 Machine learning approach for exploring computational intelligence
167
possibly contradictory to each other frequently, similar to cost reduction which reflects in reducing efficiency of system execution whereas enhanced performance may mean exploiting expenses of project. Probable method for resolving mentioned condition can be converting several goals into one. For example, enhance the returns (by means of performance calculated via expenditure of development) furthermore return is a mixture of expenditure and execution examination measures. Multi-objective optimization refers to selecting finest way out taking into account contradictory objectives. Presently no solitary solution could be applied although collection of multiple solutions is capable to use where they could execute evenly fine. Technique for attaining multi-objective optimization is able to say to posteriori (Preferences are finalized by Decision Maker depending on optimization). It consists of stages as modeling, optimizing, deciding, and a priori (preferences are finalized by decision-makers previously to optimization) (Carlos A., et al., Presentation) (Deb et al., 2001). Presume a condition that gives solutions as a collection of likely solutions, then it is named as decision space. Solution might be judged by means of the assistance of objective functions. These are typically rooted in data estimating equations but perhaps will be results of research. To the end, result ought to be the way out that is finest with respect to some sense furthermore decision-makers be supposed to be settled with it. While searching for estimated solution, it might be captivating to calculate previously or approximate collection of results so as to relate the needed harmonizing between the contradictory aims. This approach might consequence in stay away from Pareto dominated solutions (results/end outcomes those may growth for entire objective exclusive of deteriorating the performance of any further objective). The major aim of multi-objective optimization and multi-criterion decision-making is to discover the collection of non-dominated results and choose from them. Furthermore, if collection of such non-conquered results is previously identified, to help out the person who is involved in decision-making in choosing results from this collection, which is the domain of decision analysis and multi-criterion decision-making (Emmerich et al., 2018).
8.4 Contemporary developments in methods of implementation Nowadays ML, data analytics, and oT are booming domains having numerous realtime applications. All likely time synchronized systems are coping with data in enormous quantity produced from IoT devices as well as sensors. The bulk of the data gathered and produced through sensors and other devices is not structured data (unstructured data) and in massive amount. Integration model of IoT-cloud necessarily gives computation of instantaneous and growing data for above-
168
J Naveenkumar, Dhanashri P. Joshi
mentioned real-time applications. To make available resource management to upload or offload data dynamically from edge or IoT device, storage server (might be cloud) is one more issue for such applications. One more challenge with such system is a need of geographically distributed storage of data and process such data. Besides the aforementioned, there are new difficulties like service portability (Hong et al., 2016). The Cloud-IoT amalgamation system has formed numerous opportunities but also facing various restrictions caused by centralized processing facility at server, scattered data storage, networking as well as because of relatively extra space in-between distantly positioned devices or datacenters. Conventional Cloud-IoT framework might produce additional latency compared to number of the real-time applications can tolerate. Here mentioned challenge could be overcome using Fog or edge computing. The IoT framework enlarges V3 (Volume, Variety, and Velocity) of data. Such large amount of data with mentioned properties is also named as BigData. The challenge with BigData is to resolve the difficulty in data collection, storage, assimilation, and analytics. Large-scale IoT applications used for real-time data constraint need to handle and compute BigData produced, collected, and stored at geologically scattered datacenters. Apart from problems or challenges mentioned here, competence of existing IoT framework is restricted when compared to the demands of IoT-Big data real-time applications. This problem can be tackled by Edge or Fog computing by which computing ability could be taken closer to the edge of network or users, hence diminishing the necessity of transmitting all data to cloud or server which results in reduced latency and amplified effectiveness of system in terms of network bandwidth and data processing. For sustainable IoT-Fog system, presently exist primarily two problems, first problem is fog or edge devices having some degree of processing potential and second is handling BigData transmitting via network and other system components. Such enormous data flowing in communication network lead to overhead in network. Now exist usual cloud infrastructure as well as datacenter framework might be entitled as distributed and parallel processing while edge computing can be entitled distributed processing. By incorporating innovative CI in the form of NDP, we can conquer difficulties in existing system as enlightened in section 8.2.4. In-situ processing/NDP is said to be a mixture of software stack plus hardware model derived from the combination of host processing unit and node used for storage. Said framework permits implementing computation-concentrated scenarios insitu, which means inside or close to the storage unit dedicated for data storage. Consequently, NDP attempts to decrease costly data transmissions and enhance performance of system.
8 Machine learning approach for exploring computational intelligence
169
8.4.1 Case study: ISP-ML (In-storage processing – Machine Learning) framework ISP-ML architecture is projected for ML techniques to be implemented using instorage processing. ISP-ML is a simulator works on system level. It is created on the synopsys platform. This platform is based on language SystemC. Using ISP-ML architecture, it is possible to write and execute a range of ML algorithms on ISP-ML module. These algorithms can be written in programming languages similar to C or C + + with a few essential alterations (Lee et al., 2017). Said framework is utilizing Samsung SSDs as storage node. Different ways can be adopted to make use of hardware in ISP-ML framework. (1) SSD controller might consume the embedded core contained by it. (2) By considering need of user to recommend committed hardware logic chips. NAND channels are used for data collection. Channel controller works for executing common functions on data by using its potential to deal with R/W (read/write) operations to as well as from respective NAND flash. Characteristics of function are totally dependent on sort of ML algorithm to be performed. Cache controller is used by ISP-ML tor collecting and combining the results generated by each channel controller. Its vital use is contained by SSD controller as cache manager (DRAM manager). Block diagram of said system is given by Figure 8.2
SSD
Host Application OS
CPU
DRAM
SSD controller
NAND FLASH
Figure 8.2: Block diagram of ISP-ML.
Performance setup utilizes PC prepared by means of an 8-core Intel(R) Core i7 processor having Ubuntu 14.04 operating system mounted and a Samsung SSD 840 Pro. In addition, system utilizes embedded processor namely ARM 926EJ-S (frequency 400 MHz). System is examined in opposition to memory configurations of host processing as 2GB to 32GB. In-host processing (IHP) is put into operation practically and ISP is virtually operated (simulated) by means of newly produced simulator for comparing both systems and depending on individual performance attributes result get weigh against each other directly. For calculating the gradient of the predefined parameter SGD is used. SGD is ML algorithm in which parameters get updated by means of single training sample for each execution cycle. System programmed different types of parallel SGD algorithms as EASGD, synchronous SGD, and downpour SGD.
170
J Naveenkumar, Dhanashri P. Joshi
New system given in this case is examined against traditional IHP on the basis of performance. It showed that ISP worked very competently in case of insufficient memory also. Distributed processing platform usually occurrence blockages in communication is present but ISP does not practice such blockages because overhead in communication is very low in ISP. High extent of parallelism might be attained by elevating the quantity of channels within SSD high. Now in above-mentioned system fragmentation of data happened regularly as quantity of data becomes greater than the size of page, this can directly influence the system performance. One more issue exist in the system is data security, which is major worry for ML applications as the majority of the data possibly will be sensitive. It is not considered while developing ISP-ML architecture.
8.4.2 Issues and challenges Already implemented systems based on NDP be typically extended so that they are likely to be compatible and execute superior for a few explicit system or application. Attention in the direction of field-explicit applications predominantly for datacenter architectures because of power limitations. Systems are assessed by means of particular simulators appropriate for respective software applications and hardware. System designs illustrated in Awan et al., (2017), (Hong et al., (2016), Vermij et al., (2017), Kampars et al., (2017), Lee et al., (2015), Gao et al., (2016), Gao et al., (2015), Das et al., (2018), Lee et al., (2017), and Lee et al., (2018) give systems founded on in-memory processing. Above-mentioned architectures employed elements like hybrid memory cube (HMC), DRAM, etc. System architectures enlightened or offered in Awan et al., (2017), Acharya et al., (1998), Gu et al., (2016), Riedel et al., (2001), Sun et al., (2018), Choe et al., (2017), Lee et al., (2017), Kaplan et al., (2016), Kang et al., (2013), and Ruan et al., (2019) give system designs derived from in-storage processing. From all these systems, several architectures used HDDs, few SDDs are used and in addition FPGA is utilized for development of a number of systems. New system can adopt budding NVMe SSDs having capability of RAID (Gu et al., 2016). Software stack depicted for above said existing frameworks holds software application for the purpose hardware in NDP architectures are created, simulators intended for evaluating execution of respective framework, and program for programmable hardware (EH). Few issues or challenges in existing schemes are as follows: – Existing system does not developed considering multiple storage units at geologically dispersed places and resultantly do not have load balancing mechanism. – Existing frameworks are only attuned for specific type of ML applications or algorithms. – In scenario of in-storage processing added security is essential at various stages like data, hardware framework, processor context switching, and OS.
8 Machine learning approach for exploring computational intelligence
171
– Scalable system is necessary with respect to application scalability in terms of hardware and software.
8.5 Proposed implementation paradigm CI is supposed to be useful to discover collection of way out by means of NDP derived from constraints of contemporary applications. This section offers prospect on projected solution for concerns specified in section 8.4.2.
8.5.1 Projected working ML paradigm Below explained framework contains two major components as Software Stack and Hardware Stack. Both stack frameworks are elaborated here.
8.5.1.1 Hardware components Figure 8.3 elaborates hardware components of proposed architecture which contains processing/host unit (CPU), memory, and storage nodes/devices, as system is derived from concept of in-storage processing.
Memory
C P U
In-storage computing Storage drive
Controller unit
Figure 8.3: Proposed system block diagram (components of hardware).
CPU of computer (PC) can be utilized as processing/host unit for the intention of inhost processing. Memory has been by default coupled with host system CPU. The key component wants to plan and implement the storage node. Storage drive contains two major divisions, namely, controller unit and in-storage computing. Here, SSD can be used as storage unit in the system; controller unit is composed of SSD’s firmware; instorage computing part includes compiler useful to run the program used for filtration of data required to install on SSD. It is required to have kernel loaded on storage device to mount compiler on SSD. This can be achieved by using the model of docker and container that will assist to append a software tier over the firmware of disk.
172
J Naveenkumar, Dhanashri P. Joshi
8.5.1.2 Software stack In broad-spectrum, planned system’s software stack contains several program units designed for diverse function described here after: – Compiler: To compile both side, programming element compilers have to be implemented for all the subparts of system namely host processing system and drive unit or storage unit. – Simulator: The simulator decreases the execution time largely like from many hours to few minutes. Simulator can be developed on system level and can be used for simulating implementation. – Program unit for host side: This part of software stack will include programing to activate programing for storage drive, used for balancing load among several storage nodes. – Program unit for drive side: This component encloses data computation applied for data accumulated at storage unit or node. Some security techniques might exist which are applicable for storage. – Drivers for Linux kernel: System drive might access like typical storage device because it can be believed like a block device within OS Linux. The other driver administers function of virtual file. The software stack would be divided into two components that will reside in host as well as on disk. The application module on the host side is responsible for creating the instances of the ML program or dividing the techniques into simpler modules that will be forwarded to storage node for further processing. The remote procedural call may be used to migrate the application or its modules from host to the storage. A boomerang pattern may be created whereby the calling function at the host will be offloading the modules to storage, processing the data locally on storage and return the result set to the calling function. The second implementation pattern of the remote procedure call can be agent-based. In this implementation scenario, the offloading function at the host will migrate the object around the various storage nodes. On each node, the mobile agent object will process the data and move on to next storage node where the data are available for processing. The main function on host should be aware of various locations of the data so that the mobile agent can use that data and move around to process the data on distributed storage nodes. To deploy this kind of software stack, it will be required to have an application programming interface that can manage the end-to-end execution. A proposed set of API modules are: (1) mRPC: leads to access the controller that manages the main RPC library and provides the main API. This will be running on java virtual machine that is installed on host as well as storage node; (2) Controller: It manages all the instances of the RPC library and applications. It facilitates the sessions between the host and storage. It will also help in binding and unbinding the ports/ networks to allow the connections between host and storage node; (3) Sessions:
8 Machine learning approach for exploring computational intelligence
173
These will actually help the applications/modules to send and receive the objects between the host and storage. This method has some performance benefits which have been discussed in the following section.
8.5.2 Advantages of intended system The proposed system is evaluated against conventional system with the performance measurement parameters like speed, bandwidth required, and security etc. In conventional system, whole dataset required for computation as an input is transmitted from storage system to host unit so exceptionally high network bandwidth is consumed. As time required to transmit data is raised, in likely scenario delay is caused in response time. This implies increase in a response time of system will directly increase network bandwidth utilized by unit to run and vice versa. In intended system, we will transfer the data that are moderately filtered within or close to storage unit resultantly as it shall drop off movements of data as well as required bandwidth. System will just transmit certain filtered data to processing or host system for further dealing out. Proposed system could perform better when many SSDs are used as storage units or nodes. Load balancing module can be developed to balance workload in number of storage nodes. A further performance feature is system reliability. When data are transmitted from one site to another (physically or logically), exposure of data will be increased. As believed formerly in projected system, quantity of data become slighter which is needed to move so system would become more secured.
8.6 Further research roadmap in in-storage processing Several potential ways of study domain of NDP are mentioned below: – Security and Integrity of Data It is very vital to generate a protected environment for program that resides on the storage unit to take care of the integrity of data and to ensure expected outcome in the concern of misbehaving part of program. When it is about in-storage processing, extra safety measures are necessary at several-level processor context switching, for instance data, hardware, and operating system. The essential provision that needs to present at storage place (might be datacenters, cloud server etc.) is data security and in addition to that security must be considered for network from which data are being transmitted. Integrity and authentication of data ought to be made available at datacenter. Occurrence of obstruction in
174
J Naveenkumar, Dhanashri P. Joshi
network because of cyber-attacks like denial of service is another concern about network security. – Storage Device (Pervasive Device) Intelligence Machine intelligence is attained to some extent in the form of AI. But in NDP hypothesis, we want intelligence to be accomplished with storage nodes as well. Contrasting to current paradigm where common NDP paradigms are built up with one or more storage nodes and one processing node managing them, in nearby prospect is necessitated to widen existing frameworks with any storage unit among all shall behave the same like master along with rest of the storage nodes shall behave as a slave of respective master node. In intended system, masters have to vary by means of workload allocated to storage nodes for evading solitary point of breakdown of system as well as for balancing the workload between units of system. – Distributed data Since global digitization of data plus requirement of space for storage of such digitized data amplifies, the handing out enormous amount of data will carry on to happen more. Since our computation field is moving in the direction of data intensive applications, necessity of digitization of data raises. At the same time as amount of digitized data enlarged, probability to store it in geologically dispersed datacenters intensifies. To process the data that reside on geologically distributed locations is also one of the issue in in-storage processing.
8.7 Conclusion and future scope Evolution can be called as one of the optimization methods. EC may be derived from growth or evolution hypothesis in environment. In the evolutionary age, computing uses some biological conceptions to build up and propose programming and hardware system. EP was primarily projected as a practice of producing machine intelligence. It immensely impinges on prospectus domains similar to ML, data science. For the perspective of familiarized activities, forecasting the platform and captivating proper method, intelligence was observed. To input applications based on machine intelligence, data are required in enormous quantity and system ought to work by attaining minimum latency. However, movements of data during conventional method of data storage as well as dealing out, augmented reply time period in addition to have huge influence on network transmission capability (bandwidth) requisite. For conquering such issues processing in-situ grown in a scenario which comprises significant consequence on processing speed, latency as well as bandwidth of network. First systems were restricted to processing in-memory where computation on data executed in the memory of host processing unit. However in
8 Machine learning approach for exploring computational intelligence
175
recent data centric systems, tremendously quick secondary storage drives could be the perfect choice to execute through. At present situation, many important applications deployed at central server unit and necessary data to perform task might be resided at a variety of datacenters. At this instant, necessity of this era computing capability is to build up likely system which possibly will be accomplish the requisite of in-situ computing having huge level of scattered data. In future we can develop system by considering security constraints. As instead of moving whole sensitive data in network, in in-storage processing, we move code or block of code from host to storage (i.e. through network). To provide security of data from malicious code moving through a network, we can use concept of sandboxing or blockchain. Also we can develop smart and dynamic scheduling for multiple storage devices.
References Acharya A., Uysal M., et. al.; Active Disks: Programming Model, Algorithms and Evaluation; Proceedings of the eighth international conference on Architectural support for programming languages and operating systems; San Jose, California, USA — October 02–07, 1998; 81–91. Achmadi D., Suryanto Y., and Ramli K.; On Developing Information Security Management System (ISMS) Framework for ISO 27001-based Data Center; IWBIS 2018; pp. 149–157. Awan A.J., Ayguade E., et.al. Identifying the potential of Near Data Processing for Apache Spark; Proceeding MEMSYS ‘17 Proceedings of the International Symposium on Memory Systems; 2017; 60–67. Boncheol G., Yoon A.S., et. al.; Biscuit: A Framework for Near-Data Processing of Big Data Workloads; 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA); Seoul, South Korea; 18–22 June 2016; IEEE. Carlos A. and Silva S.; Multi objective Optimization; MIT Portugal. Cheng J. and Sun X. Using Electronic Technology and Computational Intelligence to Predict Properties of Dangerous Hydrocarbon. Advances in Mechanical and Electronic Engineering, LNEE 177, Springer-Verlag Berlin Heidelberg, (2012), 621–626. Chiong R. and Beng O.K.; A Comparison between Genetic Algorithms and Evolutionary Programming based on Cutting Stock Problem; Engineering Letters; 2014. Choe H., Lee S et. al. Near-Data Processing for Differentiable Machine Learning Models. Cornell University, (2017), 1–12. Das P., Lakhotia S., et.al.; Towards Near Data Processing of Convolutional Neural Networks; 2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID); Pune, India; 6–10 Jan. 2018; IEEE. Deb K. Multi-ObjectiveOptimization. Multi-Objective Optimization using EvolutionaryAlgorithms, John Wiley & Sons, Inc, (2001). Emmerich M.T.M. and Deutz A.H. A tutorial on multiobjective optimization: Fundamentals and evolutionary methods. Natural Computing, (2018), 585–609.
176
J Naveenkumar, Dhanashri P. Joshi
Fogel D.A.V.I.D.B. Natural Selection. An overview of evolutionary programming. In: Davis L.D. et al (eds.), Evolutionary Algorithms, (1999), Springer-Verlag New York,Inc, 89–109. Fogel D.B.; A Comparison of Evolutionary Programming and Genetic Algorithms on Selected Constrained Optimization Problems; SIMULATIONS; Society for Modeling and Simulation International (SCS);1995; pp. 394–404. Gao M. and Kozyrakis C.; HRL: Efficient and flexible reconfigurable logic for near-data processing; 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA); Barcelona, Spain; 12–16 March 2016; IEEE. Gao M., Ayers G., et. al.; Practical Near-Data Processing for In-memory Analytics Frameworks; 2015 International Conference on Parallel Architecture and Compilation (PACT); San Francisco, CA, USA; 18–21 Oct. 2015; IEEE. Google, “Considerations for Sensitive Data within Machine Learning Datasets”; online available: https://cloud.google.com/solutions/sensitive-data-and-ml-datasets; Creative Commons Attribution 4.0 License. Gordon T.G.W. and Bentley P.J. Evolving hardware. In: Zomaya A.Y. (eds), Handbook of Nature-Inspired and Innovative Computing, (2006), Springer, Boston, MA, 387–432. Higuchi T., Liu Y., Iwata M., and Yao X. Introduction to evolvable hardware. In: Higuchi T., Liu Y., and Yao X. (eds), Evolvable Hardware Genetic and Evolutionary Computation, (2006), Springer, Boston, MA. Hong B., Kim G., et. al.; Accelerating linked-list traversal through near-data processing; 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT); 11–15 Sept. 2016; Haifa, Israel; IEEE. Jalili-Kharaajoo M. and Moshiri B.; Active Networks and Computational Intelligence; First International Workshop Service Assurance with Partial and Intermittent Resources 2004 Fortaleza, Brazil, August 1–6, 2004 Proceedings; Springer-Verlag Berlin Heidelberg 2004; pp. 128–134. Jalili-Kharaajoo M.; On the Application of Computational Intelligence Methods on Active Networking Technology; GCC 2003; Springer-Verlag Berlin Heidelberg 2004; pp. 459–463. Kampars J. and Grabis J.; Near Real-Time Big-Data Processing for Data Driven Applications; 2017 International Conference on Big Data Innovations and Applications (Innovate-Data); Prague, Czech Republic; 21–23 Aug. 2017; IEEE. Kang Y., Kee Y.-S., et. al.; Enabling Cost-effective Data Processing with Smart SSD; 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST); Long Beach, CA, USA; 6–10 May 2013; IEEE. Kaplan R., Yavits L., et.al.; From processing-in-Memory to Processing-in-Storage; 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT); Haifa, Israel; 11–15 Sept. 2016; IEEE. Khan Z.H., Ali A.B.M.S., and Riaz Z. Editors; “Computational Intelligence for Decision Support in Cyber-Physical Systems”; Springer Singapore Heidelberg New York Dordrecht London; 2014. Lee J.H., Sim J., et.al.; BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models; 2015 International Conference on Parallel Architecture and Compilation; San Francisco, CA, USA; 18–21 Oct. 2015; IEEE. Lee S. and Choe H., et. al.; Near-data processing for machine learning; ICLR 2017; 1–12. Lee V.T., Mazumdar A., et. al.; Application Codesign of Near-Data Processing for Similarity Search; 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS); Vancouver, BC, Canada; 21–25 May 2018; IEEE. Lee V.T., Mazumdar A., et. al.; Poster: Application-Driven Near-Data Processing for Similarity Search; 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT); Portland, OR, USA; 9–13 Sept. 2017; IEEE.
8 Machine learning approach for exploring computational intelligence
177
Loiacono D. and PreuB M.; Computational Intelligence in Games; GECCO’12 Companion; July 7–11; 2012; Philadelphia, PA, USA; pp- 1139. Lontsikh P.A. and Karaseva V.A., et.al.; “Implementation of Information Security and Data Processing Center Protection Standards”; IEEE; 2016; pp. 138–143. Malhotra R., Singh N., and Singh Y. Genetic algorithms: Concepts, design for optimization of process controllers. Computer and Information Science, (2011) March, 4(2), Canadian Center of Science and Education, 39–54. Meesad P., Unger H., et.al.; Advances in intelligent systems and computing; The 9th International Conference on Computing and Information Technology (IC IT2013); King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand; May 9th–10th, 2013. Merkle L.D.; Automated Network Forensics; GECCO’08; July 12–16, 2008, Atlanta, Georgia, USA; pp. 1929–1931. Perez R., Schildhauer W., Srinivasan D et. al. TVDc: Managing security in the trusted virtual datacenter. ACM SIGOPS Operating Systems Review, (2008), 42(1), 40–47. Rattadilok P. and Petrovski A.; Self-learning Data Processing Framework Based on Computational Intelligence enhancing autonomous control by machine intelligence; 2014 IEEE Symposium on Evolving and Autonomous Learning Systems (EALS); 2014. Riedel E., Faloutsos C et. al. Active disks for large-scale data processing. Computer, (2001) Jun, 34(6), 68–74. Ruan Z., Tong H., et. al.; Insider: Designing In-Storage Computing System for Emerging HighPerformance Drive; Proceedings of 2019 USENIX Annual Technical Conference; Renton, WA, USA; July 10–12, 2019; 379–394. Rygula P., Speicher A., et. al.; Decentralized Data Center (DDC); IEEE MIT Undergraduate Research Technology Conference 2017. Salvador R.; Evolvable Hardware in FPGAs: Embedded tutorial; 11 th International Conference on DTIS; 2016; pp. 1–6. Stomeo E. and Kalganova T.; A Novel Genetic Algorithm for Evolvable Hardware; 2006 IEEE Congress on Evolutionary Computation Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada; July 16–21, 2006; pp. 134–141. Sun H., Liu W et. al. DStore: A holistic key-value store exploring near-data processing and on-demand scheduling for compaction optimization. IEEE Access, (2018), 6, 61233–61253. Vermij E., Fiorin L., et. al.; Boosting the Efficiency of HPCG and Graph500 with Near-Data Processing; 2017 46th International Conference on Parallel Processing (ICPP); Bristol, UK; 14–17 Aug. 2017; IEEE. Yafrani M.E., Chand S., et.al.; Multi-objectiveness in the Single-objective Traveling Thief Problem; GECCO ’17 Companion, Berlin, Germany; 2017; pp 107–108.
Simrandeep Singh, Nitin Mittal, Harbinder Singh
9 Classification of various image fusion algorithms and their performance evaluation metrics Abstract: Image fusion is the process of enhancing the perception of a vision by combining substantial information captured by different sensors, different exposure values, and at different focus points. Several images captured from different sensors like infrared region and visible region, positron emission tomography scan, and computed tomography, Multifocus images with different focal points, and images taken by static camera at different exposure values. Most promising area of image processing nowadays is image fusion. The picture fusion method seeks to incorporate two or more pictures into one picture that contains better data than each source picture without adding any artifacts. In distinct apps, it plays an essential role, namely medical diagnostics, pattern detection and identification, navigation, army, civilian surveillance, robotics, and remote sensing satellite images. Three elements are taken into consideration in this review document: spatial domain fusion methodology, different transformation domain techniques, and image fusion performance metrics like entropy, mean, standard deviation, average gradient, peak signal-to-noise ratio, and structural similarity index (SSIM). Many image fusion applications are explored in this chapter. Keywords: Image fusion, Multifocus image fusion, Multiexposure image fusion, Multimodal image fusion
9.1 Introduction Nowadays, digital image processing has been used in various application domains. Because of the inherent drawbacks associated with the digital camera and its image quality, there has been a great scope in developing techniques to enhance the picture quality. Images captured using the same camera may also vary in numerous ways and a single image out of them is not enough to analyze the exact situation (Jiang et al., 2017; Jin et al., 2017; Lahat et al., 2015; Systems, 2014). Image fusion is a procedure where several images can be pooled together to form a single fused image and to enhance the information or detail in an image. Normally two or more Simrandeep Singh, Nitin Mittal, Electronics and Communication Engineering, Chandigarh University, Punjab, India Harbinder Singh, Electronics and Communication Engineering, CEC Landran, Punjab, India https://doi.org/10.1515/9783110648195-009
180
Simrandeep Singh, Nitin Mittal, Harbinder Singh
than two images are captured through different sensors and a composite image with deep knowledge of object and scene is generated. The composite image has a large depth of field and contains all the objects which are very clear, more detectable and more recognizable (Zhang and Hong, 2005). The main aim of image fusion is to extract useful details from an image data set. Well-designed and optimal algorithm is required, which will transfer maximum information from set of images to a composite image. These images cater variety of applications under different domains. The main domains for image fusion are medical (CT, MRI, and PET images) (Daneshvar and Ghassemian 2010; Shen et al., 2005), astronomy (spatial resolution sensor and spectral resolution sensor) (Wang et al., 2014), surveillance (visible and IR images) (Bavirisetti and Dhuli 2016a; Feng et al., 2013; Ma et al., 2019; Saeedi and Faez 2012; Zhang et al., 2018a; Zhou et al., 2016), and so on. Image registration is very significant preprocessing task in image processing, where two or more different images are aligned on same time scale. Most of the times already registered images are considered for image fusion. Along with other parameters, light intensity is a major factor responsible for image quality. It always depends on level of light. Often infrared (IR) images are great substitute for visible images under poor light conditions. For IR imaging, IR sensor is required, which will capture IR radiations. Normally these radiations are not visible to human eye. Very simple application of IR is remote control. Every remote control works on different IR ranges. Low light-sensitive CCD camera is used to capture IR images. Image fusion is a method that fuses distinct images into a picture comprising wealthy data from various sensors. The merged picture can be used for the extraction, segmentation, target detection, and identification of characteristics. Target sensing (Xin et al., 2013), army surveillance (Sadjadi 2006), and robot (Science n.d.) and medical diagnosis (Haddadpour et al., 2017) are now commonly used image fusion application areas. Spatial domain (SD) and transform domain (TD) are its two major categories (Yang and Li 2010). Direct manipulation of pixel is performed in SD and conversion of spatial to frequency domain happens in TD (Chandra 2015; Wang et al., 2014). SD algorithms are computationally cheap but resultant image suffers from contrast distortion and loss of information issues. It is the art of merging different pictures in one picture that complements source data. The general approach adopted in this is shown in Figure 9.1. The images captured using the same sensor may vary in time coordinates, so in order to combine multiple images, it is necessary to bring the input images to a common coordinate system. The process used for this purpose is image registration, where two or more images are aligned geometrically. Geometrically aligning two or more images is called image registration (Jiang et al., 2017; Kubota and Aizawa, 2005; Manchanda and Sharma, 2018; Wu et al., 2011). It is not the scope of this chapter but yet it needs to be discussed here because it is the assumption that either all source images are already registered or acquired with the help of tripod from the static scene used here.
9 Classification of various image fusion algorithms
Camera 1
Camera 1
Image a
Image b
181
Camera 1 ...
Image n
Fused image
Figure 9.1: Approach to image fusion.
In practice all fusion algorithms perform the image fusion process directly on the source images with complementary information as given in Figure 9.2.
Complementary information
Image A
Fused image
Image B
Figure 9.2: Fused image after acquiring complementary information.
The structure of the chapter is as follows (Figure 9.3).
9.2 Related work In SD analysis all the algorithms are applied straightaway on the pixels, that is, we play with pixel values to perform image fusion at pixel level and extract information only through pixel values. Although this method is quiet easy and computationally cheap but resultant images suffers from issues related to contrast disorder, distortion, and information lost. Fused image shows artifacts due to noisy weighing map. Whereas in TD, (Bhatnagar and Li et al. 2013a; Jiang et al., 2018; Wang et al., 2004; Wu et al., 2011) image is shifted to frequency domain through discrete Fourier transform (Kazemi and Moghaddam, 2003), curvelet transform (Yang and Shutao, 2010), nonsubsampled contourlet transform (Yin et al., 2010), and so on. Image is segregated into different frequency components, that is, high- and low-frequency components.
182
Simrandeep Singh, Nitin Mittal, Harbinder Singh
Review of various image fusion algorithms
1. Introduction
2. Related work
3. Image fusion methods
2.1 Multifocus image fusion
3.1 Sparse representation based methods
2.2 Multi exposure image fusion
3.2 Multi scale decomposition methods
2.3 Visible and Infrared image fusion
3.3 Hybrid transforms
2.4 Biomedical image fusion
3.4 Other transform domains
4. Qualitative analysis metric
5. Conclusion
Figure 9.3: Structure of survey paper.
Source A
Source B
Fusion Rule (Maximum, minimum, averaging, weighted summation and weighted average)
Fused image
Qualitative Measure
Figure 9.4: Fusion process for spatial domain. Yang B. and Shutao L. 2010.
Detailed image information is shown by low-frequency component and information related to edges is given by high-frequency components. In order to apply TD in image fusion, subblock size must be defined and results may vary according to the change in size of mask. It is very difficult to find the optimum value of window.
9 Classification of various image fusion algorithms
Source A
Source B
Decomposition method (Pyramid, wavelet and multiscale geometric analysis)
Fusion rule
Fused image
183
Qualitative measure
Figure 9.5: Fusion process for transform domain. Yang B. and Shutao L. 2010.
TD image fusion methods can be further subdivided into three categories: pyramid methods, wavelet, and multiscale decomposition (MSD). Burt and Adelson proposed Gaussian and Laplacian pyramid techniques in 1983 (Burt and Adelson 1983) and provided good spectral information about the image, but the resultant image lost information and also showed artifacts. The main drawbacks of this technique are edges are not preserved and texture details, phase, and directional information is missing (Pohl and Genderen 2010). It may also include blocking effect in resultant image. The block of decomposition is accompanied by the block of fusion rule. The aim of the fusion rule block is to fuse source images based on the extracted information. Pixel level image fusion, feature level image fusion, and decision level image fusion are three categories of rules according to extensive literature review. Pixel value of source images are processed in the operation of fusion at pixel level. Higher and lower frequency components in image are bifurcated in feature level operation. Wavelet and transform vectors are used for detection and classification in decision level operation. Image fusion process may be applied to unimodal and multimodal images. Unimodal images are the images captured from single sensor and multimodal means images captured from different sensors. The examples of unimodal image fusion that includes multifocus and multi exposure are discussed in the forth coming sections.
9.2.1 Multifocus image fusion In multifocus fusion process, images are taken from same camera at different focal length. Foreground (objects) and background are focused in different images.
184
Simrandeep Singh, Nitin Mittal, Harbinder Singh
In multifocus image fusion process, multiple images acquired from the same camera focuses at different objects are joined/fused together as shown in Figure 9.6. The foremost objective is to improve the various properties of an image (Feng et al., 2013).
Figure 9.6: Multifocus image fusion: (a) focused background, (b) focused foreground, and (c) fused image.
In various machine vision, medical imaging and remote sensing applications, it is desirable to overcome the limitation of ray optics used in the camera, which focuses on different objects while clicking from different distances. The most significant matter concerning multifocus image fusion is to decide how to select in-focus part across source images and how to combine them to produce a single output image.
9.2.2 Multiexposure image fusion Generally, natural scenes have different luminous value. Multiple images acquired from the same camera clicked in different luminance or light intensities are combined together in multiexposure image fusion as shown in Figure 9.7. Some images
Figure 9.7: Multiexposure image fusion: (a) overexposed image, (b) underexposed images, and (c) fused image.
9 Classification of various image fusion algorithms
185
can be over exposed, where light intensity is more and some can be under exposed, where light intensity is less, that is, at different exposure values (Li and Kang, 2012). Mostly multiexposure image fusion algorithms use weighted average, pyramids, gradient direction, guided filter, bilateral filter, and so on.
9.2.3 Infrared and visible image fusion Fusing the IR and visible image together is an example of multimodal image fusion. In this process, multiple images captured from IR and Vis camera are fused together to perform image fusion (Bavirisetti and Ravindra, 2016a, 2016c; Jin et al., 2017; Ma et al., 2019; Sadjadi, 2006; Saeedi and Faez, 2012; Zhang et al., 2018; Zhou et al., 2016). Based on the radiation difference, that is, day/night and different weather conditions, IR images can discriminate backgrounds from their targets as shown in Figure 9.8.
Figure 9.8: Infrared and visible image fusion (a) infrared image, (b) visible image and, (c) fused image.
By contrast, visible images are best suited for human visible system and gives great deal of texture details. So in this fusion all the details in both images may be merged together to form image with all details. It has vast application in the field of defense and navigation for target detection. Visible images are not able to perform well in night, foggy, rain, and many other bad weather conditions. In contrast, images captured using IR camera are rich in information about foreground (target movement).
9.2.4 Biomedical image fusion Biomedical images may be categorized into two categories, that is, structural and functional. This categorization is carried out according to the information provided by images. The categorization of biomedical imaging sensors is well defined in Figure 9.9.
186
Simrandeep Singh, Nitin Mittal, Harbinder Singh
Different Sensors modalities
Structural
X-Ray
Functional
CT
MRI
PET
SPECT
Figure 9.9: Different biomedical imaging sensors and images.
Different biomedical images captured using different sensors like positron emission tomography (PET) (Daneshvar and Ghassemian, 2010), single-photon emission computed tomography (SPECT) (Majumdar and Patil, 2013), computed tomography (CT) (Haddadpour et al., 2017), and magnetic resonance imaging (MRI) (Haddadpour et al., 2017) are fused together as shown in below Figure 9.10. Different properties of these sensors are combined together such as information regarding hard tissues are provided by CT and soft tissues by MRI and PET scan displays functionality of organ and flow of blood in specific position is provided by SPECT imaging.
CT
MRI
MRI
MRI
PET
SPECT
(a)
FUSED
FUSED
FUSED (b)
(c)
Figure 9.10: Different biomedical image fusion: (a) MRI-CT, (b) MRI-PET, (c) MRI-SPECT.
9 Classification of various image fusion algorithms
187
All these fused thermal images show more depth of field. These fused images help medical practitioners to diagnose cancer and other diseases in human being. Human being cannot be exposed to multimodal scanning waves for longer duration, so in that case multimodal image fusion is cheap and more patient-friendly method. The most common organs subjected to investigation for diseases are brain, neck, lungs, breast, prostate, pelvic, bone marrow, ovaries, and liver. These organs are more prone to disorders. The brain is the central processing unit of human beings and it includes feelings and sensation in all parts of the body. Small disturbance inside the brain may lead to major strokes. The lungs are the most important organs in the respiratory system that undergo direct contact by air consumption with the surroundings. Pollutants and viruses are susceptible to harm the lungs. The picture of the lungs can usually show several information reflecting the condition of inner tissues. It is not a simple task in early diagnosis to discern a damaged tissue, cancer tissue, and healthy tissue. Image fusion methods demonstrated improving diagnostic and screening efficiency and in particular improving clinical results. Many researchers have conducted experiments on the breast because of the elevated rate of breast cancer among females. Mammogram (both analog and digital), and MRI and/or CT are the most common methods for breast research. The combinations of PET (functional imaging) and XRT (CT, anatomic location) showed important improvements in diagnostic precision.
9.3 Image fusion methods According to extensive literature review image fusion methods may be broadly categorized based on sparse representation (SR) (Zhang et al., 2018),MSD (Bavirisetti and Dhuli 2016b; Jin and Gu 2017; Pati et al., 1993; Yang et al., 2007; Yang and Shutao, 2010; Zhou et al., 2016), hybrid transforms, and other domains. Review of these methods is discussed in subsequent sections.
9.3.1 Methods based on sparse representation SR-based frame work was first implemented by Yang and Shutao (2010). Sparsity implies that the reconstruction of a signal requires only a small quantity of atoms, that is, coefficients become sparse. Later on, SR was introduced in many medical image fusion, denoising, interpolation, and recognition to remove noise signal (Yin, 2011) (Yang and Shutao, 2010). Finer details like texture and edges of ground are not retained by conventional SR. SR includes natural sparsity of image physiological properties based on natural vision system of human being as shown in Figure 9.11.
188
Simrandeep Singh, Nitin Mittal, Harbinder Singh
Sparse coding
Vectorization Pixel vectors Infrared image
coefficients
Sparse Fusion
Reconstruction
rule Fused sparse Coefficients
Fused Image
Dictionary Vectorization Dictionary Visible image
Pixel vectors
Sparse coefficients
Figure 9.11: SR-based image fusion.
Gaussian pyramid
Laplacian pyramid x
I1 – Laplacian pyramid
D1
+
B1
Fusion F Fused image B2
– D2 Gaussian pyramid Laplacian pyramid
x
I2
Figure 9.12: Image fusion based on MSD. Singh S et al. 2020.
The main transform used in in this domain are simultaneous Orthogonal Matching Pursuit (OMP) (Bin Yang and Li 2012), joint sparsity model (Yin, 2011), spectral and spatial details dictionary (Wang et al., 2014), SR with over-complete dictionary (Soomro et al., 2017), and gradient constrained SR (Sreeja and Hariharan, 2018). It is extensively used in various image processing techniques like image classification, image fusion, image identification, and character recognition. Novel sparse coding and dictionary-based learning systems have been studied nowadays in the scarce representation based on image fusion. But most of the
9 Classification of various image fusion algorithms
Sensor A (MRI image)
189
Sensor B (SPECT image)
DWT
Image fusion based on fuzzy logic
IDWT
Fused image Figure 9.13: Image fusion based on fuzzy logic. Jiang Q. et al. 2018.
techniques are based on fusion approaches, such as window based activity level measurement (Liu et al., 2015), weighted average (Saeedi and Faez, 2012; Zhang and Levine, 2016) based, coefficients combining (Wu et al., 2011) based on choose max (Yang and Shutao, 2012), and sparse coefficients substitution (Kim et al., 2016). However, SR may have few disadvantages also: resultant fused image may not have very fine details of image. It is because SR is not able to include all the fine and edge details of image. So, it is better to apply SR-based algorithm to low frequency components and extract spatial information from it.
9.3.2 Multiscale decomposition methods Burt and Adelson presented the MSD technique in 1983 (Hospital, 2006) by using low or band pass filters as part of the process. The most dominating domain in image fusion is multidimensional transform methods, which presume that images are represented by different layers in various modes. In MSD, source images are decomposed into several levels, merge corresponding layers with specific rules, and accordingly recreate the target images. Gaussian pyramid (Wang and Chang, 2011), Laplacian pyramid (Krishnamoorthy and Soman, 2010), discrete wavelet decomposition (Song et al., 2006), gradient pyramid (Sreeja and Hariharan, 2018), ratio of low pass pyramid (Indhumadhi and Padmavathi, 2011), contrast pyramid (Wang et al., 2004), and stationary wavelet decomposition are found to be most widely used methods based on MSD in wavelet domain. The categorization among various classical wavelet-based and pyramid-based techniques is initially proposed by Zhang and
190
Simrandeep Singh, Nitin Mittal, Harbinder Singh
Blum (Zhang et al., 2018). Curves and edges are not well preserved and represented by all these methods. In order to remove all these disadvantages more complex image fusion tools like multiscale geometric analysis are proposed by researchers. Edge preservation filtering has also effectively been used in recent years to build multiscale image representations. For multiexposure image fusion, edgepreserving MSD associated with weighted low square filter is proposed by Farbman et al. (Li and Kang, 2012). Bilateral and directional filters are joined together by Hu et al., for multiscale image fusion representatives (Singh et al., 2013). For fusion of IR images (Sadjadi, 2006), Zhou et al. merged the Gaussian and two-way filters. Li et al. have introduced a directed picture fusion filtering technique which achieves cutting-edge efficiency in a number of picture fusion applications (Pal et al., 2015). Weighted average, methods based on optimization, (Wang et al., 2014) and coefficient combining based on substitution methods are widely used strategies in MSD. The main strategies used in image fusion are guided filtering-based weighted average (Pal et al., 2015; Singh et al., 2014; Zhou et al., 2016), cross-scale fusion rule, coefficient window, choose-max (Li et al., 2003), and window- and region-based consistency verification. Generally speaking, multi-scale IR and Vis image fusion methods include three steps. First, a series of multiscale representations of each source image is broken down. Then, a fusion rule is used to fuse the multiple representations of the source image. The fused image is obtained using corresponding multiscale reverse transformations of the fused images. The selection of transformations and fusion rules is the key to multiscale transformational fusion schemes. Conventional pyramid methods are nowadays combined with optimization methods and are widely used in image fusion. The performance of these methods have been improved drastically.
9.3.3 Hybrid transforms Recently, researchers have proposed hybrid methods (where different transform methods are combined with one another) intensity-hue-saturation-wavelet (Pohl and Van Genderen, 1998), multi-scale transform-SR (Zhou et al., 2016), waveletcontourlet (Indhumadhi and Padmavathi, 2011), contourlet-SR (Yang et al., 2007), and morphological component analysis-SR (Krishnamoorthy and Soman, 2010). Fusion strategies adopted in these transforms are component substitution (Zhang and Hong, 2005), coefficient- and window-based activity level measurement (Liu et al., 2015), integration of component substitution and weighted average, (Ramakanth and Babu 2014) and choose-max and weighted average-based coefficient combining method (BHATNAGAR and Saeedi and Faez, 2012; Wu et al., 2011).
9 Classification of various image fusion algorithms
191
9.3.4 Methods performed in other TDs Principal component analysis (PCA) (Pohl and Van Genderen, 1998), IHS (Daneshvar and Ghassemian, 2010), gradient domain (Wang et al., 2004), independent component analysis (Zhou and Prasad, 2017), fuzzy theory (Jiang et al., 2018), hue saturation value (Lu et al., 2017), IHS transform (Majumdar and Patil, 2013), SD (non transforms) (Jiang et al., 2017; Kazemi and Moghaddam, 2003; Zhang et al., 2011), and Gram-Schmidt matting decomposition (Li et al., 2013) are major transform methods used in this domain. PCA is a mathematical mechanism that reduces the multidimensional data sets to smaller analytical sizes. The weight factor is determined by this technique and added in an input picture at each pixel position and takes an average pixel value for results for the same place. In the field of remote sensing, panchromatic images with high resolution are fused with multispectral images with low resolution, which leads to the “pansharpening” problem (Loncan et al., 2015). Component substitution (Rahmani et al. 2010), spatial context-based weighted average (Liu et al., 2015; Zhang et al., 2014; Li and Kang, 2012), model-based method (Algorithm 2010), machine learning-based weighted average (Li et al., 2002; Song et al., 2006), and region-based activity level measurement (Li and Yang, 2008) are fusion strategies. Decisions are made by people based on laws. Fuzzy machines that always imitate people conduct work in the same manner. The choice and means of choice are nevertheless substituted by uncertain sets of decisions, while the guidelines are replaced by uncertain laws. The Fuzzy laws also work on a number of if–then declarations. If X then A, for example, if Y then B, if A and B all are X and Y sets. Fuzzy rules describe fluid patches, which in the fluid logic are the main concept.
9.4 Quantitative analysis metric Qualitative analysis (i.e., visual analysis) alone is not adequate to check the efficiency of fusion algorithm rather certain quantitative analysis like entropy (Sreeja and Hariharan, 2018), mean, standard deviation (Manchanda and Sharma, 2018), average gradient (Gosain and Singh, 2015), and PSNR (Sreeja and Hariharan, 2018) are required (Tsai et al., 2008). Based on the grading of statistical evaluation metric, the performance of algorithms can be judged. The average amount of information conveyed in an image is referred to as entropy (Sreeja and Hariharan, 2018). The proximity of the entropy values illustrates better fusion. If in case reference image (ground truth) is unavailable, desirable values for entropy are higher for better fusion results. Entropy may be defined as eq. (9.1).
192
Simrandeep Singh, Nitin Mittal, Harbinder Singh
EI = −
L −1 2X
p sj log2 p sj
(9:1)
j=1
where L is number of gray levels, p(sj) is the probability of occurrence of gray level sj in an image I of size M × N . The second metric is mean, it measures the central tendency of the given data, and dispersion around mean is referred as standard deviation (Manchanda and Sharma, 2018). A higher value of mean and standard deviation is expected for fused image in order to have better fusion results. For an image I of size M × N, mean and standard deviation are given by equations, where μI defines mean and σI symbolize standard deviation and μI =
M X N X
I ði, jÞ
(9:2)
i=1 j=1
"
M X N 1 X ðI ði, jÞ − μI Þ2 σI = MN − 1 i = 1 j = 1
#1
2
(9:3)
For image I, ΔI,x is the gradient in horizontal direction and ΔI,y is the gradient in vertical direction. The average gradient denoted by ΔI is given by eq. (9.4): 1
M X N ðΔ 2 1 X I, x − ΔI, y Þ ΔI = 2 MN i = 1 j = 1 2
2
(9:4)
The average gradient for reference and resultant image should be closer for the better fused results. If in case reference image is not available, higher value of average gradient is expected for better fusion results. PSNR is peak signal-to-noise ratio and it is a most important parameter in image quality analysis. It indicates similarity between two images. The PSNR between fused image IF and ground truth IG, each of size M × N, is calculated as in eq. (9.5): " # 2552 PSNR = 10ln 1 PM PN (9:5) 2 i=1 j = 1 ðIF ði, jÞ − IG ði, jÞÞ MN Higher PSNR value results in more similarity between two images. Ideal PSNR value is infinity (∞) in case of perfect match between two images, If PSNR (Kaur and Singh, 2017) value decreases match will also decreases. Structural similarity (SSIM) is an indicative of perceptual structural similarity between ground truth image IG and fused image IF . Mathematically SSIM map between IF and IG is defined as follows (Jin et al., 2017) (Krishnamoorthy and Soman, 2010).
193
9 Classification of various image fusion algorithms
SSIMmap ðIF , IG Þ =
σIFI
2μIF μIG 2σIF σIG σIG σIF μ2I + μ2I σ2I + σ2I G
F
G
F
(9:6)
G
σIFI is the blockwise covariance between IF and IG whereas μIF and σIF are the G blockwise mean and standard deviation of the image IF . The mean of the SSIMmap is the value of SSIM between IF andIG . For a closely similar and dissimilar images SSIM equals 1 and 0, respectively. PSNR and SSIM are preferred only in the presence of ground truth. By using all these quantitative analysis parameters, the performance of fusion algorithms may be checked and comparative analysis among different techniques can be carried out.
9.5 Conclusion Image fusion is a popular research area in image processing and has attracted much attention recently. In recent years, wide research is carried out by researchers in this domain. We enter an era in which it is practically impossible to overlook the existence of several associated datasets due to the abundance of different sources of data. In this chapter, we have surveyed various type of images (single sensor and multimodal) that can be fused together and their corresponding image fusion methods. As argued in this chapter, various application areas of image fusion are explored. Strength and weakness of different image fusion methods have been discussed with examples. Image fusion results may be verified using quantitative and qualitative metrics parameters. Further, various quantitative performance metric has also been discussed in this chapter. Future perspective of image fusion methods is to design fusion methods with fast, effective, and specific responses, which is required for upcoming fast developed technologies. In future, by adopting combination of image fusion techniques with optimization algorithms results may be improved many folds.
References Algorithm, Using Genetic. 2010. Correspondence. British Journal of Hospital Medicine (London, England : 2005), 71(12), 714-Unknown. http://www.ncbi.nlm.nih.gov/pubmed/21135774. Bavirisetti D.P. and Ravindra D. 2016a. Fusion of Infrared and Visible Sensor Images Based on Anisotropic Diffusion and Karhunen-Loeve Transform. IEEE Sensors Journal, 16(1), 203–9. Bavirisetti D.P. and Ravindra D. 2016b. Multi-Focus Image Fusion Using Multi-Scale Image Decomposition and Saliency Detection. Ain Shams Engineering Journal, http://dx.doi.org/ 10.1016/j.asej.2016.06.011.
194
Simrandeep Singh, Nitin Mittal, Harbinder Singh
Bavirisetti D.P. and Ravindra D. 2016c. Two-Scale Image Fusion of Visible and Infrared Images Using Saliency Detection. Infrared Physics and Technology, 76, 52–64. http://dx.doi.org/ 10.1016/j.infrared.2016.01.009. Bhatnagar G. and WU Q.J. 2011. An Image Fusion Framework Based on Human Visual System in Framelet Domain. International Journal of Wavelets, Multiresolution and Information Processing, 10(01), 1250002. Chandra, Satish B. 2015. Digital Camera Image Fusion Algorithm Using Laplacian Pyramid, 4(7), 43–49. Daneshvar S. and Ghassemian H. 2010. MRI and PET Image Fusion by Combining IHS and Retina-Inspired Models. Information Fusion, 11(2), 114–23. http://dx.doi.org/10.1016/ j.inffus.2009.05.003. Farbman Z., Fattal R., Lischinski D. and Szeliski R. Edge-preserving Decompositions for Multi-scale Tone and Detail Manipulation. ACM Transactions on Graphics (TOG), (2008) Aug 1, 27(3), 1–0. Feng Z.J., Zhang X.L., Yuan L.Y., and Wang J.N. 2013. Infrared Target Detection and Location for Visual Surveillance Using Fusion Scheme of Visible and Infrared Images. Mathematical Problems in Engineering 2013. Gosain A. and Singh J. 2015. Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. 327, 305–16. http://link. springer.com/10.1007/978-3-319-11933-5. Haddadpour M., Daneshvar S., and Seyedarabi H. 2017. ScienceDirect PET and MRI Image Fusion Based on Combination of 2-D Hilbert Transform and IHS Method. Biomedical Journal, 40(4), 219–25. https://doi.org/10.1016/j.bj.2017.05.002. Hospital T.K. 2006. Technology, Faculty. 1469–72. Indhumadhi N. and Padmavathi G. 2011. Enhanced Image Fusion Algorithm Using Laplacian Pyramid and Spatial Frequency Based Wavelet Algorithm. International Journal on Computer Science and Engineering, 1(5), 298–303. http://core.kmi.open.ac.uk/display/1083402. Jiang Q. et al. 2018. Multi-Sensor Image Fusion Based on Interval Type-2 Fuzzy Sets and Regional Features in Nonsubsampled Shearlet Transform Domain. IEEE Sensors Journal, 18(6), 2494–2505. Jiang Q., Jin X., Lee S.J., and Yao S. 2017. A Novel Multi-Focus Image Fusion Method Based on Stationary Wavelet Transform and Local Features of Fuzzy Sets. IEEE Access, 5, 20286–302. Jin X. et al. 2017. A Survey of Infrared and Visual Image Fusion Methods. Infrared Physics & Technology. Jin X. and Yanfeng G. 2017. Superpixel-Based Intrinsic Image Decomposition of Hyperspectral Images. IEEE Transactions On Geoscience And Remote Sensing, 55(8). Kaur R. and Singh S. 2017. An Artificial Neural Network Based Approach to Calculate BER in CDMA for Multiuser Detection Using MEM. Proceedings on 2016 2nd International Conference on Next Generation Computing Technologies, NGCT 2016 (October): 450–55. Kazemi K. and Moghaddam H.A. 2003. Fusion of Multifocus Images Using Discrete Multiwavelet Transform. IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems 2003-Jan: 167–72. Kim M., Han D.K., and Hanseok K. 2016. Joint Patch Clustering-Based Dictionary Learning for Multimodal Image Fusion. Information Fusion, 27, 198–214. http://dx.doi.org/10.1016/ j.inffus.2015.03.003. Krishnamoorthy S. and Soman K.P. 2010. Implementation and Comparative Study of Image Fusion Algorithms. International Journal of Computer Applications, 9(2), 25–35. Kubota A. and Aizawa K. 2005. Reconstructing Arbitrarily Focused Images from Two Differently Focused Images Using Linear Filters. IEEE Transactions on Image Processing, 14(11), 1848–59.
9 Classification of various image fusion algorithms
195
Lahat D. et al. 2015. Multimodal Data Fusion: An Overview of Methods, Challenges and Prospects To Cite This Version : HAL Id : Hal-01179853 Multimodal Data Fusion : An Overview of Methods, Challenges and Prospects. XX: 1–26. Li H., Chai Y., Ling R., and Yin H. 2013a. Multifocus Image Fusion Scheme Using Feature Contrast of Orientation Information Measure in Lifting Stationary Wavelet Domain. Journal of Information Science and Engineering, 29(2), 227–47. Li M., Yan W., and Shunjun W. 2003. Multi-Focus Image Fusion Based on Wavelet Decomposition and Evolutionary Strategy. Proceedings of 2003 International Conference on Neural Networks and Signal Processing, ICNNSP’03 2(3): 951–55. Li S. and Kang X. 2012. Fast Multi-Exposure Image Fusion with Median Filter and Recursive Filter. IEEE Transactions on Consumer Electronics, 58(2), 626–32. Li S., Kang X., Jianwen H., and Yang B. 2013b. Image Matting for Fusion of Multi-Focus Images in Dynamic Scenes. Information Fusion, 14(2), 147–62. http://dx.doi.org/10.1016/ j.inffus.2011.07.001. Li S., Kwok J.T., and Wang Y. 2002. Multifocus Image Fusion Using Artificial Neural Networks. Pattern Recognition Letters, 23(8), 985–97. Li S. and Yang B. 2008. Multifocus Image Fusion Using Region Segmentation and Spatial Frequency. Image and Vision Computing, 26(7), 971–79. Liu Y., Liu S., and Wang Z. 2015. Multi-Focus Image Fusion with Dense SIFT. Information Fusion, 23(May): 139–55. http://dx.doi.org/10.1016/j.inffus.2014.05.004. Loncan L. et al. 2015. Hyperspectral Pansharpening: A Review. IEEE Geoscience and Remote Sensing Magazine. Lu T. et al. 2017. From Subpixel to Superpixel: A Novel Fusion Framework for Hyperspectral Image Classification. IEEE Transactions On Geoscience And Remote Sensing, 55(8). Ma J., Yong M., and Chang L. 2019. Infrared and Visible Image Fusion Methods and Applications: A Survey. Information Fusion, 45: 153–78. https://doi.org/10.1016/j.inffus.2018.02.004. Majumdar J. and Patil B.S. 2013. A Comparative Analysis of Image Fusion Methods Using Texture. Lecture Notes in Electrical Engineering 221 LNEE(VOL. 1): 339–51. Manchanda M. and Sharma R. 2018. An Improved Multimodal Medical Image Fusion Algorithm Based on Fuzzy Transform. Journal of Visual Communication and Image Representation, 51(December 2016): 76–94. https://doi.org/10.1016/j.jvcir.2017.12.011. Burt P.J. and Adelson E.H. 1983. The Laplacian Pyramid as a Compact Image Code. IEEE Transactions on Communications, 31(4), 532–40. Pal C., Chakrabarti A., and Ghosh R. 2015. A Brief Survey of Recent Edge-Preserving Smoothing Algorithms on Digital Images. Pati Y.C., Rezaiifar R., and Krishnaprasad P.S. 1993. Orthogonal Matching Pursuit: Recursive Function Approximation with Applications to Wavelet Decomposition. Conference Record of the Asilomar Conference on Signals, Systems & Computers 1: 40–44. Pohl C. and Van Genderen J.L. 1998. 19 International Journal of Remote Sensing Review Article Multisensor Image Fusion in Remote Sensing: Concepts, Methods and Applications. Pohl C. and Van Genderen J.L. 2010. 1161 Review Article Multisensor Image Fusion in Remote Sensing: Concepts, Methods and Applications. Rahmani S. et al. 2010. IEEE Geoscience and Remote Sensing Letters, 2010, 7(4). Rahmani S., Strait M., Merkurjev D., Moeller M., and Wittman T. An Adaptive IHS Pan-Sharpening Method.Pdf. 7(4), 746–50. Avinash R.S. and Venkatesh Babu R. 2014. FeatureMatch: A General ANNF Estimation Technique and Its Applications. IEEE Transactions on Image Processing, 23(5): 2193–2205. Sadjadi F. 2006. Comparative Image Fusion Analysis. 8–8.
196
Simrandeep Singh, Nitin Mittal, Harbinder Singh
Saeedi J. and Faez K. 2012. Infrared and Visible Image Fusion Using Fuzzy Logic and Population-Based Optimization. Applied Soft Computing Journal, 12(3), 1041–54. http://dx.doi.org/10.1016/ j.asoc.2011.11.020. Science C. For Autonomous Robot Systems Lecture 13: Pyramids in Image Processing Class Objectives / Announcements One Motivation. 1–22. Shen S., Sandham W., Granat M., and Sterr A. 2005. MRI Fuzzy Segmentation of Brain Tissue Using Neighborhood Attraction with Neural-Network Optimization. IEEE Transactions on Information Technology in Biomedicine, 9(3), 459–67. Singh H., Kumar V., and Bhooshan S. 2013. Anisotropic Diffusion for Details Enhancement in Multiexposure Image Fusion. ISRN Signal Processing, 2013, 1–18. Singh H., Kumar V., and Bhooshan S. 2014. A Novel Approach for Detail-Enhanced Exposure Fusion Using Guided Filter. The Scientific World Journal, 2014, 1–8. Singh S., Mittal N. and Singh H. Multifocus Image Fusion based on Multiresolution Pyramid and Bilateral Filter. IETE Journal of Research, (2020) Jan 17, 1–2. Song Y., Mantian L., Qingling L., and Sun L. 2006. A New Wavelet Based Multi-Focus Image Fusion Scheme and Its Application on Optical Microscopy. 2006 IEEE International Conference on Robotics and Biomimetics, ROBIO 2006: 401–5. Soomro B.N. et al. 2017. Local and Nonlocal Context-Aware Elastic Net Representation-Based Classification for Hyperspectral Images. IEEE Journal Of Selected Topics In Applied Earth Observations And Remote Sensing, 10(6). http://ieeexplore.ieee.org. Sreeja P. and Hariharan S. 2018. An Improved Feature Based Image Fusion Technique for Enhancement of Liver Lesions. Biocybernetics and Biomedical Engineering, 38(3), 611–23. https://doi.org/10.1016/j.bbe.2018.03.004. Systems E. 2014. Region-Based Multi-Focus Image Fusion Using Image Histogram Comparison. Tsai D.Y., Lee Y., and Matsuyama E. 2008. Information Entropy Measure for Evaluation of Image Quality. Journal of Digital Imaging, 21(3), 338–47. Wang W. and Chang F. 2011. A Multi-Focus Image Fusion Method Based on Laplacian Pyramid. Journal of Computers, 6(12), 2559–66. Wang W., Jiao L., and Yang S. 2014. Fusion of Multispectral and Panchromatic Images via Sparse Representation and Local Autoregressive Model. Information Fusion, 20(1), 73–87. http:// dx.doi.org/10.1016/j.inffus.2013.11.004. Wang -W.-W., Shui P.-L., and Song G.-X. 2004. Multifocus Image Fusion in Wavelet Domain. (November): 2887–90. Wu W. et al. 2011. Objective Assessment of Multiresolution Image Fusion Algorithms for Context Enhancement in Night Vision: A Comparative Study. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1), 94–109. Xin W., Wei Y.L., and Liu F. 2013. A New Multi-Source Image Sequence Fusion Algorithm Based on Sidwt. Proceedings – 2013 7th International Conference on Image and Graphics, ICIG 2013 (3): 568–71. Yang B. and Shutao L. 2010. Multifocus Image Fusion and Restoration with Sparse Representation. IEEE Transactions on Instrumentation and Measurement, 59(4), 884–92. Yang B. and Shutao L. 2012. Pixel-Level Image Fusion with Simultaneous Orthogonal Matching Pursuit. Information Fusion, 13(1), 10–19. http://dx.doi.org/10.1016/j.inffus.2010.04.001. Yang B., Jing Z.L., and Zhao H.T. 2010. Review of Pixel-Level Image Fusion. Journal of Shanghai Jiaotong University (Science), 15(1): 6–12. Yang L., Guo B., and Wei N. 2007. Multifocus Image Fusion Algorithm Based on Contourlet Decomposition and Region Statistics. Proceedings of the 4th International Conference on Image and Graphics, ICIG 2007: 707–12.
9 Classification of various image fusion algorithms
197
Yin H. 2011. Multimodal Image Fusion with Joint Sparsity Model. Optical Engineering, 50(6), 067007. Yin S., Cao L., Tan Q., and Jin G. 2010. Infrared and Visible Image Fusion Based on NSCT and Fuzzy Logic.: 671–75. Zhang D.C., Chai S., and Gooitzen V.D.W. 2011. Method of Image Fusion and Enhancement Using Mask Pyramid. 14th International Conference on Information Fusion: 1–8. Zhang P. et al. 2018a. Infrared and Visible Image Fusion Using Co-Occurrence Filter. Infrared Physics and Technology, 93(August): 223–31. https://doi.org/10.1016/j.infrared.2018.08.004. Zhang Q. et al. 2018b. Sparse Representation Based Multi-Sensor Image Fusion for Multi-Focus and Multi-Modality Images: A Review. Information Fusion, 40, 57–75. http://dx.doi.org/10.1016/ j.inffus.2017.05.006. Zhang Q. and Levine M.D. 2016. Robust Multi-Focus Image Fusion Using Multi-Task Sparse Representation and Spatial Context. IEEE Transactions on Image Processing, 25(5), 2045–58. Zhang X., Lin H., Kang X., and Shutao L. 2014. Multi-Modal Image Fusion with KNN Matting. Communications in Computer and Information Science, 484, 89–96. Zhang Y. and Hong G. 2005. An IHS and Wavelet Integrated Approach to Improve Pan-Sharpening Visual Quality of Natural Colour IKONOS and QuickBird Images. Information Fusion, 6(3), 225–34. Zhou X. and Prasad S. 2017. Active and Semisupervised Learning With Morphological Component Analysis for Hyperspectral Image Classification. 14 (8). Zhou Z., Dong M., Xie X., and Gao Z. 2016. Fusion of Infrared and Visible Images for Night-Vision Context Enhancement. Applied Optics, 55(23), 6480. https://www.osapublishing.org/ab stract.cfm?URI=ao-55-23-6480.
Dhrubasish Sarkar, Medha Gupta, Premananda Jana, Dipak K. Kole
10 Recommender system in healthcare: an overview Abstract: Today information technologies have led to a vast number of innovations and developments in almost every field of advancement. In this context, recommender systems (RS) have been setting milestones in the service industry. If we look at the web-based services, it can be said that RS has contributed majorly to increase the reachability of products and to provide a sea of options for potential customers. This RS can also be used as a tool to support decision-making by the decision-makers. With this advancement of RS in the service industry, healthcare systems also do not lag. Health RS (HRS) are becoming an important platform for providing healthcare services, which would cut down the hectic schedule of visiting the doctors and waiting for hours to get checked. In the healthcare industry, RS already play a very significant role in terms of supporting decision-making process about any individual’s health. Keeping the very limited availability of resources in mind and the need for HRS to make its way into the chapter of milestones and innovations, it is important to introduce a set of knowledge and information for the researchers that are interested in HRS studies and can make huge advancements in this domain. Using HRS to suggest the most probable and appropriate medicines after taking into consideration the history of the individual can be a sub-domain to highly think about. Hence, this paper provides a literature study of the HRS domain in general which includes the literature, innovations, purpose, and methods of HRS, along with the new concept of HRS being used for medication purposes. Keywords: Healthcare Recommender System, Recommendation System, Collaborative Filtering, Content Based Filtering, Hybrid Recommendation System
10.1 Introduction The primary care of any patient serves as the patient’s first point of contact with the primitive healthcare system. The primitive healthcare system includes the patients Dhrubasish Sarkar, Medha Gupta, Amity Institute of Information Technology, Amity University, Kolkata, India Premananda Jana, Netaji Subhas Open University, Kalyani, West Bengal, India Dipak K. Kole, Department of CSE, Jalpaiguri Government Engineering College, Jalpaiguri, West Bengal, India https://doi.org/10.1515/9783110648195-010
200
Dhrubasish Sarkar et al.
going to the hospital for any sort of medication or treatment. Receiving any recommendation about the doctors, specialists, medicines, treatment, etc. included a long process of the individuals going to the hospitals and waiting in queues for their appointment to be called after which they could interact with the doctors (Isinkaye et al, 2015; Riyaz et al, 2016). With the advancement of technology and the thought of integrating this advanced technology with healthcare, various studies have been conducted to date in which the challenges of primitive healthcare have always been the limelight of these studies. A number of algorithms put in practice, individually and together, have given remarkable results. The basic concept underlying the health recommender system (HRS) is almost the same as the e-commerce recommendation system (RS). The HRS like any other RS takes in all the details of the individual patients as the input; looks into the history of the patients; takes down all the necessary details, and then works based on the particular algorithm which ultimately gives the result based on it (Burke et al, 2011). These recommendations, when put to use correct measures, can be very useful. For example, if an individual needs immediate medication and going to the doctor can also be late, these RSs can be of great use. This RS then shall take in the history of the patient along with other personal details and could recommend a medicine for the time-being, Also, if an individual is having confusions selecting the best doctor for himself/herself, this RS could take in the personal details of the patient and then recommend him/her some best doctors which would make it easier for the person to select one. The incorporation of the RS in the healthcare domain has indeed been a turning point in the advancement of the healthcare sector making the primitive healthcare sector smarter (Burke et al, 2011; Sahoo et al, 2019; Sarwar et al, 2001). Collaborative filtering, content-based recommendation, hybrid recommendation (Cano and Morisio, 2017) are some of the very important algorithms that form the building block of the HRS. Despite all the benefits that the HRS grants, there are some challenges that the HRS faces still now. Work is been done to overcome them in the near future too. This chapter is organized as follows. The next section discusses about the basic concepts of RS and HRS, the following sections discuss the related works done in this field, the challenges and future scope. The last section contains the conclusion.
10.2 Basic concepts of RS and HRS Recommendation system or the RS as often referred to as is nothing but very basic and simple algorithms which look forward to providing the most relevant, accurate
10 Recommender system in healthcare: an overview
201
and almost correct suggestions or items to the user by filtering all the possibilities from the huge sea of available information (Ouaftouh et al, 2019). All the e-commerce sites like Amazon, Flipkart; the Google Search Engine, and even most of the social media sites like Facebook and Instagram use this RS in their respective domains and ways to provide a faster and better working experience to their users. The vast growth of information in today’s world provides an individual to make a choice from a variety of options available to him/her. Here the RS helps the individual a little by filtering the best options from the vast options available. The core concept of every RS includes it collecting an individual’s data explicitly or implicitly, then looking at his/her history, likes and then filters the relevant choices and the provides the results (Ma et al, 2017; Paul et al, 2017; Sahoo et al, 2019; Wang and Haili, 2017).Every RS is guided by some very basic methods which are as follows (Sezgin and Ozkan, 2013): 1. DATA COLLECTION: The most crucial step for building any RS. This data is collected by two methods- implicitly and explicitly. Collecting data implicitly means the information which is not collected intentionally but is gathered in the course of collecting information and Collecting information explicitly means the information which is collected explicitly.
2.
3.
DATA STORAGE: The various attributes of the collected data determine how and where the data is stored. For example, the amount of data collected determine how good the recommendation of the particular thing gets. Besides that, the type of data collected determines the type of storage that is to be used for storing the collected data. FILTERING THE DATA: The collection and storage of the data is followed by filtering of the collected data to extract the most relevant information to make the final recommendations. This step can be done by implying quite a number of algorithms: (Ma et al, 2017; Paul et al, 2017; Sahoo et al, 2019; Wang and Haili, 2017) – Content-Based Filtering (Zafrani et al, 2014): This algorithm focuses largely on the history of the individual, which then recommends products that are largely similar to the ones the user has liked in the past. – Collaborative Filtering: This algorithm focuses on the interests of the person where it compares the interests of one person to another after which it finds an almost similar interest which again directs them to give the most relevant results as the outcome. This is largely used in the technical industries. There are two types of Collaborative Filtering (CF) techniques: I. User-User CF: Here a similarity score is found between two users, after which the most similar user pairs are found. II. Item-Item CF: Here, similarity is calculated and compared between each selected pair of items.
202
Dhrubasish Sarkar et al.
– Hybrid Filtering: Both Content-Based Filtering and Collaborative Filtering have their backlogs which have been tried to overcome by combining them together in this Hybrid Filtering mechanism. In the present world of technically advanced and intelligent society where everything is running and we don’t have a minute to spare, it is very necessary for us to keep our bodies up to date and working. No one has the time to sit and wait for their turns in the hospital to meet the doctor and get very minute doubts of them cleared until and unless it is very important and can only be solved through a meeting in person. For example, for a small cough and cold it would be waste of time to go and spend hours in the hospital. This is where the HRS would be helpful to the people. Taking into account the patient’s history and the present condition it can suggest some medicines for the time-being medication. Also, the HRS can provide an individual who is confused between a number of doctors and doesn’t know which one to go to, a list of the specialists the individual is looking for. In case of an emergency, where it is very necessary for immediate medication, the HRS can provide some immediate remedies until the issue is resolved later. Taking into consideration all these instances a number of studies have been conducted in the healthcare domain and it is being tried that in the next couple of years HRS evolves as the most used system and it benefits the mankind and society in every way possible for its betterment (Grover and Kar, 2017).
10.3 Overview on HRS With the rapid development of Data Mining and Analytics, there has been a rise in the application of Big Data Analytics in various domains. The healthcare domain being the very promising domain in this field of Big Data mining and Big Data Analytics which also gives promising results. (Calero et al, 2016; Harsh and Ravi, 2014; Portugal et al, 2018; Paul and Dey, 2017; Sahoo and Pradhan, 2017; Sharma and Shadabi, 2014). Though Internet is a great platform for getting all the doubts and answers to every question, every individual needs to be careful that he/she doesn’t collect the wrong or harmful information. Internet consists of large quantities of unprocessed data, both harmless and harmful data and it is very necessary that a person gets only the harmless data. RS also filters and collects the harmless data which they provide as an output to the patients. During the last few years a large amount of data has been retrieved in healthcare databases which represents any patient’s health status and is largely scattered across different sites. The RSs or the HRS provide a very crisp data after a detailed study and analysis of the patient’s history and interests.
10 Recommender system in healthcare: an overview
203
In the HRS, there are there are two main entities which play the primary roles: the patient and the products (Harsh and Ravi, 2014; Portugal et al, 2018). The patients provide their preferences about certain domains, products, topics and these preferences are searched for in the large sea of data. The whole collected data is represented as a matrix which at last supplies the output of each patient-item pair. The HRS mainly works on predictive analysis. It predicts the recommendation/ outcome considering all the preferences and the interests of the individual patients as the input. It works in three basic steps. – First, the system collects a healthcare dataset from the individual patients. – Second, different classification algorithms are applied which filters the data according to the patient’s interests and preferences. – In the final step, the preferences are analyzed and outputs are determined accordingly. The HRS can be classified in two categories: I. PATIENT BASED: Here, the RS works on the patients choices; their given ratings of the items etc. Here, input is taken from the patients. II. ITEM BASED: Here, the RS work as Recommender engines which calculates the similarity between different preferred items to make recommendations for the patients. Here, the input is not taken directly from the patients. The working of the HRS is greatly categorized in three phases which are described vividly in the following section (Isinkaye et al, 2015; Riyaz et al, 2016; Zhou et al, 2015): – DATA COLLECTION PHASE: Here the vital information is collected from the patients, his/her feedbacks are analyzed and based on that the patients profile is created. The different attributes of the patient’s profile are behaviour, interests, and the resources accessed. There are three ways in which the HRS collects information and analyzes the feedback: 1. EXPLICIT FEEDBACK: Here, the input of the feedback is from the patients, taken into consideration the patient’s interests. 2. IMPLICIT FEEDBACK: Here, the patient’s behaviours is analyzed. This is the indirect collection and input of data. 3. HYBRID FEEDBACK: Here, both the explicit and implicit feedback is merged together. – LEARNING PHASE: Input is calculated and analyzed with the feedback to predict the outputs. – RECOMMENDATION PHASE: Analyzing the result of the Learning phase, the Recommendation Phase gives the output. The Learning phase or the Filtering Phase is also brought about by three major techniques as already discussed in the previous section.
204
Dhrubasish Sarkar et al.
– Content Based Filtering: It focuses on the patient’s profiles, their history, feedback and the preferred items to generate predictions. – Collaborative Filtering: It creates a patient-item matrix of choices or preferences of items and based on the similarities; calculations are done. If five different ratings of five movies are considered where ‘A+’ denoted that the person has liked the movie, and ‘A-‘ denotes the person has disliked the movie, then the predicted ratings of a person say ’P’ would be compared to the ratings of the person with whom ‘P’’s ratings almost match say ‘Q’. Instead of formulating predictions based on comparison with others, the weighted average of the recommendations of several people are taken into account.The weight given to a person’s ratings is based on the correlation between that person and the persons for whom the prediction is to be made. This correlation can be done by Pearson correlation technique as mentioned in eq. (10.1), where the ratings of and item ‘k’ is to be calculated and two persons are there ‘X’, ’Y’, where x, and y denote the mean value of their ratings as xi and yi the ratings then the equation can be written as rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X X ðxi − xÞðyr − yÞ= ðxi − xÞ^2.Σðyi − yÞ^2 (10:1) RðX, YÞ = r
r
If ‘k’ is the element of all the items that ‘X’ and ‘Y’ have rated, then the prediction for the rating of ‘X’ of the item ‘I’ based on the ratings of people who have rated ‘i’ is given below in eq. (10.2) as .
Pðx iÞ = Σr yi − rðX, Y Þ=n
(10:2)
– Hybrid Filtering: To increase the accuracy and precision of the results of the above mentioned two algorithms they are merged together to bring out the Hybrid Filtering. Hybrid Filtering can be done in tow major ways: a. applying some collaborative filtering in some content based applications. b. utilizing some content-based filtering in the collaborative applications. Based on the environment and considering all the possible attributes, suitable methods are approached to recommend the best results. To sum up, the primary goals of the HRS are (Burke et al, 2011; Li et al, 2009; Sahoo and Pradhan, 2019): 1. To retrieve harmless and trusted facts, data and information from the internet. 2. To analyze and recommend which is the correct and appropriate information for the patient after a detailed study and analysis of the patient’s profile. 3. To adapt their selection methods according the suitable knowledge domains.
10 Recommender system in healthcare: an overview
205
10.4 Related work Here we study and analyze the different health related issues faced by the bulk and try to develop an intelligent HRS which predict and give almost accurate results to the patients. These are mainly brought about by machine learning algorithms. Figure 10.1 shows the basic architecture of HRS (Sahoo et al, 2019). Health record providers Hospitals
Medical clinics
Medical researchers
Centralized server
Community health staff
Doctors
Patients’ records
Patients HRS (health recommendation system)
Figure 10.1: Basic architecture of health recommender system (Sahoo et al, 2019).
Wiesner and Pfeifer (2014) have presented a basic architecture of a proposed HRS (Figure 10.2). With the rapid development of big data analytics and data mining, there has been a rapid development in the application of this big data in various domains of technology. Considering the various prospects of applications, we see how Healthcare systems have emerged as a very promising sector in this respect. Big Data Analytics in Healthcare has three main sectors as mentioned in Table 10.1 Along with these 3V’s; there is VERACITY which largely regulates the HRS. All the information that is both collected and available is highly unprocessed. And in today’s world it is very necessary to have a highly processed and scheduled data for very rapid outputs. The idea of HRS was thus pitched in. Various works have been conducted throughout the globe worldwide for making HRS grab the spotlight in this sector of Data Mining and Big Data Analytics (Adams et al, 2010; Harper and Konstan, 2015). 1) Basics of HRS: The HRS is a designed decision-making system which recommends appropriate treatment and proper healthcare to the patients in a short time. It also provides high quality health remedies for patients in any urgent condition.
206
Dhrubasish Sarkar et al.
ri’ Є R
Health information artifacts
User data In PHR-DB HRS
Q
Linguistic preprocessing
ri’
Medical terms
Q’ PHRS
‘Relevance processing’ based on health graph (conceptual scoring) S
Q : PHR user data in (semi-structured) text form Q : Set of query terms indicating user specific facts R : Set of possibly recommended health information artifacts ri’: Set of terms representing preprocessed elements in ri’ S : Set of selected recommendations for a certain query Q Figure 10.2: Basic architecture of a proposed health recommender system (Wiesner and Pfeifer, 2014).
Table 10.1: Big data analytics in healthcare. Sector .
Volume The amount of data generated on a regular basis and which is collected worldwide from all the different organizations on a daily basis
Sector .
Velocity The rate at which the data is generated in every organization on in a daily basis
Sector .
Variety
The presence of data from a huge sea of all different sources and the various domains and kinds of the data available.
2) Phases: There are various phases through which any item goes through before it is recommended. These are: a) TRAINING PHASE: They collect data from the patients and create a dataset to which various classification methods are applied later on.
10 Recommender system in healthcare: an overview
207
b) PATIENT PROFILE PROCESS PHASE: This phase prepares a profile health record which is the most important phase of the HRS. c) SENTIMENTAL PHASE: This phase studies the profile of the patients, their history, sentiments and passes on to the next phase. They take in the opinions from the patients to decide the area of perforation of the patients. d) PRIVACY PRESERVATION and SECURITY PHASE: This phase is required to preserve any important data so that it remains restricted and unchangeable. They also provide various working algorithms to process the further outputs. e) RECOMMENDER PHASE: This is the last phase which works on the previous inputs of the other phases and predicts the output which are almost accurate results. 3) HRS DESIGN (Calero et al, 2016; Paul and Dey, 2017; Priyadarshini et al, 2018; Sahoo and Pradhan, 2017): There are different methodologies of which the one which is discussed below is thought to be one of the most practical and realistic approach. This approach is divided into various segments. – Objectives of the project are found through a software development team. – Descriptions of the project’s importance and aims are discussed. – After all the problems are solved, it passes on to the project development stage where the minute details of the project are discussed and the software development framework is discussed. – In the last phase, the design phase is implemented serially with every step. It can be summarized in Table 10.2 (Sahoo and Pradhan, 2019): Table10.2: HRS design. Segment . Concept Development
Here the Vs’ mainly: volume, variety, velocity, veracity is dealt with.
Segment . Proposal
Why Big Data is used? How will it be beneficial to the society?
Segment . Methodology
This is the most descriptive phase which includes discussion of – Objectives – Data Collection – Data Transformation – Platform tool Selection.
Segment . Deployment
Includes Evaluation and Testing.
4) HRS FRAMEWORK: A generalize framework of HRS is shown in Figure 10.3 (Sahoo et al, 2019).
Outlier analysis
Data transformation
Data selection
Data processing
Big data tools (Hadoop EcoComponents) Usage patterns of drugs
Identification of high-risk patients
Medical image diagnosis
Data analysis
Figure 10.3: A Generalized Framework of HRS (Sahoo et al., 2019).
Biometric scans
Patients records
Clinics
Hospitals
Data sources
Recommendations
Recommender engine
Medical reports
Health monitoring of patients
Suitable medicine recommendations Health insurance plan
Data visualization
208 Dhrubasish Sarkar et al.
10 Recommender system in healthcare: an overview
209
To build any HRS it is very necessary to have a suitable framework which would support the strong and concrete establishment of the HRS in the present and near future too. To framework of the HRS is also the backbone of the system. The backbone of the HRS can be illustrated in the following way: a) DATA COLLECTION: There are three various types of data which is recruited from the vast information pool available which are: i) STRUCTURED DATA: these are the data which is retrieved from sensors, CT scans, X-rays etc. ii) SEMI STRUCTURED DATA: these are the data which have no definite configuration with any data model as such but are almost through and processed. iii) UNSTRUCTURED DATA: These are the data which are highly unprocessed and unstructured. The data collected are from various prescriptions, hospital records, patient profiles, CT Scans, X- rays, MRIs, Blood Tests etc. The Healthcare Automation System which is an integrated system of IoT and intelligence transforms this retrieved data in the next phase. b) DATA TRANSFORMATION: The output of the last phase is taken as the input of this phase where the data is categorized according to: i. NUTRITIONAL DATA: The data falls under this phase when the doctor suggests the patient to change their food habits, or asks them to switch to a balanced diet etc. ii. PHYSICAL EXERCISE: Adopting to weather, health conditions, location the doctor may suggest the patient to physically charge and activate their body to cut down on the various diseases that he/she might be suffering from or to keep oneself fit and healthy. iii. DIAGNOSIS DATA: This is the category which takes in the data that diagnosis the symptoms of the disease the patient might be suffering from. iv. THEREPY: This data includes the possible remedies to the diagnosed disease. c) DATA ANALYSIS: This mainly deals with the data analysis and the beneficial returns and outputs of the process. Apart from the patients there are pharmacists, clinicals, hospitals which also benefit a lot from the HRS. d) VISUALIZATION: This consists of all those elements that affect how the recommended items should be presented. 5) HRS EVALUATION: The Evaluation of the Transformed and Analyzed Data should be evaluated taking into consideration all the below mentioned necessary attributes: – Precision – Recall – RSME: Difference between predicted data and known data.
210
Dhrubasish Sarkar et al.
– F Measure: This measures every individual test’s accuracy. – ROC Curve: This is the relationship between sensitivity and specify of the data. 6) HRS APPROACHES: There are different approaches to the basic algorithms that are used in the HRS for predicting the almost accurate output. The most important and the most useful of these three approaches are talked about in the later part of this segment. – MATRIX FACTORIZATION: Matrix Factorization is the oldest and the easiest method to run the algorithms in giving accurate outputs. It is a subclass of Collaborative Filtering algorithm which creates a patient-item matrix of choices and preferences of items. Matrix Factorization which is applied to this algorithm mainly decomposes the patient-item original matrix into the products of two-dimensional rectangular matrices. From the patient ratings patterns, characteristics of both items and patients are derived by an array of factors. These are mainly used to retrieve ~ ρ and ~ information. We are referring each item by the vector m ni If A belongs to R to the power of ‘mxn’ where ‘m’=number of patients/queries, ‘n’=number of items. This is represented in eq. (10.3) as: A 2 Rm × n
(10:3)
– SVD (SINGULAR VALUE DECOMPOSITION) (Zhou et al, 2015; Sahoo and Pradhan, 2019) This is another class of Collaborative Filtering that is mainly used to resolve the problems that Matrix Factorization and the pyrimitate Collaborative Filtering technique brings. It mainly predicts the ratings of an user item based on the patient’s profile after which it predicts the outcome. Here, each item is referred to as qi and each patient as pu . Here the square error difference between their dot products is known as the rating which SVD mainly measures based on the patient’s profile. ! Therefore, the dot product ðexpected ratingÞ = ðru iÞ = qTi pu .
(10:4)
– DEEP LEARNING METHOD (Wei et al, 2017; Betru et al, 2017; Dai et al, 2018; Mu, 2018; Priyadarshini et al, 2018; Yuan et al, 2018; Zhang et al, 2019): It originates as a new field of machine learning research. It eliminates the mechanism of the human brain. Considering the machine learning mechanism, deep learning method is divided into: 1) Supervised learning 2) Unsupervised learning
10 Recommender system in healthcare: an overview
211
To construct a HRS using deep learning method, its components are mainly Multilayer Perceptron with Auto-Encoder (Jiang and Yang, 2017), Convolution Neural Network, Recurrent Neural Network, Restricted Boltzmann Machine (Yedder et al, 2017; Belle et al, 2015), Adversarial Network, Neural Autoregressive Distribution Estimation etc. Monolithic Deep Learning Method is the main part of deep learning method, where HRS integrates several recommendation strategies in one algorithm.
10.5 Challenges Sezgin and Özkan (2013) have presented the challenges and opportunities in HRS. According to their study, cyber-attacks, popularity of resources, lack of integration in terms of data exchange and ethical issues related to data mining are the major challenges in HRS. On the other hand, requirement of less expertise to operate HRS, enhanced predictive power of HRS, consistency, knowledge gathering, health education etc are the perceived opportunities of HRS. Wiesner and Pfeifer (2014) have discussed about some of the key success factors of successful Personal Health Record (PHR) system like patient and care giver’s engagement, technological user-friendliness, knowledge of medical terminology, concern about health record privacy etc which can influence the success of HRS majorly. Choosing relevant PHR entries for relevant recommendation has been considered as an open problem in their study. Learning and studying about this concept in the healthcare sector has been faced with a number of challenges such as security issues, algorithm filtering issues, complexity issues, time management issues, privacy issues (Canny, 2002; Wu et al, 2015) etc. When working with such a big concept it is very important to consider all the beneficial issues that would help the HRS flourish in both business and economic status. Some of the challenges that the HRS has faced all these years and is still facing till now are given as follows: – There are a number of times that the HRS may confuse themselves between the patient’s profile if two or more patients share the common name or the same address, whatever may be the reason. Thus, it is necessary that the HRS segments all the patient profiles accordingly and then proceeds with all the actions. – Today’s world is updating itself with new information every second. A medicine that was launched two days back has become obsolete today and the popularity has been replaced with a new better and very advanced medicine that cures the same problem faster and may even be a substitute for other problems too. Thus, it is very necessary for the HRS software to update itself in regular intervals and replace with new information when something suitable is found.
212
Dhrubasish Sarkar et al.
– It is very necessary for the HRS to be user friendly. Since, it is an user end interactive process where the users play the major role, the software should be such that it is easily accessible by the users. Thus, the new developing HRS software’s should place the users in the primary place and then be built. – The existing HRS is not mainly adaptable. That is, it does not work efficiently in any surrounding that it is placed in. This is a very important concern and should be resolved. – The building of a HRS may be costly. It should be built after considering the economic status of both the urban and rural areas of the world so that it can be included in every hospital It is very vital that the HRS is easily accessible by every individual thus it should be very easy to use. Every elder person, the younger generation, the rural and urban people’ everyone should be able to take the benefits that the HRS is supposed to be bringing to the society so that in the near future this becomes the spotlight earning place in the society.
10.6 Future scope The HRS is still not fully developed. Considering all the challenges that the HRS is tangled with, it is very important now to simplify the system and modify all the backlogs to bring out a very clear HRS that not benefits the society but the economy of the world too. Some of the future rays that when put with this concept can hugely help HRS grab the spotlight are as follows: – As, the HRS almost functions the way doctors do, it is expected that the HRS would be used a substitute for the doctors in the near future.This would be really helpful for the economic world as it would cut down on everyone’s medical expenses. – It would be not anymore necessary for everyone to go to the doctors of regular health check-ups or for planning regular diets. The HRS would bring them everything at their doorstep. That would also regulate every individual’s health. – If looked up for the technical side, using the HRS would make every single individual technically smart that would also help them survive the advancing technical world. – Not only would the users be benefitted, the long queues against every single doctor would be reduced in the hospitals which would also reduce the workload for doctors. – Implementing the HRS in every hospital would also make the working of the HRS very efficient and scheduled.
10 Recommender system in healthcare: an overview
213
After considering all the challenges discussed in the previous section, proper and effective use of various data mining techniques, deep learning algorithms and various hybrid approaches will make the HRS to be more efficient and effective in coming days. Use of Natural Language Processing (NLP) and various behavioural analysis techniques would help to develop patient-centric and mass customized HRS in near future. If applied in right ways, the HRS would bring huge benefits to the society and would also greatly regulate the economic status of the world in a positive way.
10.7 Conclusion After very detailed study and analysis we can conclude stating that the HRS is a very respectable domain in data analytics and big data management systems. The HRS not only implements various algorithms to filter the hundreds of options to every individual’s almost accurate personal choices, it also decreases workload of the doctors, helps in giving immediate results that could work as substitutes for emergencies. The main idea of the HRS benefits the society in positive ways and also gives highly positive feedbacks on its integration with the economics and the business of the world.
References Adams R.J., Sadasivam R.S., Balakrishnan K., Kinney R.L., Houston T.K., and Marlin B.M. PERSPeCT: Collaborative filtering for tailored health communications. In Proceedings of the 8th ACM Conference on Recommender systems (RecSys ’14), Foster City, Silicon Valley, CA, USA, 6–10 October 2010; 329–332. Belle A., Thiagarajan R., Soroushmehr S.M., Navidi F., Beard D.A., and Najarian K. Big data analytics in healthcare. BioMed Research International, (2015), 2015, 1–16. Betru B.T., Onana C.A., and Batchakui B. Deep learning methods on recommender system: A survey of state-of-the-art. International Journal of Computer Applications, (2017), 162, 17–22. Burke R., Felfernig A., and Goker M.H. Recommender systems: An overview. AI Mag, (2011), 32(13–18), 0738–4602. ISSN. Calero V.A., Ziefle M., Verbert K., Felfernig A., and Holzinger A. Recommender systems for health informatics: State-of-the-art and future perspectives. Machine learning for health informatics. In: Holzinger A. (Ed.), Lecture Notes in Computer Science, (2016), Vol. 9605, Springer, 391–414. Canny J.. Collaborative filtering with privacy via factor analysis. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 11–15 August 2002; 238–245. Cano E. and Morisio M. Hybrid Recommender Systems: A systematic literature review. Intelligent Data Analysis, (2017), 21(6), 1487–1524. Dai Y. and Wang G. A deep inference learning framework for healthcare. Pattern Recognition Letters, (2018). doi: https://doi.org/10.1016/j.patrec.2018.02.009.
214
Dhrubasish Sarkar et al.
Grover P. and Kar A.K. Big data analytics: A review on theoretical contribution & tools used in literature. Global Journal of Flexible Systems Management, (2017). Harper F.M. and Konstan J.A. The health care datasets: History and context. ACM Transactions on Interactive Intelligent Systems, (2015), 5(19), 1–19. Harsh K. and Ravi S.. Big data security and privacy issues in healthcare. In Proceedings of the 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, 27 June–2 July 2014; 762–765. Isinkaye F.O.R., Folajimi Y.O., and Ojokoh B.A. Recommender systems: Principles, methods and evaluation. Egyptian Informatics Journal, (2015), 16, 261–273. ISSN 1110-8665. Jiang L. and Yang C.C. User recommendation in healthcare social media by assessing user similarity in heterogeneous network. Artificial Intelligence in Medicine, (2017), 81, 63–77. Li T., Gao C., and Du J. A NMF-based privacy-preserving recommender algorithm. In Proceedings of the 2009 First International Conference on Information Science and Engineering, Nanjing, China, 26–28 December 2009; 754–757. Ma X., Lu H., Gan Z., and Zeng J. An explicit trust and distrust clustering based collaborative filtering recommender approach. Electronic Commerce Research and Applications, (2017), 25, 29–39. Mu R. A survey of recommender systems based on deep learning. IEEE Access, (2018), 6, 69009–69022. Ouaftouh S., Zellou A., and Idri A. Social Recommendation: A user profile clustering-based approach. Concurrency and Computation: Practise and Experience, (2019). Paul P.K. and Dey J.L. Data Science Vis-à-Vis efficient healthcare and medical systems: A technomanagerial perspective. In Proceedings of the 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), Vellore, India, 2017, 1–8. Portugal I., Alencar P., and Cowan D. The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications, (2018), 97, 205–227. Priyadarshini R., Barik R.K., Panigrahi C., Dubey H., and Mishra B.K. An investigation into the efficacy of deep learning tools for big data analysis in healthcare. International Journal of Grid and High Performance Computing, (2018), 10, 1–13. Riyaz P.A. and Varghese S.M. A scalable product recommenders using collaborative filtering in hadoop for bigdata. Procedia Technology, (2016), 24, 1393–1399. Sahoo A.K., Pradhan C., Barik R.K., and Dubey H. Deep learning based health recommender system. Computation, (2019), 7(2), 25. doi: https://doi.org/10.3390/computation7020025. Sahoo A.K. and Pradhan C.R. A novel approach to optimized hybrid item-based collaborative filtering recommender model using R. In Proceedings of the 2017 9th International Conference on Advanced Computing (ICoAC), Chennai, India, 14–16 December 2017; 468–472. Sharma D. and Shadabi F. The potential use of multi-agent and hybrid data mining approaches in social informatics for improving e-Health services. In Proceedings of the 4th IEEE International Conference on Big Data and Cloud Computing, Sydney, NSW, Australia, 3–5 December 2014; IEEE: New York, NY, USA, 2014; 350–354. Sarwar B., Karypis G., Konstan J., and Riedl J. Item-based collaborative filtering recommender algorithms. In Proceedings of the ACM Digital Library International World Wide Web Conferences, Hong Kong, China, 1–5 May 2001; 285–295, ISBN 1-58113-348-0. Sezgin E. and Özkan S. A systematic literature review on health recommender systems. In Proceedings of the 4th IEEE International Conference on E-Health and Bioengineering (EHB 2013), Romania, 21–23 November 2013; 1 –4. Sezgin E. and Ozkan S. A systematic literature review on health recommender system. SRBM- Romania, November 21–23, 2013. Wang Y. and Hajli N. Exploring the path to big data analytics success in healthcare. Journal of Business Research, (2017), 70, 287–299.
10 Recommender system in healthcare: an overview
215
Wei J., He J., Chen K., Zhou Y., and Tang Z. Collaborative filtering and deep learning based recommender system for cold start items. Expert Systems with Applications, (2017), 69, 29–39. Wiesner M. and Pfeifer D. Health recommender systems: Concepts, requirements, technical basics and challenges. International Journal of Environmental Research and Public Health, (2014), 11, 2580–2607. doi: 10.3390/ijerph110302580. Wu J., Yang L., and Li Z. Variable weighted bsvd-based privacy-preserving collaborative filtering. In Intelligent Systems and Knowledge Engineering (ISKE), Proceedings of the 2015 10th International Conference on IntelligentSystems and Knowledge Engineering (ISKE),Taipei, Taiwan, 24–27 November 2015; IEEE: Piscataway, NJ, USA, 2015; 144–148. Yedder H.B., Zakia U., Ahmed A., and Trajkovic L. Modeling prediction in recommender systems using restricted boltzmann machine. In Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017. Yuan W., Li C., Guan D., Han G., and Khattak A.M. Socialized healthcare service recommendation using deep learning. Neural Computing and Applications, (2018), 30(7), 2071–2082. Yuan W., Li C., Guan D., Han G., and Khattak A.M. Socialized healthcare service recommendation using deep learning. Neural Computing and Applications, (2018), 30, 2071–2082. Zafrani R., Abbasi M.A., and Liu H. Social Media Mining: An Introduction. Draft Version. Cambridge University Press, (2014). Zhang S., Yao L., Sun A., and Tay Y. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR), (2019), 52(1). Zhou X., He J., Huang G., and Zhang Y. SVD-based incremental approaches for recommender systems. Journal of Computer and System Sciences, (2015), 81, 717–733.
Purva Ekatpure, Shivam
11 Dense CNN approach for medical diagnosis Abstract: Healthcare is one of the domains which has truly advanced to use the latest technologies like machine learning to help in diagnosis. Owing to its complexity of medical images, extracting features correctly makes the problem even tougher. The earlier image processing algorithms using descriptors were unable to detect the disease accurately, and also, using the correct form of descriptors based on the dataset was even a bigger challenge which further reduced the accuracy of the machine learning algorithms trained over the dataset. However the recent advancements in the field of deep learning are able to give better results for these classifications. Convolutional neural networks (CNNs) have proved to be a great algorithm choice in case of extracting spatial features, making it suitable for the medical diagnosis. However, as the number of layers increase in CNNs, the complexity of the network increases and the information passed from one later to another eventually decreases, thus causing information loss. In order to overcome this dense CNN can be considered. We have shown couple of case studies which were performed. One using the local binary patterns and other using dense CNN on two different types of medical images. The dense CNN work was also recognized by the IEEE Computer Society. Keywords: Breast Cancer, Healthcare, Computational Intelligence, Neural Networks, Convolutional Neural Networks, Dense Convolutional Neural Networks, Artificial Intelligence
11.1 Introduction In the field of machine learning and computational intelligence, applications in medical imaging techniques have pointed in the direction of multiple issues for medical image datasets. The multiple issues were all based around a singular problem – data starvation. There is a lack of proper ways to collect, label, and reuse this medical imaging data (MID). The requirement of properly coordinated efforts among various medical domain specialists, MI information analysts, as well as data scientists from both the industry and government, is an urgent demand for future developments in the field.
Purva Ekatpure, Shivam, Pune Institute of Computer Technology, Indian Institute of Information Technology Allahabad, Pune, India https://doi.org/10.1515/9783110648195-011
218
Purva Ekatpure, Shivam
Product lifecycle development tie-ups with interested commercial, educational, and government entities are also necessary. High-level features of reusable MIDs suitable to train, test, validate, verify, and regulate machine learning products should be better described. Various medical Agencies, such as NIH, should provide access to reliable MIDs to train the models necessary for proper classification. A platform is necessary to facilitate information exchange and quick communication with various MI domain experts, data scientists, and interested commercial bodies. The effectiveness of machine learning in medical image analysis is hampered by two challenges: A. Heterogeneous raw data B. Relatively small sample size A. Heterogeneous raw data Medical images are hard to generalize within a particular class because of their differences within their classes. Pattern recognition experts have recently been attracted to this field because of the recent developments in the medical diagnosis like Antinuclear Antibodies (ANA). In common ANA Analysis techniques the staining pattern of HEP-2 cells is observed in a detailed inspection by experts. The technique used in this method is referred to immunofluorescence (IIF) imaging. Regardless of how accurate this procedure is, the manual process has its inherent disadvantages of being nonscalable because of the exponentially increasing IIF images and proper validation or checking. To tackle this problem of manual verification, local binary pattern (LBP) histograms and its various variants are used. The LBP histograms allow the entire images to be represented by various LBPs. A few notable variations of LBP include rotation invariance LBP among the notable few. In particular situations, merging of two different LBP techniques such as cooccurrence among LBP (CoALBP) and rotation invariance among CoALBP (RIC-LBP) allow the generation of a single feature from the data. Also in an attempt to classify the specific feature vectors, support vector machine (SVM), K-nearest neighbor (KNN) and other supervised learning methods might be implemented. An in-depth discussion of the descriptions will proceed in the next section. We will be discussing all the descriptors in detail as we proceed in the chapter alongside a case study for demonstration. B. Relatively small sample size The other problem which we see in medical images is small sample size. We often do not get enough training examples which leads to underfitting. In this chapter we have discussed dense convolution neural network where we take care that we do not miss out on important information from the image. We will be discussing the case study in the later part of the chapter.
11 Dense CNN approach for medical diagnosis
219
11.2 Image-based feature extraction using descriptors Machine learning algorithms require features for classification. These features can be obtained by using descriptors and then can be later passed on to classifiers for classification. There are techniques used for feature extraction that are local binary patterns (LBP), CoALBP, and RIC-LBP. The Image extraction feature along with the respective algorithms are discussed in details in the coming subsections.
11.2.1 Local binary pattern LBP is a grayscale texture operator. It is used to describe an image’s locality as binary digits. The pattern is calculated by taking the threshold of the pixel difference between the center pixel and its neighbors. X (11:1) LBPðrÞ = N − 1 i = 0 signumðI ðr + δsiÞ − I ðrÞÞ*2^ i Where Signum(x) is one for x greater than zero else its value is 0. N is the number of neighbors and δsi is the displacement vector from the center pixel which is given by δsi = (s * cosx, s * sinx) where the value of x = (360/N) and s is the magnitude of the distance vector (refer Figure 11.1).
Figure 11.1: Local binary pattern.
The main property of LBP is that it is invariant to any uniform change in intensity of image. But the LBP feature does not cover the spatial property of an image. So in order to consider the spatial property we can use CoALBP. The Algorithm for LBP is 1: θ ← 0.0 2: numcols ← number of columns 3: numrows ← number of rows
220
Purva Ekatpure, Shivam
4: for all i = s in numrows - s do 5: for all j = s in numcols - s do 6: p←0 7: θ ← 0.0 8: for all count = 1 less than 4 do 9: k ← s * cos(θ) 10: l ← s * sin(θ) 11: if img[ i+k ] [ j+l ] > img[ i ][ j ] then 12: sum ← sum + 2p 13: p←p+1 14: else 15: sum ← sum + 0 16: p←p+1 17: end if 18: θ ← θ + Π/2 19: end for 20: LBP[ i ][ j ] ← sum 21: sum ← 0 22: end for 23: end for
11.2.2 Co-occurrence among local binary pattern CoALBP histogram feature is extended from the LBP histogram feature to consider the relationship between different positions (spatial) among the LBP. Histogram of many LBP pairs is used to represent CoALBP histogram features. A LBP pair at two different LBP position is represented by Pðr, δrθÞ = ðLBPðrÞ, LBPðr + δrθÞÞ
(11:2)
Where r is the position of the current pixel and δrθ represents the displacement between two LBP pairs where δrθ is defined by δrθ = (d * cos(θ),d * sin(θ)) here d represents the magnitude of displacement vector and θ represents the direction. Four CoaLBP histogram features are extracted for a single value of d and θ = 0,45,90,135 therefore we get N 2 * 4. Number of features where N represents the number of neighbors. In order to make the CoALBP invariant of any sort of rotation we propose another method that is called RIC-LBP. The algorithm for CoALBP is 1: numcols ← number of columns 2: numrows ← number of rows
11 Dense CNN approach for medical diagnosis
221
3: for all i = s in numrows - s do 4: for all j = s in numcols - s do 5: k ← d * cose (θ) 6: l ← d * sin (θ) 7: if s < k < numrows - s and s < l < numcols - s then 8: coalbp [lbp[i][j]][lbp[i+k][j+l]] ← coalbp [lbp[i][j][lbp][i+k][j+l]]+1 9: end if 10: end for 11: end for
11.2.3 Rotation invariance co-occurrence among local binary pattern As seen in the previous works that rotation invariance is an important characteristic feature of medical images classification for high accuracy. So In this we define CoALBP as rotation invariant, the pair of LBP is written as Pφ ðr, δrθÞ = ðLBPφ ðrÞ, LBPφ ðr + δrθÞÞ N −1 X signumðIðr′ + δSi, Φ − I ðrÞÞ*2i LBPΦ r′ =
(11:3) (11:4)
i=0
δsi, φ = ðs*cosðθ + φÞ, s*sinðθ + φÞÞ
(11:5)
where r’ represents r or r + δrθ and φ represents angle of rotation of entire lbp pair. In order to make it rotation invariant, same label is provided for all values of φ belonging to 0, 45, 90, 135, and 180 .Then an id is given to every LBP pair P by the algorithm mentioned below. The algorithm for mapping table is 1: N ←number of neighbors 2: id ← 1 3: limit ← 2N 4: m i,j ← Null 5: for all i in limit do 6: for all j in limit do 7: if m i,j = Null then 8: i’ ← i circular right shift by N/2 9: j’ ← j circular right shift by N/2 10: mi,j ← id 11: m’i,j’ ← id 12: id ← id + 1
222
Purva Ekatpure, Shivam
13: end if 14: end for 15: end for The total number of id generated is limit * (limit + 1)/2.The RIC-LBP histogram has comparably less dimension than CoALBP, so RIC-LBP feature can be computed at a low cost. RIC-LBP has an additional property as compared to CoALBP that it is robust against rotation. In the next section we will discuss the machine learning algorithms used for classification.
11.3 Classification using extracted features The classification based on the medical images can be done with the help of different classifiers. These classifiers require features for classifying. They can use the features provided by humans or can learn their own features. As we have seen in the previous section, descriptors like LBP, CoALBP, and RIC-LBP are used to extract the spatial information from the medical images as features. These features are then passed to classifiers like KNN and SVM for classification. Let us understand these classifiers and how are they used for medical image classification:
11.3.1 K-nearest neighbors KNN takes the features extracted by the descriptors and uses it for classification. It is a type of nonparametric algorithm, that is, regarding the distribution of the data, it does not assume anything. In the case of healthcare, since the medical images datasets do not follow mathematical theoretical assumptions, this nonparametric nature of KNN is useful. KNN is based on the concept of “feature similarity.” It is a lazy algorithm, that is, it does not generate a model during the training phase, instead it stores all the training data and waits for classification till the testing phase. It checks the similarity of the test data with KNN and predicts the output class of the test data based on the majority of output classes of its k neighbors (k is the number of nearest neighbors). To explain with the help of example, consider the following case. There are 2 classes: Coarse speckled (CS) and fine speckled (FS), represented by blue circles and purple stars, respectively. The class of orange rectangle can be either CS or FS. Refer Figure 11.2.
11 Dense CNN approach for medical diagnosis
223
Figure 11.2: KNN example: data distribution.
Let us take K = 3. The 3 closest points to BS will determine the class of BS. We can conclude with good confidence that orange rectangle belongs to CS, since all of its 3 neighbors belong to CS (Figure 11.3). Thus, it can be clearly seen that “k” is important in classification of the test data. There is no “optimal” value of “k.” It changes as per datasets. Generally what is observed is that smaller value of “k” makes the prediction unstable, whereas larger value of “k” gives stable predictions due to majority voting upto a certain value of “k” and then later increases the error beyond threshold. Usually “k” is chosen as odd number to have a tiebreaker.
Figure 11.3: KNN.
To calculate the ‘k’ in KNN, there are different types of distance measures which can be used like Euclidean distance, Manhattan distance, Hamming distance, Minkowski distance, and so on. Although KNNs are simple and require less parameters to tune, still, in the case of classification of medical images, they do not seem to be the best option. The amount of data is high, which makes it slower and also difficult to store all of them.
224
Purva Ekatpure, Shivam
KNN often lead to specialization within the class rather than generalizing it, and when it comes to medical images it has differences within class also and hence this leads to overfitting so algorithms like KNN is not considered to be the first choice algorithm for medical diagnosis.
11.3.2 Support vector machines SVM takes the features extracted from the descriptors as input and uses it for classification. In this, suppose there are “n” features, then, each data item is plotted as an n-dimensional space point. The coordinates represent the value of each feature respectively. The algorithm finds the optimal hyperplane, which separates the data according to the classes. The hyperplane is a line in case when the number of input features is two and if there are three input features, then the hyperplane is a plane. Consider the following example (Figure 11.4). There are two classes: CS and FS represented by blue circles and green triangles, respectively. The goal is to find the hyperplane which correctly separates the two classes.
Figure 11.4: SVM example: data distribution.
In the case of Figure 11.5, the hyperplane is a straight line. Thus, the hyperplane separating the two classes is found by SVM. Not only this, it finds the best hyperplane. Consider the following example (Figure 11.6): There are three hyperplanes possible, but the algorithm chooses the hyperplane with maximum margin, that is, hyperplane C in this case. This ensures robustness, thus reducing the chance of misclassification. However, in cases like below (Figure 11.7), although hyperplane B is the one with maximum margin, still since hyperplane A classifies the classes more accurately, it is chosen.
11 Dense CNN approach for medical diagnosis
225
Figure 11.5: SVM: Straight line as a Hyperplane.
Figure 11.6: SVM: Best Hyperplane.
There can be cases when there are outliers. In such case (Figure 11.8), SVM chooses the hyperplane with maximum margin, as it is robust to outliers. In real life, the data is not as clean as the example above. There are linearly nonseparable datasets. In such cases, kernels are used to transform the nonseparable problems to separable problems. Kernels are transformational functions. They convert low-dimensional input space to higher dimensional input space. For example, by kernelling, in the below example (Figure 11.9), the hyperplane obtained is a circle.
226
Purva Ekatpure, Shivam
B
A
Figure 11.7: SVM example: Choice of hyperplanes.
Figure 11.8: SVM: Outliers case.
For SVM, we can specify how much we want to avoid misclassifying each training example. It is determined by the regularization parameter (“C”). If the value of C is large, it means that avoiding misclassification is the priority, rather than finding larger margin plane. Hence, even if a hyperplane has a smaller margin but it does a good job of separating the classes, then SVM selects it. On the other hand, if the value of C is small, then although it might happen that some points are misclassified, still it finds the larger margin hyperplane. Among other parameters is the Gamma parameter. It defines how far the influence of a single training example reaches. High value of Gamma means only nearby points are considered, whereas low value means that far away points are also considered.
11 Dense CNN approach for medical diagnosis
227
Figure 11.9: SVM: Kernelling- Circle as a hyperplane.
The advantage of SVM is that the classifying accuracy is high and also it has good capability to handle faults and provides generalization too. This makes it better than KNN for many of the cases, but the major disadvantage is that it cannot be extended to diverse datasets. As there are several key parameters which need to be set accordingly. Thus, as we have seen, in case of KNN and SVM, we need to extract the features and pass them to the algorithms to train. In case of images, especially medical images, there are many features which need to be extracted. Hence extracting them via descriptors does not ensure all features are extracted, and then classifying them Does not give best results. As a result, we consider using neural networks Approach for feature extraction as well as classification, since the networks learn the features on their own, thus taking into account many combinations of features and thereby giving good classification results.
11.4 Artificial neural networks ANNs are inspired from the biological neural networks, that is, they are braininspired systems which are intended to replicate the way that we humans learn. Neural networks consist of mainly three types of layers, namely, input layer, output layer, and hidden layer. Layers are just sets of neurons. Input layer is the data which we provide to the neural network. The hidden layers are the layers which actually extract information and learn the features. The output layer is where the finished computations of the network are placed and give the results. Neural
228
Purva Ekatpure, Shivam
networks prove to be an excellent tool in the cases where it is difficult for the human programmer to extract the information due to the complexity, and pass on the learned features to the machine to recognize. Following is the basic architecture diagram of neural network (Figure 11.10). Hidden Input Output
Figure 11.10: Artificial Neural Network-Basic Architecture.
There are different types of neural networks like convolutional neural networks (CNNs), recurrent neural networks and so on. Among these, for image classification, CNNs give promising results. Let us understand more about CNNs.
11.5 Convolutional neural networks Convolutional Neural Networks (CNNs) are a special type of neural network. Like a normal neural network, they are made up of neurons. Weights are assigned to each of the connections among the neurons, which change as and when the network learns. Each neuron in the layer receives some inputs, performs a dot product, and optionally follows it with a nonlinearity. The output layer predicts the class. CNNs are the most effective in case of image recognition and classification. Following is the basic architecture of a CNN (trying to classify as boat, cat, house, tree) (Figure 11.11):
11 Dense CNN approach for medical diagnosis
Feature Maps
Feature Maps
Feature Maps
229
Feature Maps Boat (0.04) House (0.05) Tree (0.9) Cat (0.01)
Convolution + ReLu
Pooling
Convolution + ReLu
Pooling
Fully connected layer Output layer
Figure 11.11: Basic architecture of convolutional neural network.
CNNs have four major components: a. Convolution layers: Convolutional layers learn from the input images by extracting the features. Convolution uses small squares of input data to learn the features, thus preserving the relationship between the pixels. It performs dot operation of the image matrix and the kernel/filter map to produce the feature maps. Convolution with different filters extract different information from the image, thus producing different feature maps. b. Nonlinearity: There are various functions which can be used to introduce nonlinearity, such as rectified linear unit (ReLU), sigmoid, and tanh. The most commonly used is ReLU. For all the pixels in the feature map, ReLU replaces all the negative valued pixels by zero. Most of the real-world data we would want our ConvNet to learn would be nonlinear, Thus, ReLU introduces nonlinearity in the ConvNet. c. Pooling or sub sampling: When the input images are too large, then the number of parameters generated are also large, which makes it difficult to process. While reducing the dimensionality of each feature map, retaining the information is very important to get correct results. This is achieved by pooling. Spatial pooling also called subsampling or downsampling reduces the dimensionality of each map by retaining the important information. Max pooling, average pooling, and sum pooling are the different types of spatial pooling. d. Classification (fully connected (FC) layer): In FC, all the neurons of the preceding layers are connected to all the neurons of the succeeding layer. The highlevel features of an input image are learnt by the convolutional and pooling layers. The FC layer uses these extracted features and classifies the input image into its class, based on the training dataset. The FC layer is followed by a softmax activation function which in the output layer. CNNs, although are considered to be a good choice for spatial features extraction (images), still, they do have the problem of vanishing gradient. In order to overcome this drawback of CNNs, dense CNNs can be considered.
230
Purva Ekatpure, Shivam
11.6 Dense convolutional neural networks We have seen in the previous chapter that CNNs are a good choice in case of extracting spatial features, making it suitable for the healthcare related static data. However, since medical images have many minute details, in order to extract all the features, more number of layers are required. In case of CNNs, as the number of layers increases, the complexity of the network increases, and the information passed from one layer to another eventually decreases due to dropout, thus causing information loss. In order to overcome the drawback of vanishing gradient of CNNs, dense CNNs can be used (Gao Huang, et al, Densely Connected Neural Networks). In case of traditional CNNs with M Layers, there are M connections – one in between every layer and it’s next layer. On the other hand, in case of dense CNNs, for a particular layer, in order to produce its own feature maps, it uses the feature maps of all its preceding layers. The subsequent layers thus get all these feature maps as an input. Hence there are M(M + 1) connections. Thus the concept of concatenation is used instead of summing. Following is the basic architecture diagram of a dense CNN. It consists of dense blocks and transition layers in between the dense blocks (Figure 11.12)
Pooling reduces feature map sizes
Linear
Pooling
Pooling
Dense block 3 Convolution
Pooling
Dense block 2 Convolution
Convolution
Dense block 1
Output
Feature map sizes match within each block
Figure 11.12: Basic architecture of Dense Convolutional Neural Network.
As it can be seen from the Figure 11.12, the dimensions of the feature maps remain the same within a block so that they can be concatenated together easily. The convolutional and the pooling layer act as transition layers. Because of this nature of dense CNN, the network can be thinner and more compact. It also requires fewer parameters and makes accurate predictions. One of the biggest advantages of DenseNets is that there is an improvement in the flow of information as well as gradients throughout the network. Hence it is easy to train them. There is implicit deep supervision as each layer has direct access to the gradients from the loss function and the original input signal. This helps training of deeper network architectures. Further, in cases where there are smaller training set sizes, there are chances of overfitting. In such cases, having dense connections help by having a regularizing effect. Thus, because of all these reasons, DenseNets are a promising choice to use in case of medical image iagnosis.
11 Dense CNN approach for medical diagnosis
231
Let us see the following case studies with respect to image classification to get better insight:
11.7 Case studies 11.7.1 Case study 1 11.7.1.1 Classification of IIF images There are various tests for Antinuclear Antibodies (ANAs). One of the most commonly used tests is indirect IIF. Usually, for detecting antibodies in human serum, Human epithelial type 2 (HEp-2) cells are used as a substrate. HEp-2 cells are used to coat the microscope slides and the serum is incubated with the cells. The antibodies bind to the antigens or the nucleus is considered as a test. If they are present then they will bind to the antigens on the cells; in the case of ANAs, the antibodies will bind to the nucleus. These need to be visualized to see the results. They are visualized by adding a fluorescent tagged (usually FITC or rhodopsin B) antihuman antibody that binds to the antibodies. Distinct patterns of fluorescence are seen on the HEp-2 cells depending on the antibody present in the human serum and the localization of the antigen in the cell. Various staining patterns are observed on HEP-2 cells but the major six among them are cytoplasmic, homogeneous, nucleolar, centromere, FS, and FS (refer Figure 11.13).
Figure 11.13: Staining patterns HEP2 cells (homogeneous, cytoplasmic, centromere, fine speckled, coarse speckled, and nucleolar).
232
Purva Ekatpure, Shivam
11.7.1.2 Results In this section we will discuss the results obtained in details. Initially we will discuss the results obtained by using the single feature which is given in the table below. Chi Square 1 indicates chi square distance metric and the value of k is 1 (refer Table 11.1). Table 11.1: Accuracy using Single Feature. Distance metric
RIC-LBP
CoALBP
LBP
ChiSquare
%
.%
.%
Euclidean
.%
.%
%
ChiSquare
%
.%
%
Euclidean
.%
.%
%
ChiSquare
%
.%
%
Euclidean
.%
.%
%
ChiSquare
%
.%
%
Euclidean
.%
%
.%
The results obtained show us that RIC-LBP gave us better results than LBP and almost the same result as the CoALBP but as the computation cost of RIC-LBP is less than the CoALBP, RIC-LBP would be our first choice. Next set of results shows the result obtained by considering three feature simultaneously by taking the value of (s,d) as (1,2), (2,4), (4,8) and the number of neighbors to be four. It was noticed that the accuracy percent was increased by 16% and maximum accuracy was obtained by RIC-LBP histogram feature which was about 61%. The next table shows the first twenty images and its predicted pattern and its actual pattern (refer Table 11.2).
Table 11.2: Predicted patterns and their actual patterns. Image
Predicted pattern
Actual pattern
Nucleolar
Nucleolar
Nucleolar
Nucleolar
Homogeneous
Homogeneous
Cytoplasmic
Homogeneous
11 Dense CNN approach for medical diagnosis
233
Table 11.2 (continued ) Image
Predicted pattern
Actual pattern
Coarse speckled
Coarse speckled
Homogeneous
Homogeneous
Nucleolar
Centromere
Cytoplasmic
Nucleolar
Centromere
Nucleolar
Coarse speckled
Coarse speckled
Cytoplasmic
Cytoplasmic
Coarse speckled
Coarse speckled
Coarse speckled
Fine speckled
Centromere
Nucleolar
Nucleolar
Centromere
Homogeneous
Homogeneous
Centromere
Centromere
Centromere
Centromere
Centromere
Centromere
Fine speckled
Coarse speckled
It was seen that FS and CS were often interchanged thus giving a misclassification as each other in many cases because of very less difference between the two patterns also the nucleolar patterns were classified as centromere because of the same reason. Also, it was observed that the accuracy percent of cytoplasmic and centromere pattern were relatively greater than others. The reason is that the cytoplasmic pattern have more characteristic shape, that is, fiber like as compared to other, and similarly, centromere pattern have more number of spots than other which makes it different.
11.7.2 Case study 2 11.7.2.1 Breast cancer phase detection using dense convolution neural network The cells of the body maintain a regeneration process cycle. This regeneration process cycle is responsible for maintaining the natural functioning of the body. However, in abnormal situations, few cells may start growing aberrantly, leading to cancer.
234
Purva Ekatpure, Shivam
Cancer cells can be formed in any part of the body and can spread to other parts of the body. Cancer has many types, amongst which breast cancer creates a major health concern. Generally, the chances of women acquiring breast cancer are more as compared to men. Breast cancer depends on multiple factors like age, breast density, obesity and so on. Generally the cancer tumors are classified into two classes:(i) Benign (noncancerous) (ii) Malignant (cancerous) For the treatment of cancer, identifying the type of tissues (normal, benign, and malignant) is an important step. Based on the skin infiltration damage of the tissue it can be classified into two groups: (i) Noninvasive (ii) Invasive Histopathology is used to study the diseases by examining the tissues. In clinical medicine, the specimens are processed and histological sections are placed onto the glass slides. Histopathology is used to examine the surgical specimen by a pathologist. The preparation process of these histopathology slides maintains the underlying tissue architecture. This helps to get a detailed view of the disease and its effect on tissues. Some disease characteristics can be deduced only from a histopathology image, for example, lymphocytic infiltration of cancer. Also, in general, analysis of these images is considered the best way for diagnosing many diseases, including almost all cancer types (Gurcan MN, Histopathological Image Analysis: A review, 2009) In this example we have followed a dense CNN approach to classify the dataset into four classes: Examples of histopathological images (Figure 11.14) clockwise from left (1) Normal (2) Benign (3) In situ (4) Invasive carcinoma For the preprocessing, we used Macenko Stain Normalization technique for getting normalized histopathological images (M. Macenko et al., A method for normalizing histology slides for quantitative analysis, 2009, Boston). After the stain normalization, an augmented dataset is created from the normalized images in the training set. Usually, CNN classification requires dataset with many samples. The used dataset has less samples (ICIAR 2015 Grand Challenge on Breast Cancer Histopathology Images). Hence, there are chances of overfitting. In order to overcome this problem, the images are divided into patches. This helps to make the dataset more complex. Also, data is augmented. Augmentation through rotation through all directions further improves the dataset. Through various studies, it has been guaranteed that data
11 Dense CNN approach for medical diagnosis
235
Figure 11.14: Histopathology image.
augmentation and patching do not hamper the results, due to the rotation invariant nature. As a result, physicians can study the images in various angles without affecting the diagnosis result. Thus, rotation also helps to increase the size of the dataset which further helps us to avoid overfitting as well as preserves all the information. First, the image is divided into patches. Each patch has a size of 512 × 512 pixels, with a 50 percent overlap. They are calculated by subtracting the average value to the RGB channels separately. Each patch is rotated by 90 degrees and vertically to create 8 such path. A total of 70 K different patches are formed from the original 250 training images. The class of the original image is assigned to each patch.
11.7.2.2 DenseNet We are using the DenseNet-121 architecture for our prediction model. We resize our 512 × 512 pixel images to 224 × 224 after which we pass them to a convolutional layer which gives it a size of 112 × 112 pixels. The images are then transferred through the pooling layer which resizes the images to a size of 56 × 56. This architecture has four dense blocks and subsequently three transition layers. In order to make the model more compact, we reduce the number of feature-maps at transition layers. The dense blocks applies the filters of different sizes in each block to the image for training and reduces the output image size by half each time. The final dense block outputs a 7 × 7 pixel image which is vectorized into a 1 × 1 pixel output by the classification layer.
236
Purva Ekatpure, Shivam
The weights are assigned to the nodes by using Glorot method (Glorot,X and Bengio,Y, Understanding the difficulty of training deep feedforward neural networks, 2010) and batch normalization to prevent overfitting, dropout technique (Srivastava N, Dropout: A simple way to prevent neural networks from overfitting, 2014) was used by dropping out units in random with a dropout rate of 0.8.
11.7.2.3 Results We assess the performance of the model on the test set comprising of 1400 images from each class. Our model achieves a remarkable accuracy of 89.50%. From Table1 we see that our model achieves an AUROC score of almost 1 in the three classes: invasive, In situ and normal and very high value of .98 in class benign (refer Table 11.3 and Figure 11.15). Table 11.3: Performance of Model. Type
Precision
Recall
F-score
Support
AUROC
Benign
.
.
.
.
In Situ
.
.
.
.
Invasive
.
.
.
.
Normal
.
.
.
.
1.0
True positive rate
0.8
0.6
0.4 ROC curve of class 0 (area = 0.98) ROC curve of class 1 (area = 1.00) ROC curve of class 2 (area = 1.00) ROC curve of class 3 (area = 1.00)
0.2
0.0 0.0
0.2
0.4
0.6
0.8
1.0
False positive rate Figure 11.15: ROC Curves (Class 0= Benign, Class 1= Situ, Class 2= Invasive, Class 3= Normal).
11 Dense CNN approach for medical diagnosis
237
11.8 Conclusion and future work for case study 1 In this project we got an initial idea of classification of HEP-2 Cells, moreover there is much scope of improvement in the accuracy level by using various other parameters and modifying the existing algorithm accordingly to cope up with the necessity. We would be working on another algorithm that is RIC-LBP. In this algorithm the LBP are made rotation invariant initially and then co-occurrence pairs are found among the Rotation invariant LBPs. Moreover the existing algorithms can be customized to give better results, like the pattern of neighbors selected has been made flexible as we used C language. More accurate classifiers like SVM can be used in order to produce better results. From the results it is clear that the RIC-LBP histogram gives the best result. The results obtained using CoALBP histogram feature were as good as RIC-LBP but the results obtained by LBP Histogram was having very low level accuracy. Moreover different values of K were taken for the KNN classifier and the values 4 and5 gave us the best results.The cytoplasmic, centromere, and homogeneous patterns were predicted with good level of accuracy because of having distinguished texture pattern and the CS and FS were predicted with less accuracy. It was seen that the time complexity was very low because of using a low-level programming language as C.The main feature of this project as mentioned earlier was that it was implemented purely in C which reduced its complexity and increased its flexibility. A very systematic description is given in this report which presents a brief view of the entire project, one can modify the algorithms in the project according to their need.
11.9 Conclusion and future work for case study 2 Here we aimed to predict the class of breast cancer from Macenko stain normalized images using a type of DNN (DenseNet). The major problem while using CNNs for classification is feature loss with increase in number of layers. Also, even the smallest section of the histopathological images contains immense information. Thus it gets difficult and at times meaningless to apply CNNs on medical imaging. But, here with the use of DenseNet we were able to overcome this problem. Further, patching and augmentation not only helped us to increase the size of the training set, which was an important issue to resolve if we wanted to avoid overfitting, but also let us focus on every minute details in the image, as we only focus on a very small part of the tissue imaging in every instance. Our results are better than all the state of the art approaches in this field, giving us a higher accuracy. In future we aim to apply this approach to various other types of medical images as well. If the desired results are achieved, it can change our way of approaching medical reports.
238
Purva Ekatpure, Shivam
References Glorot X. and Bengio Y., 2010, March. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249–256). http://www.jmlr.org/proceedings/papers/v9/ glorot10a/glorot10a.pdf?hc_location=ufi Huang G., Liu Z., van der Maaten L., and Weinberger K.Q, Densely connected convolutional networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI,USA. https://ieeexplore.ieee.org/xpl/conhome/8097368/proceeding Gurcan M.N., Boucheron L., Can A., Madabhushi A., Rajpoot N, and Yener B. Histopathological image analysis: a review. IEEE Reviews in Biomedical Engineering, (2009), 2, 147–171. doi: 10.1109/RBME.2009.2034865. [NCBI]. ICIAR 2015 Grand Challenge on Breast Cancer Histology Images. https://rdm.inesctec.pt/dataset/ nis-2017-003 Loffe S. and Szegedy C, 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167. [arXiv]. Macenko M. et al., A method for normalizing histology slides for quantitative analysis. 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Boston, MA, 2009, pp. 1107–1110. doi: 10.1109/ISBI.2009.5193250 URL: http://ieeexplore.ieee.org/stamp/stamp. jsp?tp=&arnumber=5193250&isnumber=5192959 Srivastava N., Hinton G., Krizhevsky A., Sutskever I., and Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, (2014), 15(1), 1929–1958. http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf URL:https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnndeep-learning-99760835f148,Date of Access:6th October 2019 URL:https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithmclustering/, Date of Access: 1st October 2019 URL:https://medium.com/machine-learning-101/chapter-2-svm-support-vector-machine-theory -f0812effc72, Date of Access: 3rd October 2019
Dhaval Bhoi, Amit Thakkar
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases Abstract: The development, approbation, and acceptance of various social media tools and applications have opened new doors of opportunity for gaining crucial insight from unstructured information. Sentiment analysis and opinion mining have become popular in modern years and can be applied in diversified application areas like healthcare informatics, sports, financial sector, politics, tourism, and consumer activities and behavior. In this regard, this chapter presents how sentiment analysis can help for betterment of people suffering from critical diseases. Healthcare-related unstructured tweets relating to being shared on Twitter is becoming crowd-pleasing source of information for healthcare research. Sentiment analysis is becoming metric measurement to find out feelings or opinion of patient suffering from severe diseases. Various tools and methodologies are used, from which color-coded Word Cloud can be formed based on sentiment. Exploring the methods used for sentiment analysis on healthcare research can allow us to get better insight and understanding of human feelings and their psychology and mindset. The study shows various types of tools used in each case and different media sources and examines its impact and improvement in diseases like obesity, diabetes, cardiovascular disease, hypertension, schizophrenia, Alzheimer’s disease, and cancer using sentiment analysis and its impact on one’s life. Sentiment analysis helps in designing strategies to improve patients understanding and behavior. Keywords: Sentiment analysis, Healthcare informatics, Social media, Word Cloud
12.1 Introduction Huge amount of textual data is collected by the healthcare industry. In year 2013, California-based health network collected medical records in electronic form with various images and annotations in data (W. Raghupathi and V. Raghupathi). Most of the information is stored as text records (Gupta and Lehal, 2010) in e-health records, hand-written observations form by physicians regarding patient’s visits, social media, prescriptions, and letters. Further, “The majority of medical doctor Dhaval Bhoi, U & P U. Patel Department of Computer Engineering, Chandubhai S. Patel Institute of Technology, Charotar University of Science and Technology, Gujarat, India Dr. Amit Thakkar, Smt. Kundanben Dinsha Patel Department of Information Technology, Chandubhai S. Patel Institute of Technology, Charotar University of Science and Technology, Gujarat, India https://doi.org/10.1515/9783110648195-012
240
Dhaval Bhoi, Amit Thakkar
spends half of their time processing huge quantity of administrative forms, credentials and classifying hundreds of missing labels and imaging orders” (Kelso, 2018). Massive diverse data analysis can provide actionable insights from large data. The industry uses these data to reduce costs, personalize and reorganize services and can improve patient’s care quality to a significant level. The information gathered from data analysis would help to promote the clinical and research initiatives with low medical errors and very low costs while fulfilling the mandatory compliance and regulatory requirements. As an example, per year in USA, this analysis helps to save amount of $300 billion in the healthcare domain. However, according to Gartner’s survey, around 5% of the textual data is being evaluated by the industry (Just, 2018). Data available also invite quite a few challenges such as identification of key clinical variables, terminological variability, and vagueness. The data also vary in length, complexity, and use of technical vocabulary, hence information discovery becomes multifarious. There is a torrential flow of the data having massive growth. This tremendous growth in advance, social networking sites like twitter, Facebook, and way for communicating devices like PCs, mobile phones, and laptops, allows people to interrelate with one another to generate vast amounts of statistical data. Wide network of Twitter having more than 400 million users produces about 170 million tweets per day (Yasin et al., 2014). By the year 2020, the quantity of produced data is expected to attain to around 44 zettabytes, with at least half of them being unstructured data (Khoso, 2016), that is generated through social media technologies. It includes Facebook, Twitter, and mobile instant messaging applications such as WhatsApp, telegram, and many more. Among all datasets available, Twitter seemed to take over all other data sets. Data available are mainly and largely classified into either structured form or unstructured form. It is comparatively easy to deal with structured data than unstructured data as it creates challenges due to its unformatted structure and free form. Yet, structured and unstructured data play a key role in effective statistical data analysis.
12.1.1 What is sentiment analysis? Sentiment analysis is the critical domain which deals with people’s judgments, responses as well as feelings, which is generated from unstructured data. Sentiment analysis is generally known as emotional artificial intelligence or opinion mining. There are numerous applications of natural language processing, computational scientific study of languages, and mining of textual data. It helps to collect individual subjective information from content created from comments on various types of blogs, twitter posts, social media, Facebook etc.
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases
241
Generally, there are mainly two types of tasks related to sentiment analysis or opinion mining: 1. Detection of polarity This helps us to identify whether a view or an opinion of a person is positive (+ve), neutral (-), or negative (–ve). To improvise a basic sentiment polarity, a numerical value within a certain range can be assigned based on polarity of a sentiment derived. 2. Characteristics constructed sentiment analysis Sentiment analysis models can detect a positive or a negative polarity in a given text based on characteristics irrespective of types of the textual data. Textual data may be a word, paragraph, sentence, or a corpus of document data. In recent years due to enormous admiration of opinion mining, emotion analysis has been functional in numerous expanded areas like sports, economic sector, politics, travel, customer behavior, and healthcare domain.
12.2 Importance of sentiment analysis Sentiment analysis refers to the techniques and processes that help organization to retrieve information about how their customers are reacting to a particular service or product. Sentiment analysis is really important because emotions and attitudes toward a topic can turn out to be actionable pieces of information useful in many areas like business and research. Medical sentiment analysis, nowadays, became an emerging topic of research. It helps in building a patient-assisted healthcare system effectively. Nearly half of the Internet users search for the medical treatment or procedure on Google. More than 60% of people are looking for health-related issue over the Internet. Sentiment analysis tools are also becoming shrewder day by day. They become precise and more in sentiment drawing out more as we provide them more data. Considering all the above significance, sentiment analysis drives overall worth improvement.
12.3 Sentiment analysis challenges Due to language complexity and huge variety of languages available, sentiment analysis has to face the following major issues related to natural language processing as there are different types of level of analysis such as document-level analysis, sentence analysis, aspect-level analysis, and lexicon-level analysis. They are depicted in Table 12.1.
242
Dhaval Bhoi, Amit Thakkar
Table 12.1: Natural language processing-related challenges for sentiment analysis. Analysis entity
Natural language processing-related problem faced
Document
Opinion related to specific domain Style of writing Spamming
Sentence
Conditional and comparative sentences Opinion word sharing Identification of subjectivity Source and target of the opinion Handling negation
Aspect level
Pruning Stemming
Lexicon level
How word is orientated Orientation strong point Sarcasm detection Entity reference resolution Words having dual meaning
12.4 Sentiment analysis uses in various industry sectors Sentiment analysis can be applied in various fields (Shayaa, 2018) including sports, finance sector, politics, and tourism to understand consumer behavior and in healthcare domain. 1. Sports Twitter data could be extremely valuable if it were combined with other sportsrelated data like historical team performance and current team injuries. Relating sentiment or emotional artificial intelligence (AI) to sports, for example, we live in the epoch of the “experts” of football, of deep analysis, and of special guests who are ex-technicians and ex-football players who manage and weave countless possibilities. Wise men of this generation analyze corner shots, make sketches on blackboards, assemble alignments between questions and answers, and predict results. It is the intellectual age of football that has caused a notable influence on managers, coaches, and footballers. Emotion or sentiment analysis can play a vital role in play winner prediction based on sentiment expressed by group of people.
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases
243
2. Financial sector Before latest technology evolved, banks took the usual approach to understanding customers by sampling, transaction, and so on to decide which customers have a credit card or mortgage that could benefit from refinancing at a competitor and then make an offer when the customer contacts. Financial sentiment analysis is an important research area of financial technology. This research focuses on investigating the influence of using different financial resources to investment and how to improve the accuracy of forecasting through deep learning. 3. Politics In recent years, there has been growing interest in mining online political sentiment in order to predict the outcome of elections. If the accuracy of sentiment analysis of political tweets can be improved (or its limitations at least better understood), then this will likely have a positive effect on its usefulness as an alternative or complement to traditional opinion polling. 4. Tourism Travel forum can summarize and consolidate opinions or sentiments of prospective and existing customers and can be used for rating prediction of a travel company. Uber uses social listening on a daily basis, which allows them to understand how our users feel about the changes they are implementing. Guest reviews can clearly influence people’s decision making. Not only they want good reviews but also the way that can help them learn most of the things about customers and their expectations. 5. Consumer behavior Corporations perform sentiment analysis to analyze customers’ opinions. Consumers use social media to share both their +ve and −ve experiences with manufacturer and other users as well. This also put great impacts on the performance of the business of the companies. 6. Healthcare Emotion analysis is one of the emerging trends in healthcare domain that helps to gain a reasonable upper hand over the others. This will eventually help to understand and improve patient’s experience. Sentiment analysis helps hospitals to realize clearly whether the patients are charmed with the service offered to them or if there is any scope for improvement. Sentiment analysis was found to be useful in locating adverse drug reaction methods mentioned. For that 81 drugs are collected from various data sources like DailyStrength forum and Twitter. The results indicate that sentiment analysis helps to improve and slightly boost the performance of adverse drug reaction mentioned in both tweets and health-related posts in the forum.
244
Dhaval Bhoi, Amit Thakkar
Results in Tighe et al. 2015 suggest that pain-related tweets exchange among people community having special characteristics that reflect unique content and their communication among tweeters.
12.5 Patient sentiment analysis tools and methods 12.5.1 What is patient sentiment analysis? Patient emotion analysis is considered to be the process of often collecting the information and processing, and interpreting generated huge amount of patient data set of free-text comments and measurable ratings. It is utilized by the healthcare providers for evaluating and examining +ve or –ve opinions or emotions in order to observe and plus side up their routine practices. Sentiment analysis is the most practical way to understand your patients’ emotions at large. Emotions are instinctive or intuitive feelings incapable of being expressed in words, but sentiment being a thought influenced by emotions can be expressed through written or spoken words.
12.5.2 How sentiment analysis helps in understanding 12.5.2.1 Patients’ emotions Sentiment analysis uses AI and machine learning techniques. It processes a comparative study of written comments and ratings written by the same patients to identify and categorize patients’ opinions and emotions. The AI technology is able to extract human emotions out of the written patient feedback by identifying words and patterns generated overtime. Today, patients report about their experience on different online platforms such as review websites, social sites, patient portals etc. But these are largely unstructured and non-standardized free-text information captured in a non-systematic way that makes it hard to use them for sentiment analysis. The introduction of modern sentiment analysis tools like RepuGen has changed the scenario. These tools are AI-enabled, use machine learning techniques to process bulk of data, and capture feedback from your patients by allowing them to openly share thoughts and feelings following a visit through a quick survey. All of these make it easy for you to understand your patients’ experiences and their emotions and utilize the information more effectively to improve your quality of care. A RepuGen survey conducted on over 29,000 patients over a 6 months’ period (June 2017–December 2017) revealed that happiness, trust, anticipation, anger, distrust, and sadness were the typical emotions felt by patients right after their visit to
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases
245
their providers. It also gave insights into the intensity of each emotion marked as high, medium, or low. This nuanced understanding of the patient’s emotions with their intensity allows for a customized response to patient concerns. Providers can also benefit by launching a marketing campaign to convert the happy patients with high intensity of their emotions into advocates for the center.
12.5.2.2 Sentiment analysis for an improved patient experience Healthcare providers can use sentiment analysis as a strategy to improve patient experience by looking at specific emotions. For example, in the RepuGen study (discussed earlier) on urgent care patients, the feeling of “anticipation” increased by 8.31% when compared to the previous version of the study (Jan 2017–June 2017). This is a good sign that patients are expecting more from their providers. Providers can use this information to launch an internal “Behavior Changes” campaign for the staff who will focus on training them for providing an improved customer care service that can stand upon the different parameters of patients’ expectations.
12.5.2.3 Sentiment analysis to improve healthcare Physicians, healthcare facilities, and health centers can use sentiment analysis to improve clinical quality. Proper and careful analysis leads to proper identification of patients’ feelings and behavior. Health organization can properly plan according to the emotion or feeling, which is the main focus of patients. As patients having critical disease may get depressed, they can be well extracted from their tweet and proper counseling session can be arranged as an improvement step.
12.5.2.4 Using sentiment analysis data for strategic advantage Providers who use sentiment analysis data are always at a strategic advantage from their competitors. As sentiment analysis data give a deep insight into patients’ emotions and their intensity of feeling, providers can develop very specific and highly targeted marketing campaigns that can benefit their healthcare business by acquiring and retaining new patients. 1. Retaining your patients – By using in-depth sentiment analysis data, providers can easily target their existing patients with interesting offers and information. You can join your patients on social media chatter for imparting health information or send them newsletters or link for your new blogs through e-mails.
246
2.
Dhaval Bhoi, Amit Thakkar
This will engage your existing patients in an indirect conversation with your brand which will improve patient retention by building patient loyalty. Acquiring new patients – By utilizing the emotional intensity analysis breakdown, you can identify your high-intensity happy or trusting patients to develop new marketing programs focused on acquiring new patients. One simple example could be distributing referral forms or links through e-mail to your existing patients, so they can acquaint your service with friends and relatives. This will help you get new patients on a regular basis.
Results on these two fronts will keep your healthcare business up and running. Sentiment analysis not only helps to track our business but even more than that. You can supervise industry trends and your competitors. Using this you can get better brand strategy to improve business intelligence. Right utilization of opinion analysis and big data analytics allows you to personalize and perk up patient experiences, which will definitely improve faithfulness and revenues for your facility.
12.6 Tools and methods There are numerous applications available for sentiment analysis. Significant amount of research work has been done for assessing and evaluating various methods of judgment mining and sentiment analysis. Marketing activities are leaders in terms of uses of sentiment analysis. Other fields used significantly are hospitality, tourism industries, finance, education, and healthcare domain. It is found to be significant in sentiment analysis applications in the political and government sectors. Table 12.2 shows different work studied, their description, and various data source, methods, and approaches used. It can be noted that sentiment analysis can help to improve patient health and his emotions especially in the critical diseases like cancer. Patient and his interaction with people on social media also play a key role in his health. Different reviews posted by people, the content of the post, and the source of such post are depicted in Table 12.3. There are several tools that may be helpful to analyze people’s reaction on social media. Tools like RepuGen help you to find out what patients think and feel when they visit your clinic and go from there. The software can figure out patients’ response across multiple geographical region and service providers. It also uses various famous machine learning algorithms like support vector machine and decision tree to analyze and induce the data into different kinds of reports and charts for each place and source. From the produced graphical reports, trends and strength of the sentiment can be measured. This gives us a clear picture of people sentiment and we can properly plan regarding improving patient care and service delivery.
247
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases
Table 12.2: Work/description/methods (approaches). Work studied
Description
Data source/methods/approaches
Portier et al., ()
The primary goal of this study is to Cancer online communities posts + understand sentiment analysis and sentiment analysis techniques detecting pessimistic emotional rate and also detects ups and downs in a person’s behavior and mood when interaction is within a community. It is found that giving support to patient is good for their health.
AbdulMageed et al., ()
Subjectivity sentiment analysis of social media is performed.
Different types of Arabic tweets [], Web forum sentences [], , chat turns are used.
Ramon et al., ()
SentiHealth-Cancer tool is used to find out emotional state of patient in online cancer communities located in Brazil. Language: Portuguese
SentiHealth also helps to work in different languages.
Table 12.3: Patient posts example on social media. Post Post example sequence
Identified post type
Post source Twitter
I
“Everyone is very friendly and the waiting time is brilliant – my consultant is lovely and I always feel at ease.” Great feedback for Surgical Outpatients, SJUH!
Patient experience
II
Over-eaten AGAIN just before bed. Stuffed. Good chance I will choke on my own vomit during sleep. I blame #Olanzapine #timetochange #bipolar
Drug/product accessibility issue
III
Big week ahead? Start your mornings with a bout of HIIT. New Australian research shows it boosts your cognitive powers & decision-making ability later in the day
Event
12.7 Word Cloud They are visual representations of words that give greater importance to words that appear more frequently or have more impact on the discussion issue. So if the word is with more height and bold, it shows that it is the most important one and most discussed among the group of people. Word Clouds can also signify word frequency that gives greater rise to words that appear more frequently in the source text.
248
Dhaval Bhoi, Amit Thakkar
Several data preprocessing steps like stop words removal, tokenization, lemmatization, and stemming are performed in order to get better results and to reduce the growth of outlier. This visualization can really assist evaluators with exploratory textual analysis by identifying words that frequently appear in a set of conversations, documents, or other texts. It also helps to communicate the most relevant points or themes in the reporting stage. Figure 12.1 demonstrates one such example for experiment performed on healthcare and medicine domain-related article. Mainly words like healthcare, information, medical, insurance, diseases, system, and informatics are getting highlighted.
Figure 12.1: Healthcare Word Cloud concept.
Moreover, this type of visualization can help presenters to quickly collect data from their audience, highlight the most common answers and present the data in a way that everyone can understand. Preprocessing steps are applied as shown in Figure 12.2 to generate Word Cloud on the corpus. Preprocessing takes care of word, punctuation, whitespace, regular expressions, stop words, and most frequent tokens. It also performs transformation like converting to lowercase, removing accents, parsing html, and removing urls. Figure 12.3 contains Word Cloud generated from patients’ community talk on social media. Words like health, patient, mental, diseases, healthcare, relation, insurance, treatment, and hope draw special attention as shown in Figure 12.3.
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases
Corpus
Import documents
249
Corpus
Preprocess text
Word Cloud
Figure 12.2: Word Cloud generation after preprocessing text data.
Figure 12.3: Mental health concept in word tag cloud.
12.8 Conclusion Sentiment analysis is still an emerging concept in the domain of healthcare industry. It has proved its worth and utility in other industries as well. Using sentiment analysis overall experience of patient, his state of mind can be improved to a great extent. Proper sentiment analysis using word cloud helps to highlight silent key hidden things to understand patient’s health and also may generate more revenues in the healthcare industry domain. Providers will start recognizing the importance of understanding the true emotional state of patients and this will help them to provide them better and desired quality of care with better customer service. This will identify sentiment of patients and employees as well. Eventually, as long as your employees are motivated and feel satisfied in their working environment, their attention toward patient care will certainly get improved. With this in mind, the importance and utility of sentiment analysis
250
Dhaval Bhoi, Amit Thakkar
tools and methodologies will be felt strong and soon it will become a staple for hospitals and healthcare providers.
12.9 Future work Sentiment analysis would help to design fruitful strategies. As a feature work, heterogeneous characteristics of big data, analyzing sparse, uncertain incomplete data also require careful consideration for proper decision making. Appropriate domainspecific lexicon building would allow users to improve the performance of the sentiment-based features for classification for polar detection. Careful consideration of features like emoticons can be valuable in the identification of emotions and sentiment in patients. Another interesting line of future work will be automatic identification of drugs, treatments in patient conversations, and automatic detection of adverse for a drug mentioned on post written on social media. Acknowledgments: The authors would like to thank the Head of Department, Principal, and Dean of Faculty of Technology and Engineering, Charotar University of Science and Technology, Charusat, for their continuous suggestions, encouragement, guidance, and support for this work. Special thanks to the management for their moral support and continuous encouragement.
References Abdul-Mageed M., Diab M., and Kübler S. SAMAR: Subjectivity and sentiment analysis for Arabic social media. Computer Speech & Language, (2014), 28, 20–37. 10.1016/j.csl.2013.03.001. Eric Just. How to use text analytics in healthcare to improve outcomes – why you need more than NLP, Health catalyst, 2017. [Online]. Available: https://www.healthcatalyst.com/how-to-usetext-analytics-in-healthcare-to-improve-outcomes. [Accessed: 18-Feb-2018]. Grissette H., Nfaoui E.H., and Bahir A. Sentiment analysis tool for pharmaceutical industry & healthcare. Transactions on Machine Learning and Artificial Intelligence, (2017), 5, doi: 10.14738/tmlai.54.3339. Gupta V. and Lehal G.S. Professor. A survey of text mining techniques and applications. International Journal of Computer Sciences and Engineering, (2010), 02(06). Kelso L. “DC/OS helps athenahealth build flexible and powerful cloud-based services,” Mesosphere, 2018. [Online]. Available: https://mesosphere.com/blog/dcos-athenahealth/. [Accessed: 24-Feb-2018]. Khoso M. (2016). How Much Data is Produced Every Day? [Online], Available: http://www.northeastern. edu/levelblog/2016/05/13/how-much-data-produced-every-day/
12 Impact of sentiment analysis tools to improve patients’ life in critical diseases
251
Portier K., Greer G E., Rokach L., Ofek N., Wang Y., Biyani P., Yu M., Banerjee S., Zhao K., Mitra P., and Yen J. Understanding topics and sentiment in an online cancer survivor community, JNCI Monographs, (2013 December), 2013(47), 195–198. https://doi.org/10.1093/jncimonographs/ lgt025 Raghupathi W., and Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst, (2014), 2, 3. https://doi.org/10.1186/2047-2501-2-3 Rodrigues R G., das Dores R M., Camilo-Junior C G., and Rosa T C. SentiHealth-cancer: A sentiment analysis tool to help detecting mood of patients in online social networks, International Journal of Medical Informatics, (2016), 85(1), 80–95. ISSN 1386-5056. https://doi.org/10. 1016/j.ijmedinf.2015.09.007 Shayaa S. et al. Sentiment analysis of big data: Methods, applications, and open challenges. IEEE Access, (2018), 6, 37807–37827, doi: 10.1109/ACCESS.2018.2851311 Tighe P.J., Goldsmith R.C., Gravenstein M., Bernard H.R., and Fillingim R.B. The painful tweet: Text, sentiment, and community structure analyses of tweets pertaining to pain. Journal of Medical Internet Research, (2015) Apr 02, 17(4), e84, doi: 10.2196/jmir.3769. Yasin A., Ben-Asner Y., and Menaeison A. “Deep-dive analysis or data analytics workload in cloudsuite,” in Proc. IEEE Int. Symp. Workload Characterization (IISWC), Oct. 2014, pp. 202–211.
Diksha Thakur, Nitin Mittal, Simrandeep Singh, Rajshree Srivastva
13 A fuzzy entropy-based multilevel image thresholding using neural network optimization algorithm Abstract: In the analysis and preprocessing of images, image segmentation is a very important step. Due to their simplicity, robustness, reduced convergence times, and accuracy, standard multilevel thresholding techniques for bilevel thresholds are efficient. However, a number of computational expenditures are needed, and efficiency is broken down as extensive research is used to decide the optimum thresholds, resulting in the implementation of evolutionary algorithms and swarm intelligence (SI) to achieve the optimum thresholds. Object segmentation’s primary objective is to distinguish the foreground from the background. By optimizing Shannon or fuzzy entropy based on the neural network optimization algorithm, this chapter provided a multilevel image border for object segmentation. The suggested algorithm is evaluated on standard image sets using Firefly algorithm (FA), Differential Evolution (DE), and particle swarm optimization, and the results are compared with entropy approaches for Shannon or fuzzy. The suggested approach shows better efficiency in objective factor than state-of-the-art approaches, structural similarity index, Peak signal to noise ratio (PSNR), and standard derivation. Keywords: fuzzy entropy, particle swarm optimization, image segmentation, Shannon entropy, neural network algorithm (NNA)
13.1 Introduction Image segmentation is an advanced area of computer vision research, which results in a sequence of segments covering the entire image or a collection of contours of objects. Each pixel in an area is similar to certain unique or computed characteristics, for example, color, intensity, or texture. The segmentation goal is to simplify or alter a picture’s representation into something that is more important and easier to assess. Usually, the present image segmentation technique can be summarized as four primary classifications: regional method, boundary method, cluster-based method (Chiranjeevi and Jena, 2016; Kapur et al., 1985a; Karri and Jena, 2016), and threshold-based method. Diksha Thakur, Nitin Mittal, Simrandeep Singh, Department of Electronics and Communication Engineering, Chandigarh University, Mohali, India Rajshree Srivastva, Department of Computer Science Engineering, DIT University, Dehradun, India https://doi.org/10.1515/9783110648195-013
254
Diksha Thakur et al.
Thresholding is one of the quickest, simplest, and most efficient methods of object segmentation that can discriminate against objects from the background by setting pixel-level thresholds. The most dynamic yet fascinating area of image processing and pattern recognition is automatic separation between images and context (AjaFernandez et al., 2015). It can not only display well-defined areas with minimal overlap and aggregation efficiency but also provide original prediction or preprocessing for more complex segmentation methods (Shiand and Pan, 2016). Segmentation of thresholds is considered as the primary link in image analysis and image perception and is widely used in many fields such as clinical analysis (Masood et al., 2015), image classification (Torres-Sanchez et al., 2015), object recognition (Sonka et al., 1993), and image duplication detection (Li et al., 2015). Threshold segmentation is the most commonly used technique for object segmentation (Oliva et al., 2014). It is possible to break down thresholding approaches into parametric and nonparametric classifications (Oliva et al., 2014; Akay, 2013; Osuna-Enciso et al., 2013). Usually the parametric approach is time-consuming, whereas the nonparametric is more accurate and, as a consequence, more attention is paid to determine the appropriate limit by optimizing certain norms such as interclass variance, entropy, and error rate (Oliva et al., 2014; Akay, 2013; Osuna-Enciso et al., 2013; Kurban et al., 2014; Sarkar et al., 2015). Over the years, the associated literatures have recorded countless thresholding techniques (Dirami et al., 2013). In 1979, Otsu’s method (Otsu, 1979a) launched a thresholding technique that maximizes class variance to achieve optimum threshold values. Tsallis (Tsai, 1985) suggested a thresholding system to use the momentpreserving rule for a strong gray object. Kapur et al (1985) used histogram entropy to prevent optimal thresholds known as the Kapur’s method of entropy and often used the technique to identify problems with object threshold segmentation. The minimal cross-entropy approach used to calculate the maximum limit is used to minimize cross-entropy between the original image and the segmented image (Li and Lee, 1993). Such methods can easily be extended to segmentation of multilevel thresholds. However, when expanded to multilevel thresholds, computational time will quickly increase as they search for the optimum threshold values to optimize objective characteristics. Optimization is a word used to discover the best solution to an issue to fulfill certain conditions of constraint (Beekman et al., 2008). The search technique for a given image’s optimal threshold values is considered to be a restricted optimization problem. SI algorithms are frequently used to search ideal threshold values for multilevel thresholding issues using distinct objective function to address the computational inefficiency issues of standard thresholding methods. For multilevel models such as genetic algorithm (GA) (Hammouche et al., 2008; Feng et al., 2005); particle swarm optimization (PSO) (Maitra et al., 2008; Gao, 2010; Ghamisi et al., 2014; Horng, 2011); Artificial Bee Colony Algorithm (ABC) (Ma et al., 2011; Zhang and Wu, 2011; Cuevas et al., 2012; Cuevas et al., 2010; Sarkar et al., 2011a); DE (Sarkar and Das, 2013; Brajevic et al., 2012; Agrawal et al., 2013a; Horng, 2014); Cuckoo Search algorithms (Zhiwei et al.,
13 A fuzzy entropy-based multilevel image thresholding
255
2015; Zhou et al., 2015), and Glowworm swarm optimization algorithms (Beni and Wang, 1993; Riseman et al., 1977), their enhanced algorithms are used. Image thresholding is one of the most important characteristics for assessment of an image. It is a tool that separates target from its pixel-level context. In the field of pattern recognition and image processing, the most demanding and fascinating region remains automatically divided between objects and context. Extensive surveys of such methods can be found in Fu et al. (1981); Haralharick et al. (1985); Borisenko et al. (1987); Sahoo et al. (1988); Pal et al. (1993); Otsu (1979); Kapur et al. (1985b). Scientific literature provides different methods of segmentation of objects, including gray-level thresholding, dynamic pixel recognition, techniques focused on neural networks, edge detection, and blurred segmentation. In this paper, neural network optimization (NNO) is applied with the existing fuzzy entropy research problem to achieve a multilevel object segmentation approach (Storn and Price, 1997; Das et al., 2011). NNO is probably one of today’s most powerful true parameter optimizers (Sarkar et al., 2011b; 2012). It has been shown that using NNO for multilevel threshold-based image segmentation issues can outperform GA and PSO (Karri et al., 2014; Sezgin and Sankur, 2004b).
13.2 Thresholding techniques The thresholding method was discovered to be the most common method out of all the current methods used to segment different image kinds. There are two distinct types of thresholding: bilevel thresholding and multilevel thresholding. If the images are clearly separated by a single threshold value from the image background, a bilevel threshold is considered, while multilevel thresholds are used to divide a picture into several separate parts. Multilevel thresholding methods have played a significant part in image analysis over the years and lot of scientists and academics have done a lot of work on it. Image thresholding extracts items from the background in a scene that helps analyze and interpret the picture. Selecting a preeminent gray-level threshold is a difficult task for scientists in image processing. Selection of thresholds is simple, because of the multi-modality of the histograms of many threshold image choices. It is an Non Polynomial (NP) hard problem, and researchers have therefore suggested many methods for the threshold of preeminent gray level (Sezgin and Sankur, 2004).
13.2.1 Bilevel thresholding A gray picture has intensity values between 0 and N − 1, where N is the highest possible intensity value. Bilevel image thresholding detects an intensity value separating
256
Diksha Thakur et al.
the image’s front and background object. There are several techniques implemented for bilevel threshold estimation, such as pixel value variance estimation and entropy values (Arora et al., 2008; Gao et al., 2010b): EQk0 = fF ðx, yÞ 2 Xj0 ≤ F ðx, yÞ ≤ k1 − 1g
(13:1)
k 1 = fF ðx, yÞ 2 XjK1 ≤ F ðx, yÞ ≤ N − 1g
(13:2)
where X is the display object, and F ðx, yÞ is the frequency pixel of the situations for x and y components.
13.2.2 Multilevel thresholding Bilevel image thresholding is ineffective in images where we cannot differentiate between background and subjects of concern and we are, therefore, compelled to move toward a multilevel picture threshold scheme (Otsu, 1979a; Tsai, 1985b). Multilevel thresholds detect more than one gray level to differentiate the items of concern from the picture. It can be mathematically articulated as follows: k0 = fF ðx, y Þ 2 Xj0 ≤ F ðx, y Þ ≤ k1 − 1g
(13:3)
k1 = fF ðx, y Þ 2 Xjk1 ≤ F ðx, y Þ ≤ k2 − 1g
(13:4)
kp = fF ðx, y Þ 2 Xj ≤ F ðx, y Þ ≤ kp + 1 − 1g
(13:5)
kq = fF ðx, y Þ 2 Xjkq ≤ F ðx, y Þ ≤ N − 1g
(13:6)
where X is the display object, and F ðx, yÞ shows the pixel frequency corresponding to x and y and KP = 1, 2, . . . , M.
13.2.2.1 Otsu’s method Another technique known as interclass variance is a nonparametric segmentation technique aimed at optimizing interclass variance and increasing interclass pixel variance in each class (Tsai, 1985). We might recognize μ1 . . . μm as mean cost of pixels in group 0, 1, . . ., m for the multilevel segmentation issue: σ20 = ω0 ðμ0 − μT Þ2 , σ2j = ωj ðμj − μT Þ2 , μ0 =
tX 1 −1 i=0
ipi =ω0 ,
σ2j = ω0 ðμ1 − μT Þ2
(13:7)
σ2m = ωm ðμm − μT Þ2
(13:8)
μ1 =
tX 2 −1 i − t1
ipi =ω1
(13:9)
13 A fuzzy entropy-based multilevel image thresholding
tj + 1 − 1
μj =
X
ipi =ω0 ,
μm =
i = tj
N −1 X
ipi =ω1
257
(13:10)
i − tm
Therefore, the segmentation technique aims at optimizing f ðtÞ, the amount of the difference characteristics between groups is to obtain the optimal values in eq. (13.11): m X ! σ2i t* = arg maxðf ðtÞÞ where f ðtÞ =
(13:11)
i=0
13.2.3 Kapur’s entropy method The theory of maximizing the Kapur’s entropy method for segmentation of objects is based on the assumption that a picture consists of a foreground and a background region attributing values for distribution of object intensity (Agrawal et al., 2013a). Both regions are calculated individually, maximizing their amount. Then, an optimal limit value is calculated to maximize the entropy amount. The same idea can readily be used for multilevel image thresholds that can be formulated mathematically: H0 = −
ti − 1 X
ðpi =ω0 Þ log2 ðpi =ω0 Þ
i=0
H0 = −
tX 2−1
ðpi =ω1 Þ log2 ðpi =ω1 Þ
(13:12)
i = t1 tj + 1 − 1
H1 = −
X
ðpi =ωj Þ log2 ðpi =ω1 Þ
i=0
Hm = −
N −1 X
ðpi =ωm Þ log2 ðpi =ωm Þ
(13:13)
i=0
ω0 =
ti − 1 X
p i ; ω1 =
i=0
tX 2 −1
tj + 1 − 1
ω1 =
X i = tj
pi
(13:14)
i = t1
pi ; ωm =
N −1 X
pi
(13:15)
i − tm
where H0 , H1 , . . ., Hm are entropy values of separate areas or divisions of m + 1, and Pi is probability i where i is between 0 and 255 and N is the peak gray intensity concentration.
258
Diksha Thakur et al.
13.3 Optimum thresholding methods problem Image thresholding is a method of transforming a gray image with peak thresholds to a black and white image. Thresholding can be regional or international, but in computational terms, these methods are costly, so optimization techniques are required to optimize the objective functional results of computational time constraints locally and globally. Through optimizing the objective function, the optimization techniques seek the optimum thresholds so that the segmented image distinguishes the background and foreground image clearly. Let us say a gray-level image of G and is f1, 2, . . . , ðG − 1Þg. Then, probability Pi = Ai =Pð0 < 1 < ðG − 1ÞÞ, where Ai denotes the pixel number for the corresponding gray rate G and P denotes the total pixel number in the image equal to the pixel number in the image P equal to iG=−01 AðiÞ, respectively.
13.3.1 Fuzzy entropy Measuring uncertainty is taken as a test of data. Accordingly, fuzzy data measures are called fuzzy entropy (Herencia and Lamata, 1997; Rudas and Kaynak, 1998). It is used to evaluate an amount of fuzzy data gathered from a fuzzy system or fuzzy set. Here, it should be noted that the idea of blurred entropy is quite different from Shannon’s classical entropy since representing it does not require a probabilistic notion (McCulloch and Pitts, 1943; Garrett, 1994; Hasanipanah et al., 2016). This is because fuzzy entropy includes uncertainties of vagueness and ambiguity, while entropy from Shannon includes random (probabilistic) uncertainty.
13.3.2 Multilevel fuzzy entropy Define a classical set A as an object set that may or may not belong to set A. While an object may, according to the fuzzy set, belong to a set A in portion, which is a generalization of the classic set, one could be described as A = fx, μA ðxÞjxX g
(13:16)
where 0 ≤ μA (x) ≤ 1 and μA(x) are referred to as membership functions that measure the proximity of x to A. For simplicity, this document uses the feature of trapezoidal membership to predict the membership of n segmented areas, μ1, μ2, . . . μn by using two (n – 1) unknown parameters, namely, a1, c1, . . ., an–1, cn–1,
13 A fuzzy entropy-based multilevel image thresholding
259
where A1 =
L−1 X i=0
A*i μ1 ðiÞ, p2 =
L−1 X
A*i μ2 ðiÞ, . . . , pn =
i=0
L−1 X
p*i μn ðiÞ
(13:17)
i=0
By maximizing complete entropy, the optimal value of parameters can be acquired: ’ða1 , c2 , . . . , an − 1 , cn − 1 Þ = Arg Maxð½H1 ðtÞ + H2 ðtÞ + + Hn ðtÞÞ
(13:18)
A worldwide technique of optimization is needed to optimize eq. (13.19): t1 =
a1 + c 1 a2 + c2 an − 1 + cn − 1 , t2 = , . . . , tn − 1 = 2 2 2
(13:19)
To decrease the suggested method’s time complexity effectively using the blurred parameters, the (n – 1) no limit values can be achieved as follows:
13.4 Neural network algorithm Artificial neural networks (ANNs) map the input data to target data through an iterative update of the weights wij of the ANNs to reduce mean-square error between predicted output and target output. The neural network algorithm (NNA) uses the concepts and the structure of the ANNs to generate new solutions where the best searching agent in the population is considered as the target and the procedures of the algorithm tries to make all the searching agents follow that target solution (Sadollah et al., 2018). NNA is a population-based algorithm that initializes with randomly generated solutions within the search space. Each individual or searching agent in the population is called a “pattern solution,” each pattern solution is a vector of 1 × D representing the input data of the NNA: Pattern Solutioni = ½xi, 1 , xi, 2 , xi, 3 , . . . , xi, D . To begin the NNA, it was thought that a pattern solution matrix X with size N pop was generated randomly between the lower and upper search boundaries. Pattern solution population X is provided by: 2 3 2 3 X1 x1, 2 x1, D x1, 1 6 X 7 6 7 x2, 1 x2, 2 x2, D 6 2 7 6 7 6 7 6 7 6 . 7 6 7 . . . 6 7 6 7 X=6 7=6 7 6 . 7 6 7 . . . 6 7 6 7 6 . 7 6 7 . . . 4 5 4 5 xNpop , 1 xNpop , 2 xNpop , D XNpop
260
Diksha Thakur et al.
where xij = LBj + rand UBj − LBj , i = 1, 2, . . . , Npop , j = 1, 2, . . . , D
(13:20)
where LB and UB are 1 × D vectors representing the lower and upper bounds of the problem under consideration. Like ANNs, in NNA each pattern solution Xi will have its corresponding weight Wi where h iT Wi = wi, 1 , wi, 2 , wi, 3 , . . . , wi, Npop The weights array W is given by 2 6 6 6 h i6 6 W = W1 , W2 , . . . , Wi , . . . , WNpop 6 6 6 6 4
w11
wi1
. . . wNpop 1
w12
wi2
. . . wNpop 2
.
.
.
.
.
.
.
.
.
3 7 7 7 7 7 7 7 7 7 5
(13:21)
w1Npop wiNpop , 2 . . . wNpop Npop where W is a square matrix Npop × Npop of random numbers distributed uniformly between 0 and 1. The weight of the pattern solution is involved in the generation of a new candidate solution. In NNA, the initial weights are random numbers and its value is updated as the iteration number increases according to the network error estimated. The weight values are so limited that the summation of weight should not exceed one for any template solution, defined mathematically as follows:
wij 2 Uð0, 1Þ, i, j = 1, 2, 3, . . . , Npop pop X wij = 1, i = 1, 2, 3, . . . , Npop
(13:22) (13:23)
j=1
Such weight restrictions are used to monitor the movement bias and create new pattern solutions. The fitness Ci of each pattern solution is calculated by the evaluation of the objective function fobj using the corresponding pattern solution Xi . Ci = fobj ðXi Þ = fobj ðxi1 , xi2 , xi3 , . . . , xiD Þ,
i = 1, 2, 3, . . . , Npop
(13:24)
After the fitness calculation for all pattern solutions, the best fitness pattern solution is called the target solution with the target XTarget location, target fitness FTarget , and target weight W Target . The NNA models and ANN with Npop inputs each input of D dimension(s) and only one target output XTarget (Sadollah et al., 2018).
13 A fuzzy entropy-based multilevel image thresholding
261
The new pattern solution is generated using the weight summation technique of ANNs as follows: XjNew ðk + 1Þ =
N pop X
wij ðkÞ.Xi ðkÞ, j = 1, 2, 3, . . . , Npop
(13:25)
Xi ðk + 1Þ = Xi ðkÞ + XiNew ðk + 1Þ, i = 1, 2, 3, . . . , Npop
(13:26)
i=1
where k is an iteration index. When the previous population creates new pattern solutions, the weight matrix is updated using the following equation: WiUpdated ðk + 1Þ = Wi ðkÞ + 2.rand. W Target ðkÞ − Wi ðkÞ , i = 1, 2, 3, . . . , Npop (13:27) where constraints (13.25) and (13.26) must be satisfied during the optimization process. In the NNA algorithm, a bias operator is used to help manipulate the search space. The bias operator is used to adjust some of the pattern solutions produced in the new population Xi ðk + 1Þ as well as the updated weight matrix WiUpdated ðk + 1Þ. By moving those individuals in the population to enter certain places in the search area that the population has not reached, the bias operator prevents the algorithm from premature convergence. A modification factor βNNA is used to determine the percentage of the pattern solutions to be modified using the bias operator. The initial value of βNNA is set to 1 which means that all individuals in the population are biased. The value of βNNA at each iteration will be adaptively reduced using any feasible reduction strategy such as the following: k , k = 1, 2, 3, . . . , Max iteration (13:28) βNNA ðk + 1Þ = 1 − Max iteration βNNA ðk + 1Þ = βNNA ðkÞ.αNNA , k = 1, 2, 3, . . . , Max iteration
(13:29)
where αNNA is a positive number smaller than 1 originally selected as 0.99. The reduction of the modification factor βNNA is made to enhance the exploitation of the algorithm as the iterations increase by allowing the algorithm to search for the optimum solution near to the target solution especially at the final iterations. In comparison with ANNs, the transfer function operator is used in NNA to produce solutions of better quality. The operator of the transfer function (TF) is defined as follows: Xi* ðk + 1Þ = TFðXi ðk + 1ÞÞ = Xi ðk + 1Þ + 2.rand. XTarget ðkÞ − Xi ðk + 1Þ i = 1, 2, 3, . . . , Npop
(13:30)
262
Diksha Thakur et al.
13.5 Results of the experiment and proposed work Simulations are conducted using MATLAB R2014b using Intel® CoreTM i3 2.8 GHz processor. Segmented images were developed using the equation and the limit values acquired as defined in as gray images. We selected “Sea Star,” “Cameraman,” “Hunter,” and “Lena,” evaluating photos for quality evolution involving the suggested firefly algorithm’s robustness, effectiveness, and convergence. Both data sets are accessible through the link (http:/www.imageprocessingplace.com / root files V3/image databases.htm) from the image segmentation server and all images are in the .jpg format as shown in Figure 13.3. Therefore, we proposed an NNObased image threshold for efficient and accurate object segmentation of the essential objects above by optimizing fuzzy entropy. The efficiency and accuracy of the suggested NNO algorithm surpass other techniques of optimization such as DE, PSO, and FA.
13.5.1 Selection of parameters for NNO, DE, PSO, and FA These optimization algorithms use the same number of populations and total number iterations. The maximum number of iterations is 30 and the solution cost is 10 times higher. The efficiency of the PSO algorithm depends on two parameters of tuning, such as constants of velocity and weight variable. At these values, PSO gave the highest fitness scores experimentally. Whereas the element of inertia weight (W) is between 0 and 1. The weighting factor value (F) in NNO is 0.5 and the convergence likelihood is 0.99 as NNO performs best at these values.
Threshold based methods
Region based methods
Edge based methods Image segmentation method Clustering based methods
Watershed based methods
ANN based methods
Figure 13.1: Image segmentation method.
13 A fuzzy entropy-based multilevel image thresholding
263
Input 2D image
Gray scale
Histogram
Otsu thresholding
Figure 13.2: Flow diagram for Otsu’s method.
(a)
(b)
(c)
(d)
Figure 13.3: Test images: (a) sea star; (b) camera man; (c) hunter; (d) Lena.
FA efficiency relies on parameters such as number of alternatives, size, maximum number of iterations, randomization parameter, attractiveness, and coefficient of absorption. These control parameters should be thoroughly selected in order to successfully implement the algorithm while implementing FA. Successive work was carried out to pick these parameters and determine the highest values in which the highest objective function was found.
13.5.2 Validation quantitative Fitness function is fuzzy entropy to analyze the effect on the multilevel threshold problem of the NNO algorithm. 13.5.2.1 Maximizing fuzzy entropy In this case, the object of optimizing with the optimization approach is fuzzy entropy, which is said to be normal and good in NNO results compared to the others
264
Diksha Thakur et al.
like DE, PSO, and FA. All algorithms are designed to maximize the objective function. Tables 13.3–13.6 indicate the goal and threshold values for DE, PSO, and FA. Table 13.2 indicates a goal value higher than DE, PSO, and FA for different images obtained with NNO using fuzzy entropy. Table 13.1: Parameter settings. Algorithm
Parameters
NNO
NP = ; D = –; Gmax = ; αNNA = ., βNNA = dynamic
DE
NP = ; D = –; Gmax = ; CR = .
PSO
NP = ; D = –; Gmax = ; ωmin = .; ωmax = ., c = c = .
FA
NP = ; D = –; Gmax = ; α = .; γ =
Here, NP is the number of populations; D is the dimension of population; Gmax is number of iterations.
Table 13.2: Verification of PSNR values for NNO, DE, PSO, and FA methods. Image
N
NNO
Sea star
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Camera man
Hunter
Lena
DE
PSO
FA
13 A fuzzy entropy-based multilevel image thresholding
265
Table 13.3: Optimum threshold values, PSNR, and STD values achieved by NNO algorithms focused on Fuzzy entropy. Image
N
Threshold
PSNR
Sea star
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
, ,
.
.
, , ,
.
.
, , ,,
.
.
,
.
.
,,
.
.
,,,,
.
.
,, ,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
Camera man
Hunter
Lena
STD
Table 13.4: Optimum threshold values, PSNR, and STD values achieved by DE algorithms focused on Fuzzy entropy. Image
N
Threshold
Sea star
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
Camera man
PSNR
STD
266
Diksha Thakur et al.
Table 13.4 (continued ) Image
N
Threshold
Hunter
,
Lena
PSNR
STD
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
Table 13.5: Optimum threshold values, PSNR, and STD values achieved by PSO algorithms focused on Fuzzy entropy. Image
N
Threshold
Sea star
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
, ,,
.
.
.
.
.
.
.
.
Camera man
Hunter
Lena
,,,
,,,,
PSNR
STD
13 A fuzzy entropy-based multilevel image thresholding
267
Table 13.6: Optimum threshold values, PSNR, and STD values achieved by FA algorithms focused on Fuzzy entropy. Image
N
Threshold
Sea star
,
Camera man
Hunter
Lena
PSNR
STD
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
,
.
.
,,
.
.
,,,
.
.
,,,,
.
.
13.5.2.2 Qualitative outcomes Using fuzzy entropy in this section with NNO, DE, PSO, and FA algorithms, Researchers concentrated on the visual clarity of segmented objects with specific threshold values, that is, Th = 2, Th = 3, Th = 4, Th = 5, and so on. Figures 13.4–13.7 display segmented images/threshold images and respective histogram thresholds for DE and NNO algorithms with fuzzy entropy levels 2, 3, 4, and 5. Among these numbers, we noted that the segmented image’s visual value is higher than Th = 4, Th = 3, Th = 2, and Th = 5. Let us look at the visual quality of the few fuzzy entropy-segmented images for the efficacy and robustness check of the suggested NNO. Greater for DE is the visual value of the proposed NNO. The proposed algorithm is great for all other images in terms of visual image quality relative to other previous algorithms. From different images, the effect of multilevel thresholding is clear.
268
Diksha Thakur et al.
Figure 13.4: Results after the NNO is applied over the sea star image.
13 A fuzzy entropy-based multilevel image thresholding
Figure 13.5: Results after the NNO is applied over the camera man image.
269
270
Diksha Thakur et al.
Figure 13.6: Results after the NNO is applied over the hunter image.
13 A fuzzy entropy-based multilevel image thresholding
Figure 13.7: Results after the NNO is applied over the Lena image.
271
272
Diksha Thakur et al.
13.5.2.3 Comparison of other methods NNA maximizes fuzzy entropy for object thresholding effectively and accurately. The algorithm proposed is tested for the validity of the algorithm on natural objects Some optimization techniques such as DE, PSO, and FA with fuzzy entropy compare the results of the proposed approach. Compared to DE, PSO, and FA, the proposed algorithm has a higher/maximum fitness value. With the proposed algorithm, the PSNR value shows higher values than DE, PSO, and FA and thus draws better image quality with the proposed method. The NNA scheme is used to calculate levels of thresholds efficiently. Using the literature rules, the parametric establishment of NNA, DE, PSO, and FA has been introduced. 13.5.2.3.1 Analyzing stability The result of the optimization approach is inherently unpredictable as randomness is involved in the process and the outcomes for each run are not different. With more than one run and distinct original values, the efficacy of the algorithm is verified. If its result is appropriate under the same conditions, an algorithm is said to be robust. So, we ran the same algorithm 30 times and found the outcome to be 30 separate runs we considered the outcome. The algorithm’s consistency is evaluated by mean and S.D. 13.5.2.3.2 PSNR As a visual distinguishing measure between two objects in which the units are decibels (dB), PSNR indicates dissimilarity between image limit and object production. A lower PSNR value suggests better image quality or image threshold restoration. 2 255 ðdBÞ (13:31) PSNR = 10log10 MSE
13.6 Conclusion and future work From the above discussion, it can be done that multilevel fuzzy entropy-based segmentation thresholding techniques perform significantly better than Shannon-based techniques. Fuzzy thresholding techniques based on entropy provide adequate visual comparison outcomes. The success of the fuzzy-based strategy will be evaluated by comparing the results with Shannon entropy in terms of both visual and numerical analysis. These claims are definitely created by the state-of-the-art Feature similarity index Measurement (FSIM) and Complex Wavelet-Structural Similarity Index Measurement (CW-SSIM) image quality evaluation metrics. NNA adds velocity and precision to this algorithm without a doubt. However, for better performance, better
13 A fuzzy entropy-based multilevel image thresholding
273
self-adaptive DE or other unproven NNA variants could be used. Additionally, for better separation of segmented fields, several additional membership functions could be assessed. In the future, the capacity of segmentation algorithms could be shown by more image quality metrics. Last but not least, to achieve better results, a method based on 2D histograms could be introduced. The convergence time and performance of the NNA will be increased by improving the algorithm method in the future.
References Agrawal S., Panda R., Bhuyan S., and Panigrahi B.K. Tsallis entropy based optimal multilevel thresholding using cuckoo search algorithm. Swarm and Evolutionary Computation, (2013), 11, 16–30. Aja-Fernandez S., Curiale A.H., and Vegas-Sanchez-Ferrero G. A local fuzzy thresholding methodology for multi region image segmentation. Knowledge-Based Systems, (2015), 83(1), 1–12. Akay B. A study on particles warm optimization and artificial bee colony algorithms for multilevel thresholding. Applied Soft Computing Journal, (2013), 13(6), 3066–3091. Arora S., Acharya J., Verma A., and Panigrahi P.K. Multilevel thresholding for image segmentation through a fast statistical recursive algorithm. Pattern Recognition Letters, (2008), 29(2), 119–125. Beekman M., Sword G.A., and Simpson S.J. Biological Foundations of Swarm Intelligence. 1th. Berlin Heidelberg: Springer Press, (2008). Beni G. and Wang J. Swarm intelligence in cellular robotic systems. NATO Advanced Workshop on Robots and Biological Systems(1993), 703–712. Borisenko V.I., Zlatotol A.A., and Muchnik I.B. Image segmentation (state of the art survey). Automation and Remote Control, (1987), 48, 837–879. Brajevic I., Tuba M., and Bacanin N. Multilevel image thresholding selection based on the cuckoo search algorithm. In Advances in Sensors, Signals, Visualization, Imaging and Simulation, (2012), 9, 217–222. Chiranjeevi K. and Jena U. Fast vector quantization using a Bat algorithm for image compression. International Journal of Engineering Science Technologies, (2016), 19(2), 769–781. Cuevas E., Zaldivar D., and Pérez-Cisneros M. A novel multi-threshold segmentation approach based on differential evolution optimization. Expert Systems with Applications, (2010), 37, 5265–5271. Cuevas E., Sención F., Zaldivar D., Pérez-Cisneros M., and Sossa H. A multi-threshold segmentation approach based on artificial bee colony optimization. Applied Intelligence, (2012), 37, 321–336. Das S. and Suganthan P.N. Differential evolution – a survey of the state-of-the-art. IEEE Transaction on Evolutionary Computation, (2011), 15(1), 4–31. Dirami A., Hammouche K., Diaf M., and Siarry P. Fast multilevel thresholding for image segmentation through a multiphase level set method. Signal Processing, (2013), 93, 139–153. Feng D., Wenkang S., Liangzhou C., Yong D., and Zhenfu Z. Infrared image segmentation with 2-D maximum entropy method based on particle swarm optimization (PSO). Pattern Recognition Letters, (2005), 26, 597–603. Fu K.S. and Mui J.K. A survey on image segmentation. Pattern Recognition, (1981), 13, 3–16.
274
Diksha Thakur et al.
Gao H., Xu W., Sun J., and Tang Y. Multilevel thresholding for image segmentation through an improved quantum-behaved particle swarm algorithm. IEEE Transactions on Instrumentation and Measurement, (2010), 59(4), 934–946. Garrett J.H. Where and why artificial neural networks are applicable in civil engineering. Journal of Computing in Civil Engineering, (1994), 8, 129–130. Ghamisi P., Couceiro M.S., Martins F.M., and Benediktsson J.A. Multilevel image segmentation based on fractional-order Darwinian particle swarm optimization. IEEE Transactions on Geoscience and Remote Sensing, (2014), 52, 2382–2394. Hammouche K., Diaf M., and Siarry P. A multilevel automatic thresholding method based on a genetic algorithm for a fast image segmentation. Computer Vision and Image Understanding, (2008), 109, 163–175. Haralharick R.M. and Shapiro L.G. Survey: Image segmentation techniques. CVGIP, (1985), 29, 100–132. Hasanipanah M., Faradonbeh R.S., Amnieh H.B. et al. Forecasting blast-induced ground vibration developing a CART model. Engineering with Computers, (2016), 33, 1–10. Herencia J. and Lamata M.. Entropy measure associated with fuzzy basic probability assignment. In IEEE int. Conf. On fuzzy systems, volume 2, 863–868, 1997. Horng M.H. Multilevel image thresholding with glowworm swam optimization algorithm based on the minimum cross entropy. Advances in Information Sciences and Service Sciences, (2013), 5, 1290–1298. [38] L. Qifang, O. Zhe, C. Xin, Z. Yongquan, A multilevel threshold image segmentation algorithm based on glowworm swarm optimization, J. Comput. Inf. Syst., 10 (2014)1621-1628. Horng M.H. Multilevel thresholding selection based on the artificial bee colony algorithm for image segmentation. Expert Systems with Applications, (2011), 38, 13785–13791. Kapur J.N., Sahoo P.K., and Wong A.K. A new method for gray-level picture thresholding using the entropy of the histogram. Computer vision, graphics, and image processing, (1985), 29(3), 273–285. Karri C. and Jena U. Image compression based on vector quantization using cuckoo search optimization technique. Ain Shams Engineering Journal, (2016). Karri, C., Umaranjan, J., Prasad, P.M.K. Hybrid Cuckoo search based evolutionary vector quantization for image compression. Artif. Intell. Comput. Vision Stud. Comput. Intell., 89–113 (2014). Kurban T., Civicioglu P., Kurban R., and Besdok E. Comparison of evolutionary and swarm based computational techniques for multilevel color image thresholding. Applied Soft Computing Journal, (2014), 23, 128–143. Li C.H. and Lee C.K. Minimum cross entropy thresholding. Pattern Recognition, (1993), 26, 617–625. Li J., Li X., Yang B., and Sun X. Segmentation-based image copy-move forgery detection scheme. IEEE Transactions on Information Forensics and Security, (2015), 10(3), 507–518. Ma M., Liang J., Guo M., Fan Y., and Yin Y. SAR image segmentation based on artificial bee colony algorithm. Applied Soft Computing, (2011), 11, 5205–5214. Maitra M. and Chatterjee A. A hybrid cooperative–comprehensive learning based PSO algorithm for image segmentation using multilevel thresholding. Expert Systems with Applications, (2008), 34, 1341–1350. Masood S., Sharif M., Masood A., Yasmin M., and Raza M. A survey on medical image segmentation. Current Medical Imaging Reviews, (2015), 11,(1), 3–14. Maulik U. Medical image segmentation using genetic algorithms. IEEE Transactions on Information Technology in Biomedicine, (2009), 13, 166–173. McCulloch W.S. and Pitts W. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, (1943), 5, 115–133.
13 A fuzzy entropy-based multilevel image thresholding
275
Oliva D., Cuevas E., Pajares G., Zaldivar D., and Osuna V. A multilevel thresholding algorithm using electromagnetism optimization. Neuro Computing, (2014), 139, 357–381. Osuna-Enciso V., Cuevas E., and Sossa H. A comparison of nature inspired algorithms for multithreshold image segmentation. Expert Systems with Applications, (2013), 40(4), 1213–1219. Otsu N. A threshold selection method from gray-level histogram. IEEE Trans- actions on Systems, Man and Cybernetics, (1979), 9(1), 62–66. Otsu N. A threshold selection method for grey level histograms. IEEE Transactions on Systems, Man, and Cybernetics. SMC, (1979), 9(1), 62–66. Otsu N., A threshold selection method for grey level histograms, IEEE Transactions on System, Man and Cybernetics, SMC-9 (1979) 62–66. Pal N.R. and Pal S.K. A review on image segmentation. Pattern Recognition, (1993), 26(9), 1277–1294. Riseman E.M. and Arbib M.A. Survey: Computational techniques in the visual segmentation of static scenes. Computer Vision, Graphics, and Image Processing, (1977), 6, 221–276. Rudas I. and Kaynak M. Entropy-based operations on fuzzy sets. IEEE Transactions on Fuzzy Systems, (1998), 6(1), 33–39. Sadollah A., Sayyaadi H., and Yadav A. A dynamic metaheuristic optimization model inspired by biological nervous systems: Neural network algorithm. Applied Soft Computing, (2018), 71, 747–782. Sahoo P.K., Soltani S., Wong A.K.C., and Chen Y.C. A survey of thresholding techniques. CVGIP, (1988), 41, 233–260. Sarkar S. and Das S. Multilevel image thresholding based on 2D histogram and maximum Tsallis entropy – a differential evolution approach. IEEE Transactions on Image Processing, (2013), 22, 4788–4797. Sarkar S., Das S., and Chaudhuri S.S. A multilevel color image thresholding scheme based on minimum cross entropy and differential evolution. Pattern Recognition Letters, (2015), 54, 27–35. Sarkar S., Das S., and Chaudhuri S.S. Multilevel image thresholding based on Tsallis entropy and differential evolution. In: Panigrahi B.K., Das S., Suganthan P.N., and Nanda P.K. (eds.), SEMCCO 2012. LNCS, (2012), Vol. 7677, Springer, Heidelberg, 17–24. Sarkar S., Patra G.R., and Das S. A differential evolution based approach for multilevel image segmentation using minimum cross entropy thresholding. In: Panigrahi B.K., Suganthan P.N., Das S., and Satapathy S.C. (eds.), SEMCCO 2011, Part I. LNCS, (2011), Vol. 7076, Springer, Heidelberg, 51–58. Sarkar S., Patra G.R., and Das S., A differential evolution based approach for multilevel image segmentation using minimum cross entropy thresholding, Swarm, Evolutionary, and Memetic Computing Springer Press, Berlin Heidelberg, 2011. Sezgin M. and Sankur B. Survey over image thresholding techniques and quantitative performance evaluation. Journal of Electronic Imaging, (2004), 13(1), 146–168. Shiand N. and Pan J. An improved active contours model for image segmentation by level set method. Optik, (2016), 127(3), 1037–1042. Sonka M., Hlavac V., and Boyle R. Image Processing, Analysis and Machine Vision. Boston, Mass, USA: Springer US, (1993). Storn R. and Price K. Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, (1997), 11, 341–359. Torres-Sanchez J., L´opez-Granados F., and Pena J.M. An automatic object-based method for optimal thresholding in UAV images: Application for vegetation detection in herbaceous crops. Computers and Electronics in Agriculture, (2015), 114, 43–52.
276
Diksha Thakur et al.
Tsai W.H. Moment-preserving thresholding: A new approach. Computer Vision, Graphics, and Image Processing, (1985), 29(3), 377–393. Tsai W.H. Moment-preserving thresholding: A new approach. Computer Vision, Graphics and Image Processing, (1985), 29, 377–393. Zhang Y. and Wu L. Optimal multi-level thresholding based on maximum Tsallis entropy via an artificial bee colony approach. Entropy, (2011), 13, 841–859. Zhiwei Y., Mingwei W., Wei L., and Shaobin C. Fuzzy entropy based optimal thresholding using bat algorithm. Applied Soft Computing, (2015), 31, 381–395. Zhou Y., Li L., and Ma M. A novel hybrid bat algorithm for the multilevel thresholding medical image segmentation. Journal of Medical Imaging and Health Informatics, (2015), 5, 1742–1746.
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
14 Machine learning in healthcare Abstract: Machine learning (ML) is an application of AI (artificial intelligence), which deals with the study of capability of a computer to learn from the given data to gain knowledge in making predictions and decisions based on its experience. Such technology can benefit healthcare industry to a great extent. It is the fastest growing industry with high rates of progress in the field of health with new technologies emerging rapidly. These can be extended to a wide range of clinical tasks and prediction tasks since the performance of ML algorithms has been proved to be more than that of humans. Nowadays, all of the patient data has been recorded on computers, and the existing patient data can be used by the doctors and examiners for follow-ups. ML algorithms use this existing data and analyze them to identify patterns that are used to make precise diagnosis and provide better care to patients. With the invention of wearables, all the patient data has been monitored and stored, which is then used by ML for better patient management. ML algorithms are also being used to accurately predict the progress of a disease. This innovation can give chances to improve the proficiency and quality of healthcare. Keywords: Machine learning, Artificial intelligence, Patient care, Electronic medical records, Big data
14.1 Introduction Machine learning (ML) is an application of AI (artificial intelligence), which deals with the study of capability of a computer to learn from the given data to gain knowledge in making predictions and decisions based on its experience. Such technology can benefit healthcare industry to a great extent. It is the fastest growing industry with high rates of progress in the field of health with new technologies emerging rapidly. Nowadays, all of the patient data has been recorded on computers, and the existing patient data can be used by the doctors and examiners for follow-ups. The ML algorithms use this existing data and analyze them to identify patterns that are used to make precise diagnosis and to provide patients with better care. With the invention of wearables, all the patient data is monitored and stored, which is then used by ML for better patient management. ML algorithms are also being used to accurately predict the progress of a disease. There is
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India https://doi.org/10.1515/9783110648195-014
278
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
a drastic increase in the percentage of money the government has allotted for healthcare due to frequent appointments, follow-ups, and also the difficulty to have a very large number of specialists to be available. Innovations like swallowable chips allow the patients know that their medication is to be taken, and large-scale innovations like data analysis helps the doctors make precise predictions about the disease and are very effective and help cut down the costs in healthcare. Technologies like these and ML can never entirely replace doctors but can be highly beneficial in the sector. Currently, the doctors are mainly focused on satisfying their patients with care and giving them appropriate medication with regular follow-up appointments to help nullify their symptoms completely. With the exponentially increasing population, it is becoming difficult for doctors to keep track of each and every patient they treat, and allotting lesser time to each patient results in poor relationship between the doctor and the patient. To overcome this, newer technologies like ML must be developed and utilized to have a better patient-centered care.
14.2 Machine learning At present, we live in a world in which the amount of data is increasing rapidly, and newer methods and algorithms are being invented to get insight and to find hidden patterns in the data. These methods can be highly helpful to the healthcare industry as we can find patterns to recognize particular symptoms in a patient to diagnose a disease. ML is one of these methods of analyzing the data to gain insight and to make accurate predictions. ML algorithms learn and evolve by themselves and need not be directed to look at a particular part of data, and they learn and develop with the help of their iterative nature. Mainly, there are two types in which a machine can learn: supervised learning and unsupervised learning. Supervised learning uses existing examples to train the algorithms. Sufficient amount of inputs and correct outputs are provided to the machine, and it learns by comparing the calculated output with the actual output. This type of learning gets information from the past to predict future events. In unsupervised learning, the machine goes through the data and tries to recognize a pattern. These techniques recognize hidden patterns enclosed in a dataset or cluster and hence are explanatory techniques that mostly apply classification techniques like principal component analysis. ML techniques are being used in various departments and are very advantageous, which can produce excellent results. They are popularly used in finance departments to predict if the credit cards are faulty and they are also used in online shopping sites to suggest an item to the customer based on their previous searches. Technology like ML has been burgeoning in various diverse departments and can be enormously promising in the healthcare industry, where the health records and the patients’ requirements for clinical experts are being increased by the minute. Such technology can help reduce the cost of healthcare to a huge
14 Machine learning in healthcare
279
extent and provide better care and treatment to the patient. In medical industry, ML techniques can be very essential in prediction, diagnosis, and the progress of a disease. As gigantic amounts of data are available in the present technological era, ML can be highly advantageous by making use of this data as training datasets and analyzing it to find important patterns. Nowadays, governments are spending millions of dollars for the research and improvement of treatment and care. This expenditure can also be reduced by implementing ML in the medical sector as it cuts down the cost of diagnosis and care (Rohan et al., 2017).
14.3 Main algorithms of ML 14.3.1 Decision tree methods This method creates a decision tree using inputs and data from the training sets. Decision trees are used to classify data. The classification of a particular data is based on the growth direction of the decision tree. Decision tree classifies target data according to their best feature. The formation of a tree depends on the number of branches formed and when to stop splitting. Because of its greedy approach, it is not best suited for complex classifications. The preprocessing data can be created easily without much effort. Any normalization or scaling of data is not required for this method. The preparation of a decision tree is not affected even if any data goes missing. The decision tree’s structure is quite unstable as any minute change in data will affect it considerably.
14.3.2 Naive Bayes classification This method is purely based on probability theory. This method considers the target objects as individual events and calculates conditional probability using prior and posterior probability. The formula is PðY=XÞ =
PðX=Y ÞPðY Þ Pð X Þ
This method is widely used due to its high prediction efficiency because it is based on a simple principle. Very less amount of data is enough for this method. There is chance of loss of accuracy using naive Bayes classification. They also cannot alter the dependencies as they are present between the data variables.
280
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
14.3.3 k-Means It is an iterative unsupervised learning method used to solve the clustering problem. This method involves defining a certain value for k (the number of clusters). The next step involves assigning a centroid for each cluster and finding the distance between these centroids and the objects and assigning the objects into the closest cluster. Once all the clusters are defined, calculation is done to find the centroid of each cluster again. This process is iterated until the centroid remains constant. It is easy to implement and requires less programming. This method is relatively faster than the other clustering algorithms. It can be scaled to very vast datasets. It is adaptable. Choosing k manually is an issue. The data depends on the initial values.
14.3.4 Artificial neural network Artificial neural network (ANN) is an arithmetic model that mimics the functions of a human brain. It consists of nodes connected to each other just like neurons in a biological neural network. A unique function is associated with each node, which is called the activation function. A weight is assigned to the path between one node and another. This type of network is divided into a forward network and feedback network. The neural network learns by giving multiplying the set of inputs with their respective weights. The sum of all weighted inputs is subtracted from the threshold value to get the total weighted input. The activation function converts the total weight into the output. Using this method, information can be stored throughout the network. It produces good results even with incomplete knowledge. The way in which the network works and behaves cannot be explained. The amount of time the network takes to get an output cannot be predicted.
14.3.5 Support vector machines They are supervised learning models, which learn by converting input space into high-dimensional space. They are used for regression and classification. A support vector machine (SVM) classifies a given training example into either of two categories; hence, it is a nonprobabilistic binary classifier. These are generally used to diagnose medical issues and to predict tumors. An SVM assumes its training data as points in a space and separates them with a gap to divide them into categories. Any new training examples introduced in the space are arranged on the category in which they belong to. It naturally clusters the data in the case of unlabeled data, in which case it follows an unsupervised approach. It is quite difficult to incorporate this for large-scale data as it involves very high levels of calculation (Brownlee, 2016).
14 Machine learning in healthcare
281
14.4 Benefits of ML in healthcare ML can play a vital role in healthcare. ML techniques open up a massively wide range of possibilities in medical field. All of its possible applications are extensively beneficial in the medical industry. Basically, they are applied to reduce the time and cost of treatment by analyzing the vast patient data and finding patterns that help them to adapt and grow in disease prediction and to automate tasks. They reduce the workload drastically by helping in maintaining health records, to follow up on unpaid medical bills that eventually result in lowering the expenditures of the medical firm. ML techniques gather insights on patient’s treatment by analyzing their medical records resulting in better predictive analysis. These insights help in providing better treatment as they help in finding out the exact areas where treatment is necessary. Smart wearables are greatly harmonious with ML techniques as they provide essential data about the patient, which are then utilized by these ML algorithms. Smart wearables like Fitbits, smartwatches, and other devices can alert the patients and their doctors about any health risk or threat by analyzing the gathered data. Such technology reduces the work by medical experts and reduces the number of hospital visits or admissions. Implementation of ML in healthcare gives rise to limitless opportunities. Benefits of implementing ML techniques in the medical field are given in the following sections.
14.4.1 Prevent unnecessary hospital visits and readmissions Model proposed using ML can reduce the readmission of patients as doctors constantly monitor the patients’ condition by getting informed about the most probable patient to be readmitted and they provide necessary medicines and remedies to them to prevent the risk. This results in a more efficient and patient-centered treatment.
14.4.2 Reduces the stay in hospital ML algorithms are implemented to reduce the span of stay of a patient by identifying patients’ problem early and providing relevant care in lesser time. This generally results in better outcomes, and health of patients is remotely monitored and necessary medications are notified when required.
14.4.3 Lessen hospital-acquired infections Hospital-acquired infections are common and can be deadly. The risk of being infected at the hospitals can be prevented completely by implementing ML techniques that help in monitoring a patient remotely without the burden of visiting the hospital often.
282
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
14.4.4 Predict diseases Chronic diseases are identified early on with the help of ML techniques and preventive measures can be taken to be cured completely. Diagnosis of patients can be cross-verified using ML prediction techniques to find out any undiagnosed or misdiagnosed diseases and take necessary treatment. Regular checks can prevent the onset of any chronic disease, which can be done easily by implementing ML.
14.4.5 Cost reduction With the help of ML, the workload on the doctors is reduced and it also helps in the accurate prediction of diseases and also in alerting if the patient has any health risks or issues. All these applications result in faster and better care and eventually lower the cost of treatment tremendously.
14.4.6 Reduced workload on doctors ML helps cut down the work on doctors by performing all their mundane tasks at a greater speed. It helps the doctors in prediction of a disease and also eliminates any need for entering patient data, which is a very tedious task. All these tasks by ML eventually result in more time available for the doctors to treat the patients, which results in better treatment, personalized care, and patient-oriented care. As the whole medical industry is slowly digitalizing to the digital age where all the information and data are stored online, the use of ML techniques and wearables is amplified. Medical records and data are easily accessed by doctors and patients as everything is available online. Administering ML in medical field has opened a wide range of possibilities where each one is more advantageous than the previous one. Newer start-ups and companies are being formed on the basis of ML in healthcare to improve the existing technology and provide faster and better treatment with lower costs (Yichuan et al., 2018).
14.5 Limitations of ML in healthcare ML is a highly fruitful technology in medical field and results in better prediction of disease and better maintenance of medical records and is greatly helpful to doctors. It increases the overall efficiency and care of the treatment. Even such technology can have its own limitations that constrain it from functioning to its maximum extent. Some of the limitations of ML in medical industry are in the following sections.
14 Machine learning in healthcare
283
14.5.1 Data quality Data quality plays a vital role in ML as ML algorithms and prediction techniques are data-driven. Any miscalculation or error due to poor quality of data may have very serious consequences like the death of a patient. Noisy or incomplete data will not yield expected insights and will affect the entire analysis. Therefore, the quality of data is very essential as any anomaly might result in skewed conclusions.
14.5.2 Algorithms must be transparent One of the major drawbacks of ML algorithms is their black box–like approach as they do not provide any clear insights about how they came to a particular conclusion. Medical experts and doctors are required to know the explanation for each insight given by the ML algorithm for cross-verification. Most of the companies disapprove the implementation of ML due to this lack of transparency. Major algorithms are accepted in the medical field only if they provide the clear explanation of each outcome.
14.5.3 Risk of manipulation Data might be manipulated intentionally or accidentally, either of which can cause a havoc. Even a minute error in the neural networks or programs that are used to train the ML algorithms will return a false outcome. Since these outcomes are used in the prediction of diseases and to prescribe medicines to a patient, any kind of error will result in very dangerous consequences.
14.5.4 Algorithms must be credible The ML algorithms must be able to provide scientific explanation and credible insights to each output it produces. Any algorithm that fails to do so is considered unreliable and is not adopted by the medical firm. Such level of credibility is required as the medical firms deal with patients, and any error or miscalculation might result in unnecessary complications that are harmful to the patient.
14.5.5 Confidence scores ML algorithms must accompany their outcomes with a confidence score, which explains and supports their conclusion. The confidence scores determine the extent to which an ML algorithm can be trusted. An algorithm with a good confidence score
284
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
shows that the predictions and results given it are accurate. The black box problem is solved by attaching a confidence score to an algorithm.
14.5.6 Results must be reproducible Data provided to the ML determines the accuracy of the result it produces. Additionally, any new insights or improved statistics improve the performance of the algorithm. The results must be generated in such a way that they must be informative and must provide consistent results to improve the efficiency and authenticity of the algorithm.
14.5.7 Algorithms must be fair and must demonstrate impact Algorithms implemented in the healthcare field should be trustworthy and must not contain any kind of anomaly, which affects the final result of the prediction or the analysis. The algorithms must be tested in regular intervals to meet up with the technological advancements and should be simultaneously improved to provide better performance. To utilize ML algorithms to their maximum extent, most of these drawbacks should be addressed. The impact of ML in medical field has been increasing day by day, and their true potential can be achieved in the near future. ML has improved care by a huge extent and can further improve its efficiency by overcoming some of these drawbacks (Pratik et al., 2019).
14.6 Eventual fate of ML in healthcare ML in healthcare helps the doctors in precise decision making, to find out hidden symptoms and overall better treatment of the patient. Such technology can be taken to greater heights in the future with upcoming technological advancements. Some of the future applications by incorporating ML in healthcare include the following: ML algorithms are being improved continuously to become more intelligent so that they can perform complex tasks. This is achieved with the help of providing extensive rich, high-level data to the ML algorithms as they grow and develop based on the data they receive. This results in algorithms performing better, producing accurate imaging, making precise decisions, and maintaining health records better. This also helps the ML to prepare better learning models to solve complex problems.
14 Machine learning in healthcare
285
14.6.1 Improved electronic medical records Data from the patient who has been treated based on ML predictions is summarized and added to the electronic records for further understanding and to improve the predictive power of the algorithms. As data is increased exponentially every day, the amount of data available for computation and training of ML techniques is massive and can lead to various advancements in maintenance of electrical records. Onset of a deadly disease at a very early stage can be predicted using ML algorithms. This can be very useful as preventive steps can be taken in the early stages to totally avoid the situation. In the modern digital age, relevant data is available for any possible malady that is used to train the algorithm to produce more accurate predictions for doctors to completely rely on them.
14.6.2 Drug discovery It takes many years for a drug to be discovered and such procedure requires a huge amount of time, energy, and expenditure. To discover a drug, initially the companies go on search for a compound that cures the disease by reacting with the target molecules of the body, but in most of the cases, the compound tends to react with other molecules that are not the target, and might result in life-threatening circumstances. Therefore, companies find it difficult to discover a drug for a particular disease due to lack of predictive methods. This problem can be solved using ML as it can computationally predict the appropriate compound without affecting the test subject and even provide insights to rule out the complications that might occur. This also saves a lot of time and financial resources and also produces more accurate results.
14.6.3 Improving patient experience ML algorithms eliminate the need for frequent visits to the hospital as the patients’ health can be monitored remotely, and necessary medication can be prescribed by the doctor. They also reduce the duration of the treatment by helping the doctors predict the disease without any delay preventing any chance worsening of the situation. The algorithms analyze the enormous data and recognize patterns that are used to provide better and efficient care to the patient improving their overall experience.
14.6.4 Utilizing patient data Computational ability of these algorithms when combined with the enormous data available can produce extraordinary results. All the patient data, including their
286
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
symptoms, medications, treatment procedure, and outcomes, are stored online; this huge amount of data is utilized by ML for analyzing and learning, and is used as training datasets for improving their performance. The prediction accuracy is drastically increased after iterations of the algorithm through various enormous datasets that ultimately result in better and faster treatment of the patient. With the help of such technology, doctors can focus more on the care and experience of the patients instead of thoroughly going through the patient records to gather insights. Many companies are on the verge of adopting ML as they are highly beneficial in the medical field. ML is showing great potential in the medical industry and can reach greater peaks if enough budgets are allotted by the government for its growth as it significantly benefits the economic state of the world (Johnson et al., 2016).
14.7 Will ML replace doctors? ML is an emerging technology in the field of healthcare, and many companies keen on adopting it. As we have discussed the benefits of incorporating ML in healthcare earlier, it can be seen that ML tends to perform the job of doctors more efficiently and accurately. This brings us to the thinking if ML is capable of replacing doctors in the future or not. ML can do tasks that are humanly impossible and with great efficiency; considering this a question pops up in the mind if the medical sector can totally be incorporated by machines. ML algorithms can analyze and process massive data blocks at an intense rate and they can produce results with better accuracy continuously without being worn-out. It can be a way more productive than humans as it can predict and prescribe medications for a large amount of people at once and save the workload on employees. It can be very essential in maintaining and managing electronic records as it is a hectic and time-consuming task. It drastically reduces the error rate even if it functions and predicts faster than doctors. Undoubtedly, ML algorithms are showing great potential to play an essential role in medical industry, but even after considering all its benefits, we experience scenarios where the presence of a doctor becomes mandatory. For example, ML cannot advice a patient about personal hygiene and cannot provide the care required by a patient with complex issues. In such cases, a doctor is definitely required as the patient can communicate and explain his/her symptoms better. The presence of a doctor increases the confidence of a patient rather than entering his/her issues in a computer. Patients can express their concerns and issues better to a doctor and feel cared about in their presence. ML algorithms can only analyze the symptoms and predict the onset of a disease and cannot actually cure the disease. A doctor is required to perform the rest of the tasks after the prediction by the algorithm. There have been situations where the algorithm could not diagnose the cause and suggested treatments, which only worsened the patients’ condition; a medical expert is
14 Machine learning in healthcare
287
required in such cases who could supervise the medications prescribed to prevent any unwanted issues as it deals with the life of a person. ML algorithms find it difficult to predict or solve complex medical issues as they have not been developed so well and incorporating such technology without any supervisor would be a grave risk. Cases with multiple issues and complex diagnosis cannot have digital solutions as they require creative problem-solving skills that are not acquired by the ML algorithms. Tests also suggest that a patient would rather wait and take an appointment with a physician instead of directly taking advice from a machine. This clearly shows that algorithms can never replace the empathy and blind trust provided by the doctors. Since medical industry involves revealing life-changing decisions any person would prefer to hear it with care from a doctor instead of a blunt response from a digital screen. ML algorithms cannot replace doctors as of now and were never intended to but they can always be very useful to doctors in time management and overall treatment. Doctors can work together with ML algorithms and utilize the computational power of machines for their good as they can analyze massive amounts of data in very less time. The time a doctor generally takes to arrange and maintain the records is now taken care of by the ML algorithms that manage electronic records. This in turn provides the doctors time to diagnose more patients and treat them with patience. With the help of algorithms, diseases can be easily identified without any hassle. Mundane tasks that are automated to save time can be allotted to treat patients with better care. Smarter technology is implemented, which does not require the doctor to spare the time for frequent blood pressure or temperature. They are done automatically with the help of smart wearables that heavily reduce the workload on doctors. In countries with scarcity of doctors, this technology can be very useful as it massively reduces the amount of time a doctor takes to diagnose a patient proving them with more time for each patient. As ML algorithms only deal with prediction of a disease, there is no risk of job for doctors except for the lien of pathologists or radiologists whose jobs are to diagnose a disease. But even they would not get unemployed as every machine requires a medical expert to supervise and approve its decisions.
14.8 Applications 14.8.1 Detection of fraud in healthcare insurance Various governments all over the globe have been deeply affected by the intrusion of fraudulent activities in the healthcare management system. As a result of the fraud, many patients are being the victims who are in need of serious healthcare. This happens due to the unavailability of proper medical services and proper budget. To avoid
288
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
such activities, ML algorithms along with data mining methods can be used to analyze the faults in the health insurance databases. The types of such fraudulent activities are as follows: – Medical service providers The activities in this case include adding extra expenses to the bills for the services provided by the unit, intentionally adding services that are not even required for the patient. – Medical resource providers These activities include unnecessary tests for the patient, changing original bills to false ones, providing equipments to a patient that are not required in anyway. – Insurance policy holders These activities include showing fake information regarding employment so that the system gathers good benefits and low premium, claiming medical insurance for those cases that actually never seemed to happen. – Insurance policy providers These include paying fewer amounts to the claims received by insurance holders, dragging the date to fulfill the claims of insurance holders, and faking of their rules and policies in order to receive maximum premium from insurance holders. The algorithms used in detecting fraud in healthcare are as follows: – Neural networks A neural network is used to make a system work as a human brain so that it can take decisions on its own and separate fraud data from genuine ones. A training set is fed into the neural network according to which the system makes decisions. A neural network consists of several layers that are interconnected to each other. There occurs a unidirectional flow of data from the input layer to the output layer. An extortion identification framework comprises a gathering of multilayered systems for every substance including restorative cases, therapeutic experts, and businesses that can be associated with the fake exercises. Once the input is given as one of the entities, it is checked periodically and accordingly the training set is updated. – Bayesian network Bayesian networks are those networks that store the data about the entities having a probabilistic relationship. In this method, some data is known beforehand. A mass detection tool based on Bayesian network can detect fault in healthcare system. It also gives warnings about future frauds that may happen. – Logistic regression This is a classification method to classify a particular set of entities. Generally, for huge data it is used to analyze the data, thereby producing a classification resulting in either 0s or 1s. In healthcare system, this method can be used to
14 Machine learning in healthcare
289
detect whether a person needs to be readmitted or not, whether a person is having a particular disease or not, and so on. To maintain the well-being of a healthcare system, it is very important to remove such fraudulent activities from the system. This is a very vital part for the smooth flow of the whole healthcare system and the well-being of the patients (Prerna et al., 2014).
14.8.2 Prediction of infections in the healthcare system using ML The infections that are present in a healthcare unit can easily spread and affect a patient’s health during his/her treatment in that particular unit. This can be really a big problem for a patient who has come with another ailment. ML can play an important role in evading such issues occurring in the hospitals in the recent days. For that, solutions had to be made. So technology took a sharp turn toward advanced data sciences. The use of ML in this sector became a boon to the problems faced daily. Let us talk about the various clinical settings and the things required to carry out our work. These include patient identification, sex, age, appointment identification number, appointment date, and health status of patient. Our job is to identify the factors by which the patients are getting infected who were not during their admission. These patients are divided into few categories: – Patients with a stay period of 1 day – Patients with no attack of infection – Patients not affected by infection even after 2 days at ICU but infected during the time of discharge – Patients initially infected during admission but completely fine during discharge The methodology used to find a solution to our problem is described as follows: – Support vector machine SVM is used to apply ML in day-to-day life. It works with dimensions that are very huge. It is a supervised learning algorithm. It represents various points in space after which more points are mapped, which help in the prediction purpose further. – Decision trees Carrying out work with SVM is a tough deal. A simpler method thus can be used is a decision tree. It is basically a model based on the combination of various small steps. It consists of nodes that are split according to the best out of all nodes. This is a repetitive separating method of patients having diseases and not having diseases. So when the training dataset is changed, this + tree might get affected. This happens to be a major drawback as a solution.
290
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
– Random forest To offer better classification in healthcare system applications, a layer is added to the decision tree. This layer acts as a random layer. So for each node, a tree bootstrap is created. Each node of that tree consists of a classification tree. – Imbalanced datasets This is basically classifying the number of patients in the positive and negative parts and then trying to balance the two datasets in two ways. One way is by removing samples from majority class. The other one is by adding more samples to minority class. A data when it is balanced gives a better classification performance. – NoSQL database A database like this stores the basic information of the patients. It consists of a table that has columns such as patient ID, name, age, gender, and disease. Thus, we can say that ML plays a major role in the healthcare industry and makes the task of the people easier. It helps in removing the problems faced by the hospitals nowadays by the use of various algorithms that were not used till date.
14.8.3 Use of ML for healthcare professionals in making preauthorization decisions The process of controlling resources being wasted and the ignorance of the process requests that are not mandatory to be executed are called preauthorization. For this task, a reviewer is assigned to check each and every case and to look after the equipment or machines used. This results in added labor. To avoid such issues, ML comes to the rescue. The methods used to carry out are preprocessing techniques. These processes help in increasing the data quality in a database. To carry out the preauthorization process, the data obtained is processed with ML algorithms. The preprocessing methods include filtering algorithms, random tree classification, naive Bayes, SVM, and nearest neighbor. In this way, many unwanted services and resources are removed from the database using recent technology. Thus, the unwanted manpower is also diminished by replacing this methodology. This process not only helps reducing labor but also saves a huge amount of money in the healthcare sector by preauthorizing data. As a result, we can create a tool using recent technology to generate results with great precision.
14 Machine learning in healthcare
291
14.8.4 Improvisation in security in healthcare unit using ML Due to the reason that image processing is a costly process, this requires a lot of time and memory. So clearly a new idea was required to be found which is cheap and convenient to use. This method helps to make the services easier to access. It would really be beneficial to the healthcare system if we replace old technology with cloud computing. In this way, we can reduce cost of operations occurring all time. To prevent the data to be leaked or intruded by external body, security of the data is very important. Otherwise, unauthorized people can access our data and make changes to it. This may cause loss to the healthcare unit to a great extent. Cryptosystems come to the rescue here in combination with parallel and distributed systems. The various security mechanisms occur are as follows: – Service-oriented architecture – Security multiparty computation – Secret share schemes Our main goal is to eradicate unauthorized access to our medical database. This information is very important and should not be in the wrong hands anyhow. Adding layers to our data improves security and inaccessibility of data. In general, SVM and clustering algorithms are used for image processing. A cloud security layer added to the existing layers will help in protecting valuable data from the outer world. The architecture used here is a three-tier architecture for data security purposes. It consists of a client, cloud security layer, and cloud provider. In this model, the cloud security layer initially scrambles all well-being information by means of the HTTPS/SSL convention to verify information more. Second, this module utilizes a division way to deal with guard-restorative pictures. Once scrambled, the cloud security layer sends customers’ information to an outer cloud supplier to process them safely. In this specific circumstance, this module is liable for guaranteeing satisfactory protection and security for customers’ information during the use of cloud assets. To provide better security, a list of rules and regulations is made. It contains the list of persons who are authorized and the permissions granted to the users. There are several access controls that help in accessing the data: – Discretionary access control – Attribute-based access control – Mandatory access control – Role-based access control The original image is undergone through a series of processes. First, it is passed through a pixel-level color extractor. This helps in knowing the color of each pixel. For further classification purpose, fuzzy C means algorithm and SVM are used.
292
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
The fuzzy C means algorithm contains parameter initialization, pixel clustering, and training sample selection. After that what is received is a segmented image. This is now an image that does not show all the details of the data, thus protecting the inner details that are accessible to authorized persons only (Mbarek et al., 2018).
14.8.5 Using ML in selection of risk factors of ventriculitis and meningitis in a neuro-ICU Ventriculitis and meningitis occur in the patient’s body together. It occurs due to invasive neurosurgical procedures or penetrating head trauma. Using ML, three objectives can be fulfilled: – Detection of ventriculitis and meningitis in the patients who are present in the hospital for more than 48 h – Comparison of patients who are exposed to risk factors of all kinds in the hospital – Identification of Hepatic Arteriovenous Malformations (HAVM) risks and ranging them using various ML algorithms For risk identification of HAVM, ML is used as a better technology. In the first step, we have to make four decision trees. Weighted classifiers are used so that the performance is boosted while type 1 and type 2 errors are balanced. In order to prevent nodes from overfitting, we can use stratified k-fold cross-validation algorithm with k = 1. The second step takes care basically of the selection of features and their ranking. The methods include greedy algorithms, backward elimination, and forward selection. F-score is measured by XGBoost and then a rank list is created. In this way, HAVM probability in patients is calculated. According to this score, a patient is marked safe or in danger. A set of risk factors are also found out so that the next time any patient gets admitted, he/she does not face any diseases like ventriculitis and meningitis during the stay at the hospital. The models for identifying risk factors are: – Linear models Application of PCA (Principal Component Analysis) on the dataset. – Tree-based models Decision tree algorithms are used. So there have been researches done on this topic and a lot of statistics have been made as observed by true data. By using ML, certain factors were found out which did not have any affect to HAVM increasing. These factors include gender, blood infection, medical sedation, and spinal diseases (Ivan et al., 2018).
14 Machine learning in healthcare
293
14.9 Challenges faced by multidimensional clinical team Medicinal services checking frameworks inside the restorative foundation and at household create enormous amount of well-off phenotype statistics from an extensive exhibit of assets. Standard assets include clinical perceptions, ceaseless waveforms, lab results, medicinal pictures, and literary content notes. One fundamental arrangement of strategies for this endeavor is gadget considering frameworks. These are frameworks that dissect and translate measurements in a way that precisely perceives fundamental styles and patterns. These examples are valuable for anticipating fate logical occasions including sanatorium reconfirmation, and for deciding approaches inside the medicinal choice guide instruments. The significant thing the clinical venture is doing is to translate those in a manner that encourages to improve the regular old of influenced individual consideration. Be that as it may, the measurements and intricacy of the insight sets, which may be routinely multidimensional and progressively changing over, way that elucidation is phenomenally troublesome, in any event, for proficient clinicians. In this model, we can create an appraisal of the framework reading designs at present utilized for occasion expectation and choice aids in medicinal services. Especially we can feature how those methodologies address multidimensional measurements. We can really examine a portion of the main problems in forcing contraption picking up information on structures. These comprise lacking or adulterated information, joining of heterogeneous and multimodal records, and speculating all through patient populaces and therapeutic settings.
14.9.1 Corrupted and missing data Defiled certainties happen while a recorded measurement does no longer precisely reflect the genuine state ultra-present day the thing or man or lady being estimated. Lacking actualities happen while there might be no detailed size at a particular time. Whenever left unrecognized, an examination that does not consider lacking or tainted records may likewise cause defective decision making. In the most noticeably awful occasions, this can mean sufferers being allocated the wrong course in vogue cure. Inside the UK medicinal services gadget, realities contemporary negative outstanding has recently been connected to pointlessly dropped activities, and undetected episodes. To apprehend why health information is ultra-modern, trendy, bad, and excellent, it is far helpful to first take into account the stairs typically involved at some point of fact series. In general, the clinical records from patients involve a couple of levels, each permitting possibly trendy information corruption. Wrong use of brand new medical gadget has been related to misguided measurements in an extensive type
294
Daiyaan Ahmed Shaik, Vihal Mohanty, Ramani Selvanambi
of scientific conditions. The issue the present poor device connection is transforming into fresh out of the plastic new current basic as efforts are completed at intermediate plus extensive-term walking observing outer the present the center condition to diminish strain on crisis contributions. The patients observed tend to be more noteworthy physically vivacious than those checked in emergency clinics, which brings about more prominent levels fresh out of the plastic new development antiquity. These relics are aggravated by means of the handy issue of debasement best in class fix attachment throughout the years. In any event, assuming dependable apparatus connection could be assured, the precision present day the logical strategies may also extend. In bunches of cases, therapeutic equipment is worry to arrangements that assurance resistance. As a case, America nourishment and Drug the board orders that everybody beat oximeters for clinical use have a most extreme root recommend squared blunders ultra-present day